OpenClaw with Local Models: Complete Guide to Ollama and Offline AI

Also see: Provider-specific notes in Ollama model provider and cost tips in cost playbook.

Why run local models?

Privacy — sensitive prompts never leave your LAN.
Cost — zero per-token fees after hardware (calculator).
Offline — works during outages (with reduced capability).

Trade-offs: slower on CPU, weaker tool-use on small models, you own tuning and updates.

Ollama setup steps

Install Ollama from ollama.com on the same host as OpenClaw (or reachable LAN IP).
Pull a model: ollama pull llama3.2 (pick models matching your RAM).
Point OpenClaw to Ollama per provider config.
Run openclaw doctor to verify connectivity.

Hybrid local + cloud routing

Use local models for triage, summarization, and PII-heavy tasks; route complex tool-use to Claude/GPT (model comparison). Configure fallbacks in model providers.

Hardware guidance

8 GB RAM: 7B quantized models, light channels only.
16 GB RAM: 8–13B models, moderate automation.
32 GB+ / GPU: Larger models, faster multi-agent workloads.

Host sizing also covered in hosting optimization.