OpenClaw with Local Models: Complete Guide to Ollama and Offline AI
Also see: Provider-specific notes in Ollama model provider and cost tips in cost playbook.
Why run local models?
- Privacy — sensitive prompts never leave your LAN.
- Cost — zero per-token fees after hardware (calculator).
- Offline — works during outages (with reduced capability).
Trade-offs: slower on CPU, weaker tool-use on small models, you own tuning and updates.
Ollama setup steps
- Install Ollama from ollama.com on the same host as OpenClaw (or reachable LAN IP).
- Pull a model:
ollama pull llama3.2(pick models matching your RAM). - Point OpenClaw to Ollama per provider config.
- Run
openclaw doctorto verify connectivity.
Hybrid local + cloud routing
Use local models for triage, summarization, and PII-heavy tasks; route complex tool-use to Claude/GPT (model comparison). Configure fallbacks in model providers.
Hardware guidance
- 8 GB RAM: 7B quantized models, light channels only.
- 16 GB RAM: 8–13B models, moderate automation.
- 32 GB+ / GPU: Larger models, faster multi-agent workloads.
Host sizing also covered in hosting optimization.