OpenClaw with Local Models: Complete Guide to Ollama and Offline AI

Also see: Provider-specific notes in Ollama model provider and cost tips in cost playbook.

Why run local models?

  • Privacy — sensitive prompts never leave your LAN.
  • Cost — zero per-token fees after hardware (calculator).
  • Offline — works during outages (with reduced capability).

Trade-offs: slower on CPU, weaker tool-use on small models, you own tuning and updates.

Ollama setup steps

  1. Install Ollama from ollama.com on the same host as OpenClaw (or reachable LAN IP).
  2. Pull a model: ollama pull llama3.2 (pick models matching your RAM).
  3. Point OpenClaw to Ollama per provider config.
  4. Run openclaw doctor to verify connectivity.

Hybrid local + cloud routing

Use local models for triage, summarization, and PII-heavy tasks; route complex tool-use to Claude/GPT (model comparison). Configure fallbacks in model providers.

Hardware guidance

  • 8 GB RAM: 7B quantized models, light channels only.
  • 16 GB RAM: 8–13B models, moderate automation.
  • 32 GB+ / GPU: Larger models, faster multi-agent workloads.

Host sizing also covered in hosting optimization.