AgentFlow
Agents

Ollama

Run local models for enrichment, routing, and intent fallback.

Ollama

Ollama backs local steps in AgentFlow: enrichment passes, the intent resolver when intent.resolver.use_ollama_fallback is on, and cost-aware routing that targets work.default_enricher. It is not a separate product integration — it is the HTTP endpoint and model tags you wire under agents.ollama plus matching models entries.

Configure

Point endpoint at your daemon, name the chat and embedding tags you pulled, and mirror the model id in a models profile with provider: ollama and the usage classes you expect locally.

agents:
  ollama:
    endpoint: http://localhost:11434
    model: qwen2.5-coder:14b
    embedding_model: nomic-embed-text
    timeout: 300

models:
  ollama_local_qwen:
    provider: ollama
    class: local
    model: qwen2.5-coder:14b
    usage: [summarize, classify, pre_review, context_selection]

Prerequisites

Start the daemon, pull the weights you referenced, then let doctor confirm the tree and config agree.

ollama serve
ollama pull qwen2.5-coder:14b
agentflow doctor

Usage

enrich calls Ollama directly when you pass --agent ollama; work --prefer-local keeps eligible steps on the local profile your routing describes.

agentflow enrich billing-v2 --agent ollama
agentflow work "refactor utils" --prefer-local

Embeddings / RAG

embedding_model is reserved for retrieval work the project has not fully shipped yet. Today’s agentflow index persists text chunks in SQLite — vector similarity search is not something you should expect from a stock build.