Local investigation
Bounded grep and filesystem scans before cloud agent calls.
Local investigation
Before AgentFlow forwards repository context to a cloud agent, it investigates the project locally. The implementation lives in application/internal/investigation and feeds both agentflow investigate and the V3 pipeline prelude. The engineering goal is narrow but important: shrink prompt size and surface the files that matter using bounded, repeatable tools—grep, filesystem scans, heuristics—instead of asking a model to ingest the whole tree.
What happens on your machine
Investigation usually starts with grep against feature or task patterns, with hard caps on output bytes so a pathological match cannot exhaust your context budget. A filesystem scan walks candidate paths and flags anything above large_file_bytes, which keeps huge binaries or logs from being packed by mistake. Sensitive path detection applies secret_path_denylist and configured globs so keys and credentials are less likely to enter prompts or reports. Finally, related test heuristics propose tests tied to candidate source paths, which supports verify steps without enumerating the entire repository.
Those artefacts flow into context packing and cost estimation. They are inputs, not verdicts: they do not replace human judgment about scope or security.
CLI commands
Use these commands to run or debug investigation outside a full work pipeline:
agentflow investigate billing-v2
agentflow investigate billing-v2 --task task-003
agentflow inspect diff
agentflow inspect symbol Handler
agentflow inspect tests billing-v2The inspect subcommands wrap targeted helpers when you already know whether you care about a diff, a symbol, or test associations.
Limits and configuration
Investigation honors timeouts and byte caps from config. Example defaults:
mcp:
investigation:
large_file_bytes: 524288
max_grep_output_bytes: 262144
command_timeout_seconds: 120
sensitive_globs: ["*.pem", ".git/*"]Tightening these values trades recall for safety and steadier cost. Loosening them can surface more surrounding context, but it also increases the risk of oversized prompts and accidental exposure of secrets.
Related reading
- Local-first concepts
- CLI: investigate
- MCP tools (optional exposure via MCP when enabled)