Sub-agents & Context Cleanliness

The single highest-impact pattern in this codebase. Sub-agents let an agent system delegate heavy work — code reading, exploration, planning — to isolated runtimes that report back with summaries. The parent's context window grows by hundreds of tokens; the sub-agent can consume hundreds of thousands.

The Agent tool, in detail

crates/tools/src/lib.rs:580 is the mvp_tool_specs entry for Agent. The schema:

{
  "description": "string (required)",
  "prompt": "string (required)",
  "subagent_type": "string (optional)",
  "name": "string (optional)",
  "model": "string (optional)"
}

Dispatch is at line 1238: "Agent" => from_value::<AgentInput>(input).and_then(run_agent). The run_agent function (callable at the dispatch site) does the heavy work:

Normalize the subagent_type (lines 5099–5116) via normalize_subagent_type():
- Empty → general-purpose
- explore/explorer → Explore
- plan → Plan
- verification/verify/verifier → Verification
- clawguide/guide → claw-guide
- statusline → statusline-setup
- Anything else → passed through unchanged

Build the subagent's tool allowlist (lines 3657–3736). Per type:

Subagent type	Allowed tools
`Explore`	read_file, glob_search, grep_search, WebFetch, WebSearch, ToolSearch, Skill, StructuredOutput
`Plan`	+ TodoWrite, SendUserMessage (no bash)
`Verification`	bash, read_file, glob_search, grep_search, WebFetch, WebSearch, ToolSearch, TodoWrite, StructuredOutput, SendUserMessage, PowerShell
`claw-guide`	read_file, glob_search, grep_search, WebFetch, WebSearch, ToolSearch, Skill, StructuredOutput, SendUserMessage
`statusline-setup`	bash, read_file, write_file, edit_file, glob_search, grep_search, ToolSearch
`general-purpose` (default)	bash, read_file, write_file, edit_file, glob_search, grep_search, WebFetch, WebSearch, TodoWrite, Skill, ToolSearch, NotebookEdit, Sleep, SendUserMessage, Config, StructuredOutput, REPL, PowerShell

Spawn a thread (line 3577): std::thread::Builder::new().spawn(move || run_agent_job(&job)). Not a subprocess. The sub-agent runs in the same process, in a separate thread, with its own runtime.
Build a fresh ConversationRuntime (lines 3603–3630) inside the thread:
- Fresh Session (line 3625) — empty message history, no inheritance from parent
- Type-specific system prompt (line 3520) — built per subagent_type, not the parent's prompt
- Filtered ToolExecutor — only the allowlisted tools above
- Independent PermissionPolicy (likely matching the parent's mode, but with a separate state machine)
- Independent HookRunner, UsageTracker, etc.
Execute the prompt in the sub-runtime (loop until no more tool uses, just like the main loop covered in The Request Loop, Traced).
Persist results (lines 3509–3538): write AgentOutput to .claude/agents/{agent_id}.json and .claude/agents/{agent_id}.md.
Return a manifest to the parent, not the full transcript. The parent gets {agent_id, status, output_file, ...} — not the entire conversation the sub-agent had. This is the context-isolation contract.

What "context cleanliness" means here

When the parent invokes Agent({prompt: "find all uses of foo in the codebase"}), three things happen:

The parent's context window is unchanged — only the tool_use block ("Agent({...})") and the eventual tool_result (the manifest summary) are added.
The sub-agent starts blank — it sees no part of the parent's conversation, no parent's CLAUDE.md (well — it sees the same CLAUDE.md, since they share the cwd, but not the parent's runtime memory of having read it).
The sub-agent's transcript is on disk — recoverable, but not loaded into the parent's window.

So the parent's context grows by ~200 tokens for the dispatch + manifest, instead of by however-many-thousand tokens the sub-agent actually consumed. If the sub-agent reads 30 files and runs 5 greps, the parent never sees those tool calls or their outputs.

This is the most context-saving move available to a long-running agent. It's why production Claude Code sessions can offload all the heavy code-reading to Explore agents — the writeup itself is built from the agents' structured digests, which collectively compress hundreds of thousands of tokens of source into ~10K of summary.

The brief-like-a-stranger contract

Because the sub-agent has zero context from the parent, the parent must brief it as if writing to a colleague who just walked into the room. Claude Code's system prompt is explicit:

"Brief the agent like a smart colleague who just walked into the room — it hasn't seen this conversation, doesn't know what you've tried, doesn't understand why this task matters."

"Never delegate understanding. Don't write 'based on your findings, fix the bug' or 'based on the research, implement it.'"

The system prompt also contains worked examples of well-formed Agent prompts, all sharing the structure: goal, context, what's been tried, what to report, length cap.

This contract is enforced by the architecture, not by checks. You can send a short Agent prompt; it just won't work well. The mechanism produces good prompts because bad prompts visibly fail.

Workers vs sub-agents (the distinction matters)

Two things in claw-code might both be called "sub-agents" colloquially:

Sub-agents spawned via the Agent tool — threads, in-process, manifest-only return value, automatic context isolation. Used liberally, default tool, fast spawn.
Workers spawned via the WorkerCreate tool — also threads in claw-code, but designed to model long-running processes that the parent supervises across many interactions. crates/runtime/src/worker_boot.rs:255–294. Workers have richer state (lines 219–235): worker_id, cwd, status, trust_auto_resolve, trust_gate_cleared, auto_recover_prompt_misdelivery, prompt_delivery_attempts, prompt_in_flight, last_prompt, expected_receipt, replay_prompt, last_error, events. Workers go through phases: Spawning → TrustRequired → ToolPermissionRequired → ReadyForPrompt → Running → Finished | Failed.

The distinction in claw-code: an Agent does one thing and reports back. A Worker is a long-lived process the parent talks to repeatedly via WorkerSendPrompt/WorkerObserve/WorkerObserveCompletion, and the parent can intervene in its lifecycle (WorkerRestart, WorkerTerminate, WorkerResolveTrust).

In real Claude Code, the Worker concept doesn't exist as a model-visible tool — WorkerCreate is a claw-code-specific extension for the ultraworkers autonomous-coordination demo. Real Claude Code stops at Agent (and the various mcp__* tools that may model long-lived servers).

Running multiple Agents in parallel

Claude Code's system prompt explicitly instructs:

"If the user specifies that they want you to run agents 'in parallel', you MUST send a single message with multiple Agent tool use content blocks."

Mechanically: each tool_use block in a single assistant message gets dispatched in sequence by the runtime (the inner for-loop covered in The Request Loop, Traced), but because each Agent spawns a thread, they run concurrently. The runtime joins on all of them before returning the tool_results. The model sees results as a batch in the next turn.

Real concurrency, real time savings, no context contamination between them.

Worktree isolation

The Agent tool's input schema (claw-code) doesn't include isolation, but real Claude Code does: isolation: "worktree" causes the harness to create a temporary git worktree for the sub-agent to work in. If the sub-agent makes no changes, the worktree is cleaned up. Otherwise the path and branch are returned.

This is filesystem-level isolation on top of context isolation — useful when multiple sub-agents are doing concurrent edits and you don't want them stepping on each other. The branch_lock.rs module (covered in Multi-Agent Coordination) is the related pattern at the workflow level.

Continue: The Request Loop, Traced