Patterns to Borrow

Twelve patterns to lift, ranked roughly by impact-per-line-of-code, plus a file:line cheat sheet and the parity gaps to be aware of.

1. Deferred tool schemas

The pattern: Eagerly load only ~6 core tool schemas; expose the rest as names with a ToolSearch round-trip to fetch their schema on demand.

Why: Tool schemas can each be hundreds of tokens. A 50-tool registry would put 10K+ tokens in the system prompt that the model rarely needs. Deferred loading means typical sessions materialize only 5–10 tool schemas.

Cost: One extra round-trip per tool first-use. Negligible relative to the prompt savings.

→ See The Tool Surface.

2. Static-vs-dynamic system-prompt boundary

The pattern: Concatenate the system prompt out of explicitly-static and explicitly-dynamic sections, separated by a sentinel string. A downstream cache layer can split on the sentinel.

Why: Even if you don't have prompt caching today, you might tomorrow. The boundary makes that future migration trivial.

Cost: Three lines of code to add the sentinel and the splitter.

→ See Context, Caching, Compaction.

3. Plan files as durable intent

The pattern: Plan mode forbids edits to anything except a single named markdown file. The model writes the plan incrementally; the user reviews; ExitPlanMode requests approval.

Why: Long agent runs drift. Plans externalize intent so drift is detectable. The single-file constraint forces the plan to be the source of truth, not scattered conversation.

Cost: A mode flag in the runtime, a special tool registration filter.

→ See Plan vs Execution.

4. Sub-agent context isolation contract

The pattern: Sub-agents run in fresh runtimes (own session, own tool subset, own usage tracker). Parent gets back only a manifest, not the transcript. The sub-agent's full transcript persists to disk.

Why: Without this, every sub-agent's exploration leaks into the parent's context. With it, the parent stays at 5–10K tokens while sub-agents do the heavy reading.

Cost: A thread spawn + a fresh ConversationRuntime per Agent invocation.

→ See Sub-agents & Context Cleanliness.

5. Per-subagent-type tool allowlists

The pattern: When a sub-agent is spawned with a subagent_type (e.g., "Explore"), only a curated subset of tools is exposed to it.

Why: A read-only Explore agent shouldn't be able to run bash. Locking down the tool surface per agent type prevents whole classes of mistakes (the Explore agent can't accidentally mutate state).

Cost: A switch statement mapping types to tool name lists.

→ See Sub-agents & Context Cleanliness.

6. Lifecycle hooks separated from prompts

The pattern: Predictable, periodic, or always-do behaviors live in settings.json hooks, not in the system prompt. The harness, not the model, enforces them.

Why: "After every commit, run lint" cannot be reliably done by prompting the model — it'll forget. A PostToolUse hook will do it every time, observably.

Cost: A hook event taxonomy (PreToolUse / PostToolUse / PostToolUseFailure / SessionStart / etc.) and a config-driven runner.

→ See Extensibility (Hooks, Plugins, Skills, MCP).

7. Mock-service end-to-end testing

The pattern: Faithful mock of your model API + scripted scenarios + clean-environment CLI runs = deterministic, fast, comprehensive harness tests.

Why: You can't ship an agent harness without integration tests. Real-model tests are flaky and expensive. The mock pattern makes them cheap and reliable.

Cost: ~1K LOC for the mock, ~1K LOC for the harness scaffolding, ~50 LOC per scenario.

→ See The Mock Parity Harness.

8. Capability-probed sandboxing

The pattern: Don't check for the binary's existence — actually execute a no-op command through the proposed sandbox to verify it works. Cache the result.

Why: Container restrictions, kernel configs, missing capabilities — a binary's presence does not imply functionality. The probe catches this in 50ms once.

Cost: A OnceLock<bool> and a single execve.

→ See Permissions & Sandboxing.

9. Five-tier permission model with predictable precedence

The pattern: Deny rules > context overrides > ask rules > allow rules > mode sufficiency > final deny. Documented order; no ambiguity.

Why: Permission models that are "just check these things in some order" become unauditable as they grow. A documented decision tree is checkable.

Cost: A single authorize() function with explicit precedence comments.

→ See Permissions & Sandboxing.

10. Recovery recipes for known failure modes

The pattern: Catalogue the top N failure modes from real usage; bind each to a recipe (AcceptTrustPrompt, RebaseBranch, RetryMcpHandshake, etc.). When a recipe matches a tool error, enqueue it instead of failing or naive-retrying.

Why: Naive retry is a coin flip. A named recipe is testable, observable, and recoverable.

Cost: A small enum + a regex/match table from error patterns to recipes.

→ See Context, Caching, Compaction.

11. Branch-locks-before-merge collision detection

The pattern: Before two parallel agents touch overlapping code on a branch, declare intent (lane_id + branch + modules). Detect collisions at intent time, not at merge time.

Why: Detecting at merge time means rework. Detecting at intent time means routing — pause one lane, broaden the other's scope, or split the work cleanly.

Cost: An Arc<Mutex<HashMap<branch, Vec<intent>>>> and pairwise overlap detection (O(n²) is fine).

→ See Multi-Agent Coordination.

12. Cache-aware self-pacing

The pattern: When scheduling future wake-ups (or any intra-session pacing), respect the cache TTL. Stay in cache (sub-5-min) when actively iterating; commit to longer waits (20+ min) when the cache miss is unavoidable. Don't pick exactly 5 minutes.

Why: The Anthropic cache has a 5-minute TTL. Picking 300s burns the cache miss without amortizing. Either stay under or commit to a longer interval.

Cost: Documentation in your ScheduleWakeup / equivalent tool description, plus aware-by-default model behavior.

→ See Context, Caching, Compaction.

Reference: critical files cheat sheet

By topic, the file:line refs you'll want to navigate to:

Request loop: crates/runtime/src/conversation.rs:318 (run_turn), :346–504 (loop body), :559–582 (auto-compact), :744 (cache events), crates/runtime/src/sse.rs:18 (push_chunk)

Caching/compaction: crates/runtime/src/prompt.rs:40 (boundary sentinel), :113–221 (builder), :169–191 (compose order), crates/api/src/prompt_cache.rs:20 (config), :314 (detect_cache_break), crates/runtime/src/compact.rs:96–183 (algorithm)

Plan/execute: crates/tools/src/lib.rs:1244–1245 (EnterPlanMode/ExitPlanMode dispatch), crates/runtime/src/task_registry.rs:56 (TaskRegistry), crates/runtime/src/recovery_recipes.rs:46–86 (scenarios + recipes)

Sub-agents: crates/tools/src/lib.rs:580 (Agent spec), :1238 (dispatch), :5099–5116 (normalize_subagent_type), :3577 (thread spawn), :3603–3630 (fresh runtime), :3657–3736 (per-type allowlists), crates/runtime/src/worker_boot.rs:255–294 (Worker lifecycle)

Permissions: crates/runtime/src/permissions.rs:9–15 (PermissionMode), :148–291 (authorize), crates/runtime/src/permission_enforcer.rs:39–173 (enforcer methods), crates/runtime/src/sandbox.rs:156–303 (resolve_sandbox_status, unshare probe), crates/runtime/src/file_ops.rs:42–54 (workspace boundary), :669–687 (symlink escape), crates/runtime/src/bash_validation.rs:103–594 (validators)

Extensibility: crates/runtime/src/hooks.rs:23–25 (event types), :155 (HookRunner), crates/plugins/src/lib.rs:117 (PluginManifest), crates/commands/src/lib.rs:2553 (resolve_skill_path)

MCP: crates/runtime/src/mcp_lifecycle_hardened.rs:16–28 (phases), :257 (validator), crates/runtime/src/mcp.rs:26–37 (name normalization), crates/runtime/src/mcp_tool_bridge.rs:74–90 (registry), crates/runtime/src/mcp_stdio.rs:480 (server manager)

Multi-agent: crates/runtime/src/lane_events.rs:6–66 (event taxonomy), :1019–1149 (constructors), crates/runtime/src/branch_lock.rs:23–77 (collision detection), crates/runtime/src/team_cron_registry.rs:51–138 (team + cron registries)

Mock harness: crates/mock-anthropic-service/src/lib.rs (whole crate), crates/rusty-claude-cli/tests/mock_parity_harness.rs (whole file), mock_parity_scenarios.json (manifest)

Reference: parity gaps worth knowing

Where claw-code falls short of upstream Claude Code:

Prompt-caching breakpoints: claw-code observes cache behavior but doesn't insert cache_control: ephemeral markers (see Context, Caching, Compaction). Real Claude Code almost certainly does.
Bash validation: 1 of 18 upstream submodules implemented, and the integration into bash.rs is incomplete (see Permissions & Sandboxing).
Trust resolver: #[cfg(test)] only — not active in production builds. Real Claude Code has folder-trust prompts working at session boot.
Sandboxing on non-Linux: returns supported: false on macOS/Windows — bash runs unsandboxed. Real Claude Code may have macOS sandbox-exec or Windows AppContainer integration.
Permission category granularity: 21/50 tools require DangerFullAccess, including all the worker/team/cron-management tools. Real Claude Code likely splits this finer.
Compaction quality: claw-code's summarizer is structured but not LLM-driven. Real Claude Code likely uses a sub-model call for high-quality summaries.

These gaps don't make the codebase less instructive — they make it more so, because the architecture is legible without the production-grade complexity. Read it for shape; fill in your own production details.

The patterns above are portable across language and runtime — they're architectural moves, not implementation tricks. Pick three, build them well, ship the rest after.