Multi-Agent Coordination

This page covers what claw-code adds beyond Claude Code parity — the lane-based coordination machinery that enables ultraworkers/claw-code's "Discord-driven autonomous multi-agent" demo. Real Claude Code stops at the Agent tool. Claw-code goes further with workers, teams, lanes, and branch locks.

Lanes as event streams

crates/runtime/src/lane_events.rs (2509 LOC) — there's no explicit Lane struct. A "lane" is implicitly an event stream identified by a lane_id, with metadata (SessionIdentity, LaneOwnership, confidence levels).

LaneEventName (lines 6–50) enumerates ~20 event types:

Lifecycle: Started, Ready, Finished, Failed, Closed
State: Blocked, Red, Green, Reconciled, Merged, Superseded
Provenance: CommitCreated, PrOpened, MergeReady
Failure-specific: PromptMisdelivery, BranchStaleAgainstMain, BranchWorkspaceMismatch
Ship provenance: ShipPrepared, ShipCommitsSelected, ShipMerged, ShipPushedMain

LaneEventStatus (lines 54–66): Running, Ready, Blocked, Red, Green, Completed, Failed, Reconciled, Merged, Superseded, Closed.

The key insight: lanes are coordination channels for workflow events, not for agent invocations. When a worker transitions state (Spawning → ReadyForPrompt), it emits a LaneEvent. When tests pass, it emits Green. When a branch goes stale relative to main, BranchStaleAgainstMain fires. The events are the public state of the lane.

Public API surface (LaneEvent::new() at line 1019, plus convenience constructors at 1036–1149: started(), finished(), commit_created(), blocked(), failed(), ship_prepared(), etc.).

Critical helpers:

is_terminal_event() (line 434) — checks whether an event is final
compute_event_fingerprint() (line 447) — content hash for deduplication
dedupe_terminal_events() (line 913) — strips redundant terminal events
reconcile_terminal_events() (line 502) — picks a canonical terminal when duplicates exist

The dedup machinery exists because lanes can receive the same event from multiple sources (the worker emits it; the orchestrator confirms it; a watcher re-emits). Without dedup, the event log would be noisy.

Branch locks

crates/runtime/src/branch_lock.rs (144 LOC). Prevents collision when multiple workers try to modify overlapping code on the same branch.

@dataclass
class BranchLockIntent:
    lane_id: str
    branch: str
    worktree: str | None = None
    modules: list[str] | None = None

detect_branch_lock_collisions() (lines 23–49) takes a slice of intents and returns the collisions. It's O(n²) pairwise — fine for small intent sets.

overlapping_modules() (lines 51–63) handles the nested case: if lane A says modules=["runtime"] and lane B says modules=["runtime/mcp"], they overlap. The check uses modules_overlap() (lines 65–69) which checks for exact match or prefix.

shared_scope() (lines 71–77) returns the broader scope when there's an overlap — useful for routing the conflict resolution.

Teams + crons

crates/runtime/src/team_cron_registry.rs:51–53:

class TeamRegistry:
    def __init__(self):
        self._lock = threading.Lock()
        self._teams: dict[str, Team] = {}

A Team is a named group of task_ids (line 21–28). Teams have a status (Created/Running/Completed/Deleted). Methods: create() (line 67), get/list/delete/remove. Soft-delete (delete()) marks status, hard-delete (remove()) is permanent.

The semantics: TeamCreate({name, tasks}) creates several tasks at once and groups them. They run in parallel (by being separately dispatched), and the team is the unit of supervision (TeamDelete cancels all).

CronRegistry (line 136–138) handles scheduled triggers. A CronEntry (line 122–133) has cron_id, schedule, prompt, description, enabled, timestamps, last_run_at, run_count. Schedules are stored as plain 5-field cron expression strings — there's no cron parser in this module; presumably the cron scheduler externally invokes the registry's record_run() (line 208–218).

This is your scheduled-agents system. Hook a cron to a prompt + agent, and it fires on schedule.

Workers (recap)

crates/runtime/src/worker_boot.rs (2026 LOC) defines workers as in-process threads (not subprocesses) coordinated via shared Arc<Mutex<WorkerRegistry>>. The state machine:

Spawning → TrustRequired → ToolPermissionRequired → ReadyForPrompt → Running → Finished | Failed

The interesting design choice is trust auto-resolution. Each worker has trust_auto_resolve: bool and trusted_roots: &[String]. When a worker boots in a cwd that matches one of the trusted roots, the trust gate is auto-cleared (WorkerTrustResolution::AutoAllowlisted). Otherwise it transitions to TrustRequired and the parent must explicitly call WorkerResolveTrust({worker_id}).

The misdelivery recovery: auto_recover_prompt_misdelivery: bool. When set, if the worker's terminal indicates the prompt didn't land cleanly (e.g., dropped characters from a slow tty), the harness retries automatically up to a few times before failing.

For the distinction between Workers and sub-agents (the Agent tool), see Sub-agents & Context Cleanliness.

Putting it together

A user types into Discord: "build feature X." A clawhip listener picks this up and sends it to a worker. The worker:

Boots in a fresh worktree (branch_lock prevents collision with other lanes)
Auto-trusts the worktree (allowlist match)
Receives the prompt
Runs an agent loop (just like the main loop covered in The Request Loop, Traced) but for an extended duration (maybe hours), spawning sub-agents (Architect, Executor, Reviewer per PHILOSOPHY.md)
Emits LaneEvents as it progresses (CommitCreated, PrOpened)
On completion, fires Finished or Failed with metadata

Multiple lanes can run in parallel. The branch_lock ensures they don't stomp each other. The lane_events log is the auditable history of what happened. The recovery_recipes catch known failures and either retry or escalate.

This is the architecture of autonomous-with-supervision coding. The human stays in Discord; the claws (sub-agents) do the labor; the lanes are the visible trace.

For your own agents, the patterns to extract:

Lane = an event stream identified by an ID, not a struct. Events are the public state.
Dedup terminal events. Multiple sources will report the same outcome; pick one.
Branch locks before, not after. Detect collisions at intent time, not at merge time.
Trust auto-resolution via allowlist. Make "this is safe" cacheable so workers don't prompt every boot.
Misdelivery as a recoverable failure. Don't trust the fact that you sent a prompt to mean the agent received it cleanly.

Continue: The Mock Parity Harness