Context and Multi-Agent — Managing AI Coding Sessions

1. The Compaction Pipeline

Claude Code uses a 5-layer compaction pipeline before every model call. Each layer is cheaper than the next, so earlier layers run first and semantic compression is the last resort.

Layer	Mechanism	Cost
1. Budget reduction	Trim tool output to fit budget	Free
2. Snip	Drop old, low-value turns	Free
3. Microcompact	Remove whitespace and formatting noise	Minimal
4. Context collapse	Merge adjacent tool results	Low
5. Auto-compact	Semantic summarization by the model	High

Use /compact with custom instructions to preserve critical context when the window fills up. For example: /compact keep the database schema and test plan.

Context hygiene: Use /clear between unrelated tasks. Context from Task A pollutes Task B — the model wastes tokens attending to irrelevant history and may hallucinate connections that don't exist.

Sessions are append-only JSONL files. Nothing is ever deleted from the file on disk; compaction only affects what gets sent to the model in the next request.

2. Session Management

Resume sessions with claude -c (latest session) or claude -r (interactive picker). Session-scoped permissions are NOT restored on resume — this is intentional, preventing approval grandfathering where a permission granted for one task carries over to a different context.

Forking sessions

Fork a session to branch from a specific point. The forked session gets a copy of the conversation up to that point but diverges independently from there.

The Ralph Loop

A pattern for sustained multi-task sessions without context degradation:

Pick task from the backlog
Implement the change
Validate (tests, lint, manual check)
Commit if passing
Reset context (/clear)
Repeat

Memory persists through git history and progress logs (like tasks/todo.md), not the context window. The commit message and file diffs carry the knowledge forward; the agent re-reads them when it picks up the next task.

3. Subagent Delegation

Subagents run in isolated context windows and return only summaries to the parent. This prevents "subagent conversation inflation" — where a parent agent's context fills up with the full transcript of every delegated task instead of just the result.

When to use subagents

Research and documentation lookup
Code exploration across unfamiliar areas
Parallel analysis of independent subsystems

Coordination tiers

Tier	Agents	Communication	Use case
Subagents	2-3	In-process, parent-child	Research, exploration
Agent teams	3-5	Local, peer messaging via files	Feature implementation
Cloud orchestrators	N	Async, assign-and-walk-away	Bulk migrations, audits

Rule of thumb: Three focused agents consistently outperform one generalist working three times as long. The constraint of a narrow context window forces each agent to stay on-task.

4. Git Worktree Isolation

One task, one branch, one worktree, one agent. Create a worktree with:

git worktree add ../fix-auth-bug fix-auth-bug

Each worktree gets its own filesystem checkout, so agents editing files in parallel never produce merge conflicts at the file level.

Practical limits

5-7 concurrent agents is the practical ceiling. Beyond that, API rate limits and human review overhead cancel out the throughput gains.

Worktrees only isolate the filesystem

Ports, databases, and the Docker daemon remain shared across all worktrees. If two agents both try to bind port 3000 or migrate the same database, they collide. Use per-worktree port offsets (e.g., PORT=3000 + worktree_index) and separate test databases.

Task independence test

Before parallelizing, verify all three conditions:

File exclusivity — no two agents edit the same file
Interface stability — no agent changes an API another agent depends on
Bounded scope — each task has a clear definition of done

If any condition fails, serialize the tasks instead.

5. Practical Patterns

Where parallel agents shine

Research and proof-of-concept spikes
Understanding existing systems (reading, not writing)
Low-stakes maintenance (dependency bumps, lint fixes)
Specified, directed work with clear acceptance criteria

Detailed specs reduce review effort dramatically. An agent working from a two-paragraph description of what to build and how to test it produces code that takes minutes to review. An agent working from "make the auth better" produces code that takes longer to review than to rewrite.

Documentation flywheel

Use agents to document how your codebase works — architecture decisions, data flow, module boundaries. Then use those docs as context for future prompts. Each pass improves the documentation, which improves the next agent's output, which produces better documentation.

Merge strategies

Strategy	Speed	Safety	When to use
Lead-agent merge	Fast	Lower	Internal tools, prototypes
Human review gate	Slower	Higher	Production code, shared APIs

The real bottleneck: Verification is the bottleneck, not code generation. An agent can produce a 500-line feature in minutes, but proving it works correctly under edge cases still requires human judgment and well-designed test suites.