As agent counts scale from 2-3 subagents to 20-30 parallel workers, a new category of infrastructure emerges: multi-agent workspace managers. These tools solve problems that frameworks and coding agents were never designed to handle—persistent agent identity, automated merge coordination, work attribution across dozens of simultaneous branches, and supervisory hierarchies that recover from failures without human intervention.
The relationship to existing tools is analogous to Docker's emergence alongside virtual machines. VMs handled single-server isolation; Docker handled container orchestration at scale. Similarly, coding agents (Claude Code, Cursor) handle single-agent workflows; workspace managers handle the infrastructure layer when the bottleneck shifts from implementation capacity to coordination overhead.
Your Mental Model
Workspace managers are infrastructure, not agents. The distinction matters. A coding agent (Claude Code, Copilot) generates code. A framework (LangGraph, CrewAI) coordinates agent logic. A workspace manager operates one layer below both—it provisions isolated environments, routes work assignments, merges outputs, and supervises health across an entire fleet of coding agents. The workspace manager does not write code itself; it ensures that 20 agents writing code simultaneously do not destroy each other's work.
This is the difference between a developer and a DevOps platform. The developer writes features; the platform ensures those features deploy safely. Workspace managers are the DevOps layer for agent swarms.
The Problem Space
Why Existing Tools Break at Scale
Single-agent tools assume a model where one agent works in one repository at a time. This assumption creates cascading failures when agent count increases:
| Problem | At 1-3 Agents | At 20-30 Agents |
|---|---|---|
| Branch conflicts | Rare, manually resolved | Constant, blocks all progress |
| Merge coordination | Developer handles | Requires automated queue |
| Context loss | Restart and re-explain | 20 agents lose context simultaneously |
| Work attribution | Obvious (one agent did it) | Unclear which agent produced what |
| Failure recovery | Restart the agent | Need supervision to detect and restart |
| Cost tracking | Single billing stream | Dozens of parallel billing streams |
The Coordination Tax
[2026-02-11]: At small scale, coordination overhead is negligible. A developer managing 2-3 Claude Code subagents via Task tool spends minimal time on coordination. At 20+ agents, coordination becomes the dominant cost:
Agent Count: 1 3 5 10 20 30
Useful Work: 95% 85% 75% 60% 40% 25% (without workspace manager)
Coordination: 5% 15% 25% 40% 60% 75%
Workspace managers exist to invert this ratio—keeping useful work above 70% even at 20+ agent scale by automating coordination that would otherwise consume human attention.
Gas Town: Primary Example
Gas Town is an open-source multi-agent workspace manager created by Steve Yegge (former Google, Amazon, Grab engineer). Released December 2025, MIT licensed, built in Go, with 9,000+ GitHub stars as of February 2026.
Gas Town provides two CLI tools:
gt— Workspace and agent management (provisioning, assignment, completion, context priming)bd— Git-backed issue tracking and workflow state (bead management, step progression)
Architecture Overview
~/gt/ Town root (all projects live here)
├── .beads/ Town-level beads (hq-* prefix)
├── mayor/ Mayor config (global coordinator)
├── deacon/ Background supervisor daemon
│ └── dogs/boot/ Health triage subsystem
│
└── <rig>/ Per-project container
├── mayor/rig/ Canonical clone (source of truth)
├── refinery/rig/ Merge queue processor
├── witness/ Per-rig supervisor
├── crew/<name>/ Human workspaces
└── polecats/<name>/ Worker worktrees (one per agent)
Key architectural decisions:
- Git worktrees as isolation: Each agent operates in its own worktree (
polecats/<name>/), sharing the same Git history but maintaining independent working directories. No filesystem conflicts between agents. - Canonical clone separation: The
mayor/rig/directory holds the canonical repository state. Agent worktrees branch from it. This prevents any single agent from corrupting the source of truth. - Supervisory hierarchy: The
deacondaemon monitors agent health. Thewitnessper-rig supervisor detects stalled or failed agents. Recovery happens automatically without human intervention. - Bead-based state: All workflow state persists as "beads"—Git-backed artifacts that survive agent restarts. No ephemeral state means no context loss on failure.
Core Workflows
Receiving Work
gt hook # Check what work is assigned to this agentThe hook command is the agent's entry point. It reads the current bead assignment and returns the task specification. Agents call this at session start (typically via a SessionStart hook in Claude Code) to understand their current assignment.
Completing Work
gt done # Complete work, push branch, submit to merge queue, cleanupgt done handles the entire completion sequence atomically:
- Stage and commit changes
- Push the agent's branch
- Submit to the merge queue (refinery)
- Clean up the worktree for reuse
This eliminates the common failure mode where agents commit but forget to push, or push but fail to signal completion.
Context Restoration
gt prime # Restore context for new sessionsAgent sessions are ephemeral—context dies when the session ends. gt prime reconstructs working context from Git-backed beads, enabling agents to resume work after restarts without human re-explanation. This maps directly to the Context vs Memory distinction: beads serve as persistent memory that survives session boundaries.
Workflow Progression
bd mol current # Check current position in workflow
bd close <step> --continue # Close step and auto-advance to nextThe bd tool manages workflow state through a bead-based issue tracker. Steps close explicitly, and --continue auto-advances to the next step in the sequence. This prevents agents from stalling between workflow steps.
The Merge Queue (Refinery)
The refinery is Gas Town's automated merge coordinator. When an agent completes work and calls gt done, the branch enters the merge queue:
Agent A completes ──> Refinery Queue ──> AI-Assisted Merge ──> Canonical Clone
Agent B completes ──> (mayor/rig/)
Agent C completes ──>
How it works:
- Completed branches queue in order of submission
- The refinery attempts automated merge against the canonical clone
- If conflicts arise, an AI agent resolves them (not the original worker)
- Successful merges update the canonical clone
- Other agents' worktrees rebase against the updated canonical
This decouples work completion from merge resolution. Agents never block on merge conflicts—they move to the next task while the refinery handles integration.
Trade-off: AI-assisted merge introduces risk of incorrect conflict resolution. The refinery prioritizes throughput over perfect accuracy, relying on subsequent review to catch errors.
Agent Identity and Persistence
Gas Town assigns persistent identities to agents through the polecats/ naming convention. Each agent has:
- A named worktree:
polecats/agent-name/persists across sessions - A work history: Git commits attributed to the agent
- Accumulated context: Beads record what the agent has worked on
This contrasts with ephemeral agent models (Claude Code subagents, LangGraph nodes) where agents have no identity beyond a single invocation. Persistent identity enables:
- Skill routing: Assign agents to work matching their prior experience
- Accountability: Trace which agent produced which output
- Context efficiency: Agents familiar with a codebase area need less context loading
Supervision Architecture
┌────────────────────────────────────┐
│ Deacon │
│ (Town-level supervisor daemon) │
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Witness A │ │ Witness B │ ... │
│ │ (Rig 1) │ │ (Rig 2) │ │
│ └─────┬────┘ └─────┬────┘ │
│ │ │ │
│ ┌─────┴─────┐ ┌────┴─────┐ │
│ │ dogs/boot │ │ dogs/boot│ │
│ │ (triage) │ │ (triage) │ │
│ └───────────┘ └──────────┘ │
└────────────────────────────────────┘
│ │
┌────┴────┐ ┌───┴────┐
│Polecats │ │Polecats│
│(Agents) │ │(Agents)│
└─────────┘ └────────┘
Three supervision layers operate independently:
- Deacon (town-level): Background daemon monitoring all rigs. Detects infrastructure-level failures (disk space, network, process crashes).
- Witness (rig-level): Per-project supervisor. Detects agent-level failures (stalled work, repeated errors, timeout).
- Dogs/boot (triage): Health triage subsystem that classifies failures and routes recovery actions.
This layered approach means a single agent failure does not cascade. The witness detects the failure, dogs/boot classifies it, and recovery proceeds while other agents continue working.
Overstory: TypeScript/Bun Workspace Manager
[2026-02-13]: Overstory represents the session-as-orchestrator alternative to Gas Town's daemon-based architecture. Built in TypeScript using Bun runtime (~31K LOC, 912 tests), Overstory demonstrates that workspace management patterns transcend language ecosystems and coordination models.
Architecture Overview
.overstory/ Target project root
├── config.yaml Project configuration
├── agent-manifest.json Agent registry
├── hooks.json Central hooks config
├── agents/ Agent state
│ └── {name}/
│ ├── identity.yaml Persistent identity (CVs)
│ └── work-history.md Append-only work log
├── worktrees/ Git worktrees (gitignored)
│ └── {agent-name}/
├── specs/ Task specifications
│ └── {bead-id}.md
├── logs/ Agent logs (gitignored)
│ └── {agent-name}/{timestamp}/
├── mail.db SQLite mail (gitignored, WAL)
└── metrics.db SQLite metrics (gitignored)
Key Architectural Decisions:
- Session-as-orchestrator: Your active Claude Code session coordinates agents via
overstoryCLI. No separate daemon process. - Bun runtime: Zero runtime dependencies—only Bun built-ins (sqlite, spawn, file I/O).
- Hook integration: SessionStart/UserPromptSubmit/PreToolUse hooks provide coordination points.
- SQLite messaging: WAL mode enables concurrent agent messaging at ~1-5ms latency.
- Two-layer definitions: Base agent .md files (HOW) + dynamic overlay CLAUDE.md (WHAT).
Core Workflows
Spawning Workers
overstory sling --task bd-abc --capability builder --name auth-login \
--spec .overstory/specs/bd-abc.md --files src/auth/login.ts,src/auth/types.tsCreates worktree, writes overlay CLAUDE.md, deploys hooks, starts tmux session running Claude Code.
Communication
overstory mail send --to orchestrator --subject "Build complete" \
--body "Implemented login flow. Tests passing." --type resultMessages persist in .overstory/mail.db (SQLite). Hook automatically injects unread messages into agent context on next prompt submission.
Merge Queue
overstory merge --branch overstory/auth-login/bd-abc4-tier resolution: Clean → Auto-Resolve → AI-Resolve → Re-Imagine. Escalates automatically until conflicts resolve.
Monitoring
overstory status # Show all active agents, worktrees, beads state
overstory watch # Start watchdog daemon
overstory metrics # Show metrics summaryComparison with Gas Town
| Dimension | Overstory | Gas Town |
|---|---|---|
| Runtime | TypeScript/Bun | Go |
| Orchestrator | Session-as-orchestrator (your active Claude Code session) | External daemon (Mayor) |
| CLI | overstory (17 commands) | gt/bd dual CLIs |
| Messaging | Custom SQLite mail (~1-5ms) | Typed mail protocol |
| Cost Model | Subscription (fixed monthly cost) | API tokens (~$100/hr) |
| Dependencies | Zero (Bun built-ins only) | Go stdlib |
| Agent Definition | Base .md + overlay CLAUDE.md | Similar pattern |
| Merge Resolution | 4-tier (Clean/Auto/AI/Re-Imagine) | Refinery with AI resolution |
| Supervision | Tiered (Daemon → Triage → Monitor → Supervisor) | Daemon → Boot → Deacon → Witness |
Convergence: Both implement persistent identity, worktree isolation, typed messaging, tiered health monitoring, and merge queue infrastructure. The architectural alignment across different technology stacks (Go vs TypeScript) and coordination models (daemon vs session) validates these patterns as fundamental to swarm coordination.
When to Choose Overstory
Good fit:
- TypeScript/Bun ecosystem preferred
- Subscription cost model acceptable (fixed monthly vs per-token)
- Session-based coordination workflow natural (human stays engaged)
- Zero-dependency deployment valued
- Hook-based mechanical enforcement sufficient
Poor fit:
- Multi-language polyglot teams (Go ecosystem familiarity in Gas Town)
- API token budget model required (pay-per-use vs subscription)
- External daemon coordination preferred (orchestrator survives session crashes)
- Production validation maturity critical (Gas Town has more operational history as of early 2026)
Workspace Manager Selection Framework
When choosing between workspace managers, consider scale requirements, ecosystem preferences, and coordination models:
Scale and requirements:
1-5 agents → Claude Code Agent Teams (built-in)
5-15 agents → Overstory (session-based) OR Gas Town (daemon-based)
15-30 agents → Gas Town (proven at scale as of early 2026)
Ecosystem preference:
TypeScript/Bun → Overstory
Go → Gas Town
Language-agnostic → Either (patterns converge)
Cost model:
Subscription (fixed) → Overstory
API tokens (pay-per-use) → Gas Town
Either acceptable → Choose by ecosystem
Coordination model:
Session-as-orchestrator → Overstory
External daemon → Gas Town
When Workspace Managers Make Sense
Good Fit
- 10+ parallel agents: The coordination overhead threshold where manual management becomes the bottleneck
- Multi-repository projects: Complex merge coordination across repositories benefits from automated queuing
- Attribution requirements: Regulated environments or teams requiring clear traceability of agent-produced code
- Human design capacity as bottleneck: When there are more implementation tasks than agents to assign them to—workspace managers maximize agent utilization
- Long-running projects: Persistent agent identity and bead-based state compound value over weeks and months
Poor Fit
- 1-5 agents: Claude Code subagents or agent teams handle this scale with far less overhead. The infrastructure cost of a workspace manager exceeds its coordination benefit.
- Small projects: If the codebase fits in a single agent's context, workspace management adds unnecessary complexity.
- Early-stage practitioners: Workspace managers target practitioners at prompt maturity Level 7-8 (see Prompt Maturity Model). Understanding single-agent workflows is prerequisite.
- Cost-sensitive environments: Gas Town's operational model consumes approximately $100/hour in aggregate token costs across 20-30 agents. Budget must justify the throughput gain.
- Exploratory work: When requirements are unclear and iteration speed matters more than parallelism, single-agent exploration is more efficient than coordinated swarms.
The Scale Decision
Agent Count Decision Framework:
1-2 agents ──> Single session or subagents (Task tool)
No workspace manager needed.
3-8 agents ──> Agent teams (TeammateTool) or framework orchestration
Workspace manager optional, likely overkill.
10-20 agents ──> Workspace manager becomes valuable.
Coordination overhead exceeds manual capacity.
20-30 agents ──> Workspace manager essential.
Without one, coordination consumes >60% of effort.
30+ agents ──> Workspace manager + custom scaling infrastructure.
Current tools approaching upper limits.
Comparison with Other Approaches
| Dimension | Gas Town | Overstory | Claude Code Agent Teams | Google ADK | LangGraph |
|---|---|---|---|---|---|
| Scale target | 20-30 agents | 10-15 agents | 2-8 agents | Varies | Varies |
| Primary abstraction | Workspace/worktree | Worktree + session | Teammate session | Agent/workflow | Graph node |
| Persistence | Git-backed beads | Filesystem + SQLite | Session-only | State store | Checkpoints |
| Agent identity | Persistent (named CVs) | Persistent (identity.yaml) | Ephemeral | Configurable | Ephemeral |
| Coordination model | Mail protocol + beads | SQLite mail (~1-5ms) | Message passing + tasks | Shared state | Graph edges |
| Orchestrator model | External daemon (Mayor) | Session-as-orchestrator | Built-in (flat) | Configurable | Explicit graph |
| Merge strategy | Automated AI refinery | 4-tier escalation | Manual | N/A | N/A |
| Supervision | Deacon/Witness/Dogs | Daemon/Triage/Monitor/Supervisor | Flat (lead monitors) | Configurable | External |
| Failure recovery | Automatic (supervisor chain) | Tiered (mechanical → AI) | Manual (respawn) | Configurable | Checkpoint restore |
| Cost profile | ~$100/hr (20-30 agents) | Subscription (fixed monthly) | Lower (2-8 agents) | Lower | Lower |
| Maturity | Early (December 2025) | Early (February 2026) | Experimental | Production | Production (v1.0) |
| Language | Go (CLI tools) | TypeScript/Bun | TypeScript (SDK) | Python | Python |
| Dependencies | Go stdlib | Zero (Bun built-ins) | Node + npm packages | Python + pip packages | Python + pip packages |
| License | MIT | MIT | Proprietary | Apache 2.0 | MIT |
Key differentiators:
- Git-native persistence: Gas Town's bead system and Overstory's filesystem + SQLite persistence both ensure state survives crashes. Other approaches use ephemeral or application-specific state.
- Merge automation: Gas Town and Overstory automate merge conflict resolution through AI-assisted tiers. At 20+ agents producing concurrent branches, merge becomes the primary bottleneck without automation.
- Supervisory depth: Three-layer supervision (Gas Town: deacon/witness/dogs; Overstory: daemon/triage/monitor/supervisor) provides fault isolation missing from flat coordination models. A failed agent is detected and recovered without affecting peers.
- Coordination model: Gas Town's daemon-based Mayor vs Overstory's session-as-orchestrator represent fundamental architectural trade-offs. Daemon provides crash independence; session provides infrastructure simplicity.
Implementation Patterns
Integration with Coding Agents
Gas Town is agent-agnostic at the worker level—any coding agent that can operate in a Git worktree can serve as a polecat. The integration surface is three commands:
# SessionStart hook (in .claude/settings.json or equivalent)
gt prime # Load context from beads
# During work
gt hook # Read current assignment
# On completion
gt done # Push, submit to refinery, cleanupFor Claude Code specifically, these map to:
| Gas Town Command | Claude Code Integration Point |
|---|---|
gt prime |
SessionStart hook |
gt hook |
Agent prompt preamble |
gt done |
Post-completion workflow step |
bd mol current |
Status check within agent loop |
Bead-Based State Management
Beads are Gas Town's unit of persistent state. Every workflow step, task assignment, and completion record is a bead stored in Git:
.beads/
├── hq-project-setup # Town-level bead
├── hq-architecture-review # Town-level bead
└── task-implement-auth # Task bead assigned to polecat
Properties of beads:
- Versioned: Every bead change is a Git commit
- Attributable: Beads record which agent created or modified them
- Recoverable: Git history enables rollback to any prior state
- Portable: Beads survive agent restarts, machine changes, and session boundaries
This maps to the Context as Code mental model—treating agent state as version-controlled artifacts rather than ephemeral memory.
Work Distribution Pattern
┌──────────────┐
│ Mayor │ (Global coordinator)
│ Assigns → │
└──────┬───────┘
│
┌────┴──────────────────────────┐
│ Bead Queue │
│ [task-1] [task-2] [task-3] │
└────┬──────┬──────┬────────────┘
│ │ │
▼ ▼ ▼
Polecat Polecat Polecat
(Alice) (Bob) (Carol)
│ │ │
▼ ▼ ▼
gt done gt done gt done
│ │ │
▼ ▼ ▼
┌────┴──────┴──────┴────────────┐
│ Refinery │
│ Merge queue + AI resolution │
└───────────────────────────────┘
│
▼
Canonical Clone (mayor/rig/)
The mayor assigns tasks as beads. Polecats (worker agents) pull assignments via gt hook, execute in isolated worktrees, and submit via gt done. The refinery merges outputs in submission order, resolving conflicts with AI assistance. No agent waits for another agent's merge to complete.
Anti-Patterns
Anti-Pattern: Workspace Manager for Small Teams
Deploying Gas Town for 2-3 agents introduces infrastructure overhead (deacon daemon, refinery process, worktree management) that exceeds the coordination benefit. Claude Code agent teams or subagents handle this scale with zero infrastructure.
Better approach: Start with agent teams. Migrate to a workspace manager when coordination overhead visibly limits throughput—typically at 8-10 concurrent agents.
Anti-Pattern: Treating Workspace Managers as Frameworks
Workspace managers and agent frameworks serve different layers. Attempting to use Gas Town as a replacement for LangGraph or CrewAI conflates infrastructure with logic. Gas Town manages where agents work; frameworks manage how agents reason.
Better approach: Use workspace managers alongside frameworks. Gas Town provisions the worktree; Claude Code (or another agent) operates within it.
Anti-Pattern: Ignoring the Cost Curve
At $100/hour for 20-30 agents, workspace-managed swarms consume significant resources. Running a full swarm on exploratory work or unclear requirements wastes budget on parallel execution of potentially discarded work.
Better approach: Use single-agent exploration to clarify requirements. Deploy the swarm only for well-specified implementation tasks where parallelism yields clear throughput gains.
Anti-Pattern: Skipping the Maturity Ladder
Jumping directly to 20-agent workspace management without experience at lower scales creates operational risk. Practitioners unfamiliar with single-agent failure modes will struggle to diagnose swarm-level failures.
Better approach: Progress through the scale ladder: single agent (1-2) to agent teams (3-8) to workspace manager (10+). Each tier builds operational intuition for the next.
The Emerging Category
Gas Town represents the first visible entry in what may become a recognized tool category. The pattern is familiar from infrastructure evolution:
| Era | Problem | Category That Emerged |
|---|---|---|
| 2000s | Managing many servers | Configuration management (Chef, Puppet, Ansible) |
| 2010s | Managing many containers | Container orchestration (Docker, Kubernetes) |
| 2020s | Managing many microservices | Service mesh (Istio, Linkerd) |
| 2025+ | Managing many agents | Workspace management (Gas Town, ...) |
Signals this category is real:
- Scale demand exists: Production systems already run 10-30 agents (Claude Code agent teams, custom frameworks). The tooling gap is observable.
- Shared problems recur: Merge coordination, agent supervision, state persistence, and work attribution are not Gas Town-specific—any system at this scale faces them.
- Infrastructure layer is distinct: Workspace management is neither agent logic (framework layer) nor code generation (agent layer). It occupies a recognizable infrastructure niche.
Signals this category is premature:
- Single entrant: Gas Town is effectively the only tool in this space as of February 2026. Categories need competition to validate.
- Early maturity: Gas Town launched December 2025. Production hardening takes years.
- Model capabilities may eliminate the need: If models improve enough, fewer agents working smarter may outperform many agents coordinated by infrastructure.
- Cost barrier: $100/hour operational cost limits adoption to well-funded teams and high-value projects.
Open Questions
- How does merge quality degrade as branch divergence increases across 20+ concurrent agents?
- What is the practical upper limit on agent count before the refinery becomes a bottleneck?
- Can workspace managers integrate with non-Git version control systems, or is Git-native state fundamental?
- How does persistent agent identity interact with model updates (agent "personality" may shift between model versions)?
- What monitoring and observability patterns emerge for workspace-managed swarms?
- How do workspace managers handle multi-repository dependencies (agent in repo A needs changes from agent in repo B)?
- Will model improvements (larger context, better reasoning) reduce the need for high agent counts?
- What security model governs agent access within worktrees (can a compromised agent affect other worktrees)?
- How does bead-based state compare to database-backed state for operational query patterns?
- What happens when the refinery's AI merge resolution produces subtle bugs that pass tests?
Connections
- To Claude Code: Gas Town's polecats can run Claude Code as the underlying coding agent. Integration happens through SessionStart hooks (
gt prime) and completion workflows (gt done). Claude Code agent teams operate at smaller scale (2-8 agents) compared to Gas Town's target (20-30). - To Agent Frameworks: Workspace managers operate at a different layer than frameworks. LangGraph coordinates agent logic; Gas Town coordinates agent infrastructure. The two are complementary, not competitive.
- To Orchestrator Pattern: Gas Town's mayor implements an orchestrator pattern, but at infrastructure level rather than prompt level. The mayor assigns work and monitors completion; it does not reason about task decomposition.
- To Expert Swarm Pattern: The expert swarm pattern describes agent coordination logic. Workspace managers provide the infrastructure substrate that makes swarms practical at scale—isolation, merge automation, supervision.
- To Context Fundamentals: Gas Town's bead system directly addresses the context vs. memory gap. Beads provide persistent memory that survives session boundaries, enabling agents to resume work without re-explanation.
- To Workflow Coordination: Gas Town's
gt done+ refinery pattern automates the commit-push-merge workflow that manual coordination handles at smaller scale.
Sources
- Gas Town GitHub Repository (MIT License, 9k+ stars)
- Steve Yegge - Gas Town announcement and documentation
- Gas Town CLI reference
- Steve Yegge's background: Former Google, Amazon, Grab principal engineer; creator of Gas Town (December 2025)