Multi-Agent Workspace Managers | Agentic Engineering

As agent counts scale from 2-3 subagents to 20-30 parallel workers, a new category of infrastructure emerges: multi-agent workspace managers. These tools solve problems that frameworks and coding agents were never designed to handle—persistent agent identity, automated merge coordination, work attribution across dozens of simultaneous branches, and supervisory hierarchies that recover from failures without human intervention.

The relationship to existing tools is analogous to Docker's emergence alongside virtual machines. VMs handled single-server isolation; Docker handled container orchestration at scale. Similarly, coding agents (Claude Code, Cursor) handle single-agent workflows; workspace managers handle the infrastructure layer when the bottleneck shifts from implementation capacity to coordination overhead.

Your Mental Model

Workspace managers are infrastructure, not agents. The distinction matters. A coding agent (Claude Code, Copilot) generates code. A framework (LangGraph, CrewAI) coordinates agent logic. A workspace manager operates one layer below both—it provisions isolated environments, routes work assignments, merges outputs, and supervises health across an entire fleet of coding agents. The workspace manager does not write code itself; it ensures that 20 agents writing code simultaneously do not destroy each other's work.

This is the difference between a developer and a DevOps platform. The developer writes features; the platform ensures those features deploy safely. Workspace managers are the DevOps layer for agent swarms.

The Problem Space

Why Existing Tools Break at Scale

Single-agent tools assume a model where one agent works in one repository at a time. This assumption creates cascading failures when agent count increases:

Problem	At 1-3 Agents	At 20-30 Agents
Branch conflicts	Rare, manually resolved	Constant, blocks all progress
Merge coordination	Developer handles	Requires automated queue
Context loss	Restart and re-explain	20 agents lose context simultaneously
Work attribution	Obvious (one agent did it)	Unclear which agent produced what
Failure recovery	Restart the agent	Need supervision to detect and restart
Cost tracking	Single billing stream	Dozens of parallel billing streams

The Coordination Tax

[2026-02-11]: At small scale, coordination overhead is negligible. A developer managing 2-3 Claude Code subagents via Task tool spends minimal time on coordination. At 20+ agents, coordination becomes the dominant cost:

Agent Count:     1    3    5    10    20    30
Useful Work:   95%  85%  75%   60%   40%   25%  (without workspace manager)
Coordination:   5%  15%  25%   40%   60%   75%

Workspace managers exist to invert this ratio—keeping useful work above 70% even at 20+ agent scale by automating coordination that would otherwise consume human attention.

Gas Town: Primary Example

Gas Town is an open-source multi-agent workspace manager created by Steve Yegge (former Google, Amazon, Grab engineer). Released December 2025, MIT licensed, built in Go, with 9,000+ GitHub stars as of February 2026.

Gas Town provides two CLI tools:

gt — Workspace and agent management (provisioning, assignment, completion, context priming)
bd — Git-backed issue tracking and workflow state (bead management, step progression)

Architecture Overview

~/gt/                          Town root (all projects live here)
├── .beads/                    Town-level beads (hq-* prefix)
├── mayor/                     Mayor config (global coordinator)
├── deacon/                    Background supervisor daemon
│   └── dogs/boot/             Health triage subsystem
│
└── <rig>/                     Per-project container
    ├── mayor/rig/             Canonical clone (source of truth)
    ├── refinery/rig/          Merge queue processor
    ├── witness/               Per-rig supervisor
    ├── crew/<name>/           Human workspaces
    └── polecats/<name>/       Worker worktrees (one per agent)

Key architectural decisions:

Git worktrees as isolation: Each agent operates in its own worktree (polecats/<name>/), sharing the same Git history but maintaining independent working directories. No filesystem conflicts between agents.
Canonical clone separation: The mayor/rig/ directory holds the canonical repository state. Agent worktrees branch from it. This prevents any single agent from corrupting the source of truth.
Supervisory hierarchy: The deacon daemon monitors agent health. The witness per-rig supervisor detects stalled or failed agents. Recovery happens automatically without human intervention.
Bead-based state: All workflow state persists as "beads"—Git-backed artifacts that survive agent restarts. No ephemeral state means no context loss on failure.

Core Workflows

Receiving Work

gt hook    # Check what work is assigned to this agent

The hook command is the agent's entry point. It reads the current bead assignment and returns the task specification. Agents call this at session start (typically via a SessionStart hook in Claude Code) to understand their current assignment.

Completing Work

gt done    # Complete work, push branch, submit to merge queue, cleanup

gt done handles the entire completion sequence atomically:

Stage and commit changes
Push the agent's branch
Submit to the merge queue (refinery)
Clean up the worktree for reuse

This eliminates the common failure mode where agents commit but forget to push, or push but fail to signal completion.

Context Restoration

gt prime   # Restore context for new sessions

Agent sessions are ephemeral—context dies when the session ends. gt prime reconstructs working context from Git-backed beads, enabling agents to resume work after restarts without human re-explanation. This maps directly to the Context vs Memory distinction: beads serve as persistent memory that survives session boundaries.

Workflow Progression

bd mol current              # Check current position in workflow
bd close <step> --continue  # Close step and auto-advance to next

The bd tool manages workflow state through a bead-based issue tracker. Steps close explicitly, and --continue auto-advances to the next step in the sequence. This prevents agents from stalling between workflow steps.

The Merge Queue (Refinery)

The refinery is Gas Town's automated merge coordinator. When an agent completes work and calls gt done, the branch enters the merge queue:

Agent A completes ──> Refinery Queue ──> AI-Assisted Merge ──> Canonical Clone
Agent B completes ──>                                          (mayor/rig/)
Agent C completes ──>

How it works:

Completed branches queue in order of submission
The refinery attempts automated merge against the canonical clone
If conflicts arise, an AI agent resolves them (not the original worker)
Successful merges update the canonical clone
Other agents' worktrees rebase against the updated canonical

This decouples work completion from merge resolution. Agents never block on merge conflicts—they move to the next task while the refinery handles integration.

Trade-off: AI-assisted merge introduces risk of incorrect conflict resolution. The refinery prioritizes throughput over perfect accuracy, relying on subsequent review to catch errors.

Agent Identity and Persistence

Gas Town assigns persistent identities to agents through the polecats/ naming convention. Each agent has:

A named worktree: polecats/agent-name/ persists across sessions
A work history: Git commits attributed to the agent
Accumulated context: Beads record what the agent has worked on

This contrasts with ephemeral agent models (Claude Code subagents, LangGraph nodes) where agents have no identity beyond a single invocation. Persistent identity enables:

Skill routing: Assign agents to work matching their prior experience
Accountability: Trace which agent produced which output
Context efficiency: Agents familiar with a codebase area need less context loading

Supervision Architecture

┌────────────────────────────────────┐
│            Deacon                   │
│  (Town-level supervisor daemon)    │
│                                    │
│  ┌──────────┐  ┌──────────┐       │
│  │ Witness A │  │ Witness B │  ... │
│  │ (Rig 1)  │  │ (Rig 2)  │       │
│  └─────┬────┘  └─────┬────┘       │
│        │              │            │
│  ┌─────┴─────┐  ┌────┴─────┐     │
│  │ dogs/boot │  │ dogs/boot│      │
│  │ (triage)  │  │ (triage) │      │
│  └───────────┘  └──────────┘      │
└────────────────────────────────────┘
         │                │
    ┌────┴────┐      ┌───┴────┐
    │Polecats │      │Polecats│
    │(Agents) │      │(Agents)│
    └─────────┘      └────────┘

Three supervision layers operate independently:

Deacon (town-level): Background daemon monitoring all rigs. Detects infrastructure-level failures (disk space, network, process crashes).
Witness (rig-level): Per-project supervisor. Detects agent-level failures (stalled work, repeated errors, timeout).
Dogs/boot (triage): Health triage subsystem that classifies failures and routes recovery actions.

This layered approach means a single agent failure does not cascade. The witness detects the failure, dogs/boot classifies it, and recovery proceeds while other agents continue working.

Overstory: TypeScript/Bun Workspace Manager

[2026-02-13]: Overstory represents the session-as-orchestrator alternative to Gas Town's daemon-based architecture. Built in TypeScript using Bun runtime (~31K LOC, 912 tests), Overstory demonstrates that workspace management patterns transcend language ecosystems and coordination models.

Architecture Overview

.overstory/                         Target project root
├── config.yaml                     Project configuration
├── agent-manifest.json             Agent registry
├── hooks.json                      Central hooks config
├── agents/                         Agent state
│   └── {name}/
│       ├── identity.yaml           Persistent identity (CVs)
│       └── work-history.md         Append-only work log
├── worktrees/                      Git worktrees (gitignored)
│   └── {agent-name}/
├── specs/                          Task specifications
│   └── {bead-id}.md
├── logs/                           Agent logs (gitignored)
│   └── {agent-name}/{timestamp}/
├── mail.db                         SQLite mail (gitignored, WAL)
└── metrics.db                      SQLite metrics (gitignored)

Key Architectural Decisions:

Session-as-orchestrator: Your active Claude Code session coordinates agents via overstory CLI. No separate daemon process.
Bun runtime: Zero runtime dependencies—only Bun built-ins (sqlite, spawn, file I/O).
Hook integration: SessionStart/UserPromptSubmit/PreToolUse hooks provide coordination points.
SQLite messaging: WAL mode enables concurrent agent messaging at ~1-5ms latency.
Two-layer definitions: Base agent .md files (HOW) + dynamic overlay CLAUDE.md (WHAT).

Core Workflows

Spawning Workers

overstory sling --task bd-abc --capability builder --name auth-login \
  --spec .overstory/specs/bd-abc.md --files src/auth/login.ts,src/auth/types.ts

Creates worktree, writes overlay CLAUDE.md, deploys hooks, starts tmux session running Claude Code.

Communication

overstory mail send --to orchestrator --subject "Build complete" \
  --body "Implemented login flow. Tests passing." --type result

Messages persist in .overstory/mail.db (SQLite). Hook automatically injects unread messages into agent context on next prompt submission.

Merge Queue

overstory merge --branch overstory/auth-login/bd-abc

4-tier resolution: Clean → Auto-Resolve → AI-Resolve → Re-Imagine. Escalates automatically until conflicts resolve.

Monitoring

overstory status       # Show all active agents, worktrees, beads state
overstory watch        # Start watchdog daemon
overstory metrics      # Show metrics summary

Comparison with Gas Town

Dimension	Overstory	Gas Town
Runtime	TypeScript/Bun	Go
Orchestrator	Session-as-orchestrator (your active Claude Code session)	External daemon (Mayor)
CLI	overstory (17 commands)	gt/bd dual CLIs
Messaging	Custom SQLite mail (~1-5ms)	Typed mail protocol
Cost Model	Subscription (fixed monthly cost)	API tokens (~$100/hr)
Dependencies	Zero (Bun built-ins only)	Go stdlib
Agent Definition	Base .md + overlay CLAUDE.md	Similar pattern
Merge Resolution	4-tier (Clean/Auto/AI/Re-Imagine)	Refinery with AI resolution
Supervision	Tiered (Daemon → Triage → Monitor → Supervisor)	Daemon → Boot → Deacon → Witness

Convergence: Both implement persistent identity, worktree isolation, typed messaging, tiered health monitoring, and merge queue infrastructure. The architectural alignment across different technology stacks (Go vs TypeScript) and coordination models (daemon vs session) validates these patterns as fundamental to swarm coordination.

When to Choose Overstory

Good fit:

TypeScript/Bun ecosystem preferred
Subscription cost model acceptable (fixed monthly vs per-token)
Session-based coordination workflow natural (human stays engaged)
Zero-dependency deployment valued
Hook-based mechanical enforcement sufficient

Poor fit:

Multi-language polyglot teams (Go ecosystem familiarity in Gas Town)
API token budget model required (pay-per-use vs subscription)
External daemon coordination preferred (orchestrator survives session crashes)
Production validation maturity critical (Gas Town has more operational history as of early 2026)

Workspace Manager Selection Framework

When choosing between workspace managers, consider scale requirements, ecosystem preferences, and coordination models:

Scale and requirements:
  1-5 agents → Claude Code Agent Teams (built-in)
  5-15 agents → Overstory (session-based) OR Gas Town (daemon-based)
  15-30 agents → Gas Town (proven at scale as of early 2026)

Ecosystem preference:
  TypeScript/Bun → Overstory
  Go → Gas Town
  Language-agnostic → Either (patterns converge)

Cost model:
  Subscription (fixed) → Overstory
  API tokens (pay-per-use) → Gas Town
  Either acceptable → Choose by ecosystem

Coordination model:
  Session-as-orchestrator → Overstory
  External daemon → Gas Town

When Workspace Managers Make Sense

Good Fit

10+ parallel agents: The coordination overhead threshold where manual management becomes the bottleneck
Multi-repository projects: Complex merge coordination across repositories benefits from automated queuing
Attribution requirements: Regulated environments or teams requiring clear traceability of agent-produced code
Human design capacity as bottleneck: When there are more implementation tasks than agents to assign them to—workspace managers maximize agent utilization
Long-running projects: Persistent agent identity and bead-based state compound value over weeks and months

Poor Fit

1-5 agents: Claude Code subagents or agent teams handle this scale with far less overhead. The infrastructure cost of a workspace manager exceeds its coordination benefit.
Small projects: If the codebase fits in a single agent's context, workspace management adds unnecessary complexity.
Early-stage practitioners: Workspace managers target practitioners at prompt maturity Level 7-8 (see Prompt Maturity Model). Understanding single-agent workflows is prerequisite.
Cost-sensitive environments: Gas Town's operational model consumes approximately $100/hour in aggregate token costs across 20-30 agents. Budget must justify the throughput gain.
Exploratory work: When requirements are unclear and iteration speed matters more than parallelism, single-agent exploration is more efficient than coordinated swarms.

The Scale Decision

Agent Count Decision Framework:

1-2 agents  ──> Single session or subagents (Task tool)
                No workspace manager needed.

3-8 agents  ──> Agent teams (TeammateTool) or framework orchestration
                Workspace manager optional, likely overkill.

10-20 agents ──> Workspace manager becomes valuable.
                 Coordination overhead exceeds manual capacity.

20-30 agents ──> Workspace manager essential.
                 Without one, coordination consumes >60% of effort.

30+ agents  ──> Workspace manager + custom scaling infrastructure.
                 Current tools approaching upper limits.

Comparison with Other Approaches

Dimension	Gas Town	Overstory	Claude Code Agent Teams	Google ADK	LangGraph
Scale target	20-30 agents	10-15 agents	2-8 agents	Varies	Varies
Primary abstraction	Workspace/worktree	Worktree + session	Teammate session	Agent/workflow	Graph node
Persistence	Git-backed beads	Filesystem + SQLite	Session-only	State store	Checkpoints
Agent identity	Persistent (named CVs)	Persistent (identity.yaml)	Ephemeral	Configurable	Ephemeral
Coordination model	Mail protocol + beads	SQLite mail (~1-5ms)	Message passing + tasks	Shared state	Graph edges
Orchestrator model	External daemon (Mayor)	Session-as-orchestrator	Built-in (flat)	Configurable	Explicit graph
Merge strategy	Automated AI refinery	4-tier escalation	Manual	N/A	N/A
Supervision	Deacon/Witness/Dogs	Daemon/Triage/Monitor/Supervisor	Flat (lead monitors)	Configurable	External
Failure recovery	Automatic (supervisor chain)	Tiered (mechanical → AI)	Manual (respawn)	Configurable	Checkpoint restore
Cost profile	~$100/hr (20-30 agents)	Subscription (fixed monthly)	Lower (2-8 agents)	Lower	Lower
Maturity	Early (December 2025)	Early (February 2026)	Experimental	Production	Production (v1.0)
Language	Go (CLI tools)	TypeScript/Bun	TypeScript (SDK)	Python	Python
Dependencies	Go stdlib	Zero (Bun built-ins)	Node + npm packages	Python + pip packages	Python + pip packages
License	MIT	MIT	Proprietary	Apache 2.0	MIT

Key differentiators:

Git-native persistence: Gas Town's bead system and Overstory's filesystem + SQLite persistence both ensure state survives crashes. Other approaches use ephemeral or application-specific state.
Merge automation: Gas Town and Overstory automate merge conflict resolution through AI-assisted tiers. At 20+ agents producing concurrent branches, merge becomes the primary bottleneck without automation.
Supervisory depth: Three-layer supervision (Gas Town: deacon/witness/dogs; Overstory: daemon/triage/monitor/supervisor) provides fault isolation missing from flat coordination models. A failed agent is detected and recovered without affecting peers.
Coordination model: Gas Town's daemon-based Mayor vs Overstory's session-as-orchestrator represent fundamental architectural trade-offs. Daemon provides crash independence; session provides infrastructure simplicity.

Implementation Patterns

Integration with Coding Agents

Gas Town is agent-agnostic at the worker level—any coding agent that can operate in a Git worktree can serve as a polecat. The integration surface is three commands:

# SessionStart hook (in .claude/settings.json or equivalent)
gt prime    # Load context from beads
 
# During work
gt hook     # Read current assignment
 
# On completion
gt done     # Push, submit to refinery, cleanup

For Claude Code specifically, these map to:

Gas Town Command	Claude Code Integration Point
`gt prime`	SessionStart hook
`gt hook`	Agent prompt preamble
`gt done`	Post-completion workflow step
`bd mol current`	Status check within agent loop

Bead-Based State Management

Beads are Gas Town's unit of persistent state. Every workflow step, task assignment, and completion record is a bead stored in Git:

.beads/
├── hq-project-setup           # Town-level bead
├── hq-architecture-review     # Town-level bead
└── task-implement-auth         # Task bead assigned to polecat

Properties of beads:

Versioned: Every bead change is a Git commit
Attributable: Beads record which agent created or modified them
Recoverable: Git history enables rollback to any prior state
Portable: Beads survive agent restarts, machine changes, and session boundaries

This maps to the Context as Code mental model—treating agent state as version-controlled artifacts rather than ephemeral memory.

Work Distribution Pattern

┌──────────────┐
│    Mayor     │  (Global coordinator)
│  Assigns →   │
└──────┬───────┘
       │
  ┌────┴──────────────────────────┐
  │         Bead Queue            │
  │  [task-1] [task-2] [task-3]   │
  └────┬──────┬──────┬────────────┘
       │      │      │
       ▼      ▼      ▼
   Polecat  Polecat  Polecat
   (Alice)  (Bob)    (Carol)
       │      │      │
       ▼      ▼      ▼
   gt done  gt done  gt done
       │      │      │
       ▼      ▼      ▼
  ┌────┴──────┴──────┴────────────┐
  │         Refinery              │
  │  Merge queue + AI resolution  │
  └───────────────────────────────┘
       │
       ▼
  Canonical Clone (mayor/rig/)

The mayor assigns tasks as beads. Polecats (worker agents) pull assignments via gt hook, execute in isolated worktrees, and submit via gt done. The refinery merges outputs in submission order, resolving conflicts with AI assistance. No agent waits for another agent's merge to complete.

Anti-Patterns

Anti-Pattern: Workspace Manager for Small Teams

Deploying Gas Town for 2-3 agents introduces infrastructure overhead (deacon daemon, refinery process, worktree management) that exceeds the coordination benefit. Claude Code agent teams or subagents handle this scale with zero infrastructure.

Better approach: Start with agent teams. Migrate to a workspace manager when coordination overhead visibly limits throughput—typically at 8-10 concurrent agents.

Anti-Pattern: Treating Workspace Managers as Frameworks

Workspace managers and agent frameworks serve different layers. Attempting to use Gas Town as a replacement for LangGraph or CrewAI conflates infrastructure with logic. Gas Town manages where agents work; frameworks manage how agents reason.

Better approach: Use workspace managers alongside frameworks. Gas Town provisions the worktree; Claude Code (or another agent) operates within it.

Anti-Pattern: Ignoring the Cost Curve

At $100/hour for 20-30 agents, workspace-managed swarms consume significant resources. Running a full swarm on exploratory work or unclear requirements wastes budget on parallel execution of potentially discarded work.

Better approach: Use single-agent exploration to clarify requirements. Deploy the swarm only for well-specified implementation tasks where parallelism yields clear throughput gains.

Anti-Pattern: Skipping the Maturity Ladder

Jumping directly to 20-agent workspace management without experience at lower scales creates operational risk. Practitioners unfamiliar with single-agent failure modes will struggle to diagnose swarm-level failures.

Better approach: Progress through the scale ladder: single agent (1-2) to agent teams (3-8) to workspace manager (10+). Each tier builds operational intuition for the next.

The Emerging Category

Gas Town represents the first visible entry in what may become a recognized tool category. The pattern is familiar from infrastructure evolution:

Era	Problem	Category That Emerged
2000s	Managing many servers	Configuration management (Chef, Puppet, Ansible)
2010s	Managing many containers	Container orchestration (Docker, Kubernetes)
2020s	Managing many microservices	Service mesh (Istio, Linkerd)
2025+	Managing many agents	Workspace management (Gas Town, ...)

Signals this category is real:

Scale demand exists: Production systems already run 10-30 agents (Claude Code agent teams, custom frameworks). The tooling gap is observable.
Shared problems recur: Merge coordination, agent supervision, state persistence, and work attribution are not Gas Town-specific—any system at this scale faces them.
Infrastructure layer is distinct: Workspace management is neither agent logic (framework layer) nor code generation (agent layer). It occupies a recognizable infrastructure niche.

Signals this category is premature:

Single entrant: Gas Town is effectively the only tool in this space as of February 2026. Categories need competition to validate.
Early maturity: Gas Town launched December 2025. Production hardening takes years.
Model capabilities may eliminate the need: If models improve enough, fewer agents working smarter may outperform many agents coordinated by infrastructure.
Cost barrier: $100/hour operational cost limits adoption to well-funded teams and high-value projects.

Open Questions

How does merge quality degrade as branch divergence increases across 20+ concurrent agents?
What is the practical upper limit on agent count before the refinery becomes a bottleneck?
Can workspace managers integrate with non-Git version control systems, or is Git-native state fundamental?
How does persistent agent identity interact with model updates (agent "personality" may shift between model versions)?
What monitoring and observability patterns emerge for workspace-managed swarms?
How do workspace managers handle multi-repository dependencies (agent in repo A needs changes from agent in repo B)?
Will model improvements (larger context, better reasoning) reduce the need for high agent counts?
What security model governs agent access within worktrees (can a compromised agent affect other worktrees)?
How does bead-based state compare to database-backed state for operational query patterns?
What happens when the refinery's AI merge resolution produces subtle bugs that pass tests?

Connections

To Claude Code: Gas Town's polecats can run Claude Code as the underlying coding agent. Integration happens through SessionStart hooks (gt prime) and completion workflows (gt done). Claude Code agent teams operate at smaller scale (2-8 agents) compared to Gas Town's target (20-30).
To Agent Frameworks: Workspace managers operate at a different layer than frameworks. LangGraph coordinates agent logic; Gas Town coordinates agent infrastructure. The two are complementary, not competitive.
To Orchestrator Pattern: Gas Town's mayor implements an orchestrator pattern, but at infrastructure level rather than prompt level. The mayor assigns work and monitors completion; it does not reason about task decomposition.
To Expert Swarm Pattern: The expert swarm pattern describes agent coordination logic. Workspace managers provide the infrastructure substrate that makes swarms practical at scale—isolation, merge automation, supervision.
To Context Fundamentals: Gas Town's bead system directly addresses the context vs. memory gap. Beads provide persistent memory that survives session boundaries, enabling agents to resume work without re-explanation.
To Workflow Coordination: Gas Town's gt done + refinery pattern automates the commit-push-merge workflow that manual coordination handles at smaller scale.

Sources

Gas Town GitHub Repository (MIT License, 9k+ stars)
Steve Yegge - Gas Town announcement and documentation
Gas Town CLI reference
Steve Yegge's background: Former Google, Amazon, Grab principal engineer; creator of Gas Town (December 2025)