Treat agent knowledge like software, not documents.
This extends the "Specs as Source Code" mental model beyond specifications to all context artifacts: knowledge bases, expertise files, tool descriptions, and system prompts. If it shapes agent behavior, it's source code.
Your Mental Model
Knowledge artifacts are source code. Version control them, test them, refactor them, and document them with the same rigor you apply to Python or JavaScript. When you edit an agent's context without tracking what changed and why, you're cowboy-coding in production.
Knowledge bases aren't documentation—they're the runtime instructions that determine agent behavior. Treating them as "just text files" is like treating your application code as "just text files" and editing it in Notepad without version control.
What This Looks Like in Practice
The ACE playbook format exemplifies context as code:
[str-00001] helpful=5 harmful=0 :: Use structured output for complex tasks
[cal-00003] helpful=8 harmful=0 :: Token cost = (input + output) * rate
[mis-00004] helpful=6 harmful=0 :: Don't retry on rate limits without backoff
[con-00002] helpful=7 harmful=0 :: Context window = working memory
[too-00001] helpful=9 harmful=0 :: Grep before Edit to avoid blind writes
Each line is:
- Uniquely identified (
[prefix-ID]) - enables precise references, easy refactoring - Performance tested (
helpful=X harmful=Y) - like unit tests for knowledge - Category-organized (
str-,cal-,mis-,con-,too-) - modular design - Self-describing (the content itself explains what it does)
Software Engineering Patterns Applied to Context
Version Control: Track Changes, Enable Rollback
# Traditional documentation
docs/agent-knowledge.md # edited directly, no history
# Context as code
git log --oneline knowledge/strategies.md
a3f2b1c Add retry strategy for transient failures
e4d5c6f Remove deprecated authentication approach
f7g8h9i Refactor error handling strategies
git diff e4d5c6f..a3f2b1c knowledge/strategies.md
- [str-00015] helpful=2 harmful=3 :: Use basic auth for API calls
+ [str-00015] helpful=8 harmful=0 :: Use OAuth2 with refresh tokensWhen agent behavior regresses, you can git bisect to find which knowledge change caused it.
Testing: Helpful/Harmful Counters as Unit Tests
# Before testing
[str-00023] :: Always validate user input
# After testing (knowledge has test results)
[str-00023] helpful=12 harmful=0 :: Always validate user input
The counters are like test pass/fail metrics:
helpful > 0, harmful = 0→ Proven valuable, keep ithelpful = 0, harmful > 0→ Causes problems, remove or refactorhelpful > 0, harmful > 0→ Context-dependent, needs conditions
You can track these over time like code coverage metrics.
Modular Organization: Category Prefixes, Unique IDs
strategies/
str-00001.md # High-level approaches
str-00002.md
calculations/
cal-00001.md # Formulas and computations
cal-00003.md
mistakes/
mis-00001.md # Anti-patterns to avoid
mis-00004.md
concepts/
con-00001.md # Domain knowledge
con-00002.md
tools/
too-00001.md # Tool usage patterns
too-00005.md
Just like code modules, categories enable:
- Focused loading: Only load relevant categories for specific tasks
- Dependency tracking:
[str-00012]references[con-00003]and[too-00007] - Easier refactoring: Move entries between categories without breaking references
Refactoring: Semantic Deduplication
# Before refactoring (duplication)
[str-00008] :: For database queries, use connection pooling
[str-00015] :: When connecting to databases, use a connection pool
[too-00023] :: Database access should use connection pooling
# After refactoring (DRY principle)
[str-00008] helpful=15 harmful=0 :: Use connection pooling for database access
# References: too-00023, db-architecture.md
Like code refactoring, you extract common patterns, eliminate redundancy, and maintain a single source of truth.
Documentation: Each Entry is Self-Describing
# Weak (requires external context)
[str-00042] :: Use the pattern
# Strong (self-contained)
[str-00042] helpful=6 harmful=0 :: For multi-step workflows, use plan-build-review pattern to separate planning from execution
Like good function names and docstrings, each knowledge entry should be understandable in isolation.
When to Apply This Model
Good Fit
Production agent systems: When agents run in production, their knowledge determines user-facing behavior. Treat it with production code rigor.
Multi-agent systems: When knowledge is shared across multiple agents, version control and testing prevent one agent's changes from breaking another.
Evolving domains: When knowledge needs frequent updates (new APIs, changing policies), treating context as code makes evolution traceable and reversible.
Team collaboration: When multiple people contribute to agent knowledge, version control and structure prevent conflicts and enable review.
Poor Fit
Prototype exploration: When you're still figuring out what knowledge the agent needs, heavyweight structure slows discovery. Start informal, formalize later.
Static, finished systems: If the knowledge is complete and won't change, the overhead of treating it as source code isn't justified.
Single-person, short-lived projects: For quick experiments, simple text files work fine. Add structure when the project grows.
The Continuum: From Documents to Code
Documents Code
│ │
├─ Plain text notes (no structure) │
├─ Markdown with sections (light structure) │
├─ Structured markdown with metadata (ACE playbook) │
├─ Machine-parseable format with schema (JSON/YAML) │
└─ Formal specifications with validation (contracts) ────┘
You don't need to jump straight to the "code" end. The ACE playbook hits a sweet spot: human-readable markdown with just enough structure (IDs, counters, categories) to enable software engineering practices.
Agent-as-Code: Version-Controlled Agent Definitions
[2026-02-06]: The "context as code" mental model extends to agent definitions themselves. BMAD-METHOD demonstrates treating agents as first-class version-controlled code artifacts—portable, reusable, and shareable like any code file.
The Pattern
Instead of runtime configurations or opaque API settings, agents are self-contained markdown files with embedded YAML:
<!-- agent-definition.md -->
# Security Expert Agent
Specialized agent for security architecture, threat modeling, and vulnerability analysis.
## Role
You are a security architect with expertise in authentication, authorization,
and secure system design.
## Capabilities
- OAuth2/OIDC protocol implementation
- Threat modeling and risk assessment
- Security audit and code review
- Compliance requirements (SOC 2, HIPAA)
## Workflows
- security-audit: Comprehensive security review
- threat-model: Analyze attack surfaces
- auth-design: Design authentication flows
---
# Agent Configuration (YAML)
name: security-expert
type: specialist
model: claude-sonnet-4.5
temperature: 0.3
tools: [Read, Grep, Execute]Why This Matters
Institutional knowledge preservation: A security expert's knowledge—how to think about threat models, what to check in audits, which patterns prevent vulnerabilities—becomes portable code. When team members change, the expertise remains.
Cross-project reusability:
# Copy agent definition to new project
cp ~/agents/security-expert.md ./project-x/.agents/
# Agent immediately available with full expertiseTransparent system composition:
agents/
├── security-expert.md # 245 lines
├── architect.md # 312 lines
├── developer.md # 289 lines
└── qa-engineer.md # 198 lines
Total expertise: 1,044 lines of version-controlled knowledge
Manifest-Driven Transparency
BMAD tracks agents, workflows, and tasks via CSV manifests:
# agents.csv
AgentId,Name,Description,FilePath,WorkflowTriggers
bmm-001,Security Expert,Security architecture and threat modeling,agents/security-expert.md,security-audit|threat-model
bmm-002,Architect,System design and technical decisions,agents/architect.md,create-architecture|tech-reviewThis provides:
- Complete inventory: What agents exist?
- Capability mapping: What can each agent do?
- Dependency tracking: Which workflows require which agents?
- Easier debugging: Manifest shows system composition at a glance
Version Control Benefits
Agents under version control enable:
# Track agent evolution
git log agents/security-expert.md
a3f2b1c Added OWASP Top 10 checks to security audit workflow
e4d5c6f Updated OAuth2 PKCE flow recommendations
f7g8h9i Improved threat modeling methodology
# Compare agent versions
git diff v1.0..v2.0 agents/security-expert.md
# Rollback problematic changes
git revert f7g8h9iWhen agent behavior regresses, version control enables diagnosis: "What changed in the agent definition that caused this?"
Production Example: BMAD-METHOD
BMAD-METHOD (34.5k GitHub stars) implements agent-as-code across 26 agents:
Core orchestrator:
- BMad Master: Coordinates all 26 agents
Business Method Module (9 agents):
- Mary (Analyst), John (PM), Winston (Architect), Amelia (Developer)
- Quinn (QA), Bob (Scrum Master), Barry (Quick Flow Dev)
- Sally (UX), Paige (Technical Writer)
Builder Module (3 agents):
- Agents for creating custom agents, modules, workflows
Each agent:
- Self-contained markdown file
- Embedded YAML configuration
- Fuzzy-match triggers for activation
- Portable across projects
Framework Extension: BMB Module
BMAD includes agents specifically for creating agents:
/create-agent "API integration specialist"
↓
Builder agent generates:
- agent-definition.md (role, capabilities, instructions)
- agent-config.yaml (model, tools, workflows)
- workflow templates for common tasks
Meta-pattern: Agents as code enables agents that generate agents. The builder module creates new agent definitions following the same markdown + YAML pattern.
Comparison to Runtime Configurations
| Runtime Config | Agent-as-Code |
|---|---|
| JSON/YAML blob in API call | Markdown file with embedded config |
| Exists only during execution | Persists as version-controlled file |
| Opaque system composition | Transparent (read the files) |
| Hard to share across projects | Copy file, instant reuse |
| Manual documentation required | Self-documenting |
| No change history | Full git history |
When to Use Agent-as-Code
Good fit:
- Building reusable agent libraries for organization
- Multi-project environments where agents should be consistent
- Teams needing audit trail of agent behavior changes
- Complex agent systems requiring transparency
Not necessary for:
- One-off agent uses
- Simple single-agent systems
- Exploratory prototyping (formalize later)
- Projects with stable, unchanging agent requirements
Integration with Other Patterns
Self-Improving Experts pattern: Agent-as-code enables the improve phase to update agent definitions directly. Learnings from production update the markdown file, improving future runs.
Orchestrator pattern: Manifest-driven transparency shows which agents the orchestrator can coordinate, their capabilities, and dependencies.
Knowledge Evolution: Agent definitions are knowledge artifacts that evolve over time, following same version control practices as other context.
Implementation Checklist
If implementing agent-as-code:
- Define agent structure (markdown format, YAML schema)
- Create agent manifest (CSV or JSON tracking system)
- Version control agent definitions
- Document agent creation guidelines
- Build tooling for agent discovery (loading from manifest)
- Test agent portability (copy to new project, verify function)
- Add agent update workflow (how to improve agents)
Open Questions
- What's the right balance between reusable generic agents vs specialized project-specific agents?
- How to handle agent definition conflicts when merging across projects?
- Can agent-as-code support dynamic agent generation at runtime, or only static definitions?
- What testing framework validates agent definitions without executing them?
Implications
Knowledge Reviews Like Code Reviews
# PR: Update authentication strategies
Changes to knowledge/strategies/:
- [str-00042] helpful=2 harmful=5 :: Use basic auth
+ [str-00042] helpful=8 harmful=0 :: Use OAuth2 with PKCE flow
+ [str-00058] helpful=0 harmful=0 :: For mobile apps, use refresh token rotation
Reviewer: "str-00042 improvement looks good. For str-00058, have we tested
harmful=0? Refresh token rotation can cause issues if not handled correctly."Knowledge Regression Testing
def test_agent_follows_knowledge():
"""Verify agent applies knowledge correctly."""
agent = load_agent_with_knowledge("knowledge/strategies.md")
# Test that str-00042 is applied
response = agent.handle_auth_request(mock_request)
assert response.auth_method == "OAuth2"
assert response.uses_pkce == True
# Increment helpful counter if successful
increment_helpful("str-00042")Knowledge Metrics
# Knowledge health dashboard
Total entries: 247
Proven (helpful > 5, harmful = 0): 89 (36%)
Untested (helpful = 0, harmful = 0): 143 (58%)
Problematic (harmful > 0): 15 (6%)
Recent changes (last 7 days):
+ 12 new entries
~ 8 modified entries
- 3 removed entries
Coverage by category:
str- (strategies): 67 entries
cal- (calculations): 23 entries
mis- (mistakes): 34 entries
con- (concepts): 89 entries
too- (tools): 34 entries
Common Pitfalls
Over-Engineering Early
Problem: Creating elaborate versioning and testing infrastructure before you know what knowledge the agent needs.
Solution: Start with simple markdown. Add structure (IDs, categories, counters) when you have enough entries that organization becomes painful. Add testing when you have enough history to know what "helpful" looks like.
Treating All Context Equally
Problem: Applying heavy structure to ephemeral context that doesn't need it (one-off prompts, temporary instructions).
Solution: Distinguish between:
- Core knowledge (long-lived, reused, tested) → Treat as code
- Task-specific context (one-off, temporary) → Keep lightweight
- Generated content (can be regenerated) → Don't version control
Losing the Human-Readable Aspect
Problem: Making context so structured and formal that humans can't easily read and edit it.
Solution: The ACE playbook maintains readability. Avoid formats that require parsing tools to understand. Markdown with light structure is the sweet spot.
Connections
- To Specs as Source Code: Agent definitions are specs for agent behavior—same mental model applied to agents
- To Knowledge Evolution: Agent definitions evolve through version control like other knowledge artifacts
- To Self-Improving Experts: Expertise files are context that agents execute—agent-as-code makes the agent itself executable source
- To Context: Agent definitions shape what context enters working memory and how it's interpreted
Sources
- ACE framework paper (helpful/harmful counters, knowledge metrics)
- BMAD-METHOD GitHub Repository - Production agent-as-code implementation with 26 agents
- Agent As Code: BMAD-METHOD™ - Practitioner article on agent-as-code paradigm