Skill

agent-driven-development

Orchestrates implementation plans with worktree isolation, TDD discipline, and two-stage review. Referenced by execute-plan, fixit, and bugbash.

developer-tools

automation

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claude-skills:agent-driven-development

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A reusable orchestration loop for agent-driven implementation. Skills like `execute-plan`, `fixit`, and `bugbash` reference this pattern rather than defining their own execution mechanics. It combines worktree isolation, TDD discipline, and two-stage review (spec compliance then code quality). Fresh agent per task to prevent context pollution.

Supporting Files

code-quality-reviewer-prompt.mdimplementer-prompt.mdsandbox-mode.mdspec-reviewer-prompt.md

SKILL.md

165 lines · ~2.1k tokens

Stats

LanguageShell

Stars7

Forks2

MaintenanceExcellent

Last CommitJun 1, 2026

Actions

View Source View Plugin View on GitHub View README

Agent-Driven Development

A reusable orchestration loop for agent-driven implementation. Skills like execute-plan, fixit, and bugbash reference this pattern rather than defining their own execution mechanics. It combines worktree isolation, TDD discipline, and two-stage review (spec compliance then code quality). Fresh agent per task to prevent context pollution.

Why This Pattern

Delegating implementation to fresh agents with isolated context produces better results than accumulating work in one long session. The controller curates exactly what each agent needs -- no more, no less. This preserves the controller's own context for coordination while keeping each agent focused.

The two-stage review (spec compliance, then code quality) catches different failure modes: building the wrong thing vs. building the right thing poorly. Both reviews are mandatory, and spec compliance must pass before code quality review begins.

The Core Loop

For each task in the plan:

Create worktree -- .claude/worktree/<task-slug>/
Dispatch implementer agent with task context + TDD reference
Implementer follows TDD (skills/test-driven-development/SKILL.md), self-reviews per verification-before-completion (skills/verification-before-completion/SKILL.md)
Implementer reports status: DONE | DONE_WITH_CONCERNS | NEEDS_CONTEXT | BLOCKED
Handle status (see Autonomous Execution below)
Dispatch spec reviewer -- checks implementation matches spec
If issues: implementer fixes, spec reviewer re-reviews (loop until clean)
Dispatch code quality reviewer -- checks implementation is well-built
If issues: implementer fixes, quality reviewer re-reviews (loop until clean)
Merge worktree back to main branch
- Clean merge: done
- Textual conflict: controller resolves and runs tests
- Semantic conflict (tests fail after merge): re-dispatch against updated base
Mark task complete in native Task system -- dependents auto-unblock

Task Coordination via Native Tasks

Use TaskCreate with addBlockedBy to build dependency graphs. The controller creates all tasks upfront from the plan. Tasks become eligible for execution when all their blockers complete.

TaskCreate("Update specs", ...)
TaskCreate("Write failing tests", ..., addBlockedBy: [spec-task-id])
TaskCreate("Implement auth module", ..., addBlockedBy: [test-task-id])
TaskCreate("Implement API routes", ..., addBlockedBy: [test-task-id])  <- parallel with above
TaskCreate("Integration tests", ..., addBlockedBy: [auth-id, api-id])

This naturally expresses the dependency graph. Independent tasks (auth module and API routes above) become eligible simultaneously and can run in parallel worktrees.

Worktree-Per-Task Isolation

Every task gets its own worktree at .claude/worktree/<task-slug>/. This provides:

Maximum parallelism -- independent tasks run simultaneously without file conflicts
Clean rollback -- if a task fails, delete the worktree, no cleanup needed
Isolated state -- each agent sees a consistent snapshot, not half-finished work from another task

First-time setup

If .claude/worktree/ does not exist in the project:

Create the directory
Add .claude/worktree/ to the project's .gitignore (append if not already present)

Merge strategy

After an agent completes a task and passes both reviews:

Switch to the main working branch
Merge the worktree branch
If textual conflicts arise, resolve them and run the full test suite
If tests fail after merge (semantic conflict), the merge introduced an incompatibility -- re-dispatch the task against the updated base
Clean up the worktree branch and directory

Autonomous Execution

Once execution starts, the controller never asks the user anything. Handle all statuses internally:

DONE -- proceed to spec review
DONE_WITH_CONCERNS -- read the concerns. If they're about correctness or scope, address before review. If they're observations ("this file is getting large"), note them for the final report and proceed to review.
NEEDS_CONTEXT -- provide the missing context from the plan, specs, or codebase and re-dispatch
BLOCKED -- escalation ladder:
1. Provide more context and re-dispatch with the same model
2. Re-dispatch with a more capable model
3. Break the task into smaller pieces
4. Park the task and note it in the final report

One summary at the end. No mid-execution interruptions.

Model Selection

Use the least powerful model that can handle each role. This conserves cost and increases speed.

Mechanical tasks (1-2 files, clear spec, isolated function): use model: "haiku"
Integration tasks (multi-file coordination, pattern matching, judgment calls): use model: "sonnet"
Architecture, design, and review tasks: use the most capable model (default)

Complexity signals

Signal	Model
Touches 1-2 files with complete spec	haiku
Touches multiple files with integration concerns	sonnet
Requires design judgment or broad codebase understanding	default (most capable)
Review roles (spec compliance, code quality)	default (most capable)

Parallel Execution

When multiple tasks are unblocked simultaneously (no dependency between them), dispatch them in parallel:

Each gets its own worktree
Each gets its own implementer agent
Reviews can also run in parallel across different tasks
Merges happen sequentially to avoid conflicts (first-done merges first, subsequent tasks rebase if needed)

The Task system handles this naturally -- when a blocking task completes, all tasks it was blocking become eligible. The controller dispatches all eligible tasks at once.

Debugging Integration

For bug-fix tasks, the implementer agent should also read:

skills/debug/root-cause-tracing.md -- systematic hypothesis-driven debugging
skills/debug/defense-in-depth.md -- making fixes robust against related failures

Include these references in the implementer's dispatch prompt when the task involves diagnosing or fixing bugs (as opposed to greenfield implementation).

Prompt Templates

The following prompt templates define agent behavior. The controller provides task-specific context when dispatching each agent.

./implementer-prompt.md -- implementation agent instructions
./spec-reviewer-prompt.md -- spec compliance reviewer instructions
./code-quality-reviewer-prompt.md -- code quality reviewer instructions

Sandbox-aware dispatch

When the calling session runs from inside a sandboxed worktree (cannot write to MAIN_REPO at the OS layer), the normal git worktree add + git merge flow fails. ./sandbox-mode.md defines a graceful degradation that auto-detects sandbox mode via ~/.claude/bin/repo-writable-check.sh and falls back to host task-spawning, staged-command, or async verify-then-archive paths. Skills like /fixit and /bugbash reference it from their dispatch and on-completion sections.

Red Flags

Never:

Skip reviews (spec compliance or code quality)
Proceed with unfixed review issues
Dispatch parallel agents to the same worktree
Start code quality review before spec compliance passes
Move to the next task while either review has open issues
Let implementer self-review replace actual review (both are needed)
Ignore agent escalations -- if an agent says it's stuck, something needs to change
Accept "close enough" on spec compliance -- if the reviewer found issues, they must be fixed
Skip review re-loops -- reviewer found issues means implementer fixes means reviewer re-reviews
Ask the user questions mid-execution -- handle everything internally, report at the end
Force the same model to retry without changes when blocked

If a reviewer finds issues:

Implementer (same agent) fixes them
Reviewer reviews again
Repeat until approved

If an agent fails a task:

Dispatch a fix agent with specific instructions
Do not fix manually in the controller (context pollution)

agent-driven-development

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

agent-driven-development

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Agent-Driven Development

Why This Pattern

The Core Loop

Task Coordination via Native Tasks

Worktree-Per-Task Isolation

First-time setup

Merge strategy

Autonomous Execution

Model Selection

Complexity signals

Parallel Execution

Debugging Integration

Prompt Templates

Sandbox-aware dispatch

Red Flags

Similar Skills

Agent-Driven Development

Why This Pattern

The Core Loop

Task Coordination via Native Tasks

Worktree-Per-Task Isolation

First-time setup

Merge strategy

Autonomous Execution

Model Selection

Complexity signals

Parallel Execution

Debugging Integration

Prompt Templates

Sandbox-aware dispatch

Red Flags

Similar Skills