From autopilot
End-to-end autonomous coding pipeline. Use when the user says "autopilot", "build this from scratch", "go from idea to code", "autonomous build", or wants to go from a topic/idea to working code with a single command. Orchestrates crew, multi-model-debate, and arch-guard.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autopilot:autopilotThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
End-to-end pipeline: idea → research → design → build → review → report.
End-to-end pipeline: idea → research → design → build → review → report.
Invoked as: $ARGUMENTS
cat .autopilot/state.json 2>/dev/null | head -20 || echo "(no state)"test -f arch-guard.json && echo "DETECTED" || echo "NOT_FOUND"test -d .caw && echo "YES" || echo "NO"| Flag | Effect |
|---|---|
--skip-research | Skip Phase 1, start at design |
--skip-debate | Skip debate sub-step in Phase 2 |
--no-arch | Force-skip arch-guard even if config exists |
--from-plan <path> | Skip Phase 1+2, use existing design doc as design-brief |
--continue | Resume from .autopilot/state.json |
--verbose | Detailed per-phase progress |
--no-questions | Minimize interactive prompts (still shows user gate) |
--worktree | Isolate each build step in a git worktree (create → build → merge back) |
[1/5] RESEARCH crew:explore --research-deep autonomous
[2/5] DESIGN crew:explore --arch + debate + arch autonomous → USER GATE
[3/5] BUILD arch-guard scaffold + crew:go autonomous
[4/5] REVIEW codex + arch-check + crew:review autonomous (parallel)
[5/5] REPORT synthesis autonomous
$ARGUMENTS. Extract <topic> (everything that isn't a flag).--continue: read .autopilot/state.json.
phase == "complete" AND completion.missing > 0: gap-filling mode — read .autopilot/remaining-work.md, set phase = "build", build.status = "pending", and resume from Phase 3 using remaining-work.md as the task description for crew:go.complete and != skipped, resume from there. Skip to that phase section below.--from-plan <path>: read the file at <path>, copy its content to .autopilot/design-brief.md, skip to Phase 3..autopilot/ directory.git rev-parse --git-dir 2>/dev/null. If not a git repo, run git init and create an initial commit (git add -A && git commit -m "chore: initial commit (pre-autopilot)"). This is required because crew agents use git worktrees for isolation..autopilot/state.json:{
"schema_version": "1.1",
"topic": "<topic>",
"phase": "research",
"started_at": "<ISO timestamp>",
"config": {
"skip_research": false,
"skip_debate": false,
"no_arch": false,
"arch_guard_detected": false,
"verbose": false,
"no_questions": false,
"worktree": false
},
"phases": {
"research": { "status": "pending" },
"design": { "status": "pending", "user_approved": false, "revision_count": 0 },
"build": { "status": "pending" },
"review": { "status": "pending", "rounds": 0 },
"report": { "status": "pending" }
},
"deliverables": [],
"completion": {
"total": 0,
"built": 0,
"missing": 0,
"verdict": "pending"
},
"gap_fill_round": 0,
"last_error": null
}
arch-guard.json exists and --no-arch not set, set config.arch_guard_detected = true.config.Print: AUTOPILOT started: "<topic>"
Every phase ends with one of:
.autopilot/deferred-questions.mdOn BLOCKED: save state and stop. User can --continue after resolving.
On 3 consecutive failures in the same phase: stop and suggest manual skill invocation.
Skip if: --skip-research flag is set → mark research.status = "skipped", go to Phase 2.
[1/5] Researching...research.status = "running", phase = "research""/crew:explore --research-deep <topic>".caw/research/<slug>/RESEARCH-REPORT.mdresearch.report_path in state.autopilot/deferred-questions.mdresearch.status = "complete"[1/5] Research completeresearch.status = "failed", record error in last_error[1/5] Research failed: <error>. Use --continue to retry or --skip-research to skip.Skip if: --from-plan was used → already skipped.
[2/5] Designing...design.status = "running", phase = "design"research.report_path or locate in .caw/research/)"/crew:explore --arch <topic>", providing research report context.caw/design/architecture.md.autopilot/deferred-questions.md--skip-debate flag is set OR multi-model-debate plugin is not available.caw/design/architecture.md, extract top 2-3 key design decisions (component boundaries, technology choices, data model approaches)Skill("multi-model-debate:debate-orchestration") with the decision as topic[2/5] Debate skipped (multi-model-debate plugin not found)config.arch_guard_detected is false OR --no-arch is setSkill("arch-guard:arch-check") to surface existing layer/reference constraintsConsider running /arch-guard:setup to define architecture rules (skipping for now).caw/design/architecture.md (+ debate reports if any + arch-guard constraints if any).autopilot/design-brief.md with sections:
.autopilot/deferred-questions.mdDuring design consolidation, evaluate whether arch-guard should be enabled. Write the decision in the "Architecture Governance" section of design-brief.md.
Enable arch-guard when ANY of these apply:
Skip arch-guard when ALL of these apply:
src/ only)If enabling: set config.arch_guard_detected = true in state.json, then invoke Skill("arch-guard:setup") to generate arch-guard.json from the design. This runs BEFORE the user gate so the user can review the rules.
After writing .autopilot/design-brief.md, parse it and extract every concrete deliverable: files to create, classes/interfaces to implement, configuration files, test files, app manifests, migration scripts.
For each deliverable, record in state.json under "deliverables" array:
id: sequential identifier (d1, d2, ...)name: short name (e.g., "AgentStepExecutor")type: one of file, class, interface, function, config, directory, testexpected_path: best-guess file path where this should exist after buildsource_section: which design-brief section it came from (e.g., "Components", "Data Model")status: "pending"Set completion.total to the deliverable count. Leave completion.verdict as "pending".
Extraction heuristic: scan design-brief sections (Components, Data Model, Build Sequence, Tests) for nouns that map to code artifacts. Include both interfaces AND their expected implementations. Include test projects. Include infrastructure configs (Docker, migrations).
Present to the user via AskUserQuestion:
## Autopilot Design Review
### What will be built
<brief summary from design-brief>
### Key components
<component list>
### Tech decisions
<decisions with debate results if any>
### Planned Deliverables ({completion.total} items)
<table: name | type | expected_path — from state.json deliverables>
### Architecture governance
<"arch-guard enabled — {N} rules" or "arch-guard skipped — {reason}">
### Architecture constraints
<from arch-guard rules, or "N/A">
### Open questions
<from deferred-questions.md, or "none">
---
Options:
1. **Approve** — proceed to build ({completion.total} deliverables)
2. **Revise** — provide feedback (I'll re-design, max 3 rounds)
3. **Abort** — cancel pipeline
design.user_approved = true, design.status = "complete", print [2/5] Design approved.autopilot/design-feedback.md, increment design.revision_count, re-run 2a with feedback as additional context. Max 3 revision rounds; after 3, present final version and require approve or abort.phase = "cancelled", design.status = "cancelled", print AUTOPILOT_CANCELLED, stop.config.no_arch is set OR design-brief has no "Tech Decisions" section.autopilot/design-brief.md, extract each entry from the "Tech Decisions" sectionSkill("arch-guard:adr", "<decision title>") with context:
docs/adr/ directory if missingdesign.adr_paths = [<paths>][2/5] {N} ADR(s) generated[3/5] Building...build.status = "running", phase = "build"config.arch_guard_detected is falseSkill("arch-guard:scaffold") with the module nameSkill("arch-guard:contract-first") to define interfacesSkill("arch-guard:implement") for interface stubsCRITICAL: YOU (autopilot) execute the build loop yourself. Do NOT delegate the entire build to a single agent. Execute each step individually with commit + simplify between steps.
If .caw/task_plan.md does not exist, create it:
Agent(subagent_type="crew:planner") with prompt: the design-brief content + "Create a task plan in .caw/task_plan.md with phases and steps.".caw/task_plan.md exists.Read .caw/task_plan.md. For each pending step, execute this sequence:
If config.worktree is true — Worktree-Isolated Mode:
a. Create worktree — Invoke worktree:create for this step:
Skill("worktree:create", "step-{N}")
This creates .worktrees/step-{N} with a branch step-{N}.
b. Build — Spawn Builder to work INSIDE the worktree:
Agent(subagent_type="crew:builder", prompt="Implement Step {N}: {step description}. Context files: {list}. IMPORTANT: Work in directory .worktrees/step-{N}/ — all file reads/writes must be relative to that directory. When done, commit your changes with: cd .worktrees/step-{N} && git add -A && git commit -m '[feat] Step {N}: {step description}'")
Wait for builder to complete.
c. Merge back — Invoke worktree:merge from the worktree context:
Skill("worktree:merge", "")
Note: worktree:merge must be invoked from inside the worktree. If the Skill tool cannot change cwd, run manually:
cd .worktrees/step-{N} && git status --porcelain
If worktree has uncommitted changes, commit them first. Then merge from the main repo:
BRANCH="step-{N}"
REPO_ROOT=$(git rev-parse --show-toplevel)
ORIGINAL_REPO=$(cd "$REPO_ROOT" && git rev-parse --git-common-dir | xargs dirname)
git -C "$ORIGINAL_REPO" merge --squash "$BRANCH"
git -C "$ORIGINAL_REPO" commit -m "[feat] Step {N}: {step description}"
d. Simplify — Spawn code-simplifier on modified files:
Agent(subagent_type="code-simplifier:code-simplifier", prompt="Simplify the files modified in Step {N}: {file list}")
e. Tidy commit — Run Bash directly:
git status --porcelain
If output is non-empty:
git add -A
git commit -m "[tidy] Simplify Step {N}"
f. Cleanup — Remove the worktree:
git worktree remove .worktrees/step-{N} 2>/dev/null
git branch -d step-{N} 2>/dev/null
g. Next — Proceed to the next pending step. Repeat a-f.
If config.worktree is false — Default Mode (no isolation):
a. Build — Spawn Builder for THIS step only:
Agent(subagent_type="crew:builder", prompt="Implement Step {N}: {step description}. Context files: {list}.")
Wait for builder to complete.
b. Commit — Run these Bash commands directly (do NOT delegate):
git status --porcelain
If output is non-empty:
git add -A
git commit -m "[feat] Step {N}: {step description}"
c. Simplify — Spawn code-simplifier on modified files:
Agent(subagent_type="code-simplifier:code-simplifier", prompt="Simplify the files modified in Step {N}: {file list}")
d. Tidy commit — Run Bash directly:
git status --porcelain
If output is non-empty:
git add -A
git commit -m "[tidy] Simplify Step {N}"
e. Next — Proceed to the next pending step. Repeat a-d.
After all steps complete:
build.status = "complete"[3/5] Build complete ({completion.built}/{completion.total} deliverables)After build loop completes, iterate through every entry in state.json.deliverables and update both the individual status field and the completion summary.
CRITICAL: You MUST update each deliverable's status field in the deliverables array. Do NOT only update completion counts — the per-item status is required for reporting, gap-filling, and --continue resume.
Check rules by type:
file / config / directory: check if expected_path exists (use Bash: test -f or test -d)class / interface / function: Grep expected_path (or project-wide if path is approximate) for the declaration keyword (class <name>, interface <name>, function <name>, def <name>)test: check file exists AND contains at least one test attribute/decorator ([Fact], [Test], @Test, def test_, etc.)Update each deliverable in the deliverables array:
"status": "built""status": "missing""status": "partial"Example — before:
{ "id": "d1", "name": "FooService", "type": "class", "expected_path": "src/Foo.cs", "status": "pending" }
After verification (file exists, class declaration found):
{ "id": "d1", "name": "FooService", "type": "class", "expected_path": "src/Foo.cs", "status": "built" }
Then compute completion from the updated statuses:
completion.built = count where status == "built"completion.missing = count where status == "missing" or "partial"completion.total = deliverables.lengthcompletion.verdict:
"complete" if missing == 0"partial" if missing > 0 AND built >= 50% of total"minimal" if built < 50% of totalWrite the full updated state.json with both the modified deliverables array and the completion summary.
Note: build.status remains "complete" regardless — the build itself didn't fail, it scoped down. The gap information flows to review and report.
Skip if: config.arch_guard_detected is already true (set in Phase 2d) OR config.no_arch is set.
Fallback for cases where Phase 2d didn't enable arch-guard but the built project grew into a multi-module structure:
Skill("arch-guard:setup"), set config.arch_guard_detected = trueIf config.arch_guard_detected is true (either pre-existing or just created in 3c):
Skill("arch-guard:test-gen") to generate architecture guard-rail tests (layer dependency, reference direction, etc.)dotnet test, npm test, etc.) to verify they pass against the current build[3/5] Architecture tests generated and passingThis catches layer violations and missing integrations immediately after build, rather than leaving them as manual "Next Steps".
build.status = "failed", record error[3/5] Build failed. Use /autopilot --continue to retry (delegates to crew:go --continue).[4/5] Reviewing...review.status = "running", phase = "review", increment review.roundsUse the Agent tool — send a single message with up to 3 Agent calls:
Stream A — Codex Review (conditional):
codex CLI is available (which codex returns 0): spawn an Agent that runs codex -q "Review these changed files for bugs, security issues, and code quality: {file list}" via Bash toolStream B — Architecture Review (conditional):
config.arch_guard_detected is true (either pre-existing or auto-generated in 3c): spawn an Agent that runs Skill("arch-guard:arch-check") and Skill("arch-guard:impl-review") on the changed filesStream C — CW Review:
"/crew:review --all" for functional, security, and quality reviewStream D — Completeness Review (conditional):
state.json.completion.verdict != "complete":
state.json.deliverables where status == missing or partialintentional_deferral — stub or TODO exists, explicitly deferredoversight — nothing exists, not mentioned anywherepartial — file exists but implementation is incomplete.autopilot/review-results.md under a "## Completeness Gaps" sectionWhen reviews complete, compare findings:
.autopilot/review-results.md"/crew:review --fix" to auto-fixreview.rounds)Write to .autopilot/review-results.md and print:
| Review Stream | Status | Score |
|-------------------|--------------------|---------------------|
| Codex Review | DONE / SKIPPED | — |
| Architecture | DONE / SKIPPED | B (82) |
| CW Review | DONE_WITH_CONCERNS | 2 minor |
| Completeness | DONE / SKIPPED | 8/12 (4 missing) |
| Cross-Model Check | DONE | 1 divergence |
Set review.status based on overall result. Print [4/5] Review complete.
[5/5] Generating report...report.status = "running", phase = "report".autopilot/design-brief.md.autopilot/review-results.md.caw/research/<slug>/RESEARCH-REPORT.md (if exists)git diff --stat for file change summary.autopilot/REPORT.md:# Autopilot Report
## Topic
{topic}
## What Was Built
{summary from design-brief + component list}
## Architecture Score
{from arch-reviewer, or "N/A — arch-guard not active"}
## Review Results
{from review-results.md — issues found, fixed, remaining}
## Completeness
{table from state.json.deliverables: name | type | status (BUILT/MISSING/PARTIAL) | expected_path}
**Verdict**: {completion.verdict} ({completion.built}/{completion.total} deliverables built, {completion.missing} missing)
## Remaining Work
{if completion.missing > 0: numbered list of missing deliverables with name, type, path, and brief description of what needs to be implemented}
{if completion.missing == 0: "All designed deliverables were built."}
## Files Created/Modified
{git diff --stat output}
## Suggested Commit Message
{conventional commit message based on what was built}
completion.missing > 0: also write .autopilot/remaining-work.md as a standalone file containing the Remaining Work section. This file is structured for use as input to gap-filling.report.status = "complete", report.report_path = ".autopilot/REPORT.md"If completion.missing > 0 AND this is NOT already a gap-fill round (check state.gap_fill_round — default 0):
[5/5] Gaps detected ({completion.missing} items) — auto gap-filling...state.gap_fill_round = 1.autopilot/remaining-work.md as the task plan sourceIf completion.missing == 0 OR state.gap_fill_round >= 1 → proceed to step 7.
phase = "complete"If completion.verdict == "complete" (or deliverables array is empty — backward compat):
---
SIGNAL: AUTOPILOT_COMPLETE
---
If completion.verdict == "partial" or "minimal":
---
SIGNAL: AUTOPILOT_COMPLETE_WITH_GAPS ({completion.built}/{completion.total} deliverables)
Remaining: .autopilot/remaining-work.md
Use /autopilot --continue to build remaining items.
---
| Phase | On Failure | Recovery |
|---|---|---|
| Research | Log error, mark failed, stop | --continue retries, or --skip-research |
| Design | Preserve partial artifacts, stop | --continue retries with existing research |
| Build | crew:go has 5-level error recovery | --continue delegates to crew:go --continue |
| Review | Individual stream failure = skip that stream | Report notes which reviews ran |
| Report | Should not fail (read-only synthesis) | --continue retries |
Global: 3 consecutive failures on the same phase → suggest manual skill invocation for that phase.
/autopilot "build a notification system"
AUTOPILOT started: "build a notification system"
[1/5] Researching... done
[2/5] Designing... done (user approved)
[3/5] Building... done (8 steps, 12 files)
[4/5] Reviewing... done (3 streams, 1 fix round)
[5/5] Generating report... done
Report: .autopilot/REPORT.md
---
SIGNAL: AUTOPILOT_COMPLETE
---
This skill has disable-model-invocation: true, which means the Skill tool cannot invoke it directly. This affects any context where a model needs to call /autopilot on behalf of a user (e.g., Telegram bots, remote chat interfaces, programmatic orchestration).
Workaround: Use the Agent tool to spawn a subagent that runs the skill:
Agent(prompt="/autopilot <topic> [flags]")
The subagent gets a fresh context, loads the skill content, and executes the full pipeline. This is the same pattern autopilot itself uses to invoke crew skills (crew:explore, crew:go, crew:review) which also have disable-model-invocation.
Will:
.autopilot/ directory and all artifacts within it.autopilot/state.json for resume supportgit diff --stat for reportingWon't:
--from-plan is used.autopilot/ except through delegated skillsnpx claudepluginhub jaebit/claudemate --plugin autopilotProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.