From autobuild
Autonomous build iteration orchestrator. Runs structured improvement cycles with multi-agent review. Use when asked to iterate, improve, fix bugs, refactor, run GC, implement features, do quality improvement, run cleanup, or execute structured development cycles. Phases - research, hypothesis, plan, implement, test, review, record. 10 commands, 2 calls per phase.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autobuild:autobuildThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Breaks improvement work into phases. Spawns agent panels per stage. Enforces quality via two independent gates. Use for iteration, quality improvement, bug fixes, refactors, GC, feature implementation.
Breaks improvement work into phases. Spawns agent panels per stage. Enforces quality via two independent gates. Use for iteration, quality improvement, bug fixes, refactors, GC, feature implementation.
Solves:
Phase instructions, agent definitions, exit criteria live in resources/. Run start to see what each phase requires.
Always run this single line BEFORE invoking orchestrate. No-op when the package is already importable; auto-installs when missing OR when a stale shim is on PATH but the package is uninstalled in the active Python:
python3 -c "import stellars_claude_code_plugins" 2>/dev/null || python3 -m pip install --user --upgrade stellars-claude-code-plugins
orchestrate CLI appears after install. Or run entrypoint directly:
python .claude/skills/autobuild/orchestrate.py
Both share the same engine. Never ask the user whether to install - just run the pre-flight line.
--benchmark "instruction". --benchmark = generative instruction string, NOT a shell command. Typically references a file: --benchmark "Read MODEL_BENCHMARK.md, evaluate each [ ] item, mark [x] if passing, report remaining [ ] count as violation score." Benchmark runs in TEST phase ONLY. IMPLEMENT and REVIEW MUST NOT evaluate benchmark--objective "Implement the program defined in PROGRAM.md (read .claude/skills/autobuild/PROGRAM.md)". Use PROGRAM.md for complex objectives, *_BENCHMARK.md for checklistsComplex multi-iteration objectives: define in PROGRAM.md, checklist in BENCHMARK.md, then run:
# Fixed number of iterations
orchestrate new --type full \
--objective "Implement the program defined in PROGRAM.md (read PROGRAM.md)" \
--iterations 3 \
--benchmark "Read BENCHMARK.md and evaluate each [ ] item. Mark [x] if passing. Report remaining [ ] count as violation score."
# Run until benchmark conditions are met (--iterations 0 = unlimited)
orchestrate new --type full \
--objective "Implement the program defined in PROGRAM.md (read PROGRAM.md)" \
--iterations 0 \
--benchmark "Read BENCHMARK.md and evaluate each [ ] item. Mark [x] if passing. Report remaining [ ] count as violation score."
Orchestrator reads structured objective at each phase. Benchmark tracks progress. TEST phase verifies each checklist item against codebase, reports violation count.
--iterations 0 runs until score = 0 (all conditions met). Safety cap: 20 iterations.
new command flags| Flag | Description |
|---|---|
--type | full, gc, or hotfix (required) |
--objective | Iteration goal (required) |
--iterations | Cycles. 0 = unlimited until benchmark passes. Default: 1 |
--benchmark | Generative instruction evaluated during TEST phase |
--continue | Resume interrupted session |
--restart | Restart current iteration (optionally update objective/benchmark/iterations) |
--dry-run | Preview without executing |
--record-instructions | Custom RECORD instructions (e.g. "Update .claude/JOURNAL.md, git add, commit, push"). Default: no journal/git unless code changed |
NO EXCEPTIONS. Every code change, file edit, commit MUST occur inside an orchestrator phase. Do NOT:
Bypassing gates = bypassing quality control.
Phase discipline:
Tempted to "just quickly fix something" outside? DON'T. Start a phase.
Every phase = 2-call pattern:
orchestrate new --type full --objective "improve X" --iterations 3
orchestrate start --understanding "I will spawn 3 research agents"
# ... do the work the CLI told you to do ...
orchestrate end --evidence "done" --agents "a,b,c" --output-file "path"
CLI guides each phase with instructions, agent definitions, exit criteria.
FULLY AUTONOMOUS - NO HUMAN IN LOOP:
Human role:
Everything else autonomous:
end. No pause. No questionsGates = quality control, not human review. Trust the gates.
| Type | Use when |
|---|---|
full | Feature work, improvements, research-driven changes |
gc | Cleanup, dead code, refactoring |
hotfix | Targeted bug fix, minimal ceremony |
Inject context into any phase. User says "focus on X" → use context command. Stores as banner in phase instructions, broadcast to all agents spawned after:
orchestrate context --phase RESEARCH --message "focus on connector routing"
10 total. Run orchestrate --help or orchestrate <command> --help.
# start a 3-iteration improvement cycle
orchestrate new --type full --objective "fix connector routing" --iterations 3
# with benchmark tracking (generative instruction, evaluated during TEST phase)
orchestrate new --type full --objective "improve D3 score" --iterations 5 --benchmark "Read MODEL_BENCHMARK.md and evaluate each [ ] item. Mark [x] if passing. Report count of remaining [ ] as violation score."
# enter phase - readback validates your understanding
orchestrate start --understanding "I will spawn 3 research agents to investigate D3 failures"
# complete phase - record what was done, which agents, and output file
orchestrate end --evidence "3 agents found rotation errors" --agents "researcher,architect,product_manager" --output-file ".autobuild/phase_01_research/findings.md"
# check progress
orchestrate status
# reviewer rejects - go back to IMPLEMENT
orchestrate reject --reason "guardian blocked: benchmark overfit detected"
# skip optional phase (--force for required phases, conservative gate)
orchestrate skip --reason "no iterations remaining" --force
# inject user guidance into current phase
orchestrate context --message "focus on label clearance not edge-snap"
# log a failure mode found during work
orchestrate log-failure --mode "FM-ROTATION" --desc "rotate(-90) instead of +90 for downward arrows"
# show failure log and hypothesis catalogue
orchestrate failures
orchestrate hypotheses
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub stellarshenson/claude-code-plugins --plugin autobuild