From claude-evolve
Run the full claude-evolve loop for a workspace — the omnibus. Drives evolution.csv through its cycle (code pending candidates, score them, ideate the next generation when the queue drains, repeat) as a self-respawning pool of background worker subagents, so the main conversation stays a clean dashboard. Use when the user says "run evolution", "evolve", "start the evolution run", "process the pending candidates", or wants the whole pipeline driven end to end. Equivalent to `claude-evolve run`: codex (GPT-5.5) codes each candidate first with the Sonnet worker judging the result and falling back to coding it itself, the evaluator scores, Opus ideates.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-evolve:evolve [--working-dir DIR] [--max-workers N][--working-dir DIR] [--max-workers N]The summary Claude sees in its skill listing — used to decide when to auto-load this skill
`/evolve` is the orchestrator. It runs the evolution loop the way `claude-evolve run` does, but with subagents instead of external CLIs:
/evolve is the orchestrator. It runs the evolution loop the way claude-evolve run does, but with subagents instead of external CLIs:
pending candidate (codex/GPT-5.5 edits evolution_<id>.py to match its idea; the Sonnet worker judges the result and codes it itself if codex falls short).pending candidates remain, ideate the next generation (Opus at high effort, via the evolve-ideate skill).The design is stolen from the technical-lead /ship skill: this conversation is a re-spawn pool. The parent (this session) does almost nothing — it resolves setup once, launches a few background worker agents, and relaunches each one the instant it returns. All the noisy work (file reads, edits, evaluator output) happens inside the worker subagents, so the main thread stays a short, readable status feed. That isolation is the whole point of doing code+score in subagents.
$CLAUDE_PLUGIN_ROOT is set for this skill invocation, but background agents you spawn will not inherit it — so capture an absolute path now and bake it into every worker prompt.
PLUGIN_ROOT="$CLAUDE_PLUGIN_ROOT"
echo "PLUGIN_ROOT=$PLUGIN_ROOT"
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" params
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" cleanup
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" ensure-baseline
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" stats
--working-dir (auto-detects evolution/config.yaml or ./config.yaml). Resolve evolution_dir from params and use its absolute path everywhere below.params gives you max_workers (override with --max-workers), worker_max_candidates (candidates per worker before it returns), auto_ideate, and min_completed_for_ideation.stats (pending / complete / failed).Decide N = min(max_workers, pending_count) but at least 1. Launch N worker Agents in a single message, each:
subagent_type: "claude-evolve:coder" ← the plugin's coder agent (Sonnet, restricted tools); its definition holds the whole worker protocol — codex-first coding, judgment, fallback, scoring. Do not pass a model override.run_in_background: true ← required, so the parent stays free and workers run concurrentlyname: "evolve-worker-<i>" ← so a completion notification maps backdescription: "evolve worker <i>"prompt: just the per-run parameters (the protocol lives in the agent definition):PLUGIN_ROOT: {PLUGIN_ROOT}
WORKING_DIR: {WORKING_DIR}
K: {K}
Process up to K candidates per your protocol, then return your one-line summary.
with {PLUGIN_ROOT}, {WORKING_DIR} (both absolute), and {K} (= worker_max_candidates) substituted.
End your turn after launching. The harness re-invokes you when a worker completes.
Keep a tiny running tally in your replies (workers live, candidates completed this run, consecutive ideation no-ops). You hold no other state. On each worker completion:
cycled — … or drained — …). Surface a one-line summary to the user.cycled (it hit the {K} cap with work still flowing): relaunch that worker — one new Agent, identical subagent_type/name/prompt, run_in_background: true. End the turn.drained (it found no pending work): do not immediately relaunch it. Check whether the whole pool is now idle:
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" stats
pending > 0 (a race — another worker is still mid-flight and will free more, or work remains): relaunch the drained worker. End the turn.pending == 0 and all workers have returned drained (pool fully idle): go to Phase 3 (ideate).Only when the queue is fully drained and the pool is idle.
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" cleanup
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" stats
If pending > 0 after cleanup (stuck candidates got reset), go back to Phase 1 and relaunch the pool.top-performers --n 1) and stop:
auto_ideate is false (the workspace opts out of auto-ideation), orcomplete < min_completed_for_ideation (not enough completed candidates to learn from), orpending rows). When it returns, note how many ideas it added.
Each turn, keep it to a few lines: which workers are live, what just completed (id → score / refusal), and pending/complete counts. The detail lives in the CSV and inside the subagents — the user is watching a dashboard, not a transcript. When the run ends, give the final leader and a one-paragraph summary (generations run, candidates completed, best score).
failed, failed-validation, failed-parent-missing) and move on — never fake a score, never weaken the evaluator, never "fix" a candidate just to make it pass.algorithm.py, evaluator.py, or BRIEF.md to change outcomes. Workers only ever write evolution_<id>.py.npx claudepluginhub willer/claude-evolve --plugin claude-evolveProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.