Skill

evolve

Run the full claude-evolve loop for a workspace — the omnibus. Drives evolution.csv through its cycle (code pending candidates, score them, ideate the next generation when the queue drains, repeat) as a self-respawning pool of background worker subagents, so the main conversation stays a clean dashboard. Use when the user says "run evolution", "evolve", "start the evolution run", "process the pending candidates", or wants the whole pipeline driven end to end. Equivalent to `claude-evolve run`: codex (GPT-5.5) codes each candidate first with the Sonnet worker judging the result and falling back to coding it itself, the evaluator scores, Opus ideates.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claude-evolve:evolve [--working-dir DIR] [--max-workers N]

User invocable

Model invocable

Inline context

Default effort

Argument hint[--working-dir DIR] [--max-workers N]

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

`/evolve` is the orchestrator. It runs the evolution loop the way `claude-evolve run` does, but with subagents instead of external CLIs:

SKILL.md

96 lines · ~1.8k tokens

Stats

LanguagePython

Parent stars1

MaintenanceGood

Last CommitJun 14, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

evolve

/evolve is the orchestrator. It runs the evolution loop the way claude-evolve run does, but with subagents instead of external CLIs:

Code each pending candidate (codex/GPT-5.5 edits evolution_<id>.py to match its idea; the Sonnet worker judges the result and codes it itself if codex falls short).
Score it (run the workspace evaluator under the sandbox; record the number).
When no pending candidates remain, ideate the next generation (Opus at high effort, via the evolve-ideate skill).
Repeat until ideation can't make progress or the user stops it.

The design is stolen from the technical-lead /ship skill: this conversation is a re-spawn pool. The parent (this session) does almost nothing — it resolves setup once, launches a few background worker agents, and relaunches each one the instant it returns. All the noisy work (file reads, edits, evaluator output) happens inside the worker subagents, so the main thread stays a short, readable status feed. That isolation is the whole point of doing code+score in subagents.

Phase 0 — Setup (run once, inline)

$CLAUDE_PLUGIN_ROOT is set for this skill invocation, but background agents you spawn will not inherit it — so capture an absolute path now and bake it into every worker prompt.

PLUGIN_ROOT="$CLAUDE_PLUGIN_ROOT"
echo "PLUGIN_ROOT=$PLUGIN_ROOT"
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" params
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" cleanup
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" ensure-baseline
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" stats

If the user didn't give a workspace, omit --working-dir (auto-detects evolution/config.yaml or ./config.yaml). Resolve evolution_dir from params and use its absolute path everywhere below.
params gives you max_workers (override with --max-workers), worker_max_candidates (candidates per worker before it returns), auto_ideate, and min_completed_for_ideation.
State one setup line back to the user: workspace dir, worker count, and the current stats (pending / complete / failed).

Phase 1 — Launch the worker pool (one message, all background)

Decide N = min(max_workers, pending_count) but at least 1. Launch N worker Agents in a single message, each:

subagent_type: "claude-evolve:coder" ← the plugin's coder agent (Sonnet, restricted tools); its definition holds the whole worker protocol — codex-first coding, judgment, fallback, scoring. Do not pass a model override.
run_in_background: true ← required, so the parent stays free and workers run concurrently
name: "evolve-worker-<i>" ← so a completion notification maps back
description: "evolve worker <i>"
prompt: just the per-run parameters (the protocol lives in the agent definition):

PLUGIN_ROOT: {PLUGIN_ROOT}
WORKING_DIR: {WORKING_DIR}
K: {K}
Process up to K candidates per your protocol, then return your one-line summary.

with {PLUGIN_ROOT}, {WORKING_DIR} (both absolute), and {K} (= worker_max_candidates) substituted.

End your turn after launching. The harness re-invokes you when a worker completes.

Phase 2 — Re-spawn pool (the parent's only ongoing job)

Keep a tiny running tally in your replies (workers live, candidates completed this run, consecutive ideation no-ops). You hold no other state. On each worker completion:

Read its final line (cycled — … or drained — …). Surface a one-line summary to the user.
cycled (it hit the {K} cap with work still flowing): relaunch that worker — one new Agent, identical subagent_type/name/prompt, run_in_background: true. End the turn.
drained (it found no pending work): do not immediately relaunch it. Check whether the whole pool is now idle:
```
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" stats
```
- If pending > 0 (a race — another worker is still mid-flight and will free more, or work remains): relaunch the drained worker. End the turn.
- If pending == 0 and all workers have returned drained (pool fully idle): go to Phase 3 (ideate).

Phase 3 — Ideate the next generation

Only when the queue is fully drained and the pool is idle.

First reset any stragglers, then re-check:

python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" cleanup
python3 "$PLUGIN_ROOT/scripts/evolve_csv.py" --working-dir "<WORKING_DIR>" stats

If pending > 0 after cleanup (stuck candidates got reset), go back to Phase 1 and relaunch the pool.

Stop conditions — if any holds, the run is complete; tell the user the final leader (top-performers --n 1) and stop:
- auto_ideate is false (the workspace opts out of auto-ideation), or
- complete < min_completed_for_ideation (not enough completed candidates to learn from), or
- the previous ideation pass added 0 new ideas (evolution has converged — don't loop forever on empty ideation).
Otherwise run one ideation pass using the evolve-ideate skill for this workspace (it fans out the Opus strategy subagents and appends new pending rows). When it returns, note how many ideas it added.
- 0 added → record a consecutive no-op; if this is the 2nd in a row, stop as converged.
- ≥1 added → go back to Phase 1 and relaunch the worker pool for the new generation.

Reporting

Each turn, keep it to a few lines: which workers are live, what just completed (id → score / refusal), and pending/complete counts. The detail lives in the CSV and inside the subagents — the user is watching a dashboard, not a transcript. When the run ends, give the final leader and a one-paragraph summary (generations run, candidates completed, best score).

Honesty (claude-evolve's core rule)

A failing candidate is a real result. Record it (failed, failed-validation, failed-parent-missing) and move on — never fake a score, never weaken the evaluator, never "fix" a candidate just to make it pass.
If the evaluator or sandbox is broken (every candidate fails identically), stop and tell the user — don't grind through a whole generation of identical failures pretending it's progress.
Never edit algorithm.py, evaluator.py, or BRIEF.md to change outcomes. Workers only ever write evolution_<id>.py.

evolve

Popularity

Invocation

Context Preview

SKILL.md

evolve

Popularity

Invocation

Context Preview

SKILL.md

evolve

Phase 0 — Setup (run once, inline)

Phase 1 — Launch the worker pool (one message, all background)

Phase 2 — Re-spawn pool (the parent's only ongoing job)

Phase 3 — Ideate the next generation

Reporting

Honesty (claude-evolve's core rule)

Similar Skills

evolve

Phase 0 — Setup (run once, inline)

Phase 1 — Launch the worker pool (one message, all background)

Phase 2 — Re-spawn pool (the parent's only ongoing job)

Phase 3 — Ideate the next generation

Reporting

Honesty (claude-evolve's core rule)

Similar Skills