From second-claude-code
Evolves prompt assets that repeatedly fail structural checks. Harvests the failing asset, requires a maintainer-authored regex check, then runs mutation-driven optimization on a loop branch.
How this skill is triggered — by the user, by Claude, or both
Slash command
/second-claude-code:evolveThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Every evolve check is hand-authored by the maintainer — never by the optimized model class — and it scores structural conformance of the prompt asset, not runtime behavior. The maintainer's merge review of `winner.diff` is the final behavioral gate.**
Every evolve check is hand-authored by the maintainer — never by the optimized model class — and it scores structural conformance of the prompt asset, not runtime behavior. The maintainer's merge review of
winner.diffis the final behavioral gate.
recurrence ≥ 3). No provenance, no target.min_delta to the 0.02 floor; evolve refuses to run below it.winner.diff before merge.codex/loop-* branch; merging to main is a manual maintainer decision.The ouroboros maintainer loop. It closes the self-improvement ring: PDCA runs → Check gate fails → failure logged → evolve harvest surfaces the weak asset → the maintainer authors a structural check → the unmodified loop engine evolves the asset on an isolated branch → the maintainer merges → the next PDCA run starts stronger.
evolve is a target-selection + check-authoring layer on top of loop. It runs no optimizer of its own and relaxes no loop gate.
sources_min_3, min_two_reviewers, …) keeps failing across runs and the root cause is the producing prompt asset, not the individual run.The loop's mutations are five fixed text substitutions: should→must, can→must, a legacy command-prefix rewrite to the full /second-claude-code: namespace, trailing-whitespace trim, blank-line collapse. None add content. A candidate beats baseline only when the maintainer's regex flips fail→pass because a substitution fired — i.e. the asset still holds a should / can / legacy-prefix token the regex wants rewritten.
A check requiring new content the asset lacks (a missing heading, an added section) can never be satisfied by v1 mutations — every candidate stays byte-identical and the run ends min_delta_not_met. Those need manual editing or a future mutation set. Author assertions inside the promotable space.
| Command | Purpose |
|---|---|
list-failures [--min-recurrence N] [--asset PATH] | Scan real gate_fail events, surface assets that crossed the recurrence threshold. |
show-failure <id> | Inspect one provenance record + prior ratified checks for that asset. |
harvest <id> --assertion '<regex>[,<regex>]' [--target PATH] | Attach the maintainer-authored check and generate benchmarks/loop/evolve-<id>.json. |
run <name> | Shell out to the unmodified loop engine, then report any holdout regression (advisory). |
resume <run_id> | Delegate to loop-runner.mjs resume. |
node scripts/evolve-runner.mjs list-failures — which assets recur. Provenance only, never a check.node scripts/evolve-runner.mjs harvest <id> --assertion '/second-claude-code:' — a normalization check (a legacy short command prefix is rewritten to the full namespace). Writes a validated suite with the check inline (base64) so the worktree needs no extra file.scripts/evolve-scorer.mjs and the target asset, then node scripts/evolve-runner.mjs run evolve-<id> — runs the loop on an isolated branch; prior ratified checks ride along as weight-0 holdouts. (Refuses if either is missing from HEAD.).captures/loop-<runId>/winner.diff. Merge to main manually only if genuinely better.check_author: "maintainer" and refuses an empty --assertion, but cannot tell a human regex from a model one. The "never model-authored" guarantee rests on this skill plus your review — never let the agent invent the assertion.min_delta is pinned to 0.02; evolve run refuses below it. (Invoking loop-runner.mjs run evolve-* directly bypasses this — always use evolve run.)skills/**/SKILL.md, agents/*.md, commands/*.md, templates/*.md), restricted to a safe charset so they cannot carry shell metacharacters into the generated command.${CLAUDE_PLUGIN_DATA}/evolve/failures.jsonl (local-only)config/evolve-asset-map.jsonbenchmarks/loop/evolve-*.json.captures/loop-<run_id>/ (owned by the loop engine)prompt-detect.mjs routing. A self-mutating loop must never trigger autonomously, and is never daemon-scheduled.winner.diff.npx claudepluginhub unclejobs-ai/second-claude-code --plugin second-claude-codeRuns autonomous improvement loops: triages work, cycles through research/plan/implement/validate, and repeats until a kill switch or regression stops the run.
Evolves any measurable artifact (prompt, skill, code) through autonomous mutation-evaluate-gate loops. Supports GT case suites and scalar metric loops with automatic keep/revert.
Directs multi-cycle improvement campaigns by forming hypotheses, scouting before attacking, and extracting transferable patterns. Use for sustained autonomous quality advancement across sessions.