From claude-cordyceps
Autonomous looped repo work. For each work unit (phase, bucket, item, whatever the user calls them): plan → peer-review council → revise (≤2 rounds) → implement → test → diff council → commit. Background-ping wake-up between iterations. Repeat until the user's roadmap is exhausted or a halt trigger fires. PERSISTENCE: Claude Code's task tool is the primary state — work units and per-unit grinder steps are TaskCreate'd at run start, TaskUpdate'd on every transition. Tasks survive auto-compact, so wake-up reads the task list to find where it left off. ACTIVATE on: "grind the backlog", "grind through this", "auto-grind <X>", "run the grinder on <X>", "go autonomous on <X>", "continue with the <buckets|phases|items>", "use the auto repo grinder". REQUIRES: `cordy` on PATH and at least 2 peer drivers reachable. Refuses to start with fewer drivers — single-peer "council" defeats the diversity benefit. Auto mode (or equivalent permission-bypass) must be active; the loop cannot pause for permission prompts. NOT for one-shot tasks. If the user wants a single feature shipped, just ship it. The grinder is for "here is a roadmap, go through it for hours, wake me when done or stuck".
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-cordyceps:grinderThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Self-driving repo workflow. Plan a work unit, council the plan, revise,
Self-driving repo workflow. Plan a work unit, council the plan, revise, implement, test, council the staged diff, commit. Repeat. Wake-up between iterations is driven by background-task completion notifications. State survives auto-compact via Claude Code's task tool.
The grinder is opinionated by design — it locks in defaults so the runner doesn't make new judgment calls every loop. The interesting decisions happen at planning and severity gating; everything else is mechanics.
Companion skill: cordy — how to drive cordy councils. The grinder
uses cordy idioms; don't re-explain them here.
PHASES.md,
ROADMAP.md, BACKLOG.md), an inline list in the trigger message, a
"what's next" recommendation from a prior turn, a paragraph of prose,
a single goal that decomposes naturally. The grinder reads what's
given and serializes it into work units.Use what's already there. If the prior turn (or a referenced file) already articulates the work units — a 3-bucket recommendation, phases A through F, a numbered punch-list — serialize those units as-is. Don't re-decompose. The user did the planning; your job is to execute.
If the work is genuinely vague ("clean up the codebase"), ask once for specificity, then proceed.
Before the first wake ping fires, do this in order:
cordy daemon start 2>&1 || true # idempotent
cordy --ephemeral drivers --json
Count drivers with probe.available: true. Fewer than 2 → halt with
STATUS.md. Note which drivers are up; that's the default panel.
git rev-parse HEAD # baselineSha
<discovered test command> # baseline test counts
See "Tests pass = ?" below for runner discovery. Capture
{passed, failed, skipped} and the failure set as knownFailures
(those are pre-existing, not regressions).
One parent task per work unit. Name it with the unit identifier as the user described it — "Phase B: resource pools become ruleset-driven", "Bucket 1: build cyberpunk ruleset", etc. Use the user's vocabulary (phase, bucket, item, milestone), not the skill's.
Sub-tasks per work unit, in execution order:
draft planplan councilrevise plan (only if council surfaces findings)implementtestdiff councilrevise impl (only if council surfaces findings)commitMark all as pending. Only the first sub-task of the first work unit will go in_progress immediately.
Example shape (matching what the user actually sees in the task panel):
◻ Phase B: resource pools become ruleset-driven
◻ B: draft plan
◻ B: plan council
◻ B: implement
◻ B: test
◻ B: diff council
◻ B: commit
◻ Phase C: combat formulas become hookable
◻ C: draft plan
...
Some state can't fit in tasks — baseline info, accumulated changelog
entries, deferred findings, halt reports. These live under
<repo>/.cordyceps/grinder/:
<repo>/.cordyceps/grinder/
baseline.json — {baselineSha, baselineTests, panel,
droppedDrivers, consecutiveFailures}
changelog-pending.md — accumulated entries for end-of-run release
DEFERRED.md — deferred low-severity findings
STATUS.md — written on halt only
Don't create files preemptively — write them as the run produces
content. STATUS.md should be absent during a normal run.
If <repo>/.gitignore exists and doesn't contain .cordyceps/, append
it. Don't create a .gitignore if absent (project may have
intentionally omitted it).
Now you're in the loop.
TaskList()
Decision tree:
| Condition | Action |
|---|---|
| No tasks exist | Fresh run. Do initial setup. |
| Tasks exist, none in_progress, pending sub-tasks remain | Pick the next pending sub-task in order, mark in_progress, resume. |
| One sub-task in_progress | Resume from that exact step. Read auxiliary state (baseline.json, working tree diff vs baselineSha) to figure out where within the step you were. |
| All tasks completed | End-of-run release commit (Step 11), then halt. |
Crash recovery: working tree dirty AND in_progress sub-task doesn't
match the diff. Run git diff <baselineSha>; if the diff matches what
the recorded sub-task would produce, attempt to resume. If ambiguous,
write STATUS.md "ambiguous recovery — needs human" and halt.
┌─ for each work unit (parent task) ──────────────────────────┐
│ │
│ 1. baseline → SHA + test counts (per-unit) │
│ 2. draft plan → plan content; mark sub done │
│ 3. plan council → cordy council, background │
│ 4. revise plan (≤2) → integrate findings │
│ 5. implement → edit files; no scope creep │
│ 6. test → vs baseline; new failures→revert│
│ 7. diff council → cordy diff --staged, background │
│ 8. revise impl (≤2) → fix critical/sec/medium/quick │
│ 9. commit → conventional msg; no attribution│
│ 10. wake ping → background sleep+echo │
│ 11. mark unit complete → next work unit's tasks begin │
│ │
└─────────────────────────────────────────────────────────────┘
TaskUpdate to in_progress when starting each sub-task; completed
when finishing. Status changes are the breadcrumbs that let wake-up
recover.
Each work unit gets a fresh baseline (the previous unit's commit becomes
the new baseline). Update baseline.json:
git rev-parse HEAD # new baselineSha
<test command> # may differ from initial baseline
This gates regression detection. New failures vs. this unit's baseline = regression caused by this unit.
Mark draft plan sub-task in_progress.
For substantive work units, the plan structure that produces good council critique:
For trivial units (one-line fixes, doc-only changes), skip the template; a paragraph in the task description is enough.
Write to a tempfile when councilling:
cat > /tmp/grinder-${UNIT_ID}-plan.md <<'PLAN'
...
PLAN
Mark draft plan completed; mark plan council in_progress.
Council the plan via cordy. Use the cordy skill's host-as-chair pattern:
cordy council review /tmp/grinder-${UNIT_ID}-plan.md \
--panel codex,gemini,claude:opus \
--no-chair --timeout 600 --json \
> /tmp/grinder-${UNIT_ID}-plan-council.json 2>&1
Run with the Bash tool's run_in_background: true. The wake
notification fires when the council finishes — don't poll, don't
Monitor-stream (cordy buffers per-reviewer output until all reviewers
finish; there are no intermediate frames).
While the council runs in background, you can do useful prep
(read related files, draft scaffolding for the impl). When the
notification lands, read reviews[]. You're the chair (host-as-chair).
Mark plan council completed.
Council findings are advisory. Rules:
| Severity | Action |
|---|---|
block-ship | Address (incorporate) OR override with one-line technical rationale. No process attribution. |
critical, security | Same as block-ship. |
medium, quick-win | Address if cost is low; otherwise defer to DEFERRED.md. |
style, maintainability | Your call. Override silently if you disagree. |
If no findings worth addressing → skip revising; mark revise plan as
completed (with a brief note) and proceed.
If findings warrant changes:
revise plan in_progressrevise plan completed.If round 2 surfaces fundamental disagreement (a reviewer flags the unit as wrong-direction with substantive reasoning you can't refute) → write STATUS.md and halt. Don't bulldoze structural objections.
Mark implement in_progress.
Edit only files named in the plan's "Files in scope" + their direct imports. "While I'm here" cleanup is forbidden — backlog it to DEFERRED.md.
If implementation reveals a need for files outside scope:
Mark implement completed.
Mark test in_progress. Run the test command. Compare to baseline:
| Result | Action |
|---|---|
Same failed count, same failure set as baseline | Pass — proceed. |
New failures (not in baselineTests.knownFailures) | Regression. git restore <files>, retry implementation. |
| Test command itself fails to run | Halt with STATUS.md. Tests are a prerequisite. |
| 2 retries, regression persists | Mark unit blocked, increment consecutiveFailures, continue to next unit. |
For long test suites (5+ minutes): run with run_in_background: true,
do other prep, wake when tests finish.
Mark test completed.
Mark diff council in_progress. Stage exactly what's intended for
this commit:
git add <files-from-plan>
git status # confirm staged matches intent
git diff --staged --stat # double-check what reviewers will see
Then run the diff council on staged-only:
cordy council diff --staged \
--panel codex,gemini,claude:opus \
--no-chair --timeout 600 --json \
> /tmp/grinder-${UNIT_ID}-diff-council.json 2>&1
Background + wake-on-completion.
If unstaged changes exist post-implementation that weren't staged — cruft escaped scope — halt and surface. The grinder shouldn't silently leave or revert unexplained changes.
Mark diff council completed.
Severity gate for the diff council:
| Severity | Action |
|---|---|
critical, security | MUST address before commit. Edit, re-stage, re-run diff council (counts as a revision round). |
medium, quick-win | Address — cost-benefit favorable. |
low, info | Defer to DEFERRED.md. |
style | Defer or accept silently. |
If nothing to revise → skip; mark sub-task completed, proceed.
After 2 revision rounds with critical/security findings still unaddressed → halt and surface.
Mark commit in_progress. Per CLAUDE.md commit hygiene:
<type>(<scope>): <subject> under ~70 charsgit show --stat HEAD to verify only intended files
landed. If extra files appeared → git reset HEAD~1, redo cleanly.Append a one-line entry to changelog-pending.md:
- fix(council): pin reviewer mode at spawn (commit abc1234)
- feat(rulesets): add ruleset registry plugin (commit def5678)
Mark commit completed.
Last action before "ending" the loop iteration:
sleep 1 && echo "lazarus"
…run as a Bash tool background task (run_in_background: true). The
completion notification wakes the next loop iteration.
This pattern survives auto-compact: the OS process is independent of the context window, the notification lands, the new context reads the task list and resumes.
If you launched a long test suite or council in background and it's still running when you'd otherwise fire a ping, don't fire a separate ping — the existing background task's completion is itself a wake signal.
TaskUpdate the parent work-unit task to completed. The next work
unit's first sub-task (draft plan) becomes the next pending item;
the next wake will pick it up automatically.
If this was the last work unit, do the end-of-run release commit:
changelog-pending.md.package.json → versionCargo.toml → [package] versionpyproject.toml → [project] versionplugin.json + marketplace.json (Claude Code plugins) → bothCHANGELOG.md under a new
## [X.Y.Z] — YYYY-MM-DD section, ordered by Keep-a-Changelog
category.changelog-pending.md.STATUS.md summarizing the run (units done, deferred
count, commits made, tests passing).Do NOT push, tag, or release in the loop. Those are explicit human operations.
The runner reads council output, decides. The council has no veto.
Discover the test runner from the manifest, in this order:
| Manifest signal | Command |
|---|---|
package.json scripts.test | npm test (or pnpm test / yarn test if lockfile indicates) |
Makefile with test target | make test |
pyproject.toml (look for pytest, tox, nox config) | the configured runner |
Cargo.toml | cargo test |
go.mod | go test ./... |
.github/workflows/*.yml test job | mirror the CI command |
| None of the above | halt; ask user to specify |
Pass = exit 0 + no new failures vs baseline + lint/build clean (run the project's lint and build commands too if they exist; same discovery pattern).
Cap revision rounds at 2 per gate (plan council, diff council). After round 2:
Don't loop indefinitely chasing reviewer agreement; reviewers from different families won't fully converge on subjective findings.
Commits, code comments, CHANGELOG, PR bodies. No "per council", no "C1/R2", no "after review", no "spawned panel found X", no "chair recommendation", no "round-2 convergence". Describe the technical what changed and why. Comments only when logic is non-obvious.
This restates a global rule because the grinder is especially prone to slipping into review-stage shorthand — every unit ends in a council, and the temptation to write "fixed per chair" is strong. Resist.
Auto mode normally prompts for destructive operations. The grinder's loop cannot pause for prompts. Inside the per-unit loop, these are pre-authorized:
rm of git-tracked files (recoverable via git restore)git reset --hard <baselineSha> to revert a failed implementationNOT auto-passed — still halt and write STATUS.md:
rm -rf of untracked filesmain / master / production branchesImplementation may touch only files named in the plan's "Files in scope" + their direct imports. Cleanup outside scope → backlog to DEFERRED.md. Resist scope creep.
Clean up /tmp/grinder-* and /tmp/council-* at end of each work
unit. Over a 12-hour run these accumulate.
Don't proactively manage. If hit, the grinder's git commit will
fail; squash later as a release-prep task. Not worth interrupting the
loop for.
Write <repo>/.cordyceps/grinder/STATUS.md and stop the loop (no
wake ping) on:
| Trigger | What goes in STATUS.md |
|---|---|
| Destructive op outside the auto-pass list | What was attempted, why blocked |
| Plan-council unanimous block-ship after 2 rounds | Reviewer findings + runner's analysis |
| Diff-council critical/security findings unaddressable after 2 rounds | Same |
| Test infrastructure missing or unrecognized | What was tried + ask user to specify |
| 3 consecutive work-unit failures | Unit IDs + failure modes |
| Driver count drops below 2 mid-run | Which drivers dropped + why |
| Goal-impossible discovery during implementation | What was discovered + suggested next direction |
| Cordy daemon won't start after one restart attempt | Daemon error + suggested action |
| Ambiguous crash-recovery state | Diff vs baseline + what step was last recorded |
STATUS.md format:
# Grinder halted: <YYYY-MM-DD HH:MM>
## Trigger
<which trigger above>
## Context
<what work unit, what step, what was happening>
## What's complete
<commits made, units done>
## What's deferred (DEFERRED.md)
<count + categories>
## What's needed from you
<one specific question or decision>
cordy daemon start 2>&1 || true # idempotent
DRIVERS=$(cordy --ephemeral drivers --json)
AVAILABLE=$(echo "$DRIVERS" | jq -r '[.[] | select(.probe.available)] | length')
[ "$AVAILABLE" -lt 2 ] && halt
If a panel member errored or parse-failed in the previous unit, drop
it from the default panel for the rest of the run. Don't keep retrying
a broken driver. Track in baseline.json.droppedDrivers[] with the
reason.
If the daemon failed to start, retry once. Second failure → halt with STATUS.md.
Council reviews on real files take 5–15 minutes. Cordy buffers per-reviewer output until all reviewers finish, then writes one JSON blob in one shot.
run_in_background: true.tail / wc -c poll — output is empty until the slowest
reviewer returns.If a council hasn't notified in ≳1.5× the panel's max --timeout,
something is genuinely stuck. Investigate, don't blindly retry.
# Deferred findings — generated by grinder
## Work unit: B (resource pools)
### style: Magic number 30000 in chunkByLines
Could be named MAX_CHUNK_BYTES (already defined elsewhere — minor
refactor).
### info: Comment in src/foo.ts:42 mentions deprecated API
Outdated reference; fix on next maintenance pass.
## Work unit: C (combat formulas)
...
The user reviews this at end of run and can spawn follow-up work or discard.
git show --stat HEAD.npx claudepluginhub ellyseum/claude-cordyceps --plugin claude-cordycepsProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.