Enforce a specification-driven development loop where all code, tests, and fixes are traced back to a single SPEC.md file, with automated drift detection, adversarial spec review, and root-cause backpropagation from bugs into the spec.
Plan-then-execute against SPEC.md. Native Claude Code loop, no sub-agents.
Drift detector. Diff SPEC.md against code. Read-only, zero writes.
Spare-budget design pass. Make one shallow module deep — smaller interface, behavior held, tests green before & after.
Sharpen a fuzzy idea into §G/§C before spec. Calibrated interrogation, one question at a time.
Gather external knowledge into §R so build grounds in facts, not hallucinations. Every finding cites a source.
Bug → spec protocol. When a bug is found or a test fails, trace the cause, decide whether a new §V invariant would catch recurrence, append to §B. This is the one non-obvious thing SDD does that plan-then-execute doesn't. Triggers on test failure, bug report, post-mortem, or explicit user ask.
Plan-then-execute implementation against SPEC.md. Native single-thread loop, no sub-agents. On test or build failure, auto-invokes the backprop skill before retrying — a failed verification always considers whether a new §V invariant would prevent recurrence. Triggers when the user asks to build, implement, execute the spec, or tackle a specific §T task (`build §T.3`, `build --next`, `implement next task`, `run the build`). Expects SPEC.md to exist; if not, defers to the spec skill.
Caveman encoding for SPEC.md and spec-adjacent writes. Loaded by /spec, /build, /check. Cuts tokens ~75% vs prose while staying precise. Triggers on any write to SPEC.md or when user says "caveman", "compress this", "be brief".
Read-only drift detector. Diffs SPEC.md against current code and reports violations grouped by severity. Writes nothing — suggests remedies via the spec or build skills but never invokes them. Triggers when the user asks to check drift, audit the spec, verify invariants, or ask whether code still matches the spec. Phrasings: "check drift", "audit the spec", "does the code still match §V", "check invariants", "spec vs code".
Optional design-improvement pass for when you have spare usage to drain. Finds the shallowest modules in the code the spec touches, researches a deeper design, and proposes refactors that shrink interfaces and hide decisions — behavior held constant, tests green before and after. Proposes §I/§V/§T edits, never silent rewrites. Triggers when the user says "deepen this", "improve the design", "this module feels shallow", "pull complexity down", "use spare budget on the codebase", or invokes /ck:deepen. Leans on the codebase-design skill's deep-module vocabulary when present.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
compressed spec-driven development for claude code
one file · one loop · zero sub-agents
Plan-then-execute forgets. SDD remembers — but most SDD frameworks bury that value under agent swarms, dashboards, and ceremony that costs more tokens than it saves.
Cavekit is the simplest full loop: grill → spec → research → review →
build, over one SPEC.md file, no sub-agents. Three commands you run
every time; four more you reach for only when the change earns it.
The spine is three properties that earn their tokens:
SPEC.md at repo root survives context resets. It is
the agent's long-term memory: lose the window, reload the spec, keep going.§B entry; classes
of bug become §V invariants the spec never forgets.And one rule that keeps it from bloating into the frameworks it replaces:
right-size. A one-line fix is just /build. The full chain is for
genuinely uncertain or high-blast-radius work — never for a typo.
the loop — run these every time:
| cmd | job |
|---|---|
/ck:spec | create / amend / backprop SPEC.md. Sole mutator. |
/ck:build | native plan → execute against spec. Names which test proves each §V. Auto-backprops on failure. |
/ck:check | read-only drift report. Lists §V / §I / §T violations. The drift detector. |
reach for these — only when the change earns the ceremony:
| cmd | job |
|---|---|
/ck:grill | interrogate a fuzzy idea into a sharp §G/§C, one question at a time, before you spec. |
/ck:research | gather external knowledge into §R so build grounds in facts, not hallucinations. Every finding cites a source. |
/ck:review | adversarial senior review of the spec before build. Refutes, hardens §V, ends in a go/no-go gate. |
/ck:deepen | spare-budget design pass — make one shallow module deep. Behavior held, tests green before & after. |
One line, via the skills CLI:
npx skills add JuliusBrussee/cavekit
Installs nine skills into ~/.claude/skills/: spec, build, check
(the loop), grill, research, review, deepen (reach-for), plus
caveman and backprop (the utilities). Claude activates each when its
trigger context matches — e.g. "write a spec for…" invokes spec, a fuzzy
idea invokes grill, a risky change before build invokes review. Claude
Code picks them up on next launch.
Or via the Claude Code marketplace (also adds the /ck:spec, /ck:build,
/ck:check, /ck:grill, /ck:research, /ck:review, /ck:deepen slash
commands):
/plugin marketplace add juliusbrussee/cavekit
/plugin install ck@cavekit
Or clone directly:
git clone https://github.com/juliusbrussee/cavekit.git ~/.claude/plugins/cavekit
See FORMAT.md. Sections: §G goal, §C constraints, §I
interfaces, §R research (optional, pipe table), §V invariants, §T tasks
(pipe table), §B bugs (pipe table). Each verb owns specific sections —
no verb rewrites a section it does not own.
FORMAT.md spec schema + caveman encoding + sectioned ownership
commands/ seven thin slash-command entry points → the skills (loop + reach-for)
skills/spec spec mutator — sole writer
skills/build plan-execute, verification contract
skills/check drift report
skills/grill sharpen a fuzzy idea → §G/§C before spec
skills/research external knowledge → §R, every finding sourced
skills/review adversarial senior review of the spec → hardens §V
skills/deepen spare-budget design pass — make one module deep
skills/caveman encoding utility
skills/backprop bug → spec protocol (six steps)
cat SPEC.md is the dashboard.The previous generation is not deprecated — it is frozen at tag
v3.1.0 and
remains a fully working plugin.
What it is:
npx claudepluginhub juliusbrussee/cavekit --plugin ckUltra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Access thousands of AI prompts and skills directly in your AI coding assistant. Search prompts, discover skills, save your own, and improve prompts with AI.
Complete developer toolkit for Claude Code
Intelligent draw.io diagramming plugin with AI-powered diagram generation, multi-platform embedding (GitHub, Confluence, Azure DevOps, Notion, Teams, Harness), conditional formatting, live data binding, and MCP server integration for programmatic diagram creation and management.
Feature development with code-architect/explorer/reviewer agents, CLAUDE.md audit and session learnings, and Agent Skills creation with eval benchmarking from Anthropic.
Orchestrate multi-agent teams for parallel code review, hypothesis-driven debugging, and coordinated feature development using Claude Code's Agent Teams