From ywc-agent-toolkit
Enforces RED→GREEN→REFACTOR TDD cycle with mandatory test-failure confirmation before any production code. Use for new features, bug fixes, or behavior changes before writing code.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ywc-agent-toolkit:ywc-tdd-ritualThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**Announce at start:** "I'm using the ywc-tdd-ritual skill to enforce RED → GREEN → REFACTOR with a watch-it-fail gate before any production code."
Announce at start: "I'm using the ywc-tdd-ritual skill to enforce RED → GREEN → REFACTOR with a watch-it-fail gate before any production code."
This skill is the canonical writing-time discipline for ywc. It exists because every implementation skill (ywc-code-gen, ywc-sequential-executor, ywc-parallel-executor) can run with or without TDD, and the cost of running without is borne later by ywc-debug-rootcause, ywc-impl-review, and CI — typically 3–10× the cost of writing the test first. Adapted from superpowers:test-driven-development, tightened to delegate completion claims to ywc-verify-done and root-cause work to ywc-debug-rootcause.
NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
If production code was written before the test, delete it and start over from RED. "Adapt the existing code while writing the test" is the same as "writing the test after"; it produces a test biased by the implementation that does not actually catch defects.
When tempted to skip the cycle, check this table first:
| Excuse | Reality |
|---|---|
| "I'll write the test right after implementing — same result" | Tests written after the code pass immediately. A test that passes the first time it runs proves nothing — it might be testing the wrong thing, the wrong path, or the wrong assertion. You never saw it catch the bug, so it cannot be trusted to catch one later. Test-first forces the test to fail for the right reason first. |
| "Too simple to need a test" | "Simple" code is where unchecked assumptions land. The test takes 30 seconds; the bug found in production takes hours. There is no size threshold below which the gate stops applying. |
| "I already manually tested every edge case" | Manual testing is ad-hoc and leaves no artifact. The next change will re-introduce the bug because there is no automated guard. Manual coverage is not coverage. |
| "Deleting the X hours of implementation I already wrote is wasteful" | Sunk cost. The time is gone either way. The choice now is (a) delete and rewrite under TDD with high confidence, or (b) keep the unverified code and inherit a debugging debt that will cost more later. Keeping it is the more expensive option. |
| "I'll keep the code as reference while I write the test first" | "Reference" becomes "model to copy", which becomes "the test I built around the existing code" — i.e., a test-after with extra steps. Delete means delete. Implement fresh from the test. |
| "I need to explore the design first; TDD blocks exploration" | Exploration is fine. Treat the exploration as a spike — keep notes, then throw away the spike code before starting TDD. The TDD cycle begins from the test, not from the spike. |
| "The test setup is so big the cycle slows me down" | A test that needs a 200-line setup is signal that the design is wrong. Listen to the test: extract a seam, inject the collaborator, or simplify the interface. Hard-to-test code is hard-to-use code. |
| "I'll skip the 'watch it fail' step — I know it would fail" | The watch-it-fail step is not for the test, it is for you. It catches: (a) the test passing for an unrelated reason (e.g., the assertion is wrong), (b) typos that turn the test into a no-op, (c) the test exercising code that already exists. Without this step, the cycle silently degrades into test-after. |
| "I have 6 features to add — I'll TDD the hard one and skip the easy ones" | The easy ones are exactly where the bug hides, because attention is elsewhere. The discipline applies per feature, not per session. |
| "This is a refactor, no behavior change — tests not needed" | Refactor = same behavior, different structure. The existing test suite must pass before and after. If no test covers the surface being refactored, write the test first (RED on the existing behavior), then refactor under it. Refactoring without tests is rewriting. |
Violating the letter of this discipline is violating the spirit. The cycle is the only mechanism in the ywc workflow that catches design defects before they ship.
RED → Verify RED → GREEN → Verify GREEN → REFACTOR → Verify GREEN → Next
Each transition has a verification step that must be executed in the current message — no transition is silent.
Write one minimal test that captures one behavior the production code is required to exhibit.
Requirements:
rejects offset > total with 400" beats "pagination test 3".<run the test, scoped to just the new test>
Confirm three things from the output:
If the test passes the first time it runs, the test is wrong:
expect(x).toBeDefined() where x is always defined), orFix the test. Do not advance to GREEN with a passing-on-first-run test.
Write the simplest production code that makes the failing test pass.
Requirements:
<run the new test>
<run the broader suite to confirm no regression>
Confirm:
If the broader suite breaks, the GREEN code is too aggressive (touched a public surface other tests rely on) or too narrow (mutated shared state). Either revert or extend the suite — never silence a failing test to advance.
Now, and only now, improve the implementation:
No new behavior during refactor. The same tests must pass after each edit. If a refactor breaks a test, the refactor changed behavior — revert and split into a separate cycle.
Re-run the suite after the refactor. If it stays green, the cycle is complete. If a test breaks, treat as Step 5 violation.
If there are more behaviors to implement, return to Step 1 with the next test.
When all required behaviors are implemented, hand off to ywc-verify-done for the completion claim (build + lint + full test suite + per-claim evidence block).
The cycle produces three commit shapes per behavior, in this order. (Skip the REFACTOR commit when no cleanup was needed.)
| Stage | Commit message shape | Verification at commit time |
|---|---|---|
| RED | test: <one-line behavior> (tests fail at this commit) | Test fails for the expected reason |
| GREEN | feat: <one-line behavior> or fix: <bug> | New test passes; broader suite passes |
| REFACTOR | refactor: <what was tightened> | All tests still pass |
Per-stage verification blocks follow ywc-verify-done's shape (command → exit code → claim) — see ywc-verify-done/SKILL.md. The RED commit's verification block proves the test failed; the GREEN commit's proves it passes; the REFACTOR commit's proves nothing broke.
When delegated from ywc-code-gen --tdd, the executor emits these three commits per behavior automatically and threads the verification block into its per-step report.
ywc-code-gen with --tdd flag (Step 7.5 enforces the cycle per generated module); ywc-debug-rootcause Phase 4 §1 (regression test for the bug).ywc-verify-done (every Verify step uses its evidence-block format and forbidden-vocabulary rules); ywc-debug-rootcause (if the RED step itself uncovers an unexpected pre-existing failure, that is a Phase 1 finding — route there).ywc-confidence-gate (pre-cycle ≥90 confidence reduces RED rewrite churn); ywc-impl-review (downstream code review uses the test as the spec evidence).Before declaring a behavior complete, verify:
ywc-verify-done)(Procedural failure modes specific to TDD. Behavioral rationalizations are in the table above — do not duplicate here.)
ywc-verify-done at the end of the cycle because "the cycle already verified it". The cycle verifies one behavior; ywc-verify-done verifies the claim (build green, lint green, suite green, claim wording free of forbidden vocabulary). Both are required for completion.| Reference | Use when |
|---|---|
| references/red-green-checklist.md | Walking through the per-step entry / exit conditions during the cycle |
| references/test-shape-cookbook.md | Picking the right shape of test for the behavior (state vs interaction, unit vs integration, snapshot vs assertion) |
| ../ywc-verify-done/SKILL.md | Per-step verification block shape (command + exit code + claim) |
| ../ywc-debug-rootcause/SKILL.md | When an existing test starts failing during a cycle, route the investigation here before another fix attempt |
npx claudepluginhub yongwoon/ywc-agent-toolkitGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.