From tdd-pipeline
Enforce strict red-green-refactor TDD for any task. Use when implementing features, fixing bugs, adding functionality, or building new modules. Routes to the full 7-agent pipeline for new modules with 3+ behaviors, or runs inline red-green-refactor for bug fixes and small changes. Triggers on: "use TDD", "fix this bug", "add a feature", "implement", "run the pipeline", "TDD pipeline", "build a module with TDD", or any coding task in a project with TDD in its CLAUDE.md.
How this skill is triggered — by the user, by Claude, or both
Slash command
/tdd-pipeline:tdd-orchestrateThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Strict red-green-refactor for every code change.
Strict red-green-refactor for every code change.
Task: $0
If no task was provided, ask the user what to implement or fix. One sentence is enough.
Use the full pipeline (see "Pipeline" section) when:
Use inline red-green-refactor when:
Most tasks are inline. Default to inline unless the scope clearly warrants the full pipeline.
Read the relevant source and test files. Identify where the change goes. Do NOT write a plan document or summarize what you're about to do — read the code and move to step 2.
If you catch yourself writing a plan, outlining an approach, or explaining what you will do: STOP. Write a test instead.
Write a test against the REAL production code. Import the actual module. Call the actual function. Assert the behavior you want.
Run the test. It must fail. If it passes, either the feature already exists or your test doesn't exercise what you think it does. Investigate before proceeding.
Watch for default-value traps. If you're testing that a function returns false, nil, 0, or an empty string, and the stub already returns that value by default, your test passes immediately — you never see red. Fix this by either: (a) choosing inputs that should produce a non-default value, or (b) making the stub return an intentionally wrong value (e.g., a sentinel or the opposite of what you expect). A test that never fails against a stub proves nothing.
The test must fail for the RIGHT reason. A compile error or import failure is not a valid red state — fix those first so the test runs and fails on the assertion.
Write the smallest change to make the test pass. No refactoring. No cleanup. No extra features. Just make the red test turn green.
Run the test. If it fails, fix the implementation (not the test) until it passes.
Run the FULL test suite. If other tests broke, fix the implementation until everything passes.
If the code is clear and clean, skip this step. If you refactor, run the full test suite after every change.
Commit with a descriptive message. One logical change per commit.
If the task requires multiple changes, repeat from step 2 for each behavior. One test-implement cycle per behavior.
These are real mistakes from past sessions. Each one wastes significant time.
Don't test duplicated logic. Your test must import and call the production code. If your test reimplements the logic it's supposed to verify, it proves nothing.
Don't skip the red step. Every test must fail before you write implementation code. Writing the test and implementation together means you don't know whether the test catches regressions.
Don't get stuck planning. The plan is: write a failing test. If you've spent more than 2 minutes without creating or editing a test file, you're stalling.
Don't assert default values against stubs. If your test checks that a function returns false and the stub returns false by default, the test passes without any implementation. Either test for a truthy/non-default value, use inputs that force a non-default result, or make the stub return a deliberately wrong value so the test fails until real logic exists.
Don't test the wrong file descriptor. When testing TTY behavior, verify which fd (stdin, stdout, stderr) the code actually checks.
Don't mock what you can call. If the real code is available and fast, call it. Mocks diverge from production and hide bugs.
Use this section when the routing decision above chose the full pipeline. For inline tasks, ignore everything below.
If the task specifies a module name, use it. Otherwise, ask the user for the module name and behavior list.
Example: /tdd-orchestrate parser
You are a PURE DISPATCHER. You NEVER write code.
Violations you MUST NOT commit:
What you DO:
If you catch yourself about to use Write or Edit on a source or test file, STOP. Dispatch an agent.
All of these mean STOP. Follow the pipeline.
Use the Agent tool to launch sub-agents. Before dispatching, read the skill files from this plugin and include their content verbatim in the agent prompt:
${CLAUDE_PLUGIN_ROOT}/skills/test-writer/SKILL.md${CLAUDE_PLUGIN_ROOT}/skills/test-reviewer/SKILL.md${CLAUDE_PLUGIN_ROOT}/skills/implementer/SKILL.md${CLAUDE_PLUGIN_ROOT}/skills/code-reviewer/SKILL.md${CLAUDE_PLUGIN_ROOT}/skills/agent-briefing/SKILL.mdAgent types:
subagent_type: programmersubagent_type: reviewerRead the project's CLAUDE.md for test commands, file
paths, and language-specific context. Every value
below marked with (CLAUDE.md) must come from there.
Dispatch a programmer agent with:
test-writer skill contentagent-briefing skill contentThe agent writes the test file and type stubs to the source file path (CLAUDE.md). Stubs contain only signatures -- no real logic.
Dispatch a reviewer agent with:
test-reviewer skill contentFix loop: if NEEDS_FIXES, dispatch a programmer
agent with the test-writer skill and the reviewer's
feedback as the fix list. Then re-dispatch the
reviewer. Max 3 rounds, then escalate to user.
Run the module test command (CLAUDE.md).
Only proceed when tests compile and all fail.
Dispatch a programmer agent with:
implementer skill contentagent-briefing skill contentThe agent replaces the stub source file with the real implementation to make all tests pass.
Run these checks yourself (do NOT dispatch an agent):
If any check fails: re-dispatch implementer with specific feedback. Do NOT waste a reviewer dispatch.
Dispatch a reviewer agent with:
code-reviewer skill contentFix loop: if NEEDS_FIXES, dispatch a programmer
agent with the implementer skill and the reviewer's
feedback as the fix list. Then re-dispatch the
reviewer. Max 3 rounds, then escalate to user.
After code reviewer approves:
npx claudepluginhub kelp/kelp-claude-plugins --plugin tdd-pipelineEnforces strict test-driven development: write a failing test first, then minimal code to pass, then refactor. Activates when TDD is explicitly requested or chosen for an atomic task.
Guides Red-Green-Refactor cycles for test-first feature implementation and bug fixes. Requires /optimus:init and working test infrastructure.