Skill

tdd

Enforces strict RED-GREEN-REFACTOR TDD cycles: writes one failing test, minimal code to pass, then refactors. Delegates to test-provenance-guard and code-quality skills during REFACTOR. Useful for test-first or test-after workflows.

testing

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/mthines-agent-skills:tdd [after] [<feature-description>]

User invocable

Model invocable

Inline context

Default effort

Argument hint[after] [<feature-description>]

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Enforce strict RED-GREEN-REFACTOR discipline. One test at a time. Tests describe WHAT the system does, never HOW.

Supporting Files

rules/green.mdrules/red.mdrules/refactor.mdrules/test-after.md

SKILL.md

170 lines · ~2.4k tokens

Stats

LanguageTypeScript

Stars5

Forks2

MaintenanceExcellent

Last CommitJun 17, 2026

Actions

View Source View Plugin View on GitHub View README

Test-Driven Development

Enforce strict RED-GREEN-REFACTOR discipline. One test at a time. Tests describe WHAT the system does, never HOW.

Input

Check $ARGUMENTS for the feature or behavior description.

If $ARGUMENTS contains a feature description, use it directly.
If $ARGUMENTS contains "after" (e.g., /tdd after), run in Test-After Mode — write tests for existing code. See rules/test-after.md.
If $ARGUMENTS is empty, ask the user what behavior they want to implement or test.

Step 0: Discover Project Test Setup

Before writing any tests:

Find existing tests — glob for **/*.test.*, **/*.spec.*, **/*_test.*, **/test_*.*, **/tests/** to identify the test framework, naming conventions, and directory structure.
Find the test runner — check package.json scripts, Makefile, pyproject.toml, Cargo.toml, go.mod, or similar for the test command.
Adopt existing patterns — match the project's test style exactly: same imports, same assertion library, same file naming, same directory placement. Never introduce a new test framework or pattern.
Identify the run command — store it mentally as TEST_CMD for use throughout the cycle. If you can run a single test file or test case, prefer that over the full suite.

If no tests exist yet, ask the user which framework to use before proceeding.

Step 1: Prioritize by Business Criticality

Before diving into implementation, identify what matters most:

Core user flows first — what are the primary actions users perform? Test those before edge cases.
Ask if unclear — if the feature has multiple behaviors, ask the user to rank them or list the critical paths.
Build a test list — write a numbered list of behaviors to test, ordered by importance. Each item should be a single, specific behavior (not "test the login flow" but "reject login with expired password").

Present the test list to the user for confirmation before starting the cycle.

Step 2: RED-GREEN-REFACTOR Cycle

For each item in the test list, execute one full cycle. Follow the rules strictly:

RED Phase

See rules/red.md

Write exactly ONE failing test. Run it. Confirm it fails with the expected error. Do NOT write implementation code.

GREEN Phase

See rules/green.md

Write the MINIMUM code to make the failing test pass. No more. Run the test. Confirm it passes. Run the full relevant test suite to check for regressions.

Even in this phase, apply the basic readability primitives while you write: meaningful names, guard clauses for the cases the test forces, no nesting beyond 2 levels. These cost almost nothing during authoring but are expensive to bolt on later. Don't optimize, abstract, or add unrequested features — REFACTOR will handle deeper improvements via Skill('code-quality').

REFACTOR Phase

See rules/refactor.md

Evaluate whether refactoring is needed. If yes, refactor while keeping all tests green. If no, move to the next cycle.

Invoke Skill('code-quality') for this phase. The code-quality skill is the source of truth for what "clean" means — guard clauses, low cognitive complexity, single-responsibility functions, intent-revealing names, single source of truth for union-type metadata, schema-first validation with type inference, total functions, type-driven design, functional core / imperative shell, and the named refactor recipes catalog (R1–R20). Routing through the skill — rather than reading individual rule files — picks up its Review Mode procedure, the structured output contract, and any rules added since this skill was written. Cite recipes by ID in commit messages and PR descriptions (e.g., "R1 Consolidate Parallel Maps", "R10 Total-ise the Function") so reviews are reproducible.

Step 3: Cycle Completion Check

After each RED-GREEN-REFACTOR cycle:

Run the full relevant test suite (not just the new test).
If all tests pass, move to the next item on the test list.
If a test fails, stop and fix it before proceeding. Never accumulate broken tests.
After every 3 cycles, briefly report progress to the user.

Step 4: Final Verification

After all items in the test list are complete:

Run the full test suite one final time.
Check test coverage if the project has coverage tooling — report any critical paths that are uncovered.
Provide a summary of what was tested and what was implemented.

Critical Rules (apply to ALL phases)

Test Quality

Test behavior, not implementation — tests must exercise public interfaces only. A test must survive a complete internal refactor unchanged.
One behavior per test — each test should verify exactly one thing. The test name should describe that behavior.
No testing framework internals — never test that setTimeout works, that React renders, or that Go's http.ListenAndServe starts. Test YOUR code.
Maximum 10-15 tests per file — if you need more, split by behavior group.
Factory functions for test data — use buildUser(overrides?) patterns instead of inline object literals scattered across tests.

Mocking Strategy

DO mock: external HTTP APIs, third-party services, file system (when testing logic, not I/O), time/dates, randomness.
DO NOT mock: your own code, framework features, database (prefer test DB or in-memory), internal modules (unless crossing a major boundary).
Never mock what you don't own — if you don't control the interface, write an adapter and mock that.
If the test needs more than 3 mocks, the design is wrong — refactor the code under test first.

Naming Convention

Follow the project's existing convention. If none exists, use:

describe block: the unit under test (function, class, component)
it/test block: should [expected behavior] when [condition]
Example: describe('createOrder') → it('should reject order when inventory is zero')

Code Quality (during GREEN and REFACTOR)

The code-quality skill is the source of truth for what "well-written code" means in this workflow. Apply during GREEN as inline primitives; invoke as a full pass during REFACTOR via Skill('code-quality').

GREEN-phase primitives (apply inline while writing the minimal implementation):

Guard clauses + early returns instead of nested ifs — flat code is easier to read and easier to test.
Cognitive complexity ≤ 15 per function (SonarSource scoring). If the function feels hard to test (lots of mocks, complex setup), that is the metric warning you the function does too much.
Names that describe intent, not types or position. pendingOrders beats arr2.
One responsibility per function — if you cannot name it without "and", split it.
Validate at boundaries, trust internally. No defensive null checks for impossible states.

For REFACTOR, invoke Skill('code-quality') so the skill's Review Mode procedure runs against the GREEN output. The skill returns findings under the High / Medium / Low / Maintainability / Correctness / Testability headings; address each as a separate refactoring move (one change at a time per the "Procedure" below). Cite recipe IDs (R1–R20) in commit messages.

Anti-Patterns to Avoid

Anti-Pattern	Why It's Bad	What to Do Instead
Writing tests in bulk	Tests imagined behavior, not observed	One test per cycle
Testing and implementing together	Unconsciously designs tests around implementation	Strict phase separation
"Make sure tests pass" prompt	Encourages implementation-first thinking	"Write a FAILING test"
Changing test expectations to pass	Masks real bugs	Fix the source code
Testing private methods	Couples tests to implementation	Test through public API
Copy-pasting mock setup	Brittle, hard to maintain	Extract shared fixtures

When Things Go Wrong

Test won't fail (RED phase): The behavior already exists or the test is wrong. Investigate before proceeding.
Can't make test pass without large changes (GREEN phase): The test step is too big. Break it into smaller behaviors.
Refactoring breaks tests: The tests were testing implementation details. Rewrite the test to test behavior, then refactor.
After 2 failed attempts to fix: Clear context and start the cycle fresh with a better-scoped test.

tdd

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

tdd

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Test-Driven Development

Input

Step 0: Discover Project Test Setup

Step 1: Prioritize by Business Criticality

Step 2: RED-GREEN-REFACTOR Cycle

RED Phase

GREEN Phase

REFACTOR Phase

Step 3: Cycle Completion Check

Step 4: Final Verification

Critical Rules (apply to ALL phases)

Test Quality

Mocking Strategy

Naming Convention

Code Quality (during GREEN and REFACTOR)

Anti-Patterns to Avoid

When Things Go Wrong

Similar Skills

Test-Driven Development

Input

Step 0: Discover Project Test Setup

Step 1: Prioritize by Business Criticality

Step 2: RED-GREEN-REFACTOR Cycle

RED Phase

GREEN Phase

REFACTOR Phase

Step 3: Cycle Completion Check

Step 4: Final Verification

Critical Rules (apply to ALL phases)

Test Quality

Mocking Strategy

Naming Convention

Code Quality (during GREEN and REFACTOR)

Anti-Patterns to Avoid

When Things Go Wrong

Similar Skills