From Darkroom Engineering
Executes testing tasks: writes colocated test files, runs tests, and summarizes coverage. Includes TDD variant with rationalization counters and red flags.
How this skill is triggered — by the user, by Claude, or both
Slash command
/darkroom:testtesterThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Delegates to the Tester agent for test coverage and verification.
Delegates to the Tester agent for test coverage and verification.
button.test.tsx)If you catch yourself thinking any of the following, STOP — you are skipping testing:
| Rationalization | Why It's Wrong |
|---|---|
| "This is too simple to test" | Simple functions with edge cases are exactly what tests catch |
| "I'll add tests later" | Later never comes; untested code ships and breaks |
| "The types guarantee correctness" | Types check structure, not logic — add(a, b) returning a - b passes TypeScript |
| "It's just UI, tests don't help" | Interaction tests catch regressions that visual review misses |
| "Manual testing is enough" | Manual testing doesn't run in CI and doesn't prevent regressions |
| "Tests are passing immediately" | Tests that pass on first run without failing first may not be testing what you think — verify the test actually exercises the code path |
Return a summary:
A test-first discipline that produces tests which describe behavior through public interfaces, not implementation. Such tests survive refactors. Tests coupled to internals do not — they break the moment you rename a function and pass when behavior actually breaks.
Before exploring the codebase, follow ../context-doc/DOMAIN-AWARENESS.md. Test names and interface vocabulary should match the project's CONTEXT.md.
Do not write all tests first, then all implementation. That produces tests describing imagined behavior. They test the shape of things (data structures, function signatures) instead of user-facing capability. They go insensitive to real changes — they pass when behavior breaks and fail when behavior is fine.
WRONG (horizontal):
RED: test1, test2, test3, test4, test5
GREEN: impl1, impl2, impl3, impl4, impl5
RIGHT (vertical):
RED→GREEN: test1 → impl1
RED→GREEN: test2 → impl2
...
Each cycle responds to what you learned from the previous one. You can only write the right test for behavior N after implementing behavior N−1.
Before writing any code:
Ask: "What should the public interface look like? Which behaviors are most important?"
Write one test confirming one thing about the system:
RED: test for first behavior → fails
GREEN: minimal code to pass → passes
This proves the path works end-to-end. Don't move on until it passes.
For each remaining behavior:
RED: next test → fails
GREEN: minimal code to pass → passes
To run the red→green cycle autonomously across turns, set
/goal every planned behavior has a passing test and the full suite exits 0.
The evaluator reads the test output from the transcript after each turn — don't
phrase the condition as something only the runtime can prove (file flags, etc.).
Rules:
After all tests pass, look for cleanup:
CONTEXT.mdNever refactor while red. Get to green first.
For these, use /build (scaffold-then-test) or /fix (existing-bug pipeline) instead.
npx claudepluginhub darkroomengineering/cc-settings --plugin darkroomCreates and manages unit and integration tests by analyzing codebase, auto-detecting test frameworks, and generating tests that follow project conventions.
Enforces a strict red-green-refactor TDD cycle with coordinated agent orchestration, covering test specification, failing tests, and incremental implementation with coverage thresholds.