From 97
Provides a checklist for writing and reviewing tests: naming tests/files, designing data/fixtures/mocks, choosing assertions. Use for unit/integration/E2E tests.
How this skill is triggered — by the user, by Claude, or both
Slash command
/97:testing-disciplineThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**A good test pins the contract, uses concrete examples, names the scenario in domain language, and uses test data safe to show a customer.** Run the checklist in order when writing or reviewing a test.
A good test pins the contract, uses concrete examples, names the scenario in domain language, and uses test data safe to show a customer. Run the checklist in order when writing or reviewing a test.
Invoke when you're about to:
npm test, pytest)If you're not sure whether a change counts as "writing a test," invoke anyway — the checklist is short and skipping it produces tests that pass forever while protecting nothing.
Run every step in order when writing a new test. When reviewing, use the same checks to evaluate quality.
[3,3,3,3,3,3] against an input of [3,1,4,1,5,9]. The full postcondition is sorted and a permutation of the input — but expressing that as a generic checker is often more code than the function under test. Prefer a concrete example pair: input [3,1,4,1,5,9], expected [1,1,3,4,5,9]. "Adding to an empty collection" is not "now non-empty" — it's "now contains exactly one item, and that item is X." (Henney, 97/81.)Stack_pop_on_empty_throws reads better than test17. Then test the test: introduce a deliberate bug into the code under test on a private branch and verify the failure message tells you what went wrong. (Meszaros, 97/95.)"don't click that again, you moron" as a placeholder dialog, do not use four-letter words as fake stock tickers. Use boring, obviously-fake, professional data. (Begbie, 97/25.)These thoughts mean STOP — restart the checklist:
| Thought | Reality |
|---|---|
| "I'll just assert the function returns exactly -1." | The contract says "negative" — ±1 is an implementation incidental. The test will go red on a valid refactor and tell you nothing about the requirement. (97/80) |
| "Same length, all elements in range — that's enough for a sort test." | [3,3,3,3,3,3] satisfies that and is wrong. Use a concrete input/output pair so the only correct answer is the one in the assertion. (97/81) |
"I'll seed the database with band members and song titles — funnier than User1." | Cute test data ends up in customer screenshots, demos, and leaked source. Use boring, obviously-fake, professional data. (97/25) |
| "No time to write the test — we'll add it later." | "Later" rarely arrives, and shipping without verification is professionally irresponsible. The test is part of the change. (97/83) |
| "I know what this test means — I just wrote it." | You will not, in six months. Tests are read more than they are written. Name the scenario, structure as context/act/assert, hide scaffolding. (97/95) |
"test17 is a fine name — the body explains it." | Test names are scanned to verify coverage and to read failure reports. Encode the scenario and the entry point in the name. (97/95) |
| "The test setup is fifty lines, then a one-line assert — but it works." | Test pain is design pressure. If the setup dwarfs the body, reshape the production code. Do not mock harder. (GOOS/ListenToTestPain) |
"I'll assert that repo.save was called exactly three times." | Over-specifying mock interactions makes the test red on innocent refactors. Assert on the observable contract, not the call shape. (xUnit/FragileTest) |
"The test reads from /tmp/fixtures/users.json set up in conftest.py." | Mystery Guest. The test cannot be read in isolation. Build the fixture in the test or in a function named for what it returns. (xUnit/MysteryGuest) |
| "I'll branch on the return value and assert different things in each branch." | One test, two scenarios fighting for one name. Split into two tests, or use named parameterized cases. (xUnit/ConditionalTestLogic) |
A single well-written test is done when all of the following are true:
If any box is unchecked, the test is not done. Either finish, or delete it and start over.
| # | Principle | Author |
|---|---|---|
| 97/25 | Don't Be Cute with Your Test Data | Rod Begbie |
| 97/60 | News of the Weird: Testers Are Your Friends | Burk Hufnagel |
| 97/80 | Test for Required Behavior, Not Incidental Behavior | Kevlin Henney |
| 97/81 | Test Precisely and Concretely | Kevlin Henney |
| 97/82 | Test While You Sleep (and over Weekends) | Rajith Attapattu |
| 97/83 | Testing Is the Engineering Rigor of Software Development | Neal Ford |
| 97/92 | When Programmers and Testers Collaborate | Janet Gregory |
| 97/95 | Write Tests for People | Gerard Meszaros |
GOOS/ListenToTestPain | Listen to Test Pain | Steve Freeman & Nat Pryce |
xUnit/ObscureTest | Obscure Test | Gerard Meszaros |
xUnit/FragileTest | Fragile Test | Gerard Meszaros |
xUnit/MysteryGuest | Mystery Guest | Gerard Meszaros |
xUnit/ConditionalTestLogic | Conditional Test Logic | Gerard Meszaros |
See principles.md for the long-form distillations, citations, and source links.
npx claudepluginhub oribarilan/97 --plugin 97Enforces test quality principles including Arrange-Act-Assert structure, single behavior per test, and meaningful naming when writing or reviewing test code.
Guides effective test writing with AAA structure, testing pyramid, mock boundaries. Debugs flaky/brittle tests, chooses unit/integration/E2E boundaries.
Provides stack-agnostic testing strategies, the testing pyramid, test design heuristics, and patterns for fixing flaky tests and applying TDD.