From testing-canon
Write and review unit tests in the style of Vladimir Khorikov's "Unit Testing Principles, Practices, and Patterns" (the classical/Detroit school). Use this skill whenever the user asks you to write, add, generate, scaffold, fix, refactor, audit, or review unit tests or a test suite — even if they never mention Khorikov or the book — and whenever you are about to write tests as part of a larger coding task. Also use it for questions like "what should I test here", "should I mock this", "why is this test brittle/flaky", "is this a good test", or "how should I structure this test". The skill enforces four rules above all: test OBSERVABLE BEHAVIOR not implementation details; maximize resistance to refactoring (no false failures); mock ONLY unmanaged out-of-process dependencies; and do NOT write tests for trivial code.
How this skill is triggered — by the user, by Claude, or both
Slash command
/testing-canon:khorikov-unit-testingThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill makes you write and review unit tests the way Vladimir Khorikov argues
This skill makes you write and review unit tests the way Vladimir Khorikov argues for in Unit Testing Principles, Practices, and Patterns. The book's central claim is that the goal of testing is to enable sustainable growth of a project — tests should let people refactor and add features without fear. A test that breaks every time the code is reorganized (even when behavior is unchanged) actively works against that goal. Bad tests can be worse than no tests at all.
Most LLM- and junior-written test suites fail in the same predictable ways: they mock every collaborator, assert on internal calls, pin down implementation details, test trivial getters and setters, re-implement the production algorithm inside the test, and chase a coverage number. This skill exists to stop exactly those habits.
The principles below are language- and framework-agnostic. Translate the example pseudocode to the user's actual stack (xUnit/JUnit/pytest/Jest/RSpec/etc.) and match their existing conventions.
A test should verify a unit of observable behavior, not a unit of code, and not how that behavior is implemented.
Observable behavior = something a client of the code actually cares about: a return value, a change to observable state, or a call to an out-of-process collaborator that the outside world can see (an email sent, a message published). Everything else — which private methods run, which internal helpers get called, in what order — is an implementation detail. Coupling tests to implementation details is the single largest source of brittle tests.
Ask of every assertion: "If I refactored the internals without changing what a client observes, would this assertion still pass?" If the honest answer is "no," the test is coupled to an implementation detail and must change.
Every test is judged on four dimensions. See references/four-pillars-and-styles.md
for the full treatment; the short version:
The trade-off that drives almost every decision: you cannot maximize all four. Resistance to refactoring is effectively non-negotiable (you either couple to implementation details or you don't), so in practice you trade protection against regressions against fast feedback. Test value ≈ the product of the first three — if any one is near zero, the test is near worthless.
When you write or review a test, you are implicitly buying regression protection and resistance to refactoring; never sacrifice resistance to refactoring to get a little more coverage.
Follow this order. Do not jump straight to "mock the dependencies and assert the calls" — that is the habit this skill is designed to break.
Step 1 — Decide whether this code should be unit-tested at all.
Categorize the code under test (full detail in references/what-to-test.md):
If the user asks you to test trivial or overcomplicated code, say so and propose the better target (skip it, or extract-then-test) instead of mechanically generating tests.
Step 2 — Pick the test style. Prefer, in order (detail in
references/four-pillars-and-styles.md):
Step 3 — Decide test doubles deliberately. This is where most suites go wrong
(full detail in references/test-doubles.md). Two rules:
Step 4 — Structure the test (see "Test structure & naming" below): one Arrange-Act-Assert, one unit of behavior, no branching logic, a behavior-revealing name, and expected values hard-coded (never recomputed by re-running the production algorithm).
When asked to review, audit, or improve tests, score each test against the four
pillars and scan for the anti-patterns in references/anti-patterns-and-review.md.
Produce findings, not just a rewrite. For each problem test, report:
Highest-priority red flags to catch first:
if (testing)) → "code pollution"; remove it and find another seam.AAA. Every test has three sections: Arrange (set up inputs and the system under test), Act (invoke the one operation being tested), Assert (verify the outcome). Separate them clearly.
if/switch/loops over logic. A test should be a
flat, obvious sequence. Branching means the test is trying to cover multiple
cases — parameterize instead.Naming. Do not use the rigid Method_Scenario_Result pattern (e.g.
IsDeliveryValid_PastDate_ReturnsFalse). Name the test as a plain sentence
describing the behavior to a non-programmer or domain expert:
Example 1:
Bad: Sum_TwoNumbers_ReturnsCorrectSum
Good: Sum_of_two_numbers
Example 2:
Bad: IsDeliveryValid_InvalidDate_ReturnsFalse
Good: Delivery_with_a_past_date_is_invalid
The name should describe what the system does in a scenario, not mention the method under test or the literal return value. Use underscores (or your stack's idiom) for readability; the production code's naming rules don't apply to test names.
Readability over DRY. Some duplication in tests is fine and often better — tests are read far more than they're refactored. It's OK to extract object creation into test data builders / factory methods, but don't hide the meaning of a test behind shared helpers. Never extract the assertions into a shared method that obscures what each test actually checks.
Unit tests cover the domain model. Integration tests cover the controllers — the glue that orchestrates the domain model and out-of-process dependencies.
Read the relevant file when you need depth on that topic:
references/four-pillars-and-styles.md — the four pillars in full, the
trade-off and the test-value formula, the three styles of testing, and why code
coverage is a poor target.references/what-to-test.md — the four-quadrant code categorization, the Humble
Object pattern, and how hexagonal / functional-core-imperative-shell
architectures make code testable.references/test-doubles.md — mocks vs. stubs, Command-Query Separation,
managed vs. unmanaged dependencies, and exactly when mocking is justified.references/anti-patterns-and-review.md — the full anti-pattern catalogue and a
concrete rubric for reviewing an existing test suite.../../EXAMPLES.md (repo root) has worked before/after examples for each rule.
This skill is the classical-school core; others in the repo extend it. Reach for them when the task shifts:
art-of-unit-testing — the fundamentals companion (good-test qualities, test
organization, the same stub/mock-by-direction split in Osherove's vocabulary).effective-software-testing — how to derive the test cases (partitions,
boundaries, coverage as a guide) once you know what makes a test good.xunit-test-patterns — the canonical 5-name Test Double taxonomy and the
smell→pattern catalogue when a suite is a mess.legacy-code-testing — get untested code under test first (characterization tests),
then grow real behavior tests with these four principles.agile-testing-quadrants — where unit tests fit in a whole team's test strategy.context-driven-testing — the investigative mindset and "how much testing is enough
for this context" judgment these rules assume.npx claudepluginhub arcboxlabs/testing-canon --plugin testing-canonProvides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.