From harness-claude
Validates test suite quality by introducing deliberate code mutations to expose weak assertions, missing edge cases, and dead test code.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-claude:harness-mutation-testThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Test quality validation through mutation testing. Introduces deliberate code mutations to verify that the test suite catches real bugs, exposing weak assertions, missing edge cases, and dead test code.
Test quality validation through mutation testing. Introduces deliberate code mutations to verify that the test suite catches real bugs, exposing weak assertions, missing edge cases, and dead test code.
Detect the project's language and test framework. Mutation testing tools are language-specific:
Install and configure the mutation framework. Generate the configuration file:
Define the scope. Mutation testing is expensive. Scope it:
Set the mutation score threshold. Define the minimum acceptable score:
Verify the test suite passes before mutating. Run the full test suite. All tests must pass. Mutation testing against a failing suite produces meaningless results.
Run mutant generation in dry-run mode. List the mutations that will be applied without executing tests. Review:
Review the mutant operators. Ensure the configured operators are relevant:
+ to -, * to / -- tests mathematical correctness< to <=, > to >= -- tests off-by-one handling== to !=, true to false -- tests branch coverage"hello" to "" -- tests string handlingEstimate execution time. Calculate:
Filter out equivalent mutants (where possible). Some mutations produce functionally identical code (e.g., changing the order of commutative operations). Configure the framework to skip known equivalent patterns to reduce noise.
Execute the mutation test run. Start the framework with the configured scope:
npx stryker run --concurrency 4
or
pitest:mutationCoverage
Monitor progress. Track:
Handle long-running mutations. If a single mutant takes longer than 3x the normal test timeout:
Collect results. After completion, the framework produces:
Review the overall mutation score. Compare against the threshold:
Examine survived mutants. For each survived mutant, determine why the test suite did not catch it:
toBeTruthy() instead of toBe(42)). Fix: strengthen the assertion.Prioritize improvements by business impact. Focus on survived mutants in:
Write targeted tests for the highest-priority survived mutants. For each:
Generate an improvement report. Summarize:
Run harness validate. Confirm the project passes all harness checks after test improvements.
If a knowledge graph exists at .harness/graph/, refresh it after code changes to keep graph queries accurate:
harness scan [path]
harness validate -- Run in ANALYZE phase after tests are improved. Confirms project health with strengthened tests.harness check-deps -- Run after CONFIGURE phase to verify mutation testing framework is in devDependencies.emit_interaction -- Used to present mutation score results and survived mutant analysis to the human for prioritization decisions.toBeTruthy, toBeDefined, != null) and missing assertion patterns.toBeTruthy, toBeDefined, expect(result) without matcher) are replaced with specific assertionsharness validate passes after test improvementsCONFIGURE -- Stryker configuration:
// stryker.config.mjs
/** @type {import('@stryker-mutator/api/core').PartialStrykerOptions} */
export default {
mutate: ['src/billing/**/*.ts', '!src/billing/**/*.test.ts'],
testRunner: 'vitest',
reporters: ['html', 'clear-text', 'progress'],
coverageAnalysis: 'perTest',
timeoutMS: 30000,
concurrency: 4,
thresholds: {
high: 90,
low: 80,
break: 75,
},
};
ANALYZE -- Survived mutant investigation:
Mutant #47: src/billing/calculate-discount.ts:23
Original: if (quantity >= 10) { discount = 0.15; }
Mutation: if (quantity > 10) { discount = 0.15; }
Status: SURVIVED
Analysis: No test checks the boundary condition where quantity is exactly 10.
The existing test uses quantity=20 (well above threshold) and quantity=5 (below).
Fix -- Add boundary test:
// src/billing/calculate-discount.test.ts
it('applies 15% discount when quantity is exactly 10', () => {
const result = calculateDiscount({ quantity: 10, unitPrice: 100 });
expect(result.discount).toBe(0.15);
expect(result.total).toBe(850);
});
it('does not apply discount when quantity is 9', () => {
const result = calculateDiscount({ quantity: 9, unitPrice: 100 });
expect(result.discount).toBe(0);
expect(result.total).toBe(900);
});
CONFIGURE -- Maven PIT plugin:
<!-- pom.xml -->
<plugin>
<groupId>org.pitest</groupId>
<artifactId>pitest-maven</artifactId>
<version>1.15.3</version>
<configuration>
<targetClasses>
<param>com.example.orders.*</param>
</targetClasses>
<targetTests>
<param>com.example.orders.*Test</param>
</targetTests>
<mutators>
<mutator>CONDITIONALS_BOUNDARY</mutator>
<mutator>NEGATE_CONDITIONALS</mutator>
<mutator>RETURN_VALS</mutator>
<mutator>MATH</mutator>
</mutators>
<mutationThreshold>80</mutationThreshold>
<timestampedReports>false</timestampedReports>
</configuration>
</plugin>
EXECUTE:
mvn org.pitest:pitest-maven:mutationCoverage
# Report generated at target/pit-reports/index.html
| Rationalization | Why It Is Wrong |
|---|---|
| "We have 80% line coverage, so test quality is already good" | Line coverage measures execution, not verification. Mutation testing reveals missing assertions and weak assertions. |
| "The survived mutants are in non-critical utility code, so we can ignore them" | Every survived mutant must be either addressed with a test or explicitly justified as an equivalent mutant. |
| "I will write a test that targets the specific mutation to kill it" | No gaming the mutation score. Every new test must test a meaningful behavior, not just kill a specific mutant. |
| "The test suite has some failures, but we can still run mutation testing to see what we learn" | No mutation testing against a failing test suite. Mutations against broken tests produce garbage results. |
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeRuns mutation testing to validate test suite quality across multiple stacks (Stryker, Infection, go-mutesting, mutmut, Vitest). Use when verifying test effectiveness or after generating tests.
Runs mutation testing workflow: mutates source code module-by-module, executes tests per mutation, writes tests for survivors, verifies, commits. Tracks multi-session progress.
Runs mutation tests with Stryker, mutmut, PITest, or go-mutesting to evaluate test suite effectiveness by generating code mutants and verifying test detection. Identifies gaps in test coverage.