From hatch3r
Enforces an 8-gate testability verification before commit/release: mandate-map coverage, real-deal-first ratio, AI eval coverage, mutation kill-rate, contract/property/determinism tests. Useful for features in mandate classes (parser, payment, RPC, state machine, UI, AI).
How this skill is triggered — by the user, by Claude, or both
Slash command
/hatch3r:hatch3r-testability-verifyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill defines what "done" means for any feature shipping test code or a feature in a mandate-map class (parser, payment, RPC, state machine, UI, AI feature). Run before declaring a feature complete. The 8 gates below mix automated checks (machine-checkable on every PR) with one release-cadence gate (mutation kill rate at release-cut). Skipping any gate = the feature is not done. Passing un...
This skill defines what "done" means for any feature shipping test code or a feature in a mandate-map class (parser, payment, RPC, state machine, UI, AI feature). Run before declaring a feature complete. The 8 gates below mix automated checks (machine-checkable on every PR) with one release-cadence gate (mutation kill rate at release-cut). Skipping any gate = the feature is not done. Passing unit tests and reviewer approval alone do not satisfy this bar — a feature in a mandate class without its mandated test shape ships untested risk.
Inputs the skill expects:
src/__tests__/, tests/, __tests__/, e2e/, test/, spec/.vitest.config.ts, jest.config.js, pyproject.toml, pom.xml, or .coveragerc.stryker.conf.json or pom.xml (when payment/auth/critical paths exist).pacts/, Schemathesis report) when service boundaries exist.evals/manifest.yaml, prompts/manifest.yaml) when LLM features ship.Outputs the skill produces: an 8-line verdict block written to the PR conversation, plus a JSON artifact at .audit-workspace/testability-verify-<sha>.json for downstream consumption by hatch3r-release.
Before any work, scan the invocation for unresolved questions in scope, intent, acceptance criteria, target environment, or irreversibility. If any are found, ask the user via the platform-native question tool per agents/shared/user-question-protocol.md. Default path, not exception. Triggers for THIS skill: feature-surface class (parser vs payment vs RPC vs state machine vs UI vs AI), gate selection (coverage-threshold vs mandate-map vs AI-eval vs full), mock-justification budget (review all vs review new mocks only), mutation-test floor changes mid-cycle, and whether to block on Low-confidence findings.
Fan-out scales with task size; token cost never justifies serializing independent work (rules/hatch3r-fan-out-discipline.md P8 B2; agents/shared/efficiency-patterns.md). Emit sub_agents_spawned: { count, rationale } in your output.
This skill is the verification HARNESS — it declares HOW each testability gate is checked. The DISPATCHER that decides WHEN to run it is the CQ specialist agent:
agents/hatch3r-testability.md — invokes this skill as the closing testability gate (CQ5) on PRs modifying test code or features in a mandate-map class. The agent contributes the review trigger and Phase-4 dispatch; this skill contributes the 8-gate procedure.No duplication: the agent decides WHEN, this skill defines HOW.
rules/hatch3r-testing.md is present:
testdata/fuzz/;stryker.conf.json or pom.xml;pacts/ plus broker can-i-deploy gate;__snapshots__/.(integration-tests-without-mocks) / (total-integration-tests) ≥ 0.80.grep -rn "// MOCK:" <test-dir> plus framework-level helpers (vi.mock, jest.mock, unittest.mock.patch, mockito.when).// MOCK: <reason> comment + reviewer-acknowledged justification linked to a tracking issue.vitest.config.ts (or equivalent).src/merge/ 90/80/90/90; src/content/ 85/70/85/85; src/adapters/customization.ts 85/75/85/85.coverage/coverage-summary.json (Istanbul/v8) or coverage.xml (Cobertura).stryker run --incremental, read reports/mutation/mutation.json → metrics.mutationScore).mvn org.pitest:pitest-maven:mutationCoverage, read target/pit-reports/mutations.xml → mutationCoverage).critical-labelled paths per qaskills.sh 2026.fc.property(fc.<arb>, fn => { /* invariant */ })) or Hypothesis (@given(...)) test.// invariant: comment above the test.pact-broker can-i-deploy --pacticipant <svc> --version <sha> --to production).schemathesis run --checks all <openapi.yaml>) executed against staging.rules/hatch3r-contract-testing.md.gh run list --status failure --created >=$(date -d '30 days ago' +%Y-%m-%d) --json conclusion,name,startedAt | jq '[.[] | select(.conclusion=="failure")] | length'.test.skip / test.todo / @pytest.mark.skip in perpetuity.All 8 gates pass = the feature is "done". Anything less = not done.
.claude/rules/test-requirements.md).can-i-deploy exit 0.The orchestrator running this skill emits a single-line verdict per gate (GATE_N: PASS|FAIL <evidence-path>) and aggregates them. One FAIL on a required gate blocks the merge regardless of reviewer approval status.
Failure escalation per agents/hatch3r-testability.md status mapping: Gate 1 fail (mandate-map class missing) → CRITICAL; Gate 4 fail (AI eval coverage <100%) → CRITICAL; Gate 7 fail on auth/payment → CRITICAL; Gates 2/3/5/6/8 → FINDINGS at High or Medium.
gh pr merge on protected branches.rules/hatch3r-testing.md — per-feature test-class mandate map.rules/hatch3r-ai-evals.md — AI feature eval coverage.rules/hatch3r-contract-testing.md — Pact + Schemathesis boundaries..claude/rules/test-requirements.md — coverage thresholds per file class.agents/shared/quality-charter.md §Testing depth — mock-justification budget.stryker-mutator.io/docs/qaskills.sh/blog/mutation-testing-stryker-guidemarktechpost.com/2026/04/18/a-coding-guide-for-property-based-testing-using-hypothesis-with-stateful-differential-and-metamorphic-test-design/docs.pact.io/schemathesis.readthedocs.io/www.anthropic.com/engineering/demystifying-evals-for-ai-agentssuprmind.ai/hub/ai-hallucination-rates-and-benchmarks/npx claudepluginhub hatch3r/hatch3r --plugin hatch3rEnforces a test-driven development workflow with edge-case-first testing, strong assertions, and GIVEN/WHEN/SHOULD naming. Useful when writing or reviewing tests.
Validates test suite quality by introducing deliberate code mutations to expose weak assertions, missing edge cases, and dead test code.
Runs quality gates for linting, type checking, unit tests with coverage, spec compliance, and smoke checks at local/CI/deploy levels.