From quiver
Author comprehensive, behavior-focused tests via a 3-phase generate → critique → refine loop. Phase 1 produces a first-cut test suite covering happy paths, edge cases, error paths, and boundary conditions in the project's existing test framework. Phase 2 spawns an adversarial critique subagent that reviews the suite for missing cases, implementation-detail testing, hidden-failure mocks, redundancy, and determinism issues. Phase 3 applies the critique and runs the suite. Use when the user says "write tests", "add tests", "test this code", "comprehensive tests", "generate tests with critique", "test-craft this", "I need tests for X", or otherwise asks for high-quality test coverage on a specific file, module, or feature. Do NOT use for one-off assertions, ad-hoc debugging tests, simple smoke checks, or when the user says "just write a quick test" — those should be done inline without ceremony.
How this skill is triggered — by the user, by Claude, or both
Slash command
/quiver:test-craftThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Three phases. Do them in order. Do not skip phase 2.
Three phases. Do them in order. Do not skip phase 2.
A test file (or files) for the target code, validated by an adversarial second pass and refined accordingly. The deliverable is the test suite plus a short note to the user on the critique findings and how they were addressed.
Before starting, you need to know:
package.json, pyproject.toml, Gemfile, go.mod, etc.If the target is ambiguous ("write tests" with no specifics), ask the user one batched question to pin it down before starting Phase 1.
Read the target file end to end, plus any types/contracts it exports. Do not just skim — you need to understand what behaviors the code promises.
For each public function/method/endpoint, write a list of behaviors to verify. Include:
This list is the SPEC for the test suite. Refer back to it.
Read 1–2 existing test files in the repo. Match:
tests/ vs __tests__/ vs *_test.go next to source)test_foo.py vs foo.test.ts vs FooSpec.scala)unittest.mock, vi.mock, gomock, etc.)assert x == y vs expect(x).toBe(y) vs x.should == y)Do not invent a new testing style if the project has an established one.
Cover every behavior on the list from 1b. Each test:
Run the test suite. Make sure they pass (or fail in a meaningful way if you're doing TDD-style red-first). Fix any issues — broken tests, broken imports, wrong fixtures.
This is the load-bearing phase. Do NOT critique your own work — spawn a subagent. The subagent has not seen your reasoning during generation and will catch things you'd rationalize away.
Use the Agent tool with subagent_type: general-purpose and the following prompt:
You are an adversarial test reviewer. You are reviewing tests written by another
agent. Your job is to find every weakness in the test suite. Do not be
diplomatic — find real problems.
CONTEXT
The code under test is at: <absolute path(s)>
The tests are at: <absolute path(s)>
The framework is: <pytest|jest|vitest|...>
WHAT TO READ
1. Read the source file(s) end to end — understand the contract being tested.
2. Read the test file(s) end to end.
3. Read 1–2 sibling test files in the repo to understand local conventions.
WHAT TO LOOK FOR (be thorough)
- Tests that exercise implementation details rather than observable behavior.
Specifically: tests that check which internal helper was called when the
caller doesn't care; tests that mirror the structure of the implementation;
tests that break when refactoring without changing behavior.
- Missing edge cases: empty inputs, single-element, max-size, unicode/whitespace,
null/None/undefined, negative numbers, zero, off-by-one boundaries.
- Missing error paths: what happens with invalid input? Network failure?
Permission denied? Timeout? Concurrent access?
- Mocks that hide real failures: an HTTP client mocked to always succeed;
a database mock that doesn't enforce constraints; a time mock that doesn't
advance.
- Non-deterministic tests: depending on system time, random, network, file
system order, dict ordering (in older runtimes), or test execution order.
- Redundant tests: multiple tests asserting the same behavior with trivially
different inputs.
- Vague assertions: assertTrue, expect(x).toBeDefined(), checking only that
no exception was raised when a specific shape was expected.
- Missing negative cases: only happy paths covered; no tests for "this should
fail when X".
- Tests that don't actually run (skipped, commented out, broken imports).
- Tests with poor names that don't describe the behavior.
- Setup/teardown leaking state between tests.
- Tests for documented behavior that's missing — read the docstrings/types
and check every promise has a test.
OUTPUT
Return a markdown document with three sections:
## Critical (must fix before shipping)
- ...
## Major (should fix — risk of false confidence)
- ...
## Minor (nice to have)
- ...
For each issue, cite the specific test by name and explain WHY it's a problem
and WHAT to change. Do NOT edit any files. Reviewing only.
If the tests are actually good, say so explicitly under each section ("None
identified") rather than padding with weak nitpicks.
Save the subagent's response. This is your critique document.
Read the critique. For each issue:
Apply the fixes:
Run the whole suite. It must pass. If it doesn't, fix the failures — don't ship a broken refinement.
If the project has a coverage tool wired up (pytest-cov, c8, jest --coverage, go test -cover, etc.), run it. Report coverage numbers but treat them as a lagging indicator — coverage is a floor, not a ceiling. The critique above is the real quality bar.
Tell the user:
Keep this report tight — one short section per item, not an essay.
npx claudepluginhub asrinivasan75/quiver --plugin quiverProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.