From vibe-test
This skill should be used when the user says `/vibe-test:fix`. Diagnoses broken tests and test harnesses, routes repairs by confidence (high → auto-repair with rationale; medium → stage in `.vibe-test/pending/fixes/`; low / complex → defer to `superpowers:systematic-debugging`). Detects harness-level breaks (broken runner, missing binary, cherry-picked denominator) distinctly from test-logic breaks. Rolls back auto-written generated tests to pending when they are the source of CI breakage.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vibe-test:fixThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Read [`../guide/SKILL.md`](../guide/SKILL.md) for your overall behavior (persona, experience-level adaptation, Pattern #13 composition rules, version resolution). Then follow this command end to end.
Read ../guide/SKILL.md for your overall behavior (persona, experience-level adaptation, Pattern #13 composition rules, version resolution). Then follow this command end to end.
Fix is the repair surface of Vibe Test. A builder who runs /vibe-test:fix is asking "something is broken — diagnose it honestly, propose the smallest correct repair, and stay out of my way when the problem is deeper than you can responsibly solve." The command distinguishes test-logic breaks (an assertion that doesn't match reality, a mock wired wrong, a fixture that drifted) from harness-level breaks (the runner pool timed out, the test binary isn't installed, the coverage denominator is cherry-picked). Harness breaks are reported separately — they're the WSYATM-class failures audit already named, but fix is where they get repaired.
Detect failures → diagnose each + assign confidence → route high / medium / low per confidence → hand deep-diagnosis cases to superpowers:systematic-debugging when present → offer rollback for auto-generated tests that are the actual source of breakage → emit three output views.
.vibe-test/state/last-run.json if present, or builder-provided output pasted into chat), ORsrc/coverage/runCoverage() and capture the child process exit + stderr, OR"I can run your tests to capture failures, or you can paste the last failing output. Either way works — which do you want?"
Wait for the builder's decision. Never fabricate a failure to fix.
--path <glob> is passed, narrow the fix scope. Read the matching scoped audit-state (audit-<hash>.json) if present — it carries the classification + prior gap context that improves diagnosis quality. Full-repo fixes read audit.json when available..vibe-test/state/accepted.json to identify auto-written tests. These become candidates for the F1 rollback hook if their failure is the trigger.../guide/SKILL.md. Persona opening/handoff lines, experience-level verbosity, Pattern #13 anchored + dynamic, session memory interfaces.../guide/references/data-contracts.md section "fix". You own writes to .vibe-test/pending/fixes/ (proposed staged repairs), .vibe-test/state/fix.json (state sidecar), and updates to accepted.json / rejected.json when a rollback moves an auto-written test back to pending. You read audit-state and last-run outputs.../guide/references/plays-well-with.md. Entries with applies_to: fix are superpowers:systematic-debugging (flagship deferral) and superpowers:verification-before-completion (post-repair verification co-invoke).../guide/references/friction-triggers.md /vibe-test:fix section.src/reporter/tier-adaptive-language.ts.../session-logger/SKILL.md. start('fix', project) at entry, end({sessionUUID, command: 'fix', outcome}) at exit.../friction-logger/SKILL.md. Entries when a rollback happens (signals that auto-write confidence mis-calibrated) or when a complex diagnosis is deferred.Import surface (reusing existing src/ modules — no new TypeScript modules introduced by fix):
src/coverage/runCoverage — to re-run tests and capture exit + stderr when no cached run-output exists.src/scanner/scan — for the framework / dependency detection that powers harness-break classification.src/state/project-state — readProjectState, scopeHash, projectStateSidecarPath for audit-state load.src/state/atomic-write — atomicWriteJson, atomicWrite for staged-fix writes.src/state/friction-log — per-rollback entries.src/state/session-log — sentinel + terminal.src/state/beacons — terminal beacon.src/reporter — createReportObject, renderBanner, renderMarkdown, renderJson, getLanguageKnobs.src/handoff — appendTestPlanSession (to record the fix session chronologically).src/generator/pending-dir-manager — listPending, writePendingIndex, getCurrentHeadHash (the fix's pending-dir layout mirrors generate's; fixes go under .vibe-test/pending/fixes/ while generated tests go under .vibe-test/pending/tests/ so the two never collide).Confidence assignment is SKILL reasoning — no deterministic fix-confidence.ts. Reason over the failure signature + the repair scope + how many downstream tests depend on the change.
Before any user-facing output:
shared.preferences.persona, shared.preferences.pacing, and plugins.vibe-test.testing_experience (fallback shared.technical_experience.level) from ~/.claude/profiles/builder.json.session-logger.start('fix', project_basename). Hold the returned sessionUUID in memory until Step 9.Run the blocking prereq described above. If the check fails, render a gentle block and halt. Do NOT proceed; do invoke session-logger.end({outcome: 'aborted'}).
Parse ../guide/references/plays-well-with.md via loadAnchoredRegistry(). Filter to entries whose applies_to includes fix. For each entry present in the agent's available-skills list:
superpowers:systematic-debugging — announce verbatim: "When a diagnosis is deeper than a one-shot repair, I'll hand off to superpowers:systematic-debugging so we reason through the failure systematically before committing code." This deferral is the flagship — low-confidence diagnoses go there.superpowers:verification-before-completion — announce verbatim: "After a repair lands, verification-before-completion owns the 'is it actually fixed?' check. I won't declare the repair complete without its sign-off."Surface at most ONE anchored complement announcement per invocation. Prefer systematic-debugging over verification-before-completion when both are present (debugging fires first in the flow; verification fires last).
Three input paths, in priority order:
FAIL lines naming the test file + assertion.Timeout of Xms exceeded (harness-level — broken runner).Cannot find module '<pkg>' (missing dep or typo).SyntaxError / ReferenceError at the top of the output (harness bootstrapping failure)..vibe-test/state/last-run.json if present. Writers of this file are gate + coverage; if the builder just ran either, the output is fresh.runCoverage({framework, cwd, adapterAccepted: true, actualSourceFiles, c8TestCommand}) if no output is available and the builder opted in. Do NOT run silently — always ask first ("Want me to run your tests now to capture the failures? Takes about {T}s.").For each failure found, build a Failure record in-SKILL:
{
id: 'fail-<counter>',
source: 'builder-pasted' | 'last-run' | 'live-run',
kind: 'test-logic' | 'harness-break',
subkind: 'broken_test_runner' | 'missing_test_binary' | 'cherry_picked_denominator' | 'assertion-mismatch' | 'mock-drift' | 'fixture-drift' | 'import-missing' | 'unknown',
test_file: '<path>' | null,
assertion_excerpt: '<one-line>' | null,
raw_output_excerpt: '<up to 20 lines>',
stack_top_frame: '<path:line>' | null,
}
Three distinct harness-level finding types — these must be surfaced separately from test-logic failures because the repair pattern is different (config edit / install command) and the severity is critical by default.
broken_test_runner — e.g., vitest forks-pool timeout signature, jest circular require, mocha global timeout. Pattern-match on the failure excerpt:
pool: 'threads' or pool: 'forks', poolOptions.forks.singleFork: true.--runInBand in the test script.--timeout 10000 at the suite level (not global).missing_test_binary — package.json script references e.g. jest/bin/jest.js or vitest but the package is not in dependencies / devDependencies. Cross-check against inventory.detection.allDependencies. Propose the minimal install command: npm install -D <pkg>@<inferred-version-or-latest>.cherry_picked_denominator — if the audit already flagged this, carry it through as a finding and point at the coverage SKILL: "This is a coverage-denominator break, not a test-logic break — /vibe-test:coverage owns the adaptation-prompt UX for that. I'll surface it here as a finding and let that command do the repair."Each harness-break finding goes into the ReportObject with:
category: 'harness-break'severity: 'critical' (harness breaks make every downstream number a lie)rationale naming the exact signature and the one-line config fixexample_pattern showing the minimal-diff fix tailored to the frameworkHarness breaks take precedence in the banner ordering — builders need to know when their tool is lying to them before they fix individual tests.
For each Failure with kind === 'test-logic', compose a diagnosis. SKILL reasoning — read the failing test + the code under test, form a hypothesis, and name it:
assertion-mismatch — the assertion expects X, the code produces Y. Diagnosis includes the concrete diff between X and Y.mock-drift — a mock returns shape A, but the code has since started consuming shape B. Diagnosis names the mock definition file + the consumer.fixture-drift — a test fixture references a schema field that has been renamed / removed. Diagnosis names the fixture + the affected schema.import-missing — the test imports a symbol that no longer exists at the import path. Diagnosis names the missing symbol.unknown — the failure doesn't fit a known pattern. This triggers Step 6 (defer).For each diagnosed failure, assign confidence using SKILL reasoning:
superpowers:systematic-debugging when present; otherwise surface the diagnosis + ask the builder.Route per confidence:
atomicWrite — never streaming partial edits.Action to the report: {kind: 'write', description: '<one-line diagnosis>', target: '<path>'}..vibe-test/pending/fixes/.vibe-test/pending/fixes/<mirror-of-target>.fix.md containing:
.vibe-test/pending/fixes/index.md (overwrite with full listing on each run).Action to the report: {kind: 'stage', description: '<one-line diagnosis>', target: '<staged path>'}."{N} fixes staged at
.vibe-test/pending/fixes/index.md. Accept-all / accept<path>/ reject<path>--reason "..." / come back later?"
superpowers:systematic-debugging is present → announce the deferral:
"Handing this one off to superpowers:systematic-debugging — the failure signature has more than one plausible cause and I'd rather reason through it properly than guess. Follow that skill's flow, then come back to
/vibe-test:fixonce you know the root cause." Add anAction:{kind: 'other', description: 'deferred to superpowers:systematic-debugging', target: '<failure id>'}.
"The failure in
{test_file}has more than one plausible cause —{hypotheses}. Which one matches your recent changes? I'll tailor the diagnosis from there."
Log friction_type: "complex_diagnosis_deferred" at confidence medium whenever a failure takes the defer path — the aggregate signal tells /evolve where fix's automatic diagnosis is weakest.
For every Failure whose test_file matches an entry in accepted.json AND whose file content starts with a header matching /^\/\/ Generated by Vibe Test v[\d.]+ on .+\. Confidence: HIGH\. Audit finding: #\S+\./m, the failing test IS an auto-written generated test. The proper move is not to auto-repair — the correct move is revert to pending for re-review:
<repoRoot>/.vibe-test/pending/tests/<mirror-of-target>.git rm-equivalent so diff shows cleanly).accepted.json."The failing test
{test_file}was auto-written at HIGH confidence — it's now at.vibe-test/pending/tests/{path}for re-review. I didn't try to repair it in place because auto-write confidence was miscalibrated for this case. Re-review, edit, or reject?"
friction_type: "artifact_rewritten" at confidence high — high confidence because this is a direct miscalibration signal for the generator.Action: {kind: 'revert', description: 'auto-written test reverted to pending after CI break', target: '<pending path>'}.The rollback hook is the integrity mechanism. It admits that the generator was wrong and gives the builder the steering wheel back without silent overwrites.
After all high-confidence repairs are applied AND after the builder has accepted any medium-tier staged fixes:
superpowers:verification-before-completion is present, announce the co-invoke:
"Handing off to superpowers:verification-before-completion to run the 'is it actually fixed?' check — I won't declare the repair complete without its sign-off."
Log the co-invoke as complements_invoked on the terminal session entry.
Build the ReportObject via createReportObject({command: 'fix', plugin_version, repo_root, scope, commit_hash}). Populate:
classification — carry forward from audit-state if present; otherwise null.score — carry forward the last coverage_snapshot from audit-state if present; otherwise null. Fix does not re-score.findings — one entry per Failure the SKILL surfaced. Harness breaks come first (severity: critical), test-logic breaks next (severity derived from diagnosis + tier).actions_taken — all the Action records from Steps 6 + 7.deferrals — the Pattern #13 matches from Step 2 (verbatim deferral_contract prose).handoff_artifacts — list of files written / modified / staged this run.next_step_hint — persona-adapted. Default: "Re-run your tests to confirm the repair held. /vibe-test:gate when you're ready for CI threshold check."Render three views in parallel:
renderMarkdown(report, {proseSlots}) → docs/vibe-test/fix-<ISO-date>.md.renderBanner(report, {columns, disableColors: !isTty}) → printed to chat.renderJson({report, repoRoot, skipValidation: true}) → .vibe-test/state/fix.json. (No fix-state.schema.json exists in v0.2 — validation is skipped; the report-object shape is stable enough for consumers.)session-logger.end({sessionUUID, command: 'fix', outcome: 'completed', key_decisions, complements_invoked, artifact_generated}) — terminal entry paired to Step 0.beacons.append(repoRoot, {command: 'fix', sessionUUID, outcome: 'completed', hint: '<N repaired, M staged, K deferred, R rolled back>'}) — Pattern #12.appendTestPlanSession(repoRoot, {command: 'fix', timestamp: nowIso, sessionUUID, classification: <one-line from audit>, generated_tests: [], rejected_with_reason: [], notes: <one-line summary>}) — chronological log entry.On any state-write failure: log a runtime_hook_failure friction entry and continue — the builder already saw the banner + markdown.
Persona-adapted handoff line per guide > "Handoff Language Rules":
| Persona | Handoff line |
|---|---|
professor | "Run your tests to confirm, then /vibe-test:gate when you're ready for CI. If anything else breaks, come back here." |
cohort | "Re-run tests, then /vibe-test:gate. Any new breakage, back to /vibe-test:fix." |
superdev | "Re-run. Then /vibe-test:gate." |
architect | "Verify repair via superpowers:verification-before-completion, then /vibe-test:gate for CI threshold." |
coach | "When you're ready, re-run the failing tests, then /vibe-test:gate for the CI check. I'm here if anything else breaks." |
null (default) | "Re-run tests; /vibe-test:gate when ready." |
Do NOT prescribe /clear between commands. Claude Code auto-compacts.
| Knob | first-time / beginner | intermediate | experienced |
|---|---|---|---|
| Diagnosis length | 3-4 sentences, plain-English, inline glosses for harness, fixture, mock | 2-3 sentences, technical terms first-use-glossed | 1 sentence, pure technical |
| Rollback prompt | Explanatory paragraph | One-line explanation + path | Path + confidence + revert/keep |
| Defer announcement | Full deferral prose | Compressed prose + path | Path + one-line reason |
JSON output is level-invariant.
| Trigger | friction_type | confidence |
|---|---|---|
| Rollback hook fires (auto-written test reverted) | artifact_rewritten | high |
| Builder rejects a staged fix with a reason | fix_rejected | medium |
| Complex diagnosis deferred to systematic-debugging | complex_diagnosis_deferred | medium |
| Builder declines a Pattern #13 complement offer | complement_rejected | high |
| Repeated same-failure runs across sessions (>=3) | recurring_failure | low |
When in doubt, don't log. False positives poison /evolve.
/vibe-test:generate owns authoring — fix only repairs existing tests./vibe-test:coverage./vibe-test:gate decides pass/fail against tier threshold.superpowers:systematic-debugging by design.Broken tests are the single biggest drag on a vibe-coded app's velocity. When the test harness itself is broken, the scoreboard lies — and fixing the wrong layer is worse than doing nothing. Fix draws a bright line between harness-level breaks and test-logic breaks, names them separately in every output, and reserves auto-repair only for cases where the edit is genuinely mechanical. Every other case either stages for review (so the builder can see what's about to change before it lands) or defers to a skill that specializes in methodical root-cause work.
The rollback hook exists because every generator eventually writes a test at HIGH confidence that turns out wrong in CI. Auto-generated tests get a different social contract than hand-written ones: when they break, the default is to revert to pending for re-review, not to patch in place. That contract is the integrity mechanism that lets the generator be aggressive about confidence without gaslighting the builder.
Pattern #13 deferrals keep fix honest about its own boundaries: superpowers:systematic-debugging owns deep-cause reasoning; superpowers:verification-before-completion owns the "is it actually fixed?" check. Fix coordinates; it doesn't pretend to own either of those disciplines.
npx claudepluginhub estevanhernandez-stack-ed/vibe-test --plugin vibe-testProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.