From coding-debugger
Use for deep iterative debugging when memory lookup returns LIKELY_MATCH/WEAK_SIGNAL/NO_MATCH, a fix didn't hold, or the user asks for root cause analysis. Not for known fixes or trivial issues.
How this skill is triggered — by the user, by Claude, or both
Slash command
/coding-debugger:debug-loopThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A 7-phase debugging loop: investigate via causal tree analysis, hypothesize root cause, implement targeted fix, verify with evidence, score against criteria, pressure-test via critique agent, and report with transparency markers. Iterates up to 5x on failures.
A 7-phase debugging loop: investigate via causal tree analysis, hypothesize root cause, implement targeted fix, verify with evidence, score against criteria, pressure-test via critique agent, and report with transparency markers. Iterates up to 5x on failures.
Before entering the loop, assess whether it's warranted. The trigger is the verdict category, not a numeric score — research shows LLM-assigned confidence scores are poorly calibrated for open-ended tasks (Tian et al., EMNLP 2023; 49-84% calibration error on open-ended generation).
KNOWN_FIX — apply the fix directly and verifyLIKELY_MATCH, WEAK_SIGNAL, or NO_MATCH, the user asks for deep investigation, the initial diagnosis feels superficial, or a previous fix attempt didn't holdGoal: Understand what's actually failing and why, not just what it looks like.
search MCP tool with the symptom. Note any related incidentsOutput: Causal tree (with confirmed and pruned branches), reproduction steps, evidence gathered, research performed
Goal: Commit to a specific, testable hypothesis before writing any fix.
Output: Hypothesis statement, evidence level, prediction test, related symptom predictions
Goal: Make the minimal change that addresses the hypothesized root cause.
Output: List of changes with rationale
Goal: Collect concrete evidence that the fix works.
Every verification step must produce evidence: command output, test results, or observable behavior. "It should work" is not evidence.
Output: Evidence for each verification step
Goal: Objective pass/fail assessment with evidence.
Score against these criteria:
| # | Criterion | Method | Pass Condition | Evidence Required |
|---|---|---|---|---|
| 1 | Symptom resolved | Reproduction steps | Symptom no longer occurs | Command output or test result |
| 2 | Tests pass | Test suite | All relevant tests pass | Test runner output |
| 3 | No regressions | Broader test suite | No new failures introduced | Test runner output |
| 4 | Root cause addressed | Code review | Fix targets root cause, not symptom | Diff + reasoning |
| 5 | Hypothesis confirmed | Prediction test | Prediction test passes | Test output |
All criteria must have evidence. No criterion marked PASS without proof.
If any criterion fails → enter iteration (Phase 6 rules apply) If all criteria pass → proceed to critique (Phase 6)
Output: Scorecard with pass/fail per criterion and evidence
Goal: Challenge the fix before the user relies on it.
The critique agent checks 5 things:
Goal: Clear, honest summary. No overclaiming.
Every item in the report gets one marker:
store MCP tool for future retrievaloutcome MCP tool.claude-code-debugger/debug-loop/scorecard.mdWhen any criterion fails or the critique is CHALLENGED, iterate:
Load references/convergence-rules.md for detailed rules and escalation templates.
Summary:
Write iteration state to .claude-code-debugger/debug-loop/state.json:
{
"symptom": "original symptom",
"iteration": 1,
"phase": "VERIFY",
"hypotheses": [
{
"iteration": 1,
"hypothesis": "description",
"evidence_level": "strong | moderate | weak",
"result": "confirmed | disproved | partial",
"evidence": "what was found"
}
],
"scorecard": [
{
"criterion": "symptom_resolved",
"result": "PASS | FAIL",
"evidence": "summary"
}
],
"critique_verdict": "APPROVED | CHALLENGED | pending",
"changes_made": ["file:change summary"]
}
Create the directory with mkdir -p .claude-code-debugger/debug-loop/ before writing.
MEMORY SEARCH → INVESTIGATE → HYPOTHESIZE → FIX → VERIFY → SCORE
↓
All pass? ──yes──→ CRITIQUE ──approved──→ REPORT
↓ ↓
no challenged
↓ ↓
ITERATE ←──────────────┘
(up to 5x)
| Anti-Pattern | What to Do Instead |
|---|---|
| Accepting the first explanation | Branch first — identify 2+ plausible causes before pursuing any |
| Fixing the symptom | Trace to root cause, fix there |
| "This should fix it" | Run the tests, show the output |
| Retrying the same approach | If it failed once with the same evidence, it'll fail again. Change the hypothesis |
| Declaring victory without evidence | Every claim needs a ✅/⚠️/❓ marker |
| Skipping research when stuck | If you don't know why something behaves this way, search for it |
| Hiding uncertainty | ⚠️ and ❓ are not failures — they're honest. Hiding them is the failure |
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub tyroneross/claude-code-debugger --plugin claude-code-debugger