Skill

Run — parallel experiment execution

Internal step of /god-of-debugger. Dispatches one hypothesis-runner subagent per hypothesis in parallel. Aggregates verdicts into survival table. Invoked after hypotheses written to session.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/god-of-debugger:run

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Orchestrate experiments from `debug`. One subagent per hypothesis (`hypothesis-runner`, or `bisect-runner` for bisect). Run **parallel**. Strict verdict back.

SKILL.md

105 lines · ~1k tokens

Stats

LanguageJavaScript

Parent stars1

MaintenanceExcellent

Last CommitApr 20, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Run — parallel experiment execution

Orchestrate experiments from debug. One subagent per hypothesis (hypothesis-runner, or bisect-runner for bisect). Run parallel. Strict verdict back.

Also maintain session cost_log. Record what happened.

Inputs

.god-of-debugger/current → active session_id.
.god-of-debugger/sessions/<session_id>.json → bug, repro, localization, hypotheses.

Missing either → STOP. Tell user run /god-of-debugger:repro + /god-of-debugger:debug first.

Pre-flight

Session has ≥2 hypotheses with experiment specs.
status == "open" (not closed, not repro_unstable).
Skip session.parked — need runners not in v0.1.

Orchestration

Every non-parked hypothesis → dispatch one subagent. Single message. Multiple Task tool calls. Parallel.
- experiment.kind == "bisect" → bisect-runner
- else → hypothesis-runner
Pass each subagent only: { session_id, bug_summary, repro: {command, hit_rate}, hypothesis, budget, repo_path }. Never other hypotheses. Never prior verdicts. Isolation = point.
Wait all verdicts. No early exit.
Verify each subagent wrote .god-of-debugger/experiments/<Hn>/preregistered.json, verdict.json, probe.diff (if edit), run.log. Missing → mark inconclusive, evidence: "runner returned incomplete artifacts".
Update session.hypotheses[i] with verdict, evidence, artifact_path, budget_consumed, retries, confidence, falsification_check.
Append per-run usage to session.cost_log.runs[] with hypothesis id, origin, model, tokens if available.
Write survivor set to session.survivors. Touch session.updated_at.

Output — two parts

Part 1 — strict JSON (for pipeline)

{
  "session_id": "<id>",
  "summary": [
    {
      "id": "H1",
      "origin": "primary",
      "axis": "concurrency",
      "verdict": "killed",
      "evidence": "<one line>"
    }
  ],
  "survivors": ["H3", "H4"],
  "inconclusive": [],
  "killed": ["H1", "H2", "H5"]
}

Part 2 — markdown table (for user, in CC terminal)

Render exactly this shape. Columns in order:

| ID | Origin | Axis | Verdict | Evidence (≤60 chars) |

Origin: prim | adv
Verdict: killed | survived | inconc
Order rows: killed first, then survived, then inconclusive.
Truncate evidence with … if >60 chars.

Then prose, branch on outcome:

0 survivors (all killed)

Round 1 missed. Don't silently regenerate.

"All killed. Cause in axis not covered. Run /god-of-debugger:debug again. Regenerate with ruled-out axes deprioritized. Must cover ≥2 axes not yet tried. Cap: 2 regen rounds."

Counter in session.regenerations (increment each). After 2, halt, hand to user.

Exactly 1 survivor

Likely root cause. State plainly.

"H_n survived. Origin: . Axis: . Evidence: . Run /god-of-debugger:promote to convert experiment to regression test before fix."

2+ survivors

Experiments not discriminating.

"N survived (H_i, H_j, ...). Design experiment that distinguishes them — their expected_if_true must differ. Loop back to /god-of-debugger:debug with survivor set as context."

Flaky repro mid-run

Same hypothesis flips verdict across retries (retries > 1 with differing evidence) → STOP aggregation. session.status = "repro_unstable". Tell user: "Repro went flaky during falsification. Harden via /god-of-debugger:repro before retry."

Rules

Never ship fix here. Hook blocks anyway if survivors != 1.
Never killed without quoted evidence from subagent.
inconclusive is first-class. Includes: timeout, crash, ambiguous output, budget exhaust. Never coerce to killed/survived.
Subagent crashes / times out → inconclusive, not survived.
Parallel dispatch non-negotiable. Sequential defeats design.
Preserve origin in every table. Users must see when adversarial is last standing.

Run — parallel experiment execution

Popularity

Invocation

Context Preview

SKILL.md

Run — parallel experiment execution

Popularity

Invocation

Context Preview

SKILL.md

Run — parallel experiment execution

Inputs

Pre-flight

Orchestration

Output — two parts

Part 1 — strict JSON (for pipeline)

Part 2 — markdown table (for user, in CC terminal)

0 survivors (all killed)

Exactly 1 survivor

2+ survivors

Flaky repro mid-run

Rules

Similar Skills

Run — parallel experiment execution

Inputs

Pre-flight

Orchestration

Output — two parts

Part 1 — strict JSON (for pipeline)

Part 2 — markdown table (for user, in CC terminal)

0 survivors (all killed)

Exactly 1 survivor

2+ survivors

Flaky repro mid-run

Rules

Similar Skills