Skill

adversarial-review

Run an adversarial Enthusiast→Adversary→Judge debate review on code. Automatically converges — no manual round control needed. Use when the user says 'adversarial review', 'debate review', 'run a review round', 'do a review round', 'review code with debate agents', 'i want an adversarial review', or '/autoimprove review'. Do NOT trigger on generic 'review' requests or PR reviews. Takes a file, diff, or PR as target.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/autoimprove:adversarial-review [file|diff|glob]

User invocable

Model invocable

Inline context

Default effort

Argument hint[file|diff|glob]

Tool Access

This skill is limited to the following tools:

ReadGlobGrepBashAgentTaskCreateTaskUpdate

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

<SKILL-GUARD>

Supporting Files

heuristics/code-escape-categories.mditerate-mode.mdops-telemetry.mdoutput-format.mdreferences/output-schema.mdrubric-check.mdtelemetry-spec.md

SKILL.md

503 lines · ~9.8k tokens(exceeds 5k compaction limit)

Stats

LanguageShell

Stars2

MaintenanceExcellent

Last CommitJun 13, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

MANDATORY CHAIN: E → A → J

This chain is not interpretable. Follow each numbered step exactly. No step may be skipped, reordered, or improvised.

Agents are loaded via subagent_type — do NOT inline their prompts.

Paradigm: Agents have read-only tools (Read, Grep, Glob) and Read the target themselves from a pinned path the orchestrator computes once per run — the orchestrator does NOT inline target text into prompts. Determinism comes from a sha256 pin captured at STEP 1 and re-verified at STEP 5; intra-run races are logged, not aborted.

STEP 0 — ADAPTIVE MODE DETECTION

Measure target size. Mode gates max rounds and prompt depth.

Diff target: git diff HEAD | wc -l (or --staged).
File/glob target: count lines via Read/Glob.

Condition	MODE	MAX_ROUNDS
Target file has `.md` extension	`FULL`	10
Single file OR diff ≤ 150 lines	`LIGHTWEIGHT`	3
Multi-file OR diff > 150 lines	`FULL`	10

.md override: If the target is a .md file, force FULL mode regardless of line count.

Log: "[AR] Mode: {MODE} ({N} lines, max_rounds: {MAX_ROUNDS})". If .md override: append " [spec-mode: .md override]".

STEP 1 — RESOLVE TARGET

Target is: file path, glob, or the literal string "diff".

Optional flags (strip all from the target before resolving the path):

--model <haiku|sonnet|opus> overrides the model tier for this run → MODEL_ARG. If absent, MODEL_ARG = null (the default resolves to opus; see STEP 2).
--passes <N> overrides the Enthusiast multi-pass count for this run → PASSES_ARG (integer ≥ 1). If absent, PASSES_ARG = null (resolves per mode; see STEP 2).
--iterate switches from single-pass review to the review-AND-fix outer loop (review → apply confirmed fixes to an isolated worktree → re-review → converge). --quality-critical pairs with it to pin the loop at the opus ceiling. If --iterate is present: Read skills/adversarial-review/iterate-mode.md and follow it instead of STEP 2–5 below — that file owns the outer-loop contract (agents stay read-only; the orchestrator applies fixes only to a dedicated worktree; never auto-merge). The single-pass flow below is the inner round it reuses.

Store RUN_ID early so this step can write fixture files into $RUN_DIR:

RUN_ID="$(date -u +%Y%m%d-%H%M%S)-$(echo "$TARGET_ARG" | tr -c 'a-z0-9' '-' | cut -c1-40)"
RUN_DIR="$HOME/.autoimprove/runs/$RUN_ID"
mkdir -p "$RUN_DIR"

Resolve the target into a single canonical path agents will Read:

Argument	TARGET_KIND	Action
`"diff"`	`diff`	`git diff HEAD > $RUN_DIR/diff.patch` (fallback `--staged`). `TARGET_PATH = $RUN_DIR/diff.patch`. If empty → stop.
Single file path	`file`	`TARGET_PATH = <absolute path>`. Verify Readable.
Glob or multi-file list	`manifest`	Expand via Glob → list of absolute paths. Write to `$RUN_DIR/manifest.txt`, one path per line. `TARGET_PATH = $RUN_DIR/manifest.txt`.

Pin canonical content (load-bearing — replaces RECEIPT_NONCE):

# Single-target sha for file/diff kinds; for manifest, hash each listed file then hash the concatenation.
case "$TARGET_KIND" in
  file|diff) sha256sum "$TARGET_PATH" | awk '{print $1}' > "$RUN_DIR/target.sha256" ;;
  manifest)  (while read p; do sha256sum "$p"; done < "$TARGET_PATH") | sha256sum | awk '{print $1}' > "$RUN_DIR/target.sha256" ;;
esac
TARGET_SHA="$(cat $RUN_DIR/target.sha256)"

Compute ALLOWED_FILES (load-bearing — replaces CONTEXT_PAYLOAD membership derivation):

file kind → ALLOWED_FILES = [TARGET_PATH]
manifest kind → ALLOWED_FILES = <every path inside manifest.txt>
diff kind → ALLOWED_FILES = <paths from "+++ b/<path>" and "diff --git" headers in diff.patch>

Store ALLOWED_FILES for the membership gate in STEP 3.

Build CONTEXT_BRIEF (~500 tokens, no extra disk reads beyond the target):

code target: exported function/type signatures (first line each), imports, TODO/FIXME markers — derived from the first 500 lines of each path.
spec target: all ## headings + first sentence each; planned-work markers (Phase N, Future, TODO, Will add).

If TARGET_PATH unreadable or content empty → stop and inform user.

STEP 2 — INITIALIZE RUN

Write $RUN_DIR/meta.json:

{ "run_id": "<RUN_ID>", "target_path": "<TARGET_PATH>", "target_kind": "<TARGET_KIND>", "target_sha": "<TARGET_SHA>", "date": "<ISO>", "mode": "<MODE>", "rounds_planned": <N>, "rounds_completed": 0, "status": "running" }

Initialize state:

ROUND = 1; ROUNDS = []; PRIOR_JUDGE_OUTPUT = null; PRIOR_JUDGE_SUMMARY = null
ROUND_YIELDS = []; TARGET_TYPE = "code"; converged = false
AGENT_ENTHUSIAST = "autoimprove:enthusiast"
AGENT_ADVERSARY  = "autoimprove:adversary"
AGENT_JUDGE      = "autoimprove:judge"
MODEL_LADDER = ["haiku", "sonnet", "opus"]
# AR_MODEL — default opus (both variants); quality-first, configurable DOWN for cheap runs.
# Resolution order: MODEL_ARG (--model)  >  autoimprove.yaml `adversarial_review.default_model`  >  "opus".
AR_MODEL = MODEL_ARG ?? yaml("adversarial_review.default_model") ?? "opus"
ROUND_MODEL = AR_MODEL
# AR_PASSES — Enthusiast multi-pass (pass@k); passes are UNIONed to reduce catch variance.
# Resolution order: PASSES_ARG (--passes)  >  yaml("adversarial_review.passes")  >  default-by-mode (FULL → 2, LIGHTWEIGHT → 1).
AR_PASSES = PASSES_ARG ?? yaml("adversarial_review.passes") ?? (MODE == "FULL" ? 2 : 1)
# Nature-aware stop bookkeeping (STEP 3D): the run stops only for an ENUMERATED reason.
STOP_REASON = null            # one of: all-self-reference | zero-findings | anti-divergence-halt | max-rounds
RUNS_WORSENING = 0            # anti-divergence counter; halt at 2
PRIOR_ROUND_RESOLVED = 0      # substantive findings that became self-reference last round
PRIOR_SUBSTANTIVE = null      # substantive count of the prior round (for divergence comparison)
CONFIRMED_LOCATIONS = []   # (file, line) tuples from enthusiast/split rulings
FILE_FINDING_COUNTS = {}   # {filepath → confirmed count} updated after each Judge
REVIEWED_FILES = Set()     # subset of ALLOWED_FILES sent to Enthusiast in any prior round
# Task IDs: TASK_E_ID=null, TASK_A_ID=null, TASK_J_ID=null, TASK_PREV_J_ID=null
# Cost: RUN_START_NS=$(date +%s)000000000, SESSION_ID="${CLAUDE_SESSION_ID:-}", PROJECT_SLUG=""
# Agent IDs (per round): AR_ENTHUSIAST_AGENT_ID=null, AR_ADVERSARY_AGENT_ID=null, AR_JUDGE_AGENT_ID=null

Initialize run (Bash):

echo '{"schema_version": 1, "agents": []}' > "$RUN_DIR/agents.json"
date +%s000000000 > "$RUN_DIR/.run_start_ns"
echo "${CLAUDE_SESSION_ID:-}" > "$RUN_DIR/.session_id"
echo "$RUN_DIR" > "$HOME/.autoimprove/.current_run_dir"

Target Type Detection

If TARGET_PATH ends with .md AND (first 20 lines contain ## Implementation Plan / ## Spec / ## Design / ## Plan, OR path is in docs/superpowers/): set TARGET_TYPE = "spec"
Otherwise: TARGET_TYPE = "code"

AGENT_ENTHUSIAST = TARGET_TYPE == "spec" ? "autoimprove:enthusiast-spec" : "autoimprove:enthusiast"
AGENT_ADVERSARY  = TARGET_TYPE == "spec" ? "autoimprove:adversary-spec"  : "autoimprove:adversary"
AGENT_JUDGE      = TARGET_TYPE == "spec" ? "autoimprove:judge-spec"       : "autoimprove:judge"
# ROUND_MODEL is NOT set by TARGET_TYPE — it was already resolved from AR_MODEL above
# (default opus for both spec and code variants; configurable down via --model / yaml).

Create progress tasks (MANDATORY before dispatching any agent):

TASK_E_ID = TaskCreate({content: "🔍 Enthusiast: find strengths and risks",  status: "pending"}).id
TASK_A_ID = TaskCreate({content: "⚔️ Adversary: map safe zones + heuristics", status: "pending"}).id
TASK_J_ID = TaskCreate({content: "⚖️ Judge: deliver final verdict",            status: "pending", blocked_by: [TASK_A_ID]}).id

STEP 3 — DEBATE LOOP

Before entering the loop: Read skills/adversarial-review/ops-telemetry.md — contains all bash blocks for §Start Time, §Capture AgentId, and §Round Telemetry used in 3A/3B/3C.

Localization preference (pre-warm — resolve once, before dispatch): Read localization.primary_language and localization.allowed_languages from autoimprove.yaml → REPO_PRIMARY_LANGUAGE, ALLOWED_LANGUAGES (default ALLOWED_LANGUAGES to empty list [] when the key is absent or commented out). If primary_language is unset: ASK the user once before dispatching — "Primary language for this repo's docs/comments/specs? (blank = skip localization checks; optionally list other allowed languages)". If running non-interactively, set REPO_PRIMARY_LANGUAGE = "" (agents skip the check when it is empty). Unlike the code checklist, the localization mandate lives in every agent and applies to both code and spec targets — the <localization> block below is injected into ALL THREE dispatches (Enthusiast, Adversary, Judge).

Checklist load (code targets only): If TARGET_TYPE == "code", Read skills/adversarial-review/heuristics/code-escape-categories.md and store its body as CODE_CHECKLIST. If TARGET_TYPE == "spec", set CODE_CHECKLIST = "" — the code escape categories are NOT valid for prose/spec review (spec reviewers need their own taxonomy; see issue #130 track 2). The checklist is injected into the Enthusiast dispatch only (v1); it is NOT passed to the Adversary or Judge — injecting a find-checklist into the debunker/arbiter is an unmeasured prompt-wish gated on the gate-recall harness + a behavioral probe.

PARALLEL RULE: 3A (Enthusiast) and 3B (Adversary) are dispatched simultaneously in the same message turn. 3C (Judge) dispatches only after both complete.

TARGET-PATH CONTRACT: Pass the TARGET_PATH + TARGET_KIND + ALLOWED_FILES list in every Agent() prompt — the agent will Read from there. Never inline the target text. The agent is responsible for reading; the orchestrator is responsible for pinning, membership, and post-hoc verification.

Repeat 3A → 3B → 3C → 3D until converged = true or ROUND > MAX_ROUNDS.

STEP 3A — ENTHUSIAST (MANDATORY)

Compliance pre-check: If ROUND > MAX_ROUNDS, exit loop.

Per-round task tree (FULL mode, Round 2+):

if MODE == "FULL" AND ROUND > 1:
  TASK_E_ID = TaskCreate({content: "🔍 Round {ROUND}: Enthusiast", status: "pending", blocked_by: [TASK_PREV_J_ID]}).id
  TASK_A_ID = TaskCreate({content: "⚔️ Round {ROUND}: Adversary",  status: "pending"}).id
  TASK_J_ID = TaskCreate({content: "⚖️ Round {ROUND}: Judge",       status: "pending", blocked_by: [TASK_A_ID]}).id

Mark: TaskUpdate(TASK_E_ID, {status: "in_progress"}).

CONFIRMED_LOCATIONS (round > 1): extract (file, line) from prior rulings where winner ∈ {enthusiast, split}. Format: "src/foo.ts:42, src/bar.ts:17".

RELEVANT_FILES (round > 1, multi-file/manifest only):

RELEVANT_FILES = files where FILE_FINDING_COUNTS[file] > 0
              + files where file not in REVIEWED_FILES
if RELEVANT_FILES is empty: RELEVANT_FILES = ALLOWED_FILES
Log: "[AR] R{ROUND} file budget: {RELEVANT_FILES.length}/{ALLOWED_FILES.length} files"
REVIEWED_FILES.update(RELEVANT_FILES)

For R1, single-file, diff: RELEVANT_FILES = ALLOWED_FILES.

If RELEVANT_FILES differs from ALLOWED_FILES, write $RUN_DIR/round-{ROUND}-relevant.txt listing the active subset, one path per line, and pass it to the agent as RELEVANT_PATH. Otherwise RELEVANT_PATH = TARGET_PATH.

Start time: Run §Start Time from ops-telemetry.md substituting {ROLE}=enthusiast.

Pass@k (Enthusiast multi-pass): The Enthusiast runs AR_PASSES times this round; passes are UNIONed (variance reduction). The Adversary (3B) is dispatched in the same turn (it maps safe zones independent of Enthusiast findings); the Adversary (3B) and Judge (3C) each run once, after the union — so only the Enthusiast is repeated.

Dispatch all AR_PASSES Enthusiast passes AND 3B simultaneously (same message turn). For each pass in 1..AR_PASSES:

Agent(
  subagent_type: AGENT_ENTHUSIAST, model: ROUND_MODEL,
  prompt: "[AR Round {ROUND} — {MODE}] (pass {pass}/{AR_PASSES})

FIRST ACTION REQUIRED — non-negotiable: Your very first tool call MUST be `Read` against the path inside <target path=…>. Do not begin reasoning, do not emit any prose, do not emit any JSON until your Read tool call has returned the file contents. For kind=\"file\" Read the path directly. For kind=\"diff\" Read the path as a unified diff. For kind=\"manifest\" or kind=\"relevant_manifest\" Read the manifest first, then Read each listed file. If you skip the Read tool call, the orchestrator detects it (it counts your tool invocations) and discards your findings as confabulated. **Do not read any path outside the manifest/target — off-target reads cause the orchestrator to discard your findings and abort the run.**

<target path=\"{RELEVANT_PATH}\" kind=\"{TARGET_KIND_OR_relevant_manifest}\" sha256=\"{TARGET_SHA}\" />
<brief>{CONTEXT_BRIEF}</brief>
<if REPO_PRIMARY_LANGUAGE != \"\"><localization primary=\"{REPO_PRIMARY_LANGUAGE}\" allowed=\"{ALLOWED_LANGUAGES}\">Also apply your Localization Consistency mandate for the primary language above — each agent per its own role (Enthusiast flags deviations; Adversary maps legitimately-multilingual content as safe zones; Judge rules localization findings on merit). Deviations are content in a non-primary language: intra-file mixing or whole files. Exempt i18n/locale resources, quoted foreign text or errors, proper nouns, identifiers, and the allowed languages.</localization></if>
<if CODE_CHECKLIST != \"\">Before sign-off, scan the target for each escape category below; spend attention on the [semantic — PRIORITY] rows (the ceiling model still misses those). This list WIDENS coverage — it is not a fence: report defects outside it, and never invent a finding to 'cover' a category.
<checklist>{CODE_CHECKLIST}</checklist></if>
<if round > 1>BLOCKLIST (do not re-raise): {CONFIRMED_LOCATIONS}
Prior summary: {PRIOR_JUDGE_SUMMARY}
Find issues NOT in the blocklist only.</if>

After every required Read has returned, emit ONLY valid JSON per your schema. No preamble, no markdown fences."
)
ENTHUSIAST_PASS_RESULT[pass] = result

(TARGET_KIND_OR_relevant_manifest = "relevant_manifest" when RELEVANT_PATH != TARGET_PATH, else TARGET_KIND. The <checklist> block is present ONLY for code targets — CODE_CHECKLIST is empty for spec targets, so spec passes carry no code checklist. The Adversary and Judge prompts NEVER carry the checklist.)

Per-pass gates (run for EVERY pass before accepting it):

Capture agentId: Run §Capture AgentId from ops-telemetry.md for each pass, substituting {ROLE}=enthusiast and {AGENT_ID} with that pass's returned value (null/missing → run null-case block). The last pass's id occupies the round-telemetry slot; for cost accounting note that with AR_PASSES > 1 the round records one pass's id (full per-pass cost aggregation is a tracked follow-up).

Dispatch authenticity gate (MANDATORY — anti-hallucination): If ANY pass's Agent() call did not run a real subagent — i.e. the pass result has no agentId, OR the Agent tool is unavailable / errored / disabled in this environment — ABORT the entire AR run immediately. Manual self-review (the orchestrator role-playing the Enthusiast/Adversary/Judge inline) is FORBIDDEN: it produces hallucinated reviews against a fabricated codebase. Do NOT continue to 3B/3C/STEP 4. Write meta.json with status:"aborted", reason:"agent_dispatch_unavailable". Emit exactly: "[AR] ABORTED: real subagent dispatch unavailable — adversarial review must run as isolated subagents, never inline self-review. Findings would be hallucinated. Re-run in an environment where the Agent tool is available." Then stop.

Tool-use gate (MANDATORY — anti-confabulation, action-side): For each pass, verify the dispatched agent actually invoked Read on the pinned target instead of emitting findings against an imagined one. The agent JSONL records every tool invocation. Set AR_ENTHUSIAST_AGENT_ID to the pass's agentId before running:

PROJECT_SLUG=$(echo "$PWD" | sed 's|/|-|g')
JSONL="$HOME/.claude/projects/$PROJECT_SLUG/${CLAUDE_SESSION_ID:-}/subagents/agent-${AR_ENTHUSIAST_AGENT_ID}.jsonl"
if [ -f "$JSONL" ]; then
  ENTHUSIAST_TOOL_USES=$(jq -c '.message | (.content // []) | map(select(.type=="tool_use")) | length' "$JSONL" 2>/dev/null | awk '{s+=$1} END {print s+0}')
else
  ENTHUSIAST_TOOL_USES=0
fi

If ENTHUSIAST_TOOL_USES == 0 for ANY pass — that pass emitted a reply without ever calling Read (or any other tool) — ABORT the run: log enthusiast_no_tool_invocation with the pass's AR_ENTHUSIAST_AGENT_ID, write meta.json status:"aborted", reason:"enthusiast_no_tool_invocation", emit "[AR] ABORTED: Enthusiast emitted findings without invoking Read. Hallucination prevented.", stop. Do NOT continue to 3B/3C.

Validate (per pass): Parse each pass's JSON. Invalid → re-prompt that pass once. Still invalid → log enthusiast_malformed_json for that pass and treat it as findings: [] (a single bad pass does NOT abort when AR_PASSES > 1 — the union absorbs it; if ALL passes are invalid, skip 3B/3C and go to 3D with findings: []). Round 1 pass ≤ 50 chars → re-prompt that pass once.

Union the passes: ENTHUSIAST_OUTPUT.findings = union of all valid passes' findings. Dedup two findings ONLY when they are the same defect, not merely co-located — distinct defects at the same line must NOT collapse. Match key, in order: (a) code — same enclosing symbol/declaration AND same type/category; (b) spec — same ## heading / labeled section AND same category; (c) fallback when no symbol/section anchor is available — (file, line) within ±5 lines AND same type/category. Same-line findings with different type are kept as separate findings. On a match keep the highest severity and merge evidence. Re-id the unioned set sequentially (F1..Fn) so downstream finding_id references are unique. Log "[AR] R{ROUND} pass@{AR_PASSES}: {per-pass counts} → {union count} unioned". Store as ENTHUSIAST_OUTPUT.

Target-membership gate (MANDATORY — load-bearing anti-hallucination): For every unioned finding with a non-null file: assert file ∈ ALLOWED_FILES (exact path match, or for diff targets the b/<path> from the unified diff header). If ANY finding references a path not in ALLOWED_FILES, the Enthusiast reviewed a fabricated or off-target tree — discard all findings this round, log enthusiast_hallucinated_offtarget with the offending paths, write meta.json status:"aborted", reason:"offtarget_hallucination", emit "[AR] ABORTED: Enthusiast findings reference files not in the pinned target — hallucinated review discarded." and stop. Never pass off-target findings to the Adversary or Judge.

Pre-adversary dedup: Spec targets: skip, set NOVEL_FINDINGS = ENTHUSIAST_OUTPUT.findings. Code targets: match (file, line) against CONFIRMED_LOCATIONS (±5 lines). Log dismissed duplicates. Replace ENTHUSIAST_OUTPUT.findings with NOVEL_FINDINGS.

Mark: TaskUpdate(TASK_E_ID, {content: "🔍 AR Round {ROUND}: Enthusiast ({NOVEL_FINDINGS.length} findings)", status: "completed"}).

If NOVEL_FINDINGS is empty: skip 3C, go to 3D.

STEP 3B — ADVERSARY (PARALLEL WITH 3A)

Compliance pre-check: If ROUND > MAX_ROUNDS, exit loop.

Mark: TaskUpdate(TASK_A_ID, {content: "⚔️ AR Round {ROUND}: Adversary — mapping safe zones", status: "in_progress"}).

Reuse RELEVANT_PATH, TARGET_KIND_OR_relevant_manifest, TARGET_SHA, CONTEXT_BRIEF from 3A (do not recompute).

Start time: Run §Start Time from ops-telemetry.md substituting {ROLE}=adversary.

Agent(
  subagent_type: AGENT_ADVERSARY, model: ROUND_MODEL,
  prompt: "[AR Round {ROUND} — {MODE}] Map safe zones and heuristics.

FIRST ACTION REQUIRED — non-negotiable: Your very first tool call MUST be `Read` against the path inside <target path=…>. Do not begin reasoning, do not emit any prose, do not emit any JSON until your Read tool call has returned the file contents. Same kind→action rules as the Enthusiast (file/diff/manifest/relevant_manifest). The orchestrator counts your tool invocations and discards findings from agents that emit JSON without ever calling Read. **Do not read any path outside the manifest/target.**

<target path=\"{RELEVANT_PATH}\" kind=\"{TARGET_KIND_OR_relevant_manifest}\" sha256=\"{TARGET_SHA}\" />
<brief>{CONTEXT_BRIEF}</brief>
<if REPO_PRIMARY_LANGUAGE != \"\"><localization primary=\"{REPO_PRIMARY_LANGUAGE}\" allowed=\"{ALLOWED_LANGUAGES}\">Also apply your Localization Consistency mandate for the primary language above — each agent per its own role (Enthusiast flags deviations; Adversary maps legitimately-multilingual content as safe zones; Judge rules localization findings on merit). Deviations are content in a non-primary language: intra-file mixing or whole files. Exempt i18n/locale resources, quoted foreign text or errors, proper nouns, identifiers, and the allowed languages.</localization></if>

After every required Read has returned, emit ONLY valid JSON per your schema. No preamble, no markdown fences."
)
ADVERSARY_RESULT = result

Capture agentId: Run §Capture AgentId from ops-telemetry.md substituting {ROLE}=adversary.

Tool-use gate (MANDATORY — anti-confabulation, action-side):

PROJECT_SLUG=$(echo "$PWD" | sed 's|/|-|g')
JSONL="$HOME/.claude/projects/$PROJECT_SLUG/${CLAUDE_SESSION_ID:-}/subagents/agent-${AR_ADVERSARY_AGENT_ID}.jsonl"
if [ -f "$JSONL" ]; then
  ADVERSARY_TOOL_USES=$(jq -c '.message | (.content // []) | map(select(.type=="tool_use")) | length' "$JSONL" 2>/dev/null | awk '{s+=$1} END {print s+0}')
else
  ADVERSARY_TOOL_USES=0
fi

If ADVERSARY_TOOL_USES == 0 — ABORT the run with reason:"adversary_no_tool_invocation", emit "[AR] ABORTED: Adversary emitted output without invoking Read. Hallucination prevented.", stop.

Validate: Parse JSON. Invalid → re-prompt once. Still invalid → log adversary_malformed_json, use {"heuristics":[],"safe_zones":[]}. Store as ADVERSARY_OUTPUT.

Mark: TaskUpdate(TASK_A_ID, {content: "⚔️ AR Round {ROUND}: Adversary ({h_count} heuristics, {z_count} safe zones)", status: "completed"}).

STEP 3C — JUDGE (MANDATORY after 3B)

Compliance pre-check: Both outputs must exist. Missing → log judge_skipped_missing_inputs, go to 3D.

Mark: TaskUpdate(TASK_J_ID, {content: "⚖️ AR Round {ROUND}: Judge — ruling on debate", status: "in_progress"}).

Reuse RELEVANT_PATH, TARGET_KIND_OR_relevant_manifest, TARGET_SHA, CONTEXT_BRIEF from 3A.

Start time: Run §Start Time from ops-telemetry.md substituting {ROLE}=judge.

Agent(
  subagent_type: AGENT_JUDGE, model: ROUND_MODEL,
  prompt: "[AR Round {ROUND} — {MODE}] Arbitrate.

FIRST ACTION REQUIRED — non-negotiable: Your very first tool call MUST be `Read` against the path inside <target path=…>. Do not begin reasoning, do not emit any prose, do not emit any JSON until your Read tool call has returned the file contents. Same kind→action rules as the Enthusiast (file/diff/manifest/relevant_manifest). For every Enthusiast finding you arbitrate, you MUST additionally Read the cited file and line you are ruling on (use the file path and line number from the finding). The orchestrator counts your tool invocations and discards rulings from a Judge that never called Read. **Do not read any path outside the manifest/target.**

<target path=\"{RELEVANT_PATH}\" kind=\"{TARGET_KIND_OR_relevant_manifest}\" sha256=\"{TARGET_SHA}\" />
<brief>{CONTEXT_BRIEF}</brief>
<if REPO_PRIMARY_LANGUAGE != \"\"><localization primary=\"{REPO_PRIMARY_LANGUAGE}\" allowed=\"{ALLOWED_LANGUAGES}\">Also apply your Localization Consistency mandate for the primary language above — each agent per its own role (Enthusiast flags deviations; Adversary maps legitimately-multilingual content as safe zones; Judge rules localization findings on merit). Deviations are content in a non-primary language: intra-file mixing or whole files. Exempt i18n/locale resources, quoted foreign text or errors, proper nouns, identifiers, and the allowed languages.</localization></if>
<findings>{ENTHUSIAST_OUTPUT}</findings>
<adversarial_context>{ADVERSARY_OUTPUT}</adversarial_context>
<if round > 1>Prior rulings: {PRIOR_JUDGE_OUTPUT}
Set convergence:true only if ALL (file,line,winner,final_severity) tuples match prior round.</if>
<if MODE == FULL>Set next_round_model='sonnet' if: security findings, critical/high multi-file, 0% debunk rate, or strong E/A disagreement. Otherwise 'haiku'.</if>

After every required Read has returned, emit ONLY valid JSON per your schema. No preamble, no markdown fences."
)
JUDGE_RESULT = result

Capture agentId: Run §Capture AgentId from ops-telemetry.md substituting {ROLE}=judge.

Tool-use gate (MANDATORY — anti-confabulation, action-side):

PROJECT_SLUG=$(echo "$PWD" | sed 's|/|-|g')
JSONL="$HOME/.claude/projects/$PROJECT_SLUG/${CLAUDE_SESSION_ID:-}/subagents/agent-${AR_JUDGE_AGENT_ID}.jsonl"
if [ -f "$JSONL" ]; then
  JUDGE_TOOL_USES=$(jq -c '.message | (.content // []) | map(select(.type=="tool_use")) | length' "$JSONL" 2>/dev/null | awk '{s+=$1} END {print s+0}')
else
  JUDGE_TOOL_USES=0
fi

If JUDGE_TOOL_USES == 0 — ABORT the run with reason:"judge_no_tool_invocation", emit "[AR] ABORTED: Judge emitted rulings without invoking Read. Hallucination prevented.", stop.

Validate: Parse JSON. Invalid → re-prompt once. Still invalid → log judge_malformed_json, mark all findings status: unresolved, exit loop. Check rulings count matches NOVEL_FINDINGS; log mismatch.

Target-membership gate (Judge): For every ruling with a non-null file: assert file ∈ ALLOWED_FILES. Off-target → discard all rulings + abort with judge_offtarget_hallucination, same handling as 3A's gate.

Count: confirmed_count = rulings where winner ∈ {enthusiast, split}; debunked_count = rulings where winner = adversary.

Mark: TaskUpdate(TASK_J_ID, {content: "⚖️ AR Round {ROUND}: Judge ({confirmed_count} confirmed, {debunked_count} debunked)", status: "completed"}).

TASK_PREV_J_ID = TASK_J_ID

Update state: Append confirmed (file, line) to CONFIRMED_LOCATIONS. Set PRIOR_JUDGE_OUTPUT, PRIOR_JUDGE_SUMMARY. Update FILE_FINDING_COUNTS for each enthusiast/split ruling.

Model escalation (FULL mode only): escalation only ever moves UP the ladder — never downgrade below the resolved AR_MODEL (default opus). When ROUND_MODEL == "opus" (the ceiling) every escalation path is a no-op.

Path A (anomaly): any *_malformed_json → escalate one tier toward opus: ROUND_MODEL = MODEL_LADDER[min(index(ROUND_MODEL)+1, 2)], escalated_this_round = true. (No-op if already opus.)
Path B (judge rec): use JUDGE_OUTPUT.next_round_model if Path A didn't escalate, but never below ROUND_MODEL (escalate-only; ignore a recommendation to drop tier).
3+ consecutive non-opus rounds → log "[COST WARNING] sub-opus model active 3 consecutive rounds."

Round telemetry: Run §Round Telemetry from ops-telemetry.md, substituting {ROUND}, {ROUND_MODEL}, {ENTHUSIAST_OUTPUT_JSON}, {ADVERSARY_OUTPUT_JSON}, {JUDGE_OUTPUT_JSON}, {ERRORS_JSON_ARRAY_OR_EMPTY_ARRAY}. Also append round-N.json contents to ROUNDS array in state.

STEP 3D — CONVERGENCE CHECK (nature-aware, ceiling-confirmed)

Continuation contract (LOAD-BEARING — anti-under-execution):

NEVER stop and ask the human mid-loop. After a round, continue automatically to the next round.
The loop stops ONLY when STOP_REASON is set to one of {all-self-reference, zero-findings, anti-divergence-halt, max-rounds}. "Seems good enough" and "ask the human" are NOT valid stop reasons.
Whether to continue is decided by the deterministic predicate below over the round's persisted nature counts, not by narrative judgment.
When the loop exits, record STOP_REASON in meta.json.

Append NOVEL_FINDINGS.length to ROUND_YIELDS.

Nature classification (the stop SIGNAL — replaces raw count/identity): Classify each of this round's confirmed rulings (winner ∈ {enthusiast, split}) by overlap with prior-round confirmed locations:

overlaps a prior confirmed location (code: same symbol/declaration, fallback ±5 lines; spec: same ## heading / labeled section), OR is a meta-finding pointing only at a mismatch a prior round introduced → self-reference / propagation
a fresh location → substantive / design

Count nature_substantive and nature_propagation and persist them so the continuation predicate is deterministic. Prefer this run's in-scope $RUN_DIR (from STEP 1); fall back to the disk pointer only if unset. ALWAYS verify basename == $RUN_ID before writing — a concurrent AR run clobbers the shared .current_run_dir mid-flight (#119); on mismatch, SKIP the write (don't pollute another run's dir) rather than hard-aborting the round:

RUN_DIR="${RUN_DIR:-$(cat "$HOME/.autoimprove/.current_run_dir" 2>/dev/null)}"
if [ "$(basename "$RUN_DIR")" = "$RUN_ID" ]; then
  printf '{"round":%s,"nature_substantive":%s,"nature_propagation":%s,"confirmed":%s}\n' \
    "$ROUND" "$NATURE_SUBSTANTIVE" "$NATURE_PROPAGATION" "$CONFIRMED_COUNT" \
    > "$RUN_DIR/round-${ROUND}-nature.json"
else
  echo "[AR] WARN: RUN_DIR basename != RUN_ID ($RUN_DIR) — skipping nature write (#119 guard)" >&2
fi

Nature-aware stop predicate (ceiling-confirmed — LOAD-BEARING): all-self-reference at a sub-ceiling tier is a FALSE convergence — a stronger model resets the substantive count to non-zero. Real convergence requires confirmation at the ceiling tier (opus).

if CONFIRMED_COUNT == 0 AND ROUND > 1:
    if ROUND_MODEL == "opus": STOP_REASON = "zero-findings"; converged = true
    else: ROUND_MODEL = "opus"; converged = false            # jump to ceiling, confirm
           Log "[AR] zero findings at {tier} — confirming at opus ceiling before stop"
elif NATURE_SUBSTANTIVE == 0 AND CONFIRMED_COUNT > 0:         # all self-reference/propagation
    if ROUND_MODEL == "opus": STOP_REASON = "all-self-reference"; converged = true
    else: ROUND_MODEL = "opus"; converged = false            # jump to ceiling, confirm
           Log "[AR] all-self-reference at {tier} — confirming at opus ceiling before stop"
else:                                                          # substantive territory remains
    converged = false

Anti-divergence guard (carried from convergence work): A fix/round that opens more substantive territory than it closes loops forever. Track RUNS_WORSENING:

if PRIOR_SUBSTANTIVE != null AND NATURE_SUBSTANTIVE > PRIOR_SUBSTANTIVE: RUNS_WORSENING += 1
else: RUNS_WORSENING = 0
if RUNS_WORSENING >= 2: STOP_REASON = "anti-divergence-halt"; converged = true
    Log "[AR] HALT: substantive findings grew 2 rounds running — handing to human, not looping."
PRIOR_SUBSTANTIVE = NATURE_SUBSTANTIVE

Deterministic identity check (round > 1, secondary corroboration): Extract (file, line, winner, final_severity) from this round's and prior round's rulings. Apply ±5-line pairwise clustering (representative = min line). For file: null: use (null, first-60-chars-of-resolution, winner, final_severity). If normalized sets identical AND ROUND_MODEL == "opus" → set STOP_REASON = "all-self-reference" if unset, converged = true. If Judge said convergence: true but the nature predicate says continue → log override, continue. Round 1 guard: convergence: true from Judge → override to false, log.

Near-convergence escalation (FULL mode only):

current_yield = ROUND_YIELDS[-1]; prev_yield = ROUND_YIELDS[-2] (if len≥2)
near_convergence = current_yield <= 2 AND prev_yield != null AND current_yield < prev_yield * 0.4
if NOT escalated_this_round AND (converged OR near_convergence):
  if ROUND_MODEL == "opus": converged = true (final stop)
  elif converged AND NOT near_convergence: skip escalation (stay converged)
  else:
    ROUND_MODEL = MODEL_LADDER[index(ROUND_MODEL) + 1]
    converged = false
    Log: "Round {N}: escalating to {next_model} (yield={current_yield})"
    Re-emit todos as pending for round {N+1}

Rubric Check: Read skills/adversarial-review/rubric-check.md and follow all steps therein.

Round 2 Gate (after Round 1 only):

confirmed_count = rulings where winner ∈ {enthusiast, split}
medium_plus = confirmed where final_severity ∈ {medium, high, critical}
If confirmed_count < 3 AND medium_plus == 0 → set STOP_REASON = "all-self-reference" if unset, log "round2_skipped", exit loop, go to STEP 4

Increment: ROUND += 1. If converged = true (a STOP_REASON is set) OR ROUND > MAX_ROUNDS → exit (if exiting only because ROUND > MAX_ROUNDS and STOP_REASON is still null, set STOP_REASON = "max-rounds"). Write STOP_REASON into meta.json. Otherwise → go to 3A. Per the continuation contract, NEVER exit for any reason other than a set STOP_REASON.

STEP 4 — FORMAT OUTPUT

Read skills/adversarial-review/output-format.md

Format and emit per §Human-Readable Report, §Structured JSON, §Self-Assessment, and §Quality Rubrics (only when STEP 3D rubric check ran).

STEP 5 — WRITE TELEMETRY + POST-RUN PIN CHECK

Post-run pin check (race detection — replaces RECEIPT_NONCE/CONTENT_PROOF): Recompute TARGET_SHA from the same paths used at STEP 1. If it differs from the stored target_sha:

Log "[AR] WARNING: target content changed during run (sha mismatch). Findings still reported; treat line numbers as approximate.".
Write meta.json field "target_sha_drift": true and "target_sha_final": "<new_sha>".
Do NOT abort — race is informational, not fatal. Consumers (/quality-gate, CI ingesters) decide whether to accept findings with target_sha_drift == true.

Read skills/adversarial-review/telemetry-spec.md

Run §Cost Instrumentation Bash Block, then §Cleanup (close all todos).

COMPLIANCE RULES

Rule	Violation action
3A before 3B	Adversary dispatched without ENTHUSIAST_OUTPUT → abort, re-run from 3A
3B before 3C	Judge dispatched without ADVERSARY_OUTPUT → log error, use `{"heuristics":[],"safe_zones":[]}`
Each agent uses exact subagent_type	`AGENT_ENTHUSIAST` / `AGENT_ADVERSARY` / `AGENT_JUDGE` (resolved in Step 2)
Output validated before passing forward	Invalid → one re-prompt → fallback (never skip validation)
Convergence = deterministic check only	Judge self-report overridden when it disagrees
Round 1 convergence = always false	No exception
Real subagent dispatch required	No `agentId` / Agent tool unavailable → ABORT run. Inline self-review FORBIDDEN (hallucinates).
Findings must be grounded in pinned target	Any finding `file` ∉ `ALLOWED_FILES` → discard all findings, ABORT run (`offtarget_hallucination`). This is the load-bearing membership gate — it runs against Enthusiast output (3A) and Judge output (3C).
Target path contract	Every Agent() prompt must carry a `<target path="…" kind="…" sha256="…" />` tag. Never inline target text; never pass a variable name.
Each dispatched agent must invoke at least one tool	`tool_uses == 0` in the agent JSONL → ABORT run (`<role>_no_tool_invocation`). Action-side gate — catches agents that emit findings without calling Read on the pinned target. Sonnet treats "Read the file" as description; the FIRST ACTION REQUIRED imperative + this gate together force the actual invocation.
Localization check needs the resolved language	`<localization>` block injected into ALL three dispatches (both code and spec) ONLY when `REPO_PRIMARY_LANGUAGE != ""`. Empty → agents skip the check. Resolution: yaml `localization.primary_language` > ask user in pre-warm > empty/skip.
Checklist injection is code-only, Enthusiast-only	`CODE_CHECKLIST` injected ONLY when `TARGET_TYPE == "code"`, ONLY into the Enthusiast dispatch. Never into spec passes (spec needs its own taxonomy), never into Adversary/Judge (unmeasured prompt-wish).
Pass@k gates run per pass	Every Enthusiast pass passes the dispatch-authenticity + tool-use gates before its findings enter the union. A single malformed pass is absorbed by the union; ALL-malformed or any authenticity/tool-use failure → ABORT (`AR_PASSES > 1` never weakens a gate).
Loop stops only on an enumerated `STOP_REASON`	Exit only for `{all-self-reference, zero-findings, anti-divergence-halt, max-rounds}`. "Seems good enough" / "ask the human" are NOT valid stop reasons — NEVER stop mid-loop to ask the human (continuation contract).
Convergence confirmed at the ceiling tier	`all-self-reference` / `zero-findings` at a sub-opus tier is a FALSE convergence → jump `ROUND_MODEL` to opus for a confirmation round; declare converged only when opus itself reaches it.

Background execution: This skill executes E→A→J inline — never re-dispatches itself. Caller wanting non-blocking AR: Agent(run_in_background: true, prompt: "Invoke Skill('autoimprove:adversarial-review', args: '...')") — no subagent_type.

WHAT THIS SKILL NO LONGER DOES (post-#120 redesign)

No CONTEXT_PAYLOAD inline injection. Agents Read from TARGET_PATH. Orchestrators no longer paste tens of KB of target text into agent prompts.
No RECEIPT_NONCE / PAYLOAD_RECEIPT gate. The receipt protocol existed to prove the agent received an inline payload; with Read there is no payload to forge possession of. Dispatch-integrity is now proven by the target-membership gate (off-target reads are detected post-hoc).
No AR-PROBE / CONTENT_PROOF gate. Same reasoning — the probe verified the agent had the real bytes in-prompt; agents now read from a sha-pinned path directly.
No --map-mode flag. The map/hybrid mode was an alternative to inline injection; with Read the agent can navigate multi-file targets via Glob + Read directly. skills/adversarial-review/map-mode.md has been removed.

adversarial-review

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

adversarial-review

Popularity

Invocation

Tool Access

Context Preview

Supporting Files

SKILL.md

MANDATORY CHAIN: E → A → J

STEP 0 — ADAPTIVE MODE DETECTION

STEP 1 — RESOLVE TARGET

STEP 2 — INITIALIZE RUN

Target Type Detection

STEP 3 — DEBATE LOOP

STEP 3A — ENTHUSIAST (MANDATORY)

STEP 3B — ADVERSARY (PARALLEL WITH 3A)

STEP 3C — JUDGE (MANDATORY after 3B)

STEP 3D — CONVERGENCE CHECK (nature-aware, ceiling-confirmed)

STEP 4 — FORMAT OUTPUT

STEP 5 — WRITE TELEMETRY + POST-RUN PIN CHECK

COMPLIANCE RULES

WHAT THIS SKILL NO LONGER DOES (post-#120 redesign)

Similar Skills

MANDATORY CHAIN: E → A → J

STEP 0 — ADAPTIVE MODE DETECTION

STEP 1 — RESOLVE TARGET

STEP 2 — INITIALIZE RUN

Target Type Detection

STEP 3 — DEBATE LOOP

STEP 3A — ENTHUSIAST (MANDATORY)

STEP 3B — ADVERSARY (PARALLEL WITH 3A)

STEP 3C — JUDGE (MANDATORY after 3B)

STEP 3D — CONVERGENCE CHECK (nature-aware, ceiling-confirmed)

STEP 4 — FORMAT OUTPUT

STEP 5 — WRITE TELEMETRY + POST-RUN PIN CHECK

COMPLIANCE RULES

WHAT THIS SKILL NO LONGER DOES (post-#120 redesign)

Similar Skills