From agent-workflow
Stress-test any artifact by spawning an Adversary agent (finds problems) then a Counterweight agent (calibrates findings). Use when the user says "adversarial review", "challenge this", "stress test this", "red team this", or wants a second opinion on a design or implementation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agent-workflow:adversarial-review [path/to/artifact or PR number] — optional, inferred from context if omitted[path/to/artifact or PR number] — optional, inferred from context if omittedThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Stress-test any artifact by spawning an **Adversary** (finds problems) then a **Counterweight** (calibrates findings). The user receives a structured, evidence-checked report and decides what to act on.
Stress-test any artifact by spawning an Adversary (finds problems) then a Counterweight (calibrates findings). The user receives a structured, evidence-checked report and decides what to act on.
Optional path to artifact, PR number, or branch name. If omitted, infer from context.
If an argument was provided:
gh pr view <number> --json title,body,files and the diff with gh pr diff <number>git diff main...<branch>If no argument was provided:
git diff main...HEAD), a recently written spec file, or an open PR on the current branch (gh pr view)Hard constraints:
Read the artifact and determine:
design (specs, design docs, architecture) | implementation (code diffs, configs, PRs) | mixed (a spec with embedded code samples, or a PR that introduces a new design)Type drives persona selection (Step 3). Altitude scopes the Adversary's focus (Step 3) and the report layout (Step 5).
Announce to the user:
Reviewing {artifact name} ({type}, {scope}, {altitude} altitude). Spawning {persona} adversary...
| Artifact type | Persona | Focus |
|---|---|---|
| Spec / design doc | Feasibility skeptic | Claims vs codebase reality, scope gaps, missing considerations |
| Code diff / PR | Integration adversary | Callers, cross-module boundaries, edge cases, test coverage |
| Config / infra | Blast radius analyst | What breaks if wrong, deploy ordering, rollback safety |
| API change | Consumer advocate | All consumers accounted for, backwards compat, migration path |
| Architecture doc | Scalability skeptic | Bottlenecks, failure cascades, operational blind spots |
Select the persona whose artifact type best matches the classification from Step 2. If the artifact spans multiple types, pick the dominant concern. For subsequent rounds, pick the next most relevant unused persona.
Spawn a background subagent using the Agent tool with the following prompt:
You are an adversarial reviewer — a {persona}. Your focus: {focus}.
Mandate: Find blind spots, flawed assumptions, and failure modes in the artifact under review. Do NOT propose fixes, rewrites, or alternatives. Only identify problems.
Altitude focus: This artifact is at the {altitude} level. Your main focus must match.
If during investigation you surface findings at a different altitude than the artifact, mark them with altitude: off in the output (see Output Format). Do not suppress them — they're useful as appendix material — but they must not crowd out the main-altitude findings.
Artifact under review:
{artifact_content}
Previously confirmed challenges (do not re-raise these):
{confirmed_challenges_summary}
Explorer context (prior investigation findings):
{explorer_output}
Evidence-Gating Rule: Every challenge MUST cite specific evidence: a file path, line range, grep result, or concrete scenario. If you cannot point to where the problem manifests, do not raise it.
Bad example: "Consider error handling."
Good example: "The function processOrder() at src/orders/process.py:45 does not handle the case where order.items is empty, which will raise an IndexError on line 52."
Precision Bias: Lean toward false negatives over false positives. Only flag findings you'd bet >70% on — better to miss a marginal issue than bury real ones in noise. A short, sharp list of high-confidence challenges is more useful than an exhaustive list with mixed confidence.
Investigation Strategy:
Output Format:
## Challenges
### 1. {short title}
- **Category:** flawed_assumption | failure_mode | missing_consideration | scope_concern
- **Severity:** blocking | important | minor
- **Altitude:** match | off
- **Evidence:** {file path, line range, grep result, or concrete scenario}
- **Description:** {what goes wrong and why — be specific, name files and functions}
Severity Guide:
Wait for the Adversary subagent to complete and read its output. If it finds no challenges (design is sound), skip the Counterweight step and report directly to the user with a summary of what was investigated and the clean finding.
After the Adversary completes, build a stripped version of its output by removing each challenge's Description: and Severity: fields. Keep title, category, altitude, and evidence. This is what the Counterweight sees.
Why strip: the description and severity are anchoring signals. Letting the Counterweight read the Adversary's prose reasoning and severity number first is a known sycophancy/self-preference failure mode for LLM judges. Hiding them keeps the Counterweight's verdict and severity rating independent.
Then spawn a second background subagent using the Agent tool with the prompt below.
You are a Counterweight — an impartial calibrator of adversarial review. Your loyalty is to accuracy, not to the artifact or the adversary.
{artifact_content}
{adversary_output_stripped}
Note: You are deliberately seeing only each challenge's title, category, altitude, and evidence — not the Adversary's reasoning prose or severity rating. This is to keep your verdict and severity independent. Do not ask to see the hidden fields.
For each challenge the Adversary raised, independently verify whether it is real:
confirmed — evidence checks out, concern is realoverblown — real concern but smaller than the framing implies (explain why)phantom — evidence doesn't support the claim (explain what you found instead)During verification, you may discover issues the Adversary missed. Add them — same evidence-gating rules apply. Every issue must cite specific files, lines, or scenarios. Tag each missed issue with altitude: match or altitude: off relative to the artifact's altitude ({altitude}).
Same as the Adversary: no vibes, no generalities. Every verdict must reference what you checked and what you found. "I verified and it looks fine" is NOT acceptable. "I read src/orders/process.py:45-60 and confirmed that order.items is validated as non-empty on line 38 before processOrder() is called" IS acceptable.
## Calibration Report
### Challenge 1: {title from Adversary}
- **Adversary altitude:** {match | off}
- **Verdict:** confirmed | overblown | phantom
- **Evidence check:** {what you verified and what you found}
- **Severity:** blocking | important | minor | dismiss
### Challenge 2: {title from Adversary}
...
## Missed Issues (if any)
### 1. {title}
- **Category:** flawed_assumption | failure_mode | missing_consideration | scope_concern
- **Severity:** blocking | important | minor
- **Altitude:** match | off
- **Evidence:** {file path, line range, grep result, or concrete scenario}
- **Description:** {what goes wrong and why}
After the Counterweight completes, merge both outputs using the rules below, draft the report body, then do a brief BLUF edit pass before presenting.
The main thread sees both the Adversary's original output (with severity + description) AND the Counterweight's report (with its own independent severity). Use both when synthesizing.
confirmed): present at Counterweight's severity. If Adversary's severity differed materially (e.g., blocking → minor or vice versa), note both in a one-liner.overblown): present at Counterweight's severity, with both perspectives summarized so the reader sees what the Adversary saw and why it was smaller than framed.phantom): list in the summary section with a one-line reason; user can override.altitude: match go to the main Confirmed/Downgraded sections. Items tagged altitude: off go to the Appendix section regardless of severity. Off-altitude items are still useful — just not the main signal.After drafting the report body, re-read it and write two lines for the top:
A reader should be able to stop after these two lines if that's all they need.
Present the following to the user:
## Adversarial Review: {artifact name}
**Verdict:** {one sentence}
**Recommendation:** {one sentence}
---
**Artifact:** {type, scope, altitude, path/ref}
**Adversary persona:** {persona name} — {focus}
**Round:** {N} of 3
### Confirmed Challenges
#### 1. {title}
- **Category:** {category}
- **Severity:** {Counterweight's severity}{; if Adversary's severity differed materially, append " (Adversary said {X})"}
- **Evidence:** {evidence}
- **Description:** {Adversary's description}
### Downgraded Challenges
#### 1. {title}
- **Severity:** {Counterweight's severity} (Adversary said {X})
- **Adversary's case:** {summary of adversary's argument}
- **Counterweight's case:** {summary of counterweight's finding}
### Dismissed
- {title} — {one-line reason from Counterweight}
### Appendix: Off-Altitude Observations
{Findings at the other altitude than the artifact (e.g., implementation nits on a design doc).
Brief evidence only. Omit this section entirely if there are none.}
- {title} — {severity} — {one-line evidence}
### Detailed Assessment
{2-3 sentences expanding on the Verdict: artifact quality, signal-to-noise ratio of the review, whether safe to proceed with or needs revision}
### Next Steps
{N} confirmed, {M} downgraded, {K} dismissed{, {J} off-altitude observations if any}.
**Choose one:**
- **Act on findings** — address confirmed/downgraded challenges
- **Run another round** — fresh adversary with a different persona (round {N+1} of 3)
- **Dismiss and proceed** — artifact looks sound, move on
Round 3 special case: If this is round 3, replace "Run another round" with: "Maximum review depth reached (3/3). Act on findings or dismiss."
Wait for the user's choice before proceeding.
If the user chooses "Run another round" from the Step 5 report, execute this sequence.
Spawn a background subagent to fill coverage gaps before the next adversary round.
Explorer spawn prompt:
You are a codebase explorer preparing context for the next round of adversarial review.
An adversarial review of the following artifact found these confirmed challenges: {confirmed_challenges_from_all_prior_rounds}
Do a focused deep-dive into the areas these challenges touch. Your goal is to surface context — facts that a fresh adversary would benefit from knowing. You are NOT an adversary. Do not raise challenges or opinions. Return facts.
## Exploration Context
### Area 1: {description}
- **Files examined:** {list}
- **Findings:** {factual observations — no opinions, no severity ratings}
### Area 2: {description}
...
After the Explorer completes, return to Step 3 with these modifications:
The new adversary gets its own fresh Counterweight pass (Step 4) and synthesis (Step 5) as normal. The synthesis report for round N includes challenges from all prior rounds (with their final verdicts) plus new findings from round N.
Hard cap: 3 rounds total. After round 3's synthesis, the report states: "Maximum review depth reached (3/3). Act on findings or dismiss." No option to run another round.
Track across rounds:
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub cheapsteak/cheapsteak-agent-plugins --plugin agent-workflow