From adversarial-review
Cross-host adversarial red-team review of code, configs, and diffs. Routes the review to the agent that is NOT the host — Codex if you are running in Claude Code, Claude Opus if you are running in Codex. Cross-validates against your own independent analysis and returns unified security/robustness critics with severity ratings. Falls back to Gemini cascade, then degraded host-self with explicit warning.
How this skill is triggered — by the user, by Claude, or both
Slash command
/adversarial-review:coding-adversarial-reviewinheritThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Red-team review of code / configs / diffs. The host (you, the agent reading
Red-team review of code / configs / diffs. The host (you, the agent reading
this) must not review your own work — route the heavy critique to the
other agent. This SKILL.md is the same in Claude Code and Codex;
lib/call-external.sh detects which host you are and picks the opposite
partner.
Read tool, include the content.git diff (or git diff --staged).git diff main...HEAD./adversarial-review:adversarial-plan-review
instead and stop here./adversarial-review:prompt-optimize
instead and stop here.Use this template, replacing {CODE_TEXT} with the actual code/diff:
You are a red-team security and reliability analyst. Assume everything will
fail. Prove it with concrete exploit scenarios.
CODE:
{CODE_TEXT}
Review for:
1. SECURITY: injection, auth bypass, data exposure, OWASP top 10, secrets,
crypto misuse, deserialization, SSRF.
2. ROBUSTNESS: race conditions, failure cascades, resource exhaustion,
timeouts, error handling gaps, retry storms.
3. CORRECTNESS: off-by-one, type confusion, null/undef paths, locale/timezone,
floating-point, integer overflow.
4. CONCURRENCY: data races, deadlocks, ordering, cache coherence.
5. OBSERVABILITY: missing logs at failure points, secrets in logs, metric gaps.
6. SUPPLY CHAIN: pinned versions? lockfile? typo-squat risk?
7. BLAST RADIUS: who else does this break if deployed?
Output language: same as the input.
Sections: BLOCKERS / SHOULD FIX / NICE TO HAVE / VERDICT
Per finding: P0–P3 severity, evidence (file:line or quote), problem,
exploit/scenario, recommendation.
Verdict: SHIP / REVIEW_NEEDED / DO_NOT_MERGE
If the code is over ~6 kB, focus the diff on changed regions and provide necessary context — long prompts can stall the external backend.
Pipe the prompt into lib/call-external.sh (this script handles host detection,
routing, anti-recursion, Gemini fallback, and degraded mode):
PLUGIN_DIR="$HOME/Documents/Repos/coding-plugins/adversarial-review" # or wherever installed
echo "$PROMPT" | bash "$PLUGIN_DIR/lib/call-external.sh"
echo "exit=$?"
Capture:
0 external success, 2 degraded, 1 error/recursionNotes:
codex exec or claude -p directly — always go through
lib/call-external.sh. The script enforces anti-recursion via the
ADVERSARIAL_REVIEW_DEPTH env counter.1 (recursion), you are inside a partner-launched call; emit a
short note ("recursion guard tripped — parent already running review") and
stop. Do not produce a self-review.Without looking at the partner's output, walk the same checklist (security, robustness, correctness, concurrency, observability, supply chain, blast radius). This is your host-side draft.
Compare host-side findings with the partner's:
| Tag | Meaning |
|---|---|
[cross-validated] | both you and partner caught it (high confidence) |
[external-only] | only the partner caught it |
[host-only] | only you caught it |
On severity disagreements, take the higher of the two.
Format:
## Adversarial Code Review
- **Mode**: <external=codex | external=claude-opus | external=gemini-3.1-pro | gemini-3.1-flash-lite | gemini-2.5-pro | gemini-2.5-flash | DEGRADED>
- **Verdict**: SHIP | REVIEW_NEEDED | DO_NOT_MERGE
- **Findings**: N total — X P0, Y P1, Z P2, W P3
### Critics
#### [P0]: <title> [cross-validated | external-only | host-only]
- **Problem**: <what's wrong, why it matters>
- **Evidence**: <file:line, quote, or scenario>
- **Exploit / scenario**: <concrete failure mode>
- **Recommendation**: <specific fix>
[…repeat, highest severity first…]
### Recommended Patch
<minimal patch addressing P0/P1, in unified-diff form when practical;
prose otherwise>
### Key Risks if Merged As-Is
1. <risk> — <impact>
2. <risk> — <impact>
If lib/call-external.sh exited 2 (degraded mode), prepend this banner
verbatim to the top of the output, before the ## Adversarial Code Review
heading:
> ⚠️ **DEGRADED MODE** — no external partner reachable. Output below is
> single-perspective host self-review and violates the cross-host principle.
> Re-run after restoring access to Codex / Claude / Gemini for higher confidence.
references/host-detection.md — how lib/detect-host.sh decidesreferences/codex-integration.md — Codex CLI invocation, including the
forced_login_method = "chatgpt" gotcha for ChatGPT-account authreferences/claude-integration.md — claude -p --model opus --effort xhighreferences/fallback-chain.md — full external + Gemini cascade + degraded pathreferences/output-standards.md — P0–P3 schema, evidence requirementsnpx claudepluginhub robertoecf/adversarial-review --plugin adversarial-reviewGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.