Multi-AI adversarial code & business review with STRIDE threat modeling, evidence tiering, and adversarial red team
npx claudepluginhub hajinj/ai-review-arenaFull AI development and business lifecycle orchestrator - Always-On routing, codebase analysis, MCP detection, static analysis integration, STRIDE threat modeling, multi-AI adversarial code/business review with external CLI models (Codex subagents with per-agent model config/Gemini), evidence tiering, adversarial red team, business model benchmarking, 3-round multi-agent debate with Round 4 escalation, CSV batch review, auto-fix loop, test generation, fallback framework, cost estimation, commit/PR safety gate, and feedback-based routing
Make AI models argue with each other before your code ships.
You ask one AI to review your code. It finds 12 issues. But which ones are real?
Arena's answer: make three AIs fight about it.
┌──────────────────────────────────────────────────────────┐
│ SINGLE AI REVIEW │
│ │
│ You ───► One AI ───► "12 issues found" │
│ │
│ But... which are real? You have to check all 12. │
└──────────────────────────────────────────────────────────┘
vs.
┌──────────────────────────────────────────────────────────┐
│ ARENA REVIEW │
│ │
│ You ───► Claude ───┐ │
│ Codex ───┼──► They argue ───► 5 real │
│ Gemini ───┘ with each other issues │
│ │
│ 3 AIs independently review, then cross-examine │
│ each other's findings. Fake issues get eliminated. │
│ Real issues get confirmed with higher confidence. │
└──────────────────────────────────────────────────────────┘
Three AI families review your code separately, then challenge each other in 3 rounds:
ROUND 1 ROUND 2 ROUND 3
Independent Review Cross-Examination Defense
────────────────── ───────────────── ───────
Claude: "I found Codex: "Claude's Claude: "No, look
a SQL injection finding #3 is a at line 42 — user
at line 42" false positive, input goes directly
this input is into the query
Codex: "I found already sanitized" without escaping.
a race condition Here's proof..."
at line 89" Gemini: "Actually,
I agree with ───► CONFIRMED
Gemini: "I found Claude — the confidence: 92%
unused imports sanitization
at line 7" misses Unicode" ───► DISMISSED
(false positive)
What survives this fight = what you should actually fix.
Arena isn't just a reviewer. It's a full lifecycle system that handles everything from "I have an idea" to "ship it."
"Build an OAuth login"
│
▼
┌─────────────────────────────────┐
│ ARENA PIPELINE │
│ │
│ 1. Analyze your codebase │ ← learns your coding style
│ 2. Research best practices │ ← searches the web
│ 3. Check compliance rules │ ← platform guidelines
│ 4. Debate implementation │ ← AIs argue about HOW to build it
│ 5. Build it │
│ 6. Review with 3 AI teams │ ← the fight described above
│ 7. Auto-fix safe issues │ ← fixes trivial things automatically
│ 8. Generate tests │ ← writes regression tests
│ 9. Final report │ ← pass/fail verification
│ │
└─────────────────────────────────┘
And it works for three domains, not just code:
| Code | Business | Documentation | |
|---|---|---|---|
| Routes | A-F | G-I | J-K |
| Example | "Build OAuth" | "Write pitch deck" | "Review API docs" |
| Reviewers | 12 specialized agents | 10 specialized agents | 6 specialized agents |
| Special | Threat modeling, static analysis | Red team, quant validation | Code-doc drift detection |
You don't call Arena. Arena calls itself.
Every request you make to Claude Code gets routed through Arena automatically:
You say: Arena does:
───────────────────────────── ─────────────────────────────
"Build a login page" → Route A: Full lifecycle
"Fix this typo" → Route F: Quick fix (instant)
"Review this PR" → Route D: Multi-AI review
"Write a pitch deck" → Route G: Business pipeline
"Are the docs accurate?" → Route J: Doc review pipeline
"Refactor this module" → Route E: Refactoring pipeline
"Research auth best practices"→ Route B: Deep research
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations
Curated collection of 154 specialized Claude Code subagents organized into 10 focused categories