From debate-skills
Use this skill when orchestrating multi-agent adversarial code reviews using Claude Code agent teams. Activates when conducting thorough code reviews, setting up review debate teams, spawning specialized review personas, or synthesizing findings from multiple reviewers. Provides spawn prompts, debate protocol, and synthesis templates.
How this skill is triggered — by the user, by Claude, or both
Slash command
/debate-skills:debating-code-reviewsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Orchestrate a team of 4 specialized reviewers who independently analyze code, then debate each other's findings. Adversarial debate between diverse review perspectives catches significantly more bugs than single-pass review.
Orchestrate a team of 4 specialized reviewers who independently analyze code, then debate each other's findings. Adversarial debate between diverse review perspectives catches significantly more bugs than single-pass review.
Single-reviewer code review suffers from blind spots - each reviewer has biases and areas of focus. Adversarial debate forces findings to survive scrutiny:
Agent teams are experimental and disabled by default. Enable them by adding the following to your settings.json or shell environment:
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}
Use the full debate protocol for:
See "When to Skip" at the bottom for lighter alternatives.
| Reviewer | Role | Primary Technique | Focus |
|---|---|---|---|
| The Tracer | Correctness | Execution path tracing | Logic errors, edge cases, state bugs |
| The Architect | Design | Pre-mortem analysis | Patterns, complexity, coupling |
| The Breaker | Adversarial Tester | Adversarial input + test review | Breaking inputs, test quality gaps |
| The Prosecutor | Devil's Advocate | Five Whys + assertion verification | False positives, overstated severity |
Each reviewer is available as a custom agent in ../../agents/ and is configured with Opus, read-only tools, and this skill preloaded. See ./references/reviewer-personas.md for detailed persona rationale and thinking styles.
Create an agent team with the 4 reviewer agents. They are pre-configured in this plugin's agents/ directory (tracer, architect, breaker, prosecutor).
Teammates don't inherit the lead's conversation history - they only get project context (CLAUDE.md, skills, MCP servers) plus the spawn prompt. Include enough detail about the code change in the prompt for reviewers to work independently.
Create an agent team to review [describe the code change - include file paths,
what the change does, and why].
Spawn 4 teammates using the tracer, architect, breaker, and prosecutor agents.
Require plan approval for the prosecutor (it must wait for Round 1 findings).
Create tasks with dependencies:
- Round 1 tasks (no dependencies, run in parallel):
- "Tracer: review [files] for correctness" → assign to tracer
- "Architect: review [files] for design" → assign to architect
- "Breaker: review [files] for adversarial inputs and test quality" → assign to breaker
- Round 2 tasks (depend on all Round 1 tasks):
- "Prosecutor: challenge Round 1 findings" → assign to prosecutor
- "All reviewers: respond to challenges and each other's findings"
- Round 3 task (depends on Round 2):
- "All reviewers: state final positions with confidence levels"
After Round 3, synthesize all findings into a single report.
Tracer, Architect, and Breaker review the code independently. All three run in parallel via the shared task list.
The Prosecutor is in plan approval mode during Round 1 - the lead won't approve its plan until Round 1 tasks are complete.
Each reviewer produces findings with:
Round 2 tasks automatically unblock when Round 1 completes (via task dependencies).
See ./references/debate-protocol.md for detailed messaging patterns.
Each reviewer states final findings with confidence levels (HIGH / MEDIUM / LOW).
The Prosecutor provides final triage for each finding.
The lead produces the final report using the template in ./references/synthesis-template.md.
The report sections:
| Severity | Meaning | Action |
|---|---|---|
| BLOCKING | Incorrect behavior, data loss, or security issue | Must fix before merge |
| WARNING | Design issue, missing edge case, or test gap | Should fix, may defer with justification |
| NOTE | Minor improvement or observation | Consider, no action required |
| Verdict | When to Use |
|---|---|
| APPROVE | No BLOCKING findings, warnings are minor |
| APPROVE WITH CHANGES | No BLOCKING findings, but warnings worth addressing |
| REQUEST CHANGES | One or more confirmed BLOCKING findings |
The full 4-reviewer debate is thorough but has a cost (time and tokens). Use lighter approaches for lower-risk changes:
| Change Type | Recommended Approach |
|---|---|
| High-risk, cross-module, AI-authored | Full debate (4 reviewers, 3 rounds) |
| Medium-risk, single module | 2 reviewers (Tracer + Breaker), no Prosecutor |
| Low-risk, small change | Single reviewer (Tracer for logic, Architect for design) |
| Documentation, config, formatting | Standard code review, no debate needed |
../../agents/ (tracer, architect, breaker, prosecutor)./references/reviewer-personas.md./references/debate-protocol.md./references/synthesis-template.mdnpx claudepluginhub adbutterfield/coding-agent-plugins --plugin debate-skillsPerforms symmetric two-AI peer reviews using OpenAI Codex CLI: independent blind reviews followed by structured per-issue debate for plans, code reviews, architecture, and recommendations.
Orchestrates multi-agent code review with Codex CLI, Gemini CLI, and five Claude specialist subagents (security, performance, logic, regression, robustness) then synthesizes findings into verified fixes. Use for deep reviews, second opinions, or council reviews on PRs, commits, or branches.
Reviews code changes adversarially using parallel layers (Blind Hunter, Edge Case Hunter, Acceptance Auditor) with triage into actionable categories. Invoke on 'run code review' or 'review this code'.