Multi-model consensus system — 3 LLMs cross-examine each other to catch blind spots on critical decisions
Challenge a proposed action with the neurologist corrigibility skeptic — tests counter-positions before committing
Run a full 3-phase cross-examination on a topic — independent analysis, cross-review, synthesis
Set up the 4-folder document system (inbox/vision/reflect/dao) in this project
Run the gains gate verification — infrastructure health checks that must pass before proceeding
Health check all three surgeons to verify they are reachable and operational
Full A/B test lifecycle with safety constraints and human veto windows
Rapid 3-surgeon verdict on whether a proposed fix or change is sound
HARD-GATE — multi-model review of architectural decisions before implementation proceeds
External-model cross-examination review with optional git context for code and architecture decisions
Confidence-weighted vote from multiple surgeons to validate claims and check assumptions
Admin access level
Server config contains admin-level keywords
Modifies files
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Hook triggers on file write and edit operations
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Uses power tools
Uses Bash, Write, or Edit tools
No model invocation
Executes directly as bash, bypassing the AI model
No model invocation
Executes directly as bash, bypassing the AI model
Three-model code review consensus — disagreement is the value, not consensus.
Three independent AI models cross-examine your code. They challenge each other's blind spots, hunt for what the others missed, and surface every disagreement instead of burying it. Built on five Constitutional Physics invariants and a four-phase operating protocol. Provider-agnostic across OpenAI, DeepSeek, Anthropic, Ollama, LM Studio, vLLM, and MLX.
Quick Start · Constitutional Physics · Install · IDE Compatibility · Pairs With
Would you wing a complicated surgery with one surgeon?
Then why are you shipping code reviewed by one AI?
Every AI coding tool has the same flaw: one model, one perspective, one set of blind spots. Claude confabulates confidently where GPT hedges. GPT over-engineers where a local model stays lean. A single AI reviewer is a single point of failure — and you'd never accept that in a real operating room.
3-Surgeons puts three independent AI models on the same operating table. They don't just review — they cross-examine, challenge assumptions, and hunt for what the others missed. Your code ships only when all three agree it's ready.
| Surgeon | Role | Default Model | |
|---|---|---|---|
| 🔪 | Atlas (Head Surgeon / Judge) | Synthesizes findings, weighs evidence, decides, implements — never overrides without grounds | Claude (your IDE session) |
| 🩺 | Cardiologist (External Skeptic) | External-model perspective from a different training distribution. Surfaces what Atlas can't see | DeepSeek-chat (drop-in OpenAI also supported) |
| 🧠 | Neurologist (Local Devil's Advocate) | Runs locally for privacy and corrigibility. Forces counter-position before consensus locks in | DeepSeek-chat via local proxy (Qwen3-4B legacy) |
A single AI reviewer is a single point of failure. Three reviewers, hunting independently, force the truth into the open.
3-Surgeons is built on one belief: the bottleneck in AI-assisted coding is no longer speed — it's calibration. A confidently wrong answer ships faster than a careful right one. Three independent surgeons make confidence earnable — every claim survives cross-examination or it dies on the table.
| Scale | What it unlocks |
|---|---|
| 1 model | One opinion. Fast. Possibly wrong, but you wouldn't know. |
| 2 models | A check. Often agree. Disagreement = stop and look. |
| 3 models | Triangulation. Truth becomes recoverable. The blind spot of any one model is exposed by the other two. |
| 5+ models | A specialty board. Each surgeon brings a different training distribution. Convergence under independent attack is evidence, not opinion. |
| Continuous review | Every diff cross-examined. Every claim audited. Every blind spot named. Calibration compounds across the codebase. |
The protocol scales linearly with the number of surgeons. The architecture is provider-agnostic. The only ceiling is your tolerance for groupthink.
Five principles that govern every surgical operation. These are invariants — no tool call, no config flag, no shortcut overrides them.
npx claudepluginhub supportersimulator/3-surgeons --plugin 3-surgeonsCross-machine AI collaboration — real-time peer-to-peer messaging, chain orchestration, dispatch gates, session-aware task agents, and fleet-wide productivity visibility across multiple Claude Code instances
Use when you want a delegated second opinion or implementation from GPT (Codex), Gemini, Grok (xAI), or OpenRouter (config-driven, 400+ models) - seven expert subagents (Architect, Plan Reviewer, Scope Analyst, Code Reviewer, Security Analyst, Researcher, Debugger) and bundled ask-gpt/ask-gemini/ask-grok/ask-openrouter/ask-all/consensus commands, advisory (read-only) or implementation (write; Grok and OpenRouter are advisory-only).
Configurable multi-model code review, plan review, and general review with consensus convergence
v9.44.1 — Patch release for Gemini environment/version detection and qwen auth gating. Run /octo:setup.
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Codex, Gemini, Claude の3つの AI に並列でコードレビューを依頼し、統合レポートを生成する
Delegate tasks to Codex, Gemini, and OpenCode AI agents via Owlex MCP