From llm-council
Run a question, plan, decision, code review, or PR through a 5-advisor LLM Council with anonymized peer review and chairman synthesis. Each advisor uses a distinct REASONING METHOD (inversion, decomposition, analogy, naive questioning, dependency graphing) — not just a different persona. Based on Andrej Karpathy's LLM Council + DMAD research (ICLR 2025). MANDATORY TRIGGERS: 'council this', 'run the council', 'pressure-test this', 'stress-test this', 'war room this', 'debate this', 'convene the council'. DO NOT TRIGGER on simple factual questions, casual yes/no, lookups, or tasks with one obvious right answer. Reserve for decisions with genuine stakes and tradeoffs — cost is 11 sub-agent calls.
How this skill is triggered — by the user, by Claude, or both
Slash command
/llm-council:llm-councilThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run a question, plan, or artifact through 5 independent advisors using **distinct reasoning methods**, then anonymized peer review, then chairman synthesis. The user has explicitly invoked this skill or pre-approved via AskUserQuestion — do not re-confirm.
Run a question, plan, or artifact through 5 independent advisors using distinct reasoning methods, then anonymized peer review, then chairman synthesis. The user has explicitly invoked this skill or pre-approved via AskUserQuestion — do not re-confirm.
This skill exists because single-model responses suffer from sycophancy ("AI kiss-ass disease") and confirmation bias. A structured council with method diversity and anonymized peer review reduces both.
Good: Architecture decisions, scope/pricing/positioning tradeoffs, "should I X or Y" with real consequences, plan/PR pressure-testing, naming with stakes, hire-vs-build, migration strategies.
Bad: Factual lookups, "what's the syntax for X," summaries, simple yes/no, writing tasks, tasks with one obvious right answer.
If the question doesn't deserve the council, say so directly and answer normally instead of spawning 11 sub-agents.
#NNN → fetch via gh pr view and include diff + description/council for decisions with real stakes."CLAUDE.md (current dir + global), README.md, recent git log --oneline -10, any files the user referenced.Reframe as neutral input:
QUESTION: [the core decision or thing being reviewed]
CONTEXT: [project context, constraints, recent changes]
WHAT'S AT STAKE: [why this matters — cost of getting it wrong]
Don't steer toward an answer. If genuinely ambiguous, ask ONE clarifying question before spawning agents.
Launch all 5 advisors simultaneously in a single message with 5 Agent tool calls. Sequential execution lets earlier responses bleed into later ones and defeats the purpose. Use subagent_type: "claude" and model: "opus" for critics — Opus 4.8, the max-tier model (set 2026-06-03 per the user; the opus alias resolves to the current top Opus, so it never goes stale). This replaced the original model: "sonnet" cost-discipline choice; the user is on Max 20x and prioritizes judgment quality over per-run cost.
Advisor prompt template:
You are [ADVISOR NAME] on an LLM Council reviewing a decision.
Your assigned reasoning method: [METHOD]
Your angle: [ANGLE]
The council was asked:
---
[framed question from Step 1]
---
Apply your reasoning method rigorously. Show your work using the method — don't just state opinions.
Rules:
- 150–300 words. No preamble. Straight into analysis.
- Name specific risks, opportunities, or issues. No vague concerns.
- If code: cite specific files, functions, patterns. If plan: cite specific steps, gaps, sequencing issues.
- Lean fully into your assigned angle. Do not hedge. Do not "balance." The synthesis comes later.
- End with your single strongest recommendation.
The five advisors (method-specific instructions):
| # | Name | Angle | Reasoning method instruction |
|---|---|---|---|
| 1 | The Contrarian | What will fail? | INVERSION: "Assume this shipped exactly as proposed — and failed. Work backward: what was the cause? What looked safe but broke under pressure? What's the failure mode nobody is discussing? Show your inversion chain." |
| 2 | First Principles Thinker | What are we actually solving? | DECOMPOSITION: "Break this into atomic claims and assumptions. List them. Challenge each: is this actually true? Is it necessary? What changes if this assumption is wrong? Identify the load-bearing assumptions." |
| 3 | The Expansionist | What upside are we missing? | ANALOGY: "What adjacent domain, product, or system solved a similar problem differently? What would someone with 10× ambition do here? Where is this thinking too small? Name specific analogues and what they'd suggest." |
| 4 | The Outsider | Zero context, fresh eyes | NAIVE QUESTIONING: "You have zero context about this project. Based purely on what's in front of you, list every point that requires insider knowledge. What's confusing? What jargon is unexplained? What would you ask if you just joined? If you can't follow the reasoning, say so." |
| 5 | The Executor | What do you do Monday morning? | DEPENDENCY GRAPHING: "Map the dependencies: what blocks what? What's the critical path? What's the first thing that must happen, and what can't start until it finishes? What takes 5 minutes but everyone will forget? Show the execution sequence." |
Natural tensions by design: Contrarian↔Expansionist (downside↔upside), First-Principles↔Executor (rethink↔ship), Outsider keeps everyone honest.
Collect the 5 advisor responses. Randomize the labels so Advisor 1 is not always Response A — generate a shuffled mapping like {A → Outsider, B → Executor, C → Contrarian, D → Expansionist, E → First Principles}. Keep this mapping in your scratch for de-anonymization at Step 4.
Launch 5 new reviewer sub-agents in parallel (single message, 5 Agent tool calls, subagent_type: "claude", model: "opus" — Opus 4.8, same max-tier as the advisors). Each reviewer sees all 5 anonymized responses:
You are reviewing the outputs of an LLM Council. Five advisors independently answered:
---
[framed question]
---
**Response A:** [randomized advisor response]
**Response B:** [randomized advisor response]
**Response C:** [randomized advisor response]
**Response D:** [randomized advisor response]
**Response E:** [randomized advisor response]
Answer these three questions. Be specific. Reference responses by letter.
1. Which response is strongest? Why? (one sentence)
2. Which has the biggest blind spot? What is it missing? (one sentence)
3. What did ALL five responses miss that the council should consider? (This is the most valuable question — think hard.)
Under 150 words total. No preamble. Be direct.
Dispatch a single chairman agent with model: "opus" (Opus 4.8) — pin it explicitly so the chairman is never silently downgraded even if the parent session is on a lesser model. This agent gets EVERYTHING: the framed question, all 5 advisor responses (now de-anonymized with names and reasoning methods labelled), all 5 peer reviews, and the anonymization mapping. It runs in a fresh context, reading the critiques as input data — not as a continuation of any critic's conversation.
Chairman prompt:
You are the Chairman of an LLM Council. Five advisors independently analyzed this question, then peer-reviewed each other's responses anonymously. Your job is to produce the final verdict.
QUESTION:
[framed question]
ADVISOR RESPONSES (de-anonymized):
[Name + Reasoning Method + full response, for each of the 5]
PEER REVIEWS (with anonymization key, so you can attribute):
[All 5 peer reviews + the A→Name mapping]
Produce exactly this structure. Do not deviate.
## Council Verdict: [Topic — 5 words max]
### Where the council agrees
[Points multiple advisors converged on independently. These are high-confidence signals. Be specific — name which advisors agreed and on what.]
### Where the council clashes
[Genuine disagreements. Do not smooth these over. Present both sides and explain why reasonable advisors disagree.]
### Blind spots the council caught
[Things that only emerged in peer review — gaps individual advisors missed but reviewers flagged. Especially "what did ALL five miss" answers.]
### The recommendation
[A clear, actionable recommendation. Not "it depends." Not "consider both sides." A real answer. You may disagree with the majority if the dissent's reasoning is strongest — explain why if so.]
### The one thing to do first
[Single concrete next step. Not a list of ten things. One thing.]
### What you lose with this recommendation
[The cost of following this advice — what gets sacrificed, what the strongest dissent would warn about. Preserve dissent; don't bury it.]
Show the chairman's verdict directly. Do NOT add your own preamble like "here's what the council said." Just present the verdict. If the user wants to see individual advisor responses or peer reviews, offer to surface them — don't dump them by default.
model: "sonnet" in Steps 2 and 3.Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub justsima/agentic-stack --plugin llm-council