From devils-advocate
Generate baseline devil's advocate evaluation. Produces concern catalogue with risk scores and scorecard. Requires setup to have been run first (devils_advocate.md and fact_repository.md must exist).
How this skill is triggered — by the user, by Claude, or both
Slash command
/devils-advocate:evaluateThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate concern catalogue and scorecard. Run after setup.
Generate concern catalogue and scorecard. Run after setup.
MANDATORY: Use TaskCreate/TaskUpdate per step (read context, generate catalogue, score, create v01). Mark in_progress/completed.
Prerequisites: devils_advocate.md and fact_repository.md must exist. Otherwise: tell user to run /devils-advocate:setup.
Read target document, devils_advocate.md, fact_repository.md in full.
Fibonacci scale (1, 2, 3, 5, 8):
Risk adjustment: review full set. Adjust where interactions amplify. Document: Risk: N (adjusted from L x I = M, reason: ...).
Concern template:
### N. "[Concern as the devil would phrase it]"
**Likelihood: N** | **Impact: N** | **Risk: N**
**Their take**: what devil thinks. Write as them.
**Reality**: factual counter. Reference fact_repository.md.
**Response**: how to address it.
Categories (persona-weighted):
No negative risk scores. Strengths go in "Reality" and "Response".
Score 0-100% per concern.
| Score | Devil's reaction |
|---|---|
| 95-100% | "I have no issue" |
| 80-94% | "Fine, but I noticed..." |
| 60-79% | "Doesn't fully answer" |
| 40-59% | "This is a problem" |
| 20-39% | "You're hiding something" |
| 0-19% | "Makes it worse" |
Scorecard format (append to devils_advocate.md):
## Scorecard v01 ([document name] as-is)
| # | Concern | Risk | Score | Residual | Reasoning |
|---|---------|------|-------|----------|-----------|
| 1 | [name] | 25 | 85% | 3.75 | [specific text reference + quality assessment] |
risk x (1 - score)Top gaps: top 5 residuals.
Per high-residual concern, 2-4 options:
### Concern #N: [name] (residual: X)
**Option A**: [specific change]
- Expected effect: #N +15%, #12 -5%
**Option B**: [structural change]
- Expected effect: #N +20%
**Recommendation**: [which and why]
ASK: "How to run scoring?
claude -p subprocess. Faster, final scorecard onlyInitial evaluations: in-session calibrates devil. Re-scoring iterations: standalone faster."
claude -p --model sonnet, parse scorecard, append to devils_advocate.mdMANDATORY: Scored document carries scorecard and residual in filename.
<name>_v01.md---
## Document Scorecard (Devil's Advocate)
**Persona**: [devil role and key bias]
**Score**: [total residual risk] (lower = better, max [total absolute risk])
| # | Concern | Risk | Score | Residual | How addressed |
|---|---------|------|-------|----------|---------------|
| 1 | [name] | [risk] | [0-100%] | [residual] | [specific text] |
<name>_v01_<score>.mdreport_v01.md -> report_v01_89.mdOriginal NOT modified.
Tell user: "Baseline complete. Score: [N] out of [max]. Run /devils-advocate:iterate to improve."
Baseline low (< 30% of max): "Rather impressive start. Devil's struggling - score [N] out of [max]. Gaps worth closing still."
Baseline high (> 70% of max): "Devil has a lot to say. Score [N] out of [max] - real work to do."
npx claudepluginhub stellarshenson/claude-code-plugins --plugin devils-advocateProvides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.