From research-papers
Adjudicates disagreements across research paper collections by topic clusters, producing verdicts on errors, superseding claims, evidence hierarchies, and actionable replacement values.
How this skill is triggered — by the user, by Claude, or both
Slash command
/research-papers:adjudicateThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Systematically adjudicate disagreements across the paper collection. Not summaries — *judgments*.
Systematically adjudicate disagreements across the paper collection. Not summaries — judgments.
This skill writes verdict documents and may acquire missing evidence through paper-process. It does not mutate propstore source branches directly.
$ARGUMENTS is --all: full collection sweep — discover topics, assign papers, produce all verdicts$ARGUMENTS is a topic name (e.g., "vowel formants"): produce a single verdict for that topic$ARGUMENTS is a list of paper directories: adjudicate only the disagreements among those specific papersls -d papers/*/ | grep -v "papers/pngs" | wc -l
ls papers/*/notes.md | wc -l
Read papers/index.md (if it exists) or sample papers/*/description.md files to understand what topics the collection covers.
Check for existing verdicts:
ls research/verdicts/*.md 2>/dev/null
For --all mode, identify natural topic clusters where papers make overlapping claims. Scan description.md files and notes.md cross-reference sections to find areas of disagreement.
Standard topic areas (adapt to collection):
For single-topic mode, skip this step — use the provided topic.
For each topic, identify the specific paper directories whose notes.md must be read. A paper can belong to multiple topics.
# Example: find papers relevant to "vowel formants"
grep -rl "formant" papers/*/notes.md --include="notes.md" | head -30
Write the assignment to reports/paper-topic-assignment.md:
## Topic: [Name]
Papers to read:
- papers/Author_Year_Title/
- papers/Author_Year_Title/
[...]
Estimated scope: N papers, ~M lines of notes
For each topic, read ALL assigned notes.md files and render a verdict.
Apply this hierarchy by default. Override with explicit reasoning only.
Evidence hierarchy (higher beats lower):
Override permitted when:
Every finding of error gets one of these labels:
Write each verdict to research/verdicts/NN-topic-name.md:
# Verdict: [Topic]
## Papers Considered
[Exact folder names for traceability]
## Historical Timeline
[Who said what, when — chronological. The story of the field.]
## Findings by Category
### Wrong (methodology error or flawed reasoning)
[Each: paper, claim, what was wrong, evidence. Label: WRONG]
### Superseded (better data replaced it)
[Each: old paper/claim → new paper/claim, why new wins. Label: SUPERSEDED]
### Limited (correct but over-applied)
[Each: paper, claim, valid scope, where it breaks down. Label: LIMITED]
### Incomparable (different questions mistaken for disagreement)
[Each: the two papers, what each actually measured, why comparison is invalid. Label: INCOMPARABLE]
## What Subsumes What
[Broader theories encompassing narrower ones. The intellectual genealogy.]
## Genuinely Uncertain
[Active disagreements with no resolution. The honest "we don't know."]
## Best Current Understanding
[The verdict. For each sub-question: answer, evidence, confidence (high/medium/low).]
## Synthesizer Audit
[What the implementation currently uses vs what it should use.
Each entry: current value (file:line) + source paper → category (correct/WRONG/SUPERSEDED/LIMITED) → replacement value with source paper.
Include actual numbers ready to implement.]
## Open Questions
[What the collection can't answer. Gaps. Papers we'd need to acquire.]
Ruthless. If the evidence says a paper was wrong, say it plainly. No hedging, no "may have been superseded." Name names, cite evidence, render judgment.
"Peterson & Barney's F3 values for children were WRONG — Hillenbrand 1995 showed they were 174 Hz too high, likely due to spectrograph limitations."
Not: "Later work found somewhat different values."
Every Synthesizer Audit entry that recommends a change must include the actual replacement values. "Replace IY1 F1=270 with F1=342 per Hillenbrand 1995 Table III" — not just "consider updating."
If a critical missing paper would change the verdict, use the paper-process skill to acquire it:
Use the paper-process skill to retrieve and process: [citation or DOI]
If nested skill invocation is unavailable or unreliable on this platform, derive this skill's
installed directory from the injected <path>, then run:
uv run "<skill-dir>/../paper-process/scripts/emit_nested_process_fallback.py"
Read the FULL stdout and follow it exactly instead of opening paper-process/SKILL.md piecemeal.
A verdict rendered without key evidence is worse than a slower verdict.
Topics have soft dependencies. Process in waves:
Wave 1 — Foundations (parallel): Topics about fundamental models, baseline measurements, and architectural assumptions. No topic depends on another within this wave.
Wave 2 — Dynamics (parallel): Topics about time-varying phenomena (coarticulation, duration, prosody). May reference Wave 1 verdicts.
Wave 3 — Higher-level (parallel): Topics about speaker variation, emotion, style. May reference Wave 1 and 2 verdicts.
Wave 4 — Master synthesis (sequential): One pass reading all verdicts, producing research/verdicts/00-master-synthesis.md:
Create research/verdicts/notes-progress.md and update it after each verdict:
If Edit/Write fails with "file unexpectedly modified":
./relative, C:/forward/slashes, C:\back\slashesYou may be running alongside other agents. NEVER use git restore/checkout/reset/clean.
When done, reply ONLY:
Done - see research/verdicts/
Verdicts: [list of verdict files]
Master synthesis: research/verdicts/00-master-synthesis.md
Findings: X WRONG, Y SUPERSEDED, Z LIMITED, W INCOMPARABLE
Gaps: N papers flagged for acquisition
Do NOT:
npx claudepluginhub ctoth/research-papers-plugin --plugin research-papersSimulates a multi-perspective academic peer review with 5 independent reviewers (EIC, 3 peers, Devil's Advocate) and field-specific expertise. Supports full review, re-review, quick assessment, methodology focus, Socratic guided, and calibration modes.
Compares a known set of papers (from Zotero, Obsidian, or manual list) into a compact comparison matrix for review-writing decisions, avoiding per-paper summary dumps.
Simulates a multi-perspective academic peer review with 5 reviewer personas (EIC, 3 peers, Devil's Advocate) and multiple review modes including full review, re-review, quick assessment, methodology focus, and guided Socratic review.