Adjudicates between Red Team critiques and Blue Team defenses. Produces nuanced per-issue assessments — confirming, debating, or recommending deletion. Especially careful with Type B (acknowledged) issues. Use after Blue Team defense.
Provides an honest defense of the paper for each issue identified by the Red Team. Classifies issues into Type A (Red Team mistake), B (acknowledged), C (clerical), D (structural), E (visual evidence), F (feature-not-bug), G (other). Use after the Red Team summary has been compiled into a list.
Code audit agent. Reads replication code as a meticulous programmer looking for technical bugs independent of paper claims — panel-data operator misuse, merge integrity, missing-value propagation, forward-looking contamination, treatment-FE collinearity, silent duplication, staggered DiD heterogeneity, spatial autocorrelation. Use only with a replication-code directory. Report every bug you can demonstrate by tracing the code; do not pad the list to hit a count.
External-source hallucination audit. Catches critiques that invent specific factual details (years, geographic coverage, sample restrictions, frequency, etc.) about external papers/datasets cited by the text. Applies the Black Box Rule. Use after fact-checker, before final list compilation.
Produces the final, authoritative list of verified code issues from the Code Verifier's verdicts. Includes only CONFIRMED and OVERSTATED issues. Frames each as a single bold-labelled paragraph in sentence case for inclusion in the final report. Use after code-verifier.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Work in Progress. This plugin is still under active development by a university student. For now, it is an experiment on the utility of Claude Code for peer-reviewing academic papers. If you have any feedback or suggestions, reach me on X at @felpix_.
A Claude Code plugin that runs an adversarial multi-agent peer review on an academic paper PDF. Spawns a Red Team of specialized critics (Foundations-Critic, Empirical Auditor, Procedural Auditor, Collector, Omissions Auditor), a Blue Team that writes honest defenses, an Assessor that adjudicates between them, fact-checkers that verify quotes and citations against the PDF, and — in writer mode — an author-facing editor, proofreader, and copyeditor. Produces a structured Credibility Assessment / Bottom Line / Potential Issues / Future Research report.
The pipeline's effectiveness comes from the verification cascade: aggressive prompting drives careful reading, and downstream verifiers strip out the LLM hallucinations that aggressive prompting also produces.
Use Claude Code to add the marketplace and install the plugin.
/plugin marketplace add Felpix-Studios/peer-review
/plugin install peer-review@felpix-research
The plugin extracts PDFs to plain text once per run (so a 15-agent pipeline doesn't re-render the same pages 15 times as images). This requires Python 3.10+ on PATH as python and the pypdf package.
pip install pypdf reportlab bibtexparser
reportlab is optional — it's only used by the compile-code-to-pdf.py helper when you want to pack a replication-code directory into a single dense PDF for the code-audit agents. bibtexparser is optional — it's only used by the parse-bib.py helper if you supply a .bib file at run start (the citation-checker uses it to verify cited references against your bibliography). The plugin's regex .bib parser handles common entries without bibtexparser; install it if your bibliography uses unusual concatenations or escape sequences. Installing all three now avoids a second failure later.
Preflight check. If Python or
pypdfis missing when you run/peer-review, the skill halts at preflight and prints the install command — nothing is extracted, no agent is invoked, no directory is created.
/peer-reviewPoint the skill at a PDF. The invocation accepts free-form prose — there are no flags to parse:
/peer-review path/to/paper.pdf
/peer-review my paper is named "working-paper.pdf"
Claude will then ask a short follow-up covering:
Accept the defaults for a straightforward base run.
The final report is written to ./peer-review-report.md (top of your cwd). If pandoc and a LaTeX distribution with xelatex are installed, a companion ./peer-review-report.pdf is produced in the same directory (all LaTeX intermediates are cleaned up automatically). Every intermediate stage file — plus a paper_text.txt plain-text dump of the PDF and a generated README.md mapping each NN_stage.txt to its contents — is cached under ./peer-review-output/<slug>/, a visible folder in your cwd. A partial run can be resumed by re-invoking the command: only stages whose cached file is missing will re-run.
| Tool | Required For | Install |
|---|---|---|
| Claude Code (with plugin support) | Everything | claude.ai/download |
Python (>= 3.10) on PATH as python | PDF text extraction (every run) | python.org |
pypdf | PDF text extraction (every run) | pip install pypdf |
reportlab (optional) | compile-code-to-pdf.py helper for code-audit runs | pip install reportlab |
bibtexparser (optional) | parse-bib.py helper for citation validation when a .bib is supplied | pip install bibtexparser |
pandoc (optional) | PDF export of the final report | brew install pandoc (macOS) / apt install pandoc |
LaTeX distribution with xelatex (optional) | PDF export of the final report | MacTeX / TeX Live / MiKTeX |
| Access to Claude Opus + Sonnet | 26 agents use Opus; 9 use Sonnet | Claude Code plan |
The orchestrator drives a 30+ stage pipeline of subagents through a verification cascade designed to remove LLM hallucinations:
npx claudepluginhub felpix-studios/peer-review --plugin peer-reviewTools and agents for social science research workflows
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.