By andyed
Audits AI-assisted research drafts for citation integrity, prose quality, figure consistency, and scientific rigor before committing. Verifies references against CrossRef and BibTeX to catch confabulated citations, stale claims, and misnumbered figures.
Generate a key-claims summary from notebooks with `## Key Claims` sections. Args: <notebooks-dir> [-o <path>]
Search arXiv by free text; surfaces published DOIs ready to verify. Args: "query" [--max=10] [--sort=relevance] [--id=2401.12345]
Audit recent arXiv papers for citation quality. Args: [count] [--cat=cs.AI]
Audit citations in a directory against a BibTeX file. Args: <dir> --bibtex=<path> [--json] [--verbose]
Verify figure caption numerics against summary.json sidecars. Args: <INDEX.md> [--json]
Verify figure caption numerics against their summary.json sidecars. For each figure section in an INDEX.md, walks the linked stats dump, extracts numerics from the caption prose, and matches them. Catches stale prose where a value drifted between figure regeneration and caption update. Sibling to citation-audit, prose-audit, and rigor-audit.
Review research artifacts for scientific rigor before they're committed or pushed. Catches framing errors (nulls as detection limits, p-values without effect+CI, undefined metrics), unsupported claims, and presentation problems. Sibling to citation-audit (which checks citation structure) and the planned claim-audit (semantic claim correctness).
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Verify what AI writes about science — citations, claims, and cross-file consistency.
A README linked to PMID 12078741 as the foundational paper on Restricted Focus Viewers in vision science. The actual paper at that ID? "Determination of true ileal amino acid digestibility... in barley samples for growing-finishing pigs." The correct PMID was 12723780 — off by 645,039.
npx github:andyed/science-agent audit ./docs --bibtex=./refs.bib
Or verify a single DOI against CrossRef:
npx github:andyed/science-agent verify 10.1038/nn.2889
No install, no clone — runs straight from this repo.
Want it on your $PATH instead of refetching each run? Install the CLI globally from GitHub (wires up the science-agent bin):
npm install -g github:andyed/science-agent
science-agent audit ./docs --bibtex=./refs.bib
Catches AI-confabulated academic citations before they ship. Verifies inline references against BibTeX and CrossRef:
| Pattern | How |
|---|---|
| Wrong title | Fuzzy title matching against BibTeX + CrossRef |
| Fabricated co-authors | CrossRef author list verification |
| Wrong DOI | CrossRef DOI resolution — checks that the DOI points to the claimed paper |
| Compound confabulation | CrossRef + title search detects merged citations |
| Ambiguous citation | Surname+year collision detection across BibTeX entries |
| Orphan citation | Inline reference with no BibTeX entry |
Research notebooks produce numbers. Papers and READMEs cite those numbers. Over time, numbers drift — a notebook gets re-run with new data, but the prose still quotes the old value. Science Agent makes this auditable.
The idea: Each notebook declares its load-bearing results in a ## Key Claims table. Prose references them as [NB14:K3] (notebook 14, claim 3). Science Agent verifies every reference resolves to a real claim with a real value.
## Key Claims
| ID | Claim | Value | Verified |
|----|-------|-------|----------|
| K1 | Sample size after exclusions | N = 2,719 trials | 2026-04-09 |
| K2 | Main effect | ρ = −0.618, p = 0.0426 | 2026-04-09 |
Then in your paper or README:
The position × cognitive load correlation [NB14:K2] suggests...
Audit it:
# Generate aggregate from notebooks (one-time setup)
science-agent aggregate ./notebooks/ -o docs/notebook-key-claims.md
# Audit all claim references in prose
science-agent notebook-audit ./docs \
--aggregate=./docs/notebook-key-claims.md \
--notebooks=./notebooks/ \
--cross-repo=../downstream-repo
Detects:
[NB14:K3] cited in prose but K3 doesn't exist in NB14If you don't use notebooks or don't need claim tracking, ignore this — audit and verify work standalone. If you do, see the full setup guide: docs/notebook-conventions.md
Science Agent is a CLI tool — any assistant that can run shell commands can use it. No API keys, no plugins, no vendor lock-in.
Claude Code:
> check my citations against refs.bib
# Claude runs: npx github:andyed/science-agent audit ./docs --bibtex=./refs.bib
ChatGPT / Codex / GitHub Copilot in terminal:
> run science-agent to verify the citations in my paper
# GPT runs: npx github:andyed/science-agent audit ./paper --bibtex=./references.bib
Gemini Code Assist / Cursor / Windsurf / any terminal AI:
> audit my bibtex citations for confabulation
# Assistant runs: npx github:andyed/science-agent audit . --bibtex=./refs.bib
The pattern is the same everywhere: point at a directory of prose and a BibTeX file. The tool does the rest.
For deeper integration, see agent.md (Claude Code agent) or docs/github-actions.md (CI/CD).
For any agent that supports skills.sh — Claude Code, Cursor, Codex, and others — install the portable skill straight from GitHub:
npx skills add andyed/science-agent
This installs skills/science-agent/SKILL.md, which teaches the agent when and how to drive the CLI (audit, verify, search, arxiv-search, notebook-audit, figure-audit) plus the find→verify pattern. The CLI runs via npx github:andyed/science-agent, so there's nothing else to install.
Install the whole toolkit — agents, slash commands, CLI — in one step:
/plugin install andyed/science-agent
npx claudepluginhub andyed/science-agent --plugin science-agentMulti-channel visual production skill with brand-token enforcement and 8:1 contrast minimum.
Search and explore Claude Code session history — hooks log events, /remember searches them, /carto opens the Explorer, /focus orients on a project
Research integrity plugin for Claude Code — paper auditing, citation verification, experiment analysis, and methodology-first skills for academic workflows.
完整学术流水线 — 从 idea 到论文的全流程编排:状态机追踪、完整性验证、claim 校验
Verify academic paper citations: extract references from LaTeX/PDF, check formatting, verify existence via Crossref/Semantic Scholar, and score thematic/semantic relevance.
Production-grade academic research pipeline for Claude Code: research → write → review → revise → finalize. 4 skills, 27 modes, 39-agent ensemble, v3.7.3 + v3.8 L3 claim-faithfulness gate, v3.9.0 cross-index triangulation, v3.10 triangulation policy layer, v3.11 deterministic citation verification gate (#182).
Document review skills from Draft Detective: AI-powered checks that help authors catch issues in their documents before publication.
Verify and validate BibTeX references against CrossRef metadata. Finds uncited entries and flags discrepancies in title/author/journal/volume/pages/year.