From paper-toolkit
Use when a research workspace has analysis outputs and the user wants to write, revise, check, compile, or review an academic paper with paper-toolkit. Drives deterministic CLI tools while keeping all writing, judgment, review, and revision decisions in the agent session.
How this skill is triggered — by the user, by Claude, or both
Slash command
/paper-toolkit:agentsociety-generate-paperThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
`paper-toolkit` is a deterministic Python CLI (no LLM calls). This skill
prompts/_subagent_workflow.mdprompts/_writing_shared.mdprompts/revision_decision.mdprompts/skeptical_review.mdprompts/writing_abstract.mdprompts/writing_discussion.mdprompts/writing_intro.mdprompts/writing_methods.mdprompts/writing_results.mdreferences/check_report_schema.mdreferences/compile_run_schema.mdreferences/envelope_schema.mdreferences/evidence_graph_schema.mdreferences/exemplar_patterns.mdreferences/figure_table_specs.mdreferences/literature_search.mdreferences/review_rubric.mdreferences/tool_catalog.mdpaper-toolkit is a deterministic Python CLI (no LLM calls). This skill
directs Claude Code through writing, review, and revision while the CLI
handles all reproducible operations: workspace state, evidence DAG, figure
packing, BibTeX writing, LaTeX assembly, compile, and six checkers.
You own every judgment call (what to claim, which evidence supports it, review severity, revision class). The researcher remains the author of record; this skill assists them, it does not replace them.
NO PROSE PATCHES FOR EVIDENCE OR STRUCTURE FAILURES.
NO RELEASE CLAIM WITHOUT A FRESH `paper check all` AND `paper compile-once`.
If paper check claim-coverage or paper check citations is failing, fix
the evidence graph or refs.bib before touching section prose. If the
deterministic checks have not been re-run since the last edit, you cannot
claim the paper is ready. Skipping these steps is not a shortcut; it is a
lie about state.
Each prompt and reference is a required read for the task it owns, not a "see also" link. Open them with the Read tool, do not infer their contents from this index.
| Task | MUST read before acting |
|---|---|
| Drafting any section | prompts/_writing_shared.md + prompts/writing_<section>.md |
| Picking an opening pattern | references/exemplar_patterns.md |
| Authoring a figure or table spec | references/figure_table_specs.md |
| Searching arXiv / CrossRef / OpenAlex | references/literature_search.md |
| Skeptical review | prompts/skeptical_review.md + references/review_rubric.md |
| Revision decision | prompts/revision_decision.md |
| Running a multi-section / multi-round project | prompts/_subagent_workflow.md |
| First time using a command | references/tool_catalog.md |
| Inspecting an envelope | references/envelope_schema.md |
Reading a run.json | references/compile_run_schema.md |
Reading a CheckReport | references/check_report_schema.md |
| Inspecting the evidence DAG | references/evidence_graph_schema.md |
The Red Flags section below lists the "I don't need to read it" thoughts that mean you do.
Pre-reads apply mid-flow, not just at skill trigger time. When you transition from drafting to review, or from one section to another, the Read tool must touch the relevant prompt and references again — your context drifts and the prompts encode the discipline the next step needs.
Two modes; the controller picks per task.
Direct mode (single-edit only): the controller CC reads the discipline prompts and drives the toolkit itself. ONLY appropriate for a single-section copyedit or a deterministic-checker fix-up. NOT appropriate for drafting new sections, running skeptical review, or making revision decisions.
Subagent-driven mode (REQUIRED for drafting and review): the controller dispatches fresh subagents per role (drafter, spec-reviewer, skeptical-reviewer, revision-decider). The skeptical-reviewer and revision-decider in particular MUST be subagents — running them in the same session that drafted the prose is a self-review and counts as skipping the review pass.
When dispatching, the controller's prompt to the subagent MUST include
the relevant prompts/*.md + references/*.md paths as an explicit
"Required reads" block. Subagent sessions start with empty context and
will not find them otherwise. See prompts/_subagent_workflow.md for
payload templates.
Iron Law of mode selection: do not switch modes mid-loop. If you started a section's draft → review → revise cycle in subagent-driven mode, finish it in subagent-driven mode; do not let the controller "just fix it directly" partway through (loses the audit trail).
A recommended sequence. You may reorder when a section is already mature, or when a review finding sends the work back to an earlier step.
intake -> framing -> evidence DAG -> drafting
|
release <- revision <- skeptical review <- compile + checks
| Step | Toolkit commands (run these — do not narrate them) |
|---|---|
| 1. Intake | paper init, paper scan |
| 2. Framing | (judgment — produces no toolkit artifact yet) |
| 3. Evidence DAG | paper evidence add-claim/add-evidence/add-citation/link/validate, paper evidence topo-order |
| 4. Drafting | paper template expand --section <name>, then edit paper/sections/<name>.tex |
| 5. Compile + checks | paper compose pack-figures, paper compose write-bib, paper compose assemble-latex, paper compile-once, paper check all |
| 6. Skeptical review | (LLM judgment; output to paper/reviews/skeptical-r<N>.md) |
| 7. Revision decision | (LLM judgment; output to paper/reviews/revision-r<N>.md) |
| 8. Release | (LLM judgment, communicate to user) |
Loop 4 → 5 → 6 → 7 until verdict is PASS with only minor issues
remaining, or until the researcher closes the loop.
These checks are the floor of review, not the ceiling. Run them and treat each finding as an input to the human-judgment review, not as a finding you re-derive.
| Command | Replaces what kind of judgment |
|---|---|
paper check style | em-dash, "in this paper we propose", 25 AI-tone phrases (warning), "replacing the researcher" framing (error). Do not re-list these rules in prose; read the checker output. |
paper check citations | every \cite|\citep|\citet|\citealt|\citealp|\citeauthor|\citeyear in sections vs. refs.bib. |
paper check figures | unreferenced figures, redundant Figure N. caption prefix, bad float placement, duplicate \label{} across sections. |
paper check claim-coverage | orphan evidence, unsupported primary claims. |
paper check word-count | sections outside the venue word range. |
paper check logic-consistency | contradicting claims linked by a contradicts edge. |
paper compile-once | LaTeX errors and warnings, with LatexError.file attributed to the source .tex. |
Map deterministic findings into the rubric using references/review_rubric.md
before spending tokens on Dimensions 1, 2, 4, 7 (which are pure judgment).
| Excuse | Reality |
|---|---|
| "This section is short, I don't need an envelope after editing." | Every section edit must be followed by paper check style --section X and paper check word-count --section X. The envelope is the only record that the change was actually clean. |
| "The figure is obvious; I'll skip the evidence node." | A claim without an evidence node is invisible to paper check claim-coverage. The reviewer will catch it; the toolkit catches it first. |
| "The AI-tone warning is a false positive; I'll override." | Maybe. But override in the paper/venue.yaml (which is durable) or note it in the review file. Do not silently leave the warning unfixed and unjustified. |
| "The previous compile run is recent enough; I won't re-run." | If you edited a .tex file after r<N>, the previous run is stale. Re-run before claiming a verdict. |
| "I already read SKILL.md; I don't need to re-read the prompt." | SKILL.md is a router. The prompt files carry the discipline. Read them. |
"The verb ladder is in my head; I don't need _writing_shared.md." | The ladder evolves with the venue config. Read the file each session. |
| "The review found one major issue; I'll batch the fix with three minor ones." | paper/reviews/revision-r<N>.md must record each fix separately for auditability. Batching is the enemy of monotonic progress. |
| "Citations are unused; I'll delete them from refs.bib." | An unused citation may belong to a deferred claim. Decide first, prune second; ask the researcher if unsure. |
| "I drafted this section, so I can also review it; I'll save a subagent dispatch." | Self-review is not review. Skeptical reviewer and revision decider MUST be dispatched as subagents distinct from the drafter — see prompts/_subagent_workflow.md. |
| "I read the prompt at the start of the skill; I don't need to re-Read it before review." | Each stage has its own pre-reads. Read them again when you move between drafting and review — context drift is real. |
fatal, but it's really just wording."paper check all; my edits were small."ALL of these mean: STOP. Return to the relevant step in the Workflow Shape table above.
Before reporting "the paper is ready" or "this section is done", confirm every box. If you cannot, you skipped a step.
paper compile-once has been re-run since the last edit; latest
run.json shows ok: true.paper check all has been re-run since the last edit; report has
no errors (warnings are explicitly judged and recorded).strength: primary has a supporting evidence
node (verify with paper check claim-coverage).\cite{} key in every section exists in refs.bib (verify
with paper check citations).paper/figures/ and a FigureArtifact in paper.json (verify with
paper check figures).paper/reviews/skeptical-r<N>.md has
verdict PASS with only minor issues, or the researcher has
explicitly accepted remaining major issues with a note in
paper/reviews/revision-r<N>.md.paper check word-count).Can't check all boxes? You skipped the deterministic floor. Return to step 5.
state_summary;
decide next from artifacts + user intent + review notes.paper/. Materials outside paper/ are read-only
inputs.major to fatal for convenience or demote fatal to
major to avoid a human gate.---) in prose. The style checker enforces this.paper init --title "..." --venue nature --workspace .
paper scan --workspace .
paper evidence add-claim --id c1 --label "..." --section intro --strength primary
paper evidence add-evidence --id e1 --label "..." --source-kind figure --source-ref fig1
paper evidence add-citation --id ref1 --cite-key levy2021 --label "..."
paper evidence link --src e1 --dst c1 --kind supports
paper evidence validate
paper evidence topo-order
# For each section the topo order suggests:
paper template expand --section intro --workspace .
# Read prompts/_writing_shared.md, then prompts/writing_intro.md.
# Edit paper/sections/intro.tex.
paper check style --section intro --workspace .
paper check word-count --section intro --workspace .
paper compose pack-figures --workspace .
paper compose write-bib --workspace .
paper compose assemble-latex --workspace .
paper compile-once --workspace .
paper check all --workspace .
# Read prompts/skeptical_review.md.
# Save review to paper/reviews/skeptical-r1.md.
# Read prompts/revision_decision.md.
# Save decisions to paper/reviews/revision-r1.md.
# Apply revisions and loop.
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub yokumii/paper-toolkit --plugin paper-toolkit