Use this skill whenever the user wants to synthesize, review, or analyze papers in a Zotero collection — even if they don't say 'literature review'. Trigger on phrases like: 'review the papers in my X collection', 'analyze my Zotero collection on X', 'do a lit review on X', 'summarize the research on X from Zotero', 'what does the literature say about X', or any request to make sense of a body of papers. Runs a 6-stage pipeline: indexes the collection, derives research questions with user approval, delegates parallel Haiku workers for full-text extraction, aggregates into a cross-paper synthesis with themes/agreements/contradictions, runs two adversarial review passes, and delivers Markdown + PDF + Zotero notes per paper.
How this skill is triggered — by the user, by Claude, or both
Slash command
/rlm-research-analyzer:rlm-research-analyzerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Before invoking this skill, ensure the following are in place:
Before invoking this skill, ensure the following are in place:
| Requirement | Role | Required? |
|---|---|---|
| zotero-mcp MCP server | Provides all zotero_* tools (collection listing, metadata, full-text retrieval) | Required |
| Zotero desktop app running locally | zotero-mcp reads the local SQLite database | Required |
| GEMINI_API_KEY env var (or OpenAI equivalent) | Powers semantic search filtering when a focus question is provided | Optional — focus-question filtering is skipped if unavailable |
Python packages: weasyprint, markdown | PDF generation in Stage 6 | Optional — Stage 6 skips PDF gracefully if absent; install with pip install weasyprint markdown |
If zotero-mcp is not configured, stop immediately and print:
RLM Research Analyzer requires the zotero-mcp MCP server.
See: https://github.com/ussoftwareassociation/zotero-mcp
$ARGUMENTS format: "Collection Name" or "Collection Name" focus question
To include a focus question, the collection name MUST be quoted.
Parse $ARGUMENTS as follows:
$ARGUMENTS. If no quotes are present, treat the entire $ARGUMENTS string as the collection name (no focus question).<collection-slug>: collection name lowercased, spaces replaced by hyphens, all characters that are not letters, digits, or hyphens removed. Example: "Neuro Papers (2024)" → neuro-papers-2024.<working_dir>: <project_root>/rlm-runs/<collection-slug>/ where <project_root> is the absolute path of the current working directory at the time the skill is invoked (i.e. the directory Claude Code was opened in). Use the Bash tool to resolve it: pwd. Example: if pwd returns /home/user/myproject, then <working_dir> is /home/user/myproject/rlm-runs/my-collection/.Call zotero_get_collections to list all collections in the library.
Find the collection whose name matches <collection name> (case-insensitive). If no match is found, print:
Collection "<name>" not found. Available collections:
- <list all collection names>
Then stop.
Call zotero_get_collection_items with the matched collection key. Retrieve metadata only: title, abstract, authors, year, tags, item key. Do NOT fetch full text at this stage.
If a focus question is provided:
zotero_semantic_search with the focus question.Count the working item set. If count > 50 AND no focus question was provided, ask the user:
"This collection has N items. Synthesize all, or provide a focus question to filter first?" Wait for their response before continuing.
Create the run directory:
mkdir -p <working_dir>
Write <working_dir>/rlm_index_<collection-slug>.md with this structure:
# Index: <Collection Name>
Generated: <today's date>
Focus question: <focus question or "none">
Total items: N
## Items
### 1. <Title> (<Authors>, <Year>)
**Key:** ABC123
**Tags:** tag1, tag2
**Abstract:** <abstract text>
### 2. <Title> (<Authors>, <Year>)
**Key:** DEF456
**Tags:** tag3
**Abstract:** <abstract text>
## Excluded Items
| Key | Title | Reason |
|-----|-------|--------|
If no items were excluded from the working set, write None. under the ## Excluded Items heading instead of the empty table.
Total items must be the full collection count from step 3 (zotero_get_collection_items), before any exclusions from semantic filtering or manual removal. Items excluded from the working set are logged in the ## Excluded Items table but still counted in Total items.
Read <working_dir>/rlm_index_<collection-slug>.md.
Derive the research scope and present it to the user. Wait for approval before continuing.
Present exactly this structure (fill in the values):
## Research Scope
**Collection:** <name>
**Focus question:** <question or "open synthesis">
**Items to process:** N
**Key research questions:**
1. <question derived from focus question or inferred from collection title/abstracts>
2. <question>
(2–5 questions total)
Research questions must be: (a) answerable from the paper content, not rhetorical; (b) specific enough to produce a clear 'answered/unresolved' verdict; (c) aligned with the focus question if provided.
**Anticipated output sections:**
- Themes likely to emerge: <list based on abstracts>
- Source categories present: <empirical / review / theoretical / mixed methods>
**Items included:** N
**Items excluded:** N (reason: <semantic filter / manual>)
**Verification checklist (to be completed before delivery):**
- [ ] All items processed or skipped with reason logged
- [ ] Every major claim traceable to at least one named paper
- [ ] No single-source dependencies for key conclusions
- [ ] Research question status accounted for
Proceed with this scope? (yes / adjust)
If the user requests changes: apply their changes to the scope. Mutable fields are: research questions (add/remove/reword), items to include/exclude, and anticipated sections. Re-present the updated scope. Repeat until approved, up to 3 rounds (to avoid infinite loops). After 3 rounds without approval, ask: "Should I proceed with the current scope, or stop?"
Once approved: group the included items into batches of ~5 items each. Number batches from 1. If the last batch would contain only 1 item, merge it into the previous batch instead.
Write <working_dir>/rlm_plan_<collection-slug>.md:
# Plan: <Collection Name>
Approved: <date>
## Research Scope
**Collection:** <name>
**Focus question:** <question or "open synthesis">
**Items to process:** N
**Items included:** N
**Items excluded:** N (reason: <semantic filter / manual / none>)
**Anticipated output sections:**
- Themes likely to emerge: <list based on abstracts>
- Source categories present: <empirical / review / theoretical / mixed methods>
**Verification checklist:**
- [ ] All items processed or skipped with reason logged
- [ ] Every major claim traceable to at least one named paper
- [ ] No single-source dependencies for key conclusions
- [ ] Research question status accounted for
## Research Questions
1. <question> — status: pending
2. <question> — status: pending
## Batches
### Batch 1
#### Item 1
**Key:** ABC123
**Title:** Title here
**Authors:** Smith et al.
**Year:** 2023
**Abstract:** Full abstract text here.
#### Item 2
**Key:** DEF456
**Title:** Title here
**Authors:** Jones et al.
**Year:** 2024
**Abstract:** Full abstract text here.
### Batch 2
...
## Warnings
(populated during Stage 3 if any worker fails)
Before spawning workers, pre-compute one canonical short label per research question (≤8 words each). Use these exact labels in every worker prompt's Research question status lines. Example: if the question is 'What machine learning methods were used for early detection of Alzheimer's disease?', the label might be 'ML methods for Alzheimer's early detection'. Before spawning any workers, persist the labels into the plan file: for each research question line in ## Research Questions, append — label: <computed label> so the line reads 1. <question> — status: pending — label: <computed label>. Stage 4 reads these labels from the plan file.
Read references/worker-prompt.md from this skill's base directory. Use its content as the worker prompt template. Replace ALL placeholders with actual values (working_dir, batch number N, research questions and their labels, and all item fields sourced from the plan file) before spawning.
Spawn one Agent subagent per batch in parallel (all batches at once, in a single message with multiple Agent tool calls). Set each worker's description to "Batch N — process papers" and model to "haiku".
After all subagents return:
Batch <N> complete (where N is the batch number), treat it as failed. Open the plan file and append under ## Warnings:
- Batch <N> failed or returned empty. Items skipped: <list titles>.
Read the plan file to retrieve: (a) the authoritative list of research questions and their canonical labels; (b) any warnings logged during Stage 3. Count the batches listed in the plan file. Verify that <working_dir>/slice_*.md files exist for each batch number. If any batch number is missing, treat it as failed and append a warning: Batch <N> — no slice file found. For each existing slice file, verify it contains at least one ### heading. If a file is empty or has no ### headings, treat it as failed: append a warning (Batch <N> — slice file exists but is empty) and exclude it from aggregation.
Read all <working_dir>/slice_*.md files.
Consolidate research question status across all slices using these rules:
Label mismatch in slice_<N>.md — could not match: "<unmatched label text>".Write <working_dir>/rlm_answer_<collection-slug>.md with this exact structure:
# Synthesis: <Collection Name>
Focus question: <question or "open synthesis">
Date: <today's date>
## Per-Paper Summaries
[All per-paper summaries from all slice files, sorted by first author surname, then year. Preserve the exact format from the workers: ### heading, Key findings, Methods, Conclusions, Coverage. Omit the Research question status lines — those are consolidated below.]
## Cross-Paper Synthesis
### Themes
[3–7 themes, each supported by ≥2 papers. For each: 1–2 sentences and the papers supporting it, e.g. "(Smith 2023, Jones 2022)". If fewer than 3 themes meet the ≥2-paper threshold, note "Insufficient cross-paper convergence for N theme(s)" rather than including single-paper themes.]
### Agreements
[Findings where ≥2 papers converge. Cite the papers by author and year.]
### Contradictions
[Findings where papers studied comparable phenomena under similar conditions and reached opposing conclusions. Name the papers and the nature of the disagreement. Do not list differences explained by different populations, methods, or time frames — place those in Gaps & Open Questions instead. If no genuine contradictions are identified, write: "No genuine contradictions were identified. Apparent discrepancies are attributable to differences in [populations / methods / time frames / geographies] and are noted in Gaps & Open Questions."]
### Gaps & Open Questions
[Topics not addressed by any paper. Non-critical reviewer issues from Stage 5 will be appended here.]
## Research Questions Status
### Answered
- **<Q label>:** <summary of answer across papers> (Sources: Author Year, Author Year)
### Partially Answered
- **<Q label>:** <what was found and what remains unclear> (Source: Author Year only)
### Unresolved
- **<Q label>:** <why it remains unresolved>
### Superseded
- **<Q label>:** <what was superseded and by what finding> (Source: Author Year)
Omit any Research Questions Status subsection that has no questions assigned to it.
If any batches were skipped (i.e., ## Warnings in the plan file contains a line matching "failed or returned empty" or "no slice file found"), insert the following block immediately before the ## Cross-Paper Synthesis heading in the answer file:
> **Skipped items:** <list titles of skipped items from plan file warnings>
If no batches were skipped, do not add this block.
Read references/reviewer-prompt.md from this skill's base directory. Substitute <working_dir>, <collection-slug>, and <Collection Name> with actual values. Spawn one Agent subagent using the substituted prompt. Set description to "Review synthesis pass 1" and model to "sonnet".
After the reviewer returns, read <working_dir>/rlm_review_<collection-slug>.md. If the reviewer's return value does not match the pattern Review complete: N fatal, N major, N minor., treat as 0 issues found, log a warning to the plan file under ## Warnings, and continue to Stage 6.
For each FATAL or MAJOR issue: edit the answer file to fix it using the inline annotation in Part 2 of the review to locate the exact passage:
For each MINOR issue: append a bullet to the "Gaps & Open Questions" section of the answer file:
- [Reviewer note] <description of the minor issue>
If any FATAL or MAJOR issues were fixed: read the review file's ## Summary block and append the pass-1 tallies to ## Warnings in the plan file:
- Pass 1 review: N fatal, N major, N minor (all fixed before pass 2)
Spawn the reviewer subagent once more with the same already-substituted prompt (set description to "Review synthesis pass 2", model to "sonnet"). The reviewer will overwrite the existing review file. If it returns 0 fatal and 0 major issues, continue to Stage 6. If FATAL or MAJOR issues remain after the second pass, append them as bullets to "Gaps & Open Questions" with the prefix [Unresolved reviewer issue] and continue to Stage 6 — do not loop again.
Count the final tallies by reading the previously written files:
### headings in all slice_*.md files whose heading line does NOT contain [abstract only] and does NOT contain [skipped### headings in all slice_*.md files whose heading line DOES contain [abstract only]### headings in all slice_*.md files whose heading line contains [skipped- Pass 1 review: N fatal, N major, N minor in ## Warnings of the plan file. Extract those counts.## Summary block of the final review file. If the file does not exist or its Summary block is absent, record 'N/A'.Post-process the answer file to add a Table of Contents and a Sources section:
a. Fetch full metadata for each working-set item.
Read item keys from the Batches section of the plan file. For each key, call zotero_get_item_metadata to retrieve: journal/publication name, volume, issue, pages, DOI, URL. These calls are independent and can be made in parallel. If a key's metadata call fails, use the fields already available from the plan file (title, authors, year) and omit missing fields.
b. Build the sorted reference list. Sort all working-set items alphabetically by first author surname, then by year for ties. Format each entry as:
Surname, F.I.; Surname2, F.I.; Surname3, F.I.; et al. (Year). Title. *Journal Name*, Volume(Issue), Pages. [DOI: xxxxx](https://doi.org/xxxxx)
Rules:
;. If more than three authors, write the first three followed by et al.*...*[DOI: xxxxx](https://doi.org/xxxxx)[Link](url) insteadc. Generate a Table of Contents.
Scan the answer file for headings. Include all ## headings. For ### headings, include only those within ## Cross-Paper Synthesis and ## Research Questions Status — skip all ### headings inside ## Per-Paper Summaries. Build the ToC in this format:
## Table of Contents
- [Section Name](#anchor)
- [Subsection Name](#anchor)
Anchors follow GitHub-flavored Markdown: heading text lowercased, spaces replaced by hyphens, all characters that are not letters, digits, or hyphens removed.
d. Update the answer file.
Date: line (the third line of the file).---
## Sources
Surname, F.I.; ... (Year). Title. *Journal*. [DOI: xxx](url)
(one entry per working-set item, alphabetical order)
Generate a PDF version of the answer file. Locate the skill's base directory from the "Base directory for this skill:" line in your system context. Then run:
python "<skill_base_dir>/scripts/make_pdf.py" \
--collection "<Collection Name>" \
--date "<today's date>" \
--input "<working_dir>/rlm_answer_<collection-slug>.md" \
--output "<working_dir>/rlm_answer_<collection-slug>.pdf"
If the script exits successfully, note the PDF path in the delivery output.
If it fails (e.g., weasyprint not installed), print:
[Warning] PDF generation skipped — weasyprint not available. Install with: pip install weasyprint markdown
and continue to step 4.
Write <working_dir>/rlm_provenance_<collection-slug>.md:
# Provenance: <Collection Name>
Completed: <today's date>
## Scope
- Focus question: <focus question or "open synthesis">
- Items in working set: N (of N total in collection)
- Batches processed: N
## Sources
- Total in collection: N
- Reviewed (full text): N
- Reviewed (abstract only): N
- Skipped (no content): N
- Excluded: N
## Excluded Items
| Title | Reason |
|-------|--------|
| <title> | <reason> |
## Verification
- Reviewer pass 1: N fatal, N major (all fixed before pass 2)
- Reviewer pass 2: N fatal, N major (fixed before delivery / "not run" if only one pass)
- Remaining minor issues: <if any minor issues were recorded: "N — see Gaps & Open Questions in answer file"; otherwise: "None">
## Research Files Consulted
- rlm_index_<collection-slug>.md
- rlm_plan_<collection-slug>.md
- (list each slice_*.md file that actually exists in `<working_dir>`, one per line)
- rlm_review_<collection-slug>.md
## Outputs
- rlm_answer_<collection-slug>.md
- rlm_answer_<collection-slug>.pdf (or "not generated — weasyprint unavailable")
If the Excluded Items table in the index has no data rows, write None. under the ## Excluded Items heading instead of a table.
Create Zotero notes for each processed item:
(key, title) pairs in order.slice_*.md files into memory.(key, title) pair:
a. Search the slice files for a ### heading whose text contains the first 40 characters of the title (case-insensitive). Take the first match.
b. If found, extract the full summary block: from that ### heading line up to (but not including) the next ### heading or end of file.
c. Skip items whose heading contains [skipped — no content available] — do not create a note for them.
d. Format the note as:
## RLM Summary — <Collection Name>
*Generated: <today's date>*
<full extracted summary block>
e. Call zotero_create_note with item_key=<key> and the formatted note content. If the call fails, log the failure but continue to the next item.zotero_create_note is not available (tool not found), skip this step silently and add: "[Zotero notes skipped — zotero_create_note unavailable]".Print a delivery summary — do NOT print the full answer file, as it can be very large:
Synthesis complete for "<Collection Name>"
- Items: N reviewed (N full text, N abstract only, N skipped), N excluded
- Themes: <list theme titles>
- Research questions: <one line per question — label: verdict>
- Reviewer: pass 1 (N fatal, N major fixed), pass 2 (N fatal, N major)
Print:
---
Answer file: <working_dir>/rlm_answer_<collection-slug>.md
PDF: <working_dir>/rlm_answer_<collection-slug>.pdf (or "skipped — weasyprint not available")
Provenance: <working_dir>/rlm_provenance_<collection-slug>.md
All run files: <working_dir>
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub sfoucher/rlm-research-analyzer --plugin rlm-research-analyzer