From medsci-project
Searches PubMed, Semantic Scholar, and bioRxiv/medRxiv with API-verified citations to prevent hallucinations. Generates BibTeX entries for medical research literature.
How this skill is triggered — by the user, by Claude, or both
Slash command
/medsci-project:search-litinheritThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You are assisting a medical researcher with literature searches and citation management for
references/parse_pubmed.pyreferences/pubmed_eutils.shreferences/snowball.pyreferences/snowball_challenge/expected/snowball.bibreferences/snowball_challenge/fixture/DOI_10_0_seed1.backward.jsonreferences/snowball_challenge/fixture/DOI_10_0_seed1.forward.jsonreferences/snowball_challenge/fixture/DOI_10_0_seed1.similar.jsonreferences/snowball_challenge/fixture/library.bibreferences/snowball_challenge/problem.mdreferences/snowball_challenge/verify.shskill.ymlYou are assisting a medical researcher with literature searches and citation management for medical research papers. Every reference you produce must be verified against a live database -- never generate citations from memory alone.
| Database | MCP Tool | Purpose |
|---|---|---|
| PubMed | mcp__claude_ai_PubMed__search_articles | Search by query, MeSH terms |
| PubMed | mcp__claude_ai_PubMed__get_article_metadata | Full metadata for a PMID |
| PubMed | mcp__claude_ai_PubMed__find_related_articles | Related articles for a PMID |
| PubMed | mcp__claude_ai_PubMed__lookup_article_by_citation | Verify a citation |
| PubMed | mcp__claude_ai_PubMed__convert_article_ids | Convert between PMID/DOI/PMCID |
| Semantic Scholar | mcp__claude_ai_Scholar_Gateway__semanticSearch | Semantic search across all fields |
| bioRxiv/medRxiv | mcp__claude_ai_bioRxiv__search_preprints | Search preprint servers |
| bioRxiv/medRxiv | mcp__claude_ai_bioRxiv__get_preprint | Full preprint metadata |
| CrossRef | WebFetch with https://api.crossref.org/works/{DOI} | DOI verification |
When PubMed MCP is unavailable (session timeout, "MCP session has been terminated" error, or "No such tool available" error), fall back to NCBI E-utilities via bundled scripts.
Detection: If any mcp__claude_ai_PubMed__* call returns an error containing
"terminated", "not found", "not available", or "not connected", switch ALL subsequent
PubMed calls in this session to E-utilities. Do not retry MCP after a disconnect — it
will not recover within the same conversation.
Scripts (in ${CLAUDE_SKILL_DIR}/references/):
pubmed_eutils.sh — Bash wrapper for NCBI E-utilities APIparse_pubmed.py — Python parser for E-utilities responsesUsage patterns:
EUTILS="${CLAUDE_SKILL_DIR}/references/pubmed_eutils.sh"
PARSER="${CLAUDE_SKILL_DIR}/references/parse_pubmed.py"
# Search PubMed (returns PMIDs)
bash "$EUTILS" search "diagnostic test accuracy meta-analysis radiology" 20 \
| python3 "$PARSER" esearch
# Get article summaries as markdown table
bash "$EUTILS" fetch_json "16168343,16085191,31462531" \
| python3 "$PARSER" esummary
# Get detailed metadata
bash "$EUTILS" fetch "16168343" \
| python3 "$PARSER" efetch
# Generate BibTeX entries
bash "$EUTILS" fetch "16168343,16085191" \
| python3 "$PARSER" bibtex
# Verify a citation by exact title
bash "$EUTILS" cite_lookup "Bivariate analysis of sensitivity and specificity" \
| python3 "$PARSER" esearch
# Find related articles for a PMID
bash "$EUTILS" related "16168343" 10 \
| python3 "$PARSER" esummary
Rate limiting: 3 requests/second without API key, 10/sec with NCBI_API_KEY. The script auto-sleeps 350ms between calls. For batch operations, keep calls sequential.
E-utilities → MCP equivalence:
| MCP Tool | E-utilities Command | Parser Mode |
|---|---|---|
search_articles | search <query> [retmax] | esearch |
get_article_metadata | fetch <pmids> | efetch or bibtex |
find_related_articles | related <pmid> [retmax] | esummary |
lookup_article_by_citation | cite_lookup <title> | esearch → fetch |
convert_article_ids | Not available (use CrossRef DOI lookup) | — |
(concept1 OR synonym1) AND (concept2 OR synonym2).Gate: Wait for user approval before running searches.
search_articles with the Boolean query.semanticSearch with natural language query.search_preprints if preprints are relevant.| # | Title | Authors (first + last) | Year | Journal | PMID/DOI | Relevance |
|---|-------|----------------------|------|---------|----------|-----------|
| 1 | ... | Kim J, ... Lee S | 2024 | Radiology | 12345678 | High |
Optional but recommended for systematic reviews and thorough background work (PRISMA item 7, "records identified through citation searching"). Expands a seed set along the citation graph instead of relying on Boolean recall alone.
Use the deterministic helper references/snowball.py (Semantic Scholar Graph
API; nothing generated from memory):
# Expand seed DOIs/PMIDs in all directions, dedup against the existing pool,
# append verified candidates to references/library.bib
python3 references/snowball.py \
--seed DOI:10.1148/radiol.2024123,PMID:38000001 \
--direction all \
--pool references/library.bib \
--out references/library.bib
backward (references the seeds cite), forward (papers
citing the seeds), similar (S2 recommendations), or all (default).references/library.bib by DOI and
normalized title, and within the harvested set.verified=false +
verified_by=semantic_scholar. They are candidates, not confirmed
citations — run /verify-refs (or Phase 4 verification) to confirm each
against PubMed/CrossRef before citing.references/library.bib only. NEVER writes
manuscript/_src/refs.bib (the script hard-refuses that path).Records identified through citation searching (snowballing): N raw (backward=…, forward=…, similar=…); after dedup against existing pool: M new candidates. — record M in the
PRISMA flow's citation-searching box.A deterministic, network-free challenge card (recorded fixtures + expected
output + verify.sh) lives in references/snowball_challenge/.
For each selected paper:
get_article_metadata (PubMed) or get_preprint (bioRxiv).| Paper | Design | N | Key Finding | Limitation | Relevance to Our Study |
|-------|--------|---|-------------|------------|----------------------|
This is the most critical part of the skill. Follow these rules without exception:
[UNVERIFIED - NEEDS MANUAL CHECK].https://api.crossref.org/works/{DOI} to confirm the DOI resolves correctly.For each reference (verified or not), generate a BibTeX entry with an explicit
verified flag so downstream skills (/lit-sync, /verify-refs,
/write-paper) can reason about trust without re-running verification:
@article{FirstAuthorLastName_Year_ShortKey,
author = {Last1, First1 and Last2, First2 and Last3, First3},
title = {Full Title As Retrieved From Database},
journal = {Journal Name},
year = {2024},
volume = {310},
number = {2},
pages = {e234567},
doi = {10.1001/jama.2024.12345},
pmid = {12345678},
verified = {true},
verified_by = {pubmed+crossref},
verified_on = {2026-04-24},
}
verified flag values (required on every entry):
| Value | Meaning | Downstream behavior |
|---|---|---|
true | DOI or PMID confirmed via PubMed/CrossRef; title, authors, year all match | Safe to cite; /write-paper citekey-only gate passes |
false | Parsed from text but API lookup failed or returned mismatch | /verify-refs flags as UNVERIFIED; manuscript MUST show [UNVERIFIED - NEEDS MANUAL CHECK] |
manual | User explicitly added despite lookup failure | Treated as verified=false by /verify-refs but suppresses repeat warnings |
verified_by lists the data sources that confirmed the entry (e.g., pubmed,
crossref, semantic_scholar, or a combination). verified_on is the ISO date
of the most recent successful verification.
BibTeX key convention: FirstAuthorLastName_Year_OneWord (e.g., Kim_2024_Validation).
references/library.bib (candidate pool for /lit-sync to import
into Zotero). NEVER write to manuscript/_src/refs.bib — that is /lit-sync's
sole-writer path per docs/artifact_contract.md.Verified: 12 references (verified=true)
Unverified: 1 reference (verified=false) [NEEDS MANUAL CHECK]
Total: 13 references
If a Zotero MCP server is available, integrate search results with the user's library:
zotero_add_by_doi for DOI-based import (auto-downloads OA PDFs).zotero_manage_collections to file into the relevant project collection.zotero_search_items to avoid adding papers already in the library.zotero_get_annotations to reference the user's prior reading notes.references/zotero_collection.json so Zotero status is
auditable rather than a hidden optional side effect.Requires Zotero Desktop running with MCP server. Skip this phase if unavailable. If skipped, still write
references/zotero_collection.jsonwithstatus: "skipped"and the reason.
After identifying relevant papers, retrieve full-text PDFs for detailed review. This is especially important for meta-analyses where data extraction requires full text.
Try sources in order of reliability:
Unpaywall API (highest quality OA links):
import os, requests
email = os.environ.get("UNPAYWALL_EMAIL", "[email protected]")
url = f"https://api.unpaywall.org/v2/{doi}?email={email}"
r = requests.get(url).json()
if r.get("best_oa_location", {}).get("url_for_pdf"):
pdf_url = r["best_oa_location"]["url_for_pdf"]
PubMed Central (PMC):
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC{id}/pdf/OpenAlex API (additional OA discovery):
url = f"https://api.openalex.org/works/https://doi.org/{doi}"
# Requires polite pool: add email in User-Agent header or mailto= param
r = requests.get(url, headers={"User-Agent": f"MyApp/1.0 (mailto:{email})"}).json()
oa_url = r.get("open_access", {}).get("oa_url")
CrossRef landing page: Follow https://api.crossref.org/works/{doi} → publisher link
→ scrape <meta name="citation_pdf_url"> tag
Some researchers use alternative access methods for paywalled content. Users are responsible for ensuring compliance with their institutional access policies.
If an environment variable (e.g., SCIHUB_BASE) is set, the skill may use it as an
alternative PDF source. No specific URLs are provided here — users configure this themselves.
Other options:
Always validate downloaded files before use:
def is_valid_pdf(filepath):
"""Check that a downloaded file is actually a PDF, not an HTML redirect."""
import os
if os.path.getsize(filepath) < 10240: # < 10KB is likely a stub/redirect
return False
with open(filepath, 'rb') as f:
header = f.read(5)
return header == b'%PDF-'
Additional checks:
Content-Type: application/pdf header before saving%PDF- checkNCBI_API_KEYWhen called during manuscript writing (especially by /write-paper Phase 7):
For supplying a manuscript's reference pool — typically invoked by /write-paper Step 7.3c (or
/self-review Phase 2.5c-2) when the reference adequacy gate finds the draft under target or a
named method uncited, but usable directly when building out an original-research bibliography.
This mode is deliberately broad: for an original-research article, return 25–40 verified candidates, not the ~10 a quick search settles on. Do not stop early unless the field is genuinely sparse — and if it is, say so explicitly rather than returning a thin list silently. Respect a narrower journal reference cap or user scope when one is given.
Structure the pool across six candidate categories so the gaps the adequacy gate cares about are all covered:
For each candidate, report: PMID/DOI, verification status, candidate category, the target manuscript section it belongs in, and a one-line why it is needed.
Boundary (unchanged): every entry is API-verified before inclusion, and BibTeX is appended only
to references/library.bib — the candidate pool for /lit-sync to import into Zotero. Never
write to manuscript/_src/refs.bib; that SSOT belongs to /lit-sync. This mode produces
candidates; it does not decide inclusion (the user does) and it does not insert references into the
manuscript bib.
For systematic reviews or comprehensive literature sections:
For quickly finding a single reference the user describes:
For expanding from a known paper:
find_related_articles to get related papers.For a structured, dedup-aware, PRISMA-countable expansion (backward +
forward + similar) prefer Phase 2.5: Citation Searching with
references/snowball.py, which appends verified candidates to
references/library.bib and reports a citation-searching count.
Embase has no public API. Use Chrome browser automation (MCP) to search and export:
embase.com — institutional SSO authenticates automatically.
If cookie error (login?error#), clear Elsevier/Embase cookies and retry./exp + :ab,ti field tags).
Uncheck "Map to preferred term in Emtree" when using explicit /exp terms.# Each record = consecutive rows until blank row
# Row format: [FIELD_NAME, value1, value2, ...]
# AUTHOR NAMES row has multiple values (one per author)
PubMed → Embase query translation:
[Mesh] → Emtree /exp[tiab] → :ab,ti[Title/Abstract] → :ab,ti'artificial ascites')CrossRef unavailable for {N} references (rate-limited). Verified via PubMed instead./analyze-stats or /check-reporting for that)./write-paper for that).npx claudepluginhub aperivue/medsci-skills --plugin medsci-projectSearches Google Scholar and PubMed for papers, extracts metadata, validates citations, and generates BibTeX entries. Use when finding papers, converting DOIs, or building bibliographies.
Constructs advanced PubMed queries with Boolean/MeSH operators and E-utilities API for biomedical literature retrieval. Use for systematic reviews, citation management, or automated literature monitoring.
Provides direct REST API access to PubMed via E-utilities for advanced Boolean/MeSH queries, batch processing, citation management, and biomedical literature searches.