From sci-brain
Indexes a researcher's paper collection (Zotero library, PDF folder, or Google Scholar profile) into a structured knowledge base under .knowledge/.
How this skill is triggered — by the user, by Claude, or both
Slash command
/sci-brain:researchstyleThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Turn an existing paper collection into a structured knowledge base under `<project>/.knowledge/` (or an advisor KB). The output uses the same KB format as the `survey` and `download-ref` skills — project and advisor KBs can coexist cleanly.
Turn an existing paper collection into a structured knowledge base under <project>/.knowledge/ (or an advisor KB). The output uses the same KB format as the survey and download-ref skills — project and advisor KBs can coexist cleanly.
Step 1 — Identify the researcher and source. First, ask whose papers to index:
"Whose papers should I index? (Give me a name, or leave blank for your own collection.)"
Then ask which source to use:
"Where are the papers?"
- (a) Zotero library
- (b) A PDF folder (give me the path)
- (c) Google Scholar profile (give me the URL)
Note: the Zotero option is only meaningful for indexing your own collection (it's your local DB). For another researcher, choose (b) or (c).
Step 2 — Index the collection.
Zotero:
Locate zotero.sqlite — check in order: ~/Zotero/, ~/Library/Application Support/Zotero/, ~/snap/zotero-snap/common/Zotero/. If not found, use find ~ -maxdepth 4 -name "zotero.sqlite" as fallback. If still not found, ask for the path.
Run the bundled script:
python3 <skill-base-dir>/parse_zotero.py <path-to-zotero.sqlite> <output_dir>
The script handles: copying the DB to avoid locking, pivot queries to avoid cartesian products, author extraction, cite key deduplication, topic classification, and generating structured output.
Important — treat <output_dir> as a scratch directory, not the KB. The script writes legacy-format index files (a topic index and a .bib file) into <output_dir>. Pick a temp path (e.g., /tmp/zotero-export-$$/). Steps 3–6 are the authoritative writes — they read those intermediate files from <output_dir> as input data, then emit .raw/{arxiv,doi}/<id>.json into $KB and append to $(dirname $KB)/ref.bib. After Steps 3–6 finish, the contents of <output_dir> can be deleted.
Review the output — the script's topic classification uses keyword matching and may need manual adjustment. Check the topic distribution it prints and offer to re-classify if the user's field isn't well covered by the default patterns.
For papers missing abstracts or DOIs, find the PDF via the itemAttachments table. PDFs are at <zotero-data-dir>/storage/<key>/<filename>.pdf. Read them to extract the abstract.
PDF folder:
pdfgrep -r -i "KEYWORD" <folder> (install via package manager if missing, e.g., apt install pdfgrep or brew install pdfgrep).Google Scholar:
Note: Google Scholar actively blocks automated access — WebFetch may hit CAPTCHAs or rate limits. If scraping fails, suggest alternatives: export BibTeX manually from the Scholar profile page (Scholar → select all → export BibTeX), use ORCID or DBLP profiles instead (both have machine-friendly APIs), or switch to the PDF folder method with downloaded papers.
Processing tips:
parse_zotero.py for Zotero). Don't try to do it inline with shell commands — even for small libraries, a script is more reliable and easier to debug.TOPIC_PATTERNS in the script or ask the user to provide keywords for their domain.The KB target is decided by the caller:
# Standalone (indexes the user's own collection into the project KB):
KB=$(python3 skills/download-ref/helpers/resolve_kb.py)
# Invoked from /incarnate (indexes another researcher's collection into the advisor KB):
KB=$(python3 skills/download-ref/helpers/resolve_kb.py --advisor <slug>)
Ensure $KB/.raw/arxiv/ and $KB/.raw/doi/ exist.
.raw/ JSON per paperFor each indexed paper, write metadata to $KB/.raw/{arxiv,doi}/<id>.json in the exact shape fetch_metadata.py produces (top-level keys: title, authors, year, venue, abstract, externalIds, citationStyles, openAccessPdf). Use <safe-doi> (DOI with / → -) for DOI filenames.
For papers without a DOI or arXiv ID, skip — they don't fit the canonical KB; mention them to the user.
Per indexed paper:
KEY=$(python3 skills/download-ref/helpers/append_bibtex.py propose \
--kb "$KB" --id "$ID" --type "$TYPE" | python3 -c 'import sys,json; print(json.load(sys.stdin)["proposed_key"])')
python3 skills/download-ref/helpers/append_bibtex.py append \
--kb "$KB" --id "$ID" --type "$TYPE" --key "$KEY" \
--bib "$(dirname $KB)/ref.bib"
Auto-accept the proposed key — per-paper confirmation is unworkable at 100+ papers.
python3 skills/download-ref/helpers/index.py \
--kb "$KB" \
--title "<advisor-slug or 'project'> — researcher index" \
--source-note "Built by /researchstyle on $(date -u +%Y-%m-%d)."
Write or extend $KB/NOTES.md with:
Reference papers as [@<cite-key>]. If NOTES.md exists, extend rather than overwrite.
After Steps 3–6 complete, the KB is populated with metadata but PDFs aren't downloaded yet. Ask the user via AskUserQuestion:
"Index built. What next?"
- (a) Fetch PDFs for all refs — invokes
download-ref --from-bib $(dirname $KB)/ref.bib --kb $KB(bulk mode)- (b) Add specific refs by ID — invokes
download-refwith explicit IDs (single-shot, per-ref cite-key confirmation)- (c) Continue to
/brainstorm-ideas— start brainstorming with the indexed literature loaded- (d) Stop — leave the KB as-is
For (a) and (b), see skills/download-ref/SKILL.md. For (c), invoke /brainstorm-ideas in the current session.
npx claudepluginhub quantumbfs/sci-brain --plugin sci-brainDownloads academic references (arXiv IDs or DOIs) into a sci-brain knowledge base: fetches metadata, PDFs, renders to markdown, updates INDEX.md and ref.bib.
Syncs .bib references to Zotero library and generates Obsidian literature notes with cross-cutting concept extraction. Use after /search-lit or to bulk-register references.
Manages Paperpile reference library and resolves citations to PDFs via the paperpile CLI. Supports add, search, fetch, label, edit, trash, and auth operations.