From bibliographer
Manage a tree of scientific experiments end to end — one folder per experiment holding raw lab/CRO deliverables, extracted data, re-derived analysis, grounded claims, reports, and internal summaries — as one provenance-tracked pipeline raw→data→analysis→claims. Extract raw measurements out of CRO files (Excel .xlsx/.xls, GraphPad Prism .pzfx/.prism, Word/PDF/ PowerPoint) into tidy deterministic data/ CSVs; re-derive analysis (EC50/Hill fits, stats, summaries, figures) from that data; assert grounded scientific claims (each a re-runnable pytest spec linking a statement to sha-pinned evidence with a strength); and index everything into a libkit store for semantic + full-text search — with claims and internal summaries the highest-value searchable content. Use this skill whenever the user wants to turn CRO spreadsheets / Prism / reports into clean data, (re)generate or audit an experiment's data or analysis, fit a dose-response, make or check a grounded claim, ask "what's the evidence for X," "which study has the Day-29 knockdown numbers," or "everything we ran with ASO 7," file a new CRO/lab delivery, scaffold a new experiment folder, keep a README/summary current, or trace a result back to the original measurements — even if they don't say "scientist." For a personal library of published academic papers (DOIs, arXiv, PMIDs, PDFs), use bibliographer instead.
How this skill is triggered — by the user, by Claude, or both
Slash command
/bibliographer:scientistThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Manages a tree of scientific experiments — one folder per experiment — as a single
ROADMAP.mdSPEC.mdbin/scipyproject.tomlreferences/auditing.mdreferences/derive-claims.mdreferences/extract.mdreferences/naming.mdreferences/playbook.mdreferences/recipes.mdreferences/report-authoring.mdreferences/report.mdreferences/review-audit.mdreferences/search-index.mdreferences/vocab.example.ymlscientist/__init__.pyscientist/cli_utils.pyscientist/experiments/__init__.pyscientist/extraction/__init__.pyscientist/extraction/audit.pyManages a tree of scientific experiments — one folder per experiment — as a single provenance-tracked pipeline:
raw/ → data/ → analysis/ → claims + README (each arrow records provenance)
raw/ = CRO/lab originals · data/ = tidy faithful CSVs (no computation) · analysis/ =
re-derivations (EC50/Hill fits, stats, summaries, figures) · claims = grounded scientific
assertions, each a re-runnable pytest spec · README.md = the human/agent summary. Everything is
indexed into a libkit store for semantic + full-text search, with claims and summaries the
highest-value searchable content.
The only caller is an LLM agent. The bundled tools exist to make a sprawling, heterogeneous data folder mechanical, repeatable, and auditable — and answerable ("which file has the lumbar-cord knockdown numbers," "what's the evidence for the dose-dependent gait effect," "is this summary still true").
Each phase's detail lives in references/ and is loaded only when you need it. Start here:
| You want to… | Read |
|---|---|
Extract raw CRO files → tidy data/ CSVs, and audit that data/ is grounded in raw/ | references/extract.md |
| Re-derive analysis (fits/stats/figures) and author grounded scientific claims | references/derive-claims.md |
| Index / search / catalog the tree, file a delivery, scaffold a new experiment | references/search-index.md |
| Review provenance, audit staleness, structural check, trace a result raw→claims, enforce prose↔claims | references/review-audit.md |
↳ deep reference for the structural / staleness / semantic audit passes (sci check / sci audit / parallel-agent) | references/auditing.md |
Author a human-facing report from grounded claims — sci report mechanics (cite [claim:<id>], embed grounded figures, audit + render) | references/report.md |
| ↳ when drafting the report prose: voice/structure, literature-sweep & disconfirming-evidence discipline, the generation brief, the fresh-context §3 + voice/tone reviews | references/report-authoring.md |
data/ naming convention + assay vocabulary: references/naming.md.
Private CRO vocabulary (your real vendor names): references/vocab.example.yml.
experiment.yml provenance, the
extract.py/derive.py recipes, the claims tests, and the data/ CSVs are durable and
git-diffable. The libkit store (embeddings, search index, experiment/file/claim cards)
is rebuildable — wipe it, reindex, and you're whole. Never make the cache load-bearing for truth.experiment.yml holds a unified provenance list.
Every generation step — extract (data/…), derive (analysis/…), review (README.md) — appends
an edge: an artifact plus its inputs (each path + sha256). So raw → data → analysis → README is one DAG in one place, and a single audit can walk it.data/ is a strict, grounded superset of raw/ with no
computation. Any mean/SEM/%-knockdown/fit belongs in analysis/, never in data/.assert = the grounding/drift check. Running the claims captures
provenance automatically and indexes each claim into libkit as searchable, grounded evidence
(carrying its outcome + strength, so a contradicted or weak claim is never surfaced as fact).The CLI is scripts/sci.py, a self-contained PEP-723 uv script (it declares its own
deps), so it runs with no install. The always-works form — use this in scripts and as
an agent — is uv run /path/to/skills/scientist/scripts/sci.py <command> [args]. To
get a real sci on your PATH instead of typing that absolute form, the skill ships a
launcher shim at bin/sci — add its bin/ to PATH (export PATH="/path/to/skills/scientist/bin:$PATH") or symlink it once (ln -s /path/to/skills/scientist/bin/sci ~/.local/bin/sci). The shim execs the script, whose
shebang resolves deps each run.
Data-tree root (--home / $SCIENTIST_HOME): you no longer have to export
SCIENTIST_HOME when running from inside the data checkout. When it is unset, sci (and
the grounding experiments accessor) infer the data-repo root by walking up from the
working directory to a checkout marker (.scientist/, or LAYOUT.md + program/). An
explicit --home or a set $SCIENTIST_HOME always wins; if no root is found the clear
"set SCIENTIST_HOME" error still fires. Literature-claim grounding likewise loads ~/.env
to find $BIBLIOGRAPHER_HOME if it isn't already set, so no source ~/.env is needed.
Read the repo-wide AGENTS.md first: improve-as-you-go, push rote work into code,
PR your changes back to the skills repo, contribute generic fixes upstream to libkit by PR
(libkit is the store substrate; this is how bibliographer drove several libkit features),
and verify changes on throwaway data. Per-phase maintenance notes live in each references/ file;
the open direction (finer-grained provenance, program-level traceability) is in
ROADMAP.md — claim↔prose enforcement, the reproduction audit, and the terminal
report phase (claims → report, see report.md) are shipped.
Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub emerose/skills --plugin scientist