From htan
Access HTAN (Human Tumor Atlas Network) data — query the portal database, download from Synapse and Gen3/CRDC, query metadata in BigQuery, and search HTAN publications on PubMed.
How this skill is triggered — by the user, by Claude, or both
Slash command
/htan:htanThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Tools for accessing data from the **Human Tumor Atlas Network (HTAN)**, an NCI Cancer Moonshot initiative constructing 3D atlases of human cancers from precancerous lesions to advanced disease.
Tools for accessing data from the Human Tumor Atlas Network (HTAN), an NCI Cancer Moonshot initiative constructing 3D atlases of human cancers from precancerous lesions to advanced disease.
On first use, check if the htan CLI is available by running htan --version. If it is not installed, guide the user through setup:
Create a venv in the user's project (not in the plugin directory):
uv venv && uv pip install "${CLAUDE_PLUGIN_ROOT}"
Or without uv:
python3 -m venv .venv && source .venv/bin/activate && pip install "${CLAUDE_PLUGIN_ROOT}"
Configure credentials (portal, Synapse, etc.):
uv run htan init
Allow htan commands — ask the user to add this to their project .claude/settings.json:
{
"permissions": {
"allow": [
"Bash(uv run htan *)"
]
}
}
All htan commands are read-only and safe — credentials are read from local config files, never echoed.
ALWAYS prefix commands with uv run when a venv exists (e.g., uv run htan query portal tables). NEVER use source .venv/bin/activate && — uv run handles the venv automatically.
NEVER create a virtual environment or install packages inside the plugin cache directory. Venvs go in the user's working directory.
NEVER run htan_setup.py via Bash. It is an interactive wizard that will fail.
All commands use uv run htan .... NEVER use source .venv/bin/activate. Run any command with --help for full usage.
The fastest way to find HTAN files and get download coordinates. Uses the portal's ClickHouse backend.
uv run htan query portal files --organ Breast --assay "scRNA-seq" --limit 20
uv run htan query portal sql "SELECT atlas_name, COUNT(*) as n FROM files GROUP BY atlas_name"
uv run htan query portal tables
uv run htan query portal describe files
uv run htan query portal demographics --atlas "HTAN OHSU"
uv run htan query portal diagnosis --organ Breast --limit 10
uv run htan query portal cases --atlas "HTAN MSK"
uv run htan query portal specimen --organ Colon
uv run htan query portal summary
uv run htan query portal manifest HTA9_1_19512 HTA9_1_19553 --output-dir ./manifests
SQL notes: Array columns (organType, Gender, Race, etc.) require arrayExists(). Use <> instead of !=. LIMIT is auto-applied. See references/clickhouse_portal.md for full schema.
uv run htan pubs search --keyword "spatial transcriptomics" --max-results 5
uv run htan pubs search --author "Sorger PK"
uv run htan pubs fetch 12345678
uv run htan pubs fulltext "tumor microenvironment"
Query the HTAN Phase 1 data model — 1,071 attributes across 64 manifest components with controlled vocabularies.
uv run htan model components
uv run htan model attributes "scRNA-seq Level 1"
uv run htan model describe "File Format"
uv run htan model valid-values "File Format"
uv run htan model search "barcode"
uv run htan model required "Biospecimen"
uv run htan model deps "scRNA-seq Level 1"
uv run htan model fetch
See references/htan_data_model.md for the full component catalog and identifier patterns.
Bridges file IDs to download coordinates using the HTAN portal's DRS mapping (~67,000 files).
uv run htan files lookup HTA9_1_19512
uv run htan files lookup HTA9_1_19512 --format json
uv run htan files update
uv run htan files stats
Deep clinical queries, assay-level metadata (cell counts, library methods, file sizes).
uv run htan query bq tables
uv run htan query bq tables --versioned
uv run htan query bq describe clinical_tier1_demographics
uv run htan query bq sql "SELECT COUNT(*) FROM `isb-cgc-bq.HTAN.clinical_tier1_demographics_current`"
uv run htan query bq query "How many patients with breast cancer?"
See references/bigquery_tables.md for table schemas and query examples.
Use native platform CLIs — they're simpler and clearer.
Synapse (open access):
uv run synapse get syn26535909
uv run synapse get syn26535909 --downloadLocation ./data
uv run synapse get -r syn12345678 # recursive folder download
Gen3/CRDC (controlled access):
gen3-client download-single --profile=htan --guid=<guid>
uv run htan config check # Check credential status for all services
Use uv run htan config check to see what's configured.
Credential storage:
~/.config/htan-skill/portal.json (or OS Keychain)SYNAPSE_AUTH_TOKEN env var or ~/.synapseConfig~/.gen3/credentials.json (requires dbGaP authorization)gcloud auth application-default login)No-auth commands (always work, no setup needed):
htan pubs ... — PubMed searchhtan model ... — data model querieshtan files ... — file mapping lookups| Data Level / Type | Access | Platform |
|---|---|---|
| Level 3-4, Auxiliary, Other | Open | Synapse (synapseId) |
| Level 1-2 sequencing with DRS URI | Controlled | Gen3 (drs_uri) |
| CODEX Level 1, specialized assays (EM, RPPA, slide-seq, mass spec) | Open | Synapse |
Recommended: Portal → Download (2 steps)
uv run htan query portal files --organ Breast --assay "scRNA-seq" — find files with synapseId and drs_uriuv run synapse get <synID> (open access) or gen3-client download-single --guid <guid> (controlled)Alternative: BigQuery → Download (for complex clinical queries)
uv run htan query bq sql "SELECT ..." — find HTAN_Data_File_IDuv run htan query portal manifest <file_ids> — get download coordinatesuv run synapse get or gen3-client| Data | Portal | BigQuery |
|---|---|---|
| File size | Not available | File_Size (INTEGER, bytes) in assay metadata tables |
| Cell counts | Not available | Cell_Total in scRNAseq tables |
| Library method | Not available | Library_Construction_Method in assay tables |
| Download coordinates | synapseId, DRS URI | entityId only |
Note: File_Size and entityId exist in all BigQuery assay metadata tables (scRNAseq, bulkRNAseq, imaging, scATACseq, electron_microscopy, bulkWES, etc.), not just scRNAseq.
Fallback rule: If a query needs file sizes, cell counts, or assay-level metadata not in the portal, use BigQuery assay metadata tables (e.g., scRNAseq_level3_metadata_current, imaging_level2_metadata_current). Then use portal or htan files lookup for download coordinates.
HTAN Documentation: See references/htan_docs_manual.md for citing HTAN, dbGaP access, data levels, visualization tools.
Atlas Centers: See references/htan_atlases.md for the 14 atlas centers, cancer types, and grant numbers.
| Document | When to Read |
|---|---|
references/clickhouse_portal.md | Writing portal SQL — schema, array columns, JSON extraction, common mistakes |
references/bigquery_tables.md | Writing BigQuery SQL — table schemas, naming conventions, example queries |
references/authentication_guide.md | Setting up credentials for Synapse, Gen3, BigQuery |
references/htan_data_model.md | Looking up components, controlled vocabularies, identifier patterns |
references/htan_atlases.md | Atlas centers, cancer types, grant numbers |
references/htan_docs_manual.md | HTAN Manual site map, citing HTAN, dbGaP access, visualization tools |
npx claudepluginhub ncihtan/htan-claude --plugin htanQueries the CZ CELLxGENE Census (61M+ cells) for single-cell RNA-seq expression data by tissue, disease, or cell type, returning AnnData objects. Useful for reference data retrieval and cross-dataset analysis.
Downloads and parses scientific data from any source—genomics formats (VCF, h5ad, BAM), tabular files, multi-step API workflows—using Python code via Bash.
Queries the CELLxGENE Census (61M+ cells) programmatically for single-cell expression data across tissues, diseases, and cell types. Designed for population-scale queries and reference atlas comparisons.