From histai-skillsets
Searches pathology cases, builds research cohorts, and exports WSI files with clinical and technical filters via the HistAI Datahub API. Use only when the user explicitly wants to download whole slide images.
How this skill is triggered — by the user, by Claude, or both
Slash command
/histai-skillsets:cohort_builderThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides secure access to the HistAI Datahub, allowing you to search for pathology cases, filter by clinical and technical criteria, buy cohorts, and export whole slide images (WSI) and metadata. This API is designed for building research cohorts and managing data access.
This skill provides secure access to the HistAI Datahub, allowing you to search for pathology cases, filter by clinical and technical criteria, buy cohorts, and export whole slide images (WSI) and metadata. This API is designed for building research cohorts and managing data access.
Read this before doing anything else.
CellDX has two completely independent workflows, each with its own dataset, cost model, and skill:
| Workflow | Skill | Dataset | What it costs | When to use |
|---|---|---|---|---|
| Buy WSIs (this skill) | cohort_builder | Whole Slide Images (220K+ slides instantly available, 1M+ via custom request) | Per-slide pricing: $5/H&E, $40/IHC (volume discounts apply) | User wants to download WSI files for external use, manual review, or their own pipeline |
| Train a model | ai_model_trainer | Pre-extracted feature vectors (~66K H&E slides only — IHC slides are not in the feature store) | GPU compute only ($/GPU-hour, no per-slide fee) | User wants to train a classifier on CellDX infrastructure |
ai_model_trainer skill instead — it reads pre-extracted features and does not require buying WSIs.ai_model_trainer.All requests require an API Key.
export CELLDX_API_KEY="your-api-key"
Include the key in the X-API-KEY header:
X-API-KEY: $CELLDX_API_KEY
Users can generate API keys at Profile and settings → API keys on https://celldx.hist.ai.
https://prod.celldx.net
The database contains pathology case records. Key fields available for filtering and retrieval include:
| Field | Type | Description |
|---|---|---|
primary_diagnosis | string[] | Main diagnosis terms (e.g., "Invasive ductal carcinoma") |
cancer_type | string[] | Specific cancer subtype |
isCancer | boolean | true if malignant, false otherwise |
organ | string[] | Affected organ (e.g., "Breast", "Lung") |
organ_system | string[] | System context (e.g., "Gastrointestinal") |
age | number | Patient age |
gender | "m" | "f" | Patient gender |
icd10 | string | ICD-10 code |
| Field | Type | Description |
|---|---|---|
stain_names | string[] | Stains present (e.g., "H&E", "ER", "Ki-67") |
stain_types | string[] | Stain categories (e.g., "Routine", "IHC") |
biopsy_regimen | string[] | Procedure type (e.g., "Resection", "Biopsy") |
scanners | string[] | Scanner devices used |
magnifications | number[] | Available optical magnifications (e.g., 20, 40) |
| Field | Description |
|---|---|
macro_protocol | Macroscopic description of the specimen |
micro_protocol | Microscopic findings and detailed pathology report |
conclusion | Final diagnostic conclusion |
Stored as a map of marker names to results (e.g., HER2: "positive 3+"). Use ihc_markers or ihcStudiesContains for filtering.
/v1/billing/topup/chargeInstant top-up of credits.
Request Body:
{
"amountUsd": 1000
}
/v1/datahub/casesList cases. Supports pagination parameters: page (default 0) and size (default 100).
/v1/datahub/cases/search?page=0&size=20Search for cases using complex filters. Pagination is handled via query parameters page and size.
Request Body (Filters):
{
"organ": ["Breast"],
"isCancer": true,
"stainName": ["H&E"],
"ageRange": [40, 60]
}
When a user asks for a diagnosis that includes an organ name (e.g., "Prostate adenocarcinoma"), be aware that the primaryDiagnosis field might only contain the morphology (e.g., "Adenocarcinoma") while the localization is stored in the organ or organSystem fields.
Organ Variability:
The organ filter can contain both specific and general terms (e.g., "Left ovary", "Right ovary", "Ovary", and "Ovaries").
Best Practices:
/v1/datahub/cases/filters or /v1/datahub/cases/filters/page to see the actual values and counts for the organ and primaryDiagnosis fields.organ (e.g., "Prostate") or organSystem (e.g., "Genitourinary") and combine with a broader search in primaryDiagnosis./v1/datahub/cases/filtersGet available filter values (facets) and counts matching the current criteria.
/v1/datahub/cases/filters/pagePaginate through large lists of filter values (e.g., list all primary diagnoses matching "carcinoma").
/v1/datahub/cases/filters/globalRetrieve global, unfiltered lists of available filter options (e.g., all organs in the system).
/v1/datahub/cohortsList all your cohorts.
/v1/datahub/cohortsCreate a new cohort.
Request Body:
{
"name": "My Breast Cancer Cohort",
"cases": [
{ "caseId": "case_123" },
{ "caseId": "case_456", "slideIds": ["slide_A", "slide_B"] }
]
}
Constraints:
name: Max 200 chars, cannot be blank.cases: Must not be empty. No duplicate case IDs.slideIds: Optional. If omitted, all slides in the case are included/purchased. If provided, only the specified slides are included. Use this to reduce costs by purchasing only relevant stains (e.g., specific IHCs)./v1/datahub/cohorts/{cohortId}Get details for a specific cohort.
/v1/datahub/cohorts/{cohortId}/statusCheck the status of a cohort (PENDING, BUILDING, PAID, UNPAID, FAILED).
/v1/datahub/cohorts/{cohortId}/slidesAdd slides/cases to an existing cohort.
Request Body:
[
{ "caseId": "case_789" }
]
/v1/datahub/cohorts/{cohortId}/slidesRemove slides/cases from a cohort.
[!IMPORTANT] Cohort Modification Rules Cohorts can only be modified (adding or removing slides) while their status is
UNPAID. Once a cohort isPAID, it is locked and cannot be changed.
[!CAUTION] Financial Risk & Mandatory Confirmation Purchasing a cohort involves real financial transactions and is non-refundable once the status is
PAID. AI Agents MUST explicitly confirm the total cost and cohort content with the user and receive a clear "YES" or "PROCEED" before calling this endpoint. Never assume the user wants to pay based on high-level instructions alone.
/v1/datahub/cohorts/{cohortId}/payPay for a cohort to enable export. This endpoint is idempotent and requires an Idempotency-Key header.
Headers:
X-API-KEY: $CELLDX_API_KEY
Idempotency-Key: <unique-uuid>
Export is available only for PAID cohorts.
/v1/datahub/cohorts/{cohortId}/export/statusCheck if the export package is ready (PREPARING, READY, EXPIRED, FAILED).
/v1/datahub/cohorts/{cohortId}/export/manifestGet the download URLs for the cohort manifest and metadata.
/v1/datahub/cohorts/{cohortId}/export/filesList available files for download in the cohort.
/v1/datahub/cohorts/{cohortId}/export/files/{fileId}Get a signed download URL for a specific file (WSI).
/v1/datahub/cohorts/{cohortId}/export/refreshRefresh expired download tokens/links.
/v1/billing/balanceGet current account balance and currency.
/v1/billing/historyGet transaction history.
/v1/billing/payment-methodsList saved payment methods.
/v1/billing/topup/chargeAdd funds to account.
Request Body:
{
"amountUsd": 100
}
Use the following table to estimate cohort costs in advance. Pricing is per slide and depends on the stain type and total quantity within a single cohort.
| Quantity | H&E Slides | Other Slides (IHC, etc.) |
|---|---|---|
| 1–99 | $5.00 | $40.00 |
| 100–999 | $4.00 | $32.00 |
| 1000–9999 | $3.00 | $27.00 |
| 10000+ | $2.00 | $23.00 |
[!TIP] Volume Discounts: Discounts are calculated based on the total number of slides per cohort. To maximize savings, we recommend purchasing as many slides as possible within a single cohort rather than creating multiple smaller cohorts.
| Status | Code | Description |
|---|---|---|
| 400 | INVALID_PARAMETERS | Invalid request parameters or body. |
| 401 | INVALID_API_KEY | Missing or invalid API key. |
| 402 | STORAGE_OVERAGE | Storage limit exceeded. |
| 402 | SUBSCRIPTION_REQUIRED | Active subscription required. |
| 402 | INSUFFICIENT_FUNDS | Not enough balance to pay for cohort. |
| 403 | API_KEY_NOT_ALLOWED | Endpoint not allowed for API key auth. |
| 403 | NO_PERMISSION | User does not own the cohort. |
| 404 | NOT_FOUND | Cohort, Case, or Export not found. |
| 409 | INVALID_NAME | Cohort name is blank or too long. |
| 409 | DUPLICATE_CASES | Duplicate caseId in request. |
| 409 | COHORT_STATUS_INVALID | Action not allowed in current status (e.g., adding slides to PAID cohort). |
| 409 | COHORT_ALREADY_PAID | Cohort is already paid. |
| 409 | EXPORT_NOT_READY | Export job is still running. |
| 409 | COHORT_NOT_PAID | Export is available only for PAID cohorts. |
| 410 | EXPORT_WINDOW_EXPIRED | Export download window has expired (refresh required). |
| 429 | RATE_LIMITED | Too many requests. Retry after suggested time. |
| 500 | INTERNAL_ERROR | Server or database error. |
/v1/datahub/cases/filters/global or /v1/datahub/cases/filters to analyze available options (diagnoses, stains, organs etc). Always do this first to ensure you use the exact spelling and terms present in the dataset (e.g., checking if it is "Invasive ductal carcinoma" or "Carcinoma, Ductal, Invasive")./v1/datahub/cases/search to find cases meeting your research criteria (e.g., specific diagnosis + IHC markers)./v1/datahub/cohorts to save these cases as a named cohort.
slideIds in the request to purchase only relevant slides (e.g., only the HER2 slides found in search) rather than the entire case./v1/datahub/cohorts/{id}/pay to purchase the cohort data./export/status until READY./export/manifest to inspect the file list./export/files/{fileId} or use the bulk archive URL from the manifest if available.[!IMPORTANT] Download Window Information (WSI Slides Only) ⏰ You have 2 weeks from your purchase date to download your Whole Slide Images (WSI). 🔒 Download links expire every 24 hours for security, but don't worry - you can refresh them! 📅 Please note that we cannot guarantee WSI file availability beyond the 2-week window. 💡 We recommend downloading all slide files as soon as they're ready.
Note: Cohort metadata remains available for retrieval at any time.
[!NOTE] We'll provide you with a special file that contains direct links to all your slides. 1️⃣ Download the manifest file (it's a simple text file with all your slide links). 2️⃣ Use the links however you prefer.
The Datahub API provides instant access to 281,655 slides across 73,334 cases. However, our total inventory exceeds 1,000,000 slides.
If you cannot find the specific slides needed or if the volume is insufficient for the user's research:
npx claudepluginhub histai/skillsets --plugin slide-analyzerSearches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.