From single-cell-ml
GPU-accelerated Python RCTD cell type deconvolution for spatial transcriptomics (Visium, Xenium, VisiumHD, MERFISH, Slide-seq). Use when running rctd-py, Python RCTD, GPU deconvolution, converting spacexr references to h5ad, submitting GPU SLURM jobs for RCTD, or needing a faster alternative to R spacexr. Covers CLI (`rctd run`), Python API (`run_rctd`, `Reference`, `RCTDConfig`), SBATCH GPU templates for FGCZ L40S/Blackwell nodes, reference preparation from Seurat/scanpy, result integration back to R/Seurat, and mode selection (doublet/multi/full). Use this skill even when the user just mentions "rctd" with Python or GPU context, h5ad deconvolution, or fast spatial annotation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/single-cell-ml:rctd-pyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Python reimplementation of RCTD (Cable et al., Nature Biotechnology 2022) with GPU acceleration via PyTorch. 4-41x faster than R spacexr with 99.7% concordance (100% with `sigma_override`).
Python reimplementation of RCTD (Cable et al., Nature Biotechnology 2022) with GPU acceleration via PyTorch. 4-41x faster than R spacexr with 99.7% concordance (100% with sigma_override).
For the R spacexr workflow, see the rctd-annotation skill instead.
# Install PyTorch with CUDA first (required for GPU)
uv pip install torch --index-url https://download.pytorch.org/whl/cu124
# Then install rctd-py
uv pip install rctd-py
# Verify installation
rctd info
On FGCZ, rctd-py can also be run directly from the local repo via uv run rctd (see SBATCH template below).
# Validate inputs (fast, no GPU needed)
rctd validate spatial.h5ad reference.h5ad --cell-type-col cell_type
# Run deconvolution
rctd run spatial.h5ad reference.h5ad \
--mode doublet \
--device cuda \
--umi-min 20 \
--output results_rctd.h5ad \
--json
Default output path is <input_stem>_rctd.h5ad if --output is omitted.
from rctd import Reference, run_rctd, RCTDConfig
import anndata
spatial = anndata.read_h5ad("spatial.h5ad")
ref = Reference(anndata.read_h5ad("reference.h5ad"), cell_type_col="cell_type")
config = RCTDConfig(device="cuda", UMI_min=20)
result = run_rctd(spatial, ref, mode="doublet", config=config, batch_size=5000)
# result is a DoubletResult with .weights, .spot_class, .first_type, etc.
| Mode | Best For | Description |
|---|---|---|
doublet | Xenium, MERFISH, Slide-seq | Assigns 1-2 cell types per spot. Default and recommended for most single-cell resolution data |
multi | Dense spatial (Xenium with high contamination) | Up to 4 cell types per spot via greedy forward selection |
full | Visium, VisiumHD | Unconstrained proportions across all K cell types. Best for multi-cell spots |
For most FGCZ Xenium projects, use doublet (default).
Download a public scRNA-seq reference directly in Python:
import gget
gget.setup("cellxgene") # one-time
ref = gget.cellxgene(
species="mus_musculus", tissue="ovary", disease="normal",
out="reference.h5ad",
)
# Remap Ensembl IDs to gene symbols (gget returns Ensembl by default)
ref.var_names = ref.var["feature_name"].values
ref.var_names_make_unique()
ref.write_h5ad("reference.h5ad")
# Ready for rctd-py — ensure .X has raw counts (gget returns raw by default)
library(anndataR)
ref_seurat <- qs2::qs_read("reference.qs2", nthreads = 16)
# Downsample to ~500 cells per type (recommended)
Idents(ref_seurat) <- "cell_type"
ref_down <- subset(ref_seurat, downsample = 500)
# Convert to AnnData — ensure raw counts in .X
adata <- as_anndata(ref_down, x_layer = "counts")
adata$obs$cell_type <- as.character(ref_down$cell_type)
write_h5ad(adata, "reference.h5ad")
When you have a pre-built spacexr Reference object (e.g., from /srv/GT/databases/RCTD_References/):
library(spacexr)
library(anndata)
ref <- readRDS("spacexr_reference.rds")
# Transpose: spacexr stores genes x cells, AnnData needs cells x genes
counts <- t(ref@counts)
# Fix cell type names: spacexr may store L2_3 IT instead of L2/3 IT
cell_types <- gsub("_(\\d)", "/\\1", as.character(ref@cell_types))
obs <- data.frame(
cell_type = cell_types,
nUMI = ref@nUMI,
row.names = names(ref@cell_types)
)
adata <- AnnData(X = counts, obs = obs)
write_h5ad(adata, "reference.h5ad")
Ensure .X contains raw integer counts (not normalized/log-transformed) and .obs has a cell type column:
import anndata
adata = anndata.read_h5ad("reference.h5ad")
assert "cell_type" in adata.obs.columns
# If counts are in a layer: adata.X = adata.layers["counts"]
Based on the working template at Internal_Dev/rctd_comparison/02_SBATCH_run_rctdpy.sh:
#!/bin/bash
#SBATCH --job-name=pXXXXX_rctdpy
#SBATCH --output=/srv/GT/analysis/pXXXXX/rctdpy_%j.log
#SBATCH --error=/srv/GT/analysis/pXXXXX/rctdpy_%j.err
#SBATCH --time=02:00:00
#SBATCH --mem-per-cpu=16G
#SBATCH --cpus-per-task=8
#SBATCH --partition=GPU
#SBATCH --gres=gpu:1
# GPU nodes: fgcz-r-023 (2x L40S, 48GB each), fgcz-c-056 (2x Blackwell, 96GB each)
# Optionally pin: #SBATCH --nodelist=fgcz-c-056
WORK_DIR="/srv/GT/analysis/pXXXXX"
echo "Started: $(date)"
echo "Node: $(hostname)"
nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null || true
# Run rctd-py from local repo (Dev/* modules unavailable on GPU nodes)
cd ~/git/rctd-py
uv run rctd run \
"$WORK_DIR/spatial.h5ad" \
"$WORK_DIR/reference.h5ad" \
--cell-type-col cell_type \
--mode doublet \
--device cuda \
--umi-min 20 \
--output "$WORK_DIR/results_rctd.h5ad" \
--json
echo "Completed: $(date)"
ls -lh "$WORK_DIR/results_rctd.h5ad"
Important: GPU nodes (fgcz-r-023, fgcz-c-056) do NOT have Dev/* modules. Use uv run rctd from the local repo checkout at ~/git/rctd-py, or install rctd-py into a conda environment. If you need R for reference conversion, run that step on a non-GPU node first (employee partition).
rctd run)| Flag | Default | Description |
|---|---|---|
--mode | doublet | full, doublet, or multi |
--output / -o | <stem>_rctd.h5ad | Output h5ad path |
--device | auto | auto, cpu, or cuda |
--batch-size | 10000 | GPU batch size (affects VRAM) |
--cell-type-col | cell_type | Reference obs column with cell type labels |
--sigma-override | None | Skip sigma estimation; use for exact R concordance |
--umi-min | 100 | Min UMI per pixel (use 20 for Xenium) |
--umi-max | 20000000 | Max UMI per pixel |
--gene-cutoff | 0.000125 | Bulk gene expression threshold |
--fc-cutoff | 0.5 | Bulk fold-change threshold |
--gene-cutoff-reg | 0.0002 | DE gene expression threshold |
--fc-cutoff-reg | 0.75 | DE fold-change threshold |
--max-multi-types | 4 | Max types per pixel (multi mode) |
--confidence-threshold | 5.0 | Singlet confidence |
--doublet-threshold | 20.0 | Doublet certainty |
--dtype | float64 | float32 or float64 (float32 saves VRAM) |
--cell-min | 25 | Min cells per type in reference |
--n-max-cells | 10000 | Max cells per type (downsampling) |
--json | off | Print JSON summary to stdout |
--quiet / -q | off | Suppress progress output |
| Available VRAM | Recommended --batch-size | Peak VRAM (K=45 types) |
|---|---|---|
| 96 GB (Blackwell) | 10000 (default) | ~4 GB |
| 48 GB (L40S) | 10000 (default) | ~4 GB |
| 24 GB | 10000 (default) | ~4 GB |
| 8-16 GB | 5000 | ~2 GB |
| < 8 GB | 2000 | ~1 GB |
For VisiumHD (>100k spots), the default auto batch sizing handles this automatically.
Results are written into a copy of the spatial AnnData:
| Slot | Content | Modes |
|---|---|---|
obs["rctd_dominant_type"] | Top cell type per pixel | all |
obs["rctd_spot_class"] | singlet / doublet_certain / doublet_uncertain / reject / filtered | doublet |
obs["rctd_first_type"] | Primary cell type | doublet |
obs["rctd_second_type"] | Secondary cell type | doublet |
obs["rctd_converged"] | IRWLS convergence flag | full |
obs["rctd_n_types"] | Number of assigned types | multi |
obsm["rctd_weights"] | Cell type weight matrix (N x K) | all |
obsm["rctd_weights_doublet"] | Top-2 type weights (N x 2) | doublet |
obsm["rctd_sub_weights"] | Selected type weights | multi |
uns["rctd_cell_type_names"] | Ordered list of K cell type names | all |
uns["rctd_mode"] | Mode used | all |
uns["rctd_config"] | Full configuration dict | all |
uns["rctd_version"] | rctd-py version | all |
Spot class is 0-indexed (unlike R spacexr which is 1-indexed):
| Code | Label | Meaning |
|---|---|---|
| 0 | reject | Low confidence, not enough UMI |
| 1 | singlet | Single cell type dominates |
| 2 | doublet_certain | Two types, high confidence |
| 3 | doublet_uncertain | Two types, lower confidence |
Filtered pixels (below UMI_min) have NaN weights and "filtered" labels.
import anndata
adata = anndata.read_h5ad("results_rctd.h5ad")
# Spot class distribution
print(adata.obs["rctd_spot_class"].value_counts())
# Top cell types among singlets
singlets = adata[adata.obs["rctd_spot_class"] == "singlet"]
print(singlets.obs["rctd_first_type"].value_counts().head(10))
# Weight matrix shape
print(adata.obsm["rctd_weights"].shape) # (N_pixels, K_types)
print(adata.uns["rctd_cell_type_names"]) # Column names for weight matrix
library(anndataR)
library(Seurat)
# Read rctd-py results
result_adata <- read_h5ad("results_rctd.h5ad")
result_obs <- as.data.frame(result_adata$obs)
# Load original Seurat object
spatial_obj <- qs2::qs_read("spatial.qs2", nthreads = 16)
# Match barcodes (may need normalization — see troubleshooting)
common <- intersect(colnames(spatial_obj), rownames(result_obs))
spatial_obj$rctd_first_type <- result_obs[common, "rctd_first_type"]
spatial_obj$rctd_spot_class <- result_obs[common, "rctd_spot_class"]
spatial_obj$rctd_dominant_type <- result_obs[common, "rctd_dominant_type"]
# Add weight matrix
weights <- as.matrix(result_adata$obsm[["rctd_weights"]])
ct_names <- result_adata$uns[["rctd_cell_type_names"]]
colnames(weights) <- ct_names
for (ct in ct_names) {
safe_name <- paste0("rctd.weight.", make.names(ct))
spatial_obj[[safe_name]] <- weights[match(colnames(spatial_obj), rownames(result_obs)), ct]
}
qs2::qs_save(spatial_obj, "spatial_rctd_annotated.qs2", nthreads = 16)
rctd validate before GPU jobs to catch issues (fast, no GPU needed)--umi-min 20/ in names — zarr/h5 storage may convert to _. Use consistent naming.X, downsample to ~500 cells/type, >=50% gene overlap with spatial panel--batch-size 5000 if you hit OOM errors--dtype float32 to halve VRAM usage with <5% weight MAEFor complete implementations with code examples, see:
references/workflows.md — 6 end-to-end workflows:
references/troubleshooting.md — Common issues:
~/git/rctd-py/pip install rctd-pyInternal_Dev/rctd_comparison//srv/GT/databases/RCTD_References/npx claudepluginhub cpanse/skills --plugin single-cell-mlSearches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.