Skill

rctd-py

GPU-accelerated Python RCTD cell type deconvolution for spatial transcriptomics (Visium, Xenium, VisiumHD, MERFISH, Slide-seq). Use when running rctd-py, Python RCTD, GPU deconvolution, converting spacexr references to h5ad, submitting GPU SLURM jobs for RCTD, or needing a faster alternative to R spacexr. Covers CLI (`rctd run`), Python API (`run_rctd`, `Reference`, `RCTDConfig`), SBATCH GPU templates for FGCZ L40S/Blackwell nodes, reference preparation from Seurat/scanpy, result integration back to R/Seurat, and mode selection (doublet/multi/full). Use this skill even when the user just mentions "rctd" with Python or GPU context, h5ad deconvolution, or fast spatial annotation.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/single-cell-ml:rctd-py

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Python reimplementation of RCTD (Cable et al., Nature Biotechnology 2022) with GPU acceleration via PyTorch. 4-41x faster than R spacexr with 99.7% concordance (100% with `sigma_override`).

Supporting Files

references/troubleshooting.mdreferences/workflows.md

SKILL.md

351 lines · ~3.2k tokens

Stats

LanguagePython

Parent stars0

MaintenanceFair

Last CommitMay 18, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

rctd-py: GPU-Accelerated Spatial Deconvolution

Python reimplementation of RCTD (Cable et al., Nature Biotechnology 2022) with GPU acceleration via PyTorch. 4-41x faster than R spacexr with 99.7% concordance (100% with sigma_override).

For the R spacexr workflow, see the rctd-annotation skill instead.

When to Use

Running cell type deconvolution on spatial transcriptomics data using Python/GPU
Processing large datasets (>50k cells) where R spacexr is too slow
Converting R spacexr references to h5ad for Python workflows
Submitting GPU SLURM jobs for RCTD deconvolution on FGCZ infrastructure
Integrating rctd-py results back into R/Seurat objects
Comparing rctd-py results with R spacexr for validation

Prerequisites

# Install PyTorch with CUDA first (required for GPU)
uv pip install torch --index-url https://download.pytorch.org/whl/cu124

# Then install rctd-py
uv pip install rctd-py

# Verify installation
rctd info

On FGCZ, rctd-py can also be run directly from the local repo via uv run rctd (see SBATCH template below).

Quick Start: CLI

# Validate inputs (fast, no GPU needed)
rctd validate spatial.h5ad reference.h5ad --cell-type-col cell_type

# Run deconvolution
rctd run spatial.h5ad reference.h5ad \
    --mode doublet \
    --device cuda \
    --umi-min 20 \
    --output results_rctd.h5ad \
    --json

Default output path is <input_stem>_rctd.h5ad if --output is omitted.

Quick Start: Python API

from rctd import Reference, run_rctd, RCTDConfig
import anndata

spatial = anndata.read_h5ad("spatial.h5ad")
ref = Reference(anndata.read_h5ad("reference.h5ad"), cell_type_col="cell_type")
config = RCTDConfig(device="cuda", UMI_min=20)

result = run_rctd(spatial, ref, mode="doublet", config=config, batch_size=5000)
# result is a DoubletResult with .weights, .spot_class, .first_type, etc.

Mode Selection

Mode	Best For	Description
`doublet`	Xenium, MERFISH, Slide-seq	Assigns 1-2 cell types per spot. Default and recommended for most single-cell resolution data
`multi`	Dense spatial (Xenium with high contamination)	Up to 4 cell types per spot via greedy forward selection
`full`	Visium, VisiumHD	Unconstrained proportions across all K cell types. Best for multi-cell spots

For most FGCZ Xenium projects, use doublet (default).

Reference Preparation

From CZ CELLxGENE (via gget)

Download a public scRNA-seq reference directly in Python:

import gget
gget.setup("cellxgene")  # one-time
ref = gget.cellxgene(
    species="mus_musculus", tissue="ovary", disease="normal",
    out="reference.h5ad",
)
# Remap Ensembl IDs to gene symbols (gget returns Ensembl by default)
ref.var_names = ref.var["feature_name"].values
ref.var_names_make_unique()
ref.write_h5ad("reference.h5ad")
# Ready for rctd-py — ensure .X has raw counts (gget returns raw by default)

From Seurat .qs2 / .rds

library(anndataR)
ref_seurat <- qs2::qs_read("reference.qs2", nthreads = 16)

# Downsample to ~500 cells per type (recommended)
Idents(ref_seurat) <- "cell_type"
ref_down <- subset(ref_seurat, downsample = 500)

# Convert to AnnData — ensure raw counts in .X
adata <- as_anndata(ref_down, x_layer = "counts")
adata$obs$cell_type <- as.character(ref_down$cell_type)

write_h5ad(adata, "reference.h5ad")

From spacexr Reference .rds

When you have a pre-built spacexr Reference object (e.g., from /srv/GT/databases/RCTD_References/):

library(spacexr)
library(anndata)

ref <- readRDS("spacexr_reference.rds")

# Transpose: spacexr stores genes x cells, AnnData needs cells x genes
counts <- t(ref@counts)

# Fix cell type names: spacexr may store L2_3 IT instead of L2/3 IT
cell_types <- gsub("_(\\d)", "/\\1", as.character(ref@cell_types))

obs <- data.frame(
  cell_type = cell_types,
  nUMI = ref@nUMI,
  row.names = names(ref@cell_types)
)

adata <- AnnData(X = counts, obs = obs)
write_h5ad(adata, "reference.h5ad")

From scanpy / existing h5ad

Ensure .X contains raw integer counts (not normalized/log-transformed) and .obs has a cell type column:

import anndata
adata = anndata.read_h5ad("reference.h5ad")
assert "cell_type" in adata.obs.columns
# If counts are in a layer: adata.X = adata.layers["counts"]

SBATCH GPU Template

Based on the working template at Internal_Dev/rctd_comparison/02_SBATCH_run_rctdpy.sh:

#!/bin/bash
#SBATCH --job-name=pXXXXX_rctdpy
#SBATCH --output=/srv/GT/analysis/pXXXXX/rctdpy_%j.log
#SBATCH --error=/srv/GT/analysis/pXXXXX/rctdpy_%j.err
#SBATCH --time=02:00:00
#SBATCH --mem-per-cpu=16G
#SBATCH --cpus-per-task=8
#SBATCH --partition=GPU
#SBATCH --gres=gpu:1

# GPU nodes: fgcz-r-023 (2x L40S, 48GB each), fgcz-c-056 (2x Blackwell, 96GB each)
# Optionally pin: #SBATCH --nodelist=fgcz-c-056

WORK_DIR="/srv/GT/analysis/pXXXXX"

echo "Started: $(date)"
echo "Node: $(hostname)"
nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null || true

# Run rctd-py from local repo (Dev/* modules unavailable on GPU nodes)
cd ~/git/rctd-py
uv run rctd run \
    "$WORK_DIR/spatial.h5ad" \
    "$WORK_DIR/reference.h5ad" \
    --cell-type-col cell_type \
    --mode doublet \
    --device cuda \
    --umi-min 20 \
    --output "$WORK_DIR/results_rctd.h5ad" \
    --json

echo "Completed: $(date)"
ls -lh "$WORK_DIR/results_rctd.h5ad"

Important: GPU nodes (fgcz-r-023, fgcz-c-056) do NOT have Dev/* modules. Use uv run rctd from the local repo checkout at ~/git/rctd-py, or install rctd-py into a conda environment. If you need R for reference conversion, run that step on a non-GPU node first (employee partition).

Parameter Reference

CLI Flags (`rctd run`)

Flag	Default	Description
`--mode`	`doublet`	`full`, `doublet`, or `multi`
`--output / -o`	`<stem>_rctd.h5ad`	Output h5ad path
`--device`	`auto`	`auto`, `cpu`, or `cuda`
`--batch-size`	`10000`	GPU batch size (affects VRAM)
`--cell-type-col`	`cell_type`	Reference obs column with cell type labels
`--sigma-override`	None	Skip sigma estimation; use for exact R concordance
`--umi-min`	`100`	Min UMI per pixel (use 20 for Xenium)
`--umi-max`	`20000000`	Max UMI per pixel
`--gene-cutoff`	`0.000125`	Bulk gene expression threshold
`--fc-cutoff`	`0.5`	Bulk fold-change threshold
`--gene-cutoff-reg`	`0.0002`	DE gene expression threshold
`--fc-cutoff-reg`	`0.75`	DE fold-change threshold
`--max-multi-types`	`4`	Max types per pixel (multi mode)
`--confidence-threshold`	`5.0`	Singlet confidence
`--doublet-threshold`	`20.0`	Doublet certainty
`--dtype`	`float64`	`float32` or `float64` (float32 saves VRAM)
`--cell-min`	`25`	Min cells per type in reference
`--n-max-cells`	`10000`	Max cells per type (downsampling)
`--json`	off	Print JSON summary to stdout
`--quiet / -q`	off	Suppress progress output

VRAM Guidance

Available VRAM	Recommended `--batch-size`	Peak VRAM (K=45 types)
96 GB (Blackwell)	10000 (default)	~4 GB
48 GB (L40S)	10000 (default)	~4 GB
24 GB	10000 (default)	~4 GB
8-16 GB	5000	~2 GB
< 8 GB	2000	~1 GB

For VisiumHD (>100k spots), the default auto batch sizing handles this automatically.

Result Interpretation

Output h5ad Structure

Results are written into a copy of the spatial AnnData:

Slot	Content	Modes
`obs["rctd_dominant_type"]`	Top cell type per pixel	all
`obs["rctd_spot_class"]`	singlet / doublet_certain / doublet_uncertain / reject / filtered	doublet
`obs["rctd_first_type"]`	Primary cell type	doublet
`obs["rctd_second_type"]`	Secondary cell type	doublet
`obs["rctd_converged"]`	IRWLS convergence flag	full
`obs["rctd_n_types"]`	Number of assigned types	multi
`obsm["rctd_weights"]`	Cell type weight matrix (N x K)	all
`obsm["rctd_weights_doublet"]`	Top-2 type weights (N x 2)	doublet
`obsm["rctd_sub_weights"]`	Selected type weights	multi
`uns["rctd_cell_type_names"]`	Ordered list of K cell type names	all
`uns["rctd_mode"]`	Mode used	all
`uns["rctd_config"]`	Full configuration dict	all
`uns["rctd_version"]`	rctd-py version	all

Spot Class Encoding

Spot class is 0-indexed (unlike R spacexr which is 1-indexed):

Code	Label	Meaning
0	reject	Low confidence, not enough UMI
1	singlet	Single cell type dominates
2	doublet_certain	Two types, high confidence
3	doublet_uncertain	Two types, lower confidence

Filtered pixels (below UMI_min) have NaN weights and "filtered" labels.

Quick Inspection

import anndata
adata = anndata.read_h5ad("results_rctd.h5ad")

# Spot class distribution
print(adata.obs["rctd_spot_class"].value_counts())

# Top cell types among singlets
singlets = adata[adata.obs["rctd_spot_class"] == "singlet"]
print(singlets.obs["rctd_first_type"].value_counts().head(10))

# Weight matrix shape
print(adata.obsm["rctd_weights"].shape)  # (N_pixels, K_types)
print(adata.uns["rctd_cell_type_names"])  # Column names for weight matrix

Integration Back to R/Seurat

library(anndataR)
library(Seurat)

# Read rctd-py results
result_adata <- read_h5ad("results_rctd.h5ad")
result_obs <- as.data.frame(result_adata$obs)

# Load original Seurat object
spatial_obj <- qs2::qs_read("spatial.qs2", nthreads = 16)

# Match barcodes (may need normalization — see troubleshooting)
common <- intersect(colnames(spatial_obj), rownames(result_obs))
spatial_obj$rctd_first_type <- result_obs[common, "rctd_first_type"]
spatial_obj$rctd_spot_class <- result_obs[common, "rctd_spot_class"]
spatial_obj$rctd_dominant_type <- result_obs[common, "rctd_dominant_type"]

# Add weight matrix
weights <- as.matrix(result_adata$obsm[["rctd_weights"]])
ct_names <- result_adata$uns[["rctd_cell_type_names"]]
colnames(weights) <- ct_names
for (ct in ct_names) {
  safe_name <- paste0("rctd.weight.", make.names(ct))
  spatial_obj[[safe_name]] <- weights[match(colnames(spatial_obj), rownames(result_obs)), ct]
}

qs2::qs_save(spatial_obj, "spatial_rctd_annotated.qs2", nthreads = 16)

Best Practices

Validate first: Run rctd validate before GPU jobs to catch issues (fast, no GPU needed)
UMI_min for Xenium: Default is 100 (matching R spacexr), but Xenium data typically needs --umi-min 20
Clean cell type names: Avoid / in names — zarr/h5 storage may convert to _. Use consistent naming
Reference quality: Ensure raw counts in .X, downsample to ~500 cells/type, >=50% gene overlap with spatial panel
GPU batch_size: Start with default (auto-sized). Reduce --batch-size 5000 if you hit OOM errors
float32 for large datasets: Use --dtype float32 to halve VRAM usage with <5% weight MAE
sigma_override: Only needed for exact R spacexr concordance testing, not for normal use

Detailed Workflows

For complete implementations with code examples, see:

references/workflows.md — 6 end-to-end workflows:
- Workflow 1: CLI quick run
- Workflow 2: Python API with custom config
- Workflow 3: Reference conversion from R Seurat
- Workflow 4: Multi-sample batch processing
- Workflow 5: Result comparison with R spacexr
- Workflow 6: Integration into Seurat objects
references/troubleshooting.md — Common issues:
- GPU detection, CUDA errors
- Memory / OOM issues
- Gene overlap, cell type naming
- FGCZ GPU node caveats

Resources

Local repo: ~/git/rctd-py/
PyPI: pip install rctd-py
GitHub: https://github.com/p-gueguen/rctd-py
spacexr paper: Cable et al., Nature Biotechnology 2022
Comparison scripts: Internal_Dev/rctd_comparison/
FGCZ RCTD references: /srv/GT/databases/RCTD_References/

rctd-py

Invocation

Context Preview

Supporting Files

SKILL.md

rctd-py

Invocation

Context Preview

Supporting Files

SKILL.md

rctd-py: GPU-Accelerated Spatial Deconvolution

When to Use

Prerequisites

Quick Start: CLI

Quick Start: Python API

Mode Selection

Reference Preparation

From CZ CELLxGENE (via gget)

From Seurat .qs2 / .rds

From spacexr Reference .rds

From scanpy / existing h5ad

SBATCH GPU Template

Parameter Reference

CLI Flags (rctd run)

VRAM Guidance

Result Interpretation

Output h5ad Structure

Spot Class Encoding

Quick Inspection

Integration Back to R/Seurat

Best Practices

Detailed Workflows

Resources

Similar Skills

rctd-py: GPU-Accelerated Spatial Deconvolution

When to Use

Prerequisites

Quick Start: CLI

Quick Start: Python API

Mode Selection

Reference Preparation

From CZ CELLxGENE (via gget)

From Seurat .qs2 / .rds

From spacexr Reference .rds

From scanpy / existing h5ad

SBATCH GPU Template

Parameter Reference

CLI Flags (rctd run)

VRAM Guidance

Result Interpretation

Output h5ad Structure

Spot Class Encoding

Quick Inspection

Integration Back to R/Seurat

Best Practices

Detailed Workflows

Resources

Similar Skills

CLI Flags (`rctd run`)

CLI Flags (`rctd run`)