From pytorch-skills
Migrate C++/CUDA PyTorch extension code to use the PyTorch stable ABI. Use when the user wants to make their project ABI stable, migrate to stable ABI, or use torch::stable APIs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/pytorch-skills:abi-stableThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Migrate C++/CUDA PyTorch extension code to the stable ABI so that extensions built
Migrate C++/CUDA PyTorch extension code to the stable ABI so that extensions built against one PyTorch version continue to work with future versions without recompilation.
PyTorch 2.11+ is required. Before making any changes, verify the installed version:
python3 -c "import torch; print(torch.__version__)"
If the version is below 2.11, stop and tell the user they need to upgrade PyTorch first.
The stable ABI headers (torch/csrc/stable/) were introduced in PyTorch 2.7 but the
API surface needed for most migrations was not complete until 2.11.
Before starting a migration, read the project's reference documentation:
For a concrete before/after example of a real migration, see references/migration-example.md.
For the complete API mapping (old API -> stable ABI replacement), see references/api-mapping.md.
Find all .cpp, .cu, and .cuh files that include PyTorch headers like:
<torch/extension.h><ATen/ATen.h><ATen/cuda/CUDAContext.h><c10/cuda/CUDAGuard.h><torch/library.h>Remove old includes and replace with stable ABI headers:
// REMOVE these:
// #include <torch/extension.h>
// #include <ATen/ATen.h>
// #include <ATen/cuda/CUDAContext.h>
// #include <c10/cuda/CUDAGuard.h>
// #include <torch/library.h>
// ADD these (include only what you need):
#include <torch/csrc/stable/tensor.h>
#include <torch/csrc/stable/library.h>
#include <torch/csrc/stable/ops.h>
#include <torch/csrc/stable/accelerator.h>
#include <torch/headeronly/core/ScalarType.h>
#include <torch/headeronly/util/Exception.h>
#include <torch/headeronly/util/shim_utils.h>
// For CUDA stream access:
#include <torch/csrc/inductor/aoti_torch/c/shim.h>
#include <cuda_runtime.h>
Apply the replacements described in references/api-mapping.md. The most common ones are:
| Old API | Stable ABI Replacement |
|---|---|
at::Tensor / torch::Tensor | torch::stable::Tensor |
TORCH_CHECK(...) | STD_TORCH_CHECK(...) |
c10::cuda::CUDAGuard | torch::stable::accelerator::DeviceGuard |
at::empty({...}, options) | torch::stable::new_empty(ref_tensor, {sizes}, dtype) |
at::zeros({...}, options) | torch::stable::new_zeros(ref_tensor, {sizes}, dtype) |
TORCH_LIBRARY_IMPL | STABLE_TORCH_LIBRARY_IMPL |
m.impl("name", &func) | m.impl("name", TORCH_BOX(&func)) |
In the project's setup.py (or CMakeLists.txt), add -DUSE_CUDA to both cxx
and nvcc compiler flags if using CUDA stream access via aoti_torch_get_current_cuda_stream.
Build the project and fix compilation errors iteratively. Common issues are documented in references/common-issues.md.
These are mistakes that AI models commonly make. Follow these strictly:
Do NOT use torch::stable::new_empty_strided - this API does not exist.
To create a strided tensor, create a contiguous tensor with transposed dimensions
and then call torch::stable::transpose().
Do NOT define a dummy _C module that can be accessed from Python.
Do NOT declare aoti_torch_get_current_cuda_stream yourself. Include it from
torch/csrc/inductor/aoti_torch/c/shim.h.
Do NOT manually box kernels. Use TORCH_BOX(&func) in m.impl() calls.
Do NOT change switch statements into if/else blocks. The stable ABI works fine with switch statements.
When replacing TORCH_CHECK with STD_TORCH_CHECK, only replace the function
name. Do not change the content/arguments of the check.
When replacing tensor.data_ptr<T>(), use tensor.const_data_ptr<T>() for
read-only access. Do not change the template type.
For DeviceGuard, pass tensor.get_device_index() not tensor.device().
For CUDA streams, get the device index from a tensor
(tensor.get_device_index()), not from
torch::stable::accelerator::getCurrentDeviceIndex().
Scalar type enums use torch::headeronly::ScalarType::Float (not
torch::kFloat32), torch::headeronly::ScalarType::Half (not torch::kFloat16),
etc.
Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub meta-pytorch/skills --plugin pytorch-skills