Search everything...

Stats

Actions

Available In

nchc-cluster-skills

Name: nchc-cluster-skills
Author: nchc-bio

By NCHC-bio

Claude Code skills for TWCC / NCHC HPC cluster usage: SLURM job submission, debugging, GPU allocation, pricing, and post-job review

npx claudepluginhub nchc-bio/nchc-marketplace --plugin nchc-cluster-skills

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills5

arm64-pipeline

/arm64-pipeline

Use when setting up, porting, or debugging a job on any NCHC/TWCC partition whose compute nodes are aarch64 / ARM64 while the login node is x86_64. The current example is GB200 (gb200-rack1, gb200-rack2, gb200-dev, gb200-full; Grace CPU), and this skill will apply to any future ARM-based partition the cluster adds. Scope is strictly aarch64/arch-mismatch issues. Trigger on: user wants to run on an ARM / aarch64 / Grace / Grace Hopper node, "Exec format error", "cannot execute binary file", torch CUDA False on ARM, torchcodec on ARM, libavutil missing, FFmpeg on ARM, uv on ARM node, conda/virtualenv leakage to ARM node, "my pipeline works on login but fails on the ARM node". Do NOT trigger for: generic SLURM submission (use slurm-submission), non-interactive auth / HF token / wandb in batch (use slurm-submission), partition specs, preemption, QoS (use cluster-info), tmpfs / cgroup OOM / SIGKILL / hangs (use slurm-debug).

cluster-info

/cluster-info

Use when needing TWCC/NCHC cluster specs, partition info, QoS limits, pricing, or architecture details. Trigger on: sinfo, scontrol, partition, QoS, MinGPU, GPU type, SU billing, NTD cost, TWCC pricing, cluster identification, ARM vs x86, GB200 compatibility. This is the shared data layer — slurm-submission and slurm-debug both depend on it.

slurm-debug

/slurm-debug

Use when debugging SLURM job failures, hangs, crashes, or unexpected behavior on TWCC/NCHC HPC clusters. Trigger on: job hang, timeout, CUDA error, OOM, segfault, NCCL timeout, srun error, exit code, node drain, GPU utilization 0%, deadlock, job cancelled, ImportError in container, slow training, or any "my job isn't working" question. Do NOT trigger for: job submission (use slurm-submission), or partition/pricing queries (use cluster-info).

slurm-submission

/slurm-submission

Use when submitting SLURM jobs, writing sbatch scripts, choosing GPU count, or reviewing completed jobs on TWCC/NCHC clusters. Trigger on: sbatch, job submission, GPU allocation, DDP, multi-GPU, wall time, job template, seff, post-job review, HuggingFace token in batch job, wandb login in batch, HF 401 in job, non-interactive auth. Do NOT trigger for: partition queries or pricing (use cluster-info), or job failures/hangs (use slurm-debug). If the target partition has aarch64/ARM compute nodes (currently any gb200-*, and any future ARM partition), ALSO load arm64-pipeline for aarch64-specific setup.

verify-before-claiming

/verify-before-claiming

Use when the user asks to prove a bug claim by running code — "can you verify this?", "is that actually a bug?", "prove it", "run it and check". Agents often hallucinate bugs; this skill forces the claim through a falsification test — propose a patch, run HEAD and patch against the real entry point, compare outputs. No behavior change = claim invalidated. Do NOT trigger for general discussion or analysis where no bug claim is being tested.

Stats

Version1.4.1

Stars0

MaintenanceExcellent

LicenseMIT

Last CommitMay 15, 2026

AddedApr 5, 2026

Actions

View on GitHub View README Plugin Marketplace JSON Homepage

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

nchc-marketplace

README