Skill

nvidia-cuda-kernel-performance-review

Statically reviews CUDA C/C++ kernel sources against NVIDIA's performance guidance: memory coalescing, bank conflicts, warp divergence, occupancy, register pressure, and stream concurrency. Emits exact Nsight Compute/Systems commands for runtime confirmation.

C++

performance

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/vanguard-frontier-agentic:nvidia-cuda-kernel-performance-review

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

ReadGrepGlob

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Static review of CUDA C/C++ kernels for memory coalescing, shared-memory bank conflicts, occupancy, register pressure, and stream concurrency against NVIDIA's official CUDA Programming and Best Practices Guides. This skill is doc-anchored: it grounds review findings in NVIDIA's published documentation rather than in a certification blueprint, because no NVIDIA certification currently covers thi...

Supporting Files

metadata.json

SKILL.md

37 lines · ~736 tokens

Stats

LanguagePython

Stars18

Forks2

MaintenanceExcellent

Last CommitJun 15, 2026

Actions

View Source View Plugin View on GitHub View README