From nemotron-speech
Routes NVIDIA Nemotron Speech (Riva) NIM tasks — deploys, runs, and tests ASR, TTS, and NMT NIMs on build.nvidia.com or self-hosted.
How this skill is triggered — by the user, by Claude, or both
Slash command
/nemotron-speech:nemotron-speechThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Note:** "Nemotron Speech" is the public-facing name for what NVIDIA documents today as **Riva** / **Riva NIM**. All commands, container images, gRPC APIs, Python imports, and documentation URLs still use **"Riva"** — the rename is brand-only. Do not rename commands, images, or doc URLs.
Note: "Nemotron Speech" is the public-facing name for what NVIDIA documents today as Riva / Riva NIM. All commands, container images, gRPC APIs, Python imports, and documentation URLs still use "Riva" — the rename is brand-only. Do not rename commands, images, or doc URLs.
Agent: When walking the user through a multi-step workflow, announce each step before presenting it: Step N/M — Step Title (e.g., "Step 1/4 — Deploy the Container").
Single entry point for all NVIDIA Nemotron Speech (Riva) NIM workflows: ASR (speech-to-text), TTS (text-to-speech), and NMT (translation). Covers cloud-hosted inference via build.nvidia.com, self-hosted Docker deployment, client-protocol choice for ASR (gRPC, HTTP, WebSocket), custom NeMo model deployment via riva-build, ASR pipeline tuning (VAD, diarization, language models), and the prerequisite Docker / NGC / driver setup.
Use this skill for any Nemotron Speech / Riva NIM task — deployment, testing, custom model build, system requirements check, or model selection across ASR / TTS / NMT modalities.
Identify the user's task type, then load the corresponding reference file from references/. The reference files contain the detailed per-workflow content; this SKILL.md is a routing surface. Load only the reference relevant to the task at hand.
references/setup.md.pip install -U nvidia-riva-client and a valid NVIDIA_API_KEY from https://build.nvidia.com.NVIDIA_API_KEY and NGC_API_KEY as secrets: never print, paste, commit, or log real key values. Prefer --password-stdin for Docker login and store persistent keys in a credential manager or a chmod 600 env file rather than world-readable shell startup files./opt/nim/.cache must be writable by the container user (the NIM container runs as nvs:1000 internally), not just the host user. Run sudo chown 1000:1000 $LOCAL_NIM_CACHE after creating the directory so the container can write to it. Avoid world-writable modes — they let any local user replace cached model artifacts. Also avoid -u "$(id -u):$(id -g)" on the docker run — /opt/nim/workspace inside the container isn't writable to arbitrary UIDs. If you see I/O error Permission denied (os error 13) during model download, the host directory ownership is the issue.references/setup.md.references/deployment-readiness-checks.md.references/model-selection.md.references/asr.md..nemo → RMIR → NIM) to references/asr-custom.md.references/pipelines.md.references/tts.md.references/nmt.md.For per-release detail — current model catalog, container IDs, function IDs, voice lists, VRAM minimums, per-model feature support — fetch or open the canonical NVIDIA doc rather than relying on text in this SKILL.md or the references. Each reference file includes its own routing table to the relevant doc pages.
Top-level landing pages:
| Topic | URL |
|---|---|
| ASR support matrix | https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/asr.html |
| TTS support matrix | https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/tts.html |
| NMT support matrix | https://docs.nvidia.com/nim/speech/latest/reference/support-matrix/nmt.html |
| Prerequisites (driver / GPU / OS) | https://docs.nvidia.com/nim/speech/latest/get-started/prerequisites.html |
| ASR pipeline configuration | https://docs.nvidia.com/nim/speech/latest/asr/customization/pipeline-configuration.html |
| ASR runtime customization | https://docs.nvidia.com/nim/speech/latest/asr/customization/customization.html |
| Cloud function IDs (per model) | https://build.nvidia.com/<org>/<model>/api |
| NGC catalog | https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/models |
"Deploy a Parakeet ASR NIM" → load references/asr.md, follow Option B (self-hosted), Steps 1–4.
"Synthesize speech with Magpie" → load references/tts.md, follow Option A (cloud) or Option B (self-hosted).
"Translate English to German" → load references/nmt.md, follow the 4-step flow.
"Convert my fine-tuned .nemo to a NIM" → load references/asr-custom.md for the 4-phase pipeline and references/pipelines.md for build-time config.
"Can my GPU run this?" → load references/deployment-readiness-checks.md and run the 6-step system check.
"Which Riva model should I use?" → load references/model-selection.md, apply the decision framework, then fetch the support matrix for the specific current model name.
riva-build, riva-deploy, riva_streaming_asr_client), Python client (riva.client), gRPC namespace (nvidia.riva.asr.*), container registry (nvcr.io/nim/nvidia/*), and all NVIDIA documentation URLs still use "Riva". Do not rename these in code, commands, or docs.For task-specific runtime or modality issues, use the relevant reference file (references/<task>.md). Cross-cutting readiness checks:
references/deployment-readiness-checks.md (system check + health check table)references/deployment-readiness-checks.mddocker pull from nvcr.io returns 403 → references/setup.md (Step 5 — Docker login)references/asr-custom.md (Phase 2 base image)references/deployment-readiness-checks.md, then verify on the support matrixreferences/setup.md)NVIDIA_API_KEY and internet accessriva.client), gRPC services (nvidia.riva.*), and NVIDIA documentation URLs still use "Riva" — follow official docs and catalogs for naming, do not rename these in commands or codereferences/deployment-readiness-checks.mdreferences/setup.mdreferences/model-selection.mdreferences/asr.md, references/tts.md, or references/nmt.mdProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub nvidia-riva/nemotron-speech-skills --plugin nemotron-speech