Skill

dynamo-router-starter

Starts or patches Dynamo router modes (round-robin, KV, least-loaded, device-aware) and runs endpoint smoke checks. Useful for bring-up and mode comparison.

Python

backend

devops

Popularity

Shared by

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/nvidia-skills:dynamo-router-starter

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

<!--

Supporting Files

BENCHMARK.mdevals/evals.jsonreferences/router-modes.mdscripts/check_router_health.pyskill-card.mdskill.oms.sig

SKILL.md

175 lines · ~1.4k tokens

Stats

LanguagePython

Parent stars0

MaintenanceExcellent

Last CommitJun 6, 2026

Actions

View Source View Plugin View on GitHub View README

Dynamo Router Starter

Purpose

Make Dynamo routing feel easy by getting a baseline router mode running, enabling KV-aware routing when appropriate, and proving the endpoint works. Keep the user focused on exact commands and success signals, not router internals.

Prerequisites

Python 3.10+ with the dynamo package importable (python3 -m dynamo.frontend --help works).
For Kubernetes runs: kubectl configured with access to the target namespace and a deployed Dynamo recipe.
Network reachability to the frontend service (port-forward or direct).
A model already loaded into at least one worker (/v1/models returns at least one entry).

Required Inputs

Collect or infer:

local Python/CLI or Kubernetes recipe path
desired mode: round-robin, kv, least-loaded, device-aware-weighted, direct, or random
frontend port or Kubernetes frontend service
whether workers publish KV events; if not, use approximate KV mode
model name for smoke requests, if /v1/models cannot discover it

Instructions

1. Establish A Baseline

For local bring-up with already registered workers:

python3 -m dynamo.frontend --router-mode round-robin --http-port 8000

For Kubernetes, inspect the selected recipe deploy.yaml and locate the frontend service. If the recipe is not already deployed, use dynamo-recipe-runner first.

2. Enable KV Routing

For local frontend:

python3 -m dynamo.frontend --router-mode kv --http-port 8000

For Kubernetes, patch only the frontend service env:

envs:
  - name: DYN_ROUTER_MODE
    value: kv

If backend workers are not publishing KV cache events, set approximate mode instead of leaving the router waiting for events:

envs:
  - name: DYN_ROUTER_USE_KV_EVENTS
    value: "false"

3. Smoke Test

After port-forwarding the frontend service or starting local frontend, run:

python3 scripts/check_router_health.py \
  --base-url http://127.0.0.1:8000

This must verify /v1/models and, when a model is discoverable, one /v1/chat/completions request.

4. Compare Modes Carefully

When comparing round-robin vs KV routing:

use the same model, workers, prompt set, concurrency, and sampling settings
send repeated-prefix prompts if demonstrating KV reuse
label the result as a smoke comparison unless enough benchmark samples were collected
do not claim throughput improvement from a single chat request

If the endpoint is unhealthy or workers are missing, switch to dynamo-troubleshoot.

Available Scripts

Script	Purpose	Arguments
`scripts/check_router_health.py`	Smoke-test `/v1/models` and one chat completion against a Dynamo frontend	`--base-url`, `--retries`, `--timeout`

Invoke via the agentskills.io run_script() protocol:

run_script("scripts/check_router_health.py", args=["--base-url", "http://127.0.0.1:8000"])

Examples

Local KV-routed frontend on port 8000, then smoke-test it:

python3 -m dynamo.frontend --router-mode kv --http-port 8000 &
python3 scripts/check_router_health.py --base-url http://127.0.0.1:8000

Kubernetes-deployed frontend reachable via port-forward:

kubectl port-forward svc/qwen-vllm-disagg-frontend 8000:8000 -n dynamo-demo &
python3 scripts/check_router_health.py --base-url http://127.0.0.1:8000 --retries 3

Equivalent through the agent protocol:

run_script("scripts/check_router_health.py", args=["--base-url", "http://127.0.0.1:8000", "--retries", "3"])

Output Contract

Return:

mode selected and why
local command or Kubernetes env patch
frontend service or URL
smoke-test result
any limitation, such as approximate KV mode or missing worker KV events
next command to run for a fuller comparison

Limitations

Smoke test is one chat completion; it is not a benchmark. Use dynamo-benchmark for throughput/latency numbers.
KV-aware mode without worker KV-event publication degrades to approximate mode; this skill flags but does not fix the underlying worker config.
Mode comparisons require matched workloads; cross-mode latency claims need separate benchmark runs.

Troubleshooting

Symptom	Likely cause	Next step
`/v1/models` returns empty list	No worker registered with the frontend	Verify worker pods are Ready; confirm they connect to the same etcd/NATS
Smoke chat request times out	Frontend up, workers not serving	Switch to `dynamo-troubleshoot`; inspect worker logs
KV mode hangs	Workers do not publish KV cache events	Set `DYN_ROUTER_USE_KV_EVENTS=false` (approximate mode)
Connection refused on port-forward	Port-forward dropped or wrong service name	Re-run port-forward; verify the frontend service name matches the recipe

Benchmark

See BENCHMARK.md for the NVCARPS-EVAL performance report (auto-generated by the NVSkills CI pipeline). To refresh, re-run /nvskills-ci on an upstream PR touching this skill.

References

Read references/router-modes.md for the compact mode/env map.
Use scripts/check_router_health.py for endpoint smoke tests.

dynamo-router-starter

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

dynamo-router-starter

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Dynamo Router Starter

Purpose

Prerequisites

Required Inputs

Instructions

1. Establish A Baseline

2. Enable KV Routing

3. Smoke Test

4. Compare Modes Carefully

Available Scripts

Examples

Output Contract

Limitations

Troubleshooting

Benchmark

References

Similar Skills

Dynamo Router Starter

Purpose

Prerequisites

Required Inputs

Instructions

1. Establish A Baseline

2. Enable KV Routing

3. Smoke Test

4. Compare Modes Carefully

Available Scripts

Examples

Output Contract

Limitations

Troubleshooting

Benchmark

References

Similar Skills