Skill

vllm-ascend

vLLM-Ascend serving toolchain. Use when installing vLLM on Ascend NPUs, running offline inference, launching a model as an OpenAI-compatible API server, tuning throughput/latency for a specific serving scenario, or contributing to the vllm-ascend project. Trigger whenever the user discusses vLLM deployment, vLLM errors, serving a model on Ascend, or wants to get inference running before evaluation.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/vllm-ascend:vllm-ascend install / run / serve / contribute

User invocable

Model invocable

Inline context

Default effort

Argument hintinstall / run / serve / contribute

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Handles vLLM-Ascend installation, running, performance tuning, and contribution workflow.

Supporting Files

vllm-contribute.mdvllm-install.mdvllm-run.md

SKILL.md

40 lines · ~485 tokens

Stats

LanguagePython

Parent stars0

MaintenanceFair

Last CommitApr 29, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

vLLM-Ascend

Handles vLLM-Ascend installation, running, performance tuning, and contribution workflow.

Prerequisites

Before any vLLM task:

NPU hardware check — use /ascend to verify NPUs are healthy and free (npu-smi info)
Model on disk — use /model-download. Never pass an online model ID to vLLM; always use a local path.

Task Specifics

Installation: vllm-install.md — version-pinning between vllm and vllm-ascend, editable install
Running & Tuning: vllm-run.md — scenario interview → offline validation → graph mode → online serving
Contributing: vllm-contribute.md — DCO signature requirement, PR description template

After Serving

Once the API server is up, use /aisbench to run accuracy or performance benchmarks against it.

Run Commands via Shell Script

All vLLM commands (offline inference, online serving) must be saved to a shell script and executed through it so output is captured in a timestamped log file. See the template in /ascend → "Common Requirement: Run via Shell Script with Log Output".

Core Tips

All toolkits (vllm, vllm-ascend) are installed in editable mode. Run pip show <package> to find the source directory before modifying or referencing them.
Before any debugging session, create a new git branch to isolate changes:
```
git checkout -b debug/TOPIC
```

vllm-ascend

Invocation

Context Preview

Supporting Files

SKILL.md

vllm-ascend

Invocation

Context Preview

Supporting Files

SKILL.md

vLLM-Ascend

Prerequisites

Task Specifics

After Serving

Run Commands via Shell Script

Core Tips

Similar Skills

vLLM-Ascend

Prerequisites

Task Specifics

After Serving

Run Commands via Shell Script

Core Tips

Similar Skills