NVIDIA DGX Spark integration for Claude Code
npx claudepluginhub jeremyeder/dgx-agentskillsNVIDIA DGX Spark integration for Claude Code — local model serving, GPU monitoring, VM management, and hybrid AI workflows
A Claude Code plugin for integrating the NVIDIA DGX Spark into AI development workflows. Provides local model serving, GPU monitoring, VM management, and hybrid local+cloud inference — all accessible through skills, commands, and MCP tools within Claude Code.
cp .env.example .env
# Edit .env with your Spark's hostname and SSH user
./deploy/install.sh
This rsyncs the project to your Spark, builds the MCP server Docker container, and starts it.
Update .mcp.json with your Spark's hostname:
{
"mcpServers": {
"dgx-spark": {
"type": "http",
"url": "http://YOUR-SPARK-HOSTNAME.local:3100/mcp"
}
}
}
Replace YOUR-SPARK-HOSTNAME with your Spark's actual hostname (e.g., jeder-spark). If using Tailscale for remote access, use the Tailscale hostname instead (e.g., http://jeder-spark:3100/mcp).
Claude Code reads this file to discover the MCP server. Without it, skills like /spark-status and all spark_* MCP tools will be unavailable.
# Add the marketplace (one-time)
claude plugin marketplace add jeremyeder/dgx-agentskills
# Install the plugin
claude plugin install dgx-spark@dgx-agentskills --scope user
Or from within a Claude Code session:
/plugin marketplace add jeremyeder/dgx-agentskills
/plugin install dgx-spark@dgx-agentskills
/spark-status
/spark-models pull qwen3.5:32b
/spark-models serve Qwen/Qwen3-Coder-Next --vllm
/spark-switch local
| Skill | Description |
|---|---|
spark-setup | Reproducible provisioning from scratch or after factory reset |
spark-models | Model lifecycle management across Ollama and vLLM |
spark-hybrid | Configure Claude Code to use Spark as model backend |
spark-vpn | Tailscale VPN setup for remote access |
spark-vms | KVM/QEMU virtual machine management |
| Command | Description |
|---|---|
/spark-status | Quick health check — system, GPU, models, VPN |
/spark-models [action] [model] | List, pull, serve, stop, or recommend models |
/spark-switch [mode] | Toggle between local, cloud, and hybrid backends |
| Tool | Description |
|---|---|
spark_get_status | System overview: uptime, CPU, memory, disk |
spark_gpu_utilization | GPU memory, compute %, temperature, power |
spark_list_models | All models across Ollama and vLLM |
spark_pull_model | Pull a model via Ollama |
spark_start_model | Start a vLLM container with tool-calling support |
spark_stop_model | Stop a model container |
spark_list_containers | All Docker containers on Spark |
spark_container_logs | Tail container logs |
spark_vpn_status | Tailscale connection state and peers |
spark_health_check | MCP server health with latency |
.env (repo root)| Variable | Description | Default |
|---|---|---|
SPARK_MCP_URL | MCP server URL | http://your-spark.local:3100 |
SPARK_MCP_URL_TAILSCALE | MCP URL via Tailscale | http://your-spark:3100 |
SPARK_HOST | Spark hostname for SSH | your-spark.local |
SPARK_USER | SSH username | jeder |
SPARK_VLLM_ENDPOINT | vLLM API endpoint | http://your-spark.local:8000 |
SPARK_OLLAMA_ENDPOINT | Ollama API endpoint | http://your-spark.local:11434 |
.env (deployed to ~/dgx-agentskills/.env)| Variable | Description | Default |
|---|---|---|
MCP_PORT | MCP server port | 3100 |
OLLAMA_HOST | Ollama API address | localhost:11434 |
VLLM_IMAGE | vLLM container image | nvcr.io/nvidia/vllm:latest |
VLLM_PORT | vLLM serving port | 8000 |
VLLM_GPU_MEMORY_UTILIZATION | GPU memory fraction for vLLM | 0.7 |
Mac (Claude Code)
│
├── Plugin (skills, commands, hooks)
│ └── .mcp.json → HTTP → DGX Spark MCP Server (:3100)
│
└── Claude Code session
└── ANTHROPIC_BASE_URL → DGX Spark vLLM (:8000)
DGX Spark (your-spark.local)
├── MCP Server (Docker container, port 3100)
│ ├── nvidia-smi (GPU metrics)
│ ├── docker CLI (container management)
│ ├── ollama CLI (model management)
│ └── tailscale CLI (VPN status)
├── Ollama (host, port 11434)
├── vLLM (Docker container, port 8000)
└── Tailscale (mesh VPN)
# Bootstrap dev environment
./scripts/setup-dev.sh
# Run tests
cd mcp-server && npm test
# Run linting
./scripts/lint.sh
Development marketplace for Superpowers core skills library
Harness-native ECC skills, hooks, rules, MCP conventions, and operator workflows
Open Design — local-first design app exposed to coding agents over MCP. Install once with your agent's plugin command and projects/files/skills are reachable through stdio.