From coreweave-pack
Deploys AI inference services on CoreWeave Kubernetes using Helm charts and Kustomize for GPU scaling and multi-model setups.
How this skill is triggered — by the user, by Claude, or both
Slash command
/coreweave-pack:coreweave-deploy-integrationThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
```yaml
# helm/values.yaml
replicaCount: 2
image:
repository: vllm/vllm-openai
tag: latest
gpu:
type: A100_PCIE_80GB
count: 1
memory: 48Gi
model:
name: meta-llama/Llama-3.1-8B-Instruct
autoscaling:
enabled: true
minReplicas: 1
maxReplicas: 5
targetConcurrency: 2
helm install my-inference ./helm -f values-prod.yaml
helm upgrade my-inference ./helm -f values-prod.yaml
k8s/
├── base/
│ ├── deployment.yaml
│ ├── service.yaml
│ └── kustomization.yaml
├── overlays/
│ ├── dev/
│ │ ├── gpu-patch.yaml # L40 GPU for dev
│ │ └── kustomization.yaml
│ └── prod/
│ ├── gpu-patch.yaml # A100/H100 for prod
│ ├── replicas-patch.yaml
│ └── kustomization.yaml
kubectl apply -k k8s/overlays/prod/
For event monitoring, see coreweave-webhooks-events.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin coreweave-packDeploys KServe InferenceService on CoreWeave Kubernetes for GPU ML model serving with vLLM, autoscaling, scale-to-zero, and A100 affinity.
Deploys vLLM OpenAI-compatible server to Kubernetes with GPU support, health probes, and services via YAML templates. Checks HF token secret and existing deployments before applying.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.