From vanguard-frontier-agentic
Reviews NVIDIA GPU Operator deployments on Kubernetes for hardening: device plugin, MIG manager, NFD labels, time-sliced GPU config, securityContext, namespace tenancy, and admission policies.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vanguard-frontier-agentic:nvidia-gpu-operator-kubernetes-hardeningThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Review NVIDIA GPU Operator deployments on Kubernetes against NVIDIA documentation and Kubernetes pod-security hardening: device plugin posture, MIG manager configuration, time-sliced GPU configuration, NFD label usage, container toolkit isolation, securityContext posture for GPU workloads, and admission policy enforcement of GPU resource requests.
Review NVIDIA GPU Operator deployments on Kubernetes against NVIDIA documentation and Kubernetes pod-security hardening: device plugin posture, MIG manager configuration, time-sliced GPU configuration, NFD label usage, container toolkit isolation, securityContext posture for GPU workloads, and admission policy enforcement of GPU resource requests.
kubectl -n gpu-operator get pods, kubectl get clusterpolicy, kubectl get nodes -L nvidia.com/gpu.product, MIG profile annotations) when the active client exposes it; otherwise fall back to NVIDIA GPU Operator documentation and sanitized manifests.privileged: true outside of the GPU Operator's own DaemonSets as a critical finding — privilege creep across tenant workloads.single strategy on multi-tenant clusters when mixed is required as a high finding — partition diversity is impossible.nvidia.com/gpu requests by namespace as a high finding for multi-tenant clusters.Return, at minimum:
npx claudepluginhub raishin/vanguard-frontier-agentic --plugin vanguard-frontier-agenticReviews day-2 operations of NVIDIA GPU fleets: DCGM exporter/diag posture, GPU telemetry into Prometheus/Grafana, MIG partitioning lifecycle, Xid error runbooks, fleet upgrades, and incident response for GPU-failure modes.
Audits Kubernetes clusters against OWASP Kubernetes Top 10 (2022) vulnerability classes using kubectl commands and kube-bench, with remediation guidance.
Enforces least-privilege RBAC and secure runtime configuration for Kubernetes Operators. Use when building, reviewing, or auditing Operator manifests, ClusterRoles, Roles, OLM bundles, or CRD definitions.