From Kubernetes Operator
Use when the user asks about Kubernetes, k8s, kubectl, Helm, GitOps, Flux, Argo CD, kustomize, manifests, pods, deployments, services, ingress, Gateway API, RBAC, NetworkPolicy, Pod Security, CRDs, operators, cluster operations, or pasted Kubernetes YAML/errors.
How this skill is triggered — by the user, by Claude, or both
Slash command
/kubernetes-operator:kubernetes-operatorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Help the user work with Kubernetes: answer questions accurately, write correct manifests, build kubectl commands, and debug systematically.
Help the user work with Kubernetes: answer questions accurately, write correct manifests, build kubectl commands, and debug systematically.
kubectl explain <type>.<path> --recursive, kubectl api-resources, and kubectl version over memory — they are generated from the running server's OpenAPI and cannot be stale. If no cluster is reachable, say which Kubernetes version your answer assumes.get, describe, logs, events before proposing any mutation. When you do mutate, preview first: kubectl diff -f or --dry-run=server (server-side dry-run runs real admission/validation).kubectl apply with edit/set/patch on the same object silently loses fields on the next apply (three-way merge against last-applied). If the cluster is GitOps-managed (Flux/Argo CD field managers visible in managedFields), hand-edits get reverted at the next reconcile — direct the change to the source repo instead.kubectl <cmd> --help or kubectl explain. Cite the exact command you verified.delete --grace-period=0 --force, drain, taints with NoExecute, and namespace deletion all have blast radius — state the consequence before giving the command, and prefer the gentler alternative (e.g. rollout restart over deleting pods, cordon before drain).Work top-down: kubectl get (what's wrong) → kubectl describe (events tell you why) → kubectl logs (app's view) → deeper tools. Most answers are in describe events.
Pod Pending — it hasn't been scheduled. kubectl describe pod events show the filter that failed: insufficient CPU/memory (check kubectl top nodes and pod requests), unsatisfiable nodeSelector/affinity, untolerated taints (kubectl describe node | grep -A3 Taints), or an unbound PVC (kubectl get pvc — Pending PVC = StorageClass/provisioner issue, or WaitForFirstConsumer deadlock).
CrashLoopBackOff — container starts then dies; backoff doubles 10s→5min. Sequence:
kubectl logs <pod> --previous — the crashed container's output (current logs are the new attempt, often empty).kubectl describe pod → Last State / exit code: 137 = OOMKilled (raise memory limit) or SIGKILL after grace period; 1/2 = app error; 126/127 = bad command/missing binary.initialDelaySeconds, not more retries.describe shows CreateContainerConfigError.ImagePullBackOff / ErrImagePull — describe events contain the registry error verbatim: typo'd image/tag, missing imagePullSecrets (private registry: needs a kubernetes.io/dockerconfigjson secret referenced in the pod spec), or wrong architecture.
Service not reachable — almost always selector/port mismatch or no ready endpoints:
kubectl get endpointslices -l kubernetes.io/service-name=<svc> — empty? The Service selector matches no Ready pods. Compare spec.selector to pod labels; check pod readiness (a Ready=False pod is removed from endpoints by design).port → targetPort → containerPort. targetPort must match what the app actually listens on (test with kubectl exec <pod> -- wget -qO- localhost:<port>).kubectl run -it --rm dbg --image=busybox:1.36 --restart=Never -- nslookup <svc>.<ns>.svc.cluster.local.kubectl get netpol -n <ns>.Stuck Terminating — a finalizer isn't being cleared: kubectl get <obj> -o jsonpath='{.metadata.finalizers}'. Fix the controller responsible; patching the finalizer away is last resort (orphans external resources).
OOMKilled / throttling — memory over limit = kill; CPU over limit = throttle (slow, not dead). kubectl top pod --containers vs limits. QoS matters under node pressure: BestEffort (no requests) is evicted first — always set requests.
Node NotReady — kubectl describe node: pressure conditions (Memory/Disk/PID), kubelet heartbeats (Lease objects), then node-level inspection via kubectl debug node/<name> -it --image=busybox (host filesystem at /host).
Apply this to every manifest you produce or review; each item prevents a real production failure mode:
apps/v1, batch/v1, networking.k8s.io/v1) — verify against kubectl api-resources if a cluster is available; beta APIs get removed on upgrades.:latest makes rollbacks and node cache behavior nondeterministic.requests (scheduling + QoS); set memory limits (OOM protection); CPU limits optional (throttling tradeoff).app.kubernetes.io/name|instance|component used by both selector and Service; selectors are immutable on Deployments — choose once.runAsNonRoot: true, allowPrivilegeEscalation: false, readOnlyRootFilesystem where possible, drop capabilities; needed for restricted Pod Security level (full pattern in references/security.md).default SA (RBAC granted to it leaks to every pod); automountServiceAccountToken: false if the app doesn't use the Kubernetes API.preStop sleep for connection draining; terminationGracePeriodSeconds sized to real shutdown time.| File | Read when the task involves |
|---|---|
references/kubectl.md | kubectl internals: apply's three-way merge, server-side apply flags, kubectl debug modes, JSONPath/custom-columns scripting, output formats, plugins, scripting conventions |
references/workloads-scheduling.md | Pod lifecycle details (phases, conditions, probe tuning, termination), workload controller behavior (Deployment rollouts, StatefulSets, Jobs), scheduling (affinity, taints, topology spread, priority/preemption, PDBs, eviction) |
references/networking-storage.md | Service types and DNS, EndpointSlices, Ingress vs Gateway API, NetworkPolicy semantics, PV/PVC lifecycle, StorageClasses, access modes, CSI, ConfigMaps/Secrets consumption details |
references/api-machinery.md | API groups/versioning/deprecation, ObjectMeta semantics (resourceVersion, generation, finalizers, ownerReferences), watches, server-side apply field ownership, the apiserver request path (authn → RBAC → admission), RBAC objects and rules, authoring CRDs/operators |
references/helm.md | Helm: chart anatomy, templating, values precedence, hooks, upgrade/rollback/stuck-release recovery, OCI registries, render debugging (helm template/diff), Helm under GitOps |
references/security.md | Hardening: Pod Security Standards and the securityContext that passes restricted, dedicated ServiceAccounts, least-privilege RBAC YAML + audit commands, NetworkPolicy patterns (default-deny baseline), secrets hygiene (SOPS/sealed-secrets/ESO), image security, ResourceQuota/LimitRange, manifest security checklist |
references/gitops.md | GitOps: Flux and Argo CD operations and diagnosis, repo layout, kustomize (bases/overlays, patches, configMapGenerator hash rollouts, images transformer), secrets-in-git options, progressive delivery, drift detection |
references/versioning-and-sources.md | Official source baseline, refresh checklist, and version-sensitive answering rules |
scripts/k8s-context-check.sh — safe read-only helper for collecting local tool versions, current Kubernetes context, server/API discovery, and key kubectl explain probes before version-sensitive troubleshooting.Read the relevant file before answering in-depth questions in its area — they contain field-level specifics (exact defaults, version notes, failure modes) that make the difference between a plausible answer and a correct one.
Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub glapsfun/cnative-slills --plugin kubernetes-operator