From skillry-cloud-and-infrastructure
Use when you need to review or author Kubernetes manifests or Helm charts for resource requests and limits, liveness/readiness/startup probes, pod security context, RBAC scope, network policy, HPA, secret handling, and common workload anti-patterns.
How this skill is triggered — by the user, by Claude, or both
Slash command
/skillry-cloud-and-infrastructure:333-kubernetes-manifest-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Conduct a structured review-and-author pass over Kubernetes manifests, Kustomize overlays, or Helm charts — covering resource requests/limits, liveness/readiness/startup probes, pod and container security context, RBAC least privilege, NetworkPolicy isolation, autoscaling (HPA), secret handling, and reliability anti-patterns. The goal is workloads that schedule predictably, fail safely, run unp...
Conduct a structured review-and-author pass over Kubernetes manifests, Kustomize overlays, or Helm charts — covering resource requests/limits, liveness/readiness/startup probes, pod and container security context, RBAC least privilege, NetworkPolicy isolation, autoscaling (HPA), secret handling, and reliability anti-patterns. The goal is workloads that schedule predictably, fail safely, run unprivileged, and expose the minimum surface. Findings must come from the actual YAML and a server-side dry-run, never from assumptions, and nothing is applied to a live cluster without human approval.
k8s/, manifests/, deploy/, a Helm templates/ dir, or values.yaml.Deployment, StatefulSet, DaemonSet, Service, Ingress, RBAC, or NetworkPolicy is being introduced.find . -path "*/k8s/*" -o -name "*.yaml" -path "*manifests*" 2>/dev/null
grep -rln "^kind:" . --include="*.yaml" | head
grep -rhE "^kind:" . --include="*.yaml" | sort | uniq -c # inventory of object kinds
# Client-side validation only — no cluster mutation
kubectl apply --dry-run=client -f manifests/ 2>&1 | tail -20
# Stronger static schema validation if available
kubeconform -strict -summary manifests/ 2>/dev/null || echo "kubeconform not installed"
# Render Helm to inspect the real output
helm template ./chart -f chart/values.yaml > rendered.yaml 2>/dev/null
# Containers missing requests/limits are unschedulable-safe risks (noisy-neighbor, OOM)
grep -rL "resources:" manifests/ --include="*.yaml"
grep -rn "requests:\|limits:\|memory:\|cpu:" manifests/ --include="*.yaml"
grep -rn "livenessProbe\|readinessProbe\|startupProbe" manifests/ --include="*.yaml"
# Workloads with containers but no readinessProbe send traffic to unready pods
# Pod/container security context
grep -rn "runAsNonRoot\|runAsUser\|allowPrivilegeEscalation\|readOnlyRootFilesystem\|privileged\|capabilities" manifests/ --include="*.yaml"
# RBAC wildcards (over-permission)
grep -rn "verbs:\|resources:\|apiGroups:" manifests/ --include="*.yaml"
grep -rn "\"\*\"\|- '\*'\|- \"\*\"" manifests/ --include="*.yaml"
grep -rn "kind: ClusterRoleBinding" manifests/ --include="*.yaml" # prefer namespaced RoleBinding
grep -rln "kind: NetworkPolicy" manifests/ --include="*.yaml" || echo "no NetworkPolicy — default-allow traffic"
# Secrets must not be inline plaintext literals in manifests
grep -rn "kind: Secret" -A6 manifests/ --include="*.yaml"
grep -rn "stringData:\|data:" manifests/ --include="*.yaml"
grep -rln "kind: HorizontalPodAutoscaler" manifests/ --include="*.yaml"
requests and limits.readinessProbe and a livenessProbe; slow starters use a startupProbe.runAsNonRoot: true and a non-zero runAsUser.allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, and drops ALL capabilities.privileged: true or mounts the host network/PID/filesystem without justification.Role/RoleBinding where possible; no verbs/resources/apiGroups: ["*"] ClusterRoles.NetworkPolicy restricts ingress/egress for sensitive workloads (default-deny baseline).secretKeyRef/envFrom, not hardcoded plaintext in manifests.HorizontalPodAutoscaler (or documented fixed replica count) governs scaling.strategy (RollingUpdate with sane maxUnavailable/maxSurge).PodDisruptionBudget protects multi-replica services during node drains.:latest) with imagePullPolicy consistent with the tag.# Hardened Deployment container spec — review target
apiVersion: apps/v1
kind: Deployment
metadata: { name: api, namespace: prod }
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate: { maxUnavailable: 0, maxSurge: 1 }
template:
spec:
securityContext:
runAsNonRoot: true
runAsUser: 65532
seccompProfile: { type: RuntimeDefault }
containers:
- name: api
image: registry.example.com/api:1.4.2 # pinned, not latest
resources:
requests: { cpu: "100m", memory: "128Mi" }
limits: { cpu: "500m", memory: "256Mi" }
readinessProbe:
httpGet: { path: /healthz, port: 8080 }
initialDelaySeconds: 5
livenessProbe:
httpGet: { path: /livez, port: 8080 }
periodSeconds: 10
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities: { drop: ["ALL"] }
envFrom:
- secretRef: { name: api-secrets } # not inline plaintext
# Default-deny NetworkPolicy baseline
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: default-deny, namespace: prod }
spec:
podSelector: {}
policyTypes: ["Ingress", "Egress"]
# Safe review commands — never mutate the cluster
kubectl apply --dry-run=client -f manifests/ # client-side only
kubeconform -strict -summary manifests/
helm template ./chart | kubeconform -strict -summary -
runAsNonRoot absent and containers run as UID 0; privileged: true "to make it work".["*"] verbs/resources bound to a workload ServiceAccount.Secret manifest (base64 is encoding, not encryption).image: app:latest — non-reproducible rollouts and broken rollbacks.Produce a structured report with:
file:line | issue | severity | concrete fix.kubectl apply (server-side), kubectl delete, kubectl scale, kubectl rollout, kubectl drain/cordon, or helm install/upgrade/uninstall against a real cluster without explicit human approval.--dry-run=client, kubeconform, and helm template are safe to run unattended — they do not touch the cluster.Secret is not encryption — redact discovered secret values to **** and flag for rotation.StatefulSet, PVC, or namespace, stop and require explicit human approval.npx claudepluginhub fluxonlab/skillry --plugin skillry-cloud-and-infrastructureProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.