From etcd-druid
Use when writing a new Druid Enhancement Proposal or reviewing an existing DEP draft for quality and completeness. Guides section-by-section authoring with best-practice heuristics, or scores existing drafts against a 25-dimension / 50-point rubric covering cross-repo impact (etcd-druid, etcd-backup-restore, etcd-wrapper) plus adjacent systems (VPA, Cluster Autoscaler, PDB), public-API discipline, terminology precision, controller boundaries, feature gates, breaking changes, CEL validations, metrics design, and Mermaid diagrams.
How this skill is triggered — by the user, by Claude, or both
Slash command
/etcd-druid:depThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**EVERY DEP IS REVIEWED FROM THREE REPO PERSPECTIVES PLUS ADJACENT SYSTEMS.**
EVERY DEP IS REVIEWED FROM THREE REPO PERSPECTIVES PLUS ADJACENT SYSTEMS.
| Rationalization | Why it fails |
|---|---|
| "This only touches etcd-druid" | API changes flow to sidecars via StatefulSet env/args. Always check. |
| "The sidecar contract doesn't change" | Behavioral changes (timing, ordering) break contracts without API changes. |
| "We don't need a feature gate for this" | Any change that could break existing clusters needs a rollback path. New features default OFF in alpha. |
| "CEL validation is an implementation detail" | Missing CEL means invalid state reaches controllers. It's design. |
| "I'll surface this internal mechanism in the API" | The DEP is public; StatefulSet/leases/labels are internal. APIs outlive implementations. Hide internals behind feature gates, not CRD fields. |
| "I coined a clear new term, the operational test is obvious" | If you can't write a one-line test for "is X in this state right now?", reviewers can't either. |
| "There's no race condition between these two controllers" | Two controllers writing to the same primary object is a race until proven otherwise. Document the coordination signal. |
| "I'll mention this offline-discussion outcome in the DEP" | The DEP is public-facing. Move closed deliberation to a linked decision record. |
Determine the mode from user intent:
| User says | Mode |
|---|---|
| "write a DEP", "draft a proposal", "help me write DEP-07" | Guide |
| "review this DEP", "check my proposal", "score DEP-06" | Review |
Points to an existing file in docs/proposals/ | Review |
| Describes a new feature/enhancement idea | Guide |
Walk the author through writing a DEP section-by-section.
ls docs/proposals/*.md | tail -1
For each section of the template, prompt the author with heuristics from BEST-PRACTICES.md:
| Section | Key prompt |
|---|---|
| Summary | "Can a reader understand the change without reading the rest? Are you describing what changes (factual) instead of editorializing about what's wrong today?" |
| Terminology | "For each coined term, give a one-line operational test ('how do I check if X is in this state?'). Don't redefine upstream Kubernetes/etcd terms — link to upstream. Cover transient states." |
| Motivation | "What operational pain or incident motivates this? Link to real issues." |
| Goals | "Are these measurable? Can you write a test for each?" |
| Non-Goals | "What adjacent work are you explicitly excluding? Why?" |
| Use Cases | "Give 2-4 real scenarios with pre-conditions and expected outcomes. Does behavior differ for single-replica vs multi-replica?" |
| API Changes | "Show Go structs with kubebuilder markers AND YAML. Include status. Apply the implementation-swap test: would each field still make sense if we replaced StatefulSet with a custom controller?" |
| Lifecycle/Flow | "What are the state transitions? Which diagram type fits? Every decision arrow must be labelled." |
| Failure Scenarios | "What fails? Who detects? Is it retriable? Timeout/backoff? Where does the stuck state surface (condition / event / metric)? Does this failure block other work indefinitely?" |
| Public Surface vs. Internal Mechanics | "Which parts of this proposal are public contracts and which are internal mechanics? Acknowledge fragile internal references (label keys, status reason strings)." |
| Controller Boundaries | "If you propose a new controller, list every other reconciler touching the same primary object and define the coordination signal. Justify why a new controller vs. extending an existing one." |
| Feature Gate | "Does this need a gate? Default off in alpha; when do you flip the default? Feature gate vs. API field — why this choice? Mid-flight toggle behavior?" |
| Compatibility | "Do existing clusters continue working? Migration path? Adjacent systems (VPA, CA, PDB, CSI, self-hosted)?" |
| Metrics | "Naming follows etcddruid_. Include {namespace, name} labels. Don't encode feature names in metric names — use a label. For stateless controllers, where does start-time come from?" |
| Alternatives | "What else did you consider? Minimum 3, including at least one of: 'use existing K8s primitive', 'extend an existing controller', 'configuration option in existing API', 'inverse design'. Bullet-pointed rejection reasons." |
| Future Work | "What comes next? Can each item be its own DEP? (max 5)" |
| Decision Records | "Did any offline-discussion notes leak into the DEP body? Move them to a linked decision record (docs/decisions/ or GitHub discussion)." |
After the main sections, suggest diagrams based on content:
Run the 25-dimension rubric from BEST-PRACTICES.md against the draft:
Pre-flight self-review checklist (run before declaring "ready"):
Score an existing DEP against the full rubric.
Read the target file (user provides path or DEP number).
Apply all 25 dimensions from BEST-PRACTICES.md:
For each affected repo / system, check:
Format:
## DEP Review: DEP-NN — Title
### Score: XX/50 (Rating)
### Strengths
- ...
### Gaps (ordered by impact)
1. [Dimension N — Missing] Specific issue + suggested fix
2. [Dimension M — Weak] What's missing + example of what good looks like
### Cross-Repo Findings
- etcd-druid: ...
- etcd-backup-restore: ...
- etcd-wrapper: ...
- Adjacent systems: ... (VPA / CA / PDB / CSI / self-hosted as applicable)
### Diagram Recommendations
- Missing labels on: <arrow / node>
- Missing: [type] diagram for [content]
- Suggestion: [Mermaid code block]
### Voice & Style
- (Any editorializing language to make factual?)
- (Any unsupported claims about K8s / etcd behavior — needs upstream link?)
| After DEP is... | Invoke |
|---|---|
| Approved by maintainers | /etcd-druid:plan to create implementation plan |
| Involving API changes | /etcd-druid:api-change for CEL/two-commit details |
| Ready for implementation | /etcd-druid:implement with the plan |
| Thought | Why it fails |
|---|---|
| "I'll write the code first, DEP later" | DEP exists to validate design before wasting implementation time. |
| "Only etcd-druid is affected" | API changes flow to sidecars. Always check all three repos plus adjacent systems. |
| "The DEP is good enough at 22/50" | Below 32 = Needs Work. Gaps in design surface as bugs in code. |
| "Diagrams are optional" | They're not. One Mermaid diagram minimum is a hard requirement. Every decision arrow must be labelled. |
| "This implementation detail belongs in the public CRD" | If swapping the underlying mechanism would break the field, it's an internal detail. Hide it behind a feature gate. |
| "Two controllers can both write to this object — they'll cooperate" | Until proven otherwise, that's a race. Document the coordination signal or merge the work into one reconciler. |
| "My new term is self-explanatory" | If you can't write the one-line operational test, reviewers will rewrite the term for you. |
| "I'll inline the offline-discussion outcome here" | Decision records live in docs/decisions/ or linked discussions, not in the public DEP body. |
npx claudepluginhub seshachalam-yv/etcd-druid-skills --plugin etcd-druid-skillsGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.