From sdd-triad
Help write or improve evaluation scenarios for the SDD Triad (spec-driven proposal system). Guides the user through use cases, stress tests, anti-pattern signals, and comparison tables — and checks the result for common weaknesses. Use when the user says "write SDD scenarios", "help me write scenarios", "create scenarios for the triad", "improve my scenarios", "check my scenarios", or "scenario review".
How this skill is triggered — by the user, by Claude, or both
Slash command
/sdd-triad:sdd-scenario-writerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Help the user author or improve an evaluation scenarios document that will be consumed by the `sdd-evaluator` agent inside the SDD Triad loop. Well-designed scenarios produce specific, actionable feedback. Vague scenarios produce feedback the writer cannot act on.
Help the user author or improve an evaluation scenarios document that will be consumed by the sdd-evaluator agent inside the SDD Triad loop. Well-designed scenarios produce specific, actionable feedback. Vague scenarios produce feedback the writer cannot act on.
The scenarios file is the evaluator's primary input. The evaluator reads the scenarios and a proposal, then tests the proposal against every scenario. The evaluator's feedback goes back to the writer (after sanitization by the orchestrator) to guide revisions.
The writer never sees the scenarios. This is the information barrier — the defining feature of the SDD Triad. Scenarios must be written so that an evaluator can produce specific, actionable feedback to the writer without naming any scenario.
Walk the user through each section. Push back on vague scenarios that would produce unusable feedback.
Named situations that a valid proposal must be able to handle. Each use case describes a plausible real-world situation in the domain.
Every use case needs:
Good use case:
Late Sweden return. Sweden dates slip one week later than planned. The post-Sweden custody period start date shifts accordingly. Questions: Does the schedule still satisfy the 7-day minimum for Period 2? Does back-to-school continuity survive? Is the actual-day count still at ceiling?
Bad use case:
Schedule works when things change. The schedule should handle changes. (No questions, no specifics, no metric references.)
When the user offers a vague use case, ask: "What specific questions should the evaluator check the proposal against? If the evaluator can't find clear answers in the proposal, what should the feedback say?"
Pass/fail structural conditions. Each stress test has:
Good:
T3 No coordination-only roles. Every named role in the proposal owns a functional portfolio. Pass: no role's stated responsibilities are limited to routing, coordination, or hand-off work.
Bad:
T3 Roles are meaningful. Roles should do real work. (No pass condition — the evaluator cannot score this.)
Push the user to make pass conditions binary. If they say "the schedule should be resilient," ask: "Resilient to what specifically? What's the pass condition — what would a non-resilient schedule look like?"
Observable symptoms that indicate the proposal is likely to fail in practice. Each signal names:
These are early-warning indicators, not pass/fail tests. The evaluator uses them to flag risks. The orchestrator tracks whether risks are improving across rounds.
Good:
Pods meet but don't decide. Pod cadences are described but no named person holds accountability for pod-level decisions. Indicates: governance theater — meetings occur but decisions defer upward. Maps to: M-05 decision velocity.
Bad:
Things might not work. (No symptom, no failure mode, no metric reference.)
When scenarios represent distinct structural approaches (e.g., three different schedule shapes, three different org structures), include a scoring table that compares them on the metrics that matter.
| Metric | Scenario A | Scenario B | Scenario C |
|---|---|---|---|
| Actual days | 25 | 25 | 23 |
| Pre-event time | 10 days | 0 days | 10 days |
| Handoff count | 8 | 6 | 8 |
This helps the evaluator contextualize trade-offs and produce feedback that surfaces the real distinctions between approaches.
These belong in the spec. If the user starts putting them in the scenarios, flag it:
Tell the user: "That's a requirement, not a scenario. It belongs in the spec so the writer can see it and design to satisfy it. The scenarios test whether the proposal handles situations the writer wasn't explicitly told to optimize for."
Apply this test to every use case and stress test before finalizing:
Could an evaluator produce specific, actionable feedback to the writer about this scenario without naming the scenario?
If the only useful feedback would be "the scenario about X fails," the scenario is too abstract. Rewrite it to be concrete enough that the evaluator can describe the gap in terms of real-world behavior.
Wrong (requires naming the scenario): "UC-09 fails." Right (describes the gap): "When two functions produce contradictory customer-facing materials, the proposal does not identify who resolves this or in what timeframe."
Your job is not to transcribe what the user says — it is to draw out the failure modes, edge cases, and tensions they haven't articulated. The user knows what worries them; you know how to turn worries into testable scenarios.
Use these throughout the interview. They are not a script — deploy them when the conversation calls for them.
Failure mode probes — find what breaks:
Tension probes — find where constraints conflict:
Sensitivity probes — find what's fragile:
Accountability probes — find who owns what:
Completeness probes — find the gaps:
Anti-pattern probes — find the silent failures:
Conduct the interview in four phases. Each phase has a goal, opening questions, and follow-up patterns. Do not rush — stay in each phase until you have concrete, testable material.
Phase 1: Domain and spec orientation
Phase 2: Use case discovery
Phase 3: Stress tests and anti-patterns
Phase 4: Comparison and synthesis
After all four phases, run the quality checklist. Then write the scenarios.
Run this against every scenarios document before finalizing. Report each item as pass or fail with a note.
| # | Check | Pass condition |
|---|---|---|
| 1 | Every use case has questions | Each use case lists the questions the evaluator checks the proposal against |
| 2 | Every stress test has a pass condition | Each stress test has an unambiguous, binary pass condition |
| 3 | Anti-pattern signals name the failure mode | Each signal identifies what would go wrong and why it matters |
| 4 | Metric cross-references are present | Use cases and signals reference spec metric IDs (M-01, etc.) |
| 5 | No spec content in the scenarios | No hard constraints, soft constraints, proposal format, or static metrics |
| 6 | No duplication with the spec | Scenarios don't re-state requirements already in the spec |
| 7 | Feedback test passes | Every scenario can produce actionable writer feedback without naming the scenario |
| 8 | Use cases are concrete | Each describes a specific plausible situation, not an abstract category |
| 9 | Stress test pass conditions are binary | Each can be scored pass or fail — not "partially" or "it depends" |
Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub ghelleks/sdd-triad --plugin sdd-triad