From compass
How to write Given/When/Then scenarios that double as the acceptance suite — scenario granularity, the qualities of a runnable scenario, and how depth scales by route. Triggers during Specify and Clarify.
How this skill is triggered — by the user, by Claude, or both
Slash command
/compass:bdd-specificationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
**BDD is strategy S1** — expressing acceptance criteria as Given/When/Then
BDD is strategy S1 — expressing acceptance criteria as Given/When/Then scenarios. It is the strong, shipped-on default way to satisfy guardrail G2: acceptance defined before it is built — stated, and checkable. Keep that relationship straight.
In Compass the BDD spec is not documentation that precedes the tests — it is
the tests, read at a different time. spec.feature.md is written once at
Specify and run as the acceptance check at Verify. It is also the one artifact
five roles read (see role-translation). Write it knowing all of that.
Good BDD scenarios do not appear fully formed. They emerge through a disciplined refinement chain: vague idea → concrete examples → acceptance criteria → at least one executable specification each.
Start with the vague idea. "Users should be able to reset their password." This is a wish, not a spec. It is the right starting point — not the ending point.
Generate concrete examples. Ask: what does that look like in the real world? A user with a valid email who asks for a reset link, gets a link. A user whose token has expired, cannot use the old link. A user who has already reset once today, hits a rate limit. Concrete examples ground abstract wishes in observable events.
Distil into acceptance criteria. Each example suggests a criterion: "Given a valid, unexpired reset token, the user can set a new password." At this step you are also applying the ubiquitous language — the shared vocabulary of the domain that every role (engineer, product owner, QA, marketer) uses consistently. Terms that are vague in conversation become precise in criteria. "Expired" gets a definition; "rate limit" gets a number.
Write at least one executable specification per criterion. The criterion becomes a Given/When/Then scenario. A criterion with no runnable scenario is a wish that never became a check — and guardrail G2 refuses that: acceptance must be stated and checkable. One runnable scenario per criterion is the minimum; multiple scenarios cover the edges.
Scenario names carry the refinement: they state an outcome, not a step. "should" as a prefix disciplines this well — "should reject an expired token" names a required outcome, not a call-path.
Scenario: expired token should be rejectedScenario: test token expiry checkThe ubiquitous language is not optional. When "user," "subscriber," and "account" are used interchangeably in scenarios, they carry different implications to different readers. Pick the domain term and use it consistently — the spec is the contract between all five roles.
User stories ("As a [role], I want [feature], so that [outcome]") are
refused as a per-role spec format in Compass — see ADR-004 (one spec, many
lenses). The rationale: a user story format embeds a single role's
perspective into the spec, which means one role's spec and another role's
spec diverge. Compass uses one spec.feature.md that all five roles read
through their own lens (see role-translation), not five separate
role-scoped artifacts. The BDD scenario is the shared artifact; the
role-translation lens is how each role reads it. User stories as a format
are fine upstream of Compass (in a brief, a brief or a Jira ticket) — they
are not the spec, and they do not replace the scenario.
Scenario: <a behaviour, named as an outcome>
Given <the world is in this concrete, specific state>
When <exactly one triggering action happens>
Then <this observable, checkable outcome holds>
A scenario is a single behaviour with a single trigger. It is concrete enough that someone could execute it by hand and concrete enough that a test can assert it automatically — those are the same bar.
When means two scenarios. The trigger is
singular.When, or branches in its
Then ("Then either X or Y"), or needs a paragraph of Given — those are
multiple behaviours wearing one name.The vocabulary never changes. The depth does — and the route tells you how much.
Clarify is where the spec is verified as a spec, before anyone builds from it. Walk it for:
Record each ambiguity, its resolution, and who resolved it in
clarifications.md. Clarify may be light on Standard; it may be collapsed
on Express only because the one scenario was certified unambiguous; it is
skipped on Spike because the unknown is the point; it is never simply
absent where the route or a routing guardrail calls for it.
Given the cache is warm, When flushCache() is called…. That tests the code's shape, not its behaviour. Scenarios outlive
implementations.Given lines and three When lines.
Split it.Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub jed72/compass --plugin compass