From agentops
Transforms an intent into acceptance-gated beads by first writing Gherkin scenarios, running them red, then deriving spec and beads. No runnable test, no bead.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agentops:behavior-first-planningThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Quick Ref:** Turn an intent into beads that each carry a *runnable acceptance test* defining "done". The generative discipline behind the `bdd-foundry` workflow — extracted here so behavior-first planning survives independent of any orchestrator. Output: frozen Gherkin → executed-red tests → derived spec → an acceptance-gated bead DAG.
Quick Ref: Turn an intent into beads that each carry a runnable acceptance test defining "done". The generative discipline behind the
bdd-foundryworkflow — extracted here so behavior-first planning survives independent of any orchestrator. Output: frozen Gherkin → executed-red tests → derived spec → an acceptance-gated bead DAG.
YOU MUST EXECUTE THIS DISCIPLINE. Do not just describe it.
Spec-first planning ships beads with no done-criteria: a title, a paragraph of "why", and nothing a machine can run to decide it is finished. The implementer then invents the bar, and "done" becomes a self-grade. Behavior-first planning inverts the order: define the behavior as an executable test before the design, so every bead is born with a runnable contract. The rule is absolute — no runnable acceptance test, no bead.
This is the successor to plain decomposition (plan): same DAG output, but each unit carries a failing test that has actually been run red, not prose. It is the planning-side mirror of the membrane (docs/architecture/control-loop-model.md): the bead's gate is deterministic ground truth, not an opinion.
Turn the intent into concrete, testable Gherkin scenarios — Given/When/Then — covering the happy path, edge cases, AND error/failure paths. Every clause must be specific enough to become a runnable test; "works correctly" is rejected.
behaviors.md as the frozen definition of done. Freezing means: downstream phases derive from it; they do not silently add or drop scenarios.The adversary applies this concrete checklist to every behavior that touches an input, a trust boundary, a mutation/write surface, a failure path, or external state — and emits the missing attack-vector scenario for each applicable class. These are the bypass classes a green test most often misses (cheap to add here, expensive at the gate):
-- / override vectors; the component that ACTS must be the one that validates.Also read any repo-local gate-findings ledger (try docs/gate/findings-ledger.md) and apply its Standing Review Dimensions — those are real defects a gate already caught; do not let them recur. This is the ratchet: every gate finding permanently upgrades this checklist.
Turn each frozen scenario into a runnable test in the project's framework. The test IS the executable definition of done. The tests must be currently failing (red) because the feature is not built yet — and you must observe the red, not assert it:
acceptance-tests/ and an acceptance-tests.md index mapping scenario id → test name/path, including the one-line command that runs the whole suite.cargo/pytest/go test call (it arg-errors before any test runs); chain with &&.This phase is the heart of the discipline. Skipping the executed red is how a plan silently regresses to spec-first.
Design the architecture/spec whose only job is to make the acceptance tests pass — derived from the behaviors + tests, not free-form. For each behavior, name the components/changes that satisfy it. A repaired or already-green test from Phase 2 gets its assertion fixed here. Keep it tight: a spec, not a monument. Write to spec.md.
Decompose the spec into a dependency-ordered DAG of beads. Every bead MUST carry:
scenario_ref — the id of a real frozen scenario it delivers, andacceptance_test — an invocable command + test path (from acceptance-tests/ or the project test tree) that defines done for that bead. Prose-only acceptance is rejected.Then apply the mechanical gate (compute it, do not self-report):
acceptance_test resolves (in list mode: --list / --collect-only / -list / --count) to exactly one unignored test. Not zero, not 2+, not an arg error.scenario_ref is a real frozen scenario id.A bead failing any check is rejected, not written. Only the gate-passed set advances.
Behavior-first planning is not done when the beads are drafted — it is done when an independent reviewer confirms them. Before writing anything to the tracker:
Use br from the main checkout (never a worktree — it forks the bead DB), each bead self-contained with an explicit ACCEPTANCE section, deps wired per the manifest, overlap-checked against existing open beads.
bdd-foundry (.claude/workflows/bdd-foundry.js) is the thin orchestrator over this discipline: it dispatches each phase as a black-box agent, gates on the mechanical (computed-in-JS) checks above, and writes to the tracker only on the cleared verdict. This skill is the discipline; the workflow is the deterministic harness that runs it conformantly (see docs/architecture/workflow-conformance-pattern.md). When invoked directly (no workflow), you run the four phases yourself and apply the same gates by hand.
operating-loop — this is the full Gherkin → executed-red → acceptance-gated-DAG discipline, used when beads must be genuinely crank-ready.npx claudepluginhub boshu2/agentops --plugin agentopsTransforms feature descriptions, bug reports, or improvement ideas into structured beads with parallel research and multi-phase planning. Supports flexible detail levels.
Applies BDD with Gherkin scenarios and TDD Red-Green-Refactor to implement features, fix bugs, and write executable specs/tests before production code.
Transforms approved Intents into executable TDD phased plans requiring tests first (happy/bad/edge/security/leak/damage) per phase with CLI E2E gates. Use after /intent-review for TaskSwarm.