From beat
Validates implementation completeness against spec artifacts using independent subagents to eliminate context bias.
How this skill is triggered — by the user, by Claude, or both
Slash command
/beat:verifyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Verify implementation against change artifacts across five dimensions. Uses independent subagents to eliminate context bias.
Verify implementation against change artifacts across five dimensions. Uses independent subagents to eliminate context bias.
<decision_boundary>
Use for:
source: distill)NOT for:
/beat:design)/beat:plan)/beat:apply)/beat:archive)Trigger examples:
</decision_boundary>
You MUST dispatch independent subagents for verification — NEVER verify implementation yourself in the main session. The main session has context bias from the conversation history.Dispatch the verification subagent AND code-reviewer in parallel — they are independent checks.
If a subagent fails, proceed with findings from the other. If BOTH fail, report the failure — do NOT fall back to self-verification.
After presenting the combined report: you MUST record the outcome in the top-level
verification field of status.yaml (see step 6). If verification could not run at all
(both subagents failed), do NOT record — a failed run is not a verification outcome.
| Thought | Reality |
|---|---|
| "The change is small, I can verify it myself" | Self-verification creates confirmation bias. You saw the implementation — you can't objectively verify it. |
| "I already reviewed the code during apply" | That's exactly why you need an independent verifier. Familiarity breeds blind spots. |
| "Running two subagents is overkill for this" | Code quality and spec alignment are independent dimensions. A single agent conflates them. |
| "I'll just run the tests, that's verification enough" | Tests verify behavior but not spec alignment, design adherence, or code quality. |
| "I'll dispatch them sequentially to save context" | They're independent — parallel dispatch is faster and prevents one report from biasing the other. |
| "The report is delivered, the status.yaml write is just bookkeeping" | The verification field is how archive knows verify ran. Skip it and archive warns "never verified" on a verified change. Ten seconds — write it. |
verification record in status.yaml)digraph verify {
"Select change" [shape=box];
"Read artifacts +\ntesting context" [shape=box];
"Parallel dispatch" [shape=box, style=bold];
"Verification\nsubagent" [shape=box];
"Code-reviewer\nsubagent" [shape=box];
"tests available?" [shape=diamond];
"Run automated tests" [shape=box];
"Present combined report" [shape=box];
"Record verification\nin status.yaml" [shape=doublecircle];
"Select change" -> "Read artifacts +\ntesting context";
"Read artifacts +\ntesting context" -> "Parallel dispatch";
"Parallel dispatch" -> "Verification\nsubagent";
"Parallel dispatch" -> "Code-reviewer\nsubagent";
"Verification\nsubagent" -> "tests available?";
"Code-reviewer\nsubagent" -> "tests available?";
"tests available?" -> "Run automated tests" [label="yes"];
"tests available?" -> "Present combined report" [label="no"];
"Run automated tests" -> "Present combined report";
"Present combined report" -> "Record verification\nin status.yaml";
}
Input: Optionally specify a change name. If omitted, infer from context or prompt.
Steps
Select the change
If no name provided:
beat/changes/ directories (excluding archive/)Read all artifacts and determine testing context
Read from beat/changes/<name>/:
status.yaml (schema: references/status-schema.md)features/*.feature (all Gherkin files, if gherkin status is done)proposal.md (if exists)design.md (if exists)tasks.md (if exists)Read beat/config.yaml (if exists, schema: references/config-schema.md).
Determine drive mode:
gherkin status is done → Gherkin-driven verificationgherkin status is skipped → Proposal-driven verificationDetermine testing context (three-layer priority: tag > source > config):
testing.required set to false? If yes, skip test existence checks globally.status.yaml contain source: distill? If yes, Dimension 1 switches to accuracy mode (see below).status.yaml have gherkin.modified? If yes, collect the listed paths and their .feature.orig backup paths — the verification subagent needs them for semantic verification (Dimension 1B+).Dispatch verification subagent AND code-reviewer in parallel
Launch BOTH agents simultaneously using a single message with two Agent tool calls:
Agent A — Verification subagent (subagent_type: Explore):
Read verification-subagent-prompt.md for the complete subagent prompt.
Provide ONLY:
gherkin.modified with their .feature.orig backup paths (if any)Agent B — Code quality review (subagent_type: general-purpose):
Read code-reviewer-prompt.md for the complete subagent prompt.
Provide:
This reviews: code quality, architecture, naming, error handling, test quality, security, and plan alignment. Its output is Dimension 4, classified in Beat's CRITICAL/WARNING/SUGGESTION vocabulary.
Fallback: If one agent fails, proceed with the other's findings. If BOTH fail, report failure — do NOT self-verify.
Run automated tests if available
Detect and run the project's test suite:
testing.behavior framework (or auto-detect)testing.e2e framework (or auto-detect). If beat/changes/<name>/features/ contains feature files, combine BDD feature paths: beat/features/ + beat/changes/<name>/features/Present combined verification report
Combine both subagent reports:
Record the outcome in status.yaml
Read beat/changes/<name>/status.yaml (read before write — preserve existing fields), then set the top-level verification field per references/status-schema.md:
verification: { status: passed, critical: 0, date: YYYY-MM-DD }
status: passed when zero CRITICAL findings; issues-found otherwisecritical: the CRITICAL count from the combined report, including failing automated tests from step 4phase — verification outcome lives only in this fieldThis is the only file verify writes. /beat:archive uses it to warn when archiving an unverified change. Re-running verify after fixes overwrites the field.
Issue Classification
Dimension 5 is advisory — its findings classify as WARNING or SUGGESTION only, never CRITICAL. The user decides whether to act before archiving; living-doc drift never blocks the archive.
Graceful Degradation
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub kirkchen/beat --plugin beat