Skill

execute

Use when builder says 开始做 / 做吧 / 执行 / go / start / 跑起来 / 走起 / execute / execute-plan and an approved `.bdc/plans/*.md` exists; execute plan steps with verify evidence, scope checks, commits, and checkbox progress. Skip if no plan exists, pure Q&A, or a trivial one-off change.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/builders-dont-cry:execute

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

If builder says `别走流程`, `直接做`, `简单点`, or `不走地图`, do not force `/execute`. Acknowledge once and do the smallest direct action.

Supporting Files

references/progress-dashboard.mdreferences/tdd-cycle.mdreferences/test-fanout.md

SKILL.md

227 lines · ~5k tokens

Stats

LanguageShell

Stars4

MaintenanceExcellent

Last CommitMay 5, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

/execute — Let's fucking go: execute plan steps with dispatch + TDD loop

Step 0: Escape Hatch

If builder says 别走流程, 直接做, 简单点, or 不走地图, do not force /execute. Acknowledge once and do the smallest direct action.

Exception: do not bypass safety gates for destructive changes, secret access, public upload, deploy, database migration, force push, or irreversible writes.

Step 0.5: Project Goal Check

If PROJECT_GOAL.md exists in the target root, read it before execution. If the approved plan conflicts with current_phase, exit_criteria, or out_of_scope, stop and ask whether to update the goal or return to /plan. Do not edit PROJECT_GOAL.md automatically.

Contract

Each invocation:

Find candidate plans via bash ${BDC_HOME:-$HOME/.bdc}/scripts/active-plan-resolver.sh --list — resolver scans repo .bdc/plans/ ∪ ${BDC_HOME:-$HOME/.bdc}/plans/, applies INDEX-status / mtime / unfinished-checkbox filters. If multiple filesystem candidates appear, resolve from current-session evidence before asking: explicit plan path/name in the prompt, the plan just created/reviewed/approved in this session, a current-plan cursor if present, current repo target root, or a clear project/topic match. If one candidate is uniquely supported, use it and state the reason. Ask builder only when no unique candidate remains after session-context resolution, or when the next action is destructive/irreversible.
Read the plan header + the first unchecked - [ ] Step
Look up <role> in ${BDC_HOME:-$HOME/.bdc}/standards/model-router.md and dispatch accordingly
Run the verify command to collect evidence
Pass → commit → change - [ ] to - [x] in the plan → move to the next Step
Fail → root cause (${BDC_HOME:-$HOME/.bdc}/standards/debugging.md) → retry once with the same role → if still failing, escalate to Opus or report to builder

Iron Law

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE

Before claiming a Step is complete, in this message, you must have run the verify command and seen the output. These 6 excuses are all invalid:

Excuse	Reality
"Should pass now"	Run the command and check the output
"I'm confident"	Confidence ≠ evidence
"Linter passed"	Linter does not check compilation / tests
"Codex returned success"	Verify independently
"Just this once"	No exceptions
"Previous run passed"	Did you run it this time? If not, it hasn't passed

You Do Not Write Code

You (Claude) only do orchestration. Production code must be dispatched to Codex (by role):

Modifying .py .ts .tsx .js .jsx .go .rs .swift or other code files → dispatch
Writing or modifying tests → dispatch
Refactoring → dispatch

Exceptions (you handle directly):

.md .yaml .toml CLAUDE.md settings.json and other config / docs / behavior-rule files
Single-line typo fixes
Fallback when Codex fails 2 consecutive times on the same task

Per-Step Loop

For the test/implementation cycle inside a coding step, load references/tdd-cycle.md. For test fan-out decisions, load references/test-fanout.md. Do not create a standalone /tdd skill; TDD is part of /execute step execution. Because: splitting it out lets execution bypass RED/GREEN discipline.

Tag References

When a plan step carries a tag, load the matching reference before dispatch:

Tag	Reference
`[source-driven]`	`${BDC_HOME:-$HOME/.bdc}/standards/source-driven-development.md`
`[security-sensitive]`	`${BDC_HOME:-$HOME/.bdc}/standards/checklists/security.md`
`[interface]`	`${BDC_HOME:-$HOME/.bdc}/standards/interface-design.md`
`[adr]`	`${BDC_HOME:-$HOME/.bdc}/standards/checklists/adr-lite.md`
`[deprecation]`	`${BDC_HOME:-$HOME/.bdc}/standards/checklists/migration.md`
`[simplify-required]`	`${BDC_HOME:-$HOME/.bdc}/standards/checklists/simplification.md`
`[debug-required]`	`${BDC_HOME:-$HOME/.bdc}/standards/debugging.md`

Do not invent new tags during execution. Because: tags are plan-time contracts.

Review Obligations From Tags

[debug-required] creates a review obligation: require RED evidence before implementation, or record why RED is impossible.
test-engineer review obligation checks RED evidence, weak oracle risk, and whether the lowest useful test level was chosen.
[security-sensitive] creates a review obligation: run or schedule targeted /verify --deep security review with verify-security-auditor before /ship.
[interface] creates a review obligation: code review must check downstream contract compatibility in /verify.

1. Read next `- [ ] <action> ← verify: <cmd> — <role>` from plan
2. Look up <role> in ${BDC_HOME:-$HOME/.bdc}/standards/model-router.md to determine who to dispatch to
3. Output the **progress dashboard** — three required parts every step:
   (a) **TodoList block** — at /execute start TaskCreate one task per plan `- [ ]` step (subjects plain text, no markdown). Per-step TaskUpdate status pending→in_progress→completed. Subject template: `<step id> <step description> — <Model>/<Role> · <duration> · <result> [⚠ <exception>]`. `<Model>` MUST include version/tier; bare names rejected.
   (b) **chat header** above TodoList: `` `[N/M · X%]` · 当前 `<step id> <step description>` ``.
   (c) **▶ chat dispatch line** below TodoList: `` ▶ `<step id> <step description>` · `<Model>/<Role>` · 业务: <plain Chinese> ``.
   Full template + worked example: `references/progress-dashboard.md`.
4. **Stopwatch start**: run `date +%s` via Bash, echo the value in chat (`▶ <step> 起点 t=<epoch>`). This is the only reliable source of truth for `<duration>`.
5. **Pre-Codex setup** (v8 Bug 1+2 — required before dispatch):
   a. Write step packet:
        `bash ${BDC_HOME:-$HOME/.bdc}/scripts/step-packet.sh write --plan <plan> --step <combined_step_id>`

        **`<combined_step_id>` MUST be `F<NNN>.S<n>`** (zero-padded feature id + dot + step id),
        e.g. `F001.S1` / `F002.S3` — combined from the `### F00N: ...` Feature header and the
        `- [ ] S<n>` step within. Per-step expected_files / oracle_source / falsify tables are
        keyed by this form. **DO NOT** pass bare `S1` — `step-packet.sh` rejects bare
        step ids before packet write, because a bare id cannot map feature-scoped expected_files.
        See `standards/plan-management.md § Combined Step ID format` (v8 round-2 P0#2).
        (this writes ${BDC_HOME:-$HOME/.bdc}/state/step-<wid>.json with target_root + worktree_id + expected_files + oracle_source)
   b. Pre-Codex snapshot (R1 — Codex 写文件不走 Edit/Write hook,必须 /execute 自己拍快照):
        `bash ${BDC_HOME:-$HOME/.bdc}/scripts/scope-check.sh --phase pre --packet <packet>`
   c. Read oracle_source from plan if test step → inject into Codex prompt Context (truncated per v8 plan §B.4)

6. Dispatch (prompt MUST be passed via single-quoted heredoc — see model-router.md § Prompt Passing; never via `"<prompt>"` double-quoted arg).
   **All Codex commands MUST include `-C "$TARGET_ROOT"`** (v8 Bug 3 — read TARGET_ROOT from packet via `jq -r .target_root <packet>` or from plan via `step-packet.sh read --plan <plan>`):
     coding         → codex exec -m gpt-5.5 -c model_reasoning_effort=xhigh -C "$TARGET_ROOT" - <<'CODEX_EOF' … CODEX_EOF
       - **Test-code subtype**: if behavior is already chosen, oracle_source is present, and the task is only generating tests, route to Spark and follow `references/test-fanout.md`. Main orchestration owns behavior judgment; Spark writes test code only.
     chore / docs   → codex exec -C "$TARGET_ROOT" - <<'CODEX_EOF' … CODEX_EOF   (default Spark)
     research       → Agent tool model: "sonnet" subagent_type: "Explore"
     batch          → mcp__lmstudio__generate + json_schema
     review         → spawn feature-dev:code-architect model: opus
     orchestration  → handle directly
7. **Post-Codex checks** (v8 Bug 1+2 — Codex output not covered by hooks, must run explicitly):
   a. Post-Codex snapshot + scope check:
        `bash ${BDC_HOME:-$HOME/.bdc}/scripts/scope-check.sh --phase post --packet <packet>`
        (writes scope_check evidence event; for `atomic-plan-core-v1` with non-empty `expected_files`, exit 3 means out-of-scope files were touched → stop before verify, commit, or checkbox flip. Legacy/empty-scope packets remain advisory.)
   b. **Explicit typecheck/format** (DO NOT rely on PostToolUse hook — hook only catches direct Claude Edit/Write/MultiEdit, not Codex):
        - python: `ruff check $TARGET_ROOT 2>&1 || true; mypy $TARGET_ROOT 2>&1 || true`
        - typescript: `cd $TARGET_ROOT && npx tsc --noEmit 2>&1 || true`
        - swift: `swift build --package-path $TARGET_ROOT 2>&1 || true`
   c. Run the plan's verify command and capture output
   d. Write evidence: `bash ${BDC_HOME:-$HOME/.bdc}/scripts/evidence.sh write --event verify_finished --packet <packet> --field "exit_code=$?"`
7. **Stopwatch end**: run `date +%s` again, subtract the start epoch, format as `XmYs` (e.g. 138 → `2m18s`), TaskUpdate the subject's `<duration>` field with this value
8. exit 0 + expected output → fill `<result>` = `通过`, TaskUpdate subject + flip status to completed → next step
   otherwise → fill `<result>` = `失败`, root cause (`${BDC_HOME:-$HOME/.bdc}/standards/debugging.md`) → retry once with same role (mark `⚠ 重试 1 次`)
            → still failing → escalate to Opus (mark `⚠ 升级 Opus 4.7`) or report to builder
9. **Re-echo header**: after every TaskUpdate (status flip OR subject change), echo a fresh chat header line: `` `[N/M · X%]` · 当前 `<step id> <step description>` ``. No silent TaskUpdate — header echo is required pairing.
10. verify passes → atomic commit (codex's code + plan flip together via commit-commands:commit) → Edit plan to change `- [ ]` to `- [x]`
11. Next Step

**Note on commit scope**: before claiming the commit is atomic over both code and plan, runtime-check whether the plan file is actually tracked in the repo: `git ls-files --error-unmatch <plan-file> >/dev/null 2>&1`. If exit 0, the plan is tracked → atomic commit covers code + plan flip. If exit non-zero, the plan is not tracked (e.g. an unwhitelisted `.bdc/plans/` in some project repo, or codex wrote to `/tmp/`) → still flip the checkbox, but commit covers code changes alone and the wording must reflect that. As of 2026-04-27 `${BDC_HOME:-$HOME/.bdc}/` and most project repos whitelist `.bdc/plans/` so atomicity holds; do not assume it without the runtime check.

Codex Prompt Template

Wrap the prompt in a single-quoted heredoc + - arg. zsh leaves backticks / $vars / \ alone inside <<'…', so plan code blocks and YAML schemas survive intact.

v8 Bug 3 修订:Codex must run with -C "$TARGET_ROOT" (read from packet:TARGET_ROOT=$(jq -r .target_root <packet>)). This pins Codex to the plan's declared Target root instead of inheriting the BDC pwd.

TARGET_ROOT=$(jq -r .target_root ${BDC_HOME:-$HOME/.bdc}/state/step-${WORKTREE_ID}.json)

codex exec -m gpt-5.5 -c model_reasoning_effort=xhigh -C "$TARGET_ROOT" - <<'CODEX_EOF'
<task action description from plan>

Constraints:
- Files in scope: <inferred from plan's Per-step expected_files (target-root-relative)>
- Must pass: <plan's verify cmd>
- Run verify yourself after making changes
- If verify fails, root-cause and fix — maximum 2 rounds
- Do not modify files unrelated to this task. Because: `/execute` scope evidence treats unrelated writes as atomic-boundary failures.

Return: diff summary + verify output
CODEX_EOF

Fallback

Failure case	Response
`codex exec -m gpt-5.5` 429 rate limit	Switch to default Spark: `codex exec - <<'CODEX_EOF' … CODEX_EOF` (separate quota; same heredoc form, drop only `-m gpt-5.5`)
Codex fails 2 consecutive times on same task	Escalate to Opus / current Claude session takes over. `codex:rescue` is opt-in only — do not auto-fall back to it (it leaves `--model` unset and silently downgrades coding tasks to Spark).
Verify command itself is wrong (plan bug)	Stop and report to builder — do not fix the verify to fool yourself
MCP channel schema error	Switch to Bash `codex exec`

Done Definition (8 steps per Step — v8 Bug 2 修订)

Pre-Codex: write step packet + pre snapshot (step-packet.sh write + scope-check.sh --phase pre)
Dispatch Codex with -C "$TARGET_ROOT" to write code (or write docs yourself)
Post-Codex: snapshot + scope check (scope-check.sh --phase post). For atomic-plan-core-v1 with non-empty expected_files, exit 3 means out-of-scope files were touched → stop before verify, commit, or checkbox flip. Legacy/empty-scope packets remain advisory.
Explicit typecheck/format/lint (DO NOT trust hook auto-run for Codex output — hook only catches direct Claude Edit/Write/MultiEdit). Run ruff / tsc --noEmit / swift build etc. manually based on language.
plan's verify command exits 0 (write verify_finished evidence event)
git commit atomic (commit-commands:commit)
plan Edit - [ ] → - [x] + step-packet.sh clean to remove state file
Next Step

v8 Bug 2 explanation: PostToolUse hook on Edit/Write/MultiEdit (post-edit-combined.js) does NOT fire for codex exec file writes — Codex writes through its own process, not Claude Code's tool layer. Old wording "typecheck / lint passes (hook auto-runs)" was false; quality checks were silently skipped on every Codex-driven step. v8 makes them explicit.

Anti-patterns

Do not claim a Step is complete without running verify. Because: completion claims must be backed by fresh evidence.
Do not write .py .ts or other code files yourself (must dispatch to Codex). Because: /execute keeps Claude as orchestrator and Codex as code writer.
Do not escalate to Opus after only 1 Codex failure (retry first). Because: one retry catches transient prompt or environment misses without spending the judgment model.
Do not skip a plan Step because it "looks obviously fine". Because: unchecked steps break the plan's audit trail.
Do not combine multiple Steps into one commit. Because: combined commits violate atomicity and make rollback coarse.
Do not check off a Step when verify "looks like it passed" — must have exit 0 + expected output. Because: visual output without exit-code evidence is not a reliable gate.
Do not fan-out Spark agents across cycles in the same vertical slice. Because: parallel cycles turn TDD into horizontal slicing. Fan-out is allowed only for already-decided test cases with an external oracle_source. See references/test-fanout.md.
Do not modify a test's expectation after implementation exists to make it pass (anti-adaptation-loop). Because: adapting the oracle to the implementation destroys the RED signal. Fix the implementation or restart from a new externally sourced RED.
Do not skip the progress dashboard in Step 3. Because: builder needs live visual state, business-language dispatch, and current percentage in the same step window. Required parts: (a) TaskCreate registers all plan steps as TodoList tasks on session start (subject = plain text, NO markdown — **, backticks, _ show as literal chars and look broken). (b) Each step boundary triggers TaskUpdate to flip status (pending → in_progress → completed); fallback / escalation also TaskUpdate the subject to reflect the model that actually ran. (c) chat echoes header above the TodoList and ▶ dispatch line below — key fields wrapped in backticks for blue inline-code highlight. Subject format: <step> — <Model>/<Role> · <duration> · <result> [⚠ <exception>]. <Model> MUST include version/tier (Codex GPT-5.5, Opus 4.7 xhigh); bare Codex / Claude / Opus rejected. ▶ line uses plain Chinese business language, never [coding/effort] tags. builder reads three things: TodoList visual state (auto-rendered ✓/■/□ + strikethrough/bold), chat progress header, business-language ▶. Code internals (file paths, commit SHAs, diff stats, verify command strings) do NOT belong in any of these.
Iron Law A — header echo pairing: every TaskUpdate (status flip OR subject change) MUST be paired with a fresh chat header echo `[N/M · X%]` · 当前 `<step id> <step description>` in the SAME assistant turn. A silent TaskUpdate without paired header is a violation. Why: builder needs the percentage to live next to the visual TodoList state — if you only flip status without re-echoing the header, the screen progress goes stale.
Iron Law B — subject must carry full state: every in_progress or completed task subject MUST contain all 5 fields per the template: <step id> <step desc> — <Model>/<Role> · <duration> · <result> [⚠ <exception>]. <duration> for completed comes from real date +%s deltas (Step 4 + Step 7 of the per-step loop), never fabricated; absent stopwatch = absent duration field, but result + Model/Role still required. <result> (通过 / 失败) is fact-based off verify exit code. Empty subjects like ■ S3 wire health probe — Codex GPT-5.5/Coding (no duration, no result) are violations — they look like the broken screenshot builder showed on 2026-04-26.
Do not delegate coding role tasks to the codex:codex-rescue subagent. Because: that subagent is a thin forwarder for ad-hoc "find codex to rescue" requests; its spec explicitly leaves --model unset, so codex falls back to the config default (gpt-5.3-codex-spark) and silently downgrades any task tagged [coding / GPT-5.5 / xhigh]. For coding tasks, dispatch via direct Bash heredoc form (codex exec -m gpt-5.5 -c model_reasoning_effort=xhigh - <<'CODEX_EOF' … CODEX_EOF). Use codex:codex-rescue only when builder explicitly asks for it.
Do not pass the prompt as a double-quoted positional arg (codex exec ... "<prompt>"). Because: zsh expands backticks / $VAR / \ inside double quotes, so plan code blocks, verify commands, and YAML schemas get executed by the host shell before codex starts. Use single-quoted heredoc + - (see model-router.md § Prompt Passing). Failure modes seen 2026-04-26: 0-byte output, hung Bash call, prompt arriving at codex with backtick contents replaced by host-command output. Heredoc with 'CODEX_EOF' disables every form of expansion.

Plan files: .bdc/plans/<proj>-<MMDD>-<topic>.md
Format rules: ${BDC_HOME:-$HOME/.bdc}/standards/plan-management.md
Role dispatch: ${BDC_HOME:-$HOME/.bdc}/standards/model-router.md
Codex channels: ${BDC_HOME:-$HOME/.bdc}/standards/model-router.md § Codex three channels
Debug process: ${BDC_HOME:-$HOME/.bdc}/standards/debugging.md

Verification

Run the plan step's verify command fresh before claiming completion.
Run post-dispatch scope check before verify and commit.
Run git ls-files --error-unmatch <plan-file> before claiming code + plan checkbox are in the same commit.

Red Flags

A bare S1 is passed outside its feature block instead of F001.S1.
Codex writes files but no explicit typecheck or test command is run afterward.
A scope check reports out-of-scope files and execution continues anyway.

Common Rationalizations

Rationalization	Response
"Codex succeeded, so the step is done."	Do not. Incident: `standards/debugging.md` 2026-04-22 says Codex wrote code and tests but could not commit because sandbox `.git/` writes failed. Claude must verify and commit.
"Passing the prompt as a quoted arg is simpler."	Do not. Incident: `standards/debugging.md` 2026-04-26 records `codex exec "<prompt>"` losing backticked content through shell expansion. Use single-quoted heredoc.

execute

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

execute

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

/execute — Let's fucking go: execute plan steps with dispatch + TDD loop

Step 0: Escape Hatch

Step 0.5: Project Goal Check

Contract

Iron Law

You Do Not Write Code

Per-Step Loop

Tag References

Review Obligations From Tags

Codex Prompt Template

Fallback

Done Definition (8 steps per Step — v8 Bug 2 修订)

Anti-patterns

Related

Verification

Red Flags

Common Rationalizations

Similar Skills

/execute — Let's fucking go: execute plan steps with dispatch + TDD loop

Step 0: Escape Hatch

Step 0.5: Project Goal Check

Contract

Iron Law

You Do Not Write Code

Per-Step Loop

Tag References

Review Obligations From Tags

Codex Prompt Template

Fallback

Done Definition (8 steps per Step — v8 Bug 2 修订)

Anti-patterns

Related

Verification

Red Flags

Common Rationalizations

Similar Skills