From builders-dont-cry
Use when builder says 开始做 / 做吧 / 执行 / go / start / 跑起来 / 走起 / execute / execute-plan and an approved `.bdc/plans/*.md` exists; execute plan steps with verify evidence, scope checks, commits, and checkbox progress. Skip if no plan exists, pure Q&A, or a trivial one-off change.
How this skill is triggered — by the user, by Claude, or both
Slash command
/builders-dont-cry:executeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
If builder says `别走流程`, `直接做`, `简单点`, or `不走地图`, do not force `/execute`. Acknowledge once and do the smallest direct action.
If builder says 别走流程, 直接做, 简单点, or 不走地图, do not force /execute. Acknowledge once and do the smallest direct action.
Exception: do not bypass safety gates for destructive changes, secret access, public upload, deploy, database migration, force push, or irreversible writes.
If PROJECT_GOAL.md exists in the target root, read it before execution. If the approved plan conflicts with current_phase, exit_criteria, or out_of_scope, stop and ask whether to update the goal or return to /plan. Do not edit PROJECT_GOAL.md automatically.
Each invocation:
bash ${BDC_HOME:-$HOME/.bdc}/scripts/active-plan-resolver.sh --list — resolver scans repo .bdc/plans/ ∪ ${BDC_HOME:-$HOME/.bdc}/plans/, applies INDEX-status / mtime / unfinished-checkbox filters. If multiple filesystem candidates appear, resolve from current-session evidence before asking: explicit plan path/name in the prompt, the plan just created/reviewed/approved in this session, a current-plan cursor if present, current repo target root, or a clear project/topic match. If one candidate is uniquely supported, use it and state the reason. Ask builder only when no unique candidate remains after session-context resolution, or when the next action is destructive/irreversible.- [ ] Step<role> in ${BDC_HOME:-$HOME/.bdc}/standards/model-router.md and dispatch accordingly- [ ] to - [x] in the plan → move to the next Step${BDC_HOME:-$HOME/.bdc}/standards/debugging.md) → retry once with the same role → if still failing, escalate to Opus or report to builderNO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION EVIDENCE
Before claiming a Step is complete, in this message, you must have run the verify command and seen the output. These 6 excuses are all invalid:
| Excuse | Reality |
|---|---|
| "Should pass now" | Run the command and check the output |
| "I'm confident" | Confidence ≠ evidence |
| "Linter passed" | Linter does not check compilation / tests |
| "Codex returned success" | Verify independently |
| "Just this once" | No exceptions |
| "Previous run passed" | Did you run it this time? If not, it hasn't passed |
You (Claude) only do orchestration. Production code must be dispatched to Codex (by role):
.py .ts .tsx .js .jsx .go .rs .swift or other code files → dispatchExceptions (you handle directly):
.md .yaml .toml CLAUDE.md settings.json and other config / docs / behavior-rule filesFor the test/implementation cycle inside a coding step, load references/tdd-cycle.md.
For test fan-out decisions, load references/test-fanout.md.
Do not create a standalone /tdd skill; TDD is part of /execute step execution. Because: splitting it out lets execution bypass RED/GREEN discipline.
When a plan step carries a tag, load the matching reference before dispatch:
| Tag | Reference |
|---|---|
[source-driven] | ${BDC_HOME:-$HOME/.bdc}/standards/source-driven-development.md |
[security-sensitive] | ${BDC_HOME:-$HOME/.bdc}/standards/checklists/security.md |
[interface] | ${BDC_HOME:-$HOME/.bdc}/standards/interface-design.md |
[adr] | ${BDC_HOME:-$HOME/.bdc}/standards/checklists/adr-lite.md |
[deprecation] | ${BDC_HOME:-$HOME/.bdc}/standards/checklists/migration.md |
[simplify-required] | ${BDC_HOME:-$HOME/.bdc}/standards/checklists/simplification.md |
[debug-required] | ${BDC_HOME:-$HOME/.bdc}/standards/debugging.md |
Do not invent new tags during execution. Because: tags are plan-time contracts.
[debug-required] creates a review obligation: require RED evidence before implementation, or record why RED is impossible.test-engineer review obligation checks RED evidence, weak oracle risk, and whether the lowest useful test level was chosen.[security-sensitive] creates a review obligation: run or schedule targeted /verify --deep security review with verify-security-auditor before /ship.[interface] creates a review obligation: code review must check downstream contract compatibility in /verify.1. Read next `- [ ] <action> ← verify: <cmd> — <role>` from plan
2. Look up <role> in ${BDC_HOME:-$HOME/.bdc}/standards/model-router.md to determine who to dispatch to
3. Output the **progress dashboard** — three required parts every step:
(a) **TodoList block** — at /execute start TaskCreate one task per plan `- [ ]` step (subjects plain text, no markdown). Per-step TaskUpdate status pending→in_progress→completed. Subject template: `<step id> <step description> — <Model>/<Role> · <duration> · <result> [⚠ <exception>]`. `<Model>` MUST include version/tier; bare names rejected.
(b) **chat header** above TodoList: `` `[N/M · X%]` · 当前 `<step id> <step description>` ``.
(c) **▶ chat dispatch line** below TodoList: `` ▶ `<step id> <step description>` · `<Model>/<Role>` · 业务: <plain Chinese> ``.
Full template + worked example: `references/progress-dashboard.md`.
4. **Stopwatch start**: run `date +%s` via Bash, echo the value in chat (`▶ <step> 起点 t=<epoch>`). This is the only reliable source of truth for `<duration>`.
5. **Pre-Codex setup** (v8 Bug 1+2 — required before dispatch):
a. Write step packet:
`bash ${BDC_HOME:-$HOME/.bdc}/scripts/step-packet.sh write --plan <plan> --step <combined_step_id>`
**`<combined_step_id>` MUST be `F<NNN>.S<n>`** (zero-padded feature id + dot + step id),
e.g. `F001.S1` / `F002.S3` — combined from the `### F00N: ...` Feature header and the
`- [ ] S<n>` step within. Per-step expected_files / oracle_source / falsify tables are
keyed by this form. **DO NOT** pass bare `S1` — `step-packet.sh` rejects bare
step ids before packet write, because a bare id cannot map feature-scoped expected_files.
See `standards/plan-management.md § Combined Step ID format` (v8 round-2 P0#2).
(this writes ${BDC_HOME:-$HOME/.bdc}/state/step-<wid>.json with target_root + worktree_id + expected_files + oracle_source)
b. Pre-Codex snapshot (R1 — Codex 写文件不走 Edit/Write hook,必须 /execute 自己拍快照):
`bash ${BDC_HOME:-$HOME/.bdc}/scripts/scope-check.sh --phase pre --packet <packet>`
c. Read oracle_source from plan if test step → inject into Codex prompt Context (truncated per v8 plan §B.4)
6. Dispatch (prompt MUST be passed via single-quoted heredoc — see model-router.md § Prompt Passing; never via `"<prompt>"` double-quoted arg).
**All Codex commands MUST include `-C "$TARGET_ROOT"`** (v8 Bug 3 — read TARGET_ROOT from packet via `jq -r .target_root <packet>` or from plan via `step-packet.sh read --plan <plan>`):
coding → codex exec -m gpt-5.5 -c model_reasoning_effort=xhigh -C "$TARGET_ROOT" - <<'CODEX_EOF' … CODEX_EOF
- **Test-code subtype**: if behavior is already chosen, oracle_source is present, and the task is only generating tests, route to Spark and follow `references/test-fanout.md`. Main orchestration owns behavior judgment; Spark writes test code only.
chore / docs → codex exec -C "$TARGET_ROOT" - <<'CODEX_EOF' … CODEX_EOF (default Spark)
research → Agent tool model: "sonnet" subagent_type: "Explore"
batch → mcp__lmstudio__generate + json_schema
review → spawn feature-dev:code-architect model: opus
orchestration → handle directly
7. **Post-Codex checks** (v8 Bug 1+2 — Codex output not covered by hooks, must run explicitly):
a. Post-Codex snapshot + scope check:
`bash ${BDC_HOME:-$HOME/.bdc}/scripts/scope-check.sh --phase post --packet <packet>`
(writes scope_check evidence event; for `atomic-plan-core-v1` with non-empty `expected_files`, exit 3 means out-of-scope files were touched → stop before verify, commit, or checkbox flip. Legacy/empty-scope packets remain advisory.)
b. **Explicit typecheck/format** (DO NOT rely on PostToolUse hook — hook only catches direct Claude Edit/Write/MultiEdit, not Codex):
- python: `ruff check $TARGET_ROOT 2>&1 || true; mypy $TARGET_ROOT 2>&1 || true`
- typescript: `cd $TARGET_ROOT && npx tsc --noEmit 2>&1 || true`
- swift: `swift build --package-path $TARGET_ROOT 2>&1 || true`
c. Run the plan's verify command and capture output
d. Write evidence: `bash ${BDC_HOME:-$HOME/.bdc}/scripts/evidence.sh write --event verify_finished --packet <packet> --field "exit_code=$?"`
7. **Stopwatch end**: run `date +%s` again, subtract the start epoch, format as `XmYs` (e.g. 138 → `2m18s`), TaskUpdate the subject's `<duration>` field with this value
8. exit 0 + expected output → fill `<result>` = `通过`, TaskUpdate subject + flip status to completed → next step
otherwise → fill `<result>` = `失败`, root cause (`${BDC_HOME:-$HOME/.bdc}/standards/debugging.md`) → retry once with same role (mark `⚠ 重试 1 次`)
→ still failing → escalate to Opus (mark `⚠ 升级 Opus 4.7`) or report to builder
9. **Re-echo header**: after every TaskUpdate (status flip OR subject change), echo a fresh chat header line: `` `[N/M · X%]` · 当前 `<step id> <step description>` ``. No silent TaskUpdate — header echo is required pairing.
10. verify passes → atomic commit (codex's code + plan flip together via commit-commands:commit) → Edit plan to change `- [ ]` to `- [x]`
11. Next Step
**Note on commit scope**: before claiming the commit is atomic over both code and plan, runtime-check whether the plan file is actually tracked in the repo: `git ls-files --error-unmatch <plan-file> >/dev/null 2>&1`. If exit 0, the plan is tracked → atomic commit covers code + plan flip. If exit non-zero, the plan is not tracked (e.g. an unwhitelisted `.bdc/plans/` in some project repo, or codex wrote to `/tmp/`) → still flip the checkbox, but commit covers code changes alone and the wording must reflect that. As of 2026-04-27 `${BDC_HOME:-$HOME/.bdc}/` and most project repos whitelist `.bdc/plans/` so atomicity holds; do not assume it without the runtime check.
Wrap the prompt in a single-quoted heredoc + - arg. zsh leaves backticks / $vars / \ alone inside <<'…', so plan code blocks and YAML schemas survive intact.
v8 Bug 3 修订:Codex must run with -C "$TARGET_ROOT" (read from packet:TARGET_ROOT=$(jq -r .target_root <packet>)). This pins Codex to the plan's declared Target root instead of inheriting the BDC pwd.
TARGET_ROOT=$(jq -r .target_root ${BDC_HOME:-$HOME/.bdc}/state/step-${WORKTREE_ID}.json)
codex exec -m gpt-5.5 -c model_reasoning_effort=xhigh -C "$TARGET_ROOT" - <<'CODEX_EOF'
<task action description from plan>
Constraints:
- Files in scope: <inferred from plan's Per-step expected_files (target-root-relative)>
- Must pass: <plan's verify cmd>
- Run verify yourself after making changes
- If verify fails, root-cause and fix — maximum 2 rounds
- Do not modify files unrelated to this task. Because: `/execute` scope evidence treats unrelated writes as atomic-boundary failures.
Return: diff summary + verify output
CODEX_EOF
| Failure case | Response |
|---|---|
codex exec -m gpt-5.5 429 rate limit | Switch to default Spark: codex exec - <<'CODEX_EOF' … CODEX_EOF (separate quota; same heredoc form, drop only -m gpt-5.5) |
| Codex fails 2 consecutive times on same task | Escalate to Opus / current Claude session takes over. codex:rescue is opt-in only — do not auto-fall back to it (it leaves --model unset and silently downgrades coding tasks to Spark). |
| Verify command itself is wrong (plan bug) | Stop and report to builder — do not fix the verify to fool yourself |
| MCP channel schema error | Switch to Bash codex exec |
step-packet.sh write + scope-check.sh --phase pre)-C "$TARGET_ROOT" to write code (or write docs yourself)scope-check.sh --phase post). For atomic-plan-core-v1 with non-empty expected_files, exit 3 means out-of-scope files were touched → stop before verify, commit, or checkbox flip. Legacy/empty-scope packets remain advisory.ruff / tsc --noEmit / swift build etc. manually based on language.verify_finished evidence event)git commit atomic (commit-commands:commit)- [ ] → - [x] + step-packet.sh clean to remove state filev8 Bug 2 explanation: PostToolUse hook on Edit/Write/MultiEdit (post-edit-combined.js) does NOT fire for codex exec file writes — Codex writes through its own process, not Claude Code's tool layer. Old wording "typecheck / lint passes (hook auto-runs)" was false; quality checks were silently skipped on every Codex-driven step. v8 makes them explicit.
.py .ts or other code files yourself (must dispatch to Codex). Because: /execute keeps Claude as orchestrator and Codex as code writer.references/test-fanout.md.**, backticks, _ show as literal chars and look broken). (b) Each step boundary triggers TaskUpdate to flip status (pending → in_progress → completed); fallback / escalation also TaskUpdate the subject to reflect the model that actually ran. (c) chat echoes header above the TodoList and ▶ dispatch line below — key fields wrapped in backticks for blue inline-code highlight. Subject format: <step> — <Model>/<Role> · <duration> · <result> [⚠ <exception>]. <Model> MUST include version/tier (Codex GPT-5.5, Opus 4.7 xhigh); bare Codex / Claude / Opus rejected. ▶ line uses plain Chinese business language, never [coding/effort] tags. builder reads three things: TodoList visual state (auto-rendered ✓/■/□ + strikethrough/bold), chat progress header, business-language ▶. Code internals (file paths, commit SHAs, diff stats, verify command strings) do NOT belong in any of these.`[N/M · X%]` · 当前 `<step id> <step description>` in the SAME assistant turn. A silent TaskUpdate without paired header is a violation. Why: builder needs the percentage to live next to the visual TodoList state — if you only flip status without re-echoing the header, the screen progress goes stale.in_progress or completed task subject MUST contain all 5 fields per the template: <step id> <step desc> — <Model>/<Role> · <duration> · <result> [⚠ <exception>]. <duration> for completed comes from real date +%s deltas (Step 4 + Step 7 of the per-step loop), never fabricated; absent stopwatch = absent duration field, but result + Model/Role still required. <result> (通过 / 失败) is fact-based off verify exit code. Empty subjects like ■ S3 wire health probe — Codex GPT-5.5/Coding (no duration, no result) are violations — they look like the broken screenshot builder showed on 2026-04-26.coding role tasks to the codex:codex-rescue subagent. Because: that subagent is a thin forwarder for ad-hoc "find codex to rescue" requests; its spec explicitly leaves --model unset, so codex falls back to the config default (gpt-5.3-codex-spark) and silently downgrades any task tagged [coding / GPT-5.5 / xhigh]. For coding tasks, dispatch via direct Bash heredoc form (codex exec -m gpt-5.5 -c model_reasoning_effort=xhigh - <<'CODEX_EOF' … CODEX_EOF). Use codex:codex-rescue only when builder explicitly asks for it.codex exec ... "<prompt>"). Because: zsh expands backticks / $VAR / \ inside double quotes, so plan code blocks, verify commands, and YAML schemas get executed by the host shell before codex starts. Use single-quoted heredoc + - (see model-router.md § Prompt Passing). Failure modes seen 2026-04-26: 0-byte output, hung Bash call, prompt arriving at codex with backtick contents replaced by host-command output. Heredoc with 'CODEX_EOF' disables every form of expansion..bdc/plans/<proj>-<MMDD>-<topic>.md${BDC_HOME:-$HOME/.bdc}/standards/plan-management.md${BDC_HOME:-$HOME/.bdc}/standards/model-router.md${BDC_HOME:-$HOME/.bdc}/standards/model-router.md § Codex three channels${BDC_HOME:-$HOME/.bdc}/standards/debugging.mdgit ls-files --error-unmatch <plan-file> before claiming code + plan checkbox are in the same commit.S1 is passed outside its feature block instead of F001.S1.| Rationalization | Response |
|---|---|
| "Codex succeeded, so the step is done." | Do not. Incident: standards/debugging.md 2026-04-22 says Codex wrote code and tests but could not commit because sandbox .git/ writes failed. Claude must verify and commit. |
| "Passing the prompt as a quoted arg is simpler." | Do not. Incident: standards/debugging.md 2026-04-26 records codex exec "<prompt>" losing backticked content through shell expansion. Use single-quoted heredoc. |
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub 0xmariowu/builders-dont-cry