From prd
Executes the next TaskMaster task using the implementation plan with CDD verification. Dispatches a subagent to implement, verifies subtasks with evidence, and loops until all tasks are complete.
How this skill is triggered — by the user, by Claude, or both
Slash command
/prd:execute-taskThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
The execution loop. Three sources converge:
The execution loop. Three sources converge:
docs/superpowers/plans/*.md produced by GENERATE.taskmaster/tasks/tasks.json with
dependencies and complexity scoresexecute-task is the single skill that runs the full build from "tasks are ready" to SHIP_CHECK_OK. It is autonomous — no AskUserQuestion inside the loop. Any gap that would require user input is surfaced through the recon escalation ladder (step 11) or the inbox (steps 4 and 8), never via a modal prompt.
This skill is invoked either:
prd-taskmaster orchestrator when current_phase is EXECUTE.On entry, confirm that:
.atlas-ai/state/pipeline.json exists and records phase: EXECUTE.taskmaster/tasks/tasks.json exists with at least one ready task.atlas-ai/customizations/system-prompt-template.md is present (may be
empty — absence is a setup bug, empty is fine)If any of the above are missing, report the gap and halt. Do NOT attempt to bootstrap the missing artifact from inside this loop — that is the orchestrator's job.
Each pass through this cycle moves exactly one TaskMaster task from pending
to done. Do the 13 steps in order. Do not skip.
Task-start SHA — at the very beginning of each iteration (before step 2), capture the current git HEAD:
task_start_sha=$(git rev-parse HEAD)Record
$task_start_shain the execute-log row for this iteration. It is the oracle of truth for every reachability sweep in step 9b below: "what modules did THIS task add?" isdiff $task_start_sha..HEAD. The oracle flow already issues per-task start commits; this surfaces the same value in the loop prose.
Heartbeat check: verify the execute-task heartbeat timer is running.
If missing, register one via CronCreate("execute-task-heartbeat", "* * * * *", "echo heartbeat").
Abort the iteration if the timer cannot be created — a missing heartbeat
means a missing stuck-session detector, and that is load-bearing.
Inbox reconciliation: read .atlas-ai/state/pipeline.json,
.taskmaster/tasks/tasks.json, and the current TodoWrite list.
Diff them. If the three are stale by more than 5 tasks (i.e. TodoWrite
says 10 done but tasks.json says 3 done), report the diff and halt — do
not paper over bookkeeping drift by silently reconciling.
Pick next task: run backend op next with the plugin's project-root
pointer. Use exactly this invocation:
python3 script.py next-task
Parse the JSON result.
done, run .atlas-ai/ship-check.py,
emit SHIP_CHECK_OK on success, exit the loop.Load plan step: search for the matching task ID in this priority order, halting only after all three fail:
docs/superpowers/plans/*.md (the superpowers GENERATE default output).taskmaster/docs/plan.md (the prd-taskmaster HANDOFF default output,
whose path is also recorded in
pipeline.json:phase_evidence.HANDOFF.plan_file_path)pipeline.json:phase_evidence.HANDOFF.plan_file_path (in case
a future handoff variant writes elsewhere)If none of the three contains the matching task ID, the task was
invented downstream of the plan — mark the task blocked, inbox the
parent orchestrator with message_type="blocker", and continue to the
next iteration.
(Codified 2026-06-04 — yesterday's ai-human-tasker run had its plan at
.taskmaster/docs/plan.md only, while this step previously read
docs/superpowers/plans/*.md exclusively. The controller silently
improvised; a cold-start successor would have hit the blocked path on
every task.)
Generate CDD card: convert the task's subtasks field into a
testing_plan. Each subtask becomes a verifiable check with a concrete
evidence path (file, command output, or test name). Write the card to
.atlas-ai/cdd/task-<id>.json. A task without subtasks is treated as a
single RED card.
Set in-progress: run backend op set-status from the current project
root:
python3 script.py set-status --id <N> --status in-progress
This flip is observable by watchers and anchors the iteration in TaskMaster itself.
Dispatch implementer subagent — NEVER in-session. The controller must:
.atlas-ai/customizations/system-prompt-template.md
into the subagent's system prompt. If the file is empty, inject nothing
and continue.1-4 fast — use the fast tier (Haiku-class)5-7 standard — use the standard tier (Sonnet-class)8-10 capable — use the capable tier (Opus-class)DONE,
DONE_WITH_CONCERNS, NEEDS_CONTEXT, or BLOCKED.Rationale: complexity-tiered dispatch keeps the dollars-per-task curve sensible. A complexity-2 boilerplate task does not need Opus; a complexity-9 architectural task should not be given to Haiku.
Route by status: the subagent's return status drives the next move.
expand
(python3 script.py expand --id <N>). If still
blocked, set status=blocked, inbox parent, halt this iteration.Do NOT invent new status values. The four above are the only terminal returns. Any other string from the subagent is a protocol violation and should be logged + treated as BLOCKED.
Triple verification — the plugin's core quality gate, per spec §11.4. Three independent checks must agree.
Hard exit-code gate (MANDATORY — bypasses agreement count). Before
invoking the three checkers, run .atlas-ai/ship-check.py --dry-run. If
it reports any non-zero Exit status N in evidence files, the task
FAILS regardless of how the agent narratives read. SHIP_CHECK_FAIL is
NOT a warning. Narrative claiming the exit code is "expected" or
"infrastructure noise" does NOT override this gate — write a separate
task-fix-N to address the underlying failure instead. There is NO
override path; Gate 5 is unfakable. (Codified 2026-06-04 after T12
in ai-human-tasker was marked DONE while pnpm test exited 1 with 11
failing tests.)
9b. Reachability sweep (MANDATORY for wired/live tasks). After the hard exit-code gate passes, run the reachability sweep for this task:
python3 script.py reachability-sweep \
--task <task_id> \
--start-commit <task_start_sha>
This command:
$task_start_sha and HEAD.WIRED, EXEMPT, ORPHAN, or ERROR..atlas-ai/cdd/task-<id>.json under the "reachability" key (atomic,
additive — existing card keys are preserved).The sweep exit code encodes the verdict:
exit 0 → WIRED or EXEMPT (pass; proceed to the three checkers).exit 1 → ORPHAN or ERROR (see step 10 for the auto-downgrade path).For spike/domain-model tasks the sweep returns EXEMPT automatically (no importer search is performed for those tiers).
Why sweep before the triple check? A green test on a module imported by nothing is not "done" — it is scaffolded. The triple check can pass for an ORPHAN module (all tests pass; doubt and validate agree). The reachability gate closes that gap:
donemeans the module is reachable from real production callsites, not just reachable from tests. Wire it or it ships as scaffold.
The three checks (run only if the hard gate AND the reachability sweep both pass):
/doubt skill — adversarial doubt sweep on the claimed completion./validate skill — deterministic validation pass (lint / tests / exit
codes).Opus subagent sanity pass — asks a fresh subagent "would
you merge this?" with the task spec + diff + evidence.3+ agree pass -> task passes. Disagreement -> halt this iteration, surface to inbox.
Mark done + propagate state — branch on the sweep verdict from step 9b:
WIRED or EXEMPT (sweep exit 0) → proceed normally:
a. Run backend op set-status for the parent task. Because the sweep
already wrote the reachability block into the CDD card, the
set-status CLI auto-reads it — no --reachability flag needed:
python3 script.py set-status --id <N> --status done
If you want to be explicit (e.g. for logging), you may pass:
--reachability WIRED or --reachability EXEMPT.
b. Subtask writeback: for each subtask S in task.subtasks whose
evidence file (per the CDD card from step 5) exists, run
python3 script.py set-status --id <N>.<S> --status done. Subtasks left
pending while the parent is done are a data-integrity violation
that breaks any tool computing progress from subtask state.
(Codified 2026-06-04 — yesterday's run left all 39 subtasks
pending despite 13/13 parent tasks done.)
c. Update .atlas-ai/state/pipeline.json per-task: call
mcp__plugin_prd_go__update_pipeline_task_status(task_id=<N>, status="done") if the MCP tool is available. If not, fall back to
atomic read-modify-write using the pattern in
mcp-server/pipeline.py:locked_update() — read, append <N> to
phase_evidence.EXECUTE.tasks_completed, write to temp, rename.
Never leave pipeline.json and tasks.json mutually inconsistent.
(Codified 2026-06-04 — yesterday's run promised this write in
SKILL.md but never executed it. pipeline.json froze at HANDOFF
transition through all 85 minutes of execution.)
ORPHAN or ERROR (sweep exit 1) → auto-downgrade to scaffold:
Do NOT mark the task done. Instead:
python3 script.py set-status --id <N> --status scaffold
Then:
execute-log.jsonl: "reachability_verdict": "ORPHAN" (or
"ERROR"), "auto_downgraded": true, and a plain-English note of
which modules are unwired (from the sweep's modules list in the
CDD card).scaffold, not done.title: "Wire <module> into <entrypoint>") and append it via
python3 script.py expand --id <N> or the MCP equivalent.Throughline: a green test on a module imported by nothing is not done — wire it or it ships as scaffold. The auto-downgrade ensures the task graph stays honest: Gate 6 will block the ship until every wired/live task's reachability block reads WIRED or EXEMPT. If all wired/live tasks auto-downgraded to scaffold, the ship check will block at Gate 2 ("not every task is done") and the developer must choose: wire the modules, re-tier them (spike/domain-model), or mark them explicitly exempt (
reachableVia: cli:...). There is no silent path to SHIP_CHECK_OK with an unwired module at a wired/live tier.
Check stepback triggers: if 15 minutes have passed with no task moving to done, OR 5 consecutive iterations have failed on the same task class, the recon escalation ladder is MANDATORY. Climb the ladder in this exact order, not out of order:
/stepback -> /research-before-coding -> /question -> pivot
/stepback — reassess the architectural assumption. Was the plan
wrong?/research-before-coding — feed the blocker into the Perplexity +
Context7 + GitHub pipeline for fresh external context./question — batch-research the unresolved unknowns in parallel.pivot — the plan step itself is unsound; kick the task back to the
plan author (inbox parent with message_type="plan_pivot_requested").The ladder is append-only — if /stepback surfaces a fix, apply it and
return to step 3. Only climb if the prior rung did not yield progress.
Render progress — show the execute progress panel: MCP
render_status(phase="EXECUTE") → print its rendered field; CLI
python3 script.py status --phase EXECUTE. Then emit the atlas-gamify
one-line score (tasks done / tasks total, complexity-weighted). This is the
human-visible progress signal and also feeds the dogfood debrief.
Loop: back to step 1 until SHIP_CHECK_OK or a halt condition fires.
The termination sequence is strict — three steps, in order, no shortcuts:
.atlas-ai/ship-check.py. If it does NOT exit 0, halt. Do NOT
emit any completion signal. Investigate the gate failure, fix, retry.Skill(skill: "sync") to refresh the memory
bank (session-context/CLAUDE-*.md, MEMORY.md, capability inventory).
This MUST happen BEFORE the SHIP_CHECK_OK token is printed.
Orchestrators tail-watch the token; if the memory bank is stale when
they react, successor sessions inherit a wrong picture of the world.
(Codified 2026-06-04 — yesterday's ai-human-tasker run shipped 15.6k
LOC while session-context/CLAUDE-activeContext.md still said
"Scaffold complete. No application code yet".)SHIP_CHECK_OK to stdout. This is the ONLY place in your
output where the token may appear — emit it nowhere else, to avoid
false-positive matches by log-watchers.The ship-check script is deterministic. Its gates are documented at the
top of ${CLAUDE_PLUGIN_ROOT}/skel/ship-check.py (copied to .atlas-ai/ship-check.py at setup):
pipeline.json current_phase == "EXECUTE"master.tasks[].status == "done"task-<id>.json or combined variant).taskmaster/docs/plan.md OR docs/superpowers/plans/*.mdExit status N line in any evidence fileGate 5 is the convergent must-do from the 2026-06-04 audit — a "PASS" label on a non-zero-exit test is structurally impossible after this script runs. There is no override path for Gate 5; it is the unfakable oracle.
Do not emit SHIP_CHECK_OK on a mere "DONE" keyword in a subagent reply.
Do not emit on "all tasks marked done" without the explicit ship-check.
Do not emit before /sync has been called.
These are the most common pressure points where the loop silently degrades from "verified" to "performative". If you catch yourself thinking any of them, stop and repair the gap.
Every iteration appends a structured row to
.atlas-ai/state/execute-log.jsonl. Field types are strict — text
narrative in a typed field is a logging bug, not compliance. The schema:
iteration (integer, or "FINAL" for the terminal marker)timestamp (ISO 8601 string)task_id (string)complexity (integer or human label)tier (string: "fast" | "standard" | "capable")subagent_status (string: "DONE" | "DONE_WITH_CONCERNS" | "NEEDS_CONTEXT" | "BLOCKED")retry_count (integer)triple_verify (string: "PASS" / "FAIL" plus free-text rationale)stepback_triggered (boolean, REQUIRED — true iff /stepback was
invoked this iteration). Putting narrative-text in this field is a
violation; use stepback_narrative instead.stepback_narrative (string, nullable — explanation when
stepback_triggered: true; null otherwise)ladder_rung (string, nullable — which rung was reached if escalated)gamify (string — atlas-gamify one-line score)The stepback fields were split (2026-06-04) after a FINAL iteration entry
wrote a paragraph of narrative into the boolean stepback field and was
treated as compliance with the stepback_mandatory rule. Boolean trigger
This log is the dogfood artifact — debrief tools consume it, the orchestrator greps it, and future runs read it for retrospective analysis.
/handoff — that direction is one-way.pivot, which inboxes the plan author. This skill does not mutate the
plan in place..atlas-ai/ship-check.py is the terminal gate. This
skill calls it; it does not reimplement the checks.This skill uses no explicit process termination. A halt condition reports the reason in the structured log and returns control to the caller (the user or the orchestrator). Never kill the shell — the caller owns the session lifecycle.
npx claudepluginhub anombyte93/prd-taskmaster --plugin prdExecutes one implementation task from the Claude Workflow board via an 11-step protocol. Use after cw-plan or manually for autonomous coding with verification and commit.
Executes an approved implementation plan task-by-task, dispatching one fresh subagent per task with an RTCO brief, then validates output through schema check, self-review, and plan-compliance review before marking complete.
Executes task DAGs via isolated subagents after /plan. Manages target dir, git init, state recovery, and task results in .tasker/.