From bholzer
Drive a task DAG to completion by dispatching a coder subagent then a validator subagent for each eligible task, in serial. Use this skill whenever the user asks to "run the tasks", "execute the DAG", "drain the task queue", "start the autonomous coding session", "work through the plan", or similar phrasings that mean "actually do the work the task files describe." Trigger on phrases like "run the tasks in plans/<slug>/", "execute the DAG at …", "kick off the autopilot for …", "start coding the tasks", "drain plans/<slug>/tasks/". This is the step *after* `task-specs` — it assumes the task files are already fleshed out with full bodies (acceptance criteria, files to touch, etc.) and walks the DAG until completion or until something needs human attention.
How this skill is triggered — by the user, by Claude, or both
Slash command
/bholzer:run-tasksThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Drive a task DAG to completion by dispatching subagents — one task at a time, serial. The orchestrator is intentionally as thin as possible: it does not read task bodies, does not understand what's being built, and does not run code itself. It does exactly three things — **pick the next eligible task, dispatch the coder, dispatch the validator** — and stops the moment anything looks wrong.
Drive a task DAG to completion by dispatching subagents — one task at a time, serial. The orchestrator is intentionally as thin as possible: it does not read task bodies, does not understand what's being built, and does not run code itself. It does exactly three things — pick the next eligible task, dispatch the coder, dispatch the validator — and stops the moment anything looks wrong.
Use this skill for any request that boils down to "execute the task DAG." Common phrasings:
plans/<slug>/"plans/<slug>/tasks/"If the user is asking to create the DAG, use task-dag. If they're asking to flesh out skeleton task files, use task-specs. If the task files are still one-line skeletons, refuse to run and tell the user to spec them first — there's nothing for the coder to execute.
The filesystem is the source of truth. Each task file's status: frontmatter field is the state machine. The orchestrator re-derives all state by reading the directory on every tick — it holds nothing important in its own context. Coder and validator are isolated subagents (their own context windows) dispatched via the Agent tool. The orchestrator never reads task bodies; only frontmatter. This makes it crash-recoverable (kill the session, re-run, it picks up where it left off) and keeps its context shallow even on long runs.
Default: the user passes a path like plans/<slug>/ or plans/<slug>/tasks/. Resolve to the tasks/ directory. If the user didn't pass a path, look in plans/ for the most recent tasks/ directory or ask which one.
Confirm the directory contains:
index.md (DAG view)T-NNN-*.md task filesIf the directory is empty or missing skeleton frontmatter, stop and tell the user — there's no DAG to drive.
This skill kicks off an autonomous loop that will modify code, run tests, and write to memory. Before the first dispatch, tell the user what you're about to do in one or two sentences — e.g., "About to start the autopilot for plans/team-invitations/. I count 8 pending tasks, longest chain T-001 → T-003 → T-007. I'll stop and surface immediately on any failure or block."
Then proceed. Do not ask permission task-by-task — that defeats the point.
Repeat the following until the loop exits:
List every T-*.md file in the tasks directory (ignore index.md). For each, read just enough to parse the YAML frontmatter — id, depends_on, status. Do not read the body. This step happens fresh every tick; do not cache.
A task is eligible if:
status: pendingdepends_on has status: doneIf multiple tasks are eligible, pick the one with the lowest id (ascending T-NNN). Determinism makes the loop replayable.
If no tasks are eligible:
done: exit with success — report total tasks completed.failed or blocked: exit with that state — surface which task and why.pending with unmet deps but no terminal state): something is wrong with the DAG. Surface and stop.Flip the picked task's frontmatter from status: pending to status: in_progress using the Edit tool. Be exact — status: pending → status: in_progress on the one file.
Dispatch the task-coder subagent via the Agent tool. The dispatch prompt should be minimal and stable across tasks:
Execute the task specified at
<absolute path to task file>. Read only that file; do not read sibling task files, the design doc, or the PRD. Follow its acceptance criteria. Return a status ofdone,failed, orblocked, plus one short sentence of context.
Wait for the subagent to return. (The Agent tool blocks.)
Three possible outcomes:
done — proceed to validator dispatch (step 3e).failed — flip the task's status: to failed, exit the loop, surface the task ID and the coder's one-line reason to the user. Do not retry.blocked — flip the task's status: to blocked, exit the loop, surface the task ID and what the coder reported as the blocker.done)Dispatch the task-validator subagent via the Agent tool with a minimal stable prompt:
Validate the task at
<absolute path to task file>. Read the "Acceptance criteria" section, run every check, and returnpassorfailwith the failing commands' output if any.
Wait for the return.
pass — flip the task's status: to done. Continue the loop (back to 3a).fail — flip the task's status: to failed. Exit the loop. Surface the failing checks to the user.When the loop exits, report concisely:
done, or stopped on a specific taskfailed / blocked), and the one-line reasonExample success: "Completed 8/8 tasks for plans/team-invitations/. All done."
Example failure: "Stopped on T-004 (status: failed). Validator reported: acceptance check npm test src/services/invitations.test.ts exited 1. See plans/team-invitations/tasks/T-004-invitations-service.md for the spec."
These are load-bearing — break them and the architecture stops working.
status: line.failed or blocked. Stop and surface — the human decides what to do next. Autonomous-with-a-tripwire beats autonomous-spinning-on-bad-ground-truth.pending ──(dispatch coder)──> in_progress
in_progress ──(coder: done, validator: pass)──> done [continue loop]
in_progress ──(coder: failed)──────────────────> failed [stop, surface]
in_progress ──(coder: blocked)─────────────────> blocked [stop, surface]
in_progress ──(validator: fail)────────────────> failed [stop, surface]
Terminal states: done, failed, blocked. The orchestrator never moves a task out of a terminal state.
A good run looks like a tight log of dispatches and status flips. The orchestrator's job is to be boring and predictable. Aim for:
Things to avoid:
done — they are sequential, not concurrentGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub bholzer/claude-bholzer --plugin bholzer