Skill

run-tasks

Drive a task DAG to completion by dispatching a coder subagent then a validator subagent for each eligible task, in serial. Use this skill whenever the user asks to "run the tasks", "execute the DAG", "drain the task queue", "start the autonomous coding session", "work through the plan", or similar phrasings that mean "actually do the work the task files describe." Trigger on phrases like "run the tasks in plans/<slug>/", "execute the DAG at …", "kick off the autopilot for …", "start coding the tasks", "drain plans/<slug>/tasks/". This is the step *after* `task-specs` — it assumes the task files are already fleshed out with full bodies (acceptance criteria, files to touch, etc.) and walks the DAG until completion or until something needs human attention.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/bholzer:run-tasks

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Drive a task DAG to completion by dispatching subagents — one task at a time, serial. The orchestrator is intentionally as thin as possible: it does not read task bodies, does not understand what's being built, and does not run code itself. It does exactly three things — **pick the next eligible task, dispatch the coder, dispatch the validator** — and stops the moment anything looks wrong.

SKILL.md

143 lines · ~2.1k tokens

Stats

Parent stars0

MaintenanceGood

Last CommitMay 20, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Run Tasks (Orchestrator)

Drive a task DAG to completion by dispatching subagents — one task at a time, serial. The orchestrator is intentionally as thin as possible: it does not read task bodies, does not understand what's being built, and does not run code itself. It does exactly three things — pick the next eligible task, dispatch the coder, dispatch the validator — and stops the moment anything looks wrong.

When this skill triggers

Use this skill for any request that boils down to "execute the task DAG." Common phrasings:

"Run the tasks in plans/<slug>/"
"Execute the DAG at plans/<slug>/tasks/"
"Drain the task queue"
"Start the autonomous coding session"
"Kick off the autopilot for "

If the user is asking to create the DAG, use task-dag. If they're asking to flesh out skeleton task files, use task-specs. If the task files are still one-line skeletons, refuse to run and tell the user to spec them first — there's nothing for the coder to execute.

Architecture in one paragraph

The filesystem is the source of truth. Each task file's status: frontmatter field is the state machine. The orchestrator re-derives all state by reading the directory on every tick — it holds nothing important in its own context. Coder and validator are isolated subagents (their own context windows) dispatched via the Agent tool. The orchestrator never reads task bodies; only frontmatter. This makes it crash-recoverable (kill the session, re-run, it picks up where it left off) and keeps its context shallow even on long runs.

Workflow

1. Locate the tasks directory

Default: the user passes a path like plans/<slug>/ or plans/<slug>/tasks/. Resolve to the tasks/ directory. If the user didn't pass a path, look in plans/ for the most recent tasks/ directory or ask which one.

Confirm the directory contains:

An index.md (DAG view)
One or more T-NNN-*.md task files

If the directory is empty or missing skeleton frontmatter, stop and tell the user — there's no DAG to drive.

2. Confirm before starting

This skill kicks off an autonomous loop that will modify code, run tests, and write to memory. Before the first dispatch, tell the user what you're about to do in one or two sentences — e.g., "About to start the autopilot for plans/team-invitations/. I count 8 pending tasks, longest chain T-001 → T-003 → T-007. I'll stop and surface immediately on any failure or block."

Then proceed. Do not ask permission task-by-task — that defeats the point.

3. Tick loop

Repeat the following until the loop exits:

3a. Re-read state from disk

List every T-*.md file in the tasks directory (ignore index.md). For each, read just enough to parse the YAML frontmatter — id, depends_on, status. Do not read the body. This step happens fresh every tick; do not cache.

3b. Pick one eligible task

A task is eligible if:

status: pending
Every ID in depends_on has status: done

If multiple tasks are eligible, pick the one with the lowest id (ascending T-NNN). Determinism makes the loop replayable.

If no tasks are eligible:

If every task is done: exit with success — report total tasks completed.
If any task is failed or blocked: exit with that state — surface which task and why.
Otherwise (mixed pending with unmet deps but no terminal state): something is wrong with the DAG. Surface and stop.

3c. Mark in-progress and dispatch coder

Flip the picked task's frontmatter from status: pending to status: in_progress using the Edit tool. Be exact — status: pending → status: in_progress on the one file.

Dispatch the task-coder subagent via the Agent tool. The dispatch prompt should be minimal and stable across tasks:

Execute the task specified at <absolute path to task file>. Read only that file; do not read sibling task files, the design doc, or the PRD. Follow its acceptance criteria. Return a status of done, failed, or blocked, plus one short sentence of context.

Wait for the subagent to return. (The Agent tool blocks.)

3d. Handle the coder's return

Three possible outcomes:

done — proceed to validator dispatch (step 3e).
failed — flip the task's status: to failed, exit the loop, surface the task ID and the coder's one-line reason to the user. Do not retry.
blocked — flip the task's status: to blocked, exit the loop, surface the task ID and what the coder reported as the blocker.

3e. Dispatch validator (only on coder `done`)

Dispatch the task-validator subagent via the Agent tool with a minimal stable prompt:

Validate the task at <absolute path to task file>. Read the "Acceptance criteria" section, run every check, and return pass or fail with the failing commands' output if any.

Wait for the return.

pass — flip the task's status: to done. Continue the loop (back to 3a).
fail — flip the task's status: to failed. Exit the loop. Surface the failing checks to the user.

4. Exit and surface

When the loop exits, report concisely:

The terminal state: all done, or stopped on a specific task
If stopped: which task, in what state (failed / blocked), and the one-line reason
A pointer to the task file for the user to read if they want details

Example success: "Completed 8/8 tasks for plans/team-invitations/. All done."

Example failure: "Stopped on T-004 (status: failed). Validator reported: acceptance check npm test src/services/invitations.test.ts exited 1. See plans/team-invitations/tasks/T-004-invitations-service.md for the spec."

Hard rules

These are load-bearing — break them and the architecture stops working.

Never read task bodies. Only frontmatter. The orchestrator must not know what the work is.
Never modify the body of a task file. Only the status: line.
Never run code, edit source files, or run tests yourself. Subagents do the work.
Never retry on failed or blocked. Stop and surface — the human decides what to do next. Autonomous-with-a-tripwire beats autonomous-spinning-on-bad-ground-truth.
Re-read the directory every tick. Do not cache task state in your own context — your context is finite and the filesystem is the source of truth.
Serial only. Dispatch one coder, wait, dispatch one validator, wait. No parallel dispatch.

State machine reference

pending ──(dispatch coder)──> in_progress
in_progress ──(coder: done, validator: pass)──> done       [continue loop]
in_progress ──(coder: failed)──────────────────> failed    [stop, surface]
in_progress ──(coder: blocked)─────────────────> blocked   [stop, surface]
in_progress ──(validator: fail)────────────────> failed    [stop, surface]

Terminal states: done, failed, blocked. The orchestrator never moves a task out of a terminal state.

Quality bar

A good run looks like a tight log of dispatches and status flips. The orchestrator's job is to be boring and predictable. Aim for:

Every tick begins with a fresh filesystem read
Every state transition is reflected in the task file before the next tick
Stops are loud and specific (task ID + reason), never silent

Things to avoid:

Reading task bodies "just to understand" — that's how the orchestrator's context bloats and how it starts second-guessing the subagents
Heuristics about how to fix a failure — that's the human's call
Implicit retries (e.g., "let me just try once more") — every retry has to be a human decision so the loop is auditable
Dispatching the validator before the coder reports done — they are sequential, not concurrent

run-tasks

Invocation

Context Preview

SKILL.md

run-tasks

Invocation

Context Preview

SKILL.md

Run Tasks (Orchestrator)

When this skill triggers

Architecture in one paragraph

Workflow

1. Locate the tasks directory

2. Confirm before starting

3. Tick loop

3a. Re-read state from disk

3b. Pick one eligible task

3c. Mark in-progress and dispatch coder

3d. Handle the coder's return

3e. Dispatch validator (only on coder done)

4. Exit and surface

Hard rules

State machine reference

Quality bar

Similar Skills

Run Tasks (Orchestrator)

When this skill triggers

Architecture in one paragraph

Workflow

1. Locate the tasks directory

2. Confirm before starting

3. Tick loop

3a. Re-read state from disk

3b. Pick one eligible task

3c. Mark in-progress and dispatch coder

3d. Handle the coder's return

3e. Dispatch validator (only on coder done)

4. Exit and surface

Hard rules

State machine reference

Quality bar

Similar Skills

3e. Dispatch validator (only on coder `done`)

3e. Dispatch validator (only on coder `done`)