From kiln
Run the complete kiln pipeline using an agent team. Reads the PRD to determine team structure, then orchestrates specify → plan → tasks → implement → audit → PR.
How this skill is triggered — by the user, by Claude, or both
Slash command
/kiln:build-prdThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
```text
$ARGUMENTS
You MUST consider the user input before proceeding (if not empty). The user input is the feature description.
Verify agent teams are available (NON-NEGOTIABLE).
Before anything else, check that TeamCreate is available as a tool. If it is NOT available, STOP immediately and tell the user:
"Agent teams are not enabled.
/build-prdrequires Claude Code agent teams to orchestrate the pipeline.To enable them, add this to your Claude Code settings or launch with the flag:
claude --enable-agent-teamsOr add
"enableAgentTeams": trueto.claude/settings.json.Then restart Claude Code and run
/build-prdagain."
Do NOT proceed with any other pre-flight steps if teams are unavailable. Do NOT attempt to run the pipeline in single-agent mode.
If no user input was provided, ask the user for a feature description.
Locate the PRD — check for a PRD in this order:
docs/features/*-<slug>/PRD.mddocs/features/ contains exactly one feature PRD folder: read that feature PRDdocs/PRD.md (the product-level PRD)/create-prd first.
Extract the feature scope, functional requirements, deliverables, and any named external dependencies.
For feature PRDs, also read docs/PRD.md for inherited product context (tech stack, users, constraints).Read .specify/memory/constitution.md — note any constraints that affect team structure.
Handle working directory and create branch:
The user's local checkout is their working copy. The pipeline branches from the current HEAD — not from main.
# Step A: Check for uncommitted changes
if ! git diff --quiet || ! git diff --cached --quiet; then
echo "You have uncommitted changes."
fi
# Also check untracked files
git status --short
If there are uncommitted changes or staged files: Commit them to the current branch first, then create the pipeline branch. Do NOT ask the user — just commit and proceed. These are typically the PRD and backlog files the pipeline needs.
git add -A
git commit -m "chore: commit working changes before pipeline branch"
If the working directory is clean: Proceed directly to creating the branch.
# Step B: Derive feature slug and create a fresh branch from current HEAD (FR-004)
# The feature slug MUST be derived from the PRD directory name (2-4 words, lowercase, hyphenated).
# Example: docs/features/2026-04-01-pipeline-workflow-polish/PRD.md → "pipeline-workflow-polish"
# Strip the date prefix (YYYY-MM-DD-) from the PRD directory name to get the slug.
PRD_DIR_NAME=$(basename "$(dirname "$PRD_PATH")")
FEATURE_SLUG=$(echo "$PRD_DIR_NAME" | sed 's/^[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}-//' | tr ' ' '-' | tr '[:upper:]' '[:lower:]')
BRANCH_NAME="build/${FEATURE_SLUG}-$(date +%Y%m%d)"
git checkout -b "$BRANCH_NAME"
The branch is created from wherever the user currently is — their current branch, current commit, current state. This preserves their working context. The pipeline's work happens on this new branch; the user's original branch is untouched.
Branch naming rule (FR-004): The branch MUST follow build/<feature-slug>-<YYYYMMDD> exactly. The feature slug is derived from the PRD directory name with date prefix stripped. Do NOT use arbitrary slugs — derive from the PRD path.
Spec directory naming rule (FR-005): The spec directory MUST be specs/<feature-slug>/ where <feature-slug> matches the branch name's feature portion (the part between build/ and the trailing -YYYYMMDD). No numeric prefixes. The specifier agent MUST use this exact directory name.
PRD freeze: The PRD is frozen the moment you read it. Do NOT ask the user for confirmation — just proceed. Log a one-line message: "PRD frozen — starting pipeline on branch $BRANCH_NAME from $(git rev-parse --abbrev-ref HEAD@{-1})." If the user needs to change requirements mid-run, they can trigger a scope-change pause (see Step 4 in Monitor and Steer).
The pipeline always flows through these roles. This is the minimum — you MUST have at least one teammate per role:
Specifier — Runs /specify, then /plan, then /tasks in a single uninterrupted pass. All three commands MUST execute back-to-back without stopping. The specifier MUST NOT go idle between commands. Produces all spec artifacts and commits them. Always runs first.
Researcher — Resolves external dependencies referenced in the PRD. Clones starters to vendor/, documents findings in research.md. Runs after specifier if the PRD names external projects; skip this role if there are no external deps. PRD naming authority: The researcher MUST NOT rename, substitute, or "improve" directory names, file names, or identifiers that the PRD explicitly specifies. If the PRD says apps/electron, the researcher documents apps/electron — not apps/desktop, not apps/electron-app, not any "technology-agnostic" alternative. The PRD is the naming authority. If the researcher believes a PRD name is wrong, they must flag it to the team lead for resolution rather than silently substituting a different name.
Implementer — Runs /implement. Executes the task plan phase-by-phase, writes code matching contracts, marks tasks [X], commits per phase. Runs after specifier (and researcher if present).
QA Engineer — (Web/frontend projects only) Runs the qa-engineer agent. Unlike other auditors, the QA engineer is long-lived — it starts after the specifier finishes (so it knows what to test) and runs in parallel with implementers. It operates in two modes:
SendMessage. The implementer fixes the issue and notifies QA for re-test. This creates a tight feedback loop that catches visual bugs while the implementer still has context./qa-pipeline which spins up a 4-agent QA team:
qa-pass + build-prd labels
After /qa-pipeline, runs /qa-final as a quick green/red gate to confirm all E2E tests pass.
The audit-pr agent includes the QA report summary and issue links in the PR body.The QA engineer tracks its checkpoint history in .kiln/qa/checkpoints.md so it doesn't re-test unchanged flows. It is a peer to implementers, not a gate after them.
QA snapshot guidance (FR-013): QA result snapshots and incremental test-result files MUST NOT be committed to the feature branch. They belong in .kiln/qa/ which is gitignored. The QA engineer should write all artifacts (screenshots, videos, reports) to .kiln/qa/ and never git add them.
Skip this role for CLI-only, API-only, or non-visual projects.
Auditor — Runs after all implementers AND the QA engineer's final pass finish. Each auditor gets a fresh context (no implementation history polluting their judgment). Split auditors by concern so they can run in parallel:
/audit — PRD→Spec→Code→Test verification.kiln/qa/latest/QA-REPORT.md summary in the PR body.For simple features, one auditor can do all of these. For complex features, split them so each auditor starts with a clean context and a focused lens.
Retrospective — Messages all teammates for feedback, creates a GitHub issue with findings. Runs last, before shutdown.
Based on what you read in the PRD, decide where to add parallelism:
qa-engineer that runs alongside implementers. It starts after the specifier (needs spec to know what to test) and runs checkpoint passes as implementers complete phases. Implementers message the QA engineer when a phase is done; QA tests it and sends feedback back. This is the only role that runs in parallel with implementers rather than after them.Ask yourself:
qa-engineer running alongside implementers.ALL pipeline agents (specifier, researcher, implementers, QA engineer, auditors) MUST write a friction note to specs/<feature>/agent-notes/<agent-name>.md before completing their work and marking their task as done. This is a prerequisite for task completion — the retrospective agent reads these notes instead of polling live teammates.
Each agent's prompt already includes the friction notes section. When spawning agents, ensure the feature path is communicated so agents know where to write their notes. The specs/<feature>/agent-notes/ directory should be created by the first agent that writes a note.
After tasks.md is generated, check the task count per implementer. If any single implementer would own more than 20 tasks, split them into multiple implementers by component or phase. A single implementer doing 50+ tasks will take too long, delaying the auditor and causing the pipeline to bottleneck.
Example: If tasks.md has 73 tasks split across CLI (20), templates (27), and modules (16), spawn 3 implementers — not 1 or 2. The "keep under 6" guidance applies to non-implementer roles; add implementers as needed to keep each one under ~20 tasks.
Simple feature (single module, no external deps):
specifier → implementer → auditor → retrospective
3 teammates, fully serial.
Medium feature (CLI + templates, one external dep):
specifier → researcher ─┐
├→ impl-cli ──┐→ audit-code ──┐
└→ impl-tmpl ─┘→ audit-tests ─┤→ retrospective
5 teammates, 2 implementers in parallel, 2 auditors in parallel.
Web frontend feature (UI + API, visual QA with feedback loop):
specifier ─┐
├→ impl-ui ───┐
├→ impl-api ──┤→ qa-engineer (final) ─┐→ audit-pr ─┐
└→ qa-engineer ┘ (checkpoints ↔ impls)└→ audit-compliance ─┤→ retrospective
6 teammates. qa-engineer starts after specifier, runs checkpoint passes during implementation (sends feedback to impl-ui/impl-api, receives "fix ready" notifications back). After all implementers finish, QA switches to final mode for the full video report. Auditors depend on both implementers AND QA final pass.
Complex feature (CLI + 3 modules + external starter):
specifier → researcher ─┐
├→ impl-core ──┐→ audit-compliance ─┐
├→ impl-mod-a ─┤→ audit-tests ──────┤→ retrospective
└→ impl-mod-b ─┘→ audit-smoke ──────┘
7 teammates, 3 implementers in parallel, 3 auditors in parallel.
Each teammate should run the kiln commands (/specify, /plan, /tasks, /implement, /audit) — not reimplement their logic. Implementers running in parallel should each get a filtered view of tasks.md (only their component's tasks).
TeamCreate with a descriptive name (e.g., kiln-{feature})TaskCreate to create ALL tasks. You MUST create every task listed in the Mandatory Tasks section below, plus any additional tasks from your PRD analysis.TaskCreate or TaskUpdate (see dependency rules below)owner via TaskUpdateEvery pipeline run MUST include these tasks regardless of feature complexity. Do NOT skip any of them:
| # | Task | Owner | Depends On | Why Mandatory |
|---|---|---|---|---|
| 1 | Specify + plan + research + tasks | specifier | — | Produces all spec artifacts |
| N | Implementation (1+ tasks) | implementer(s) | specifier | Builds the feature |
| Q | Visual QA (checkpoints + final) | qa-engineer | specifier | (Web/frontend only) Feedback loop during impl + final video report. Create this task if the PRD has any visual/frontend component. The QA engineer does NOT wait for implementers to finish — it runs alongside them. |
| A | Audit + smoke test + create PR | auditor | all implementers + qa-engineer (if present) | Quality gate + deliverable |
| R | Retrospective | retrospective | ALL other tasks | Self-improvement — feeds back into the skill and template. ALWAYS the last task before shutdown. MUST NOT start until every other task is completed or explicitly cancelled. |
The retrospective task exists to make every pipeline run improve the next one. Skipping it means repeating the same friction forever.
Based on your PRD analysis, you may add:
qa-engineer task for web/frontend projects (runs alongside implementers, not after)Each implementer MUST have exactly one task that represents ALL of their work. The implementer MUST NOT mark this task as completed until every phase and every sub-task in tasks.md that they own is finished and committed. The auditor depends on these task completions, so premature completion signals cause the auditor to run against incomplete code.
Include this in every implementer's prompt:
Do NOT mark your task as `completed` via TaskUpdate until ALL of the following are true:
- Every task assigned to you in tasks.md is marked [X]
- Every phase you own has been committed
- You have no remaining uncommitted work
Your task completion is the signal that triggers the auditor. If you mark it done early, the auditor will audit incomplete code and produce invalid findings.
Wire dependencies following these rules:
addBlockedBy dependencies when creating the retrospective task via TaskCreate. Do NOT depend on only the auditor — that leaves a race condition where implementers may still be running.Task 1: Specify (no deps) → owner: specifier
Task 2: Research (depends: 1) → owner: researcher
Task 3: Impl CLI (depends: 2) → owner: impl-cli
Task 4: Impl templates (depends: 2) → owner: impl-templates
Task 5: Audit + smoke + PR (depends: 3, 4) → owner: auditor
Task 6: Retrospective (depends: 1, 2, 3, 4, 5) → owner: retrospective ← depends on ALL tasks
Task 1: Specify (no deps) → owner: specifier
Task 2: Impl UI (depends: 1) → owner: impl-ui
Task 3: Impl API (depends: 1) → owner: impl-api
Task 4: Visual QA (depends: 1) → owner: qa-engineer ← starts with implementers, NOT after
Task 5: Audit + PR (depends: 2, 3, 4) → owner: auditor ← waits for impls AND QA final pass
Task 6: Retrospective (depends: 1, 2, 3, 4, 5) → owner: retrospective
Note: Task 4 (QA) depends only on the specifier, so it unblocks at the same time as the implementers. The QA engineer runs checkpoint passes during implementation by communicating with implementers via SendMessage. It only marks its task as completed after its final pass (all flows tested, video exported). The auditor waits for this completion before starting.
The system automatically unblocks dependent tasks when their dependencies complete. The retrospective will not unblock until every single dependency is marked completed.
Before spawning any teammates, verify:
Spawn all teammates EXCEPT the retrospective agent. The retrospective is spawned later in Step 5 after all auditors complete. This keeps its context clean — an agent spawned at pipeline start accumulates idle notifications and peer DM summaries for the entire run, burning tokens on irrelevant context.
Spawn teammates using the Agent tool with:
team_name set to the team name from Step 2name set to a descriptive name (e.g., specifier, impl-core, auditor)run_in_background: truemode: "bypassPermissions"Each teammate's prompt should include:
TaskUpdate to mark tasks in-progress when starting and completed when doneSendMessage to notify dependent teammates when unblockedTaskList after completing each task to find the next available work~/.claude/teams/{team-name}/config.json to discover other teammates by nameThe specifier's prompt MUST include these exact instructions to prevent stalling between commands:
SPEC DIRECTORY NAMING (FR-005): The spec directory MUST be specs/<feature-slug>/ where <feature-slug>
matches the branch name's feature portion (the part between "build/" and the trailing "-YYYYMMDD").
No numeric prefixes. For example, if the branch is build/pipeline-workflow-polish-20260401,
the spec directory MUST be specs/pipeline-workflow-polish/. Do NOT use any other naming scheme.
You MUST run all three kiln commands in a single uninterrupted pass:
1. Run `/specify` with the feature description
2. IMMEDIATELY after specify completes, run `/plan` — do NOT stop, do NOT wait, do NOT go idle
3. IMMEDIATELY after plan completes, run `/tasks` — do NOT stop, do NOT wait, do NOT go idle
4. ONLY after all three are done: commit all artifacts, mark your task completed, and notify downstream teammates
Each slash command will report "completion" and suggest next steps — IGNORE those suggestions and proceed to the next command in this list. Your task is NOT complete until spec.md, plan.md, contracts/interfaces.md, and tasks.md all exist and are committed.
Why this is needed: Each /* skill ends by reporting completion and suggesting the next command. Without explicit chaining instructions, the specifier agent treats each skill completion as a stopping point and goes idle, requiring a manual nudge from the team lead to continue. This caused a ~10 minute stall in the 015 pipeline run.
The researcher's prompt MUST include these exact instructions:
When documenting findings in research.md:
- If the PRD explicitly names a directory, file, package, or identifier, you MUST use that exact name. Do NOT substitute a "better", "cleaner", or "technology-agnostic" name.
- Example: If the PRD says `apps/electron`, document `apps/electron` — not `apps/desktop`.
- If you believe a PRD name is incorrect or problematic, flag it to the team lead with your reasoning. Do NOT silently rename it in your research output.
- Verify every directory name, package name, and file path in your research.md against the PRD before committing. Any mismatch is a bug.
Why this is needed: In the 015 pipeline, the researcher substituted apps/desktop for the PRD's apps/electron, documenting it as "technology-agnostic naming." This cascaded into spec artifacts and module definitions, requiring two fixup commits (b142ba9, a8b35cc) and manual mid-pipeline renaming across all affected files.
The QA engineer's prompt MUST include these exact instructions:
You are the QA engineer for this pipeline. You run the `qa-engineer` agent definition.
## SKILLS
- `/qa-setup` — Run FIRST. Installs Playwright, scaffolds .kiln/qa/, generates test matrix and test stubs.
- `/qa-checkpoint` — During implementation. Tests new flows, sends feedback to implementers.
- `/qa-pipeline` — After ALL implementers finish. 4-agent team (e2e + chrome + ux + reporter in pipeline mode). Reporter routes findings to implementers for fixing.
- `/qa-final` — Quick gate after /qa-pipeline. Just runs playwright tests and confirms green.
## WORKFLOW
1. On startup: Run `/qa-setup`
2. If `/qa-setup` reports credential-dependent flows, message the team lead:
"QA CREDENTIALS NEEDED — [list flows]. Please ask the user to fill in .kiln/qa/.env.test."
Do NOT block — continue testing non-auth flows while waiting.
3. Watch for messages from implementers saying a phase is complete
4. When notified: Run `/qa-checkpoint`
5. When an implementer messages "fix ready": Run `/qa-checkpoint [flow-name]` to re-test
6. If team lead provides credentials: re-check .kiln/qa/.env.test and unblock auth flows
7. After ALL implementers are done: Run `/qa-pipeline` (4-agent team with fix routing)
8. After `/qa-pipeline` completes: Run `/qa-final` (quick green/red gate)
9. Mark your task as completed via TaskUpdate ONLY after `/qa-final` is green
10. Notify the auditor that QA is complete and report is ready
## CREDENTIALS
- NEVER hardcode or guess credentials — always load from .kiln/qa/.env.test
- NEVER log, screenshot, or expose credentials in video recordings
- If credentials aren't provided, mark affected flows as SKIPPED in the QA report — do NOT block the pipeline
## FEEDBACK RULES
- For each FAILURE: send actionable feedback directly to the responsible implementer via SendMessage:
- What you tested (user flow + steps)
- What went wrong (with screenshot path)
- Suggested fix direction
- Severity (Critical/Major/Minor)
- For each PASS: send brief confirmation to the implementer
- Re-test promptly when an implementer says "fix ready" — you're in their critical path
Do NOT mark your task as completed until the final pass is done and all artifacts are committed.
Your task completion is the signal that triggers the auditor — it needs your QA report and video links for the PR.
Why this is needed: A QA engineer that only runs after implementation misses the chance to catch visual bugs while the implementer still has context. By running checkpoint passes during implementation and sending feedback directly to implementers, bugs get fixed in the same phase they're introduced — not discovered hours later in a final audit.
When a QA engineer is on the team, add this to every implementer's prompt:
A QA engineer (qa-engineer) is testing your work as you build it. After completing each phase:
1. Commit your work
2. Send a message to qa-engineer: "Phase N complete — [list of user flows now testable]. Dev server runs on port [port]."
3. Continue to your next phase — do NOT wait for QA results
4. If qa-engineer sends you feedback about a failure:
a. Read the feedback carefully (it includes what they tested, what failed, and a suggested fix)
b. Fix the issue in your current phase if possible, or note it for a dedicated fix pass
c. After fixing, message qa-engineer: "Fix ready for [flow name] — please re-test"
5. QA feedback fixes are part of your work — do NOT mark your task as completed until QA issues in your scope are resolved
The auditor's prompt MUST include these exact instructions:
Before starting your audit, verify that ALL implementation AND QA are truly complete:
1. Run `TaskList` and check that every implementer task has status `completed` — not `in_progress`, not `pending`
2. If a qa-engineer task exists, verify it is also `completed` (QA final pass done, videos exported)
3. Read `tasks.md` and verify that every task assigned to implementers is marked `[X]`
4. If ANY implementer or qa-engineer task is still in progress or unchecked, do NOT begin auditing. Instead:
- Message the team lead: "Audit blocked — task {id} is not yet complete."
- Wait for the team lead to confirm all work is done before proceeding.
Do NOT audit a partially-complete implementation. Your audit findings are only valid against the final state of the code.
If a QA engineer ran, read `.kiln/qa/latest/QA-REPORT.md` and include its findings in your audit:
- Reference the QA pass/fail verdict
- Link video artifacts in the PR body (.kiln/qa/latest/videos/*.webm)
- Flag any remaining QA failures as blockers
Why this is needed: In the 015 pipeline, the auditor started at 20:05 and documented blockers (missing v2 template, missing Zero/Drizzle/Auth), but the implementer committed those exact fixes at 20:06 and 20:12. The auditor was working against incomplete implementation because it began as soon as its task dependency resolved — but the implementer had marked its coarse-grained task as "completed" before finishing all phases of work.
The auditor's prompt MUST also include:
Before creating the PR, reconcile blockers.md against the current code state:
1. Re-read every blocker in blockers.md
2. For each blocker, check if the code has been updated since it was documented:
- Run `git log --oneline` and check for commits that may have addressed the blocker
- Read the affected files to verify current state
3. If a blocker has been resolved by a later commit, update its status to "RESOLVED" with the commit hash
4. Update the compliance summary table to reflect the actual final state
5. Commit the updated blockers.md before creating the PR
The PR must reflect the FINAL state of the code, not a point-in-time snapshot from mid-implementation.
When creating the PR, always add the `build-prd` label:
gh pr create --label "build-prd" --title "[feature-name]: [short description]" --body "$(cat <<'PREOF'
## Summary
- [bullet points from audit findings]
## Compliance
- PRD coverage: X%
- Test coverage: X%
- Blockers: N (see specs/{feature}/blockers.md)
## QA Results
- Smoke test: PASS/FAIL
- Visual QA: PASS/FAIL/SKIPPED — [video count] recordings
- QA Report: .kiln/qa/latest/QA-REPORT.md
## Test plan
- [ ] Tests pass (`npm test`)
- [ ] Build succeeds (`npm run build`)
- [ ] Smoke test passes
- [ ] Visual QA passes (if applicable)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
PREOF
)"
Why this is needed: In the 015 pipeline, blockers.md cited B-001 (missing v2 template), B-002 (missing Zero/Drizzle/Auth), and B-003 (only 2 UI components) as critical gaps with 65% compliance. But the implementer fixed all three in later commits. The blockers.md was never updated, so the PR would have shipped with a stale 65% compliance figure when the actual number was higher.
Include these in every teammate prompt:
.specify/memory/constitution.md before any code changesTaskUpdate (in_progress → completed)TaskList for the next unblocked, unassigned taskspecs/<feature>/contracts/interfaces.md exactly[X] in tasks.md IMMEDIATELY after completing each onetasks.md and contracts/interfaces.md before starting your next task — they may have changed.Teammates go idle after every turn — this is normal. An idle teammate can still receive messages. If a teammate sends a message and then goes idle, that's the expected flow (they sent their message and are waiting for a response). Do NOT treat idle as an error or shutdown.
After spawning, you are the team lead. Your job is coordination, not implementation.
SendMessage.TaskList — check periodically to see what's done and what's blocked.SendMessage with the teammate's name (not agentId) to communicate.Monitor agent activity and detect stalled agents. A stalled agent wastes pipeline time and blocks downstream work.
Default timeout: 10 minutes (configurable per-project by adjusting this value).
How to detect stalls:
in_progress tasksin_progress for longer than the stall timeout with no commits, task updates, or messages from that agent, it is considered stalledWhen a stall is detected:
When dispatching implementer agents that work on multi-phase task lists, enforce phase ordering:
Rule: Do NOT dispatch or unblock Phase N+1 agents until ALL tasks in Phase N are marked [X] in tasks.md.
How to enforce:
tasks.md[X][ ], do NOT dispatch the next phase's agentsWhy: Without enforcement, agents can race ahead and build on incomplete foundations, causing cascading failures. A Phase 2 agent that starts before Phase 1 is done may reference code that doesn't exist yet.
When approximately 50% of implementer tasks in tasks.md are marked [X], spawn a short-lived audit-midpoint agent to catch structural issues early — before the full audit at the end. This prevents problems like the missing Dockerfile in the obsidian-mcp-mvp pipeline (issue #14) from being caught only at final audit.
How to trigger: While monitoring via TaskList, track the ratio of completed implementer tasks. When it crosses ~50%, spawn the midpoint auditor.
audit-midpoint agent prompt (spawn with run_in_background: true):
You are a lightweight mid-pipeline auditor. Your job is to catch structural gaps EARLY, not to do a full compliance audit. Check these specific things:
1. **Deployment artifacts**: Read plan.md's "Deployment Readiness" section. For every artifact marked "Yes", verify the file exists or has a task assigned in tasks.md. Flag any missing artifacts.
2. **Contract compliance**: Spot-check 3-5 implemented functions against contracts/interfaces.md. Flag signature mismatches.
3. **Structural completeness**: Verify the project structure in plan.md matches what's been created so far. Flag missing directories or misnamed paths.
Report findings to the team lead via SendMessage. Do NOT fix anything — just report. Keep it brief: a bulleted list of gaps found (or "No structural gaps found").
After reporting, mark your task as completed. This is a one-shot check, not an ongoing role.
Create a task for this agent in Step 2 (e.g., "Mid-pipeline structural check") with dependencies on the specifier task only (it runs during implementation, not after). The final auditors do NOT depend on this task — it's advisory.
When all implementers have completed their tasks and before dispatching QA agents for the final pass, check if the project uses Docker:
# Check for Docker configuration in the project root
if [ -f "Dockerfile" ] || [ -f "docker-compose.yml" ] || [ -f "docker-compose.yaml" ] || [ -f "compose.yml" ] || [ -f "compose.yaml" ]; then
echo "Docker project detected — rebuilding containers before QA"
if [ -f "docker-compose.yml" ] || [ -f "docker-compose.yaml" ] || [ -f "compose.yml" ] || [ -f "compose.yaml" ]; then
docker compose build 2>&1 || echo "WARNING: Docker rebuild failed — QA may test stale containers"
else
docker build -t "$(basename $(pwd))" . 2>&1 || echo "WARNING: Docker rebuild failed — QA may test stale containers"
fi
fi
Rules:
Dockerfile or docker-compose.yml (or compose.yml variants) exists in the project rootIf the user changes scope, updates the PRD, or asks to modify requirements while implementers are already running:
TaskList — no implementer or QA engineer should have tasks in in_progress state after acknowledging. If someone doesn't respond, send the pause message again.spec.md, plan.md, contracts/interfaces.md, and tasks.md to reflect the new scope. Commit the updated artifacts./qa-setup to regenerate the test matrix.Why this matters: Without an explicit pause, implementers work against stale spec artifacts and the QA engineer tests against outdated flows. The pause-update-resume cycle ensures all agents work from the same source of truth.
After the audit-pr agent creates the PR, and before spawning the retrospective, the team lead completes the issue lifecycle for this build:
Identify the PRD path used for this build (from Pre-Flight step 3).
Scan for matching issues:
PRD_PATH="<the PRD path used for this build>"
for issue_file in .kiln/issues/*.md; do
[ -f "$issue_file" ] || continue
# Check if status is prd-created
status=$(grep -m1 '^status:' "$issue_file" | sed 's/^status:[[:space:]]*//')
[ "$status" = "prd-created" ] || continue
# Check if prd field matches
prd_field=$(grep -m1 '^prd:' "$issue_file" | sed 's/^prd:[[:space:]]*//')
if [ "$prd_field" = "$PRD_PATH" ]; then
echo "MATCH: $issue_file"
fi
done
Update matching issues to status: completed:
status: completed (replace the existing status line)completed_date: YYYY-MM-DD (today's date)pr: #<PR-number> (the PR number from audit-pr)Archive completed issues (FR-008):
mkdir -p .kiln/issues/completed
# Move each updated issue file to completed/
mv "$issue_file" .kiln/issues/completed/
If the move fails for any file, log a warning and continue — do not block the pipeline.
Commit the issue updates if any files were changed:
git add .kiln/issues/ && git commit -m "chore: mark prd-created issues as completed after PR creation"
If no matching issues are found, skip this step silently.
⛔ STOP. DO NOT send ANY shutdown requests or run TeamDelete until the retrospective is COMPLETE. ⛔ This has been violated in past runs — the team lead shut down agents before the retrospective could collect feedback, losing all self-improvement data.
The retrospective teammate was NOT spawned in Step 3. Spawn it NOW, after all auditor tasks are completed. This gives it a clean context without accumulated idle notifications from the entire pipeline. Use the same Agent tool parameters as Step 3 (team_name, run_in_background, mode) with name: "retrospective". The retrospective task was already created in Step 2 with dependencies on all other tasks — it should unblock immediately since all prerequisites are complete.
Before the retrospective agent starts any work, it MUST run TaskList and verify that every non-retrospective task has status completed or cancelled. If ANY task is still pending or in_progress:
TaskList after receiving a follow-up message from the team leadInclude these instructions verbatim in the retrospective teammate's prompt when spawning it in Step 3.
The retrospective teammate's job:
specs/<feature>/agent-notes/ directory. Each pipeline agent writes a friction note before completing — these contain what was confusing, where agents got stuck, and what could be improved. This is the PRIMARY source of agent feedback, replacing live SendMessage polling of teammates.SendMessage asking for feedback. But prefer the written notes — they're more structured and don't depend on agent availability.specs/{feature}/agent-notes/ — agent friction notes (primary feedback source)specs/{feature}/blockers.md — documented blockersgit log — commit flow and any fixup commits that indicate reworkSendMessage history — look for misunderstandings, repeated clarifications, agents asking the same question twice, or agents doing work that conflicted with another agentFile: [path]
Current: "[exact text that caused the issue]"
Proposed: "[rewritten text that fixes it]"
Why: [one sentence explaining the improvement]
build-prd label:
gh issue create -R yoshisada/ai-repo-template --label "build-prd" --title "..." --body "..."
Containing:
TaskUpdateOnly proceed to Step 5.5 after the retrospective task is marked completed.
After the retrospective completes and before cleanup/PR creation, run /next to produce a continuance analysis. This gives the developer a prioritized list of what to work on after the pipeline finishes.
How to run: The team lead invokes /next directly (not --brief) to get the full analysis. Do NOT spawn a new teammate for this — the team lead runs the skill itself.
What it does:
.kiln/logs/next-<timestamp>.md.kiln/issues/ for any untracked gapsInclude the continuance output in the final pipeline summary (Step 6). The "What's Next" section from /next should appear in the terminal output so the developer sees their next steps immediately.
If /next fails: Log a warning ("Continuance analysis failed — skipping") and proceed with cleanup and PR creation. The continuance step is advisory only — it MUST NOT block the pipeline.
BEFORE proceeding with ANY cleanup or shutdown:
1. Run TaskList RIGHT NOW
2. Find the retrospective task
3. Is its status "completed"?
- NO → STOP. Do NOT proceed. Go back to Step 5. Wait for retrospective to finish.
- YES → Continue to the shutdown protocol below.
If you skip this check, the retrospective data is LOST and the pipeline
cannot self-improve. This has happened before — do not let it happen again.
Verify retrospective ran: The retrospective task MUST show status completed in TaskList. If it does not, STOP HERE — go back to Step 5 and wait. Do NOT send any shutdown requests. Do NOT run TeamDelete. Do NOT proceed to the report.
Confirm each agent is finished before shutdown (NON-NEGOTIABLE):
For EACH teammate (including the retrospective agent), send a confirmation request BEFORE sending a shutdown request:
SendMessage("[agent-name]", "The pipeline is complete. Are you finished with all your work? Please confirm:
1. All your tasks are marked completed in TaskList
2. All your artifacts are committed
3. You have no pending messages to send
Reply 'READY TO SHUTDOWN' when confirmed.")
Wait for each agent to reply 'READY TO SHUTDOWN' before proceeding. If an agent says it's NOT finished:
NEVER shut down an agent that hasn't confirmed it's finished. An agent may have uncommitted work, pending messages, or in-progress analysis that would be lost.
NEVER shut down ANY agent before the retrospective is complete. The retrospective agent messages other teammates for feedback — if they're shut down, it can't collect their responses. All agents must remain alive until the retrospective agent confirms 'READY TO SHUTDOWN'.
Shut down teammates gracefully: Only AFTER every agent has confirmed 'READY TO SHUTDOWN', send each teammate SendMessage with message: {type: "shutdown_request"}.
Shutdown order:
Wait for all teammates to shut down before cleaning up.
Clean up: Use TeamDelete to remove the team and task directories.
Write pipeline log: Save the pipeline report to .kiln/logs/{feature-branch}-{timestamp}.md for audit trail.
Summarize the pipeline results:
## Pipeline Report: {feature branch name}
| Step | Status | Details |
|------|--------|---------|
| Specify | [Done/Failed] | {FR count, user story count} |
| Plan | [Done/Failed] | {artifact count} |
| Research | [Done/Skipped/Questions] | {deps resolved} |
| Tasks | [Done/Failed] | {phase count, task count} |
| Commit | [Done/Failed] | {commit hash} |
| Implementation | [Done/Failed] | {phases completed, tasks done} |
| Visual QA | [Pass/Fail/Skipped] | {flows tested, checkpoints run, issues found/fixed, video count, GitHub issues filed} |
| Audit | [Pass/Fail] | {compliance %, test quality, smoke result} |
| PR | [Created/Failed] | {PR URL} |
| Retrospective | [Done/Failed] | {issue URL} |
| Continuance | [Done/Skipped] | {report path or "skipped"} |
**Branch**: {branch name}
**PR**: {URL}
**Tests**: {count} passing, {coverage}% coverage
**Compliance**: {percentage}
**Blockers**: {count} — see specs/{feature}/blockers.md
**Smoke Test**: {PASS/FAIL}
**Visual QA**: {PASS/FAIL/SKIPPED} — {video count} recordings, {N} GitHub issues filed, see .kiln/qa/latest/QA-PASS-REPORT.md
**Retrospective**: {issue URL}
**What's Next**: {continuance report path or "skipped"}
The debugger agent is NOT part of the standard pipeline. It is spawned on-demand in the background when an issue can't be resolved by the agent that encountered it. The pipeline does not wait for or depend on the debugger — it runs alongside the pipeline as a background helper.
When NOT to spawn a debugger:
When to spawn a debugger:
| Failure Source | Trigger (team lead judgment call) | Debugger Gets |
|---|---|---|
| QA engineer | Implementer can't fix after 2 attempts | QA's failure report + implementer's failed fix attempts |
| Smoke tester | Non-obvious FAIL result | Smoke test output (command, stderr, exit code) |
| Test runner | Failing test with unclear root cause | Failing test name, file, error message |
| Auditor | Implementation gap that no one can fix | Blocker description from auditor |
| Build | Cryptic build failure | Build output |
| Implementer | Implementer explicitly reports being stuck | Implementer's description + what they've tried |
Spawn a debugger agent with:
- team_name: [team name]
- name: "debugger" (or "debugger-2" if one is already running)
- run_in_background: true
- mode: "bypassPermissions"
Prompt must include:
- The failure report (copy the exact message from the reporting agent)
- Which agent reported it
- The working directory and branch
- Any prior fix attempts (from the implementer or previous debugger runs)
- Instructions to use /debug-diagnose first, then /debug-fix
- Instructions to message the original reporter when fixed
- Instructions to message team lead if escalating
Agent reports failure
│
├─ Team lead spawns debugger agent
│
├─ Debugger runs /debug-diagnose → classifies issue, selects technique
│
├─ Debugger runs /debug-fix → applies fix, verifies
│ │
│ ├─ PASS → debugger notifies reporter, reporter re-verifies
│ │ │
│ │ ├─ Reporter confirms fix → debugger marks done
│ │ └─ Reporter says still broken → debugger iterates
│ │
│ └─ FAIL → debugger iterates (max 3 attempts per technique, max 3 techniques)
│ │
│ └─ All strategies exhausted → debugger escalates to team lead
│ │
│ └─ Team lead escalates to USER with full debug report
│
└─ All debug artifacts logged in debug-log.md
The debugger is NOT pre-planned in Step 2. When you spawn one mid-pipeline:
TaskCreate with description: "Debug: [issue summary]"TaskUpdate (so the retro captures its findings)The debugger's task is completed when either:
If the debugger is still running when the pipeline reaches the retrospective gate, the retrospective waits for it (since it was added as a dependency). If the issue is non-blocking, you can cancel the debugger task instead to avoid holding up the pipeline.
If the debugger exhausts all strategies (9 attempts across 3 techniques), it sends a comprehensive escalation report to the team lead. The team lead then:
Only escalate to the user AFTER the debugger has tried. "I hit an error" → spawn debugger → debugger tries → THEN escalate if needed.
/implement stops early, spawn a replacement to continue from where it left offspecs/{feature}/blockers.md — pipeline continuesdebugger, debugger-2, etc.)Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub yoshisada/ai-repo-template --plugin kiln