From kampus-pipeline
Transforms a triaged epic into a PRD-grade plan with product layer (problem, user stories, testing strategy) and engineering layer, creating GitHub sub-issues with user-story traces and dependency topology. Operates autonomously via gh api REST.
How this skill is triggered — by the user, by Claude, or both
Slash command
/kampus-pipeline:plan-epicThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You take a triaged epic (`type:epic` + `status:triaged`) and turn it into something
You take a triaged epic (type:epic + status:triaged) and turn it into something
a fleet of write-code agents can execute without you in the loop: a PRD-grade plan
written into the epic body, a set of native GitHub sub-issues each carrying its own user-story
trace and acceptance criteria, and a pinned ## Dependencies section that says what gates what.
"PRD-grade" is the bar this skill exists to hold. A plan that lists only architecture and a task split is half a plan — it says how without ever saying who needs this and what changes for them. Your plan leads with the product layer (the problem, the solution from the user's view, the user stories, the testing strategy) and then lays down the engineering layer (approach, split rationale). The user stories are the spine: they are what you slice the children from, and every child traces back to one. See ADR 0046 for why.
You operate autonomously. The plan you write is read by write-code agents, not presented
to a human for sign-off — there is no interview, no propose-first, no approval gate. You
author the product layer from the brief + the existing product + codebase exploration + your own
product judgment; you do not ask the user questions (the human already approved the epic at
triage). When a user story hinges on a genuine product decision you can't ground from the
codebase, carve it as a type:decision child — never handwave it. Plan, split, link, done.
The epic body is append-down: the triaged original brief stays untouched at the top, and you write below it. You never rewrite-on-top an epic — its original content is the brief that grounds your plan, not noise to bury. (This is exactly the exception triage carves out for epics; the formats doc spells out why.)
gh api REST — never GraphQLThe kamp-us org runs a legacy Projects-classic integration that breaks GraphQL issue
queries. Every read and write goes through gh api. Native sub-issues have a REST
surface (below); use it. This is not a style preference — GraphQL calls error out on
this org.
Resolve the target repo once, up front. This skill is repo-agnostic — every gh api
call targets $REPO, not a hardcoded repo. Resolve it at the top of your run per the shared
contract's Target repo resolution
(../gh-issue-intake-formats.md): $CLAUDE_PIPELINE_REPO
if set, else the current repository. In phoenix this defaults to kamp-us/phoenix, so the
behavior is unchanged with no config (ADR 0062 §1).
REPO="${CLAUDE_PIPELINE_REPO:-$(gh repo view --json nameWithOwner -q .nameWithOwner)}"
You write three of the five shared formats; read them before you start:
../gh-issue-intake-formats.md.
## Dependencies grammar (format 1) — the topology you pin at the bottom of
the epic body: ### Phase N headings as the sequential spine, the list within a
phase as a parallel group, requires: #N as a cross-boundary gating edge.
Topology only — no retry budgets, no concurrency caps, no code flags; those are
orchestrator concerns, not shared issue state. (The orchestrator itself is not this repo's
job — ADR 0046; the topology is the
only dependency artifact phoenix keeps.)**Stories:** line (the story numbers from your plan this child implements or unblocks), a
**TDD:** flag, an optional **Containment:** marker, a ### What to build prose spec, and a
### Acceptance criteria checklist. Two hard invariants: every child carries ≥ 1 acceptance
criterion, and every child traces to ≥ 1 user story (see the coverage invariant in Step 3).**Containment:** marker, defined once in the formats contract's
§The product-development cycle hook.
plan-epic is the only writer of the marker: when the repo carries a
product-development-cycle.md you stamp each child's containment from the cycle's policy; when
it's absent the step no-ops (graceful absence, ADR 0062). See Step 3's Stamp the containment
marker.write-code as
children complete), but your plan should make the cross-task signal they'll carry
predictable. The ## Dependencies graph is the spine those handoffs route along.Read the formats doc tolerantly when reconciling an existing plan (re-plan, below) and write it canonically. Tolerant reading is the safety margin, not the target.
plan-epic and review-plan both mutate one epic's children (you supersede/unlink/close on
re-plan; the gate flips planned → triaged). Run concurrently they interleave and corrupt
the ledger (#264). Before you create, amend, supersede, unlink, or close any child — and
before the body PATCH in Step 5 — acquire the status:planning epic-lock; release it when
you finish (PASS-or-park), on every exit path including failure. This is the primary
serialization (ADR 0059); the Step 5
splice+recheck (#261) is the complementary backstop for its residual, not a replacement.
Acquire (one bash step, fails closed). Re-read the lock label; if it's held, back off and
stop. Otherwise POST it — and only treat the lock as acquired if that POST actually
succeeds. A failed acquire (the 422 returned when status:planning hasn't been created in the
repo — it's a canonical lock label, see ADR 0059
§Setup and the formats doc's status-label table — or any transient gh IO fault) must not
fall through to mutate: it backs off and exits 0, so a missing label or a flaky write never lets
you mutate unlocked. The back-off exit 0 is deliberate (a held lock or a setup gap is not a
plan-epic failure) — but it means a caller keying on exit status alone cannot tell "planned" from
"backed off, did nothing"; the echo is the signal, so a wrapper must read it (or re-run) rather
than treat exit 0 as "the epic was planned".
# acquire: defer to a lock already held; otherwise POST it — and proceed ONLY if the POST succeeds
HELD=$(gh api repos/$REPO/issues/<EPIC> --jq '[.labels[].name] | index("status:planning")')
# gh --jq prints "" (not "null") for a jq null, so test non-empty: index() is a numeric position when held, empty when absent.
if [ -n "$HELD" ]; then
echo "epic #<EPIC> is being planned by another run (status:planning held) — BACK OFF, do not mutate."
exit 0 # the held lock is the holder's, not ours — do NOT release it.
fi
if ! gh api repos/$REPO/issues/<EPIC>/labels -f "labels[]=status:planning" >/dev/null; then
echo "could not acquire status:planning on epic #<EPIC> (422 missing label? transient gh fault?) — BACK OFF, do not mutate."
exit 0 # FAILS CLOSED: the POST didn't land, so we DON'T hold the lock — never mutate unlocked.
fi
# Lock held. WE acquired it, so WE must release it — on EVERY terminal path (below).
Release is an explicit agent step, not a shell trap … EXIT. The acquire above runs in
one bash invocation; your subsequent mutations (re-plan supersede/unlink/close, the Step 3
child creates, the Step 5 body PATCH) run in separate later bash invocations — each its own
process. A trap … EXIT armed in the acquire shell fires the instant that shell exits, i.e.
before any mutation runs, releasing the lock immediately and giving you zero serialization.
So the release can't live in the acquire snippet; it is an action you take, deliberately, on
the way out — run this exact DELETE once you reach any terminal state (PASS/done, parked,
or a failure/abort mid-mutation):
# release: run on EVERY exit path AFTER a successful acquire (done, park, or fault mid-mutation).
# Do NOT fire-and-forget — a silently-failed DELETE LEAKS the lock and wedges the epic, the exact
# catastrophe this design prevents. A 404 is benign (label already gone — released, or never
# landed); ANY other failure means the lock may still be held, so surface it LOUDLY.
if ! relerr=$(gh api -X DELETE repos/$REPO/issues/<EPIC>/labels/status:planning 2>&1); then
case "$relerr" in
*"HTTP 404"*|*"Label does not exist"*) : ;; # already released / never acquired — nothing to free
*) echo "WARNING: failed to release status:planning on epic #<EPIC> — the epic-lock may be LEAKED (still held). Re-run this DELETE or clear the label by hand; until cleared, plan-epic/review-plan back off on this epic. ($relerr)" ;;
esac
fi
The release fires on every terminal path on purpose: you drive this as an LLM agent across
many bash calls, and an agent that aborts (or whose gh call throws) part-way through the
mutation must still issue the DELETE before it stops — a release that fires only on the clean
fall-through LEAKS the lock on the error/abort path (wedging the epic against every later
plan-epic/review-plan run until a human clears it — the exact catastrophe #264 warns about).
Only release a lock YOU acquired (the success branch above), never the held lock you backed
off from. A leaked lock is silent and only a human clears it.
POST .../labels is not compare-and-swap (no If-Match) — two runs that both read the
lock absent in the same window both acquire (the §7/#260 TOCTOU, over the whole child set).
So this is detect-and-serialize, not a mutex: it serializes the common concurrent
re-plan, and the residual co-acquire window is caught by Step 5's splice+recheck. Don't claim
a guarantee the label API can't give.
Read the epic body — the brief is the top section, the part above any plan you may
have written on a prior run. Then read enough of the codebase to plan against what's
actually there: the files and modules the brief names, the ADRs in .decisions/, the
patterns in .patterns/, related issues and PRs. A plan written without the codebase
is a wish list; a plan written from it is executable.
Resolve the brief's open questions yourself from the codebase and project
conventions — that is the planning work. The epic brief often ends with open
questions (triage leaves them for you). Answer them in the plan with a stated
rationale; don't punt them downstream as unscoped ambiguity. If a question is
genuinely a product or architecture fork that needs human-judgment — not something the
codebase settles — carve it as its own type:decision child rather than blocking the
whole plan on it. (This is the autonomous substitute for the interview a human-facing PRD
tool would run: you resolve what you can and turn the rest into decision work, you don't
stop to ask.)
In the bash below,
<EPIC>/<CHILD>angle-bracket tokens are placeholders you hand-substitute with concrete issue numbers;$VARS(e.g.$CHILD_ID,$BODY) are live shell variables the commands set and read.
# the epic, its current body, its labels, and any children it already has
gh api repos/$REPO/issues/<EPIC> --jq '{number,title,labels:[.labels[].name],sub_issues_summary}'
# capture the body AND its revision marker from ONE GET — reading them in two calls lets a writer
# land between them, yielding an updated_at newer than the captured body (TOCTOU skew); the Step 5
# recheck would then either spuriously retry or trust a marker that doesn't match the captured body.
gh api repos/$REPO/issues/<EPIC> --jq '{body,updated_at}' > /tmp/plan-epic-<EPIC>-snap.json
jq -r '.body' /tmp/plan-epic-<EPIC>-snap.json > /tmp/plan-epic-<EPIC>-current.md
jq -r '.updated_at' /tmp/plan-epic-<EPIC>-snap.json > /tmp/plan-epic-<EPIC>-updated-at.txt
gh api 'repos/$REPO/issues/<EPIC>/sub_issues?per_page=100' \
--jq '.[] | "#\(.number) [\(.state)] \(.title)"'
If the epic already has sub-issues or a plan section, you're re-planning — jump to Re-plan after you've drafted the new plan, and reconcile rather than blindly recreate.
Below the untouched brief, write the plan a write-code fleet needs, as a
## Plan (plan-epic) section whose subsections lead with the product layer and then the
engineering layer. Write these exact ### headings, in this order, under
## Plan (plan-epic):
## Plan (plan-epic)
### Problem & who has it
### What changes
### User stories
### Goal / non-goals
### Resolved questions
### Approach
### Testing strategy
### Task-split rationale
(The ## Dependencies topology is a separate top-level section you pin in Step 5 — not part
of this plan block.) What each section holds:
Product layer — lead with this.
### User stories
heading so every child's **Stories:** line can reference them by number. A numbered list,
each: As a <actor>, I want <capability>, so that <benefit>. Be extensive — cover
the happy path, edge cases, error states, and admin/moderation flows; thin generic stories
produce thin tasks. Each story should be specific enough to demo. Actors include agents
where the surface is agent-facing. These stories are what you slice the children from in
Step 3, and every child traces back to one — so write them first and write them well.
A story that depends on an unresolved product decision becomes a type:decision child (Step
1); don't bury the fork inside a vague story.Engineering layer — then this.
type:decision children instead of answered here.path/to/file.ts:42; paths and snippets go stale fast and the children read
the live code anyway..patterns/effect-testing.md); what makes a good test here (behavior, not
implementation); prior art in the repo to follow. This is what sets each child's **TDD:**
flag honestly, instead of guessing per child.## Dependencies topology legible, and it names which stories each slice
carries.Keep it grounded. No invented requirements, no aspirational scope the brief didn't ask for. The plan serves the children; if a paragraph doesn't change how a child gets built or what it delivers to a user, cut it.
Slice the plan into executable children. Each implementation child is a tracer-bullet
vertical slice: a thin path through every layer it touches (storage → service → fate →
UI → tests) that delivers one narrow-but-complete piece of user-visible value, demoable on
its own. Prefer many thin slices over few thick ones. A child a write-code agent can
pick up cold and finish in a PR or two, with an unambiguous "done".
type:decision and type:investigation children are the exception to "vertical slice" —
they produce a record (an ADR via /adr, or a diagnosis), not a layered code change. They
still trace to the stories or forks they unblock.
This is the discipline that makes the split PRD-derived rather than invented:
**Stories:** line — the stories it implements,
or (for decision/investigation/infra children) the stories it unblocks. A child that traces
to no story is scope creep — cut it. The rare genuinely-pure-infra child that unblocks no
single story carries the explicit marker **Stories:** none (pure infra — see What to build)
and justifies itself in ### What to build; the line is never silently left blank.Before you finish, run the coverage check both directions: list each story → the child(ren) covering it, and each child → its story(ies). An uncovered story or an untraceable child means the split isn't done.
Each child's body follows the sub-issue body format (format 2) exactly:
**Stories:** <REQUIRED — story numbers this child implements or unblocks; or `none (pure infra — see What to build)`>
**TDD:** yes | no
**Containment:** flag (default-off) | exempt (<reason>) | none (no cycle doc) ← stamped per the cycle-doc hook below; omit (or `none`) when there's no cycle doc
### What to build
<One or two paragraphs of concrete scope: what changes, where, why. Name the
modules/files. State what's out of scope if there's a tempting adjacent thing.>
### Acceptance criteria
- [ ] <observable, externally checkable criterion>
- [ ] <…>
(A child's ### What to build may name concrete files — it sits close to the code. Only the
epic-level Approach (Step 2) stays path-free, because it ages faster and the children read
the live code anyway.)
The invariants you must hold:
**Stories:** line is required and never blank — it names
the stories the child implements or unblocks (or, for the rare pure-infra child, the explicit
none (pure infra — see What to build) marker). See the coverage invariant above.write-code can't know
when to stop and review-code can't verify without it. If you can't write one,
the child isn't specified yet.yes for a behavior with a
verifiable contract; no for config, docs, scaffolding, or an operational step. It's
advice to write-code, not a gate.product-development-cycle.md, every child carries a
**Containment:** line; when it's absent the step no-ops and children carry none (or no
line). The marker's grammar is defined once in the formats contract — you stamp it, you don't
re-derive it.## Dependencies graph,
not by reference between child bodies. (The story numbers point into the epic plan's
### User stories, which is shared context, not a sibling body.)Create each child via REST, assembling its body from a temp file so multi-line markdown and backticks survive the shell:
BODY="$(cat /tmp/plan-epic-child.md)"
gh api repos/$REPO/issues \
-f title="<sharp single-unit title>" \
-f body="$BODY" \
--jq '{number,id}'
Capture both number and id from the create — Step 4 links by the id, so you won't need
to re-fetch it.
Children get their own type from the work they are (type:feature, type:chore,
type:bug, type:decision, type:investigation) — not inherited from the epic — plus a
priority. Do not label children status:needs-triage: they were born from a triaged plan,
they don't re-enter triage. But they are not yet pickable either — they're born
status:planned, the pre-gate state. write-code keys on status:triaged, so a
status:planned child stays unpickable until the review-plan gate validates the ledger and
flips planned → status:triaged (per ADR
0047 — that flip is the whole enforcement
mechanism: an unverified-but-pickable child is unrepresentable). Apply status:planned + a
type:* + a p*:
gh api repos/$REPO/issues/<CHILD>/labels \
-f "labels[]=type:feature" -f "labels[]=p2" -f "labels[]=status:planned"
POST .../labels is additive — it appends to whatever the child already carries,
it doesn't replace the set (relevant if you re-apply labels to an existing child during
an amend).
(Type and priority are your call as planner, the same authority triage has — you're the one who understands the slice.)
Milestone is one more attribute applied at child creation, alongside the labels above —
but unlike type:*/p*/status:planned it is conditional and inherited, not your call
as planner. A child inherits the parent epic's milestone when the epic has one, so a
campaign milestone's burndown is complete by construction: if a "Search" epic is in the
"Search" milestone, every child it spawns belongs to "Search" too, and the milestone can
actually reach 100%. The milestone is the one optional intake dimension — read its
definition and the REST surface from the formats contract's milestone section
(../gh-issue-intake-formats.md, Milestone — the one
optional intake dimension); this is the inherit-logic that section says lives here and cites it.
Read the epic's milestone once, and only if it has one PATCH each created child onto it:
EPIC_MILESTONE=$(gh api repos/$REPO/issues/<EPIC> --jq '.milestone.number // empty')
if [ -n "$EPIC_MILESTONE" ]; then
gh api -X PATCH repos/$REPO/issues/<CHILD> -f milestone="$EPIC_MILESTONE"
fi
If the epic has no milestone, children stay unmilestoned — inheritance copies the epic's state, it never invents one. This skill never creates a milestone (creating/curating the set is a human roadmap act, ADR 0072 §3) and assigns a child only to the epic's existing milestone — never a guessed or fresh one. An unmilestoned epic yielding unmilestoned children is correct, not a gap to backfill (freeze-by-absence: deliberate absence is a signal, per the contract section).
The containment marker is the per-child cycle decision — the same kind of attribute you apply at child creation alongside the labels and milestone above, but one you derive from the repo's cycle policy rather than from the slice itself. Its grammar (the canonical values, the tolerant-read rule, who writes vs reads it) is defined once in the formats contract's §The product-development cycle hook — plan-epic is the only writer named there; read that section for the contract, don't re-derive it here. The why is ADR 0083 (agents own deployment / humans own release).
Consult the cycle-doc hook using the contract's one canonical probe — a content read of
the well-known repo-root product-development-cycle.md. Run it once per plan; absent ⇒ this
whole step no-ops (graceful absence, ADR
0062):
# the formats-contract canonical probe — absent ⇒ no marker stamped (children carry `none`)
if gh api "repos/$REPO/contents/product-development-cycle.md" --jq '.path' >/dev/null 2>&1; then
CYCLE_DOC=present
else
CYCLE_DOC=absent
fi
type:* and p*. Phoenix's cycle (per its product-development-cycle.md): a user-facing
child ships dark, so it carries **Containment:** flag (default-off); an
internal / refactor / infra / docs child has no user-facing surface to contain, so it
carries **Containment:** exempt (<reason>) with the reason naming which (e.g. exempt (docs),
exempt (internal refactor)). Stamp the line into the child body (the **Containment:** field
in the format-2 template above), alongside **Stories:** / **TDD:**.none (no cycle doc) (or, equivalently, omit
the line — a missing line reads as none per the contract's tolerant-read rule). No other
behavior changes; the plan is well-formed exactly as it was before this dimension existed.The judgment of user-facing-vs-exempt is yours — it's the same slice-level understanding that set the child's type and priority. write-code (ship dark) and review-code (verify the gating) read this marker downstream; they never write it.
GitHub has a native sub-issues relationship — link each child to the epic so it
shows up in the epic's sub_issues_summary and the GitHub UI's sub-issue list. This
is the real parent/child edge, not just a ## Dependencies mention.
The endpoint takes the child's database id (.id), not its issue number:
# the child's database id (reuse the .id from the Step 3 create if you captured it)
CHILD_ID=$(gh api repos/$REPO/issues/<CHILD> --jq '.id')
gh api -X POST repos/$REPO/issues/<EPIC>/sub_issues \
-F sub_issue_id=$CHILD_ID \
--jq '.sub_issues_summary'
-F (not -f) so the id is sent as a number. Confirm the link landed:
gh api repos/$REPO/issues/<EPIC> --jq '.sub_issues_summary'
# total should equal the number of children you linked
gh api 'repos/$REPO/issues/<EPIC>/sub_issues?per_page=100' \
--jq '.[] | "#\(.number) [\(.state)] \(.title)"'
The exact-equality check holds on the fresh-plan path, where every linked child is
still open. On the re-plan path — once you've closed superseded children — don't
rely on it: sub_issues_summary.total is known to undercount when children are a
mix of open and closed (a GitHub sub-issues caveat). There, the
GET .../sub_issues list above is the source of truth for what's actually linked.
To unlink a child (you'll need this in re-plan when a child is superseded), the
endpoint is singular sub_issue (not sub_issues), and the id goes in the JSON
body via --input — -X DELETE … -F does not work here:
echo "{\"sub_issue_id\": $CHILD_ID}" \
| gh api -X DELETE repos/$REPO/issues/<EPIC>/sub_issue --input -
Unlinking does not close the child; it just removes the parent/child edge. Closing is a separate state change (the journal-note path in re-plan).
## Dependencies)Now assemble and pin the full body: untouched brief + the PRD-grade plan from Step 2
## Dependencies section referencing the child numbers you just created.The dependency grammar (format 1): ### Phase N headings are the sequential spine (every
issue in a phase closes before the next phase starts); the list within a phase is a parallel
group (no ordering between them); requires: #N on a child is a cross-boundary gating edge
for a dependency that doesn't fall on a phase boundary. Topology only — no retry budgets,
concurrency caps, or code flags (those are the out-of-repo orchestrator's, per ADR 0046).
Derive the topology from the task-split rationale: independent slices share a phase (parallel);
a slice that needs another's output sits in a later phase, or carries a requires: for a
single specific predecessor.
## Dependencies
### Phase 1
- #<a> — <label>
- #<b> — <label>
### Phase 2
- #<c> — <label> (requires: #<a>)
- #<d> — <label>
The epic body is load-bearing shared state — its ## Dependencies topology is what
write-code reads to decide what's pickable. A second plan-epic run, or a review-plan
child-flip, or a re-plan loop, can edit the same body concurrently; a blind whole-body
PATCH would silently clobber that edit (the lost-update this step exists to prevent — issue
#261, same last-write-wins family as the issue-claim race
../gh-issue-intake-formats.md §7 (issue #260) and the
SHA-bound verdict contract, ADR 0058
(issue #258)). GitHub's issue PATCH honors no If-Match/If-Unmodified-Since — there is no
native compare-and-swap — so the write is made safe by two layers, in order:
Layer 1 — surgical section replacement (collision avoidance). Don't reassemble the body
from your in-memory plan and overwrite the whole thing. Re-read the epic's current body
immediately before the write, replace only the section you changed (the ## Dependencies
block, and — when re-planning — the ## Plan (plan-epic) block), and leave every other byte of
the live body exactly as you just read it. A concurrent edit to a different part of the body
(the brief, a sibling's handoff note, a label-driven addition) then cannot collide with your
write at all — you preserved it verbatim because you never reconstructed it.
Layer 2 — optimistic recheck (abort+retry on a same-section race). Two writers editing the
same section still race. So immediately before the PATCH, re-GET the epic's updated_at and
compare it to the marker captured in Step 1. If it moved, another writer touched the body
since you read it: abort, re-read the body from scratch, re-derive your section against the
fresh revision, and retry — never PATCH over a body you didn't just read.
The re-derive is the part that makes this honest, and it is your action, not the script's.
The block below is a skeleton you re-run per attempt, not a one-shot you launch once: a
## Dependencies block names concrete child numbers and phase topology, so when the recheck
fires (a racer added/closed a child between your reads) you must regenerate
/tmp/plan-epic-<EPIC>-deps.md — and on a re-plan /tmp/plan-epic-<EPIC>-plan.md — against the
freshly-read body (re-run Step 2's split + Step 5's section derivation) before you re-enter
the loop. The script cannot do this for you inside one bash invocation; it can only refuse to
proceed until you have. So the recheck branch stamps the fresh base and breaks out (it does
not silently continue onto a stale deps.md), and a guard at the top of each attempt aborts
loudly if deps.md was not regenerated since the base it splices onto was read — turning the
"re-derive" from a comment you might skip into a precondition the script enforces.
# /tmp/plan-epic-<EPIC>-deps.md = the new `## Dependencies` block. On a RE-PLAN, also
# /tmp/plan-epic-<EPIC>-plan.md = the new `## Plan (plan-epic)` block (set REPLAN=1).
# Give each block a trailing blank line so the next spliced heading stays separated.
# Landing is confirmed against the WHOLE `## Dependencies` block round-tripping byte-for-byte,
# NOT a single line: two concurrent runs on the SAME epic likely both emit a given `- #<child>`
# line (they share children), so a lone matching line can't tell our section from a racer's
# clobber (see step 6). deps.md is re-derived by YOU between attempts (the recheck breaks out and
# hands back; step 2) — the freshness guard (step 2.5) enforces it was, so each pass splices a block
# derived against the body it's splicing onto, never a stale one.
# A first-time plan has NO `## Dependencies` heading yet (Step 2 doesn't write one) — that case
# APPENDS the block to EOF; a re-plan has exactly one and SPLICES it in place. Zero headings on a
# re-plan, or more than one ever, is corruption: abort loudly (step 4).
#
# This block is a SKELETON you re-run per attempt, not a one-shot. When the recheck (step 2) fires
# it stamps the fresh base, BREAKS, and hands back to you to re-derive `deps.md` (+ `plan.md` on a
# re-plan) against `/tmp/plan-epic-<EPIC>-current.md` — then you re-invoke the block. The freshness
# guard (step 2.5) refuses to splice a `deps.md` older than the base it would splice onto, so a
# stale block can never re-clobber a racer's legitimate topology.
# Per attempt: re-read → recheck (verify unchanged) → freshness guard → anchor guard → splice/append → PATCH
# → re-verify our block landed. `landed=1` only after a pass confirms the round-trip; `patched=1`
# records that a PATCH was actually issued (so the terminal verdict can tell "raced every time,
# never wrote" from "wrote and lost"). The terminal check after the loop turns an
# exhausted-or-aborted run into a hard STOP rather than ambiguous output.
landed=0; patched=0
for attempt in 1 2 3; do
# 1. re-read the LIVE body + its revision marker from ONE GET (coherent — no TOCTOU skew)
gh api repos/$REPO/issues/<EPIC> --jq '{body,updated_at}' > /tmp/plan-epic-<EPIC>-live.json
jq -r '.body' /tmp/plan-epic-<EPIC>-live.json > /tmp/plan-epic-<EPIC>-live.md
NOW=$(jq -r '.updated_at' /tmp/plan-epic-<EPIC>-live.json)
WAS=$(cat /tmp/plan-epic-<EPIC>-updated-at.txt)
# 2. optimistic recheck — if the body moved since we last read it, stamp the fresh base and BREAK.
# Re-deriving the section is YOUR action (the script can't regenerate deps.md/plan.md inside one
# bash invocation): re-run Step 2's split + Step 5's derivation against the now-fresh
# `-current.md`, then re-invoke this block. The freshness guard (2.5) enforces that you did.
if [ "$NOW" != "$WAS" ]; then
echo "epic body changed since read ($WAS -> $NOW) — RE-DERIVE deps.md (+ plan.md on a re-plan)"
echo " against the fresh base, then re-invoke this block. (Not auto-retried: the re-derive is an agent step.)"
cp /tmp/plan-epic-<EPIC>-live.md /tmp/plan-epic-<EPIC>-current.md # fresh base to re-derive against
echo "$NOW" > /tmp/plan-epic-<EPIC>-updated-at.txt
break
fi
# 2.5. freshness guard — `deps.md` (and, on a re-plan, `plan.md`) MUST have been (re-)derived
# against the base this attempt is splicing onto. That base is `-current.md` (stamped from the
# live body the recheck above just confirmed unchanged), and your re-derive writes deps.md
# AFTER it — so deps.md must be newer than current.md (`-nt` = "newer than"). If it isn't, the
# re-derive precondition is unmet (you re-invoked without regenerating the block off the fresh
# base): a stale block that references the wrong child set. Abort loudly, don't write — this is
# what stops the `continue`-era footgun of re-splicing the originally-derived block (issue #261).
if ! [ /tmp/plan-epic-<EPIC>-deps.md -nt /tmp/plan-epic-<EPIC>-current.md ] \
|| { [ "${REPLAN:-0}" = 1 ] && ! [ /tmp/plan-epic-<EPIC>-plan.md -nt /tmp/plan-epic-<EPIC>-current.md ]; }; then
echo "ABORT: deps.md (or plan.md on a re-plan) is NOT newer than the base it splices onto (-current.md) —"
echo " you re-invoked without re-deriving. Re-run Step 2's split + Step 5's section derivation"
echo " against /tmp/plan-epic-<EPIC>-current.md, then re-invoke this block. Refusing to splice a stale block."
break
fi
# 3. anchor guard — the splice keys off the count of exact `## Dependencies` headings:
# 0 + first-time plan → no topology pinned yet (Step 2 omits it): APPEND to EOF (step 4a).
# 1 → re-plan with an existing section: SPLICE it in place (step 4b).
# 0 + re-plan, or >1 → corruption (heading drifted to `## Dependencies (phased)`, was
# deleted, or duplicated): a blind splice/append would orphan or
# double the section. Abort loudly, leave `landed=0`.
DEPS_HEADINGS=$(grep -c '^## Dependencies[[:space:]]*$' /tmp/plan-epic-<EPIC>-live.md)
if [ "$DEPS_HEADINGS" -gt 1 ] || { [ "$DEPS_HEADINGS" -eq 0 ] && [ "${REPLAN:-0}" = 1 ]; }; then
echo "ABORT: live body has $DEPS_HEADINGS exact '## Dependencies' headings (want 0 on a first-time plan, 1 on a re-plan) — refusing to splice; inspect by hand"
break
fi
# 3b. on a re-plan, the Plan splice (step 4) keys off `## Plan (plan-epic)` the same way deps keys
# off `## Dependencies` — and the same drift bites: 0 means the heading drifted (e.g. `## Plan`)
# and the awk would splice NOTHING (the re-planned plan silently dropped); >1 means it'd double.
# Want exactly 1 on a re-plan. (First-time plans don't splice the plan block, so skip the check.)
if [ "${REPLAN:-0}" = 1 ]; then
PLAN_HEADINGS=$(grep -c '^## Plan (plan-epic)[[:space:]]*$' /tmp/plan-epic-<EPIC>-live.md)
if [ "$PLAN_HEADINGS" -ne 1 ]; then
echo "ABORT: re-plan but live body has $PLAN_HEADINGS exact '## Plan (plan-epic)' headings (want exactly 1) — refusing to splice; inspect by hand"
break
fi
fi
# 4. surgical splice/append: write ONLY the changed section(s), keep every other byte verbatim.
if [ "$DEPS_HEADINGS" -eq 0 ]; then
# 4a. FIRST-TIME plan — no `## Dependencies` heading exists. Append the block to the END of a
# byte-for-byte copy of the live body (the brief + plan above are preserved untouched).
cp /tmp/plan-epic-<EPIC>-live.md /tmp/plan-epic-<EPIC>-body.md
cat /tmp/plan-epic-<EPIC>-deps.md >> /tmp/plan-epic-<EPIC>-body.md
else
# 4b. RE-PLAN — `## Dependencies` is the pinned LAST section: cut from its heading to EOF,
# append fresh deps. (DEPS_HEADINGS == 1 here.)
awk '/^## Dependencies[[:space:]]*$/{exit} {print}' /tmp/plan-epic-<EPIC>-live.md \
> /tmp/plan-epic-<EPIC>-body.md
cat /tmp/plan-epic-<EPIC>-deps.md >> /tmp/plan-epic-<EPIC>-body.md
fi
# On a RE-PLAN, `## Plan (plan-epic)` ALSO changed — splice it in place too: delete the inclusive
# `## Plan (plan-epic)`..next-`## ` range and re-insert the fresh plan block at that boundary.
if [ "${REPLAN:-0}" = 1 ]; then
awk -v plan="/tmp/plan-epic-<EPIC>-plan.md" '
/^## Plan \(plan-epic\)[[:space:]]*$/ { while ((getline l < plan) > 0) print l; skip=1; next }
skip && /^## / { skip=0 }
!skip { print }
' /tmp/plan-epic-<EPIC>-body.md > /tmp/plan-epic-<EPIC>-body.2.md \
&& mv /tmp/plan-epic-<EPIC>-body.2.md /tmp/plan-epic-<EPIC>-body.md
fi
BODY="$(cat /tmp/plan-epic-<EPIC>-body.md)"
# 5. extract THIS run's whole `## Dependencies` block (heading → EOF) from the body we're about
# to write — that exact multi-line block is what we'll confirm round-tripped, so a racer who
# happens to share a child number can't satisfy the check with one matching `- #` line.
awk '/^## Dependencies[[:space:]]*$/{f=1} f{print}' /tmp/plan-epic-<EPIC>-body.md \
> /tmp/plan-epic-<EPIC>-deps-expected.md
# 6. write, then re-confirm OUR WHOLE BLOCK landed — extract `## Dependencies`→EOF from the live
# post-write body and diff it against the block we just wrote. A racer's clobber differs
# somewhere in the block (different topology/labels/ordering), so an exact block match — not a
# heading or a single child line — is what tells our section from theirs. The residual window
# (below) means the PATCH is still last-write-wins; this is the honest after-the-fact check
# that retries the loser.
gh api -X PATCH repos/$REPO/issues/<EPIC> -f body="$BODY" >/dev/null; patched=1
gh api repos/$REPO/issues/<EPIC> --jq '.body' \
| awk '/^## Dependencies[[:space:]]*$/{f=1} f{print}' > /tmp/plan-epic-<EPIC>-deps-live.md
if diff -q /tmp/plan-epic-<EPIC>-deps-expected.md /tmp/plan-epic-<EPIC>-deps-live.md >/dev/null; then
echo "epic body updated, our whole ## Dependencies block round-tripped"; landed=1; break
else
# A racer clobbered our write. Do NOT auto-re-splice the stale deps.md — that would
# silently re-clobber the racer's legitimate same-section topology change. Mirror the
# recheck-break (step 2): snapshot the racer's body as the FRESH base, then break to hand
# back to the agent to RE-DERIVE deps.md (and plan.md on a re-plan) against it before any
# re-splice. The freshness guard (step 2.5) then enforces the re-derive on the next invoke,
# so a stale block can never re-clobber.
echo "our ## Dependencies block is NOT the one in the post-write body — a racer clobbered it."
echo " Re-derive deps.md (and plan.md on a re-plan) against the refreshed"
echo " /tmp/plan-epic-<EPIC>-current.md, then re-invoke this block. Refusing to re-splice the stale block."
gh api repos/$REPO/issues/<EPIC> > /tmp/plan-epic-<EPIC>-snap.json # one snapshot, no TOCTOU between body+updated_at
jq -r '.body' /tmp/plan-epic-<EPIC>-snap.json > /tmp/plan-epic-<EPIC>-current.md # fresh base to re-derive against
jq -r '.updated_at' /tmp/plan-epic-<EPIC>-snap.json > /tmp/plan-epic-<EPIC>-updated-at.txt
break
fi
done
# Terminal verdict — the loop can exit several ways; only one is success. An orchestrator (and the
# next agent reading the transcript) must not mistake an exhausted-or-aborted run for a win. The two
# non-success modes differ: a run that NEVER issued a PATCH (raced + re-derived, or a guard aborted)
# left the body untouched; a run that DID PATCH but lost the round-trip left it possibly half-written.
if [ "$landed" != 1 ]; then
echo "plan-epic: could NOT land the ## Dependencies block — STOP and inspect, do not proceed."
if [ "$patched" = 1 ]; then
echo " (A PATCH was issued but our block didn't round-trip — a racer clobbered it. The epic body"
echo " may be half-written with someone else's topology; the topology this run derived is NOT pinned.)"
else
echo " (No PATCH was ever issued — either a guard aborted (corrupt/duplicated heading, or deps.md"
echo " not re-derived against the fresh base), or every attempt raced and handed back to re-derive."
echo " The epic body is untouched; the topology this run derived is NOT pinned.)"
fi
fi
Keep the brief byte-for-byte. With the surgical splice/append this is automatic: a first-time
plan appends the ## Dependencies block to a verbatim copy of the live body; a re-plan copies the
live body up to the ## Dependencies heading verbatim and re-appends the fresh block. Either way
the brief above the plan is untouched bytes from the live read; on a re-plan the ## Plan (plan-epic) block is itself re-spliced in place (step 4), and everything outside the two changed
sections is verbatim — don't reflow the brief, don't "tidy" it, don't reconstruct it from memory —
splice around it.
Honest residual — this narrows the window, it is not a lock. The recheck (layer 2) only
detects a race that completed before this run's read; a writer who edits after your
updated_at read but before your PATCH lands is still lost-update territory, because the
PATCH itself is last-write-wins (GitHub offers no conditional write on issue bodies). Layer 1
shrinks the blast radius to same-section collisions; layer 2 narrows the same-section window;
the re-confirm in step 6 — diffing the whole ## Dependencies block that round-tripped, not the
section heading or a single child line two racers might share — catches the loser after the fact
so it retries rather than failing silently, and the terminal verdict turns an exhausted-or-aborted
run into a hard STOP rather than a silent half-write. What this is
not is mutual exclusion — true single-writer safety on one epic would
need a designated single planner or a CAS the API doesn't provide (same honest framing as the
issue-claim semantics in ../gh-issue-intake-formats.md §7 and
the SHA-bound verdict contract, ADR 0058). Don't claim a "lock"; claim "no silent lost-update of
the topology," which is what the acceptance asks for.
Sanity-check the result: the brief is still on top, the plan follows (product layer first),
the ### User stories section is present, the ## Dependencies numbers match the children that
exist and are linked, and every story maps to a child (the coverage invariant).
When you're re-run on an epic that already has a plan and children (the brief changed, scope shifted, a child was closed), you rewrite the plan and the task split together — but you don't blow away history. Re-derive the user stories first (they may have grown or shifted), then judge each existing child individually against the new story set:
| Verdict | When | Action |
|---|---|---|
| Keep | The child is still a faithful slice of the new plan and still covers its story. | Leave it. If only its framing drifted, you may amend its body, but its identity stands. |
| Amend | The child's intent survives but its scope/criteria/stories moved. | PATCH its body to the new spec (preserve its acceptance-criteria + **Stories:** discipline). It stays linked, same number. |
| Supersede | The child no longer fits — the plan dropped it, merged it, or replaced it with a differently-shaped unit. | Close it with a journal note (below), unlink it, and create the replacement fresh if there is one. |
| Frozen | The child is already closed (its work merged). | Leave it untouched — it's history; the new plan builds on it. Never reopen or supersede a closed-done child. |
After reconciling, re-run the story-coverage check (Step 3) against the new story set: a newly-added story with no child needs one; an orphaned child needs a story or a cut.
A child created fresh during a re-plan — a Supersede replacement, or one filling a newly-added story — is born exactly like a first-plan child: it inherits the epic's milestone the same way (Step 3, Inherit the epic's milestone), conditional on the epic having one.
Closed-done children are history — never reopen or supersede them. A child that's
already closed because its work merged is part of the record. The new plan builds on
top of it; it doesn't pretend the work didn't happen. Only open children are
candidates for amend/supersede.
Every supersede is auditable. Before closing a superseded child, post a comment saying why and where the work went, so the trail is legible:
gh api repos/$REPO/issues/<CHILD>/comments \
-f body="Superseded by re-plan of #<EPIC>: <specific reason — e.g. 'scope merged into #<NEW>' or 'dropped, the brief no longer asks for X'>."
# unlink from the epic (singular sub_issue, id in the JSON body), then close not-planned
CHILD_ID=$(gh api repos/$REPO/issues/<CHILD> --jq '.id')
echo "{\"sub_issue_id\": $CHILD_ID}" | gh api -X DELETE repos/$REPO/issues/<EPIC>/sub_issue --input -
gh api -X PATCH repos/$REPO/issues/<CHILD> -f state=closed -f state_reason=not_planned
Per-child judgment is the default because it preserves the most history. But when the plan changed so much that mapping old children to new ones is a tangle — you can't cleanly say which old child maps to which new one — don't force a bad mapping. Full-supersede instead: close every open child with a journal note pointing at the new plan, then create the new split clean. Closed-done children stay as history untouched. A clean new split with honest journal notes beats a contorted keep/amend mapping that leaves children half-describing the old plan and half the new.
After reconciling children, rewrite the plan body and the ## Dependencies section
(Steps 2 and 5) to match the surviving + new children. The brief stays untouched on
top, as always. The re-plan write goes through the same guarded read-modify-write as
Step 5 — surgical section splice + optimistic updated_at recheck, never a blind
whole-body PATCH. Re-plan is exactly the concurrency hot-spot the guard exists for: a
re-plan loop racing a fresh plan-epic run or a review-plan child-flip is the
lost-update case in issue #261. Write the fresh ## Plan (plan-epic) block to
/tmp/plan-epic-<EPIC>-plan.md, the fresh ## Dependencies block to
/tmp/plan-epic-<EPIC>-deps.md, and run the Step 5 loop with REPLAN=1 so it splices
both sections into the freshly-read live body in place (the live body already has exactly one
## Dependencies heading and exactly one ## Plan (plan-epic) heading to splice against — the
loop's anchor guards abort if either drifted). When updated_at moved since your read, the loop
breaks and hands back so you re-derive both blocks against the fresh base before re-invoking it
— the re-derive is your step, not the script's, and the freshness guard enforces you did it. (On a
first-time plan, leave REPLAN unset — the live body has no ## Dependencies heading yet, so the
loop appends the new block to EOF instead of splicing, and the Plan-heading guard is skipped.)
When validating against a scratch epic (a throwaway you created to test the skill, not a real backlog epic), tear it down afterwards so the real backlog stays clean. Closing isn't enough — the children and the epic should be removed. You can't delete issues via the public REST API, so the honest cleanup is: unlink and close every scratch child not-planned, close the scratch epic, and label them so they're unmistakably test debris.
# for each scratch child: unlink + close
CHILD_ID=$(gh api repos/$REPO/issues/<CHILD> --jq '.id')
echo "{\"sub_issue_id\": $CHILD_ID}" | gh api -X DELETE repos/$REPO/issues/<EPIC>/sub_issue --input -
gh api -X PATCH repos/$REPO/issues/<CHILD> -f state=closed -f state_reason=not_planned
# then close the scratch epic
gh api -X PATCH repos/$REPO/issues/<EPIC> -f state=closed -f state_reason=not_planned
(If you have repo-admin and the GraphQL deleteIssue mutation is available to you,
deleting outright is cleaner — but GraphQL is unreliable on this org, so closing is
the dependable path.) Never run dry-run validation against a real epic; spin up a
scratch one.
A single invocation takes one epic from triaged brief to executable ledger: read the epic +
codebase (Step 1), write the PRD-grade plan — product layer (problem / solution / user
stories / testing strategy) then engineering layer (Step 2), split into tracer-bullet
children that each trace to a story (Step 3), link them as native sub-issues (Step 4), and pin
the full body with its ## Dependencies topology (Step 5). Re-runs reconcile.
Acquire the status:planning epic-lock before you mutate (see §Acquire the
epic-lock) and release
it when you finish — on success, park, or failure. A lock left held wedges the epic against
every later plan-epic/review-plan run until a human clears it.
Report back a short ledger: the epic, the story count, the children created (with the story each covers), and the phase topology. Don't narrate every REST call — the epic body and the linked sub-issues are the durable record.
This skill is one of a suite (report → triage → plan-epic → review-plan →
write-code → review-code → ship-it) that turns GitHub issues into an agent-operable
pipeline. The shared label semantics and the body/comment/dependency/story formats live in
../gh-issue-intake-formats.md; the decision to make
plan-epic's output PRD-grade, story-driven, coverage-enforced, and autonomous (with the
personal PRD/orchestrator harness deliberately kept out of the repo) is ADR
0046. Your input is a
type:epic + status:triaged issue from triage; your output — the epic body's PRD-grade
plan + ## Dependencies, and the linked sub-issues with their story traces and acceptance
criteria — is what write-code reads to pick, sequence, and execute the work, once the
review-plan gate has flipped each child status:planned → status:triaged (ADR
0047).
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub kamp-us/phoenix --plugin kampus-pipeline