From cadence
Use when reviewing a PR/MR diff for security, logic, performance, cross-file impact, test coverage, or spec compliance findings. NOT for writing PR descriptions, design reviews requiring business judgment, implementation work, release notes, or deep CVE/supply-chain audits.
How this skill is triggered — by the user, by Claude, or both
Slash command
/cadence:pr-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Review a PR/MR diff by dispatching independent role-based subagents in parallel, then publish findings as one sticky summary comment + per-finding inline comments. The main session never reviews — it ingests, dispatches, merges, emits, publishes.
Review a PR/MR diff by dispatching independent role-based subagents in parallel, then publish findings as one sticky summary comment + per-finding inline comments. The main session never reviews — it ingests, dispatches, merges, emits, publishes.
You MUST dispatch independent subagents — NEVER review the diff yourself in the main session. The main session accumulates context bias from prior conversation. Only an isolated subagent can deliver an unbiased finding.Dispatch in PARALLEL using a single message with multiple Agent tool calls. If one subagent fails, proceed with the rest BUT surface the failure in the sticky comment header (never silent). If ALL fail, report failure — do NOT fall back to self-review.
Publishing happens in the main session (post-merge) — not in subagents.
mode: local does NOT relax this gate. Local mode changes the output target (JSON to stdout instead of the PR/MR sticky/inline), not the reviewer. The 4-subagent parallel dispatch is exactly the property that makes local mode worth invoking from a supervisor session — without it the caller could just self-review.
| Thought | Reality |
|---|---|
| "The diff is small, I can review it myself" | Self-review is biased by what you saw in the conversation. Small ≠ unbiased. |
| "I already saw this code earlier" | That's exactly why you can't review it. Familiarity hides issues. |
| "Dispatching 3-4 subagents is overkill" | Each persona uses a different mental model. A single agent dilutes all of them. |
| "Sequential is fine, I'll save tokens" | Parallel is faster wall-clock and prevents one report from biasing the next. |
| "spec-auditor isn't needed, the spec is short" | If has_spec is true, dispatch. The check is whether spec exists, not whether it's verbose. |
| "I'll just check the obvious bug myself" | Even one self-checked finding contaminates the report — readers can't tell which findings are biased. |
| "1 subagent failed, just hide it" | Hiding partial review = pretending coverage existed. Surface it in sticky header. |
| "Prior finding line moved, mark fixed" | Line moving ≠ behaviour fixed. Require subagent verification, then hedge as Likely fixed. |
✅ Fixed without a verification note from subagent<decision_boundary>
Use for:
NOT for:
</decision_boundary>
digraph pr_review {
"Receive inputs" [shape=doublecircle];
"Resolve mode" [shape=box];
"Compute capability flags" [shape=box];
"Parallel dispatch" [shape=box, style=bold];
"security-reviewer" [shape=box];
"staff-engineer" [shape=box];
"sdet" [shape=box];
"has_spec?" [shape=diamond];
"spec-auditor" [shape=box];
"Skip spec-auditor" [shape=box];
"Collect + Dedup" [shape=box];
"Apply severity merge rule" [shape=box];
"Build sticky + inline" [shape=box];
"dry-run?" [shape=diamond];
"Print to console" [shape=box];
"Publish to PR" [shape=box];
"Done" [shape=doublecircle];
"Receive inputs" -> "Resolve mode";
"Resolve mode" -> "Compute capability flags";
"Compute capability flags" -> "Parallel dispatch";
"Parallel dispatch" -> "security-reviewer";
"Parallel dispatch" -> "staff-engineer";
"Parallel dispatch" -> "sdet";
"Parallel dispatch" -> "has_spec?";
"has_spec?" -> "spec-auditor" [label="yes"];
"has_spec?" -> "Skip spec-auditor" [label="no"];
"security-reviewer" -> "Collect + Dedup";
"staff-engineer" -> "Collect + Dedup";
"sdet" -> "Collect + Dedup";
"spec-auditor" -> "Collect + Dedup";
"Skip spec-auditor" -> "Collect + Dedup";
"Collect + Dedup" -> "Apply severity merge rule";
"Apply severity merge rule" -> "Build sticky + inline";
"Build sticky + inline" -> "dry-run?";
"dry-run?" -> "Print to console" [label="yes"];
"dry-run?" -> "Publish to PR" [label="no"];
"Apply severity merge rule" -> "Emit findings JSON" [label="mode=local"];
"Emit findings JSON" [shape=box];
"Emit findings JSON" -> "Done";
}
pr — <owner>/<repo>#<N> or a full PR/MR URL (GitHub PR, or GitLab /-/merge_requests/<iid>). Used for platform detection, sticky lookup, and publishing — see Platform. No implicit branch-based detection (too brittle).
Exception:
mode: localdoes not requirepr(no remote side effects). See Local Mode.
mode — auto (default) / full / incremental / local. See Mode Detection and Local Mode.
dry-run — true / false (default false). When true, builds sticky + inline payloads and prints to console; no remote API writes (GitHub/GitLab).
base — base git ref for diff scope (e.g. origin/main). Used by local mode only — replaces the sticky/PR-based base lookup. Ignored for other modes (they derive base from the PR or repo state).
last_sha — prior reviewed HEAD commit. Used by local mode only for incremental review. In non-local modes the last_sha lives in the sticky body; local mode has no sticky, so the caller (e.g. supervisor) must pass it explicitly.
spec — source of the spec / design doc. Sets has_spec flag.
spec: docs/specs/payment-v2.mdspec: https://confluence.example.com/payment-v2spec: this PR implements PCI DSS v4.0 logical isolationspec: design doc at X, acceptance criteria in Jira ABC-123If absent → spec-auditor not dispatched. Other subagents do not reference spec.
test direction — tells sdet how to evaluate test coverage. All sub-fields optional:
unit only / integration required / e2e required / no test neededMissing → sdet uses heuristic from diff nature.
context — free-form supplementary signals:
Context can adjust severity at merge time (see Severity Merge Rule).
repo rules — compact repo-specific review constraints sourced from committed project files, not from conversation memory. Use this when a repo has hard invariants, ADRs, testing conventions, glossary terms, module READMEs, or accepted design trade-offs that reviewers must respect.
Examples:
Repo rules are review inputs, not output metadata. Do not add a "rules source" / "review inputs" section to the sticky.
pr-review publishes to GitHub (via gh) or GitLab (via glab). Resolve $PLATFORM before Mode Detection — it selects every discovery and publish endpoint below. Both platforms use the same sticky/inline markers (HTML comments render in either markdown), so only the transport differs.
pr is a full URL → inspect host + path:
github.com, or path contains /pull/ → gh/-/merge_requests/ → glabpr is <owner>/<repo>#<N> shorthand (no host) → detect from git remote get-url origin: github.com → gh; any other host → glab.glab targets its configured host. Never hardcode a hostname.| GitHub | GitLab | |
|---|---|---|
| Repo / project | $OWNER/$REPO | $PROJECT = URL-encoded namespace/path (/ → %2F) |
| PR / MR number | $N | $IID (the MR number) |
| Inline diff anchors | n/a | $BASE / $START / $HEAD from glab api projects/$PROJECT/merge_requests/$IID → .diff_refs |
| Operation | GitHub (gh api) | GitLab (glab api) |
|---|---|---|
| Find sticky | repos/$OWNER/$REPO/issues/$N/comments | projects/$PROJECT/merge_requests/$IID/notes?per_page=100 |
| Create sticky | -X POST .../issues/$N/comments | -X POST .../merge_requests/$IID/notes |
| Update sticky in place | -X PATCH .../issues/comments/$ID | -X PUT .../merge_requests/$IID/notes/$ID |
| Delete stale sticky | -X DELETE .../issues/comments/$ID | -X DELETE .../merge_requests/$IID/notes/$ID |
| Commit status | -X POST .../statuses/$HEAD -f context=pr-review | -X POST .../statuses/$HEAD -f name=pr-review |
| Inline comment | one review: -X POST .../pulls/$N/reviews (batched) | one discussion per finding: -X POST .../merge_requests/$IID/discussions --input <json> |
GitLab specifics that bite (verified against a live MR):
PUT .../notes/$ID. Posting a fresh note every run is exactly what makes incremental detection fail (multiple stickies, no canonical last_sha).--input. Passing -f "position[...]" form flags is silently dropped and the note lands un-anchored as a plain DiscussionNote instead of a DiffNote.failed (GitHub uses failure); GitLab also has no delete-status API (a status is only superseded by a newer post on the same name).Resolve before dispatch. The mode controls diff scope and output sections.
mode: local short-circuits this whole section — no sticky lookup, no SHA reachability check, no noop case. Diff scope comes from the base (and optional last_sha) inputs; see Local Mode.
digraph mode {
"mode input" [shape=box];
"Has sticky?" [shape=diamond];
"last_sha reachable?" [shape=diamond];
"Same as HEAD?" [shape=diamond];
"incremental" [shape=box, style=bold];
"full" [shape=box, style=bold];
"local" [shape=box, style=bold];
"noop (report no-change)" [shape=box, style=bold];
"mode input" -> "Has sticky?" [label="auto"];
"mode input" -> "incremental" [label="incremental (forced)"];
"mode input" -> "full" [label="full (forced)"];
"mode input" -> "local" [label="local (forced)"];
"Has sticky?" -> "last_sha reachable?" [label="yes"];
"Has sticky?" -> "full" [label="no"];
"last_sha reachable?" -> "Same as HEAD?" [label="yes"];
"last_sha reachable?" -> "full" [label="no\n(force-push?)"];
"Same as HEAD?" -> "noop (report no-change)" [label="yes"];
"Same as HEAD?" -> "incremental" [label="no"];
}
Sticky discovery (use the platform's "Find sticky" endpoint from Platform):
# GitHub
gh api repos/$OWNER/$REPO/issues/$N/comments \
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | {id, body}'
# GitLab
glab api "projects/$PROJECT/merge_requests/$IID/notes?per_page=100" \
| jq '[.[] | select(.body | contains("<!-- pr-review:sticky -->"))] | max_by(.id)'
When more than one sticky matches (legacy duplicates from older runs), the newest (highest id) is canonical — its body holds the authoritative last_sha. Publishing edits that one in place and deletes the rest (see Publishing).
Markers embedded in sticky body:
<!-- pr-review:sticky --> — locator<!-- pr-review:sha=<commit> --> — last reviewed HEADSHA reachability:
git cat-file -e <last_sha> 2>/dev/null && echo reachable || echo unreachable
If unreachable (force-push / squash-merge of older PR / branch rebased): fall back to full AND prepend to sticky body:
> ⚠️ Prior review base `<last_sha>` is not reachable (force-push?). This iteration is a full re-review.
last_sha == HEAD)When the sticky exists AND last_sha == HEAD: skip dispatch + publish. Print to console:
pr-review: nothing new since <last_sha>. Skipping. Use mode=full to force a re-review.
The sticky is already current; do not touch it.
Use when the caller is another skill or supervisor session that needs unbiased multi-role review of a diff but has no PR open yet (e.g. a supervisor session's verify phase doing pre-PR critique). The HARD-GATE still applies — local mode is about output target, not about who reviews.
Caveat — calling from the same dev session that wrote the code (author-as-reviewer bias): pr-review's 4-subagent dispatch is isolated by design — finding generation is robust even when called from the author's session. But the downstream
modify / wontfix / deferverdict on each finding is NOT covered by this isolation. If the same session that wrote the code also reasons about which findings to wontfix, author-narrative bias compounds — framing a diff as "bug-free" produces the strongest detection drop among framing conditions tested across 6 LLMs (Mitropoulos et al., Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review, arXiv:2603.18740). Treat local-mode findings as advisory in dev sessions; do NOT auto-execute verdicts in main session. A proper dev-stage verdict loop needs a separate Deriver-pattern verdict-subagent (not built yet — seepr-babysit/SKILL.md§ 4.6 "When NOT to use" for the equivalent caveat on Wontfix Template).
mode: local (required to enter this mode)base: <ref> (required — e.g. origin/main)last_sha: <sha> (optional — if provided, runs incremental on <last_sha>..HEAD and still reads <base>...HEAD for cumulative context that subagents need for prior-finding verification)spec, test direction, context — same semantics as default modepr is NOT required and ignored if provided.
last_sha → full diff: git diff <base>...HEAD (three-dot — topic-only changes)last_sha → incremental: subagents see both <base>...HEAD and <last_sha>..HEAD; they report findings only inside the incremental window plus verification status for prior findings (caller must pass prior findings too — see below)The sticky normally carries prior findings between iterations. In local mode the caller owns that state and must pass to pr-review on each invocation:
prior_findings: array of objects with {id, slug, file, line, category, severity, justification, summary} — same shape as findings JSON output (see below)prior_fix_range: <first-fix-sha>^..<last-fix-sha> — the commits that addressed iter (N-1) findings, used by the threshold's drop signal (B)If last_sha is set but prior_findings is missing → ESCALATE to caller; do not fabricate.
Skip Publishing. Skip sticky/inline markdown construction. Emit one JSON document to stdout:
{
"mode": "local",
"base": "origin/main",
"head": "<HEAD sha>",
"last_sha": "<sha or null>",
"status": "PASSED | PASSED_WITH_NOTES | REVIEW_BEFORE_MERGE | BLOCKED | PARTIAL | NOOP",
"status_heading": "✅ pr-review: PASSED | 🟡 pr-review: PASSED WITH NOTES | 🟠 pr-review: REVIEW BEFORE MERGE | 🔴 pr-review: BLOCKED | ⚠️ pr-review: PARTIAL",
"open_counts": { "P0": 0, "P1": 0, "P2": 0, "Q": 0 },
"subagent_failures": [],
"next_action": "<one-line or null>",
"findings": [
{
"id": "F1",
"p_code": "P0 | P1 | P2 | Q",
"severity_emoji": "🚨 | ⚠️ | 💡 | ❓",
"slug": "kebab-case-slug",
"category": "Original [code name] from subagent",
"file": "path/to/file",
"line_start": 42,
"line_end": 42,
"confidence": "high | medium | low",
"blast": "Local | Module | Cross-service | Data layer",
"justification": "Reachable | Precedent | Asymmetric | Historical",
"failure_mode": "one-line",
"mitigation": "one-line",
"evidence": "verbatim diff line(s)",
"details": "optional multi-line",
"disposition": "open | likely_fixed | still_present | follow_up | wontfix | by_design",
"accepted_exception": null | { "kind": "follow_up | wontfix | by_design", "reason": "...", "issue": "#123 or null" },
"severity_adjustment": null | { "from": "💡 P2", "to": "⚠️ P1", "reason": "..." }
}
],
"accepted_exceptions": [
{ "finding_id": "F2", "kind": "follow_up | wontfix | by_design", "reason": "one-line", "issue": "#123 or null" }
],
"spec_gaps": [
{
"id": "F7",
"section": "spec section or decision id",
"title": "one-line",
"spec_quote": "verbatim",
"code_quote": "verbatim",
"questions": ["..."]
}
],
"prior_verifications": [
{
"prior_id": "F1",
"verification": "yes | unclear | no",
"note": "what evidence"
}
],
"checked_and_clean": [
{ "slug": "...", "evidence": "one-line" }
]
}
severity_adjustment: null when no adjustment; the merged severity is already reflected in p_code / severity_emoji. The adjustment field exists so callers can audit downgrades (same role as the sticky's ## ⚖️ Severity adjustments section).
prior_verifications is empty [] when last_sha is absent.
gh review or glab discussions)gh / glablast_sha == HEAD and caller still invokes, return findings: [] + a status: "noop")Compute before dispatch:
| Flag | Default | Set when | Effect |
|---|---|---|---|
has_spec | false | spec input present OR PR description has goal/requirement section | dispatch spec-auditor |
has_repo | true | repo access available (grep / index / LSP) | enable cross-file checks |
is_trivial | false | <50 LOC AND (docs-only OR pure rename OR pure type-only) | skip staff-engineer |
Before dispatch, build a compact context pack from durable inputs. This reduces false positives from reviewers missing accepted scope or repo rules, while preserving subagent isolation from conversation history.
Allowed sources:
spec, context, test direction, and repo rulesrepo rulesDo not include:
Subagents receive the compact context pack, but still emit findings only with quoted diff/spec evidence. Dispatcher uses the same pack for severity calibration, accepted-exception handling, and P0 conservatism.
Default dispatch (4 subagents in parallel via a single message):
| Subagent | Prompt file | When dispatched |
|---|---|---|
| security-reviewer | security-reviewer-prompt.md | always |
| staff-engineer | staff-engineer-prompt.md | always (skip if is_trivial) |
| sdet | sdet-prompt.md | always |
| spec-auditor | spec-auditor-prompt.md | only if has_spec |
Each subagent receives:
full mode; <last_sha>..HEAD in incremental mode)full / incremental)incremental mode (dispatcher MUST provide all three):
Checked & clean slugs for drift spot-checkprior_fix_range: <first-fix-sha>^..<last-fix-sha> — git range covering the commits that addressed iter (N-1) findings. Subagent uses this to apply drop signal (B) self-introduced surface. In single-commit-per-iter cases this collapses to <last_sha>..HEAD. If the dispatcher cannot determine the range (e.g. force-push, squash-merge of iter N-1 commits) → fall back to full mode and announce in sticky; do NOT invoke incremental mode without prior_fix_rangeThreshold inlining: the Finding Inclusion Threshold is inlined directly in each subagent prompt (security-reviewer-prompt.md / staff-engineer-prompt.md / sdet-prompt.md / spec-auditor-prompt.md). Dispatcher does NOT need to prepend threshold text — subagents apply it from their baked-in section. This avoids relying on dispatcher's "good behavior" to inject the gate on every invocation.
In incremental mode, each subagent ALSO emits for every prior finding within its scope:
Prior finding status: <id>
verification: yes | unclear | no
note: <one-line — what evidence supports the verification>
Mapping to display status (in ## 🔄 Changes since last review table):
verification | Display |
|---|---|
yes | ✅ Likely fixed <sha> — |
unclear | ⏸️ Untouched — <note: "file segment not in diff"> |
no | 🔄 Still present — <note: "evidence still observable at :"> |
Never emit ✅ Fixed without verification: yes. Default hedge is Likely fixed always — finality belongs to the human reviewer.
⚠️ Partial — <subagent> failed⚠️ Partial — N/4 subagents failed: <names>Each subagent emits findings in this shape:
[<category-id> <category-name>] <file>:<line_start>-<line_end>
Severity: 🚨 | ⚠️ | 💡 | ❓
Confidence: high | medium | low
Blast: Local | Module | Cross-service | Data layer
Justification: Reachable | Precedent | Asymmetric | Historical
Evidence: <verbatim diff line(s) — cite-or-drop rule>
Failure mode: <one-line — what breaks if shipped as-is>
Mitigation: <one-line — fix action; cite test path when test coverage is part of the fix>
Details: <optional — multi-line narrative, repro steps, code patch. Use only when Failure mode genuinely needs more than one line>
Notes: <optional — only if severity differs from default>
Field semantics:
Failure mode — concrete bug / breach / drift consequence. Forces severity calibration: if you cannot describe what goes wrong in one line, you do not have a finding.Mitigation — actionable fix. When the finding's resolution involves test coverage, name the test file and case (e.g. add assert in foo_test.py:42 'rejects empty input' case).Details — escape hatch for findings whose explanation cannot fit one line (e.g. multi-step race, cross-file impact chain). Keep Failure mode and Mitigation as one-liners regardless; put narrative here.Justification — required class declaring why the finding is worth emitting. See Finding Inclusion Threshold below. Findings that cannot commit to one of the four classes MUST NOT be emitted as standalone findings; batch into a Q-class hygiene followup instead.After findings, each subagent emits N/A categories: [<list>] declaring which of its owned categories were reviewed and clean. This distinguishes "checked, found nothing" from "skipped".
spec-auditor uses Spec quote: + Code quote: instead of single Evidence: — both must be verbatim quotes. Failure mode for spec findings = "what spec contract gets violated if shipped".
Drop rule: any finding without Evidence: (or both quotes for spec-auditor) is fabrication — discard before merge.
This gate is applied by each subagent inline before emitting a finding. Canonical definition lives in the subagent prompts, not here — see any of:
security-reviewer-prompt.md § Finding Inclusion Thresholdstaff-engineer-prompt.md § Finding Inclusion Thresholdsdet-prompt.md § Finding Inclusion Thresholdspec-auditor-prompt.md § Finding Inclusion ThresholdAll four contain the same Justification classes (Reachable / Precedent / Asymmetric / Historical), the same drop signals (A / B / C / D), and the same Asymmetric escape hatch. Per-prompt variations only add category-specific guidance (e.g. "most S1–S5 are Asymmetric" for security, "rare for T-class" for SDET).
Why duplicated across four prompts rather than referenced from one source: see Design note: prompt inlining.
Full vs incremental mode: full mode applies the threshold but drop signal (B) self-introduced surface never fires (no prior_fix_range on iter 1). Incremental mode applies all four signals.
Spec ambiguity rule (applies only to spec-auditor's C-class findings, kept in this SKILL.md as cross-cutting): if a candidate finding's mitigation offers "add a code comment" / "document the limitation in a comment" as an equal-weight resolution (phrasing "either X or document Y"), the finding is a Q-class spec gap addressed to the spec author, not P-class actionable. A comment-as-last-resort fallback ("do X; if impractical, document Y") keeps the finding actionable — the primary mitigation is what gets judged.
Apply in fixed order to each finding. Lower number wins on conflict.
confidence: low → demote to ❓ Question (terminal, no further escalation)blast: cross-service or blast: data-layer raises review attention and can escalate P2→P1 when the failure is factual. It does not automatically escalate to 🚨 Blocker.| Emoji | P-code | Label |
|---|---|---|
| 🚨 Blocker | P0 | must fix; blocks merge |
| ⚠️ Factual | P1 | should fix |
| 💡 Suggestion | P2 | consider |
| ❓ Question | Q | clarify; not a priority tier |
Severity ordering (for sort): P0 > P1 > P2. Q is orthogonal.
Downgrades (step 5 lowering a tier) MUST appear in the Severity adjustments section. Never silent. Never collapsed behind <details> — render as plain section when any adjustment exists.
file:line + same category → keep highest base severity, attribute to all reporting subagentsPR-published prose (sticky shape narrative, Failure mode / Mitigation / Details content, spec gap question body, verification notes, framing text around code refs) renders in the PR description's primary language. Everything else — markers, section titles, field labels, kebab-case slugs, P-codes, severity / justification / status tokens, the race meta tag — stays English.
Fallback when the PR description lacks substantive prose: linked issue body, then English.
Terminal / JSON output (mode=local JSON, dry-run console, noop message) stays English regardless of PR language — those go to callers, not the PR.
After subagent findings are merged, deduped, and severity-calibrated, produce three artifacts:
pr-review (GitHub context / GitLab name), visible in the PR/MR header / Checks area. Its target_url links to the sticky comment.Status tier (drives sticky heading and commit status, NOT any actual approve/request-changes review event):
| Condition | Sticky heading | Commit status |
|---|---|---|
| Any subagent failed | ⚠️ pr-review: PARTIAL | failure |
| Any unaccepted P0 | 🔴 pr-review: BLOCKED | failure |
| No unaccepted P0, any unaccepted P1 | 🟠 pr-review: REVIEW BEFORE MERGE | failure |
| Only unaccepted P2 / Q | 🟡 pr-review: PASSED WITH NOTES | success |
| Zero unaccepted findings | ✅ pr-review: PASSED | success |
| Zero unaccepted findings + accepted exception | 🟡 pr-review: PASSED WITH NOTES | success |
Accepted exceptions (follow-up, wontfix, by-design) do not count as open blockers after they are explicitly recorded in the sticky. The sticky status MUST be recalculated after every incremental review from current unaccepted findings plus accepted exceptions; never leave BLOCKED / REVIEW BEFORE MERGE visible after all P0/P1 findings have been accepted or verified likely fixed.
The skill does not submit APPROVE or REQUEST_CHANGES reviews. Status wording lives inside the sticky comment and commit status only. Auto-approve / auto-merge is forbidden.
Final P0 is reserved for findings that are all three:
Cross-service or Data layer blast may escalate severity only when the three P0 conditions hold. Otherwise prefer P1 for factual defects, P2 for optional hardening, or Q for spec/design decisions. Downgrades caused by accepted scope or design context must be visible in Accepted exceptions or Severity adjustments.
The first visible line of the sticky is always the status heading. Do not prepend bot attribution such as Automated review by pr-review skill. The reader should know pass/fail before reading details.
## <status-heading>
**Open**: <none | P0×N, P1×N, P2×N, Q×N — only non-zero> · **Reviewed HEAD**: `<HEAD>` · **Mode**: <full|incremental>
**Checked**: ✅ <N> clean
**Next action**: <one-line: optional for PASSED, required otherwise>
Examples:
## ✅ pr-review: PASSED
**Open**: none · **Reviewed HEAD**: `abc1234` · **Mode**: full
**Checked**: ✅ 11 clean
## 🟡 pr-review: PASSED WITH NOTES
**Open**: P2×1, Q×1 · **Reviewed HEAD**: `abc1234` · **Mode**: full
**Checked**: ✅ 11 clean
**Next action**: optional; no blocker
## 🟠 pr-review: REVIEW BEFORE MERGE
**Open**: P1×2 · **Reviewed HEAD**: `abc1234` · **Mode**: incremental
**Checked**: ✅ 11 clean
**Next action**: fix F2/F4 or explicitly defer
## 🔴 pr-review: BLOCKED
**Open**: P0×1, P1×2 · **Reviewed HEAD**: `abc1234` · **Mode**: incremental
**Checked**: ✅ 11 clean
**Next action**: fix F1 before merge
## ⚠️ pr-review: PARTIAL
**Open**: P1×2 · **Reviewed HEAD**: `abc1234` · **Mode**: full
**Checked**: ✅ 8 clean
**Next action**: rerun review; security-reviewer failed
Convert each subagent's [<code> <name>] to a kebab-case slug for output. Drop the subagent-owned code (S/E/T/C). Examples:
[E3 Conditional side effects] → state-consistency (use semantic slug, not the literal name when one is more reviewer-meaningful)[S3 Secret / credential] → secrets-handling[T1 Test coverage gaps] → missing-coverage[C4 Business rule alignment] → decision-conflictWhen semantic slug differs from the literal category name, prefer semantic. The slug is the navigation handle reviewers see; pick the term that conveys "what kind of problem" most directly.
<!-- pr-review:sticky -->
<!-- pr-review:version=2 -->
<!-- pr-review:sha=<HEAD> -->
<!-- pr-review:status=<PASSED|PASSED_WITH_NOTES|REVIEW_BEFORE_MERGE|BLOCKED|PARTIAL> -->
## <status-heading>
**Open**: <none | non-zero buckets> · **Reviewed HEAD**: `<HEAD>` · **Mode**: <full|incremental>
**Checked**: ✅ <N> clean
**Next action**: <one-line: optional only when PASSED>
> <one-line shape narrative — what's the issue cluster; render in PR description language. English example: "observability + state-consistency form two P1 clusters; security clean">
## 📋 Currently open (<N>)
- **<id>** <P-code> `<slug>` — <file>:<line>
- ...
## ↪ Accepted exceptions (<N>)
- **<id>** <P-code> `<slug>` — <follow-up #N | wontfix | by-design>: <one-line reason>
- ...
📍 **Inline comments**: <N> findings pinned to source lines (see the Files changed tab) — render this locator line in PR description language
## ⚖️ Severity adjustments
<rendered only when ≥1 adjustment exists; NOT inside <details>; see template below>
## 🔄 Last iteration changes (`<last_sha>..<HEAD>`)
<rendered only in incremental mode; ONLY this iter's verifications, not cumulative; see template below>
<details><summary>📊 Overview by category</summary>
| Category | P0 | P1 | P2 | Q | Files |
| -------- | --: | --: | --: | --: | ----------------------------- |
| `<slug>` | N | N | N | N | <file paths, comma-separated> |
</details>
<details><summary>❓ Spec gap questions (<N>)</summary>
<rendered only when spec-auditor emitted gap items>
</details>
<details><summary>✅ Checked & clean (<N>)</summary>
- `<slug>`: <one-line evidence — what specific patterns were verified clean, or which grep / file-read confirmed>
- ...
</details>
---
`pr-review` · reviewed `<base>..<HEAD>`<· last reviewed `<last_sha>` — incremental only>
Rules:
Open counts only unaccepted findings. Accepted exceptions appear in their own section and do not block PASSED WITH NOTES.Next action is mandatory for PARTIAL, BLOCKED, REVIEW BEFORE MERGE, and PASSED WITH NOTES; omit only for clean PASSED.📋 Currently open rendered flat (no <details>) when ≥1 finding is not yet Likely fixed; one bullet per finding, sorted P0→P1→P2→Q then by file path. Omit the section entirely when all findings are closed (avoid empty heading)↪ Accepted exceptions rendered flat when any finding is explicitly closed as follow-up / wontfix / by-design. Omit when empty.📊 Overview by category always in <details> (collapsed); rows omitted where P0/P1/P2/Q are all zero. Collapsed by default — summary line already conveys totals; the table is for drill-down only📍 Inline comments line shown when ≥1 P0/P1/P2 finding posted inline; omit otherwiseSeverity adjustments rendered flat (no <details>) when any adjustment exists — discipline requirement, never silent🔄 Last iteration changes rendered flat in incremental mode; shows ONLY this iter's verifications (<last_sha>..<HEAD>), never cumulative across older iterations. Audit trail for older iters lives in git history (commits + prior inline comment threads), not in the stickySpec gap questions always in <details> (collapsed) — verbose; secondary to actionable findingsChecked & clean always in <details> (collapsed) — count is the load-bearing signal; expand for trust calibration## ⚖️ Severity adjustments
| # | Category | Adjustment | Reason |
| ---- | -------- | -------------------------------------------------- | ------------------- |
| F<n> | `<slug>` | <original-emoji + P-code> → <final-emoji + P-code> | <reason — one line> |
## 🔄 Last iteration changes (`<last_sha>..<HEAD>`)
| Prior | Status |
| ------------------------------------ | --------------------------------------------- |
| F<n> <P-code> <slug> (<file>:<line>) | ✅ Likely fixed `<sha>` — <verification note> |
| F<n> <P-code> <slug> (<file>:<line>) | 🔄 Still present — <note> |
| F<n> <P-code> <slug> (<file>:<line>) | ⏸️ Untouched — <note> |
| F<n> <P-code> <slug> (<file>:<line>) | ↪ Follow-up #<N> — <accepted reason> |
| F<n> <P-code> <slug> (<file>:<line>) | 🚫 Wontfix — <accepted reason> |
| F<n> <P-code> <slug> (<file>:<line>) | 🧭 By-design — <accepted reason> |
| F<n> Q <slug> | ⏸️ Awaiting spec author |
Scope: only findings whose status changed (or was re-confirmed) in this iteration's <last_sha>..<HEAD> diff. Untouched findings carrying over from before <last_sha> belong in 📋 Currently open, not here. The table is the delta, not the inventory.
Status legend (hedged on purpose — line-moved ≠ behaviour-fixed):
✅ Likely fixed <sha> — subagent emitted verification: yes + note explaining what changed🔄 Still present — subagent emitted verification: no + note pointing to remaining evidence⏸️ Untouched — subagent emitted verification: unclear (file segment not in diff)↪ Follow-up #<N> — human / babysit disposition accepted a follow-up issue instead of fixing in this PR🚫 Wontfix — human / babysit disposition accepted that the finding will not be fixed in this PR🧭 By-design — human / babysit disposition accepted that the finding's premise conflicts with a repo rule, ADR, spec, or explicit PR design decisionFollow-up / wontfix / by-design rows MUST also appear under ↪ Accepted exceptions, and MUST be excluded from 📋 Currently open.
One per P0 / P1 / P2 finding emitted in this iteration. Anchored to the diff via the platform's inline endpoint — GitHub: one batched review (event=COMMENT); GitLab: one discussion per finding carrying a position (see Platform). Opening a new root comment for the same finding in a later iteration is allowed, but the root MUST keep the same F<n> finding ID, use this template, link to the sticky, and link to the previous thread when known.
<!-- pr-review:finding-root -->
<!-- pr-review:finding-id=F<n> -->
<!-- pr-review:status=<open|still_present|likely_fixed|follow_up|wontfix|by_design> -->
<!-- pr-review:head=<HEAD> -->
<!-- pr-review:sticky-url=<sticky-comment-url> -->
<!-- pr-review:previous-thread-url=<url-or-none> -->
**F<n> <P-code> `<slug>`** · <status-label>
**Sticky summary**: <sticky-comment-url>
**Iteration**: `<last_sha>..<HEAD>`<br>
**Previous thread**: <url — omit when none>
**Failure mode**: <one-line>
**Mitigation**: <one-line; cite test path when applicable>
<details><summary>Evidence</summary>
```diff
<verbatim diff line(s)>
```
</details>
<sub>blast: <Local|Module|Cross-service|Data layer> · <reversible|not reversible> · confidence: <high|medium|low> · justification: <Reachable|Precedent|Asymmetric|Historical></sub>
<!-- pr-review:justification=<Reachable|Precedent|Asymmetric|Historical|Hygiene> -->
The root markers are consumed by pr-babysit and by later incremental reviews. The justification HTML marker is consumed by pr-babysit's diminishing-returns gate to decide whether to keep looping or hand back to the user. Hygiene value is reserved for batched Q-class hygiene followups; never emit Hygiene on a P0/P1/P2 finding.
Status label values:
🆕 New🔄 Still present✅ Likely fixed↪ Follow-up🚫 Wontfix🧭 By-designreversible derivation:
reversible — code-only change, additive feature, refactor without state migrationnot reversible — destructive migration, breaking contract change, irreversible side effect (sent message, deleted data)<details>)### ❓ F<n> <spec-section-or-decision-id> — <one-line title>
`<Blast>` · spec-author confirm
**Spec quote**: <verbatim>
**Code quote**: <verbatim>
**Question for spec author**:
1. <numbered question>
2. ...
<closing line, in PR description language — e.g. "not blocking the PR; want to clarify X">
Q findings do not become inline comments — they're often cross-file conceptual questions, pinning to a line misleads.
| Drop | Why |
|---|---|
Capability: has_spec=Y · has_repo=Y · is_trivial=N | dispatch logic; reviewer doesn't care |
Subagents: security ✅ · staff-eng ✅ · ... | which bots ran is process metadata — UNLESS one failed (then surface) |
Source: <subagent> per finding | reviewer wants "what kind of issue" (already in category), not "who found it" |
<subagent>/ namespace prefix on slugs | leaks subagent identity; bare slug reads cleaner |
Checked & clean grouped under subagent headers | same — flat topic list |
Empty Severity adjustments section heading | render section only when content exists |
Runs after the findings merge/dedup step. Skipped entirely when dry-run: true (print sticky + inline payloads to console instead) or when mode: local (emit findings JSON to stdout — see Local Mode).
digraph publish {
"Findings merged" [shape=doublecircle];
"Build sticky body" [shape=box];
"Build inline payload template" [shape=box];
"Find sticky comment" [shape=box];
"Found?" [shape=diamond];
"Update sticky in place" [shape=box];
"Delete stale duplicate stickies" [shape=box];
"POST new sticky" [shape=box];
"Capture sticky URL" [shape=box];
"Publish commit status" [shape=box];
"Fill sticky URL in inline roots" [shape=box];
"Post inline comments" [shape=box];
"Done" [shape=doublecircle];
"Findings merged" -> "Build sticky body";
"Findings merged" -> "Build inline payload template";
"Build sticky body" -> "Find sticky comment";
"Find sticky comment" -> "Found?";
"Found?" -> "Update sticky in place" [label="yes"];
"Found?" -> "POST new sticky" [label="no"];
"Update sticky in place" -> "Delete stale duplicate stickies";
"Delete stale duplicate stickies" -> "Capture sticky URL";
"POST new sticky" -> "Capture sticky URL";
"Capture sticky URL" -> "Publish commit status";
"Capture sticky URL" -> "Fill sticky URL in inline roots";
"Build inline payload template" -> "Fill sticky URL in inline roots";
"Fill sticky URL in inline roots" -> "Post inline comments";
"Publish commit status" -> "Done";
"Post inline comments" -> "Done";
}
Pick endpoints by $PLATFORM (see Platform). The five steps are identical in shape; only the transport differs. sticky.md and its markers are byte-for-byte the same on both platforms:
<!-- pr-review:sticky -->
<!-- pr-review:version=2 -->
<!-- pr-review:sha=$HEAD -->
<!-- pr-review:status=$STATUS_TOKEN -->
STATUS_STATE (step 4): success for PASSED / PASSED_WITH_NOTES; the failure token for REVIEW_BEFORE_MERGE / BLOCKED / PARTIAL — GitHub failure, GitLab failed. STATUS_DESCRIPTION must stay short.
gh)# 1. find sticky id (may be empty)
STICKY_ID=$(gh api repos/$OWNER/$REPO/issues/$N/comments \
--jq '.[] | select(.body | contains("<!-- pr-review:sticky -->")) | .id' | head -1)
# 2. build sticky.md (markers above)
# 3. create when none exists, else PATCH in place — capture the permalink
if [ -z "$STICKY_ID" ]; then
STICKY_URL=$(gh api -X POST repos/$OWNER/$REPO/issues/$N/comments \
-F [email protected] --jq '.html_url')
else
STICKY_URL=$(gh api -X PATCH repos/$OWNER/$REPO/issues/comments/$STICKY_ID \
-F [email protected] --jq '.html_url')
fi
# 4. PR-header-visible commit status
gh api -X POST repos/$OWNER/$REPO/statuses/$HEAD \
-f state="$STATUS_STATE" -f context="pr-review" \
-f target_url="$STICKY_URL" -f description="$STATUS_DESCRIPTION"
# 5. inline — one batched review. Skip when none this iteration.
# inline-comments.json: [{"path": "...", "line": N, "side": "RIGHT", "body": "..."}, ...]
if [ "$(jq 'length' inline-comments.json)" -gt 0 ]; then
gh api -X POST repos/$OWNER/$REPO/pulls/$N/reviews \
-F event=COMMENT \
-F body="pr-review iteration · $STATUS_DESCRIPTION · $STICKY_URL" \
-F '[email protected]'
fi
glab)# 0. MR web url + inline diff anchors ($PROJECT = url-encoded namespace/path, $IID = MR number)
eval $(glab api "projects/$PROJECT/merge_requests/$IID" \
| jq -r '"MR_URL=\(.web_url) BASE=\(.diff_refs.base_sha) START=\(.diff_refs.start_sha) HEAD=\(.diff_refs.head_sha)"')
# 1. find sticky notes; newest matching id is canonical, older matches are stale duplicates
MATCHES=$(glab api "projects/$PROJECT/merge_requests/$IID/notes?per_page=100" \
| jq '[.[] | select(.body | contains("<!-- pr-review:sticky -->"))]')
STICKY_ID=$(echo "$MATCHES" | jq -r 'max_by(.id).id // empty')
# 2. build sticky.md (same markers as GitHub)
# 3. create when none exists, else PUT the newest in place + DELETE the stale duplicates (self-heal)
if [ -z "$STICKY_ID" ]; then
STICKY_ID=$(glab api -X POST "projects/$PROJECT/merge_requests/$IID/notes" \
--field [email protected] | jq -r '.id')
else
glab api -X PUT "projects/$PROJECT/merge_requests/$IID/notes/$STICKY_ID" \
--field [email protected] > /dev/null
echo "$MATCHES" | jq -r ".[] | select(.id != $STICKY_ID) | .id" | while read -r OLD; do
glab api -X DELETE "projects/$PROJECT/merge_requests/$IID/notes/$OLD"
done
fi
STICKY_URL="$MR_URL#note_$STICKY_ID" # note permalink = <mr-web-url>#note_<id>
# 4. commit status — `name=` (not `context=`), state token `failed` (not `failure`)
glab api -X POST "projects/$PROJECT/statuses/$HEAD" \
-f state="$STATUS_STATE" -f name="pr-review" \
-f target_url="$STICKY_URL" -f description="$STATUS_DESCRIPTION"
# 5. inline — ONE discussion per finding; position MUST travel as a JSON body via --input.
# `-f "position[...]"` flags are silently dropped → un-anchored DiscussionNote.
# discussion-<n>.json (old_path == new_path; for an added/context line set
# new_line and omit old_line):
# {"body":"<root markdown>","position":{"position_type":"text",
# "base_sha":"$BASE","start_sha":"$START","head_sha":"$HEAD",
# "old_path":"<file>","new_path":"<file>","new_line":<line>}}
for f in discussion-*.json; do
glab api -X POST "projects/$PROJECT/merge_requests/$IID/discussions" \
-H "Content-Type: application/json" --input "$f"
done
Do not delete or resolve old inline comments from prior iterations. Both platforms auto-mark a diff-anchored comment outdated when its line moves (GitHub collapses outdated threads; GitLab marks the discussion outdated), so stale roots fade on their own. Opening a new root for a repeated finding is allowed, but keep the same F<n> id, include Sticky summary, and include Previous thread when known. This is the chosen trade-off (vs. resolving threads via API) for operational simplicity. (Sticky duplicates are different — those the skill self-heals; see Publishing step 3.)
When dry-run: true:
state, context/name, target_url, description)inline-comments.json / GitLab discussion-*.json)gh / glab writesSubagents (security-reviewer / staff-engineer / sdet / spec-auditor) operate as isolated Agent dispatches with no shared loader. They cannot follow a "see references/X.md" pointer at dispatch time — references load via the main session, not the subagent's. This inverts skill-creator's default duplication-avoidance principle: any policy a subagent MUST apply is inlined verbatim into each subagent prompt, not stored once in references/ and pointed to.
Current inlined-duplicated content (intentional, not drift):
[window=..., damage=..., recovery=...] meta tag spec) — identical across staff-engineer-prompt.md + security-reviewer-prompt.md; pr-babysit's Gate B parser depends on identical syntaxCross-prompt sync is maintained via <!-- keep-in-sync: ... --> HTML comments at each duplicated section header. When editing one, grep for the keep-in-sync marker to find paired sections.
Do NOT refactor inlined content into a shared references/ file — the alternative regresses to the exact failure mode that motivated inlining. SKILL.md previously claimed "dispatcher prepends threshold at dispatch time" and that contract was never enforced (commit 328b73b8 fixed it by baking threshold into prompts; ⇒ this design note exists to prevent a future editor from re-introducing the same gap).
<details>⚠️ pr-review: PARTIAL; never silentpr-review (GitHub context / GitLab name) with target_url set to the sticky permalinkF-prefixed, never #-prefixed (F1, F2, …) — GitHub auto-links a bare #<digits> in a comment to the issue/PR of that number, so a finding labelled #7 renders in the sticky as a link to issue #7 (an unrelated issue that merely shares the number). The F prefix sidesteps the collision entirelyF<n>, include Sticky summary, and include Previous thread when known↪ Accepted exceptions and no longer count as open blockersLikely fixed, never bare Fixed; line-moved ≠ behaviour-fixednpx claudepluginhub kirkchen/cadence --plugin cadenceGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.