From domain-chassis
Runs a structured retrospective of a completed work session — captures what specs, gates, and plans miss. What actually happened, where friction occurred, and what to change next time. Domain-agnostic methodology.
How this skill is triggered — by the user, by Claude, or both
Slash command
/domain-chassis:aarWhen to use
Use when the user asks to do an AAR, after-action review, retrospective, post-mortem, debrief a session or run, review what happened, capture lessons learned, or analyze outcomes of completed work.
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Structured retrospective analysis of a completed work session. The AAR captures what specs, gates, and plans miss: what actually happened, where friction occurred, and what to change next time.
Structured retrospective analysis of a completed work session. The AAR captures what specs, gates, and plans miss: what actually happened, where friction occurred, and what to change next time.
The AAR produces two artifacts: an evidence record (structured, factual, verifiable) and an interpretation document (narrative analysis, lessons, recommendations). These are separate files. A downstream consumer can read the evidence record without encountering interpretation, and vice versa. This separation is a chassis-level requirement — see ${CLAUDE_PLUGIN_ROOT}/foundation/EVIDENCE.md for the doctrine and provenance.
You (the main agent) drive the analysis. You sat through the session or have been primed with the project state. The AAR's value is in connecting observations to methodology decisions — identifying which skill, process, or configuration change would have changed the outcome. That's judgment work that requires your context.
Subagents gather mechanical data when you lack session context. When you participated in the work, your conversation history is the evidence base — subagents are skipped.
Before gathering data, identify which domain this AAR belongs to. Check the workspace root for a domain doctrine file:
FORGE.md → Forge (Building)LAB.md → Lab (Operating)WORKSHOP.md → Workshop (Tooling)RESEARCH.md → Research (Investigating)If no doctrine file is found, ask the operator which domain this work belongs to. Set DOMAIN to the resolved domain name (lowercase).
The AAR persists to the domain's knowledge repo — the domain-specific plugin that holds doctrine, AARs, and operational knowledge. These repos are not uniformly named. Known mappings:
forge-doctrineworkshop-polishFor other domains, look for a repo or directory matching {domain}-doctrine or {domain}-* that contains an aar/ directory.
Resolution order:
aar/ directory and the domain doctrine file)../{repo-name}/ as a subdirectory of cwd using the known mappings above.{domain}-* that contain an aar/ directory.Set DOMAIN_REPO to the resolved path.
Before dispatching subagents, determine your context mode based on observable session signals:
If warm context: skip subagent dispatch. Your session context is the primary evidence base for Phase 2. Proceed directly to Phase 2.
If cold context: you must delegate data gathering to subagents using the Agent tool. Do not run git commands or scan artifacts yourself — dispatch subagents and wait for their reports. This keeps the main agent's context clean for the judgment-heavy Phase 2 work.
Dispatch both subagents in a single message (one Agent tool call each) so they run concurrently. Do NOT set run_in_background: true — use foreground mode so you block until both return. Do not proceed to Phase 2 until both subagent results are in your conversation. Their reports are your primary evidence base.
Subagent 1 — Git Evidence (via Agent tool):
Brief the subagent with the project directory and ask it to run these git commands and report back:
git log --oneline -20 — recent commits including the session's outputgit log --format='%h %s' --since="24 hours ago" — today's session commitsgit diff HEAD~N..HEAD --stat — files changed across the session (adjust N to span it)git tag --sort=-creatordate | head -5 — tags producedThe subagent should return a structured summary of what it finds — commit hashes, file change stats, and any tags.
Subagent 2 — Artifact State (via Agent tool):
Brief the subagent with the workspace root and project directory. Ask it to scan for operational artifacts:
gates/).claude/ directory)The subagent should report what exists and its current state. Do not assume a fixed set of artifacts — different domains and projects produce different artifacts.
If the user can provide a session log path, have one of the subagents read that too.
Work through each section using your available evidence — session context in warm mode, subagent reports in cold mode, or both if subagent data supplements your session context. Don't ask the user to repeat what's already been discussed.
Phase 2 gathers both evidence and interpretation. The separation into distinct artifacts happens in Phase 3, not here. Work through the sections naturally — just be aware of which sections produce evidence (factual, verifiable data) and which produce interpretation (analysis, conclusions, recommendations).
What was built, modified, or fixed. Commit hashes, file counts, test counts. Quantitative, not narrative. This section's content goes into the evidence record.
What was the estimate (scope, complexity, timeline)? What actually happened? Where did the estimate diverge and why? This is analysis — it interprets what the evidence means.
3-5 specific successes. "The spec was tight" is vague. "The spec had five requirements with exact field names and code locations, eliminating all ambiguity" is useful.
Friction, failures, unexpected issues. Categorize:
Distill into reusable, actionable principles. Each lesson should be something a practitioner can apply to their next session. No generic platitudes. Lessons are interpretation — they are the operator's conclusions drawn from the evidence, not the evidence itself.
Concrete next steps with clear ownership:
Each action item should be specific enough to become a task or commit. Action items are interpretation — they are recommendations derived from analysis, not factual records.
This section fires only when the AAR reviews a closed gate — one that went through gate-review (so there is a predicted verdict to score) and reached a lived outcome (CLEARED, with any errata / re-reviews). For AARs of non-gate work, or work with no gate-review verdict, skip it entirely; the AAR is unchanged. This conditionality keeps the measurement off every retrospective the chassis serves across the four domains — it is not universalized onto every AAR.
When it fires, emit two measurements, following the schema, requirement-key registry, classification taxonomy, and placement rationale in references/attribution-ledger.md:
gate-review's predicted verdict from the gate's frontmatter review header (verdict: / confidence:) and score it against the lived gate-work outcome (CLEARED status, ## Gate Errata, re-review history). Record predicted-vs-lived and whether they agree (accurate / optimistic / pessimistic). Calibration reads observable fields, so it is self-authored without cold treatment.GP-* / GR-Qnn keys in the registry), classify it load-bearing / inert / absent-but-needed / indeterminate, and anchor every non-indeterminate classification to a checkable locus (a gate checkpoint ID, a gate-work event, a ## Gate Errata entry, an AAR finding). indeterminate is the honest verdict when the evidence does not support a classification — never manufacture one.Write the attribution as a fenced ledger block exactly per the reference schema, so the derive-script can extract it deterministically:
<!-- ledger:begin gate=Q{n} date=YYYY-MM-DD -->
**Calibration.** Predicted: `pass` @ confidence N (reviewed ...Z). Lived: CLEARED ...Z, K errata, M re-reviews. Verdict-calibration: **accurate|optimistic|pessimistic** — ...
| requirement_key | requirement | source | classification | locus |
|-----------------|-------------|--------|----------------|-------|
| GR-Q02 | every method a positive artifact | gate-review Q2 | load-bearing | <checkable locus> |
<!-- ledger:end -->
Self-authored, made safe structurally (the A4 decision). The attribution is authored by you, the AAR agent — not handed to a cold pass — because the locus-anchoring makes each classification checkable rather than narratable, and indeterminate removes the pressure to manufacture a verdict. The heavier cold guard is reserved for the irreversible action: the prune decision at pruning-review time, the operator's call, downstream of any single AAR. Both measurements are interpretation and belong in the interpretation artifact (the .md), never the .evidence.md record.
Write two files to the domain knowledge repo's aar directory. The evidence record captures factual data. The AAR captures interpretation that references the evidence record.
${DOMAIN_REPO}/aar/{YYYY-MM-DD}-{project}-{brief-description}.evidence.md
Template:
# Evidence: {Project} — {Brief Description}
**Date:** {YYYY-MM-DD}
**Domain:** {domain name}
**Project:** {project name}
**Scope:** {what was attempted — 1 sentence}
## Commits
| Hash | Message | Files changed |
|------|---------|---------------|
| {hash} | {message} | {count} |
## Artifacts Produced
| Artifact | Path | Type |
|----------|------|------|
| {name} | {path} | {file/tag/config/etc} |
## Timeline
| Event | Timestamp |
|-------|-----------|
| {event} | {ISO 8601} |
## Metrics
{quantitative data — test counts, file counts, version numbers, durations, or other measurable outcomes. Omit this section if no metrics are relevant.}
The evidence record contains only verifiable data. No analysis, no conclusions, no recommendations. Every entry is independently checkable against git history, file system state, or other external sources.
${DOMAIN_REPO}/aar/{YYYY-MM-DD}-{project}-{brief-description}.md
Template:
# AAR: {Project} — {Brief Description}
**Date:** {YYYY-MM-DD}
**Domain:** {domain name}
**Project:** {project name}
**Evidence:** [{evidence filename}]({evidence filename})
## Expectations vs Reality
{estimate vs actual, divergence analysis}
## What Went Well
{specific successes with reasoning}
## What Didn't Go Well
{friction and failures, categorized by type}
## Lessons Learned
{actionable principles}
## Action Items
- [ ] {action item 1}
- [ ] {action item 2}
## Operator Notes
{leave blank for operator input}
The AAR is interpretation. It analyzes, concludes, and recommends. It references the evidence record for factual grounding — a reader who wants to verify a claim follows the Evidence link. The AAR does not duplicate the commit table, artifact list, or timeline from the evidence record.
After writing the attribution block into the interpretation artifact, regenerate the cross-gate ledger snapshot so it reflects the new entry. The snapshot is a derived projection of the AAR tables — never hand-edited; regenerate it and it is correct by construction (so it cannot drift from its source):
${CLAUDE_PLUGIN_ROOT}/skills/aar/scripts/ledger.py derive ${DOMAIN_REPO}/aar/ --scope {domain} --out ${DOMAIN_REPO}/aar/ledger-{YYYY-MM-DD}.tsv
Commit the refreshed snapshot alongside the AAR (Phase 5). The snapshot is the source the operator-run pruning review reads (ledger.py prune-review <snapshot> --threshold N); it surfaces a requirement as a prune candidate once its inert streak meets the threshold. The prune decision — removing a requirement from the bar — applies cold scrutiny and is the operator's, taken across domains' snapshots, never on one gate's observation.
Check if any lessons or action items should propagate to:
Note proposed propagations at the end of the AAR under a "Propagation" section. Don't make changes automatically — flag them for the operator to review.
Persist both artifacts to the domain knowledge repo in a single commit:
cd ${DOMAIN_REPO}
git add aar/{filename}.evidence.md aar/{filename}.md
git commit -m "aar: {project} — {brief-description}"
git push
Both files are committed together — the evidence record and interpretation are a pair. For a gate-closure AAR, add the refreshed ledger snapshot to the same commit (git add aar/ledger-{YYYY-MM-DD}.tsv). If the domain knowledge repo could not be resolved in the earlier step and the AAR was written to the workspace root, remind the operator to file it.
${CLAUDE_PLUGIN_ROOT}/foundation/EVIDENCE.md placement rationale. Read when emitting the gate-closure calibration/attribution section.scripts/ledger.py — derives the cross-gate snapshot from AAR attribution tables (derive) and surfaces prune candidates from a snapshot (prune-review). Invoked via ${CLAUDE_PLUGIN_ROOT}/skills/aar/scripts/ledger.py.npx claudepluginhub basher83/domain-chassis --plugin domain-chassisCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.