From wholework
Detect documentation/implementation drift and auto-generate Issues (`/audit drift`), and detect structural fragility (`/audit fragility`). AI detects semantic drift between Steering Documents + Project Documents and codebase implementation, and auto-generates Issues for code-side fixes. Where `/doc sync` proposes documentation-side fixes, `/audit` is the complementary skill that creates Issues for code-side fixes. Running without arguments executes both drift and fragility perspectives in an integrated run. `/audit stats` aggregates Issue metadata across the project and generates a project health diagnostic report (throughput / composition / First-try success / Backlog Health, etc.), providing a third lens for project health alongside drift and fragility detection. `/audit stats --retention` adds phase/verify and Icebox dwell metrics (median/p95/30-day threshold violations, verify-type breakdown, Icebox dwell, trigger candidates) with escalation-based retire-proposal comment posting (30/60/90 days for verify, 90/180 days for Icebox). `/audit recoveries` reads the cross-Issue orchestration recovery log (`docs/reports/orchestration-recoveries.md`) and files Issues for recurring patterns that exceed a frequency threshold. `/audit progress <XL-parent-issue-number>` displays a sub-issue progress snapshot (status breakdown, phase distribution, time estimate, 24h activity) for the specified XL parent issue. `/audit auto-session <session-id>` generates the data layer of a /auto session retrospective report from `.tmp/auto-events.jsonl` filtered by session_id (SESSION_ID generated by /auto at startup as PID-timestamp). `/audit auto-session --full <session-id>` additionally generates LLM drafts for all 4 narrative sections (What worked / Limits and gaps / Improvement candidates surfaced / Conclusion) with `[LLM draft — human review required]` markers; no issues are auto-filed (human gate preserved). Both modes also generate a Japanese-translated sibling file at `{report-path-without-ext}-ja.md` by default; pass `--no-ja` to skip.
How this skill is triggered — by the user, by Claude, or both
Slash command
/wholework:auditThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Parse ARGUMENTS and route to the appropriate subcommand.
Parse ARGUMENTS and route to the appropriate subcommand.
If ARGUMENTS contains --help, Read ${CLAUDE_PLUGIN_ROOT}/modules/skill-help.md and follow the "Processing Steps" section to output help, then stop.
If ARGUMENTS is drift or starts with drift (including options like --dry-run, --limit N): execute the "drift subcommand" section and exit.
If ARGUMENTS is fragility or starts with fragility (including options like --dry-run, --limit N): execute the "fragility subcommand" section and exit.
If ARGUMENTS is stats or starts with stats (including options like --since DATE, --limit N, --no-save, --retention): execute the "stats Subcommand" section and exit.
If ARGUMENTS is recoveries or starts with recoveries (including options like --dry-run, --limit N, --threshold K): execute the "recoveries subcommand" section and exit.
If ARGUMENTS is progress or starts with progress (e.g., progress <issue-number>): execute the "progress Subcommand" section and exit.
If ARGUMENTS is auto-session or starts with auto-session (e.g., auto-session <session-id>, auto-session --full <session-id>, auto-session --since 24h, auto-session <id> --output <path>, auto-session <id> --no-ja): execute the "auto-session Subcommand" section and exit.
If ARGUMENTS is empty (no arguments), --dry-run, or starts with --limit: execute the "Integrated Execution (drift + fragility)" section and exit.
For any other ARGUMENTS: display "Usage: /audit [drift|fragility|stats|recoveries|progress |auto-session ] [--dry-run] [--limit N] [--since DATE] [--no-save] [--retention] [--threshold K] [--no-ja] (running /audit without arguments executes drift + fragility integrated run)" and exit.
Detect semantic drift between Steering Documents + Project Documents and codebase implementation, and generate Issues for code-side fixes.
Note: Run
/doc sync --deepfirst to normalize document-side drift, then run/audit driftto detect remaining semantic drift that requires code-side fixes. This keeps/audit driftfocused on drift that cannot be resolved by updating documentation alone.
Parse the following options from ARGUMENTS:
--dry-run: display the drift report only without generating Issues--limit N: limit Issue generation to N items (in descending severity order)Read ${CLAUDE_PLUGIN_ROOT}/modules/codebase-analysis.md and follow the "Processing Steps" section to execute cross-codebase analysis.
Then collect documents using the following procedure:
Load Steering Documents:
Read ${CLAUDE_PLUGIN_ROOT}/modules/detect-config-markers.md and follow the "Processing Steps" section. Retain SPEC_PATH and STEERING_DOCS_PATH for use in subsequent steps.
Search for $STEERING_DOCS_PATH/product.md, $STEERING_DOCS_PATH/tech.md, $STEERING_DOCS_PATH/structure.md with Glob and Read any that exist. If none exist, display "Steering Documents not found. Run /doc init." and exit with error.
Load Project Documents:
Following the document traversal pattern from /doc, dynamically detect type: project documents using this procedure:
type: project pattern limited to *.md files, getting a list of candidate file paths$SPEC_PATH/node_modules/.git/.tmp/Fetch existing open Issues (for duplicate check):
gh issue list --state open --json number,title,body --limit 100
The retrieved issue list is used for duplicate checking in Step 3 (after drift detection).
Cross-reference the Steering Documents, Project Documents, and codebase analysis results collected in Step 1 to detect semantic drift.
Steering Documents categories (examples):
| Category | Detection method |
|---|---|
| tech.md Architecture Decisions vs actual code | Compare with Read + Grep pattern comparison (inconsistencies between documented architecture and actual code) |
| tech.md Key Dependencies vs actual dependencies | Extract actual deps from package.json/go.mod etc. with Grep → compare with tech.md table |
| tech.md Coding Conventions vs actual code | Detect naming convention violations / Forbidden Expressions usage with Grep |
| structure.md Directory Layout vs actual directory | Get actual directory listing with ls + Glob → diff against structure.md entries |
| structure.md Key Files vs actual files | Detect absent listed files and unlisted important files with Glob |
| product.md Non-Goals vs implementation | AI judgment to detect implemented features that violate Non-Goals |
| product.md Terms vs code terminology | Detect usages of different notation from defined terms with Grep |
Project Documents categories (examples):
| Category | Detection method |
|---|---|
| workflow.md skill list vs actual skills | Match Glob results of skills/*/SKILL.md against skill names/subcommands listed in workflow.md |
| workflow.md phase descriptions vs SKILL.md implementation | Compare phase role descriptions (routing, options, etc.) with actual behavior in SKILL.md via Read |
| workflow.md path references vs actual files | Extract path references (like skills/<name>/SKILL.md) with Grep → verify file existence with Glob |
docs/environment-adaptation.md Layer 3 Domain Files table vs bundled Domain file frontmatter | (1) Glob ${CLAUDE_PLUGIN_ROOT}/skills/**/*.md and ${CLAUDE_PLUGIN_ROOT}/modules/**/*.md; for each file Read its frontmatter and collect files with type: domain → "actual Domain files". (2) Read docs/environment-adaptation.md → extract all rows from the "Domain Files (exhaustive)" table under Layer 3. (3) Report three drift sub-types: table-missing — a file has type: domain frontmatter but is not listed in the table; file-or-frontmatter-missing — a table row's file does not exist or lacks type: domain frontmatter; load_when-mismatch — the load_when column value in the table differs from the load_when: block in the file's frontmatter |
| eager-load 共通モジュールへの capability guidance 混入 | ${CLAUDE_PLUGIN_ROOT}/scripts/check-eager-load-capability.sh を実行し出力を drift レポートに含める。スクリプトが行う処理: (1) modules/{name}-adapter.md を Glob して capability 名を列挙; (2) modules/verify-patterns.md と modules/verify-executor.md の本文のセクション見出し(table row 除く)に capability 名が現れる箇所を検出; (3) 対応する Domain file skills/*/{name}-guidance.md の存在を確認; (4) Domain file が存在しない場合に Issue 候補として記録 |
Severity scoring (AI judgment):
Assign severity to each detection result using these guidelines (not strict rule-based, AI judgment):
Semantically compare the detected drift against existing open Issues retrieved in Step 1.
Reference titles and bodies; if the content is similar to an existing Issue (pointing out the same drift), judge as duplicate and skip. Duplicate check is AI-judgment-based.
Display duplicates as "duplicate (existing Issue #N)" in the results report.
Display drift detection results in table format:
| No | Category | Severity | Description | Affected Files | Duplicate |
|----|---------|----------|-------------|---------------|-----------|
| 1 | tech.md Coding Conventions | high | ... | skills/foo/SKILL.md | - |
| 2 | workflow.md skill list | medium | ... | docs/workflow.md | - |
| 3 | structure.md Key Files | low | ... | docs/structure.md | existing #123 |
In --dry-run mode: display the table and exit (do not generate Issues).
In normal mode:
If --limit N is specified, select N items in descending severity order. Exclude duplicates ("existing #N") from the count.
Ask the user with AskUserQuestion (non-interactive mode: auto-resolve — automatically select "Generate all" for non-duplicate items up to --limit N; record the decision in an issue comment):
If "Cancel": display "Issue generation cancelled." and exit.
Generate Issues in /issue standard format for approved drift items.
Table row addition verify commands:
When the detected drift involves adding a row to a documentation table (e.g., adding a new entry to a | Name | Path | Role | table in docs/structure.md), generate a grep + section_contains pair from the start — do not generate grep alone:
<!-- verify: grep "{row-keyword}" "{target-file}" -->
<!-- verify: section_contains "{target-file}" "{section-heading}" "{row-keyword}" -->
{row-keyword}: a keyword that uniquely identifies the new table row (e.g., the script name or module name being added){section-heading}: the section heading that contains the table. Selection rule:
### Scripts), use that heading## Key Files(Required))Rationale: grep alone cannot verify that the keyword appears in the correct section. The section_contains command ensures the entry is placed in the expected table, not elsewhere in the file. This matches the guidance in modules/verify-patterns.md §5.
Each Issue body:
## Background
{Context where the drift was found, quoting the relevant Steering/Project Document section}
## Purpose
{Problem resolved by the fix}
## Acceptance Conditions
### Pre-merge (automated verification)
- [ ] <!-- verify: {verify command} --> {condition 1}
- [ ] {condition 2}
### Post-merge
- [ ] {verification items}
Verify command validity check (before creating Issues):
Before calling gh issue create, validate each <!-- verify: {verify command} --> in the generated Issue body:
modules/verify-executor.md (e.g., file_exists, file_contains, section_contains, grep, command, github_check, rubric, etc.). If the command type is unknown, replace it with a valid known type or remove the verify comment.file_not_contains "path" "" — the empty second argument is invalid). If an empty argument is found, fix the command with a meaningful value or remove the verify comment.Fix any invalid verify commands before proceeding to Issue creation.
Label assignment:
After Issue generation, assign the following label:
audit/drift: tracking label indicating the drift was detected by the audit skillDo not assign the triaged label when creating Issues. The triaged label is assigned by the /triage skill after triage is actually executed; pre-assigning it causes the Issue to be skipped by the triage pipeline, leaving Type/Size/Priority/Value unset.
Type/Size assignment:
Set Type and Size from AI estimation of drift scope (update project fields via ${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh).
After generation:
Display the list of generated Issue numbers and titles.
Then read ${CLAUDE_PLUGIN_ROOT}/modules/steering-hint.md and follow the "Processing Steps" section.
Aggregate Issue metadata across the project and generate a project health diagnostic report. This is a read-only tool — it generates new docs/stats/YYYY-MM-DD.md files only, and does not edit existing files or create Issues.
Parse the following options from ARGUMENTS:
--since DATE: aggregation start date (default: 90 days before today; format: YYYY-MM-DD)--limit N: maximum number of Issues to fetch (default: 500)--no-save: skip saving to docs/stats/; output to stdout only--retention: enable retention analysis — compute phase/verify and Icebox dwell metrics, and post escalation-based retire-proposal commentsFetch Issue list:
gh issue list --state all --json number,title,body,labels,createdAt,closedAt,state --limit {N}
Filter to Issues created on or after --since DATE. If --since is not specified, use 90 days before today as the default.
Fetch timeline items for each Issue (for reopen and phase label transition analysis):
For each Issue in the filtered list:
${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh --query get-issue-timeline -F num={number} \
--jq '.data.repository.issue'
Extract the following from timelineItems:
ReopenedEvent entries → mark the Issue as having reopen historyLabeledEvent and UnlabeledEvent entries for phase/* labels → record the sequence of phase transitions in chronological orderSpec file existence check (for retrospective presence):
Use Glob to check whether $SPEC_PATH/issue-{number}-*.md exists for each Issue. Record existence as a boolean (do not read Spec content).
phase/done AND has no reopen historyphase/done (reopen history does not affect this)phase/verify back to phase/codeFor each Issue in the filtered list, resolve Type, Size, and Priority from GitHub Projects fields (with label fallback) by calling the helper scripts:
${CLAUDE_PLUGIN_ROOT}/scripts/get-issue-type.sh {number} # -> Bug / Feature / Task (empty if unset)
${CLAUDE_PLUGIN_ROOT}/scripts/get-issue-size.sh {number} # -> XS / S / M / L / XL (exit 1 if unset)
${CLAUDE_PLUGIN_ROOT}/scripts/get-issue-priority.sh {number} # -> urgent / high / medium / low (exit 1 if unset)
Classify as "unset" when the script exits with a non-zero status or outputs an empty string. The gh-graphql.sh --cache flag used internally in each script deduplicates GraphQL requests for the same Issue.
Classify each Issue by checking whether its title or body contains any of the following keywords (case-sensitive partial match). Assign the first matching segment in order; if none match, assign "other".
| Segment | Keywords |
|---|---|
| ui/design | UI, デザイン, 画面, レイアウト, Figma, design |
| backend | API, サーバー, サーバ, DB, データベース, backend |
| infra | CI, CD, Docker, 環境, deploy, インフラ, runner |
| docs | ドキュメント, doc, README, CLAUDE.md, 文書 |
| test | テスト, test, bats, spec |
| other | (none of the above) |
This section is structured as an independent subsection to allow future replacement with LLM-based classification.
Classify each Issue based on its labels:
audit/drift label present → audit (drift)audit/fragility label present → audit (fragility)retro/verify label present → retrospectiveNote: retro/verify label may not yet exist in the repository. When the label is absent or no Issue has it, the "retrospective" category will show 0 — this is expected behavior. Once the companion Issue adding retro/verify label assignment to /verify Step 13 is merged, retrospective-derived Issues will be separated automatically.
Section 5 (Outcome) の全サブ項目集計から retro/verify ラベル付き Issue を除外するためのフィルター定義。
Compute the following from filtered_issues:
retro_verify_count: number of Issues in filtered_issues that have the retro/verify labeloutcome_population: filtered_issues minus Issues that have the retro/verify labelretro/verify Issues represent wholework infrastructure improvement proposals surfaced by the /verify phase and closed as "not planned" after upstream migration — they are not implementation failures and must not affect First-try success rate, Completed rate, Rework, or Phase regression calculations.
Note: Section 4 (Work Origin) uses the full filtered_issues as its population and is not affected by this filter. retro/verify Issues are already classified as the "retrospective" category in Section 4.
Split the past 90 days into three 30-day windows (window 1: oldest, window 3: most recent). For each window, compute:
triaged labelFor each Issue that has (or had) a phase/verify label, compute the dwell time (滞留期間) as:
phase/verify LabeledEvent to todayphase/verify LabeledEvent to the most recent phase/verify UnlabeledEventCompute the following aggregates across all Issues currently in phase/verify:
Scan each Issue currently labeled phase/verify (closed state) for unchecked (- [ ]) lines containing verify-type: observation. Count the number of such Issues.
Scan each Issue currently labeled phase/verify (closed state) for unchecked (- [ ]) lines containing verify-type: opportunistic. Count the number of such Issues.
Scan each Issue currently labeled phase/verify (closed state) for unchecked (- [ ]) lines containing verify-type: manual. Count the number of such Issues.
For each Issue currently in phase/verify, compute the dwell time from the most recent phase/verify LabeledEvent to today. Collect all Issues with dwell time ≥ 30 days as threshold violation candidates. Record the list with Issue number, title, and dwell days.
For Issues with Project Status=Icebox (fetched via ${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh --query get-projects-with-fields), compute the dwell time as days from each Issue's createdAt to today.
Compute the following aggregates across all Icebox Issues:
For each Icebox Issue, scan the Issue body for re-evaluation trigger text (lines containing keywords such as 「再評価トリガー」, "re-evaluation trigger", or "trigger"). For each trigger found, apply heuristic judgment:
#123), check whether that Issue is CLOSED via gh issue view.Record Issues where at least one trigger heuristic evaluates to true as "trigger fire candidates".
Collect items meeting any of the following criteria to display in the Highlights section:
Highlights contain only auto-detected items. Do not include interpretation or inference in the report.
Generate a Markdown report containing all 6 sections below. Output to stdout.
List items that meet the auto-detection criteria from Step 2. If no items meet the criteria, output "No highlights detected."
Do not interpret or infer. Only enumerate items that meet the detection thresholds.
Display Created / Closed / Net / Open end for each of the three 30-day windows in table format.
| Window | Period | Created | Closed | Net | Open end |
|--------|--------|---------|--------|-----|----------|
| W1 (oldest) | YYYY-MM-DD – YYYY-MM-DD | N | N | N | N |
| W2 | ... | N | N | N | N |
| W3 (recent) | ... | N | N | N | N |
Display counts by Type, Size, and Priority. Also show the ratio change for the most recent 30-day window vs. the prior two windows combined.
Display distribution of audit (drift) / audit (fragility) / retrospective / manual. Include percentage of total.
If the retro/verify label does not exist, display "retrospective" as 0 with a note: "(retro/verify label not yet assigned — will be separated once companion Issue is merged)".
Display the exclusion note first:
Outcome 集計対象: N 件 (うち retro/verify ラベル付き M 件を除外)
where N = count(outcome_population) and M = retro_verify_count. Always display this note even when M = 0.
All sub-items below are computed using outcome_population (i.e., filtered_issues with retro/verify-labeled Issues excluded):
phase/verify → phase/code regressions occurred most frequentlyoutcome_population)Display the following:
triaged label)Display the following (computed from the Step 2 metrics):
verify-type: observation ACsverify-type: opportunistic ACsIf there are no Issues currently in phase/verify, display "No Issues currently in phase/verify."
Skip this entire section when --retention is not specified.
When --retention is specified, append the following after Section 7.
Display the following table with threshold warnings:
| Metric | Value | Threshold | Status |
|---|---|---|---|
| phase/verify dwell (median) | N days | > 14 days | OK / WARNING |
| phase/verify dwell (p95) | N days | > 30 days | OK / WARNING |
| Observation waiting | N | > 10 | OK / WARNING |
| Opportunistic waiting | N | > 10 | OK / WARNING |
| Manual waiting | N | > 5 | OK / WARNING |
| 30-day threshold violations | N | > 0 | OK / WARNING |
List 30-day threshold violation Issues (if any) with Issue number, title, and dwell days.
Display the following table with threshold warnings:
| Metric | Value | Threshold | Status |
|---|---|---|---|
| Icebox dwell (median) | N days | > 90 days | OK / WARNING |
| Icebox dwell (p95) | N days | > 180 days | OK / WARNING |
| Icebox count | N | — | — |
| Trigger fire candidates | N | > 0 | OK / NOTIFY |
List trigger fire candidate Issues (if any) with Issue number, title, and the matched trigger text.
For each Issue currently in phase/verify, compute dwell days and call:
${CLAUDE_PLUGIN_ROOT}/scripts/compute-escalation-level.sh verify <dwell_days>
Route by escalation level:
stale-verify label via gh issue edit --add-label stale-verifyFor each Icebox Issue, compute dwell days and call:
${CLAUDE_PLUGIN_ROOT}/scripts/compute-escalation-level.sh icebox <dwell_days>
Route by escalation level:
Duplicate prevention: before posting a comment, fetch existing comments via gh issue view --json comments and check for a comment containing the same escalation level marker (<!-- escalation-level: N -->). If found, skip that Issue — do not post a duplicate.
Comment format (include escalation level marker for duplicate prevention):
<!-- escalation-level: {N} -->
## phase/verify Retention Notice (Level {N})
This Issue has been in `phase/verify` for **{dwell_days} days**.
{Level-specific message}
For Icebox comments, use ## Icebox Retention Notice (Level {N}) as the heading.
If --no-save is specified: output to stdout only and exit.
If --no-save is not specified:
YYYY-MM-DD formatmkdir -p docs/stats
docs/stats/YYYY-MM-DD.md (overwrite if the file already exists for the same date). When --retention is specified, the Sections 8 and 9 retention output is included in the saved file.Then read ${CLAUDE_PLUGIN_ROOT}/modules/steering-hint.md and follow the "Processing Steps" section.
Detect structural fragility based on project context (Steering Documents) and generate risk improvement Issues.
Parse the following options from ARGUMENTS (same system as drift):
--dry-run: display the fragility report only without generating Issues--limit N: limit Issue generation to N items (in descending severity order)Read ${CLAUDE_PLUGIN_ROOT}/modules/codebase-analysis.md and follow the "Processing Steps" section to execute cross-codebase analysis.
Then collect documents using the following procedure:
Load Steering Documents:
Read ${CLAUDE_PLUGIN_ROOT}/modules/detect-config-markers.md and follow the "Processing Steps" section. Retain SPEC_PATH and STEERING_DOCS_PATH for use in subsequent steps.
Search for $STEERING_DOCS_PATH/product.md, $STEERING_DOCS_PATH/tech.md, $STEERING_DOCS_PATH/structure.md with Glob and Read any that exist. If none exist, display "Steering Documents not found. Run /doc init." and exit with error.
Load Project Documents:
Following the document traversal pattern from /doc, dynamically detect type: project documents using this procedure:
type: project pattern limited to *.md files, getting a list of candidate file paths$SPEC_PATH/node_modules/.git/.tmp/Fetch existing open Issues (for duplicate check):
gh issue list --state open --json number,title,body --limit 100
The retrieved issue list is used for duplicate checking in Step 3 (after fragility detection).
Based on context collected in Step 1, detect structural fragility in the following 5 categories.
Detection categories (exhaustive):
| Category | Detection method |
|---|---|
| Missing tests for core modules | For modules positioned as core in product.md / structure.md, check for test files in tests/ with Glob. Detect modules without tests (test coverage gap detection) |
| Architecture Decisions violations | Read the Architecture Decisions section of tech.md and detect code patterns contradicting the documented design decisions with Grep + Read |
| Missing error handling for critical external deps | Identify call sites for dependencies deemed critical in tech.md Key Dependencies with Grep, and verify presence/absence of try/catch etc. error handling |
| Single point of failure | Identify files that many modules depend on from structure.md dependency relations, and verify presence/absence of corresponding tests/documentation |
| Scattered configuration | Detect cases where environment variables/config values are scattered across multiple files without an SSoT (Single Source of Truth) with Grep, and cross-reference with tech.md descriptions |
Severity scoring (AI judgment):
Use the same guidelines as drift:
Boundary with drift:
If the same location applies to both, prioritize drift and skip the fragility side.
Semantically compare the detected fragility against existing open Issues retrieved in Step 1.
Reference titles and bodies; if the content is similar to an existing Issue (pointing out the same fragility), judge as duplicate and skip. Duplicate check is AI-judgment-based.
Display duplicates as "duplicate (existing Issue #N)" in the results report.
Display fragility detection results in table format:
| No | Category | Severity | Description | Affected Files | Duplicate |
|----|---------|----------|-------------|---------------|-----------|
| 1 | Missing tests for core modules | high | ... | modules/foo.md | - |
| 2 | Architecture Decisions violations | medium | ... | skills/bar/SKILL.md | - |
| 3 | Scattered configuration | low | ... | scripts/setup.sh | existing #456 |
In --dry-run mode: display the table and exit (do not generate Issues).
In normal mode:
If --limit N is specified, select N items in descending severity order. Exclude duplicates from the count.
Ask the user with AskUserQuestion (non-interactive mode: auto-resolve — automatically select "Generate all" for non-duplicate items up to --limit N; record the decision in an issue comment):
If "Cancel": display "Issue generation cancelled." and exit.
Generate Issues in /issue standard format for approved fragility items.
Each Issue body:
## Background
{Context where the fragility was found, quoting the relevant Steering/Project Document section}
## Purpose
{Risk reduced by the improvement}
## Acceptance Conditions
### Pre-merge (automated verification)
- [ ] <!-- verify: {verify command} --> {condition 1}
- [ ] {condition 2}
### Post-merge
- [ ] {verification items}
Label assignment:
After Issue generation, assign the following label:
audit/fragility: tracking label indicating the fragility was detected by the audit skillDo not assign the triaged label when creating Issues. The triaged label is assigned by the /triage skill after triage is actually executed; pre-assigning it causes the Issue to be skipped by the triage pipeline, leaving Type/Size/Priority/Value unset.
Type/Size assignment:
Set Type and Size from AI estimation of fragility scope (update project fields via ${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh).
After generation:
Display the list of generated Issue numbers and titles.
Then read ${CLAUDE_PLUGIN_ROOT}/modules/steering-hint.md and follow the "Processing Steps" section.
Read the cross-Issue orchestration recovery log and file Issues for patterns that exceed a frequency threshold. Mirrors the drift/fragility subcommand structure.
Parse the following options from ARGUMENTS:
--dry-run: display candidates only without generating Issues--limit N: limit Issue generation to N items (in descending frequency order)--threshold K: minimum recurrence count to qualify as a candidate (default: 3)Read the recovery log:
Read docs/reports/orchestration-recoveries.md. If the file does not exist, display "Recovery log not found: docs/reports/orchestration-recoveries.md. Recovery events are written by /auto Step 4a." and exit.
Fetch existing open Issues (for duplicate check):
gh issue list --state open --json number,title,body --limit 100
Write the JSON to .tmp/open-issues-recoveries.json.
Run the candidate detection script:
${CLAUDE_PLUGIN_ROOT}/scripts/collect-recovery-candidates.sh docs/reports/orchestration-recoveries.md --threshold ${K} --issues-json .tmp/open-issues-recoveries.json
Where ${K} is the value from --threshold (default: 3).
The script outputs <symptom-short>\t<count> lines for each qualifying candidate.
For each candidate from Step 2, perform a semantic duplicate check against the open Issues retrieved in Step 1:
collect-recovery-candidates.sh script already excludes exact substring matchesDisplay recovery candidate results in table format:
| No | Symptom | Occurrences | Duplicate |
|----|---------|-------------|-----------|
| 1 | gh-pr-list-head-glob | 4 | - |
| 2 | verify-timeout-exceeded | 3 | existing #311 |
Clean up temp file: rm -f .tmp/open-issues-recoveries.json
In --dry-run mode: display the table and exit (do not generate Issues).
In normal mode:
If --limit N is specified, select N items in descending frequency order. Exclude duplicates from the count.
Ask the user with AskUserQuestion (non-interactive mode: auto-resolve — automatically select "Generate all" for non-duplicate items up to --limit N; record the decision in an Issue comment):
If "Cancel": display "Issue generation cancelled." and exit.
Generate Issues for approved candidates.
Each Issue body:
## Background
Recurring orchestration recovery pattern detected by `/audit recoveries`:
- Symptom: {symptom-short}
- Occurrences: {count} (threshold: {K})
- Recent examples from `docs/reports/orchestration-recoveries.md`:
{quote 1-3 representative Diagnosis + Recovery Applied sections}
## Purpose
{Describe what structural fix would prevent recurrence of this recovery pattern}
## Acceptance Conditions
### Pre-merge (automated verification)
- [ ] <!-- verify: {verify command} --> {condition 1}
- [ ] {condition 2}
### Post-merge
- [ ] {verification items}
Label assignment:
After Issue generation, assign audit/fragility (recovery patterns are structural fragility by nature).
Do not assign the triaged label.
Type/Size assignment:
Set Type and Size from AI estimation of recovery pattern scope (update project fields via ${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh).
Update log entries after filing:
For each filed Issue, update the corresponding log entries in docs/reports/orchestration-recoveries.md:
Find all entries where symptom-short matches and Improvement Candidate is 未起票. Replace - 未起票 with - 起票済み #NNN (where NNN is the new Issue number) using the Edit tool.
Commit the log update:
git add docs/reports/orchestration-recoveries.md
git commit -s -m "chore: update recovery log after /audit recoveries filing
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>"
git push origin HEAD 2>/dev/null || git push origin main
After generation:
Display the list of generated Issue numbers and titles.
Then read ${CLAUDE_PLUGIN_ROOT}/modules/steering-hint.md and follow the "Processing Steps" section.
Display a progress snapshot of sub-issues under a specified XL parent issue.
Extract the parent issue number from ARGUMENTS (e.g., progress 1000 → parent number 1000). If no issue number is provided, display "Usage: /audit progress " and exit.
Run:
"${CLAUDE_PLUGIN_ROOT}/scripts/get-sub-issue-progress.sh" <parent-number>
The script outputs JSON with the following structure:
{
"parent": { "number": 1000, "title": "..." },
"sub_issues": [
{
"number": 1001,
"title": "...",
"state": "OPEN",
"createdAt": "2026-06-01T00:00:00Z",
"closedAt": null,
"updatedAt": "2026-06-10T12:00:00Z",
"labels": [{ "name": "phase/code" }],
"blockedBy": [{ "number": 1005, "state": "OPEN" }]
}
]
}
If the script exits non-zero, display the error and exit.
For each sub-issue, classify status using the following priority order (first matching rule wins):
| Priority | Status | Classification rule |
|---|---|---|
| 1 | Done | state == "CLOSED" |
| 2 | Blocked | state == "OPEN" AND any entry in blockedBy has state == "OPEN" |
| 3 | Stale | state == "OPEN" AND labels contain stale-verify |
| 4 | In progress | state == "OPEN" AND labels contain any of: phase/code, phase/review, phase/verify, phase/spec |
| 5 | Pending | all other OPEN issues (including phase/issue, phase/ready, or no phase label) |
Status counts: Count sub-issues per status (Done / In progress / Blocked / Stale / Pending).
Phase distribution: For In progress and Blocked sub-issues, count by phase label (phase/issue, phase/spec, phase/ready, phase/code, phase/review, phase/verify). An issue with no phase label counts under "no phase".
Time estimates:
(closedAt - createdAt) in minutes for all Done sub-issues. If no Done sub-issues exist, display "N/A".pending_count / max(in_progress_count, 1) × median. Display as a range (±20%): e.g., "12-18 hours wall-clock".Recent 24h activity: Filter sub-issues where updatedAt is within the last 24 hours. Report the count.
Blocked relationships: For each Blocked sub-issue, list the OPEN blockedBy issue numbers.
Output the snapshot in the following format:
XL parent #<parent-number>: <parent-title>
Sub-issues: <total> total (created <earliest-createdAt date>)
Status breakdown:
✅ Done: <N> (<pct>%)
🔄 In progress: <N> (<pct>%) — #<num>, #<num>, ...
🟡 Blocked: <N> (<pct>%) — #<num> (by #<blocker>), ...
🟠 Stale: <N> (<pct>%) — #<num>, ...
⬜ Pending: <N> (<pct>%)
Phase distribution (in-progress + blocked):
phase/issue: <N>
phase/spec: <N>
phase/ready: <N>
phase/code: <N>
phase/review: <N>
phase/verify: <N>
no phase: <N>
Time estimates (based on completed sub-issues):
Median time per sub-issue: <N> min
Est. remaining: <low>-<high> hours wall-clock (<in_progress_count> concurrent, <pending_count> sub-issues remaining)
Recent activity (last 24h):
- <N> sub-issues updated
Omit rows with count 0 from Phase distribution. If no sub-issues exist under the parent, display "No sub-issues found for #." and exit.
Generate the data layer of a /auto session retrospective report from .tmp/auto-events.jsonl, filtered by session_id.
This subcommand covers the post-session time scale of the /audit 3-axis model:
| Command | Time scale | Use case |
|---|---|---|
/audit stats | weeks/months | project health |
/audit progress <XL> | hours | in-progress snapshot |
/audit auto-session <id> | session post | post-session retrospective (this subcommand) |
/auto generates a SESSION_ID at startup using the format PID-timestamp (e.g., 12345-1718336400). This identifier is recorded in:
.tmp/auto-session-current — pointer file read by run-auto-sub.sh to populate session_id in each emitted event.tmp/auto-session-${SESSION_ID}.json — metadata file recording session start timeEach event in .tmp/auto-events.jsonl includes a session_id field set to this value. The auto-session subcommand filters events by session_id to isolate exactly one session's activity, even when multiple sessions' events are mixed in the same log file.
.tmp/auto-events.jsonl is the primary data source (set via AUTO_EVENTS_LOG env var or default path). This file is populated by R1 (#630) and subsequent extensions. When R1-era event types (watchdog_kill, max_silent_window, token_usage, concurrent_commit_detected) are absent from the log, the corresponding Summary rows degrade gracefully to 0 or N/A.
The generated report (docs/reports/auto-session-{session-id}-{date}.md) contains the following sections:
--full for LLM-assisted draft)Parse from ARGUMENTS (after the auto-session prefix):
<session-id> (positional): generate report for this session; required unless --since is given; may appear before or after --full--full: enable full mode — after generating the data-layer report, generate LLM narrative drafts for all 4 sections (What worked / Limits and gaps / Improvement candidates surfaced / Conclusion) and insert them into the report with [LLM draft — human review required] markers--output <path>: override output file path (default: docs/reports/auto-session-<id>-<date>.md)--since <spec>: list mode — show distinct session_ids from the log, filtered to the specified time window (e.g., 24h, 2026-06-14); omit <session-id> in this mode--no-ja: skip Japanese sibling generation (Step 4). Default behavior generates a Japanese-translated sibling file at {report-path-without-ext}-ja.md alongside the English report."${CLAUDE_PLUGIN_ROOT}/scripts/get-auto-session-report.sh" <session-id> [--output <path>] [--no-github]
Pass --no-github only in contexts where GitHub API calls are unavailable (e.g., hermetic testing). In normal operation, omit --no-github to include live GitHub label/PR state in the report.
If no session_id is given (list mode with --since):
"${CLAUDE_PLUGIN_ROOT}/scripts/get-auto-session-report.sh" --since <spec>
Output the path of the generated report file. If in list mode, display the session list returned by the script.
If --full is not specified, stop here.
--full mode only)This step generates LLM drafts for the 4 narrative sections and inserts them into the report.
Run only when --full is present in ARGUMENTS.
gh issue view <N> --json title,body,labels (provides richer context for narrative generation)${CLAUDE_PLUGIN_ROOT}/skills/audit/auto-session-narrative-prompts.md to load the prompt templates and few-shot examplesgh issue list --search "<keyword>" to check for existing open issues for each candidate.tmp/narrative-draft-<session-id>.md using the Write tool, structured as:
### What worked
{draft content}
### Limits and gaps
{draft content}
### Improvement candidates surfaced
{draft content}
### Conclusion
{draft content}
"${CLAUDE_PLUGIN_ROOT}/scripts/get-auto-session-report.sh" <session-id> --narrative-draft .tmp/narrative-draft-<session-id>.md --output <report-path>
This replaces each "TBD — fill in after reviewing the session" placeholder with the draft content prefixed by > [LLM draft — human review required]rm -f .tmp/narrative-draft-<session-id>.md
[LLM draft — human review required] markers in all 4 narrative sections. Review and edit before committing."Note: No issues are filed automatically. The Improvement candidates section lists candidates for human review; filing is done manually via /issue or discarded. This preserves the human gate.
Note: The [LLM draft — human review required] marker is inserted as a blockquote prefix so the user can immediately identify LLM-generated content requiring review before the report is committed or shared.
This step runs by default after Steps 1–3 complete (regardless of whether --full was specified). Skip entirely when --no-ja is present in ARGUMENTS.
.md of the report path with -ja.md (e.g., docs/reports/auto-session-<id>-<date>.md → docs/reports/auto-session-<id>-<date>-ja.md)--full was set, otherwise post Step 2)spawn-recovery-subagent.sh), Issue/PR references (#666), session IDs, ISO 8601 timestamps, SHA hashes, log markers like [LLM draft — human review required] (but translate the marker text content if it appears as Japanese in body: prefer [LLM ドラフト — レビュー必須] for the blockquote marker){sibling-path}."Note: The Japanese sibling is generated unconditionally by default to support the project's user-facing Japanese convention (CLAUDE.md). Use --no-ja to opt out.
/audit (no arguments) sequentially executes both drift and fragility perspectives and displays detection results in an integrated table.
Parse the following options from ARGUMENTS (same system as drift/fragility):
--dry-run: display the integrated report only without generating Issues--limit N: limit total Issue generation to N items (in descending severity order)Execute Steps 1–3 from the "drift subcommand" (context collection, drift detection, duplicate check). Don't proceed to Issue generation at this step — --dry-run/--limit are applied at final output; collect detection results only.
Execute Steps 1–3 from the "fragility subcommand" (context collection, fragility detection, duplicate check). Reuse the same Steering/Project Documents context from drift if available (skip re-fetching).
If fragility detection results overlap with drift detections (pointing out the same location), prioritize drift and skip the fragility side.
Display drift and fragility detection results in an integrated table with a lens column:
| No | lens | Category | Severity | Description | Affected Files | Duplicate |
|----|------|---------|----------|-------------|---------------|-----------|
| 1 | drift | tech.md Coding Conventions | high | ... | skills/foo/SKILL.md | - |
| 2 | fragility | Missing tests for core modules | medium | ... | modules/bar.md | - |
| 3 | drift | workflow.md skill list | low | ... | docs/workflow.md | existing #789 |
In --dry-run mode: display the integrated table and exit (do not generate Issues).
In normal mode:
If --limit N is specified, select N items in descending severity order. Exclude duplicates from the count.
Ask the user with AskUserQuestion (non-interactive mode: auto-resolve — automatically select "Generate all" for non-duplicate items up to --limit N; record the decision in an issue comment):
Generate Issues in /issue standard format for approved items. Apply the Issue body format, label assignment, and Type/Size assignment based on each item's lens:
For items with lens: drift:
Each Issue body:
## Background
{Context where the drift was found, quoting the relevant Steering/Project Document section}
## Purpose
{Problem resolved by the fix}
## Acceptance Conditions
### Pre-merge (automated verification)
- [ ] <!-- verify: {verify command} --> {condition 1}
- [ ] {condition 2}
### Post-merge
- [ ] {verification items}
Label assignment:
After Issue generation, assign the following label:
audit/drift: tracking label indicating the drift was detected by the audit skillDo not assign the triaged label when creating Issues. The triaged label is assigned by the /triage skill after triage is actually executed; pre-assigning it causes the Issue to be skipped by the triage pipeline, leaving Type/Size/Priority/Value unset.
Type/Size assignment:
Set Type and Size from AI estimation of drift scope (update project fields via ${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh).
For items with lens: fragility:
Each Issue body:
## Background
{Context where the fragility was found, quoting the relevant Steering/Project Document section}
## Purpose
{Risk reduced by the improvement}
## Acceptance Conditions
### Pre-merge (automated verification)
- [ ] <!-- verify: {verify command} --> {condition 1}
- [ ] {condition 2}
### Post-merge
- [ ] {verification items}
Label assignment:
After Issue generation, assign the following label:
audit/fragility: tracking label indicating the fragility was detected by the audit skillDo not assign the triaged label when creating Issues. The triaged label is assigned by the /triage skill after triage is actually executed; pre-assigning it causes the Issue to be skipped by the triage pipeline, leaving Type/Size/Priority/Value unset.
Type/Size assignment:
Set Type and Size from AI estimation of fragility scope (update project fields via ${CLAUDE_PLUGIN_ROOT}/scripts/gh-graphql.sh).
Display the list of generated Issue numbers and titles grouped by lens.
Then read ${CLAUDE_PLUGIN_ROOT}/modules/steering-hint.md and follow the "Processing Steps" section.
npx claudepluginhub saitoco/wholework --plugin wholeworkCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.