From bug-audit
Audit a feature, release, or issue queue for bugs — functional AND UX. Trigger when someone says "bug audit", "audit this release", "what's broken", "QA this feature", "review before release", "check for issues", "is this ready to ship", or when Dan or a customer reports a problem and the team needs to triage scope. Also trigger proactively after any deploy where the 7-day soak window shows activity in #assistant-feedback. Covers four areas: functional bugs, UX bugs, test coverage gaps, and customer-surfaced classification. "It works" and "it feels right" are separate questions. Both must pass.
How this skill is triggered — by the user, by Claude, or both
Slash command
/bug-audit:bug-auditThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Structured bug audit covering functional correctness, UX quality, test coverage
Structured bug audit covering functional correctness, UX quality, test coverage gaps, and customer-surfaced classification. Run this before any release, after any incident, and any time "it doesn't work" arrives without specifics.
A feature can be functionally correct and still be broken from the user's perspective. Wrong loading state, missing error message, confusing empty state, broken flow requiring a workaround — these are bugs. They cause churn. They get reported to Dan. They end up in #assistant-feedback and count as customer-surfaced in the quality KPIs.
This audit treats UX failures identically to functional failures: classified, assigned, and tracked. "It works, it just looks wrong" is not a valid close state.
| Severity | Definition | Response |
|---|---|---|
| P1 | Customer-facing data loss, auth failure, complete feature outage | Immediate — page on-call |
| P2 | Feature degraded, workaround exists, affects multiple users | Same business day |
| P3 | Minor functional error, edge case, easy workaround | Within current sprint |
| P4 | UX polish issue, cosmetic, no functional impact | Backlog — prioritize by frequency |
Customer-surfaced: First reported by a customer or by Dan. Dan's reports are captured as customer-reported in the system — there is no separate category. Customer-surfaced bugs of any severity are tracked against the team's quality KPIs. Recurring customer-surfaced bugs in a quarter cap uplift eligibility regardless of other scores.
Internally detected: Caught by automated tests, CI, Datadog, or a team member before any customer saw it. This is the quality system working correctly. A bug caught internally is not a failure — it's a save.
Before auditing, confirm:
Feature / release: [name or Notion project URL]
Spec page: [Notion URL — where ACs and Cypress test plan live]
Environment: [prod / staging / both]
Audit trigger: [pre-release / post-deploy soak / customer report / scheduled]
Audit date: [date]
Audited by: [name]
If there is no spec page, the feature was built without acceptance criteria. Note this as a P2 process failure in the report — it means no baseline exists for what "correct" looks like.
For each acceptance criterion on the spec, verify in the target environment:
AC-[N]: [Description]
Status: PASS / FAIL / NOT TESTED
Evidence: [What was observed — be specific]
Bug ID: BUG-[YYYY-MM-DD]-[N] (if FAIL)
Severity: P1 / P2 / P3
Source: Customer-surfaced / Internal
What counts as a functional bug:
Audit UX failures separately from functional failures. Work through each category. Check against the spec's UX Considerations section if it exists. If no UX Considerations section exists, that is itself a gap — note it.
Loading and Async States
Empty States
Error States
Flow Continuity
Visual and Layout
Data Display
For each UX issue found:
UX-[N]: [Description — what the user experiences]
Category: Loading / Empty / Error / Flow / Visual / Data
Severity: P2 (blocks task completion) / P3 (degrades experience) / P4 (cosmetic)
Reproduction: [Steps to see it]
Expected: [What the user should see]
Actual: [What the user sees]
Test needed: Yes — TC description / No / Existing test needs update
Review the Cypress test plan from the spec. If no Cypress test plan exists, that is a P2 process failure — note it and reconstruct coverage from the ACs.
AC-[N]: [Description]
Test exists: Yes / No
Test file: cypress/e2e/[file].cy.js (if exists)
CI status: Passing / Failing / Flaky / Quarantined / Not run
Coverage gap: [What scenario is not tested]
Mandatory coverage checklist:
cy.wait(ms) in new tests — flag any found, these are bugsdata-testid — flag CSS class or text selectors@owner comment at the topA missing Cypress test for any P1 or P2 acceptance criterion is itself a P2 bug. The feature ships without a safety net.
## Bug Audit Report
Feature: [name]
Notion spec: [URL]
Audit date: [date]
Audited by: [name]
Environment: [prod / staging / both]
Trigger: [pre-release / post-deploy soak / customer report / scheduled]
## Summary
Total issues: N
P1: N (customer-surfaced: N / internal: N)
P2: N (customer-surfaced: N / internal: N)
P3: N
P4: N
UX bugs: N
Test gaps: N
Ship decision: HOLD / SHIP WITH KNOWN ISSUES / CLEAR TO SHIP
## Functional Bugs
[List using Step 2 format]
## UX Bugs
[List using Step 3 format]
## Test Coverage Gaps
[List using Step 4 format]
## Customer-Surfaced Issues
[Any P1/P2 first reported by a customer or Dan — these are KPI-tracked.
Dan's reports count as customer-surfaced in the quality scoring system.]
## Process Failures
[Missing spec, missing ACs, missing Cypress test plan, missing UX Considerations
section — note each one. These are systemic gaps, not one-time misses.]
HOLD — do not release:
.only() on the feature's testsSHIP WITH KNOWN ISSUES:
CLEAR TO SHIP:
If the same UX bug category appears in three or more consecutive audits (e.g., empty states consistently missing, loading states consistently wrong), it is a systemic gap in the spec template — not a one-time miss.
Escalate to Dan to update the feature-spec skill's UX Considerations
requirements so the pattern is caught at spec time, not at audit time.
skills/feature-spec/SKILL.mddocs/standards/testing/test-health.mdskills/wrench-ux-constitution/SKILL.mdProvides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.
npx claudepluginhub wrenchai/wrench-plugins --plugin bug-audit