From workflow-skills
Run a comprehensive, autonomous codebase review across six floor lenses (architecture, security, code quality, testing, operations, maintenance) plus lenses derived from the repo's own risk surface (agentic/LLM, canon drift, …). Findings are adversarially verified, deduplicated against the repo's risk registers and the cumulative findings ledger, scored mechanically, and persisted for trend analysis. Emits proposal-only plan stubs; elaboration and validation are delegated to feature-elaborate / feature-review.
How this skill is triggered — by the user, by Claude, or both
Slash command
/workflow-skills:codebase-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Run a full, autonomous codebase review. Produces a timestamped review report with verified, scored findings; an updated cumulative findings ledger; trend analysis against prior reviews; and proposal-only plan stubs routed to the repo's backlog.
Run a full, autonomous codebase review. Produces a timestamped review report with verified, scored findings; an updated cumulative findings ledger; trend analysis against prior reviews; and proposal-only plan stubs routed to the repo's backlog.
This skill is repo-agnostic: everything repo-specific is read at runtime — .claude/repo-conventions.yaml for layout, CLAUDE.md for the repo's own disciplines, and the repo's risk registers for what to review hardest. The conventions file is a claims file, not ground truth: Phase 0 verifies its load-bearing keys against the tree, and a stale conventions file is itself a finding. One deliberate exception: the output root docs/reviews/ (reports + ledger) is fixed by this skill's convention — it's tool-owned output, not repo structure, and a stable location is what makes prior-review discovery mechanical.
Companion files (read each when its phase needs it):
| File | Contents |
|---|---|
lenses.md | The six floor-lens charters, the derived-lens procedure + trigger table, and the per-lens report contract |
ledger.md | Findings-ledger schema, ID scheme, carry-forward/dedup rules, scoring rules, verification protocol |
report-format.md | discovery.md / synthesis.md / final-summary templates |
/codebase-review # full review: floor lenses + derived lenses
/codebase-review --lens <key> # single-lens pass (e.g. --lens canon-drift) — same ledger,
# dedup, and verification machinery at a fraction of the cost
/codebase-review --full # disable diff-scoping: full attention everywhere
Flags compose: --lens <key> --full runs the single lens with diff-scoping disabled (full attention within that lens).
Seven phases, end-to-end, without stopping for user input. On ambiguity: make the best judgment, apply it, document it in the synthesis Process notes. Never ask the user mid-run — that breaks one-shot and cloud runs; auto-detect and record assumptions instead.
Phase 0: Ground — conventions (verified), prior ledger, mechanical gates
Phase 1: Scope — inventory, diff-scope vs last reviewed SHA, derive the lens set
Phase 2: Lens reviews — parallel agents, one per lens; findings + coverage manifest each
Phase 3: Verify — adversarial verification of findings
Phase 4: Critique — completeness check: what did the review itself miss?
Phase 5: Synthesis — dedup (in-run + vs registers + vs ledger), score, trend
Phase 6: Emit — reports, ledger update, proposal-only plan stubs, summary
Read .claude/repo-conventions.yaml. Cache: package, backend.* (test_dir, api_main, config_module, events_module, migrations, patterns.*), frontend.*, docs.*, backlog_root, backlog.templates_dir, backlog.feature_size.goal_max_tasks, backlog.altitude_discipline. Pattern checks in the lenses fire only if declared and not false.
Then verify the load-bearing keys against the tree — do not trust them:
backlog_root resolves to a real directory containing epic/feature folders. If not, locate the actual backlog (Glob for **/tech-debt/features/*/proposal.md and epic README patterns), use reality for the rest of the run, and record a conventions-drift finding.ci: keys or comments) vs .github/workflows/ contents. The workflow files are the canonical step list wherever the two disagree.test_dir and test invocation vs what CI actually runs.If the conventions file is missing entirely: auto-detect (package manifests, workflow files, docs layout), record the detection in discovery.md, and suggest authoring the file in the final summary. Never block on it.
Read docs/reviews/ledger.json (schema in ledger.md). Three cases:
next_review directives (must-cover lenses, lightly-scrutinised areas).docs/reviews/<date>/ folders exist — backfill the ledger from the most recent synthesis.md first, per the backfill procedure in ledger.md. One-time migration; then proceed as case 1.Also read docs/reviews/monitor-ledger.yaml if present (schema in ledger.md). It is the routing layer over the findings ledger's monitor subset: the findings ledger records whether/when a monitor item recurs (monitor_trigger + carry-forward); the routing ledger records how it fires. An active entry carries a mechanism that keeps it monitored — acceptance | hook | schedule | review — validated in-repo by scripts/check-monitor-ledger.py (pre-commit + CI), so a monitor item cannot sit un-routed between reviews. (Graduating an item to its own dedicated PLAN feature instead closes it — see Phase 5 step 8; that is an exit from MONITOR, not an active mechanism.) It supplies the review-mechanism entries this run must re-open (Phase 5) and the active acceptance routings to reconcile against reality (an acceptance-guarded item whose fix has silently shipped is the drift this catches). Absent (no routing ledger yet) → the first run that produces or backfills monitor findings creates it in Phase 6.
Run the repo's own quality gates and capture results as ground truth for the lens agents (findings about things a gate already catches are noise; findings about things a gate claims to catch but doesn't are gold):
pre-commit run --all-files) if configuredTimebox each; on failure-to-run record SKIPPED with reason — a skipped gate is a coverage note, never a silent omission.
While doing this, capture the gate inventory for the operations lens's truthfulness audit: every pre-commit hook's files:/exclude: scope, the CI step list, the type-checker's configured scope, which lockfiles/requirements the dependency audit actually covers, and any env-gated tests that exist but never run in CI.
Build a structured inventory with Glob/Grep (never hardcode file lists). Scan: backend packages (per package), API layer, data layer + migrations, declared pattern directories (workflows/adapters/events), LLM/agent/prompt code (frameworks, providers, prompt templates, MCP servers, agent configs), config/settings, frontend dirs, tests per layer, infrastructure (containers, CI workflows, manifests), backlog (epics, feature counts, statuses), docs.
If the ledger has a prior sha: git diff --stat <sha>..HEAD to build a churn map.
Sampled in its coverage manifest.--full, or no prior SHA: no diff-scoping — full sweep. "Every 4th" is mechanical: this review's ordinal is len(ledger.reviews) + 1; ordinal divisible by 4 → full sweep.Diff-scoping allocates attention; it never silently excludes. Anything neither read nor sampled must appear as Not examined.
The six floor lenses always run (charters in lenses.md): Architecture & Structure, Security, Code Quality, Testing, Operations & Infrastructure, Maintenance & Lifecycle.
Then read the repo's risk registers and apply the trigger table in lenses.md to derive additional lenses:
docs/context/canonical.yaml or equivalent) and its flagged docs' deviation/gap tablesdocs/roadmap/raid.md), threat models (e.g. docs/security.md), ADR deviation tablesnext_review.must_cover — these are mandatory lenses this runlightly_scrutinised list — seed these areas into the relevant lens chartersRecord the final lens set and the rationale for each derived lens in discovery.md. The lens set is part of the review's contract: next review compares against it.
Launch one agent per lens, in parallel. Use the Workflow tool when available (per-agent results checkpoint as they complete, and the run is resumable after interruption); otherwise fan out with the Agent tool. Same prompts either way.
Each lens agent receives:
lenses.mdEach agent writes its own report to docs/reviews/YYYY-MM-DD/<lens-key>.md following the report contract in lenses.md, and returns to the orchestrator only a structured summary: per finding {id, title, severity, disposition, files}, plus its coverage manifest. Full report bodies never round-trip through the orchestrator's context.
Standing instructions inside every lens prompt (spelled out in lenses.md): evidence cites file + symbol; verify any count you assert; a hedged claim is either resolved or tagged UNVERIFIED; check the dispositions registers before writing a finding as new; apply the solution-quality-over-effort rubric (an effort-justified security/authz/contract choice is a finding at P1 minimum, flagged for human decision).
Adversarial verification by independent agents that did not author the finding:
The verifier re-reads the cited evidence at HEAD and actively tries to refute the finding. Verdicts: confirmed, refuted (excluded from scoring and the priority tables; retained in the ledger with status refuted for calibration), reprioritised (with the new priority), enriched (evidence corrected/extended).
Back-propagate every verdict into the lens report file — a lens report and the synthesis must never disagree on a finding's priority. Record the run's refutation and reprioritisation rates in the ledger (calibration trend across reviews).
One critic agent, after the lens summaries and coverage manifests are in. Inputs: all summaries + manifests, the discovery inventory, and the repo's own risk claims (threat model, canon manifest, RAID).
Charter:
Critic findings join the pool under the critique lens key and P0/P1s among them go through Phase 3 verification like any other. The not-examined list feeds both the synthesis Coverage section and the ledger's next_review.lightly_scrutinised.
ledger.md) — a finding matching an open prior finding is recurring (update last_seen, increment seen_count, keep its original ID); a finding matching a prior accepted/monitor item is auto-suppressed unless the evidence materially changed, in which case escalate with an explicit evidence-change note.ledger.md to each lens's verified finding counts. Record both the lens agent's raw score and the synthesis-computed final score; the scorecard shows the final, with a note where they differ. Overall = median of lens finals, adjustable ±1 with a documented reason.goal_max_tasks is a soft re-check signal, not a limit. Route a finding to an existing backlog feature when one already covers it. Keep the single review-quick-fixes bundle for QUICK_FIX findings.monitor is not done until it is either kept active under a mechanism that fires on its own, or graduated out. Active mechanisms: acceptance (a guard promoted into a milestone feature's Acceptance criteria — stays monitored until that feature ships), hook (a scoped pre-commit/CI tripwire), schedule (a /schedule poll), or review (time/drift — this skill re-opens it each run; record the mechanism it should graduate to as target_mechanism). Graduating an item to its own dedicated PLAN feature is the exit: the backlog then owns it, so its routing entry becomes closed (resolution = the feature), never an active feature mechanism. An un-routed monitor finding is the same defect as an un-homed PLAN finding — do not allow one. Then:
review-mechanism entry in the routing ledger: re-verify it against HEAD (the same resolution check as open findings), bump last_checked, and if its monitor_trigger has fired, escalate it to PLAN this run.resolved, not carried (the same drift that lets a closed feature read not-started). For acceptance entries, additionally confirm the pointer still resolves and still names the finding id.discovery.md and synthesis.md per report-format.md (lens reports were written in Phase 2 and corrected in Phase 3).docs/reviews/ledger.json per ledger.md: append this review's entry (date, SHA, lens set, scores, stats, rates, next_review block), add new findings, update carried ones. Never rewrite prior reviews' entries.docs/reviews/monitor-ledger.yaml (schema in ledger.md): new monitor findings → routed active entries (id = the finding's lens-local id, which joins the findings ledger as CR-<source_review>-<id>); graduated (to a feature) or resolved items → closed with a resolution. The join is 1:1 — every findings-ledger status: monitor entry has exactly one active routing entry and vice versa; an item that graduated or resolved is no longer status: monitor, so it is a closed entry, never active. Reconcile any divergence here (this step owns the join; scripts/check-monitor-ledger.py owns structural integrity + pointer resolution). The synthesis "Items to Monitor" section is a rendered snapshot (from check-monitor-ledger.py --report), never a second source of record.<tech-debt home>/<topic>/proposal.md using the repo's own proposal template (from backlog.templates_dir; minimal fallback skeleton in report-format.md if the repo has none). Frontmatter per the repo's conventions; body must cite Source: codebase-review YYYY-MM-DD and the finding IDs. Do not author design.md or tasks.md, do not invent decisions or task breakdowns — detail tracks confidence (altitude discipline). Elaboration is /feature-elaborate's job when the topic is scheduled; validation is /feature-review's job before implementation. This skill never inline-validates its own plans.report-format.md, ending with the recommended next-review date and its must-cover list (also persisted in the ledger — the recommendation must survive the session).--lens <key>)Runs Phases 0, 1 (scoped to the named lens), 2 (one agent), 3, 5, 6 with the full ledger/dedup/verification machinery. The scorecard re-bands only the reviewed lens; all others carry their prior score marked not re-reviewed. Use it for cheap between-baseline passes (e.g. --lens canon-drift after a planning push, --lens security after an incident).
The Phase 5 monitor-routing step runs in full regardless of the lens scope — it is lens-independent (re-opening review-mechanism entries and reconciling acceptance routings keeps the ledger's cadence honest even on a single-lens pass). Intentional: a cheap pass should still not let monitored items rot.
npx claudepluginhub ajmaher2-dev/claude-skills --plugin workflow-skillsGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.