From squid
Sweeps codebase architecture periodically by reading ADRs, mapping module/dependency graphs, surfacing 5-10 smells with severity, and generating refactor proposals for /refactor. Triggered by '/architecture-review'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/squid:architecture-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
The team's day-to-day pipeline (`/day`, `/night`) operates at task grain. Long-horizon codebase health — drift, unintended coupling, dead modules, layering violations — accumulates between tasks and never gets addressed unless someone deliberately looks. This skill is that deliberate look.
The team's day-to-day pipeline (/day, /night) operates at task grain. Long-horizon codebase health — drift, unintended coupling, dead modules, layering violations — accumulates between tasks and never gets addressed unless someone deliberately looks. This skill is that deliberate look.
The output is not a single PR. It's a prioritised backlog of refactor proposals, each shaped so /refactor can pick one up and turn it into a Tasks Plan.
You are the auditor — you delegate exploration to sub-agents, you read docs/adr/ to avoid re-proposing settled questions, and you produce a written report. You do NOT write code, do NOT start refactors, and do NOT decide priority for the team — you propose, the human prioritises.
$ARGUMENTS:
packages/backend/) → audit only that subtree.backend, frontend-web) → audit only that component.Smaller scopes produce sharper findings; whole-repo audits produce broader maps but blunter recommendations.
/refactor lands and you want to find the next one./refactor with the named target./review territory.Resolve $ARGUMENTS. If empty, ask the user one question via AskUserQuestion:
Whole repo, one component, or a specific subtree?
Echo the resolved scope back as a one-line confirmation. Don't block.
Read docs/adr/ end-to-end if it exists. Build a mental model of:
This is the single highest-value step. An architecture review that proposes "split the auth module" when ADR-0005 already explained why it's deliberately combined wastes everyone's time. If ADRs don't exist, note that — recommend adr.md as a prerequisite to capturing this review's outcome.
Also read, if present:
docs/glossary.md — to know whether the team already names concepts consistently.CLAUDE.md and any per-component CLAUDE.md — for stated rules.Spawn 2–3 Explore agents in parallel:
Agent(
subagent_type="Explore",
prompt="""Architecture mapping pass on {scope}.
Produce four artefacts:
(1) Module / package list with one-line responsibility per module (file:line for the canonical entry point).
(2) Dependency graph — who imports whom. Flag any cycles. Flag any "junk drawer" modules imported by everyone.
(3) Layering — name the de-facto layers (e.g., handlers → services → repositories → models) and list any layer-skipping imports.
(4) Public surface — what's exported / consumed by other components or external callers. Mark anything intended-internal that's leaked.
Be exhaustive on the dependency graph; missing an edge produces a wrong recommendation. Report as four sections."""
)
Agent(
subagent_type="Explore",
prompt="""Hot-spot scan on {scope}.
Find:
(1) Files with > 500 lines.
(2) Files / modules touched by > 30% of commits in the last 6 months — `git log --since='6 months ago' --name-only | sort | uniq -c | sort -rn | head -20`.
(3) Tests with > 100 setup lines or > 5 fixtures — usually a smell that the system-under-test is too coupled.
(4) `# TODO` / `# FIXME` / `# HACK` markers — count and cluster.
(5) Any `# type: ignore` / `eslint-disable` / `nolint` clusters — places where the team is fighting their own static analysis.
Report as five sections with file:line."""
)
When agents return, read the top-3 most-implicated files yourself. Don't rely on summaries on the load-bearing modules — you need firsthand familiarity to call findings credibly.
Walk these dimensions. For each, form 0 or more findings. Don't manufacture findings to pad the report — empty dimensions are fine and signal health.
| Dimension | What to look for |
|---|---|
| Cycles | Import cycles between modules / packages. Always a smell. |
| Layer violations | Lower layers (e.g., models) importing higher layers (e.g., HTTP handlers). |
| God modules | One module that everything imports; usually means responsibilities accreted. |
| Junk drawers | utils/, common/, helpers/ modules with > 10 unrelated functions. |
| Hidden coupling | Two modules that should be independent but share a hidden contract (a shape, a magic string, an env var). |
| Test smells | High setup-to-assert ratio; brittle tests on private internals; flaky tests. Usually rooted in a structural problem in the SUT. |
| API leakage | Internal types reaching public surface; public surface that should be private (e.g., a route that no caller uses). |
| Dead code | Unused exports, unreachable branches, modules with zero importers. |
| Drift from CLAUDE.md | Stated rules the codebase no longer follows. The rule is right; the code drifted. |
| ADR drift | Decisions ADRs describe that the code no longer reflects (silently superseded). Either update the ADR or fix the code. |
| Performance shape | Sync I/O in async handlers; N+1 patterns; in-memory aggregation that should be a query. |
| Observability gaps | Critical paths without logs / metrics / traces; logs that don't include correlation IDs. |
For each finding, assign:
Prioritise by severity / effort ratio, with confidence as a tiebreaker. Do not propose more than 10 findings. A 30-finding report gets ignored; a 7-finding report gets acted on. Cluster related findings into one if they share a root cause.
Path:
tracker/architecture-review-{YYYY-MM-DD}.md.architecture-review.Template:
# Architecture review — {scope}, {YYYY-MM-DD}
**Scope:** {whole repo / component / path}
**ADRs read:** {N} ({list, e.g. "0001–0007"})
**Findings:** {N total — by severity: S1×N, S2×N, S3×N, S4×N}
## Summary
{2–3 sentences. The headline shape: "Three module cycles between auth/ and core/, all introduced in the last 4 months, blocking the planned auth-extraction work. Tests on the affected modules have grown 2× in setup size. No ADR governs the boundary. Recommend ADR + extraction refactor."}
## Findings
### F1 — {one-line title} (S{1–4}, effort: {S/M/L}, confidence: {H/M/L})
**Where:** `path/to/a.py:42`, `path/to/b.py:117`
**Symptom:** {what is concretely wrong, observable}
**Root cause hypothesis:** {your best read on why}
**Proposed refactor:** {one paragraph; the shape of the fix, not the fix itself}
**ADR exposure:** {does an ADR govern this? if yes, cite. If no, an ADR should accompany the refactor.}
**Out of scope (explicit):** {what this finding is NOT recommending}
### F2 — ...
(Repeat for each finding, top to bottom by priority.)
## Not recommended
{Smells you found but are deliberately NOT proposing. State why — e.g., "F-skip-1: The `utils/strings.py` module has 12 unrelated functions and looks like a junk drawer. Skipping because every function is < 10 lines and the cost of splitting exceeds the reading cost."}
## Pre-existing decisions (from ADRs) you should NOT undo
{One-bullet-each summary of ADRs that explain *why* something looks weird but is intentional. This protects the next reader from re-litigating.}
## Suggested order of operations
1. F{best-first} — {why it goes first}
2. F{next} — {why}
...
Each finding above can be picked up by `/refactor F{N}` (paste the finding's body into the refactor goal). Don't tackle all of them at once — pick the top 1–3 and run them through the pipeline first.
Single block:
## Architecture review complete — {scope}
**Report:** {path or issue URL}
**Findings:** {N} ({severity breakdown})
**Top recommendation:** F1 — {title}
### Recommended next step
`/refactor F1` (or paste the F1 body into a fresh `/refactor` invocation). The first finding is the highest leverage by severity-over-effort.
### Open question (if applicable)
{Anything you found but couldn't conclude on without team input — e.g., "F4 hinges on whether the auth/billing boundary should be a hard-coded contract or a feature-flag-toggled split. Surfacing for team decision before refactoring."}
### ADR action
{One of:}
- "Recommend ADR-{NNNN} to record the resolution of F{X} when its refactor lands."
- "No ADRs in `docs/adr/` — recommend bootstrapping with [`adr.md`](../scaffold/specs/adr.md) before proceeding so this review's findings stay durable."
/refactor then plans. /night then executes. Skipping ahead to "we should do X" pre-commits the team.npx claudepluginhub iusztinpaul/squid --plugin squidIdentifies shallow modules and architectural friction in a codebase, using domain language from CONTEXT.md and ADR decisions, to propose refactoring opportunities for testability and AI-navigability.
Scans a codebase, interviews the developer, and produces a shareable architecture insights document. Use when assessing architecture, determining direction, or auditing drift.
Scans codebase for architectural friction: shallow modules, tight coupling, untestable seams. Produces an HTML report with before/after diagrams and refactoring candidates.