From Topia
Comprehensive project audit — security, dependencies, code quality, architecture, performance, infra, docs, and nexus analytics. Delegates to specialist skills and generates an 8-dimension health score.
How this skill is triggered — by the user, by Claude, or both
Slash command
/topia:auditThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Comprehensive project health audit across 8 dimensions (7 project + 1 nexus analytics). Delegates security scanning to `guardian`, dependency analysis to `dependency-doctor`, and code complexity to `autopsy`, then directly audits architecture, performance, infrastructure, and documentation. Applies framework-specific checks (React/Next.js, Node.js, Python, Go, Rust, React Native/Flutter) based ...
Comprehensive project health audit across 8 dimensions (7 project + 1 nexus analytics). Delegates security scanning to guardian, dependency analysis to dependency-doctor, and code complexity to autopsy, then directly audits architecture, performance, infrastructure, and documentation. Applies framework-specific checks (React/Next.js, Node.js, Python, Go, Rust, React Native/Flutter) based on detected stack. Produces a consolidated health score and prioritized action plan saved to AUDIT-REPORT.md.
/topia audit — full 8-dimension project health audit/topia audit dx — DX Review Mode (Addy Osmani 8 principles, see below)recon (L2): Phase 0 — project structure and stack discoverydependency-doctor (L3): Phase 1 — vulnerability scan and outdated dependency checkguardian (L2): Phase 2 — security audit (OWASP Top 10, secrets, config)autopsy (L2): Phase 3 — code quality and complexity assessmentarchitecture-mapper (L2): refresh architecture picture when docs/architecture/ is missing or staleimprove-architecture (L2): Phase 3.5 — architecture sub-score (depth / leverage / locality across top modules)perf (L2): Phase 4 — performance regression checkdb (L2): Phase 5 — database health dimension (schema, migrations, indexes)journal (L3): record audit date, overall score, and verdictconstraint-check (L3): audit HARD-GATE compliance across project skillssast (L3): Phase 2 — deep static analysis (Semgrep, Bandit, ESLint security rules)retro (L2): Phase 6 — engineering velocity and health dimension (topia:retro)browser-pilot (L3): DX Review Mode — real browser testing of docs, setup guides, error pagesbuild (L1): pre-implementation audit gatelaunch (L1): pre-launch health check/topia audit direct invocationCall topia:recon for a full project map. Then use Read on:
README.md, CLAUDE.md, CONTRIBUTING.md, .editorconfig (if they exist)Determine:
API/backend | frontend/SPA | fullstack | CLI tool | library | mobile | infra/IaCOutput before proceeding: Brief project profile, stack summary, and which Framework-Specific Checks will be applied.
For each critical module (entry points, auth, data layer, core business logic):
Why: Without this phase, the auditor pattern-matches against known vulnerability lists and hallucinates findings that don't exist in THIS specific codebase. The invariants + assumptions ground all later analysis in reality.
Delegate to dependency-doctor. The dependency-doctor report covers:
Pass the full dependency-doctor report through to the final audit.
Delegate to guardian. Request a full security scan covering:
*, missing HTTP security headers).gitignore coverage of sensitive filesPass the full guardian report through to the final audit.
Delegate to autopsy for codebase health (complexity, coupling, hotspots, dead code, health score per module).
In addition, use Grep to find supplementary issues autopsy may not cover:
# console.log in production code
grep -r "console\.log" src/ --include="*.ts" --include="*.js" -l
# TypeScript any types
grep -r ": any" src/ --include="*.ts" -n
# Empty catch blocks
grep -rn "catch.*{" src/ --include="*.ts" --include="*.js" -A 1 | grep -E "^\s*}"
# Python print() in production
grep -r "^print(" . --include="*.py" -l
# Rust .unwrap() outside tests
grep -rn "\.unwrap()" src/ --include="*.rs"
Merge autopsy report + supplementary findings.
Use Read and Grep to evaluate structural health directly.
4.1 Project Structure
4.2 Design Patterns & Principles
// BAD — route handler directly coupled to database
app.get('/users/:id', async (req, res) => {
const user = await db.query('SELECT * FROM users WHERE id = $1', [req.params.id]);
res.json(user);
});
// GOOD — layered architecture
app.get('/users/:id', async (req, res) => {
const user = await userService.getUser(req.params.id);
res.json(user);
});
4.3 API Design (if applicable)
4.4 Database Patterns (if applicable)
// BAD — N+1
const users = await db.query('SELECT * FROM users');
for (const user of users) {
user.posts = await db.query('SELECT * FROM posts WHERE user_id = $1', [user.id]);
}
// GOOD — single JOIN
const usersWithPosts = await db.query(`
SELECT u.*, json_agg(p.*) as posts
FROM users u LEFT JOIN posts p ON p.user_id = u.id
GROUP BY u.id
`);
LIMIT on user-facing queries4.5 State Management (frontend only)
5.1 Build & Bundle (frontend)
// BAD — imports entire library
import _ from 'lodash';
// GOOD — tree-shakeable import
import get from 'lodash/get';
5.2 Runtime Performance
// BAD — regex compiled on every call
function validate(input: string) {
return /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/.test(input);
}
// GOOD — compile once at module level
const EMAIL_REGEX = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
function validate(input: string) { return EMAIL_REGEX.test(input); }
5.3 Database & I/O
LIMIT on user-facing endpoints)// BAD — sequential when independent
const users = await fetchUsers();
const products = await fetchProducts();
// GOOD — parallel
const [users, products] = await Promise.all([fetchUsers(), fetchProducts()]);
Use Glob and Read to check:
6.1 CI/CD Pipeline
.github/workflows/, .gitlab-ci.yml, .circleci/, Jenkinsfile)6.2 Environment Configuration
.env.example exists with placeholder values (not real secrets)// BAD — silently undefined
const port = process.env.PORT;
// GOOD — validate at startup
const port = process.env.PORT;
if (!port) throw new Error('PORT environment variable is required');
6.3 Containerization (if applicable)
.dockerignore covers node_modules, .git, .env6.4 Logging & Monitoring
console.log)/health, /ready)Use Glob and Read to check:
7.1 Project Documentation
7.2 Code Documentation
CHANGELOG.md maintainedLICENSE file presentApply only if the framework was detected in Phase 0. Skip entirely if not relevant.
React / Next.js (detect: react or next in package.json)
useEffect with missing dependencies (stale closures)Node.js / Express / Fastify (detect: express, fastify, koa, @nestjs/core)
SELECT * without paginationPython (Django / Flask / FastAPI) (detect: django, flask, fastapi in requirements)
permission_classes, DEBUG=True in production, missing CSRF middlewareapp.run(debug=True) without environment checkdef func(items=[]))Go (detect: go.mod)
file, _ := os.Open(filename))defer for resource cleanup (files, locks, connections)Rust (detect: Cargo.toml)
.unwrap() / .expect() in non-test production code (use ? operator)unsafe blocks without safety commentsMobile (React Native / Flutter) (detect: react-native in package.json or pubspec.yaml)
keyExtractor or getItemLayoutReact.memo on list item componentsconst constructors, missing dispose() for controllers and streamsGoal: Surface insights about skill usage, chain patterns, and nexus health from accumulated metrics.
Data source: .topia/metrics/ directory (populated by hooks automatically).
.topia/metrics/ exists. If not, emit INFO: "No metrics data yet — run a few build sessions first.".topia/metrics/skills.json — extract per-skill invocation counts, last used dates.topia/metrics/sessions.jsonl — extract session count, avg duration, avg tool calls.topia/metrics/chains.jsonl — extract most common skill chains.topia/metrics/routing-overrides.json (if exists) — list active routing overrides.topia/metrics/baseline.json (if exists) — A/B baseline for savings comparisonnode compiler/bin/topia.js analytics --json (or topia metrics --json) — extract tokenOverview, tokenTrend, savingsVsBaseline, toolTokenDistribution, expensiveSessions.topia/metrics/tools.json if present — per-tool invocation and estimated token totalsCompute and report:
sessions.jsonl includes tokens):
baseline.json presenttools.json / toolTokenDistribution) — candidates to reduce Read/Skill/MCP usagecontext_peak) — correlate with pressure_level and primary skillPropose routing overrides: If patterns suggest inefficiency (e.g., debug consistently called 3+ times in a chain for the same session), propose a new routing override for user approval.
Output as a section in the final audit report:
### Nexus Analytics
| Skill | Invocations | Last Used | Chains Containing |
|-------|-------------|-----------|-------------------|
| build | 47 | 2026-02-28| 34 |
| recon | 89 | 2026-02-28| 42 |
| ... | ... | ... | ... |
**Common Chains**:
1. build → recon → plan → test → fix → quality → verify (34x)
2. debug → recon → fix → verification (12x)
**Session Stats**: 23 sessions, avg 35min, avg 52 tool calls
**Token Stats**: avg peak ctx 94k (measured), ~4.2k est. tokens/session, 1.2 compactions/session, confidence mixed
**Baseline**: -18% vs without_topia (if baseline.json set)
**Unused Skills**: [list or "none"]
**Routing Overrides**: [count] active
Shortcut: /topia metrics invokes ONLY this phase, not the full 8-phase audit. CLI: topia metrics or topia analytics.
Triggered by /topia audit dx. Evaluates developer experience — how easy it is for a new contributor to understand, set up, use, and recover from mistakes in this project. Inspired by Addy Osmani's DX framework.
Measure: How many steps from git clone to running the project?
1. Read README.md — extract setup instructions
2. Count discrete steps (clone, install, config, build, run)
3. Check: are ALL commands copy-pasteable? (no placeholders without explanation)
4. Check: does `npm start` / `python main.py` / equivalent work immediately after install?
5. If browser-pilot available: attempt to follow the README steps literally
| Steps to Run | Score | Verdict |
|---|---|---|
| 1-2 commands | 10/10 | Excellent — "clone and go" |
| 3-4 commands | 7/10 | Good — reasonable setup |
| 5-7 commands | 4/10 | Fair — friction will lose contributors |
| 8+ commands | 2/10 | Poor — significant onboarding barrier |
| Cannot run from README | 0/10 | Broken — README is lying or incomplete |
Check for common setup traps:
.env.example → new dev has no idea what env vars are needed.nvmrc or engines field)python-version in pyproject.toml)--help or usage message when running CLI with no argsScore: count friction points. 0 = 10/10, 1-2 = 7/10, 3-4 = 4/10, 5+ = 2/10.
Sample 5 error paths in the codebase (auth failure, invalid input, missing config, network error, permission denied). For each:
| Quality | Score |
|---|---|
| All 3 (what + why + how) | 10/10 |
| What + why, no how | 6/10 |
| What only | 3/10 |
| Generic errors | 1/10 |
If project has a CLI entry point:
1. Run `<cli> --help` — capture output
2. Check: does it list all commands with descriptions?
3. Check: does each subcommand have `--help` with examples?
4. Check: is there a quickstart example in help output?
5. Check: are flags named predictably (--verbose not --v, --output not --o)?
Score: 2 points each for: command listing, subcommand help, examples, consistent naming, error on unknown flag.
If no CLI: score as N/A (does not count toward total).
Can a developer find answers in under 60 seconds?
Grep for ]( patterns and spot-check 5 links)If browser-pilot available: navigate the docs site, time how long to find "how to authenticate" and "how to deploy".
For the top 10 exported functions / API endpoints:
{ data, error } or all throw — not mixed)Score: consistency percentage across the 10 sampled APIs.
d.ts files, JSDoc, type stubs)down, deploys have rollback, git-based workflows)--force required for dangerous ops)--dry-run exist for risky operations?## DX Review: [Project Name]
| # | Principle | Score | Key Finding |
|---|-----------|-------|-------------|
| 1 | Time to Hello World | ?/10 | [steps count + blocker if any] |
| 2 | Setup Friction | ?/10 | [friction points found] |
| 3 | Error Messages | ?/10 | [quality level + worst example] |
| 4 | CLI Help | ?/10 | [coverage + gaps] |
| 5 | Doc Navigation | ?/10 | [findability + dead links] |
| 6 | API Consistency | ?/10 | [consistency % + violations] |
| 7 | Progressive Disclosure | ?/10 | [simple path exists? defaults?] |
| 8 | Recovery | ?/10 | [undo/rollback/dry-run support] |
| **Overall DX** | | **?/10** | **[verdict]** |
### Quick Wins (fix in <1 hour)
1. [specific, actionable improvement]
2. [specific, actionable improvement]
3. [specific, actionable improvement]
### Structural Improvements (plan needed)
1. [deeper change needed]
Grade thresholds: 9-10 Excellent DX, 7-8 Good DX, 5-6 Fair DX (losing contributors), 3-4 Poor DX (significant barrier), 1-2 Hostile DX.
After all phases complete:
Use Write to save AUDIT-REPORT.md to the project root with the full findings from all phases.
Call topia:journal to record: audit date, overall health score, verdict, and CRITICAL count.
Each dimension score feeds into a weighted composite formula that produces a single comparable health score. Use this formula to compute Overall Health — not a simple average.
Overall = (Security × 0.25) + (Code Quality × 0.20) + (Architecture × 0.15)
+ (Dependencies × 0.15) + (Performance × 0.10) + (Infrastructure × 0.08)
+ (Documentation × 0.07)
Nexus Analytics (Phase 8) is advisory — it contributes 0 to the weighted score but informs the verdict narrative.
| Score Range | Grade | Verdict | Action |
|---|---|---|---|
| 90–100 | Excellent | PASS | Routine audit in 3 months |
| 75–89 | Good | PASS | Address MEDIUM items next sprint |
| 60–74 | Fair | WARNING | Fix HIGH items within 2 weeks |
| 40–59 | Poor | FAIL | Fix CRITICAL + HIGH within 1 week |
| 0–39 | Critical | FAIL | Emergency response — CRITICAL items block all new work |
Security issues cause exponential blast — a 3/10 security score with all other dimensions at 9/10 = overall 72 (Fair), not 8.1 (Good). The formula ensures security and code quality dominate the verdict. Comparable across runs: if Overall moves from 68 → 74 after fixes, the project measurably improved.
CRITICAL — Must fix immediately. Security vulnerabilities, data loss, broken builds.
HIGH — Should fix soon. Performance bottlenecks, CVEs, major code smells.
MEDIUM — Plan to fix. Code duplication, missing tests, outdated deps.
LOW — Nice to have. Style inconsistencies, minor refactors, doc gaps.
INFO — Observation only. Architecture notes, tech debt acknowledgment.
Apply confidence filtering: only report findings with >80% confidence. Consolidate similar issues (e.g., "12 functions missing error handling in src/services/" — not 12 separate findings). Adapt judgment to project type (a console.log in a CLI tool is fine; in a production API handler, it's not).
## Audit Report: [Project Name]
- **Verdict**: PASS | WARNING | FAIL
- **Overall Health**: [score]/10
- **Total Findings**: [n] (CRITICAL: [n], HIGH: [n], MEDIUM: [n], LOW: [n])
- **Framework Checks Applied**: [list]
### Health Score
| Dimension | Score | Notes |
|----------------|:--------:|--------------------|
| Security | ?/10 | [brief note] |
| Code Quality | ?/10 | [brief note] |
| Architecture | ?/10 | [brief note] |
| Performance | ?/10 | [brief note] |
| Dependencies | ?/10 | [brief note] |
| Infrastructure | ?/10 | [brief note] |
| Documentation | ?/10 | [brief note] |
| Nexus Analytics | ?/10 | [brief note] |
| **Overall** | **?/10** | **[verdict]** |
### Phase Breakdown
| Phase | Issues |
|----------------|--------|
| Dependencies | [n] |
| Security | [n] |
| Code Quality | [n] |
| Architecture | [n] |
| Performance | [n] |
| Infrastructure | [n] |
| Documentation | [n] |
| Nexus Analytics | [n] |
### Composite Score
- **Formula**: (Security×0.25) + (Code Quality×0.20) + (Architecture×0.15) + (Dependencies×0.15) + (Performance×0.10) + (Infrastructure×0.08) + (Documentation×0.07)
- **Weighted Score**: [computed value] → Grade: [Excellent/Good/Fair/Poor/Critical]
### Top Priority Actions
1. [action] — [file:line] — [why it matters]
### Positive Findings
- [at least 3 things the project does well]
### Follow-up Timeline
- FAIL → re-audit in 1-2 weeks after CRITICAL fixes
- WARNING → re-audit in 1 month
- PASS → routine audit in 3 months
Report saved to: AUDIT-REPORT.md
| Gate | Requires | If Missing |
|---|---|---|
| Discovery Gate | Phase 0 project profile completed before Phase 1 | Run recon and read config files first |
| Security Gate | guardian report received before assembling final report | Invoke topia:guardian — do not skip |
| Deps Gate | dependency-doctor report received before assembling final report | Invoke topia:dependency-doctor — do not skip |
| Report Gate | All 8 phases completed before writing AUDIT-REPORT.md | Complete all phases, note skipped ones |
| Artifact | Format | Location |
|---|---|---|
| Audit report | Markdown | AUDIT-REPORT.md (project root) |
| 8-dimension health score | Markdown table | AUDIT-REPORT.md + inline |
| Weighted composite score + grade | Markdown | inline + AUDIT-REPORT.md |
| Nexus analytics section | Markdown table | inline + AUDIT-REPORT.md |
| Journal entry | Text | .topia/adr/ (via topia:journal) |
| Failure Mode | Severity | Mitigation |
|---|---|---|
| Generating health scores from file name patterns instead of actual reads | CRITICAL | Phase 0 recon run is mandatory — never score without reading actual code |
| Skipping a phase because "there are no changes in that area" | HIGH | All 7 phases run for every audit — partial audits produce misleading scores |
| Health score inflation — no negative findings in any dimension | MEDIUM | CONSTRAINT: minimum 3 positive AND 3 improvement areas required |
| Dependency-doctor or guardian sub-call times out → skipped silently | MEDIUM | Mark phase as "incomplete — tool timeout" with N/A score, do not fabricate |
~8000-20000 tokens input, ~3000-6000 tokens output. Sonnet orchestrating; guardian (sonnet/opus) and autopsy (opus) are the expensive sub-calls. Full audit runs 4 sub-skills. Most thorough L2 skill — run on demand, not on every cycle.
Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
npx claudepluginhub linenoize/topia --plugin topia