From rune
Performance regression gate that detects N+1 queries, sync-in-async, missing indexes, memory leaks, and bundle bloat before production. Ranks findings by cost impact hierarchy so fix priority maps to unit-cost reduction.
How this skill is triggered — by the user, by Claude, or both
Slash command
/rune:perfThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Performance regression gate. Analyzes code changes for patterns that cause measurable slowdowns — N+1 queries, sync operations in async handlers, unbounded DB queries, missing indexes, memory leaks, and bundle bloat. Not a profiler — a gate. Finds performance bugs with measurable/estimated impact before production, so developers fix them at the cheapest point in the cycle.
Performance regression gate. Analyzes code changes for patterns that cause measurable slowdowns — N+1 queries, sync operations in async handlers, unbounded DB queries, missing indexes, memory leaks, and bundle bloat. Not a profiler — a gate. Finds performance bugs with measurable/estimated impact before production, so developers fix them at the cheapest point in the cycle.
/rune perf — manual invocation before commitcook (L1): Phase 5 quality gatereview (L2): performance patterns detected in diffdeploy (L2): pre-deploy regression checkaudit (L2): performance health dimensionscout (L2): find hotpath files and identify framework in usebrowser-pilot (L3): run Lighthouse / Core Web Vitals for frontend projectsverification (L3): run benchmark scripts if configured (e.g. npm run bench)design (L2): when Lighthouse Accessibility BLOCK — design system may lack a11y foundationcook (L1): Phase 5 quality gate before PRaudit (L2): performance dimension delegationreview (L2): performance patterns detected in diffdeploy (L2): pre-deploy perf regression checkadversary (L2): scalability stress test when bottleneck patterns detected in planreferences/cost-reference.md — Cost priority hierarchy, quick wins checklist, instance right-sizing, data transfer traps, serverless optimization, observability cost control, managed vs self-hosted matrix, unit economics tracking. Load when cost analysis or FinOps context detected.references/scalability-reference.md — Bottleneck identification flow, performance thresholds, API patterns (cursor pagination, rate limiting, circuit breaker, graceful shutdown), caching strategies, queue-based load leveling, concurrency patterns, K8s HPA, CDN headers, load testing. Load when scaling or infrastructure optimization context detected.../deploy/references/observability.md — Instrumentation discipline (RED/USE metrics, percentiles-not-averages, cardinality bounds, structured logs, correlation IDs). Load when establishing the measurement basis for an optimization — you cannot optimize what you do not measure.Measure before optimizing. A perf finding without a metric behind it is a guess. Before recommending a fix, confirm the hotpath actually emits the signal that proves it is slow: RED latency as a histogram (read p95/p99 — averages hide the slow tail), bounded label cardinality, and a correlation ID to tie a slow trace to its logs. If the instrumentation does not exist yet, that absence is itself a finding (UNMEASURED_HOTPATH). See ../deploy/references/observability.md for the instrumentation contract.
Determine what to analyze:
scout to identify top 10 hotpath files (entry points, routes, DB access layers, render-heavy components)Scan all in-scope files for:
N+1 pattern — loop containing ORM call:
# BAD: N+1
for user in users:
orders = Order.objects.filter(user=user) # N queries
# GOOD: prefetch
users = User.objects.prefetch_related('orders').all()
Finding: N+1 DETECTED — [file:line] — loop over [collection] with [ORM call] inside — use prefetch/JOIN
Unbounded query — no LIMIT/pagination:
# BAD
db.query("SELECT * FROM events")
# GOOD
db.query("SELECT * FROM events LIMIT 100 OFFSET ?", [offset])
Finding: UNBOUNDED_QUERY — [file:line] — missing LIMIT on [table] — add pagination
SELECT * — fetching all columns when only some are needed:
Finding: SELECT_STAR — [file:line] — select only needed columns
Scan for synchronous operations in async contexts:
Blocking I/O in async handler:
// BAD: blocks event loop
async function handler(req) {
const data = fs.readFileSync('./config.json')
}
// GOOD
async function handler(req) {
const data = await fs.promises.readFile('./config.json')
}
Finding: SYNC_IN_ASYNC — [file:line] — [readFileSync|execSync|etc] in async function — blocks event loop
Missing await:
// BAD: fire-and-forget
async function save() {
db.insert(record) // no await
}
Finding: MISSING_AWAIT — [file:line] — unresolved Promise may cause race condition
Scan for:
Event listener without cleanup:
// BAD: leak in React
useEffect(() => {
window.addEventListener('resize', handler)
// missing return cleanup
})
// GOOD
useEffect(() => {
window.addEventListener('resize', handler)
return () => window.removeEventListener('resize', handler)
}, [])
Finding: MEMORY_LEAK — [file:line] — addEventListener without cleanup in useEffect
Growing collection without eviction:
# BAD: unbounded cache
cache = {}
def get(key):
if key not in cache:
cache[key] = expensive_compute(key)
return cache[key]
Finding: UNBOUNDED_CACHE — [file:line] — dict grows indefinitely — add LRU eviction or TTL
Closure-captured arrays:
// BAD: closure retains entire `items` even after only `id` is needed
function makeHandler(items) {
return (e) => items.find(x => x.id === e.id); // items array retained
}
// GOOD: extract minimum data
function makeHandler(items) {
const ids = new Set(items.map(x => x.id));
return (e) => ids.has(e.id);
}
Finding: CLOSURE_RETAIN — [file:line] — closure captures full collection — extract needed fields only
Timer / interval without clear:
setInterval / setTimeout callback referencing component state without cleanup on unmountLEAK_TIMER — [file:line] — interval not cleared — add clearInterval in cleanupCost framing — memory leak = cost driver:
Unbounded caches + forgotten listeners + closure-captured arrays produce a chain reaction in production: memory grows → process exceeds container limit → orchestrator kills + cold-starts → cold-start latency spike → autoscaler provisions more replicas → bill climbs 20-40% versus a leak-free baseline. The leak finding is also a COST finding. When tagging severity, escalate any leak in a long-running process (worker, daemon, queue consumer) to BLOCK even if heap profile is currently under threshold — the slope, not the absolute, determines OOM timing.
Cross-link to debug when a leak surfaces during diagnosis: debug finds the cause, perf quantifies the cost driver and recommends scope (LRU vs WeakRef vs explicit lifecycle).
If project type is frontend:
// BAD: imports entire lodash
import _ from 'lodash'
// GOOD: named import
import { debounce } from 'lodash'
Finding: BUNDLE_BLOAT — [file:line] — default import of [library] prevents tree-shakingIf browser-pilot is available and project has a URL: invoke it for Lighthouse score.
Lighthouse Score Gates (apply to any project with a public URL):
Performance: ≥ 90 → PASS | 70–89 → WARN | < 70 → BLOCK
Accessibility: ≥ 95 → PASS | 80–94 → WARN | < 80 → BLOCK
Best Practices: ≥ 90 → PASS | < 90 → WARN
SEO: ≥ 80 → PASS | < 80 → WARN (public-facing pages only)
Core Web Vitals thresholds:
LCP (Largest Contentful Paint):
≤ 2.5s → PASS | 2.5–4s → WARN | > 4s → BLOCK
INP (Interaction to Next Paint, replaces FID):
≤ 200ms → PASS | 200–500ms → WARN | > 500ms → BLOCK
CLS (Cumulative Layout Shift):
≤ 0.1 → PASS | 0.1–0.25 → WARN | > 0.25 → BLOCK
Lighthouse Accessibility score < 80 = BLOCK regardless of other scores.
Accessibility regressions are legal liability and cannot be auto-fixed by the AI.
Do NOT downgrade this gate.
If no URL available (dev-only environment): log INFO: no URL for Lighthouse — run manually before deploy
If Lighthouse MCP not installed: log INFO: Lighthouse MCP not available — run lighthouse [url] --output json manually
React:
useEffect without dependency array → runs every renderNode.js / Express:
require() calls inside route handlers (should be top-level)crypto.subtle async API)Python / Django:
select_related / prefetch_related on ForeignKey traversallen(queryset) instead of queryset.count() (loads all rows)bind=True retried without backoffSQL:
If project has benchmark scripts (detected via package.json scripts, Makefile, or pytest-benchmark):
verification to run them.perf-baseline.json existsIf no benchmarks configured: log INFO: no benchmark scripts found — skipping
Emit structured report:
## Perf Report: [scope]
### BLOCK (must fix before merge)
- [FINDING_TYPE] [file:line] — [description] — estimated impact: [Xms|X% bundle|X queries]
### WARN (should fix)
- [FINDING_TYPE] [file:line] — [description] — estimated impact: [...]
### PASS
- DB query patterns: clean
- Async/sync violations: none
- [etc.]
### Lighthouse (if ran)
- Performance: [score] [PASS|WARN|BLOCK]
- Accessibility: [score] [PASS|WARN|BLOCK]
- Best Practices: [score] [PASS|WARN]
- SEO: [score] [PASS|WARN]
- LCP: [Xs] [PASS|WARN|BLOCK] | INP: [Xms] [PASS|WARN|BLOCK] | CLS: [X] [PASS|WARN|BLOCK]
### Verdict: PASS | WARN | BLOCK
For projects that call AI APIs (detected via imports of anthropic, openai, @anthropic-ai/sdk, @ai-sdk/core, langchain, llamaindex, or fastmcp), audit token usage patterns per operation type.
Scan for:
| Pattern | Finding | Impact |
|---|---|---|
| AI call inside a loop without batching | TOKEN_LOOP — [file:line] — AI call in loop over [collection] — batch or parallelize | Cost scales linearly with collection size |
| No token usage tracking | NO_TOKEN_TRACKING — [file:line] — AI response usage not captured — add cost logging | Invisible spend, no budget control |
| Expensive model for simple tasks | MODEL_MISMATCH — [file:line] — using [opus/gpt-4] for [classification/extraction] — use [haiku/gpt-4.1-mini] | 10-30x cost difference for same result |
| Missing max_tokens on open-ended prompts | UNBOUNDED_TOKENS — [file:line] — no max_tokens on [call] — add limit to prevent runaway cost | Single call can consume entire budget |
| Duplicate AI calls for same input | DUPLICATE_AI_CALL — [file:line] — same prompt sent to [provider] without caching — add response cache | Wasted tokens on redundant calls |
Per-Operation Cost Awareness:
When token tracking IS present, analyze the operation type breakdown:
Operation Type Avg Tokens Frequency Monthly Est.
─────────────────────────────────────────────────────────────
Chat (primary) 2,500 in/800 out high $X.XX
Background notes 500 in/200 out per-chat $X.XX
Summarization 1,500 in/300 out periodic $X.XX
Classification 200 in/50 out high $X.XX
─────────────────────────────────────────────────────────────
Total estimated monthly $X.XX
Report this under a ### AI Token Budget subsection in the Perf Report. Only include when AI API usage detected — skip entirely for non-AI projects.
Key insight: The most impactful optimization is often model selection per operation — using a cheaper model for background tasks (summarization, classification, metadata extraction) while reserving expensive models for primary user-facing interactions. A 10x cost reduction on 60% of calls = 6x overall savings.
Logging, tracing, and metrics infrastructure is often the second-largest cloud bill line after compute for production workloads. Most observability cost is self-inflicted — high-cardinality labels, unsampled traces, or 100% INFO log retention. Scan for the four common patterns:
Sampling discipline:
| Layer | Healthy default | Anti-pattern |
|---|---|---|
| INFO logs | 5-10% sample, 100% on warn/error | 100% retention on all levels |
| Distributed traces | 5-10% normal, 100% on errors or > p95 latency | Head-based 100% sampling |
| Metrics | Pre-aggregated at agent (no per-event metrics) | Per-event metrics emission |
| Debug logs | Off in prod; on-demand via runtime flag | Always-on debug retention |
Findings to emit:
| Pattern | Finding | Cost impact |
|---|---|---|
Log line emits unique IDs (user_id, request_id, trace_id) AS A METRIC LABEL | HIGH_CARDINALITY_METRIC — [file:line] — label [name] is unbounded — move to log/trace, not metric | Metric stores explode 100x+ ingestion cost |
| No sampling on INFO logs | LOG_NO_SAMPLE — [file:line] — INFO logged at 100% — add sampling (10% INFO, 100% warn+) | 10x log volume = 10x bill |
| Trace sampling head-based at 100% | TRACE_NO_SAMPLE — [file:line] — head sampling 100% — switch to tail-based 5-10% with always-on for errors/slow | 20x trace volume |
console.log in production hot path | LOG_HOT_PATH — [file:line] — log emitted per request in tight loop — gate behind LOG_LEVEL or remove | Logs dominate IOPS; throttles real traffic |
| Sentry / OTel SDK with no scrubbing config | OBS_NO_SCRUB — [file:line] — payloads emitted unscrubbed — risk PII + cost (large payloads) | Compliance + ingestion cost |
Cost framing: same project, same business logic; the observability cost line moves 5-20x depending on whether sampling is disciplined. Especially severe in serverless/edge runtimes where every invocation pays cold-start log ingestion overhead.
Not every perf finding has the same cost impact. When the report has multiple findings competing for engineering attention, rank them by where they sit in this hierarchy — fixes higher up the tree deliver more $ per engineering hour than fixes lower down.
Tier 1 — Architecture choices (10x impact)
• Hot loop running on wrong runtime (serverless cold-start per call vs warm container)
• N+1 queries that fan out to upstream rate limit (compounds with retries)
• Single-region for global users (egress + latency cost)
Tier 2 — Data transfer (3-5x impact)
• Cross-AZ chatter (per-GB egress)
• Uncompressed responses (gzip = 70-90% reduction)
• Image format (WebP/AVIF = 25-50% reduction over JPEG/PNG)
• Log/trace volume (see Step 8.6)
Tier 3 — Compute right-sizing (2-3x impact)
• Oversized instance types
• Over-provisioned autoscaler floor
• Always-on workers for bursty traffic
Tier 4 — DB optimization (1.5-2x impact)
• Missing index on hot query
• N+1 within a single DB connection (cheaper than cross-service N+1)
• Connection pool sizing
• Query result caching
Tier 5 — Caching layer (1.2-1.5x impact)
• CDN cache hit rate
• Application-level memoization
• Read-through cache for hot reads
Reporting rule: when verdict is BLOCK or WARN, group findings by tier in the report. Operator should NOT optimize Tier 5 caching when a Tier 1 architecture mismatch is sitting unaddressed — the Tier 1 fix subsumes the Tier 5 gain.
Unit economics anchor: where possible, attach a unit-cost delta to each finding (e.g., LOG_NO_SAMPLE → ~$X/mo per 10k req based on operator's current observability vendor pricing). If unit-cost data unavailable, flag the finding with the tier-based multiplier band instead of leaving impact blank.
Cost-allocation precondition: a perf report's cost claims are only auditable if cloud resources have allocation tags (Environment, Team, Service). If deploy artifacts show untagged resources, raise a WARN COST_UNATTRIBUTED — resources lack allocation tags — anomaly detection blind; tag-first then optimize.
## Perf Report: src/api/users.ts, src/db/queries.ts
### BLOCK
- N+1_QUERY src/db/queries.ts:47 — loop over users with Order.find() inside — fix: use JOIN or prefetch — estimated: +200ms per 100 users
### WARN
- SYNC_IN_ASYNC src/api/users.ts:23 — readFileSync in async handler — fix: fs.promises.readFile
### PASS
- Memory leak patterns: clean
- Bundle analysis: N/A (backend project)
### Verdict: BLOCK
| Gate | Requires | If Missing |
|---|---|---|
| Scope Gate | File list or scout result before scanning | Invoke scout to identify hotpath files |
| Evidence Gate | file:line + estimated impact for every BLOCK finding | Downgrade to WARN or remove finding |
| Framework Gate | Framework detected before framework-specific checks | Fall back to generic patterns only |
Known failure modes for this skill. Check these before declaring done.
| Failure Mode | Severity | Mitigation |
|---|---|---|
| BLOCK finding without impact estimate | HIGH | Every BLOCK needs "estimated impact: X" — evidence gate enforces this |
| False N+1 on intentional batched loops | MEDIUM | Check if loop has a batch_size limiter or is already prefetched upstream |
| Skipping framework checks because framework not detected | MEDIUM | If scout returns unknown framework, run generic checks + note in report |
| Calling browser-pilot on backend-only project | LOW | Check project type in Step 1 — browser-pilot only for frontend/fullstack |
| Reporting WARN as BLOCK (severity inflation) | MEDIUM | BLOCK = measurable regression on hot path; WARN = pattern that could be slow |
| Memory leak found, severity left as MEDIUM | HIGH | Leak in long-running process = cost driver (OOM restart loop → autoscaler spend). Escalate to BLOCK when process lifetime > 1 hour |
| High-cardinality label silently added as metric tag | HIGH | user_id / request_id / trace_id as METRIC label explodes ingestion cost 100x. Move to log/trace, never metric |
| Findings reported without tier ranking when multiple compete | MEDIUM | Cost Impact Hierarchy groups by tier (1-5). Tier 1 fix subsumes Tier 5 gain; operator otherwise optimizes wrong target |
| Cost claim made without cloud tag prerequisite | MEDIUM | Untagged resources = anomaly detection blind. Pair perf findings with WARN COST_UNATTRIBUTED when cloud tags missing |
| Artifact | Format | Location |
|---|---|---|
| Perf Report with verdict | Markdown (PASS/WARN/BLOCK) | inline |
| Per-finding details | Structured list (file:line + impact) | inline |
| Lighthouse scores (if ran) | Score table | inline |
| Framework-specific findings | Categorized list | inline |
~3000-8000 tokens input, ~500-1500 tokens output. Sonnet for pattern recognition.
Scope guardrail: perf investigates and reports only — it does not fix code. All fixes are delegated to fix (L2) after the report is reviewed.
npx claudepluginhub rune-kit/rune --plugin @rune/analyticsPerforms static code analysis for performance bottlenecks, optimization opportunities, scalability issues, including N+1 queries, memory leaks, caching, and Core Web Vitals. Generates prioritized report with code fixes.
Diagnoses frontend and backend performance bottlenecks including bundle size, N+1 queries, memory leaks, and Core Web Vitals. Prioritizes fixes by impact.
Guides performance measurement and optimization workflows using Core Web Vitals targets. Use when profiling reveals bottlenecks or when load time budgets exist.