From performance-engineer
Profile a system or endpoint for performance bottlenecks — measure, identify, and prioritise optimisation targets.
How this skill is triggered — by the user, by Claude, or both
Slash command
/performance-engineer:performance-profileThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Profile performance for $ARGUMENTS.
Profile performance for $ARGUMENTS.
Measure current performance. "It feels slow" is not a measurement. p95 = 2.3s on /api/search with 50 concurrent users is.
| Metric | How to measure | Record |
|---|---|---|
| p50 response time | APM, load test, or application timing | [ms] — typical user experience |
| p95 response time | APM or load test | [ms] — worst case for most users |
| p99 response time | APM or load test | [ms] — tail latency |
| Throughput | Requests per second at current load | [rps] |
| Error rate | Percentage of requests returning errors | [%] |
| Resource utilisation | CPU, memory, disk I/O, network during test | [%] |
Rules:
Where does the total response time go? Break down the waterfall:
| Component | Time (ms) | % of total | Notes |
|---|---|---|---|
| Network | [ms] | [%] | DNS, TLS handshake, round trip |
| Server processing | [ms] | [%] | Application code execution |
| Database queries | [ms] | [%] | Total query time (may include multiple queries) |
| External API calls | [ms] | [%] | Third-party service latency |
| Serialisation | [ms] | [%] | JSON/XML encoding/decoding |
| Rendering (if frontend) | [ms] | [%] | Component rendering, DOM updates |
The component consuming the most time is the first optimisation target. Do not optimise a component that accounts for 5% of total time.
Database queries are the most common bottleneck. Check systematically:
| Problem | How to detect | Impact |
|---|---|---|
| N+1 queries | Query count per request (should be < 10). Enable query logging and count | Linear slowdown with data size |
| Missing indexes | EXPLAIN ANALYZE on slow queries. Sequential scans on large tables | Dramatic slowdown at scale |
| Full table scans | Query plan shows Seq Scan on tables with > 10K rows | O(n) instead of O(log n) |
| Lock contention | Check for long-running transactions, deadlocks in logs | Cascading delays under concurrency |
| Unnecessary queries | Queries that fetch data not used in the response | Wasted time and database load |
# Check for ORM query patterns (N+1)
grep -rn "\.find\|\.get\|\.query\|\.select\|\.where\|\.include\|\.join" --include="*.ts" --include="*.py" --include="*.cs"
# Check for missing eager loading
grep -rn "lazy\|LazyLoad\|defer\|select_related\|prefetch_related" --include="*.ts" --include="*.py" --include="*.cs"
Third-party APIs and external services are latency you cannot control — but you can mitigate.
| Check | What to look for | Mitigation |
|---|---|---|
| Timeouts configured? | Every external call must have an explicit timeout | Set timeout to 3–5s for non-critical, 10–30s for critical |
| Circuit breaker? | Repeated failures should trip a circuit breaker, not keep retrying | Implement circuit breaker pattern |
| Parallel calls? | Sequential calls to independent services waste time | Use Promise.all, Task.WhenAll, asyncio.gather |
| Caching? | Stable data fetched on every request | Cache with appropriate TTL |
| Retry logic? | Retries without backoff cause thundering herd | Exponential backoff with jitter |
Profile CPU-bound work:
| Problem | How to detect | Fix |
|---|---|---|
| O(n²) algorithms | Nested loops over collections, response time grows quadratically with data | Replace with O(n log n) or O(n) algorithm |
| Unnecessary serialisation | JSON.parse/stringify, deep clone on every request | Avoid redundant serialisation, use streaming |
| Redundant computation | Same calculation repeated across requests | Memoisation, caching computed values |
| Synchronous heavy work | CPU-bound work blocking the event loop or request thread | Move to background worker, use async processing |
| Regular expression | Complex regex on large strings (ReDoS risk) | Simplify regex, set input length limits |
Tools:
--prof, clinic.js, 0x (flame graphs)cProfile, py-spy, scalenedotTrace, dotnet-trace, PerfViewApply these systematic methods to ensure no resource or service is overlooked:
Under concurrency, contention creates bottlenecks that don't appear in single-request testing:
| Resource | Symptom | Check | Fix |
|---|---|---|---|
| Connection pool | Timeouts waiting for connection | Pool size vs concurrent requests | Increase pool size, reduce query time |
| Thread pool | Request queuing, rising latency under load | Thread count vs concurrent requests | Increase threads, move blocking I/O to async |
| Memory pressure | GC pauses, OOM errors under load | Memory usage trend during load test | Reduce allocation, increase memory, fix leaks |
| File descriptors | "Too many open files" errors | ulimit -n, open file count | Increase limits, close connections properly |
| Metric | Target | How to measure |
|---|---|---|
| Largest Contentful Paint (LCP) | < 2.5s | Lighthouse, Web Vitals |
| Interaction to Next Paint (INP) | < 200ms | Lighthouse, Web Vitals |
| Cumulative Layout Shift (CLS) | < 0.1 | Lighthouse, Web Vitals |
| JavaScript bundle size | < 200KB gzipped | Bundlesize, webpack-bundle-analyzer |
| Image optimisation | WebP/AVIF, lazy loading, responsive sizes | Lighthouse audit |
| Render blocking resources | None in critical path | Lighthouse audit |
| Unnecessary re-renders | Minimal | React DevTools Profiler, Vue DevTools |
Rank optimisations by: impact (time saved) × frequency (requests affected) / effort (complexity to implement).
Rules:
# Performance Profile: [target]
## Baseline
| Metric | Value | Target | Status |
|---|---|---|---|
| p50 response | [ms] | < 200ms | PASS/FAIL |
| p95 response | [ms] | < 500ms | PASS/FAIL |
| p99 response | [ms] | < 1s | PASS/FAIL |
| Throughput | [rps] | [target] | PASS/FAIL |
| Error rate | [%] | < 0.1% | PASS/FAIL |
## Timing Breakdown
| Component | Time (ms) | % of total |
|---|---|---|
| [component] | [ms] | [%] |
## Bottlenecks Identified
| # | Component | Problem | Impact | Effort | Priority |
|---|---|---|---|---|---|
| 1 | [component] | [specific issue at file:line] | [time saved] | [complexity] | [High/Medium/Low] |
## Recommendations (ordered by priority)
1. **[Component — issue]** — [specific fix]. Expected improvement: [ms saved, % reduction]
2. **[Component — issue]** — [specific fix]. Expected improvement: [ms saved]
## Next Steps
- [ ] Implement recommendation #1
- [ ] Re-measure baseline after change
- [ ] Proceed to recommendation #2 only after verifying #1
/performance-engineer:load-test-plan — design load tests to reproduce and measure the bottlenecks you've profiled./performance-engineer:capacity-plan — feed profiling results into capacity planning to understand scaling limits.Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.
npx claudepluginhub hpsgd/turtlestack --plugin performance-engineer