From harness-claude
Applies a measurement-first workflow: define metric, baseline, identify bottleneck, fix, and verify with statistical significance. Useful when optimizing performance without clear evidence.
How this skill is triggered — by the user, by Claude, or both
Slash command
/harness-claude:perf-profiling-methodologyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> Apply a systematic, measurement-first profiling workflow — define metric, establish baseline, identify bottleneck, implement fix, verify improvement with statistical significance — to avoid wasted optimization effort and ensure every change demonstrably improves performance.
Apply a systematic, measurement-first profiling workflow — define metric, establish baseline, identify bottleneck, implement fix, verify improvement with statistical significance — to avoid wasted optimization effort and ensure every change demonstrably improves performance.
Never optimize without a baseline. Before changing any code, measure the current state with the exact metric you want to improve. Record at least 5 measurements to establish a stable baseline (performance varies 10-30% between runs):
# Example: run Lighthouse 5 times and compute median
for i in {1..5}; do
npx lighthouse https://example.com --output=json --output-path=./run-$i.json --quiet
done
# Extract LCP from each run and compute median
Follow the profiling workflow:
Read flame charts effectively. In Chrome DevTools Performance panel:
The optimization target is the function with the largest self time that is on the critical path. Deep call stacks with small self times are not the bottleneck — the leaf functions are.
Use CPU throttling in DevTools. Developer hardware is 5-10x faster than the median user device. Always profile with:
Set performance budgets and enforce in CI:
// lighthouserc.js — Lighthouse CI configuration
module.exports = {
ci: {
assert: {
assertions: {
'largest-contentful-paint': ['error', { maxNumericValue: 2500 }],
interactive: ['error', { maxNumericValue: 3500 }],
'total-byte-weight': ['error', { maxNumericValue: 200000 }],
'cumulative-layout-shift': ['error', { maxNumericValue: 0.1 }],
},
},
},
};
Understand lab vs field data:
| Aspect | Lab (Lighthouse, WebPageTest) | Field (CrUX, RUM) |
|---|---|---|
| Conditions | Controlled, reproducible | Real user devices and networks |
| Device | Specified throttling | Actual user hardware |
| Coverage | Single page, one scenario | All pages, all users |
| Use for | Debugging, regression detection | Understanding real user experience |
| Limitation | Does not reflect real-world variance | Cannot reproduce specific scenarios |
Lab and field data should agree directionally. If Lighthouse shows LCP of 1.5s but CrUX shows 4.0s, the lab test is not representative of real user conditions (likely missing slow devices or slow networks).
Measure with statistical significance:
// Simple statistical validation for A/B performance tests
function isSignificant(controlSamples, treatmentSamples, confidenceLevel = 0.95) {
const controlMean = mean(controlSamples);
const treatmentMean = mean(treatmentSamples);
const controlStdDev = stddev(controlSamples);
const treatmentStdDev = stddev(treatmentSamples);
const n = controlSamples.length;
const pooledStdErr = Math.sqrt((controlStdDev ** 2 + treatmentStdDev ** 2) / n);
const tStat = (controlMean - treatmentMean) / pooledStdErr;
const criticalValue = 1.96; // for 95% confidence
return Math.abs(tStat) > criticalValue;
}
Optimize in this order (each level has 10x the impact of the one below):
Most wasted optimization effort happens when teams optimize at level 4 (micro-optimizing a function from 2ms to 1ms) while the architecture at level 1 adds 3 seconds of unnecessary latency.
Pinterest's performance team validates every optimization with an A/B test:
This protocol caught a "30% LCP improvement" that was actually a 2% improvement. The initial benchmark ran on a fast internal network; the A/B test revealed the improvement was much smaller for real users on cellular networks. The fix was still shipped because 2% improvement for all users was valuable, but expectations were correctly calibrated.
Shopify runs Lighthouse CI on every PR for every storefront route. Each route has a performance budget:
Any PR that regresses a metric beyond its budget fails the build. The PR author receives a comparison report showing exactly which metric regressed, by how much, and which files contributed to the regression (via source map analysis).
Use a three-tier approach:
Optimizing without measuring first. "I bet the problem is the database" leads to weeks of database optimization when the real bottleneck is a 3MB uncompressed hero image. Always profile first, then optimize the actual bottleneck.
Testing on developer hardware without throttling. A MacBook Pro on gigabit fiber does not represent the median user on a mid-tier Android phone on 4G. Always enable CPU throttling (4x) and network throttling (Fast 3G) in DevTools, or use WebPageTest with a real Moto G4 device.
Single-run measurements. Performance varies 10-30% between runs due to background processes, network conditions, and GC timing. A single run showing 2.1s LCP could be 2.7s on the next run. Always take the median of 5+ runs.
Optimizing p50 when p95 is the real problem. Median latency looks great at 500ms, but p95 is 8 seconds. Tail latencies affect 5% of users on every page load. Focus on percentile metrics, not averages.
Micro-benchmarking in isolation. Optimizing a function from 1ms to 0.1ms is a 10x improvement that is completely irrelevant if the function is called once during a 3-second page load. Always measure the impact on the end-to-end metric, not the isolated function.
Premature optimization without profiling. Adding useMemo to every React component, will-change to every element, or code-splitting every route "just in case" adds complexity without measured benefit. Profile first, optimize only the measured bottleneck.
npx claudepluginhub intense-visions/harness-engineering --plugin harness-claudeGuides performance profiling for web applications: Core Web Vitals, bundle analysis, runtime optimization, and common bottlenecks.
Optimizes application performance via measure-identify-fix-verify workflow. Use for Core Web Vitals, load times, regressions, or profiling bottlenecks.
Guides measure-first optimization with profiling to identify and fix performance bottlenecks, regressions, and Core Web Vitals issues.