Profiling-driven code optimization plugin by donghyungko
npx claudepluginhub bluuewhale/auto-optimizeProfiling-driven code optimization workflow. Builds Regression Test and Benchmark Test infrastructure first, then runs an autonomous optimization loop to iteratively improve measurable metrics.
AutoResearch for Performance Engineering
Measure first. Reason deep. Reflect. Repeat.
Andrej Karpathy introduced the idea of autoresearch — closing the loop between hypothesis, experiment, and reflection so that an AI agent can drive an entire research cycle autonomously. auto-optimize applies that idea to performance engineering.
You define a numeric goal and a success threshold. The plugin builds regression and benchmark infrastructure, locks a baseline, then runs an autonomous loop: profile → reason → plan → apply → test → measure → reflect → repeat. Every iteration is a git commit. Every decision is reasoned, recorded, and fed back into the next cycle.
claude plugin marketplace add bluuewhale/auto-optimize
claude plugin install auto-optimize@auto-optimize
/auto-optimize The API is slow. I want to make it faster.
auto-optimize will ask a few clarifying questions — metric, scope, success target, and test commands — then take it from there. If benchmarks or regression tests are missing, it writes them before starting the loop.
Full write-up: HashSmith Part 3 — I Automated My Way to a 27% Faster Hash Table
HashSmith is an open-source high-performance hash table for the JVM — a SwissTable-style map built around SWAR probing and 8-byte control word groups. After two rounds of manual optimization, the author handed the profiler to auto-optimize.
One prompt. ~3 hours. No manual intervention.
/auto-optimize I want to optimize the get/put performance of the SwissMap implementation.
| Experiments run | 5 |
| Optimizations landed | 3 |
| Dropped | 2 |
| Improvement vs baseline | 13–32% across all 8 benchmark scenarios |
The agent ran 5 experiments autonomously. Three compounding wins, in order:
Tombstone guard — the probe loop was carrying tombstone-handling logic on a path where tombstones essentially never exist in production. Splitting into two specialized loop bodies eliminated the dead weight. Put path: -19% to -45%.
ILP hoisting on the read path — emptyMask was being computed after the key-equality loop, creating a serial dependency. Moving it adjacent to eqMask let the CPU's out-of-order engine pipeline both SWAR operations in the same clock cycle. Get path: -11% to -36%.
A third, smaller improvement compounded on top of both.
None of these required a single line of code written by the author. The structured reasoning pipeline (Step-Back → CoT → Self-Consistency → Pre-mortem) found the tombstone fast path by asking what is this loop doing that it doesn't need to do? — a question that wasn't visible in the disassembly alone.
Most optimization attempts fail silently:
auto-optimize enforces the discipline you know you should have but don't.
| Phase | What Happens | Output |
|---|---|---|
| 0. Gather | Collects goal, scope, metric direction, and numeric success criteria | experiment-plan.md |
| 1. Infra | Builds Regression Test and Benchmark Test scripts if missing (parallel sub-agents) | tests/ + bench/ |
| 1.5 Baseline | Locks noise-floor-validated baseline measurement and environment snapshot | baseline/ |
| 2. Loop | Profile → Disassemble → Reason (Opus) → Apply → Test → Benchmark → Reflect | iterations/ + leaderboard.md |
| 3. Report | Summarizes all iterations, best config, and recommended next steps | final-report.md |
Every iteration is a git commit — including reverts. The full experiment history is always recoverable.
Most AI coding tools apply changes and hope for the best. auto-optimize's inner loop is built differently — each iteration runs a structured reasoning pipeline powered by Claude Opus before a single line of code is touched.
Every iteration delegates planning to a dedicated Opus sub-agent that applies four reasoning techniques in sequence:
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 84 marketplace plugins, 192 local specialized agents, and 156 local skills - optimized for granular installation and minimal token usage
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations