Autonomous optimization loops for Claude Code
npx claudepluginhub hwuiwon/autotuneAutonomous optimization loops — edit, benchmark, keep improvements, revert regressions, repeat forever
Autonomous optimization loops for Claude Code. Inspired by karpathy/autoresearch.
Try an idea, measure it, keep what works, repair what breaks, and pause with a reason when the loop gets stuck.
Works for any optimization target: test speed, bundle size, LLM training loss, build times, Lighthouse scores, inference latency — anything with a number you want to move.
┌─────────────────────────────────────────────────┐
│ Autotune Loop │
│ │
│ Edit code ─► Benchmark ─► Keep / Revert / Heal │
│ ▲ │ │
│ └──── Explain state ◄───────┘ │
│ until budget or pause │
└─────────────────────────────────────────────────┘
The Claude Code agent autonomously:
autotune.sh)autotune.jsonlInfrastructure (domain-agnostic):
init-experiment.sh — Initialize a session with metric name, unit, directionrun-experiment.sh — Run benchmark, parse METRIC name=value lines, run correctness checkslog-experiment.sh — Log results, auto-commit/revert, compute confidence scores, classify healthdashboard.sh — Terminal dashboard with live monitoring and health statelib/health.py — Recovery controller and explainable health transitionsAgent (the brain):
agents/autotune.md — Full autonomous loop with resume protocol, loop rules, and tool protocolSkill (setup wizard):
skills/autotune/SKILL.md — Gathers goal, writes session files, starts the loop# Step 1: Add the marketplace
/plugin marketplace add hwuiwon/autotune
# Step 2: Install the plugin
/plugin install autotune@autotune
For local development/testing:
claude --plugin-dir /path/to/autotune
cd /path/to/your/project
# Interactive — agent asks what to optimize
claude --agent autotune
# Quick start with a goal
claude --agent autotune -p "Optimize test suite speed"
In a separate terminal:
bash "${CLAUDE_PLUGIN_ROOT:-/path/to/autotune}/bin/dashboard.sh" --watch
# Stop the loop (preserves all data)
autotune stop
# Explain the current state and next action
autotune explain
# Force the loop into repair mode
autotune repair
# Resume later
claude --agent autotune
# Clear everything and start fresh
autotune clear
Two files keep the session alive across restarts and context resets:
autotune.jsonl — Append-only log of every experiment (metric, status, commit, description, ASI)autotune.md — Living document: objective, what's been tried, dead ends, key wins.autotune.state — Current operating mode, health state, streaks, and recovery budgetA fresh agent with no memory can read these two files and continue exactly where the previous session left off.
After 3+ experiments, computes a confidence score using Median Absolute Deviation (MAD) as a robust noise estimator:
confidence = |best_improvement| / MAD
| Score | Label | Meaning |
|---|---|---|
| >= 2.0 | High (green) | Improvement is likely real |
| 1.0-2.0 | Marginal (yellow) | Could be noise — consider re-running |
| < 1.0 | Within noise (red) | Improvement may not be real |
Autotune is not a blind infinite loop anymore. It tracks loop health and switches strategies when the run stops being productive.
Health states:
running / improving — normal optimization modeplateaued — repeated non-improving runscrashing — benchmark or checks failures are stacking uphealing — a recovery playbook is activepaused — recovery budget is exhausted and the loop stopped safelyRecovery playbooks are configured in priority order and can include:
rebaselineshrink_scopediagnosepauseUse autotune explain to see the current health state, last decision reason, and suggested next action.
Optional autotune.checks.sh runs correctness checks after every passing benchmark:
#!/bin/bash
set -e
pnpm typecheck
pnpm test
pnpm lint
If checks fail, the result is logged as checks_failed and the changes are reverted.