Marketplace

autotune

Autonomous optimization loops for Claude Code

npx claudepluginhub hwuiwon/autotune

README

View full README on GitHub

1 Plugin

autotune

0·

Autonomous optimization loops — edit, benchmark, keep improvements, revert regressions, repeat forever

2mo

v0.3.0

hwuiwon

Stats

Plugins1

UpdatedMar 28, 2026

Links

View on GitHub View Marketplace JSON

autotune

Autonomous optimization loops for Claude Code. Inspired by karpathy/autoresearch.

Try an idea, measure it, keep what works, repair what breaks, and pause with a reason when the loop gets stuck.

Works for any optimization target: test speed, bundle size, LLM training loss, build times, Lighthouse scores, inference latency — anything with a number you want to move.

How It Works

┌─────────────────────────────────────────────────┐
│                 Autotune Loop                │
│                                                  │
│ Edit code ─► Benchmark ─► Keep / Revert / Heal  │
│      ▲                           │               │
│      └──── Explain state ◄───────┘               │
│         until budget or pause                    │
└─────────────────────────────────────────────────┘

The Claude Code agent autonomously:

Analyzes the codebase and picks an optimization to try
Makes a focused code change
Runs the benchmark (autotune.sh)
Compares against baseline and guardrails: improved? keep (auto-commit). Worse? discard (auto-revert)
Tracks loop health: improving, plateaued, crashing, healing, paused
Applies recovery playbooks with bounded retries, then pauses with an explicit reason if recovery is exhausted
Logs experiments, recovery actions, and decisions to autotune.jsonl

What's Included

Infrastructure (domain-agnostic):

init-experiment.sh — Initialize a session with metric name, unit, direction
run-experiment.sh — Run benchmark, parse METRIC name=value lines, run correctness checks
log-experiment.sh — Log results, auto-commit/revert, compute confidence scores, classify health
dashboard.sh — Terminal dashboard with live monitoring and health state
lib/health.py — Recovery controller and explainable health transitions
Hooks for auto-resume and benchmark enforcement

Agent (the brain):

agents/autotune.md — Full autonomous loop with resume protocol, loop rules, and tool protocol

Skill (setup wizard):

skills/autotune/SKILL.md — Gathers goal, writes session files, starts the loop

Quick Start

Install as Claude Code Plugin (recommended)

# Step 1: Add the marketplace
/plugin marketplace add hwuiwon/autotune

# Step 2: Install the plugin
/plugin install autotune@autotune

For local development/testing:

claude --plugin-dir /path/to/autotune

Run

cd /path/to/your/project

# Interactive — agent asks what to optimize
claude --agent autotune

# Quick start with a goal
claude --agent autotune -p "Optimize test suite speed"

Monitor

In a separate terminal:

bash "${CLAUDE_PLUGIN_ROOT:-/path/to/autotune}/bin/dashboard.sh" --watch

Stop / Resume

# Stop the loop (preserves all data)
autotune stop

# Explain the current state and next action
autotune explain

# Force the loop into repair mode
autotune repair

# Resume later
claude --agent autotune

# Clear everything and start fresh
autotune clear

Session Persistence

Two files keep the session alive across restarts and context resets:

autotune.jsonl — Append-only log of every experiment (metric, status, commit, description, ASI)
autotune.md — Living document: objective, what's been tried, dead ends, key wins
.autotune.state — Current operating mode, health state, streaks, and recovery budget

A fresh agent with no memory can read these two files and continue exactly where the previous session left off.

Confidence Scoring

After 3+ experiments, computes a confidence score using Median Absolute Deviation (MAD) as a robust noise estimator:

confidence = |best_improvement| / MAD

Score	Label	Meaning
>= 2.0	High (green)	Improvement is likely real
1.0-2.0	Marginal (yellow)	Could be noise — consider re-running
< 1.0	Within noise (red)	Improvement may not be real

Health And Recovery

Autotune is not a blind infinite loop anymore. It tracks loop health and switches strategies when the run stops being productive.

Health states:

running / improving — normal optimization mode
plateaued — repeated non-improving runs
crashing — benchmark or checks failures are stacking up
healing — a recovery playbook is active
paused — recovery budget is exhausted and the loop stopped safely

Recovery playbooks are configured in priority order and can include:

rebaseline
shrink_scope
diagnose
pause

Use autotune explain to see the current health state, last decision reason, and suggested next action.

Backpressure Checks

Optional autotune.checks.sh runs correctness checks after every passing benchmark:

#!/bin/bash
set -e
pnpm typecheck
pnpm test
pnpm lint

If checks fail, the result is logged as checks_failed and the changes are reverted.

autotune

README

1 Plugin

autotune

autotune

README

autotune

How It Works

What's Included

Quick Start

Install as Claude Code Plugin (recommended)

Run

Monitor

Stop / Resume

Session Persistence

Confidence Scoring

Health And Recovery

Backpressure Checks

1 Plugin

autotune

Related Marketplaces

nextjs

thedotmack

ruview

autotune

How It Works

What's Included

Quick Start

Install as Claude Code Plugin (recommended)

Run

Monitor

Stop / Resume

Session Persistence

Confidence Scoring

Health And Recovery

Backpressure Checks

Related Marketplaces

nextjs

thedotmack

ruview