From sage
Iterates autonomously to optimize a measurable metric (bundle size, test coverage, query time) by repeatedly modifying code, verifying, and keeping improvements.
How this skill is triggered — by the user, by Claude, or both
Slash command
/sage:autoresearchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Autonomous iteration toward a measurable outcome. The agent modifies
examples/bundle-size/README.mdexamples/bundle-size/autoresearch.shexamples/bundle-size/brief.mdexamples/prose-readability/README.mdexamples/prose-readability/autoresearch.shexamples/prose-readability/brief.mdexamples/test-coverage/README.mdexamples/test-coverage/autoresearch.shexamples/test-coverage/brief.mdreferences/crash-handling.mdreferences/harness-conventions.mdreferences/loop-protocol.mdreferences/metric-design.mdreferences/session-continuity.mdreferences/stuck-recovery.mdAutonomous iteration toward a measurable outcome. The agent modifies code, commits, runs a verify command, keeps improvements, reverts regressions — repeating until a target is hit, a budget is exhausted, or the user interrupts.
Core principles (from Karpathy's autoresearch pattern):
Before the loop can start, capture these (skip if already provided):
| Field | Required | Example |
|---|---|---|
| Goal | Yes | "Reduce bundle below 200KB" |
| Metric name | Yes | bundle_kb |
| Direction | Yes | lower or higher |
| Target | Optional | 200 |
| Verify command | Yes | pnpm build && measure.sh |
| Writable scope | Recommended | src/**/*.ts |
| Frozen scope | Recommended | package.json, *.lock |
| Per-run budget | Yes (default 120s) | 120 seconds |
| Max iterations | Optional | 100 |
| Termination | Auto | target if target given, else interrupt |
Present as a brief for user approval:
Sage: Autoresearch session configured.
Goal: [goal statement]
Metric: [name] ([direction]), target: [target or "none — runs until interrupted"]
Verify: [command]
Scope: writable [globs], frozen [globs]
Budget: [seconds]s per run, [max iterations or "unlimited"]
[A] Start — begin autonomous iteration
[R] Revise — change configuration
Each iteration follows 8 phases. Read references/loop-protocol.md
for per-phase detail.
| # | Phase | Actor | What happens |
|---|---|---|---|
| 1 | REVIEW | agent | Read current state, recent history (last 20 iterations from JSONL) |
| 2 | IDEATE | agent | Propose ONE change, ≤1 sentence. If stuck, load references/stuck-recovery.md |
| 3 | MODIFY | agent | Make the change. Stay within writable scope. |
| 4 | COMMIT | runtime | git add -A && git commit on autoresearch/<slug> branch |
| 5 | VERIFY | runtime | Run verify command with wall-clock budget |
| 6 | DECIDE | runtime | Parse METRIC, compare to best → keep / discard / crash |
| 7 | LOG | runtime+agent | Append JSONL, rebuild TSV, agent updates living doc |
| 8 | REPEAT | runtime | Check termination → loop or exit |
Decision rules (Phase 6):
crash, reset to HEADcrash, resetcrash, resetkeep, advance branchdiscard, resetThe Python runtime at core/autoresearch/ handles deterministic phases
(COMMIT, VERIFY, DECIDE, LOG, REPEAT). The agent handles creative
phases (REVIEW, IDEATE, MODIFY).
Running the runtime:
python -m core.autoresearch run --brief .sage/work/<slug>/brief.md --project .
Harness contract: The verify command must print METRIC name=number
to stdout. See references/harness-conventions.md.
All state lives in .sage/work/<YYYYMMDD-slug>/:
| File | Role |
|---|---|
brief.md | Configuration (goal, metric, scope, budget) |
autoresearch.md | Living doc — ideas tried, wins, dead ends |
autoresearch.jsonl | Structured log (one line per iteration) |
results.tsv | Human-readable view (derived from JSONL) |
runs/NNNN-*.log | Per-iteration stdout+stderr |
.autoresearch-state.json | Crash recovery state (not committed) |
On resume (new session, context reset, platform switch):
autoresearch.md for high-level contextautoresearch.jsonl for recent historygit log on the branchSee references/session-continuity.md for full protocol.
Session end: Store a structured summary in sage-memory:
Session start: Search sage-memory for priors on this repo + metric. Inject into IDEATE as "known-good starting points" and "known dead ends."
| Gate | When | Check |
|---|---|---|
| scope | After MODIFY | Changed files ⊆ writable, frozen untouched |
| pre-verify | After COMMIT | git status is clean |
| metric-parseable | After VERIFY | At least one METRIC line in stdout |
| budget | During VERIFY | Wall-clock ≤ per_run_seconds |
Gates are enforced by the runtime, not by prose. The agent cannot bypass them.
references/loop-protocol.md — per-phase inputs, outputs, failure modesreferences/metric-design.md — what makes a good metricreferences/harness-conventions.md — METRIC line contractreferences/stuck-recovery.md — escape local minimareferences/crash-handling.md — retry vs skip decision treereferences/session-continuity.md — resume protocolnpx claudepluginhub xoai/sageGuides interactive setup of optimization goals, metrics, and scope; runs autonomous git-committed experiment loops: code changes, testing, measurement, keep improvements or revert. For performance tuning in git repos.
Runs autonomous experiment loops to iteratively optimize measurable metrics like code performance, ML loss, build size via git branches, code changes, verify commands, and guards.
Runs an autonomous improvement loop: modify code, measure one metric, keep or discard changes, repeat. Use for overnight optimization against a quantified goal (coverage, bundle size, etc.).