From autoresearch
Core methodology reference for the autoresearch keep/revert cycle. This skill provides the rules, decision logic, and error recovery patterns for the autonomous loop — not the top-level entry point (use /research for that). Applicable when the user asks about "autoresearch pattern", "keep/revert logic", "experiment decision criteria", "auto-tune hyperparameters", "run keep/revert optimization", or needs guidance on how the autonomous modify-evaluate-decide cycle works internally.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autoresearch:research-loopThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Core knowledge for driving iterative experiment cycles that modify target code,
Core knowledge for driving iterative experiment cycles that modify target code, evaluate against immutable metrics, and use git-based keep/revert decisions.
The autoresearch pattern automates the scientific method: hypothesize, experiment, measure, decide, repeat.
LOOP:
1. Check git state (current branch/commit)
2. Form hypothesis and modify target file(s)
3. git commit the change
4. Run evaluation command (with timeout)
5. Extract metrics from output
6. Log result to results.tsv (append-only, git-untracked)
7. IF metric improved -> keep commit (advance branch)
ELSE -> git reset --hard HEAD~1 (revert)
8. GOTO 1
Before starting, define these in an experiment-config.yaml at project root:
tag: "experiment-name" # Branch name: experiment/<tag>
target_files: # Files the agent may modify
- "src/algorithm.py"
eval_command: "just test" # Command to run evaluation
metric_name: "score" # Name of the metric to optimize
metric_direction: "lower" # "lower" or "higher" is better
timeout_seconds: 120 # Max time per experiment run
results_file: "results.tsv" # Append-only log (git-untracked)
Apply the simplicity criterion when deciding keep vs. revert:
Log every experiment to results.tsv (tab-separated, git-untracked):
commit metric status description
a1b2c3d 0.9500 keep baseline
b2c3d4e 0.9320 keep increase learning rate
c3d4e5f 0.9600 discard switch activation function
d4e5f6g 0.0000 crash double model width (OOM)
Columns: commit hash (7 chars), metric value (6 decimals), status (keep/discard/crash), short description of what was tried.
timeout_seconds. Status = "crash", revert.Each experiment should run as an independent subagent via the Agent tool to avoid context window exhaustion. The researcher agent handles one experiment per invocation. State persists externally in git history + results.tsv.
For detailed decision logic and criteria:
references/decision-logic.md - Complete keep/revert decision tree with examplesnpx claudepluginhub hironow/dotfiles --plugin autoresearchRuns iterative experiments to optimize measurable metrics (speed, accuracy, config). Manages .lab/ directory for experiment history and autonomous workflow.
Sets up autonomous experiment loops for code optimization targets. Gathers goal/metric/files, creates git branch/benchmark script/logging, runs baseline via subagent. For 'run autoresearch' or iterative experiments.
Generates program.md for autonomous AI research experiments (Karpathy's autoresearch). Interviews user on codebase, metrics, constraints; explores code; tailors agent instructions from template.