Autonomous experiment loop plugin for Claude Code
npx claudepluginhub liljamesjohn-archive/claude-autoresearchAutonomous experiment loop for Claude Code — try ideas, measure results, keep what works, discard what doesn't, repeat forever.
Autonomous experiment loop plugin for Claude Code — try ideas, measure results, keep what works, discard what doesn't, repeat forever.
Inspired by Karpathy's autoresearch and pi-autoresearch.
You tell Claude what to optimize and how to measure it. Claude then runs an autonomous loop:
Works for any optimization target: test speed, bundle size, algorithm performance, build times, API latency, Lighthouse scores.
# Add the marketplace (first time only)
/plugin marketplace add liljamesjohn/claude-autoresearch
# Install the plugin
/plugin install autoresearch
Autoresearch requires a git worktree for safety. Start one, then invoke the skill:
# Start a worktree session (Claude Code creates the branch automatically)
claude -w autoresearch-fifo
# Inside the session, start the loop
/autoresearch optimize FIFO lot matching speed
Claude will ask about the benchmark command, metric, files in scope, and quality gates — then start looping. Your main checkout is completely untouched.
Meanwhile, in another terminal, you can keep working normally:
claude # your normal session on main
/autoresearch-status
/autoresearch resume
/autoresearch off
Or just press Ctrl+C at any time.
/autoresearch clear
The experiment loop is prompt-driven, not code-driven. The skill prompt (skills/autoresearch/SKILL.md) teaches Claude the full protocol. Claude uses its built-in tools (Bash, Edit, Read, Write) to execute each iteration.
Worktree isolation is mandatory. Every autoresearch session runs in a dedicated git worktree — a physically separate copy of the repo. Your main checkout, working files, .env, IDE configs, and in-progress work are never touched.
The plugin will refuse to start if not in a worktree.
| Threat | Protection |
|---|---|
| Experiment damages user's code | Worktree isolation — main checkout is physically untouched |
| Loop runs forever | Max iteration cap (default 50) + Ctrl+C |
| Bad experiment breaks code | Git revert on discard — last good commit restored |
| Context fills up | SessionStart hook re-injects autoresearch.md after compaction |
| Agent stops unexpectedly | Stop hook keeps the loop running |
| Session files lost on revert | Protected files are backed up before git revert |
| File | Purpose | Committed? |
|---|---|---|
autoresearch.md | Living session document — objective, metrics, what's been tried | Yes |
autoresearch.sh | Benchmark script | Yes |
autoresearch.checks.sh | Optional quality gate script | Yes |
autoresearch.jsonl | Append-only experiment log | Yes |
autoresearch.ideas.md | Ideas backlog for deferred optimizations | Yes |
.claude/autoresearch-loop.local.md | Stop hook state file (active/iteration/max) | No (local) |
claude-autoresearch/
├── .claude-plugin/
│ ├── plugin.json # Plugin manifest
│ └── marketplace.json # Marketplace catalog
├── skills/
│ └── autoresearch/
│ └── SKILL.md # /autoresearch skill — setup + loop prompt
├── commands/
│ └── autoresearch-status.md # /autoresearch-status — results summary
├── lib/
│ └── lib.sh # Shared utilities (frontmatter parsing, hook input)
├── hooks/
│ ├── hooks.json # Hook registrations (Stop, SessionStart, PreCompact)
│ └── scripts/
│ ├── stop-hook.sh # Keeps the loop running (Ralph Wiggum pattern)
│ ├── session-start-compact.sh # Re-injects context after compaction
│ ├── session-start-resume.sh # Auto-resume on session start
│ ├── pre-compact.sh # Preserves state before compaction
│ ├── revert-experiment.sh # Git revert with protected files
│ └── log-experiment.sh # Structured JSONL experiment logger
├── tests/ # Unit + E2E test suites
├── LICENSE
└── README.md