Autonomous codebase improvement loop inspired by Karpathy's autoresearch.
npx claudepluginhub dhanesh/autoresearchAutonomous codebase improvement loop inspired by Karpathy's autoresearch. Iteratively improves code quality, test coverage, performance, and architecture using multi-metric evaluation with diminishing returns detection.
Autonomous codebase improvement loop for Claude Code, inspired by Karpathy's autoresearch.
Runs a tight improve-evaluate-iterate loop that converges on measurable codebase improvements across code quality, test coverage, performance, and architecture.
# Add to your project's .claude/settings.json
{
"plugins": ["github:dhanesh/autoresearch#plugin"]
}
claude --plugin-dir /path/to/autoresearch/plugin
# From clone
bash install/install.sh
# From remote
curl -fsSL https://raw.githubusercontent.com/dhanesh/autoresearch/main/install/install.sh | bash
/autoresearch # Interactive — discover constraints from your codebase
/autoresearch src/ --profile quality # Quality-focused improvement on src/
/autoresearch --profile coverage # Maximize test coverage
/autoresearch --profile performance # Optimize performance
/autoresearch --max-iterations 10 # Limit iterations
/autoresearch --resume # Resume a previous run
/autoresearch --dry-run # Preview what would be evaluated
DISCOVER → BASELINE → LOOP → REPORT
| Axis | What it measures | Examples |
|---|---|---|
| Static Analysis | Lint warnings, type errors, complexity | ESLint, TSC, Biome, Ruff |
| Test Suite | Pass rate, coverage percentage | Jest, Vitest, pytest |
| LLM Rubric | Readability, architecture, maintainability | 4-dimension weighted rubric |
| Custom | User-defined metrics | Bundle size, benchmarks, custom scripts |
All scores normalized to 0-100 and combined via weighted composite.
| Profile | Best for | Weights |
|---|---|---|
quality | Reducing complexity, improving naming, strengthening types | lint 25%, types 20%, tests 25%, LLM 30% |
performance | Bundle size, algorithmic complexity, hot paths | lint 15%, tests 20%, benchmark 35%, LLM 30% |
coverage | Adding tests, covering edge cases, assertion quality | coverage 35%, tests 25%, lint 10%, LLM 30% |
autoresearch/
├── plugin/ # Claude Code plugin (distributable)
│ ├── plugin.json # Plugin metadata
│ ├── commands/ # /autoresearch command
│ ├── skills/autoresearch/ # Overview skill
│ ├── hooks/ # SessionStart + PreCompact hooks
│ ├── lib/ # TypeScript reference implementations
│ ├── profiles/ # Preset evaluation profiles
│ └── README.md # Plugin documentation
├── install/ # Installation scripts
│ ├── install.sh # Multi-agent installer
│ └── uninstall.sh # Cleanup
├── src/ # Source (canonical)
│ ├── types.ts # Core types and defaults
│ ├── loop.ts # Loop state machine
│ ├── discovery.ts # Codebase introspection
│ ├── report.ts # Report generation
│ └── evaluators/ # Multi-axis evaluation engine
├── profiles/ # Preset profiles (canonical)
├── SKILL.md # Main skill definition
├── package.json # Project metadata
└── .manifold/ # Constraint manifold (design docs)
MIT
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 84 marketplace plugins, 192 local specialized agents, and 156 local skills - optimized for granular installation and minimal token usage
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations