By Butanium
Orchestrate autonomous Claude-based research agents to investigate hypotheses, spawn experiment runners that execute Python scripts and analyze outputs with LLM judges, generate interactive Quarto HTML reports, send supervisor notifications, perform fresh-eyes reviews, and run batch evaluations.
Fresh-eyes review with limited context. Only reads files specified in ALLOWED_FILES directive.
Interactive research mode. Collaborate with the user to investigate questions, maintain hypotheses, and spawn scientists.
Autonomous research mode. Investigates questions, maintains hypotheses, spawns scientists and colleagues.
Fresh-eyes review with limited context. Only reads files specified in ALLOWED_FILES directive.
Run specific experiments and document results. Spawned by the research orchestrator.
How to send notifications to the human supervisor via ntfy.sh. Use when you need input, hit a blocker, or update them on your progress.
Cost and latency optimization for Anthropic API usage. Covers prompt caching, batch API, and when to combine them.
Standard experiment folder structure and templates. Reference for creating or validating experiment folders.
How to evaluate research samples using structured JSON output from claude -p. Covers criteria writing, the core judging pattern, and practical examples.
Core research principles for hypothesis-driven investigation. Shared across orchestrator, scientists, and colleagues.
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
STATUS: Work in progress - very experimental and fast evolving codebase
Claude Code plugin for autonomous research orchestration.
Report tend to still be sloppy (with not enough red teaming of the results etc.) but it's sloly getting better.
A scaffolding system for hypothesis-driven research using Claude Code. The orchestrator agent acts as a PI — it maintains hypotheses, designs experiments, and delegates execution to specialized subagents (scientist, colleague, reviewer) that run with constrained permissions enforced by hooks.
RESEARCH_STATE.md, designs experiments, spawns subagents, synthesizes findings.RESEARCH_STATE.md, tools/, etc.).ALLOWED_FILES./research-principles — Core principles for hypothesis-driven investigation (shared across all roles)./research-judging — How to set up and run the LLM judge pipeline for batch evaluation./experiment-structure — Standard experiment folder structure and templates./contact-supervisor — How to send notifications to the human supervisor via ntfy.sh./writing-guidelines — How to write up findings as an interactive Quarto report./supervisor-report — Process for writing and reviewing reports for the supervisor./efficient-api-usage — Cost and latency optimization (prompt caching, batch API).For development (load directly without installation):
claude --plugin-dir /path/to/this/repo/plugins/clab
Note: --plugin-dir must be passed every time you run Claude. Changes to the plugin are reflected after restarting Claude.
For persistent install (via local marketplace):
Add this repo as a marketplace:
/plugin marketplace add /path/to/this/repo
Install the plugin:
/plugin install clab@claude-lab
To update after local changes, run /plugin marketplace update claude-lab then reinstall.
Tip for development: Enable auto-update on the marketplace (/plugin → Marketplaces → claude-lab → Enable auto-update) to automatically pick up changes at startup.
Local symlink install (workaround for GH #17688 — plugin frontmatter hooks don't fire):
The plugin system doesn't parse hooks from agent/skill frontmatter. This script symlinks agents, skills, and hooks into .claude/ so they're loaded by the local agent loader which correctly handles hooks.
# Run from your project directory (where .claude/ lives)
path/to/claude-lab/scripts/install-plugin-locally.sh path/to/claude-lab/plugins/clab
# Overwrite existing symlinks
path/to/claude-lab/scripts/install-plugin-locally.sh path/to/claude-lab/plugins/clab --force
# Uninstall
path/to/claude-lab/scripts/install-plugin-locally.sh path/to/claude-lab/plugins/clab --uninstall
Requires hook commands to use "$CLAUDE_PROJECT_DIR"/... paths (not ${CLAUDE_PLUGIN_ROOT}). Restart Claude Code after install.
export CLAB_NTFY_TOPIC="your-ntfy-topic" # Required for notifications
Start a research session:
claude --dangerously-skip-permissions
Then invoke the orchestrator agent with your research question:
claude --agent orchestrator --dangerously-skip-permissions "Your research question here"
Skills are preloaded automatically via the agent's frontmatter — no manual /skill loading needed.
RESEARCH_STATE.md # Hypotheses, evidence, confidence levels
TECHNICAL_GUIDE.md # Project-specific technical knowledge
research_diary.md # Reflections, @clement mentions
scaffolding_notes.md # General autonomous research best practices
tools/ # Reusable utilities (orchestrator maintains)
experiments/ # One folder per experiment (config.yaml, report.md, outputs/)
sidequests/ # Interesting tangents for later
archive/ # Deprecated files (never delete, always archive)