By shawnroos
Your overnight research assistant. Obsessively analyzes codebases for tunable parameters, designs experiments, runs them in worktrees while you sleep, and delivers findings with recommendations.
Train a local LLM to handle lightweight research tasks. Your intern starts dumb (really dumb) but learns from every nerd run and earns responsibility through demonstrated competence. Requires a local LLM serving stack (Ollama, MLX-LM, llama.cpp, or vLLM).
Run a continuous self-improvement loop on a specific aspect of your codebase. The agent edits code, runs it, measures the result, keeps improvements, discards regressions, and repeats indefinitely. Like Karpathy's autoresearch but for any codebase feature. Use: /nerd-loop 'search relevance' or /nerd-loop 'api response time'
Schedule nerd experiments to run at specific times (e.g., overnight). Uses macOS LaunchAgent for scheduling.
One-time global setup for the nerd plugin. Detects hardware, installs the training variant (MLX for Apple Silicon, original for NVIDIA), runs calibration benchmarks, and saves a hardware profile. Only needs to run once per machine — projects auto-initialize on first /nerd run.
Check the status of the nerd queue, running experiments, and backlog. Shows progress, completed findings, and pending proposals.
Pre-flight validation agent that checks whether the lab is ready before experiments run. Verifies data access (WAL-mode, file permissions, exports), confirms config fields are actually wired in execution paths, scaffolds missing eval infrastructure (export scripts, test fixtures, datasets), and reports readiness. Use before experiment execution or before starting a nerd-loop to confirm the environment can produce valid results.
Analyzes nerd research findings, experiment reports, and backlog proposals to identify the best candidates for deep /nerd-loop continuous improvement. Looks for areas with high improvement potential, measurable metrics, and clear scope boundaries. Use after /nerd completes or when deciding what to loop on.
Scans codebases for tunable parameters, hardcoded thresholds, magic numbers, and empirical optimization opportunities. Use when nerd needs to discover what experiments to run.
Maps codebase for performance research — identifies hot paths, I/O boundaries, complex functions, and areas of interest. Produces a structured area map that guides specialist agent dispatch.
Scans a scoped set of files for tunable parameters and clusters results into research themes. Used by /nerd-this for context-scoped experiment discovery.
Reference for identifying tunable parameters in codebases. Use when scanning for research targets — hardcoded thresholds, magic numbers, heuristic weights, prompt templates, pipeline budgets.
Reference for designing nerd experiments — competing theories, sweep harnesses, ground truth strategies, metric selection, and feasibility checks. Use when creating or reviewing experiment plans.
Canonical delegation protocol for the nerd intern. Reference this when delegating tasks to the local LLM in /nerd or /nerd-this orchestrators. Defines health checks, timeouts, confidence gating, shadow comparison, fallback, and logging.
Reference for intern training data formats, benchmark structure, and evaluation protocol. Use when running aptitude tests, collecting training data, or evaluating intern performance.
Reference card for performance research — anti-pattern catalog, profiling tool reference, metric command templates, and measurability gate criteria. Use when writing performance experiment plans or analyzing performance findings.
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Tasty crustacey morsels for Claude by @shawnroos.
| Plugin | Description |
|---|---|
| nerd | Autonomous codebase research — discovers tunable parameters, runs experiments overnight, delivers findings |
claude plugin add-marketplace https://github.com/shawnroos/shawnroos-plugins.git
claude plugin install nerd
npx claudepluginhub shawnroos/shrimpshack --plugin nerdFind and destroy zombie processes and repo slop spawned by Claude
Find and destroy zombie processes and repo slop spawned by Claude
Your overnight research assistant. Obsessively analyzes codebases for tunable parameters, designs experiments, runs them in worktrees while you sleep, and delivers findings with recommendations.
Workflow-agnostic pulsed loop engine for Claude Code: runs the auto loop pattern (plan-loop -> seam -> work-loop, parallel fan-out, severity-based exit) as a durable, observable state machine. Ships named recipes (A1 Classic, A2 Parallel Theories+Judge, A4 Adversarial Pair, W Work-only) — pick a workflow topology at /auto start, or author your own. A disk-persisted per-unit ledger is the loop's source of truth; the engine is workflow-blind and drives any workflow through a thin adapter. Self-paces in-session via ScheduleWakeup; resume after a suspend is one command off the durable ledger.
Autonomous experiment loops on any codebase — one file, one metric, one loop. Based on Karpathy's autoresearch pattern.
Autonomous experiment loop that optimizes any file by a measurable metric. 5 slash commands, 8 evaluators, configurable loop intervals (10min to monthly).
Autonomous experimentation skill — your AI coding agent designs experiments, tests hypotheses, discards failures, keeps wins. Runs overnight while you sleep.
Autonomous, personalized research loops for Claude Code. Set a topic, walk away, come back to a quality-gated report adapted to your projects.
Research harness for optimizing code with the GEPA algorithm (LLM-driven genetic-Pareto search).
Autonomous research orchestration: agents for hypothesis-driven investigation, experiment running, fresh-eyes review, and batch evaluation.