Search everything...

Stats

Actions

Available In

nerd

Name: nerd
Author: shawnroos

By shawnroos

Your overnight research assistant. Obsessively analyzes codebases for tunable parameters, designs experiments, runs them in worktrees while you sleep, and delivers findings with recommendations.

npx claudepluginhub shawnroos/nerd

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Slash Commands7

nerd-intern

/nerd-intern

Train a local LLM to handle lightweight research tasks. Your intern starts dumb (really dumb) but learns from every nerd run and earns responsibility through demonstrated competence. Requires a local LLM serving stack (Ollama, MLX-LM, llama.cpp, or vLLM).

nerd-loop

/nerd-loop

Run a continuous self-improvement loop on a specific aspect of your codebase. The agent edits code, runs it, measures the result, keeps improvements, discards regressions, and repeats indefinitely. Like Karpathy's autoresearch but for any codebase feature. Use: /nerd-loop 'search relevance' or /nerd-loop 'api response time'

nerd-schedule

/nerd-schedule

Schedule nerd experiments to run at specific times (e.g., overnight). Uses macOS LaunchAgent for scheduling.

nerd-setup

/nerd-setup

One-time global setup for the nerd plugin. Detects hardware, installs the training variant (MLX for Apple Silicon, original for NVIDIA), runs calibration benchmarks, and saves a hardware profile. Only needs to run once per machine — projects auto-initialize on first /nerd run.

nerd-status

/nerd-status

Check the status of the nerd queue, running experiments, and backlog. Shows progress, completed findings, and pending proposals.

Agents10

context-scanner

/context-scanner

Scans a scoped set of files for tunable parameters and clusters results into research themes. Used by /nerd-this for context-scoped experiment discovery.

experiment-executor

/experiment-executor

Executes nerd experiment plans in isolated worktrees. Builds evaluation harnesses, runs parameter sweeps, captures results. Use when an experiment plan is ready and needs to be implemented and run.

intern-evaluator

/intern-evaluator

Runs aptitude tests and ongoing evaluation of the local LLM intern. Calls the intern endpoint with benchmark examples for each task type (parameter-detection, result-classification, context-extraction), scores against expected outputs, and returns structured results with accuracy per task and mode recommendations. Use during /nerd-intern setup or when re-evaluating intern capability.

lab-tech

/lab-tech

Pre-flight validation agent that checks whether the lab is ready before experiments run. Verifies data access (WAL-mode, file permissions, exports), confirms config fields are actually wired in execution paths, scaffolds missing eval infrastructure (export scripts, test fixtures, datasets), and reports readiness. Use before experiment execution or before starting a nerd-loop to confirm the environment can produce valid results.

loop-scout

/loop-scout

Analyzes nerd research findings, experiment reports, and backlog proposals to identify the best candidates for deep /nerd-loop continuous improvement. Looks for areas with high improvement potential, measurable metrics, and clear scope boundaries. Use after /nerd completes or when deciding what to loop on.

Skills6

codebase-analysis

/codebase-analysis

Reference for identifying tunable parameters in codebases. Use when scanning for research targets — hardcoded thresholds, magic numbers, heuristic weights, prompt templates, pipeline budgets.

experiment-planning

/experiment-planning

Reference for designing nerd experiments — competing theories, sweep harnesses, ground truth strategies, metric selection, and feasibility checks. Use when creating or reviewing experiment plans.

intern-delegation

/intern-delegation

Canonical delegation protocol for the nerd intern. Reference this when delegating tasks to the local LLM in /nerd or /nerd-this orchestrators. Defines health checks, timeouts, confidence gating, shadow comparison, fallback, and logging.

intern-training

/intern-training

Reference for intern training data formats, benchmark structure, and evaluation protocol. Use when running aptitude tests, collecting training data, or evaluating intern performance.

performance-analysis

/performance-analysis

Reference card for performance research — anti-pattern catalog, profiling tool reference, metric command templates, and measurability gate criteria. Use when writing performance experiment plans or analyzing performance findings.

Hooks1

Event Hooks

File writes

2 hooks across 2 events

Stats

Version0.1.0

Stars0

MaintenanceExcellent

Last CommitMar 16, 2026

AddedMar 26, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Safety Signals

Caution

Modifies files

Hook triggers on file write and edit operations

Uses power tools

Uses Bash, Write, or Edit tools

README

nerd

Your codebase has hundreds of hardcoded thresholds, magic numbers, and untested heuristics. You don't know which ones matter. The nerd does.

A Claude Code plugin that obsessively researches your codebase overnight — finding every tunable parameter, designing rigorous experiments with competing theories, running them in isolated worktrees, and delivering findings that tell you what to keep, what to change, and what to rearchitect. It remembers what it learned, so it never wastes time re-testing what it already proved.

Why

Most "optimization" is guessing. You tweak a threshold, eyeball the result, ship it. The nerd treats your codebase like a research problem:

It doesn't just test if a parameter is optimal. It generates competing theories about why it exists — is the parameter wrong? Is the model wrong? Is the feature unnecessary? Is the data the real bottleneck?
It runs experiments in parallel worktrees so your working branch stays clean.
It remembers everything via a persistent knowledge graph, so run 2 builds on run 1's findings instead of rediscovering them.
It runs while you sleep. Schedule it tonight, review the findings tomorrow.

The most valuable findings aren't parameter tweaks. They're architectural discoveries that only emerge when you test competing explanations.

Install

claude plugin install nerd

Quick Start

/nerd-setup                        # One-time hardware calibration
/nerd                              # Let the nerd loose on your codebase
/nerd-loop "search relevance"      # Deep continuous iteration on one area
/nerd-schedule tonight             # Run experiments overnight

/nerd-setup runs once per machine. Projects auto-initialize on first /nerd run.

What It Actually Does

`/nerd` — Broad Research

Scans your codebase for every tunable parameter, designs experiments with competing theories, validates the lab environment, runs them in parallel, and delivers structured findings.

/nerd "search ranking"
  ├─ parameter-scanner    finds 12 tunable parameters
  ├─ plan-reviewer        generates 3 competing theories per experiment
  ├─ lab-tech             validates data access, config wiring, build cache
  ├─ experiment-executor  runs experiments in parallel worktrees
  ├─ report-compiler      evaluates which theories held up
  └─ loop-scout           recommends the best target for deep iteration

`/nerd-loop` — Deep Iteration

Karpathy's autoresearch pattern applied to your code. Reads the code, hypothesizes an improvement, makes the change, measures, keeps if better, reverts if not — and repeats until it hits a local maximum.

/nerd-loop "search relevance"
  ├─ Establishes baseline metric (e.g., nDCG@10)
  ├─ Loops: edit → test → measure → keep/discard
  ├─ Pivots strategy after 5 consecutive failures
  ├─ Escalates after another 5
  └─ Stops at local maximum (15 failures across 3 strategies)

It doesn't just sweep parameters — it rewrites algorithms, restructures logic, removes unnecessary code. Anything within the scoped files is fair game.

`/nerd-this` — Context-Scoped Research

Research just what you're working on right now. Infers scope from your current branch, session files, and conversation topics, then groups findings into research themes.

/nerd-this auth flow
  ├─ Infers scope from git diff + session context
  ├─ Groups parameters into research themes
  └─ Runs the full experiment pipeline on selected themes

Competing Theories

This is the core insight. Most experiment tools ask "is this parameter optimal?" The nerd asks "what's actually going on?" by generating 3+ competing theories per experiment:

Theory Type	What It Tests
Parameter is wrong	A different value would improve the metric
Model is wrong	The mathematical model is inappropriate — try a different one entirely
Feature is unnecessary	Removing the feature causes no degradation
Data is the bottleneck	The parameter doesn't matter because the input data is the real problem
Architecture is the bottleneck	No parameter value can fix this — the architecture needs to change

Reports evaluate each theory as SUPPORTED / REFUTED / INCONCLUSIVE and recommend: KEEP, CHANGE, REMOVE, REARCHITECT, or INVESTIGATE.

Research DAG — The Nerd Remembers

Every theory, verdict, and finding is persisted in a JSON knowledge graph. The nerd gets smarter with every run:

Skips dead ends — won't re-test parameters linked to active REFUTED verdicts
Seeds from open threads — picks up unresolved theories from prior runs
Detects staleness — re-tests when your source files change significantly
Synthesizes patterns — surfaces cross-experiment insights when 3+ verdicts converge

View full README on GitHub

nerd

Popularity

What's Inside

Confidence

README

nerd

Why

Install

Quick Start

What It Actually Does

/nerd — Broad Research

/nerd-loop — Deep Iteration

/nerd-this — Context-Scoped Research

Competing Theories

Research DAG — The Nerd Remembers

Similar Plugins

autoresearch

autoresearch-builder

autoresearch-agent

gepa-research

researcher

llm-council-plugin

More by shawnroos

nerd

clawcrush

clawcrush

auto

nerd

Why

Install

Quick Start

What It Actually Does

/nerd — Broad Research

/nerd-loop — Deep Iteration

/nerd-this — Context-Scoped Research

Competing Theories

Research DAG — The Nerd Remembers

Popularity

Health & Quality

More by shawnroos

nerd

clawcrush

clawcrush

auto

Similar Plugins

autoresearch

autoresearch-builder

autoresearch-agent

gepa-research

researcher

llm-council-plugin

`/nerd` — Broad Research

`/nerd-loop` — Deep Iteration

`/nerd-this` — Context-Scoped Research

`/nerd` — Broad Research

`/nerd-loop` — Deep Iteration

`/nerd-this` — Context-Scoped Research