Search everything...

Stats

Actions

Available In

understudy

Name: understudy
Author: understudylabs

By understudylabs

Improve LLM apps and agents from real traces — reduce cost/latency, raise quality/reliability, capture traces, build evals, run local optimization (GEPA), compare models/providers, and route through Understudy.

npx claudepluginhub understudylabs/understudy-agent-tools --plugin understudy

Popularity

Stars

Above avg

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills22

distill-classifier

/distill-classifier

Use when a developer wants to replace an expensive frontier model on a classification workload (binary, multi-class, multi-label, or structured extraction) with a fine-tuned open-weight student — "distill this classifier", "can a small model do this tagging job", "the frontier labels these for $X, make it cheaper", "consensus-label my data". Multi-teacher majority-vote labeling, failure-directed SFT data, and a four-way promote/shadow/collect/stop verdict.

ingest-traces

/ingest-traces

Use when a developer already has production LLM traces — a bucket of captures, provider log exports, or gateway capture files — and wants them turned into local, redacted eval sets, or profiled for cost first. "Ingest my traces", "turn these logs into an eval set", "where is my LLM spend going", "which calls could a local model take over".

install-plugin

/install-plugin

Use when a developer wants to install, enable, update, reinstall, or verify the Understudy skills as a Claude Code plugin — "install understudy", "add the understudy skills", "set up the plugin", "why can't you see the understudy skill". Runs the non-interactive `claude plugin` CLI (or shows the commands), then tells the developer the one activation step and whether a restart is needed.

design-simulated-environment

/design-simulated-environment

Use to build a simulated, seeded environment (AutomationBench / verifiers style) so any model can run a captured agentic workload end-to-end and be scored on final state — "simulate this workload's tools", "build a validator for these traces", "let a small model attempt the whole task", "score recall/precision against gold", or any handoff from understand-workload toward whole-case model comparison.

compare-trajectories

/compare-trajectories

Use when you need to know HOW two model runs differ behaviorally on the same tasks, not just THAT one scores higher — per-task trajectory diffing that classifies the gap as persistence/recovery, knowledge, or format/parsing. "why does the bigger model pass these", "is this gap RL-shaped", "diff these two trajectory runs", "where do the trajectories diverge", "what would distillation buy me". The behavioral complement to compare-model-sweep.

Stats

Version0.3.0

LanguageTypeScript

Stars2

MaintenanceExcellent

LicenseMIT

Last CommitJun 17, 2026

AddedJun 6, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

understudy-skills2

README

understudy-agent-tools

Public, MIT-licensed Understudy skill library and thin CLI.

This repo is the public skills surface for local-first AI workload evaluation, optimization planning, gateway handoff, and agent-led implementation. The CLI is thin TypeScript/Node: durable shortcuts, auth, artifact checks, and runtime wrappers that a coding agent can monitor.

The OSS MVP loop is local-first and skill-led:

capture evidence -> attach harness/environment
  -> confirm metric/validator/holdout -> rerun baseline
  -> optimize workload -> conservative claim packet

Registration is not required for that loop. Hosted gateway access is available after understudy login; browser, channel, daemon, and desktop-runtime commands remain outside this public CLI until intentionally extracted.

The hosted surface this CLI consumes is documented at docs.understudylabs.com — see open-source/agent-tools for how this repo fits the platform and open-source/cli for the command-level CLI reference. The skills here stay local-first; the docs site covers the hosted contracts behind them.

Shape

Spine	Path	Purpose
CLI	`src/`	Thin TypeScript shortcuts for auth, artifact checks, and durable runs.
Skills	`skills/`	MVP progressive-disclosure agent playbooks.
Docs	`docs/`	Public methodology and release-boundary notes.
Scripts	`scripts/`	Repo hygiene checks, not product CLI code.
Vendor	`vendor/`	Vendored or mirrored compatibility shims, with license metadata.

The CLI should stay boring. Workflow judgment belongs in skills; durable shortcuts belong in TypeScript only when the agent needs reliable execution, auth injection, artifact writes, or a safety gate.

Install Locally

Fast first-run installer:

curl -fsSL https://raw.githubusercontent.com/UnderstudyLabs/understudy-agent-tools/main/install.sh | bash

This installs the CLI, installs or refreshes the Claude Code plugin when claude is available, then opens Claude Code in the current directory. In Claude Code, run:

/reload-plugins
/understudy:onboard

The installer intentionally does not download model weights, start MLX, install Pi, launch tmux/iTerm, or make frontier calls. Those belong inside the Claude Code skill flow, where the agent can explain the tradeoffs, ask consent, coach the user on opening their preferred terminal, and run the same commands itself when appropriate.

For non-interactive installs, add --yes:

curl -fsSL https://raw.githubusercontent.com/UnderstudyLabs/understudy-agent-tools/main/install.sh | bash -s -- --yes

The installer is resumable. It writes step markers under ~/.understudy/agent-tools/install-state; after a failed run, use:

curl -fsSL https://raw.githubusercontent.com/UnderstudyLabs/understudy-agent-tools/main/install.sh | bash -s -- --resume

You can also jump directly to a step:

curl -fsSL https://raw.githubusercontent.com/UnderstudyLabs/understudy-agent-tools/main/install.sh | bash -s -- --from-step 2

Developer install from a clone:

npm install
npm run build
node dist/bin.js --help

After package publication:

npm install -g @understudylabs/understudy-agent-tools
understudy spine

No provider calls, uploads, model downloads, secret-value inspection, or hosted jobs run by default. After authentication, the CLI emits bounded product telemetry documented in docs/telemetry.md; disable it with UNDERSTUDY_TELEMETRY=0.

Install as a Claude Code plugin

The skills in skills/ ship as a Claude Code plugin, declared in .claude-plugin/ (plugin.json + marketplace.json). Installing it registers the public invocable skills in skills/, including the understudy orchestrator, onboarding, capture/eval, optimization, local model, distillation, RLM, and verifier-handoff workers.

From a clone of this repo:

claude plugin marketplace add /path/to/understudy-agent-tools
claude plugin install understudy@understudy-skills

Then run /reload-plugins in your Claude Code session to activate — no restart required. The equivalent interactive flow is /plugin marketplace add <path> then /plugin install understudy@understudy-skills. The install-plugin skill automates this and reports whether the plugin is already installed.

After /reload-plugins, run /understudy:onboard. That is where the coding agent guides the first local model, terminal choice, Pi/tmux handoff, and any frontier comparison with explicit consent.

View full README on GitHub

understudy

Popularity

What's Inside

Confidence

README

understudy-agent-tools

Shape

Install Locally

Install as a Claude Code plugin

Similar Plugins

fullstack-dev-skills

claude-md-management

godot-skills

nature-skills

understudy-agent-tools

Shape

Install Locally

Install as a Claude Code plugin

Popularity

Health & Quality

Similar Plugins

fullstack-dev-skills

claude-md-management

godot-skills

nature-skills

skill-creator

unity-dev-toolkit