Search everything...

Stats

Actions

Available In

gepa-anywhere

Name: gepa-anywhere
Author: evanfabry

By EvanFabry

Scaffold-into-any-repo GEPA prompt/artifact optimization. Five skills: gepa-init (lay out a multi-agent repo), evaluator-discovery (build + calibrate a grounded evaluator for an agent), gepa-scaffold (single-prompt quickstart), gepa-run (drive the exit-42 session-reflection loop), gepa-frontier (inspect Pareto + promote the winner).

npx claudepluginhub evanfabry/gepa-anywhere --plugin gepa-anywhere

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills6

evaluator-discovery

/evaluator-discovery

Build a trustworthy, anchored evaluator for an agent whose output you want to optimize, when no reliable metric/judge/gold exists yet. It learns exactly what the agent produces, gives the evaluator the SAME inputs the agent saw (building + testing those surfaces), drafts the evaluator as an EXTERNAL markdown rubric, has a DIVERSE expert panel (incl. an adversary) harden it, and calibrates it against a real anchor while hill-climbing self-consistency — then registers it so gepa can optimize the agent against it. Use whenever the user wants to optimize / improve / "make better" an agent's prompt or output and the way to MEASURE quality is missing or weak — e.g. "set up a judge for my extraction agent", "how do I score whether my summarizer is good", "optimize this prompt but I have no gold labels", "build an evaluator for <agent>", or BEFORE any `gepa run` whose metric is absent or untrustworthy. NOT for laying out the repo (that's gepa-init) or driving the optimization loop (that's gepa-run); this is only the build-the-metric step. The evaluator is the bottleneck on every optimization — do not skip building one well.

gepa-autorun

/gepa-autorun

Drive a GEPA optimization run to automatic completion, stopping when a stop-policy condition is met (budget exhausted, corroborated saturation, or too many invalid patches). Use when the user says "autorun gepa", "run until done", "keep optimizing automatically", or "stop when saturated", AND a `.gepa/config.yaml` exists. Wraps gepa-run with a decide_stop check between every iteration.

gepa-frontier

/gepa-frontier

Inspect a finished gepa-anywhere run's Pareto frontier and deliberately promote the winning candidate back onto the artifact. Use after `gepa run` reaches exit 0, when the user says "show the frontier", "which candidate won", "promote the winner", "apply the best prompt", or wants to compare candidates / check the holdout report before committing.

gepa-init

/gepa-init

Lay out a repository to hill-climb (optimize) ARBITRARILY MANY agents with gepa, each with its own grounded evaluator — by hand-creating the `.gepa/agents/<name>/` directories + a registry convention. Use once per repo, before building evaluators or running optimizations, when the user says "I want to optimize several agents/prompts in this repo", "set up a multi-agent gepa project", "initialize gepa for many agents", or "add gepa to this project" AND more than one agent will be optimized. NOT for building the evaluator itself (that's evaluator-discovery) or running the loop (that's gepa-run). For a single prompt + a code metric, prefer `gepa scaffold` (the flat `.gepa/config.yaml` quickstart) — this skill is the multi-agent superset and produces a DIFFERENT, incompatible layout, so don't mix them in one repo.

gepa-run

/gepa-run

Drive a GEPA reflective-optimization loop on any text artifact in any repo, using THIS Claude Code session as the free (Max-billed) reflection LM. The Python CLI `gepa run` handles GEPA's optimization math + checkpointing; when it needs the reflection LM it writes a pending envelope and exits 42 — you read it, propose improved artifact text, write the response, and re-invoke. Use when the user says "run gepa", "optimize this prompt/instructions", "evolve the artifact", "improve the extraction prompt", AND a `.gepa/config.yaml` exists (or they want one — then use gepa-scaffold first).

Stats

Version0.1.0

LanguagePython

Stars0

MaintenanceExcellent

Last CommitJun 9, 2026

AddedJun 9, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

gepa-anywhere-local

README

gepa-anywhere

Scaffold-into-any-repo GEPA reflective optimization for any text artifact (a prompt, an instruction file, prompt fragments) — using the active Claude Code session as the reflection LM (free, Max-billed), with host-supplied rollout + metric hooks. A generalization of whiteboard/tools/whiteboard-gepa.

Status: M0–M4 built. Generic core (single-file and multi-component artifacts; command and subagent rollout; command and subagent/LLM-judge metric; session and api reflection) + scaffold + 3 skills. Validated end-to-end against two examples on the same core: examples/cramer (real host, M3) and examples/whiteboard (multi-component generality proof, runnable). See SPEC.md for the full design and the M0–M4 plan.

uv sync                                   # one-time, in this repo

# in any repo:
gepa scaffold                             # drops .gepa/{config.yaml, rollout.sh, metric.py, golden/}
#  ...point artifact.path at your file, implement the hooks, add a golden set...
gepa run --config .gepa/config.yaml       # session drives the exit-42 reflection loop
gepa state --config .gepa/config.yaml     # suspended | done | in-flight
gepa frontier --config .gepa/config.yaml  # inspect the Pareto frontier; --promote applies the winner

gepa and gepa-anywhere are the same entry point. As a Claude Code plugin, the three skills (gepa-scaffold, gepa-run, gepa-frontier) drive this conversationally — the session is the reflection LM, so no API key is needed for the heavy LLM work.

What's where

core/        generic loop — config, dataset/splits, candidate I/O, command/subagent hooks,
             ConfigDrivenGEPAAdapter, exit-42 suspend/resume, run lock, frontier, CLI
templates/   what `scaffold` drops into a host repo (mirrors core/scaffold.py constants)
skills/      gepa-scaffold | gepa-run | gepa-frontier
tests/       config/protocol/hook/adapter unit tests + a no-LLM end-to-end toy optimization

The generic core carries no host-domain strings (enforced by tests/test_nfr6_generic.py). The harness is reusable; the metric and the golden set are the host's work — that's where the quality of a run is decided.

Develop

uv run pytest -q          # full suite (the toy run exercises the real gepa.optimize loop)

First validation target (M3): scaffold into ~/trading/cramer to optimize its extraction prompt against a hand-labeled golden set (SPEC §9, M3).

gepa-anywhere

Popularity

What's Inside

Confidence

README

gepa-anywhere

What's where

Develop

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana

gepa-anywhere

What's where

Develop

Popularity

Health & Quality

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana