From optimize-anything
Guide for running, configuring, and interpreting `optimize-anything` and `gepa` optimization workflows. Use when asked how to optimize a prompt, artifact, config, or skill, or when troubleshooting evaluator feedback, budget, or score interpretation.
How this skill is triggered — by the user, by Claude, or both
Slash command
/optimize-anything:optimization-guideThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
End-to-end guide for optimizing text artifacts with `optimize-anything` and `gepa`.
End-to-end guide for optimizing text artifacts with optimize-anything and gepa.
Start with your current best version of the artifact. gepa evolves from here.
objective if you have no seed, and let gepa bootstrap one from the description.{"system_prompt": "...", "examples": "..."} for multi-component artifacts (e.g., system_prompt + few-shot examples).Use the generate-evaluator skill to create one matched to your objective. The evaluator is the most critical piece—gepa's optimization quality is bounded by your evaluator's feedback quality.
Single-task (no dataset) — optimize one artifact against one evaluator:
{"seed": "...", "evaluator_command": ["bash", "evaluators/eval.sh"]}
Multi-task (with dataset) — optimize across multiple examples for cross-task transfer:
result = optimize_anything(
seed_candidate="...",
evaluator=eval_fn,
dataset=[{"input": "q1", "expected": "a1"}, ...],
)
Generalization (train + validation split) — ensure the artifact transfers to unseen examples:
result = optimize_anything(
seed_candidate="...",
evaluator=eval_fn,
dataset=train_examples,
valset=val_examples,
)
Use the budget subcommand for a starting point, then adjust:
| Seed length | Recommended budget | Rationale |
|---|---|---|
| < 100 chars | 50 | Short artifact, fewer mutations needed |
| 100-499 | 100 | Moderate exploration |
| 500-1999 | 200 | More search space to cover |
| 2000+ | 300 | Extensive exploration recommended |
Configure options via GEPAConfig:
from gepa.optimize_anything import GEPAConfig, EngineConfig
config = GEPAConfig(
engine=EngineConfig(
max_metric_calls=150, # Budget
parallel=True, # Parallel evaluation
max_workers=8, # Worker count
),
)
Via CLI:
optimize-anything optimize seed.txt --evaluator-command bash evaluators/eval.sh --budget 100 --objective "maximize clarity" -o result.txt
Via Python API:
from optimize_anything import optimize_anything, command_evaluator
from gepa.optimize_anything import GEPAConfig, EngineConfig
result = optimize_anything(
seed_candidate=open("seed.txt").read(),
evaluator=command_evaluator(["bash", "evaluators/eval.sh"]),
objective="maximize clarity",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
print(result.best_candidate)
Use plateau-based early stopping to avoid wasting budget after convergence:
optimize-anything optimize seed.txt \
--evaluator-command bash evaluators/eval.sh \
--budget 120 \
--early-stop \
--early-stop-window 10 \
--early-stop-threshold 0.005
Notes:
--early-stop is auto-enabled when --budget > 30.--early-stop-window and --early-stop-threshold for noisier evaluators.early_stopped and stopped_at_iteration when a run exits early.For cache reuse across runs, copy prior disk cache entries into a new run directory:
optimize-anything optimize seed.txt \
--evaluator-command bash evaluators/eval.sh \
--run-dir runs \
--cache \
--cache-from runs/run-20260303-120000
Notes:
--cache-from requires --cache and --run-dir.--cache-from copies fitness_cache/ from the previous run before optimization starts.The result contains:
best_candidate — the optimized artifact.val_aggregate_scores — score progression across iterations.total_metric_calls — how many evaluator invocations were used.Signs of a good run:
total_metric_calls < budget (converged early).best_candidate against seed.txt or in-memory seed to see targeted differences.Signs of problems:
seed — add richer feedback, increase budget, or refine objective.budget 20-50 first to validate your evaluator on seed.txt and confirm that scores change meaningfully.gepa's reflection.objective string that is injected into gepa's reflection prompt and specify constraints like token limits or format requirements.background for domain knowledge, constraints, or strategies such as "Target audience is non-technical users. Never use jargon."budget if optimization results on seed.txt are poor.evaluator_cwd as an absolute project path next to seed.txt and evaluators/eval.sh when evaluators/eval.sh or other evaluator commands use repo-relative files or scripts.npx claudepluginhub asragab/optimize-anythingOptimizes text artifacts — code, prompts, agent architectures, configs — via GEPA's evolutionary search API with evaluator-driven ASI feedback.
Runs autonomous optimization loops to iteratively improve prompts, templates, configs, or code using four-way separation of main agent, eval agent, test runner, and deterministic eval.py judge. Invoke via /autoresearch or 'optimize this prompt'.
Optimizes prompts for LLMs using constitutional AI, chain-of-thought reasoning, and model-specific techniques. Transforms basic instructions into production-ready prompts to improve accuracy, reduce hallucinations, and cut costs.