From dspy-skills
Optimizes text artifacts — code, prompts, agent architectures, configs — via GEPA's evolutionary search API with evaluator-driven ASI feedback.
How this skill is triggered — by the user, by Claude, or both
Slash command
/dspy-skills:dspy-optimize-anythingThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.
Optimize any artifact representable as text — code, prompts, agent architectures, vector graphics, configurations — using a single declarative API powered by GEPA's reflective evolutionary search.
| Input | Type | Description |
|---|---|---|
seed_candidate | str | dict[str, str] | None | Starting artifact text, or None for seedless mode |
evaluator | Callable | Returns score (higher=better), optionally with ASI dict |
dataset | list | None | Training examples (for multi-task and generalization modes) |
valset | list | None | Validation set (for generalization mode) |
objective | str | None | Natural language description of what to optimize for |
background | str | None | Domain knowledge and constraints |
config | GEPAConfig | None | Engine, reflection, and tracking settings |
| Output | Type | Description |
|---|---|---|
result.best_candidate | str | dict | Best optimized artifact |
pip install -U "gepa>=0.1.1,<0.2"
The evaluator scores a candidate and returns Actionable Side Information (ASI) — diagnostic feedback that guides the LLM proposer during reflection.
Simple evaluator (score only):
import gepa.optimize_anything as oa
from gepa.optimize_anything import EngineConfig, GEPAConfig
config = GEPAConfig(engine=EngineConfig(max_metric_calls=100))
def evaluate(candidate: str) -> float:
score, diagnostic = run_my_system(candidate)
oa.log(f"Error: {diagnostic}") # captured as ASI
return score
Rich evaluator (score + structured ASI):
def evaluate(candidate: str) -> tuple[float, dict]:
result = execute_code(candidate)
return result.score, {
"Error": result.stderr,
"Output": result.stdout,
"Runtime": f"{result.time_ms:.1f}ms",
}
ASI can include open-ended text, structured data, multi-objectives (via scores), or images (via gepa.Image) for vision-capable LLMs.
Mode 1 — Single-Task Search: Solve one hard problem. No dataset needed.
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
config=config,
)
Mode 2 — Multi-Task Search: Solve a batch of related problems with cross-transfer.
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=tasks,
config=config,
)
Mode 3 — Generalization: Build a skill/prompt/policy that transfers to unseen problems.
result = oa.optimize_anything(
seed_candidate="<your initial artifact>",
evaluator=evaluate,
dataset=train,
valset=val,
config=config,
)
Seedless mode: Describe what you need instead of providing a seed.
result = oa.optimize_anything(
evaluator=evaluate,
objective="Generate a Python function `reverse()` that reverses a string.",
config=config,
)
print(result.best_candidate)
import gepa.optimize_anything as oa
from gepa import Image
from gepa.optimize_anything import EngineConfig, GEPAConfig
import logging
logger = logging.getLogger(__name__)
# ---------- SVG optimization with VLM feedback ----------
GOAL = "a pelican riding a bicycle"
VLM = "vertex_ai/gemini-3-flash-preview"
VISUAL_ASPECTS = [
{"id": "overall", "criteria": f"Rate overall quality of this SVG ({GOAL}). SCORE: X/10"},
{"id": "anatomy", "criteria": "Rate pelican accuracy: beak, pouch, plumage. SCORE: X/10"},
{"id": "bicycle", "criteria": "Rate bicycle: wheels, frame, handlebars, pedals. SCORE: X/10"},
{"id": "composition", "criteria": "Rate how convincingly the pelican rides the bicycle. SCORE: X/10"},
]
def evaluate(candidate, example):
"""Render SVG, score with a VLM, return (score, ASI)."""
image = render_image(candidate["svg_code"]) # via cairosvg
score, feedback = get_vlm_score_feedback(VLM, image, example["criteria"])
return score, {
"RenderedSVG": Image(base64_data=image, media_type="image/png"),
"Feedback": feedback,
}
result = oa.optimize_anything(
seed_candidate={"svg_code": "<svg>...</svg>"},
evaluator=evaluate,
dataset=VISUAL_ASPECTS,
background=f"Optimize SVG source code depicting '{GOAL}'. "
"Improve anatomy, composition, and visual quality.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
logger.info(f"Best SVG:\n{result.best_candidate['svg_code']}")
# ---------- Code optimization (single-task) ----------
def evaluate_solver(candidate: str) -> tuple[float, dict]:
"""Evaluate a Python solver for a mathematical optimization problem."""
import subprocess, json
proc = subprocess.run(
["python", "-c", candidate],
capture_output=True, text=True, timeout=30,
)
if proc.returncode != 0:
oa.log(f"Runtime error: {proc.stderr}")
return 0.0, {"Error": proc.stderr}
try:
output = json.loads(proc.stdout)
return output["score"], {
"Output": output.get("solution"),
"Runtime": f"{output.get('time_ms', 0):.1f}ms",
}
except (json.JSONDecodeError, KeyError) as e:
oa.log(f"Parse error: {e}")
return 0.0, {"Error": str(e), "Stdout": proc.stdout}
result = oa.optimize_anything(
evaluator=evaluate_solver,
objective="Write a Python solver for the bin packing problem that "
"minimizes the number of bins. Output JSON with 'score' and 'solution'.",
background="Use first-fit-decreasing as a starting heuristic. "
"Higher score = fewer bins used.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
print(result.best_candidate)
# ---------- Agent architecture generalization ----------
def evaluate_agent(candidate: str, example: dict) -> tuple[float, dict]:
"""Run an agent architecture on a task and score it."""
exec_globals = {}
exec(candidate, exec_globals)
agent_fn = exec_globals.get("solve")
if agent_fn is None:
return 0.0, {"Error": "No `solve` function defined"}
try:
prediction = agent_fn(example["input"])
correct = prediction == example["expected"]
score = 1.0 if correct else 0.0
feedback = "Correct" if correct else (
f"Expected '{example['expected']}', got '{prediction}'"
)
return score, {"Prediction": prediction, "Feedback": feedback}
except Exception as e:
return 0.0, {"Error": str(e)}
result = oa.optimize_anything(
seed_candidate="def solve(input):\n return input",
evaluator=evaluate_agent,
dataset=train_tasks,
valset=val_tasks,
background="Discover a Python agent function `solve(input)` that "
"generalizes across unseen reasoning tasks.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
print(result.best_candidate)
optimize_anything complements DSPy's built-in optimizers. Use DSPy optimizers (GEPA, MIPROv2, BootstrapFewShot) for DSPy programs, and optimize_anything for arbitrary text artifacts outside DSPy:
import dspy
import gepa.optimize_anything as oa
from gepa.optimize_anything import EngineConfig, GEPAConfig
# DSPy program optimization (use dspy.GEPA)
optimizer = dspy.GEPA(
metric=gepa_metric,
reflection_lm=dspy.LM("openai/gpt-4o"),
auto="medium",
)
compiled = optimizer.compile(agent, trainset=trainset)
# Non-DSPy artifact optimization (use optimize_anything)
result = oa.optimize_anything(
seed_candidate=my_config_yaml,
evaluator=eval_config,
background="Optimize Kubernetes scheduling policy for cost.",
config=GEPAConfig(engine=EngineConfig(max_metric_calls=100)),
)
oa.log() — Route prints to the proposer as ASI instead of stdout(score, dict) tuples for multi-faceted diagnosticsobjective= when the solution space is large and unfamiliarbackground= to constrain the searchvalset when the artifact must transfer to unseen inputsgepa.Image to pass rendered outputs to vision-capable LLMsGEPAConfig(engine=EngineConfig(max_metric_calls=...))gepa package (pip install -U "gepa>=0.1.1,<0.2")valset for transfernpx claudepluginhub omidzamani/dspy-skills --plugin dspy-skillsOptimizes DSPy programs using the dspy.GEPA reflective/evolutionary optimizer for complex tasks with rich-feedback metrics. Requires a DSPy module, metric returning dspy.Prediction, trainset, and reflection LM.
Optimizes agentic systems with DSPy's GEPA optimizer using LLM reflection on execution traces and Pareto-based search. For multi-step agents needing textual feedback metrics.
Runs autonomous optimization loops to iteratively improve prompts, templates, configs, or code using four-way separation of main agent, eval agent, test runner, and deterministic eval.py judge. Invoke via /autoresearch or 'optimize this prompt'.