From autodialectics
Run benchmark suites and manage policy evolution — create challengers, compare against champions, promote or rollback policies.
How this skill is triggered — by the user, by Claude, or both
Slash command
/autodialectics:benchmarkThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill to benchmark policies and drive champion/challenger evolution.
Use this skill to benchmark policies and drive champion/challenger evolution.
autodialectics-mcp must be on PATH (pip install autodialectics).
benchmark(suite_dir?, policy_id?) — run the benchmark suite against a policy. Returns case-by-case results with scores and decisions.evolve_policy(use_gepa?) — analyze recent benchmark reports and create a challenger policy. Set use_gepa: false to skip the GEPA optimizer (simpler heuristic fallback).promote_policy(policy_id) — promote a challenger to champion if comparison rules allow.rollback_policy() — revert to the previous champion if the current one regresses.autodialectics benchmark
autodialectics evolve
autodialectics promote <policy_id>
autodialectics rollback
benchmark → evolve_policy → benchmark (with challenger) → compare → promote or rollback
evolve_policy returns no_reports, run benchmarks first to generate data.If the user passes a suite directory after /autodialectics:benchmark, use it as the benchmark suite path.
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub hmbown/plugins --plugin autodialectics