Stats
Actions
Tags
From agent-evaluation-lab
Use when designing, running, debugging, or hardening deterministic eval suites for agent skills, prompts, tool workflows, or MCP-backed cases.
How this skill is triggered — by the user, by Claude, or both
Slash command
/agent-evaluation-lab:skill-evaluation-workbenchThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- A skill or prompt needs repeatable quality checks across models.
references/ area and case fixtures into scoped support dirs.result, summary, trace, and workspace evidence.references/workbench-suite-model.mdGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub yeaight7/agent-powerups --plugin agent-evaluation-lab