Enhanced skill creator with evaluation, benchmarking, and description optimization
npx claudepluginhub yaniv-golan/skill-creator-plusCreate, test, evaluate, and iteratively improve Claude Code skills with built-in benchmarking, blind A/B comparison, and description optimization
The skill that builds skills. Write a draft, run evals against a baseline, review results in an interactive viewer, improve, and repeat — until your skill actually works. Based on Anthropic's official skill-creator plugin, with bug fixes and best practices baked in.
Official skill-creator | skill-creator-plus | |
|---|---|---|
| Best practices guide | — | 580-line Anthropic patterns reference |
| Eval viewer in Cowork | Silent fail on submit | Copyable JSON textarea (fixed) |
| Description optimizer | Requires separate ANTHROPIC_API_KEY | Uses your existing claude session |
| Benchmarking script | Silent empty results | Fixed directory handling |
| Skill type taxonomy | 3 broad categories | 3 + 9 Anthropic internal types |
| Pre-packaging checklist | — | Official checklist built in |
Anthropic ships a skill-creator plugin. It's good, but several parts are broken or missing:
ANTHROPIC_API_KEY most users don't have. This version uses claude -p, which just works with your existing session.See the CHANGELOG for the full list of fixes.
Install (Claude Code):
claude plugin marketplace add https://github.com/yaniv-golan/skill-creator-plus
claude plugin install skill-creator-plus@skill-creator-plus-marketplace
Then just ask:
/skill-creator-plus Create a skill that reviews pull requests for security issues
The skill takes it from there — intent capture, drafting, test cases, evaluation, and iteration.
Note: If you also have Anthropic's built-in
skill-creatorinstalled, Claude may pick that one instead. Either uninstall the built-in, or use/skill-creator-plusto invoke this version explicitly.
Captures your intent through structured questions, researches existing patterns, then writes a SKILL.md with metadata, instructions, and test cases.
Spawns parallel runs (with-skill and baseline) on test prompts. While runs execute, drafts quantitative assertions. Grades results via the grader agent and shows them in an interactive browser-based viewer.
Analyzes evaluation results, identifies weaknesses, and rewrites the skill. Each iteration is benchmarked against the previous version.
For rigorous validation, runs blind A/B comparison: the comparator agent scores two outputs without knowing which skill produced them, then the analyzer agent unblinds and explains the differences.
Generates trigger/non-trigger test queries, runs an optimization loop with train/test split, and selects the best-performing description.
— or install manually —
yaniv-golan/skill-creator-plus and click SyncFrom your terminal:
claude plugin marketplace add https://github.com/yaniv-golan/skill-creator-plus
claude plugin install skill-creator-plus@skill-creator-plus-marketplace
Claude Code marketplace entries for the plugin-safe Antigravity Awesome Skills library and its compatible editorial bundles.
Production-ready workflow orchestration with 84 marketplace plugins, 192 local specialized agents, and 156 local skills - optimized for granular installation and minimal token usage
Directory of popular Claude Code extensions including development tools, productivity plugins, and MCP integrations