By synergy321
Eval and iterate Claude skills · 4-flow split (umbrella router + iterate / trigger-tune / blind-test) · L1/L2/L3 scoring · read-only on target · skill-creator contract.
在不知道哪个 skill 产生了哪个输出的情况下,对比两个输出的质量。
你的工作是:拿到一个 eval prompt,按照 skill 的指令去执行它,然后把整个过程完整记录下来。grader 后续会读这份记录来判断这次执行是不是真的成功了。
你的工作是读 executor 留下的 transcript 和产出文件,逐条评分,并给出整体质量判断。
你的工作是:读完这一轮所有 eval run 的评分数据,找出规律,生成改进建议。在 Iterate Step 4 被派出来。
产出文件型 skill 的端到端盲测:测试 skill 的产出物(DESIGN.md / 文档 / 模板 / spec) 能不能被下游 LLM 当 prompt 用——即产出能不能"work"。 当用户说: - "盲测这个 skill" - "产出能不能 work" - "端到端测试 skill" - "DESIGN.md 测下游" - "产出文件给下游用对吗" - "blind test" 使用此 skill。 即使用户没说 "盲测",只要涉及产出文件型 skill(产出物 = 下游 prompt)的端到端验证, 也应该触发。 不要用于:L1/L2/L3 分层评分(→ iterate-skill)/ description 触发率(→ trigger-tune)/ 纯 CLI 工具型 skill(无文件产出,盲测无意义)。
让 skill 真干一次活 → 打分(分三关:有没有做出来 / 做得对不对 / 做得好不好)→ 给改进建议 → 出成绩单。 每道题强制跑两遍(一遍用 skill、一遍不用),逼自己回答「这 skill 到底有没有真本事」。 当用户说: - "跑 L1/L2/L3 评分" - "benchmark 这个 skill" - "看 skill 内容质量" - "iteration N" - "下一轮迭代" - "改进 skill 内容" 使用此 skill。 即使用户没说 "iterate",只要涉及对已有 skill 跑分层质量评估或多轮内容改进,也应该触发。 不要用于:description 触发率优化(→ trigger-tune)/ 产出端到端测试(→ blind-test)/ 从零创建 skill(→ skill-creator)。
前台 · 分诊台:听懂你想测一个已有 skill 的哪方面,把你领到 iterate-skill / trigger-tune / blind-test 三条流程之一。 当用户说: - "帮我改进 skill" - "跑 eval 测一下" - "benchmark 这个 skill" - "这个 skill 的触发准不准" - "A/B 对比两个版本" - "盲测产出物" - "跑 iteration" 使用此 skill 做意图识别 + 路由。 即使用户没说 "iterator" 字面词,只要涉及对已有 skill 做评估 / 改进 / 测试,也应该触发。 不要用于:从零创建新 skill(→ skill-creator)/ 解释 skill 概念(直接答即可)。
优化 skill description 触发准确率:基于 Iterate eval report 的 trigger 标注, 调整 description,跑触发测试(20 prompt:should-trigger / should-NOT-trigger),用户确认。 当用户说: - "description 触发不准" - "trigger accuracy" - "20 个 prompt 测一下" - "优化触发短语" - "误触发太多" - "应该触发但没触发" 使用此 skill。 即使用户没说 "trigger",只要涉及 skill description 触发率调优,也应该触发。 不要用于:内容质量评分(→ iterate-skill)/ 产出端到端测试(→ blind-test)/ 从零创建 skill(→ skill-creator)。
Modifies files
Hook triggers on file write and edit operations
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A Claude Code plugin that puts a skill to work, scores it across three levels, and hands back a concrete list of improvements — without ever touching the skill it is testing.
Writing a skill is easy. Knowing whether it actually helps is not. skill-iterator answers that question empirically:
The plugin splits by user intent — pick the one that matches what you want to know:
/skill-iterator — umbrella router; start here if you are not sure which flow you need./iterate-skill — L1/L2/L3 content scoring + improvement suggestions over multiple iterations./trigger-tune — optimize a skill's description so it fires at the right time (trigger-rate tuning)./blind-test — end-to-end output test with anti-fixture-leak guards.suggestions.json. Applying changes is a separate, deliberate step.This repo is a Claude Code marketplace containing one plugin. From inside Claude Code:
# 1. add the marketplace
/plugin marketplace add synergy321/skill-iterator
# 2. install the plugin
/plugin install skill-iterator@skill-iterator-plugin
| plugin | skill-iterator |
| marketplace | skill-iterator-plugin |
Skills, agents, and the lint hook are auto-discovered — no extra configuration.
skill-creator, an authoring tool that is not bundled with this plugin. Without it you still get every finding — you just apply the edits with your own workflow.run_loop.py drives the Anthropic SDK directly and reads ANTHROPIC_API_KEY from the environment; the agent-spawn path does not.MIT © 2026 Travis Chong. Use it, fork it, ship it.
npx claudepluginhub synergy321/skill-iterator --plugin skill-iteratorComplete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use
Comprehensive SEO analysis plugin for Claude Code. 25 sub-skills (21 core + 1 orchestrator + 1 framework + 2 extension mirrors) and 18 sub-agents cover technical SEO, content quality, schema, sitemaps, Core Web Vitals, local SEO, backlinks, AI/GEO, ecommerce, hreflang, SXO, clustering, drift monitoring, and Google APIs. Includes optional MCP extensions, SPA-aware rendering, portability, and hardened SSRF/DNS-rebinding safe fetchers.
Modern R development skills for Claude Code - tidyverse patterns, rlang metaprogramming, Bayesian inference, performance optimization, and more
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
Complete creative writing suite with 10 specialized agents covering the full writing process: research gathering, character development, story architecture, world-building, dialogue coaching, editing/review, outlining, content strategy, believability auditing, and prose style/voice analysis. Includes genre-specific guides, templates, and quality checklists.
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications