Skip to main content

/

/

Stats

Actions

Tags

Stats

Actions

Tags

ClaudePluginHub

Community directory for discovering and installing Claude Code plugins.

Find plugins for your project

AI-powered recommendations based on your stack.

Product

Browse Plugins
Marketplaces
Pricing
About
Contact

Resources

Learning Center
Blog
Weekly Digest
Claude Code Docs
Plugin Guide
Plugin Reference
Plugin Marketplaces

Community

Browse on GitHub
Get Support

Legal

Terms of Service
Privacy Policy

Browse · Plugins · Top Plugins · Marketplaces · Components · Technologies · Skills · Agents · Commands · Hooks · MCP Servers · LSP Servers · Output Styles · Themes · Monitors

Categories · Productivity · Development · Testing · Deployment · Security · Documentation · Data · Utilities

© 2025 ClaudePluginHub

Community Maintained · Not affiliated with Anthropic

ClaudePluginHub

ClaudePluginHub

Tools Learn Pricing

Search everything...

silver-benchmark | silver-bullet

Home
Skills
silver-bullet
silver-benchmark

Skill

silver-benchmark

From silver-bullet

Runs repeatable benchmark and adversarial evaluation workflows across agents, models, providers, prompts, or implementation approaches. Defines task fixtures, scoring rubrics, and produces decision reports.

developer-tools

Popularity

Stars

5

Forks

3

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/silver-bullet:silver-benchmark <benchmark task> [--providers <list>] [--rounds N]

User invocable

Model invocable

Inline context

Default effort

Argument hint<benchmark task> [--providers <list>] [--rounds N]

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

SB-owned benchmark workflow for repeatable evaluation. External providers may

SKILL.md

50 lines · ~468 tokens

Stats

LanguageHTML

Stars5

Forks3

MaintenanceExcellent

Last CommitJun 18, 2026

Actions

View Source View Plugin View on GitHub View README

Tags

adversarial-testing

agent-comparison

Stats

LanguageHTML

Stars5

Forks3

MaintenanceExcellent

Last CommitJun 18, 2026

Actions

View Source View Plugin View on GitHub View README

Tags

adversarial-testing

agent-comparison

/silver:benchmark - Agent And Approach Evaluation

SB-owned benchmark workflow for repeatable evaluation. External providers may enrich the run only when installed and requested; SB owns the fixture, scoring, evidence, and final decision.

Output

Write or update .planning/BENCHMARK.md.

The report must include:

task fixture, constraints, and accepted-answer rubric;
compared agents/models/providers/approaches;
cost, latency, tool use, evidence quality, and correctness observations;
self-evaluation bias checks when applicable;
winning decision or inconclusive result with next step;
retained artifacts and cleanup notes.

Process

Display SILVER BULLET > BENCHMARK.
Define a task fixture that can be repeated without hidden context.
Define scoring before running candidates: correctness, requirement coverage, safety, maintainability, verification quality, cost, and latency.
Run each candidate under the same constraints. If external providers are not available, benchmark local approaches or prompt variants and record the limitation.
Use an adversarial review pass for top candidates when the result will influence architecture, release, security, or high-cost work.
Invoke or apply silver:domain-audit --pack benchmark-eval.
Normalize findings into silver:review or silver:research when the benchmark drives implementation.

Exit Gate

A benchmark result is valid only when the fixture, rubric, raw evidence, and decision rationale are sufficient for another session to reproduce the comparison.

$

npx claudepluginhub alo-exp/silver-bullet --plugin silver-bullet

Similar Skills

writing-skills

230.8k

Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.

6 files

View writing-skills