Search everything...

Stats

Actions

Available In

eval-designer

Name: eval-designer
Author: tavva

By tavva

Design production-quality LLM evaluations for Langfuse

npx claudepluginhub tavva/ben-claude-plugins --plugin eval-designer

Popularity

Stars

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills1

eval-design

/eval-design

This skill should be used when the user asks "what evals can we create", "how do I evaluate this", "design an eval", "create evals for", "how do I know if my LLM is working", "measure quality", or mentions evals, evaluation, scoring rubrics, golden datasets, LLM-as-judge, quality metrics, or judge prompts.

README

ben-claude-plugins

My plugins for Claude Code.

Installation

/plugin install <plugin-name>@ben-claude-plugins

Available Plugins

Plugin	Description
rodney	Browser automation via the rodney CLI for web scraping, frontend verification, and page interaction
eval-designer	Design production-quality LLM evaluations for Langfuse
langfuse	Query Langfuse LLM observability platform via the lf CLI
obsidian-agent-tools	CDP tools for Obsidian plugin development and testing
readme-generator	Generate excellent README files following best practices from https://github.com/matiassingers/awesome-readme
resend	Send emails and manage domains, API keys, and templates via the Resend CLI
sprite	Manage Sprites - persistent, isolated Linux microVMs for safe code execution

Licence

MIT

Similar Plugins

promptfoo-evals

21.4k·

Teaches AI coding agents to create promptfoo eval suites with deterministic assertions, provider configs, and best practices

3mo

v0.121.3

promptfoo

DeepEval

16.3k·

Skills for adding DeepEval evaluations, tracing, datasets, Confident AI reports, and iterative improvement loops to AI applications.

v1.0.0

confident-ai

langfuse-pack

2.2k·

Claude Code skill pack for Langfuse LLM observability (24 skills)

2mo

v1.0.0

jeremylongshore

evals-skills

1.4k·2·

Skills for building LLM evaluations: pipeline audit, error analysis, synthetic data generation, LLM-as-Judge design, evaluator validation, RAG evaluation, and annotation interfaces.

v0.2.0

hamelsmu

More by tavva

sprite

0·

Manage Sprites - persistent, isolated Linux microVMs for safe code execution

4mo

v1.0.0

tavva

readme-generator

0·

Generate excellent README files following best practices from awesome-readme

6mo

v0.1.0

tavva

obsidian-agent-tools

0·

Obsidian CLI tools for plugin development, testing, and vault automation

3mo

v0.2.0

tavva

rodney

0·

Browser automation via the rodney CLI for web scraping, frontend verification, and page interaction

4mo

v0.1.0

tavva

Stats

Version0.1.0

LanguageShell

Stars0

MaintenanceGood

Last CommitJan 13, 2026

AddedJan 14, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

Available In

ben-claude-plugins

eval-designer

Popularity

What's Inside

README

ben-claude-plugins

Installation

Available Plugins

Licence

Similar Plugins

promptfoo-evals

DeepEval

langfuse-pack

evals-skills

More by tavva

sprite

readme-generator

obsidian-agent-tools

rodney

Confidence

More by tavva

sprite

readme-generator

obsidian-agent-tools

rodney

Popularity

Health & Quality

Similar Plugins

promptfoo-evals

DeepEval

langfuse-pack

evals-skills

evaluation

langfuse