Marketplace

dev-plugins

Development plugins with eval harnesses — a reference implementation

npx claudepluginhub bailejl/dev-plugins

README

2 Plugins

frontend-dev

0·

React component scaffolding, a11y audits, responsive checks, refactoring, and design system compliance

3mo

v0.1.0

bailejl

ai-readiness

0·

Assess repo and git history for AI-coding assistant readiness — audits, code review, security, testing, architecture, and API design

3mo

v0.1.0

bailejl

Stats

Plugins2

UpdatedMar 26, 2026

Links

View on GitHub View Marketplace JSON

dev-plugins

A Claude Code plugin marketplace for development tooling — with built-in evaluation harnesses for each plugin.

Designed as a reference implementation demonstrating how to build Claude Code plugins with rigorous, eval-driven development.

Quick Demo

# 1. Install dependencies
npm install

# 2. Set your API key (used by eval harness)
echo "ANTHROPIC_API_KEY=your-key-here" > .env

# 3. Run evals for one plugin and view results
npm run eval:readiness
npx promptfoo view

How Evals Work

┌──────────┐    ┌───────────┐    ┌──────────────────┐    ┌─────────┐
│   Task   │───▶│  Trial    │───▶│     Graders      │───▶│ Outcome │
│ (test    │    │  (single  │    │ • deterministic  │    │ pass@k  │
│  case in │    │  prompt-  │    │ • llm-rubric     │    │ pass^k  │
│  suite)  │    │  foo run) │    │ • transcript     │    │ scores  │
└──────────┘    └───────────┘    └──────────────────┘    └─────────┘

See BASELINE.md for current eval metrics and docs/EVAL_TAXONOMY.md for how our eval concepts map to the Anthropic "Demystifying Evals" article.

Plugins

frontend-dev

React component scaffolding, accessibility audits, responsive design checks, component refactoring, and design system compliance.

Commands:

/frontend-dev:scaffold-component — Scaffold a React component with props, types, tests, and story
/frontend-dev:a11y-audit — WCAG 2.1 AA compliance audit using axe-core patterns
/frontend-dev:responsive-check — Responsive design audit (media queries, viewport, touch targets)
/frontend-dev:refactor — React component refactoring (decompose, extract hooks, reduce complexity)
/frontend-dev:design-system — Design system compliance (tokens vs hardcoded values)

ai-readiness

Assess a repository and its git history for AI-coding assistant readiness — comprehensive audits covering code quality, security, testing, architecture, git health, and API design.

Commands:

/ai-readiness:full-audit — 10-section comprehensive AI readiness audit
/ai-readiness:git-health — 71 git anti-patterns with DORA-based severity scoring
/ai-readiness:code-review — 7-category weighted code review and static analysis
/ai-readiness:architecture — 6-category architecture review with SOLID principles
/ai-readiness:security — 6-category security review (OWASP, auto-fail on critical)
/ai-readiness:testing — Test quality: patterns, desiderata, pyramid analysis
/ai-readiness:api-review — 7-category API design and contract review

Project Structure

dev-plugins/
├── plugins/           # What ships to users (commands, skills, agents, hooks)
│   ├── frontend-dev/
│   └── ai-readiness/
├── evals/             # Per-plugin eval suites, graders, fixtures (stays in repo)
│   ├── frontend-dev/
│   └── ai-readiness/
├── eval-infra/        # Shared eval utilities, scripts, rubric templates
└── docs/              # Contributor and learner guides

Getting Started

# Install dependencies
npm install

# Set your Anthropic API key in .env (gitignored)
echo "ANTHROPIC_API_KEY=your-key-here" > .env

Run evals

# Single plugin
npm run eval:frontend
npm run eval:readiness

# All plugins
npm run eval:all

View results

# Interactive web viewer
npx promptfoo view

# Compute pass@k metrics
python eval-infra/scripts/compute-pass-at-k.py --results evals/ai-readiness/.promptfoo/output.json --k 1 3 5

See docs/GETTING_STARTED.md for detailed setup instructions.

Tooling

Tool	Purpose
Promptfoo	Eval harness + LLM grading
ESLint	Code-based grading (lint)
Prettier	Code-based grading (format)
axe-core	Accessibility assertion engine
Vite	Test fixture builds (frontend-dev)

Documentation

Getting Started — Setup and first eval run
Eval Philosophy — Principles of eval-driven development
Eval Taxonomy — Maps Anthropic article concepts to this repo
Writing Evals — How to write test suites
Grader Guide — Grader types and implementation patterns
Adding a Plugin — Step-by-step guide for new plugins

License

MIT

dev-plugins

README

2 Plugins

frontend-dev

ai-readiness

dev-plugins

README

dev-plugins

Quick Demo

How Evals Work

Plugins

frontend-dev

ai-readiness

Project Structure

Getting Started

Run evals

View results

Tooling

Documentation

License

2 Plugins

frontend-dev

ai-readiness

Related Marketplaces

antigravity-awesome-skills

claude-code-workflows

claude-plugins-official

dev-plugins

Quick Demo

How Evals Work

Plugins

frontend-dev

ai-readiness

Project Structure

Getting Started

Run evals

View results

Tooling

Documentation

License

Related Marketplaces

antigravity-awesome-skills

claude-code-workflows

claude-plugins-official