Plugins listed here are tagged for this topic and auto-indexed from public GitHub repositories.
Plugins listed here are tagged for this topic and auto-indexed from public GitHub repositories.
Plugins focused on test authoring, coverage analysis, mocking, and test automation across frameworks.
Jest, Vitest, Pytest, Playwright, Cypress, and Go testing are common. Filter by technology to find plugins for your test runner.
Several include agents or commands that analyze source files and generate corresponding unit tests, including edge cases and mock setups.
Some generate CI pipeline configurations for test execution. Plugins with hooks can run tests on file save. Check component types for automation support.
Enforce test-driven development, structured debugging, and code review workflows in Claude Code. Plan features in granular tasks, execute them in isolated worktrees with parallel subagents, and verify completeness with automated checks before merging.
Equip AI coding agents with production engineering skills to handle full dev lifecycles: refine ideas to specs, implement via TDD slices, run tests/debug, perform multi-axis code reviews, optimize perf/security, automate CI/CD, and execute ship checklists.
Iteratively create custom Claude Code skills from scratch, refine existing ones via drafting and description optimization, run test evaluations, and benchmark performance with quantitative metrics and variance analysis.
Automate technical debt reduction, dependency updates, and code refactoring by scanning for vulnerabilities and code smells, generating prioritized remediation plans, and leveraging AI-powered test automation and code review.
Generate production-ready stateful CLI harnesses for GUI applications from local paths or GitHub repos, implementing Click CLI with REPL/JSON support, pytest unit/E2E tests, and docs. List installed harnesses, refine coverage gaps, run tests to verify functionality, and validate against standards.
Delegate expert-level code reviews, security audits, penetration tests, QA automation, accessibility compliance checks, performance optimizations, chaos engineering, and compliance validations to specialized sub-agents across codebases, infrastructure, and systems.
Run PluginEval certification pipeline on Claude plugins or skills to compute quality scores, badges (Platinum/Gold/Silver/Bronze), dimension breakdowns, anti-patterns, and recommendations via static analysis and LLM judging across 10 criteria including triggering, orchestration, and output quality. Compare skills head-to-head or evaluate directories for actionable insights.
Enforce strict red-green-refactor TDD cycles: generate failing tests, implement minimal passing code, then refactor while keeping tests green. Includes AI-powered code review for security and quality.
Execute marketing and growth workflows including A/B testing setup, SEO diagnostics and optimization, analytics tracking audit, programmatic SEO strategy, and automated content/email campaign creation with brand voice analysis.
Implement a complete QA and testing workflow: set up A/B tests with hard gates, automate browser testing with Playwright/Puppeteer, enforce code review checklists and TDD, debug systematically, and fix failing tests using pytest patterns.
Centers Python backend development around async patterns, FastAPI, Django, and modern tooling, providing architectural guidance, testing strategies with pytest, and production best practices for scalable APIs and services.
Control a browser from the command line for AI-driven testing and automation — navigate pages, click elements, fill forms, capture screenshots, extract data, and automate Electron apps.
Write production-grade shell scripts with defensive programming patterns, unit tests via Bats, and static analysis via ShellCheck, enabling robust CI/CD pipelines and cross-platform automation.
Manage Python projects via structured tracks for features, bugs, refactors: initialize context artifacts like product.md and tech-stack.md, create detailed specs and phased plans, implement tasks with strict TDD workflow using pytest coverage and git commits, monitor status, revert commits, and validate artifacts for consistency.
Build and evaluate production-grade AI agents using LangGraph, RAG systems, MCP servers, and prompt engineering patterns—with behavioral testing and reliability monitoring.
Run end-to-end browser tests and automate web interactions using Playwright, allowing Claude to click elements, fill forms, take screenshots, and scrape content locally.
Run a full academic research pipeline from idea to paper — orchestrating multi-stage writing, reference fact-checking, claim auditing against source text, and collaborative progress tracking with state dashboards and user checkpoint approvals.
Scaffold new Claude Agent SDK apps in TypeScript or Python by interactively gathering requirements, installing dependencies, and configuring projects. Verify apps post-creation or changes for SDK best practices, code quality, security, type safety, documentation, and deployment readiness.
Evaluate and improve LLM applications by instrumenting agents, chatbots, and RAG pipelines with DeepEval tracing, generating test suites, running evaluations, and exporting traces to Confident AI for observability and iterative refinement.
Drive a real Chrome browser via CDP using coordinate clicks and screenshots — no CSS selectors needed. Automate, scrape, test, or interact with web pages by controlling the user's already-running Chrome.
Generate Playwright tests from specs, migrate Cypress/Selenium suites, run on BrowserStack, sync with TestRail, analyze coverage, and diagnose flaky failures — all within your existing workflow.
Automate web tasks with natural language: drives a local Playwright browser, captures screenshots and action logs, visually self-verifies results. Also generates reusable Python CLI tools from the workflows.
Conduct full product discovery cycles in your IDE: brainstorm ideas and experiments for new/existing products from PM/designer/engineer views, identify/prioritize assumptions and features, triage requests, generate interview scripts, summarize transcripts, and design metrics dashboards.
Generate PRDs, OKRs, outcome roadmaps, user stories, job stories, sprint plans, release notes, and stakeholder maps. Run pre-mortems for risk analysis, retrospectives for team feedback, prioritization frameworks, meeting summaries, and test scenarios with dummy data to manage agile product execution workflows.
Generate SQL queries from natural language descriptions using your database schema for PostgreSQL, MySQL, or BigQuery. Analyze CSV or Excel user data to produce cohort retention heatmaps, engagement trends, churn insights, and research recommendations. Evaluate A/B tests for statistical significance, confidence intervals, lift, and ship/extend/stop decisions with Python-powered reports.
Direct AI coding agents to create or update promptfoo evaluation suites with configs, prompts, tests, deterministic assertions, and provider setups following best practices. Streamline LLM eval coverage, regression debugging, and new eval matrix generation in JavaScript or Python projects using OpenAI or Anthropic models.
Implement Trail of Bits handbook security testing workflows: fuzz Rust, Python, C/C++, Ruby code with AFL++, libFuzzer, cargo-fuzz, Atheris; instrument AddressSanitizer; run static analysis via Semgrep, CodeQL; generate coverage reports, dictionaries, and bypass obstacles for vulnerability detection.
Develop and troubleshoot Ginkgo/Gomega test suites in Go, enabling spec authoring, table-driven tests, filtering, parallel execution, flake handling, async timeouts, CI setup, and JSON-based failure diagnosis.
Automatically generate production-ready unit tests from source code files or snippets in JavaScript/TypeScript (using Jest, Vitest, or Mocha), Python (pytest), Java (JUnit 5), and Go. Auto-detects frameworks, covers happy paths, edge cases, boundaries, errors, and provides mocks for robust testing.
Build multi-language code graphs to map call graphs, attack surfaces, blast radius, taint propagation, privilege boundaries, and complexity hotspots for security audits. Visualize architecture with Mermaid diagrams, compare snapshots across git commits for evolution analysis, triage mutation testing survivors, generate crypto test vectors, diagram protocols, and project SARIF findings onto graphs.
Runs Mooncake pre-PR validation locally to catch typos, formatting, build and test failures before pushing, using a single shell script that mirrors the CI pipeline's reproducible checks.
Automate complex browser workflows from natural language commands — navigate websites, extract data, fill forms, and run AI-powered UI tests, all without writing code.
Generate E2E test suites in JavaScript and automate testing for iOS/Android mobile apps on simulators/emulators using Appium, Detox, XCUITest, Espresso, or Maestro. Validate UI interactions, gestures, navigation, permissions, and platform behaviors.
Orchestrate autonomous multi-agent sprints to develop full features from specs.md: agents handle architecture, parallel implementation of Next.js frontends and Python/FastAPI backends, CI/CD setup, automated testing, UI QA, reviews, and iterative convergence with structured reports and git safety.
Backtest crypto and stock trading strategies on historical data to compute performance metrics like Sharpe and Sortino ratios, maximum drawdowns, equity curves, and optimize parameters via grid search.
Generate and execute end-to-end browser tests for full user workflows spanning frontend and backend using Playwright, Cypress, or Selenium. Create tests with page objects, scenarios, and assertions, then run them to validate complete user journeys in browsers.
Streamline end-to-end Obsidian plugin development and vault management: scaffold projects with TypeScript setups, implement UI views/events/data handling, optimize performance/security, establish local dev loops/CI/CD/release pipelines, migrate content, and troubleshoot errors using 24 specialized skills.
Configure and optimize mewt/muton mutation testing campaigns by scoping targets, tuning timeouts, and streamlining long-running tests for Rust, Go, TypeScript, and JavaScript codebases.
Automated bug hunting and red-team engagement platform for web, cloud, mobile, and enterprise targets. Runs recon, vulnerability scanning, exploit chaining, and report generation across 70+ attack classes with slash commands and auto-loaded skill sets.
Automate overnight software development by configuring Git hooks for TDD enforcement with tests and lints, then run Claude autonomously for 6-8 hours to build features that pass all checks by morning.
Design, execute, and analyze load, stress, spike, soak, and endurance tests on APIs, web apps, and databases using k6, Artillery, JMeter, Locust, and autocannon. Identify bottlenecks, review metrics, and verify SLAs to optimize performance.
Automate OWASP Top 10 vulnerability scans and penetration testing on JavaScript, Python, and Java codebases using Semgrep, ESLint-security, Bandit, and dependency audits. Delegate comprehensive security audits to a specialized agent covering injections, XSS, CSRF, authentication flaws, access control, and misconfigurations.
Detect UI visual regressions by capturing screenshots of components or pages with Playwright or Cypress, comparing against baselines across responsive breakpoints, generating pixel diffs, analyzing changes, and producing markdown reports with recommendations. Integrates with Percy, Chromatic, BackstopJS, and CI workflows.
Design structured workflow skills for Claude Code using multi-step phases, decision trees, subagent delegation, and progressive disclosure for pipelines, routing, and safety gates. Audit skills via 6-phase review detecting structural issues, pattern adherence, tool correctness, and anti-patterns.
Generate OpenAPI specs and Pact consumer contracts from API code, designs, or schemas to enable consumer-driven contract testing, documentation, code generation, verification tests, and CI/CD setup.
Evaluate machine learning models using metrics like accuracy, precision, recall, and F1-score to perform performance analysis, validation, model comparison, and optimization. Generate production-ready AI/ML code that includes validation, error handling, performance metrics, saved artifacts, and documentation.
Track regression tests across code releases by mapping git commits to pytest or Jest tests, tagging markers for suites, flagging coverage gaps, generating pass/fail reports with flaky detection, viewing history, and enforcing runs in CI/CD pipelines.
Analyze test coverage reports from Jest/nyc, pytest, Go test, and JaCoCo across JavaScript, Python, Go, and Java projects to identify untested code paths, branch gaps, low-coverage files, enforce thresholds, and generate detailed reports with targeted test recommendations.
Generate test reports by parsing JUnit XML, Jest JSON, pytest results, and coverage data into Markdown/HTML formats with metrics, failures, slowest tests, trends, and CI annotations. Aggregate results across frameworks for summaries and exports in HTML, PDF, or JSON.
Audit web pages and components for WCAG 2.1/2.2 accessibility compliance using axe-core, Playwright, Pa11y, Lighthouse, and more. Detect ARIA errors, keyboard navigation issues, color contrast violations, and screen reader incompatibilities, then generate markdown reports with prioritized fixes and code examples.
Validate API responses against OpenAPI and JSON schemas to ensure contract compliance, detect schema drift, and verify data integrity. Generate JSON Schema definitions, Ajv validators, Express middleware, tests, docs, and monitoring directly in Node.js, Python, or Java workflows.
Generate fast (<5min) smoke test suites in Jest-style JavaScript for critical paths like system health, authentication, and core features, then run them post-deployment via Playwright, curl, Bash scripts, or CI/CD to verify UI, APIs, and functionality.
Run interactive penetration tests on web apps and codebases: scan HTTP security headers for CSP/HSTS issues, audit npm/pip dependencies for vulnerabilities, analyze code for secrets/injections with bandit, get severity-prioritized findings, fix suggestions, and JSON reports.
Generate k6, Artillery, wrk, or Gatling scripts for API load, stress, and soak tests to validate performance under configurable loads. Run tests locally to measure response times, throughput, error rates, scalability, and identify bottlenecks.
Verify blockchain smart contracts match specifications from whitepapers, PDFs, Markdown, or URLs, detecting implementation gaps, undocumented behaviors, logic discrepancies, and security issues via structured audits and generating compliance reports.
Implement property-based testing strategies across multiple languages and smart contracts to verify invariants like serialization roundtrips, idempotence, parsing, validation, normalization, and algorithms for stronger coverage than example-based tests.
Build Vizro dashboards end-to-end: design layouts and visualizations, implement with Python and YAML, then automate end-to-end testing with Playwright.
Orchestrate multi-agent coding workflows with context-aware task decomposition, parallel subtask execution, automated code review, and TDD test generation.
Implement production-grade .NET patterns and practices for Akka.NET actor systems, ASP.NET Core, EF Core, Blazor, .NET Aspire, testing with xUnit/Playwright/TestContainers, CI/CD optimization, API design, and performance profiling via specialized Claude Code skills and agents.
Provision and manage isolated test environments using Docker Compose and Testcontainers for databases, caches, queues like PostgreSQL, MySQL, Redis, DynamoDB. Generate docker-compose files, env vars, seed data scripts, startup scripts, and cleanup code to enable reliable, reproducible testing without local setup conflicts.
Integrate Stripe payment processing, subscription management, and billing queries via an MCP server, with guided API selection, Connect platform setup, and migration support. Debug errors, test with card numbers, and provision API keys through the Stripe CLI.
Execute 175 slash commands to automate git workflows like branching/PR creation/issue syncing with Linear, code quality reviews/refactors/fixes, test generation/setup/coverage, CI/CD pipelines, security/performance audits, documentation generation, project scaffolding/setup, and deployments across JS/TS/Python/Go/Rust/Svelte stacks.
Instruct Claude Code to automate browser tasks with Playwright: it auto-detects dev servers, launches a visible browser, runs E2E tests, fills forms, takes screenshots, validates responsive design and UX, handles login flows, checks links, and generates clean scripts in /tmp.
Automate development workflows by walking through code files line-by-line in VSCode or Vim, logging timestamped work sessions with file changes in daily Markdown, generating detailed issue specs staged in Git, engaging in adaptive Socratic quizzes for learning, and delegating UI validation tasks to a browser agent using Chrome DevTools.
Automate database testing workflows by generating test suites with data factories, transaction wrappers for automatic rollback, schema validation, assertions, cleanup, fixtures, migrations, integrity checks, and performance monitoring across PostgreSQL, MySQL, MongoDB, SQLite, Redis using Prisma, Drizzle, Jest, Pytest.
Design and execute chaos engineering experiments to inject failures like network latency, service crashes, resource exhaustion into Kubernetes clusters and Docker containers, validating distributed system resilience and recovery using tools like Chaos Mesh and AWS FIS.
Generate and run load tests with k6, JMeter, or Artillery to validate web app and API performance under stress, spike, soak, and scalability scenarios. Detect bottlenecks, set thresholds, and integrate into CI/CD pipelines for automated validation.
Test load balancing strategies by validating traffic distribution, health checks, failover, session persistence, and SSL on live NGINX, HAProxy, AWS ALB/NLB, GCP, and Kubernetes Ingress setups. Generate Jest test suites to verify these behaviors across backends.
Fuzz test REST and GraphQL APIs using OpenAPI specs to detect crashes, vulnerabilities, edge cases, and unexpected behaviors with tools like Schemathesis, RESTler, OWASP ZAP. Generate test suites, security reports, and reproducible payloads for input validation and security auditing.
Run integration test suites for APIs, databases, services, queues, and files using real Dockerized dependencies without mocks. Automates full workflow: environment setup, database seeding, service orchestration, test execution with coverage reporting, and teardown cleanup. Select suites and configure environments via CLI flags.
Generate realistic, relationally consistent test data and idempotent seed scripts by analyzing database schemas, respecting foreign keys, constraints, and data types with Faker libraries for dev/test environments across JS, Python, C#, Prisma, Node, and TypeScript.
Orchestrate complex test workflows across Jest, Vitest, pytest, Playwright, and Cypress with parallel execution, test sharding, dependency management, flakey retries, affected test selection, and result aggregation in GitHub Actions or GitLab CI. Generate optimized configs for CI/CD pipelines.
Run mutation testing on JavaScript, Python, Java, Go, C#, or Ruby codebases to evaluate test suite quality. Introduce code mutants with tools like Stryker, mutmut, PITest, or go-mutesting, check detection rates, identify coverage gaps, and generate reports with survival scores and improvement suggestions.
Perform consumer-driven contract testing with Pact (JavaScript, Python, JVM) and OpenAPI validation for REST, GraphQL, gRPC APIs to detect breaking changes, generate tests, and produce detailed reports integrable into CI/CD pipelines.
Automate performance regression detection in CI/CD pipelines by generating test suites, baselines, thresholds, reporting, and PR integrations. Statistically compare response times, throughput, resource usage against baselines to validate builds and spot trends early.
Generate test doubles—mocks, stubs, spies, fakes—for unit testing by analyzing code dependencies. Produces implementations, fixtures, example tests, and rationale. Works across JavaScript (Jest, Vitest, Sinon), Python (pytest, unittest.mock), Go (gomock), and more frameworks.
Generate and run mock API servers from OpenAPI specifications to simulate stateful CRUD operations with realistic Faker.js data, latency delays, error conditions, and request recording—ideal for frontend testing and backend prototyping without a live server.
Generate and execute comprehensive test suites for REST and GraphQL APIs directly from OpenAPI specs, automating request generation, schema/response validation, CRUD coverage, auth handling, error/performance checks, idempotency tests, with reporting in Jest, pytest, Supertest, or REST-assured.
Generate realistic test data for users, products, orders, technical fields, and custom schemas to populate fixtures, factories, seeds, edge cases, and databases in JS/TS/Python/Ruby apps using Faker.js, Fishery, pytest fixtures, and factory patterns.
Scan codebases for reflected, stored, and DOM-based XSS vulnerabilities across HTML, JavaScript, CSS, and URLs. Test WAF bypass techniques and CSP protections, then receive reports on risks with remediation suggestions via commands or natural language triggers.
Create and manage snapshot tests for UI components and data using Jest, Vitest, or pytest to catch regressions. Analyze test failures with intelligent diff reviews, selectively update snapshots for intentional changes, validate and organize snapshot files, then generate detailed analysis reports.
Enforce strict TDD workflow: write minimal failing tests first for complex logic or public APIs, verify red phase failure, implement green-passing code without internal mocks, then refactor safely. Supports unit and integration tests with Jest.
Integrate SerpApi into Python and Node.js/TypeScript apps to extract structured search data from Google, Bing, YouTube, Shopping, News, and Maps. Automate setup, auth, cost-free local testing with pytest/Vitest fixtures, Redis caching, rate limiting, proxy deployment to Vercel/GCP/Fly.io, security hardening, production checklists, SEO monitoring, and legacy migrations via 18 Claude Code skills.
Build and validate LLM evaluation pipelines: design judge prompts, calibrate against human labels, generate synthetic test data, audit pipeline trustworthiness, analyze failure modes, evaluate RAG systems, and collect human annotations via a browser UI.
Write expressive, correct Gomega assertions in Go tests using Expect/Ω notation, synchronous and asynchronous (Eventually/Consistently) patterns, the full built-in matcher catalog, composite matchers for nested structs, custom matcher authoring, and sub-libraries for HTTP, streaming I/O, goroutine leak detection, and performance measurement.
Enforces Test-Driven Development by detecting the test framework, installing a reporter, and outputting test results. Blocks file writes with forbidden terms and enforces semantic rules via bash guards and LLM prompt checks.
Configure VSCode extensions to test APIs with httpYac including auth scripts and CI/CD workflows, monitor multiple dev server ports like Vite and Next.js in real-time, and deploy static sites via SFTP to Nginx servers with secure setups.
Use natural language to automate browser interactions—navigate, fill forms, click elements, handle multi-tab sessions—and extract structured data from websites into JSON/CSV, all via Claude through the ActionBook MCP server running locally.
Automate comprehensive project management: audit health and permissions, generate architecture/user docs and roadmaps, handle git workflows/PRs/releases, test UX/onboarding/responsiveness via browser, consult multi-AI models, and post team updates with feedback triage.
Build, debug, test, deploy, and optimize browser-based Node.js apps and IDEs using StackBlitz SDK and WebContainers: boot in-browser environments, embed interactive playgrounds in docs, fix common errors, run Playwright CI tests, deploy to Vercel/Netlify with production headers, and tune performance/security.
Test web apps for cross-browser compatibility using Playwright locally across Chromium, Gecko, WebKit, and mobile viewports, or on real devices via BrowserStack, Sauce Labs, LambdaTest, Kobiton. Run interactive tests, scan JS/CSS risks, and generate reports with browser matrices.
Audit local repositories for code health by analyzing complexity metrics, git churn, and test coverage gaps. Generate detailed reports with overviews, critical issues, warnings, prioritized refactor recommendations, and actionable steps to address technical debt hotspots.
Enforce TDD and SDD workflows with AI-driven agents that scaffold projects, generate tests, debug failures, reverse-engineer docs, and orchestrate batch task execution from design plans.
Delegate specialized AI agents to automate code reviews on git diffs, security audits for APIs and auth per OWASP, debugging of errors and incidents, test generation with Jest/pytest, performance profiling, and quality assurance across dev workflows.
Manage React Native Storybook stories using Component Story Format (CSF) — set up Storybook v10 in Expo, CLI, or Re.Pack projects, upgrade across major versions, and connect to a local Storybook instance for accessing UI components and documentation in Claude Code.
Verifies suspected security bugs by tracing data flows, assessing exploitability, and issuing TRUE POSITIVE or FALSE POSITIVE verdicts with evidence, then optionally generating proof-of-concept exploits.