By testland
Production-side QA per ISTQB-canonical shift-right ('a test approach to test a system continuously in production'): 3 skills (synthetic-monitor-author, prod-canary-validator, feature-flag-experiment-validator) and 1 agent (observability-to-test).
Coordinates a release that runs a canary deploy and a feature-flag A/B experiment simultaneously - audits the user-assignment overlap to detect canary cohort contamination of the experiment split, sequences the two validators (prod-canary-validator then feature-flag-experiment-validator), and reconciles their verdicts into a single promote/hold/rollback decision. Use when a team ships a canary deploy and an active A/B experiment at the same time and needs to confirm the two cohort splits are statistically independent before trusting either verdict.
Closes the loop between production observability signals and the test suite - reads a synthetic-monitor failure / Sentry error / Datadog incident / log alert, isolates the failing condition (input + state + system version), proposes the regression test that would have caught it (unit + integration + E2E layers per the test pyramid), and emits a PR adding the test plus the bug-repro package. Use after every production-side incident - converts "we caught it in prod" into "we'll catch it earlier next time.
Validates the statistical significance of an A/B / feature-flag experiment result - computes per-metric effect size + p-value (chi-square for proportions, Welch's t-test for continuous metrics), applies a multiple-comparison correction (Bonferroni / Benjamini-Hochberg) when N>1 metric, surfaces practical-vs-statistical-significance distinction, and emits a ship/don't-ship verdict per metric. Use to keep PMs / engineers from "shipping the winning variant" based on under-powered or multiple-tested results - the rigorous version of "the variant looks better in the dashboard.
Builds a canary-validation workflow that compares a canary deploy's metrics against the baseline (current main) - picks the metric set (error rate, p50/p95/p99 latency, business KPIs like checkout-completion), defines per-metric thresholds (absolute + relative-to-baseline), runs a statistical-comparison check (effect size + significance) over the canary's observation window, and emits a promote/rollback verdict. Use as the gate between canary deploy and full rollout - the deterministic version of "the on-call eyeballs the dashboard for 30 min.
Reads Real User Monitoring data (Datadog RUM, Sentry Performance, GA4 Core Web Vitals / CrUX) to identify high-traffic user journeys that have no synthetic monitor coverage: ranks journeys by session volume times business value, diffs the ranked list against existing synthetic monitors, and emits a prioritized gap list ready to feed into synthetic-monitor-author. Use when an observability stack has RUM instrumented but the team suspects synthetic coverage is sparse, biased toward low-traffic paths, or was never systematically derived from real usage data.
Drafts a synthetic monitor configuration for one critical user journey - picks the platform (Datadog Synthetics, Pingdom, Checkly, New Relic, etc.), authors the scripted-transaction body (Playwright-style for browser checks; HTTP-step for API checks), wires the cadence (typical 1-15 min), defines per-step assertions (DOM presence, API status, response shape) and aggregate alert thresholds (consecutive-failure count + on-call routing). Use when a critical journey needs continuous-in-production verification per ISTQB-canonical shift-right ("a test approach to test a system continuously in production").
Uses power tools
Uses Bash, Write, or Edit tools
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
A rigorously curated quality-engineering plugin marketplace for Claude Code. 77 plugins, 695 components, every one rating-gated before merge.
d6 floordocs/REVIEWER_TRAINING.mdSee Quality bar and docs/REVIEWER_CHECKLIST.md.
The marketplace ships three kinds of building block:
qa-api-testing, qa-load-testing). You install only the plugins your
stack needs.great-expectations,
oauth-flow-test-author). Claude loads a skill when your request matches
its trigger; you can also ask for it by name.schema-diff-reviewer reviews a migration diff and returns a findings
table). An agent may preload one or more skills to do its work.Installed components stay dormant until a matching task comes up, so adding a plugin doesn't add noise — it adds capability that activates on demand.
/plugin marketplace add testland/qa
/plugin install <plugin-name>@testland-qa
For example:
/plugin install qa-data-quality@testland-qa
/plugin marketplace add https://github.com/testland/qa
git clone https://github.com/testland/qa ~/.claude/marketplaces/testland-qa
Before you install: plugins run inside your Claude Code session and ship agent instructions and tool wrappers. Anthropic doesn't vet marketplace contents — review a plugin's components before installing it into a sensitive project. Every component here is rating-gated (see Quality bar), but you remain in control of what runs.
New to the marketplace? Install one or two plugins for your role rather than everything — components activate on demand, so a focused set keeps things sharp.
| If you're a… | Try first |
|---|---|
| Manual / exploratory tester | qa-manual-testing · qa-bdd · qa-bug-repro |
| Test automation engineer | qa-web-e2e · qa-api-testing · qa-unit-tests-js |
| Performance engineer | qa-load-testing · qa-chaos-resilience |
| Security tester | qa-sast · qa-secrets · qa-dast |
| Lead / manager / head of quality | qa-roles · qa-test-management · qa-process |
The full catalog is below; for versions and component counts see
CATALOG.md.
Once a plugin is installed, its skills and agents are available to Claude
Code — invoke them by describing the task in plain language. Example with
qa-data-quality:
/plugin install qa-data-quality@testland-qa
great-expectations skill scaffolds an ExpectationSuite + Checkpoint and
wires the results into a CI gate.schema-diff-reviewer agent returns a Critical / Warning / Info findings
table covering breaking-vs-additive changes and downstream impact.Each plugin's README.md lists its skills and agents and what each one does.
npx claudepluginhub testland/qa --plugin qa-shift-rightVisual regression testing: 7 skills (percy-visual-regression-testing, chromatic-visual-regression-testing, playwright-snapshots, storybook-visual-regression-testing, responsive-breakpoint-runner, visual-baseline-conventions, visual-baseline-gate) and 2 agents (visual-diff-classifier, visual-baseline-curator).
Contract testing for microservices: 5 skills (pact-contract-testing, openapi-contract-diff, graphql-schema-regression, protobuf-compat-checking, contract-compatibility-gate) and 2 agents (contract-drift-investigator, contract-test-scaffolder).
Flake triage: 2 skills (flaky-test-quarantine, flake-pattern-reference) and 5 agents (e2e-flake-bisector, parallel-isolation-checker, regression-bisector, ai-flake-detector, e2e-test-trend-reporter).
Bug reproduction workflow: 1 skill (bug-report-template) and 8 agents (bug-report-from-recording, bug-repro-builder, crash-stack-trace-analyzer, defect-clusterer, defect-trend-narrator, escape-defect-analyzer, failure-classifier, test-failure-debugger).
Data quality testing for analytical pipelines: 5 skills (dbt-testing, great-expectations, soda-checks, data-quality-gate, data-quality-conventions) and 2 agents (schema-diff-reviewer, data-anomaly-triager).
Comprehensive .NET development skills for modern C#, ASP.NET, MAUI, Blazor, Aspire, EF Core, Native AOT, testing, security, performance optimization, CI/CD, and cloud-native applications
Complete collection of battle-tested Claude Code configs from an Anthropic hackathon winner - agents, skills, hooks, and rules evolved over 10+ months of intensive daily use
Comprehensive skill pack with 66 specialized skills for full-stack developers: 12 language experts (Python, TypeScript, Go, Rust, C++, Swift, Kotlin, C#, PHP, Java, SQL, JavaScript), 10 backend frameworks, 6 frontend/mobile, plus infrastructure, DevOps, security, and testing. Features progressive disclosure architecture for 50% faster loading.
Unity Development Toolkit - Expert agents for scripting/refactoring/optimization, script templates, and Agent Skills for Unity C# development
Harness-native ECC operator layer - 67 agents, 271 skills, 92 legacy command shims, reusable hooks, rules, selective install profiles, and production-ready workflows for Claude Code, Codex, OpenCode, Cursor, and related agent harnesses
Binary reverse engineering, malware analysis, firmware security, and software protection research for authorized security research, CTF competitions, and defensive security