From ab-testing
Design a rigorous A/B experiment with hypothesis, metrics, guardrails, and a full experiment plan. Use when starting a new experiment or formalizing an existing idea.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ab-testing:design-experimentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Help the user design a rigorous A/B experiment. If they provide a description of what they want to test as `$ARGUMENTS`, use that as the starting point. Otherwise, ask what product change or feature they want to test.
Help the user design a rigorous A/B experiment. If they provide a description of what they want to test as $ARGUMENTS, use that as the starting point. Otherwise, ask what product change or feature they want to test.
Clarify the change — Understand what is being changed and why. Ask follow-up questions if the change is ambiguous.
Produce a structured experiment design document in Markdown with these sections:
Experiment Name: A short descriptive name
Hypothesis: Format as: "If [change], then [metric] will [direction] by [expected magnitude] because [mechanism]."
Primary Metric: The single metric that will decide ship/no-ship. Define it precisely (numerator, denominator, time window).
Secondary Metrics: Additional metrics to monitor for deeper understanding. List 2-5.
Guardrail Metrics: Metrics that must NOT degrade. Typical guardrails include:
Randomization Unit: User, session, device, etc. Explain why this unit is appropriate.
Target Population: Who is included/excluded from the experiment and why.
Traffic Allocation: Recommend a split (e.g., 50/50) or ramped rollout schedule. Justify the choice.
Expected Duration: Estimate based on the primary metric's baseline rate and the minimum detectable effect. If the user hasn't provided baseline numbers, ask for them or note what assumptions you're making.
Interaction Risks: Flag potential conflicts with other experiments or features.
Decision Criteria:
/ab-testing:sample-size)npx claudepluginhub weisberg/agile_agentic_analytics --plugin ab-testingDesigns controlled experiments (A/B, multivariate, quasi) with hypothesis, success metrics, sample size, and statistical power. For validating features via /design-experiment or phrases like 'design experiment'.
Designs complete A/B test plans from hypotheses, including structured hypothesis, primary/guardrail metrics, variants, sample size, duration, success criteria, and risks.
A/B test design — produce an experiment spec with hypothesis, primary metric, MDE, sample size, run time, and decision rule. Also determines when NOT to A/B test and what to do instead. Use when asked to "design an A/B test", "should we test this", "experiment design", "how do we know if this works", "what's the sample size", or "set up an experiment".