From ab-testing
Calculate required sample size and experiment duration for an A/B test. Provide baseline rate and minimum detectable effect, or answer interactively.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ab-testing:sample-sizeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Calculate the required sample size for an A/B test. Parse any parameters from `$ARGUMENTS` (e.g., "baseline 5% mde 10% relative") or collect them interactively.
Calculate the required sample size for an A/B test. Parse any parameters from $ARGUMENTS (e.g., "baseline 5% mde 10% relative") or collect them interactively.
Gather these from the user (use defaults where noted):
| Parameter | Description | Default |
|---|---|---|
| Baseline rate | Current conversion rate OR baseline mean + std dev for continuous metrics | required |
| MDE | Minimum detectable effect (absolute or relative) | required |
| Significance level (alpha) | Type I error rate | 0.05 |
| Statistical power (1 - beta) | Probability of detecting a true effect | 0.80 |
| Number of variants | Including control | 2 |
| Sidedness | One-sided or two-sided test | Two-sided |
| Daily traffic | Visitors per day (optional, to estimate duration) | optional |
n = (Z_alpha + Z_beta)^2 * (p1(1-p1) + p2(1-p2)) / (p2 - p1)^2n = (Z_alpha + Z_beta)^2 * 2 * sigma^2 / delta^2If daily traffic was provided:
Duration = sample_size_per_variant * num_variants / daily_trafficShow sample size at multiple MDE levels to help the user choose:
| Relative MDE | Absolute MDE | Sample Size per Variant | Duration (days) |
|---|---|---|---|
| 1% | ... | ... | ... |
| 2% | ... | ... | ... |
| 5% | ... | ... | ... |
| 10% | ... | ... | ... |
| 20% | ... | ... | ... |
Flag if applicable:
Generate a Python snippet using statsmodels.stats.power or manual calculation that reproduces the result:
from statsmodels.stats.power import NormalIndPower, TTestIndPower
# or manual calculation with scipy.stats.norm
npx claudepluginhub weisberg/agile_agentic_analytics --plugin ab-testingDesigns statistically rigorous A/B tests and interprets experiment results with ship/iterate/kill recommendations. Calculates sample size, run time, and flags design risks.
Analyzes A/B test results for statistical significance, sample size validation, confidence intervals, lift, guardrails, and ship/extend/stop recommendations. Handles CSV/Excel data via Python scripts.
Designs statistically rigorous A/B tests with hypothesis, sample size, duration, and results interpretation guide. Activates on experiment design or test setup requests.