From pm-data-analytics
Analyzes A/B test results for statistical significance, sample size validation, effect sizes, and ship/extend/stop recommendations.
How this command is triggered — by the user, by Claude, or both
Slash command
/pm-data-analytics:analyze-test <test results as data, screenshot, or description>The summary Claude sees in its command listing — used to decide when to auto-load this command
# /analyze-test -- A/B Test Analysis Evaluate experiment results with statistical rigor and translate findings into a clear product decision: ship, extend, or stop. ## Invocation ## Workflow ### Step 1: Accept Test Data Accept in any format: - Summary statistics (conversion rates, sample sizes per variant) - Raw event data (CSV with user_id, variant, converted, timestamp) - Screenshot from an experimentation platform (Optimizely, LaunchDarkly, etc.) - Description of the experiment and results ### Step 2: Validate Test Design Before analyzing results, check: - Was sample size suffic...
Evaluate experiment results with statistical rigor and translate findings into a clear product decision: ship, extend, or stop.
/analyze-test Control: 4.2% conversion (n=5000), Variant: 4.8% conversion (n=5100)
/analyze-test [upload a CSV of test results]
/analyze-test [screenshot from your experimentation platform]
Accept in any format:
Before analyzing results, check:
Flag issues if found — results from a flawed test can be misleading.
Apply the ab-test-analysis skill:
## A/B Test Analysis: [Test Name]
**Date**: [today]
**Test duration**: [X days/weeks]
**Total sample**: [N users]
### Results Summary
| Variant | Sample | Metric | Rate | 95% CI |
|---------|--------|--------|------|--------|
| Control | [n] | [metric] | [X%] | [X% - Y%] |
| Variant | [n] | [metric] | [X%] | [X% - Y%] |
### Statistical Analysis
- **Relative lift**: [+X%] ([CI range])
- **P-value**: [X]
- **Statistically significant**: [Yes/No] at 95% confidence
- **Minimum detectable effect**: [X%] (what the test was powered to detect)
### Sample Size Check
- **Required sample**: [N] per variant (for [X%] MDE at 80% power)
- **Actual sample**: [N] per variant
- **Verdict**: [Sufficiently powered / Underpowered / Overpowered]
### Decision
**Recommendation: [SHIP / EXTEND / STOP]**
[Clear explanation of why, considering both statistical and practical significance]
### Business Impact Estimate
If shipped to 100% of users:
- **Expected impact**: [metric change per month/quarter]
- **Revenue impact**: [if applicable]
- **Confidence**: [How certain we are about this estimate]
### Caveats
- [Any concerns about the test validity]
- [Segments where results differ]
- [Novelty effects or other biases to consider]
### Follow-Up
- [What to test next based on learnings]
- [Monitoring plan if shipping the variant]
npx claudepluginhub gaoflyx-ux/rpg-0.1 --plugin pm-data-analytics/analyze-testAnalyzes A/B test results for statistical significance, sample size validation, effect sizes, and ship/extend/stop recommendations.
/design-experimentPlans an A/B test or experiment: restates hypothesis, computes sample size with power analysis, recommends duration, and flags risks like peeking and SRM.
/design-experimentDesign statistically sound A/B test with clear hypothesis and success criteria.
/experimentDesigns an A/B experiment from a hypothesis or design change, producing a document with structured hypothesis, variants, metrics, sample size, duration, user flows, and analysis plan.
/experiment-designDesigns a rigorous A/B experiment plan for a feature-flagged change, producing hypothesis, metrics, sample size, and decision thresholds. Optionally creates the experiment in PostHog when connected.
/ab-setupGuides users through interactive A/B test planning — collects scope, element type, goal, and traffic level, then generates a structured test plan with hypothesis and design.