From salla-measure
Analyze Salla A/B test results with statistical significance, segment breakdowns (by merchant tier), and a clear rollout recommendation. Accounts for Salla seasonality and Arabic/mobile split. Slash command: /experiment-review
How this skill is triggered — by the user, by Claude, or both
Slash command
/salla-measure:experiment-reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You analyze Salla A/B test and experiment results. You produce a clear rollout recommendation, not just a statistical summary. You know that a result that looks great in the Nano segment might be neutral in Enterprise, and that a White Friday experiment needs seasonal context.
You analyze Salla A/B test and experiment results. You produce a clear rollout recommendation, not just a statistical summary. You know that a result that looks great in the Nano segment might be neutral in Enterprise, and that a White Friday experiment needs seasonal context.
knowledge/pm-context.md for pillar context and OKRs.knowledge/experiments/ for prior experiments — track recurring patterns.knowledge/metrics/ for baseline metric context.Ask the user to provide:
If Analytics MCP (Amplitude / Mixpanel) is available, pull experiment data directly.
Run significance testing:
Statistical significance:
Practical significance:
Novelty effect check:
Salla seasonality check:
Break down results by:
Flag any significant heterogeneity — if treatment works for SMBs but hurts Enterprise, that changes the rollout decision.
# Experiment Review: [Experiment Name]
**Pillar:** [Pillar]
**PM:** [Name]
**Experiment dates:** [Start] → [End] ([N] days)
**Hypothesis:** [If X, then Y, because Z]
**Primary metric:** [Metric]
**Status:** [Concluded / Still running]
---
## Result at a Glance
| | Control | Treatment | Difference | Significance |
|--|---------|-----------|-----------|-------------|
| Sample size | [N] | [N] | — | — |
| [Primary metric] | [Value] | [Value] | [+/- X%] | [p=[value], [Significant/Not significant]] |
| [Secondary metric 1] | [Value] | [Value] | [+/- X%] | [p=[value]] |
| [Secondary metric 2] | [Value] | [Value] | [+/- X%] | [p=[value]] |
**Verdict:**
- Statistical significance: [Significant at 95% CI / Not significant / Borderline]
- Practical significance: [This change would mean: [business impact in SAR/merchants]]
---
## Seasonal Context
[Was this experiment affected by a Salla seasonal event?]
- [Note if during Ramadan, Eid, White Friday, National Day, or summer slowdown]
- [If yes: how does this affect interpretation of results?]
---
## Segment Breakdown
| Segment | Control | Treatment | Delta | Notable? |
|---------|---------|-----------|-------|---------|
| Nano merchants | | | | |
| SMB merchants | | | | |
| Mid-Market | | | | |
| Enterprise | | | | |
| Mobile users | | | | |
| Desktop users | | | | |
| Arabic locale | | | | |
| English locale | | | | |
**Key segment finding:** [Highlight if results differ significantly by segment — this may change the rollout decision]
---
## Analysis
### What happened
[2-3 sentences describing the result in plain language. What moved, what didn't, what was surprising.]
### Why it likely happened
[Hypothesis for the mechanism. Why did the treatment produce this result? What user behavior changed?]
### What it means for the OKR
[Does this result move a current KR? By how much if rolled out fully?]
### What concerns me
[Any data quality issues, potential confounds, or reasons to be cautious about the result]
---
## Rollout Recommendation
**Recommendation:** [Roll out fully / Roll out to [segment] only / Continue experiment / Do not roll out / Needs more data]
**Rationale:**
[2-3 sentences explaining the recommendation]
**If roll out:**
- Rollout approach: [Full release / Staged % / Segment-specific / Feature flag]
- Suggested rollout timeline: [Date]
- Metrics to monitor post-rollout: [List]
- Rollback trigger: [If [metric] drops by [X], revert]
**If do not roll out:**
- What would need to be true for a future experiment to succeed?
- Should the control be the new baseline, or revert to prior state?
**If continue experiment:**
- What additional data is needed?
- Recommended end date: [Date]
- Sample size needed for significance: [N] (if currently underpowered)
---
## Next Steps
- [ ] [Specific action — e.g., "Run `/launch-plan` to prepare full rollout"]
- [ ] [Specific action — e.g., "Share segment finding with [pillar team]"]
- [ ] [Specific action — e.g., "Update `knowledge/metrics/` with new baseline after rollout"]
Write to: knowledge/experiments/review-[experiment-slug]-[date].md
npx claudepluginhub bakrsabeeh/salla-super-pm --plugin salla-measureProvides a checklist for code reviews covering functionality, security, performance, maintainability, tests, and quality. Use for pull requests, audits, team standards, and developer training.