Skill

flake-triage

Classifies a flaky WDIO/Appium spec as intermittent, deterministic-flake, or locator-broke by analyzing the last 10 BrowserStack runs. Use when the user names a spec and asks "is this flaky", "why is X failing on CI", "should we skip this test", or "triage this flake".

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/qa-mobile-pack:flake-triage

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

Read

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Decide whether a spec is genuinely flaky, broken, or hit by a regression. Input `$ARGUMENTS` is the spec name (file path or test title).

SKILL.md

48 lines · ~565 tokens

Stats

LanguageShell

Stars0

MaintenanceGood

Last CommitMay 6, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Flake Triage

Decide whether a spec is genuinely flaky, broken, or hit by a regression. Input $ARGUMENTS is the spec name (file path or test title).

Requires browserstack MCP to pull recent runs — see qa-mobile-pack/mcp/README.md. If unavailable, ask the user to paste the last 10 run summaries (status + failed step + timestamp).

Procedure

Fetch the last 10 runs of $ARGUMENTS via browserstack MCP, sorted newest-first. For each run capture: status (pass/fail), failed_step, timestamp, app_version or release tag.
Compute:
- pass_rate = passes / 10
- unique_failure_steps = count of distinct failed_step values across the failing runs
- recent_pass_rate = passes in runs 1–5 (most recent)
- older_pass_rate = passes in runs 6–10
Apply the decision tree in order — first match wins:
- recent_pass_rate < 0.2 AND older_pass_rate > 0.9 → locator-broke (regression after a release)
- pass_rate < 0.5 AND unique_failure_steps == 1 → deterministic-flake (same step every fail)
- pass_rate < 0.5 AND unique_failure_steps >= 2 → intermittent (different steps each fail)
- Otherwise → stable (do not classify as flake)
Pick the recommended action from this map:
- locator-broke → fix locator; check the latest release diff for the failing screen
- deterministic-flake → investigate timing or data setup at the failing step
- intermittent → file a ticket with the run links; consider retry policy; do not skip yet
- stable → no action

Output template

**Spec:** $ARGUMENTS
**Pass rate (last 10):** N/10  (recent 5: N/5, older 5: N/5)
**Unique failure steps:** N

**Classification:** `<intermittent | deterministic-flake | locator-broke | stable>`

**Recommended action:** <action>

**Evidence:**
- <run timestamp> — <status> — <failed_step or "-">
- ... (up to 10)

flake-triage

Invocation

Tool Access

Context Preview

SKILL.md

flake-triage

Invocation

Tool Access

Context Preview

SKILL.md

Flake Triage

Procedure

Output template

Similar Skills

Flake Triage

Procedure

Output template

Similar Skills