From vanguard-frontier-agentic
Triages flaky tests across any framework into root-cause categories (async races, shared state, environment coupling, etc.) and assigns remediation or quarantine paths.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vanguard-frontier-agentic:test-flakiness-triageThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill triages flaky tests — tests that pass and fail without a code change — into root-cause categories and assigns each a remediation or quarantine path. Flakiness is a process failure, not a test-by-test annoyance: once a suite flakes, engineers re-run red builds reflexively, a re-run becomes the default response to any failure, and the suite stops detecting real regressions. The skill i...
This skill triages flaky tests — tests that pass and fail without a code change — into root-cause categories and assigns each a remediation or quarantine path. Flakiness is a process failure, not a test-by-test annoyance: once a suite flakes, engineers re-run red builds reflexively, a re-run becomes the default response to any failure, and the suite stops detecting real regressions. The skill is framework-agnostic (Playwright, Cypress, Jest, JUnit, pytest, Go) and converts a pile of "sometimes fails" into categorized causes, a quarantine decision per test, and a policy that stops new flakes from entering the suite.
|| true, retry-the-whole-job) with no flaky tracking as HIGH — it converts flakiness from visible to invisible and unbounded.Date.now, timezone, or locale without injection/freezing as HIGH for environment coupling.Load these only when needed:
Return, at minimum:
npx claudepluginhub raishin/vanguard-frontier-agentic --plugin vanguard-frontier-agenticDiagnoses non-deterministic test failures and eliminates root causes (timing, shared state, concurrency, external dependency, randomness) instead of retrying or skipping.
Diagnoses and eliminates flaky or nondeterministic tests by classifying failure types (ordering, timing, resource, environment, external, concurrency) and isolating root causes with reproducible fixes.
Investigates a specific flaky test by retrieving its history, failure pattern, and category, then recommends fix, quarantine, or escalate. Best for DataDog CI users.