From qa-feature-flags
Workflow-driven skill that builds a flag-state coverage matrix from the project's flag inventory and risk register. Walks through: inventorying flags (grep for flag-evaluation calls), classifying each (boolean / multi-variant / kill-switch / experiment), choosing the coverage strategy (per-flag-isolation / pairwise / full / risk-driven per feature-flag-test-matrix-reference), generating the test matrix (PICT for pairwise; manual for risk-driven), and emitting test skeletons. Use when introducing flag-test coverage to a new codebase or when a flag-related incident exposes a coverage gap. Composes feature-flag-test-matrix-reference.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qa-feature-flags:flag-state-coverage-builderThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Building a flag-state coverage matrix from scratch is hard
Building a flag-state coverage matrix from scratch is hard because the combinatorics explode. This skill walks through producing a realistic coverage matrix - not exhaustive, but sufficient.
The output: a coverage-matrix YAML + per-cell test skeletons + gaps documented for follow-up.
Grep for SDK calls:
# Generic
grep -rn 'isOn\|isEnabled\|variation\|getFeatureValue' --include='*.{ts,js,py,go,java}' .
# Per-platform
grep -rn 'launchdarkly\|ld_client' . # LD
grep -rn 'unleash.isEnabled' . # Unleash
grep -rn 'flagsmith.get_' . # Flagsmith
grep -rn 'gbClient.\|growthbook' . # GrowthBook
Output: a flag inventory:
flags:
- name: show-new-ui
platform: launchdarkly
type: boolean
found_at:
- src/components/Header.tsx:42
- src/pages/Dashboard.tsx:88
- name: checkout-experiment
platform: launchdarkly
type: multi-variant
variants: [control, treatment-a, treatment-b]
found_at:
- src/pages/Checkout.tsx:120
# ...
| Category | Signals | Coverage need |
|---|---|---|
| Kill-switch | Naming: *-kill, disable-*, emergency-* | Test on→off toggle latency |
| Experiment | Multi-variant, used in analytics | Per-variant test + assignment integrity |
| Permission-gated feature | Used with if(flag && user.role===...) | Test per (flag, role) cell |
| UI tweak | Used in JSX/template; no business logic | Default + each variant; low risk |
| Migration | Naming: use-new-*, migrate-to-* | Test both paths to verify equivalence |
| Plan / tier gating | Used with subscription / plan check | Per (flag, plan) cell |
Per
feature-flag-test-matrix-reference:
| Strategy | Apply to |
|---|---|
| Default-only smoke | UI tweaks (low risk) |
| Per-flag isolation | Migration flags |
| Pairwise | Permission-gated + plan-tier (interactions matter) |
| Full matrix | Kill-switches + flags with regulatory impact |
| Risk-driven | Catch-all for the rest |
For pairwise: use PICT (Microsoft):
# pict.txt
flag_a: on, off
flag_b: on, off
flag_c: control, treatment-a, treatment-b
user_segment: free, paid, enterprise
pict pict.txt > matrix.tsv
PICT emits a pairwise-covering matrix (≤ 12 tests instead of 24 for full).
For risk-driven: combine with risk register from
qa-process/risk-matrix.
Cells with high impact + high likelihood become required tests.
For each cell of the matrix, generate a test stub:
// tests/feature-flags/auth.test.ts
describe('auth flag matrix', () => {
beforeEach(() => {
td.update(td.flag('use-new-auth').booleanFlag().on(false));
});
test('free user, new auth off → old flow', () => {
td.update(td.flag('use-new-auth').booleanFlag().on(false));
expect(authFlow({ plan: 'free' })).toBe('old');
});
test('free user, new auth on → new flow', () => {
td.update(td.flag('use-new-auth').booleanFlag().on(true));
expect(authFlow({ plan: 'free' })).toBe('new');
});
test('paid user, new auth on → new flow', () => {
td.update(td.flag('use-new-auth').booleanFlag().on(true));
expect(authFlow({ plan: 'paid' })).toBe('new');
});
// ... per pairwise matrix
});
The platform-specific SDK setup comes from
launchdarkly-testing etc.
Add these regardless of matrix coverage:
test('kill-switch deactivation propagates within 30s', async () => {
td.update(td.flag('emergency-disable').booleanFlag().on(false));
expect(featureActive()).toBe(true);
td.update(td.flag('emergency-disable').booleanFlag().on(true));
// SDK may have polling delay; in test mode it's instant
expect(featureActive()).toBe(false);
});
test('SDK fails → default returned', async () => {
const brokenClient = simulateSDKFailure();
expect(brokenClient.boolVariation('any-flag', user, false)).toBe(false);
expect(brokenClient.boolVariation('any-flag', user, true)).toBe(true);
});
test('user assignment sticky across sessions', () => {
const v1 = client.variation('rollout', { key: 'user-1' });
const v2 = client.variation('rollout', { key: 'user-1' });
expect(v1).toEqual(v2);
});
Emit a coverage doc:
# Flag-Test Coverage Matrix
## Covered cells
| Flag | Strategy | Cells | Test file |
|---|---|---|---|
| show-new-ui | per-flag isolation | 2 | tests/flags/show-new-ui.test.ts |
| checkout-experiment | pairwise (3 flags) | 9 | tests/flags/checkout-pairwise.test.ts |
| auth-migration | full matrix (2 flags × 3 plans) | 6 | tests/flags/auth.test.ts |
## Documented gaps (deliberate)
| Cell | Reason | Mitigation |
|---|---|---|
| flag-x = on AND flag-y = on AND user.segment = `internal` | Low likelihood — internal users only see flag-y in beta | Manual verify on flag-y promotion |
| theme-tweak all 3 variants × all 5 segments | UI-only; default-on-each is sufficient | None |
| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Build matrix without inventory | Flags missed silently | Always grep first |
| Pairwise on truly-independent flags | Wasted tests | Identify interactions; pair-test only interacting flags |
| Full matrix on 20+ flags | 2^20 tests; infeasible | Pairwise or risk-driven |
| Don't document gaps | Future maintainers don't know | Coverage doc with gaps + reason |
| One mega-test file for all flags | Failures opaque | One file per flag (or flag-pair) |
| Skip platform-specific override-mode | Tests pass against mock; prod-SDK-specific bugs hide | Use platform's TestData/bootstrap |
| Skip kill-switch test | "It worked in dev" | Always test |
| Coverage matrix not committed / no review | Drift unnoticed | Matrix.yaml in repo |
This skill produces:
flag-coverage.yaml.feature-flag-test-matrix-reference.qa-process/risk-matrix.launchdarkly-testing,
unleash-testing,
flagsmith-testing,
growthbook-testing.stale-flag-detector.flag-removal-runbook-author.npx claudepluginhub testland/qa --plugin qa-feature-flagsProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.