From qa-shift-left
Reads a PRD, design doc, or product brief and pulls out the non-functional requirements (performance, accessibility, security, internationalization, reliability, observability) as concrete, threshold-bound, testable assertions. Maps every NFR to its measurement source (Lighthouse, axe, OWASP ASVS, WCAG criterion, etc.) so the test suite knows what to assert against. Use after acceptance-criteria-extractor handles functional requirements.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qa-shift-left:nfr-extractorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
ISTQB defines **non-functional testing** as "A test type to evaluate
ISTQB defines non-functional testing as "A test type to evaluate that a component or system complies with non-functional requirements" (istqb-nft). The eight ISO/IEC 25010:2011 product-quality characteristics define the canonical NFR categories (cite by stable standard ID; iso.org is paywalled / Cloudflare-protected):
| ISO/IEC 25010 characteristic | Common shorthand | This skill covers |
|---|---|---|
| Functional Suitability | functional | (out - handled by acceptance-criteria-extractor) |
| Performance Efficiency | performance / perf | yes |
| Compatibility | compatibility | yes |
| Usability | a11y / UX | yes (a11y subset) |
| Reliability | reliability | yes |
| Security | security | yes |
| Maintainability | (developer-facing) | (out) |
| Portability | (developer-facing) | (out) |
Plus two practitioner-emergent NFR families that are increasingly treated as first-class:
| Family | Source |
|---|---|
| Internationalization (i18n) | W3C i18n best practices |
| Observability | DORA / SRE practitioner literature |
Acceptance-criteria-extractor handles functional requirements (Given/When/Then). This skill handles everything else - the threshold-and-measurement requirements that don't fit Gherkin's shape.
testability-reviewer would
flag for failing the Observable heuristic.Per family, look for the canonical signal phrases:
| Family | Signal phrases |
|---|---|
| Performance | "fast", "responsive", "within Xms", "p95", "throughput", "latency", "time to" |
| Accessibility | "WCAG", "screen reader", "keyboard navigation", "color contrast", "ARIA" |
| Security | "auth", "encrypt", "OWASP", "CSP", "rate limit", "secret" |
| Compatibility | "browser", "iOS", "Android", "version", "device" |
| Reliability | "uptime", "SLO", "SLA", "graceful degradation", "circuit breaker", "retry" |
| Internationalization | "locale", "RTL", "translation", "Unicode", "timezone" |
| Observability | "metric", "log", "trace", "alert", "dashboard" |
Each signal becomes a candidate NFR - but NOT every signal is testable as authored. Run each through Step 2.
Per ISTQB testability (istqb-testability) and the
testability-reviewer
heuristics, an NFR is testable only when it has:
| Untestable NFR | Threshold-bound rewrite |
|---|---|
| "The site should be fast" | "p95 page-load time on /dashboard is ≤2.5s, measured by Lighthouse CI under 4G throttling." |
| "Accessible to screen reader users" | "WCAG 2.2 AA conformance on /checkout; axe-core scan returns zero serious or critical violations." |
| "Secure against attacks" | "OWASP ASVS Level 2 controls 1-7; verified by npm audit --audit-level=high exit zero AND ZAP baseline scan zero alerts." |
| "Works on all browsers" | "Functional E2E tests pass on Chrome ≥120, Firefox ≥120, Safari ≥17 (latest two majors per WebRumStats data)." |
| "Highly available" | "Uptime ≥99.9% measured monthly via external prober; recovery from any single-AZ failure ≤5 min." |
The agent never fabricates a threshold. If the doc says "fast" without a number, the agent emits an explicit question ("What's the target p95? Common defaults: 2.5s for landing, 1.5s for cached, 200ms for API."), it does not invent 2.5s.
Each NFR must name how it's measured. Canonical sources by family:
| Metric | Source |
|---|---|
| LCP (Largest Contentful Paint) | Web Vitals (Google) - page-load perception |
| INP (Interaction to Next Paint) | Web Vitals - replaces FID since March 2024 |
| CLS (Cumulative Layout Shift) | Web Vitals |
| p95 / p99 latency | k6 / JMeter / Gatling / load-test runner |
| Throughput | Same load-test runners |
Web Vitals canonical thresholds: LCP ≤2.5s, INP ≤200ms, CLS ≤0.1.
| Standard | Source |
|---|---|
| WCAG 2.2 AA / AAA | https://www.w3.org/TR/WCAG22/ |
| ARIA practices | https://www.w3.org/WAI/ARIA/apg/ |
| Tools | axe-core, pa11y, Lighthouse a11y, IBM Equal Access, WAVE |
Per-criterion citations use the WCAG SC ID format
(WCAG 2.2 SC 1.4.3 Contrast (Minimum)) so the test report
back-references the standard.
| Standard | Source |
|---|---|
| OWASP Top 10 | https://owasp.org/www-project-top-ten/ |
| OWASP ASVS | https://owasp.org/www-project-application-security-verification-standard/ |
| CWE | https://cwe.mitre.org/ |
| Tools | ZAP / Burp / Snyk / Trivy / Semgrep |
| Source |
|---|
| MDN browser-compat-data (https://github.com/mdn/browser-compat-data) |
| Can I use (https://caniuse.com/) |
| Per-team analytics (real device share) |
| Concept | Source |
|---|---|
| SLO / SLA / SLI | Google SRE Workbook (free at sre.google/workbook) |
| Error budgets | Same |
| Chaos engineering principles | https://principlesofchaos.org/ |
| Source |
|---|
| W3C i18n Web FAQ (https://www.w3.org/International/) |
| Unicode CLDR (https://cldr.unicode.org/) |
| ICU MessageFormat |
# NFRs extracted from `<doc-id>`
**Source doc:** `<path or URL>`
**Date extracted:** YYYY-MM-DD
**NFRs found:** N
**Threshold gaps flagged:** M
## Performance
| ID | Requirement | Threshold | Measurement | Scope |
|-------|------------------------------------------------------------|-----------|-------------|----------------------|
| PRF-1 | LCP on /dashboard | ≤2.5s | Lighthouse CI; Web Vitals 75th-percentile field data | logged-in user, 4G, 1280x800 |
| PRF-2 | p95 latency on POST /api/orders | ≤200ms | k6 load test, 50 RPS for 5 min | production-like staging |
## Accessibility
| ID | Requirement | Threshold | Measurement | Scope |
|-------|------------------------------------------------------------|------------------|-------------|------------------|
| A11Y-1| WCAG 2.2 AA conformance | zero serious/critical violations | axe-core scan | every page in app sitemap |
## Security
| ID | Requirement | Threshold | Measurement | Scope |
|-------|------------------------------------------------------------|------------------|-------------|------------------------|
| SEC-1 | No high-severity dependency vulnerabilities | zero high/critical | npm audit --audit-level=high | every PR; production deps |
| SEC-2 | OWASP ASVS Level 2 baseline | pass | ZAP baseline scan + manual ASVS checklist | release readiness gate |
## Compatibility / i18n / Reliability / Observability
(similar tables, only included when the source doc raised them)
## Threshold gaps (HUMAN INPUT REQUIRED)
| Vague phrase | Where in doc | Suggested concretization |
|------------------------------------|-----------------------|----------------------------------------------------------------|
| "fast checkout" | line 12 | LCP ≤1.5s (typical for cached transactional flow)? |
| "supports global users" | line 28 | Locale list? RTL support? Timezone handling? Unicode in fields? |
The agent does NOT pick the answers — author must confirm.
Input:
"Checkout must feel snappy and work for users worldwide. We need to meet enterprise security requirements."
Output (excerpt):
**NFRs found:** 0 (all phrases are vague)
**Threshold gaps flagged:** 3
| Vague phrase | Suggested concretization |
|-------------------------------------------|-------------------------------------------------------------------------------------------|
| "feel snappy" | LCP ≤2.5s, INP ≤200ms (Web Vitals defaults)? |
| "work for users worldwide" | Specify: which locales? RTL? timezones? Unicode validation? CLDR-based number/date formatting? |
| "enterprise security requirements" | OWASP ASVS Level 2 (typical baseline) or Level 3 (strict)? SOC 2 Type II audit? FedRAMP? |
The output is a question list, not a fabricated set of thresholds. Author returns; agent re-extracts on next pass.
Input:
"Page load on /dashboard must be ≤1.5s p95 measured by Lighthouse CI on the GitHub Actions ubuntu-latest runner. Color contrast ratio ≥4.5:1 for normal text per WCAG 2.2 SC 1.4.3."
Output:
## Performance
| ID | Requirement | Threshold | Measurement | Scope |
|-------|------------------------------------------------------------|-----------|-------------|----------------|
| PRF-1 | Page load on /dashboard | ≤1.5s p95 | Lighthouse CI on GitHub Actions ubuntu-latest | /dashboard route |
## Accessibility
| ID | Requirement | Threshold | Measurement | Scope |
|-------|------------------------------------------------------------|-----------|-------------|----------------|
| A11Y-1| Color contrast ratio for normal text | ≥4.5:1 | WCAG 2.2 SC 1.4.3 | every page |
**Threshold gaps:** 0 — all NFRs are threshold-bound and measurement-cited.
| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Inventing thresholds the doc didn't specify | Tests pass for fabricated baselines; real users see worse perf. | Flag as threshold gap; require author input. |
| One generic "performance" NFR for every page | Different surfaces have different budgets (cached vs non-cached). | One NFR per measurable scope (route × viewport × auth state). |
| Mapping every security NFR to "OWASP Top 10" | Top 10 is awareness-tier; ASVS is verifiable. Compliance maps to ASVS. | Specify ASVS Level (1/2/3) and which controls. |
| Treating "uptime SLA" as a test requirement | SLA is a runtime / monitoring concern, not a pre-deploy test. | Map to monitoring metric + alert rule, not to a CI gate. |
testability-reviewer - the
upstream agent that flags vague phrases for this skill to formalize.acceptance-criteria-extractor - sibling skill for functional requirements.npx claudepluginhub testland/qa --plugin qa-shift-leftSearches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
Guides Payload CMS config (payload.config.ts), collections, fields, hooks, access control, APIs. Debugs validation errors, security, relationships, queries, transactions, hook behavior.
Implements vector databases with Pinecone, Weaviate, Qdrant, Milvus, pgvector for semantic search, RAG, recommendations, and similarity systems. Optimizes embeddings, indexing, and hybrid search.