From qa-ai-assisted
Build-an-X workflow that uses an LLM to map existing tests to spec sections - given a spec doc + the test suite, the LLM identifies which tests cover which sections, surfaces uncovered sections (gap), and recommends specific tests to add. Output is a coverage matrix per spec ID. Use as a follow-up to `ai-test-generator` (which generates tests for new ACs) - this maps the existing landscape and finds what's missing.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qa-ai-assisted:ai-spec-coverage-mapperThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Coverage tools (per [`lcov-analysis`](../../../qa-test-reporting/skills/lcov-analysis/SKILL.md))
Coverage tools (per lcov-analysis)
report which lines are tested. They don't report which spec
sections are tested.
A spec section like "AC-1.4: Already-applied promo shows 'Already applied'" might be:
This skill uses an LLM to map tests ↔ spec sections semantically.
spec_path: "docs/specs/checkout.md"
test_globs:
- "tests/checkout/**/*.spec.ts"
- "features/checkout/*.feature"
ac_extraction:
pattern: "AC-(\\d+\\.\\d+):" # AC IDs in the spec
The LLM reads both the spec and the tests; outputs the mapping.
# scripts/ai-coverage.py
import openai
spec_text = read(spec_path)
ac_list = extract_acs(spec_text)
test_files = read_all(test_globs)
system_prompt = """
You map AC IDs to tests. For each AC ID, identify:
- which test files cover it
- which test names within those files cover it
- coverage tier: full | partial | none
If partial, explain what aspect is missing.
"""
response = openai.chat.completions.create(
model='gpt-4',
messages=[
{'role': 'system', 'content': system_prompt},
{'role': 'user', 'content': f"Spec:\n{spec_text}\n\nTests:\n{test_files}"},
],
)
print(response.choices[0].message.content)
## Spec → test coverage map
**Spec:** `docs/specs/checkout.md`
**ACs:** 12
**Tests inventoried:** 47
### Coverage matrix
| AC ID | Description | Coverage | Tests |
|----------|------------------------------------------|----------|-----------------------------------------------------|
| AC-1.1 | Valid promo applies discount | ✅ full | `promo.spec.ts > "applies WELCOME10"` |
| AC-1.2 | Expired promo shows error | ✅ full | `promo.spec.ts > "shows error for EXPIRED50"` |
| AC-1.3 | Invalid format shows "Code not found" | ✅ full | `promo.spec.ts > "rejects NOTREAL"` |
| AC-1.4 | Already-applied promo shows "Already applied" | ⚠ partial | `promo.spec.ts > "rejects duplicate"`. ⚠ Test asserts "Already used" — message drift from AC. |
| AC-1.5 | Promo applies before tax | ❌ none | |
| AC-2.1 | Stripe webhook delivery retried | ✅ full | `webhook.spec.ts > "retries on 500"` |
| AC-2.2 | Stripe webhook delivery DLQ after 3 fails | ❌ none | |
| ...
### Action items
| AC ID | Action |
|---------|-------------------------------------------------|
| AC-1.4 | Update test message assertion to match AC ("Already applied"). |
| AC-1.5 | Add test asserting promo applies before tax. |
| AC-2.2 | Add test for DLQ-after-3-fails behavior. |
### Coverage trend
(Compare with prior run)
- AC count: 12 (was 10)
- Full coverage: 9/12 = 75% (was 8/10 = 80%)
- 2 new ACs added in this PR; both uncovered.
Schedule weekly:
on:
schedule:
- cron: '0 4 * * MON'
jobs:
spec-coverage:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- run: python scripts/ai-coverage.py
- uses: peter-evans/create-issue-from-file@v5
with:
title: 'Spec coverage report — week of ${{ github.event.repository.updated_at }}'
content-filepath: spec-coverage-report.md
LLMs may claim a test "covers" an AC when it doesn't. Verification:
acceptance-test-from-criteria
if the team uses @AC-X.Y tags - those are the ground truth.| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Trusting LLM's "full coverage" claim without verification | Hallucination; tests don't actually cover the AC. | Spot-check + cross-reference (Step 5). |
| Running on the entire codebase repeatedly | Cost + slow. | Filter to changed ACs / tests since last run. |
| One-shot mapping; never updated | Drift; mapping stale. | Weekly cadence (Step 4). |
| No action items per gap | Coverage gaps surface but nothing happens. | Per-gap action item (Step 3 example). |
ai-test-generator.ai-test-generator - sister
skill: generates tests for the gaps this skill identifies.acceptance-test-from-criteria - for tag-based AC traceability without LLM.coverage-debt-tracker - line-coverage debt; complementary to spec coverage.npx claudepluginhub testland/qa --plugin qa-ai-assistedProvides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.