From qa-test-reporting
Builds a per-PR coverage delta report from any pair of LCOV / Cobertura / JSON coverage outputs (current run + baseline from the merge target) - emits a per-file table with line% / branch% deltas, called-out new files, hidden drops (overall +0.1pp but one file -8pp), and a single-line PR-comment summary. Use when the team has coverage in CI but needs human-readable PR feedback that points at the specific file the reviewer should focus on, not just an aggregate number.
How this skill is triggered — by the user, by Claude, or both
Slash command
/qa-test-reporting:coverage-diff-reporterThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A whole-repo coverage gate is necessary but not sufficient. A drop
A whole-repo coverage gate is necessary but not sufficient. A drop of 0.4pp overall might hide a 12pp drop in one critical file. A new file at 35% line coverage might pass an aggregate gate but leave a regression-risk hot spot.
This skill builds a coverage diff report that solves the "reviewer can see what to look at" problem:
This skill does not decide pass/fail - that's the gate's job
(see lcov-analysis or
cobertura-analysis Step 5). This
skill just makes the diff legible.
Match the existing CI's reporter:
| Existing reporter | Use |
|---|---|
LCOV .info | lcov-analysis parser |
| Cobertura XML | cobertura-analysis parser |
| Jest JSON / V8 coverage | Convert to LCOV first (jest --coverageReporters=lcov) |
| JaCoCo XML | Use jacoco-analysis, or convert to Cobertura |
| coverage.py | coverage xml → Cobertura, OR py2lcov → LCOV |
The reporter writes to current.json (parsed). The same parser
runs against the baseline → baseline.json.
Two patterns:
The main branch's last successful CI run uploaded its coverage as an artifact. PR jobs download it.
- name: Restore baseline
uses: dawidd6/action-download-artifact@v3
with:
workflow: coverage.yml
branch: main
name: coverage-baseline
path: baseline/
- name: Parse current
run: python scripts/parse_lcov.py coverage/lcov.info > current.json
- name: Parse baseline
run: python scripts/parse_lcov.py baseline/lcov.info > baseline.json
- name: Generate diff
run: python scripts/coverage_diff.py current.json baseline.json > diff.md
The PR job checks out main, runs tests + coverage, then checks out the PR head. Slower (~2x runtime) but always-fresh.
- name: Checkout main
run: git fetch origin main && git checkout origin/main
- name: Run tests on main
run: npm test -- --coverage && cp coverage/lcov.info baseline.lcov
- name: Checkout PR head
run: git checkout ${{ github.event.pull_request.head.sha }}
- name: Run tests on PR head
run: npm test -- --coverage
Pattern A is the default. Pattern B is the fallback when artifact retention has expired or main coverage is non-deterministic.
# scripts/coverage_diff.py
def compute_diff(current, baseline):
base_idx = {f['path']: f for f in baseline}
rows = []
for f in current:
b = base_idx.get(f['path'])
line_now = pct(f.get('lh', 0), f.get('lf', 0))
branch_now = pct(f.get('brh', 0), f.get('brf', 0))
line_then = pct(b.get('lh', 0), b.get('lf', 0)) if b else None
branch_then = pct(b.get('brh', 0), b.get('brf', 0)) if b else None
rows.append({
'path': f['path'],
'is_new': b is None,
'line_now': line_now, 'line_delta': delta(line_now, line_then),
'branch_now': branch_now,'branch_delta': delta(branch_now, branch_then),
})
# Also catch deletions — files in baseline but not current.
for path, b in base_idx.items():
if path not in {f['path'] for f in current}:
rows.append({'path': path, 'is_deleted': True, 'line_now': None, 'line_then': pct(b.get('lh', 0), b.get('lf', 0))})
return rows
Reviewers care most about big drops. Sort by line_delta ascending
(most-negative first), with new sub-threshold files at the top:
def classify(row):
if row.get('is_deleted'): return 'deleted'
if row.get('is_new') and row['line_now'] < 80: return 'new_below_threshold'
if row.get('is_new'): return 'new_ok'
if row['line_delta'] is not None and row['line_delta'] <= -5: return 'regressed'
if row['line_delta'] is not None and row['line_delta'] < 0: return 'declined'
if row['line_delta'] is not None and row['line_delta'] > 0: return 'improved'
return 'unchanged'
The thresholds (80% for new files, -5pp for regression) are tunable per repo.
## Coverage diff — `<sha>` vs `main` `<base-sha>`
**Overall:** line 84.2% (-0.3pp) | branch 71.5% (-0.1pp)
**Files changed:** 7 (3 regressed, 1 new, 2 improved, 1 deleted)
### ⚠ Regressions (4)
| File | Line% | Branch% |
|---------------------------------------|-------------|-------------|
| `src/checkout/cart.ts` | 65.4 (-12.8 ⬇) | 50.0 (-25.0 ⬇) |
| `src/checkout/promo.ts` | 78.0 (-8.5 ⬇) | 60.0 (-15.0 ⬇) |
### 🆕 New files (1)
| File | Line% | Branch% |
|---------------------------------------|-------------|-------------|
| `src/checkout/discount-stack.ts` | 35.0 (NEW, below 80% threshold) | 25.0 |
### ✅ Improvements (2)
| File | Line% | Branch% |
|---------------------------------------|-------------|-------------|
| `src/orders/list.ts` | 92.0 (+4.5 ⬆) | 85.0 (+10.0 ⬆) |
### 🗑 Deleted (1)
| File | Was line% |
|---------------------------------------|-------------|
| `src/legacy/old-checkout.ts` | 22.0 |
The four-section split (Regressions / New / Improvements / Deleted) matches reviewer attention budget. Improvements get airtime - positive feedback prevents the gate from feeling adversarial.
PR comment APIs render long markdown by default; the summary line sits at the top so the reviewer doesn't have to scroll:
📉 Coverage 84.2% (-0.3pp) — 3 files regressed, 1 new file below threshold. See full report below.
Or if all-clear:
✅ Coverage 84.5% (+0.2pp) — no regressions, 2 files improved.
- name: Generate diff report
run: python scripts/coverage_diff.py current.json baseline.json > diff.md
- name: Post / update PR comment
uses: marocchino/sticky-pull-request-comment@v2
with:
header: coverage-diff
path: diff.md
sticky-pull-request-comment uses the header to update the same
comment across pushes - the reviewer doesn't see N copies of the
report as the PR evolves.
| Anti-pattern | Why it fails | Fix |
|---|---|---|
| Posting only the aggregate (overall ± Xpp) | Hides which file regressed; reviewer can't act. | Per-file table sorted by drop (Step 4 - 5). |
| One thread per push (new comment per commit) | PR conversation drowns in coverage churn; nobody reads. | Sticky comment updated in place (Step 7). |
| Showing every unchanged file | 500-row tables; the 3 regressions are buried. | Filter to only changed files; one summary line for unchanged count. |
| Adversarial framing ("FAIL: coverage dropped") | Reviewer associates coverage tool with friction; team disables. | Show improvements too (Step 5). Gate failures are the gate's job; this report is informational. |
| Using PR's merge-base coverage (re-runs main coverage) | Doubles CI cost; flake risk on the main re-run. | Cache main coverage as artifact (Step 2 Pattern A). |
| Hiding new files because they "don't have a baseline" | New files are exactly where regressions enter the codebase. | Always show new files; flag the sub-threshold ones explicitly (Step 4). |
| Ignoring deleted files | Coverage went up because high-coverage code was deleted; aggregate misleads. | Show deletions (Step 3); explain in summary if they cause aggregate movement. |
coverage html).lcov-analysis - LCOV parser this
skill consumes.cobertura-analysis - Cobertura
parser this skill consumes.jest-coverage-analysis,
jacoco-analysis,
coverage-py-analysis -
language-specific parsers; convert to LCOV / Cobertura before
feeding this skill.unit-test-coverage-targeter - downstream skill that reads the same data to suggest which
uncovered branches to target next.Provides CDSS development patterns for drug interaction checking, dose validation, clinical scoring (NEWS2, qSOFA), and alert classification integrated into EMR workflows.
npx claudepluginhub testland/qa --plugin qa-test-reporting