From xp-stack
Reviews and optimizes GitHub Actions workflows for performance (cache layering, matrix sharding with Vitest 2.x+/Playwright blob+merge), quality (gate separation, observability via STEP_SUMMARY), security (SHA pinning of third-party actions, OIDC instead of long-lived secrets, pull_request_target hardening, persist-credentials hardening), and avoids documented anti-patterns (parallel cache corruption, eval gates without baseline, duplicate CI runs on same SHA). Use when editing any file under .github/workflows/, when adding or removing CI jobs, when triaging slow CI, when reviewing third-party action updates, or when discussing supply chain security of GitHub Actions (SHA pinning, OIDC, third-party action updates).
How this skill is triggered — by the user, by Claude, or both
Slash command
/xp-stack:optimizing-github-actions.github/workflows/**This skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Pra engines sem skill loading (Cursor, Codex sem MCP):** leia este file inteiro e siga as instrucoes como se fossem suas. Voce nao precisa "invocar" — apenas obedeca. Cursor e Codex que tem `xp-stack` instalado via npm leem este SKILL.md em `.cursor/rules/` ou `.codex/skills/`.
Pra engines sem skill loading (Cursor, Codex sem MCP): leia este file inteiro e siga as instrucoes como se fossem suas. Voce nao precisa "invocar" — apenas obedeca. Cursor e Codex que tem
xp-stackinstalado via npm leem este SKILL.md em.cursor/rules/ou.codex/skills/.
Voce eh o Auditor de CI. Sua missao eh rodar 10-item pre-flight checklist em todo workflow alterado: SHA pinning, OIDC, pull_request_target risk, concurrency, trigger eficiente, artifact v4, coverage em shards, bash hardening, gate calibrado, persist-credentials. Bloqueie merge se algum item criticar falhar.
Prescriptive skill for any edit in .github/workflows/*.yml. Encodes 2025-2026 state-of-the-art for performance, quality, and security, plus universal anti-patterns documented from real supply-chain incidents.
Auto-activated via the paths: .github/workflows/** frontmatter field — Claude loads this skill when you edit any workflow file. Also invocable manually via /xp-stack:optimizing-github-actions when discussing CI/CD without a workflow file open (e.g., SHA pinning, OIDC migration, runner cost analysis).
| Situation | Action | Sub-file to read |
|---|---|---|
Editing any .github/workflows/*.yml | Run the 10-item pre-flight checklist below | — |
| Adding a third-party action | Validate SHA pin + check recent supply-chain incidents | references/security.md |
| Adding/changing test sharding | Vitest blob or independent reporter — depends | references/sharding.md |
Adding actions/upload-artifact or download-artifact | Validate unique names per job (v4 breaking) | references/artifacts-v4.md |
| Triaging slow CI | Measure baseline BEFORE proposing optimization | references/caching.md + Iron Law |
| Touching expensive workflow (LLM, long build) | PR-only, no push trigger | Item 5b of checklist |
| Refactoring for reuse | Reusable workflow vs composite action | references/reuse-patterns.md |
| Configuring concurrency / triggers | PR cancel-in-progress true, deploy false | references/concurrency-and-triggers.md |
| Adding pipeline dashboard | STEP_SUMMARY + sticky comment | references/observability.md |
1. Before merging any change to .github/workflows/*.yml, run the 10-point pre-flight checklist below.
2. Before suggesting a "speed up CI" optimization, MEASURE BASELINE first via `gh run view --log` or the Actions UI. No guesses.
3. Before adding a third-party action, SHA-pin it. No exceptions for "trusted org".
Violating these three rules has caused real production incidents documented across the industry: Docker cache corruption costing 5min/job with no hit, duplicate CI runs on the same SHA wasting minutes, eval gates without baseline failing every PR.
Copy this section when reviewing an edit. Each item has a one-line rationale + reference. Items without OK in your review = fix before proposing the PR.
SHA pinning of third-party actions. Does every third-party action (uses: org/action@...) use a 40-char full-length SHA + # vN.N.N comment? Official GitHub actions (actions/*) may use major version but should be marked # OK: official. Reason: tj-actions/changed-files (CVE-2025-30066, March 2025) — attacker rewrote 350+ tags to point at malicious commit, ~23k repos compromised. Git tags are mutable. → references/security.md
OIDC for cloud auth. Does the workflow authenticate to AWS/GCP/Azure? If yes, does it use permissions: id-token: write + aws-actions/configure-aws-credentials (or equivalent) instead of ${{ secrets.AWS_ACCESS_KEY_ID }}? Reason: OIDC = ephemeral JWT, no duplicated credential, automatic rotation, granular control via trust policy. Long-lived secrets are only justifiable for services without OIDC support. → references/security.md
pull_request_target hardening. If the trigger is pull_request_target, does the workflow NOT check out the PR head (ref: ${{ github.event.pull_request.head.sha }}) without extreme sandboxing? Reason: classic RCE vector — pull_request_target runs with base repo secrets; checkout PR head + run code (npm install, build) = attacker via fork gets shell on runner with secrets. → references/security.md
Concurrency. Do PR workflows have concurrency.group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true? Do deploy workflows have cancel-in-progress: false? Reason: PR cancels previous run (savings + faster feedback); deploy NEVER cancels in-progress (risk of inconsistent prod state). → references/concurrency-and-triggers.md
Trigger efficiency (two sub-items — both):
pull_request: [main, dev] BUT NOT push: [main, dev]? Push to dev would run CI 2x when a feature PR → dev merges (PR already tested + push generates new event). Keep push only on main (covers direct hotfix + serves as deploy gate). Reason: universal duplication anti-pattern; same SHA paid twice in minutes and money.Artifact naming (v4). Does each actions/upload-artifact@v4 have a UNIQUE name (including ${{ matrix.shard }} when applicable)? Does the consolidator job use pattern: ...-* + merge-multiple: true or the merge sub-action? Reason: v4 broke compat with v3 (Dec 2023): artifacts are immutable, unique names required, no automatic merge. → references/artifacts-v4.md
Coverage threshold in sharded tests. Is the coverage threshold NOT being applied on individual shards? Each shard sees partial coverage = false negative. Threshold only on the merge job (Vitest: COVERAGE_SKIP_THRESHOLDS=1 on shards + --merge-reports + real threshold on merge). Reason: math — 1/3 of files = 1/3 of absolute coverage. → references/sharding.md
Bash hardening. Do long run: | scripts (>5 lines) start with set -euo pipefail? Do loops (for ... in ...) have explicit set -e or || exit on critical commands? Reason: default GHA shell is bash -eo pipefail but pipefail doesn't work in all scenarios; silent failure inside a loop = one broken deploy doesn't stop the rest. → references/security.md (section "Bash hardening")
Calibrated blocking gate. Does any step that does exit 1 on a numeric metric (eval, coverage, perf budget) have an empirical PoC confirming the metric DISCRIMINATES between baseline and experiment? If not, is it an informative gate (echo "::warning::..." && exit 0) with sticky comment? Reason: a hard threshold without empirical data fails healthy PRs and trains devs to bypass. Use informative mode until baseline is established. → references/observability.md
Checkout hardening. Do actions/checkout calls in workflows that DON'T do a subsequent git push have persist-credentials: false? Reason: principle of least privilege. v6+ stores credentials in $RUNNER_TEMP (safer than .git/config of v3-v5), but if the step doesn't need to authenticate to Git remote, it shouldn't persist a credential at all. → references/security.md
| Suite | Typical volume | Recommended shards | Reporter | Merge job? |
|---|---|---|---|---|
| Vitest unit | <500 files | 1 (no shard) | default | no |
| Vitest unit | 500-2000 files | 2-3 | blob (v2.0.0+) | yes — --merge-reports |
| Vitest unit | >2000 files | 3-4 | blob | yes |
| Playwright e2e | <50 specs | 1 | default | no |
| Playwright e2e | 50-200 specs | 2-3 | any (blob if you want a single HTML) | optional |
| Playwright e2e | >200 specs | 3-4 | blob | yes — merge-reports --reporter html |
Sweet spot: 2-3 shards. 4+ has diminishing return (setup overhead × N vs speedup). Sharding without blob reporter is also valid — each shard uploads a separate artifact on failure.
| Cloud / Service | OIDC support | Action |
|---|---|---|
| AWS | yes (official) | permissions: id-token: write + aws-actions/configure-aws-credentials |
| GCP | yes (official) | google-github-actions/auth with Workload Identity Federation |
| Azure | yes (official) | azure/login with client-id + tenant-id + subscription-id |
| HashiCorp Vault | yes (official) | hashicorp/vault-action with OIDC |
| GitHub Container Registry | yes (built-in) | ${{ secrets.GITHUB_TOKEN }} is already ephemeral OIDC |
| SaaS without OIDC (most third-party APIs) | no | Long-lived API key in secret — no alternative |
For mandatory long-lived secrets, mark with explicit comment: # OIDC not supported by <service>.
actions/cache@v4 vs setup-node native cache| Case | Use | Reason |
|---|---|---|
Cache node_modules (with heavy symlinks/postinstall) | actions/cache@v4 on node_modules | Skips entire npm ci in parallel jobs |
Cache only ~/.npm (skips download but runs npm ci) | cache: npm in setup-node@v4 | Official default, simple |
| Cache Playwright browsers | actions/cache@v4 on ~/.cache/ms-playwright | setup-node doesn't cover |
| Cache build outputs (dist/) | actions/cache@v4 on build path | Same |
| Cache heavy Docker image (>500MB) | DON'T cache if hit rate < 50% | Documented anti-pattern (write contention + corruption in parallel) |
dorny/paths-filter vs native paths| Case | Use | Reason |
|---|---|---|
| Whole workflow skips if nothing relevant changed | Native paths/paths-ignore on trigger | Simpler, native |
| Specific jobs run conditionally; others always run | dorny/paths-filter (or step-security/paths-filter) | Job-level with boolean flags |
| Required status check on PR + workflow sometimes skips | Avoid native paths (skip = skipped check = blocks merge) | Use paths-filter which always runs but skips internally |
Consider step-security/paths-filter as a hardened drop-in for dorny/paths-filter — aligned with post-tj-actions hardening.
| Case | Use | Reason |
|---|---|---|
| Reuse across multiple jobs with possibly different runners | Reusable workflow (workflow_call) | Defines complete jobs with runners |
| Reuse of steps inside 1 job (e.g., setup-node + cache + npm ci) | Composite action | Lighter, no new runner |
Steps need inherited secrets: | Reusable workflow (supports secrets: inherit) | Composite has no native secrets mechanism |
| Granular logging per step | Reusable workflow | Composite is logged as ONE consolidated step |
| Deep nesting (up to 10 levels) | Composite action | Reusable can't call another reusable |
cancel-in-progress| Workflow | group | cancel-in-progress |
|---|---|---|
| CI on PR | ${{ github.workflow }}-${{ github.ref }} | true |
| CI on push to main | ${{ github.workflow }}-${{ github.ref }} | false (deploy gate) |
| Deploy | deploy-${{ github.ref }} | false (NEVER cancel rollout) |
| Hybrid (PR + push) | ${{ github.workflow }}-${{ github.ref }} | ${{ github.event_name == 'pull_request' }} |
| Expensive post-merge workflows | ${{ github.workflow }}-${{ github.ref }} | true (cancels rebuild on consecutive pushes) |
uses: org/action@vN (tag) without SHA pin in workflow with secrets: or permissions: write → see item 1.${{ secrets.AWS_* }} / ${{ secrets.GCP_* }} / ${{ secrets.AZURE_* }} when OIDC is supported → see item 2.on: pull_request_target + actions/checkout with ref: head.* → BLOCKING, see item 3.pull_request: [dev] + push: [dev] → see item 5a.push trigger → see item 5b.actions/upload-artifact@v4 without unique name in matrix → see item 6.vitest run --coverage --shard=N/M without COVERAGE_SKIP_THRESHOLDS (global threshold applied to partial shard) → see item 7.run: | with for loop without set -e or || exit → see item 8.exit 1 on numeric-metric gate without empirical baseline PoC → see item 9.actions/checkout without persist-credentials: false in read-only workflow → see item 10.key without measuring hit rate → see references/caching.md (parallel cache corruption section).env: at top-level with ${{ secrets.X }} when only 1-2 steps use it → secret scope too wide, move to step level.| "I'll just..." | Reality | What to do |
|---|---|---|
"...use @v4 instead of SHA pin, it's a trusted action" | tj-actions had 23k repos trusting it. A tag can be rewritten in seconds. | SHA pin takes 30s: gh api repos/{org}/{repo}/git/refs/tags/{tag} --jq '.object.sha' |
"...put the secret in workflow env: to simplify" | Exposes it to all steps including third-party actions | Move it to the step that uses it: env: at step level |
| "...check out PR head in pull_request_target, more convenient" | Classic RCE — attacker via fork gets shell | Use pull_request + filter on head.repo.full_name if you need the head |
| "...also run CI on push to dev, just to make sure the merge is OK" | PR already tested the same SHA, double cost | Trust the PR. push only on main (deploy gate) |
| "...activate the eval gate at 0.70 already, it's a reasonable number" | Without PoC, any number is a guess. May fail 100% of PRs | Informative mode (::warning + exit 0) for N PRs until baseline exists |
| "...cache this big Docker image to save 30s" | If hit rate < 50%, save cost exceeds benefit. Cache can corrupt in parallel | Measure hit rate over 5-10 runs before keeping |
"...persist-credentials by default, no one will inspect it" | Subsequent third-party step can leak via git push or inspection | persist-credentials: false on read-only checkouts |
DO NOT hardcode SHA pins in this skill — they age within days. Use the audit script to resolve tag → current SHA:
# Resolve tag to current SHA (monthly rotation)
gh api repos/actions/checkout/git/refs/tags/v5.0.0 --jq '.object.sha'
# Audit all workflows in the repo
bash ${CLAUDE_PLUGIN_ROOT}/skills/optimizing-github-actions/scripts/audit-action-pins.sh
Useful gh commands for triage:
gh workflow list # List workflows
gh run list --workflow=ci.yml --limit=10 # Last 10 CI runs
gh run view <run-id> --log # Logs of a run
gh run view <run-id> --json conclusion,jobs # Structured status
gh api /repos/{owner}/{repo}/actions/runs/{run-id}/timing # Time per job
Essential official docs:
GitHub Actions evolves quickly (actions/upload-artifact@v4 breaking change in Dec 2023, free ARM in Jan 2025, supply-chain incident in March 2025, pull_request_target change in Nov 2025). To keep this skill current:
references/ file + bump the version footer.scripts/audit-action-pins.sh.Sub-files (loaded on demand):
references/security.md — SHA pinning, OIDC, pull_request_target, persist-credentials, secret scoping, bash hardening, tj-actions postmortem.references/sharding.md — Vitest blob+merge (v2.0.0+), Playwright shard with/without blob, fail-fast semantics, coverage in shards.references/artifacts-v4.md — Breaking changes vs v3, merge patterns, naming.references/caching.md — actions/cache@v4 vs setup-node native, restore-keys hierarchy, parallel cache corruption anti-pattern.references/concurrency-and-triggers.md — Concurrency groups, paths-filter (native vs dorny), avoiding SHA duplication.references/observability.md — STEP_SUMMARY, sticky comments, dashboards, calibrated gates.references/reuse-patterns.md — Reusable workflows vs composite actions, decision matrix.Examples:
examples/good-workflow.yml — Annotated template with SHA pins, OIDC, concurrency, correct sharding.examples/bad-workflow.yml — Same workflow with red flags marked (# RED FLAG: ...).Scripts:
scripts/audit-action-pins.sh — Iterates workflows, classifies uses: lines into PIN/MAJ/TAG/BAD/LOC, exit 1 if red flags.Version: 0.2.0 (2026-04-26) Last audited: 2026-04-26
Provides behavioral guidelines to reduce common LLM coding mistakes, focusing on simplicity, surgical changes, assumption surfacing, and verifiable success criteria.
Searches, retrieves, and installs Agent Skills from prompts.chat registry using MCP tools like search_skills and get_skill. Activates for finding skills, browsing catalogs, or extending Claude.
npx claudepluginhub rnobre1/xp-stack --plugin xp-stack