Skill

debug-pipeline

Systematically debug CI/CD pipeline failures -- build errors, test failures, deployment issues, runner problems, trigger mismatches, permission errors, secret availability, flaky tests, and silent failures. Uses systematic debugging (reproduce, isolate, hypothesize, test, confirm, fix, verify) -- never guesses. Triggers on "CI is broken", "pipeline failing", "workflow not running", "debug my pipeline", "build failed", "deploy failed", "runner not picking up jobs", "workflow trigger not working", "secrets not available", "CI flaky", "pipeline keeps failing", "why did CI fail", "fix my workflow". Produces root cause analysis with concrete fix.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/cicd-expert:debug-pipeline

User invocable

Model invocable

Inline context

Default effort

Tool Access

This skill is limited to the following tools:

AgentReadGrepGlobBashTodoWriteWebSearchWebFetchmcp__plugin_context7_context7__resolve-library-idmcp__plugin_context7_context7__query-docs

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Dispatches the cicd-expert agent with a debug-workflow briefing.

SKILL.md

104 lines · ~1k tokens

Stats

Stars0

MaintenanceGood

Last CommitApr 16, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Debug Pipeline

Dispatches the cicd-expert agent with a debug-workflow briefing.

Step 1: Gather failure context

Before dispatching, collect:

Error message or symptom (from user or from reading logs)
Which workflow file and job
When it started failing (after which change?)
Whether it is intermittent (flaky) or consistent
Platform and runner type

Read the failing workflow file and any recent git changes to CI config.

Step 2: Dispatch cicd-expert

Agent({
  description: "Debug CI/CD failure",
  subagent_type: "cicd-expert:cicd-expert",
  model: "opus",
  prompt: "<see briefing below>"
})

Briefing

ORIGINAL USER REQUEST: <verbatim>

WORKFLOW: debug

FAILURE CONTEXT:
- Error message/symptom: <from user or logs>
- Workflow file: <path>
- Failing job: <name if known>
- Intermittent or consistent: <if known>
- Recent changes: <git log of CI-relevant changes>
- Platform: <detected>
- Runner type: <detected>
- Working directory: <absolute path>

DELIVERABLES:
1. Systematic diagnosis:
   a. Reproduce -- confirm the failure exists and identify exact conditions
   b. Isolate -- which job, which step, which command fails?
   c. Check common failure categories:
      - Trigger mismatch (event type, branch filter, path filter)
      - Permission error (GITHUB_TOKEN scope, environment protection)
      - Secret unavailability (fork PR, missing environment, wrong scope level)
      - Runner issue (label mismatch, capacity, ARC scaling, ephemeral cleanup)
      - Dependency issue (cache miss, lockfile drift, registry outage)
      - Configuration syntax (YAML indentation, expression syntax, matrix)
      - Concurrency conflict (concurrent runs, resource contention)
      - Network issue (egress, DNS, private network)
   d. Hypothesize -- form 1-3 ranked hypotheses based on evidence
   e. Test -- check each hypothesis against the config and logs
   f. Confirm -- identify root cause with evidence

2. Root cause analysis:
   - What failed
   - Why it failed
   - When it started failing (if determinable)
   - What change caused it (if determinable)

3. Fix:
   - Exact configuration change (YAML diff)
   - Why this fixes it (with confidence grade)
   - How to verify the fix

4. Prevention:
   - What would have caught this earlier?
   - Any workflow scanning or validation to add?

CONSTRAINTS:
- NEVER guess the root cause -- follow the systematic path
- Read the actual workflow file and any referenced reusable workflows
- Check for common gotchas: YAML quoting, expression syntax, action version mismatches
- If the error involves a third-party action, check its documentation via context7 or WebFetch
- Document the root cause and fix clearly so the user can prevent recurrence

Proceed with your standard workflow (reference files first for prior similar failures, then read the pipeline config, then diagnose systematically).

Step 3: Relay findings

Present root cause + fix. Offer to apply the fix directly.

Never do

Guess the root cause without evidence
Suggest "try this and see" without a hypothesis
Skip reading the actual workflow file
Add emojis

debug-pipeline

Invocation

Tool Access

Context Preview

SKILL.md

debug-pipeline

Invocation

Tool Access

Context Preview

SKILL.md

Debug Pipeline

Step 1: Gather failure context

Step 2: Dispatch cicd-expert

Briefing

Step 3: Relay findings

Never do

Similar Skills

Debug Pipeline

Step 1: Gather failure context

Step 2: Dispatch cicd-expert

Briefing

Step 3: Relay findings

Never do

Similar Skills