From ai-tools
Track GitHub workflows, analyze failures, and automatically fix issues
How this skill is triggered — by the user, by Claude, or both
Slash command
/ai-tools:github-workflow-doctorThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill tracks a GitHub workflow run, analyzes failures, and attempts to fix issues automatically.
README.mdpreview.mjsscripts/get-workflow-info.mjsscripts/get-workflow-logs.mjsscripts/list-running-workflows.mjsscripts/rerun-workflow.mjsscripts/wait-for-workflow.mjstests/get-workflow-info.test.mjstests/get-workflow-logs.test.mjstests/list-running-workflows.test.mjstests/rerun-workflow.test.mjstests/wait-for-workflow.test.mjsThis skill tracks a GitHub workflow run, analyzes failures, and attempts to fix issues automatically.
This skill can be invoked in two ways:
If a run_id argument was provided, use it and proceed to Step 2.
Otherwise, query the repository for running workflows:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/list-running-workflows.mjs
This returns a JSON array of running/queued workflows with details like:
If running workflows are found:
Present the list to the user and ask them to choose. Format each workflow in a two-line format:
⏳ Update UI; Move scripts to miniter-utility
container build #5475: Pull request by waynebrantley at 2:34 PM (2m 30s)
The format is:
If the user selects "Show all recent workflows", run:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/list-running-workflows.mjs --all
Then present the expanded list and ask them to choose again.
If the user selects "Enter a specific run ID", ask them to provide the run ID.
If no running workflows are found:
Query for recent workflows including failed ones:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/list-running-workflows.mjs --include-failed
If failed workflows are found, inform the user: "No running workflows, but found failed workflows you can fix."
Present the list (including failed workflows marked with ❌) and ask them to choose.
If still no workflows, ask: Ask the user: "What would you like to do?" Options:
Store the selected workflow run ID for use throughout the skill.
Before tracking, check if the selected workflow is already completed:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/get-workflow-info.mjs <run-id>
Check the status field in the JSON output:
If status === "completed":
conclusion fieldconclusion === "success":
conclusion === "failure":
If status === "in_progress" or status === "queued":
Note: This step is only executed if the workflow is in_progress or queued (determined in Step 1.5)
Run the wait-for-workflow script to poll the workflow status:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/wait-for-workflow.mjs <run-id>
This will output progress updates to stderr and final status as JSON to stdout.
Inform the user:
Note: This step is only executed if we tracked the workflow in Step 2
When the workflow completes, check the success field in the JSON output.
Report to the user:
End the skill here.
Proceed to Step 4 for failure analysis and fixing.
Run the get-workflow-logs script to fetch failure details:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/get-workflow-logs.mjs <run-id>
This returns a JSON object with the following structure:
runId — the workflow run IDfailedJobs — array of failed jobs, each with:
jobName — name of the failed jobjobId — numeric job IDconclusion — "failure", "timed_out", or "cancelled"url — link to the job on GitHubfailedSteps — array of failed steps with name, conclusion, numberlogs — full workflow log output as a stringsummary — e.g. "2 job(s) failed"CRITICAL: Analyze this JSON directly. Do NOT generate inline node -e or bash -c commands to parse the logs. The data is already structured JSON — read and interpret it in context. Inline scripts fail due to escaping issues (e.g. \! bash history expansion) and are unnecessary.
Analyze the JSON to identify:
failedJobs[].failedSteps)logs)Before attempting code fixes, check the logs for transient infrastructure errors — failures caused by temporary platform issues, not by code bugs. These are resolved by re-running the workflow, not by changing code.
Known transient error patterns (check logs string for these):
blob unknown to registry — Docker registry consistency errorTLS handshake timeout — network timeout during TLS negotiationrate limit or API rate limit exceeded — GitHub/registry rate limitingCould not resolve host or Temporary failure in name resolution — DNS failures502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout — upstream server errorsunexpected EOF or connection reset by peer — connection dropped mid-transferi/o timeout — generic network I/O timeouterror pulling image combined with timeout/network errors — container image pull failuresResource not accessible by integration — transient GitHub App permission errorsIf a transient error is detected:
<matched pattern>), not a code bug. Re-running the failed jobs."node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/rerun-workflow.mjs <run-id>
rerunTriggered is true:
rerunTriggered is false: report the rerun failure message to the user and proceed to Step 5 (attempt code fix anyway)If no transient error is detected:
Proceed to Step 5 as normal.
Initialize attempt counter: Set attempt_count = 1 and max_attempts = 3
For each attempt (while attempt_count <= max_attempts):
Analyze the failure using the logs from Step 4
Identify the fix - determine what code changes are needed
Ask user if auto-fix seems uncertain:
Make the fix: Edit the necessary files
Commit the changes:
git add <changed-files>
git commit -m "fix: address workflow failure - <brief description>"
Check workflow trigger type:
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/get-workflow-info.mjs <original-run-id>event field in the JSON outputHandle based on trigger type:
If event === "push":
git push
node ${CLAUDE_PLUGIN_ROOT}/skills/github-workflow-doctor/scripts/get-workflow-info.mjs --latest "<workflow-name>"
attempt_countIf event === "workflow_dispatch" or other:
git push
If fix fails again:
attempt_count >= max_attempts:
attempt_count and continue the loopIf the workflow passes after a fix:
CRITICAL: Do NOT work around script failures.
If any script in this skill produces no output, fails, or returns unexpected results:
gh commands directlyExample of what NOT to do:
❌ "The scripts didn't produce output. Let me check the runs directly with gh."
Instead:
✅ "The get-workflow-info.mjs script produced no output. This may indicate a bug in the skill. Would you like me to investigate or should we try a different approach?"
If a script fails, ask the user how to proceed: "A skill script failed to produce output. What would you like to do?" Options:
gh CLI tool, which must be installed and authenticatednpx claudepluginhub waynebrantley/aitools --plugin ai-toolsInspects GitHub Actions workflow runs using gh CLI: lists runs, checks status, analyzes logs, debugs failures, reruns jobs. Use for CI/CD troubleshooting.
Writes and optimizes GitHub Actions workflows for CI/CD pipelines, triggers, jobs, steps, secrets, artifacts, and debugging runs.