Generates Markdown runbooks for incident response, operational procedures, troubleshooting guides, and emergency protocols from system analysis. Outputs structured files with metadata, steps, decision trees, and escalation paths.
How this skill is triggered — by the user, by Claude, or both
Slash command
/documentation-standards:create-runbookThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Generate a runbook for operational procedures, incident response, or troubleshooting.
Generate a runbook for operational procedures, incident response, or troubleshooting.
Extract from user input:
incident, operational, troubleshooting, emergency (default: operational)Based on the topic and type, gather relevant information:
For incident runbooks:
For operational runbooks:
For troubleshooting runbooks:
For emergency runbooks:
runbook-creation skill for templatestype argumentDetermine file location:
Priority order:
1. docs/runbooks/{type}/RB-{number}-{slug}.md
2. docs/operations/runbooks/RB-{number}-{slug}.md
3. runbooks/RB-{number}-{slug}.md
Numbering:
Generate content sections based on type:
All types include:
Type-specific sections:
| Type | Additional Sections |
|---|---|
| incident | Alert details, impact assessment, communication templates |
| operational | Rollback procedure, verification checklist |
| troubleshooting | Symptom/cause matrix, diagnostic commands |
| emergency | Immediate actions, notification list, recovery steps |
# Incident Runbook: {TOPIC}
| Property | Value |
|----------|-------|
| **ID** | RB-INC-{NUMBER} |
| **Alert** | [Alert name] |
| **Severity** | [SEV1/2/3/4] |
| **Service** | {SERVICE} |
| **Owner** | [Team] |
| **Last Updated** | {DATE} |
## Alert Details
[Alert trigger conditions]
## Immediate Actions (First 5 Minutes)
1. Acknowledge alert
2. Assess impact
3. Initial communication
## Diagnosis
[Decision tree and diagnostic steps]
## Resolution
[Step-by-step fix procedures]
## Verification
[How to confirm resolution]
## Communication
[Status update templates]
## Post-Incident
[Cleanup and follow-up tasks]
# Runbook: {TOPIC}
| Property | Value |
|----------|-------|
| **ID** | RB-OPS-{NUMBER} |
| **Category** | Operational |
| **Service** | {SERVICE} |
| **Owner** | [Team] |
| **Last Updated** | {DATE} |
| **Estimated Duration** | [Time] |
## Overview
[Purpose and when to use]
## Prerequisites
[Access, tools, knowledge needed]
## Procedure
### Step 1: [Name]
[Detailed instructions with commands]
### Step 2: [Name]
[Detailed instructions]
## Verification
[How to confirm success]
## Rollback
[How to undo if needed]
## Troubleshooting
[Common issues and fixes]
/create-runbook "database failover"
→ Creates operational runbook for database failover procedure
/create-runbook "high error rate" type=incident service="api-gateway"
→ Creates incident runbook for API gateway error rate alerts
/create-runbook "pod crash loop" type=troubleshooting service="order-service"
→ Creates troubleshooting guide for order service pod crashes
/create-runbook "security breach response" type=emergency
→ Creates emergency runbook for security incidents
When generating runbook content:
After creating the runbook:
Generated runbook must:
npx claudepluginhub melodic-software/claude-code-plugins --plugin documentation-standardsGenerates structured operational runbooks for services, incidents, or deployments with prerequisites, step-by-step procedures, rollback steps, and escalation paths.
Generates operational runbooks for repeatable incident procedures that any engineer can execute under pressure. Follows Google SRE and PagerDuty best practices.
Provides Markdown templates for structuring operational runbooks including diagnosis, resolution, validation, and rollback with bash/Kubernetes/AWS examples.