From quality-attributes
Design systems that fail gracefully and recover automatically. Use when defining SLAs, designing for fault tolerance, or improving uptime.
How this skill is triggered — by the user, by Claude, or both
Slash command
/quality-attributes:reliability-designThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Build systems that anticipate failures, degrade gracefully, and recover automatically. Design for MTBF and MTTR trade-offs.
Build systems that anticipate failures, degrade gracefully, and recover automatically. Design for MTBF and MTTR trade-offs.
You are designing for reliability. The user faces uptime requirements, wants to reduce MTTR, or needs to design disaster recovery. Read their current SLAs and failure modes.
Based on Nygard's Release It! and Google's SRE practices:
Define SLA/SLO/SLI:
Map Failure Modes: For each critical component, ask: "What happens if this fails?" Example: database down → query service fails → frontend shows error.
Design Fault Isolation: Use bulkheads (thread pools per dependency), timeouts, and circuit breakers. Ensure one service failure doesn't bring down others.
Plan Recovery: For each failure, specify recovery mechanism. Database replica failover (automated)? Service restart? Manual intervention?
Establish Monitoring: Instrument critical paths with metrics (request latency, success rate, queue depth). Alert when approaching SLI threshold.
npx claudepluginhub sethdford/claude-skills --plugin architect-quality-attributesDesigns Service Level Objectives (SLOs) with SLIs, targets, alerting thresholds, and error budgets following Google SRE best practices. Use for defining reliability targets, calculating error budgets, or establishing service indicators.
Defines SLOs and error budgets for service reliability, enabling data-driven trade-offs between feature velocity and system stability.
Designs SLOs with SLIs, targets, alerting thresholds, and error budgets following Google SRE best practices. Use for defining reliability targets or service indicators.