From Claudient — DevOps & Infrastructure
Infrastructure capacity planning: forecast resource needs, cost projections, scaling recommendations
How this skill is triggered — by the user, by Claude, or both
Slash command
/claudient-devops-infra:capacity-plannerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
- Planning infrastructure ahead of a product launch or traffic spike
/incident-response/aws-architect, /gcp-architect, or /azure-architect/slo-architectBuild a capacity plan for [SERVICE or SYSTEM] for the next [3 / 6 / 12] months.
Current state:
- Service: [what it does]
- Traffic: [current requests/day or RPS]
- Infrastructure: [current compute — e.g., 3x t3.medium EC2, 2 Kubernetes pods, etc.]
- Database: [type, instance size, current storage used]
- Current monthly cloud cost: [$X]
- Current utilisation: [CPU: X%, Memory: X%, DB connections: X of Y]
Growth assumptions:
- Expected traffic growth: [X% per month / flat / specific event-driven spike]
- Expected data growth: [GB/month stored in database or object storage]
- Planned product launches: [any events that will cause sudden spikes]
Constraints:
- SLO: [availability target, latency SLO]
- Budget ceiling: [$X/month max]
- Cloud provider: [AWS / GCP / Azure]
- Existing commitments: [any reserved instances or savings plans already purchased]
Produce:
## 1. Capacity forecast
Projected resource needs at: [3 months out / 6 months / 12 months]
- Compute: current vs. needed
- Memory: current vs. needed
- Database: storage and IOPS growth
- Bandwidth / data transfer costs
- CDN or caching layer impact
## 2. Scaling triggers
At what metric threshold should we scale?
- CPU > X% sustained for Y minutes → scale out by Z replicas
- Memory > X% → vertical scale to next tier or add swap
- DB connections > X% of max → consider connection pooling (PgBouncer) or read replica
## 3. Cost projection
| Month | Compute | Database | Storage | Bandwidth | Total |
|---|---|---|---|---|---|
| Now | $X | $X | $X | $X | $X |
| +3mo | $X | $X | $X | $X | $X |
| +6mo | $X | $X | $X | $X | $X |
| +12mo | $X | $X | $X | $X | $X |
## 4. Scaling recommendations
Concrete actions in order:
1. [What to do now — immediate action]
2. [What to do in 30-60 days]
3. [What to plan for at 6 months]
## 5. Cost optimisation opportunities
Savings available without reducing capacity:
- Reserved instances / savings plans: $X/month saved if purchased now
- Rightsizing: [specific instances that are over-provisioned]
- Storage tiering: [any data that can move to cheaper storage]
- Caching: [what can be cached to reduce DB load and compute cost]
Build a scaling model for [SERVICE] based on traffic patterns.
Current traffic data:
- Average RPS (requests per second): [X]
- Peak RPS (highest observed): [X]
- Daily traffic pattern: [flat / morning peak / evening peak / bursty]
- Weekly pattern: [weekday-heavy / weekend-heavy / flat]
Service characteristics:
- Average request latency: [Xms at current load]
- CPU per request (approximate): [X% per pod per 100 RPS]
- Memory per request: [X MB working set per pod]
- Stateless or stateful: [stateless = easy to scale horizontally]
Scaling model output:
For each RPS level:
| RPS | Pods needed | CPU headroom | Latency estimate | Cost/month |
|---|---|---|---|---|
| [Current: X] | [Y pods] | [X% headroom] | [Xms] | $X |
| [2x growth] | | | | |
| [5x growth] | | | | |
| [10x growth] | | | | |
Horizontal scaling rules:
- Scale out when: CPU > [X]% for [Y] minutes OR RPS > [Z]
- Scale in when: CPU < [X]% for [Y] minutes AND RPS < [Z]
- Minimum pods: [N] (for availability during scaling events)
- Maximum pods: [N] (cost ceiling or account limit)
HPA (Horizontal Pod Autoscaler) config for Kubernetes:
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: [service-name]
namespace: [namespace]
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: [service-name]
minReplicas: [N]
maxReplicas: [N]
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Generate the scaling model for my service.
### Database capacity planning
Plan database capacity for [SERVICE] over [N] months.
Current state:
Growth inputs:
Database capacity plan output:
| Month | Data size | Index size | Total | Storage cost |
|---|---|---|---|---|
| Now | X GB | X GB | X GB | $X |
| +3mo | ||||
| +6mo | ||||
| +12mo |
Storage alert thresholds:
Current max connections for [instance type]: [N] Current usage: [X connections, X% of max] Connection pool recommendation:
If using PgBouncer or RDS Proxy:
Upgrade instance when:
Next instance tier: [current] → [recommended next] at [X months] Cost delta: $X/month additional
Add a read replica when:
Read replica cost: $X/month (same instance type as primary) Connection routing: [describe how to route reads vs. writes in application code]
### Launch capacity plan
Build a launch capacity plan for [PRODUCT / FEATURE / EVENT].
Launch details:
Current infrastructure:
Launch plan output:
| Scenario | Peak RPS | Pods needed | DB connections | Action required |
|---|---|---|---|---|
| Conservative | X | N | X | [no change / minor tweak] |
| Base case | X | N | X | [pre-scale to N pods] |
| Optimistic | X | N | X | [temporary vertical scale + pre-warm] |
T-24h: pre-scale compute to [N] pods (don't wait for autoscaler) T-4h: warm CDN cache for all new pages T-0: post in #engineering and tag on-call with launch dashboard link T+1h: check error rates, latency, DB connections — compare to baseline T+24h: review actual traffic vs. forecast, resize if over-provisioned
Extra cost for [7 days of pre-scaled infrastructure]: $X Rollback to normal provisioning after: [DATE] if traffic stabilises below [X] RPS
### Cloud cost optimisation analysis
Analyse my cloud costs and find savings opportunities.
Current monthly bill: [$X total] Breakdown:
Infrastructure inventory:
Analyse by category:
COMPUTE RIGHTSIZING:
RESERVED INSTANCES / SAVINGS PLANS:
STORAGE COST REDUCTION:
DATABASE OPTIMISATION:
Produce: ranked list of savings opportunities by monthly dollar impact, with implementation effort (low / medium / high).
## Example
**User:** We're planning to open-source a tool and expect it to be featured on Product Hunt next month. Currently we have 2 t3.medium EC2 instances behind an ALB, serving 500 RPS at 45% CPU. We expect 5-10x traffic spike on launch day. Monthly bill is $220.
**Output (excerpt):**
```markdown
# Capacity Plan: Product Hunt Launch
## Current baseline
2x t3.medium (2 vCPU, 4GB RAM) — 500 RPS at 45% CPU.
Estimated max capacity: ~1,100 RPS before CPU saturation.
## Traffic scenarios
| Scenario | Peak RPS | Required instances | Headroom |
|---|---|---|---|
| Conservative (5x) | 2,500 | 5x t3.medium | 20% |
| Base case (7x) | 3,500 | 7x t3.medium | 10% |
| Optimistic (10x+) | 5,000+ | 10x t3.medium or 5x t3.large | 15% |
## Pre-launch actions (by priority)
**Now (today):**
- Enable EC2 Auto Scaling with min=2, max=12, scale-out at CPU>60% for 3 minutes
- Set scale-in cooldown to 15 minutes (prevents thrashing after spike)
- Verify ALB health check is correctly configured and responds < 2s
**T-48h before launch:**
- Pre-scale to 6 instances manually (don't rely on autoscaler during the spike — it lags 3-5 minutes)
- Run a load test to 3,500 RPS using k6 or Locust — confirm latency stays < 200ms
**T-4h:**
- Warm CloudFront cache for all static assets on the new pages
- If serving any database-backed content: review cache-control headers
**Launch day cost:**
7 days of 6x t3.medium pre-scaled: ~$90 additional
If you hit 10x and scale to 10 instances: ~$150 additional for the week
Recommend: pre-scale, watch, scale down after 48-72h when traffic normalises.
**What to watch on launch day:**
- ALB RequestCount (not just RPS at your app — ALB is the leading indicator)
- Target group healthy host count (should stay at pre-scaled N throughout)
- DB connections — t3.medium can handle ~50 connections each; at 10 instances that's 500 connections
- If using RDS: check FreeableMemory and DatabaseConnections metrics
Work with us: Claudient is backed by Uitbreiden — we build AI products and B2B solutions with developer communities. uitbreiden.com · Reddit · YouTube
npx claudepluginhub claudient/claudient --plugin claudient-devops-infraProvides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Fetches up-to-date documentation from Context7 for libraries and frameworks like React, Next.js, Prisma. Use for setup questions, API references, and code examples.