From argos
Yük testi tasarımı + capacity planning — k6/Locust/Gatling baseline, headroom %, autoscaling threshold, cost projection, bottleneck triage (DB/cache/CPU/network), seasonal forecast.
How this skill is triggered — by the user, by Claude, or both
Slash command
/argos:capacity-planningThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
`agents/shared/severity-rubric.md` ve `agents/shared/escalation-matrix.md` default-load
agents/shared/severity-rubric.md ve agents/shared/escalation-matrix.md default-load
sayılır (agents/coordination.md §11). Bu skill'in çıktısı Critical / High / Medium /
Low + kanıt formatında olmak zorunda — spekülatif Critical yasak. Sahiplik dışı bulgu
ilgili agent'a delege; karar yetkisi eşiği aşılırsa kullanıcı onayı zorunlu.
Mevcut sistem ne kaldırıyor? Test öncesi metric:
| Metric | Tool | Pencere |
|---|---|---|
| RPS (peak / avg) | Prometheus rate(http_requests_total[5m]) | 7 gün |
| p50/p95/p99 latency | Prometheus histogram | 7 gün |
| Error rate | Prometheus 5xx oran | 7 gün |
| CPU/memory utilization | Prometheus node_*, container_* | 7 gün |
| DB QPS + connection pool | postgres_exporter | 7 gün |
| Cache hit rate | redis_exporter / app metric | 7 gün |
| Concurrent users | unique session 5dk pencere | 7 gün |
Saturation'ı bul:
idle in transaction artıyor.| Tool | Uygun |
|---|---|
| k6 | HTTP/HTTPS, gRPC, WebSocket; JS scripting; Grafana entegre. Modern default. |
| Locust | Python; UI; distributed runner; complex scenario için. |
| Gatling | Scala; high-throughput; raporlama güçlü. |
| JMeter | Java; Legacy; UI'dan tasarım; complex SOAP/JMS. |
| wrk | C; mikro-benchmark, single endpoint. |
| Artillery | Node.js; YAML scenario; multi-protocol. |
Plugin tercih: k6 (HTTP+WS+gRPC, modern, code-first).
// k6 örnek
import http from 'k6/http';
import { check, sleep } from 'k6';
export const options = {
scenarios: {
// 1. Smoke — küçük yük, sistem ayakta mı?
smoke: {
executor: 'constant-vus',
vus: 5, duration: '2m',
tags: { test_type: 'smoke' },
},
// 2. Load — beklenen trafik
load: {
executor: 'ramping-vus',
stages: [
{ duration: '5m', target: 100 },
{ duration: '20m', target: 100 },
{ duration: '5m', target: 0 },
],
tags: { test_type: 'load' },
},
// 3. Stress — kapasite sınırı bul
stress: {
executor: 'ramping-vus',
stages: [
{ duration: '5m', target: 100 },
{ duration: '5m', target: 200 },
{ duration: '5m', target: 400 },
{ duration: '5m', target: 800 },
{ duration: '10m', target: 1500 },
{ duration: '5m', target: 0 },
],
tags: { test_type: 'stress' },
},
// 4. Spike — ani artış
spike: {
executor: 'ramping-vus',
stages: [
{ duration: '10s', target: 50 },
{ duration: '30s', target: 1000 }, // ani spike
{ duration: '3m', target: 50 },
{ duration: '10s', target: 0 },
],
tags: { test_type: 'spike' },
},
// 5. Soak — uzun süreli (memory leak / leak)
soak: {
executor: 'constant-vus',
vus: 100, duration: '4h',
tags: { test_type: 'soak' },
},
},
thresholds: {
'http_req_duration{test_type:load}': ['p(99)<500'],
'http_req_failed': ['rate<0.001'],
},
};
export default function () {
const res = http.get('https://api.example.com/orders');
check(res, { 'status 200': (r) => r.status === 200 });
sleep(1);
}
RPS limit'e ulaştı, latency artıyor
↓
CPU >%80? ─── evet ──→ saturated; profiling (pprof, py-spy) → algoritma optimize / scale out
│
hayır
↓
Memory >%85? ─── evet ──→ leak veya buffer; heap snapshot
│
hayır
↓
DB pool full? ─── evet ──→ pool size, slow query (EXPLAIN), N+1, indeks
│
hayır
↓
Cache miss high? ─── evet ──→ TTL, key strategy, warm-up
│
hayır
↓
Network egress full? ─── evet ──→ NIC limit, payload size, compression
│
hayır
↓
Lock contention? ──→ profile lock wait (mutex profiling)
peak_observed = en yüksek 7 gün p95 RPS
target_capacity = peak_observed × (1 + headroom_pct)
provisioned = target_capacity / per_replica_rps × safety_factor
Tipik:
headroom_pct = 0.30-0.50 (%30-50 ek alan).safety_factor = 1.2 (replica fail tolerans).spec:
minReplicas: 3 # P&E (1 alan, 1 fail tolerans)
maxReplicas: 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60 # %60 hedef → headroom var
# custom metric:
- type: Pods
pods:
metric: { name: rps_per_pod }
target: { type: AverageValue, averageValue: '100' }
behavior:
scaleUp:
stabilizationWindowSeconds: 0 # hızlı scale-up
policies:
- { type: Percent, value: 100, periodSeconds: 15 }
scaleDown:
stabilizationWindowSeconds: 300 # yavaş scale-down (flapping önle)
policies:
- { type: Percent, value: 10, periodSeconds: 60 }
monthly_cost = replicas × instance_$$/hour × 730 hours
+ database_class_$$
+ load_balancer_$$
+ egress_$$ / month
+ storage_$$ / month
Δ = projected - current
docs/capacity/<service>.mdstabilizationWindowSeconds=0 scale-down — flapping.User: /capacity-plan api-svc
Agent (load-test-engineer):
1. Baseline 7 gün:
- RPS peak 850 (16:00 UTC), avg 320
- p99 latency 380ms peak, 180ms avg
- CPU avg %42, peak %71 (saturation yakın)
- DB pool 60/100 peak
- Cache hit %91
2. Saturation: CPU peak %71'de p99 480ms'e fırlıyor → saturation point.
3. k6 stress test plan:
- 100 → 1500 VU 30 dk
- threshold: p99<500ms, error<0.001
4. Sonuç:
- 1200 VU'da CPU %92 saturated, p99 1.2s
- DB pool 95/100, slow query `SELECT * FROM orders WHERE customer_id...`
(no index, seq scan)
5. Bottleneck: önce DB indeks, sonra CPU. Indeks ekle → 1800 VU dayanıyor.
6. Headroom %40: peak 850 RPS × 1.4 = 1190 RPS hedef.
7. HPA: min 5 (peak öncesi warm), max 30, CPU target %55.
8. Cost: 5→8 replica avg, peak 24. Δ +$420/ay.
9. Seasonal: Q4 +%70 hedef → max 30→50 replica + DB read replica ekle.
10. Drill: 2026-11-01'de Q4 öncesi load test re-run.
# Capacity Plan: <service>
## Baseline (7 gün)
| Metric | Peak | Avg | p95 | p99 |
## Saturation Point
- CPU/memory/DB/cache/network
## Test Sonuç
- smoke / load / stress / spike / soak — pass/fail + max sustained
## Bottleneck
- Triage (kaynak değil sebep)
## Capacity Plan
- Headroom %, safety factor, target replicas
## HPA Config
```yaml
# tuned spec
| Öncelik | Aksiyon | Sahip | Bitiş | Issue |
npx claudepluginhub resultakak/argos --plugin argosGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.