From silver-bullet
Enforces stateless design, efficient data access, asynchronous processing, and caching patterns to ensure systems can handle 10x-100x load growth without redesign.
How this agent operates — its isolation, permissions, and tool access model
Agent reference
silver-bullet:agents/claude/scalability/skillThe summary Claude sees when deciding whether to delegate to this agent
Every design, plan, and implementation MUST handle current load efficiently AND accommodate 10x growth without architectural changes. Design for the load you expect in 18 months, not the load you have today. **Why this matters:** Systems that aren't designed to scale hit walls — and those walls always appear at the worst time (launch day, viral moment, enterprise customer onboarding). Retrofitt...
Every design, plan, and implementation MUST handle current load efficiently AND accommodate 10x growth without architectural changes. Design for the load you expect in 18 months, not the load you have today.
Why this matters: Systems that aren't designed to scale hit walls — and those walls always appear at the worst time (launch day, viral moment, enterprise customer onboarding). Retrofitting scalability is 10-100x more expensive than building it in.
When to invoke: During PLANNING (after /silver:context, before /silver:plan) and during REVIEW (as part of code review criteria). This skill applies to both new code and modifications to existing code.
Every service, function, and handler MUST be stateless unless there is an explicit, documented reason for state.
Test: Can you run 5 instances of this service behind a load balancer with no shared state? If no, fix it.
Every database query and data access pattern MUST be designed for scale:
| Pattern | Requirement |
|---|---|
| Queries | Must use indexes. No full table scans on tables that will grow. |
| Pagination | Required for any list endpoint. No unbounded SELECT *. |
| N+1 queries | Forbidden. Use joins, batch loading, or dataloader patterns. |
| Write amplification | Minimize. Don't update entire records when one field changes. |
| Connection pooling | Required. Never open/close connections per request. |
| Read replicas | Design for eventual consistency where appropriate. |
Test: Run an EXPLAIN on every query. If it says "full table scan" on a table with >10K rows, add an index.
Any operation that doesn't need an immediate response MUST be asynchronous:
Synchronous is acceptable for: auth checks, data reads <100ms, input validation.
Every read-heavy path MUST have a caching strategy:
| Cache layer | TTL | Use when |
|---|---|---|
| HTTP cache (CDN, browser) | Minutes to hours | Static assets, API responses that change infrequently |
| Application cache (Redis) | Seconds to minutes | Computed results, session data, frequent queries |
| Database query cache | Seconds | Identical queries hitting the DB frequently |
| No cache | — | Write paths, real-time data, personalized content |
Every cache MUST have:
Every resource consumer MUST have explicit limits:
| Resource | Limit | What happens at limit |
|---|---|---|
| HTTP request body | Max size (e.g., 10MB) | 413 Payload Too Large |
| Query results | Max rows (e.g., 1000) | Pagination required |
| Batch operations | Max batch size (e.g., 100) | Split into chunks |
| Concurrent connections | Pool size (e.g., 20) | Queue or reject |
| Background jobs | Max concurrent (e.g., 10) | Queue with backpressure |
| File uploads | Max size + count | Reject with clear error |
No unbounded anything. Every loop, query, queue, and buffer has a maximum.
Architecture MUST support horizontal scaling:
Every user-facing operation MUST have a performance budget:
| Operation type | Budget |
|---|---|
| API response (P95) | <200ms |
| Page load (LCP) | <2.5s |
| Database query | <50ms |
| Background job start | <1s from event |
| Search | <500ms |
If an operation exceeds its budget, it MUST be optimized before shipping. "It works" is not the same as "it scales."
Before finalizing any design or plan, run the Scalability Checklist:
If any item fails: redesign before proceeding to implementation.
As you write code:
EXPLAIN on new queries. Add indexes proactively.Verify these as part of every code review:
If existing code violates these rules:
| Pattern | Problem | Fix |
|---|---|---|
| In-memory sessions | Breaks with multiple instances | External session store |
| Unbounded queries | Memory explosion at scale | Pagination + limits |
| Synchronous emails | Request blocked for seconds | Queue + async worker |
| No connection pooling | Connection exhaustion under load | Pool with limits |
| Cache without TTL | Stale data forever | TTL + invalidation strategy |
| SELECT * | Transfers unnecessary data | Select only needed columns |
| Fat payloads | Network bottleneck | Paginate, compress, or stream |
| Excuse | Reality |
|---|---|
| "We only have 100 users" | You'll have 10,000 before you know it. Design now. |
| "We can optimize later" | Optimization is cheap. Redesigning architecture is not. |
| "Premature optimization" | Scalability design ≠ micro-optimization. These are architectural. |
| "It's fast enough on my machine" | Your machine has 1 user. Production has thousands. |
| "We'll add caching when we need it" | By then you'll need it urgently. Design the strategy now. |
| "This is just an internal tool" | Internal tools scale with the company. Design accordingly. |
npx claudepluginhub alo-exp/silver-bullet --plugin silver-bulletExpert Go code reviewer that analyzes diffs, runs go vet and staticcheck, and checks for idiomatic Go, concurrency bugs, error handling, and security issues.