From silver-bullet
Enforces stateless design, efficient data access, async-first patterns, and caching strategies to handle 10x-100x growth without redesign. Used during planning and review of non-trivial changes.
How this skill is triggered — by the user, by Claude, or both
Slash command
/silver-bullet:scalabilityThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Every design, plan, and implementation MUST handle current load efficiently AND accommodate 10x growth without architectural changes. Design for the load you expect in 18 months, not the load you have today.
Every design, plan, and implementation MUST handle current load efficiently AND accommodate 10x growth without architectural changes. Design for the load you expect in 18 months, not the load you have today.
Why this matters: Systems that aren't designed to scale hit walls — and those walls always appear at the worst time (launch day, viral moment, enterprise customer onboarding). Retrofitting scalability is 10-100x more expensive than building it in.
When to invoke: During PLANNING (after /silver:context, before /silver:plan) and during REVIEW (as part of code review criteria). This skill applies to both new code and modifications to existing code.
Every service, function, and handler MUST be stateless unless there is an explicit, documented reason for state.
Test: Can you run 5 instances of this service behind a load balancer with no shared state? If no, fix it.
Every database query and data access pattern MUST be designed for scale:
| Pattern | Requirement |
|---|---|
| Queries | Must use indexes. No full table scans on tables that will grow. |
| Pagination | Required for any list endpoint. No unbounded SELECT *. |
| N+1 queries | Forbidden. Use joins, batch loading, or dataloader patterns. |
| Write amplification | Minimize. Don't update entire records when one field changes. |
| Connection pooling | Required. Never open/close connections per request. |
| Read replicas | Design for eventual consistency where appropriate. |
Test: Run an EXPLAIN on every query. If it says "full table scan" on a table with >10K rows, add an index.
Any operation that doesn't need an immediate response MUST be asynchronous:
Synchronous is acceptable for: auth checks, data reads <100ms, input validation.
Every read-heavy path MUST have a caching strategy:
| Cache layer | TTL | Use when |
|---|---|---|
| HTTP cache (CDN, browser) | Minutes to hours | Static assets, API responses that change infrequently |
| Application cache (Redis) | Seconds to minutes | Computed results, session data, frequent queries |
| Database query cache | Seconds | Identical queries hitting the DB frequently |
| No cache | — | Write paths, real-time data, personalized content |
Every cache MUST have:
Every resource consumer MUST have explicit limits:
| Resource | Limit | What happens at limit |
|---|---|---|
| HTTP request body | Max size (e.g., 10MB) | 413 Payload Too Large |
| Query results | Max rows (e.g., 1000) | Pagination required |
| Batch operations | Max batch size (e.g., 100) | Split into chunks |
| Concurrent connections | Pool size (e.g., 20) | Queue or reject |
| Background jobs | Max concurrent (e.g., 10) | Queue with backpressure |
| File uploads | Max size + count | Reject with clear error |
No unbounded anything. Every loop, query, queue, and buffer has a maximum.
Architecture MUST support horizontal scaling:
Every user-facing operation MUST have a performance budget:
| Operation type | Budget |
|---|---|
| API response (P95) | <200ms |
| Page load (LCP) | <2.5s |
| Database query | <50ms |
| Background job start | <1s from event |
| Search | <500ms |
If an operation exceeds its budget, it MUST be optimized before shipping. "It works" is not the same as "it scales."
Before finalizing any design or plan, run the Scalability Checklist:
If any item fails: redesign before proceeding to implementation.
As you write code:
EXPLAIN on new queries. Add indexes proactively.Verify these as part of every code review:
If existing code violates these rules:
| Pattern | Problem | Fix |
|---|---|---|
| In-memory sessions | Breaks with multiple instances | External session store |
| Unbounded queries | Memory explosion at scale | Pagination + limits |
| Synchronous emails | Request blocked for seconds | Queue + async worker |
| No connection pooling | Connection exhaustion under load | Pool with limits |
| Cache without TTL | Stale data forever | TTL + invalidation strategy |
| SELECT * | Transfers unnecessary data | Select only needed columns |
| Fat payloads | Network bottleneck | Paginate, compress, or stream |
| Excuse | Reality |
|---|---|
| "We only have 100 users" | You'll have 10,000 before you know it. Design now. |
| "We can optimize later" | Optimization is cheap. Redesigning architecture is not. |
| "Premature optimization" | Scalability design ≠ micro-optimization. These are architectural. |
| "It's fast enough on my machine" | Your machine has 1 user. Production has thousands. |
| "We'll add caching when we need it" | By then you'll need it urgently. Design the strategy now. |
| "This is just an internal tool" | Internal tools scale with the company. Design accordingly. |
npx claudepluginhub alo-exp/silver-bullet --plugin silver-bulletGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.