From thinking-skills
Applies structural/margin-of-safety reasoning to capacity planning, timeouts, SLA commitments, and estimates under uncertainty. Sizes buffers to the cost of being wrong rather than optimizing to the edge.
How this skill is triggered — by the user, by Claude, or both
Slash command
/thinking-skills:thinking-margin-of-safetyThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Margin of Safety, borrowed from Benjamin Graham's investment philosophy and structural engineering, is the practice of building in buffers to account for unknown unknowns. In a world of uncertainty, systems optimized to the edge are brittle. Robust systems have slack, reserves, and room for error.
Margin of Safety, borrowed from Benjamin Graham's investment philosophy and structural engineering, is the practice of building in buffers to account for unknown unknowns. In a world of uncertainty, systems optimized to the edge are brittle. Robust systems have slack, reserves, and room for error.
Core Principle: Build in buffers. The world is uncertain. Systems without margin fail when stressed.
Decision flow:
Provisioning, setting a limit, or committing an estimate?
→ Is there real uncertainty AND a cost to under-provisioning? → yes → SIZE A BUFFER TO THE FAILURE COST
→ Are you optimizing to the edge to save a little? → yes → ADD SLACK unless the breach is cheap
→ Estimate could be 2x off? → factor that into the buffer
cost(buffer) > probability(breach) × cost(breach), the buffer is the wrong call. Right-size, don't max out.thinking-bounded-rationality (set a good-enough threshold), not margin. Margin sizes the buffer on a number; bounded-rationality decides when to stop looking for the number.When provisioning capacity, setting a limit, or committing to an estimate under uncertainty:
If you can eliminate the uncertainty by measuring or looking up the real number, do that instead — a measured value beats a padded guess. If the question is "when do I stop searching?", that's thinking-bounded-rationality, not margin.
What's your best guess for the requirement?
Estimate: Need 100 requests/second capacity
Estimate: Project will take 6 weeks
Estimate: Need 500GB storage for year 1
How confident are you, and what could you be missing?
## Uncertainty Analysis
| Factor | Your Estimate | Uncertainty | Possible Range |
|--------|---------------|-------------|----------------|
| Traffic | 100 RPS | ±50% | 50-150 RPS |
| Spike multiplier | 3x | ±100% | 1.5x-6x |
| Growth rate | 20%/year | ±50% | 10-30%/year |
| Unknown unknowns | - | +50-100% | - |
Different contexts need different margins:
| Context | Typical Margin | Rationale |
|---|---|---|
| Capacity planning | 2-3x | Traffic spikes unpredictable |
| Time estimation | 1.5-2x | Everything takes longer |
| Infrastructure | 2x headroom | Scaling takes time |
| SLA commitment | 1.5x buffer | Reputation at stake |
| New/unknown domain | 2-3x | High uncertainty |
| Well-understood domain | 1.3-1.5x | Lower uncertainty |
Base estimate: 100 RPS
Margin: 2x (moderate uncertainty, spikes possible)
Provision: 200 RPS capacity
Base estimate: 6 weeks
Margin: 1.5x (experienced team, some unknowns)
Commit: 9 weeks
Track actuals against estimates to calibrate future margins:
## Calibration Log
| Estimate | Margin Applied | Actual | Margin Accuracy |
|----------|----------------|--------|-----------------|
| 100 RPS | 2x (200) | 180 | Adequate |
| 6 weeks | 1.5x (9) | 10 weeks | Insufficient |
| 500 GB | 2x (1TB) | 400 GB | Excessive |
Insight: Time estimates need higher margin; storage was overprovisioned
## Capacity Planning with Margin
Base load: 1,000 RPS
Peak multiplier: 3x (historical)
Margin for unknowns: 1.5x
Margin for growth: 1.3x (6 months runway)
Required capacity: 1,000 × 3 × 1.5 × 1.3 = 5,850 RPS
Round up: 6,000 RPS
Rationale: Can handle 6x normal load, or 4x peak, or growth + peak
## Project Estimation with Margin
Task estimates:
- Feature A: 2 weeks
- Feature B: 3 weeks
- Integration: 1 week
- Testing: 1 week
Base total: 7 weeks
Adjustments:
- Optimistic bias: +30%
- Unknowns: +20%
- Dependencies: +15%
Margin total: 1.65x
Commitment: 7 × 1.65 = 11.5 → 12 weeks
Rule of thumb: Hofstadter's Law - "It always takes longer than you expect,
even when you take into account Hofstadter's Law."
## Budget with Margin
Infrastructure estimate:
- Compute: $5,000/month
- Storage: $2,000/month
- Network: $1,000/month
Base: $8,000/month
Margin considerations:
- Traffic growth: +25%
- Unplanned incidents: +15%
- New features: +20%
Budget request: $8,000 × 1.6 = $12,800/month
Actual budget: $13,000/month (round up)
## Architectural Margin
Connection pool:
- Normal usage: 50 connections
- Peak: 100 connections
- Margin: 2x peak
- Configure: 200 connections
Queue depth:
- Normal processing: 1,000 messages
- Burst: 10,000 messages
- Margin: 2x burst
- Configure: 20,000 max depth
Timeout:
- P99 latency: 500ms
- Margin: 2x
- Set timeout: 1000ms
Almost never appropriate for:
Acceptable for:
Margin isn't free. Balance:
## Margin Cost-Benefit
High margin:
+ Handles unexpected loads
+ Reduces stress/heroics
+ Enables growth without emergency scaling
- Higher infrastructure cost
- Potentially wasted resources
Low margin:
+ Lower cost
+ Efficient resource use
- Risk of outages
- Constant firefighting
- Technical debt from quick fixes
Sweet spot: Margin where cost of buffer < expected cost of margin-breach × probability
# Margin of Safety Analysis: [Context]
## Base Estimate
What: [What you're estimating]
Estimate: [Your point estimate]
Confidence: [How confident you are]
## Uncertainty Factors
| Factor | Impact | Probability | Adjustment |
|--------|--------|-------------|------------|
| [Factor 1] | +X% | Medium | |
| [Factor 2] | +Y% | Low | |
| Unknown unknowns | +Z% | - | |
## Margin Calculation
Base: [X]
Uncertainty multiplier: [1.X]
Context multiplier: [1.Y] (high/medium/low stakes)
Total margin: [X × all multipliers]
## Final Commitment/Design
With margin: [Final number]
Rationale: [Why this margin]
## Monitoring Plan
How will you know if margin is adequate/excessive?
- [Metric to track]
- [Threshold for concern]
- [Review cadence]
"The margin of safety is always dependent on the price paid."
In engineering: The margin needed depends on the cost of failure. Critical systems need more margin. Experiments can run leaner.
"Confronted with the challenge to distill the secret of sound investment into three words, we venture the motto, Margin of Safety."
In systems: When in doubt, build in margin. The cost of over-provisioning is usually much less than the cost of under-provisioning when things go wrong.
"The function of the margin of safety is, in essence, that of rendering unnecessary an accurate estimate of the future."
You don't need to predict perfectly if you have adequate margin. Margin is insurance against your own estimation errors.
npx claudepluginhub tjboudreaux/cc-thinking-skills --plugin thinking-skillsBreaks down unknown quantities into estimable factors for order-of-magnitude answers. Use for capacity planning, cost estimation, or sanity-checking when real data is unavailable.
Produces a structured capacity planning document covering traffic forecasts, resource requirements, scaling strategy, cost projections, and infrastructure action roadmap.
Models current resource utilization and traffic growth to project when infrastructure capacity will be exhausted, helping you provision proactively before SLO violations occur.