From thinking-skills
Maps interconnected services and traces root causes when debugging cross-service incidents where single-component fixes fail or behavior is emergent.
How this skill is triggered — by the user, by Claude, or both
Slash command
/thinking-skills:thinking-systemsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Systems thinking views a problem as part of an interconnected whole rather than isolated components. It focuses on relationships, feedback loops, delays, and emergent properties—behaviors that arise from interactions and can't be predicted from parts alone. Its proven payoff is cross-service/incident debugging, where "obvious" single-component fixes fail.
Systems thinking views a problem as part of an interconnected whole rather than isolated components. It focuses on relationships, feedback loops, delays, and emergent properties—behaviors that arise from interactions and can't be predicted from parts alone. Its proven payoff is cross-service/incident debugging, where "obvious" single-component fixes fail.
Core Principle: The behavior of a system cannot be understood by analyzing components in isolation. Look at connections, feedback, and emergence.
Problem spans multiple components? → yes → APPLY SYSTEMS THINKING
Fix in one place caused issue in another? → yes → APPLY SYSTEMS THINKING
Behavior seems "emergent" or unexpected? → yes → APPLY SYSTEMS THINKING
This is the core of the skill—apply it first.
Draw components, connections, and data/control flows:
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Client │────▶│ API │────▶│ DB │
└─────────┘ └────┬────┘ └─────────┘
│
▼
┌─────────┐
│ Cache │
└─────────┘
For each loop, determine:
Retry Storm Loop (Reinforcing - Dangerous):
Service slow → Clients retry → More load → Service slower → More retries
Follow the symptom backward to find originating cause:
Symptom: High latency in Service C
→ Service C waiting on Service B
→ Service B waiting on Service A
→ Service A doing full table scan (ROOT CAUSE)
What happens when components interact under stress?
One component fails → Dependent components overload → They fail
↓
← More traffic to remaining ←
Mitigation: Circuit breakers, bulkheads, graceful degradation
Cache expires → All requests hit backend simultaneously → Overload
Mitigation: Jittered expiration, cache warming, request coalescing
Processing rate < Arrival rate → Queue grows → Memory pressure → OOM
Mitigation: Backpressure, rate limiting, queue bounds
Multiple processes → Same resource → Lock contention → Serialization
↓
Throughput collapses despite available CPU
Mitigation: Sharding, optimistic locking, resource isolation
Reinforcing (Positive) Loops: Amplify change
Technical Debt Loop:
Deadline pressure → Shortcuts → More bugs → More firefighting
↓
← Less time for quality ←
Balancing (Negative) Loops: Counteract change
Auto-scaling Loop:
Load increases → More instances spawn → Load per instance decreases
↓
← Fewer instances needed ←
Questions to identify loops:
Stocks: Accumulated quantities (users, technical debt, cache size) Flows: Rates of change (registrations/day, bugs fixed/sprint)
┌─────────────────────────────────────┐
│ Inflow → [Stock] → Outflow │
│ │
│ New bugs → [Bug Backlog] → Fixes │
│ Requests → [Queue Depth] → Processed│
│ Hires → [Team Size] → Attrition │
└─────────────────────────────────────┘
Key insight: Stocks change slowly even when flows change quickly. Queue depth doesn't drop instantly when you add capacity.
Time lags between cause and effect obscure relationships:
Code deployed → [Delay: Cache TTL] → Users see change
Feature shipped → [Delay: Adoption curve] → Metrics change
New hire starts → [Delay: Ramp-up] → Productivity impact
Danger: Acting before feedback arrives leads to overcorrection.
Small changes can have large effects (and vice versa):
Linear assumption: 2x traffic = 2x latency
Reality: Traffic crosses threshold → 10x latency (queue buildup)
Linear assumption: Adding engineer adds capacity
Reality: Communication overhead grows O(n²)
Behaviors that arise from interactions, not individual components:
┌──────────────────────────────────────────────────────────────┐
│ System: [Name] │
├──────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ │
│ │ Factor │──────(+)──────────────▶│ Factor │ │
│ │ A │ │ B │ │
│ └─────────┘ └────┬────┘ │
│ ▲ │ │
│ │ │ │
│ (-) (+) │
│ │ │ │
│ │ ┌─────────┐ │ │
│ └─────────│ Factor │◀─────────────┘ │
│ │ C │ │
│ └─────────┘ │
│ │
│ Legend: (+) = same direction, (-) = opposite direction │
│ Loop type: Reinforcing / Balancing │
└──────────────────────────────────────────────────────────────┘
Once you've located where to intervene, pick the highest-leverage point you can actually move:
| Leverage | Example | Impact |
|---|---|---|
| Parameters | Timeout values | Low |
| Buffer sizes | Queue limits | Low-Medium |
| Feedback loops | Add monitoring | Medium |
| Information flows | Make metrics visible | Medium-High |
| Rules | Change retry policy | High |
| Goals | Redefine SLOs | Very High |
| Paradigm | Rethink architecture | Transformational |
(See thinking-leverage-points for Meadows' full 12-level hierarchy.)
"We can't control systems or figure them out. But we can dance with them."
Systems resist simple fixes. Effective intervention requires understanding the whole, finding leverage points, and accepting that you're influencing, not controlling.
npx claudepluginhub tjboudreaux/cc-thinking-skills --plugin thinking-skillsUses feedback loop analysis to diagnose why a system grows uncontrollably, oscillates, or resists change. Identifies dominant loops and delays.
Maps feedback loops, identifies system archetypes, and ranks interventions by Meadows' leverage hierarchy for complex problems with interconnected components.
Routes to the appropriate systems thinking tool based on your situation. Use for diagnosing system behaviors, feedback loops, and leverage points.