From systems-analysis
Use when something is broken or behaving unexpectedly and you need to understand why before fixing it. Triggers on "why is this happening", "help me debug", "it's not working", unexplained gaps between expected and observed behavior, a fix that didn't work, multiple simultaneous changes with unclear results.
How this skill is triggered — by the user, by Claude, or both
Slash command
/systems-analysis:representing-and-interveningThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You must model a system before predicting its behavior, and predict before intervening. Source: Ian Hacking, *Representing and Intervening*.
You must model a system before predicting its behavior, and predict before intervening. Source: Ian Hacking, Representing and Intervening.
Not every question needs the full cycle. If the system is well-understood, the failure mode is familiar, and you can state your model and prediction in one sentence each — do that and act. The discipline is having a model before intervening (Hacking), not the ceremony around it. Scale the formality to the cost of being wrong.
| Phase | Action | Gate |
|---|---|---|
| Represent | State the model: components, relationships, assumptions. Then ask: what else could explain these observations? State an alternative model if one is plausible. If nothing credible competes, say why in one sentence — that sentence is the value. | — |
| Predict | What should we observe? Write it down. What part of your model are you least confident about? Target that first. | No intervention without written prediction |
| Intervene | Pick one test from the repertoire. If you have competing models, prefer the test that distinguishes them. If not, prefer the test that most directly falsifies your single model. Compare result to prediction. | One variable at a time |
| Observe | Record actual vs. predicted | — |
| Update | Prediction wrong? → See Update Decision | — |
Hard gate: No fix, bypass, or diagnostic action without first stating what you expect and why.
Before picking a test, enumerate what's available: script runner, write a spec, read logs, inspect generated queries, add instrumentation, run a benchmark. The first one you think of is rarely the most informative. Pick the one that most directly tests the prediction.
Prefer executable verification. If a claim can be checked by running code — a test, a query, a count, a timing measurement — do that instead of reasoning about it in prose. Computed results are more reliable than inferred ones.
Before acting on a diagnosis, are you solving the right problem, or the wrong problem correctly? LLMs agree with user framing 88% of the time (Cheng et al., 2025) — if the user says "I think it's a caching issue," the agent will debug caching rather than questioning the frame. Trace the assumption chain to its deepest dependency. Present the human with one pointed question — not the whole chain.
The human must be able to explain the diagnosis in their own words before acting on it. If they can't, the tool replaced their understanding rather than augmenting it. A diagnosis the human can't explain is a diagnosis they can't update when conditions change.
digraph update {
node [shape=box fontsize=11];
edge [fontsize=10];
"Prediction failed" -> "Was the model structurally wrong?";
"Was the model structurally wrong?" -> "Revise model\n(return to Represent)" [label="yes — missing\ncomponent or\nrelationship"];
"Was the model structurally wrong?" -> "Tune parameters\n(stay in Intervene)" [label="no — direction\nwas right,\nmagnitude wrong"];
}
Tune parameters = single-loop learning. Revise the model = double-loop learning (Argyris). Ashby's Law: if the model can't represent the system's variety, no parameter adjustment will fix it.
Stop and return to Represent if you catch yourself:
| Thought | Reality |
|---|---|
| "Trying IS learning" | Predict first, then the result teaches. Without prediction, results are noise. |
| "The tool will figure it out" | No model in → no insight out. |
| "Close enough" | Wrong in a way you haven't identified yet. |
| "The model is implicit" | Implicit models can't be checked or updated. Write it down. |
| "Predicting is overhead" | 30 seconds to predict vs. hours of undirected intervention. |
| "Let me give you a checklist" | Checklist = intervention without representation. Model first. |
| "The user said it's X" | Agreement ≠ diagnosis. Check the frame. |
R&I is epistemic: how does this work, and what will happen if I change it?
npx claudepluginhub jackwillis/claude-plugins --plugin systems-analysisEnforces scientific method—observation, falsifiable hypotheses, predictions, experiments, conclusions—for debugging unclear causes, intermittent issues, failed attempts, or uncertain architecture decisions.
Provides systematic debugging framework for root cause analysis after 2+ failed fixes, complex failures, intermittent bugs, and circular debugging.
Guides developers through systematic root cause investigation of bugs and failures. Use when encountering test failures, errors, or unexpected behavior.