Skill

representing-and-intervening

Use when something is broken or behaving unexpectedly and you need to understand why before fixing it. Triggers on "why is this happening", "help me debug", "it's not working", unexplained gaps between expected and observed behavior, a fix that didn't work, multiple simultaneous changes with unclear results.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/systems-analysis:representing-and-intervening

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You must model a system before predicting its behavior, and predict before intervening. Source: Ian Hacking, *Representing and Intervening*.

Supporting Files

examples/design-decision.mdexamples/flaky-test.mdexamples/slow-api.md

SKILL.md

104 lines · ~1.9k tokens

Stats

LanguageCSS

Parent stars1

MaintenanceExcellent

Last CommitApr 4, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Representing and Intervening

You must model a system before predicting its behavior, and predict before intervening. Source: Ian Hacking, Representing and Intervening.

Proportionality

Not every question needs the full cycle. If the system is well-understood, the failure mode is familiar, and you can state your model and prediction in one sentence each — do that and act. The discipline is having a model before intervening (Hacking), not the ceremony around it. Scale the formality to the cost of being wrong.

Five Phases

Phase	Action	Gate
Represent	State the model: components, relationships, assumptions. Then ask: what else could explain these observations? State an alternative model if one is plausible. If nothing credible competes, say why in one sentence — that sentence is the value.	—
Predict	What should we observe? Write it down. What part of your model are you least confident about? Target that first.	No intervention without written prediction
Intervene	Pick one test from the repertoire. If you have competing models, prefer the test that distinguishes them. If not, prefer the test that most directly falsifies your single model. Compare result to prediction.	One variable at a time
Observe	Record actual vs. predicted	—
Update	Prediction wrong? → See Update Decision	—

Hard gate: No fix, bypass, or diagnostic action without first stating what you expect and why.

Two Modes

Lightweight (default): Natural language model and predictions. Always start here.
Formal (opt-in): Tool-assisted (e.g., causal diagrams, logic engines). Only after lightweight model exists. Formal mode is for research, scientific, or regulatory contexts where the model needs to survive external scrutiny — not typical software engineering.

Intervention Repertoire

Before picking a test, enumerate what's available: script runner, write a spec, read logs, inspect generated queries, add instrumentation, run a benchmark. The first one you think of is rarely the most informative. Pick the one that most directly tests the prediction.

Prefer executable verification. If a claim can be checked by running code — a test, a query, a count, a timing measurement — do that instead of reasoning about it in prose. Computed results are more reliable than inferred ones.

Problem Setting (Schon)

Before acting on a diagnosis, are you solving the right problem, or the wrong problem correctly? LLMs agree with user framing 88% of the time (Cheng et al., 2025) — if the user says "I think it's a caching issue," the agent will debug caching rather than questioning the frame. Trace the assumption chain to its deepest dependency. Present the human with one pointed question — not the whole chain.

Understanding, Not Just Receiving

The human must be able to explain the diagnosis in their own words before acting on it. If they can't, the tool replaced their understanding rather than augmenting it. A diagnosis the human can't explain is a diagnosis they can't update when conditions change.

Update Decision

digraph update {
  node [shape=box fontsize=11];
  edge [fontsize=10];
  "Prediction failed" -> "Was the model structurally wrong?";
  "Was the model structurally wrong?" -> "Revise model\n(return to Represent)" [label="yes — missing\ncomponent or\nrelationship"];
  "Was the model structurally wrong?" -> "Tune parameters\n(stay in Intervene)" [label="no — direction\nwas right,\nmagnitude wrong"];
}

Tune parameters = single-loop learning. Revise the model = double-loop learning (Argyris). Ashby's Law: if the model can't represent the system's variety, no parameter adjustment will fix it.

Red Flags

Stop and return to Represent if you catch yourself:

"Let me just try..." (intervening without predicting)
Reaching for a tool before the human has spoken
"Close enough" (skipping the observe/update cycle)
Multiple simultaneous changes (uninterpretable results)
"It partially worked, let's tune" (may be structural, not parametric)
Ranking fixes by probability without stating the model they assume

Rationalizations

Thought	Reality
"Trying IS learning"	Predict first, then the result teaches. Without prediction, results are noise.
"The tool will figure it out"	No model in → no insight out.
"Close enough"	Wrong in a way you haven't identified yet.
"The model is implicit"	Implicit models can't be checked or updated. Write it down.
"Predicting is overhead"	30 seconds to predict vs. hours of undirected intervention.
"Let me give you a checklist"	Checklist = intervention without representation. Model first.
"The user said it's X"	Agreement ≠ diagnosis. Check the frame.

Examples

Flaky integration test — competing models (insertion order vs. race condition), executable verification
API latency spike — competing models (N+1 vs. index), query counting to distinguish them
Design decision: queue vs. database — competing approaches (not debugging), observability as the differentiator

Arriving From Another Skill

From frame-problem: You've named assumptions and checked freshness. Carry the verified assumptions into your Represent phase — they're your starting constraints. Focus Represent on the parts the frame audit flagged as uncertain.
From causal-analysis: You have a DAG and identified causal relationships. Use them as your model in Represent rather than building from scratch. Your prediction should test the edges you're least confident about.
From requisite-variety: You've identified a variety gap or regulation failure. The regulation model is your starting Represent — now ask why the regulator fails, which is an R&I question.

Transition Signals

Model reveals a regulation problem (regulator can't match disturbance variety, "we keep adding rules") → suggest requisite-variety to the user.
Represent phase needs causal structure from observational data (confounders, selection bias) → suggest causal-analysis to the user.
Production is down, what broke? → suggest systematic-debugging to the user (forensic, not epistemic).
Update reveals structural revision — the model was wrong, not miscalibrated. If brainstorming is available, suggest it to the user to explore the problem space before committing to a new model.
Assumptions feel stale or unexamined — you have a model but haven't checked whether the world still matches it → suggest frame-problem to the user.
Model is solid, intervention plan is clear — if writing-plans is available, suggest it to the user. For multi-step fixes, suggest executing-plans or subagent-driven-development.
Prediction is clear and you need to encode it as a test — if what-to-test is available, suggest it to the user. R&I's prediction becomes the test's causal claim.

R&I is epistemic: how does this work, and what will happen if I change it?

representing-and-intervening

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

representing-and-intervening

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Representing and Intervening

Proportionality

Five Phases

Two Modes

Intervention Repertoire

Problem Setting (Schon)

Understanding, Not Just Receiving

Update Decision

Red Flags

Rationalizations

Examples

Arriving From Another Skill

Transition Signals

Similar Skills

Representing and Intervening

Proportionality

Five Phases

Two Modes

Intervention Repertoire

Problem Setting (Schon)

Understanding, Not Just Receiving

Update Decision

Red Flags

Rationalizations

Examples

Arriving From Another Skill

Transition Signals

Similar Skills