Skill

perrow-system-safety

Apply Charles Perrow's Normal Accident Theory to analyze complex systems for interactive complexity, tight coupling, and accident potential. Use when the user wants to review system architecture, diagnose incidents, assess risk in high-risk systems, or invokes /perrow-system-safety.

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/research-skills:perrow-system-safety

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Based on Charles Perrow's *Normal Accidents: Living with High-Risk Technologies* (1984). Apply these frameworks when analyzing systems for safety, diagnosing incidents, or reviewing architecture.

Supporting Files

evals/evals.json

SKILL.md

209 lines · ~3.1k tokens

Stats

LanguagePython

Stars0

MaintenanceExcellent

Last CommitMay 25, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Normal Accident System Safety

Based on Charles Perrow's Normal Accidents: Living with High-Risk Technologies (1984). Apply these frameworks when analyzing systems for safety, diagnosing incidents, or reviewing architecture.

Core Thesis

In systems with interactive complexity AND tight coupling, accidents are inevitable ("normal"). They arise from unanticipated interactions of multiple small failures, not from single catastrophic failures.

The Interaction/Coupling (I/C) Chart

Classify any system on two axes to assess accident potential:

                    INTERACTION
              Linear ◄─────────► Complex
         ┌─────────────────┬─────────────────┐
  Loose  │  Quadrant I     │  Quadrant II    │
COUPLING │  Low risk       │  Moderate risk  │
         │  e.g. Auto      │  e.g. University│
         │  manufacturing  │  research lab   │
         ├─────────────────┼─────────────────┤
  Tight  │  Quadrant III   │  Quadrant IV    │
         │  Moderate risk  │  HIGHEST RISK   │
         │  e.g. Hydro     │  e.g. Nuclear   │
         │  electric dam   │  power, chemical│
         │                 │  plants, aviation│
         └─────────────────┴─────────────────┘

Quadrant IV systems (complex + tight) are where normal accidents occur. Most high-profile disasters live here.

Interactive Complexity — Assessment Checklist

A system has high interactive complexity if it has:

Factor	Complex (score +1)	Linear (score 0)
Proximity	Components in close proximity, shorting possible	Spatially segregated subsystems
Common-mode connections	Shared connections that couple subsystems	Dedicated, independent connections
Interconnected subsystems	Tight interconnections between subsystems	Subsystems can operate independently
Substitutability	Limited ability to substitute components or roles	Easy to substitute supplies/equipment/personnel
Feedback loops	Multiple, unfamiliar feedback loops	Few feedback loops, all understood
Controls	Multiple controls per function	Single-purpose controls
Information	Indirect, inferred information	Direct, observable information
Understanding	Unfamiliar, unexpected sequences	Well-understood production sequences

Scoring: 0-3 = Linear; 4-6 = Moderately Complex; 7-8 = Highly Complex

Tight Coupling — Assessment Checklist

A system is tightly coupled if:

Factor	Tight (score +1)	Loose (score 0)
Timing	Delays not possible, must proceed on schedule	Can delay, hold, or pause processes
Sequence	Invariant sequence required	Multiple valid sequences possible
Methods	Only one way to achieve goal	Multiple methods available
Slack	No slack/buffer between components	Some buffer exists between stages
Buffers	No designed-in buffers	Fortuitous buffers can absorb failures

Scoring: 0-2 = Loose; 3 = Moderate; 4-5 = Tight

DEPOSE Failure Analysis Framework

When diagnosing any failure or incident, systematically examine six components:

Design — Was the system designed with inadequate safety margins? Are safety features built in but flawed?
Equipment — Did hardware/components fail? Were redundancies present but defeated?
Procedures — Were established procedures followed? Were they adequate for the actual conditions?
Operators — Did human error contribute? (But: "human error" is often a symptom, not a cause)
Supplies & Materials — Were materials substandard? Were consumables adequate?
Environment — What environmental conditions contributed? (Weather, terrain, temperature, etc.)

DEPOSE Analysis Process

List all observed failures across all six DEPOSE categories
Identify which failures were independent vs interacting
Map the interaction sequence — which failures combined to produce the accident?
Determine if this was a system accident (unanticipated interactions) or component failure accident (single anticipated failure)
Ask: were there designed-in safety features that were themselves defeated by the interactions?

System Accident vs Component Failure

Characteristic	System Accident	Component Failure
Number of failures	Multiple, interacting	Single point of failure
Predictability	Unanticipated sequences	Expected failure mode
Root cause	Interactive complexity + tight coupling	Design flaw, wear, operator error
Post-incident blame	Often misattributed to "operator error"	Correctly attributed to component
Prevention	Redesign for loose coupling / linearity	Redundancy, maintenance, training

Error-Inducing Systems

Some systems are structurally configured to induce errors and defeat attempts at error reduction. Key indicators:

Blame inversion: Victims are low-status, unorganized, or anonymous; perpetrators face no accountability
Regulatory fragmentation: International/multi-jurisdictional systems with weak enforcement (e.g., flags of convenience)
Production pressure dominance: Safety is subordinated to schedule/economic pressure at every level
Information asymmetry: Operators have poor information; surveillance is inadequate or absent
Authority concentration: Single-person authority with insufficient checks (the "captain" problem)
Safety feature defeat: Designed-in safety features are bypassed, degraded, or rendered ineffective by the system's own structure
Error normalization: Frequent small failures become invisible through familiarity ("the traditions of the sea")

Key insight: In error-inducing systems, improving any single component may be inconsequential because other components will be allowed to express more risk. Only wholesale reconfiguration can make the system error-avoiding.

System Type Classification

Perrow distinguishes three types of systems by how they transform inputs:

Type	Description	Risk Profile
Transformation	Changes the nature of materials (nuclear, chemical)	Highest risk — poorly understood dynamics, unobservable processes
Fabricating	Assembles components into products (manufacturing)	Lower risk — linear, well-understood sequences
Additive	Moves things without changing them (transport)	Variable — depends on complexity/coupling

Aviation is largely additive but becomes transformational at high speed/altitude ("exceeding the buffet boundary"), where it takes on the dangerous characteristics of transformation systems.

Production Pressure Analysis

Production pressures are a systemic force, not individual greed. Assess:

Schedule pressure: Are operators incentivized to meet timelines at the expense of safety?
Economic pressure: Does the organization profit from tighter coupling or reduced margins?
Regulatory capture: Does the industry influence its own regulators?
Risk homeostasis: Do safety improvements get consumed by increased risk-taking rather than increased safety? (e.g., better brakes → drivers go faster)
Blame patterns: Does the organization attribute systemic failures to individual "operator error"?
Insurance failure: Does insurance actually penalize risky behavior, or are costs passed to consumers?
International fragmentation: Can organizations evade regulation through flags of convenience, multiple jurisdictions?

Practical Application: System Design Review

When reviewing any system (software architecture, organizational process, physical plant):

Classify on I/C chart — Where does it fall? If Quadrant IV, it needs special attention.
Run DEPOSE checklist — Identify weak points across all six categories.
Look for linearization opportunities — Can subsystems be decoupled? Can sequences be made flexible?
Look for coupling reduction — Can buffers, slack, or delays be introduced?
Check for automation paradoxes — Does automation reduce workload but increase consequence of failure when humans must intervene?
Assess error-inducing features — Does the system's structure encourage risky behavior?
Evaluate redundancy independence — Are backup systems truly independent, or can a single event defeat multiple "redundant" systems?

Practical Application: Incident Diagnosis

When an incident occurs:

Don't stop at "operator error" — Ask what system conditions made the error likely or inevitable
Map all concurrent failures — Use DEPOSE to identify every contributing factor
Trace the interaction chain — How did independent failures combine in unexpected ways?
Identify defeated safety features — Were there safeguards that should have caught this but didn't?
Check for tight coupling indicators — Could the cascade have been halted? Were there buffers?
Look for organizational factors — Production pressures, training gaps, regulatory failures

Policy Framework: System Classification for Action

Perrow proposes classifying high-risk systems along two dimensions: net catastrophic potential and cost of alternatives. This yields three action categories:

Category 1: Abandon (High Risk + Low Alternative Cost)

Systems where inevitable risks outweigh reasonable benefits:

Nuclear power (complex + tightly coupled transformation system, no organizational fix possible)
Nuclear weapons (catastrophic potential + arms race dynamics)

Category 2: Constrain & Improve (Moderate-High Risk + High Alternative Value)

Systems we cannot easily do without, but where risks should be reduced:

Marine transport (error-inducing system, needs wholesale restructuring)
Recombinant DNA (enormous potential benefits, but needs strict controls)

Category 3: Self-Correcting with Modest Effort (Lower Risk + Existing Self-Correction)

Systems that are partially self-correcting and can be further improved:

Chemical plants (mostly linear, loosely coupled)
Airliners and air traffic control (strong safety culture, self-reporting systems like ASRS)
Mining, fossil fuel plants, highway/automobile safety

Risk Assessment Critique

Perrow argues that conventional risk assessment is flawed because it:

Monetizes social goods and human life ($300,000 per life in 1984)
Treats 50 scattered highway deaths as equivalent to 50 deaths in one catastrophe
Ignores the difference between voluntary risks (skiing) and imposed risks (nuclear plants)
Fails to account for third-party and fourth-party victims (innocent bystanders + future generations)
Is dominated by industry-funded researchers with a vested interest

Key insight: "Ultimately, the issue is not risk, but power; the power to impose risks on the many for the benefit of the few."

Key Quotes for Context

"In tightly coupled systems, failures can cascade rapidly, and there is little slack or buffer to absorb them."

"Complex systems produce more unfamiliar sequences than are actually displayed in any given accident."

"Operator error is a convenient catch-all for mishaps whose real cause is uncertain, complex, or embarrassing to the system."

"There is no organizational structure that we would or should tolerate that could prevent [system accidents in nuclear power]."

"The issue is not risk, but power; the power to impose risks on the many for the benefit of the few."

References

Perrow, C. (1984). Normal Accidents: Living with High-Risk Technologies. Basic Books.
Case studies covered: Nuclear power (TMI, Fermi, Dresden), Petrochemical (Bhopal, Texas City, Flixborough), Aviation (DC-10, Orange County, ATC), Marine transport, Space (Mercury, Apollo), DNA research

perrow-system-safety

Invocation

Context Preview

Supporting Files

SKILL.md

perrow-system-safety

Invocation

Context Preview

Supporting Files

SKILL.md

Normal Accident System Safety

Core Thesis

The Interaction/Coupling (I/C) Chart

Interactive Complexity — Assessment Checklist

Tight Coupling — Assessment Checklist

DEPOSE Failure Analysis Framework

DEPOSE Analysis Process

System Accident vs Component Failure

Error-Inducing Systems

System Type Classification

Production Pressure Analysis

Practical Application: System Design Review

Practical Application: Incident Diagnosis

Policy Framework: System Classification for Action

Category 1: Abandon (High Risk + Low Alternative Cost)

Category 2: Constrain & Improve (Moderate-High Risk + High Alternative Value)

Category 3: Self-Correcting with Modest Effort (Lower Risk + Existing Self-Correction)

Risk Assessment Critique

Key Quotes for Context

References

Similar Skills

Normal Accident System Safety

Core Thesis

The Interaction/Coupling (I/C) Chart

Interactive Complexity — Assessment Checklist

Tight Coupling — Assessment Checklist

DEPOSE Failure Analysis Framework

DEPOSE Analysis Process

System Accident vs Component Failure

Error-Inducing Systems

System Type Classification

Production Pressure Analysis

Practical Application: System Design Review

Practical Application: Incident Diagnosis

Policy Framework: System Classification for Action

Category 1: Abandon (High Risk + Low Alternative Cost)

Category 2: Constrain & Improve (Moderate-High Risk + High Alternative Value)

Category 3: Self-Correcting with Modest Effort (Lower Risk + Existing Self-Correction)

Risk Assessment Critique

Key Quotes for Context

References

Similar Skills