Skill

harm-modeling

Systematically enumerate the potential HARMS of an AI system — to users, third parties, vulnerable groups, and society — under normal use, misuse, and malfunction, then rank them and map mitigations. This is the AI-safety analog of threat modeling (which targets attackers). Use when designing or reviewing an AI feature for safety, not security.

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/ai-safety:harm-modeling

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

A harm model: who could be harmed, how, under what conditions, how badly, and what

Supporting Files

reference.md

SKILL.md

55 lines · ~642 tokens

Stats

Parent stars1

MaintenanceGood

Last CommitMay 31, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Goal

A harm model: who could be harmed, how, under what conditions, how badly, and what reduces it — the safety counterpart to a security threat model.

How this differs from threat modeling

Threat modeling (threat-modeling:stride) asks how could an attacker compromise the system? The actor is adversarial.
Harm modeling asks how could the system harm people even with no attacker? via normal use, foreseeable misuse, malfunction, bias, or over-reliance. Use both for a complete picture.

Steps

Define the system & context — purpose, users (including vulnerable populations: minors, patients, at-risk groups), deployment context, and the stakes of the decisions it influences.
Identify stakeholders — direct users, non-user subjects (people the output is about), bystanders/third parties, and society at large.
Enumerate harm categories (see reference.md): physical, psychological, financial, discrimination/unfairness, privacy/dignity, misinformation, manipulation/autonomy, societal/democratic, environmental, and dangerous- capability/misuse harms.
For each plausible harm, capture the condition: normal use, foreseeable misuse, malfunction/error (hallucination, failure), distribution shift, or feedback effects at scale. Note who is harmed and how severe/irreversible.
Rate severity × likelihood × affected-population (weight irreversible and vulnerable-group harms up). Reuse threat-modeling:risk-rank scoring.
Map mitigations — design changes, guardrails, evals, human oversight, disclosures, usage policy, monitoring — and note residual harm.

Output

A harm-model table: stakeholder · harm category · condition · severity · likelihood · affected group · mitigation · residual. Plus a top-harms summary and recommended safeguards. Use security-reporting for the writeup and security-diagramming to map harm pathways.

Notes

Always include foreseeable misuse and malfunction, not just intended use — most real-world AI harms come from those. Give extra weight to harms that are irreversible or fall on people who can't opt out.

harm-modeling

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

harm-modeling

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

Goal

How this differs from threat modeling

Steps

Output

Notes

Similar Skills

Goal

How this differs from threat modeling

Steps

Output

Notes

Similar Skills