From mattpocock-skills
Guides writing and editing skills with a vocabulary and principles focused on predictability, invocation modes, and information hierarchy.
How this skill is triggered — by the user, by Claude, or both
Slash command
/mattpocock-skills:writing-great-skillsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
A skill exists to wrangle determinism out of a stochastic system. **Predictability** — the agent taking the same _process_ every run, not producing the same output — is the root virtue; every lever below serves it.
A skill exists to wrangle determinism out of a stochastic system. Predictability — the agent taking the same process every run, not producing the same output — is the root virtue; every lever below serves it.
Bold terms are defined in GLOSSARY.md; look them up there for the full meaning.
Two choices, trading different costs:
disable-model-invocation, and write a model-facing description with rich trigger phrasing ("Use when the user wants…, mentions…").disable-model-invocation: true; the description becomes human-facing — a one-line summary, trigger lists stripped.Pick model-invocation only when the agent must reach the skill on its own, or another skill must. If it only ever fires by hand, make it user-invoked and pay no context load.
When user-invoked skills multiply past what you can remember, that piled-up cognitive load is cured by a router skill: one user-invoked skill that names the others and when to reach for each.
A model-invoked description does two jobs — state what the skill is, and list the branches that should trigger it. Every word increases context load, so a description earns even harder pruning than the body:
A skill is built from two content types — steps and reference — that mix freely: a skill can be all steps, all reference, or both. The core decision is which to use and where each sits on the information hierarchy, a ladder ranked by how immediately the agent needs the material:
SKILL.md, the primary tier: what the agent does, in order. Each step ends on a completion criterion, the condition that tells the agent the work is done. Make it checkable (can the agent tell done from not-done?) and, where it matters, exhaustive ("every modified model accounted for", not "produce a change list") — a vague criterion invites premature completion.SKILL.md, consulted on demand. Often a legitimately flat peer-set (every rule of a review on one rung) — a fine arrangement, not a smell. This skill is all reference.SKILL.md into a separate file, reached by a context pointer, loaded only when the pointer fires. (Spans disclosed reference — a sibling file like GLOSSARY.md, still part of the skill — through fully external reference that lives outside the skill system and any skill can point at.)A demanding completion criterion drives thorough legwork — the digging the agent does within the work — whether the skill has steps or not, since "every rule applied" binds flat reference just as "every step done" binds a sequence.
Push too little down and the top bloats; push too much and you hide material the agent actually needs. That tension is the whole decision.
Progressive disclosure is the move down the ladder — out of SKILL.md into a linked file — so the top stays legible. Mechanics: a linked .md file in the skill folder, named for what it holds (this skill discloses its full definitions to GLOSSARY.md). Some skills are used in more than one way, and each distinct way is a branch — different runs taking different paths through the skill. Branching is the cleanest disclosure test: inline what every branch needs, and push behind a pointer what only some branches reach. A context pointer's wording, not its target, decides when and how reliably the agent reaches the material.
Where the ladder decides how far down a piece sits, co-location decides what sits beside it once there: keep a concept's definition, rules, and caveats under one heading rather than scattered, so reading one part brings its neighbours with it.
Granularity is how finely you divide skills, and each cut spends one of the two loads, so split only when the cut earns it. Two cuts:
Keep each meaning in a single source of truth: one authoritative place, so changing the behaviour is a one-place edit.
Check every line for relevance: does it still bear on what the skill does?
Then hunt no-ops sentence by sentence, not just line by line: run the no-op test on each sentence in isolation, and when one fails, delete the whole sentence rather than trim words from it. Be aggressive — most prose that fails should go, not be rewritten.
A leading word is a compact concept already living in the model's pretraining that the agent thinks with while running the skill (e.g. lesson, fog of war, tracer bullets). Repeated throughout the text (though not necessarily - a strong leading word might only be needed once), it accumulates a distributed definition and anchors a whole region of behaviour in the fewest tokens, by recruiting priors the model already holds.
It serves predictability twice. In the body it anchors execution: the agent reaches for the same behaviour every time the word appears. In the description it anchors invocation: when the same word lives in your prompts, docs, and code, the agent links that shared language to the skill and fires it more reliably.
Hunt for opportunities to refactor skills to use leading words. A triad spelled out at three sites (duplication), a description spending a sentence to gesture at one idea — each is a passage begging to collapse into a single token. Examples include:
You win twice over: fewer tokens, and a sharper hook for the agent to hang its thinking on. Assume every skill is carrying restatements that leading words retire — go find them.
Use these to diagnose issues the user may be having with the skill.
npx claudepluginhub esonhugh/marketplace --plugin mattpocock-skillsUses test-driven development to create, edit, and verify skills for agent behavior. Write pressure scenarios, watch agents fail, then document rules to fix behavior.
Use when creating new skills, editing existing skills, or verifying skills work before deployment
Applies TDD RED-GREEN-REFACTOR cycle to create, edit, and verify Claude Code skills: test scenarios first, baseline failures, minimal docs, refactor loopholes.