Skill

Holdfast — Review

Review holdfast evidence and detect drift. Use when the user says "review evidence", "what's drifting", "how's my classifier doing", "check holdfast", "anything drifting", or asks about patterns in their pipeline or task quality.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/holdfast:review

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

You read accumulated evidence, run detection rules, and summarize what's

SKILL.md

87 lines · ~609 tokens

Stats

LanguagePython

Stars2

MaintenanceExcellent

Last CommitApr 17, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

Holdfast — Review

You read accumulated evidence, run detection rules, and summarize what's working and what's drifting.

Core rule: frozen surfaces don't change. Evolvable surfaces improve only with evidence and approval.

Find contracts

find . -name "contract.yaml" -not -path "*/.holdfast/*"

For each contract, read:

cat {contract_dir}/contract.yaml
cat {contract_dir}/invariants.yaml
cat {contract_dir}/detection.yaml
ls {contract_dir}/.holdfast/evidence/

Run detection

Pipeline mode — Python API

If Python and holdfast are installed:

from holdfast import Contract, check_contract
contract = Contract.load("{contract_dir}")
alerts = check_contract(contract)

Claude mode — in-context

Read detection.yaml for the rules. Read all evidence JSON files. Then compute each rule. Show your work — list the values, show the math, then state whether the threshold was exceeded.

Failure rate

Take the most recent N runs (N = window)
Count runs where passed is false
Compute: failure_count / total_count
Alert if result exceeds max_rate

Variance

Take the most recent N runs (N = window)
Extract the target field value from each run (e.g., output.satisfaction)
Compute mean: sum(values) / count
Compute stddev: sqrt(sum((v - mean)^2 for v in values) / (count - 1))
Alert if stddev exceeds max_stddev
If group_by is set, bucket runs by that field and check each group separately

Drift

Take the first N runs as baseline (N = baseline_window)
Take the last M runs as recent (M = recent_window)
Compute mean of the target field for each window
Compute: abs(recent_mean - baseline_mean)
Alert if shift exceeds max_shift

Summarize findings

Be specific. Cite run IDs. Examples:

"Across 20 runs, 6 were rejected — all on refactoring tasks where the diff was too large."
"Satisfaction drifted from 4.2 to 3.1 over the last 15 runs. The approach change in v2 may be too aggressive."
"All runs pass. No pattern to address."

If there are patterns worth acting on, suggest the user run /holdfast:evolve.

Holdfast — Review

Popularity

Invocation

Context Preview

SKILL.md

Holdfast — Review

Popularity

Invocation

Context Preview

SKILL.md

Holdfast — Review

Find contracts

Run detection

Pipeline mode — Python API

Claude mode — in-context

Failure rate

Variance

Drift

Summarize findings

Similar Skills

Holdfast — Review

Find contracts

Run detection

Pipeline mode — Python API

Claude mode — in-context

Failure rate

Variance

Drift

Summarize findings

Similar Skills