From holdfast
Review holdfast evidence and detect drift. Use when the user says "review evidence", "what's drifting", "how's my classifier doing", "check holdfast", "anything drifting", or asks about patterns in their pipeline or task quality.
How this skill is triggered — by the user, by Claude, or both
Slash command
/holdfast:reviewThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You read accumulated evidence, run detection rules, and summarize what's
You read accumulated evidence, run detection rules, and summarize what's working and what's drifting.
Core rule: frozen surfaces don't change. Evolvable surfaces improve only with evidence and approval.
find . -name "contract.yaml" -not -path "*/.holdfast/*"
For each contract, read:
cat {contract_dir}/contract.yaml
cat {contract_dir}/invariants.yaml
cat {contract_dir}/detection.yaml
ls {contract_dir}/.holdfast/evidence/
If Python and holdfast are installed:
from holdfast import Contract, check_contract
contract = Contract.load("{contract_dir}")
alerts = check_contract(contract)
Read detection.yaml for the rules. Read all evidence JSON files. Then
compute each rule. Show your work — list the values, show the math,
then state whether the threshold was exceeded.
window)passed is falsefailure_count / total_countmax_ratewindow)output.satisfaction)sum(values) / countsqrt(sum((v - mean)^2 for v in values) / (count - 1))max_stddevgroup_by is set, bucket runs by that field and check each group separatelybaseline_window)recent_window)abs(recent_mean - baseline_mean)max_shiftBe specific. Cite run IDs. Examples:
If there are patterns worth acting on, suggest the user run /holdfast:evolve.
npx claudepluginhub kevintelford/holdfast --plugin holdfastEvaluates ML model performance: runs static LLM usage analysis, detects stack, compares metrics to baseline, checks data drift and error patterns.
Audits Claude workflow system by analyzing error patterns from evolution data, model performance metrics, rule effectiveness, and configuration staleness. Run monthly or on phrases like 'audit workflow'.
Evaluate model performance — check for accuracy drops, data drift, and error patterns. Use when asked about "model accuracy dropped", "evaluate the model", "check for drift", or "model performance".