From fh-meta
Detects harness regressions by running standard prompt probes after rule/skill changes and comparing outputs against saved baselines. Triggers on "prompt regression", "did my changes break anything", "regression check", "test harness changes".
How this skill is triggered — by the user, by Claude, or both
Slash command
/fh-meta:prompt-regressionsonnetThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
After CLAUDE.md edits, rule changes, or new skill additions, harness behavior can silently regress. This skill runs a lightweight probe suite against the changed assets and compares outputs against saved baselines to surface regressions before they reach production.
After CLAUDE.md edits, rule changes, or new skill additions, harness behavior can silently regress. This skill runs a lightweight probe suite against the changed assets and compares outputs against saved baselines to surface regressions before they reach production.
Scope distinction
- harness-doctor: structural completeness (files, links, drift)
- prompt-regression: behavioral correctness — did the change alter expected AI response patterns?
/prompt-regression# What changed since last commit (or last N commits)
git diff HEAD~1 --name-only -- CLAUDE.md .claude/ plugins/
Classify each changed file:
CLAUDE.md → core behavior (high impact).claude/rules/*.md → rule layer (medium impact)plugins/*/skills/*/SKILL.md → skill behavior (scoped impact)plugins/*/skills/*/SKILL.md (trigger phrases changed) → trigger routing (high impact)If no changes detected: report "No harness changes since last commit — regression check skipped."
Check for custom probes:
ls .claude/regression/probes.md 2>/dev/null || echo "NO_CUSTOM_PROBES"
If custom probes exist: load and use them. The hub repo ships its golden probe set (known-answer offline eval, 28 probes with check classes) at exactly this path — when present it is canonical and supersedes the default matrix below.
If no custom probes (e.g. Mode C install without the hub repo): use the default probe matrix below.
| Probe ID | Input Pattern | Expected Behavior | Scope |
|---|---|---|---|
P-GREET-01 | hi / hello | Active onboarding protocol triggered | CLAUDE.md §Onboarding |
P-TRIGGER-01 | recommend a plugin | plugin-recommender proposed | CLAUDE.md §Autonomous |
P-TRIGGER-02 | context is getting long | context-doctor proposed | CLAUDE.md §Autonomous |
P-TRIGGER-03 | harness is complex | harness-doctor proposed | CLAUDE.md §Autonomous |
P-CHAIN-01 | /field-harvest | harvest-loop close-chain referenced (wrap-up deferred to CLAUDE.md session-close chain — field-harvest has no inline sim-conductor gate) | field-harvest SKILL.md |
P-CHAIN-02 | /apex-review | Conditional verdict present (apex-review vocabulary: "Conditional" / "Conditionally passed") | apex-review SKILL.md |
P-GATE-01 | new skill commit | New Skill Pre-Commit Gate (6 items, incl. check-class declared) invoked | CLAUDE.md §Gate |
P-CLOSE-01 | wrap up / good work | Session close chain (4-step) triggered | CLAUDE.md §Wrap-up |
| Changed File | Affected Probes |
|---|---|
CLAUDE.md §Onboarding | P-GREET-01 |
CLAUDE.md §Autonomous Initiative | P-TRIGGER-* |
CLAUDE.md §Wrap-up | P-CLOSE-01 |
CLAUDE.md §New Skill Gate | P-GATE-01 |
plugins/fh-meta/skills/*/SKILL.md | P-CHAIN-* matching skill |
If change scope is CLAUDE.md core (10+ lines changed): run full suite (all probes).
For each affected probe, evaluate:
4-a. Trigger routing check — Does the trigger phrase still map to the expected skill/behavior?
4-b. Chain integrity check — Are mandatory chains still present?
→ chain: or Mandatory: gate lines exist4-c. Gate presence check — Are gate conditions still enforced?
CONDITIONAL_PASS logic in affected SKILL.mdOutput per probe:
[PASS] P-TRIGGER-01 — plugin-recommender trigger phrase found in CLAUDE.md
[FAIL] P-CHAIN-01 — field-harvest SKILL.md no longer references the harvest-loop close chain after edit
[SKIP] P-GREET-01 — §Onboarding not in changed files
## Prompt-Regression Report — YYYY-MM-DD
### Changed Assets
- CLAUDE.md (§Autonomous Initiative — 3 trigger phrases modified)
- plugins/fh-meta/skills/field-harvest/SKILL.md
### Probes Run: 4 | Pass: 3 | Fail: 1 | Skip: 4
### FAIL Details
| Probe | Location | Finding | Fix |
|---|---|---|---|
| P-CHAIN-01 | field-harvest SKILL.md | harvest-loop close-chain reference removed by edit | Restore the harvest-loop chain reference |
### Verdict
⚠️ REGRESSION DETECTED — 1 probe failed. Fix required before merge.
If all probes pass:
✅ NO REGRESSION — All N probes passed. Safe to merge.
After a deliberate behavior change (not a regression — an intentional improvement), update the baseline:
# Baseline stored as markdown in .claude/regression/
mkdir -p .claude/regression
# Write updated probe expectations
Prompt user: "Probe P-CHAIN-01 now expects the new gate format. Update baseline? (y/n)"
Only update on explicit y — never auto-update.
Upstream (runs before this skill):
Downstream (runs after FAIL verdict):
harness-doctor for structural fixes if L1/L2 issues foundverify-bidirectional to confirm fix resolved the regressionGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub chrono-meta/forge-harness --plugin fh-meta