Skill

prompt-regression

Detects harness regressions by running standard prompt probes after rule/skill changes and comparing outputs against saved baselines. Triggers on "prompt regression", "did my changes break anything", "regression check", "test harness changes".

Popularity

Parent stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/fh-meta:prompt-regression

User invocable

Model invocable

Inline context

Default effort

Configuration

Modelsonnet

Tool Access

This skill is limited to the following tools:

ReadBashGlobGrep

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

After CLAUDE.md edits, rule changes, or new skill additions, harness behavior can silently regress. This skill runs a lightweight probe suite against the changed assets and compares outputs against saved baselines to surface regressions before they reach production.

SKILL.md

182 lines · ~1.7k tokens

Stats

LanguageHTML

Parent stars2

MaintenanceExcellent

Last CommitJun 11, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

prompt-regression — Harness Regression Detection

Scope distinction

harness-doctor: structural completeness (files, links, drift)

prompt-regression: behavioral correctness — did the change alter expected AI response patterns?

Triggers

/prompt-regression
"prompt regression", "regression check", "regression test"
"did my rule change break anything", "test harness changes", "verify harness behavior", "make sure my edit didn't change behavior"
After significant CLAUDE.md edits or new skill commits

Execution Steps

Step 1. Identify Changed Assets

# What changed since last commit (or last N commits)
git diff HEAD~1 --name-only -- CLAUDE.md .claude/ plugins/

Classify each changed file:

CLAUDE.md → core behavior (high impact)
.claude/rules/*.md → rule layer (medium impact)
plugins/*/skills/*/SKILL.md → skill behavior (scoped impact)
plugins/*/skills/*/SKILL.md (trigger phrases changed) → trigger routing (high impact)

If no changes detected: report "No harness changes since last commit — regression check skipped."

Step 2. Load Probe Suite

Check for custom probes:

ls .claude/regression/probes.md 2>/dev/null || echo "NO_CUSTOM_PROBES"

If custom probes exist: load and use them. The hub repo ships its golden probe set (known-answer offline eval, 28 probes with check classes) at exactly this path — when present it is canonical and supersedes the default matrix below.

If no custom probes (e.g. Mode C install without the hub repo): use the default probe matrix below.

Default Probe Matrix

Probe ID	Input Pattern	Expected Behavior	Scope
`P-GREET-01`	`hi` / `hello`	Active onboarding protocol triggered	CLAUDE.md §Onboarding
`P-TRIGGER-01`	`recommend a plugin`	plugin-recommender proposed	CLAUDE.md §Autonomous
`P-TRIGGER-02`	`context is getting long`	context-doctor proposed	CLAUDE.md §Autonomous
`P-TRIGGER-03`	`harness is complex`	harness-doctor proposed	CLAUDE.md §Autonomous
`P-CHAIN-01`	`/field-harvest`	harvest-loop close-chain referenced (wrap-up deferred to CLAUDE.md session-close chain — field-harvest has no inline sim-conductor gate)	field-harvest SKILL.md
`P-CHAIN-02`	`/apex-review`	Conditional verdict present (apex-review vocabulary: "Conditional" / "Conditionally passed")	apex-review SKILL.md
`P-GATE-01`	new skill commit	New Skill Pre-Commit Gate (6 items, incl. check-class declared) invoked	CLAUDE.md §Gate
`P-CLOSE-01`	`wrap up` / `good work`	Session close chain (4-step) triggered	CLAUDE.md §Wrap-up

Step 3. Map Changes → Affected Probes

Changed File	Affected Probes
`CLAUDE.md` §Onboarding	P-GREET-01
`CLAUDE.md` §Autonomous Initiative	P-TRIGGER-*
`CLAUDE.md` §Wrap-up	P-CLOSE-01
`CLAUDE.md` §New Skill Gate	P-GATE-01
`plugins/fh-meta/skills/*/SKILL.md`	P-CHAIN-* matching skill

If change scope is CLAUDE.md core (10+ lines changed): run full suite (all probes).

Step 4. Run Affected Probes

For each affected probe, evaluate:

4-a. Trigger routing check — Does the trigger phrase still map to the expected skill/behavior?

Read the changed file
Locate trigger section
Confirm the probe's input pattern still appears in the trigger list

4-b. Chain integrity check — Are mandatory chains still present?

For skill chains: confirm → chain: or Mandatory: gate lines exist
For session close chain: confirm all 4 steps in CLAUDE.md §Wrap-up

4-c. Gate presence check — Are gate conditions still enforced?

Pre-commit gate: confirm all 6 items present in CLAUDE.md (incl. check-class declared)
Conditional-pass gate: confirm CONDITIONAL_PASS logic in affected SKILL.md

Output per probe:

[PASS] P-TRIGGER-01 — plugin-recommender trigger phrase found in CLAUDE.md
[FAIL] P-CHAIN-01 — field-harvest SKILL.md no longer references the harvest-loop close chain after edit
[SKIP] P-GREET-01 — §Onboarding not in changed files

Step 5. Regression Report

## Prompt-Regression Report — YYYY-MM-DD

### Changed Assets
- CLAUDE.md (§Autonomous Initiative — 3 trigger phrases modified)
- plugins/fh-meta/skills/field-harvest/SKILL.md

### Probes Run: 4  |  Pass: 3  |  Fail: 1  |  Skip: 4

### FAIL Details
| Probe | Location | Finding | Fix |
|---|---|---|---|
| P-CHAIN-01 | field-harvest SKILL.md | harvest-loop close-chain reference removed by edit | Restore the harvest-loop chain reference |

### Verdict
⚠️ REGRESSION DETECTED — 1 probe failed. Fix required before merge.

If all probes pass:

✅ NO REGRESSION — All N probes passed. Safe to merge.

Step 6. Baseline Update (on explicit user approval)

After a deliberate behavior change (not a regression — an intentional improvement), update the baseline:

# Baseline stored as markdown in .claude/regression/
mkdir -p .claude/regression
# Write updated probe expectations

Prompt user: "Probe P-CHAIN-01 now expects the new gate format. Update baseline? (y/n)"

Only update on explicit y — never auto-update.

Done When

All affected probes are evaluated (PASS / FAIL / SKIP) — class: mandatory-pass
Regression report is output with clear PASS/FAIL verdict — class: mandatory-pass
If FAIL: specific file + line fix is recommended — class: judged, paired with verify-bidirectional (the fix recommendation is re-checked, not trusted as-is)
Baseline updated only on explicit user approval — class: mandatory-pass (HITL)

Chains

Upstream (runs before this skill):

Automatically triggered after significant harness commits (Step 1 detects via git diff)

Downstream (runs after FAIL verdict):

→ harness-doctor for structural fixes if L1/L2 issues found
→ verify-bidirectional to confirm fix resolved the regression

prompt-regression

Popularity

Invocation

Configuration

Tool Access

Context Preview

SKILL.md

prompt-regression

Popularity

Invocation

Configuration

Tool Access

Context Preview

SKILL.md

prompt-regression — Harness Regression Detection

Triggers

Execution Steps

Step 1. Identify Changed Assets

Step 2. Load Probe Suite

Default Probe Matrix

Step 3. Map Changes → Affected Probes

Step 4. Run Affected Probes

Step 5. Regression Report

Step 6. Baseline Update (on explicit user approval)

Done When

Chains

Similar Skills

prompt-regression — Harness Regression Detection

Triggers

Execution Steps

Step 1. Identify Changed Assets

Step 2. Load Probe Suite

Default Probe Matrix

Step 3. Map Changes → Affected Probes

Step 4. Run Affected Probes

Step 5. Regression Report

Step 6. Baseline Update (on explicit user approval)

Done When

Chains

Similar Skills