From vp-git
Validates git rebase correctness using a five-layer pipeline (range-diff, sem, weave, jscpd, tests). Use when verifying a git rebase (especially stacked PRs with --update-refs), comparing branches post-rebase, or auditing conflict resolution correctness. Covers scenarios like: 'validate rebase', 'check rebase', 'did we lose anything', 'compare before and after rebase', 'duplicate test blocks', 'rebase validation', 'run range-diff', 'lost code during rebase', 'rebase artifacts', 'conflict resolution verification'.
How this skill is triggered — by the user, by Claude, or both
Slash command
/vp-git:rebase-validateThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Validate a git rebase for lost code, duplicate blocks, broken refs, and stale
Validate a git rebase for lost code, duplicate blocks, broken refs, and stale references using five complementary layers. No single tool catches all rebase regressions — the value is in layered defense.
git rebase --update-refs on a stacked branch setIf the user just finished a rebase and wants validation, run this sequence:
Step 0: Verify refs (are we comparing the right things?)
Step 1: git range-diff -s (commit overview — seconds)
Step 2: sem entity-level diff — MCP (mcp__sem__sem_diff) or CLI, if available
Step 3: weave preview target (merge readiness — if weave is available)
Step 4: jscpd + duplicate grep (token-level — catches what sem misses)
Step 5: project validation suite (functional oracle)
Always run Step 0 first. After Step 0 completes, Steps 1-4 can run as parallel agents. Step 5 is the final oracle. On agents without a parallel-Agent tool, run Steps 1-4 sequentially — they're independent.
Before ANY comparison, confirm refs point where expected. A broken ref makes every subsequent comparison meaningless.
# Confirm branch positions
git rev-parse --short <rebased-branch>
git rev-parse --short <original-branch>
# Confirm ancestry
git merge-base --is-ancestor <expected-base> <rebased-branch> && echo "OK" || echo "WRONG BASE"
Why this matters: git rebase --update-refs silently moves ALL branch refs
pointing to commits in the replayed range — including backup branches. If you
created a backup before rebase and it was in the replayed range, it got moved.
Safe backup strategy: use tags (not branches) for pre-rebase snapshots, or create backup branches pointing to commits OUTSIDE the range being replayed.
Optional safety check — if no pre-rebase snapshot exists, git reflog can
show where the branch pointed before rebase:
git reflog <rebased-branch> | head -5
git range-diff -s (Commit-Level)The single most important first check. Gives a complete overview in seconds.
# Using remote as "before" snapshot (most practical if pushed before rebase)
git range-diff -s origin/old-branch..origin/old-tip new-base..new-tip
# Using pre-rebase tags
git range-diff -s pre-rebase/branch..old-tip new-base..new-tip
# Immediately after rebase (reflog — only works for the tip branch)
# Assumes @{u} has not moved since the rebase (e.g. no force-push to upstream)
git range-diff @{u} @{1} @
| Symbol | Meaning | Action |
|---|---|---|
= | Identical patch | No review needed |
! | Altered (conflict resolution changed content) | Review these |
< | Dropped (only in old range) | Verify intentional |
> | Added (only in new range) | Verify expected |
Use --creation-factor=90 for rebases with heavy conflict resolution — the
default (60) can misclassify heavily-resolved commits as drop+add instead of
altered. The algorithm builds a cost matrix via "diff of diffs" and solves
least-cost assignment — higher creation-factor values make it more forgiving
of large changes when pairing commits.
Additional useful flags:
--left-only — suppress commits missing from the old range (show only
what's new or changed)--right-only — suppress commits missing from the new range (show only
what was dropped or changed)Note: range-diff ignores merge commits by default. This is correct for rebases (which linearize history) but matters if comparing branches that contain merges.
For a quick count:
git range-diff -s ... | grep -c '=' # identical
git range-diff -s ... | grep -c '!' # altered (review these)
git range-diff -s ... | grep -c '<' # dropped
git range-diff -s ... | grep -c '>' # added
Compares named code entities between branches. See
references/silent-behaviors.md for limitations (incl. the MCP/CLI field-name
difference). Prefer the most capable backend available, in the order below. On a
large diverged branch, scope to the suspect files (e.g. file_path on the MCP
tool, or a -- <path> pathspec on the CLI) — a whole-branch diff is the
900-entity noise trap.
Tier 1 — sem MCP (preferred when available). Call mcp__sem__sem_diff with
base_ref = old-branch and target_ref = new-branch (optional file_path to
scope). If the tool is not present in this session or the call errors, do not
retry — fall through to Tier 2. It returns JSON shaped:
{ "changes": [ { "change_type": "added|modified|deleted|renamed",
"entity_name": "...", "entity_type": "...", "file": "..." } ],
"total_changes": N }
For risk-ranking a large change set, mcp__sem__sem_impact (file_path +
entity_name, mode=all) surfaces the highest-fan-in changed entity to direct
attention first (see the Agent Delegation Pattern below).
Tier 2 — sem CLI (fallback). Emit an explicit signal when sem is absent so a silent no-op is never mistaken for "ran clean":
if command -v sem >/dev/null 2>&1; then
sem diff --format json old-branch..new-branch
# Alternative syntax if dotdot doesn't work:
# sem diff --from old-branch --to new-branch --format json
else
echo "sem CLI not installed — Tier 2 did NOT run; fall through to Tier 3" >&2
fi
As of sem 0.6.0: the CLI emits camelCase (changeType, filePath); the MCP
emits snake_case (change_type, file). The summarizer below handles both.
Tier 3 — git (NOT an entity-level check). If neither MCP nor CLI is
available, git diff --stat gives a file/line summary — but it cannot see a
function dropped from an otherwise-present file. Record Layer 2 as unavailable
(not passed) in your final summary and lean harder on Steps 4–5:
git diff --stat old-branch new-branch
Summarize by change type and file. MCP path: you already hold the structured
JSON — read the change_type/file counts directly; no shell needed. Apply the
same guards the script enforces below: if the result has no changes array, or a
record lacks both change_type and file, treat it as schema drift and do NOT
report zero. Then apply the zero-result check below. CLI path: pipe through
the script, which normalizes both field shapes and fails loud rather than
reporting a false zero:
sem diff --format json old-branch..new-branch | python3 -c "
import json, sys, collections
data = json.load(sys.stdin)
changes = data.get('changes')
if not isinstance(changes, list):
sys.exit('Step 2: no usable changes[] — sem produced no entity diff; do NOT read as zero changes')
ct = lambda c: c.get('change_type') or c.get('changeType')
fp = lambda c: c.get('file') or c.get('filePath')
if any(ct(c) is None or fp(c) is None for c in changes):
print(' WARNING: records with unrecognized fields — possible sem schema drift; counts may be wrong', file=sys.stderr)
by_type = collections.Counter(ct(c) for c in changes)
by_file = collections.Counter(fp(c) for c in changes)
for t, n in by_type.most_common(): print(f' {t}: {n}')
print()
for f, n in by_file.most_common(15): print(f' {n:3d} {f}')
"
Focus on deleted entities (backup has, rebased doesn't) — with base_ref =
old-branch and target_ref = new-branch, a backup-only entity surfaces as
deleted. These are potential losses; filter out intentionally deleted files
before investigating.
A zero/empty result does not clear test files or large files. sem silently
yields no entities for describe()/it()/test() blocks (call expressions) and for
files over the 32KB tree-sitter cap (see references/silent-behaviors.md). When
the change touches test files or files >30KB, treat a sem zero as inconclusive
and defer to Step 4 (jscpd + grep) before declaring no losses.
weave preview (Merge Readiness)After a successful rebase onto target, this should show 0 conflicts.
weave preview target-branch
If conflicts remain, the rebase didn't fully resolve all divergences. This is normal if target has commits not yet in the rebased branch.
Check availability first:
command -v weave >/dev/null 2>&1 || echo "weave not installed — skipping to Step 4"
This layer catches what sem/weave miss — they cannot parse ~40% of test files.
npx --yes jscpd . --ignore "node_modules/**,dist/**,.git/**" --min-lines 5 --min-tokens 50 --max-lines 5000 --reporters console
CRITICAL: --max-lines defaults to 1000 — files over 1000 lines are
silently skipped. Always pass --max-lines 5000.
Compare clone count against baseline (if available) to distinguish pre-existing from rebase-introduced duplicates.
# Adapt the find path to match the project's test directory layout
while IFS= read -r -d '' f; do
dupes=$(grep -oE "(it|test)\(['\"\`][^'\"\`]*['\"\`]" "$f" | sort | uniq -d)
[ -n "$dupes" ] && echo "DUPLICATE in $f:" && echo "$dupes"
done < <(find . -path ./node_modules -prune -o \( -name '*.spec.js' -o -name '*.test.js' \) -print0 2>/dev/null)
node:test silently runs both copies of duplicate it() blocks — no warning.
Cross-scope duplicates (same name in different describe() blocks) are usually
intentional. Same-scope duplicates are bugs.
If ast-grep is installed (brew install ast-grep), it can complement jscpd
with structural pattern matching — finding code that matches a known shape
rather than detecting unknown duplicates:
command -v sg >/dev/null 2>&1 || echo "ast-grep not installed — skipping"
# Find all test blocks (useful for manual review of test structure)
sg -p 'it($NAME, $$$BODY)' -l js .
# Find all describe blocks wrapping test blocks
sg -p 'describe($NAME, $$$BODY)' -l js .
ast-grep is a search tool, not a clone detector — it finds known patterns, while jscpd finds unknown duplicates. Use ast-grep when you know what rebase artifact to look for. No silent file-size thresholds (unlike jscpd).
The definitive oracle. Every other layer is advisory.
Run the project's full validation suite — lint, type checking, and tests. Adapt these commands to the project's setup:
# Example (adapt to your project)
npm run check # or: cargo check, go vet, etc.
npm test # or: cargo test, go test ./..., etc.
Duplicate test blocks pass both lint and tests — they are invisible to this layer. That's why Layer 4 exists.
If you haven't rebased yet, run these first to establish a baseline:
# Save baseline duplicate counts (adapt paths to project layout)
npx --yes jscpd . --ignore "node_modules/**,dist/**,.git/**" --min-lines 5 --min-tokens 50 --max-lines 5000 --reporters console > /tmp/jscpd-baseline.txt
# Save baseline duplicate test names (per-file, same approach as main check)
while IFS= read -r -d '' f; do
dupes=$(grep -oE "(it|test)\(['\"\`][^'\"\`]*['\"\`]" "$f" | sort | uniq -d)
[ -n "$dupes" ] && echo "$f: $dupes"
done < <(find . -path ./node_modules -prune -o \( -name '*.spec.js' -o -name '*.test.js' \) -print0 2>/dev/null) > /tmp/test-names-baseline.txt
# Tag all branch tips (for range-diff after rebase)
for b in branch1 branch2 branch3; do
git tag "pre-rebase/$b" "$b"
done
Without baselines, every finding during verification requires manual investigation to determine if it's pre-existing or rebase-introduced.
This pattern is optimized for Claude Code's Agent tool. On other agents, treat the agent list as a sequential checklist — order doesn't matter beyond Step 0 first, Step 5 last.
For large rebases, delegate verification to parallel agents:
Agent 1: git range-diff -s (commit correspondence)
Agent 2: sem entity-level diff — mcp__sem__sem_diff if available, else sem CLI
Agent 3: weave preview target (merge readiness, if available)
Agent 4: jscpd + grep (duplicate detection)
Agent 5: project validation suite (lint + types + tests)
Launch all 5 immediately after rebase (in parallel where supported,
sequentially otherwise). Check range-diff and jscpd results first
(arrive in seconds, highest signal). The gap between "rebase complete"
and "verification results" is dangerous — kick off verification immediately,
decide nothing until results arrive.
For a large change set, have Agent 2 follow its diff with mcp__sem__sem_impact
on the modified entities and review the highest-fan-in one first — the entity
with the most transitive dependents is where a botched conflict resolution does
the most damage.
| Finding | Classification | Action |
|---|---|---|
! commit in range-diff | Conflict resolution | Review the diff-of-diff |
deleted entity in sem (backup-only, base=old/target=new) | Potential loss | Check if intentionally deleted |
| Duplicate test block (same describe scope) | Rebase artifact | Delete the stale copy |
| Duplicate test name (cross describe scope) | Pre-existing pattern | Note for future cleanup |
| weave conflict | Unresolved divergence | Expected if target moved forward |
| lint/type error | Rebase artifact | Fix immediately |
See references/silent-behaviors.md for the silent exclusion thresholds that
make test files invisible to sem/weave/jscpd.
Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub voxpelli/vp-claude --plugin vp-git