From brand-extractor
Validate brand extractions by comparing replicated components against originals using a three-layer approach (pixel comparison, structural LLM analysis, token traceability). Implements Gate 5 of the validation pipeline. Use when comparing component replications to originals, performing visual regression testing on design tokens, or validating that extracted tokens accurately reproduce the source design.
How this skill is triggered — by the user, by Claude, or both
Slash command
/brand-extractor:visual-validationThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill teaches Claude how to perform Gate 5 visual replication validation — the ultimate test of extraction accuracy. If components built from extracted tokens look like the originals, the extraction is correct.
This skill teaches Claude how to perform Gate 5 visual replication validation — the ultimate test of extraction accuracy. If components built from extracted tokens look like the originals, the extraction is correct.
Uses pixelmatch to compute pixel-level similarity between original and replica screenshots.
python scripts/pixel_compare.py --original ./components/original/ --replica ./components/replica/ --output ./comparison/
Thresholds per component:
| Component | Criterion ID | Threshold | Rationale |
|---|---|---|---|
| Navigation bar | V-PIX-01 | ≥85% | Complex multi-element, some layout variance acceptable |
| Hero section | V-PIX-02 | ≥80% | Content varies (images, animations), structure matters more |
| Button set | V-PIX-03 | ≥90% | Atomic element, should be near-perfect |
| Card component | V-PIX-04 | ≥85% | Common molecule, tests shadow + spacing + radius |
| Footer | V-PIX-05 | ≥80% | Layout-heavy, content may vary |
| Form elements | V-PIX-06 | ≥85% | Tests input styling, focus states, spacing |
Overall pass: Average across all components ≥0.83
You (Claude) visually inspect the original and replica screenshot pairs side by side and evaluate six structural criteria.
How to evaluate:
Criteria and scoring:
| ID | Criterion | What to Check | Score Values |
|---|---|---|---|
| V-STR-01 | Layout fidelity | Column count, alignment, stacking order, spatial arrangement | MATCH (1.0) / CLOSE (0.7) / DIVERGENT (0.3) / MISSING (0.0) |
| V-STR-02 | Colour accuracy | Background, text, accent colours visually match | Same scale |
| V-STR-03 | Typography match | Font, weight, size appear the same | Same scale |
| V-STR-04 | Spacing rhythm | Padding/margin feels consistent | Same scale |
| V-STR-05 | Component completeness | All sub-elements present in replica | Same scale |
| V-STR-06 | Brand impression | Does the replica "feel" like the same brand? | Same scale |
Evaluation guidelines:
Critical rule: V-STR-06 (Brand impression) must NOT be MISSING for any component. A MISSING here means the extraction fundamentally failed for that component.
For every discrepancy found in Layers 1 and 2, trace back to a specific design token. This is what makes remediation targeted rather than "try again".
Traceability record format:
{
"discrepancy": "Button border-radius in replica (4px) does not match original (8px)",
"affected_component": "button_primary",
"affected_token": "borderRadius.md",
"current_value": "4px",
"expected_value": "8px",
"confidence": 0.9,
"remediation": {
"action": "UPDATE_TOKEN",
"token_path": "borderRadius.md.$value",
"new_value": "8px",
"requires_re_replication": true
}
}
Remediation actions:
UPDATE_TOKEN — Change a token valueADD_TOKEN — Add a missing tokenRE_EXTRACT — Re-run extraction for a specific propertyRE_REPLICATE — Rebuild the component with current tokens (code fix, not token fix)Priority assignment:
When Gate 5 fails:
max_iterations times (default 3)Circuit breaker: If max iterations reached and still failing:
{
"gate": "GATE_5_VISUAL_REPLICATION",
"iteration": 1,
"verdict": "PASS" | "FAIL",
"layer_1_pixel": {
"average_similarity": 0.87,
"threshold": 0.83,
"components": { ... }
},
"layer_2_structural": {
"components": {
"nav": {
"V-STR-01": "MATCH", "V-STR-02": "CLOSE",
"V-STR-03": "MATCH", "V-STR-04": "CLOSE",
"V-STR-05": "MATCH", "V-STR-06": "MATCH"
}
}
},
"layer_3_traceability": [ ... ],
"pass_conditions": {
"layer_1_avg_met": true,
"layer_2_no_missing_brand": true,
"layer_3_all_high_remediated": true
},
"next_action": "PROCEED" | "REMEDIATE_AND_REVALIDATE"
}
scripts/pixel_compare.py — Runs pixelmatch on component pairs, outputs Layer 1 scoresnpx claudepluginhub imehr/imehr-marketplace --plugin brand-extractorVisual-QA loop for replica sites: screenshots, responsiveness gate (390px), visual-parity per section, accessibility checks, and automated fix escalation. Invoked by orchestrators.
Validates design system token compliance, detects visual regressions, and analyzes responsive consistency across breakpoints, modes, and interactive states.
Synthesizes multiple screenshot extractions into unified design systems by inventorying sources, consolidating tokens for color, typography, spacing, elevation, and normalizing components.