From claude-resources
Renders UI changes and judges layout from actual pixels, not CSS tokens. Diagnoses spacing, grouping, visual hierarchy, and alignment issues. Supports reference-match mode where a user-provided screenshot is the spec.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-resources:design-iteration-wisdomThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
You cannot reliably judge a rendered layout from class names and design tokens.
You cannot reliably judge a rendered layout from class names and design tokens.
Whether spacing reads as "tight" or "loose" is a property of the rendered
composition — the emergent result of token → utility → responsive variant → flex/grid resolution → content (image aspect, wrapped text) → browser layout.
You only ever edit the top of that chain; the look only exists at the
bottom, after rendering. Reasoning from tokens alone produces confident,
wrong spacing — it can yield a too-tight AND a too-loose version from the same
rule, because the rule was about magnitude and the problem is about
relationships.
The fix is a loop: render it, look at the pixels, diagnose in real design terms by combining the project's design-system rules with general design knowledge, change code, look again. This skill is the judgment layer; the project's design-system skill is the source of allowed values. Use both.
But render-and-look only works if you look at the right thing. There are two different targets you can be judging against:
The loop (render → look) is identical for both, but the second has a failure mode this skill exists to stop and previously did not name: you render your change, look at your own output, and confirm it against your mental model of the request instead of against the reference's actual pixels — so a confident "looks right ✓" ships a result the user can see is wrong at a glance. You had the same bitmap they did. If they can spot the error instantly, the information was in the image; you compared it to the wrong thing. Reference-match mode has its own discipline — see "Matching a reference" below — and it is not optional whenever a reference exists.
Spacing encodes grouping. Bind tight, separate loose. Things that belong together get less space between them; things that are distinct get more. The contrast (ratio) between hierarchy levels is the design — not the absolute size.
For the full vocabulary (Gestalt grouping, type, alignment, visual weight, the symptom→cause→fix cheatsheet, and how to measure) read references/design-knowledge.md during the Diagnose step.
When the user gives a concrete visual reference, the reference is the spec. Success is not "does my result look good / look like a drawer / feel right" — it is "does my rendered result match the reference, element by element." The aesthetic-judgment half of this skill barely applies; this is reproduction, and it has its own discipline because it has its own failure mode (the one that created this section): rendering your change, looking at it, and confirming it against your own restatement of the request instead of against the reference.
Run these, in order — and treat skipping any one as the bug:
Read the reference as presence AND absence. A positional annotation is a complete statement, not a hint. A green line drawn along the bottom edge says "a border goes here" and, by every edge it does not cover, "no border anywhere else." Marking where something is also marks where it isn't. Before coding, write the spec out both ways: what the reference shows present and what it shows absent. The classic, exact miss that motivated this section: reading "border here" and silently adding borders (a full perimeter) the reference never drew.
Use the user's annotation legend verbatim. They often state it ("green = must have, red = must not have"; arrows = position; a box = the element in question). Apply that meaning literally. If a mark is genuinely ambiguous, ask one question — do not paper over it with a guess.
Verify by DIFF, never by self-judgment. Render your result at the same framing as the reference (same region, same zoom — crop to it, even with a sharp/Playwright crop), set the two images side by side, and walk every element: present-in-both, absent-in-both, present-in-one-only. Every "present in only one" is a defect — including elements present in yours but not the reference (the added border, the extra rule, the shifted corner). "Does this look right?" judged on your render alone is the trap; the only question is "does this match that?"
Do not author your own pass/fail checks from your interpretation. A checklist you wrote from your reading ("tab has no top border ✓") can only confirm your reading — it passes while the result is wrong, because the misread is upstream of the check. With a reference, the sole valid check is "matches the reference." If you delegate verification to a subagent, give it the reference image and tell it to diff against it; never hand it your paraphrased checklist (that just launders your misinterpretation through a second agent).
On "still wrong," re-derive from the reference — don't patch your model. When feedback says it's still off, the reflex is the smallest edit to your existing output (you remove the line across the tab but keep the perimeter you should never have added). Stop. Re-extract the spec from the reference from scratch, as if seeing it for the first time. Repeated feedback on the same element means your underlying model is wrong, not one step short — a patch on a wrong model stays wrong.
/l-design-system, or any *design-system* / *design-tokens* skill or doc).
Invoke/read it — it defines the allowed values. Never invent raw px/rem when
a token system exists. If none exists, use the values already present in the
codebase as the de-facto scale.package.json scripts and the project's
CLAUDE.md / README for the dev or serve command and port (e.g. pnpm dev,
npm run dev). Reuse an already-running server if one is up — don't spawn a
duplicate.Start the dev/preview server (or confirm one is running) and get the URL. For gated preview deployments, note any auth cookie/header the project documents.
Use /headless-browser to capture the target at the breakpoints that matter —
at minimum mobile (~390px), mid (~760px), and desktop (~1300px).
Responsive escalation rules can only be checked across widths; a single
screenshot will mislead you.
Capture the worst case, not the page top. Layout breaks hide in the extreme content: scroll to the item with the longest title / most wrapping text, the longest list, AND the shortest / near-empty group — at the narrowest viewport. A clean top-of-page screenshot routinely hides a row whose title collapses to one character per line three screens down, or a short section drowning in separator space. Screenshot the region in question with its worst-case data, not whatever renders first.
/verify-ui (computed styles) or a Playwright bounding-box
snippet (see references/design-knowledge.md → "Measuring rhythm"). Numbers turn
"does it look grouped" into arithmetic you can verify.rgb(214,211,209)) and
pure white (rgb(255,255,255)) look identical on a dark background: the eye
normalizes "light line on dark" to "white." So a screenshot does not settle
a color question — you will look at the muted line and call it white. The same
goes for any near-threshold value: a 1px vs 2px border, exact alignment, an
opacity, a token that resolves to almost-but-not-the-target. Never report a
color or exact value from looking. Sample it — and prefer the computed style
over pixels. For anything that is a real DOM element with the color set in CSS
(a border, a background, text), getComputedStyle(el).borderColor /
.backgroundColor is authoritative: it's the actual resolved value, and it
skips the entire fragile pixel harness below. Query the element first. Fall back
to rendered pixel RGB from a screenshot only when there is no single element
to ask — the "rule" is really a pseudo-element / box-shadow, or you're checking
an image, a gradient, or an antialiased edge. Either way, compare to the target
number. "It looks white ✓" is not a verification; "the computed border-color
is rgb(255,255,255) ✓" is. If the user insists a color/value is wrong and
you "can't see it," that is the tell that you are eyeballing something only
measurement resolves — stop looking and sample.
clip is CSS pixels, but the saved image is
device pixels — don't multiply clip coords by devicePixelRatio (you'll
sample the wrong place); map back by dividing the read coords. (b) A 1px line
sits at a sub-pixel y, so a 1px-tall strip routinely misses it — sample a
band (e.g. ±5px around the border) and take, per column, the pixel
furthest from the background color (on a dark bg that's the brightest
pixel; on a light bg the darkest — never hard-code "brightest" or a dark rule
on a light bg samples the background and reads as absent). (c) Sample a
text-free region — text glyphs in the foreground color contaminate the
furthest-from-bg pixel and read as a false line. When two probes disagree, the
naive one is wrong; trust the robust band-scan — but remember (a)–(c) only
arise because you fell back to pixels; a queryable element never needs them.Load references/design-knowledge.md and name the problem in design terms — grouping/contrast, hierarchy, alignment, weight, type measure — mapped onto the project's token scale. Bad: "the rows need more spacing." Good: "within-series row gaps ≈ between-series gaps → no grouping contrast; tighten rows and raise the between-series gap a couple of steps on the scale." The diagnosis is the synthesis of general principle + project value.
Apply the change using project tokens/components. Keep changes minimal and targeted to what the diagnosis identified.
Repeat Steps 2–5. Confirm the fix did what the diagnosis predicted and didn't break grouping at another breakpoint. Usually 2–4 rounds converge. In reference-match mode there is no "diagnosis" to confirm — re-crop and diff against the reference, and if the same element is still wrong, re-derive the spec from the reference rather than patching your model (see "Matching a reference" → the repeated-feedback rule).
Design has a taste component an AI approximates but doesn't own. When the computable invariants pass and the screenshots read cleanly at every width, stop and show the human the before/after screenshots for the final call. Don't loop indefinitely chasing a verdict only a person can give.
In reference-match mode the bar is different — not taste but fidelity. Stop when your render matches the reference element-for-element (presence and absence), and show the human your result next to their reference, not just a before/after of your own work, so the final check is the same direct comparison you just made.
Even without perfect perception you can check these against measured px:
If any fails, the layout will read as tight (low contrast), loose (binder too big), or chaotic (non-monotonic) — regardless of how "generous" the absolute values are.
Reference fidelity (when a reference exists — the strongest invariant): every element the reference marks present is present; every region it leaves unmarked is unchanged/absent; and nothing exists in your render that the reference does not show. An added border, rule, box, or shifted edge that the reference never contained is a defect even when it "looks fine" on its own — extra is wrong, not just missing. This check is interpretation-free, so it catches the misreads your own paraphrased checklists never will.
npx claudepluginhub takazudo/claude-resources --plugin claude-resourcesDesigns and iterates production-grade frontend interfaces through live browser iteration, UX audit, visual refinement, and design system work.
Beautifies web UIs using visual principles like hierarchy, restraint, whitespace, color application, and typography to achieve premium aesthetics like Stripe or Linear.
Designs, redesigns, critiques, audits, polishes frontend UIs for websites, dashboards, components, forms. Covers UX review, accessibility, performance, responsive design, theming, typography, layout, color, motion, micro-interactions.