From repo-visuals
Create hero visuals — animated GIF or static PNG — for GitHub repositories. Runs a structured discovery conversation (scan repo → recommend format → propose creative scenarios → agree on a brief), then designs bespoke HTML, previews it in the browser, and exports.
How this skill is triggered — by the user, by Claude, or both
Slash command
/repo-visuals:repo-visualsThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Turn a repo (GitHub URL, local folder, or free-text brief) into a hero visual that a viewer sees at the top of the README and instantly understands *what this project does and why it's interesting*. The hero may be an **animated GIF** or a **static PNG** — the skill recommends one based on the repo's identity, the user picks.
Turn a repo (GitHub URL, local folder, or free-text brief) into a hero visual that a viewer sees at the top of the README and instantly understands what this project does and why it's interesting. The hero may be an animated GIF or a static PNG — the skill recommends one based on the repo's identity, the user picks.
The skill's quality comes from the discovery dialog, not from templates. Every hero is bespoke.
page.screenshot to PNG (with deviceScaleFactor: 2 for retina crispness).Dev mode (author-only): Phase 6 — Evaluate exists but only runs in dev mode. Dev mode is for the skill's author iterating on the skill itself; it collects scorecard data and writes run logs under ./evaluations/ in the user's current working directory. Enable it only when the user explicitly says "dev mode" (or sets REPO_VISUALS_DEV=1). In every normal run — including Manual, Semi-auto, and Auto — the skill ends after Phase 5. Do not mention Phase 6 to end users.
Discovery runs before any HTML is written. Its job: go from a vague ask ("make a hero GIF for my repo") to a specific, committed creative brief.
Enter discovery mode whenever the user invokes the skill, unless they explicitly say "skip discovery, I have the brief ready."
Before anything else — before the scan summary, before any direction questions — ask the user which operating mode they want for this run. Use the AskUserQuestion tool so the choice is structured and visible, not buried in free text. Offer three modes; describe each in concrete terms (what gets asked, what gets decided silently) and include pros/cons so the user can pick with open eyes. Default recommendation: Semi-auto.
Note: this "operating mode" is a skill-level concept about how many questions the skill asks. It is distinct from the Claude Code harness's own auto-mode; the two do not replace each other.
Question shape:
AskUserQuestion({
questions: [{
header: "Run mode",
question: "How involved do you want to be in this run? This affects how many decisions I ask you to make before we ship an artifact.",
multiSelect: false,
options: [
{
label: "Semi-auto (Recommended)",
description: "I decide vibe, audience, scenario, dimensions, copy. You decide: output format (GIF/PNG/HTML) and one preview-and-iterate review before export. ~2 decisions. Pros: fast, keeps production-grade gate, keeps your taste in the loop on the things that matter most. Cons: you miss input on smaller creative calls."
},
{
label: "Manual",
description: "I ask you at every decision point — scenario pick, vibe confirmation, brief approval, preview iteration rounds, export ship-intent. I still make suggestions and recommendations at each step. Pros: max control, highest ceiling on quality. Cons: slow — 8–12 back-and-forths before an artifact."
},
{
label: "Auto",
description: "I make every decision and go straight to the exported artifact (GIF or PNG, my call). Pros: hands-off, 0 decisions, fastest path to a shippable draft. Cons: lower quality ceiling, more risk of missing your taste or the repo's real scope; expect to redirect after seeing the result."
}
]
}]
})
Rules that apply to every mode regardless of choice (craft non-negotiables):
After the user picks a mode, commit it to memory for the run (e.g. "Mode: Semi-auto") and reference it when deciding whether to ask or decide silently at each subsequent step. In Auto and Semi-auto, make decisions with a brief one-line note ("going with the Product-UI marketing archetype per §1.4c — amplication-shape repo") so the user can redirect if they disagree.
User may provide:
gh repo view for stars/topics/languages/releases.Gather as much as possible before asking the user anything. Goal: Claude should already have an opinion about what this repo is before the vibe conversation starts.
package.json, Cargo.toml, pyproject.toml, go.mod, *.csproj, etc. — description, keywords, dependencies, entry points.assets/, docs/, .github/, any *.png / *.gif / *.svg near the root.LICENSE, CHANGELOG.md — maturity signal.ast-graph is available locally and the repo has supported languages: run scan + hotspots + symbol on top-level exports for structural inventory.src/tools/*, CLI command registrations, etc.). Do not estimate from the README. This number is used later: any hero that says "all", "every", "the whole", or shows a grid meant to represent scope must match this count — or explicitly frame itself as a sample ("10 of 30 shown", scroll affordance, "30+"). An undercount reads as a broken promise.Summarize findings back to the user in ~6–10 bullets so they can correct misreadings early. Include the inventory count when one exists.
Ask these as a structured batch, with suggested defaults based on what the scan found so the user can just accept or redirect:
Many users will answer "I don't know" to vibe / hero moment / audience. That's normal. Don't force them to guess — extract intent sideways:
If they still can't articulate, offer Claude's best guess based on the scan and ask them to confirm or push back. Be willing to commit first, confirm second — a concrete wrong guess triggers better reactions than another open-ended question.
Before asking Q2, form an opinion from the scan. First consult craft/reference-gallery.md — try to place the target repo in one of the catalogued archetypes (Terminal-demo GIF / Product-UI marketing / Brand-first logo / Banner/promo graphic / Diagram-as-hero) and cite the match by name when recommending. If no clean match, fall back to the heuristics below.
The repo's identity tells you which format fits:
Lean static (PNG) when:
assets/ / docs/ are already static and strong.Lean animated (GIF) when:
Lean HTML-only when:
Present the recommendation forcefully: "Based on the scan, I'd go static — this repo's identity is [X], and motion would add noise not signal. Want me to proceed with static, or would you rather animated?" Then honor whatever they pick.
Offer 2–3 concrete scenarios (two if the viable directions are closely related, three if they diverge meaningfully), each with:
htop")Then Claude's recommendation: pick one and argue for it forcefully in 1–2 sentences. Frame it as something the user can still redirect.
Move to the build phase when all six are settled:
Write the brief back to the user in a compact block. Wait for "go" before writing any HTML — unless the mode is Auto, in which case proceed directly to Phase 2 with a brief summary line and no wait. In Semi-auto, show the brief but proceed after a brief pause unless the user interjects. In Manual, wait explicitly for "go".
Default: 1200×675 (16:9). It's the safe pick when the repo gives no strong signal — GitHub's ~1000px README column renders it at ~560h (comfortable but not dominant), and the skill's capture/eval scripts are calibrated around that width. Start here unless the repo wants something else.
Tailor when the repo's spirit points elsewhere. Don't force a 16:9 crop onto a project whose identity is clearly another shape. Signals to deviate:
How to decide: during the scan (§1.3), note the existing hero image's dimensions and the repo's format archetype (from craft/reference-gallery.md). If they point toward a non-16:9 shape and the brief's hero moment fits that shape, propose the tailored size in the direction-questions batch (§1.4) — show the user the default and the tailored option, cite the reason, let them pick. If nothing argues against 1200×675, don't introduce the choice; just use it.
Craft rules scale with whatever dimensions are chosen — the font-size floors, bottom-clearance minimums, and column-density rules later in this document are written around 1200×675 but are expressed as ratios (% of stage height, px relative to canvas width) so they port. When you deviate, recompute: e.g. at 1200×400 the body-text floor is still ~2.5% of stage height × retina headroom, not a fixed 17px.
The capture pipeline already supports arbitrary dimensions — scripts/screenshot.js and capture.js both take --width and --height flags. Pass whatever the brief locked in; don't hardcode 1200×675 in export commands.
Once the brief is locked and the user says "go", write the HTML in one file (index.html in a working directory for this repo's hero).
Default layout — always nest per-repo under a subdirectory named after the target repo (e.g. last30days-skill, my-cli):
<current-dir>/repo-visuals-work/<repo-name>/
index.html # the hero animation source
assets/ # any images/SVGs the scenario needs
Use repo-visuals-work/<repo-name>/ so the scratch files stay obviously separate from the target repo and runs for different repos don't clobber each other. Never write directly into repo-visuals-work/ — always into the repo-named subfolder. If no obvious repo name, use a short slug derived from the brief.
Open the target repo's README once more right before writing HTML. Mirror its actual phrasing, headings, and technical terms in the animation's on-screen text. The hero should feel like it was written by the repo author, not for them.
Before writing any scene copy or layout, read:
craft/headlines.md — headline patterns (imperative-plus-invariant, narrative arc), voice rules, anti-patterns.craft/templates/*.html — full working heroes from past runs, all upstream-merged (the maintainers of the target repos accepted them). Read end-to-end to see how a complete scene system is composed (stage + browser chrome + tool-body + rotating hero text + progress indicator + timeline scheduler). Don't copy verbatim — steal patterns. Match by archetype, not by preference ranking.Every scene needs a headline. A demo without copy delivers no meaning.
index.htmlSingle self-contained file. No build step. Sections:
@import), define CSS custom properties for the palette (--bg, --accent, --text, --muted, etc.) derived from the vibe/constraints.TIMELINE with named keys (e.g. scene2Start: 3500) so pacing is tweakable in one spot. Include loopPauseAt so the animation has a clean loop boundary.schedule(t, fn) wrapping setTimeout into an animTimers array, plus a runLoop() that clears and restarts. This is what the export pipeline calls to reset to t=0 deterministically.transform and opacity only — never top/left/width/height (janky, expensive).runLoop() is the hard reset if needed.Math.random() without a seed, no Date.now()-based logic. Everything must replay identically each capture.v1.9.2 turns obsolete the next Tuesday, a visible 6.4k stars turns obsolete the next weekend.
v1.9.2, 2.0.0-rc3), language/runtime floors (Node 20+, Python 3.10+ — these bump every ~2yrs), star counts (6.4k stars), download counts (12M weekly), contributor counts (52 contributors), "last updated" / release dates, benchmark figures pulled from a specific build.<next lower round number>+ so upstream movement doesn't falsify it. Examples: 44 rules → "40+ rules", 6,421 stars → "6k+ stars", 127 contributors → "100+ contributors", 12.3M weekly downloads → "10M+ weekly downloads". Round down to the nearest meaningful milestone — never round up, and never pick a cap the number is about to cross (a 9k+ stars claim on a 9.1k repo is a landmine; use 9k+ only when the repo is comfortably past that for months, otherwise drop to 5k+).grid-template-rows: auto 1fr auto or any pattern that parks the bottom row flush against the stage edge — it reads as cramped / "glued-in" at export. Either (a) center the column content vertically (align-content: center on the grid) with explicit gaps, or (b) use grid-template-rows: auto auto 1fr with meaningful breathing room (≥32px equivalent) below the last content row. Default to option (a) unless the column genuinely has more content than vertical space.bottom: Xpx, which creates a large vertical gap and jams the footer against the edge.bottom: 18px / bottom: 24px reads as "jammed against the edge" at both HTML preview and export. Bottom-pinned footer-style elements want ≥40 px from the stage bottom edge, and the content immediately above them wants ≥32 px vertical gap so the footer doesn't kiss the preceding row. If the math says the bottom anchor has less than that, reduce the stage height (trim dead space) before nudging the footer down — clamped bottom clearance is a stage-sizing problem disguised as a positioning problem. Real incident: a Terminal.Gui hero shipped with bottom: 22px and the bottom line visibly hugged the edge; fix was stage height 675 → 580 plus bottom: 28px, not just "move it"..row, .item, .card, .stat are ubiquitous shortcuts — don't style them as plain .row { display: grid; grid-template-columns: 80px 1fr } at the top level, because every unrelated .row elsewhere in the stage (tree row, menu row, metric row) will silently inherit that grid and render broken. Scope the selector: .form .row, .sidebar .item, .metrics .card. Real incident: a Terminal.Gui mock had the tree pane's rows render as stacked two-cell grids (arrow on one line, label on another) because the form's .row grid leaked globally.index.html looks perfect in the browser, but the exported PNG/GIF is cropped or shifted because the browser's natural centering (body { display: grid; place-items: center }) + the stage's box-shadow + any body padding collectively move the stage away from the (0, 0) origin the capture pipeline samples from. Rules to prevent this:
body { margin: 0; padding: 0 } with the stage at flow position) or wrap centering in @media (min-height: 800px) so headless Chrome at the capture viewport lands at origin. Never rely on place-items: center alone if the capture viewport equals the stage size exactly.box-shadow on the stage is for the HTML preview only. Shadows extend outside the 1200×675 box and get guillotined by the capture crop — fine, expected. But don't rely on the shadow as a visual boundary; the content inside the stage must read as complete without the shadow.top: -N or right: -N to create an "overlap" effect will be clipped in export. If you want a visual that bleeds past the stage edge, reposition it inside or fake the bleed with a gradient — do not let real content live outside.clip. Always pass clip: { x: 0, y: 0, width: 1200, height: 675 } to page.screenshot() rather than relying on viewport-equals-capture. This guarantees the crop, regardless of scroll offset or body margin that slipped in.index.html in the browser. If they don't match visually, the HTML is wrong (not the pipeline) — fix the layout, don't fix the pipeline.setViewport({ ..., deviceScaleFactor: 2 }) — Puppeteer's page.screenshot() respects it. GIF: deviceScaleFactor is silently ignored by Chromium's screencast API — see §4.3g for the mandatory workaround (viewport at 2× dimensions + document.body.style.zoom = 2 + preview-media-query override).htmlhint/HTMLHint#1861 shipped a 1× 1200×200 marquee at 11.5 px body text; visual "looks small / fuzzy" feedback required a full retina re-render.shot-wrap with right: 24px), preserve its left-edge position, not its right margin — otherwise the left whitespace gap grows as the element narrows, creating an orphan composition. Real incident (Terminal.Gui hero, April 2026): initial layout had top: 56px on the title block and bottom: 26px on the footer — user called it out; later the demo shrank from 788×528 to 686×460 while right: 24 stayed fixed, so the left gap grew from 44 → 146 px. Fix combined (a) equalize top/bottom padding, (b) equalize left/right margins by trimming the stage width, (c) preserve the demo's left-edge position when resizing.'17 / Since stat was meant to say "released in 2017" — user asked what it meant; fix was Since 2017 or drop the stat. Apply this check during §3.2 preview review.ffmpeg -filter_complex "setpts=<ratio>*PTS" to speed-trim before embedding. A seamless inner = seamless outer.First preview as soon as:
Don't polish before first preview. User's reaction on overall shape is more valuable than Claude's local polish loop.
Keep this phase conversational. The goal is to converge on a version the user loves before spending time on export.
After writing index.html, give the user the command to open it themselves (their browser, their timing):
start repo-visuals-work/<repo-name>/index.htmlopen repo-visuals-work/<repo-name>/index.htmlxdg-open repo-visuals-work/<repo-name>/index.htmlTell them to watch one full loop (the animation restarts automatically via runLoop()).
Keep first-preview questions focused on shape, not polish:
Don't ask about colors/fonts/spacing on the first round. Polish comes after shape is right.
frontend-design. If the user has gone back-and-forth on style (palette, type, overall aesthetic, visual language) for 3+ rounds without converging — or says something like "still not it" / "try a totally different direction" — stop tweaking in place. Invoke the frontend-design skill via the Skill tool to get a fresh, high-quality design pass. Pass it the brief, the repo scan summary, the current index.html, and a plain-language description of what's not working. Use its output as the new starting point, then return to this iteration loop. Don't invoke it for pacing, timing, or animation-logic feedback — only when the blocker is visual design quality.GIF quantization causes specific failures that HTML preview hides: small text blurs, near-identical hues posterize, fine gradients band. If GIF was selected in §1.4, run the export pipeline once at ~70% polish so these surface before the final polish rounds. Check:
If problems appear, tune the HTML (bigger type, fewer palette neighbors, simpler gradients) and keep iterating on the preview.
Skip this section entirely if the output is not a GIF (MP4/WebM preserve HTML fidelity; static outputs are sampled directly).
Stop iterating when the user says "ship it" (or equivalent). Don't invite more rounds — excess polish is real cost. If Claude thinks there's still a clear improvement available, mention it once and let the user decide.
The user may not fully know what they want. Keep watching for mismatches between the scan/brief and the HTML behavior:
When you spot a mismatch, flag it proactively ("the README leans 'fast' but the current pacing is slower — intentional, or should we tighten?"). Deliver intent, not just instructions.
"Reduce" means reduce — directional verbs are load-bearing. When the user pairs a directional verb (reduce, smaller, tighten, cut, shrink, trim) with a relational clause ("so that padding A equals padding B", "to match X"), treat the directional verb as the binding constraint. If the easy execution of the relational clause would violate the direction — e.g., equalizing padding by growing the demo — stop and restate rather than execute. Real incident: user said "reduce the height so the padding Cross-platform to top equal MIT to bottom" — I executed as "equalize padding" and grew the demo from 749×502 to 788×528; user had to correct "I meant reduce the size, not increase it to bigger." The literal word was clear; the misread cost a capture-and-encode round trip.
Before you export anything, confirm — the exact gate depends on operating mode (§1.1a):
| Gate item | Manual | Semi-auto | Auto |
|---|---|---|---|
| User has seen the artifact running (open in browser or rendered screenshot) | required | required | skipped |
| At least one iteration round after first look | required | required | skipped |
| Scope match — hero claims ("all", "every", grid/list representing the repo) match real inventory from §1.3, or explicitly framed as sample ("10 of 30", "30+", scroll affordance) | required | required | required |
| User has explicitly said ship / go / export | required | required (after 1 preview) | not required |
The scope-match rule is non-negotiable in every mode — it's a craft check, not a human-involvement question. The human-in-the-loop items are what modes toggle.
Export is the last step, not a midpoint. A capture→encode→evaluate cycle feels productive but burns the user's patience on a draft that wasn't ready. When in doubt between "export this pass" and "one more iteration", prefer the iteration — except in Auto mode, where the point is to deliver a first draft fast and let the user redirect.
Check presence, list missing, ask the user before installing:
puppeteer — via node --version and looking for node_modules/puppeteer. If missing: npm install puppeteer (~170MB Chromium download).ffmpeg — via ffmpeg -version. If missing:
choco install ffmpeg, brew install ffmpeg, apt install ffmpeg) — needs admin.https://github.com/GyanD/codexffmpeg/releases/latest and extract ffmpeg.exe into the skill's bin/ dir. Do NOT add to PATH; call by absolute path.Never install silently. Show the plan, ask, then act.
Ship at <skill-dir>/scripts/capture.js. Accepts CLI args:
node capture.js --html <path-to-index.html> --out <frames-dir> --duration 20700 --width 1200 --height 675
Based on the format decided in Phase 1.4c:
repo-visuals-work/<repo-name>/index.html.Use as-is unless there's a specific reason to deviate.
Capture (2× retina — mandatory for anything with text):
Preferred path (current Chromium honors deviceScaleFactor during screencast):
headless: true — 'new' is deprecated).setViewport({ width: W, height: H, deviceScaleFactor: 2 }). Screencast frames emit at 2W × 2H.page.goto(file://...) with waitUntil: 'networkidle0'.await page.evaluateHandle('document.fonts.ready').await page.evaluate(() => window.runLoop()) to reset animation to t=0.Page.screencastFrame, save each frame as PNG. Verify the first saved frame is 2W × 2H — if it's W × H, deviceScaleFactor is being ignored and you need the fallback below.client.send('Page.startScreencast', { format: 'png', everyNthFrame: 1 }).This is what scripts/capture.js --scale 2 (default) now does.
Fallback (if a future Chromium/Puppeteer silently ignores deviceScaleFactor in screencast): enlarge the viewport to 2× the target dimensions and apply zoom: 2 to the body so the stage renders at 2× density natively. Use this only if the preferred path produces 1× frames:
setViewport({ width: W * 2, height: H * 2, deviceScaleFactor: 1 }).(0, 0):
await page.addStyleTag({ content: `
html, body { margin: 0 !important; padding: 0 !important; }
body { display: block !important; align-items: flex-start !important; justify-content: flex-start !important; }
`});
await page.evaluate(() => { document.body.style.zoom = '2'; }).await page.evaluateHandle('document.fonts.ready').await page.evaluate(() => window.runLoop()) to reset the animation to t=0.page.target().createCDPSession().Page.screencastFrame — save each frame as PNG, record metadata.timestamp (seconds). Verify the first saved frame is 2W × 2H pixels; if it's still W × H the zoom/viewport step didn't apply and you're about to ship a 1× render.client.send('Page.startScreencast', { format: 'png', everyNthFrame: 1 }).elapsedMs > DURATION_MS (no extra padding — anything past TIMELINE.loopEnd is the start of the next cycle and will break the loop seam). DURATION_MS must equal TIMELINE.loopEnd exactly.Why screencast, not screenshot-loop or virtual-time: real screencast records exactly what the compositor paints, including CSS transitions. Screenshot loops drift under load; virtual-time (Emulation.setVirtualTimePolicy) freezes the compositor and captures stale frames.
Encode (two-pass palette):
# Pass 1: palette built from 2× frames, lanczos-downscaled to target size
ffmpeg -y -f concat -safe 0 -i frames.txt \
-vf "fps=24,scale=<W>:<H>:flags=lanczos+accurate_rnd+full_chroma_int,palettegen=stats_mode=full:max_colors=256" \
palette.png
# Pass 2: apply palette with sharp-text dither, same lanczos downscale
ffmpeg -y -f concat -safe 0 -i frames.txt -i palette.png \
-lavfi "fps=24,scale=<W>:<H>:flags=lanczos+accurate_rnd+full_chroma_int [x]; [x][1:v] paletteuse=dither=bayer:bayer_scale=3:diff_mode=rectangle" \
-loop 0 hero.gif
scale=<W>:<H>:flags=lanczos+accurate_rnd+full_chroma_int in both passes. Frames from capture.js --scale 2 are 2W × 2H; this downscales them to the target dimensions cleanly. Omitting the downscale ships a final GIF at doubled dimensions (acceptable on retina but bloats file size ~4×).stats_mode=full — analyzes every pixel, giving chrome text equal weight to motion regions. Use stats_mode=diff only if the hero is mostly-static with a small moving region and text sharpness isn't a concern.bayer:bayer_scale=3 — sharper than bayer_scale=5; try dither=none if text is still blurry and gradients are minimal (~30% file-size cost).fps=24 is a good default (20s × 24 = 480 frames). fps=30 looks buttery but bloats file size; drop to fps=20 or fps=18 if over the 10 MB target.Size budget:
fps=20 → drop to fps=15 → max_colors=192 → max_colors=128 → shorten loop. Apply in order; stop as soon as under budget.For the static format, skip ffmpeg entirely — a single crisp screenshot is enough.
Capture via Puppeteer:
headless: 'new').setViewport({ width, height, deviceScaleFactor: 2 }) — retina-crisp for README rendering at native size.page.goto(file://...) with waitUntil: 'networkidle0'.await page.evaluateHandle('document.fonts.ready').await page.screenshot({ path: 'hero.png', type: 'png', omitBackground: false }).If the HTML has an animation that evolves over time, either:
await page.evaluate(() => window.seekTo(<seconds>)) if the source HTML exposes such a hook.Optional compression: run pngquant --quality=80-95 hero.png --output hero.png --force if the PNG is over ~500 KB. Static heroes rarely need it.
Size budget:
Ship script: node scripts/screenshot.js --html <path> --out hero.png --width 1200 --height 675 (thin wrapper around the steps above — see scripts/screenshot.js).
repo-visuals-work/<repo-name>/hero.gifrepo-visuals-work/<repo-name>/hero.pngKeep in the scratch dir until Phase 5 (Output) moves it.
Future formats Claude may extend this skill with — not part of the current export pipeline:
libx264 -crf 18 or libvpx-vp9 -crf 32. Higher fidelity, smaller files. Note: GitHub renders .mp4 uploaded via issue/PR drag-and-drop but not .mp4 checked into the repo and linked in markdown.Move hero.gif from the scratch dir into the target repo, update the README, and optionally open a PR.
Read the target repo to infer convention, then ask. Priority order when inferring:
assets/ or images/ → follow it.docs/ with images → place at docs/hero.gif (or docs/<repo-name>-hero.gif if multiple visuals)..github/ with images → .github/hero.gif.assets/hero.gif and create the dir.File name: default hero.gif. If the repo already has a hero.gif or keeps multiple visuals, prefer <repo-name>-hero.gif or <repo-name>-demo.gif.
Default to minimal — ship only what the repo needs. The mandatory artifact is the image file itself (hero.png / hero.gif) placed at the inferred path, plus the single-line ![...] embed in the target README. Nothing else by default.
Do not preemptively commit hero.html, a docs/images/README.md maintenance doc, capture scripts, frames, palettes, or any supporting artifact. They enlarge the PR, dilute the diff, and in many repos are noise the maintainer will ask you to remove.
Include supporting files only when one of these applies:
SonarSource/sonarqube#3427 — hero.html + maintenance README was requested by the review bot and directly turned a 2-item review into a merged PR.)assets/ / docs/images/ / website/src/assets/img/ already contains design sources (SVGs, Figma exports, prior hero.html), match that convention on the way in.If none of those apply, keep the PR to the image + one README line. Real incident: htmlhint/HTMLHint#1863 shipped with hero.html + a maintenance README.md bundled in — the maintainer's feedback was literally "GIF looks good — if you could kindly remove the HTML and MD I'll get it merged". Anything beyond the image risks being a scope imposition on someone else's repo.
When you do need to commit source (one of the three triggers above):
<image-dir>/hero.html — the self-contained HTML source (just copy the scratch index.html verbatim).<image-dir>/README.md — short doc with: (a) table of embedded stats + where each is verifiable, (b) re-render snippet (Puppeteer for static, Puppeteer + ffmpeg for GIF).Read the README first. Ask:
 right after the H1 title and tagline.Alt text is informational, not decorative. If the image contains text, stats, or a named concept that a sighted viewer takes away, the alt text must convey the same. <repo-name> demo is almost never enough.
Pattern:
If the image is genuinely decorative (a brand flourish, a pattern), use alt="" explicitly so assistive technology skips it rather than announcing the filename.
Real incident: a review bot flagged  with "omits the stats visually presented in the image — screen-reader users will never see them." Fix was a one-line alt-text rewrite; flagging it up front avoids the round trip (SonarSource/sonarqube#3427).
Always use a relative path —  or <img src="assets/hero.gif">, never https://raw.githubusercontent.com/<owner>/<repo>/main/assets/hero.gif. Even if the existing README uses absolute raw.githubusercontent.com URLs for its current images, do not mirror that style for your new hero. Reasons:
main don't render in forks or in the PR preview — the image stays broken until the PR merges, which hurts review quality and often triggers reviewer objections (has happened: htmlhint/HTMLHint#1861).If the user explicitly asks for an absolute URL (e.g. embedding the hero on an external site that loads the README), use one — but the default for in-repo embeds is always relative.
Branch name default: docs/add-hero-gif. Override if the repo has a branch-name convention (check recent PRs or CONTRIBUTING.md).
Commit message default: docs: add animated hero gif to README. Follow existing commit style (conventional commits, imperative, etc.).
Co-author footer: default OFF. Only add a Co-Authored-By: Claude … footer if the user explicitly opts in for this repo.
Detect auth and repo ownership:
origin, open PR via gh pr create against the default branch.gh repo fork --clone=false, add the fork as a remote, push the branch there, then gh pr create --repo <upstream> --head <user>:<branch>.gh auth login as the right account. Never guess.Commit identity: if the user specifies a different git account for this repo, set user.name / user.email on the local repo config only, not globally. Use <login>@users.noreply.github.com if email is unknown.
After the PR opens, report:
Then, as a final one-liner after the hand-off report, invite them to the showcase:
Optional: add this to
SHOWCASE.md.
That's it. Do not expand on it unless the user asks. Do not block waiting for an answer. Do not re-surface it in later phases or future runs.
If the user doesn't want a PR, leave hero.gif at repo-visuals-work/<repo-name>/hero.gif and print the path. Don't modify the target repo.
If the user sounded genuinely pleased (e.g. "this looks great", "love it"), end with the same one-liner as §5.5:
Optional: add this to
SHOWCASE.md— the hero file itself can go in the PR, no upstream merge needed.
Same rule — mention once, don't expand, don't block, don't re-surface.
Never let the showcase invite crowd the hero hand-off. The hero is the user's deliverable. The invite is a footnote on the last line, never in its own phase, never a question the user has to answer.
Gate: run this phase only when dev mode is active (user said "dev mode" this run, or REPO_VISUALS_DEV=1). In every other run — Manual, Semi-auto, Auto — stop after Phase 5. Do not ask user-rating questions, do not write run logs, do not mention this phase.
Score the final artifact, not the process. In dev mode, always runs at the end.
Each criterion is rated by exactly one of: User, Claude (subjective, chat-blind), Code (deterministic script), or AI (Claude re-reads the final GIF/HTML with vision, blind to prior chat).
User-rated (3) — viewer-side truth.
| Criterion | What it measures | Signal for low score |
|---|---|---|
| Hero moment delivery | Does a cold viewer "get it" in ~10 seconds — both what the repo is and why they'd reach for it? | After one loop, viewer still can't state the repo's purpose or motivation |
| Visual impact | Does the artifact make the viewer want to try the repo? | Looks fine but feels generic; no pull |
| Ship-worthiness | Gut check: would the user paste this into the repo's README today, as-is? | User hesitates, wants "one more pass" |
Claude-rated (1) — repo-fit judgment.
| Criterion | What it measures | Signal for low score |
|---|---|---|
| Repo fidelity | Do on-screen text, terminology, and vibe feel like this specific repo's own voice? | Headlines read like generic marketing; terminology drifts from README |
Code-evaluated — scripts/evaluate.js runs automatically after export. Rows depend on format.
| Criterion | Applies to | Pass rule | Fail signal |
|---|---|---|---|
| File size | GIF + PNG | GIF: ≤ 10 MB target / ≤ 15 MB cap. PNG: ≤ 500 KB target / ≤ 1 MB cap. | Over target → 3; over cap → 1 |
| Dimensions | GIF + PNG | Matches spec (e.g. 1200×675). PNG at 2× device pixel scale is also acceptable. | Wrong size → 1 |
| Loop duration | GIF only | 15–25 s (hero default) | Outside band → 2 |
| Loop seam | GIF only | First-frame vs last-frame pixel diff under ~2% | Visible jump on loop → 2 |
| Palette size | GIF only | Palette ≤ 256, no visible banding on solid regions | Banding detected → 2 |
AI-evaluated (4) — Claude re-opens the exported artifact with vision, blind to prior chat. Prompt in §6.3.
| Criterion | What it measures | Signal for low score |
|---|---|---|
| Legibility | Every headline readable at native render size, no sub-pixel smear | Any headline unreadable → 2 |
| Scene clarity | Each scene conveys one idea in its airtime | Two scenes blur together or one feels like filler → 3 |
| Voice match | Headlines match tone and terminology of the repo's README | Drift from repo's own language → 2 |
| Intent delivery | After one loop, can a cold viewer state why to reach for this repo? | Demos what without delivering why → 3 |
| Score | Label | Meaning |
|---|---|---|
| 1 | Poor | Falls apart on the criterion |
| 2 | Weak | Noticeably misses |
| 3 | OK | Gets there, nothing more |
| 4 | Strong | Clearly delivers |
| 5 | Excellent | Best-in-class for this repo |
Use the labels, not bare numbers. A "3" alone is noise; "3 / OK" is meaningful.
Assemble in four steps:
Run code eval — node scripts/evaluate.js <path-to-hero.gif-or-png> → emits a JSON scorecard.
Run AI eval — extract 4–6 keyframes from the GIF first (ffmpeg -ss <t> -i hero.gif -update 1 -frames:v 1 frame.png at evenly spaced times), then re-read each frame with vision, blind to prior chat. Use this fixed prompt — anchored, evidence-required, and fed the repo's real inventory so factual drift is catchable:
You are evaluating a hero GIF for the repo
<owner/repo>.Repo ground truth (from §1.3 scan):
- README excerpt (first 40 lines):
<excerpt>- Real inventory count:
<N>(e.g. "30 tools", "12 commands", "N/A"). Specific names:<list if applicable>.- Stated hero moment (from §1.6 brief):
<one sentence>Rating protocol — read carefully:
- Default every score to 3 / OK. A 3 means "gets there, nothing more." Only move up with specific visual evidence from the frames; only move down with specific visual evidence of a problem.
- 4 / Strong requires one concrete observation from the frames that the criterion is clearly delivered (cite it in the note).
- 5 / Excellent requires two concrete observations AND that you cannot name a realistic improvement. If you can name one, cap at 4.
- Do not grade on effort, intent, or potential. Grade only what is visible in the frames.
- Compare on-screen claims against the repo ground truth. If the hero says "all" / "every" / shows a grid of N items but the repo has more, cap Intent delivery at 2 / Weak and note the undercount. An unverified claim is weaker than a verified one.
Rate each of Legibility, Scene clarity, Voice match, Intent delivery (1–5 Poor/Weak/OK/Strong/Excellent) with a one-sentence justification citing specific frame evidence. Do not reference any prior conversation — judge only what you see in the frames and read in the ground truth above.
At the end, for any score ≥ 4, answer: "what specific change would push this to 5?" If you have an answer, lower the score by one.
Fill Claude's repo-fidelity row with a one-sentence justification.
Ask the user for the 3 User rows via the AskUserQuestion tool — in Manual and Semi-auto only. In Auto mode, skip the User rows entirely; compute the overall average from Code + AI + Claude rows alone and flag in the run file that User ratings were not collected (so the evaluations index can weight it correctly).
Use four questions in one call: Hero moment delivery, Visual impact, Ship-worthiness, and a fourth free-text-via-Other for the one-line feedback. Structure each rating question with four labeled options matching the scale (omit 5 to fit the 4-option max; users can pick "Other" to enter 5/Excellent or a custom score). Example shape:
AskUserQuestion({
questions: [
{
header: "Hero moment",
question: "Hero moment delivery — after one loop, would a cold viewer get *what* this repo is *and why* they'd reach for it?",
options: [
{ label: "2 / Weak", description: "Noticeably misses" },
{ label: "3 / OK", description: "Gets there, nothing more" },
{ label: "4 / Strong", description: "Clearly delivers" },
{ label: "1 / Poor", description: "Falls apart" }
],
multiSelect: false
},
{ header: "Visual impact", question: "Visual impact — does it make you want to try the repo?", options: [/* same 4 */], multiSelect: false },
{ header: "Ship-worthiness", question: "Ship-worthiness — would you paste this into the README today, as-is?", options: [/* same 4 */], multiSelect: false },
{
header: "Feedback",
question: "One line of free-text feedback — the single most useful signal for next time.",
options: [
{ label: "Nothing to add", description: "Skip this round" },
{ label: "Add a comment", description: "Pick 'Other' to type your line" }
],
multiSelect: false
}
]
})
The "Other" escape hatch covers 5 / Excellent and any other custom response. Capture the returned labels + any annotations.notes or "Other" text into the scorecard.
Display the completed table, grouped by rater, then compute an overall simple average. Keep the full table in the run file.
| Criterion | Rater | Score | Note |
|----------------------|--------|--------------|--------------------------------------------------------|
| Hero moment delivery | User | (ask user) | (ask user) |
| Visual impact | User | (ask user) | (ask user) |
| Ship-worthiness | User | (ask user) | (ask user) |
| Repo fidelity | Claude | 4 / Strong | Mirrors README phrasing; tagline could be tighter. |
| File size | Code | 5 / Excellent| 2.4 MB (10 MB target). |
| Dimensions | Code | 5 / Excellent| 1200×675 matches spec. |
| Loop duration | Code | 5 / Excellent| 20.1 s inside 15–25 s band. |
| Loop seam | Code | 4 / Strong | 1.3% first/last-frame diff (threshold 2%). |
| Palette size | Code | 5 / Excellent| 212 colors, no banding flagged. |
| Legibility | AI | 4 / Strong | All 7 headlines readable at native size. |
| Scene clarity | AI | 3 / OK | Cron and JWT scenes blur into each other. |
| Voice match | AI | 4 / Strong | Matches README's "one-off utility" framing. |
| Intent delivery | AI | 3 / OK | Shows *what* each tool does, not *why* a user needs it.|
Write all evaluation files under ./evaluations/ in the user's current working directory — NOT the plugin cache, which is wiped on /plugin update. Create the directory if it doesn't exist on first dev-mode run.
Tier 1 — curated aggregate (committed): ./evaluations/index.md
Tier 2 — raw per-run files (gitignored by default): ./evaluations/runs/<YYYY-MM-DD>-<slug>.md
hero.gif / HTML if user opts to keep themUser can opt in to commit specific runs (typically OSS repos they're happy to publicize).
repo-visuals-retroA separate skill (not part of repo-visuals's runtime) for improving the skill itself. See ../repo-visuals-retro/SKILL.md for its own design. Invoked on-demand when you have enough samples to spot patterns — not automatically per run.
npx claudepluginhub livlign/claude-skills --plugin repo-visualsConverts a GitHub pull request into a narrated explainer video with code diff, before/after, and impact scenes. Useful for creating changelog videos or code walkthroughs from PRs.
Sets up a render-vision-fix loop for visual output (slides, charts, UI). Renders to PNG, scores with vision AI, and iterates until score >= 9.5.
Creates static component showcases and validates visual fidelity through iterative expert review cycles with per-criterion scoring, judge verdicts, and versioned outputs.