From video-essay
Stages 4–5 of the video-essay pipeline. For each of 3 angles — per-angle hook research, write a short script, pre-audio judge, then audio/images/covers/captioned video. Use this skill when packages.md exists but the 3 shorts aren't all assembled yet. Runs after ve-intake, before ve-publish-shorts.
How this skill is triggered — by the user, by Claude, or both
Slash command
/video-essay:ve-produceThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Takes `packages.md` (3 angles) and produces 3 fully-rendered, captioned shorts ready to publish. Per-angle pipeline: research → script → judge → audio → images → covers → video. Auto-advances from script judge pass to audio; **no user checkpoint inside Stage 5** (checkpoint lives at Stage 6 in `ve-publish-shorts`).
Takes packages.md (3 angles) and produces 3 fully-rendered, captioned shorts ready to publish. Per-angle pipeline: research → script → judge → audio → images → covers → video. Auto-advances from script judge pass to audio; no user checkpoint inside Stage 5 (checkpoint lives at Stage 6 in ve-publish-shorts).
For each of A, B, C, run a focused hook-research pass scoped to: "what's the strongest 1.7-second hook for this specific angle?" Different question from the angle-thesis. Surfaces:
Run the deep-research thorough-mode flow per references/deep_research/SKILL.md, scoped to each angle's hook. Quote the angle's Hook concept: field verbatim in the per-agent prompts. Save each output to $EPISODES_DIR/<slug>/research_hook_{A,B,C}.md — three separate files. Never overwrite research_angles.md.
Parallelizable — launch 3 deep-research invocations concurrently if budget allows.
No checkpoint here. Auto-advance to Stage 5.
For each of A, B, C. Sub-steps run per angle; angles can be parallelized across each other.
Read references/shorts_bible.md Part 2 Commandment 4 (Jenny Hoyos framework). Write $EPISODES_DIR/<slug>/shorts/{A,B,C}/script.md:
---
episode: <NNN>
slug: <slug>
angle: A
format: short
length_strategy: loop-maximized | watch-maximized
loop_technique: visual-match-cut | narrative-callback | audio-continuity | cliffhanger-reversal | open-question-close
target_seconds: 20 # loop-max: 11-25; watch-max: 40-60
target_words: 50 # ~150 wpm
---
# <core idea, working title>
[VISUAL: first frame — peak anomaly, sound-off legible. THIS IS THE THUMBNAIL.]
Hook line. Power word. ≤10 words. Sound-off legible.
[VISUAL: foreshadowing visual]
Foreshadow line. Plants the ending. ≤2 lines, ≤3 seconds combined.
[VISUAL: body beat 1]
Body paragraph 1. "But / therefore" causation. 5th-grade readability.
[VISUAL: body beat 2]
Body paragraph 2.
[VISUAL: payoff frame — same as first frame for visual-match-cut loop, OR ending that recontextualizes the opening]
Payoff line. NO trailing second. NO sign-off. Cut on the punchline.
Rules:
Run the full two-phase protocol from references/shorts_script_judge_rubric.md against each angle's script.md BEFORE kicking off audio or images. A Gate 10 coherence fail is free to fix here (rewrite); caught at Stage 6 it costs a full audio + image + ZapCap re-roll.
Per-angle — two sequential Agent calls:
Phase 0 (naive cold read). Agent receives ONLY the script path. No Bible, no rubric, no packages, no research. Prompt: "You are watching this video for the first time. Answer the five cold-reader questions from shorts_script_judge_rubric.md Phase 0. Do not read any other file." Save the 300–400 word report to /tmp/<slug>_<angle>_naive_read.md.
Phase A+B (rubric grading). Separate Agent. Receives: Bible, rubric, script, Phase 0 report, angle's packages.md entry. Images/video not rendered yet — the Phase 0 report is load-bearing evidence for Gate 10 and hook-vs-body coherence in Phase B. Visual gates (1, 6, 7, 8, 9) get checked "pending render"; Stage 6 re-verifies them.
Decision:
No user checkpoint — the judge is authoritative on coherence. Only escalate on <5/10 (multiple fails) or if the rewrite loop doesn't converge in two iterations.
ve-audio <slug> --format short --angle A
Reads shorts/A/script.md, writes shorts/A/audio.mp3 + shorts/A/audio_timings.json. Uses ElevenLabs by default (voice from script frontmatter or pipeline/config.toml).
ve-images <slug> --format short --angle A
Outputs into shorts/A/images/. Style preset stays the same as the rest of the channel (typically editorial_illustration).
The in-video first frame does scroll-stop in-feed. Covers do profile-grid + Search + browse. Different jobs, different images.
# Reels profile grid (3:4 tall, since Jan 2025)
ve-thumbnail <slug> \
--aspect 3:4 \
--brief "<cover_concept from packages.md>" \
--text "<2-5 word overlay derived from hook>" \
--text-position center-bottom --text-color yellow --text-stroke black \
--out shorts/A/covers/cover_reels.png
# YouTube Shorts mobile profile (9:16)
ve-thumbnail <slug> \
--aspect 9:16 \
--brief "<cover_concept>" \
--text "<overlay>" \
--text-position center-bottom --text-color yellow --text-stroke black \
--out shorts/A/covers/cover_shorts.png
# TikTok profile grid (9:16)
ve-thumbnail <slug> \
--aspect 9:16 \
--brief "<cover_concept>" \
--text "<overlay>" \
--text-position center-bottom --text-color yellow --text-stroke black \
--out shorts/A/covers/cover_tiktok.png
cover_shorts.png and cover_tiktok.png can often be the same render (both 9:16); produce both filenames so per-platform overrides work later.
Cover overlay text rules (per references/judge_rubric.md Scroll-Stop Visual Power):
ve-assemble <slug> --format short --angle A
Outputs shorts/A/video.mp4 (captioned) and shorts/A/video_nocaptions.mp4 (raw). Requires ZAPCAP_API_KEY — verified by ve-doctor. On missing key, ve-assemble hard-fails before ffmpeg burn (exit 2). On a caption error mid-run, video_nocaptions.mp4 is preserved; a re-run of the same command skips ffmpeg and retries captioning only.
End state per angle: shorts/A/script.md, audio.mp3, audio_timings.json, images/, video.mp4, video_nocaptions.mp4, covers/cover_reels.png, covers/cover_shorts.png, covers/cover_tiktok.png.
When all 3 angles have shorts/<angle>/video.mp4 + 3 covers, hand off to ve-publish-shorts — it runs Stage 6 (visual gate) then Stage 7 (publish to YT/IG/TikTok).
Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub robertnowell/video-essay --plugin video-essay