From super-publisher
Creates scripted tutorial videos with narration, TTS, aligned subtitles, and MP4 rendering. Use for tutorials, explainers, demos, or short-form videos.
How this skill is triggered — by the user, by Claude, or both
Slash command
/super-publisher:video-tutorial-makerThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Use this skill to turn a topic, outline, script, or existing video into a finished tutorial video package.
Use this skill to turn a topic, outline, script, or existing video into a finished tutorial video package.
Default assumption: produce a 16:9 landscape MP4 first. Create 9:16 or other platform variants only when requested or clearly useful.
Define the deliverable.
1920x1080 landscape video.Write the video package.
Generate speech.
edge-tts when available because it is free and more natural than basic system voices.zh-CN-XiaoxiaoNeural; adjust voice or rate only if the user asks or the result is clearly poor.Build subtitles from the same timeline as audio.
00:00:01.234 and 00:00:01,234 timestamp formats.NaN.Render video.
Produce platform variants.
1920x1080, SAR 1:1, DAR 16:9.Verify before delivery.
ffprobe to confirm width, height, aspect ratio, streams, duration, and file size.Use a project virtualenv if one exists:
.venv/bin/edge-tts --voice zh-CN-XiaoxiaoNeural --rate +0% \
--file scene-01.txt \
--write-media scene-01.mp3 \
--write-subtitles scene-01.vtt
If network or TTS fails, report the exact blocker and either retry after permission is granted or fall back to a clearly labeled system TTS draft.
The most common drift bug is this:
scene video duration = raw audio duration + pause
concatenated audio = raw audio duration only
This creates cumulative drift. Fix it by padding every scene audio to the corresponding scene video duration:
ffmpeg -y -i scene-01.mp3 \
-af "apad,atrim=0:SCENE_DURATION" \
-ar 24000 -ac 1 -c:a pcm_s16le scene-01-padded.wav
Concatenate scene-XX-padded.wav files, then mux that audio with the video.
When the source is 16:9 and content must not be cropped:
ffmpeg -y -i input.mp4 \
-filter_complex "[0:v]split=2[bgsrc][fgsrc];\
[bgsrc]scale=1080:1920:force_original_aspect_ratio=increase,crop=1080:1920,\
gblur=sigma=32,eq=brightness=-0.12:saturation=0.75[bg];\
[fgsrc]scale=1080:-2[fg];\
[bg][fg]overlay=(W-w)/2:(H-h)/2,format=yuv420p,setsar=1[v]" \
-map "[v]" -map 0:a:0 -map '0:s?' \
-c:v libx264 -preset veryfast -crf 20 \
-c:a copy -c:s mov_text -movflags +faststart output-9x16.mp4
After conversion, verify:
ffprobe -v error \
-show_entries stream=index,codec_type,codec_name,width,height,sample_aspect_ratio,display_aspect_ratio,duration \
-show_entries format=duration,size \
-of default=noprint_wrappers=1 output-9x16.mp4
Expected video values for this 9:16 variant:
width=1080
height=1920
sample_aspect_ratio=1:1
display_aspect_ratio=9:16
In the final answer, provide the absolute path to the finished MP4 and mention the verification actually run. If there are multiple outputs, identify the recommended one first, usually the 16:9 file.
npx claudepluginhub guanyang/super-publisher --plugin super-publisherCreates professional tutorial videos from screen recordings using AI narration (Hebrew), music, subtitles via ffmpeg, and auto-distribution to YouTube/WhatsApp.
Transforms arbitrary text (articles, notes, topics) into faceless explainer videos with TTS narration, abstract graphics, diagrams, and data visualization. No product screenshots or website captures.
Orchestrates AI video production workflow: gathers specs interactively, generates scripts/storyboards, Gemini TTS voiceovers, Lyria music, Veo 3.1 clips or image animations, assembles with FFmpeg.