From video-editing
Turn an audio file into a music video by compositing a waveform / spectrum / circular-CQT animation over cover art or a background. Templates (waveform-bottom, spectrum-bars, circular-cqt-cover, etc.) are saved in preferences.json so the user can pick by name. Use when the user says "make a music video from this audio", "audio to video with waveform", "render this song with a visualizer", "spectrum animation over cover art".
How this skill is triggered — by the user, by Claude, or both
Slash command
/video-editing:audio-to-music-videoThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Single-pass ffmpeg synthesis of a music video from an audio file plus a chosen template. The audio is pristine (stream-copied where the container allows) and the visual is built entirely with ffmpeg's `-filter_complex` graph — no external editor needed.
Single-pass ffmpeg synthesis of a music video from an audio file plus a chosen template. The audio is pristine (stream-copied where the container allows) and the visual is built entirely with ffmpeg's -filter_complex graph — no external editor needed.
Templates are stored in preferences.json under audio_to_video_templates and selected by name. Each template defines: visualizer type, layout, colour palette, optional cover-art treatment, and resolution. Built-in defaults ship in this skill's prose; users add their own via onboard or by editing preferences.json directly.
| Name | Layout | Visualizer |
|---|---|---|
waveform-bottom | full-frame blurred cover art bg + crisp cover art centered + linear waveform strip across bottom 15% | showwaves (mode=cline) |
spectrum-bars | solid colour bg + cover art top-center + spectrum bars bottom 40% | showfreqs (mode=bar, ascale=log) |
circular-cqt-cover | blurred cover bg + circular spectrum (showcqt) ring around centered cover | showcqt |
vector-scope | dark bg + cover art top-left + Lissajous vector scope center | avectorscope |
volume-meter | minimal — solid bg + cover art + horizontal volume meter | showvolume |
A template entry:
{
"name": "waveform-bottom",
"resolution": "1920x1080",
"fps": 30,
"bg": { "type": "cover-blur", "blur": 40, "darken": 0.3 },
"cover": { "size": "60%", "position": "center", "shadow": true },
"viz": {
"filter": "showwaves",
"mode": "cline",
"size": "1920x162",
"color": "white",
"position": "bottom",
"opacity": 0.9
}
}
| Field | Default |
|---|---|
| Audio | required (.mp3, .wav, .flac, .m4a) |
| Cover art | optional path; if omitted and the audio has embedded cover art, extract it; else use bg.solid colour |
| Template | name from preferences.audio_to_video_templates; default waveform-bottom |
| Title / Artist | optional drawtext overlay |
| Output | <basename>.mv.mp4 |
If not supplied:
ffmpeg -hide_banner -y -i "$AUDIO" -an -vcodec copy "$WORK/cover.jpg" 2>/dev/null \
|| COVER_MISSING=1
If still missing, use a plain colour from bg.solid (or fall back to #111111).
waveform-bottom)WIDTH=1920; HEIGHT=1080; FPS=30
COVER_W=$((WIDTH * 60 / 100)) # 60% width centered
ffmpeg -hide_banner -y \
-loop 1 -i "$COVER" \
-i "$AUDIO" \
-filter_complex "
[0:v]scale=${WIDTH}:${HEIGHT}:force_original_aspect_ratio=increase,
crop=${WIDTH}:${HEIGHT},
boxblur=40:1,
eq=brightness=-0.3[bg];
[0:v]scale=${COVER_W}:-1[cov];
[bg][cov]overlay=x=(W-w)/2:y=(H-h)/2[bgcov];
[1:a]showwaves=mode=cline:s=${WIDTH}x162:colors=white:rate=${FPS}[wf];
[bgcov][wf]overlay=x=0:y=H-h-40:format=auto[v]
" \
-map "[v]" -map 1:a \
-c:v libx264 -crf 20 -preset medium -pix_fmt yuv420p \
-c:a aac -b:a 192k \
-shortest \
-movflags +faststart \
"$OUT"
Replace the visualizer chain:
[1:a]showfreqs=s=${WIDTH}x432:mode=bar:ascale=log:fscale=log:colors=cyan|magenta:rate=${FPS}[viz];
[bgcov][viz]overlay=x=0:y=H-h-60:format=auto[v]
[1:a]showcqt=s=${WIDTH}x${HEIGHT}:fps=${FPS}:bar_h=300:axis_h=0:cscheme=1|0.5|0|0|0.5|1[viz];
[bgcov][viz]overlay=0:0:format=auto:enable='gte(t,0)'[v]
showcqt is the most musical-looking visualizer — it maps frequency on a constant-Q logarithmic scale, so each octave gets equal screen real estate. Heavier on CPU than showwaves.
Add a drawtext after the visualizer overlay:
[v]drawtext=fontfile='/abs/path/Inter-Bold.ttf':text='${TITLE}':fontsize=64:fontcolor=white:x=80:y=80:[email protected]:shadowx=2:shadowy=2,
drawtext=fontfile='/abs/path/Inter.ttf':text='${ARTIST}':fontsize=36:[email protected]:x=80:y=160:[email protected]:shadowx=2:shadowy=2[v]
For stable colour and broad compatibility, default to libx264 + pix_fmt yuv420p. The visualizer filters generate full-range RGB — without the pix_fmt clamp some players show washed-out colours.
If the user has a render profile, offer to use it, but warn that the filtergraph already runs on CPU and a GPU encoder won't speed up rendering by much.
If the user wants to register their own (different colours, different cover crop, different visualizer):
jq '.audio_to_video_templates += [$tpl]' --argjson tpl "$NEW_TPL" "$PREFS" \
> "$PREFS.tmp" && mv "$PREFS.tmp" "$PREFS"
onboard should be extended to walk new users through registering at least one template.
Audio : path/to/song.mp3 (4:32, 320 kbps)
Cover : embedded → extracted (1400x1400)
Template : waveform-bottom (1920x1080 @ 30fps)
Output : path/to/song.mv.mp4 (28 MB, 4:32)
showcqt is ~5–10× slower than showwaves.karaoke-video — combine them for a "make a karaoke video from this MP3" flow: run audio-to-music-video to synthesise a base video, then karaoke-video to add the lyric overlay (skip the demucs step in that case, since the audio is already stem-friendly or already instrumental).loop 1 -i $COVER input.Guides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.
npx claudepluginhub danielrosehill/claude-code-plugins --plugin video-editing