Search everything...

Stats

Actions

Available In

video-recap-skills

Name: video-recap-skills
Author: worldwonderer

By worldwonderer

Automatically produce Chinese-narration video recaps from any video file using AI video understanding, narration script generation, scene cutting, TTS voiceover synthesis, and final assembly with ffmpeg and a MiMo API key.

npx claudepluginhub worldwonderer/video-recap-skills

Popularity

Stars

Top 5%

283

Med: 0·Avg: 285

Installs

Med: 0·Avg: 1

What's Inside

Skills6

video-assemble

/video-assemble

Assemble a final recap video: mux narration audio over the source video, duck the original audio under the narration, render subtitles (SRT/ASS, optionally burned in), and loudness- normalize. Use as the last stage of the video-recap bundle. Consumes the source video + tts_meta.json (+ narration placement); produces recap_<name>.mp4 + subtitles.srt/.ass. 触发词: 视频合成, 混音, 字幕, 压字幕, assemble video, mux, ducking, subtitles, 成片.

video-cut

/video-cut

Cut a long video down to selected source ranges (montage / clip assembly). Part of the video-recap bundle: in the orchestrated (two-pass) flow, consumes clip_plan.json + the source video, produces edited_source.mp4; the agent then writes narration.json against the output timeline. When invoked standalone WITHOUT --no-narration-map, also remaps an existing narration.json → narration_mapped.json (legacy single-pass path). 触发词: 视频剪辑, 剪辑式解说, video cut, clip plan, 拼剪.

video-recap

/video-recap

Generate a Chinese-narration recap video from an input video, end to end. Use when the user gives a video file (.mp4 / .mov / .mkv / .webm) and asks to add narration, generate voiceover, dub, summarize, or produce a recap (短剧 / 电视剧 / 电影 / 纪录片 / 科普). Orchestrates the video-* skill bundle: understanding → (agent writes narration) → cut → voiceover → assemble. 触发词: 视频解说, 视频旁白, 生成解说, 视频recap, video recap, voiceover, narration, auto-dub, recap.

video-script

/video-script

Write a timestamped Chinese narration script (解说词 / 旁白) for an already-analyzed video, then lint/validate it. Use after video-understanding has produced agent_narration_brief.md + vlm_analysis.json, when you need to author the recap narration (style, anti-hallucination, 字数公式, density, hook/throughline). Input: the understanding index in work_dir. Output: narration.json (validated). 触发词: 解说词, 写解说, 视频旁白, narration script, 写稿, 解说文案.

video-understanding

/video-understanding

Analyze a video into a structured understanding index: scene detection, ASR transcript, per-scene visual (VLM) analysis, silence windows, a fused timeline, and a narration-writing brief. Use to understand / index / summarize what happens in a video, or as the first stage of the video-recap bundle before writing narration. Input: a video file. Output: scenes.json, asr_result.json, vlm_analysis.json, silence_periods.json, timeline_fusion.json, agent_narration_brief.md. 触发词: 视频理解, 视频分析, 视频索引, video understanding, analyze video, 看懂视频.

Stats

Version0.2.2

ReleasedJun 17, 2026

LanguagePython

Stars283

Forks49

MaintenanceExcellent

LicenseMIT

Last CommitJun 18, 2026

AddedJun 14, 2026

Actions

View on GitHub View README Plugin Marketplace JSON

Own this plugin?

Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).

README

video-recap-skills

中文 · English

一句话，把视频做成中文解说 recap。 在 Claude Code 里说一声就开跑，本地只要 ffmpeg 加一个小米 MiMo 的 API Key——不用 GPU、不下模型，macOS / Linux / Windows 都能跑。

演示

https://github.com/user-attachments/assets/92698ec6-0d23-4f9f-8825-c3684ef57aff

成片之外，还能一键导出剪映草稿手动精修——原片、解说、BGM、字幕各一轨：

这是什么

flowchart LR
    research["背景调研"] --> understand
    video(["视频"]) --> understand["理解<br/>场景·ASR·VLM"] --> script["写稿<br/>Agent"] --> voiceover["配音<br/>MiMo TTS"] --> assemble["组装<br/>混音·字幕"] --> output(["Recap"])
    understand -. 剪辑模式：先剪后配 .-> cut["剪辑<br/>先剪成片"] -.-> script
    classDef io fill:#eef6ff,stroke:#4f86c6,color:#1f2937;
    classDef opt fill:#f3f4f6,stroke:#9ca3af,color:#374151;
    class video,output io;
    class research,cut opt;

为什么用它

一个 key 跑全程。 ASR、VLM、TTS 全走小米 MiMo，本地除了 ffmpeg 没别的依赖。
先查资料再写稿。 跑前把剧情、人物查清楚存进 background_research.json，VLM 才认得出谁是谁。
解说成块，原声也成块。 解说一段段连着讲、整块一次配音，段间留白把精彩原声整段放回满音量——大致七三开。
先剪后配，画面不串。 --edit-mode cut 先把长视频剪成成片，再对着成片写解说，时间轴天然对齐；出稿前还有一道 LLM 评审挑幻觉、钩子和主线。
能接着在剪映里改。 可选导出多轨剪映草稿，原片、解说、BGM、字幕各占一轨；核心渲染只靠 ffmpeg，不装剪映照样出片。

安装

① 装插件——对 Claude Code 说：

安装这个插件：https://github.com/worldwonderer/video-recap-skills

② 装 ffmpeg（流水线本身不用 pip install，脚本都是标准库 + PATH 上的 ffmpeg，Python 3.10+）：

brew install ffmpeg                        # macOS
sudo apt install ffmpeg                     # Debian/Ubuntu
choco install ffmpeg                        # Windows（或 scoop / winget install ffmpeg）

字幕默认烧进画面，需要带 libass（subtitles 滤镜） 的 ffmpeg——上面这些包基本都自带。如果你的 ffmpeg 没编 libass，开跑前会立刻报错并提示（也可以加 --no-burn-subtitles 输出未遮黑条的 MP4 + .srt 外挂字幕）。用 python3 scripts/recap.py --doctor 自检。

③ 配 MiMo API Key（一个 key 同时驱动 ASR / VLM / TTS，放环境变量、别写进仓库）：

export MIMO_API_KEY=your-mimo-key
# tp-* 的 Token-Plan key 会自动连集群，可选 cn | sgp | ams：
export MIMO_TOKEN_PLAN_CLUSTER=cn

按量付费的 sk-* key 默认走 https://api.xiaomimimo.com/v1。其它都有默认值；想分别配 key/URL 或改模型、音色、响度、字幕等，见配置手册。

怎么用

把视频丢给它，顺手给点视频背景：

给 /path/to/video.mp4 做个解说。这是《庆余年》第一集，主角是范闲。

它会分析视频、照背景写解说，产出带字幕的 recap_<名>.mp4。想要别的花样，照样一句话：

把 /path/to/long.mp4 剪成十分钟左右的解说短片，字幕压进画面。

背后是编排器把几个阶段串起来跑，中间停下来让 Agent 写解说（剪辑模式会停两次：先写 clip_plan.json 挑片段，剪成成片后再对着成片写 narration.json）。第一次跑前先自检环境：

python3 skills/video-recap/scripts/recap.py --doctor

架构

Skill	职责	输入 → 输出（`work_dir` 契约）
video-understanding	场景检测 · 抽帧 · ASR（`mimo-v2.5-asr`）· VLM（`mimo-v2.5`）· 时间轴融合 · 生成 brief（`--consolidate` 索引默认开）	`视频` → `scenes / asr_result / vlm_analysis / silence_periods / timeline_fusion / agent_narration_brief.md`
video-script	写作规则（SKILL.md）+ 评审（LLM 评委）+ lint/校验	`brief + 索引` → `narration.json`
video-cut	片段计划 → 拼剪成片（剪辑模式先剪后配，解说按成片时间轴写，无需重映射）	`clip_plan.json + 视频` → `edited_source.mp4`
video-voiceover	合成解说音频（MiMo TTS，`mimo-v2.5-tts`）	`narration.json` → `tts_segments/ + tts_meta.json`
video-assemble	混音 · 压低原声 · 渲染字幕 · 多轨时间线（可选导出剪映）	`视频 + tts_meta` → `recap_<名>.mp4 + subtitles.srt/.ass + timeline.json`
video-recap	编排器 + `--doctor`	`视频` → `recap_<名>.mp4`

输出

recap_<video>.mp4：成片（固定输出名，每次运行原地覆盖，迭代解说时刷新同一文件）。subtitles.srt（默认烧录字幕，同时产出 subtitles.ass；--no-burn-subtitles 关闭）
work_dir/narration.json：解说脚本（narration_lint.json 时间诊断、narration_review.md 评审意见）
work_dir/agent_narration_brief.md：给 Agent 的时间和场景 brief
work_dir/vlm_analysis.json · asr_result.json · silence_periods.json · timeline_fusion.json：理解产物
work_dir/clip_plan.json · edited_source.mp4 · recap_phase.json：剪辑模式产物（解说在成片时间轴上写，recap_phase.json 记录剪/配进度供断点续跑）
work_dir/timeline.json · work_dir/assembly_manifest.json · tts_segments/ · tts_meta.json：多轨时间线、渲染记录与 TTS 音频

参考文档

各 skill 的契约：每个 skills/<skill>/SKILL.md（写作规则在 video-script 的 SKILL.md 里）
数据结构 · 配置手册 · 多轨时间线 / 剪映导出
背景调研指南 · VLM prompt 模板

致谢

linux.do
剪映草稿导出参考了 pyJianYingDraft、capcut-mate（均 Apache-2.0）的草稿结构。

许可

MIT，见 LICENSE。

video-recap-skills

Popularity

What's Inside

Confidence

README

video-recap-skills

演示

这是什么

为什么用它

安装

怎么用

架构

输出

参考文档

致谢

许可

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana

More by worldwonderer

story-import

video-recap-skills

演示

这是什么

为什么用它

安装

怎么用

架构

输出

参考文档

致谢

许可

Popularity

Health & Quality

More by worldwonderer

story-import

Similar Plugins

caveman

frontend-design

ui-design

claude-mem

marketing-skills

nanobanana