video-context

Claude Code plugin that lets Claude "watch" videos.

Claude can read images natively but not video. This plugin extracts scene-change frames + audio transcript from any video — local file, public URL (Loom/YouTube/Vimeo/raw mp4), or private URL with auth — and feeds them to Claude as images + text.

Install

/plugin marketplace add https://github.com/vusallyv/video-context-plugin.git
/plugin install video-context@video-context-marketplace
/video-context:setup

/video-context:setup installs ffmpeg, yt-dlp, whisper-cpp, and the whisper model. Skip it if you want — extract.sh auto-installs the binaries on first use, but the whisper model (~150MB) is only fetched by setup.

Private repo: works with your gh auth (or GITHUB_TOKEN).

Usage

Just paste a video link or path and ask Claude to analyze it:

Analyze this recording: https://www.loom.com/share/abc123 What does this video show? /Users/me/Downloads/bug-repro.mp4

Claude detects the video and runs the skill automatically.

Private URL with auth

Set VIDEO_AUTH_HEADER before invoking Claude (or tell Claude what it is in chat):

export VIDEO_AUTH_HEADER="Authorization: Bearer $TOKEN"

For Atlassian/Jira basic auth:

export VIDEO_AUTH_HEADER="Authorization: Basic $(echo -n "$EMAIL:$JIRA_API_TOKEN" | base64)"

Requirements

ffmpeg — required
yt-dlp — recommended (handles Loom/YouTube/Vimeo)
whisper-cpp + model — optional, for audio transcript

/video-context:setup installs everything on macOS (Homebrew) and Linux (apt/dnf/pacman/zypper; builds whisper-cpp from source). On macOS, extract.sh also auto-installs missing binaries on first use via brew. Windows: install manually.

Tuning

Env var	Default	Effect
`SCENE_THRESHOLD`	`0.4`	Lower = more frames. `0.2` for screen recordings.
`MAX_FRAMES`	`20`	Hard cap; trimmed evenly across timeline.
`FRAME_WIDTH`	`1280`	Downscale to save tokens.
`WHISPER_MODEL`	`/opt/homebrew/share/whisper-cpp/ggml-base.en.bin`	Whisper model path.

How it works

Resolves source — local path used as-is; URLs fetched via yt-dlp (or curl with auth).
ffmpeg scene-detect extracts keyframes at SCENE_THRESHOLD, plus the first + last frame.
Caps to MAX_FRAMES evenly across timeline.
Extracts audio → whisper-cpp → transcript.txt.
Prints frame paths + transcript so Claude can Read each frame and the text.

License

MIT

video-context

Popularity

What's Inside

README

video-context

Install

Usage

Private URL with auth

Requirements

Tuning

How it works

License

Confidence

Similar Plugins

ui-design

nanobanana

llm-council-plugin

product-management

Popularity

Health & Quality

Similar Plugins

ui-design

nanobanana

llm-council-plugin

product-management