Enable Claude to analyze local video files or YouTube URLs with full multimodal perception: extract frames and audio via FFmpeg, detect scenes/motion/silence/transitions, transcribe speech using Whisper or Gemini, generate detailed visual descriptions, summaries, and answer questions interactively.
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub jordanrendric/claude-video-vision --plugin claude-video-visionLet Claude watch a video. Downloads with yt-dlp, extracts auto-scaled frames with ffmpeg, pulls captions or falls back to Whisper, and hands frames + transcript to Claude so it can answer questions about the video.
Turn videos into a sequence of relevant still frames + transcript + a self-contained HTML report so Claude can view them as images, hear the audio, and write its analysis back into the report. Pass a local path, an http(s) URL, or pipe video bytes on stdin.
Turn any video into a section-by-section study-notes markdown file with embedded screenshots and a timestamped transcript.
Process videos using VideoDB Python SDK. Upload, search, edit, generate subtitles, transcode, capture, and stream — all via natural language.
Compose yt-dlp + ffmpeg + Whisper into a single command that hands an AI agent the raw materials to watch any social video — VIDEO + FRAMES + TRANSCRIPT, ready for an LLM to read frames as images and transcript as text.
AI-powered video processing toolkit - download videos, remove silence, trim/cut, extract audio, transcribe, generate descriptions, upload to YouTube and Bunny.net