By jain777
Give Claude smarter eyes for video. /see watches videos with scene-aware frame extraction and optional OCR; /skillgen scaffolds Claude Code skills from tutorial videos.
Watch a video (URL or local path) and answer questions about it. Use when the user shares a video URL, asks Claude to look at a video file, or asks about content in a video. Smart scene-aware frame extraction + optional OCR keep token cost low.
Watch a tutorial or demo video and scaffold a runnable Claude Code skill from it. Use when the user wants to turn a how-to video, screencast, or tool demo into an invocable skill or plugin.
Give Claude Code smarter eyes for video. Two skills, one plugin:
/claude-eyes:see — watch a video (URL or local file) and answer questions about it. Scene-aware frame extraction with perceptual deduplication keeps token cost low; optional OCR replaces expensive high-res frame inspection for text-heavy content; contact-sheet preview mode summarizes long videos in a single image./claude-eyes:skillgen — watch a tutorial or demo video and scaffold a runnable Claude Code skill from it.Designed to spend as few image tokens as possible on long or static video, while staying accurate:
| Capability | How claude-eyes does it |
|---|---|
| Frame sampling | Scene-change + minimum-cadence selection after mpdecimate perceptual dedup — one ffmpeg pass, no duplicate frames in static stretches |
| Per-frame timestamps | PTS-in-filename via -frame_pts 1, correct even when sampling is non-uniform |
| Long-video preview | One tile=4x6 contact sheet (~1k vision tokens) to orient before spending on full extraction |
| OCR | Optional tesseract pass; auto-on for text-related questions, so on-screen text is read cheaply instead of via high-res frames |
| Transcript input | --transcript accepts user-supplied VTT/SRT/JSON/TXT first; falls back to captions, then Whisper |
| Cache | sha256-keyed under ${CLAUDE_PLUGIN_DATA}/cache — re-runs skip download and extraction |
| Skill-from-video | /skillgen scaffolds a working SKILL.md + plugin.json from a tutorial |
/plugin install https://github.com/jain777/claude-eyes
Or from a local clone:
git clone https://github.com/jain777/claude-eyes
/plugin install ./claude-eyes
First-run preflight installs ffmpeg and yt-dlp on macOS via Homebrew; prints exact install commands on Linux/Windows. tesseract is optional (enables --ocr). A Whisper API key (Groq preferred, OpenAI fallback) is only needed if the video has no captions AND you didn't pass --transcript.
/claude-eyes:see/claude-eyes:see https://youtu.be/dQw4w9WgXcQ what happens at 0:30?
/claude-eyes:see ~/recording.mp4 --start 2:15 --end 2:45 summarize
/claude-eyes:see https://example.com/talk.mp4 --preview
/claude-eyes:see lecture.mp4 --transcript lecture.srt
/claude-eyes:see code-review.mp4 --ocr what does the function on screen do?
Flags: --transcript PATH, --profile {auto,tutorial,live}, --start T, --end T, --max-frames N, --resolution W, --ocr, --no-whisper, --whisper {groq,openai}, --preview, --no-cache, --out-dir DIR.
/claude-eyes:skillgen/claude-eyes:skillgen https://youtu.be/jq-tutorial jq-helper
/claude-eyes:skillgen ~/screencast.mp4 my-tool
Watches the video with the tutorial profile + OCR, then helps Claude synthesize and scaffold a runnable skill at ${CLAUDE_PLUGIN_DATA}/generated/<name>/ for review.
MIT
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub jain777/claude-eyesJobClaw by whatNxtAI — open-source Agent Skills that run your entire job hunt: profile, search, fit scoring, tailored resumes, cover letters, recruiter triage, interview prep, and offer negotiation. Bring your own keys; most skills need none.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Memory compression system for Claude Code - persist context across sessions
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Curate auto-memory, promote learnings to CLAUDE.md and rules, extract proven patterns into reusable skills.
Editorial "Web Designer" bundle for Claude Code from Antigravity Awesome Skills.
Core skills library for Claude Code: TDD, debugging, collaboration patterns, and proven techniques