By SSFSKIM
Transcribe YouTube videos/playlists into Markdown, translate them, and convert to HTML blogs with frame snapshots inline; browse the library in a local web UI or export it to Markdown for Obsidian; also ingest EPUB e-books into the same library and index their figures for a tutor agent (yt-dlp + ffmpeg + OpenAI/Gemini).
Serve a folder of generated `.html` blogs as an Obsidian-like web app: a
Build a tutor-facing index that maps every meaningful image in a book to **where
Turn `.epub` e-books into a folder tree of HTML pages that **mirrors the book's
This skill should be used when the user asks to "turn this YouTube video into a blog post", "make a full blog from a YouTube URL with images", "유튜브 영상을 블로그로 변환해줘", "video to blog", "embed slides into the transcript", or wants the transcript PLUS meaningful frame snapshots in an HTML page. Extracts frames by uniform sampling, deduplicates with perceptual hash, ranks with Gemini Flash against transcript context, and renders semantic HTML with clickable YouTube deep-links. For transcript-only output, use the `transcribe` skill instead.
Mirror a folder of generated `.html` blogs into a parallel Markdown tree so the
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
Run Transcription and Translation for Youtube or playlist into clean Markdown transcripts.
transcription → translation: pull the spoken words out of a video, then localize them.
Video is great, but for learning/information purposes, it's too less dense. In the world where there's just so much information out there, you might want to optimize your information ingestion time, and automated transcription --> translation pipeline help you get any youtube knowledge out there efficiently.
New in v0.2.0: the full-blog skill produces a self-contained HTML page with the
transcript and ~10–25 meaningful frame snapshots embedded inline. Click any frame to
jump to that moment on YouTube. Translate the result with translate to ship localized
blog posts.
YouTube URL ──▶ transcribe ──▶ Markdown (original language) ──▶ translate ──▶ Markdown (your language)
Getting usable text out of YouTube is fiddly: captions are sometimes missing, come in the wrong
language, are riddled with rolling-duplicate lines, or are auto-translated rather than original.
And once you have a transcript, translating a long talk hits model output limits and mangles
formatting. yt-ribosome handles all of that for you, as natural-language skills inside Claude Code.
gpt-4o-transcribe). YouTube auto-captions are skipped by default; opt back in with --allow-auto-captions..srt too when timestamps exist.full-blog produces a self-contained HTML page with the transcript and meaningful
frames embedded inline at the timestamp each frame was captured. Each <figure> deep-
links back to YouTube via ?t= so readers can jump to the moment.
# v0.2.0 — turn a YouTube URL into an HTML blog with images
python3 skills/full-blog/scripts/full_blog.py "<URL>" --out-dir blogs
# Then translate (HTML-aware):
python3 skills/translate/scripts/translate.py blogs/ --to Korean
Output structure:
blogs/
├── 01 - My Talk.html
├── 01 - My Talk/
│ ├── 00_03_12.jpg
│ └── 00_05_44.jpg
└── _run_summary.json
Cost target: ~$0.10 per 60-min video with gemini-2.5-flash. ~$0.03 with gemini-2.0-flash.
| Skill | What it does | Say something like |
|---|---|---|
transcribe | YouTube → original-language Markdown | "transcribe this video/playlist", "유튜브 트랜스크립트 받아줘" |
translate | transcript/Markdown/HTML files → target language | "translate these to Korean", "한국어로 번역해줘" |
transcribe-and-translate | both, end to end | "transcribe this and translate it to Korean" |
full-blog | YouTube → HTML blog with embedded frame snapshots | "make a full blog from this video", "유튜브 영상을 블로그로 변환해줘" |
/plugin marketplace add SSFSKIM/yt-ribosome
/plugin install yt-ribosome@yt-ribosome
Or run it locally without installing:
claude --plugin-dir /path/to/yt-ribosome
| Requirement | Needed for |
|---|---|
yt-dlp (recent) + ffmpeg | all transcription |
openai + OPENAI_API_KEY | audio fallback and OpenAI translation |
google-genai + GEMINI_API_KEY or GOOGLE_API_KEY | Gemini translation |
imagehash, Pillow (pip install imagehash Pillow) | full-blog: perceptual-hash frame deduplication |
beautifulsoup4 (pip install beautifulsoup4) | translate: HTML support (v0.2.0) |
pip install -U yt-dlp openai google-genai # ffmpeg: brew install ffmpeg / apt install ffmpeg
API keys are read from the environment or a .env file in the current working directory. A Gemini
key starting with AQ. is auto-detected as a Vertex AI Express key; an AIza… key uses the
standard Gemini API.
In Claude Code, just ask:
"Transcribe
https://youtube.com/watch?v=…and translate it to Korean."
npx claudepluginhub ssfskim/yt-ribosome --plugin yt-ribosomeStandalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Create and edit Obsidian vault files including Markdown, Bases, and Canvas. Use when working with .md, .base, or .canvas files in an Obsidian vault.
Ultra-compressed communication mode. Cuts ~75% of tokens while keeping full technical accuracy by speaking like a caveman.
Frontend design skill for UI/UX implementation
Comprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns