This skill should be used when the user asks to "extract a transcript", "summarize this video", "make a doc from this transcript", or provides a YouTube URL, podcast transcript, or timestamped spoken-word content. Also triggers when the user pastes a raw transcript, mentions "here's a video from Dr. [name]", or drops a YouTube link without explicitly asking for extraction.
How this skill is triggered — by the user, by Claude, or both
Slash command
/obscura-scraper-crawler:extract-transcriptThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Convert raw transcripts (YouTube, podcast, interview) into clean, structured Markdown summaries. Unlike study extraction, this is a reasoning task — the transcript is messy spoken-word content that needs to be understood, organized, and summarized, not just reformatted.
Convert raw transcripts (YouTube, podcast, interview) into clean, structured Markdown summaries. Unlike study extraction, this is a reasoning task — the transcript is messy spoken-word content that needs to be understood, organized, and summarized, not just reformatted.
The simplest invocation is just a YouTube URL:
/extract-transcript https://www.youtube.com/watch?v=XXXXX
This will:
scripts/get_transcript.pyIf the user provides a YouTube URL instead of pasted text, run:
python3 {SKILL_DIR}/scripts/get_transcript.py "<youtube-url>" -o /tmp/bundle.json
The JSON bundle contains: title, channel, channel_url, description, duration_seconds, upload_date, chapters, transcript_plain, transcript_timestamped, metadata_source.
Bundle schema (exact field names — important):
{
"title": "string",
"channel": "string",
"channel_url": "string",
"description": "string | null",
"duration_seconds": 123, // int
"upload_date": "YYYY-MM-DD | null",
"chapters": [ // null when the video has no chapters
{ "title": "string", "start_seconds": 0.0 }
],
"transcript_plain": "string", // newline-joined plain text
"transcript_timestamped": [
{ "text": "string", "start": 0.0, "duration": 0.0 } // seconds, float, 2dp
],
"metadata_source": "yt-dlp | fallback"
}
Pay attention to these specifically — they're the ones easiest to assume wrong:
start_seconds (a float), not start_time / start / timestamp. There is no end_time; compute the end of chapter i as the start_seconds of chapter i+1 (or duration_seconds for the last chapter).start and duration (not t / length). Add start + duration for the end of a segment.upload_date is a normalized YYYY-MM-DD string when metadata_source == "yt-dlp"; may be null on the fallback path.chapters is null (not []) when the video has no chapter markers. Test with bundle.get("chapters") or [] before iterating.Requires youtube-transcript-api. If the script fails with ModuleNotFoundError, install it: pip install youtube-transcript-api.
Prefer yt-dlp when available — it gives richer and more reliable metadata (especially chapters and full descriptions). Install with pip install yt-dlp or brew install yt-dlp. The fallback path (oembed + YouTube watch-page scrape) works without it but is brittle; any field it can't resolve is null.
How to use each field:
description — read it carefully. This is the single most valuable piece for disambiguating product/tool names the transcript garbles, finding resource links the speaker referenced, and confirming credentials. When in doubt about a term from the transcript, check the description first.chapters — when present, use chapter titles as your section scaffolding (H2 headings). They're the creator's own outline, which is almost always cleaner than what you'd infer.duration_seconds + upload_date — include both in the metadata header. Readers need to know how fresh the advice is and whether it's worth watching.transcript_timestamped — use this to embed timestamp anchors on key quotes. Format: https://www.youtube.com/watch?v=<id>&t=<seconds>s. Do this for 3–8 notable quotes, not every quote — this is the feature that makes the doc useful as a navigation index back to the video.transcript_plain — the primary content to summarize.If the user pastes the transcript directly, skip Step 0 and move to Step 1.
Most fields come straight from the bundle. You still need to infer:
channel (YouTube name) but not credentials. Check the description for "I'm a [role]" self-descriptions. If still unclear, say so explicitly in the doc ("no stated credentials").Read through the entire transcript. If chapters are available, use them as your section boundaries. Otherwise identify:
Decide: tutorial or argument mode. A tutorial ("here's how I built X," "step 1, step 2…") is best served by an up-front reproducible-steps block before the narrative. An argument-driven talk (a research summary, an opinion piece, an interview) is best served by the standard narrative template.
Use this exact structure. Fields marked [optional] are omitted when not applicable.
# [Video/Episode Title]
**Speaker:** [Name, credentials if known]
**Source:** [YouTube / podcast name / etc.]
**Published:** [upload_date — YYYY-MM-DD]
**Runtime:** [HH:MM:SS or ~N minutes]
**Topic:** [one-line description]
**Format:** [solo tutorial, interview, panel, etc.]
**Key references cited:** [studies, guidelines, tools, books mentioned — if any]
**Published paper:** [optional — if the video presents a specific published study, include author/journal/year/DOI and a relative-path link to the study MD if one exists locally]
> **Transcription uncertainties:** [optional — a single line or short list of proper nouns, product names, or speaker claims Claude couldn't fully verify. Include these so the user can correct them in one pass. Omit the block entirely if everything resolved cleanly.]
---
## Summary
[2–4 sentences capturing the core argument and conclusion.]
**Key takeaway:** [one sentence — the single thing someone should remember]
---
## [For tutorials only — optional up-front block]
### Steps to reproduce
1. [Short imperative step with command/link if applicable]
2. [...]
[When the video is a how-to build, put the extracted reproducible steps up front. Link each step to its timestamp in the video with `[time](url?t=Ns)`. Narrative sections below can then focus on *why*, not *what*.]
---
## [Section 1: from chapter title if available, or your own heading]
[Content organized into clear prose, not raw transcript. Clean up false starts,
filler words, repetition. Preserve the speaker's voice and key quotes.]
> Use blockquotes for particularly strong or quotable statements. ([1:22](https://www.youtube.com/watch?v=VIDEO_ID&t=82s))
[Embed `?t=Ns` timestamp anchors on 3–8 notable quotes across the doc — not every quote. Enough that a reader can jump back into the video at the moments that matter, not so many that it becomes noise.]
## [Section 2]
[etc.]
---
## Source
**Video/Episode:** "[full title]"
**URL:** [video URL]
**Channel:** [channel name linked to channel URL]
**Presenter/Guest:** [name, credentials, affiliation]
**Host:** [if interview format]
**Sponsor:** [if disclosed in the video]
**Resources mentioned:** [links pulled from the description and transcript — books, tools, prior videos, referenced channels]
If the user's project CLAUDE.md or user-level memory defines a personal context to evaluate transcripts against (a medical case, a research focus, a product decision), add a "Personal relevance" section before the Source block that connects the speaker's claims to that context. Be specific — cite numbers, open questions, and note where claims are testable vs. speculative. Skip this section when no project context applies.
Default: write the file to the directory this Claude Code session was started in — the primary working directory shown in your environment. This is the user's implicit "current workspace" and is almost always where they expect output to land.
Guard against dumping notes into a code repo. If the session started in a git repository that has no existing transcript-like docs (no transcripts/ folder, no sibling .md files that look like prior transcripts), ask the user where to save before writing. It's much cheaper to ask once than to write into the wrong place and pollute a tracked repo.
Filename: Lowercase kebab-case — <speaker>-<short-slug>.md.
<speaker> is the speaker's own name, not the channel brand. Tom Solid (channel "ICOR with Tom | AI Productivity") → tom-solid. Fall back to the channel handle only when no person is named.<short-slug> is a 3–6 word descriptive title, max ~60 chars total. Keep the full video title in the # H1 heading inside the file.-, drop middle initials. "Chris Raroque" → chris-raroque. "Mendoza-López" → mendoza-lopez.Examples: tom-solid-note-taking-apps.md, chris-raroque-custom-agent-v2.md.
Match an existing filename pattern in the target folder if one is obvious; the kebab default is the fallback.
If the speaker discusses a published paper:
extract-study so both live in the repo and link to each other.Also cross-reference related reference docs, letters, or other transcripts when the content overlaps substantively.
If the file was written into a git repository where transcripts belong, add and commit with a descriptive message. If it was written to a non-git directory, skip this step.
> blockquotes for particularly strong or memorable direct quotes. Anchor 3–8 of them to their video timestamps.extract-study)extract-book)Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub noisemeldorg/skills --plugin extraction-skills