From social-media-creator
Extract key content from external sources (YouTube videos, PDFs, web articles) for video creation. Use when the user provides a YouTube link, PDF file, or article URL and wants to create video content from it. Supports Czech and English content. Outputs a structured content brief for downstream video skills.
How this skill is triggered — by the user, by Claude, or both
Slash command
/social-media-creator:extract-contentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Extract and structure content from YouTube videos, PDF documents, and web articles into a format ready for video creation.
Extract and structure content from YouTube videos, PDF documents, and web articles into a format ready for video creation.
| Source | How to Extract | Notes |
|---|---|---|
| YouTube video | Use youtubetotranscript.com via Chrome browser to extract transcript | Extract full spoken content, then title/description via WebFetch |
| PDF file | Use Read tool on the PDF file path, or use pdf skill for complex PDFs | Extract headings, key sections, main arguments |
| Web article | Use WebFetch on the article URL | Extract title, headings, key paragraphs, main takeaways |
DO NOT try to extract transcripts directly from youtube.com — it will NOT work (YouTube blocks automated transcript access, sandbox blocks direct API calls, and captions DOM is not reliably accessible).
ALWAYS use youtubetotranscript.com — this is the ONLY reliable method.
Extract the video ID from the YouTube URL:
https://www.youtube.com/watch?v=ZuJryiwxjDw → video ID = ZuJryiwxjDwhttps://youtu.be/ZuJryiwxjDw → video ID = ZuJryiwxjDwNavigate to the transcript page in Chrome browser:
https://youtubetotranscript.com/transcript?v={VIDEO_ID}
Extract the transcript from the #transcript element using Chrome browser JS:
// The transcript text is inside the #transcript element
// It can be large (50K+ chars), so extract in chunks if needed
const text = document.querySelector('#transcript')?.innerText || '';
// For large transcripts, chunk it:
const chunks = [];
const chunkSize = 12000;
for (let i = 0; i < text.length; i += chunkSize) {
chunks.push(text.substring(i, i + chunkSize));
}
window.__transcript_chunks = chunks;
Read each chunk via separate JS calls:
window.__transcript_chunks[0] // first chunk
window.__transcript_chunks[1] // second chunk
// ... etc
Save the full transcript to a file in the post folder
#transcript CSS selector specificallyget_page_text or read_page on this page — too much noise. Use javascript_tool with the #transcript selectorAfter extracting raw content, structure it into a Content Brief:
# Content Brief
## Source
- Type: [youtube | pdf | article]
- URL/Path: [source location]
- Original Title: [title from source]
- Language: [cs | en]
## Key Points (ordered by importance)
1. [Most important takeaway]
2. [Second key point]
3. [Third key point]
...
## Detailed Summary
[2-3 paragraph summary of the full content]
## Quotable Lines
- "[Direct quote or paraphrase that works as a hook]"
- "[Another strong statement]"
## Suggested Video Angle
[1-2 sentences on the best angle for a social media video]
## Suggested Hook
[Opening line that grabs attention in the first 2 seconds]
When extracting for a multi-lesson series, add:
## Series Breakdown
### Lesson 1: [Title]
- Key points: [...]
- Duration estimate: [X seconds]
### Lesson 2: [Title]
- Key points: [...]
- Duration estimate: [X seconds]
[... etc]
## Series Arc
- Opening lesson hook: [how to introduce the series]
- Progression: [how lessons build on each other]
- Finale: [how the series concludes with a strong CTA]
Extracted sources are cached at the root level of the outputs folder so they can be reused across multiple posts/series without re-extraction.
outputs/
├── _sources/ ← shared source cache
│ ├── yt-ZuJryiwxjDw/ ← YouTube video (keyed by video ID)
│ │ ├── transcript.txt ← raw transcript
│ │ ├── content-brief.md ← structured Content Brief
│ │ └── metadata.json ← title, URL, language, extraction date
│ ├── article-abc123/ ← article (keyed by URL slug/hash)
│ │ ├── raw-content.txt ← extracted article text
│ │ ├── content-brief.md
│ │ └── metadata.json
│ └── pdf-my-document/ ← PDF (keyed by filename slug)
│ ├── raw-content.txt
│ ├── content-brief.md
│ └── metadata.json
├── hormozi-prodavej-bohatum/ ← post folder (uses cached source)
├── hormozi-offers-lekce/ ← series folder (uses same cached source)
└── ...
| Source Type | Cache Folder Name | Key |
|---|---|---|
| YouTube | yt-{VIDEO_ID} | Video ID from URL (e.g., yt-ZuJryiwxjDw) |
| Article | article-{slug} | Slugified domain+path, max 60 chars (e.g., article-forbes-com-ai-trends-2026) |
pdf-{filename-slug} | Slugified filename without extension (e.g., pdf-startup-handbook) |
{
"type": "youtube",
"source_url": "https://www.youtube.com/watch?v=ZuJryiwxjDw",
"title": "If You Want More Money In 2026 Do This First",
"language": "en",
"extracted_at": "2026-02-28T14:30:00Z",
"cache_key": "yt-ZuJryiwxjDw"
}
ALWAYS check the cache before extracting from any source:
outputs/_sources/{cache-key}/ existstranscript.txt / raw-content.txt and content-brief.md — skip extractionmkdir -p outputs/_sources/{cache-key}/transcript.txt (YouTube) or raw-content.txt (article/PDF)content-brief.mdmetadata.json with source info and extraction timestamp# YouTube — extract video ID
VIDEO_ID="ZuJryiwxjDw"
CACHE_KEY="yt-${VIDEO_ID}"
# Article — slugify URL
URL="https://www.forbes.com/sites/ai-trends-2026"
CACHE_KEY="article-$(echo "$URL" | sed 's|https\?://||;s|www\.||;s|[^a-zA-Z0-9]|-|g;s|-\+|-|g;s|^-||;s|-$||' | cut -c1-60)"
# PDF — slugify filename
FILENAME="Startup Handbook.pdf"
CACHE_KEY="pdf-$(echo "${FILENAME%.*}" | tr '[:upper:]' '[:lower:]' | sed 's|[^a-z0-9]|-|g;s|-\+|-|g;s|^-||;s|-$||')"
If the user requests a different angle, focus, or series breakdown from the same source, you have two options:
content-brief-{angle-slug}.md alongside the originaloutputs/_sources/{cache-key}/) — if cached, load existing contentnpx claudepluginhub michalvarys/claude-plugins --plugin social-media-creatorGuides creation, editing, and verification of skills for AI coding agents using test-driven development with subagent scenarios. Use when authoring or debugging skills.