From lattifai-skills
Summarize a transcript, podcast, or long caption file into structured markdown (TL;DR, chapters with timestamps, quotes, entities). **Primary path uses this session's LLM directly — no API key, no model config.** Trigger on "summarize", "生成摘要", "总结", "TL;DR", "episode summary", "what was discussed", or when the user has a long caption and wants key points. CLI `lai summarize caption` is the secondary path for oversized transcripts / headless runs.
How this skill is triggered — by the user, by Claude, or both
Slash command
/lattifai-skills:lai-summarizeThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
> **Preferred model: Claude Sonnet** (cost-efficient for this agent-driven workload). This skill runs on whatever model is active in the parent session — any Claude model works; no hard switch.
Preferred model: Claude Sonnet (cost-efficient for this agent-driven workload). This skill runs on whatever model is active in the parent session — any Claude model works; no hard switch.
This skill summarizes using the agent's own LLM capability — the model you are running right now. It does not call out to any external LLM service by default. Helper scripts turn the source into an agent-ready prompt and validate the finished summary, so the agent only has to write prose.
Primary path (agent-driven, default): prepare.py → agent writes summary.md → validate.py.
Secondary path: lai summarize caption (CLI with its own LLM backend) — only when the transcript is too large for the agent's context, or there is no agent in the loop.
Any caption format supported by lattifai-captions (SRT, VTT, ASS, JSON, Gemini markdown, plain text, …).
Optional meta.md beside the source enriches the summary. If meta.md defines chapters:, those titles and timestamps are hard constraints — no merge, split, rename, or reorder.
meta.md must wrap its YAML in --- frontmatter delimiters; otherwise validate.py can't parse it and hard constraints are silently skipped (with a warning on stderr). Minimal template:
---
title: "Episode Title"
chapters:
- { title: "Intro", start: 0.0, end: 60.0 }
- { title: "Main Topic", start: 60.0, end: 420.0 }
---
Free-form notes below the frontmatter are fine.
<source> — caption/transcript file--output <path> — default <source>.summary.md--meta <path> — episode metadata (auto-detected beside source)--lang <code> — summary language (default: source language)short / medium / long ≈ 200–400 / 500–1000 / 1000–2000 words (default medium)<base> = source media stem (e.g. podcast from podcast.mp3) or YouTube ID. Files all land in the current directory:
# 1. Build an agent-ready prompt input from the source + meta.md
python skills/lai-summarize/scripts/prepare.py podcast.aligned.json -o podcast.prompt_input.md
# 2. Agent reads <base>.prompt_input.md and writes <base>.summary.md per the schema below.
# 3. Validate frontmatter, chapters, and verbatim quotes
python skills/lai-summarize/scripts/validate.py podcast.aligned.json podcast.summary.md
prepare.py produces a transcript with [MM:SS] timestamps and speaker labels, plus a # Meta block that marks meta.md chapters as HARD CONSTRAINT when present.
---
title: "Episode Title"
seo_title: "SEO title (≤60 chars)"
seo_description: "One-sentence description (≤160 chars)"
tags: ["tag1", "tag2"] # 4–8 tags
chapters:
- { title: "Chapter Title", start: 10.0, end: 52.0 }
confidence: 0.85 # 0.0–1.0 self-assessment
source_quality: high # high | medium | low
---
TL;DR paragraph (2–4 sentences, active voice).
## [00:10] Chapter Title
Summary paragraph(s).
> *"Verbatim quote, ≤40 words."*
## Entities
- **Name** (Person|Concept|Organization): one-line context
validate.py checksseo_title ≤ 60, seo_description ≤ 160tags: 4–8 entrieschapters: 1–8, each with title/start/end, start < endmeta.md verbatim when present## [MM:SS] Title headers match frontmatter chapters 1:1 in order> *"quote"* appears verbatim in source transcript textconfidence ∈ [0, 1]; source_quality ∈ {high, medium, low}Requires pyyaml (stdlib argparse / re otherwise).
The CLI takes two positional args (input, output); there is no -o:
lai summarize caption podcast.aligned.json podcast.summary.md
Configure an LLM backend once (required for this path only):
lai config set summarization.llm.model_name gemini-3-flash-preview
# Gemini key: see /lai-transcribe
# Or OpenAI-compatible:
lai config set summarization.llm.model_name gpt-4o
lai config set OPENAI_API_KEY <your-key>
CLI options: summarization.lang=zh, summarization.length=short, summarization.output_format=json, meta=video.meta.md.
| Problem | Fix |
|---|---|
validate.py flags quote not in source | Quote must be verbatim — rewrite or pick another line |
| Chapter count / timestamp drift from meta.md | Hard constraint — use meta.md values exactly |
| Transcript too large for the agent | Fall back to CLI (secondary path) |
meta.md chapters malformed | Remove from meta.md and let the agent generate chapters |
/lai-transcribe — produce the transcript first/lai-align — precise timestamps feed chapter boundaries/lai-diarize — speaker labels enable speaker-aware summaries/lai-translate — translate the summarynpx claudepluginhub lattifai/lattifai-skills --plugin lattifai-skillsTranscribes audio/video files to Markdown documentation with LLM summaries, speaker diarization, timestamps, and meeting minutes using Faster-Whisper or Whisper.
Converts raw meeting transcript .txt files into structured .md notes with metadata, TL;DR, key topics, action items, and quotes. Useful for processing meeting recordings or chat logs.
Transcribes audio/video files to Markdown with speaker diarization, timestamps, metadata, meeting minutes, and LLM summaries using Faster-Whisper or Whisper.