Transcribe video/audio to clean text by driving the media-transcribe CLI. Thin adapter — no transcription logic of its own.
AI-native, engine-pluggable media-to-text. Point it at a video/audio file or a folder and get clean plain-text transcripts. It runs as a standalone CLI — and ships a thin Claude Code adapter so AI-tool users get a one-command entry point.
一个 AI 原生入口友好、引擎可插拔、可独立运行的媒体转文字工具。
media-transcribe (the tool) — a Python CLI that does the real work: ffmpeg audio
extraction → ASR → transcript cleaning, batched and resumable. Usable from any shell,
any AI tool, cron, or CI.This keeps the logic testable and reusable instead of trapped in one vendor's prompt format.
uv tool install git+https://github.com/ZONGHAOLISOTA/media-transcribe
# or: pipx install git+https://github.com/ZONGHAOLISOTA/media-transcribe
# or from a clone: pip install -e .
Requires ffmpeg on PATH (brew install ffmpeg).
v0.1 ships one fully-working engine, Qwen MLX (Apple Silicon). See docs/INSTALL_QWEN.md.
# a whole folder, Chinese, with a domain-vocabulary file
media-transcribe ./videos --lang zh --context-file vocab.txt
# a single file
media-transcribe lesson01.mp4
# preview without transcribing
media-transcribe ./videos --recursive --dry-run
Transcripts mirror the input tree under <input>/transcripts/ by default. Re-runs skip
files that already have a non-empty transcript; failures are recorded in
transcripts/.media-transcribe/run.json, so a re-run retries only what's missing.
To send output elsewhere, pass --out <dir> (or set output_dir in a config file). An
explicit output path is used as given — a relative path resolves against your current
directory, not the input folder.
/plugin marketplace add ZONGHAOLISOTA/media-transcribe
/plugin install media-transcribe@media-transcribe
Then just ask: "transcribe the videos in ./lessons".
| Engine | Status | Notes |
|---|---|---|
qwen-mlx | ✅ v0.1 | Qwen3-ASR via mlx-qwen3-asr, Apple Silicon |
| faster-whisper / whisper.cpp / API | 🔜 | interface + docs ready |
Add your own engine without forking the core — see docs/ENGINE_DEVELOPMENT.md.
No GUI, no Obsidian/knowledge-base building, no speaker diarization, no subtitle editor, no multi-platform installer.
pip install -e ".[dev]"
pytest
MIT © 2026 Zonghao Li
Own this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimOwn this plugin?
Verify ownership to unlock analytics, metadata editing, and a verified badge. GitHub access is read-only (username + org membership).
Sign in to claimBased on adoption, maintenance, documentation, and repository signals. Not a security audit or endorsement.
npx claudepluginhub zonghaolisota/media-transcribe --plugin media-transcribeComprehensive UI/UX design plugin for mobile (iOS, Android, React Native) and web applications with design systems, accessibility, and modern patterns
Standalone image generation plugin using Nano Banana MCP server. Generates and edits images, icons, diagrams, patterns, and visual assets via Gemini image models. No Gemini CLI dependency required.
Multi-model consensus engine integrating OpenAI Codex CLI, Gemini CLI, and Claude CLI for collaborative code review and problem-solving.
Write feature specs, plan roadmaps, and synthesize user research faster. Keep stakeholders updated and stay ahead of the competitive landscape.