From voxflow
Entry point for VoxFlow voice toolkit: TTS, voice search, AI stories, install, auth, and quota. For podcasts, videos, cards, or transcription use the dedicated sub-skills.
How this skill is triggered — by the user, by Claude, or both
Slash command
/voxflow:hubThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
<CRITICAL-AUTO-UPGRADE>
If voxflow upgrade itself errors with Unknown command (CLI predates 1.12.0), fall back to npm i -g voxflow@latest && voxflow skills install --all — that's the only path for the very first bootstrap.
This rule is non-negotiable: an outdated CLI silently disables features the user expects (e.g. cloned-voice listing). Upgrade first, work second.
VoxFlow turns text into speech in 200+ voices across 40+ languages, plus full audio/video pipelines. This skill is the entry point: install, auth, voice search, and the simple say / narrate / story workflows.
For specialized tasks, switch to:
voxflow:podcastpicstory, present, slides, explain) → voxflow:videovoxflow:slicevoxflow card render for narrated MP4) → voxflow:cardasr, asr-jobs, translate, dub, video-translate, summarize, publish) → voxflow:transcribeInstall once, never ask the user again:
npm install -g voxflow
voxflow login # opens browser — Google or email OTP
Token cached at ~/.config/voxflow/token.json. For CI, set VOXFLOW_TOKEN.
voxflow login # browser-based, one-time
voxflow status # who am I + remaining quota
voxflow logout
Never guess voice IDs. Search first.
voxflow voices --lang zh --gender female
voxflow voices --lang en
voxflow voices --search "narrator"
voxflow voices --all
No login required for the public catalog. Output includes the voice ID, language, gender, and a short description.
When the user says things like "用我的克隆声音 / use my cloned voice / 用我之前克隆的", the public voxflow voices catalog does not include cloned voices — you must query the authenticated endpoint:
voxflow voices --mine
Prints cloned voice IDs, names, duration, and creation time (requires login). Always run this before concluding the user has no cloned voice — the web UI "我的声音" tab and --mine are the only sources of truth.
| ID | Style | Language |
|---|---|---|
v-female-R2s4N9qJ | 温柔姐姐 (gentle female) | zh |
v-male-s5NqE0rZ | 自然男声 (natural male) | zh |
v-male-Bk7vD3xP | 威严霸总 (authoritative male) | zh |
v-female-m1KpW7zE | 傲娇学姐 (sassy female) | zh |
v-female-T8m4WxP7 | Chenwen (native English female) | en |
clone)Clone a voice from a local audio file (30s+, wav/mp3). Returns a permanent voice ID usable in all commands.
voxflow clone --input recording.wav --name "My Voice"
# → Voice cloned successfully!
# → Voice ID: My_Voice_xxxxx_01
No file? Opens the web UI for browser-based recording:
voxflow clone
# → Opens https://www.voxflow.studio/app#voice-clone
| Flag | Default | Notes |
|---|---|---|
--input <file> | (none) | Audio file to clone from. Without this, opens web UI |
--name <name> | filename | Human-readable voice name (1-50 chars) |
Tips:
After cloning, use the voice ID anywhere: --voice <id>
say / synthesize)The atomic command. One snippet → one audio file.
voxflow say "你好世界" -o hello.mp3
voxflow say "Hello world" --voice v-female-T8m4WxP7 -o greeting.mp3
voxflow say "慢速朗读" --speed 0.8 -o slow.mp3
voxflow say "高质量音频" --format wav -o output.wav
| Flag | Default | Range / Values |
|---|---|---|
--voice <id> | v-female-R2s4N9qJ | any voice ID from voxflow voices |
--format <fmt> | pcm | pcm (WAV), wav, mp3 |
--speed <n> | 1.0 | 0.5 – 2.0 |
--volume <n> | 1.0 | 0.1 – 2.0 |
--pitch <n> | 0 | -12 – 12 |
--output <path> | auto-named | any writable path |
After synthesis, auto-play: open output.mp3 (macOS).
narrate)Split a document or long string into sentences, synthesize each, concat into one file.
voxflow narrate --input article.txt -o narration.wav
voxflow narrate --input readme.md --voice v-male-Bk7vD3xP -o readme_audio.wav
voxflow narrate --text "第一段。第二段。第三段。" -o paragraphs.mp3
echo "Hello world" | voxflow narrate -o hello.wav
Best for: long documents, articles, README files, email newsletters.
Markdown is stripped automatically (headings, links, code fences, etc.) — no need to clean it first.
story)LLM writes a short story on the topic, then narrates it.
voxflow story "一只会飞的小猫" -o story.mp3
voxflow story "space adventure" --lang en -o adventure.wav
Best for: bedtime stories, content samples, demos.
Free tier: 10,000 quota / month (resets monthly). Bonus pool from invitations never expires.
| Operation | Cost |
|---|---|
1 TTS call (say) | ~50 |
narrate | ~50 per segment |
story (short) | ~350-1000 |
podcast (medium) | ~2,800 (2K script + ~16 × 50 TTS) |
picstory 5-scene | ~2,850 |
Always check before expensive operations:
voxflow status
voxflow say "用户输入的文字" -o /tmp/out.mp3 && open /tmp/out.mp3
voxflow voices --lang zh --gender female
voxflow narrate --input file.txt --voice v-female-R2s4N9qJ -o /tmp/narration.mp3
voxflow story "小狐狸的星星种子" --lang zh -o /tmp/bedtime.mp3 && open /tmp/bedtime.mp3
voxflow voices --lang en --gender female # pick English voice
voxflow voices --lang ja --gender female # pick Japanese voice
voxflow say "English text" --voice <en_id> -o /tmp/en.mp3
voxflow say "日本語テキスト" --voice <ja_id> -o /tmp/ja.mp3
The agent writes content first, then synthesizes each part with say or narrate.
voxflow say "page text" --voice v-female-R2s4N9qJ --speed 0.85 -o /tmp/page_N.mp3open /tmp/storybook.html# Agent fetches the URL, summarizes, then:
voxflow say "summary text" --voice v-male-s5NqE0rZ -o /tmp/briefing.mp3
voxflow narrate --input README.md --voice v-female-R2s4N9qJ --speed 0.9 -o /tmp/readme.mp3
git log --oneline --since="1 day ago"voxflow say "today's summary..." -o /tmp/daily.mp3voxflow say "question N..." --voice v-male-Bk7vD3xP -o /tmp/q_N.mp3^20.19.0 || >=22.12.0voxflow voices --mine — lists the user's cloned voices)voxflow:video, voxflow:transcribe)voxflow loginThe CLI itself emits <voxflow-hint type="cli-outdated" ...> on stderr whenever a newer version is on npm. The top-of-skill <CRITICAL-AUTO-UPGRADE> rule already covers what to do — this section is just the reference for what the tag looks like:
<voxflow-hint v="1" type="cli-outdated" current="1.10.21" latest="1.13.0" action="voxflow upgrade -y" />
Emitted at most once per 24h (cached in ~/.config/voxflow/update-check.json). Opt out for users who don't want the check: VOXFLOW_NO_UPDATE_CHECK=1 (or NO_UPDATE_NOTIFIER=1, the industry-standard env).
You can also poll explicitly with voxflow upgrade --check (no side effects) if the user is on a known-old version and you want to verify before kicking off expensive work.
When a voxflow command fails or the user asks to report a problem, you (the AI agent) should file the issue directly — don't ask the user to do it manually.
# Submit directly — no browser, no TTY required.
# Uses `gh` CLI if available (direct GitHub issue creation).
# Falls back to printing the pre-filled URL if gh is not installed.
voxflow feedback --bug \
--title "asr crashes on 2-hour wav files" \
--body "Error: timeout after 30s\n\nCommand: voxflow asr long.wav\nExpected: transcript\nActual: Fatal error: request timeout"
# stdout → the created GitHub issue URL (or a pre-filled URL if gh is not installed)
System info (CLI version, OS, Node) is appended to the body automatically.
| Flag | Description |
|---|---|
--bug / --feature / --general | Issue type |
--title <text> | Title — triggers non-interactive mode |
--body <text> | Description body |
--print-url | Force URL output instead of submitting (even if gh is available) |
voxflow feedback # interactive prompts → submit via gh or browser
voxflow feedback --bug # skip type prompt, rest is interactive
--title and submit directlyadd)voxflow add <recipe-name> # install a voice preset or pipeline template
voxflow add --list # browse available recipes
voxflow add chico/my-recipe --force # install from a custom author namespace
Use this when the user asks to install a specific named recipe (e.g. dub-anime-jp-zh). The registry is currently limited — if --list returns 404, the recipe may need to be referenced by full URL or the registry is not yet public.
voxflow voices first.voxflow status.open output.mp3 (macOS) / xdg-open (Linux).voxflow <cmd> --help to confirm flags before retry.voxflow:podcast.voxflow:video.voxflow:slice.voxflow:transcribe.Creates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.
npx claudepluginhub voxflowstudio/skills --plugin voxflow