Skill

pocket-tts

From research-skills

Speaks text aloud using a local TTS server on macOS. Streams audio for low-latency playback via ffplay or afplay.

Popularity

Stars

Forks

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/research-skills:pocket-tts

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Local text-to-speech via pocket-tts server. Streams audio for low latency. **macOS only** (uses `afplay` as fallback).

SKILL.md

65 lines · ~592 tokens

Stats

LanguageShell

Stars5

Forks1

MaintenanceGood

Last CommitMay 30, 2026

Actions

View Source View Plugin View on GitHub View README

Pocket TTS

Local text-to-speech via pocket-tts server. Streams audio for low latency. macOS only (uses afplay as fallback).

Prerequisites: pip install pocket-tts and brew install ffmpeg

Quick Reference

# Ensure server is running (do this first)
curl -s http://localhost:8321/health > /dev/null 2>&1 || {
  pocket-tts serve --voice ~/.config/pocket-tts/default-voice.wav --port 8321 > /dev/null 2>&1 &
  sleep 4
}

# Speak with streaming playback (audio starts immediately)
curl -s -X POST http://localhost:8321/tts -F "text=Hello world" -o - | ffplay -nodisp -autoexit -loglevel quiet -

# Or with temp file (if ffplay unavailable)
curl -s -X POST http://localhost:8321/tts -F "text=Hello world" -o /tmp/speak.wav && afplay /tmp/speak.wav && rm /tmp/speak.wav

Architecture

Always use the server — it keeps the model and voice embedding warm in memory.

Port: 8321
Default voice: ~/.config/pocket-tts/default-voice.wav (loaded once at server start)
Streaming: /tts returns chunked WAV. Pipe to ffplay for immediate playback during generation.

Changing Voices

Per-request (server keeps default warm, but can generate with others):

curl -s -X POST http://localhost:8321/tts -F "text=Hello" -F "voice_url=jean" -o - | ffplay -nodisp -autoexit -loglevel quiet -

Built-in voices: alba, marius, javert, jean, fantine, cosette, eponine, azelma

Custom: Any http://, https://, or hf:// URL

To change the default, restart server with different --voice.

Creating Custom Voices

# Extract 30s clip from source (pocket-tts truncates to 30s anyway)
ffmpeg -y -ss START_SECONDS -t 30 -i input.mp3 -ar 24000 -ac 1 ~/.config/pocket-tts/default-voice.wav

Troubleshooting

Server not responding: Check if process died, restart with serve command

Slow first response: Server needs ~4s to load model on first start

No audio: Ensure ffplay (from ffmpeg) or afplay (macOS built-in) is available

pocket-tts

Popularity

Invocation

Context Preview

SKILL.md

pocket-tts

Popularity

Invocation

Context Preview

SKILL.md

Pocket TTS

Quick Reference

Architecture

Changing Voices

Creating Custom Voices

Troubleshooting

Similar Skills

Pocket TTS

Quick Reference

Architecture

Changing Voices

Creating Custom Voices

Troubleshooting

Similar Skills