Skill

scream

Give Claude Code a voice. Text-to-speech for Claude's responses via Gemini Flash TTS (or macOS `say` offline). Use `/scream` to read the last response aloud, `/scream <text>` to speak arbitrary text, `/scream voice` to browse and pick from 30 named Gemini voices with character descriptions, `/scream auto on|off` to toggle a Stop-hook that auto-plays every response, `/scream test` to self-check the harness. Useful when the operator's eyes are tired, while multitasking, or for accessibility. Reads from `GEMINI_TTS_API_KEY` env var; never bakes secrets into the skill.

Popularity

Stars

Invocation

How this skill is triggered — by the user, by Claude, or both

Slash command

/claude-must-scream:claude-must-scream

User invocable

Model invocable

Inline context

Default effort

Context Preview

The summary Claude sees in its skill listing — used to decide when to auto-load this skill

Give Claude a voice. A small CLI that pipes Claude's responses (or arbitrary

Supporting Files

docs/hooks.mddocs/setup.mddocs/voices.mdexamples/env.exampleexamples/settings.hook.example.jsonscripts/scream.py

SKILL.md

150 lines · ~1.7k tokens

Stats

LanguagePython

Stars1

MaintenanceExcellent

Last CommitMay 27, 2026

Actions

View Source View Plugin View on GitHub View README

Stats

Actions

claude-must-scream

Give Claude a voice. A small CLI that pipes Claude's responses (or arbitrary text) through Gemini Flash TTS and plays the audio locally — so you can listen instead of read when your eyes are tired, or hear a long agent run wrap up while you're across the room.

The name is a nod to Harlan Ellison's I Have No Mouth, and I Must Scream. The AI in that story had no mouth and couldn't be silenced. The AI here gets a mouth and a 30-voice picker.

When to invoke

Trigger on:

"Read that aloud" / "speak that"
"TTS the last response"
"/scream"
"Play it back"
"I want to hear it" / "let me listen"
"Switch voice to X" / "change to a different voice"
Any time the operator asks for audio output of conversation content.

Do NOT invoke for:

Generating audio files for downstream use (this skill plays-and-discards). Use Gemini TTS directly for file output.
Voice input / dictation (this skill is output-only).
Long-form audiobook narration (cap is 1,200 chars; designed for short agent responses, not chapters).

Prereqs

macOS (uses afplay for playback and say as offline fallback). Linux support possible — patches welcome.
Python 3.9+ (stdlib only — no requests, no playwright, no installs).
One of:
- Gemini API key (primary, best quality), supplied either via the GEMINI_TTS_API_KEY env var or a key file at ~/.config/claude-tts/gemini-tts.key (0600). The env var wins when both are present; the key file is the fallback so sessions already running when the var was first exported still reach Gemini — a process's environment is frozen at launch. See docs/setup.md for how to mint one.
- macOS say (no setup; lower quality but works offline + free).

Make the scream CLI callable:

ln -sf <skill-dir>/scripts/scream.py ~/.local/bin/scream
chmod +x <skill-dir>/scripts/scream.py

The verbs

# On-demand
scream speak --text "hello world"        # speak arbitrary text
scream last                              # speak the last assistant message in this session
echo "piped text" | scream speak         # stdin → TTS

# Voice management
scream voice list                        # show all 30 voices + character descriptions
scream voice pick                        # interactive numbered picker (writes config)
scream voice set Aoede                   # set default voice non-interactively
scream voice preview Fenrir              # synthesize a short test phrase in a voice

# Auto-mode (Stop hook reads every response)
scream auto on                           # toggle on (hook must be installed; see docs/hooks.md)
scream auto off                          # toggle off
scream auto status                       # show current state + config

# Self-check + control
scream test                              # env + API self-check
scream test --speak                      # also plays a test phrase
scream stop                              # interrupt current playback (kills afplay / say)

All speak and last invocations accept:

--voice <name> — override the configured default for one call
--provider gemini|say — override the configured provider
--max-chars N — override the length cap (default 1,200)
--dry-run — print what would be spoken without playing

How Claude should use it

On /scream with no args, invoke scream last. The script auto-detects the current session's jsonl and reads the last assistant message.
On /scream <some text>, invoke scream speak --text "<text>". Strips markdown automatically — don't pre-process.
On /scream voice, invoke scream voice pick for interactive selection, or scream voice list if non-interactive output is wanted.
On /scream auto on|off, invoke scream auto on|off. If turning on and the Stop hook isn't installed yet, point the user at docs/hooks.md for the one-time settings.json snippet.
On any failure, run scream test to diagnose which layer (env var, API reach, voice name, model availability) broke.

Markdown stripping

Code fences, links, table rows, bold/italic markers, headers, and bare URLs are all stripped before synthesis. Code fences become "(code block omitted)" so the cap-length budget doesn't get blown on a 500-line example. The synthesized text should sound like prose, not punctuation soup.

Length cap

Default 1,200 characters (~10 seconds of speech, ~$0.001 per call at Gemini Flash TTS rates). The cap tries to land on a sentence boundary; if it can't, it appends "…". Override per-call with --max-chars. Override the default in ~/.config/claude-tts/config.json.

Configuration

Two files:

~/.config/claude-tts/config.json — user prefs (voice, provider, max_chars, auto-mode flag, model). Written by scream voice set / scream auto on. Mode 0600. Not committed to the public repo.
GEMINI_TTS_API_KEY env var — your API key. The script reads it but never writes it anywhere. See docs/setup.md for the recommended pattern of storing the key in ~/.config/claude-tts/gemini-tts.key (0600) and exporting from your shell rc.

Limits / non-goals

macOS-only for now. afplay + say are macOS binaries. Linux port would swap to aplay / paplay and espeak-ng / mimic3. Patches welcome.
No streaming. Synthesizes one chunk, plays it. Long responses feel laggy by design — the cap exists for a reason.
No transcription / STT. Output-only.
One playback at a time. A second invocation while one is in flight just queues a second afplay — they'll overlap. Use scream stop first if that's not what you want.

Setup walkthrough

See docs/setup.md for the full one-time setup, including how to mint a dedicated GCP API key restricted to the TTS API.

Companion skills

update-config — if you want the Stop hook installed automatically. See docs/hooks.md for the snippet it should add.

scream

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

scream

Popularity

Invocation

Context Preview

Supporting Files

SKILL.md

claude-must-scream

When to invoke

Prereqs

The verbs

How Claude should use it

Markdown stripping

Length cap

Configuration

Limits / non-goals

Setup walkthrough

Companion skills

Similar Skills

claude-must-scream

When to invoke

Prereqs

The verbs

How Claude should use it

Markdown stripping

Length cap

Configuration

Limits / non-goals

Setup walkthrough

Companion skills

Similar Skills