Local multimodal I/O bridge for Claude Code — voice and vision
npx claudepluginhub qte77/cc-senses-pluginLocal multimodal I/O bridge for Claude Code — TTS (/speak), STT (/listen), VLM (/see).
Status: Prototype — end-to-end voice for Claude Code. TTS output via PTY proxy, STT input module scaffolded (config, engine, mic, VAD, PTY injection). Not production-ready.
End-to-end voice plugin for Claude Code. TTS speaks Claude's responses aloud, STT captures voice input via Moonshine/Vosk.
cc-tts-wrap claude)/speak skill — speak specific text or toggle auto-read modeNoMicrophoneError graceful degradation/listen skill — voice input and offline transcription (planned)Session summary generated with three engines for comparison:
| Engine | Quality | File |
|---|---|---|
| espeak-ng | Robotic (rule-based) | assets/audio/cc-tts-summary-espeak-ng.wav |
| Piper (amy) | Natural (neural VITS, ~60MB) | assets/audio/cc-tts-summary-piper.wav |
| Kokoro (sarah) | Best local (82M params) | assets/audio/cc-tts-summary-kokoro.wav |
make setup_dev # install package + dev deps
make setup_tts # install espeak-ng + mpv (robotic, zero-config)
make setup_piper # install Piper (neural, good quality)
make setup_kokoro # install Kokoro (best local quality)
# Live TTS (PTY wrapper — real-time)
cc-tts-wrap claude
# On-demand TTS (CLI)
cc-tts "Hello from Claude Code"
# Batch auto-read (Stop hook — set in .cc-tts.toml)
# auto_read = true
Create .cc-voice.toml in project root (also reads legacy .cc-tts.toml):
engine = "auto" # "espeak" | "piper" | "kokoro" | "auto"
voice = "en_US-amy-medium" # engine-specific voice name
speed = 1.0
auto_read = false # enable Stop hook auto-read
max_chars = 2000
player = "auto" # "mpv" | "ffplay" | "aplay" | "auto"
[stt]
engine = "auto" # "moonshine" | "vosk" | "auto"
language = "en"
wake_word = "hey_claude"
mic_device = "default"
auto_listen = false
TTS env overrides: CC_TTS_ENGINE, CC_TTS_VOICE, CC_TTS_SPEED, CC_TTS_AUTO_READ, CC_TTS_PLAYER.
STT env overrides: CC_STT_ENGINE, CC_STT_LANGUAGE, CC_STT_WAKE_WORD, CC_STT_MIC_DEVICE, CC_STT_AUTO_LISTEN.
cc-tts-wrap claude
↓
PTY proxy (pty_proxy.py) ↔ claude (interactive)
↓
stream_filter.py → ANSI strip, code block skip, spinner suppress
↓
sentence_buffer.py → accumulate, flush on ". " / "? " / "! "
↓
speak.py → engine.synthesize() → player.play_audio()
make validate # lint + type check + test (quiet)
make quick_validate # lint + type check only
VERBOSE=1 make test # full pytest output
Install as Claude Code plugin for /speak, /listen skills and Stop hook auto-read:
claude plugin install cc-voice@local