From claude-voice
Use when the user wants to start a voice conversation with the Claude Desktop App, enable a hold-to-talk hotkey, use voice input in French / Spanish / German / Japanese / any non-English language, dictate prompts by speaking instead of typing, or invokes "/voice", "voice mode", "speak to Claude", "voice chat", "hands-free", "talk to Claude", or similar. Walks the user through choosing language and Whisper model size, then launches a standalone Python voice loop that records mic, runs local Whisper STT, and pastes the transcript into whatever window is focused (typically Claude Desktop). Fills the gap where official voice mode is English-only.
How this skill is triggered — by the user, by Claude, or both
Slash command
/claude-voice:voiceThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill gets the user from zero to a running voice-input loop in one go.
This skill gets the user from zero to a running voice-input loop in one go. It asks the user four quick setup questions (language, model size, hotkey, voice sensitivity), installs Python deps on first run, and launches the script as a detached background process.
Run these checks. If either fails, tell the user what to install and stop:
python -c "import sys; assert sys.version_info >= (3,10), f'need 3.10+, have {sys.version}'; print(sys.version)"
Needs Python 3.10+. If python is not found, try python3 (macOS/Linux) or py (Windows).
If missing: "Install Python 3.10 or newer from python.org, then restart Claude."
claude --version
Needs the Claude CLI. If missing: "Install the Claude CLI first — https://claude.com/claude-code"
Install into a plugin-local venv, not the user's system Python. This dodges
PEP 668 on modern Debian/Ubuntu/Fedora (where pip install --user is refused),
and keeps faster-whisper's heavy deps isolated so uninstall is just rm -rf .venv.
First check whether the venv already exists and has the deps:
# Windows
"${CLAUDE_PLUGIN_ROOT}/.venv/Scripts/python.exe" -c "import faster_whisper, sounddevice, soundfile, pynput, numpy, yaml" 2>&1
# macOS / Linux
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python" -c "import faster_whisper, sounddevice, soundfile, pynput, numpy, yaml" 2>&1
If the venv doesn't exist, create it (tell the user "Setting up voice mode (one-time, ~200 MB)..."):
# Windows
python -m venv "${CLAUDE_PLUGIN_ROOT}/.venv"
"${CLAUDE_PLUGIN_ROOT}/.venv/Scripts/python.exe" -m pip install --upgrade pip
"${CLAUDE_PLUGIN_ROOT}/.venv/Scripts/python.exe" -m pip install -r "${CLAUDE_PLUGIN_ROOT}/scripts/requirements.txt"
# macOS / Linux
python3 -m venv "${CLAUDE_PLUGIN_ROOT}/.venv"
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python" -m pip install --upgrade pip
"${CLAUDE_PLUGIN_ROOT}/.venv/bin/python" -m pip install -r "${CLAUDE_PLUGIN_ROOT}/scripts/requirements.txt"
If venv creation itself fails with "ensurepip is not available" on Debian/Ubuntu,
the user needs sudo apt install python3-venv. Surface that hint and stop.
If install itself fails, surface the error and stop.
Use AskUserQuestion with these FOUR questions (send all four in one call for a single round-trip — four is the AskUserQuestion tool maximum):
Question 1 — "What language will you be speaking?"
enfresdeitptjazhru, nl, ko, ar, hi)Question 2 — "Which Whisper model size?" Include a short description so the user can choose:
Question 3 — "Which hotkey should trigger recording?"
<f8> (Recommended — single key, no chord, rarely conflicts)<f9><ctrl>+<shift>+<space> (may clash with IMEs or Discord push-to-talk on some setups)<ctrl>+<shift>+<v><ctrl>+<alt>+<v>)Do NOT offer Right Alt as an option on Windows. Many keyboard layouts remap Right Alt to AltGr, which Windows emits as a synthetic Ctrl+Alt combo — pynput sees two keys and the hotkey never resolves.
Question 4 — "How do you usually speak when recording?"
This picks between the default transcription pipeline and the whisper_mode
preset, which boosts mic gain and relaxes VAD/Whisper thresholds so quiet
speech doesn't get dropped as silence.
--whisper-mode flag)--whisper-mode)Defaults if the user skips: small model, auto-detect language, <f8> hotkey,
normal speaking voice.
Build the arg list:
--language <code>--model <chosen_model>--hotkey "<chosen_hotkey>"--whisper-modeSpawn DETACHED so it survives this conversation:
Use the venv's python (created in Step 2), not the system python.
Windows: Use PowerShell Start-Process with an array argument list so paths with spaces or unicode (OneDrive, accented usernames) don't blow up the quoting:
powershell -NoProfile -Command "Start-Process -FilePath '${CLAUDE_PLUGIN_ROOT}/.venv/Scripts/python.exe' -WorkingDirectory '${CLAUDE_PLUGIN_ROOT}' -ArgumentList @('${CLAUDE_PLUGIN_ROOT}/scripts/claude_voice.py', <each-flag-as-its-own-quoted-element>)"
Example with --language fr --model small:
powershell -NoProfile -Command "Start-Process -FilePath '${CLAUDE_PLUGIN_ROOT}/.venv/Scripts/python.exe' -WorkingDirectory '${CLAUDE_PLUGIN_ROOT}' -ArgumentList @('${CLAUDE_PLUGIN_ROOT}/scripts/claude_voice.py', '--language', 'fr', '--model', 'small')"
macOS / Linux:
nohup "${CLAUDE_PLUGIN_ROOT}/.venv/bin/python" "${CLAUDE_PLUGIN_ROOT}/scripts/claude_voice.py" <flags> > /dev/null 2>&1 &
Tell them concisely:
Do NOT try to read the script's output — it runs independently.
Provides UI/UX resources: 50+ styles, color palettes, font pairings, guidelines, charts for web/mobile across React, Next.js, Vue, Svelte, Tailwind, React Native, Flutter. Aids planning, building, reviewing interfaces.
Searches MemPalace before answering questions about past work, people, projects, or prior decisions. Returns verbatim stored content instead of guessing from model memory.
npx claudepluginhub afterrealm/claude-voice --plugin claude-voice