cc-gc-stts

Talk to Claude Code, Gemini CLI or Antigravity CLI aka agy and hear them talk back. This project adds seamless Speech-to-Text (STT) and Text-to-Speech (TTS) capabilities via a Model Context Protocol (MCP) server.

📰 Read the story: True voice mode for Claude Code

Talk Listen

✨ Features

🎙️ Speech-to-Text (STT): Dictate your prompts instead of typing.
🔊 Text-to-Speech (TTS): Hear the model's responses read aloud.
🔄 Conversational Loop: Use the /stts command for a continuous voice-driven session.
🚀 Persistent Daemon: Fast startup using a reusable Chrome window.
🛠️ Cross-Platform: Works with both Claude Code and Gemini CLI.
🕘 History: Recall past prompts and responses from a dropdown above each panel, or with Alt+↑ / Alt+↓.

🏗️ How It Works

stts uses a background daemon to manage a persistent Chrome/Chromium window:

MCP Server: Exposes stt and tts tools to the AI model. Talks to the daemon over plain HTTP — one short request per call, no polling, no per-call subprocess spawn.
Daemon: A local HTTP + WebSocket server on port 15986 that controls a Chrome instance in "app mode". Stores its profile under $TMPDIR/cc-gc-stts-user-data-dir.
Browser UI ↔ Daemon: A single persistent WebSocket at /ws carries every per-turn message. The daemon pushes a request frame the moment the model calls stt or tts; the page pushes back complete / cancel / close when the user is done.
Browser UI: Uses the native Web Speech API for recognition and synthesis. Free at the wallet — note that on Linux Chrome routes recognition audio through Google's servers, so this is not a fully offline pipeline.
Smart Auto-Advance: In the /stts voice loop, if you simply listen through the response without touching anything, the loop advances automatically the moment speech ends. Only if you press Stop or Play (or say "stop it" / "play it") does the page wait for a manual Got it! so you stay in control of replays.
Automatic Lifecycle: The daemon starts on demand and shuts down when the Chrome window is closed.
Port-collision aware: If port 15986 is held by a non-stts process, the launcher fails fast with a clear error instead of timing out.

🚀 Quick Start

1. Build the project

npm install
npm run build

2. Install

Claude Code

claude plugins marketplace add https://github.com/sandipchitale/cc-gc-stts.git
claude plugin install stts

Gemini CLI

gemini extensions install --consent https://github.com/sandipchitale/cc-gc-stts.git

Antigravity CLI

agy plugin install --consent https://github.com/sandipchitale/cc-gc-stts.git

⌨️ Usage

Conversational Loop

Run the voice-driven loop where you speak, the model processes, and the response is read back to you:

Claude Code: /stts
Gemini CLI: /stts
Antigravity CLI: /stts

Direct Tool Usage

You can also ask the model to "use the stt tool" or "speak this using tts" directly in your prompts.

🗣️ Voice Commands & Shortcuts

Both STT and TTS modes support voice-activated commands for a hands-free experience.

Popular Commands

Command	Action
`send prompt`	Submits your dictated text
`cancel prompt`	Aborts the current recording
`new paragraph`	Inserts a line break
`got it`	(TTS mode) Acknowledges the response and continues — only required if you used Stop or Play during playback; otherwise the loop auto-advances
`stop it`	(TTS mode) Stops the current playback (after this, Got it! is required to advance)
`play it`	(TTS mode) Replays the response (after this, Got it! is required to advance)

Note: Many more punctuation and formatting commands are supported (e.g., insert comma, select all, undo it). Toggle the side panel to see the full list.

Keyboard Shortcuts:

Ctrl+R: Toggle recording/playback side panel.
Enter: Send prompt (Talk side).
Escape: Stop recording or close the commands panel.
Alt+↑ / Alt+↓: Cycle through prompt or response history when the textarea is focused.

Voice command side panel

🕘 Prompt & Response History

Each panel has a History bar above its textarea:

stts

Popularity

What's Inside

README

cc-gc-stts

✨ Features

🏗️ How It Works

🚀 Quick Start

1. Build the project

2. Install

Claude Code

Gemini CLI

Antigravity CLI

⌨️ Usage

Conversational Loop

Direct Tool Usage

🗣️ Voice Commands & Shortcuts

Popular Commands

🕘 Prompt & Response History

Confidence

Similar Plugins

voicemode

claudio

claude-to-speech

nanobanana

product-management

ui-design

More by sandipchitale

tellme-claude

claude-stts