From adk-streaming
Use this skill to build an ADK 2.0 voice-first agent — speech-in, speech-out using Gemini Live audio modalities. Triggers on: "ADK voice agent", "ADK speech to speech", "ADK audio streaming", "voice assistant ADK", "ADK microphone input", "ADK audio output", "ADK PCM audio". Generates an agent configured for AUDIO response modality with PCM ingestion, voice selection, and turn-detection callbacks.
How this skill is triggered — by the user, by Claude, or both
Slash command
/adk-streaming:audio-streaming-agentThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
Build a voice-in / voice-out agent on Gemini Live via ADK 2.0.
Build a voice-in / voice-out agent on Gemini Live via ADK 2.0.
from google.adk.agents import LlmAgent
from google.adk.streaming import StreamingMode
root_agent = LlmAgent(
name="voice_assistant",
model="gemini-2.5-flash-live",
instruction=(
"You are speaking with the user via voice. "
"Keep responses under 30 seconds of speech. "
"Don't read URLs or code; describe them in plain language."
),
streaming_mode=StreamingMode.BIDI,
response_modalities=["AUDIO"],
voice_name="Aoede",
input_audio_config={
"sample_rate_hz": 16000,
"encoding": "LINEAR16",
},
output_audio_config={
"sample_rate_hz": 24000,
"encoding": "LINEAR16",
},
)
Common Gemini Live voices: Aoede, Puck, Charon, Kore, Fenrir. Test each — characteristics differ (warmth, pace, pitch).
const audioCtx = new AudioContext({ sampleRate: 16000 });
navigator.mediaDevices.getUserMedia({ audio: true }).then(stream => {
const source = audioCtx.createMediaStreamSource(stream);
const processor = audioCtx.createScriptProcessor(4096, 1, 1);
source.connect(processor);
processor.connect(audioCtx.destination);
processor.onaudioprocess = (e) => {
const pcm = e.inputBuffer.getChannelData(0);
const int16 = new Int16Array(pcm.map(s => Math.max(-1, Math.min(1, s)) * 0x7FFF));
ws.send(int16.buffer);
};
});
Gemini Live handles VAD (voice activity detection) server-side. Override:
turn_detection={
"type": "server_vad",
"threshold": 0.5,
"silence_duration_ms": 700,
}
When the user starts speaking mid-response, ADK fires an interruption event:
from google.adk.callbacks import on_interruption
@on_interruption
async def handle_interrupt(ctx, event):
print(f"User interrupted at {event.timestamp}")
# Optional: log, reset agent state, etc.
gemini-live-bootstrap for general streaming setupbidirectional-tool-streaming for tools that run during voice convonpx claudepluginhub healthcare-ai-consulting-llc/adk-2-toolkit --plugin adk-streamingCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.