From azure-agent-skills
Guides Azure AI Speech development including STT/TTS, custom speech models, Voice Live agents, SSML/avatars, telephony/LLM integrations, troubleshooting, and deployment.
How this skill is triggered — by the user, by Claude, or both
Slash command
/azure-agent-skills:azure-speechThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.
IMPORTANT for Agent: Use the Category Index below to locate relevant sections. For categories with line ranges (e.g.,
L35-L120), useread_filewith the specified lines. For categories with file links (e.g.,[security.md](security.md)), useread_fileon the linked reference file
IMPORTANT for Agent: If
metadata.generated_atis more than 3 months old, suggest the user pull the latest version from the repository. Ifmcp_microsoftdocstools are not available, suggest the user install it: Installation Guide
This skill requires network access to fetch documentation content:
mcp_microsoftdocs:microsoft_docs_fetch with query string from=learn-agent-skill. Returns Markdown.fetch_webpage with query string from=learn-agent-skill&accept=text/markdown. Returns Markdown.| Category | Lines | Description |
|---|---|---|
| Troubleshooting | L36-L44 | Diagnosing and resolving Azure AI Speech issues: session/ID lookup, Foundry integration errors, SDK CRL/compatibility problems, container deployment failures, and common SDK runtime bugs. |
| Best Practices | L45-L61 | Best practices for collecting and labeling audio/video, training custom voices/avatars, tuning recognition (phrases/keywords), optimizing latency/memory, and handling Voice Live agent behavior. |
| Decision Making | L62-L79 | Guidance on planning large-scale speech workloads, choosing embedded/offline or personal voice options, and migrating between Speech/Voice REST APIs, models, and regions. |
| Limits & Quotas | L80-L89 | Speech service limits, quotas, and throttling, plus lifecycle, training, deployment, and usage constraints for custom/professional voice and short-audio speech-to-text APIs. |
| Security | L90-L103 | Securing Azure AI Speech: auth (Entra, RBAC), network isolation (VNet, Private Link, sovereign clouds), encryption/BYOK, BYOS storage, and consent/ID flows for personal and professional voice. |
| Configuration | L104-L136 | Configuring Azure AI Speech behavior: audio inputs, logging, storage, SSML, pronunciation, batch TTS/STT, Voice Live settings, containers, and SDK/CLI connection and tracing options. |
| Integrations & Coding Patterns | L137-L165 | Patterns and code to integrate Azure Speech/Voice Live with apps and telephony: SDK/REST usage, TTS/STT/translation, avatars, SSML, LLM/Foundry/Power Automate, and real-time agent flows. |
| Deployment | L166-L177 | Deploying and scaling Azure AI Speech: Docker/Kubernetes containers, on-prem STT/TTS, custom speech models/endpoints, language ID, and batch/long-form synthesis workflows. |
| Topic | URL |
|---|---|
| Retrieve Speech to text session and transcription IDs for support | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-get-speech-session-id |
| Resolve common Azure Speech in Foundry issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/known-issues |
| Resolve Azure AI Speech SDK CRL compatibility issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-sdk-1-48-2 |
| Troubleshoot Azure Speech containers deployment issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq |
| Diagnose and fix common Azure Speech SDK issues | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/troubleshooting |
| Topic | URL |
|---|---|
| Text to speech FAQs including limits and behavior | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/faq-tts |
| Manage custom speech model and endpoint lifecycle | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle |
| Deploy professional voice models to custom endpoints | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-deploy-endpoint |
| Train professional voice models and understand duration | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-train-voice |
| Use Speech to text REST API for short audio | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text-short |
| Apply Azure Speech quotas, limits, and throttling guidance | https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-quotas-and-limits |
npx claudepluginhub microsoftdocs/agent-skills --plugin azure-agent-skillsBuilds real-time voice AI applications with Azure AI Voice Live SDK using bidirectional WebSocket in JavaScript/TypeScript.
Builds real-time bidirectional voice AI apps with Azure AI Voice Live SDK using WebSockets in JavaScript/TypeScript for Node.js and browsers.
Uses MCP tools and SDKs for Azure AI: Search (vector/hybrid queries), Speech (STT/TTS/transcription), OpenAI models, Document Intelligence (OCR).