From assemblyai-pack
Implements exponential backoff with jitter, queue throttling, and concurrency limits for AssemblyAI transcription and streaming APIs. Use for 429 retry logic and throughput management.
How this skill is triggered — by the user, by Claude, or both
Slash command
/assemblyai-pack:assemblyai-rate-limitsThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Handle AssemblyAI rate limits with exponential backoff, queue-based throttling, and concurrency management. AssemblyAI auto-scales limits for paid users.
Handle AssemblyAI rate limits with exponential backoff, queue-based throttling, and concurrency management. AssemblyAI auto-scales limits for paid users.
assemblyai package installed| Endpoint | Free | Pay-as-you-go |
|---|---|---|
POST /v2/transcript | 5/min | Scales with usage |
GET /v2/transcript/:id | No hard limit | No hard limit |
POST /v2/upload | 5/min | Scales with usage |
| Metric | Free | Pay-as-you-go |
|---|---|---|
| New streams/min | 5 | 100 (auto-scales) |
| Concurrent streams | ~5 | Unlimited (auto-scales 10% every 60s at 70% usage) |
| Metric | Free | Paid |
|---|---|---|
| Requests/min | Limited | Scales with usage |
| Max audio input | 100 hours per request | 100 hours per request |
Note: AssemblyAI auto-scales paid limits. At 70%+ utilization, the new session rate limit increases by 10% every 60 seconds with no ceiling cap.
import { AssemblyAI, type Transcript } from 'assemblyai';
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY!,
});
async function transcribeWithBackoff(
audioUrl: string,
options: Record<string, any> = {},
config = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 30000 }
): Promise<Transcript> {
for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
try {
return await client.transcripts.transcribe({
audio: audioUrl,
...options,
});
} catch (err: any) {
if (attempt === config.maxRetries) throw err;
const status = err.status ?? err.statusCode;
// Only retry on 429 (rate limit) and 5xx (server errors)
if (status && status !== 429 && (status < 500 || status >= 600)) throw err;
const exponentialDelay = config.baseDelayMs * Math.pow(2, attempt);
const jitter = Math.random() * config.baseDelayMs;
const delay = Math.min(exponentialDelay + jitter, config.maxDelayMs);
console.warn(`[${attempt + 1}/${config.maxRetries}] Retrying in ${delay.toFixed(0)}ms...`);
await new Promise(r => setTimeout(r, delay));
}
}
throw new Error('Unreachable');
}
import PQueue from 'p-queue';
// Limit to N concurrent transcription jobs
const transcriptionQueue = new PQueue({
concurrency: 5, // Max 5 concurrent jobs
interval: 60_000, // Per minute window
intervalCap: 50, // Max 50 new jobs per minute
});
async function queuedTranscribe(audioUrl: string): Promise<Transcript> {
return transcriptionQueue.add(() =>
transcribeWithBackoff(audioUrl)
);
}
// Process a batch of files
const audioUrls = [
'https://example.com/audio1.mp3',
'https://example.com/audio2.mp3',
'https://example.com/audio3.mp3',
];
const results = await Promise.all(
audioUrls.map(url => queuedTranscribe(url))
);
console.log(`Completed ${results.length} transcriptions`);
console.log(`Queue size: ${transcriptionQueue.size}, pending: ${transcriptionQueue.pending}`);
async function batchTranscribe(
audioUrls: string[],
onProgress?: (completed: number, total: number) => void
): Promise<Transcript[]> {
const queue = new PQueue({ concurrency: 5 });
const results: Transcript[] = [];
let completed = 0;
const promises = audioUrls.map(url =>
queue.add(async () => {
const transcript = await transcribeWithBackoff(url);
completed++;
onProgress?.(completed, audioUrls.length);
return transcript;
})
);
return Promise.all(promises);
}
// Usage
await batchTranscribe(
urls,
(done, total) => console.log(`Progress: ${done}/${total}`)
);
async function connectStreamingWithRetry(maxRetries = 3) {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
const transcriber = client.streaming.createService({
speech_model: 'nova-3',
sample_rate: 16000,
});
transcriber.on('error', (error) => {
console.error('Streaming error:', error);
});
await transcriber.connect();
return transcriber;
} catch (err: any) {
if (attempt === maxRetries) throw err;
// WebSocket code 4008 = session limit
const delay = Math.pow(2, attempt) * 2000;
console.warn(`Stream connect failed. Retrying in ${delay}ms...`);
await new Promise(r => setTimeout(r, delay));
}
}
}
| Scenario | Status | Strategy |
|---|---|---|
| Rate limited (async) | 429 | Exponential backoff, honor Retry-After header |
| Server error | 500-503 | Retry with backoff |
| Session limit (streaming) | WS 4008 | Wait and reconnect |
| Auth error | 401 | Do not retry, fix credentials |
| Invalid input | 400 | Do not retry, fix request |
For security configuration, see assemblyai-security-basics.
npx claudepluginhub jeremylongshore/claude-code-plugins-plus-skills --plugin assemblyai-packOptimizes AssemblyAI transcription performance using model selection, parallel batch processing with PQueue, caching, and latency benchmarks. For slow transcriptions, high latency, or batch workloads.
Implements concurrency queues with p-limit, stats, and backoff for Deepgram API to handle 429 rate limits and quotas.
Implements Speak API rate limiting in TypeScript with per-minute throttling, 429 retry backoff, and batch queuing. Use for integrations hitting assessment/conversation limits.