AssemblyAI Rate Limits
Overview
Handle AssemblyAI rate limits with exponential backoff, queue-based throttling, and concurrency management. AssemblyAI auto-scales limits for paid users.
Prerequisites
-
assemblyai package installed
- Understanding of async/await patterns
Rate Limit Tiers (Actual)
Async Transcription API
| Endpoint |
Free |
Pay-as-you-go |
POST /v2/transcript |
5/min |
Scales with usage |
GET /v2/transcript/:id |
No hard limit |
No hard limit |
POST /v2/upload |
5/min |
Scales with usage |
Streaming (WebSocket)
| Metric |
Free |
Pay-as-you-go |
| New streams/min |
5 |
100 (auto-scales) |
| Concurrent streams |
~5 |
Unlimited (auto-scales 10% every 60s at 70% usage) |
LeMUR
| Metric |
Free |
Paid |
| Requests/min |
Limited |
Scales with usage |
| Max audio input |
100 hours per request |
100 hours per request |
Note: AssemblyAI auto-scales paid limits. At 70%+ utilization, the new session rate limit increases by 10% every 60 seconds with no ceiling cap.
Instructions
Step 1: Exponential Backoff with Jitter
import { AssemblyAI, type Transcript } from 'assemblyai';
const client = new AssemblyAI({
apiKey: process.env.ASSEMBLYAI_API_KEY!,
});
async function transcribeWithBackoff(
audioUrl: string,
options: Record<string, any> = {},
config = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 30000 }
): Promise<Transcript> {
for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
try {
return await client.transcripts.transcribe({
audio: audioUrl,
...options,
});
} catch (err: any) {
if (attempt === config.maxRetries) throw err;
const status = err.status ?? err.statusCode;
// Only retry on 429 (rate limit) and 5xx (server errors)
if (status && status !== 429 && (status < 500 || status >= 600)) throw err;
const exponentialDelay = config.baseDelayMs * Math.pow(2, attempt);
const jitter = Math.random() * config.baseDelayMs;
const delay = Math.min(exponentialDelay + jitter, config.maxDelayMs);
console.warn(`[${attempt + 1}/${config.maxRetries}] Retrying in ${delay.toFixed(0)}ms...`);
await new Promise(r => setTimeout(r, delay));
}
}
throw new Error('Unreachable');
}
Step 2: Queue-Based Concurrency Control
import PQueue from 'p-queue';
// Limit to N concurrent transcription jobs
const transcriptionQueue = new PQueue({
concurrency: 5, // Max 5 concurrent jobs
interval: 60_000, // Per minute window
intervalCap: 50, // Max 50 new jobs per minute
});
async function queuedTranscribe(audioUrl: string): Promise<Transcript> {
return transcriptionQueue.add(() =>
transcribeWithBackoff(audioUrl)
);
}
// Process a batch of files
const audioUrls = [
'https://example.com/audio1.mp3',
'https://example.com/audio2.mp3',
'https://example.com/audio3.mp3',
];
const results = await Promise.all(
audioUrls.map(url => queuedTranscribe(url))
);
console.log(`Completed ${results.length} transcriptions`);
console.log(`Queue size: ${transcriptionQueue.size}, pending: ${transcriptionQueue.pending}`);
Step 3: Batch Processing with Progress
async function batchTranscribe(
audioUrls: string[],
onProgress?: (completed: number, total: number) => void
): Promise<Transcript[]> {
const queue = new PQueue({ concurrency: 5 });
const results: Transcript[] = [];
let completed = 0;
const promises = audioUrls.map(url =>
queue.add(async () => {
const transcript = await transcribeWithBackoff(url);
completed++;
onProgress?.(completed, audioUrls.length);
return transcript;
})
);
return Promise.all(promises);
}
// Usage
await batchTranscribe(
urls,
(done, total) => console.log(`Progress: ${done}/${total}`)
);
Step 4: Streaming Rate Limit Handling
async function connectStreamingWithRetry(maxRetries = 3) {
for (let attempt = 0; attempt <= maxRetries; attempt++) {
try {
const transcriber = client.streaming.createService({
speech_model: 'nova-3',
sample_rate: 16000,
});
transcriber.on('error', (error) => {
console.error('Streaming error:', error);
});
await transcriber.connect();
return transcriber;
} catch (err: any) {
if (attempt === maxRetries) throw err;
// WebSocket code 4008 = session limit
const delay = Math.pow(2, attempt) * 2000;
console.warn(`Stream connect failed. Retrying in ${delay}ms...`);
await new Promise(r => setTimeout(r, delay));
}
}
}
Output
- Automatic retry with exponential backoff and jitter
- Queue-based concurrency control with p-queue
- Batch transcription with progress reporting
- Streaming reconnection logic
Error Handling
| Scenario |
Status |
Strategy |
| Rate limited (async) |
429 |
Exponential backoff, honor Retry-After header |
| Server error |
500-503 |
Retry with backoff |
| Session limit (streaming) |
WS 4008 |
Wait and reconnect |
| Auth error |
401 |
Do not retry, fix credentials |
| Invalid input |
400 |
Do not retry, fix request |
Resources
Next Steps
For security configuration, see assemblyai-security-basics.