Reference architectures for scalable transcription systems: synchronous API for short files, async queue (BullMQ) for batch processing, WebSocket streaming for real-time, and hybrid routing with enterprise multi-region deployment.
Select based on use case: Sync API for files under 60s, Async Queue for batch/long files, Streaming for real-time transcription, or Hybrid for mixed workloads.
Direct API calls via Express endpoint. Store results in database. Best for low-latency, short audio requirements.
Use BullMQ with Redis for job queuing. Configure workers with concurrency (10), retry (3 attempts, exponential backoff). Notify clients on completion.
Create WebSocket server that proxies audio between client and Deepgram Live API. Forward transcripts back to client in real-time with interim results.
Auto-select pattern based on audio duration: sync for <60s, async for >300s. Allow explicit mode override via request parameter.
Deploy multi-region with load balancing. Use Redis cluster for cross-region coordination. Configure per-region worker pools with 20 concurrency and 5 retries.
See detailed implementation for advanced patterns.
| Issue | Cause | Solution |
|---|---|---|
| Timeout on large files | Sync pattern | Switch to async queue |
| WebSocket disconnect | Network issue | Auto-reconnect with backoff |
| Queue backlog | Worker overload | Scale workers, increase concurrency |
| Region failover | Regional outage | Route to healthy region |
| Pattern | Best For | Latency | Throughput |
|---|---|---|---|
| Sync API | Short files (<60s) | Low | Low |
| Async Queue | Batch processing | Medium | High |
| Streaming | Live transcription | Real-time | Medium |
| Hybrid | Mixed workloads | Varies | High |