Implement comprehensive observability for Deepgram integrations with Prometheus metrics, OpenTelemetry distributed tracing, structured JSON logging, Grafana dashboards, and AlertManager rules.
Define counters for requests (by status/model/type), audio processed, rate limit hits, and estimated cost. Add histograms for transcription latency. Add gauges for active connections.
Wrap Deepgram client to auto-record metrics on every transcription. Track success/error counts, latency, audio duration, and cost per model. Add OpenTelemetry span attributes.
Initialize NodeSDK with OTLP exporter. Set service name, version, and environment as resource attributes. Auto-instrument HTTP (excluding /health and /metrics paths).
Use Pino with JSON output, ISO timestamps, and component-specific child loggers (transcription, metrics, alerts). Include service metadata in every log line.
Build panels for request rate, P95 latency, audio processed per hour, error rate gauge, estimated daily cost, and active connections.
Alert on: error rate >5% (critical), P95 latency >30s (warning), rate limit hits >10/hr (warning), cost spike >2x yesterday (warning), zero requests for 15min (warning).
See detailed implementation for advanced patterns.
| Issue | Cause | Solution |
|---|---|---|
| Missing metrics | No instrumentation | Use instrumented client wrapper |
| High cardinality | Too many labels | Limit label values to known set |
| Alert storms | Wrong thresholds | Tune alert rules, add for duration |
| Metric gaps | Scrape failures | Check Prometheus targets |
| Pillar | Tool | Purpose |
|---|---|---|
| Metrics | Prometheus | Performance and usage tracking |
| Traces | OpenTelemetry | Request flow visibility |
| Logs | Pino (JSON) | Debugging and audit |
| Alerts | AlertManager | Incident notification |
| Metric | Type | Purpose |
|---|---|---|
deepgram_transcription_requests_total |
Counter | Request throughput |
deepgram_transcription_latency_seconds |
Histogram | Latency tracking |
deepgram_audio_processed_seconds_total |
Counter | Usage tracking |
deepgram_estimated_cost_dollars |
Counter | Budget monitoring |
deepgram_rate_limit_hits_total |
Counter | Throttling detection |