Set up comprehensive observability for OpenEvidence clinical AI integrations with Prometheus metrics, OpenTelemetry tracing, structured logging with PHI redaction, and Grafana dashboards.
| Metric | Type | Alert Threshold |
|---|---|---|
openevidence_requests_total |
Counter | N/A |
openevidence_request_duration_seconds |
Histogram | P95 > 15s |
openevidence_errors_total |
Counter | > 5% error rate |
openevidence_cache_hits_total |
Counter | < 50% hit rate |
openevidence_rate_limit_remaining |
Gauge | < 10% headroom |
openevidence_deepconsult_active |
Gauge | > 50 concurrent |
Register counters (requests, errors, cache hits/misses), histograms (request duration, DeepConsult duration), gauges (rate limit remaining), and summaries (confidence scores) with appropriate labels.
Wrap OpenEvidence client with metrics collection, cache hit/miss tracking, rate limit monitoring from response headers, and confidence score recording.
Initialize OpenTelemetry with Google Cloud Trace exporter, HTTP and Express instrumentations, and service metadata.
Use pino with PHI redaction (patient.*, patientId, mrn, *.ssn) and OpenEvidence-specific child logger for clinical queries and DeepConsult events.
Create Prometheus alerts for: high error rate (>5% warning, >20% critical), high P95 latency (>15s), low cache hit rate (<50%), rate limit warning (<10 remaining), service down.
Create panels for request rate, error rate gauge, latency heatmap, cache hit rate timeseries, rate limit gauge, and confidence score distribution.
/metrics endpoint| Issue | Detection | Resolution |
|---|---|---|
| Metrics endpoint down | Prometheus scrape failure | Check /metrics route registration |
| Missing traces | No spans in Cloud Trace | Verify OpenTelemetry SDK initialization |
| PHI in logs | Audit review | Add patterns to pino redact config |
| Alert fatigue | Too many alerts | Adjust thresholds, add for durations |
router.get('/metrics', async (req, res) => {
res.set('Content-Type', registry.contentType);
res.send(await registry.metrics());
});
See detailed implementation for advanced patterns.