Optimize OpenEvidence clinical query performance for point-of-care response times. Covers query optimization, intelligent caching with TTL management, connection pooling, request batching, and performance monitoring.
openevidence-observability)| Metric | Target | Critical |
|---|---|---|
| Clinical Query P50 | < 3s | > 10s |
| Clinical Query P95 | < 8s | > 15s |
| Cache Hit Rate | > 70% | < 50% |
| DeepConsult Start | < 5s | > 15s |
Remove filler words from questions, limit context to top 5 conditions and 10 medications, set maxCitations: 5 for faster responses.
Build ClinicalQueryCache with SHA-256 key generation. Use dynamic TTL: 5min for stat/urgent, 24hr for pharmacokinetics/mechanisms, 1hr for guidelines, default for others.
Configure HTTPS agent with keepAlive: true, 10 max sockets, 5 max free sockets. Pre-warm connections on startup.
Use DataLoader to batch concurrent queries (max 5 per batch, 50ms scheduling window) within rate limits.
Instrument queries with Prometheus histograms for latency by specialty/urgency/cached, plus cache hit/miss counters.
| Performance Issue | Detection | Resolution |
|---|---|---|
| High P95 latency | Metrics alert | Check cache hit rate, optimize queries |
| Low cache hit rate | < 50% hits | Review TTL strategy, check key normalization |
| Connection timeouts | Timeout errors | Check keep-alive config, increase pool size |
| Memory pressure | Redis alerts | Implement LRU eviction, reduce TTLs |
Stat/urgent queries: 5 minutes
Pharmacokinetics/mechanisms: 24 hours
Treatment guidelines: 1 hour
Default clinical queries: configurable (default 1hr)
See detailed implementation for advanced patterns.