Production architecture for AI-powered research and search with Perplexity Sonar API. Covers model routing for cost/quality tradeoffs, citation extraction, multi-query research pipelines, and conversational search.
Application Layer (Research Agent, Fact Checker, Content Writer)
|
Search Router (sonar/sonar-pro/sonar-reasoning)
|
Citation Pipeline (Extract URLs, Validate, Store, Render)
|
Cache Layer (Query Hash -> Result, TTL by freshness need)
Use OpenAI-compatible client pointed at api.perplexity.ai. Route queries by depth: quick/standard (sonar), deep (sonar-pro), reasoning (sonar-reasoning).
Parse response text for [N] URL patterns and inline URLs. Deduplicate and store citations with metadata.
Multi-phase: broad overview (fast model) -> identify subtopics -> deep dive each (sonar-pro) -> deduplicate citations.
Maintain message history for follow-up questions that build on previous context.
See detailed implementation for search service, citation extraction, multi-query research pipeline, conversational session, and fact-check service code.
| Issue | Cause | Solution |
|---|---|---|
| No citations returned | Using basic sonar for complex query | Upgrade to sonar-pro |
| Stale information | Outdated sources | Add recency preference in system prompt |
| High cost | Using sonar-pro for simple queries | Route simple queries to sonar |
| Rate limit | Too many concurrent searches | Add request queue with delays |
const result = await factCheck("The Earth is approximately 4.5 billion years old");
console.log(result.verdict); // Accurate, with sources
console.log(result.sources); // Citation URLs