Skills Development Perplexity Reliability Patterns

Perplexity Reliability Patterns

v20260311
perplexity-reliability-patterns
Guides implementing Perplexity resilience with caching, model fallbacks, streaming timeouts, and citation checks to keep search‑augmented services fault tolerant under variable latency.
Get Skill
430 downloads
Overview

Perplexity Reliability Patterns

Overview

Production reliability patterns for Perplexity Sonar API integrations. Perplexity performs live web searches per request, making response times variable and dependent on search complexity -- unlike static LLM inference.

Prerequisites

  • Perplexity API key configured
  • Caching layer (Redis recommended)
  • Understanding of search-augmented generation latency

Instructions

Step 1: Cache Identical Queries

Perplexity's web search is expensive per call. Cache results for repeated queries within a time window.

import hashlib, json

class PerplexityCache:
    def __init__(self, redis_client, ttl=600):  # 600: timeout: 10 minutes
        self.r = redis_client
        self.ttl = ttl

    def get_or_search(self, client, messages, model="sonar", **kwargs):
        key = self._cache_key(messages, model, **kwargs)
        cached = self.r.get(key)
        if cached:
            return json.loads(cached)
        result = client.chat.completions.create(
            model=model, messages=messages, **kwargs
        )
        self.r.setex(key, self.ttl, json.dumps(result.to_dict()))
        return result

    def _cache_key(self, messages, model, **kwargs):
        data = json.dumps({"m": messages, "model": model, **kwargs}, sort_keys=True)
        return f"pplx:{hashlib.sha256(data.encode()).hexdigest()}"

Step 2: Model Tier Fallback

If sonar-pro times out or errors, fall back to sonar for a faster but shallower response.

def resilient_search(client, messages, timeout=30):
    try:
        return client.chat.completions.create(
            model="sonar-pro", messages=messages, timeout=timeout
        )
    except Exception:
        return client.chat.completions.create(
            model="sonar", messages=messages, timeout=15
        )

Step 3: Streaming with Timeout Protection

Perplexity streams can stall on complex searches. Set per-chunk timeouts.

import time

def stream_with_timeout(client, messages, chunk_timeout=10):
    stream = client.chat.completions.create(
        model="sonar", messages=messages, stream=True
    )
    last_chunk = time.time()
    full_response = ""
    citations = []

    for chunk in stream:
        if time.time() - last_chunk > chunk_timeout:
            raise TimeoutError("Stream stalled")
        last_chunk = time.time()
        delta = chunk.choices[0].delta.content or ""
        full_response += delta
        if hasattr(chunk, 'citations'):
            citations = chunk.citations
        yield delta

    return full_response, citations

Step 4: Citation Validation

Verify cited URLs are accessible before presenting to users.

import aiohttp

async def validate_citations(citations: list[str]) -> list[dict]:
    validated = []
    async with aiohttp.ClientSession() as session:
        for url in citations[:5]:  # limit to top 5
            try:
                async with session.head(url, timeout=aiohttp.ClientTimeout(total=5)) as r:
                    validated.append({"url": url, "status": r.status, "valid": r.status < 400})  # HTTP 400 Bad Request
            except:
                validated.append({"url": url, "status": 0, "valid": False})
    return validated

Error Handling

Issue Cause Solution
Slow responses (>15s) Complex search query Use sonar instead of sonar-pro
Stream stalls Search taking too long Per-chunk timeout detection
Stale results Cached data too old Reduce TTL for time-sensitive queries
Broken citation links Source pages moved Validate URLs before displaying

Examples

Basic usage: Apply perplexity reliability patterns to a standard project setup with default configuration options.

Advanced scenario: Customize perplexity reliability patterns for production environments with multiple constraints and team-specific requirements.

Resources

Output

  • Configuration files or code changes applied to the project
  • Validation report confirming correct implementation
  • Summary of changes made and their rationale
Info
Category Development
Name perplexity-reliability-patterns
Version v20260311
Size 4.74KB
Updated At 2026-03-12
Language