技能 编程开发 Perplexity 可靠性模式

Perplexity 可靠性模式

v20260311
perplexity-reliability-patterns
指导通过缓存、模型降级、流超时保护和引用校验等策略提升 Perplexity 搜索增强服务的稳定性与容错能力,应对延迟波动与调用失败。
获取技能
430 次下载
概览

Perplexity Reliability Patterns

Overview

Production reliability patterns for Perplexity Sonar API integrations. Perplexity performs live web searches per request, making response times variable and dependent on search complexity -- unlike static LLM inference.

Prerequisites

  • Perplexity API key configured
  • Caching layer (Redis recommended)
  • Understanding of search-augmented generation latency

Instructions

Step 1: Cache Identical Queries

Perplexity's web search is expensive per call. Cache results for repeated queries within a time window.

import hashlib, json

class PerplexityCache:
    def __init__(self, redis_client, ttl=600):  # 600: timeout: 10 minutes
        self.r = redis_client
        self.ttl = ttl

    def get_or_search(self, client, messages, model="sonar", **kwargs):
        key = self._cache_key(messages, model, **kwargs)
        cached = self.r.get(key)
        if cached:
            return json.loads(cached)
        result = client.chat.completions.create(
            model=model, messages=messages, **kwargs
        )
        self.r.setex(key, self.ttl, json.dumps(result.to_dict()))
        return result

    def _cache_key(self, messages, model, **kwargs):
        data = json.dumps({"m": messages, "model": model, **kwargs}, sort_keys=True)
        return f"pplx:{hashlib.sha256(data.encode()).hexdigest()}"

Step 2: Model Tier Fallback

If sonar-pro times out or errors, fall back to sonar for a faster but shallower response.

def resilient_search(client, messages, timeout=30):
    try:
        return client.chat.completions.create(
            model="sonar-pro", messages=messages, timeout=timeout
        )
    except Exception:
        return client.chat.completions.create(
            model="sonar", messages=messages, timeout=15
        )

Step 3: Streaming with Timeout Protection

Perplexity streams can stall on complex searches. Set per-chunk timeouts.

import time

def stream_with_timeout(client, messages, chunk_timeout=10):
    stream = client.chat.completions.create(
        model="sonar", messages=messages, stream=True
    )
    last_chunk = time.time()
    full_response = ""
    citations = []

    for chunk in stream:
        if time.time() - last_chunk > chunk_timeout:
            raise TimeoutError("Stream stalled")
        last_chunk = time.time()
        delta = chunk.choices[0].delta.content or ""
        full_response += delta
        if hasattr(chunk, 'citations'):
            citations = chunk.citations
        yield delta

    return full_response, citations

Step 4: Citation Validation

Verify cited URLs are accessible before presenting to users.

import aiohttp

async def validate_citations(citations: list[str]) -> list[dict]:
    validated = []
    async with aiohttp.ClientSession() as session:
        for url in citations[:5]:  # limit to top 5
            try:
                async with session.head(url, timeout=aiohttp.ClientTimeout(total=5)) as r:
                    validated.append({"url": url, "status": r.status, "valid": r.status < 400})  # HTTP 400 Bad Request
            except:
                validated.append({"url": url, "status": 0, "valid": False})
    return validated

Error Handling

Issue Cause Solution
Slow responses (>15s) Complex search query Use sonar instead of sonar-pro
Stream stalls Search taking too long Per-chunk timeout detection
Stale results Cached data too old Reduce TTL for time-sensitive queries
Broken citation links Source pages moved Validate URLs before displaying

Examples

Basic usage: Apply perplexity reliability patterns to a standard project setup with default configuration options.

Advanced scenario: Customize perplexity reliability patterns for production environments with multiple constraints and team-specific requirements.

Resources

Output

  • Configuration files or code changes applied to the project
  • Validation report confirming correct implementation
  • Summary of changes made and their rationale
信息
Category 编程开发
Name perplexity-reliability-patterns
版本 v20260311
大小 4.74KB
更新时间 2026-03-12
语言