技能 编程开发 API容错与可靠性模式实现

API容错与可靠性模式实现

v20260423
perplexity-reliability-patterns
本指南旨在解决在生产环境中集成外部大型语言模型(LLM)API(如Perplexity Sonar)时遇到的可靠性挑战。它提供了完整的容错模式,包括电路断路器、模型级回退、流式超时检测和缓存机制,确保您的应用程序即使在外部服务出现暂时故障或性能波动时,也能保持稳定和高可用性。
获取技能
89 次下载
概览

Perplexity Reliability Patterns

Overview

Production reliability patterns for Perplexity Sonar API. Perplexity performs live web searches per request, making response times inherently variable. The key reliability challenges: search can stall, citations can break, and model tiers have different availability.

Prerequisites

  • Perplexity API key configured
  • Cache layer (Redis or in-memory)
  • Understanding of search latency variability

Instructions

Step 1: Model Tier Fallback

import OpenAI from "openai";

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

async function resilientSearch(
  query: string,
  preferredModel: string = "sonar-pro"
) {
  const fallbackChain = [preferredModel, "sonar"];
  let lastError: Error | null = null;

  for (const model of fallbackChain) {
    try {
      const response = await perplexity.chat.completions.create({
        model,
        messages: [{ role: "user", content: query }],
        max_tokens: model === "sonar-pro" ? 2048 : 512,
      });

      if (model !== preferredModel) {
        console.warn(`[Reliability] Fell back from ${preferredModel} to ${model}`);
      }

      return {
        answer: response.choices[0].message.content || "",
        citations: (response as any).citations || [],
        model: response.model,
        fallback: model !== preferredModel,
      };
    } catch (err: any) {
      lastError = err;
      if (err.status === 401 || err.status === 402) throw err; // Don't retry auth/billing
      console.warn(`[Reliability] ${model} failed (${err.status || err.message}), trying next`);
    }
  }

  throw lastError || new Error("All models failed");
}

Step 2: Circuit Breaker

class CircuitBreaker {
  private failures = 0;
  private lastFailure = 0;
  private state: "closed" | "open" | "half-open" = "closed";

  constructor(
    private threshold: number = 5,
    private resetTimeMs: number = 60000
  ) {}

  async execute<T>(fn: () => Promise<T>, fallback: () => Promise<T>): Promise<T> {
    if (this.state === "open") {
      if (Date.now() - this.lastFailure > this.resetTimeMs) {
        this.state = "half-open";
      } else {
        console.warn("[CircuitBreaker] Open — using fallback");
        return fallback();
      }
    }

    try {
      const result = await fn();
      if (this.state === "half-open") {
        this.state = "closed";
        this.failures = 0;
      }
      return result;
    } catch (err) {
      this.failures++;
      this.lastFailure = Date.now();
      if (this.failures >= this.threshold) {
        this.state = "open";
        console.warn(`[CircuitBreaker] Opened after ${this.failures} failures`);
      }
      return fallback();
    }
  }

  get status() {
    return { state: this.state, failures: this.failures };
  }
}

// Usage
const breaker = new CircuitBreaker(5, 60000);
const cachedFallback = () => getCachedResult(query);

const result = await breaker.execute(
  () => resilientSearch(query, "sonar-pro"),
  cachedFallback
);

Step 3: Streaming with Timeout Protection

async function* streamWithTimeout(
  query: string,
  model: string = "sonar",
  chunkTimeoutMs: number = 10000
): AsyncGenerator<{ type: "text" | "citations" | "timeout"; data: any }> {
  const stream = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    stream: true,
    max_tokens: 2048,
  });

  let lastChunkAt = Date.now();

  for await (const chunk of stream) {
    if (Date.now() - lastChunkAt > chunkTimeoutMs) {
      yield { type: "timeout", data: "Stream stalled — no data for 10s" };
      return;
    }

    lastChunkAt = Date.now();
    const text = chunk.choices[0]?.delta?.content || "";
    if (text) yield { type: "text", data: text };

    const citations = (chunk as any).citations;
    if (citations) yield { type: "citations", data: citations };
  }
}

// Usage
for await (const event of streamWithTimeout("explain quantum computing", "sonar-pro")) {
  if (event.type === "text") process.stdout.write(event.data);
  if (event.type === "citations") console.log("\nSources:", event.data);
  if (event.type === "timeout") console.error("\nStream timed out");
}

Step 4: Cache as Reliability Layer

import { LRUCache } from "lru-cache";
import { createHash } from "crypto";

const reliabilityCache = new LRUCache<string, any>({
  max: 500,
  ttl: 24 * 3600_000, // 24-hour stale cache for reliability
});

async function searchWithCacheFallback(query: string, model = "sonar") {
  const key = createHash("sha256").update(`${model}:${query}`).digest("hex");

  try {
    const response = await resilientSearch(query, model);
    // Update cache on success
    reliabilityCache.set(key, response);
    return { ...response, source: "live" };
  } catch {
    // Serve stale cache as last resort
    const cached = reliabilityCache.get(key);
    if (cached) {
      console.warn("[Reliability] Serving stale cached result");
      return { ...cached, source: "stale-cache" };
    }
    throw new Error("Perplexity unavailable and no cached result");
  }
}

Step 5: Citation URL Validation

async function validateCitations(
  citations: string[],
  timeoutMs: number = 5000
): Promise<Array<{ url: string; status: number; valid: boolean }>> {
  const results = await Promise.allSettled(
    citations.slice(0, 5).map(async (url) => {
      const controller = new AbortController();
      const timeout = setTimeout(() => controller.abort(), timeoutMs);
      try {
        const response = await fetch(url, {
          method: "HEAD",
          signal: controller.signal,
          redirect: "follow",
        });
        return { url, status: response.status, valid: response.status < 400 };
      } catch {
        return { url, status: 0, valid: false };
      } finally {
        clearTimeout(timeout);
      }
    })
  );

  return results.map((r) =>
    r.status === "fulfilled" ? r.value : { url: "", status: 0, valid: false }
  );
}

Error Handling

Issue Cause Solution
sonar-pro timeout >15s Complex multi-source search Fall back to sonar
Stream stalls Search hanging on source Per-chunk timeout detection
Broken citation links Source pages moved/deleted Validate URLs before displaying
All models failing Perplexity outage Serve stale cache, circuit breaker

Output

  • Model tier fallback chain
  • Circuit breaker preventing cascade failures
  • Streaming with stall detection
  • Cache as reliability layer (stale > unavailable)
  • Citation URL validation

Resources

Next Steps

For policy enforcement, see perplexity-policy-guardrails.

信息
Category 编程开发
Name perplexity-reliability-patterns
版本 v20260423
大小 7.34KB
更新时间 2026-04-28
语言