Perplexity API Anti-Patterns and Pitfalls

v20260423

perplexity-known-pitfalls

A comprehensive guide to identifying and avoiding common integration mistakes and anti-patterns when using the Perplexity Sonar API. It covers critical best practices such as correctly handling citations, setting maximum token limits, utilizing recency filters, and optimizing API calls to ensure cost efficiency and data accuracy. Essential for code review and developer onboarding.

Perplexity API LLM Integration Audit Coding AI Best Practices

Get Skill

437 downloads

Overview

Perplexity Known Pitfalls

Overview

Real gotchas when integrating Perplexity Sonar API. Perplexity uses an OpenAI-compatible chat endpoint but performs live web searches -- a fundamentally different paradigm from standard LLM completions. These pitfalls come from treating it like a regular chatbot.

Prerequisites

Perplexity API key configured
Understanding of OpenAI-compatible chat API format

Pitfalls

1. Using It as a Generic Chatbot

Perplexity searches the web per request. Using it for tasks that don't need web search wastes money.

# BAD: general chatbot (wastes a search query)
response = call_perplexity("Write me a haiku about cats")
# Costs $0.005+ for something any LLM can do offline

# GOOD: leverage web search capability
response = call_perplexity(
    "What are the latest Next.js 15 features released this month?",
    search_recency_filter="month"
)

2. Ignoring Citations

Perplexity returns [1], [2] markers in text with a separate citations array. Ignoring them loses the key value prop.

data = response.model_dump()  # or response.json() for raw HTTP
answer = data["choices"][0]["message"]["content"]
citations = data.get("citations", [])  # NOT in choices — top-level field

# BAD: displaying raw markers
print(answer)  # "According to [1], Node.js 22 adds..."

# GOOD: replace markers with links
import re
for i, url in enumerate(citations, 1):
    answer = answer.replace(f"[{i}]", f"[{i}]({url})")

3. Using Wrong SDK Import

There is no @perplexity/sdk or perplexity Python package. Use the standard OpenAI client.

// BAD — this package doesn't exist
import { PerplexityClient } from "@perplexity/sdk";

// GOOD — use OpenAI client with Perplexity base URL
import OpenAI from "openai";
const client = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY,
  baseURL: "https://api.perplexity.ai",
});

4. Not Setting max_tokens

Without max_tokens, responses can be arbitrarily long, increasing costs unpredictably.

// BAD: no token limit — output cost can spike
await client.chat.completions.create({
  model: "sonar-pro",  // $15/M output tokens!
  messages: [{ role: "user", content: "Tell me about AI" }],
});

// GOOD: always set max_tokens
await client.chat.completions.create({
  model: "sonar-pro",
  messages: [{ role: "user", content: "Tell me about AI" }],
  max_tokens: 1024,
});

5. No Recency Filter for Time-Sensitive Queries

Without search_recency_filter, Perplexity may cite outdated articles.

# BAD: may return articles from any time period
response = call_perplexity("current Bitcoin price")

# GOOD: constrain to recent results
response = call_perplexity(
    "current Bitcoin price",
    search_recency_filter="day"  # hour | day | week | month
)

6. Sending Full Conversation History

Each message in the conversation may trigger new search queries. Sending 20 turns of history is expensive and slow.

# BAD: 20 turns of history = many search queries
messages = long_history + [{"role": "user", "content": "summarize"}]

# GOOD: summarize context, send focused query
messages = [
    {"role": "system", "content": "Answer based on web search."},
    {"role": "user", "content": f"Context: {summary}\nQuestion: {question}"}
]

7. Using sonar-pro for Simple Queries

sonar-pro costs 3-15x more than sonar. Using it for simple factual lookups wastes budget.

// BAD: sonar-pro for a trivial question
await client.chat.completions.create({
  model: "sonar-pro",  // $3 input + $15 output per M tokens
  messages: [{ role: "user", content: "What is the capital of France?" }],
});

// GOOD: match model to complexity
const model = isComplexQuery(query) ? "sonar-pro" : "sonar";

8. Mixing Allowlist and Denylist in Domain Filter

search_domain_filter supports either allowlist (include) or denylist (exclude with - prefix), but not both in the same request.

// BAD: mixing modes
search_domain_filter: ["python.org", "-reddit.com"]  // ERROR

// GOOD: pick one mode
search_domain_filter: ["python.org", "docs.python.org"]  // Allowlist
// OR
search_domain_filter: ["-reddit.com", "-quora.com"]  // Denylist

9. Not Caching Search Results

Every uncached call performs a web search. At scale, duplicate queries burn budget.

// BAD: same query hits API every time
app.get("/search", (req, res) => {
  const result = await client.chat.completions.create({ ... });
  res.json(result);
});

// GOOD: cache by query hash
const cache = new LRUCache({ max: 1000, ttl: 3600_000 });
app.get("/search", (req, res) => {
  const key = hash(req.query.q);
  if (cache.has(key)) return res.json(cache.get(key));
  const result = await client.chat.completions.create({ ... });
  cache.set(key, result);
  res.json(result);
});

10. Wrong Base URL

The API is at api.perplexity.ai, not api.perplexity.com.

// BAD
baseURL: "https://api.perplexity.com"  // Wrong domain

// GOOD
baseURL: "https://api.perplexity.ai"   // Correct

Code Review Checklist

Uses openai package, not fake @perplexity/sdk
Base URL is https://api.perplexity.ai
max_tokens set on every request
Citations parsed from response.citations array
search_recency_filter used for time-sensitive queries
Caching implemented for repeated queries
Model routing: sonar for simple, sonar-pro for complex
Conversation history trimmed before sending
PII sanitized from queries
Domain filter uses only allowlist OR denylist, not both

Error Handling

Pitfall	Impact	Detection
No caching	3-5x cost overrun	Check cache hit rate metric
Wrong model	Budget waste	Grep for `sonar-pro` in simple query paths
No max_tokens	Unpredictable costs	Grep for `create()` calls without `max_tokens`
PII in queries	Privacy violation	Run sanitization check in CI

Output

Identified anti-patterns in existing code
Applied fixes for each pitfall
Code review checklist for ongoing quality

Resources

Info

Category Development

Name perplexity-known-pitfalls

Version v20260423

Size 6.84KB

Source jeremylongshore/claude-code-plugins-plus-skills

Updated At 2026-04-28

Language

简体中文

English