Perplexity Deployment Integration for Cloud Platforms

v20260423

perplexity-deploy-integration

This guide provides comprehensive instructions for deploying applications powered by the Perplexity Sonar API. It covers setup for major cloud platforms like Vercel, Google Cloud Run, and Docker. Users can deploy production-ready endpoints, manage required secrets, implement edge functions, and utilize caching mechanisms (e.g., Redis) to ensure high availability and performance in production environments.

Perplexity Deployment Vercel CloudRun Docker API Serverless AI

Get Skill

244 downloads

Overview

Perplexity Deploy Integration

Overview

Deploy applications using Perplexity Sonar API to edge and server platforms. Perplexity's OpenAI-compatible endpoint at https://api.perplexity.ai/chat/completions works from any platform that can make HTTPS requests.

Prerequisites

Perplexity API key stored in PERPLEXITY_API_KEY
Platform CLI installed (vercel, gcloud, or docker)
Application tested locally

Instructions

Step 1: Vercel Edge Function

// api/search.ts
import OpenAI from "openai";

export const config = { runtime: "edge" };

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

export default async function handler(req: Request) {
  const { query, model = "sonar", stream = false } = await req.json();

  if (stream) {
    const response = await perplexity.chat.completions.create({
      model,
      messages: [{ role: "user", content: query }],
      stream: true,
      max_tokens: 2048,
    });

    return new Response(response.toReadableStream(), {
      headers: { "Content-Type": "text/event-stream" },
    });
  }

  const response = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    max_tokens: 2048,
  });

  return Response.json({
    answer: response.choices[0].message.content,
    citations: (response as any).citations || [],
    model: response.model,
  });
}

set -euo pipefail
# Deploy to Vercel
vercel env add PERPLEXITY_API_KEY production
vercel deploy --prod

Step 2: Cloud Run with Redis Cache

// server.ts
import express from "express";
import OpenAI from "openai";
import { createClient } from "redis";
import { createHash } from "crypto";

const app = express();
app.use(express.json());

const perplexity = new OpenAI({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  baseURL: "https://api.perplexity.ai",
});

const redis = createClient({ url: process.env.REDIS_URL });
await redis.connect();

app.post("/api/search", async (req, res) => {
  const { query, model = "sonar" } = req.body;
  const cacheKey = `pplx:${createHash("sha256").update(`${model}:${query}`).digest("hex")}`;

  // Check cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    return res.json({ ...JSON.parse(cached), cached: true });
  }

  const response = await perplexity.chat.completions.create({
    model,
    messages: [{ role: "user", content: query }],
    max_tokens: 2048,
  });

  const result = {
    answer: response.choices[0].message.content,
    citations: (response as any).citations || [],
    model: response.model,
    tokens: response.usage?.total_tokens,
  };

  // Cache for 1 hour
  await redis.setEx(cacheKey, 3600, JSON.stringify(result));
  res.json(result);
});

app.listen(8080);

set -euo pipefail
# Deploy to Cloud Run
gcloud secrets create perplexity-api-key --data-file=<(echo -n "$PERPLEXITY_API_KEY")
gcloud run deploy perplexity-search \
  --source . \
  --set-secrets=PERPLEXITY_API_KEY=perplexity-api-key:latest \
  --port=8080 \
  --allow-unauthenticated

Step 3: Docker

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .
RUN npm run build

ENV NODE_ENV=production
EXPOSE 8080
CMD ["node", "dist/server.js"]

Step 4: Vercel Configuration

{
  "functions": {
    "api/search.ts": {
      "maxDuration": 30
    }
  }
}

Step 5: Health Check

app.get("/health", async (req, res) => {
  const start = Date.now();
  try {
    await perplexity.chat.completions.create({
      model: "sonar",
      messages: [{ role: "user", content: "ping" }],
      max_tokens: 5,
    });
    res.json({ status: "healthy", latencyMs: Date.now() - start });
  } catch {
    res.status(503).json({ status: "unhealthy", latencyMs: Date.now() - start });
  }
});

Error Handling

Issue	Cause	Solution
Edge function timeout	sonar-pro takes >30s	Use sonar or increase maxDuration
Cache stale for news	TTL too long	Use `search_recency_filter` + shorter TTL
API key invalid after deploy	Wrong secret reference	Verify `vercel env ls` or `gcloud secrets`
Stream interrupted	Client disconnect	Handle abort signal gracefully

Output

Deployed API endpoint serving Perplexity search
Cached responses with configurable TTL
Health check endpoint
Platform-specific secret management

Resources

Next Steps

For multi-environment setup, see perplexity-multi-env-setup.

Info

Category Artificial Intelligence

Name perplexity-deploy-integration

Version v20260423

Size 5.32KB

Source jeremylongshore/claude-code-plugins-plus-skills

Updated At 2026-04-28