Skills Engineering Figma API Incident Runbook

Figma API Incident Runbook

v20260423
figma-incident-runbook
A comprehensive runbook for rapid incident response to Figma REST API failures. It guides users through quick triage (checking status, authentication, rate limits), decision-making based on error codes (403, 429, 500), applying immediate mitigation strategies (token rotation, fallback modes), and completing thorough postmortem documentation.
Get Skill
76 downloads
Overview

Figma Incident Runbook

Overview

Rapid incident response procedures for Figma REST API integration failures. Covers triage, mitigation, and postmortem for the most common failure modes.

Prerequisites

  • Access to application logs and metrics
  • Figma PAT for health checks
  • Communication channel (Slack, PagerDuty)

Instructions

Step 1: Quick Triage (First 5 Minutes)

#!/bin/bash
echo "=== Figma Incident Triage ==="

# 1. Is Figma itself down?
echo -n "Figma Status: "
curl -s https://www.figmastatus.com/api/v2/status.json 2>/dev/null \
  | jq -r '.status.description // "Cannot reach status page"'

# 2. Is our token valid?
echo -n "Auth Check: "
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
  -H "X-Figma-Token: ${FIGMA_PAT}" \
  https://api.figma.com/v1/me)
echo "$HTTP_CODE"

# 3. Can we read a known file?
echo -n "File Access: "
curl -s -H "X-Figma-Token: ${FIGMA_PAT}" \
  "https://api.figma.com/v1/files/${FIGMA_FILE_KEY}?depth=1" \
  | jq -r '.name // "FAILED"'

# 4. Are we rate limited?
echo "Rate Limit Headers:"
curl -s -D - -o /dev/null \
  -H "X-Figma-Token: ${FIGMA_PAT}" \
  https://api.figma.com/v1/me 2>/dev/null \
  | grep -iE "(retry-after|rate-limit|figma)" || echo "No rate limit headers"

Step 2: Decision Tree

API returning errors?
├── 403 Forbidden
│   ├── Token expired (>90 days) → Rotate PAT immediately
│   ├── Wrong scopes → Regenerate with correct scopes
│   └── File not shared → Check file permissions
│
├── 429 Rate Limited
│   ├── Retry-After < 60s → Wait and retry automatically
│   ├── Retry-After > 300s → Reduce request volume
│   └── X-Figma-Rate-Limit-Type: low → Consider upgrading plan
│
├── 404 Not Found
│   ├── File deleted → Check with file owner
│   ├── Wrong file key → Verify FIGMA_FILE_KEY
│   └── API path wrong → Check endpoint documentation
│
├── 500/503 Server Error
│   ├── status.figma.com shows incident → Wait for resolution
│   ├── Intermittent → Retry with backoff
│   └── Persistent → Contact Figma support
│
└── Network Error (ECONNREFUSED, timeout)
    ├── DNS resolution failing → Check DNS config
    ├── Firewall blocking → Verify outbound HTTPS to api.figma.com
    └── TLS error → Check Node.js version (18+ required)

Step 3: Immediate Mitigation

For 403 (Token Expired):

# Generate new PAT in Figma Settings > Personal access tokens
# Then update your deployment:

# GitHub Actions
gh secret set FIGMA_PAT --body "figd_new-token-here"

# Cloud Run
echo -n "figd_new-token" | gcloud secrets versions add figma-pat --data-file=-
gcloud run services update my-service --update-secrets="FIGMA_PAT=figma-pat:latest"

# Fly.io
fly secrets set FIGMA_PAT=figd_new-token

For 429 (Rate Limited):

// Emergency: disable non-critical Figma calls
const EMERGENCY_MODE = process.env.FIGMA_EMERGENCY === 'true';

async function safeFigmaCall<T>(
  path: string,
  critical: boolean = false
): Promise<T | null> {
  if (EMERGENCY_MODE && !critical) {
    console.warn(`Figma call skipped (emergency mode): ${path}`);
    return null;
  }
  return figmaFetch(path);
}

For 500/503 (Figma Down):

// Serve cached data when Figma is unavailable
async function getTokensWithFallback() {
  try {
    return await extractTokensFromFigma();
  } catch (error) {
    console.warn('Figma unavailable, serving cached tokens');
    // Return last-known-good tokens from cache or file
    const cached = await readFile('output/tokens.json', 'utf-8');
    return JSON.parse(cached);
  }
}

Step 4: Communication

## Internal Notification (Slack)
**Figma Integration Alert**
- Status: INVESTIGATING / MITIGATED / RESOLVED
- Impact: [Design token sync paused / Asset export failing]
- Cause: [403 expired token / 429 rate limit / Figma outage]
- Action: [Rotating token / Reducing request rate / Waiting for Figma]
- ETA: [Next update in 15 min]

## External (if applicable)
Design system updates may be delayed due to a temporary issue
with our Figma integration. Cached data is being served.

Step 5: Postmortem Template

## Figma Incident Postmortem
**Date:** YYYY-MM-DD
**Duration:** X hours Y minutes
**Severity:** P1/P2/P3

### Summary
[One sentence: what happened and what was the impact]

### Timeline
- HH:MM UTC - First alert fired (describe alert)
- HH:MM UTC - On-call acknowledged
- HH:MM UTC - Root cause identified
- HH:MM UTC - Mitigation applied
- HH:MM UTC - Full resolution confirmed

### Root Cause
[Technical explanation, e.g., "PAT expired after 90 days without rotation"]

### Action Items
- [ ] Set up PAT rotation reminder at 80-day mark
- [ ] Add 403 alert to PagerDuty
- [ ] Implement cached fallback for token data

Output

  • Issue identified via triage script
  • Root cause determined from decision tree
  • Mitigation applied (token rotation, fallback mode, etc.)
  • Stakeholders notified
  • Postmortem documented

Error Handling

Issue Cause Solution
Can't reach status.figma.com Network issue Try from different network or mobile
Triage script fails PAT not set Set FIGMA_PAT before running
Fallback data stale Last cache too old Set up regular cache refresh
Alert not firing Missing metrics Verify Prometheus scrape config

Resources

Next Steps

For data handling, see figma-data-handling.

Info
Category Engineering
Name figma-incident-runbook
Version v20260423
Size 6.12KB
Updated At 2026-04-26
Language