Skills Development Clay Incident Response Runbook

Clay Incident Response Runbook

v20260423
clay-incident-runbook
A comprehensive runbook for diagnosing and resolving production incidents involving the Clay SaaS platform. It provides structured triage steps, severity definitions (P1-P4), and resolution guides for issues like credit exhaustion, webhook failures, and data flow outages, ensuring minimal downtime.
Get Skill
408 downloads
Overview

Clay Incident Runbook

Overview

Rapid response procedures for Clay-related production incidents. Clay is a hosted SaaS platform, so incidents fall into two categories: (1) Clay-side issues (platform outage, provider degradation) and (2) your-side issues (webhook misconfiguration, credit exhaustion, handler failures).

Severity Levels

Level Definition Response Time Examples
P1 Complete data flow stopped < 15 min Credits exhausted, webhook URL expired, Clay outage
P2 Degraded enrichment < 1 hour Low hit rates, slow processing, CRM sync errors
P3 Minor impact < 4 hours Single provider down, intermittent webhook failures
P4 No user impact Next business day Monitoring gaps, cost optimization needed

Instructions

Step 1: Quick Triage (2 Minutes)

#!/bin/bash
# clay-triage.sh — rapid diagnostic for Clay incidents
set -euo pipefail

echo "=== Clay Incident Triage ==="
echo "Time: $(date -u +%Y-%m-%dT%H:%M:%SZ)"
echo ""

# 1. Check Clay platform status
echo "--- Clay Platform Status ---"
curl -s -o /dev/null -w "clay.com: HTTP %{http_code}\n" https://www.clay.com
echo ""

# 2. Test webhook delivery
echo "--- Webhook Test ---"
if [ -n "${CLAY_WEBHOOK_URL:-}" ]; then
  WEBHOOK_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
    -X POST "$CLAY_WEBHOOK_URL" \
    -H "Content-Type: application/json" \
    -d '{"_triage": true, "_ts": "'$(date -u +%s)'"}')
  echo "Webhook: HTTP $WEBHOOK_CODE"
  if [ "$WEBHOOK_CODE" = "200" ]; then echo "  -> Webhook OK"; fi
  if [ "$WEBHOOK_CODE" = "404" ]; then echo "  -> ISSUE: Webhook URL invalid/expired"; fi
  if [ "$WEBHOOK_CODE" = "429" ]; then echo "  -> ISSUE: Rate limited"; fi
else
  echo "CLAY_WEBHOOK_URL not set!"
fi
echo ""

# 3. Test Enterprise API (if applicable)
echo "--- Enterprise API Test ---"
if [ -n "${CLAY_API_KEY:-}" ]; then
  API_CODE=$(curl -s -o /dev/null -w "%{http_code}" \
    -X POST "https://api.clay.com/v1/people/enrich" \
    -H "Authorization: Bearer $CLAY_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"email": "triage@test.com"}')
  echo "Enterprise API: HTTP $API_CODE"
  if [ "$API_CODE" = "401" ]; then echo "  -> ISSUE: API key invalid/expired"; fi
  if [ "$API_CODE" = "403" ]; then echo "  -> ISSUE: Not on Enterprise plan or key revoked"; fi
else
  echo "No Enterprise API key configured"
fi
echo ""

# 4. Check your callback endpoint
echo "--- Callback Endpoint ---"
CALLBACK_URL="${CLAY_CALLBACK_URL:-https://your-app.com/api/health}"
curl -s -o /dev/null -w "Callback: HTTP %{http_code}\n" "$CALLBACK_URL" || echo "Callback endpoint unreachable!"

Step 2: Decision Tree

Enrichment not running?
├── Check Clay UI: any red error cells?
│   ├── YES: Click cells, read error messages → go to Error Resolution
│   └── NO: Continue
├── Is auto-run enabled? (Table Settings > Auto-update)
│   ├── NO: Enable auto-update at table level, then column level
│   └── YES: Continue
├── Do you have credits remaining? (Settings > Plans & Billing)
│   ├── NO: Add credits or connect own provider API keys → P1
│   └── YES: Continue
├── Is the webhook accepting data? (Test with curl)
│   ├── NO: Re-create webhook (50K limit may be hit) → P1
│   └── YES: The issue is likely provider-side
└── Check individual enrichment providers in Clay Settings > Connections
    ├── Provider connection lost → Reconnect API key
    └── Provider rate limited → Wait or switch to different provider

Step 3: Common Incident Resolutions

P1: Credits Exhausted

1. Check: Settings > Plans & Billing > Credit balance
2. Immediate: Connect your own provider API keys (0 credits)
3. Short-term: Add credit pack or upgrade plan
4. Prevent: Set credit burn alerts (see clay-observability)

P1: Webhook URL Expired (50K Limit)

1. Check: Table > + Add > Webhooks — does existing webhook show "limit reached"?
2. Fix: Create new webhook on same table
3. Update: Change CLAY_WEBHOOK_URL in all deployment secrets
4. Verify: Send test payload to new webhook URL

P2: Low Enrichment Hit Rate

1. Check: Sample 10 rows — are input domains valid?
2. Check: Are providers connected? (Settings > Connections)
3. Fix: Pre-filter invalid rows, reconnect providers
4. Monitor: Track hit rate for next hour

P2: CRM Sync Failures

1. Check: HTTP API column errors (click red cells)
2. Common: CRM API key expired → regenerate and update in column config
3. Common: Field mapping changed → update column body JSON
4. Test: Run HTTP API column on single row manually

Step 4: Communication Template

## Clay Incident Notification

**Severity:** P[1/2/3]
**Time detected:** [UTC timestamp]
**Impact:** [What's not working]
**Affected:** [Teams/workflows affected]

**Current Status:** [Investigating / Mitigating / Resolved]

**Actions Taken:**
1. [Action 1]
2. [Action 2]

**Next Update:** [Time]

**Root Cause:** [If known]
**Resolution:** [Steps taken to fix]
**Prevention:** [What we'll do to prevent recurrence]

Step 5: Postmortem Template

Item Details
Incident Date [Date]
Duration [X hours]
Severity P[1/2/3]
Impact [Leads not enriched / CRM not updated / Credits exhausted]
Root Cause [e.g., Webhook hit 50K limit without monitoring]
Detection [How was it discovered? Alert / user report / manual check]
Resolution [Steps taken]
Credits Lost [Estimate of wasted credits, if any]
Prevention [Monitoring gaps to close, safeguards to add]

Error Handling

Issue Cause Solution
Can't access Clay dashboard Browser/network issue Try incognito, different browser, or mobile
Webhook test returns nothing Webhook URL malformed Re-copy full URL from Clay table
All providers returning empty Account-level issue Contact Clay support at community.clay.com
CRM pushing wrong data Column references changed Re-map HTTP API column body fields

Resources

Next Steps

For data handling and compliance, see clay-data-handling.

Info
Category Development
Name clay-incident-runbook
Version v20260423
Size 4.66KB
Updated At 2026-04-28
Language