Skills Development Langfuse Incident Runbook

Langfuse Incident Runbook

v20260311
langfuse-incident-runbook
Guides responders through diagnosing, resolving, and reviewing Langfuse platform incidents, covering quick triage, incident-type lookup tables, resolution steps, post-mortems, escalation contacts, and error handling for outages, auth failures, and missing traces.
Get Skill
286 downloads
Overview

Langfuse Incident Runbook

Contents

Overview

Step-by-step procedures for responding to Langfuse-related incidents, from initial triage through resolution and post-incident review.

Prerequisites

  • Access to Langfuse dashboard
  • Application logs access
  • Metrics/monitoring dashboards
  • Escalation contacts

Instructions

Step 1: Initial Assessment (2 minutes)

Run the quick diagnosis script: check Langfuse status page, API connectivity, auth test, and application metrics.

Step 2: Determine Incident Type

Symptom Likely Cause Action
No traces appearing SDK not flushing Check shutdown handlers, reduce batch size
401/403 errors Auth issue Verify keys match project, check rotation
High latency Rate limits Increase batching, implement circuit breaker
Missing data Partial failures Ensure spans end in finally blocks
Complete outage Langfuse service Enable fallback, queue locally

Step 3: Apply Resolution

Follow the section-specific resolution steps. For outages, activate graceful degradation mode.

Step 4: Post-Incident Review

Verify traces appearing, check error rates normalized, schedule post-mortem for P1/P2.

See detailed implementation for advanced patterns.

Output

  • Incident severity classified
  • Root cause identified
  • Resolution applied
  • Post-incident checklist completed

Error Handling

Severity Description Response Time
P1 Complete outage 15 min
P2 Degraded, partial loss 1 hour
P3 Slow/delayed traces 4 hours
P4 Minor issues 24 hours

Examples

Escalation Contacts

Level Contact When
L1 On-call engineer All incidents
L2 Platform team lead P1/P2 unresolved 30min
L3 Langfuse support Service-side issues

Resources

Info
Category Development
Name langfuse-incident-runbook
Version v20260311
Size 3.24KB
Updated At 2026-03-12
Language