Skills Development Robust API Design Patterns for AI

Robust API Design Patterns for AI

v20260423
anth-reliability-patterns
This module provides structured implementations of critical reliability patterns (Circuit Breaker, Graceful Degradation, Idempotency) for integrating with the Claude API. It helps developers build resilient, production-grade applications that can handle transient failures, rate limits, and service outages without failing completely, significantly enhancing system stability and user experience.
Get Skill
438 downloads
Overview

Anthropic Reliability Patterns

Overview

Production reliability patterns for Claude API: circuit breaker (prevent cascading failures), graceful degradation (serve fallbacks), idempotency (safe retries), and timeout management.

Circuit Breaker

import time
from enum import Enum

class CircuitState(Enum):
    CLOSED = "closed"       # Normal operation
    OPEN = "open"           # Failing, reject requests
    HALF_OPEN = "half_open" # Testing recovery

class ClaudeCircuitBreaker:
    def __init__(self, failure_threshold: int = 5, recovery_timeout: int = 60):
        self.state = CircuitState.CLOSED
        self.failures = 0
        self.threshold = failure_threshold
        self.recovery_timeout = recovery_timeout
        self.last_failure_time = 0.0

    def call(self, func, *args, **kwargs):
        if self.state == CircuitState.OPEN:
            if time.time() - self.last_failure_time > self.recovery_timeout:
                self.state = CircuitState.HALF_OPEN
            else:
                raise Exception("Circuit breaker OPEN — Claude API unavailable")

        try:
            result = func(*args, **kwargs)
            if self.state == CircuitState.HALF_OPEN:
                self.state = CircuitState.CLOSED
                self.failures = 0
            return result
        except Exception as e:
            self.failures += 1
            self.last_failure_time = time.time()
            if self.failures >= self.threshold:
                self.state = CircuitState.OPEN
            raise

# Usage
breaker = ClaudeCircuitBreaker(failure_threshold=5, recovery_timeout=60)

def safe_claude_call(prompt: str) -> str:
    try:
        return breaker.call(
            client.messages.create,
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            messages=[{"role": "user", "content": prompt}]
        ).content[0].text
    except Exception:
        return "AI assistant is temporarily unavailable."

Graceful Degradation

import anthropic

def complete_with_fallback(prompt: str) -> str:
    """Try Sonnet → Haiku → cached response → static fallback."""
    models = ["claude-sonnet-4-20250514", "claude-haiku-4-20250514"]

    for model in models:
        try:
            msg = client.messages.create(
                model=model,
                max_tokens=1024,
                messages=[{"role": "user", "content": prompt}]
            )
            return msg.content[0].text
        except anthropic.RateLimitError:
            continue  # Try cheaper model
        except anthropic.APIStatusError:
            continue  # Try next model

    # All models failed — return cached or static response
    cached = cache.get(f"claude:{hash(prompt)}")
    if cached:
        return f"[Cached response] {cached}"

    return "Our AI assistant is temporarily unavailable. Please try again in a few minutes."

Idempotent Requests

import hashlib
import json

class IdempotentClaude:
    def __init__(self):
        self.client = anthropic.Anthropic()
        self.cache = {}  # Use Redis in production

    def create_message(self, idempotency_key: str | None = None, **kwargs) -> str:
        # Generate deterministic key from request params if not provided
        if not idempotency_key:
            idempotency_key = hashlib.sha256(
                json.dumps(kwargs, sort_keys=True, default=str).encode()
            ).hexdigest()

        # Return cached result for duplicate requests
        if idempotency_key in self.cache:
            return self.cache[idempotency_key]

        msg = self.client.messages.create(**kwargs)
        result = msg.content[0].text
        self.cache[idempotency_key] = result
        return result

Timeout Configuration

# Layer timeouts for defense-in-depth
client = anthropic.Anthropic(
    timeout=60.0,      # SDK-level timeout (covers connect + read)
    max_retries=3,     # Auto-retry on 429/5xx
)

# Per-request timeout override
msg = client.messages.create(
    model="claude-haiku-4-20250514",
    max_tokens=64,
    messages=[{"role": "user", "content": "Quick question"}],
    timeout=10.0  # Override for fast operations
)

Reliability Checklist

  • Circuit breaker prevents cascading failures
  • Graceful degradation serves fallback responses
  • Idempotency keys prevent duplicate processing
  • Timeouts configured at SDK and application level
  • Health check probes API connectivity
  • Retry logic uses exponential backoff (SDK default)
  • Rate limit headers monitored for pre-emptive throttling

Resources

Next Steps

For policy guardrails, see anth-policy-guardrails.

Info
Category Development
Name anth-reliability-patterns
Version v20260423
Size 5.19KB
Updated At 2026-04-28
Language