Security practices for Langfuse LLM observability integrations. Langfuse captures prompts, completions, and metadata from LLM calls -- making data privacy and access control critical since traced data may contain PII.
Langfuse uses separate public and secret keys. Only the secret key should be protected.
import os
# Public key: safe for client-side (identifies project)
LANGFUSE_PUBLIC_KEY = os.environ["LANGFUSE_PUBLIC_KEY"]
# Secret key: NEVER expose (grants write access to traces)
LANGFUSE_SECRET_KEY = os.environ["LANGFUSE_SECRET_KEY"]
# Host (for self-hosted)
LANGFUSE_HOST = os.environ.get("LANGFUSE_HOST", "https://cloud.langfuse.com")
# Validate on startup
assert LANGFUSE_SECRET_KEY, "LANGFUSE_SECRET_KEY required"
assert not LANGFUSE_SECRET_KEY.startswith("pk-"), "Using public key as secret key!"
Langfuse stores everything you send. Scrub PII before tracing.
import re
from langfuse import Langfuse
langfuse = Langfuse()
def scrub_pii(text: str) -> str:
text = re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z]{2,}\b', '[EMAIL]', text, flags=re.IGNORECASE)
text = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE]', text)
text = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN]', text)
return text
def traced_llm_call(prompt: str, user_id: str):
trace = langfuse.trace(
name="llm-call",
user_id=user_id, # OK: user IDs are fine
input=scrub_pii(prompt), # scrub before tracing
)
response = call_llm(prompt) # send original to LLM
trace.update(output=scrub_pii(response)) # scrub output too
return response
# docker-compose.yml for self-hosted Langfuse
services:
langfuse:
image: langfuse/langfuse:latest
environment:
- AUTH_DISABLE_SIGNUP=true # prevent open registration
- AUTH_DOMAINS_WITH_SSO_ENFORCEMENT=company.com # require SSO
- LANGFUSE_DEFAULT_PROJECT_ROLE=VIEWER # least privilege default
- ENCRYPTION_KEY=${ENCRYPTION_KEY} # encrypt data at rest
Configure automatic cleanup of old traces to limit exposure.
# Self-hosted: set retention via environment
# LANGFUSE_RETENTION_DAYS=90
# Cloud: use API to delete old traces
from datetime import datetime, timedelta
def cleanup_old_traces(langfuse: Langfuse, max_age_days: int = 90):
cutoff = datetime.now() - timedelta(days=max_age_days)
# Use Langfuse API to list and delete traces older than cutoff
# Implement based on your Langfuse version's API
print(f"Cleaning traces older than {cutoff.isoformat()}")
| Issue | Cause | Solution |
|---|---|---|
| PII in traces | Not scrubbing before trace | Apply PII scrubbing to inputs/outputs |
| Secret key leaked | Wrong key in client code | Validate key prefix (sk- vs pk-) |
| Unauthorized access | No SSO enforcement | Enable AUTH_DOMAINS_WITH_SSO_ENFORCEMENT |
| Data accumulation | No retention policy | Set LANGFUSE_RETENTION_DAYS |
from langfuse import Langfuse
langfuse = Langfuse(
public_key=os.environ["LANGFUSE_PUBLIC_KEY"],
secret_key=os.environ["LANGFUSE_SECRET_KEY"],
host=os.environ.get("LANGFUSE_HOST", "https://cloud.langfuse.com")
)