Production-ready architecture patterns for Mistral AI integrations. Covers layered project structure, client singleton, configuration management, error handling, service layer with caching, health checks, and prompt templates.
API Layer (Routes, Controllers, Middleware)
↓
Service Layer (Business Logic, Orchestration)
↓
Mistral Layer (Client, Prompts, Error Handling)
↓
Infrastructure Layer (Cache, Queue, Monitoring)
Organize into src/mistral/ (client, config, types, errors, prompts, handlers), src/services/ai/ (chat, rag, cache), src/api/ (routes, streaming), tests/ (unit, integration), and config/ (per-environment JSON).
Create getMistralClient() factory that lazily initializes a single Mistral instance with config from environment. Include resetMistralClient() for testing.
Define config schema with z.object() for apiKey, model (default mistral-small-latest), timeout (30s), maxRetries (3), and cache settings. Parse from environment variables.
Create MistralServiceError class with code, status, retryable flag. Wrap raw errors: 429 = RATE_LIMIT (retryable), 401 = AUTH_ERROR, 500+ = SERVICE_ERROR (retryable).
Build ChatService with complete() and stream() methods. Add cache check for deterministic requests (temperature=0). Wrap all calls with retry and error handling.
Implement checkMistralHealth() that calls client.models.list() and returns status (healthy/degraded/unhealthy) with latency measurement.
Define reusable PromptTemplate interface with system prompt and user template function. Include templates for summarize, classify, and codeReview.
| Issue | Cause | Resolution |
|---|---|---|
| Rate limit (429) | Too many requests | Retry with backoff, marked retryable |
| Auth error (401) | Invalid API key | Check credentials, not retryable |
| Server error (5xx) | Mistral service issue | Retry with backoff, marked retryable |
| Config validation | Missing/invalid env vars | Check Zod schema errors |
const client = getMistralClient();
const response = await client.chat.complete({
model: 'mistral-small-latest',
messages: [{ role: 'user', content: 'Hello' }],
});
See detailed implementation for advanced patterns.