groq-reference-architecture
jeremylongshore/claude-code-plugins-plus-skills
This guide details a production-ready reference architecture for applications built on Groq's LPU inference API. It covers sophisticated model routing based on latency, quality, or cost needs, implementing streaming pipelines, and establishing robust multi-provider fallback chains (e.g., Groq primary to OpenAI backup). Ideal for designing and reviewing complex, resilient LLM systems.