together-reference-architecture
jeremylongshore/claude-code-plugins-plus-skills
A comprehensive reference architecture for building scalable, cost-effective, and robust AI services. This system manages the full AI lifecycle, including intelligent model routing (for quality/cost trade-offs), response caching (using Redis), asynchronous batch processing, and fine-tuning pipeline management via Together AI's OpenAI-compatible API. Ideal for enterprise applications requiring high availability across multiple open-source models.