anth-performance-tuning
jeremylongshore/claude-code-plugins-plus-skills
This guide details advanced strategies for optimizing Claude API usage to reduce latency and cost. Techniques covered include prompt caching (significant cost reduction), strategic model selection (Haiku, Sonnet, Opus) based on task complexity, implementing streaming for perceived speed, and managing token limits. Use this guide when building high-performance, cost-efficient LLM applications that interact with large language models.