Manage access to Groq's ultra-fast LPU inference API through API key scoping and organization-level controls. Groq's per-token pricing is extremely low (orders of magnitude cheaper than GPU-based providers), but its speed makes runaway usage easy.
set -euo pipefail
# Key for the chatbot team (high RPM, small model)
curl -X POST https://api.groq.com/openai/v1/api-keys \
-H "Authorization: Bearer $GROQ_ADMIN_KEY" \
-d '{
"name": "chatbot-prod",
"allowed_models": ["llama-3.3-70b-versatile", "llama-3.1-8b-instant"],
"requests_per_minute": 500, # HTTP 500 Internal Server Error
"tokens_per_minute": 100000 # 100000 = configured value
}'
# Key for batch processing (lower RPM but higher token limit)
curl -X POST https://api.groq.com/openai/v1/api-keys \
-H "Authorization: Bearer $GROQ_ADMIN_KEY" \
-d '{
"name": "batch-processor",
"allowed_models": ["llama-3.1-8b-instant"],
"requests_per_minute": 60,
"tokens_per_minute": 500000 # 500000 = configured value
}'
// groq-gateway.ts - Enforce model restrictions before forwarding
const TEAM_MODEL_ACCESS: Record<string, string[]> = {
chatbot: ['llama-3.3-70b-versatile', 'llama-3.1-8b-instant'],
analytics: ['llama-3.1-8b-instant'], // Cheapest model only
research: ['llama-3.3-70b-versatile', 'mixtral-8x7b-32768', 'gemma2-9b-it'], # 32768 = configured value
};
function validateModelAccess(team: string, model: string): boolean {
return TEAM_MODEL_ACCESS[team]?.includes(model) ?? false;
}
In the Groq Console > Organization > Billing:
set -euo pipefail
# Check usage across all API keys
curl https://api.groq.com/openai/v1/usage \
-H "Authorization: Bearer $GROQ_ADMIN_KEY" | \
jq '.usage_by_key[] | {key_name, requests_today, tokens_today, estimated_cost_usd}'
set -euo pipefail
# 1. Create replacement key with same permissions
# 2. Deploy new key to services
# 3. Monitor for 24h to confirm no traffic on old key
# 4. Delete old key
curl -X DELETE "https://api.groq.com/openai/v1/api-keys/OLD_KEY_ID" \
-H "Authorization: Bearer $GROQ_ADMIN_KEY"
| Issue | Cause | Solution |
|---|---|---|
429 rate_limit_exceeded |
RPM or TPM cap hit | Groq rate limits are strict; add exponential backoff |
401 invalid_api_key |
Key deleted or expired | Generate new key in Groq Console |
model_not_available |
Model not in key's allowed list | Create key with broader model access |
| Spending cap paused API | Monthly budget reached | Increase cap or wait for billing cycle |
Basic usage: Apply groq enterprise rbac to a standard project setup with default configuration options.
Advanced scenario: Customize groq enterprise rbac for production environments with multiple constraints and team-specific requirements.