Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.
# Get savings breakdown
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \
| jq '{
currentMonthlyCost: .currentMonthlyCost,
optimizedMonthlyCost: .optimizedMonthlyCost,
monthlySavings: .monthlySavings,
savingsPercentage: .savingsPercentage,
spotSavings: .spotSavings,
rightSizingSavings: .rightSizingSavings
}'
# Enable aggressive spot with diversity and fallbacks
curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
-d '{
"enabled": true,
"spotInstances": {
"enabled": true,
"clouds": ["aws"],
"spotDiversityEnabled": true,
"spotDiversityPriceIncreaseLimitPercent": 20,
"spotBackups": {
"enabled": true,
"spotBackupRestoreRateSeconds": 600
}
}
}'
Spot allocation strategy by workload tier:
| Workload Type | Spot % | Rationale |
|---|---|---|
| Batch jobs, CI runners | 100% spot | Interruptible, restartable |
| Stateless APIs (behind LB) | 80% spot | Can handle brief interruptions |
| Stateful services, databases | 0% spot | Use on-demand or reserved |
| ML training | 80-100% spot | Checkpointing handles interrupts |
# Get resource waste analysis
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
"https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
| jq '[.items[] | select(.estimatedSavingsPercent > 20) | {
name: .workloadName,
namespace: .namespace,
wastedCpu: (.currentCpuRequest - .recommendedCpuRequest),
wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest),
savingsPercent: .estimatedSavingsPercent
}] | sort_by(-.savingsPercent) | .[0:10]'
# Hibernate non-production clusters during off-hours
# Scales nodes to zero, resume on demand
# Enable hibernation
curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \
-d '{
"schedule": {
"enabled": true,
"hibernateAt": "20:00",
"wakeUpAt": "08:00",
"timezone": "America/New_York",
"weekdaysOnly": true
}
}'
interface CostReport {
cluster: string;
period: string;
currentCost: number;
optimizedCost: number;
savings: number;
spotPercent: number;
}
async function generateMonthlyCostReport(
clusterIds: string[]
): Promise<CostReport[]> {
const reports: CostReport[] = [];
for (const clusterId of clusterIds) {
const [cluster, savings, nodes] = await Promise.all([
castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`),
castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`),
castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`),
]);
const spotNodes = nodes.items.filter(
(n: { lifecycle: string }) => n.lifecycle === "spot"
).length;
reports.push({
cluster: cluster.name,
period: new Date().toISOString().slice(0, 7),
currentCost: savings.currentMonthlyCost,
optimizedCost: savings.optimizedMonthlyCost,
savings: savings.monthlySavings,
spotPercent:
nodes.items.length > 0
? (spotNodes / nodes.items.length) * 100
: 0,
});
}
return reports;
}
| Issue | Cause | Solution |
|---|---|---|
| Savings lower than expected | Too many on-demand constraints | Relax node template constraints |
| Spot interruptions too frequent | Single instance type | Enable spot diversity |
| Hibernation not triggering | Schedule timezone wrong | Use IANA timezone format |
| Right-sizing too aggressive | Low headroom | Increase memory headroom to 20% |
For architecture patterns, see castai-reference-architecture.