From castai-pack
Maximize Kubernetes cost savings with CAST AI spot strategies and right-sizing. Use when analyzing cloud spend, optimizing spot-to-on-demand ratios, or configuring CAST AI for maximum savings. Trigger with phrases like "cast ai cost", "cast ai savings", "cast ai spot strategy", "reduce kubernetes cost", "cast ai budget".
How this skill is triggered — by the user, by Claude, or both
Slash command
/castai-pack:castai-cost-tuningThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.
Maximize Kubernetes cost savings through CAST AI: spot instance strategies, workload right-sizing, cluster hibernation, and savings tracking. Typical savings: 50-70% on cloud compute costs.
# Get savings breakdown
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/savings" \
| jq '{
currentMonthlyCost: .currentMonthlyCost,
optimizedMonthlyCost: .optimizedMonthlyCost,
monthlySavings: .monthlySavings,
savingsPercentage: .savingsPercentage,
spotSavings: .spotSavings,
rightSizingSavings: .rightSizingSavings
}'
# Enable aggressive spot with diversity and fallbacks
curl -X PUT -H "X-API-Key: ${CASTAI_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/policies" \
-d '{
"enabled": true,
"spotInstances": {
"enabled": true,
"clouds": ["aws"],
"spotDiversityEnabled": true,
"spotDiversityPriceIncreaseLimitPercent": 20,
"spotBackups": {
"enabled": true,
"spotBackupRestoreRateSeconds": 600
}
}
}'
Spot allocation strategy by workload tier:
| Workload Type | Spot % | Rationale |
|---|---|---|
| Batch jobs, CI runners | 100% spot | Interruptible, restartable |
| Stateless APIs (behind LB) | 80% spot | Can handle brief interruptions |
| Stateful services, databases | 0% spot | Use on-demand or reserved |
| ML training | 80-100% spot | Checkpointing handles interrupts |
# Get resource waste analysis
curl -s -H "X-API-Key: ${CASTAI_API_KEY}" \
"https://api.cast.ai/v1/workload-autoscaling/clusters/${CASTAI_CLUSTER_ID}/workloads" \
| jq '[.items[] | select(.estimatedSavingsPercent > 20) | {
name: .workloadName,
namespace: .namespace,
wastedCpu: (.currentCpuRequest - .recommendedCpuRequest),
wastedMemory: (.currentMemoryRequest - .recommendedMemoryRequest),
savingsPercent: .estimatedSavingsPercent
}] | sort_by(-.savingsPercent) | .[0:10]'
# Hibernate non-production clusters during off-hours
# Scales nodes to zero, resume on demand
# Enable hibernation
curl -X POST -H "X-API-Key: ${CASTAI_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.cast.ai/v1/kubernetes/clusters/${CASTAI_CLUSTER_ID}/hibernate" \
-d '{
"schedule": {
"enabled": true,
"hibernateAt": "20:00",
"wakeUpAt": "08:00",
"timezone": "America/New_York",
"weekdaysOnly": true
}
}'
interface CostReport {
cluster: string;
period: string;
currentCost: number;
optimizedCost: number;
savings: number;
spotPercent: number;
}
async function generateMonthlyCostReport(
clusterIds: string[]
): Promise<CostReport[]> {
const reports: CostReport[] = [];
for (const clusterId of clusterIds) {
const [cluster, savings, nodes] = await Promise.all([
castaiGet(`/v1/kubernetes/external-clusters/${clusterId}`),
castaiGet(`/v1/kubernetes/clusters/${clusterId}/savings`),
castaiGet(`/v1/kubernetes/external-clusters/${clusterId}/nodes`),
]);
const spotNodes = nodes.items.filter(
(n: { lifecycle: string }) => n.lifecycle === "spot"
).length;
reports.push({
cluster: cluster.name,
period: new Date().toISOString().slice(0, 7),
currentCost: savings.currentMonthlyCost,
optimizedCost: savings.optimizedMonthlyCost,
savings: savings.monthlySavings,
spotPercent:
nodes.items.length > 0
? (spotNodes / nodes.items.length) * 100
: 0,
});
}
return reports;
}
| Issue | Cause | Solution |
|---|---|---|
| Savings lower than expected | Too many on-demand constraints | Relax node template constraints |
| Spot interruptions too frequent | Single instance type | Enable spot diversity |
| Hibernation not triggering | Schedule timezone wrong | Use IANA timezone format |
| Right-sizing too aggressive | Low headroom | Increase memory headroom to 20% |
For architecture patterns, see castai-reference-architecture.
npx claudepluginhub flight505/skill-forge --plugin castai-packCreates, edits, and optimizes skills for Claude Code, including drafting, evaluating with test prompts, iterating on performance, and improving skill descriptions for better triggering accuracy.