Kubernetes Cost Optimization — 7 Ways to Cut Your Cloud Bill

Kubernetes cost optimization tips

We earn commissions when you shop through the links on this page, at no extra cost to you.


Kubernetes clusters are easy to overspend on. Most teams overprovision resources, leave idle workloads running, and never set up autoscaling properly. The result is a cloud bill that grows every month without a clear reason why.

This guide covers 7 practical ways to cut your Kubernetes costs. No theory — just steps you can apply to your cluster today.


Why Kubernetes Costs Get Out of Control

Kubernetes schedules pods based on resource requests, not actual usage. If your developers request more CPU and memory than their apps actually need, you’re paying for resources that sit idle.

Common reasons costs spiral:

  • Pods requesting too much CPU and memory
  • Nodes running at low utilization
  • No autoscaling configured
  • Dev and test clusters running 24/7
  • Unused persistent volumes and load balancers

Fix these and you can cut your cloud bill by 30-60%.


1. Set Resource Requests and Limits on Every Pod

This is the most important step. Without resource requests and limits, Kubernetes cannot schedule pods efficiently.

Resource requests tell the scheduler how much CPU and memory a pod needs. Limits cap how much it can consume.

Example:

resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"

Important: Set memory limits but be careful with CPU limits. Too low a CPU limit throttles your pod during peak demand and hurts performance. Start with no CPU limits, monitor actual usage, then set limits based on real data.

Review your resource requests every quarter. What worked six months ago may be wasting money today.


2. Use the Horizontal Pod Autoscaler (HPA)

HPA automatically scales the number of pod replicas based on CPU or memory usage. Instead of running 10 replicas at all times, you run 2 at off-peak and scale up when traffic increases.

Set up HPA for a deployment:

kubectl autoscale deployment my-app \
  --cpu-percent=70 \
  --min=2 \
  --max=10

Check HPA status:

kubectl get hpa

HPA works best for stateless applications with variable traffic patterns. If your app has predictable traffic spikes, HPA will save you significant compute costs.


3. Enable Cluster Autoscaler

Cluster Autoscaler adds and removes nodes based on pod scheduling demand. When pods cannot be scheduled due to lack of resources, it adds a node. When nodes are underutilized, it removes them.

This prevents you from paying for idle nodes during low-traffic periods.

Most managed Kubernetes services support Cluster Autoscaler natively:

  • AWS EKS — enable via node group settings
  • Azure AKS — enable via --enable-cluster-autoscaler flag
  • GKE — enable via --enable-autoscaling flag
  • DigitalOcean Kubernetes — enable autoscaling in the node pool settings

Set a minimum and maximum node count that matches your traffic patterns. Avoid setting the minimum too high — that defeats the purpose.


4. Use Spot or Preemptible Instances for Non-Critical Workloads

Spot instances (AWS), Preemptible VMs (GCP), and Low-Priority VMs (Azure) offer 60-90% discount compared to on-demand pricing. The trade-off is they can be interrupted with short notice.

Use spot instances for:

  • Batch processing jobs
  • CI/CD build agents
  • Dev and test workloads
  • Non-critical background tasks

Do not use spot instances for:

  • Production databases
  • Stateful workloads with no redundancy
  • Workloads that cannot tolerate interruption

The right approach is a mixed node pool — on-demand nodes for critical workloads, spot nodes for everything else. Most cloud providers support this natively in Kubernetes node groups.


5. Delete Idle Resources

Idle resources are silent cost killers. They sit there doing nothing and charge you every hour.

Check for and delete:

Unused persistent volumes:

kubectl get pv | grep Released
kubectl delete pv <pv-name>

Unused services with load balancers:

kubectl get svc --all-namespaces | grep LoadBalancer

Each cloud load balancer costs money even with zero traffic. Delete any that are not actively in use.

Completed or failed jobs:

kubectl get jobs --all-namespaces | grep -E "Complete|Failed"
kubectl delete job <job-name>

Namespaces with no active workloads:

kubectl get pods --all-namespaces | grep -v Running

Run this cleanup monthly. It takes 30 minutes and can save hundreds of dollars.


6. Turn Off Dev and Test Clusters After Hours

Dev and test clusters do not need to run 24/7. If your team works 8 hours a day, 5 days a week, you’re paying for 128 idle hours per week.

Schedule cluster shutdown for non-production environments:

  • Use a cron job to scale down node pools to zero after hours
  • Use cloud scheduler tools to stop clusters on weekends
  • Scale down to minimum nodes outside of business hours

On AWS EKS and Azure AKS you can scale a node group to zero:

# AWS EKS — scale node group to zero
aws eks update-nodegroup-config \
  --cluster-name my-cluster \
  --nodegroup-name dev-nodes \
  --scaling-config minSize=0,maxSize=3,desiredSize=0

A dev cluster running on a $48/month node pool costs $576/year. Shutting it down after hours cuts that to around $200/year.


7. Right-Size Your Nodes

Most teams pick a node size once during initial setup and never revisit it. Over time, workloads change and the original node size no longer fits.

Use the Vertical Pod Autoscaler (VPA) to get recommendations on optimal resource requests:

# Install VPA
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/download/vertical-pod-autoscaler.yaml

# Create a VPA object in recommendation mode
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Off"
EOF

Set updateMode: "Off" first — this gives you recommendations without automatically changing anything. Review the recommendations before applying them.

kubectl describe vpa my-app-vpa

Use these recommendations to update your resource requests manually. Re-run this process every quarter.


Tools to Monitor Kubernetes Costs

Manual reviews only go so far. These tools give you continuous visibility:

  • Kubecost — open source, shows cost by namespace, deployment, and team
  • OpenCost — CNCF project, free, integrates with Prometheus
  • Lens — Kubernetes IDE with resource usage visibility
  • Cloud native tools — AWS Cost Explorer, Azure Cost Analysis, GCP Billing all support Kubernetes cluster cost breakdown

Start with Kubecost or OpenCost. Both are free and take under an hour to set up.


Choosing the Right Cloud Provider

Cost optimization also starts with your infrastructure choice. Managed Kubernetes services vary significantly in price:

  • DigitalOcean Kubernetes — simple pricing, no control plane fee, good for small to medium clusters. Starting from $12/month per node.
  • AWS EKS — $0.10/hour control plane fee plus EC2 node costs. More complex pricing but more features.
  • Azure AKS — free control plane, pay for nodes only. Good value for Azure-heavy teams.
  • GKE — one free zonal cluster, then $0.10/hour. Autopilot mode can reduce costs significantly.

For smaller teams or projects, DigitalOcean Kubernetes is often the most cost-effective starting point. Get started with DigitalOcean →


Summary

TipEffortPotential Saving
Set resource requests and limitsLow20-30%
Enable HPALow10-20%
Enable Cluster AutoscalerMedium20-40%
Use spot instancesMedium40-70% on eligible nodes
Delete idle resourcesLow5-15%
Turn off dev clusters after hoursLowUp to 65% on dev costs
Right-size nodes with VPAMedium10-25%

Start with tips 1, 5, and 6. These have the lowest effort and produce immediate savings. Then work through the rest.


You Might Also Like

Uncategorized