In part one of our Managing Cloud Costs with Kubernetes article series, we covered five key questions your cloud management team should be asking to understand, align, and report on your cloud costs with Kubernetes. In part two of this series, we’re going to outline five best practices to optimize cloud costs within a Kubernetes container environment.
As a brief recap of part one, we established the importance of the following:
- Identifying common KPIs with a Cloud Center of Excellence (CCoE)
- Ensuring all teams are following the same governance practices (definitions, labeling, etc.)
- Establishing how your organization allocates costs for cloud services (by resource requests or by usage, and for shared services)
With common KPIs, labeling conventions, and rules for how cloud costs are charged back to the right owner, you can start to think about how to optimize your container environment for better performance and lower costs.
How to optimize Kubernetes cloud costs
There are five primary best practices to optimize your Kubernetes cloud costs:
- Pod rightsizing
- Node rightsizing (or virtual machine rightsizing)
- Autoscaling (horizontal pod autoscaling, vertical pod autoscaling, and cluster autoscaling)
- Rebalancing fragmented nodes
- Leveraging cloud discounts (reserved instances, spot, savings plans, etc.)
1. Pod rightsizing
Kubernetes Pods are a construct that contains one or more containers that act as a single block of resources that can be scheduled and scaled on the cluster, and are the smallest deployable unit in a Kubernetes container environment. Below is a sample Kubernetes cluster to help visualize how all the components of a Kubernetes cluster come together.
When configuring your Kubernetes cluster, you can use resource requests and limits, where developers control the amount of CPU and memory resources per pod or container by setting the resource request and limit fields in the configuration file. Kubernetes will take the declared resource request and limit at face value, guaranteeing at least the request value, but allowing up to the limit value to be consumed. You can read more details about this in Kubernetes’ documentation here.
To help reduce the cost of your Kubernetes cluster, ensure that you’re setting resource requests and limits that provide enough resources for optimal performance, but not so much that there’s waste. Examine your pod usage and application performance over time to determine if you can rightsize your pods by adjusting your requests and limits.
Kubernetes also offers a tool called the Vertical Pod Autoscaler (VPA). The VPA automatically allocates more or less CPU and memory to existing pods. You can learn more about how the VPA works here.
2. Node rightsizing
Similar to rightsizing your pods, it’s important to make sure you’re using the right size and type of node in your Kubernetes cluster for the workloads you’re running. As a simple example, let’s say you have a node with 10 CPUs and 10 GB of RAM that costs $100/month. You also have a workload that requires 4 CPUs and 4 GB of RAM to run. You can fit up to two pods of this workload in your node, but there isn’t enough room for a third. You would need to add an additional node, which increases wasted resource space in your nodes and drives up costs. To reduce wasted spend and resources for that particular workload, it would be better to use a node with 8 CPUs and 8 GB of RAM.
The simple takeaway here is similar to pod rightsizing—be sure to measure what your applications require and reduce the number and size of your nodes where possible.
However, in terms of performance, it’s important to keep in mind that if the number of pods per node becomes too large, operations might slow down and can even become unreliable. Because of this, managed Kubernetes services usually impose limits on the number of pods per node. Here’s a quick breakdown of the leading cloud providers’ pod-per-node limits:
- On Amazon Elastic Kubernetes Service (EKS), the maximum number of pods per node depends on the node type and ranges from four to 737.
- On Google Kubernetes Engine (GKE), the limit is 110 pods per node, regardless of the type of node.
- On Azure Kubernetes Service (AKS), the default limit is 30 pods per node but it can be increased up to 250.
Improve resource utilization and optimize costs with CloudHealth rightsizing recommendations
CloudHealth’s rightsizing functionality makes it easy to quickly identify underutilized infrastructure and get recommendations for downgrading or terminating assets. Recommendations are based on utilization and performance metrics (e.g. CPU, memory, etc.) that can be ingested into the platform via APIs, integration partners (e.g. Datadog), or the CloudHealth Agent. Once the metrics are available, you have the power to set performance thresholds specific to your business and you can take advantage of advanced filtering capabilities by dynamic business groupings, regions, and more.
By rightsizing your pods and nodes, you can improve the performance and reduce the cost of your Kubernetes clusters. However, it’s challenging to know the exact number of pods or nodes that best fit the services you’re running and to adapt quickly when changes occur. To help with this challenge, Kubernetes offers autoscaling capabilities to ensure you’re using the right size and number of pods, nodes, and clusters for your needs. Luckily, there are tools in Kubernetes for managing the number of active pods and nodes:
- The Horizontal Pod Autoscaler (HPA) increases or decreases the number of pods you run based on observed pod CPU or memory utilization. To learn more about how this is done in practice, see the Kubernetes documentation on the Horizontal Pod Autoscaler here.
- The Cluster Autoscaler automatically increases or decreases the size of your Kubernetes cluster by adding or removing nodes, based on pod and node utilization metrics. You can learn more about the Cluster Autoscaler here.
Applications have different needs with different patterns. Ensuring your application not only scales up when necessary, but also scales down at the appropriate times, can save you significant costs.
4. Rebalancing fragmented Kubernetes nodes
With time, any active Kubernetes cluster goes through a recurring series of deployments and periodic scale-out, which translates to repeated pod/node additions and removals. This cycle generally introduces several inefficiencies in the clusters. A few of them can often be addressed by the three steps we’ve already covered—rightsizing pods, rightsizing nodes, and autoscaling. However, one of the most significant (but rarely addressed) problems is that of resource fragmentation in Kubernetes clusters, which requires special attention.
Since Kubernetes schedulers cannot predict the future pod sizes and node additions, over time, a lot of inconsistencies get introduced in the way pods are scheduled. Inadvertently, pods are eventually scheduled across nodes in such a way that for any new pod, all the resources requested by it are collectively unavailable at any single node, making the pod un-schedulable. Even though your overall cluster might have much more capacity available across the nodes, a scale-up is still needed. It creates a "pseudo" resource crunch that could be avoided by consolidating these fragments of available resources together.
This can be achieved by identifying and migrating a certain set of pods across the nodes to consolidate available resources together. In large-scale clusters, it’s especially important to rebalance unoptimized Kubernetes clusters like this to avoid wasted resources and unnecessary cloud costs.
In sum, rebalancing Kubernetes clusters is about performing best practices one, two, and three (pod rightsizing, node rightsizing, and autoscaling) in an integrated way and on an ongoing basis.
However, if you have hundreds or thousands of pods, identifying the migration plan of pods around the nodes can be nearly impossible, more so because of multiple resources to balance out (like CPU and Memory).
As such, the CloudHealth team has devised a couple of algorithms that can automatically generate migration plans for Kubernetes clusters. See our complete article on how this works here: Kubernetes Cost Optimization: How to rebalance fragmented Kubernetes clusters to reduce cloud costs
5. Leveraging efficient purchasing options
Major cloud providers offer different resource purchasing options, with several discounted price options in exchange for modified service contract terms. These resource purchasing options apply to Kubernetes just as they would to non-containerized infrastructure. For example:
- On-demand Instances: Pay by the hour or second for the instances that you launch
- Savings Plans: Reduce your Amazon node costs (EC2 or Fargate) by making a commitment to a consistent amount of usage, in USD per hour, for a term of one or three years (only applicable for AWS)
- Reserved Instances: Receive a discounted price in return for committing to pay for resources for one or three years (Azure calls these “Reservations” and Google Cloud calls these “Committed Use Discounts”)
- Spot Instances: Request unused instances in exchange for a discounted rate compared to on-demand prices (Azure calls these “Azure Spot VMs” and Google calls these “Preemptible VMs”)
For an in-depth look into these options and more, read this article that compares discounts, commitments, and reservations between AWS, Microsoft Azure, and Google Cloud.
Spot instances lend themselves particularly well to a containerized environment. Spot instances go by different names depending on the cloud provider. Amazon calls them “Spot Instances,” Azure calls them “Spot VMs” and Google Cloud calls them “Preemptible VMs.” Whichever cloud provider you choose, the purpose of spot instances is the same—users can request unused resources from the cloud provider “on the spot” and use those resources at a lower cost than on-demand prices.
Spot resources come with the caveat that the instance might be lost any moment if the cloud provider needs to pull it back for on-demand or reserved customers. Many critical applications are not designed for this, but are well-suited for applications that can tolerate minor interruptions. Kubernetes can help manage this by automatically ensuring the declarative configuration (including the appropriate number of running nodes) has been realized.
How CloudHealth helps manage your cloud discount options for optimal cost savings
CloudHealth takes the hassle out of cloud discount management by providing the modeling, optimization, amortization, and recommendation capabilities needed to help you feel confident about your purchasing decisions. With CloudHealth, you can see how well your Kubernetes cluster is covered by Reserved Instances, AWS Savings Plans, and Spot Instances in order to achieve the maximum benefits and cost savings from your investment. Learn more from the following solution briefs:
Containers are revolutionizing how applications are developed and deployed, and can provide numerous advantages for organizations looking to adopt the technology. However, optimizing your Kubernetes cloud costs can be difficult without visibility into your containerized environment and accountability for spend.
Some organizations try to develop container management tools in-house, but struggle to justify the total time, resources, and capital needed to undergo such a large project. CloudHealth is ready to help you get visibility into your clusters, optimize your container environments, and save money—all without sacrificing agility.
Enable your developers to focus on differentiating your organization while you stay optimized and in control. Learn how CloudHealth can help you manage your Kubernetes environments by reading our solution brief, or book a demo to see the CloudHealth Platform in action.
For more best practices on managing cloud costs with Kubernetes see our in-depth whitepaper: FinOps for Kubernetes: Unpacking Container Cost Allocation and Optimization