Amazon Web Services provides users with five AWS cost optimization “pillars” to help minimize costs. Although the pillars are practical and easy to understand, they may not go deep enough for some new users into the why’s and how’s of cost optimization—particularly when it comes to maintaining the optimized state.
The five AWS cost optimization pillars will be familiar to experienced users of Amazon Web Services—rightsize, schedule on/off times, choose the right pricing model, optimize storage, and repeat. For newer adopters of the AWS Cloud, the reasons why it’s necessary to optimize AWS costs and then repeat the process won’t be so familiar. After all, in the cloud you only pay for what you use. Don’t you? No.
When you launch services in the cloud (the AWS Cloud or any other), you pay for what you provision. This means that if you provision an instance with x vCPUs and y GiBs of memory, what you pay for is what you provision, even if you only use half of its capacity. It’s a similar story if you leave instances running when you’re not using them or assign more expensive storage to an instance than you need.
One of the reasons this happens quite a lot to new cloud users is because operating costs in on-premises IT infrastructures are mostly fixed. If you fire up a new server, it’s already paid for, so it doesn’t really matter if you don’t use its full capacity or leave it running (except for utility costs). In the cloud, where servers can be launched with the click of a mouse, it’s easy for costs to quickly spiral out of control.
So let’s go through the five AWS cost optimization pillars provided by Amazon Web Services one-by-one, elaborate a little more about what they mean, and explain why the repeat cycle is necessary to maintain the optimized state. We’ll also add a sixth pillar of our own and explain how automating the processes can help your business save time and money maintaining an optimized AWS Cloud.
Pillar #1 - rightsizing AWS instances
The reason for rightsizing AWS instances is so that their capacities match their workloads and so your business really does only pay for what it uses. For businesses new to AWS, the most common assets to rightsize are EC2 instances and RDS instances, and the way to optimize these instances is to monitor CPU utilization, network throughput, and disk I&O via Amazon Cloudwatch.
What you have to remember is that AWS instances double in capacity as they go up in size—and half in capacity as they go down in size—so you can only really downsize to a less expensive instance if utilization metrics are peaking at ~45% or less. If you find you already have too many instances running to monitor utilization manually, Amazon Web Services recommends using CloudHealth.
Pillar #2 - scheduling on/off times
Not every asset you deploy on AWS needs to be running continuously. Some EC2 instances are used for developing, testing, staging, and QA; and, when your developers go home, it doesn’t make sense to keep these assets running. You could ask developers to make sure they switch off non-production assets manually, but historically this hasn’t been the most reliable way to achieve AWS cost optimization.
As its second AWS cost optimization pillar, Amazon Web Services recommends scheduling on/off times for non-production—suggesting they can be switched off for as much as 70% of the week. If you use a service such as CloudHealth to monitor utilization metrics, you will likely find that you can apply more aggressive schedules (rather than 8.00 a.m. to 8.00 p.m. Monday to Friday) in order to achieve greater savings.
Pillar #3 - reserved instances
Most businesses are aware that by committing to a level of service over one or three years, they’re able to take advantage of AWS Reserved Instances in order to achieve discounts of up to 75% compared to “On Demand” pricing. Reserved Instances can contribute significantly to AWS cost optimization for predictable workloads, but only if they’re properly managed.
Some businesses make the mistake of committing to a Reserved Instance and then failing to take full advantage of the savings due to fallen demand, or an instance that is part of a reservation is terminated and not replaced. CloudHealth’s “RI Optimizer” can help businesses overcome these potential AWS cost optimization issues by managing Reserved Instances throughout their lifecycles.
Pillar #4 - optimize storage
Storage is often overlooked in AWS cost optimization procedures because so much attention is given to instances, yet substantial savings can be achieved by assigning the right storage type to assets and maintaining infrequently accessed data in lower tier storage options—although, before transferring infrequently access data to cold storage, it’s important to be aware of retrieval costs.
With regard to storage types, AWS assigns General Purpose SSD storage to assets by default. If a high level of performance and availability isn’t necessary—for example, for non-production instances—it’s possible to half storage costs by using Throughput Optimized HDD storage. Again, CloudHealth’s cloud management platform can provide guidance about appropriate storage tiers and types.
Pillar #5 - repeat
AWS cost optimization is not a one-off exercise. As businesses expand their presence in the cloud, asset utilization increases or decreases, and Reserved Instances near the end of their lifecycles, it’s necessary to treat AWS cost optimization as an ongoing exercise. Amazon Web Services recommends constantly measuring, monitoring, and improving to ensure the full economic potential of the AWS Cloud is extracted, and suggests four ways in which this can be done:
- Define and enforce cost allocation tagging.
- Define metrics, set targets, and review.
- Train and incentivize teams to save costs.
- Assign the responsibility for AWS cost optimization to an individual or team.
Bonus Pillar #6 - delete zombie assets
Zombie assets are assets that are no longer being used, or that were launched at the same time as (for example) an EC2 instance, and then not deleted when the EC2 instance was terminated. Some businesses can find thousands of zombie assets in their inventories which are still being paid for even though they aren’t being used—and it isn’t always poor housekeeping that’s to blame.
Some assets are often attached by default to an EC2 instance—for example, Elastic Block Storage—and unless the box on the AWS console is checked to delete the storage volume when the instance is terminated, it keeps running—and running up costs. Other zombie assets can include Elastic IP addresses, Elastic Load Balancers, aged snapshots, and components of instances that were activated when an instance failed to launch.
Automating AWS cost optimization
It’s humanly impossible to monitor the AWS Cloud around the clock in order to identify assets suitable for rightsizing, scheduling, or terminating. Therefore, many businesses take advantage of policy-driven automation to keep an eye on their AWS Cloud and alert them to opportunities to optimize costs. Automating AWS cost optimization is a straightforward process.
You simply select a cloud management platform with automation capabilities (i.e. CloudHealth) and then apply the policies you want the platform to monitor. For example, if you have concerns EC2 instances are overprovisioned, you apply a policy to be notified when CPU utilization falls below 45%. Even if the instance is using a high percentage of memory (which might make a size reduction impractical), you may be able to save money by transferring the workload to a different family type.
Additionally, creating a Cloud Financial Management Practice can help with AWS cost optimization. Cloud Financial Management (CFM), also known as FinOps or Cloud Cost Management, is a function that helps align and develop financial goals, drive a cost-conscious culture, establish guardrails to meet financial targets, and gain greater business efficiencies. Learn more about establishing a Cloud Financial Management practice here.
It’s also possible to apply policies that notify you when on/off schedules could be more aggressive, when Reserved Instance utilization needs attention, or when data has not been accessed for x days—making it a suitable candidate for cold storage. Automating AWS cost optimization also works the other way around as well. You can apply policies to be notified when assets are over-utilized (and suitable for upgrading), when further Reserved Instance purchases could save you money, or when data retrieval costs exceed the savings achieved by moving data to cold storage.