The content in this blog is outdated and we cannot reliably say it is still accurate with the speed in which the cloud industry moves. But don’t worry—below are more recent, up-to-date blogs.
In our last blog we discussed the first two steps in our rightsizing recipe for success that provided a framework to measure usage and performance of your infrastructure. Now that you’ve consolidated your cloud data and evaluated it by our three lenses, you’re ready to analyze and optimize the environment.
Analyze – Don’t Boil the Ocean
Using your devised groupings, you can look at the metrics behind your instances based on the cost and performance factors that matter most for Rightsizing. Amazon has a plethora of instance families to choose from and a variety of sizes for each instance type. Luckily, spinning up an instance doesn’t mean you need to solely commit to that specific size. To align actual usage with your instances and reduce wasted costs from under-utilized instances, Amazon allows you to modify an instance to a different size within its family without hurting performance - a luxury of rightsizing!
Analyzing the daily minimum, maximum and average utilization of three key performance metrics (CPU, Memory, Disk) provides the best picture of your instances’ performance. For example, your instance could be dormant most of the time with variant usage spikes during certain periods of the day. In a situation like that, it’s okay to have a large instance type in order to handle forecasted periodic spikes in usage. The problem arises when servers are over-provisioned to cover all possible needs or worst case scenarios. This serves to mitigate risk, but from a financial perspective, it’s extremely inefficient.
Once the CPU, Memory, and Disk utilization statistics for the instances being utilized for the development of our new App are known, you have the information needed to make the decisions necessary to optimize cost and usage. Keep in mind that the business purpose (or Function) of the instance should play into the decision making process. For example, you shouldn’t rightsize a memory optimized instance in the R3 family strictly because the Average CPU utilization hasn’t gone beyond 10% in several months. Its memory capacity needs to be factored into the decision, as it may be a server meant to archive tables. The same rule applies when analyzing different metrics in the C4, M3, I2, or G2 instance families. With this game plan, you’re ready to take action and optimize your environment.
Optimize – Take Action
You can apply the following optimizations based on the following analysis:
During the analysis stage, you’ll likely find multiple instances that have minimal activity or memory utilization over a long periods of time. When you have an instance where average utilization does not go above 5% and the maximum doesn’t come close to 20% over several periods, you have found “zombie infrastructure” - launched servers that have gone unused, while accumulating unnecessary cost. By isolating and terminating those instances, you can maximize cost efficiency.
If an instance has a low score (20% or lower) for the metrics mentioned above, it’s also likely an underutilized instance. Downgrading some of those instances can yield significant savings while still giving users active servers to execute their daily tasks. For example, you can rightsize a c4.xlarge into a c4.large without hurting performance if the average CPU is 25%. You’re spending less money and enjoying the same processing power. If you have several m4.4xlarges that have less than 40% of their memory utilized, rightsize those instances to a m4.2xlarge to save on the cost and enjoy the same memory capacity.
When your instances are constantly hitting maximum utilization, they should be upgraded to increase performance and mitigate the risk of not being able to meet surges in demand. Although moving up an instance family will increase costs for that instance, the benefit from greater performance is worth the investment.
Your infrastructure is now rightsized! This is great before launch, but what happens when you unleash your app into the world? How do you track your infrastructure's performance and cost? Stay tuned for our next blog illustrating how governance can help maintain your newly optimized environment.