Common Culprits for Unexpected AWS, Azure, and GCP Service Cost Spikes

7 Min Read

As a CloudHealth Technical Account Manager, I have the opportunity to work directly with a diverse set of customers across all three major public cloud providers - AWS, Azure, and GCP. From innovative startups to Fortune 500 enterprises, these organizations are continually caught off guard by unpredictable cloud service costs as they attempt to manage, optimize, and govern their cloud spend.

I work alongside these customers as a trusted advisor to provide visibility into their cloud environment and enable policies to identify cost fluctuations. In many cases, I see the same cloud services create unexpected cost spikes. So to help customers know what to look for, I collaborated with fellow Technical Account Managers to document a list of common cloud services across AWS, Azure, and GCP that contribute to unpredictable cloud service costs.

AWS Cloud Service Costs

AWS CloudWatch

AWS Cloudwatch is a monitoring service that provides data for all your AWS cloud services and applications. With CloudWatch, you can track your Amazon EC2 instances, Amazon DynamoDB tables, Amazon RDS DB instances, and custom metrics generated from your applications.

As your team begins to scale CloudWatch, you may hit limits for custom metrics, alarms, and dashboards, at which point costs will rise rapidly and unexpectedly. Review this knowledge center article from AWS for tips on how to reduce these charges.

AWS NAT Gateway

Network Address Translation (NAT) Gateway is an AWS managed service that enables your private network to communicate with public networks like the internet. You can send and receive traffic from a single IP address without exposing host identities. One way I like to explain NAT Gateway is like an apartment building and concierge. If someone wants to send you a package, they can send it to the apartment building and the concierge routes it to your specific room without them knowing your room number.

When initially deploying NAT Gateway, your cloud costs can easily skyrocket, so it’s important to address the configuration and deployment of your NAT Gateways. You’ll be charged per hour and per gigabyte of data, but there are less expensive solutions for routing data available. For best practices, we recommend this article from AWS: How can I reduce data transfer charges for my NAT gateway?

AWS Glue

AWS Glue is a managed extract, transfer, and load (ETL) service that crawls all the assets in your AWS environment and stores that information in the AWS Glue Data Catalog. Glue is essentially an organized central repository of your organization’s AWS data.

Like other Amazon managed services, AWS Glue is valuable and easy to deploy, but each job can rack up costs. If you can reduce the number of Data Processing Units (DPUs) to run your ETL job, then you can significantly reduce costs. Learn more about AWS Glue pricing here.

AWS SageMaker 

AWS SageMaker is a fully managed machine learning (ML) service that eases the workload of developing high-quality ML models. It doesn’t cost anything for the first two months, as long as usage remains below a certain amount and number of hours. As more engineers within your business leverage AWS SageMaker, you can exceed the free tier quickly, causing cloud costs to rise.

After you exceed the free tier, you only pay for what you use, with billing for building, training, and deploying ML models charged by the second. Make sure you have visibility into AWS SageMaker usage and pricing to prevent exceeding budget.

AWS S3 Glacier

AWS S3 Glacier is an AWS storage class recommended for low-access, long-term storage. Although S3 Glacier is one of the more affordable storage classes that AWS offers, there are still some important factors to consider to ensure you aren’t surprised by unpredictable cloud costs.

For example, if data is uploaded to S3 Glacier archive and deleted within 90 days of its creation, you’re charged a prorated early deletion fee. Learn more by checking out Amazon’s S3 Glacier FAQs.

Azure Cloud Service Costs

Azure Monitor Log Analytics

Azure Monitor Log Analytics is the primary tool in the Azure portal to help you collect and analyze data generated by resources in your cloud and on-premises environments. With Log Analytics, you can understand how applications are performing and receive proactive issue notifications. But like all things of value, it comes at a price.

There are multiple factors to consider when enabling Azure Log Analytics that can impact your total cloud costs, including the region your data is located, if you’re choosing pay-as-you-go or capacity reservations, the number of notifications and alerts you’d like to receive, as well as your desired data retention period. You can see Microsoft’s breakdown of Azure Monitor pricing and Log Analytics pricing here.

Azure Sentinel

Azure Sentinel provides intelligent security analytics across your enterprise and stores this data in your Azure Monitor Log Analytics workspace. The Azure Sentinel pricing model offers two pricing models - Capacity Reservations or Pay-As-You-Go - and is based on the volume of data ingested for analysis and stored.

If you choose Capacity Reservations, you have the opportunity to save costs, but only if you accurately choose the amount of data ingested. Unpredictable costs come into play when you go above your reservation capacity and the additional data is charged per the Pay-As-You-Go rates. Without proper visibility and monitoring, Azure Sentinel costs can quickly get out of hand. You can learn more about Azure Sentinel pricing details here.

Azure Data Explorer (Kusto)

Azure Data Explorer, also known as Kusto, is a fully managed service for storing and running real-time analytics on big data. The information you can gather from Azure Data Explorer helps you monitor usage, processes, and service quality from your applications, websites, IoT devices, and more.

Depending on your region, the amount of data you’re running, the frequency of your queries, and the types of retention policies you choose, costs can increase fast. You can estimate your cloud costs with the Azure Data Explorer pricing calculator and we also encourage you to read more on the Azure blog about how to control costs in Azure Data Explorer using down-sampling and aggregation.

Google Cloud Service Costs

Google Firebase

Firebase is Google's platform for developers to build and improve mobile and web applications. With all its power and functionality, there are countless stories of runaway Firebase costs. Here are just two examples I encourage you to read so you know what to look out for:

Google BigQuery

Google BigQuery is a database that can run massive amounts of data processing in a short amount of time. As the amount of usage and storage data increases, costs can run rampant. Also, depending on how you’re searching for data, costs can vary.

Google provides more information on how to control costs in their documentation, which you can use alongside the official BigQuery pricing page.

How to Take Control of Unpredictable Cloud Service Costs

Although I’ve just provided a long list of cloud services that can lead to unpredictable cloud costs, this list isn’t necessarily exhaustive. There are other cloud services and nuances that can affect your cloud costs and your business’ bottom line.

To stay ahead of unpredictable cloud costs regardless of the source, it’s imperative to have visibility into all of your cloud services and resources - because you can’t control what you can’t see. Cloud service providers’ native monitoring tools are often insufficient because they don’t provide full visibility into multi-tenanted public cloud data centers. So you’ll want to establish visibility across your public cloud, hybrid cloud, and multicloud environments, as well as on-premises IT infrastructure.

Once you’ve established holistic visibility, you can see where you’ve wasted spend, start to optimize your environment, and eventually, automate remediation actions. When making changes, ensure your cloud management and finance teams are aligned with the broader business so there are no silos in business decisions and unexpected cloud costs due to miscommunication or misalignment. Each of these steps is part of building a mature cloud financial management practice.

Learn more about how our customers have tackled these challenges to take control of unpredictable cloud costs with our whitepaper: Building a Successful Cloud Financial Management Practice

profile photo of zack siegert
Zack Siegert, Technical Account Manager

Zack is a member of the CloudHealth Customer Success Team and focuses on enabling customers to be successful in the cloud by providing insights into cloud management best practices, helping teams build a mature cloud strategy, and ensuring customers are maximizing their usage of the CloudHealth Platform.

We Think You Might Like These: