Cloud scalability—or being able to add and remove resources as you need them—has been one of the major factors driving businesses to the cloud. However, scaling up and down is not always as simple as it is made out to be. We investigate why cloud scalability tends to cause so many problems and suggest a solution.
Although the basic concept of cloud scalability is easy to understand, the problems for new cloud adopters start immediately due to the number of terminologies used. On the surface infrastructures (or their components) can be “scalable”, “elastic”, “auto-scalable”, “burstable”, or “right-sizable.” Each seems to be similar to the other, so what distinguishes one from another?
Cloud Scalability vs Cloud Elasticity
The first difference to address is cloud scalability vs cloud elasticity. In general usage, “cloud scalability” relates to the server space and resources used per online service or business application (i.e. the “application level”), whereas “cloud elasticity” relates to infrastructure as a whole (i.e. enabling the hypervisor to create instances or containers with the resources to meet overall demand).
Cloud scalability can depend on cloud elasticity when a load balancer is used to distribute application traffic across a number of servers (“horizontal scaling” or “scaling out”). Alternatively cloud scalability can be achieved by over-provisioning resources allocated to the application or by moving the application to a bigger instance (“vertical scaling” or “scaling up”).
Auto-scaling can be both horizontal and vertical
Auto-scaling monitors the performance of applications and automatically adjusts the capacity to maintain steady, predictable performance and to ensure businesses only pay for the resources they use. Horizontal auto-scaling allows businesses to create rules to start or stop instances assigned to a resource when upper or lower thresholds are breached, whereas vertical auto-scaling allows businesses to create rules affecting the amount of CPU or RAM allocated to an existing instance.
Typically auto-scaling is a free service offered by Cloud Service Providers, but you will have to pay for the monitoring services (i.e. AWS CloudWatch). There can also be issues with auto-scaling inasmuch as horizontal auto-scaling doesn´t always keep up with unexpected peaks in demand (instances can take up to five minutes each to load), while vertical auto-scaling usually requires downtime and is not an ideal solution for maintaining a steady, predictable performance.
Burstable cloud scalability
To address the issue of unexpected peaks in demand, the leading cloud service providers offer burstable instances. These are typically small-sized instances that businesses can run below their peak capacity and save the capacity not being used. Then, should an unexpected peak in demand occur, the instance automatically draws against its banked capacity in order to obtain a “burst” of power when required.
It is advisable not to rely too heavily on burstable instances, as the benefits of this type of cloud scalability are only effective when sufficient capacity has been saved up. If the peak in demand is ongoing, the instance´s banked capacity can get quickly exhausted, leaving the service or application unobtainable.
Right-Sizing Cloud Scalability
The term “right-sizing” is most commonly used in relation to cloud cost optimization—typically reducing the size of over-provisioned resources in order to ensure businesses are not paying for services they are not using. However, “right-sizing” does not always have to mean “downsizing.” It can also mean increasing the capacity of resources allocated to a service or application to improve its performance.
In this respect, right-sizing is identical to vertical scaling or vertical auto-scaling, as rules can be created in order to upsize or downsize a resource depending on demand. The slight difference between the two is that right-sizing is more often seen as a long-term measure, whereas vertical scaling (particularly vertical auto-scaling) more commonly addresses short-term fluctuations in demand.
Managing the Many Elements of Scalability
When discussing the problems relating to cloud scalability, we’ve already identified slow responses to peaks in demand, downtime, and the exhaustion of banked capacity, and possibly the fourth problem is the biggest of them—managing the many elements of scalability.
To effectively manage the many elements of scalability across one cloud or multiple clouds, CloudHealth can be invaluable. CloudHealth identifies which resources are suitable for right-sizing (the long-term measure for optimizing cost and performance) and then allows you to take advantage of policy-driven automation in order to create alerts for resources in need of upsizing and downsizing. The process effectively results in the hands-free management of your scalable resources.
One of the major advantages CloudHealth has over other cloud management platforms is that it can help businesses better manage the many elements of scalability whether their assets are deployed in the cloud or in on-prem infrastructures. Data can be aggregated in order to give businesses greater visibility over their assets and enable them to make better-informed decisions.