While most servers spend the majority of their time well below peak usage, companies often pay for max usage 24/7.
Cloud providers enable the ability to scale usage up and down, but determining the right schedule is highly prone to human error.
Machine learning models can be used to predict server usage throughout the day and scale the servers to that predicted usage.
Depending on the number of servers, savings can be in the millions of dollars.
How big of a server do you need? Do you know? Enough to handle peak load, plus a little more headroom? How often is your server going to run at peak utilization? For two hours per day? Ten hours? If your server is only running at two hours per day at peak load, then you are paying for 22 hours of peak performance that you aren’t using. Multiply that inefficiency across many servers, and that’s a lot of money spent on compute power sitting idle.
Cloud Providers Make Scaling Up and Down Possible (with a Caveat)
If you’ve moved off-premise and are using a cloud provider such as AWS or Azure, it’s easy to reconfigure server sizes if you find that you need a bigger server or if you’re not fully utilizing the compute, as in the example above. You can also schedule these servers to resize if there are certain times where the workload is heavier. For example, scheduling a server to scale up during nightly batch processes or during the day to handle customer transactions.
The ability to schedule is powerful, but it can be difficult to manage the specific needs of each server, especially when your enterprise uses many servers for a wide variety of purposes. The demands of a server can also change, perhaps without their knowledge, requiring close monitoring of the system. Managing the schedules of servers becomes yet another task to pile on top of all of IT’s other responsibilities. If only there was a solution that could recognize the needs of a server and create dynamic schedules accordingly, and do so without any intervention from IT. This type of problem is a great example for the application of machine learning.
How Machine Learning Can Dynamically Scale Your Server Capacity (without the Guesswork)
Machine learning excels at taking data and creating rules. In this case, you could use a model to predict server utilization, and then use that information to dynamically create schedules for each database.
Server Optimization In Action
We’ve previously done such an application for a client in the banking industry, leading to a 68% increase in efficiency and a cost savings of $10,000 per year for a single server. When applied to the client’s other 2,000 servers, this method could lead to savings of $20 million per year!
While the actual savings will depend on the number of servers employed and the efficiency at which they currently run, the cost benefits will be significant once the machine learning server optimization model is applied.
If you’re interested in learning more about using machine learning to save money on your server usage, click here to contact us about our risk-free server optimization whiteboard session.
If the pandemic and our business applications have one thing in common, it’s the difficulty in preparing for the future. Just as we could not foresee the oncome of the virus, we cannot always precisely determine the capacity required to run our applications effectively, no matter how much we plan.
When demand exceeds your application’s ability and capacity to run efficiently, it’s time to scale.
What is scalability?
Scalability is an application’s ability to increase or decrease overall support and performance in response to the changes in demand. For example, how your company’s website might respond to an increase in visitors is dependent on your application’s scalability. When met with this demand, you want to make sure your application can handle the increase so that it functions properly. Scalability has its limits, and scaling is increasing the capacity of those limits.
The question is: is scaling up or scaling out the right choice for your business?
What is horizontal scaling?
Horizontal scaling is a type of scaling in which additional resources, such as servers, nodes or instances, are added to a system in order to handle an increasing amount of workload or traffic. In horizontal scaling, multiple machines or servers are added to a network or cluster, working together to distribute the processing load, instead of relying on a single machine to handle all the tasks.
Horizontal scaling is commonly used in distributed computing environments, cloud computing platforms, and web applications, where demand and traffic can vary significantly over time. Horizontal scaling provides several benefits, including improved performance, increased availability, and better fault tolerance, as it can handle sudden spikes in traffic and balance the load across multiple resources.
What is vertical scaling?
Vertical scaling is a type of scaling in which additional resources, such as CPU, memory, or storage, are added to a single machine or server to handle an increasing amount of workload or traffic. In vertical scaling, the capacity of a single machine is increased by adding more resources, such as increasing the number of cores or upgrading to a more powerful CPU.
Vertical scaling is commonly used in database systems, high-performance computing, and other applications where a single machine can handle the processing requirements of the workload. Vertical scaling provides several benefits, including improved performance and better resource utilization, as it allows for the efficient use of a single machine’s resources. However, vertical scaling has limitations, as the capacity of a single machine is finite and can become a bottleneck when the workload exceeds the machine’s capacity.
What is vertical vs. horizontal scaling?
There are two different ways to scale: vertical scaling and horizontal scaling. Vertical scaling, also known as scaling up, is adding more power, or increasing the capacity of a single machine or server for better performance.
For example, you can scale up by adding resources, such as CPU, RAM, or disk capacity to add more processing power to your existing machine. In cloud terms, this translates into increasing the instance type for your application. In the short term, vertical scaling creates a bigger, better machine for an application to run on. Additionally, vertical scaling is data consistent, as your data is stored on a single node / instance.
One caveat to scaling up, however, is that it comes with limits to the amount of hardware that can be added to a single machine. Vertical scaling also introduces potential for hardware failures. Vertical scaling is easy in the sense that there is no need for as additions only are made to the machine, but is easier better? Not necessarily.
Horizontal scaling, or scaling out, is when you add more machines or servers to your existing pool of resources. In cloud terms, this is referred to as Auto Scaling where the cloud OS can adjust capacity to demand needs. Rather than adding to a single machine as in scaling up, scaling out is duplicating a current set up and breaking it into separate resources.
Instead of changing the capacity of your existing server you are decreasing the load of the server through additional, duplicate servers. More resources might come appear more complex for your business but scaling out pays off in the long run, especially for larger enterprises. Instead of worrying about upgrading hardware as with vertical scaling, horizontal scaling provides a more continuous and seamless upgrading process.
Horizontal vs Vertical Scaling Pros and Cons
Which type of scaling is right for your business?
There are pros and cons to both horizontal and vertical scaling, however, horizontal scaling is currently trending due to its reliability and efficiency. Vertical scaling is simpler, while horizontal scaling may prove to optimize your business operations in the long run. Most commonly, business choose to scale out. Regardless of the environment a business operates in, scaling up requires downtime, which can be inefficient for a business’s operations.
There are a several factors to consider when determining the scaling method right for you:
Flexibility: Horizontal scaling allows for flexibility because you can determine the configuration for your setup that optimizes cost and performance for your business needs. Costs are not optimized when scaling up, as you pay for the set price of the hardware.
Upgrades: With vertical scaling, hardware additions can only be upgraded to a limited extent. Horizontal scaling allows for continuous upgrades since you are not dependent on a single piece of equipment.
Redundancy: Another benefit that comes with horizontal scaling is there is no single point of failure distributed with a cloud environment. If your servers fail, the load balancer redirects the request to a different one of your servicers. Vertical scaling, on the other hand, has a single point of failure meaning if the machine goes down, the application goes down with it. Transitioning to the cloud through horizontal scaling eliminates the potential for this problem.
Cost: While vertical scaling may come with a lower upfront cost compared to horizontal scaling, horizontal scaling optimizes cost over time.
Choosing a scaling method that meets your business needs may seem like a complicated choice, but it does not have to be. 2nd Watch is an AWS Premier Partner, a Microsoft Azure Gold Partner, and a Google Cloud Partner providing professional and managed cloud services to enterprises. Contact Us to take the next step in your cloud journey.