If you provide a web service or an application that handles multiple user requests at at time, chances are you came across the term: load balancing and load balancer. Here we will explain what it is, how it works and how to make sure you distributing traffic with load balancers in a secure manner.
What is load balancing?
Load balancing is a technique used in computer networks and IT systems to distribute the load across multiple resources or servers. There are hardware-based load balancers and software-based ones. The former require a dedicated load balancing device.
A software-based load balancer runs on servers or on Virtual machines in the cloud optimizing the use of each resource computing capacity. In this article we will focus on software-based balancers with a particular focus on cloud load balancing.
What is load balancing in the cloud?
Cloud load balancing is a fully distributed, software-defined solution that balances user traffic to multiple backends to avoid congestion and ensure low latency. There are different types of load balancing, depending on the type of traffic you are dealing with, global or regional.
Static vs. dynamic load balancing algorithms
We can distinguish between two types of load balancing algorithms: static and dynamic.
Static load balancing
Static load balancing algorithms have fixed rules governing user traffic. They measure the incoming application traffic and distribute it between different servers without considering the performance of each server.
Imagine a security check at the airport and a series of gates that divide the incoming passengers evenly between each security point. The system will work well if all security stations take the same amount of time to process passengers. If, however, one of the security points is slower, it will cause an imbalance that a static load balancing algorithm cannot account for.
Static load balancers distribute traffic evenly among all servers in the network. A static load balancer is therefore a good solution for a website with static content because traffic is usually very predictable for such websites.
Dynamic load balancing
Dynamic load balancing algorithms consider current state and availability of all available servers and shift traffic from the poorly performing servers to underutilized ones. This way traffic remains even and uninterrupted even if individual servers are less efficient.
In the airport security check example, a dynamic load balancer would assign passengers to the least occupied security stations regardless of which door they came through. This way, if one station is slower at processing passengers, the other ones can pick up the workload.
A dynamic load balancer is therefore a good solution for websites and applications that experience spikes and drops in traffic that are difficult to predict. Dynamic load balancers make sure application availability remains high even if some of the servers are underperforming. They are, however, more difficult to configure.
Application and network load balancers
The two main types of load balancing are application and network load balancers.
Application load balancing algorithms
The Application Load Balancer distributes HTTP and HTTPS traffic to backends, whether they are hosted internally or externally. It works at the application layer (Layer 7) and you can use it to route traffic based on content within each request (e.g. the URL or HTTP header).
Network load balancing
The load that requires balancing is often network traffic or customer requests. Network load balancers operate at the transport layer (Layer 4) and route traffic based on IP protocol data. The purpose of network load balancing is to optimize performance and ensure reliability and availability of services.
In the process of load balancing, traffic gets distributed among multiple servers. For example, content delivery networks (CDNs) often contain a load balancer to be able to deliver content in a fast and reliable manner, regardless of user traffic.
DNS load balancing
Load balancing can also be performed on the domain level by configuring a domain in the Domain Name System (DNS). Through DNS load balancing user requests to the domain are distributed across a group of servers.
Cloud load balancing
A special case of load balancing is cloud load balancing. If your application that is growing in popularity around the world, at some point you may be concerned about properly scaling it. You will also have to consider optimizing and securing the app, while maintaining user satisfaction. This is where load balancing comes into play.
How does load balancing work?
Let’s assume that one of the application users is located in New York, i.e. in the US-East4 region. For your application to operate most efficiently, it is a good idea to deploy a backend instance in this region and configure a load-balancing virtual IP address.
If you discover that other regions are also seeing an increase in users, create backend instances there as well. By doing this, you can be sure that there will be no changes to your virtual IP address or DNS service settings.
Automatic scaleup
Creating instances in different regions gives you some flexibility. What will happen if, for example, your users from India who are directed to installations in that region find them overloaded?
In this case, the instance will start to automatically scale up to respond to the additional load, and the users themselves will be redirected to backend instances in another region for the duration of this change. Once the instances in the original region have been scaled seamlessly, users will be reassigned to the original instances.
Cloud load balancing uses anycast virtual IP addresses, providing a single, global front-end virtual IP address. What is particularly important for users is the fact that cloud load balancing provides cross-region failover, fast auto-scaling and scaling to millions of queries per second. This is external load balancing at Layer 7.
Load balancing middleware
In a three-tier application, after the frontend you come across middleware and then data sources that you can interact with to fulfill the user’s request. If you want to incorporate load balancing algorithms, you will need an additional, fourth layer. That’s the internal load balancer, located between the frontend and the other internal layers.
Layer 4 internal load balancing is intended for TCP/UDP traffic per RFC 1918 VIP, where the client IP address is preserved. In this case, you get automatic health checks in the absence of a middle proxy server. Software-defined network and data plane controls are used for load balancing.
How to use load balancing in the cloud?
Let’s take a look at the Cloud Load Balancing data model in practice in Google Cloud. Global anycast, IPv4 or IPv6 virtual IP addresses are available for global HTTP(S) load balancing, associated with a forwarding rule that directs traffic to the target proxy server that terminates the client session. Of course, in the case of HTTP(S) this is also the place where you implement certificates.
Importantly, in this case we are not talking about a single device, but logic distributed throughout the entire infrastructure. The configured URL map provides Layer 7 routing and directs the client request to the appropriate backend service. Backend services can be managed groups of instances or groups of network endpoints for containerized workloads.
This is also where the capacity and health of the service are determined, and Cloud CDN allows content to be cached to improve performance. This is where you can configure firewall rules to control traffic to and from the backend.
The internal load balancing configuration works in the same way. We still have the forwarding rule, but here it points directly to the backend service. A forwarding rule has a virtual IP address, protocol, and up to five ports.
How to ensure security?
Google offers a set of comprehensive security good practices. What are they?
First, try to use SSL everywhere. With HTTP(S) and SSL proxy load balancing, you can use Google-managed certificates. In this case, Google is responsible for providing and managing the SSL certificate lifecycle.
Cloud Load Balancing also supports multiple SSL certificates, which is especially useful if you want to support multiple domains using the same IP address and load balancing port. Global load balancing absorbs and disperses layer 3, 4 volumetric attacks on Google’s global load balancing infrastructure.
In addition, with Cloud Armor, you can protect against application-level attacks from Layer 3 to Layer 7. Using Identity Aware Proxy and firewalls, you can authenticate and authorize access to your backends.
Running a classic load balancer for cloud applications
Let’s see how to configure an external application load balancer in a few steps using the backend of an instance group managed by Compute Engine.
SSL certificate resource configuration
Before you start creating a load balancer, be sure to configure the SSL certificate resource. Google recommends using certificates managed by Google.
Permission configuration
Please note that you must have permissions to create Compute Engine instances, firewall rules, and reserved IP addresses in the project. Recognized roles are project owner and editor. You may also have the following roles, depending on the task:
- Creating an instance – Instance Admin
- Adding and removing firewall rules – Security Admin
- Creating load balancer components – Network Admin
- Creating a project – Project Creator
Create a managed instance group
To properly configure a load balancer using the Compute Engine backend, your virtual machines must be in an instance group. A managed instance group provides virtual machines that run the backend servers of an external HTTP(S) load balancer.
Start by creating an instance template. Then create a managed instance group and select this template.
Add a named port to the instance group
In this step, you should define the HTTP service for your instance group and map the port name to the appropriate port. The load balancing service will forward traffic according to this indication.
Create a firewall rule
This could be an example of a fw-allow-health-check rule, i.e. an inbound traffic rule that allows traffic from Google Cloud health checking systems.
Reserve an external IP address
Once your instances are up and running, the next step is to configure the global, static external IP address that your clients will use to connect to the load balancer.
Configure the load balancer
We come to the seventh, very important step, i.e. configuration of the load balancer. To do this, you must go to the “Load balancing” subpage in the console and select the “Create load balancer” option.
If you have chosen the HTTP(S) protocol, go to the “Application Load Balancer (HTTP/S)” tab. After completing the next steps, remember to configure the frontend, backend, host and path rules.
Domain connection to the load balancer
Once you have created your load balancer, note the IP address associated with it. To point a domain to the load balancer, create an A record using the domain registration service.
If you add multiple domains to your SSL certificate, you should add an A record to each of them, and then they will all point to the IP address of the load balancing module.
Testing the traffic sent to your instances
As we approach the end of this procedure, it remains to test the operation of the load balancer. Remember that FOTC experts can help you throughout the process. Help may be especially useful if you are new to the cloud environment. At the same time, you will learn how to configure them correctly to optimize costs.