Distribute the incoming traffic | Balancing load | Load Balancer

5 min readAug 24, 2022

In publishing-a-website, we looked at a sample website running on a single Virtual Machine (VM) in the cloud. If your VM was terminated accidentally, people won’t be able to reach to your website. To prevent this you can use “Termination Protection” feature. But things can still go wrong!

Let’s assume that you added more features to your website. It gathers information from user, writes to database, shows users various data. It’s not just website anymore, it’s an web application now. As traffic increased, if the VM is out; users can’t use your web application. To avoid single point of failure, you decide to run it on 2 VMs.

You can just create an identical VM and run web application. Now, we just need to figure out a way to distribute incoming traffic to the VMs. This is what load balancers do. Examples of load balancers: Nginx, Elastic Load Balancer, AVI networks. There are different algorithms such as round-robin, hashing etc. to distribute incoming traffic to the machines.

Load Balancers are agile. If a server goes down then load balancers would not forward traffic to that machine. What if the server is up but application is not? Load Balancers can be configured with different types of health checks to verify whether the application is up or not.

As per Derek DeJonghe (The author of “Load Balancing in the Cloud” Report), Load Balancers help solve the performance, economy, and availability problems.

We have 2 unique types of load-balancers. Load-Balancer A can be configured as following:

all the traffic that was forwarded to port 80 of load-balancer would go on X group of servers
all the traffic that forwarded to port 8080 would go on Y group of servers

Load-Balancer A is based on pure networking which is why it’s called Network Load Balancer (NLB). NLB operates at networking layer (Layer-4) of OSI model. Blog new-network-load-balancer-effortless-scaling-to-millions-of-requests-per-second/ shows how AWS’s NLB can cater millions of requests per sercond.

Load-balancer B can be configured as following:

if incoming request is for test.jigarrathod.net then forward it to J-group of servers
if incoming request is for jigarrathod.net/payment then forward to payment-group of servers

Load-balancer B is more in line with application usage and are known as Application Load Balancer (ALB). ALB operates at application layer (Layer-7) of OSI model.

I mentioned group of servers several times. In case of AWS, it is referred as a Target group.

Screenshot of Load Balancers supported by AWS

Following diagram shows 2 different target groups:

test — port 80 — http protocol
test123 — port 80 — tcp protocol

While creating a load-balancer,

Application Load balancer would see target group with HTTP protocol
Network Load Balancer would see target group with TCP protocol

Load balancers can be configured to monitor the health of the server. You can configure how often (interval) do you want to run the health check. It can be run every 10s or 30s or minute or 2 minute and so on.

Consider a simple health check that only looks at HTTP response code. If the website is working fine it would return 200 HTTP response code; any other response code would be considered failed health check. It is possible that even though server is working fine, it may have failed health check every once in a while. This is why you need to set a threshold!

If a health check on server-A keeps has been failing 3 times or more then it can be marked unhealthy. Load balancer would not forward traffic to such unhealthy server. Here 3 is threshold to mark a service as unhealthy and is referred as “Unhealthy threshold”.

Once the issue is fixed, server-A will start passing the health check. After 5 such successful health check, it can be marked healthy. Load balancer would start forwarding traffic to it now. Here 5 is “Healthy threshold”.

How would load balancer choose whom to forward traffic to?

Load balancer will choose a server out of a target group based on an algorithm. Algorithms like round-robin just forward traffic whether server is busy or not; whereas least-connection algorithm makes decision based on how many connections each server has. To get to know more about other algorithms check types-of-load-balancing-algorithms.

If user-A’s request would land on different servers. Let’s say server-1 has served user-A’s request. Server-1 already has user-A related information in the memory. Wouldn’t it be great if user-A’s request can be served by server-1 ?

This can be achieved by sticky sessions which are only supported by Application load balancer.

Can you add Load Balancer in DNS?

Yes. In case of AWS, you would use alias. To get higher level idea about DNS, checkout publishing-a-website .

There are few more key differences between ALB and NLB (from 2017 re:invent https://youtu.be/z0FBGIT1Ub4?t=344)

NLB

requests headers do not get modified

ALB

request headers are modified
from user’s end point connections terminate at ALB

In this post, I covered

Load Balancers
Network Load Balancers (NLB)
Application Load Balancer (ALB)
Health checks
Unhealthy | healthy threshold
Some of the algorithms used by load balancers
Sticky session

Distribute the incoming traffic | Balancing load | Load Balancer

Written by Jigar Rathod

No responses yet