Distribute the incoming traffic | Balancing load | Load Balancer
In publishing-a-website, we looked at a sample website running on a single Virtual Machine (VM) in the cloud. If your VM was terminated accidentally, people won’t be able to reach to your website. To prevent this you can use “Termination Protection” feature. But things can still go wrong!
Let’s assume that you added more features to your website. It gathers information from user, writes to database, shows users various data. It’s not just website anymore, it’s an web application now. As traffic increased, if the VM is out; users can’t use your web application. To avoid single point of failure, you decide to run it on 2 VMs.
You can just create an identical VM and run web application. Now, we just need to figure out a way to distribute incoming traffic to the VMs. This is what load balancers do. Examples of load balancers: Nginx, Elastic Load Balancer, AVI networks. There are different algorithms such as round-robin, hashing etc. to distribute incoming traffic to the machines.
Load Balancers are agile. If a server goes down then load balancers would not forward traffic to that machine. What if the server is up but application is not? Load Balancers can be configured with different types of health checks to verify whether the application is up or not.
As per Derek DeJonghe (The author of “Load Balancing in the Cloud” Report), Load Balancers help solve the performance, economy, and availability problems.
We have 2 unique types of load-balancers. Load-Balancer A can be configured as following:
- all the traffic that was forwarded to port 80 of load-balancer would go on X group of servers
- all the traffic that forwarded to port 8080 would go on Y group of servers
Load-Balancer A is based on pure networking which is why it’s called Network Load Balancer (NLB). NLB operates at networking layer (Layer-4) of OSI model. Blog new-network-load-balancer-effortless-scaling-to-millions-of-requests-per-second/ shows how AWS’s NLB can cater millions of requests per sercond.
Load-balancer B can be configured as following:
- if incoming request is for test.jigarrathod.net then forward it to J-group of servers
- if incoming request is for jigarrathod.net/payment then forward to payment-group of servers
Load-balancer B is more in line with application usage and are known as Application Load Balancer (ALB). ALB operates at application layer (Layer-7) of OSI model.
I mentioned group of servers several times. In case of AWS, it is referred as a Target group.
Following diagram shows 2 different target groups:
- test — port 80 — http protocol
- test123 — port 80 — tcp protocol
While creating a load-balancer,
- Application Load balancer would see target group with HTTP protocol
- Network Load Balancer would see target group with TCP protocol
Load balancers can be configured to monitor the health of the server. You can configure how often (interval) do you want to run the health check. It can be run every 10s or 30s or minute or 2 minute and so on.
Consider a simple health check that only looks at HTTP response code. If the website is working fine it would return 200 HTTP response code; any other response code would be considered failed health check. It is possible that even though server is working fine, it may have failed health check every once in a while. This is why you need to set a threshold!
If a health check on server-A keeps has been failing 3 times or more then it can be marked unhealthy. Load balancer would not forward traffic to such unhealthy server. Here 3 is threshold to mark a service as unhealthy and is referred as “Unhealthy threshold”.
Once the issue is fixed, server-A will start passing the health check. After 5 such successful health check, it can be marked healthy. Load balancer would start forwarding traffic to it now. Here 5 is “Healthy threshold”.
How would load balancer choose whom to forward traffic to?
Load balancer will choose a server out of a target group based on an algorithm. Algorithms like round-robin just forward traffic whether server is busy or not; whereas least-connection algorithm makes decision based on how many connections each server has. To get to know more about other algorithms check types-of-load-balancing-algorithms.
If user-A’s request would land on different servers. Let’s say server-1 has served user-A’s request. Server-1 already has user-A related information in the memory. Wouldn’t it be great if user-A’s request can be served by server-1 ?
This can be achieved by sticky sessions which are only supported by Application load balancer.
Can you add Load Balancer in DNS?
Yes. In case of AWS, you would use alias. To get higher level idea about DNS, checkout publishing-a-website .
There are few more key differences between ALB and NLB (from 2017 re:invent https://youtu.be/z0FBGIT1Ub4?t=344)
NLB
- requests headers do not get modified
ALB
- request headers are modified
- from user’s end point connections terminate at ALB
In this post, I covered
- Load Balancers
- Network Load Balancers (NLB)
- Application Load Balancer (ALB)
- Health checks
- Unhealthy | healthy threshold
- Some of the algorithms used by load balancers
- Sticky session
Other posts linked to this blogs:
References:
- Nginx Load Balancing — load-balancing
- Load Balancing in the Cloud Report by Derek DeJonghe on O’Reilly Media, Inc.
- AVI networks — what-is-load-balancing
- AWS re:Invent 2014 — Elastic Load Balancing — https://youtu.be/K-YFw9-_NPE
- Elastic Load Balancing AWS 2013 — https://youtu.be/l5HSED9FiPI
- AWS re:Invent 2019 — Elastic Load Balancing for different workloads— https://youtu.be/HKh54BkaOK0
- Networking talk — https://youtu.be/gj4CD73Wmns
- Comparision of ELBs — Product_comparisons
Other notable resources
- Maglev A Fast and Reliable Software Network Load Balancer — pub44824
- Udemy Course — AWS Networking Masterclass — Amazon VPC and Hybrid Cloud by Neal Davis — aws-networking-amazon-vpc-aws-vpn-hybrid-cloud