Load Balancing Fundamentals: Algorithms & Setup

What Is Load Balancing?

A load balancer distributes incoming traffic across multiple backend servers. This provides:

High availability — If one server fails, traffic is routed to healthy servers.
Scalability — Add more backend servers to handle increased load.
Performance — Spread requests to prevent any single server from being overwhelmed.

Layer 4 vs Layer 7

Feature	Layer 4 (Transport)	Layer 7 (Application)
Operates on	TCP/UDP connections	HTTP requests
Speed	Faster (no content inspection)	Slightly slower
Routing decisions	IP + port only	URL, headers, cookies
SSL termination	Pass-through or terminate	Terminate and inspect
Use case	Generic TCP services	Web applications

Load Balancing Algorithms

Algorithm	How It Works	Best For
Round Robin	Sequential rotation	Equal-capacity servers
Weighted Round Robin	Rotation with capacity weights	Mixed-capacity servers
Least Connections	Sends to server with fewest active connections	Variable request duration
IP Hash	Same client IP always goes to same server	Session persistence
Random	Random selection	Large server pools

Nginx as Load Balancer

upstream backend {
    least_conn;
    server 10.0.0.1:8000 weight=3;
    server 10.0.0.2:8000 weight=2;
    server 10.0.0.3:8000 backup;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

HAProxy Configuration

frontend http_front
    bind *:80
    default_backend http_back

backend http_back
    balance leastconn
    option httpchk GET /health/

    server web1 10.0.0.1:8000 check inter 5s fall 3 rise 2
    server web2 10.0.0.2:8000 check inter 5s fall 3 rise 2
    server web3 10.0.0.3:8000 check inter 5s fall 3 rise 2 backup

Health Checks

Health checks prevent sending traffic to failed servers:

# HAProxy health check options:
check          # Enable health checking
inter 5s       # Check every 5 seconds
fall 3         # Mark down after 3 consecutive failures
rise 2         # Mark up after 2 consecutive successes

Types of health checks: - TCP — Can the load balancer open a connection? (Layer 4) - HTTP — Does /health/ return 200 OK? (Layer 7) - Custom — Application-specific checks (database connectivity, disk space).

Session Persistence

When a user's requests must go to the same backend (e.g., server-side sessions):

Cookie-based — Load balancer inserts a cookie identifying the backend server.
IP hash — Same source IP always routes to the same server.
Application-managed — Use shared session storage (Redis, database) and any balancing algorithm.

Cookie-based or shared session storage are preferred over IP hash, which breaks when users are behind CGNAT (many users sharing one IP).

Load Balancing Fundamentals