LearningTree · AWS · Networking

Elastic Load Balancing —
Traffic Distribution Layer

AWS's managed load balancing service — distributes incoming traffic across multiple targets, performs health checks, enables zero-downtime deployments, and acts as the intelligent entry point to your system. Not just traffic spreading — it's your traffic control layer.

⚡ ELB in 30 Seconds

Distributes traffic — routes incoming requests across multiple healthy targets (EC2, ECS, Lambda)
Health checks — automatically detects unhealthy targets and stops routing traffic to them
Three types — ALB (Layer 7, HTTP), NLB (Layer 4, TCP/UDP), GWLB (Layer 3, appliances)
Multi-AZ — spans Availability Zones for fault tolerance and high availability
Fully managed — AWS handles scaling, patching, and availability of the load balancer itself

Chapter One

What is Load Balancing

The Problem — Why Load Balancers Exist Introductory

You deploy your application on one server. Traffic grows. That single server handles 100 requests/second… then 1,000… then it crashes. Users see 503 errors. Your app is down. You can scale vertically (bigger server), but there's a ceiling — and a single point of failure remains.

👉 The fundamental problem: One server = one point of failure + limited capacity. You need multiple servers — but someone needs to decide which server handles each request. That's what a load balancer does.

What is a Load Balancer Introductory

A load balancer sits between clients and your servers. It receives all incoming requests and distributes them across multiple backend servers (called targets). No single target gets overwhelmed. If one target dies, the load balancer stops sending traffic to it — users never notice.

💥

Without Load Balancer

Single server handles everything
Server crashes → entire app goes down
Can't scale beyond one machine's capacity
Deployments require downtime
No health monitoring
No traffic distribution

⚖️

With Load Balancer

Traffic distributed across multiple targets
One target dies → others handle traffic seamlessly
Scale horizontally by adding more targets
Rolling deployments with zero downtime
Continuous health checks on all targets
Smart routing (path, host, headers)

Horizontal Scaling — The Key Concept Introductory

There are two ways to handle growth:

📐

Vertical Scaling (Scale Up)

Make one server bigger (more CPU, RAM)
Simple — no architecture change
Hard limit — eventually max out
Single point of failure remains
Downtime to resize

📊

Horizontal Scaling (Scale Out)

Add more servers of the same size
Virtually unlimited capacity
No single point of failure
Add/remove servers without downtime
Requires a load balancer to distribute traffic

Horizontal scaling is how modern cloud architectures work. Auto Scaling adds EC2 instances; the load balancer automatically includes them in traffic distribution. This is elastic computing — grow and shrink with demand.

Concept Diagram — How Load Balancing Works Introductory

Concept — Load balancer distributes traffic to healthy targets

AWS Diagram — ELB in a VPC Core

AWS Architecture — ELB distributes traffic across EC2 instances in multiple AZs

Architecture Diagram — Multi-AZ High Availability Core

High Availability — ELB ensures traffic flows only to healthy targets across AZs

Mental Model — The Traffic Controller Analogy Introductory

🚦

Without Load Balancer = One-Lane Road

All cars (requests) take the same single road
Traffic jam at peak hours (server overloaded)
Road closed for repair → no one gets through (downtime)
No alternative route
Accident = total shutdown

🛣️

With Load Balancer = Traffic Controller + Highway

Multiple lanes (targets) available
Traffic controller (ELB) directs cars to least-busy lane
One lane closed → controller re-routes to open lanes
New lanes added during rush hour (Auto Scaling)
Smart routing — trucks to one lane, cars to another (ALB rules)
Controller never sleeps (managed service, always available)

🧠 Key Insight

ELB is not just "traffic spreading" — it's your traffic control layer. It routes intelligently, monitors health continuously, enables zero-downtime deployments, and acts as the single entry point that decouples clients from your backend architecture.

What Makes ELB Special Core

AWS ELB is a fully managed service — you don't run load balancer software on EC2. AWS handles:

📈

Auto-Scaling

ELB itself scales automatically. Handles 10 req/s or 10M req/s — no pre-provisioning needed (ALB). AWS manages the underlying fleet.

🔄

Health Checks

Pings targets every 10-30 seconds. Unhealthy target? Traffic immediately re-routed to healthy ones. Auto-heals when target recovers.

🌍

Multi-AZ

Spans multiple Availability Zones. If an entire AZ goes down, ELB routes 100% to remaining AZs. True fault tolerance.

ELB in the Request Flow Core

Understanding where ELB sits in the full AWS request flow:

① Route 53 → DNS resolution (domain → IP)
② CloudFront → Edge caching (optional, reduces latency)
③ ELB → Distributes traffic to targets
④ EC2 / ECS / Lambda → Processes the request
⑤ RDS / DynamoDB → Data layer

👉 ELB is the gateway between the internet and your compute layer. Everything flows through it.

Chapter Summary Introductory

Load balancing = distributing traffic across multiple targets to prevent overload
Horizontal scaling = adding more servers (not bigger servers) — requires a load balancer
Health checks = ELB monitors targets and stops routing to unhealthy ones
Multi-AZ = ELB spans AZs for fault tolerance — entire AZ failure is handled
Fully managed = AWS scales ELB itself; you don't manage load balancer infrastructure
ELB = traffic control layer — not just spreading, but intelligent routing + health + zero-downtime deployments

Chapter Two

ELB Types — ALB, NLB & GWLB

Three Load Balancers, Three Layers Introductory

AWS offers three types of Elastic Load Balancers — each operating at a different OSI layer and optimized for different workloads. Choosing the right one is a common exam question and a critical architecture decision.

👉 The key mental model: ALB = smart routing (HTTP/HTTPS), NLB = raw speed (TCP/UDP), GWLB = security appliances (Layer 3). Most web applications use ALB. High-performance systems use NLB. You rarely need GWLB unless you're running firewalls or IDS/IPS.

Overview — The Three Types Introductory

🌐

ALB — Application Load Balancer

Layer 7 (HTTP/HTTPS)
Content-based routing
Path, host, header, query string rules
WebSocket support
Lambda + EC2 + ECS targets
Best for: web apps, microservices, APIs

⚡

NLB — Network Load Balancer

Layer 4 (TCP/UDP/TLS)
Ultra-low latency (~100μs)
Millions of requests/sec
Static IP / Elastic IP support
Preserves source IP
Best for: gaming, IoT, financial, TCP services

🛡️

GWLB — Gateway Load Balancer

Layer 3 (IP packets)
Transparent to applications
GENEVE encapsulation
Inline traffic inspection
Third-party appliances
Best for: firewalls, IDS/IPS, DPI

Detailed Comparison Table Core

   Feature ALB NLB GWLB 
  OSI Layer Layer 7 Layer 4 Layer 3 
 Protocols HTTP, HTTPS, gRPC TCP, UDP, TLS IP (all protocols) 
 Latency ~ms ~μs (ultra-low) ~ms 
 Static IP ❌ (use DNS) ✅ (Elastic IP) ❌ 
 Content Routing ✅ (path, host, headers) ❌ (port only) ❌ 
 Security Groups ✅ ✅ (added 2023) ❌ 
 Lambda Targets ✅ ❌ ❌ 
 Source IP Preserved ❌ (use X-Forwarded-For) ✅ ✅ 
 Throughput High Millions req/s High 
  

Feature	ALB	NLB	GWLB
OSI Layer	Layer 7	Layer 4	Layer 3
Protocols	HTTP, HTTPS, gRPC	TCP, UDP, TLS	IP (all protocols)
Latency	~ms	~μs (ultra-low)	~ms
Static IP	❌ (use DNS)	✅ (Elastic IP)	❌
Content Routing	✅ (path, host, headers)	❌ (port only)	❌
Security Groups	✅	✅ (added 2023)	❌
Lambda Targets	✅	❌	❌
Source IP Preserved	❌ (use X-Forwarded-For)	✅	✅
Throughput	High	Millions req/s	High

Concept Diagram — Layer Differences Introductory

Concept — Each ELB type operates at a different network layer

AWS Diagram — ALB vs NLB Architecture Core

AWS Architecture — ALB routes by content, NLB routes by connection

Architecture Diagram — When to Use Which Core

Decision Tree — Choose the right ELB type for your workload

ALB vs NLB — When to Choose Core

🌐

Choose ALB When

Web application with HTTP/HTTPS traffic
Microservices needing path-based routing
Multiple domains on one load balancer (host-based)
Need to route to Lambda functions
WebSocket connections needed
Authentication (OIDC/Cognito integration)
gRPC workloads
Redirect HTTP → HTTPS at LB level

⚡

Choose NLB When

Non-HTTP protocols (TCP, UDP, TLS)
Need static IP for whitelisting by partners
Extreme performance (millions req/s)
Ultra-low latency required (~100μs)
Need to preserve client source IP
Gaming servers, VoIP, IoT
PrivateLink (exposing services cross-account)
Long-lived TCP connections

Gateway Load Balancer — Quick Overview Advanced

GWLB is the newest type (launched 2020) and the least commonly used. It's designed for a very specific use case: inserting third-party security appliances (firewalls, IDS/IPS) inline into your traffic flow without changing your application architecture.

🛡️

How GWLB Works

Traffic is transparently diverted to appliances
Uses GENEVE protocol (port 6081) for encapsulation
Appliances inspect traffic and return it
Original flow continues — app doesn't know
Scales fleet of appliances automatically
Deployed via Gateway Load Balancer Endpoint (GWLBE) in each VPC

⚠️

When NOT to use GWLB

Regular web applications → use ALB
Simple TCP forwarding → use NLB
If you don't run security appliances → skip GWLB
Not for direct application load balancing
Exam tip: rarely tested, but know it exists

GWLB — Exam Quick Reference

Layer	Layer 3 (IP packets) — transparent to applications
Protocol	GENEVE encapsulation (port 6081) — wraps original packets
Use with	Third-party firewalls (Palo Alto, Fortinet), IDS/IPS, DLP appliances
Deployment	GWLB → Target Group (appliance fleet) → returns cleaned traffic via GWLBE
Exam triggers	"inline inspection," "security appliance," "firewall," "IDS/IPS," "transparent"
NOT for	HTTP routing (use ALB), high-performance TCP (use NLB), direct LB

🧠 Key Insight

For most real-world scenarios: default to ALB for web apps. Switch to NLB when you see "static IP," "TCP/UDP," "extreme performance," or "source IP preservation." GWLB only appears for security appliance scenarios.

Chapter Summary Introductory

ALB (Layer 7) = HTTP/HTTPS smart routing — path, host, headers — web apps + microservices + Lambda
NLB (Layer 4) = TCP/UDP raw speed — static IP, ultra-low latency, millions req/s — gaming + IoT + financial
GWLB (Layer 3) = IP packet inspection — transparent inline security appliances — firewalls + IDS/IPS
Default choice = ALB for 90% of web workloads; NLB when you need static IP or non-HTTP; GWLB only for security appliances
Exam keywords: "path-based" → ALB, "static IP" → NLB, "firewall appliance" → GWLB

Chapter Three

Application Load Balancer — Deep Dive

What Makes ALB Special Introductory

The Application Load Balancer is the most commonly used ELB type. It operates at Layer 7 (HTTP/HTTPS), which means it can read and understand the content of requests — URLs, headers, cookies, query strings — and make intelligent routing decisions based on that content.

👉 Key difference: NLB sees "a TCP connection on port 443." ALB sees "GET /api/users?page=2 with header Host: api.example.com and cookie session=abc123." ALB understands the request — that's what makes content-based routing possible.

ALB Components — Listener, Rules, Target Groups Core

Every ALB has three core components that work together:

👂

Listener

Checks for connections on a port + protocol
Typically: port 80 (HTTP) or 443 (HTTPS)
You can have multiple listeners
HTTPS listener needs an SSL certificate
Entry point for all requests

📋

Rules

Conditions that match incoming requests
Path: /api/*, /images/*, /admin/*
Host: api.example.com, www.example.com
Headers, query strings, HTTP method
Evaluated in priority order (1–50000)
Default rule = catch-all (required)

🎯

Target Group

Group of targets receiving traffic
Can be: EC2, ECS, Lambda, or IP
Has its own health check configuration
Defines port and protocol for targets
One rule → one target group action

Path-Based Routing Core

Path-based routing sends requests to different target groups based on the URL path. This is the most commonly used ALB routing feature — it lets one ALB serve multiple services:

GET /api/users → Target Group: API Service (ECS Fargate)
GET /web/dashboard → Target Group: Web Frontend (EC2)
GET /static/logo.png → Target Group: Static Assets (S3 via Lambda)
GET /admin/* → Target Group: Admin Service (separate EC2 fleet)

👉 One ALB, one domain, multiple services — path routing is microsservice routing without a service mesh.

Host-Based Routing Core

Host-based routing routes based on the Host header — effectively serving multiple domains/subdomains from one ALB:

🏷️

Example: Multi-Tenant SaaS

api.example.com → API target group
www.example.com → Web target group
admin.example.com → Admin target group
*.customer.com → Customer-specific group
All served by ONE ALB

💡

Why This Matters

One ALB = one bill (cheaper than multiple LBs)
Wildcard support (*.example.com)
Combine with path rules for fine-grained control
Each domain can route to different backends
One SSL cert with SANs covers all domains

Concept Diagram — ALB Routing Flow Introductory

Concept — ALB inspects request content and routes to matching target group

AWS Diagram — ALB with Multiple Target Types Core

AWS Architecture — ALB routing to EC2, ECS, and Lambda targets

Advanced Routing — Headers, Query Strings, Methods Advanced

Beyond path and host, ALB can route on any HTTP attribute:

🔧

Advanced Condition Types

HTTP Header: X-Custom-Header = "beta" → TG-Beta
Query String: ?version=v2 → TG-V2
HTTP Method: POST /orders → TG-Write-Service
Source IP: 10.0.0.0/8 → TG-Internal
Combine conditions with AND logic

🎯

Advanced Actions

Forward: Send to target group (normal)
Redirect: HTTP→HTTPS, old→new URL
Fixed Response: Return 404/503 directly from ALB
Authenticate: OIDC/Cognito before forwarding
Weighted: 90% TG-Blue + 10% TG-Green

Architecture Diagram — ALB for Microservices Advanced

Production Pattern — Single ALB routing to microservices with weighted deployment

ALB Features — Deep List Advanced

✅

ALB Can Do

Path-based & host-based routing
HTTP → HTTPS redirect (built-in)
Fixed response (maintenance page from LB)
OIDC / Cognito authentication at LB level
Sticky sessions (cookie-based)
WebSocket and HTTP/2 support
gRPC support
Lambda targets (serverless backend)
Weighted target groups (canary/blue-green)
Access logs to S3
WAF integration (Web Application Firewall)
Slow start mode for new targets

❌

ALB Cannot Do

Static IP (use NLB or Global Accelerator)
Preserve source IP directly (use X-Forwarded-For)
Non-HTTP protocols (TCP, UDP, MQTT → NLB)
Ultra-low latency (μs) — ALB adds ~ms
PrivateLink / VPC endpoint service (→ NLB)
Inline IP packet inspection (→ GWLB)

ALB Operational Details Advanced

⏱️

Idle Timeout

Default: 60 seconds
If no data sent/received for 60s → connection closed
WebSocket: increase to 3600 seconds or more
File uploads: increase for large files
Short APIs: decrease to 30s (frees resources)
Target keep-alive should exceed ALB idle timeout

🐢

Slow Start Mode

New targets receive gradually increasing traffic
Duration: 30–900 seconds (configurable)
Protects cold-start instances from sudden full load
JVM warm-up, cache priming, lazy initialization
Traffic ramps linearly over configured duration
Disabled by default (set to 0)

🔒

Deletion Protection

Prevents accidental ALB deletion
Must be explicitly disabled before deletion
Enable for all production load balancers
Terraform: enable_deletion_protection = true
Console: shows error if you try to delete

📏

ALB Limits

100 rules per listener (default, can request increase)
5 conditions per rule (AND logic)
Up to 5 actions per rule
50 target groups per ALB
For large microservices: chain ALBs or use NLB→ALB
Multiple listeners on same ALB (e.g., 80 + 443)

🧠 Key Insight

ALB is your microservice router. One ALB replaces what used to require multiple load balancers or a dedicated API gateway. Path routing + host routing + weighted targets = a lightweight service mesh at the infrastructure level.

Chapter Summary Introductory

ALB = Layer 7 — reads HTTP content and routes based on path, host, headers, cookies, query strings
Components: Listener (port+protocol) → Rules (conditions) → Target Groups (destinations)
Path routing: /api/* → Service A, /web/* → Service B — one LB, multiple services
Host routing: api.example.com → different backend than www.example.com
Target types: EC2 instances, ECS tasks, Lambda functions, IP addresses — mix freely
Advanced: weighted target groups for canary deployments, OIDC auth, fixed responses, WAF
Limits: 100 rules per listener, 5 conditions per rule, 50 target groups per ALB
Idle timeout: default 60s — increase for WebSocket (3600s) and large uploads
Slow start: ramp up traffic to new targets over 30-900s (protects cold starts)
Cannot: static IP, source IP preservation (use X-Forwarded-For), non-HTTP traffic

Chapter Four

Network Load Balancer — Deep Dive

What Makes NLB Different Introductory

The Network Load Balancer operates at Layer 4 (Transport layer) — it sees TCP/UDP connections, not HTTP content. It doesn't inspect payloads, doesn't read headers, doesn't understand URLs. It simply forwards connections at wire speed. This makes it extremely fast — latency measured in microseconds, not milliseconds.

👉 The key difference: ALB is like a concierge who reads your invitation and directs you to the right room. NLB is like a high-speed highway toll booth — it doesn't care what's in your car, it just routes you through as fast as physically possible.

NLB Core Characteristics Core

⚡

Performance

Millions of requests per second
Ultra-low latency: ~100 microseconds
Handles sudden traffic spikes without warm-up
No pre-provisioning needed
Designed for volatile, bursty workloads
Connection-level load balancing (not request-level)

📌

Static IP / Elastic IP

One static IP per AZ (automatically assigned)
Can attach Elastic IP per AZ (bring your own)
IP never changes — firewall-friendly
Partners can whitelist your IP
DNS resolves to these IPs (not rotating like ALB)
Critical for IP-based whitelisting scenarios

🔍

Source IP Preservation

Client's real IP reaches the target
No X-Forwarded-For needed (unlike ALB)
Target sees: source IP = client IP
Important for logging, geo-restriction, security
Works by default for instance targets

🚫

What NLB Cannot Do

No content-based routing (no path/host)
No HTTP header inspection
No Lambda targets
No built-in authentication (OIDC)
No WebSocket-specific handling
No WAF integration directly

NLB vs ALB — Protocol Handling Core

The fundamental difference in how they process traffic:

ALB sees: "GET /api/users HTTP/1.1\nHost: api.example.com\nAuth: Bearer token..."
ALB decides: path matches /api/* → forward to API target group

NLB sees: "TCP connection → destination port 443"
NLB decides: port 443 → forward to target group for port 443
(NLB doesn't open the packet — just routes the connection)

Concept Diagram — NLB Connection Flow Introductory

Concept — NLB forwards TCP/UDP connections at wire speed without inspection

AWS Diagram — NLB with Static IP Core

AWS Architecture — NLB with Elastic IPs for partner whitelisting

NLB Use Cases Core

🎮

Gaming

Millions of concurrent TCP/UDP connections
Ultra-low latency mandatory
Static IP for DNS pointing
Long-lived connections

💰

Financial / Trading

Microsecond latency requirements
Static IP for partner whitelisting
High-frequency trading connections
TLS passthrough

📡

IoT / Real-time

MQTT over TCP (IoT devices)
Millions of device connections
UDP for telemetry/streaming
Source IP for device identification

Architecture Diagram — NLB + PrivateLink Advanced

Advanced Pattern — NLB enables AWS PrivateLink for cross-account service exposure

TLS on NLB — Termination Options Advanced

🔐

TLS Termination at NLB

NLB decrypts TLS — sends plain TCP to targets
Attach ACM certificate to NLB
Offloads CPU-intensive crypto from targets
Targets don't need certificates
Use when: targets are in private subnet

🔄

TLS Passthrough

NLB forwards encrypted traffic as-is
Target handles TLS termination
NLB never sees plaintext data
End-to-end encryption preserved
Use when: compliance requires end-to-end TLS

NLB Operational Details Advanced

💰

Cross-Zone Load Balancing — Cost

NLB cross-zone is OFF by default
When enabled: inter-AZ data transfer charges apply
Unlike ALB (which is free and always on)
Enable when: uneven target distribution across AZs
Disable when: traffic is evenly distributed
Can be expensive for high-throughput workloads
Exam tip: know this cost difference vs ALB

📡

NLB UDP Behavior

UDP is connectionless — no TCP handshake
Does NOT support sticky sessions for UDP
Health checks for UDP: send UDP packet
Success = response received; Failure = timeout
Can also use TCP/HTTP health check for UDP targets
UDP flows tracked by 5-tuple (src/dst IP+port+proto)
Use for: DNS, game state, telemetry, VoIP

🧠 Key Insight

NLB is for when you need raw TCP/UDP performance, static IPs, or PrivateLink. If your question mentions "static IP," "millions of requests," "UDP," "TCP passthrough," or "PrivateLink" — the answer is NLB. For everything HTTP-related, use ALB instead.

Chapter Summary Introductory

NLB = Layer 4 — sees TCP/UDP connections, not HTTP content — routes by port only
Ultra-fast — ~100μs latency, millions of req/s, no warm-up needed
Static IP — one per AZ, supports Elastic IP — critical for firewall whitelisting
Source IP preserved — targets see real client IP (no X-Forwarded-For needed)
PrivateLink — only NLB can back a VPC Endpoint Service (cross-account private access)
TLS options — terminate at NLB (offload crypto) or passthrough (end-to-end encryption)
Cross-zone — OFF by default, inter-AZ data transfer costs when enabled (unlike ALB which is free)
UDP — connectionless, no sticky sessions, 5-tuple flow tracking
Use cases — gaming, financial trading, IoT, B2B APIs needing static IP, PrivateLink services

Chapter Five

Target Groups & Health Checks

What is a Target Group Introductory

A Target Group is the logical grouping of targets (instances, containers, Lambda functions, or IPs) that receive traffic from the load balancer. Think of it as a "destination pool" — the load balancer sends traffic to the group, and the group distributes it among its members.

👉 Mental model: Listener receives traffic → Rules evaluate → Matching rule forwards to Target Group → Target Group distributes to individual targets. The target group is the bridge between routing rules and actual compute.

Target Types Core

🖥️

Instance Targets

Register by instance ID
ELB routes to primary private IP
Port specified at registration
Used with: EC2 instances, ASG
Source IP: NLB preserves, ALB uses X-Forwarded-For
Most common target type

📦

IP Targets

Register by IP address
Can target IPs outside the VPC
On-premises servers via Direct Connect
Other VPCs (peered or Transit Gateway)
ECS tasks with awsvpc networking
Multiple ports on same instance

Lambda Targets (ALB only)

Register a Lambda function
ALB converts HTTP → Lambda event
Lambda response → HTTP response
Serverless backend behind ALB
One Lambda per target group
Great for lightweight APIs

⚓

ALB Targets (NLB only)

Register an ALB as NLB target
NLB (static IP) → ALB (content routing)
Best of both worlds combination
Static IP + path/host routing
Exam favorite: "static IP + HTTP routing"
Introduced in 2021

Health Checks — How They Work Core

Health checks are the heartbeat of load balancing. Without them, ELB would blindly send traffic to dead targets. Health checks continuously probe each registered target — if a target fails, ELB stops routing traffic to it. When it recovers, traffic resumes.

🏓

How It Works

ELB sends a probe (HTTP GET or TCP connect)
Target must respond within timeout
Success = healthy response (e.g., HTTP 200)
Failure = timeout or wrong status code
Consecutive failures = mark unhealthy
Consecutive successes = mark healthy again

⚙️

Configuration

Protocol: HTTP, HTTPS, TCP, gRPC
Path: /health, /status, /ping
Port: traffic port or custom
Interval: 10-300 sec (default 30)
Timeout: 2-120 sec (default 5)
Healthy threshold: 2-10 (default 5)
Unhealthy threshold: 2-10 (default 2)

✅

Success Criteria

HTTP: status code match (200-499)
TCP: connection established
gRPC: gRPC status code
Default: 200 OK
Can set: "200-299" or "200,302"
Path should be lightweight (/health)

Concept Diagram — Health Check Lifecycle Introductory

Concept — Health check states: Initial → Healthy → Unhealthy → Healthy (recovery)

AWS Diagram — Target Group with Health Checks Core

AWS Architecture — ELB heath checks detect failure and re-route traffic automatically

Deregistration Delay (Connection Draining) Advanced

When a target is removed (scaling down, deployment, or unhealthy), ELB doesn't kill existing connections immediately. Deregistration delay gives in-flight requests time to complete:

⏱️

How It Works

Target marked for deregistration
ELB stops sending NEW requests to it
Existing connections continue until complete
Or until deregistration delay expires
Default: 300 seconds (5 minutes)
Can set: 0–3600 seconds

💡

Best Practices

Short-lived requests (APIs): set to 30-60s
Long-lived (WebSocket, uploads): set to 300-3600s
Set to 0 for instant termination (testing)
Critical for zero-downtime deployments
Works with Auto Scaling scale-in
Works with rolling deployment updates

Architecture Diagram — Failover Behavior Core

Failover — When AZ-a targets all fail, cross-zone balancing routes 100% to AZ-b

Sticky Sessions Advanced

🍪

Duration-Based (LB Cookie)

ELB generates cookie: AWSALB
Duration: 1 second to 7 days
Same client → same target for duration
Simple — no app changes needed
Warning: uneven load if sessions are long

🎯

Application-Based Cookie

App generates custom cookie
ELB reads it to maintain affinity
Cookie name must NOT start with AWSALB
App controls session lifecycle
More flexible — app-aware routing

⚠️ Sticky Sessions + WebSocket Limitation

Sticky sessions do NOT work with WebSocket connections. WebSocket maintains a persistent connection (no new HTTP requests to route), so cookie-based affinity is irrelevant. For WebSocket, the connection itself provides affinity — once established, it stays on the same target. If you need WebSocket + multiple targets, use connection ID routing at the application level, not ELB stickiness.

🧠 Key Insight

Health checks are invisible to users but critical for reliability. They're how ELB achieves "zero-downtime" — failed targets are silently removed, recovered targets automatically rejoin. Combine with cross-zone balancing for true fault tolerance across AZs.

Chapter Summary Introductory

Target Group = logical pool of targets (EC2, IP, Lambda, or ALB) — destination for routing rules
Target types: instance (by ID), IP (cross-VPC/on-prem), Lambda (ALB only), ALB (NLB only)
Health checks = continuous probes every 10-300s — unhealthy targets removed from rotation automatically
Thresholds: unhealthy=2 failures, healthy=5 successes — tune for speed vs stability
Deregistration delay = 300s default — allows in-flight requests to complete before target removed
Cross-zone: ALB always on (free), NLB off by default — enables failover across AZs
Sticky sessions: bind client to same target via cookie — use sparingly (causes uneven load)

Chapter Six

Security & Integration

Security Groups on ELB Core

Security groups control what traffic can reach your load balancer and what traffic your targets accept. The critical pattern is: allow public traffic to ELB, but restrict targets to only accept traffic FROM the ELB.

🛡️

ALB Security Group

Inbound: port 443 from 0.0.0.0/0 (internet)
Inbound: port 80 from 0.0.0.0/0 (redirect to HTTPS)
Outbound: port 8080 to target SG
ALB always has a security group
Can restrict by IP range (internal ALB)

🔒

Target (EC2) Security Group

Inbound: port 8080 from ALB security group
NOT from 0.0.0.0/0!
Only ELB can talk to targets
Targets are in private subnet
No direct internet access to instances

👉 The golden rule: Targets should ONLY allow inbound traffic from the ELB security group — never from 0.0.0.0/0. This ensures all traffic passes through the load balancer (health checks, routing, TLS) and targets aren't directly exposed.

NLB Security Differences Core

⚠️

NLB (Before 2023)

NLB did NOT have security groups
Traffic passed through to targets
Target SG had to allow client IPs directly
Because NLB preserves source IP
Made security rules complex

✅

NLB (After 2023)

NLB now supports security groups
Target SG can reference NLB SG
Same pattern as ALB now possible
Simplified security management
Exam: know both old and new behavior

Concept Diagram — Security Group Chain Introductory

Concept — Security groups create a trust chain: Internet → ELB SG → Target SG

TLS / SSL Termination Core

TLS termination means the load balancer decrypts HTTPS traffic and forwards plain HTTP to targets. This offloads CPU-intensive cryptography from your application servers.

🔐

TLS at ALB (Most Common)

Attach ACM certificate to ALB
ALB decrypts HTTPS → HTTP to targets
Targets on port 80/8080 (plain HTTP)
SNI support (multiple certs)
Free ACM certificates

🔄

End-to-End Encryption

ALB decrypts, then re-encrypts to targets
Target group protocol: HTTPS
Targets need their own certs
Required for: PCI compliance, zero-trust
More CPU on targets

📋

SSL Policies

Choose TLS version (1.2, 1.3)
Choose cipher suites
Recommended: TLSv1.2+ only
ELBSecurityPolicy-TLS13-1-2-2021-06
Disable old protocols (SSLv3, TLS 1.0)

AWS Diagram — TLS Termination Architecture Core

AWS Architecture — TLS termination at ALB with ACM certificate

Integration with AWS Services Core

🛡️

WAF (Web Application Firewall)

Attach AWS WAF to ALB
Block SQL injection, XSS
Rate limiting per IP
Geo-blocking
Custom rules & managed rule groups
ALB only (not NLB)

🏰

AWS Shield

Shield Standard: free, auto-enabled
Protects against L3/L4 DDoS
Shield Advanced: $3000/mo
L7 DDoS protection + support
SRT (Shield Response Team)
Cost protection during attacks

🔑

Cognito / OIDC Auth

ALB authenticates before forwarding
Cognito User Pool integration
Any OIDC provider (Google, Okta)
Auth at LB level — app doesn't need auth code
ALB only feature

📊

CloudWatch

Metrics: RequestCount, TargetResponseTime
HealthyHostCount, UnhealthyHostCount
HTTPCode_ELB_5XX (LB errors)
HTTPCode_Target_5XX (target errors)
Alarms for monitoring

📝

Access Logs

Log every request to S3
Client IP, latency, response code
Target chosen, health status
5/15 minute intervals
Free (only S3 storage cost)

🔍

Global Accelerator

Static IPs + AWS backbone routing
Works with ALB (static IP for ALB!)
Improves global latency
Instant failover between regions
Alternative to NLB for static IP

Architecture Diagram — Full Security Stack Advanced

Production Security — Full protection stack from edge to target

🧠 Key Insight

ELB is not just about load balancing — it's the natural point for TLS termination, authentication, WAF enforcement, and security group chaining. A properly configured ALB + WAF + Shield + SG chain provides defense-in-depth from DDoS to SQL injection to unauthorized access.

Chapter Summary Introductory

Security Groups: ALB allows 0.0.0.0/0 :443, targets allow only from ALB SG — never expose targets directly
NLB SGs: now supported (since 2023) — same pattern as ALB possible
TLS termination: ALB decrypts HTTPS → plain HTTP to targets — use ACM for free auto-renewing certs
WAF: attach to ALB for SQL injection, XSS, rate limiting, geo-blocking — ALB only
Shield: Standard (free, auto) protects L3/L4 DDoS; Advanced ($3K/mo) adds L7 + support team
Cognito/OIDC: ALB can authenticate users before forwarding — auth at infrastructure level
Access Logs: every request logged to S3 — client IP, latency, status, target chosen

Chapter Seven

Architecture Patterns

Pattern 1 — ALB + EC2 Auto Scaling (Classic Web App) Core

The most common AWS architecture pattern. ALB handles routing and health checks, Auto Scaling Group handles capacity. Together they create an elastic, self-healing web tier.

Pattern 1 — ALB + EC2 Auto Scaling Group (classic 3-tier web application)

✅

Why This Pattern Works

ALB handles routing + TLS + health checks
ASG handles capacity (scales up on load)
Unhealthy instances auto-replaced by ASG
Multi-AZ = AZ failure is survivable
Rolling deploys via ASG instance refresh
Cost-effective: scale down at night

🎯

Best For

Traditional web applications
Monolithic apps (before containerization)
WordPress / PHP / Java web apps
Predictable traffic patterns
Teams familiar with EC2
Exam tip: most common pattern tested

Pattern 2 — ALB + ECS Fargate (Modern Containers) Core

The modern standard for running microservices on AWS. ALB routes different paths to different ECS services — each service scales independently. No servers to manage (Fargate = serverless containers).

Pattern 2 — ALB + ECS Fargate microservices (production-grade)

Pattern 3 — NLB for High-Performance Systems Advanced

🎮

Gaming Architecture

NLB with Elastic IPs
UDP for game state (low latency)
TCP for chat/auth
Millions of concurrent players
Source IP preserved (anti-cheat)
Static IP for DNS (game clients)

💰

Financial Trading

NLB for microsecond latency
TLS passthrough (end-to-end audit)
Static IP for partner whitelisting
TCP connections for FIX protocol
Cross-zone disabled (data locality)
PrivateLink for cross-account access

Pattern 4 — NLB + ALB Combo (Static IP + HTTP Routing) Advanced

When you need both a static IP (NLB) and content-based routing (ALB) — the solution is chaining them. This is an exam favorite.

Pattern 4 — NLB → ALB chain gives static IP + Layer 7 routing

Pattern 5 — Multi-Tier Load Balancing Advanced

Pattern 5 — External ALB (public) + Internal ALB (private) for true multi-tier

Pattern Summary — Quick Reference Core

   Pattern LB Type Best For 
  ALB + EC2 ASG ALB Traditional web apps, monoliths, WordPress 
 ALB + ECS Fargate ALB Microservices, containers, modern apps 
 NLB (high-perf) NLB Gaming, IoT, financial, PrivateLink 
 NLB → ALB chain NLB + ALB Static IP + HTTP routing (exam favorite) 
 Multi-tier (ext+int) 2× ALB Enterprise apps, service isolation, compliance 
  

Pattern	LB Type	Best For
ALB + EC2 ASG	ALB	Traditional web apps, monoliths, WordPress
ALB + ECS Fargate	ALB	Microservices, containers, modern apps
NLB (high-perf)	NLB	Gaming, IoT, financial, PrivateLink
NLB → ALB chain	NLB + ALB	Static IP + HTTP routing (exam favorite)
Multi-tier (ext+int)	2× ALB	Enterprise apps, service isolation, compliance

🧠 Key Insight

The "right" pattern depends on your requirements: ALB + ASG for traditional apps, ALB + Fargate for modern microservices, NLB for non-HTTP or extreme performance, NLB→ALB for static IP + routing, and multi-tier ALBs for enterprise isolation. Know all five — they are frequently tested across AWS exams.

Chapter Summary Introductory

Pattern 1 (ALB + EC2 ASG): classic self-healing web tier — ALB routes, ASG scales, Multi-AZ for HA
Pattern 2 (ALB + ECS Fargate): modern microservices — path routing to independent services, no servers to manage
Pattern 3 (NLB high-perf): gaming/financial — static IP, microsecond latency, TCP/UDP, PrivateLink
Pattern 4 (NLB → ALB): best of both — static IP from NLB + Layer 7 routing from ALB (exam favorite!)
Pattern 5 (Multi-tier): external ALB (public) → web tier → internal ALB (private) → API services → DB