Elastic Load Balancing โ
Traffic Distribution Layer
AWS's managed load balancing service โ distributes incoming traffic across multiple targets, performs health checks, enables zero-downtime deployments, and acts as the intelligent entry point to your system. Not just traffic spreading โ it's your traffic control layer.
โก ELB in 30 Seconds
- Distributes traffic โ routes incoming requests across multiple healthy targets (EC2, ECS, Lambda)
- Health checks โ automatically detects unhealthy targets and stops routing traffic to them
- Three types โ ALB (Layer 7, HTTP), NLB (Layer 4, TCP/UDP), GWLB (Layer 3, appliances)
- Multi-AZ โ spans Availability Zones for fault tolerance and high availability
- Fully managed โ AWS handles scaling, patching, and availability of the load balancer itself
What is Load Balancing
You deploy your application on one server. Traffic grows. That single server handles 100 requests/secondโฆ then 1,000โฆ then it crashes. Users see 503 errors. Your app is down. You can scale vertically (bigger server), but there's a ceiling โ and a single point of failure remains.
๐ The fundamental problem: One server = one point of failure + limited capacity. You need multiple servers โ but someone needs to decide which server handles each request. That's what a load balancer does.
A load balancer sits between clients and your servers. It receives all incoming requests and distributes them across multiple backend servers (called targets). No single target gets overwhelmed. If one target dies, the load balancer stops sending traffic to it โ users never notice.
Without Load Balancer
- Single server handles everything
- Server crashes โ entire app goes down
- Can't scale beyond one machine's capacity
- Deployments require downtime
- No health monitoring
- No traffic distribution
With Load Balancer
- Traffic distributed across multiple targets
- One target dies โ others handle traffic seamlessly
- Scale horizontally by adding more targets
- Rolling deployments with zero downtime
- Continuous health checks on all targets
- Smart routing (path, host, headers)
There are two ways to handle growth:
Vertical Scaling (Scale Up)
- Make one server bigger (more CPU, RAM)
- Simple โ no architecture change
- Hard limit โ eventually max out
- Single point of failure remains
- Downtime to resize
Horizontal Scaling (Scale Out)
- Add more servers of the same size
- Virtually unlimited capacity
- No single point of failure
- Add/remove servers without downtime
- Requires a load balancer to distribute traffic
Horizontal scaling is how modern cloud architectures work. Auto Scaling adds EC2 instances; the load balancer automatically includes them in traffic distribution. This is elastic computing โ grow and shrink with demand.
Without Load Balancer = One-Lane Road
- All cars (requests) take the same single road
- Traffic jam at peak hours (server overloaded)
- Road closed for repair โ no one gets through (downtime)
- No alternative route
- Accident = total shutdown
With Load Balancer = Traffic Controller + Highway
- Multiple lanes (targets) available
- Traffic controller (ELB) directs cars to least-busy lane
- One lane closed โ controller re-routes to open lanes
- New lanes added during rush hour (Auto Scaling)
- Smart routing โ trucks to one lane, cars to another (ALB rules)
- Controller never sleeps (managed service, always available)
ELB is not just "traffic spreading" โ it's your traffic control layer. It routes intelligently, monitors health continuously, enables zero-downtime deployments, and acts as the single entry point that decouples clients from your backend architecture.
AWS ELB is a fully managed service โ you don't run load balancer software on EC2. AWS handles:
Auto-Scaling
ELB itself scales automatically. Handles 10 req/s or 10M req/s โ no pre-provisioning needed (ALB). AWS manages the underlying fleet.
Health Checks
Pings targets every 10-30 seconds. Unhealthy target? Traffic immediately re-routed to healthy ones. Auto-heals when target recovers.
Multi-AZ
Spans multiple Availability Zones. If an entire AZ goes down, ELB routes 100% to remaining AZs. True fault tolerance.
Understanding where ELB sits in the full AWS request flow:
โ Route 53 โ DNS resolution (domain โ IP)
โก CloudFront โ Edge caching (optional, reduces latency)
โข ELB โ Distributes traffic to targets
โฃ EC2 / ECS / Lambda โ Processes the request
โค RDS / DynamoDB โ Data layer
๐ ELB is the gateway between the internet and your compute layer. Everything flows through it.
- Load balancing = distributing traffic across multiple targets to prevent overload
- Horizontal scaling = adding more servers (not bigger servers) โ requires a load balancer
- Health checks = ELB monitors targets and stops routing to unhealthy ones
- Multi-AZ = ELB spans AZs for fault tolerance โ entire AZ failure is handled
- Fully managed = AWS scales ELB itself; you don't manage load balancer infrastructure
- ELB = traffic control layer โ not just spreading, but intelligent routing + health + zero-downtime deployments
ELB Types โ ALB, NLB & GWLB
AWS offers three types of Elastic Load Balancers โ each operating at a different OSI layer and optimized for different workloads. Choosing the right one is a common exam question and a critical architecture decision.
๐ The key mental model: ALB = smart routing (HTTP/HTTPS), NLB = raw speed (TCP/UDP), GWLB = security appliances (Layer 3). Most web applications use ALB. High-performance systems use NLB. You rarely need GWLB unless you're running firewalls or IDS/IPS.
ALB โ Application Load Balancer
- Layer 7 (HTTP/HTTPS)
- Content-based routing
- Path, host, header, query string rules
- WebSocket support
- Lambda + EC2 + ECS targets
- Best for: web apps, microservices, APIs
NLB โ Network Load Balancer
- Layer 4 (TCP/UDP/TLS)
- Ultra-low latency (~100ฮผs)
- Millions of requests/sec
- Static IP / Elastic IP support
- Preserves source IP
- Best for: gaming, IoT, financial, TCP services
GWLB โ Gateway Load Balancer
- Layer 3 (IP packets)
- Transparent to applications
- GENEVE encapsulation
- Inline traffic inspection
- Third-party appliances
- Best for: firewalls, IDS/IPS, DPI
| Feature | ALB | NLB | GWLB |
|---|---|---|---|
| OSI Layer | Layer 7 | Layer 4 | Layer 3 |
| Protocols | HTTP, HTTPS, gRPC | TCP, UDP, TLS | IP (all protocols) |
| Latency | ~ms | ~ฮผs (ultra-low) | ~ms |
| Static IP | โ (use DNS) | โ (Elastic IP) | โ |
| Content Routing | โ (path, host, headers) | โ (port only) | โ |
| Security Groups | โ | โ (added 2023) | โ |
| Lambda Targets | โ | โ | โ |
| Source IP Preserved | โ (use X-Forwarded-For) | โ | โ |
| Throughput | High | Millions req/s | High |
Choose ALB When
- Web application with HTTP/HTTPS traffic
- Microservices needing path-based routing
- Multiple domains on one load balancer (host-based)
- Need to route to Lambda functions
- WebSocket connections needed
- Authentication (OIDC/Cognito integration)
- gRPC workloads
- Redirect HTTP โ HTTPS at LB level
Choose NLB When
- Non-HTTP protocols (TCP, UDP, TLS)
- Need static IP for whitelisting by partners
- Extreme performance (millions req/s)
- Ultra-low latency required (~100ฮผs)
- Need to preserve client source IP
- Gaming servers, VoIP, IoT
- PrivateLink (exposing services cross-account)
- Long-lived TCP connections
GWLB is the newest type (launched 2020) and the least commonly used. It's designed for a very specific use case: inserting third-party security appliances (firewalls, IDS/IPS) inline into your traffic flow without changing your application architecture.
How GWLB Works
- Traffic is transparently diverted to appliances
- Uses GENEVE protocol (port 6081) for encapsulation
- Appliances inspect traffic and return it
- Original flow continues โ app doesn't know
- Scales fleet of appliances automatically
- Deployed via Gateway Load Balancer Endpoint (GWLBE) in each VPC
When NOT to use GWLB
- Regular web applications โ use ALB
- Simple TCP forwarding โ use NLB
- If you don't run security appliances โ skip GWLB
- Not for direct application load balancing
- Exam tip: rarely tested, but know it exists
GWLB โ Exam Quick Reference
| Layer | Layer 3 (IP packets) โ transparent to applications |
| Protocol | GENEVE encapsulation (port 6081) โ wraps original packets |
| Use with | Third-party firewalls (Palo Alto, Fortinet), IDS/IPS, DLP appliances |
| Deployment | GWLB โ Target Group (appliance fleet) โ returns cleaned traffic via GWLBE |
| Exam triggers | "inline inspection," "security appliance," "firewall," "IDS/IPS," "transparent" |
| NOT for | HTTP routing (use ALB), high-performance TCP (use NLB), direct LB |
For most real-world scenarios: default to ALB for web apps. Switch to NLB when you see "static IP," "TCP/UDP," "extreme performance," or "source IP preservation." GWLB only appears for security appliance scenarios.
- ALB (Layer 7) = HTTP/HTTPS smart routing โ path, host, headers โ web apps + microservices + Lambda
- NLB (Layer 4) = TCP/UDP raw speed โ static IP, ultra-low latency, millions req/s โ gaming + IoT + financial
- GWLB (Layer 3) = IP packet inspection โ transparent inline security appliances โ firewalls + IDS/IPS
- Default choice = ALB for 90% of web workloads; NLB when you need static IP or non-HTTP; GWLB only for security appliances
- Exam keywords: "path-based" โ ALB, "static IP" โ NLB, "firewall appliance" โ GWLB
Application Load Balancer โ Deep Dive
The Application Load Balancer is the most commonly used ELB type. It operates at Layer 7 (HTTP/HTTPS), which means it can read and understand the content of requests โ URLs, headers, cookies, query strings โ and make intelligent routing decisions based on that content.
๐ Key difference: NLB sees "a TCP connection on port 443." ALB sees "GET /api/users?page=2 with header Host: api.example.com and cookie session=abc123." ALB understands the request โ that's what makes content-based routing possible.
Every ALB has three core components that work together:
Listener
- Checks for connections on a port + protocol
- Typically: port 80 (HTTP) or 443 (HTTPS)
- You can have multiple listeners
- HTTPS listener needs an SSL certificate
- Entry point for all requests
Rules
- Conditions that match incoming requests
- Path: /api/*, /images/*, /admin/*
- Host: api.example.com, www.example.com
- Headers, query strings, HTTP method
- Evaluated in priority order (1โ50000)
- Default rule = catch-all (required)
Target Group
- Group of targets receiving traffic
- Can be: EC2, ECS, Lambda, or IP
- Has its own health check configuration
- Defines port and protocol for targets
- One rule โ one target group action
Path-based routing sends requests to different target groups based on the URL path. This is the most commonly used ALB routing feature โ it lets one ALB serve multiple services:
GET /api/users โ Target Group: API Service (ECS Fargate)
GET /web/dashboard โ Target Group: Web Frontend (EC2)
GET /static/logo.png โ Target Group: Static Assets (S3 via Lambda)
GET /admin/* โ Target Group: Admin Service (separate EC2 fleet)
๐ One ALB, one domain, multiple services โ path routing is microsservice routing without a service mesh.
Host-based routing routes based on the Host header โ effectively serving multiple domains/subdomains from one ALB:
Example: Multi-Tenant SaaS
- api.example.com โ API target group
- www.example.com โ Web target group
- admin.example.com โ Admin target group
- *.customer.com โ Customer-specific group
- All served by ONE ALB
Why This Matters
- One ALB = one bill (cheaper than multiple LBs)
- Wildcard support (*.example.com)
- Combine with path rules for fine-grained control
- Each domain can route to different backends
- One SSL cert with SANs covers all domains
Beyond path and host, ALB can route on any HTTP attribute:
Advanced Condition Types
- HTTP Header: X-Custom-Header = "beta" โ TG-Beta
- Query String: ?version=v2 โ TG-V2
- HTTP Method: POST /orders โ TG-Write-Service
- Source IP: 10.0.0.0/8 โ TG-Internal
- Combine conditions with AND logic
Advanced Actions
- Forward: Send to target group (normal)
- Redirect: HTTPโHTTPS, oldโnew URL
- Fixed Response: Return 404/503 directly from ALB
- Authenticate: OIDC/Cognito before forwarding
- Weighted: 90% TG-Blue + 10% TG-Green
ALB Can Do
- Path-based & host-based routing
- HTTP โ HTTPS redirect (built-in)
- Fixed response (maintenance page from LB)
- OIDC / Cognito authentication at LB level
- Sticky sessions (cookie-based)
- WebSocket and HTTP/2 support
- gRPC support
- Lambda targets (serverless backend)
- Weighted target groups (canary/blue-green)
- Access logs to S3
- WAF integration (Web Application Firewall)
- Slow start mode for new targets
ALB Cannot Do
- Static IP (use NLB or Global Accelerator)
- Preserve source IP directly (use X-Forwarded-For)
- Non-HTTP protocols (TCP, UDP, MQTT โ NLB)
- Ultra-low latency (ฮผs) โ ALB adds ~ms
- PrivateLink / VPC endpoint service (โ NLB)
- Inline IP packet inspection (โ GWLB)
Idle Timeout
- Default: 60 seconds
- If no data sent/received for 60s โ connection closed
- WebSocket: increase to 3600 seconds or more
- File uploads: increase for large files
- Short APIs: decrease to 30s (frees resources)
- Target keep-alive should exceed ALB idle timeout
Slow Start Mode
- New targets receive gradually increasing traffic
- Duration: 30โ900 seconds (configurable)
- Protects cold-start instances from sudden full load
- JVM warm-up, cache priming, lazy initialization
- Traffic ramps linearly over configured duration
- Disabled by default (set to 0)
Deletion Protection
- Prevents accidental ALB deletion
- Must be explicitly disabled before deletion
- Enable for all production load balancers
- Terraform:
enable_deletion_protection = true - Console: shows error if you try to delete
ALB Limits
- 100 rules per listener (default, can request increase)
- 5 conditions per rule (AND logic)
- Up to 5 actions per rule
- 50 target groups per ALB
- For large microservices: chain ALBs or use NLBโALB
- Multiple listeners on same ALB (e.g., 80 + 443)
ALB is your microservice router. One ALB replaces what used to require multiple load balancers or a dedicated API gateway. Path routing + host routing + weighted targets = a lightweight service mesh at the infrastructure level.
- ALB = Layer 7 โ reads HTTP content and routes based on path, host, headers, cookies, query strings
- Components: Listener (port+protocol) โ Rules (conditions) โ Target Groups (destinations)
- Path routing: /api/* โ Service A, /web/* โ Service B โ one LB, multiple services
- Host routing: api.example.com โ different backend than www.example.com
- Target types: EC2 instances, ECS tasks, Lambda functions, IP addresses โ mix freely
- Advanced: weighted target groups for canary deployments, OIDC auth, fixed responses, WAF
- Limits: 100 rules per listener, 5 conditions per rule, 50 target groups per ALB
- Idle timeout: default 60s โ increase for WebSocket (3600s) and large uploads
- Slow start: ramp up traffic to new targets over 30-900s (protects cold starts)
- Cannot: static IP, source IP preservation (use X-Forwarded-For), non-HTTP traffic
Network Load Balancer โ Deep Dive
The Network Load Balancer operates at Layer 4 (Transport layer) โ it sees TCP/UDP connections, not HTTP content. It doesn't inspect payloads, doesn't read headers, doesn't understand URLs. It simply forwards connections at wire speed. This makes it extremely fast โ latency measured in microseconds, not milliseconds.
๐ The key difference: ALB is like a concierge who reads your invitation and directs you to the right room. NLB is like a high-speed highway toll booth โ it doesn't care what's in your car, it just routes you through as fast as physically possible.
Performance
- Millions of requests per second
- Ultra-low latency: ~100 microseconds
- Handles sudden traffic spikes without warm-up
- No pre-provisioning needed
- Designed for volatile, bursty workloads
- Connection-level load balancing (not request-level)
Static IP / Elastic IP
- One static IP per AZ (automatically assigned)
- Can attach Elastic IP per AZ (bring your own)
- IP never changes โ firewall-friendly
- Partners can whitelist your IP
- DNS resolves to these IPs (not rotating like ALB)
- Critical for IP-based whitelisting scenarios
Source IP Preservation
- Client's real IP reaches the target
- No X-Forwarded-For needed (unlike ALB)
- Target sees: source IP = client IP
- Important for logging, geo-restriction, security
- Works by default for instance targets
What NLB Cannot Do
- No content-based routing (no path/host)
- No HTTP header inspection
- No Lambda targets
- No built-in authentication (OIDC)
- No WebSocket-specific handling
- No WAF integration directly
The fundamental difference in how they process traffic:
ALB sees: "GET /api/users HTTP/1.1\nHost: api.example.com\nAuth: Bearer token..."
ALB decides: path matches /api/* โ forward to API target group
NLB sees: "TCP connection โ destination port 443"
NLB decides: port 443 โ forward to target group for port 443
(NLB doesn't open the packet โ just routes the connection)
Gaming
- Millions of concurrent TCP/UDP connections
- Ultra-low latency mandatory
- Static IP for DNS pointing
- Long-lived connections
Financial / Trading
- Microsecond latency requirements
- Static IP for partner whitelisting
- High-frequency trading connections
- TLS passthrough
IoT / Real-time
- MQTT over TCP (IoT devices)
- Millions of device connections
- UDP for telemetry/streaming
- Source IP for device identification
TLS Termination at NLB
- NLB decrypts TLS โ sends plain TCP to targets
- Attach ACM certificate to NLB
- Offloads CPU-intensive crypto from targets
- Targets don't need certificates
- Use when: targets are in private subnet
TLS Passthrough
- NLB forwards encrypted traffic as-is
- Target handles TLS termination
- NLB never sees plaintext data
- End-to-end encryption preserved
- Use when: compliance requires end-to-end TLS
Cross-Zone Load Balancing โ Cost
- NLB cross-zone is OFF by default
- When enabled: inter-AZ data transfer charges apply
- Unlike ALB (which is free and always on)
- Enable when: uneven target distribution across AZs
- Disable when: traffic is evenly distributed
- Can be expensive for high-throughput workloads
- Exam tip: know this cost difference vs ALB
NLB UDP Behavior
- UDP is connectionless โ no TCP handshake
- Does NOT support sticky sessions for UDP
- Health checks for UDP: send UDP packet
- Success = response received; Failure = timeout
- Can also use TCP/HTTP health check for UDP targets
- UDP flows tracked by 5-tuple (src/dst IP+port+proto)
- Use for: DNS, game state, telemetry, VoIP
NLB is for when you need raw TCP/UDP performance, static IPs, or PrivateLink. If your question mentions "static IP," "millions of requests," "UDP," "TCP passthrough," or "PrivateLink" โ the answer is NLB. For everything HTTP-related, use ALB instead.
- NLB = Layer 4 โ sees TCP/UDP connections, not HTTP content โ routes by port only
- Ultra-fast โ ~100ฮผs latency, millions of req/s, no warm-up needed
- Static IP โ one per AZ, supports Elastic IP โ critical for firewall whitelisting
- Source IP preserved โ targets see real client IP (no X-Forwarded-For needed)
- PrivateLink โ only NLB can back a VPC Endpoint Service (cross-account private access)
- TLS options โ terminate at NLB (offload crypto) or passthrough (end-to-end encryption)
- Cross-zone โ OFF by default, inter-AZ data transfer costs when enabled (unlike ALB which is free)
- UDP โ connectionless, no sticky sessions, 5-tuple flow tracking
- Use cases โ gaming, financial trading, IoT, B2B APIs needing static IP, PrivateLink services
Target Groups & Health Checks
A Target Group is the logical grouping of targets (instances, containers, Lambda functions, or IPs) that receive traffic from the load balancer. Think of it as a "destination pool" โ the load balancer sends traffic to the group, and the group distributes it among its members.
๐ Mental model: Listener receives traffic โ Rules evaluate โ Matching rule forwards to Target Group โ Target Group distributes to individual targets. The target group is the bridge between routing rules and actual compute.
Instance Targets
- Register by instance ID
- ELB routes to primary private IP
- Port specified at registration
- Used with: EC2 instances, ASG
- Source IP: NLB preserves, ALB uses X-Forwarded-For
- Most common target type
IP Targets
- Register by IP address
- Can target IPs outside the VPC
- On-premises servers via Direct Connect
- Other VPCs (peered or Transit Gateway)
- ECS tasks with awsvpc networking
- Multiple ports on same instance
Lambda Targets (ALB only)
- Register a Lambda function
- ALB converts HTTP โ Lambda event
- Lambda response โ HTTP response
- Serverless backend behind ALB
- One Lambda per target group
- Great for lightweight APIs
ALB Targets (NLB only)
- Register an ALB as NLB target
- NLB (static IP) โ ALB (content routing)
- Best of both worlds combination
- Static IP + path/host routing
- Exam favorite: "static IP + HTTP routing"
- Introduced in 2021
Health checks are the heartbeat of load balancing. Without them, ELB would blindly send traffic to dead targets. Health checks continuously probe each registered target โ if a target fails, ELB stops routing traffic to it. When it recovers, traffic resumes.
How It Works
- ELB sends a probe (HTTP GET or TCP connect)
- Target must respond within timeout
- Success = healthy response (e.g., HTTP 200)
- Failure = timeout or wrong status code
- Consecutive failures = mark unhealthy
- Consecutive successes = mark healthy again
Configuration
- Protocol: HTTP, HTTPS, TCP, gRPC
- Path: /health, /status, /ping
- Port: traffic port or custom
- Interval: 10-300 sec (default 30)
- Timeout: 2-120 sec (default 5)
- Healthy threshold: 2-10 (default 5)
- Unhealthy threshold: 2-10 (default 2)
Success Criteria
- HTTP: status code match (200-499)
- TCP: connection established
- gRPC: gRPC status code
- Default: 200 OK
- Can set: "200-299" or "200,302"
- Path should be lightweight (/health)
When a target is removed (scaling down, deployment, or unhealthy), ELB doesn't kill existing connections immediately. Deregistration delay gives in-flight requests time to complete:
How It Works
- Target marked for deregistration
- ELB stops sending NEW requests to it
- Existing connections continue until complete
- Or until deregistration delay expires
- Default: 300 seconds (5 minutes)
- Can set: 0โ3600 seconds
Best Practices
- Short-lived requests (APIs): set to 30-60s
- Long-lived (WebSocket, uploads): set to 300-3600s
- Set to 0 for instant termination (testing)
- Critical for zero-downtime deployments
- Works with Auto Scaling scale-in
- Works with rolling deployment updates
Duration-Based (LB Cookie)
- ELB generates cookie: AWSALB
- Duration: 1 second to 7 days
- Same client โ same target for duration
- Simple โ no app changes needed
- Warning: uneven load if sessions are long
Application-Based Cookie
- App generates custom cookie
- ELB reads it to maintain affinity
- Cookie name must NOT start with AWSALB
- App controls session lifecycle
- More flexible โ app-aware routing
โ ๏ธ Sticky Sessions + WebSocket Limitation
Sticky sessions do NOT work with WebSocket connections. WebSocket maintains a persistent connection (no new HTTP requests to route), so cookie-based affinity is irrelevant. For WebSocket, the connection itself provides affinity โ once established, it stays on the same target. If you need WebSocket + multiple targets, use connection ID routing at the application level, not ELB stickiness.
Health checks are invisible to users but critical for reliability. They're how ELB achieves "zero-downtime" โ failed targets are silently removed, recovered targets automatically rejoin. Combine with cross-zone balancing for true fault tolerance across AZs.
- Target Group = logical pool of targets (EC2, IP, Lambda, or ALB) โ destination for routing rules
- Target types: instance (by ID), IP (cross-VPC/on-prem), Lambda (ALB only), ALB (NLB only)
- Health checks = continuous probes every 10-300s โ unhealthy targets removed from rotation automatically
- Thresholds: unhealthy=2 failures, healthy=5 successes โ tune for speed vs stability
- Deregistration delay = 300s default โ allows in-flight requests to complete before target removed
- Cross-zone: ALB always on (free), NLB off by default โ enables failover across AZs
- Sticky sessions: bind client to same target via cookie โ use sparingly (causes uneven load)
Security & Integration
Security groups control what traffic can reach your load balancer and what traffic your targets accept. The critical pattern is: allow public traffic to ELB, but restrict targets to only accept traffic FROM the ELB.
ALB Security Group
- Inbound: port 443 from 0.0.0.0/0 (internet)
- Inbound: port 80 from 0.0.0.0/0 (redirect to HTTPS)
- Outbound: port 8080 to target SG
- ALB always has a security group
- Can restrict by IP range (internal ALB)
Target (EC2) Security Group
- Inbound: port 8080 from ALB security group
- NOT from 0.0.0.0/0!
- Only ELB can talk to targets
- Targets are in private subnet
- No direct internet access to instances
๐ The golden rule: Targets should ONLY allow inbound traffic from the ELB security group โ never from 0.0.0.0/0. This ensures all traffic passes through the load balancer (health checks, routing, TLS) and targets aren't directly exposed.
NLB (Before 2023)
- NLB did NOT have security groups
- Traffic passed through to targets
- Target SG had to allow client IPs directly
- Because NLB preserves source IP
- Made security rules complex
NLB (After 2023)
- NLB now supports security groups
- Target SG can reference NLB SG
- Same pattern as ALB now possible
- Simplified security management
- Exam: know both old and new behavior
TLS termination means the load balancer decrypts HTTPS traffic and forwards plain HTTP to targets. This offloads CPU-intensive cryptography from your application servers.
TLS at ALB (Most Common)
- Attach ACM certificate to ALB
- ALB decrypts HTTPS โ HTTP to targets
- Targets on port 80/8080 (plain HTTP)
- SNI support (multiple certs)
- Free ACM certificates
End-to-End Encryption
- ALB decrypts, then re-encrypts to targets
- Target group protocol: HTTPS
- Targets need their own certs
- Required for: PCI compliance, zero-trust
- More CPU on targets
SSL Policies
- Choose TLS version (1.2, 1.3)
- Choose cipher suites
- Recommended: TLSv1.2+ only
- ELBSecurityPolicy-TLS13-1-2-2021-06
- Disable old protocols (SSLv3, TLS 1.0)
WAF (Web Application Firewall)
- Attach AWS WAF to ALB
- Block SQL injection, XSS
- Rate limiting per IP
- Geo-blocking
- Custom rules & managed rule groups
- ALB only (not NLB)
AWS Shield
- Shield Standard: free, auto-enabled
- Protects against L3/L4 DDoS
- Shield Advanced: $3000/mo
- L7 DDoS protection + support
- SRT (Shield Response Team)
- Cost protection during attacks
Cognito / OIDC Auth
- ALB authenticates before forwarding
- Cognito User Pool integration
- Any OIDC provider (Google, Okta)
- Auth at LB level โ app doesn't need auth code
- ALB only feature
CloudWatch
- Metrics: RequestCount, TargetResponseTime
- HealthyHostCount, UnhealthyHostCount
- HTTPCode_ELB_5XX (LB errors)
- HTTPCode_Target_5XX (target errors)
- Alarms for monitoring
Access Logs
- Log every request to S3
- Client IP, latency, response code
- Target chosen, health status
- 5/15 minute intervals
- Free (only S3 storage cost)
Global Accelerator
- Static IPs + AWS backbone routing
- Works with ALB (static IP for ALB!)
- Improves global latency
- Instant failover between regions
- Alternative to NLB for static IP
ELB is not just about load balancing โ it's the natural point for TLS termination, authentication, WAF enforcement, and security group chaining. A properly configured ALB + WAF + Shield + SG chain provides defense-in-depth from DDoS to SQL injection to unauthorized access.
- Security Groups: ALB allows 0.0.0.0/0 :443, targets allow only from ALB SG โ never expose targets directly
- NLB SGs: now supported (since 2023) โ same pattern as ALB possible
- TLS termination: ALB decrypts HTTPS โ plain HTTP to targets โ use ACM for free auto-renewing certs
- WAF: attach to ALB for SQL injection, XSS, rate limiting, geo-blocking โ ALB only
- Shield: Standard (free, auto) protects L3/L4 DDoS; Advanced ($3K/mo) adds L7 + support team
- Cognito/OIDC: ALB can authenticate users before forwarding โ auth at infrastructure level
- Access Logs: every request logged to S3 โ client IP, latency, status, target chosen
Architecture Patterns
The most common AWS architecture pattern. ALB handles routing and health checks, Auto Scaling Group handles capacity. Together they create an elastic, self-healing web tier.
Why This Pattern Works
- ALB handles routing + TLS + health checks
- ASG handles capacity (scales up on load)
- Unhealthy instances auto-replaced by ASG
- Multi-AZ = AZ failure is survivable
- Rolling deploys via ASG instance refresh
- Cost-effective: scale down at night
Best For
- Traditional web applications
- Monolithic apps (before containerization)
- WordPress / PHP / Java web apps
- Predictable traffic patterns
- Teams familiar with EC2
- Exam tip: most common pattern tested
The modern standard for running microservices on AWS. ALB routes different paths to different ECS services โ each service scales independently. No servers to manage (Fargate = serverless containers).
Gaming Architecture
- NLB with Elastic IPs
- UDP for game state (low latency)
- TCP for chat/auth
- Millions of concurrent players
- Source IP preserved (anti-cheat)
- Static IP for DNS (game clients)
Financial Trading
- NLB for microsecond latency
- TLS passthrough (end-to-end audit)
- Static IP for partner whitelisting
- TCP connections for FIX protocol
- Cross-zone disabled (data locality)
- PrivateLink for cross-account access
When you need both a static IP (NLB) and content-based routing (ALB) โ the solution is chaining them. This is an exam favorite.
| Pattern | LB Type | Best For |
|---|---|---|
| ALB + EC2 ASG | ALB | Traditional web apps, monoliths, WordPress |
| ALB + ECS Fargate | ALB | Microservices, containers, modern apps |
| NLB (high-perf) | NLB | Gaming, IoT, financial, PrivateLink |
| NLB โ ALB chain | NLB + ALB | Static IP + HTTP routing (exam favorite) |
| Multi-tier (ext+int) | 2ร ALB | Enterprise apps, service isolation, compliance |
The "right" pattern depends on your requirements: ALB + ASG for traditional apps, ALB + Fargate for modern microservices, NLB for non-HTTP or extreme performance, NLBโALB for static IP + routing, and multi-tier ALBs for enterprise isolation. Know all five โ they are frequently tested across AWS exams.
- Pattern 1 (ALB + EC2 ASG): classic self-healing web tier โ ALB routes, ASG scales, Multi-AZ for HA
- Pattern 2 (ALB + ECS Fargate): modern microservices โ path routing to independent services, no servers to manage
- Pattern 3 (NLB high-perf): gaming/financial โ static IP, microsecond latency, TCP/UDP, PrivateLink
- Pattern 4 (NLB โ ALB): best of both โ static IP from NLB + Layer 7 routing from ALB (exam favorite!)
- Pattern 5 (Multi-tier): external ALB (public) โ web tier โ internal ALB (private) โ API services โ DB