Amazon VPC β
Virtual Private Cloud
Your own private data-centre inside AWS. Define subnets, control routing, layer security, and connect to the internet or on-premises β all in software, no hardware required.
β‘ VPC in 30 Seconds
- Logically isolated network section inside AWS β your own private IP address space
- You define subnets, route tables, gateways, and security rules entirely in software
- Public subnets face the internet; private subnets stay shielded behind NAT
- Two firewall layers: Security Groups (instance-level, stateful) + NACLs (subnet-level, stateless)
- Connect to on-premises via VPN or Direct Connect; to other VPCs via Peering or Transit Gateway
What is VPC
Amazon VPC (Virtual Private Cloud) is a logically isolated section of the AWS cloud where you launch AWS resources in a virtual network that you define. You have full control over the IP address range, subnet layout, route tables, network gateways, and security settings β just like operating a traditional on-premises network, but entirely in software.
π Think of a VPC as: a virtual data-centre you own inside AWS β private, fenced off, and fully under your control.
Before VPC existed, all AWS resources ran in a flat, shared network called EC2-Classic. Today every AWS account automatically gets a Default VPC in each region so you can launch resources immediately. For production workloads, architects create Custom VPCs with a deliberate design that enforces security and network separation.
Before VPC β EC2-Classic Problems
- All customers shared the same flat network β no isolation
- No subnet-level security or routing control
- Could not bring your own IP ranges or topology
- Hard to replicate on-premises network architecture
- No private connectivity model to corporate networks
VPC Solves
- Complete logical isolation β your traffic never touches other accounts
- You own the IP ranges, subnets, and routing decisions
- Two-layer security: Security Groups + Network ACLs
- Public and private network zones in the same VPC
- Extend corporate data-centre into AWS via VPN / Direct Connect
VPC isolation works at the network layer. Even if two EC2 instances exist in the same AWS region, they cannot communicate unless they share the same VPC (or explicit peering is configured). This is enforced by the underlying AWS hypervisor network β it is not just a firewall rule.
Account Boundary
Each AWS account is its own security principal. Resources across accounts are isolated by default β no cross-account traffic without explicit IAM + VPC configuration.
Region Boundary
A VPC lives in one AWS region. Resources in us-east-1 and eu-west-1 are in separate VPCs and isolated unless you set up cross-region peering or Transit Gateway.
VPC Boundary
Within one account and region you can have multiple VPCs. They are isolated from each other by default. You choose if and how they communicate.
Every AWS account has a Default VPC automatically created in each region. It is designed for quick experimentation β you can launch an EC2 instance immediately without any network setup. Custom VPCs are created by architects for production workloads where you control every network decision.
| Attribute | Default VPC | Custom VPC |
|---|---|---|
| Created by | AWS β automatic | You β on demand |
| CIDR block | 172.31.0.0/16 (fixed) | You choose (e.g. 10.0.0.0/16) |
| Subnets | One public subnet per AZ (pre-created) | You define public + private subnets |
| Internet Gateway | Attached automatically | You attach IGW when needed |
| Route tables | Default route: 0.0.0.0/0 β IGW | You create and manage per subnet |
| Security Groups | Default SG: all inbound from same SG | You define rules per workload |
| DNS hostnames | Enabled by default | Configurable per VPC |
| Best for | Quick starts, learning, dev testing | Production, multi-tier apps, compliance |
π Best practice: Never run production workloads in the Default VPC. Create Custom VPCs with a deliberate IP plan. The Default VPC puts all subnets in public β not safe for databases, internal services, or regulated workloads.
A VPC is like a private office building that only your company occupies, inside a large public commercial district (AWS region):
Building = VPC
- The outer walls = VPC boundary β isolates your network from others
- Building address range = CIDR block (e.g., 10.0.0.0/16)
- You control who enters, who exits, and which floors talk to each other
- Multiple buildings (multiple VPCs) can exist in the same city (region)
Lobby = Public Subnet
- Visible from the street (internet-reachable)
- Web servers and load balancers live here
- Has a front door = Internet Gateway (IGW)
- Anyone with the address can reach it (if security rules allow)
Server Room = Private Subnet
- No direct street access β reachable only from inside the building
- Databases and app servers live here
- Outbound internet traffic exits via back service door (NAT Gateway)
- Visitors from the lobby must pass an internal security check (SG)
Security Desk = Security Group + NACL
- Security Group = guard at each person's desk (instance-level, stateful)
- NACL = checkpoint at each floor entrance (subnet-level, stateless)
- Both layers can be active simultaneously for defence in depth
- SGs remember connections; NACLs check both inbound and outbound independently
When you create a VPC there are several building blocks that work together. Each chapter deep-dives into one, but here is the map:
| Component | What It Does | Covered In |
|---|---|---|
| VPC | Logical network boundary β defines the IP address space with a CIDR block | Chapter 01 (this chapter) |
| Subnets | Sub-divides the VPC into ranges per AZ. Public = internet-routable. Private = internal only. | Chapter 02 |
| Route Tables | Rules that decide where traffic is forwarded for each subnet | Chapter 03 |
| Internet Gateway (IGW) | Two-way gateway β allows public subnets to communicate with the internet | Chapter 03 |
| NAT Gateway | Outbound-only internet access for private subnet resources | Chapter 06 |
| Security Groups | Stateful, instance-level virtual firewall β inbound and outbound per resource | Chapter 04 |
| Network ACLs (NACL) | Stateless, subnet-level firewall β first line of defence at the subnet boundary | Chapter 04 |
| VPC Peering | Private, direct connection between two VPCs (same or different account/region) | Chapter 05 |
| VPC Endpoints | Private access to AWS services (S3, DynamoDB) without hitting the public internet | Chapter 06 |
Nearly every AWS compute, database, and integration service runs inside or connects to a VPC. Understanding VPC is foundational for everything else in AWS networking:
Compute
EC2 instances, ECS tasks, Lambda (VPC mode), and EKS worker nodes all sit inside VPC subnets. Without a VPC, you cannot run them privately.
Databases
RDS, Aurora, ElastiCache, and Redshift all require a VPC subnet group. They live in private subnets and are reachable only from within the VPC.
Load Balancing
ALB and NLB are deployed into VPC subnets. Public-facing load balancers sit in public subnets; internal load balancers in private subnets.
Security Services
AWS WAF, Shield, GuardDuty, and Inspector all operate at the VPC layer. VPC Flow Logs capture all traffic metadata for monitoring and audit.
Storage
S3 and DynamoDB can be accessed privately via VPC Endpoints β traffic stays on the AWS backbone without hitting the public internet.
Hybrid Connectivity
AWS VPN and Direct Connect attach to a VPC via a Virtual Private Gateway β extending your corporate network into AWS seamlessly.
VPC is the foundation of all AWS networking. Every resource you launch lives inside a VPC. Getting the VPC design right β CIDR ranges, subnet splits, routing, and security layers β defines the security and scalability of your entire AWS architecture.
- VPC = your private network inside AWS β logically isolated, fully under your control, no extra hardware required.
- Isolation is enforced at the hypervisor level β two VPCs cannot communicate by default, even in the same account and region.
- Default VPC β auto-created with public subnets and an IGW; ideal for learning, not for production.
- Custom VPC β you define the CIDR, subnets, route tables, and security rules; mandatory for production workloads.
- Public subnets face the internet via an Internet Gateway. Private subnets are shielded β internet access only through NAT Gateway.
- Two security layers: Security Groups (stateful, instance-level) + Network ACLs (stateless, subnet-level). Use both for defence in depth.
- VPC is foundational β EC2, RDS, Lambda, ALB, ECS, EKS β all live inside a VPC subnet.
π― Chapter 01 β Exam Tips
- VPC is region-scoped. Subnets are AZ-scoped. A VPC cannot span regions.
- Default VPC has all public subnets + IGW. Custom VPC starts with nothing β you build it.
- Two VPCs in the same account cannot communicate by default β you need Peering or Transit Gateway.
- Every VPC has exactly one CIDR block (primary) + up to 4 secondary CIDRs.
- VPC = logical isolation at the hypervisor level. It is NOT just a firewall β it's network-level separation.
Subnets & CIDR
CIDR (Classless Inter-Domain Routing) is the notation used to define a range of IP addresses. A CIDR block looks like 10.0.0.0/16 β the number after the slash tells you how many bits are fixed (the network part), and the rest are free for hosts.
π Rule of thumb: The smaller the prefix number (e.g. /16), the bigger the range. The bigger the number (e.g. /28), the smaller the range β fewer IPs available.
| CIDR Notation | Fixed Bits | Total IPs | Usable IPs (AWS) | Typical Use |
|---|---|---|---|---|
10.0.0.0/16 | 16 | 65,536 | 65,531 | Entire VPC β large address space |
10.0.1.0/24 | 24 | 256 | 251 | Single subnet β most common size |
10.0.1.0/26 | 26 | 64 | 59 | Small subnet β microservices, Lambda |
10.0.1.0/28 | 28 | 16 | 11 | Minimum VPC subnet (AWS limits to /28) |
π AWS reserves 5 IPs in every subnet: first 4 (network, router, DNS, future) + last 1 (broadcast). A /24 gives you 256 total but only 251 usable.
When you create a VPC, you assign a primary CIDR block. This is the total IP address space for your VPC β you carve it up into subnets. Getting this right matters because you cannot shrink a CIDR after creation and changing it requires rebuilding the VPC.
CIDR Sizing Rules
- VPC CIDR must be between /16 (65,536 IPs) and /28 (16 IPs)
- Use RFC 1918 private ranges:
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 - Choose a /16 for production VPCs β room to add subnets and AZs later
- You can add up to 4 secondary CIDR blocks to a VPC later
- Avoid overlapping CIDRs if you plan VPC Peering or Direct Connect
Recommended Ranges
10.0.0.0/16β most common production choice10.10.0.0/16β when you have multiple VPCs (avoid overlap)172.31.0.0/16β AWS Default VPC uses this (avoid for custom)- Reserve a unique /16 per environment:
10.0.x.x= dev,10.1.x.x= staging,10.2.x.x= prod
A subnet (sub-network) is a range of IP addresses carved out of your VPC's CIDR block. Every EC2 instance, RDS database, and Lambda function in VPC mode must be placed in a subnet. Subnets are the fundamental unit of network placement in AWS.
AZ-Scoped
A subnet lives in exactly one Availability Zone. It cannot span multiple AZs. To achieve HA, you create one subnet per AZ and launch resources in each.
Non-overlapping
Subnet CIDRs within a VPC cannot overlap. If the VPC is 10.0.0.0/16, subnets like 10.0.1.0/24 and 10.0.2.0/24 are valid β 10.0.1.0/23 and 10.0.1.0/24 would conflict.
Route-Table Linked
Each subnet is associated with exactly one route table. The route table decides if traffic from that subnet can reach the internet, other subnets, or stays internal.
A common misconception: there is no "public" flag on a subnet. What makes a subnet public or private is purely its route table. A subnet whose route table has a route to an Internet Gateway (IGW) is public. A subnet with no IGW route is private.
| Attribute | Public Subnet | Private Subnet |
|---|---|---|
| Route to 0.0.0.0/0 | Points to Internet Gateway (IGW) | Points to NAT Gateway β or no internet route |
| Resources inside | Load balancers, bastion hosts, NAT Gateway | App servers, databases, internal services |
| Inbound from internet | Yes (if Security Group allows) | No β not reachable from internet directly |
| Outbound to internet | Yes β direct via IGW | Yes β but via NAT Gateway in public subnet |
| Public IP assignment | Auto-assign enabled (recommended) | Disabled β private IPs only |
| DNS hostnames | Public DNS name assigned (e.g. ec2-xx.compute.amazonaws.com) | Private DNS name only |
| Typical resources | ALB, NLB, NAT GW, Bastion EC2 | EC2 app servers, RDS, ElastiCache, Lambda |
π Key insight: The route table entry 0.0.0.0/0 β igw-xxxx is what makes a subnet "public". Remove that route β the subnet becomes private instantly. Add it back β public again. The subnet CIDR itself doesn't change.
In every subnet, AWS reserves the first 4 and last 1 IP address. For 10.0.1.0/24:
| IP Address | Reserved For |
|---|---|
10.0.1.0 | Network address β identifies the subnet itself |
10.0.1.1 | VPC router β your default gateway for the subnet |
10.0.1.2 | DNS server (VPC base + 2 rule) |
10.0.1.3 | Reserved by AWS for future use |
10.0.1.255 | Broadcast address (not used in VPC but still reserved) |
π Exam trap: A /28 subnet has 16 IPs total. Minus 5 reserved = 11 usable. If you need 14 EC2 instances in one subnet, a /28 is NOT enough. Use /27 (32 IPs, 27 usable) instead.
Do This
- Use /24 for most subnets β 251 IPs is enough for most workloads
- Create subnets in at least 2 AZs per tier for fault tolerance
- Make public subnets smaller β fewer resources need public IPs (e.g., /26 or /27)
- Make private subnets larger β app servers, containers, and Lambda need IPs
- Leave gaps in your CIDR plan for future subnets (don't fill /16 immediately)
- Use consistent naming:
pub-a,pub-b,priv-a,priv-b
Avoid This
- Don't use /28 for app subnets β ECS/EKS and Lambda chew through IPs fast
- Don't deploy databases in public subnets β ever
- Don't use overlapping CIDRs across VPCs if you plan peering
- Don't rely on the Default VPC in production β you can't fully control it
- Don't skip adding subnets in a 3rd AZ for critical workloads
A well-planned VPC for a 3-AZ production deployment using 10.0.0.0/16:
| Subnet | CIDR | AZ | Type | Resources |
|---|---|---|---|---|
public-a | 10.0.1.0/24 | us-east-1a | Public | ALB, NAT GW |
public-b | 10.0.4.0/24 | us-east-1b | Public | ALB, NAT GW |
public-c | 10.0.7.0/24 | us-east-1c | Public | ALB, NAT GW |
app-a | 10.0.2.0/24 | us-east-1a | Private | EC2, ECS tasks |
app-b | 10.0.5.0/24 | us-east-1b | Private | EC2, ECS tasks |
app-c | 10.0.8.0/24 | us-east-1c | Private | EC2, ECS tasks |
db-a | 10.0.3.0/24 | us-east-1a | Private | RDS, ElastiCache |
db-b | 10.0.6.0/24 | us-east-1b | Private | RDS Standby |
db-c | 10.0.9.0/24 | us-east-1c | Private | RDS Replica |
π 9 subnets Γ 251 usable IPs = 2,259 IPs used out of 65,531 VPC IPs. You still have room to add Lambda subnets, management subnets, or secondary CIDR blocks β always plan with growth in mind.
Subnets are just CIDR slices of your VPC assigned to one AZ. What makes a subnet public or private is the route table β not a flag. AWS reserves 5 IPs per subnet. Plan your CIDR early, use /24 for most subnets, and spread across at least 2 AZs for every tier.
- CIDR notation β
10.0.0.0/16means 65,536 IPs with the first 16 bits fixed. Smaller prefix = more IPs. - VPC CIDR: choose between /16 and /28. Use /16 for production β gives room for subnets and future expansion.
- Subnets are AZ-scoped β one subnet = one AZ. Spread across 2β3 AZs for fault tolerance.
- Public vs Private: the route table makes the difference. Route to IGW β public. No IGW route β private.
- AWS reserves 5 IPs per subnet: network, router, DNS, reserved, broadcast. /28 = 11 usable, /24 = 251 usable.
- Use /24 subnets for most tiers. Make public subnets smaller; private app/db subnets larger.
- Plan CIDRs for peering: never use overlapping ranges across VPCs you might connect later.
π― Chapter 02 β Exam Tips
- Subnet = one AZ only. You cannot stretch a subnet across AZs.
- AWS reserves 5 IPs per subnet. A /28 gives only 11 usable. A /24 gives 251.
- "Public subnet" just means its route table has a route to an IGW β there is no "public" flag.
- VPC max CIDR = /16 (65,536 IPs). Smallest allowed = /28 (16 IPs, 11 usable).
- Overlapping CIDRs between VPCs = cannot peer them. Plan ranges early.
Routing & Internet Access
A route table is a set of rules (called routes) that tells VPC network traffic where to go. Every subnet in a VPC is associated with exactly one route table. When traffic leaves a resource in a subnet, the VPC router consults the route table to decide the next hop.
π Think of a route table as: a GPS for your network packets β "if the destination is 10.0.0.0/16, stay local; if it's 0.0.0.0/0, go to the internet gateway."
Route Table Structure
- Each row = one route (destination + target)
- Destination: CIDR range this rule applies to
- Target: where to send matching traffic (IGW, NAT, local, etc.)
- Most specific (longest prefix) wins when multiple rules match
Association
- Each subnet is associated with exactly one route table
- Multiple subnets can share the same route table
- A route table not explicitly associated with a subnet uses the main route table
- You can replace the main route table at any time
Local Route (always present)
- Every route table has a built-in local route
- Destination: your VPC CIDR (e.g.
10.0.0.0/16) - Target:
local - This route cannot be deleted or modified β it's AWS enforced
- Allows all resources within the VPC to communicate with each other
| Destination | Target | Meaning |
|---|---|---|
10.0.0.0/16 | local | All traffic within the VPC stays internal β always present, cannot remove |
0.0.0.0/0 | igw-xxxx | All other traffic β Internet Gateway. Makes this a public subnet route table. |
0.0.0.0/0 | nat-xxxx | All other traffic β NAT Gateway. Private subnet β outbound internet only. |
10.2.0.0/16 | pcx-xxxx | Traffic to a peered VPC range β VPC Peering connection. |
192.168.0.0/16 | vgw-xxxx | Traffic to on-premises range β Virtual Private Gateway (VPN / Direct Connect). |
The local route is automatically created when you create a VPC and cannot be removed. It ensures all resources within the VPC can communicate with each other using private IP addresses β regardless of which subnet they are in.
What the Local Route Enables
- EC2 in subnet A β EC2 in subnet B (same VPC) β works automatically
- EC2 β RDS in another subnet β works via local route
- Lambda in VPC β EKS node in another subnet β works
- No extra configuration needed for same-VPC communication
- Security Groups still control whether traffic is allowed
What Still Needs Configuration
- Internet access still requires IGW + route + public IP
- Cross-VPC communication requires VPC Peering or Transit Gateway
- On-premises access requires VPN or Direct Connect
- Security Groups must explicitly allow the traffic even within VPC
An Internet Gateway is a horizontally scaled, redundant, highly available VPC component that allows resources in your VPC to communicate with the internet. It serves two purposes: providing a target in route tables for internet-routable traffic, and performing NAT for instances with public IPs.
IGW Key Facts
- One IGW per VPC β you attach it to the VPC once
- Horizontally scalable β no bandwidth limits, no single point of failure
- Free to attach β you only pay for data transfer
- Performs 1:1 NAT between private IP and public/Elastic IP
- Both inbound AND outbound traffic flow through IGW (bidirectional)
IGW Requirements for Internet Access
- IGW must be attached to the VPC
- Subnet's route table must have
0.0.0.0/0 β igw-xxxx - EC2 instance must have a public IP or Elastic IP assigned
- Security Group must allow the inbound/outbound traffic
- NACL must not be blocking the traffic (default NACL allows all)
π Missing any one of these = no internet. The most common mistake is having route table pointing to IGW but no public IP on the EC2 instance (or vice versa β public IP but no route to IGW).
Public Route Table
10.0.0.0/16 β local(always present)0.0.0.0/0 β igw-xxxx(the internet route)- Associated with public subnets only
- Resources here need a public/Elastic IP to actually reach internet
- IGW must be attached to the VPC first
Private Route Table
10.0.0.0/16 β local(always present)0.0.0.0/0 β nat-xxxx(via NAT Gateway)- Associated with private subnets
- Resources can initiate outbound internet connections (e.g. yum update)
- Internet cannot initiate connections inbound β NAT is one-directional
π No 0.0.0.0/0 route at all = fully isolated subnet (no internet in or out). Use this for your most sensitive resources β isolated databases or compliance workloads that should never touch the internet.
| Rule | Detail |
|---|---|
| 1:1 association | Each subnet is associated with exactly one route table at any time |
| Many subnets, one table | Multiple subnets can share the same route table (common for private app subnets across AZs) |
| Main route table | Every VPC has a main route table. Subnets not explicitly associated default to it. |
| Explicit association | You create an explicit association to override the main route table for a specific subnet |
| Best practice | Never add internet routes to the main route table β use explicit associations for each tier |
For IPv6-enabled VPCs, AWS provides an Egress-Only Internet Gateway. Since all IPv6 addresses are public (no NAT in IPv6), the egress-only IGW allows outbound-only IPv6 traffic β the same role that NAT Gateway plays for IPv4 private subnets.
IPv4 + IPv6 Comparison
- IPv4 private β internet: NAT Gateway in public subnet
- IPv6 β internet outbound only: Egress-Only Internet Gateway
- IPv6 β internet (full bidirectional): regular Internet Gateway
- Egress-only IGW prevents unsolicited inbound IPv6 connections
When You Need It
- Dual-stack VPC (both IPv4 and IPv6 CIDRs)
- EC2 instances with IPv6 addresses that need outbound internet
- Exam: "IPv6 private subnet internet access" β Egress-Only IGW
| Symptom | Root Cause | Fix |
|---|---|---|
| EC2 in public subnet can't reach internet | No IGW attached to VPC, or no 0.0.0.0/0 β IGW route, or no public IP on EC2 | Attach IGW, add route, assign Elastic IP or enable auto-assign public IP |
| EC2 in private subnet can't reach internet (e.g. yum update fails) | No NAT Gateway, or private route table missing 0.0.0.0/0 β NAT | Create NAT GW in public subnet, add route in private RT to NAT GW |
| Private EC2 can't reach RDS in same VPC | Security Group on RDS doesn't allow inbound from EC2 SG/IP | Add SG rule on RDS allowing inbound 3306/5432 from EC2's SG |
| Subnet unexpectedly routes to internet | Subnet is using main route table which has IGW route | Create separate private route table, explicitly associate subnet, remove IGW from main RT |
| Traffic to peered VPC fails | Missing route to peered VPC CIDR, or peered VPC's RT also missing return route | Add route in both VPCs: peer-cidr β pcx- entry |
Route tables are the traffic cops of your VPC. The local route handles all internal VPC traffic automatically. Add 0.0.0.0/0 β IGW for public subnets, 0.0.0.0/0 β NAT for private subnets, and leave isolated subnets with no default route. Routing + Security Groups together control connectivity.
- Route tables direct traffic β each subnet has exactly one, multiple subnets can share a table.
- Local route (
VPC-CIDR β local) is always present, cannot be deleted, enables all intra-VPC communication. - Public subnet = route table has
0.0.0.0/0 β IGW. Instance also needs a public or Elastic IP. - Private subnet = route table has
0.0.0.0/0 β NAT GW. Outbound only β internet cannot initiate inbound. - Isolated subnet = no default route at all. Used for databases and sensitive workloads that need zero internet exposure.
- IGW is horizontally scaled, free to attach, and performs 1:1 NAT between Elastic IPs and private IPs.
- Longest prefix match: more specific routes take priority β
10.0.1.0/24beats10.0.0.0/16beats0.0.0.0/0.
π― Chapter 03 β Exam Tips
- Public subnet = route table has
0.0.0.0/0 β IGW. Instance ALSO needs a public/Elastic IP. - Private subnet = route table has
0.0.0.0/0 β NAT GW. Outbound only β internet cannot initiate inbound. - The local route cannot be deleted. It is always present and forces intra-VPC traffic to stay local.
- If EC2 can't reach internet: check (1) IGW attached, (2) route to IGW, (3) public IP, (4) SG outbound, (5) NACL outbound.
- Each subnet = exactly one route table. One route table can serve multiple subnets.
Security β SG vs NACL
AWS VPC provides two independent firewall layers that work at different levels of your network. Used together they give defence-in-depth β if one is misconfigured, the other still protects you.
Security Groups β Instance-Level
- Attached to a network interface (EC2, RDS, ALB, Lambdaβ¦)
- Stateful β return traffic is automatically allowed
- Allow rules only β no explicit deny
- Rules evaluated as a whole (all rules checked, most permissive wins)
- Default: all outbound allowed, all inbound denied
Network ACLs β Subnet-Level
- Attached to a subnet β applies to all traffic entering/leaving
- Stateless β inbound and outbound rules evaluated independently
- Allow and Deny rules β can explicitly block IPs/ranges
- Rules evaluated in number order β first match wins
- Default NACL: all traffic allowed in both directions
π Key exam distinction: Security Groups = stateful (like a bouncer who remembers you leaving, so lets you back in). NACLs = stateless (like a checkpoint that checks your ID both ways, every time).
A Security Group acts as a virtual firewall for your AWS resources. You attach one or more Security Groups to an EC2 instance (or any resource with a network interface), and all traffic to/from that resource is evaluated against the group's rules.
Stateful Behaviour
If you allow inbound HTTP (port 80) to an EC2, the response traffic (ephemeral ports) automatically flows back β you do NOT need an outbound rule for the response. SG tracks the connection state.
Rule Structure
- Type β SSH, HTTP, Custom TCPβ¦
- Protocol β TCP, UDP, ICMP
- Port range β single or range
- Source/Dest β CIDR, another SG ID, or prefix list
SG-to-SG Rules
You can reference another Security Group as a source/destination. Instead of hardcoding IPs, the rule says "allow from all instances in SG-xyz" β the most scalable approach for multi-tier apps.
Security Group Best Practices
- One SG per role:
sg-web,sg-app,sg-db - Reference SGs between tiers β don't use CIDR for internal rules
- Allow only the ports your app actually needs
- Never use
0.0.0.0/0for SSH/RDP β use your IP or Bastion SG - Use descriptive names and descriptions for every rule
- Multiple SGs on one resource = union of all rules (most permissive wins)
Common Mistakes
- Opening
0.0.0.0/0port 22 (SSH) to the internet - Using CIDR ranges for inter-service rules instead of SG references
- Forgetting that multiple SGs on one instance are unioned β not AND-ed
- Confusing stateful SG with stateless NACL β SG doesn't need outbound for responses
- Not cleaning up unused Security Groups (unused β harmless β they occupy slot limits)
| SG Name | Tier | Inbound Rules | Outbound Rules |
|---|---|---|---|
sg-alb | Load Balancer | 443 from 0.0.0.0/0; 80 from 0.0.0.0/0 | 8080 to sg-app (or All to VPC CIDR) |
sg-app | App Server | 8080 from sg-alb | 5432 to sg-db; 443 to 0.0.0.0/0 (API calls) |
sg-db | Database | 5432 from sg-app | None needed (stateful β responses auto-allowed) |
sg-bastion | Bastion | 22 from your-office-IP/32 | 22 to VPC CIDR (reach internal servers) |
A Network ACL (NACL) is an optional, stateless security layer for your subnets. Every subnet must have a NACL β if you don't create a custom one, AWS attaches the default NACL which allows all traffic.
NACL Rule Numbering
- Rules evaluated in ascending number order β lowest first
- First matching rule wins β evaluation stops there
- Rules range from 1 to 32766
- Rule * (star) is the implicit deny-all at the end β cannot be removed
- Leave gaps (100, 200, 300β¦) so you can insert rules later
Stateless β Both Directions
- Every request needs an inbound rule AND an outbound rule
- HTTP response from server uses ephemeral ports (1024β65535)
- You must allow outbound on the full ephemeral port range for responses
- This is the most common NACL mistake β people forget the outbound return rule
Inbound Rules
| # | Type | Port | Source | Action |
|---|---|---|---|---|
| 100 | HTTPS | 443 | 0.0.0.0/0 | β ALLOW |
| 110 | HTTP | 80 | 0.0.0.0/0 | β ALLOW |
| 120 | Custom TCP | 1024β65535 | 0.0.0.0/0 | β ALLOW (responses) |
| * | All | All | 0.0.0.0/0 | π« DENY |
Outbound Rules
| # | Type | Port | Dest | Action |
|---|---|---|---|---|
| 100 | Custom TCP | 1024β65535 | 0.0.0.0/0 | β ALLOW (responses) |
| 110 | HTTPS | 443 | 0.0.0.0/0 | β ALLOW |
| * | All | All | 0.0.0.0/0 | π« DENY |
π Ephemeral port trap: When your EC2 responds to a browser request, the response goes to the browser's ephemeral port (e.g., 54231). Your NACL outbound rules must allow ports 1024β65535 or responses are silently dropped β this is stateless behaviour.
| Feature | Security Group | Network ACL |
|---|---|---|
| Scope | Instance / ENI level | Subnet level β all resources in subnet |
| Stateful? | β Yes β return traffic auto-allowed | β No β must explicitly allow both directions |
| Allow/Deny | Allow rules only (no explicit deny) | Both Allow and Deny rules |
| Rule evaluation | All rules evaluated β most permissive wins | Rules in number order β first match wins |
| Default behaviour | All inbound DENIED, all outbound allowed | Default NACL: all ALLOWED (custom: all DENIED) |
| Can block specific IP? | β No β you cannot deny a specific IP | β Yes β add a DENY rule for that IP/range |
| Applies to | Specific resources you attach it to | All resources in the associated subnet |
| Rule count limit | 60 inbound + 60 outbound rules per SG (soft) | 20 rules inbound + 20 outbound per NACL (soft) |
| Use case | Primary traffic control per application tier | Extra layer, IP blacklisting, compliance controls |
Use Security Groups as your primary control (stateful, per-resource). Add NACLs as a second layer for subnet-wide IP blocking, compliance controls, or to explicitly deny traffic you cannot deny with SGs alone. Together they provide defence-in-depth β traffic must pass both.
- Security Groups β stateful, instance-level, allow-only rules, all evaluated together (most permissive wins).
- NACLs β stateless, subnet-level, allow + deny rules, evaluated in number order (first match wins).
- Stateful vs Stateless: SG return traffic is automatic. NACL requires explicit rules for both directions + ephemeral ports.
- Default SG: all inbound denied, all outbound allowed. Default NACL: all allowed. Custom NACL: all denied until you add rules.
- Use SG-to-SG references for inter-tier rules β more scalable than CIDR-based rules.
- Use NACLs to deny known bad IPs, block port ranges, or add subnet-wide controls beyond what SGs offer.
- Both must allow traffic β one misconfigured layer blocks everything even if the other allows it.
π― Chapter 04 β Exam Tips
- SG = stateful (return traffic auto-allowed). NACL = stateless (must allow both directions explicitly).
- SG has allow-only rules; NACL has allow + deny. Only NACL can explicitly block an IP.
- NACL outbound must allow ephemeral ports (1024β65535) for response traffic β this is the #1 NACL mistake.
- Multiple SGs on one instance = union (most permissive wins). NACL rules = evaluated in number order (first match wins).
- Default SG: all inbound denied. Default NACL: all traffic allowed. Custom NACL: all traffic denied until you add rules.
Connectivity
A VPC is isolated by default β nothing outside it can reach your resources unless you explicitly configure a connectivity path. AWS provides several mechanisms to connect your VPC to the internet, to other VPCs, or to on-premises environments.
VPC β VPC
- VPC Peering β direct 1:1 private link
- Transit Gateway β hub-and-spoke for many VPCs
- PrivateLink β expose a service privately to another VPC
VPC β On-Premises
- Site-to-Site VPN β encrypted tunnel over internet
- AWS Direct Connect β dedicated private circuit
- Client VPN β individual user access
VPC β AWS Services
- VPC Endpoints (Gateway) β S3, DynamoDB (free)
- VPC Endpoints (Interface) β other AWS services via PrivateLink
- Covered in Chapter 06
A VPC Peering connection is a networking connection between two VPCs that enables you to route traffic between them using private IPv4 or IPv6 addresses. The VPCs can be in the same account, different accounts, or different regions (inter-region peering).
How VPC Peering Works
- Requester VPC sends a peering request β accepter VPC accepts
- Add routes in both VPCs' route tables:
peer-cidr β pcx-xxxxx - Update Security Groups to allow traffic from peer CIDR
- Traffic stays on AWS backbone β never traverses the internet
- No bandwidth throttle, no single point of failure
- You pay no extra for same-region peering (data transfer still billed)
VPC Peering Limitations
- No transitive routing β AβB and BβC does NOT mean A can reach C
- No overlapping CIDRs β peered VPCs must have non-overlapping IP ranges
- 1:1 only β to connect 10 VPCs you need 45 peering connections (n*(n-1)/2)
- No IPv6 on same-account, same-region peering β use TGW for that
- No edge-to-edge routing (VPN, Direct Connect) through a peer
π Transitive routing trap (exam favourite): VPC-A peers with VPC-B, VPC-B peers with VPC-C. VPC-A cannot reach VPC-C through VPC-B β VPC Peering is not transitive. Use Transit Gateway for hub-and-spoke connectivity instead.
AWS Site-to-Site VPN creates an encrypted IPSec tunnel between your on-premises network and your AWS VPC over the public internet. It is the fastest and cheapest way to establish hybrid connectivity β setup takes minutes.
How Site-to-Site VPN Works
- AWS side: attach a Virtual Private Gateway (VGW) to your VPC
- On-premises side: configure your Customer Gateway (CGW) device (router/firewall)
- AWS creates two IPSec tunnels for redundancy (different AWS endpoints)
- Add route in VPC route table:
on-prem-CIDR β vgw-xxx - IPSec encrypted β AES-256 or AES-128
VPN Performance & Cost
- Max 1.25 Gbps per tunnel (two tunnels = up to 2.5 Gbps with ECMP)
- Latency varies β goes over internet, so latency fluctuates
- Cost: ~$0.05/hr per VPN connection + data transfer
- Good for: backup DR connectivity, testing, low-bandwidth workloads
- Not for: high-bandwidth, latency-sensitive (use Direct Connect instead)
AWS Direct Connect provides a dedicated private physical network connection from your data centre to AWS. Traffic does not go over the internet β it flows through a private fiber link to an AWS Direct Connect location, then into your VPC.
Direct Connect Key Facts
- Speeds: 1 Gbps, 10 Gbps, 100 Gbps (dedicated) or 50 Mbpsβ10 Gbps (hosted)
- Consistent, low-latency connection β not over internet
- Reduced data transfer costs vs internet transfer
- Provisioning takes weeks to months (physical circuit installation)
- Connect via Virtual Interfaces (VIF): Private VIF β VPC, Public VIF β AWS public services
Direct Connect Risks
- Single physical link = single point of failure
- Use two DX connections in different locations for HA
- Always configure a VPN as backup to Direct Connect
- Traffic is NOT encrypted by default β add IPSec VPN over DX for encryption
- High setup cost and lead time
| Feature | Site-to-Site VPN | Direct Connect |
|---|---|---|
| Connection type | IPSec tunnel over internet | Dedicated private circuit |
| Speed | Up to 1.25 Gbps per tunnel | 1 / 10 / 100 Gbps |
| Latency | Variable (internet path) | Consistent, low latency |
| Encryption | Yes β IPSec AES-256 built-in | No β requires VPN over DX for encryption |
| Setup time | Minutes | Weeks to months |
| Cost | Low (~$0.05/hr + data) | High (port fee + data transfer) |
| Best for | Quick start, DR backup, small bandwidth | High throughput, consistent latency, large data |
| HA built-in? | 2 tunnels per connection (auto) | No β build your own redundancy |
AWS Transit Gateway acts as a central hub to interconnect multiple VPCs and on-premises networks. Instead of setting up n*(n-1)/2 peering connections for n VPCs, you attach each VPC to one Transit Gateway β and all attached VPCs can route to each other through the hub.
Transit Gateway Features
- Attach VPCs, VPNs, and Direct Connect gateways
- Supports transitive routing (unlike peering)
- Up to 5,000 attachments per TGW
- Route tables per TGW for network segmentation
- Cross-region peering between two TGWs
Multi-Account Sharing
- Share TGW across AWS accounts via Resource Access Manager (RAM)
- Centralise routing for entire AWS Organization
- One TGW per region β use TGW peering for cross-region
Cost
- ~$0.05/hr per attachment + $0.02/GB data
- More expensive than peering for simple 2-VPC setups
- Cost-effective when you have 4+ VPCs β saves on peering complexity
For VPC-to-VPC: use Peering for β€3 VPCs (simple, free), Transit Gateway for 4+ (hub-and-spoke, transitive). For on-premises: VPN spins up in minutes over the internet; Direct Connect gives a dedicated fibre circuit. Use both for High Availability β DX as primary, VPN as failover.
- VPC Peering β private, 1:1 link between two VPCs. No transitive routing. Both route tables must be updated. No overlapping CIDRs.
- Transit Gateway β hub-and-spoke for many VPCs. Supports transitive routing. Use for 4+ VPCs or complex multi-account topologies.
- Site-to-Site VPN β IPSec over internet. Fast to set up (minutes), up to 1.25 Gbps, variable latency. Good as DR backup.
- Direct Connect β dedicated private fibre. Up to 100 Gbps, consistent low latency. Weeks to provision. No built-in encryption β add VPN over DX for security.
- HA pattern: Direct Connect (primary) + VPN (backup) = resilient hybrid connectivity.
- Cost rule: Peering is cheapest for few VPCs. TGW pays off at scale. Direct Connect requires port fee commitment.
- VPC Endpoints (for S3/DynamoDB/others without internet) β covered in Chapter 06.
π― Chapter 05 β Exam Tips
- VPC Peering is NOT transitive. AβB and BβC does NOT mean A can reach C. Use Transit Gateway for that.
- VPN = encrypted IPSec over internet (minutes to set up, ~1.25 Gbps). Direct Connect = private fibre (weeks, up to 100 Gbps).
- Direct Connect is NOT encrypted by default. Add VPN over DX if encryption is required.
- Transit Gateway supports transitive routing. Use for 4+ VPCs or hub-and-spoke topologies.
- HA hybrid = Direct Connect (primary) + Site-to-Site VPN (backup). Both connect via Virtual Private Gateway.
Advanced Networking
NAT Gateway (Deep Dive)
Managed, HA outbound internet for private subnets. Multi-AZ patterns, billing, vs NAT Instance.
VPC Endpoints
Access S3, DynamoDB, and 200+ AWS services privately β no internet, no NAT, no data transfer fees.
DNS in VPC
VPC resolver, Route 53 Resolver inbound/outbound, DNS hostnames, custom domains.
AWS PrivateLink
Expose your service to other VPCs without peering β NLB provider, ENI consumer, no CIDR conflicts.
A NAT Gateway (Network Address Translation) allows EC2 instances in private subnets to initiate outbound connections to the internet or other AWS services, while preventing the internet from initiating connections to those instances. It lives in a public subnet and uses an Elastic IP.
| Feature | NAT Gateway (AWS Managed) | NAT Instance (Self-Managed EC2) |
|---|---|---|
| Availability | Highly available within AZ β AWS managed | Single EC2 β single point of failure |
| Bandwidth | Up to 100 Gbps (auto-scales) | Depends on EC2 instance type |
| Maintenance | AWS handles patching, failover | You patch, monitor, and restart |
| Security Groups | Cannot attach SGs directly | Can attach SGs for fine-grained traffic control |
| Cost | ~$0.045/hr + $0.045/GB processed | EC2 cost only (cheaper at low volumes) |
| Port forwarding | Not supported | Supported |
| Recommendation | β Always use for production | Only for cost optimisation in dev/test |
NAT Gateway is AZ-Scoped
- A NAT Gateway lives in one AZ only
- If that AZ fails, private instances in other AZs lose internet via that NAT
- Best practice: Create one NAT Gateway per AZ
- Each AZ's private route table β points to its own NAT GW
- Costs more but eliminates cross-AZ data transfer charges and AZ dependency
NAT Gateway Cost Tips
- Billed per hour + per GB of data processed
- Avoid routing S3/DynamoDB traffic through NAT β use VPC Endpoints (free!)
- Use NAT Gateway only in AZs where you have running workloads
- Lambda, ECS tasks in VPC all need NAT GW for internet β can add up fast
- Monitor with CloudWatch metric:
BytesOutToDestination
A VPC Endpoint allows you to connect your VPC directly to AWS services (S3, DynamoDB, SQS, SNS, etc.) without going through the internet, NAT Gateway, or IGW. Traffic stays on the AWS private backbone β more secure and often cheaper.
Gateway Endpoint
- Supported services: S3 and DynamoDB only
- Free β no hourly charge, no data transfer charge
- Works by adding a route in your route table:
pl-xxxxx β vpce-xxxxx - No ENI created β pure routing-based
- Regional β works across AZs automatically
- Applies a prefix list as destination in route table
Interface Endpoint (PrivateLink)
- Supported services: 200+ AWS services (SQS, SNS, API GW, CloudWatch, SSMβ¦)
- Costs ~$0.01/hr per AZ + $0.01/GB data
- Creates an ENI (Elastic Network Interface) in your subnet
- Gets a private DNS name β service calls resolve to your VPC IP
- Can apply Security Groups to the ENI for traffic control
- Works across accounts and VPCs via PrivateLink
π S3 via Gateway Endpoint = zero cost. If your private EC2 or Lambda is hammering S3 through a NAT Gateway, you're paying ~$0.045/GB needlessly. Add a Gateway Endpoint for S3 and the traffic bypasses NAT entirely β no code changes needed.
VPC Endpoints can have endpoint policies β JSON IAM-style policy documents that control which actions and resources can be accessed through the endpoint. This lets you restrict a private subnet to only access specific S3 buckets, preventing data exfiltration.
Example β Restrict to one S3 bucket
{
"Statement": [{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::my-company-bucket",
"arn:aws:s3:::my-company-bucket/*"
]
}]
} Use Cases
- Prevent EC2 in private subnet from accessing competitor S3 buckets
- Restrict DynamoDB access to only your application's tables
- Compliance: ensure data only flows to approved destinations
- Combined with bucket policies that deny access unless via VPC endpoint
Every VPC has a built-in DNS resolver available at the VPC base address + 2 (e.g., 10.0.0.2 for a 10.0.0.0/16 VPC). This resolver, called AmazonProvidedDNS, handles all DNS queries from your VPC resources.
VPC DNS Settings (Two Toggles)
- enableDnsSupport β enables the AmazonProvidedDNS resolver. Default:
true. Disable this and DNS breaks for everything in the VPC. - enableDnsHostnames β assigns public DNS hostnames (e.g.,
ec2-54-1-2-3.compute.amazonaws.com) to instances with public IPs. Default:falsefor custom VPCs,truefor Default VPC. - Both must be enabled for VPC Endpoint private DNS to work.
Route 53 Resolver
- Inbound Endpoints β on-premises DNS servers can resolve AWS resource names via your VPC
- Outbound Endpoints β your VPC can query on-premises DNS servers for internal domains
- Resolver Rules β forward specific domains (e.g.,
corp.internal) to on-premises DNS - Used when merging on-premises and AWS DNS namespaces
| DNS Scenario | Resolution Flow |
|---|---|
EC2 queries api.s3.amazonaws.com | VPC DNS (10.0.0.2) β public AWS DNS β S3 IP (or VPC Endpoint private IP if endpoint exists) |
EC2 queries rds-instance.xxx.rds.amazonaws.com | VPC DNS β AWS internal β private IP of RDS (10.0.3.x) |
EC2 queries corp.internal (on-prem domain) | VPC DNS β Route 53 Outbound Resolver Endpoint β on-premises DNS server |
On-prem server queries app.vpc.internal | On-prem DNS β Route 53 Inbound Resolver Endpoint β VPC DNS |
| Interface Endpoint private DNS enabled | Queries to sqs.us-east-1.amazonaws.com resolve to ENI private IP β no internet exit |
AWS PrivateLink lets you expose a service running in your VPC to consumers in other VPCs β or other AWS accounts β without peering, without internet, and without CIDR conflicts. The consumer sees an ENI in their own VPC; they don't need to know anything about your network topology.
Service Provider Side
- Put your service behind a Network Load Balancer (NLB)
- Create a VPC Endpoint Service pointing to that NLB
- Control which AWS accounts / principals can connect
- Approve connection requests manually or automatically
Service Consumer Side
- Create an Interface VPC Endpoint in your VPC
- An ENI with a private IP is created in your subnet
- Access the service via the ENI private IP or private DNS name
- Traffic goes via AWS backbone β never public internet
PrivateLink Benefits
- No VPC peering, no overlapping CIDR concerns
- Service provider's VPC IP space is never exposed
- Works across regions, accounts, and even third-party SaaS
- AWS Marketplace services also use PrivateLink
Deploy one NAT Gateway per AZ for HA. Add Gateway Endpoints for S3 and DynamoDB to eliminate NAT costs for those services. Use Interface Endpoints (PrivateLink) for everything else that needs private access. DNS in VPC is automatic β Route 53 Resolver bridges AWS and on-premises namespaces.
VPC Flow Logs capture metadata about the IP traffic going to and from network interfaces in your VPC. They do NOT capture packet contents β only metadata like source/destination IP, ports, protocol, bytes, and whether the traffic was accepted or rejected.
What Flow Logs Capture
- Source & destination IP addresses
- Source & destination port numbers
- Protocol (TCP, UDP, ICMPβ¦)
- Packet & byte counts
- ACCEPT or REJECT action
- Timestamp & log status
Where You Can Enable Them
- VPC level β captures all ENIs in the VPC
- Subnet level β captures all ENIs in that subnet
- ENI level β captures traffic for one specific interface
- Can be enabled simultaneously at all three levels
- Flow Logs do NOT affect network throughput or latency
Where to Store Them
- CloudWatch Logs β real-time monitoring, metric filters, alarms
- Amazon S3 β long-term storage, cheaper, query with Athena
- Kinesis Data Firehose β stream to third-party systems
- Choose based on use case: debugging (CW) vs audit (S3) vs SIEM (Firehose)
| Field | Example | What It Tells You |
|---|---|---|
srcaddr | 10.0.1.50 | Source IP β who sent the traffic |
dstaddr | 10.0.2.20 | Destination IP β where it was going |
srcport | 443 | Source port β what service sent it |
dstport | 54231 | Destination port β ephemeral (response) or well-known (request) |
protocol | 6 (TCP) | Transport protocol used |
action | ACCEPT / REJECT | Was traffic allowed or blocked by SG/NACL? |
bytes | 5240 | Total bytes transferred in this flow |
Flow Log Use Cases
- Debug "why can't EC2 reach RDS?" β look for REJECT on port 5432
- Detect unauthorised access attempts β filter REJECT + external IPs
- Identify top talkers β aggregate bytes per source IP
- Compliance audit β prove network traffic patterns for SOC2/HIPAA
- Security forensics β trace lateral movement after a breach
Flow Log Limitations
- Does NOT capture packet contents (payload) β only metadata
- DNS queries to AmazonProvidedDNS are not logged
- DHCP traffic is not logged
- Traffic to 169.254.169.254 (metadata service) is not logged
- Cannot be modified after creation β delete and recreate to change
- There is a delay (~10 min) before logs are available
π Debugging tip: If you see traffic ACCEPTED in Flow Logs but your app still can't connect β the problem is NOT the network (SG/NACL). Check the application layer (service not running, wrong port, DNS resolution). If you see REJECT β check SG and NACL rules for that port/IP combination.
- NAT Gateway β managed, scales to 100 Gbps, AZ-scoped. Deploy one per AZ for HA. Billed per hour + GB.
- NAT Gateway vs NAT Instance: always prefer NAT GW for production β managed, no patching, auto-HA within AZ.
- Gateway Endpoint (S3 + DynamoDB) β free, route-based. Eliminates NAT costs for these two services. Add endpoint policy for security.
- Interface Endpoint (PrivateLink) β ENI per subnet, private DNS. Supports 200+ AWS services. Costs ~$0.01/hr per AZ.
- DNS: AmazonProvidedDNS at VPC-base+2. Enable
enableDnsSupport+enableDnsHostnamesfor full DNS functionality. Route 53 Resolver endpoints bridge on-premises and AWS DNS. - PrivateLink β NLB on provider side, ENI on consumer side. No peering, no CIDR conflicts, works cross-account and cross-region.
- VPC Flow Logs β captures ACCEPT/REJECT metadata per ENI. Store in CloudWatch (debug) or S3 (audit). Does NOT capture packet contents. Essential for security and troubleshooting.
π― Chapter 06 β Exam Tips
- NAT Gateway is AZ-scoped β one per AZ for HA. NAT Instance is self-managed EC2 (legacy).
- S3 + DynamoDB use Gateway Endpoints (free). Everything else uses Interface Endpoints (paid, ENI-based).
- If asked "how to access S3 privately without NAT" β answer is Gateway Endpoint.
enableDnsSupport+enableDnsHostnamesmust both betruefor VPC Endpoint private DNS to work.- Flow Logs capture ACCEPT/REJECT metadata β NOT packet contents. Stored in CloudWatch or S3.
- PrivateLink = NLB (provider) + Interface Endpoint (consumer). No VPC peering needed. No CIDR overlap concern.
Architecture Patterns
This chapter brings everything together. These are the real-world VPC architecture blueprints used in production β from a simple 2-tier web app to a full HA multi-AZ enterprise design. Each pattern builds on the VPC primitives covered in earlier chapters.
Pattern 1 β 2-Tier
Web server in public subnet, database in private subnet. Simplest production-grade setup.
Pattern 2 β 3-Tier
ALB β App servers β Database. Clear separation: presentation, logic, data layers.
Pattern 3 β HA Multi-AZ
3-tier across 2+ AZs. Auto Scaling Group, Multi-AZ RDS, multi-AZ NAT GWs. Full resilience.
The simplest production-ready pattern: a web/app server in a public subnet talks directly to a database in a private subnet. Good for MVPs, internal tools, and single-AZ applications.
| Layer | Subnet | Resources | Access |
|---|---|---|---|
| Web/App | Public β 10.0.1.0/24 | EC2 web server with Elastic IP | Internet via IGW |
| Database | Private β 10.0.2.0/24 | RDS (MySQL/Postgres) | Web layer only (SG reference) |
2-Tier Pros
- Simple β fewer components to manage
- Cheap β no ALB, no NAT GW needed if EC2 is in public subnet
- Fast to set up for prototypes and internal apps
2-Tier Cons
- No HA β single EC2, single point of failure
- Cannot auto-scale without ALB
- EC2 directly faces internet β larger attack surface
The standard production pattern. An Application Load Balancer in the public subnet receives traffic and distributes it to app servers in a private subnet. App servers talk to a database in an isolated private subnet. Clear separation of concerns at every layer.
| Tier | Subnet | Resources | In/Out |
|---|---|---|---|
| Presentation | Public β 10.0.1.0/24 | ALB, NAT Gateway | Internet via IGW |
| Application | Private App β 10.0.2.0/24 | EC2 / ECS / EKS nodes | Inbound from ALB only; outbound via NAT |
| Data | Private DB β 10.0.3.0/24 | RDS, ElastiCache, OpenSearch | Inbound from app tier only; no internet route |
The full production HA pattern. Every tier is spread across at least 2 AZs. Auto Scaling Groups manage EC2 capacity automatically. RDS Multi-AZ handles synchronous replication and automatic failover. NAT Gateways are per-AZ so losing one AZ doesn't affect others.
HA Design Principles
- No single point of failure β every component must have a standby in another AZ
- ALB spans multiple AZs β routes to healthy targets automatically
- Auto Scaling Group manages EC2 across AZs β replaces unhealthy instances
- RDS Multi-AZ β synchronous standby, auto-failover in 60β120 sec
- One NAT GW per AZ β AZ failure doesn't kill other AZs' outbound internet
- S3 + Gateway Endpoint in both AZs' route tables
HA VPC Checklist
- β 2+ AZs in every tier (pub, app, db)
- β ALB configured with subnets in all AZs
- β ASG with min 2, spans both AZs
- β RDS Multi-AZ enabled
- β One NAT GW per AZ (not shared)
- β Route 53 health checks for failover DNS
- β VPC Flow Logs enabled for all subnets
- β CloudWatch alarms on NAT GW, ALB 5xx, RDS connections
| Requirement | Solution | VPC Component |
|---|---|---|
| Internet-facing web app | ALB in public subnet, EC2 in private | IGW + public subnet + ALB + NAT GW |
| Zero-downtime deployments | ALB + Auto Scaling Group target group | Multi-AZ private subnets + ASG |
| Database that survives AZ failure | RDS Multi-AZ or Aurora Global | DB subnet group spanning 2+ AZs |
| Private S3 access (no NAT cost) | Gateway VPC Endpoint | Route entry in private subnet RT |
| Access 200+ AWS services privately | Interface Endpoint (PrivateLink) | ENI in your subnet with private DNS |
| Connect to on-premises (fast setup) | Site-to-Site VPN | VGW + Customer Gateway + route |
| Connect to on-premises (high throughput) | Direct Connect + VPN failover | DX Gateway + VGW |
| Share services across 10+ VPCs | Transit Gateway | TGW + attachments per VPC |
| Block a known malicious IP range | NACL deny rule | NACL on public subnet |
| Audit all network traffic | VPC Flow Logs β S3 or CloudWatch | Enable on VPC, subnet, or ENI level |
Start with the 3-tier pattern for any production workload. Add Multi-AZ to every tier (ALB, ASG, RDS) for HA. Use one NAT Gateway per AZ β avoid the temptation to share one across AZs. Add S3 Gateway Endpoint immediately to eliminate NAT costs. Everything builds on the VPC fundamentals from Chapter 01.
- 2-Tier: EC2 in public + RDS in private. Simple, cheap. No HA. Good for dev/test or simple internal apps.
- 3-Tier: ALB (public) β EC2/ECS (private app) β RDS (isolated db). Clean separation. The standard production blueprint.
- HA Multi-AZ: every tier spans 2+ AZs. ALB is multi-AZ by default. ASG replaces unhealthy EC2. RDS Multi-AZ for automatic failover.
- One NAT GW per AZ β shared NAT is a single point of failure and incurs cross-AZ data transfer charges.
- S3 Gateway Endpoint should always be added β it's free and eliminates NAT costs for S3 traffic.
- Security Groups chain:
sg-alb β sg-app β sg-dbusing SG references, not CIDRs. - VPC Flow Logs on all subnets + CloudWatch alarms = observability foundation for your network.
π― Chapter 07 β Exam Tips
- 3-tier is the default production architecture: ALB (public) β App (private) β DB (isolated).
- Multi-AZ = HA. Every tier must span 2+ AZs: ALB (automatic), EC2 (ASG), RDS (Multi-AZ).
- NAT Gateway per AZ β don't share across AZs (AZ failure + cross-AZ data transfer).
- Use SG-to-SG references (
sg-app β sg-db) β not CIDR ranges β for inter-tier rules. - Always add S3 Gateway Endpoint (free) to eliminate NAT costs for S3/DynamoDB traffic.
| Concept | Key Idea | Scope |
|---|---|---|
| VPC | Your private, isolated network inside AWS | Region |
| Subnet | AZ-level segmentation of VPC CIDR | Single AZ |
| CIDR | IP address range notation (e.g., 10.0.0.0/16 = 65,536 IPs) | VPC & Subnet |
| Internet Gateway | Bidirectional internet access for public subnets | VPC (one per VPC) |
| NAT Gateway | Outbound-only internet for private subnets | Single AZ |
| Route Table | Rules that decide where traffic goes | Subnet (1:1) |
| Local Route | Cannot be deleted β allows all intra-VPC traffic | Every route table |
| Security Group | Stateful firewall at instance/ENI level (allow only) | Instance |
| Network ACL | Stateless firewall at subnet boundary (allow + deny) | Subnet |
| VPC Peering | 1:1 private link between two VPCs (non-transitive) | Cross-account/region |
| Transit Gateway | Hub-and-spoke for many VPCs (transitive routing) | Region (peerable) |
| Site-to-Site VPN | Encrypted IPSec tunnel over internet to on-premises | VPC β On-prem |
| Direct Connect | Dedicated private fibre circuit (not encrypted by default) | VPC β On-prem |
| Gateway Endpoint | S3 + DynamoDB private access (free, route-based) | VPC |
| Interface Endpoint | 200+ services via PrivateLink ENI (paid) | Subnet (per-AZ ENI) |
| VPC Flow Logs | Network traffic metadata (ACCEPT/REJECT, not payload) | VPC / Subnet / ENI |
| Elastic IP | Static public IPv4 address you own | Instance / NAT GW |
| PrivateLink | Expose service to other VPCs via NLB (no peering needed) | Cross-account |
| Mistake | Why It's Wrong | Fix |
|---|---|---|
| Running production in Default VPC | All subnets are public β databases exposed to internet | Create Custom VPC with private subnets |
| Assuming subnet is public by default | Subnet is private until you add 0.0.0.0/0 β IGW route | Explicitly create public route table with IGW route |
| Forgetting route table association | New subnets inherit the main route table (may have IGW!) | Always explicitly associate private RT |
| Blocking ephemeral ports in NACL | Responses use ports 1024β65535; NACL is stateless | Allow 1024β65535 outbound in NACL |
| Opening 0.0.0.0/0 port 22 in SG | SSH open to entire internet = brute force attacks | Restrict to your IP or use SSM Session Manager |
| Sharing one NAT GW across AZs | AZ failure + cross-AZ data charges | Deploy one NAT GW per AZ |
| Routing S3 traffic through NAT GW | Unnecessary $0.045/GB cost | Add free S3 Gateway Endpoint |
| Overlapping CIDRs across VPCs | Cannot peer VPCs with overlapping ranges | Plan IP ranges before creating VPCs |
| Using CIDRs instead of SG references | Breaks when IPs change; doesn't scale | Reference SG IDs between tiers |
| No VPC Flow Logs enabled | Can't debug or audit network issues | Enable at VPC level β S3 for audit |