LearningTree · AWS · Database

Amazon RDS —
Managed Relational Database

Fully managed relational database — no patching, no backups to configure, no infrastructure to manage. RDS gives you MySQL, PostgreSQL, MariaDB, Oracle, or SQL Server with built-in high availability, automatic failover, and point-in-time recovery.

⚡ RDS in 30 Seconds

Managed SQL database — AWS handles patching, backups, and hardware
6 engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora (separate service)
Multi-AZ = high availability (synchronous standby, auto-failover)
Read Replicas = read scaling (asynchronous, up to 15 replicas)
Runs inside your VPC — private, never on the public internet

Chapter One

What is Amazon RDS

The Problem — Running Databases Yourself Introductory

Running a relational database yourself on EC2 requires you to handle everything: installing the engine, configuring storage, setting up replication, scheduling backups, applying patches, monitoring for failures, and manually failing over when the primary goes down. That's weeks of work before you write a single line of application code.

👉 The core problem: Databases need constant operational care — backups, patches, failover, replication. Every hour spent on DB ops is an hour not spent on your product. RDS takes all of that away.

What is a Relational Database Introductory

A relational database stores data in structured tables with rows and columns. Tables are related to each other through foreign keys. You query data with SQL. This model works for almost all transactional applications: e-commerce orders, user accounts, financial records, inventory systems.

📊

Structured Data

Schema-defined. Every row in a table has the same columns. Strong data integrity with constraints (NOT NULL, UNIQUE, FOREIGN KEY).

🔗

Relationships

Tables join together. A users table relates to an orders table. Complex queries with JOINs, GROUP BY, transactions.

🔒

ACID Transactions

Atomicity, Consistency, Isolation, Durability. Either all changes commit, or none do. Critical for financial and medical data.

What is Amazon RDS Introductory

Amazon RDS (Relational Database Service) is a fully managed service that runs a relational database engine on your behalf inside AWS. You choose the engine, instance size, and storage. AWS handles everything else — provisioning, patching, backups, monitoring, failover.

🛑

You Manage on EC2

Install database engine (MySQL, Postgres...)
Configure storage, IOPS, file system
Apply OS + DB patches manually
Set up replication manually
Schedule and test backups
Monitor and respond to failures
Implement failover scripts

✅

RDS Manages

Hardware provisioning and lifecycle
Database engine installation
Automated OS and DB patching
Synchronous replication (Multi-AZ)
Automated daily backups + transaction logs
Health monitoring + auto-restart
Automatic failover in <2 minutes

Supported Database Engines Core

RDS Supported Engines — Choose the right engine for your use case

MySQL

What is RDS Custom

Managed DB service with OS + engine access
Supports Oracle and Microsoft SQL Server only
SSH access, filesystem access, custom scripts
Toggle automation (pause RDS automation to apply custom changes)
AWS still manages backups, Multi-AZ, failover

🎯

When to Use RDS Custom

Legacy Oracle apps requiring custom OS-level configuration
SQL Server features not exposed by standard RDS
Custom patches or DB features RDS doesn't support
Migrating on-prem Oracle to AWS with minimal changes
Exam: “OS-level access + managed RDS” → RDS Custom

Concept Diagram — Managed vs Unmanaged Introductory

DIY (EC2) vs Managed (RDS) — What you own vs what AWS owns

AWS Architecture Diagram Core

RDS in AWS — Application talks to RDS endpoint inside VPC

VPC (10.0.0.0/16)

PUBLIC SUBNET

EC2 App

Web server

→

PRIVATE SUBNET

RDS Instance

MySQL / Postgres
Port 3306 / 5432

🔒 Security Group: only EC2 allowed
🔑 KMS encrypted at rest
🩟️ Endpoint: mydb.xxx.us-east-1.rds.amazonaws.com
🔄 Automated daily backups (7 day retention)

Mental Model — The Managed Apartment Analogy Introductory

🏠

DIY (EC2 Database) = Owning a House

You choose and install the plumbing (DB engine)
You fix the boiler when it breaks (patches)
You call a plumber at 3am (on-call)
You organize your own insurance (backups)
Total control, total responsibility

🏢

RDS = Managed Apartment Building

Building manager handles plumbing (AWS manages engine)
Maintenance team patches issues (automated patching)
24/7 on-call building staff (AWS monitoring)
Fire insurance included (automated backups)
You just live there and focus on your work

Multi-AZ vs Read Replicas — Critical Distinction Core

This is the most important concept in RDS and the most common exam mistake. These two features solve completely different problems:

   Feature Purpose Replication Can Serve Reads? 
  Multi-AZ 🛡️ High Availability Synchronous (zero data loss) ❌ No — standby only 
 Read Replica 📈 Read Scaling Asynchronous (slight lag) ✅ Yes — read traffic 
  

Feature	Purpose	Replication	Can Serve Reads?
Multi-AZ	🛡️ High Availability	Synchronous (zero data loss)	❌ No — standby only
Read Replica	📈 Read Scaling	Asynchronous (slight lag)	✅ Yes — read traffic

🧠 Key Insight

Multi-AZ = HA (failover protection). Read Replica = performance (scale reads). You can combine both: a Multi-AZ primary with read replicas for a production-grade, highly available, read-scalable database tier.

Chapter Summary Introductory

 RDS = fully managed relational DB — AWS handles patching, backups, failover, replication
6 engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server (+ Aurora separately)
Runs in your VPC — private subnet, security groups, never on public internet
Multi-AZ = high availability (synchronous standby, NOT readable)
Read Replicas = read scaling (asynchronous, readable, up to 15)
Mental model: managed apartment building — you use it, AWS maintains it
 

Chapter Two

Core Concepts — DB Instance, Storage & Endpoints

The DB Instance — The Core Unit Introductory

A DB instance is an isolated database environment running in AWS. It is the fundamental building block of RDS — think of it as a virtual database server. Each DB instance runs one database engine (MySQL, PostgreSQL, etc.) and can contain multiple databases.

🖥️

Instance Class

db.t3 — burstable (dev/test)
db.m6g — general purpose
db.r6g — memory-optimized
db.x2g — extra-large memory
Choose: vCPU + RAM

💾

Storage

gp2 — general SSD (3 IOPS/GB)
gp3 — new general SSD, independent IOPS
io1 / io2 — provisioned IOPS (I/O intensive)
magnetic — legacy, not recommended
Auto-scaling available (gp2/gp3/io1)

🔗

Endpoint

DNS hostname, not IP address
DNS stays the same after failover
mydb.xxx.us-east-1.rds.amazonaws.com
Port: 3306 (MySQL) / 5432 (PG)
Connection string stays stable

Storage Types — When to Use What Core

   Type IOPS Best For Cost 
  gp2 (SSD) 3 IOPS/GB, burst to 3000 Most workloads, small DBs $ 
 gp3 (SSD) 3000 baseline, up to 16,000 Most workloads — prefer over gp2 $$ (20% cheaper than gp2) 
 io1 / io2 Up to 64,000 IOPS High I/O: OLTP, ERP, financial $$$$ 
 Magnetic ~100 IOPS Legacy only, avoid $ 
  

Type	IOPS	Best For	Cost
gp2 (SSD)	3 IOPS/GB, burst to 3000	Most workloads, small DBs	$
gp3 (SSD)	3000 baseline, up to 16,000	Most workloads — prefer over gp2	$$ (20% cheaper than gp2)
io1 / io2	Up to 64,000 IOPS	High I/O: OLTP, ERP, financial	$$$$
Magnetic	~100 IOPS	Legacy only, avoid	$

DB Instance Lifecycle Core

DB Instance components — what makes up an RDS instance

Key Configuration Options Core

📝

Parameter Groups

DB engine configuration settings
E.g., max_connections, innodb_buffer_pool_size
Default parameter group = engine defaults
Create custom group to tune performance
Changes may require reboot

📅

Maintenance Windows

Weekly window for minor version upgrades + patches
Default: 30-minute window during low-traffic hours
You choose the window (e.g., Sun 03:00–03:30 UTC)
Multi-AZ = failover to standby → near-zero downtime
Single-AZ = brief outage during patch

📈

Storage Auto-Scaling

Automatically expands storage when near limit
Set maximum storage threshold (required)
Trigger: free space <10% for 5+ minutes
Trigger: last scaling was 6+ hours ago
Min increment: 10% of current size (or min 5 GB)
Max limit: 65,536 GiB (64 TiB)
No downtime • Supported: gp2, gp3, io1

🖧️

DB Subnet Group

Collection of subnets in different AZs
RDS uses these for Multi-AZ placement
Must span at least 2 AZs
Best practice: private subnets only
Required for all RDS instances (even single-AZ)

🧠 Key Insight

Prefer gp3 over gp2 — it's cheaper and gives independent IOPS control. Always use a DB subnet group with private subnets. The DNS endpoint never changes — your app always uses the same connection string, even after failover.

Performance Insights — Wait Time Analysis Advanced

📊

What is Performance Insights

Visualises DB load by wait type (CPU, I/O, locks, network)
Shows SQL queries consuming the most resources
7-day retention free; up to 2 years with paid tier
Supports MySQL, PostgreSQL, MariaDB, Oracle, SQL Server
Enabled per DB instance — zero impact on performance

🔍

When to Use

Database is slow but you don't know why
Find which queries are causing lock waits
Identify CPU vs I/O bound workloads
Spot regression after schema change or deploy
Exam: “identify DB bottleneck / slow queries” → Performance Insights

Chapter Summary Introductory

 DB Instance = isolated DB environment (engine + compute + storage + endpoint)
Instance classes: db.t3 (burstable), db.m6g (general), db.r6g (memory-optimized)
Storage: gp3 (best default), io1 (high-IOPS workloads), gp2 (legacy)
Endpoint = stable DNS hostname — stays the same after failover
Parameter groups = DB engine tuning; subnet groups = VPC placement
Storage auto-scaling = expands automatically when <10% free (no downtime)
 

Chapter Three

High Availability — Multi-AZ

What is Multi-AZ Introductory

Multi-AZ is RDS's high availability feature. When enabled, RDS automatically provisions a standby replica in a different Availability Zone. Data is synchronously replicated to the standby. If the primary fails, RDS automatically fails over to the standby — no manual intervention, no data loss.

👉 Critical to understand: Multi-AZ standby is NOT a read replica. You cannot query the standby instance. It exists solely as a failover target. Its only job is to take over if the primary fails.

How Failover Works Core

1️⃣

Failure Detected

Primary instance fails (hardware, OS crash, AZ outage, maintenance). RDS health checks detect within ~30 seconds.

2️⃣

DNS Flipped

RDS updates the DNS endpoint to point to the standby. Total failover time: typically 1–2 minutes. Your app reconnects to same endpoint automatically.

3️⃣

Standby Promoted

Standby becomes the new primary. RDS automatically provisions a new standby in the other AZ to restore HA. Zero data loss (synchronous replication).

Diagram 1 — Multi-AZ Normal Operation Core

Multi-AZ — Synchronous replication to standby (standby not queryable)

Diagram 2 — Failover Flow Core

Multi-AZ Failover — Primary fails, DNS flips, standby promoted (<2 min)

AZ-a (us-east-1a)

Primary ❌ FAILED

Hardware failure / AZ outage

AZ-b (us-east-1b)

Standby → New Primary ✔

Promoted automatically
Same DNS endpoint

① Primary failure detected (~30s)
② RDS updates DNS record → standby IP
③ Application reconnects (same endpoint URL)
④ Standby is now primary — new standby provisioned in AZ-a
⏱️ Total downtime: typically 60–120 seconds

What Triggers Failover Core

⚡

Automatic Failover Triggers

Primary instance failure (hardware, OS crash)
Network connectivity loss to primary
Storage failure on primary
AZ or data centre outage
Planned maintenance with reboot
You manually trigger (Reboot with failover)

💡

Multi-AZ Gotchas

Standby is in a different AZ, not different region
Standby DNS endpoint is different — don't use it
Backups taken from standby (zero I/O impact on primary)
Both instances same class/storage (can't scale standby independently)
Extra cost: ~2× (two instances running)
Not available for all instance classes

Multi-AZ DB Cluster (New) Advanced

RDS also offers a Multi-AZ DB Cluster (a newer option) — one writer and two readable standbys across 3 AZs. Unlike classic Multi-AZ, the standbys can serve reads. Failover is faster (<35 seconds). Currently available for MySQL 8.0 and PostgreSQL 13+.

💻

Classic Multi-AZ (DB Instance)

1 primary + 1 standby
Standby NOT readable
Failover: ~60–120 seconds
Supported: all engines
Exam default assumption

🗃️

Multi-AZ DB Cluster

1 writer + 2 readable standbys
Standbys ARE readable
Failover: <35 seconds
MySQL 8 + PostgreSQL 13+ only
Higher cost, better read availability

📋 Classic vs Cluster — Comparison Table

Feature	Classic Multi-AZ	Multi-AZ Cluster
Standby count	1 (different AZ)	2 (different AZs)
Standby readable?	❌ No	✅ Yes
Failover time	60–120 seconds	<35 seconds
Supported engines	All engines	MySQL 8, PostgreSQL 13+

🧠 Key Insight

Multi-AZ = availability, not performance. The standby is invisible to your app — same endpoint, same experience. Failover is automatic and takes less than 2 minutes. For exam: “Multi-AZ” always means “HA”, not scaling. Standby = NOT queryable (unless using the newer Multi-AZ Cluster).

Chapter Summary Introductory

 Multi-AZ = high availability — primary + synchronous standby in different AZ
Standby is NOT queryable — exists only as a failover target
Automatic failover in 1–2 minutes — DNS updated, app reconnects to same endpoint
Triggers: hardware failure, AZ outage, network loss, planned maintenance reboot
Zero data loss — synchronous replication means every write reaches standby first
Multi-AZ Cluster: newer option (MySQL 8/PG 13+) — 2 readable standbys, <35s failover
 

Chapter Four

Scaling — Read Replicas

What is a Read Replica Introductory

A Read Replica is an asynchronous copy of your primary RDS instance that serves read-only queries. Applications send writes to the primary and reads to the replica(s). This offloads read traffic from the primary, improving overall database performance for read-heavy workloads.

👉 Core idea: Read Replicas solve performance, not availability. Reads scale horizontally — add more replicas. Writes still go to one primary. Replication is asynchronous — a tiny lag exists.

Multi-AZ vs Read Replica — Side by Side Core

🛡️

Multi-AZ — High Availability

Purpose: survive failures
Replication: synchronous (zero data loss)
Queryable: NO — standby only
Automatic failover: YES (<2 min)
Same region only
Cost: ~2× (two instances)

📈

Read Replica — Read Scaling

Purpose: handle more reads
Replication: asynchronous (slight lag)
Queryable: YES — read-only traffic
Automatic failover: NO — manual promotion
Same region, cross-region, cross-account
Cost: additional instance per replica

Read Replica Facts Core

🔢

Limits

Up to 5 replicas (MySQL, MariaDB)
Up to 5 replicas (PostgreSQL)
Up to 5 replicas (Oracle, SQL Server)
Replicas of replicas (chaining) supported
Each replica has its own endpoint

🌎

Cross-Region

Create replica in a different AWS region
Replication over network (encrypted)
Disaster recovery: promote to primary if home region fails
Lower latency reads for global users
Additional cross-region data transfer cost

🎯

Promotion

Manually promote replica → standalone primary
Replication stops on promotion
Gets its own read+write endpoint
Use for: DR failover, migration, scaling writes
NOT automatic — requires manual action

Diagram 1 — Read Replica Architecture Core

Read Replicas — Writes to primary, reads distributed across replicas

Read Replica Use Cases Core

✅

Good Use Cases

Read-heavy apps: news sites, e-commerce catalogues
Analytics queries: run heavy reports without impacting primary
Geographic distribution: replica in EU for EU users
DR strategy: cross-region replica for regional failover
Migration: promote replica to move DB to new region

❌

Not Suitable For

Applications that need guaranteed consistency (async lag)
Write scaling — all writes still go to one primary
Automatic failover — promotion is manual
Real-time synchronisation (tiny delay always exists)

🧠 Key Insight

Read Replicas = scale reads horizontally. Exam: “read-heavy workload” or “offload analytics” → Read Replica. “Automatic failover” → Multi-AZ (not Read Replica). Cross-region replica → disaster recovery + global low-latency reads.

Chapter Summary Introductory

 Read Replica = asynchronous copy for read-only traffic — offloads primary
Up to 5 replicas per primary (MySQL, PG) — each has its own endpoint
Async replication = tiny lag — not suitable for apps needing instant consistency
Cross-region replicas — for DR and lower global read latency
Promotion = manual action — promotes replica to standalone read+write primary
Exam trap: Read Replica ≠ automatic failover (that's Multi-AZ)
 

Chapter Five

Backups & Snapshots

Two Types of Backups Introductory

RDS provides two backup mechanisms that complement each other: automated backups (enabled by default, point-in-time recovery) and manual snapshots (user-initiated, retained forever until deleted).

🔄

Automated Backups

Enabled by default on all RDS instances
Daily full backup during backup window
Continuous transaction log backups (every 5 min)
Retention: 1–35 days (default 7 days)
Stored in S3 (AWS-managed, not visible)
Deleted when DB instance is deleted
Enables point-in-time recovery (PITR)

📸

Manual Snapshots

User-initiated (CLI / Console / API)
Full backup of DB instance
Retained indefinitely (until you delete)
Stored in S3 (visible in console)
Survive DB instance deletion
Can copy across regions
Can share with other AWS accounts

Point-in-Time Recovery (PITR) Core

Point-in-Time Recovery lets you restore your database to any second within your retention period. RDS combines the daily snapshot with transaction logs to reconstruct the exact state of the database at your requested timestamp.

Point-in-Time Recovery — Restore to any second within retention window

Backup Configuration Details Core

📅

Backup Window

30-minute window daily for full backup
Set during low-traffic hours
Brief I/O suspension possible (single-AZ)
Multi-AZ: backup from standby (zero I/O impact)
Can change anytime

⏰

Retention Period

Default: 7 days
Range: 1–35 days
Set to 0 = disable automated backups
Increase for longer PITR window
Automated backups deleted with DB

🌐

Cross-Region Backup

Replicate automated backups to another region
Additionally protected against regional disaster
Extra cost (storage + transfer)
Manual snapshot copy also supported
Share snapshot with another AWS account

Restore Behaviour Core

⚠️ Restore creates a NEW DB instance

Restoring a snapshot or PITR always creates a new RDS instance with a new endpoint
You must update your application's connection string to the new endpoint
Original DB instance continues running (if still alive)
Gives you a clean way to test restoration without impacting production
Restored instance uses default parameter group — reapply custom settings

🐌

Restore: Lazy Loading (S3)

Restored DB loads data from S3 lazily (on first access per block)
First queries against restored DB may be slower than usual
Data is fully loaded in the background over time
For production restores: enable EBS fast snapshot restore on provisioned IOPS volumes to pre-warm data
Exam: “first queries slow after snapshot restore” → lazy loading from S3

AWS Backup Integration Advanced

🗄️

AWS Backup Service

Central backup management across AWS services
Covers RDS, DynamoDB, EFS, EBS, EC2, Aurora
Set backup policies (frequency, retention, cross-region)
Compliance reporting (PITR, audit logs)
Useful for multi-service backup governance

💰

Snapshot Pricing

First snapshot = full DB size
Subsequent snapshots = incremental (changed blocks)
Storage: ~$0.095/GB-month
Free tier: backup storage up to DB size
Automated backups: free up to DB size

RDS Event Notifications (SNS) Advanced

🔔

Event Notifications

Subscribe to SNS topics for RDS events
Events: failover, backup started/completed, low storage, maintenance, deletion
Covers DB instances, parameter groups, snapshots, security groups
Near real-time alerts — typically within minutes
Chain to Lambda / SQS for automated response

🤖

Automation Examples

SNS → Lambda: auto-scale read replicas on high load alert
SNS → Slack/PagerDuty: alert on-call when failover occurs
SNS → Lambda: take manual snapshot before maintenance window
Exam: “alert when RDS failover happens” → RDS Event Notification + SNS

🧠 Key Insight

Automated backups = PITR within retention (max 35 days). Manual snapshots = forever until deleted, cross-region, cross-account. Restoring always creates a NEW instance. Multi-AZ backups from standby = zero performance impact on primary.

Chapter Summary Introductory

 Automated backups = daily full + continuous transaction logs — enables PITR (1–35 days)
Manual snapshots = user-initiated, retained forever, cross-region/account shareable
PITR = restore to any second within retention window — full backup + log replay
Restore = new instance — new endpoint, update connection string; first queries slow (lazy S3 loading)
Multi-AZ backup benefit: backup taken from standby — zero I/O impact on primary
Event Notifications: SNS alerts for failover, backup, low storage, maintenance events
Exam: “recover to specific time” → PITR; “cross-account snapshot” → manual snapshot copy
 

Chapter Six

Security & Networking

VPC & Subnet Groups Introductory

RDS always runs inside a VPC. A DB subnet group specifies the subnets (across at least 2 AZs) where RDS can place instances. Best practice: use private subnets only — no public internet access to your database.

RDS Security Layers — VPC + Subnet Group + Security Group + Encryption

VPC

AZ-a — Private Subnet

RDS Primary

KMS encrypted
Port 3306

AZ-b — Private Subnet

RDS Standby

Multi-AZ replica
Auto-failover

🔒 Security Group: inbound port 3306 from EC2 SG only — no public internet
🔑 KMS encryption at rest — all data, logs, snapshots encrypted
🌐 SSL/TLS in transit — encrypted connection between app and DB

Security Groups for RDS Core

✅

Correct Security Group Setup

RDS Security Group inbound: only from App SG
Rule: TCP port 3306 (MySQL) from EC2 security group ID
Never open to 0.0.0.0/0 (public internet)
Lambda in VPC → attach Lambda to same VPC, allow its SG
On-prem: allow VPN/DX CIDR range

❌

Common Mistakes

Enabling public accessibility (RDS reachable from internet)
Opening port 3306 to 0.0.0.0/0
Forgetting outbound rules on app SG
Lambda outside VPC can't reach private RDS
Not using SSL — credentials transmitted in plaintext

Encryption Core

🔑

Encryption at Rest (KMS)

Enable at creation time — cannot add later
Uses AWS KMS (AES-256)
Encrypts: data files, backups, snapshots, replicas, logs
Read replicas inherit encryption from primary
To encrypt unencrypted DB: snapshot → copy with encryption → restore
Exam: “encrypt existing unencrypted RDS” → snapshot + copy method

🌐

Encryption in Transit (SSL/TLS)

Download AWS RDS certificate bundle
Enable SSL in connection string: --ssl-ca=rds-ca.pem
Enforce SSL: set parameter require_secure_transport=1
PostgreSQL: ssl=true in connection string
Oracle / SQL Server: native SSL

IAM Authentication Advanced

👤

IAM DB Authentication

Authenticate to RDS using IAM token (no password)
Token generated via generate-db-auth-token API
Valid for 15 minutes
Supported: MySQL 5.7/8.0 and PostgreSQL 10+ only
Not supported: Oracle, SQL Server, MariaDB
No credentials stored in app code — use IAM role
Good for: EC2, Lambda, ECS accessing RDS

🗝️

Secrets Manager (Recommended)

Store DB credentials in Secrets Manager
Automatic rotation (every 30/60/90 days)
Native RDS integration — rotates without downtime
App reads secret via SDK — never hardcodes password
Exam: “rotate DB credentials automatically” → Secrets Manager

🧠 Key Insight

Always-on: encryption at rest (enable at creation), SSL in transit, private subnet, SG allowing only app. Exam: “encrypt existing RDS” → snapshot + encrypted copy + restore. “Rotate DB credentials” → Secrets Manager.

Chapter Summary Introductory

 VPC + private subnet = RDS never exposed to internet; DB subnet group spans 2+ AZs
Security Groups = restrict inbound to app SG only (never 0.0.0.0/0)
KMS encryption at rest = must enable at creation; covers data, logs, snapshots, replicas
SSL/TLS in transit = download RDS CA cert, enable in connection string
IAM auth = token-based, no passwords; Secrets Manager = auto-rotating credentials
Exam trap: can't enable encryption on existing DB — must snapshot → copy encrypted → restore
 

Chapter Seven

Architecture Patterns

Pattern 1 — Web Application (EC2 + Multi-AZ RDS) Core

Pattern 1 — Classic 3-tier web app: ALB + EC2 + Multi-AZ RDS

VPC

ALB

Public
subnet

→

PRIVATE SUBNET — App Tier

EC2

AZ-a

EC2

AZ-b

→

PRIVATE SUBNET — DB Tier

Primary

AZ-a

Standby

AZ-b

Multi-AZ RDS • Automated backups • KMS encrypted • Security Group: EC2 → RDS only

Pattern 2 — Read-Heavy App (Primary + Read Replicas) Core

Pattern 2 — Read-heavy: write to primary, read from replicas (e.g., news site, catalogue)

VPC

App Tier

App Server

Writes → Primary
Reads → Replicas

→

PRIMARY (R/W)

Primary

Writes only
AZ-a

async ————▶

async ————▶

READ REPLICAS (R)

Replica 1

AZ-b

Replica 2

AZ-c

Use case: reads up to 3× faster • Analytics on replica • No impact on primary from heavy reads

Pattern 3 — Serverless API (Lambda + RDS Proxy + RDS) Advanced

Lambda functions can't maintain persistent DB connections — each invocation opens and closes a connection. At scale, this exhausts the RDS connection pool. RDS Proxy sits between Lambda and RDS, pooling connections and reusing them efficiently.

Pattern 3 — Serverless: Lambda → RDS Proxy → RDS (avoids connection exhaustion)

Lambda ×1000

Each invocation
needs DB access

→

RDS Proxy

Connection
pooling & reuse
IAM auth

→

RDS Instance

Max connections
protected

RDS Proxy benefits: connection pooling • reduces failover time (66% faster) • IAM auth enforcement • Secrets Manager integration

Pattern 4 — Full Production Setup Advanced

Pattern 4 — Production: Multi-AZ + Read Replicas + ElastiCache + Secrets Manager

Production VPC

ALB

Public

→

EC2 / ECS

App tier
Multi-AZ

→

ElastiCache

Cache layer
Redis / Memcached

cache miss ↓

RDS Primary

Multi-AZ
+2 Replicas

🔑 Secrets Manager: automatic credential rotation • 🔒 KMS at rest • 🌐 SSL in transit
🔄 Multi-AZ standby • 📸 Daily snapshots (cross-region) • 📈 2 read replicas for analytics

Pattern 5 — Blue/Green Deployments (Zero-Downtime Changes) Advanced

RDS Blue/Green Deployments create a synchronized staging environment (green) that mirrors production (blue). You test schema changes safely, then switch production traffic to green in seconds — with zero application downtime.

🔵

Blue (Production)

Current production DB
Live traffic serving users
Changes tested here in green first
Becomes old environment after switchover

🟢

Green (Staging)

Synchronized copy of blue
Apply schema changes & patches safely
Test application against new schema
Kept in sync via binlog replication

🔄

Switchover

Single-click switchover (seconds)
DNS flipped — prod now points to green
Old blue retained for rollback
No data loss, no application outage

🎯 Blue/Green Use Cases

Major version upgrades (e.g., MySQL 5.7 → 8.0) with zero downtime
Schema changes: adding columns, changing indexes
Testing DB engine parameter changes safely before applying to production
Exam: “zero-downtime major version upgrade” → RDS Blue/Green Deployments

Decision Guide — RDS vs Other DB Options Core

   You Need... Use Why 
  Managed relational DB (MySQL / PG) RDS Patching, backups, Multi-AZ managed 
 Maximum relational performance Aurora 5× MySQL / 3× PG performance, auto-scales 
 Key-value / document store DynamoDB Serverless, single-digit ms, unlimited scale 
 In-memory caching (reduce DB load) ElastiCache Redis / Memcached, microsecond latency 
 Lambda + RDS (connection pooling) RDS Proxy Prevents connection exhaustion, IAM auth 
 Real-time analytics without ETL pipelines Zero-ETL → Redshift Near real-time RDS → Redshift, no pipelines needed 
 OS-level DB access (Oracle / SQL Server) RDS Custom Managed + SSH/filesystem access for legacy migrations 
 Full DB control on EC2 EC2 + DB Custom configs RDS doesn't support (rare) 
  

You Need...	Use	Why
Managed relational DB (MySQL / PG)	RDS	Patching, backups, Multi-AZ managed
Maximum relational performance	Aurora	5× MySQL / 3× PG performance, auto-scales
Key-value / document store	DynamoDB	Serverless, single-digit ms, unlimited scale
In-memory caching (reduce DB load)	ElastiCache	Redis / Memcached, microsecond latency
Lambda + RDS (connection pooling)	RDS Proxy	Prevents connection exhaustion, IAM auth
Real-time analytics without ETL pipelines	Zero-ETL → Redshift	Near real-time RDS → Redshift, no pipelines needed
OS-level DB access (Oracle / SQL Server)	RDS Custom	Managed + SSH/filesystem access for legacy migrations
Full DB control on EC2	EC2 + DB	Custom configs RDS doesn't support (rare)

Exam Cheatsheet Core

🎯 Exam Keywords → RDS Answer

“automatic failover DB” → Multi-AZ (NOT read replica)
“read-heavy, offload reads” → Read Replica
“recover to specific time” → PITR (automated backups)
“encrypt existing unencrypted RDS” → snapshot → copy with encryption → restore
“Lambda + RDS connection exhaustion” → RDS Proxy
“faster failover with Lambda/RDS” → RDS Proxy (cuts failover time ~66%)
“rotate DB credentials automatically” → Secrets Manager
“cross-region disaster recovery DB” → Cross-region read replica
“DB not publicly accessible” → private subnet + security group
“Multi-AZ standby queryable?” → NO (classic Multi-AZ) / YES (Multi-AZ Cluster)
“zero-downtime major version upgrade” → Blue/Green Deployments
“real-time RDS analytics, no ETL” → Zero-ETL integration → Redshift
“OS-level access Oracle/SQL Server managed” → RDS Custom
“identify slow queries / DB bottleneck” → Performance Insights
“alert when DB failover / backup happens” → RDS Event Notifications + SNS
“first queries slow after restore” → lazy S3 loading; use EBS fast snapshot restore

🧠 Final Insight

RDS is your production relational database foundation: Multi-AZ for HA, Read Replicas for scale, PITR for safety, Secrets Manager for credentials, RDS Proxy for serverless (66% faster failover). Use Blue/Green Deployments for zero-downtime upgrades, Performance Insights to diagnose slow queries, Zero-ETL for real-time analytics to Redshift, and RDS Custom for Oracle/SQL Server when you need OS-level access.