Amazon RDS
LearningTree · AWS · Database

Amazon RDS —
Managed Relational Database

Fully managed relational database โ€” no patching, no backups to configure, no infrastructure to manage. RDS gives you MySQL, PostgreSQL, MariaDB, Oracle, or SQL Server with built-in high availability, automatic failover, and point-in-time recovery.

⚡ RDS in 30 Seconds

  • Managed SQL database — AWS handles patching, backups, and hardware
  • 6 engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora (separate service)
  • Multi-AZ = high availability (synchronous standby, auto-failover)
  • Read Replicas = read scaling (asynchronous, up to 15 replicas)
  • Runs inside your VPC — private, never on the public internet
01
Chapter One

What is Amazon RDS

The Problem — Running Databases Yourself Introductory

Running a relational database yourself on EC2 requires you to handle everything: installing the engine, configuring storage, setting up replication, scheduling backups, applying patches, monitoring for failures, and manually failing over when the primary goes down. That's weeks of work before you write a single line of application code.

👉 The core problem: Databases need constant operational care — backups, patches, failover, replication. Every hour spent on DB ops is an hour not spent on your product. RDS takes all of that away.

What is a Relational Database Introductory

A relational database stores data in structured tables with rows and columns. Tables are related to each other through foreign keys. You query data with SQL. This model works for almost all transactional applications: e-commerce orders, user accounts, financial records, inventory systems.

📊

Structured Data

Schema-defined. Every row in a table has the same columns. Strong data integrity with constraints (NOT NULL, UNIQUE, FOREIGN KEY).

🔗

Relationships

Tables join together. A users table relates to an orders table. Complex queries with JOINs, GROUP BY, transactions.

🔒

ACID Transactions

Atomicity, Consistency, Isolation, Durability. Either all changes commit, or none do. Critical for financial and medical data.

What is Amazon RDS Introductory

Amazon RDS (Relational Database Service) is a fully managed service that runs a relational database engine on your behalf inside AWS. You choose the engine, instance size, and storage. AWS handles everything else — provisioning, patching, backups, monitoring, failover.

🛑

You Manage on EC2

  • Install database engine (MySQL, Postgres...)
  • Configure storage, IOPS, file system
  • Apply OS + DB patches manually
  • Set up replication manually
  • Schedule and test backups
  • Monitor and respond to failures
  • Implement failover scripts

RDS Manages

  • Hardware provisioning and lifecycle
  • Database engine installation
  • Automated OS and DB patching
  • Synchronous replication (Multi-AZ)
  • Automated daily backups + transaction logs
  • Health monitoring + auto-restart
  • Automatic failover in <2 minutes
Supported Database Engines Core
RDS Supported Engines — Choose the right engine for your use case
MySQL
MySQL
Most popular
open-source
PostgreSQL
PostgreSQL
Advanced features
JSON support
MariaDB
MariaDB
MySQL fork
open-source
Oracle
Oracle
Enterprise
bring-your-license
SQL Server
SQL Server
Microsoft
.NET stack

⚠️ Aurora is a separate, AWS-optimized engine — MySQL/PostgreSQL compatible but 5× faster. Covered in the Aurora page.

RDS Custom — Managed DB with OS-Level Access Advanced
🔧

What is RDS Custom

  • Managed DB service with OS + engine access
  • Supports Oracle and Microsoft SQL Server only
  • SSH access, filesystem access, custom scripts
  • Toggle automation (pause RDS automation to apply custom changes)
  • AWS still manages backups, Multi-AZ, failover
🎯

When to Use RDS Custom

  • Legacy Oracle apps requiring custom OS-level configuration
  • SQL Server features not exposed by standard RDS
  • Custom patches or DB features RDS doesn't support
  • Migrating on-prem Oracle to AWS with minimal changes
  • Exam: “OS-level access + managed RDS” → RDS Custom
Concept Diagram — Managed vs Unmanaged Introductory
DIY (EC2) vs Managed (RDS) — What you own vs what AWS owns
DIY ON EC2 — YOU OWN ALL 🔧 Application Code 🔧 DB Engine (install + configure) 🔧 OS Patches + Security 🔧 Backups + Replication + Failover Hardware & Infrastructure RDS — YOU OWN THE TOP ONLY ✅ Application Code ← Your job ✅ DB Engine ← AWS manages ✅ OS Patches ← AWS manages ✅ Backups + Replication ← AWS manages Hardware ← AWS manages
AWS Architecture Diagram Core
RDS in AWS — Application talks to RDS endpoint inside VPC
VPC VPC (10.0.0.0/16)
PUBLIC SUBNET
EC2
EC2 App
Web server
PRIVATE SUBNET
RDS
RDS Instance
MySQL / Postgres
Port 3306 / 5432
🔒 Security Group: only EC2 allowed
🔑 KMS encrypted at rest
🩟️ Endpoint: mydb.xxx.us-east-1.rds.amazonaws.com
🔄 Automated daily backups (7 day retention)
Mental Model — The Managed Apartment Analogy Introductory
🏠

DIY (EC2 Database) = Owning a House

  • You choose and install the plumbing (DB engine)
  • You fix the boiler when it breaks (patches)
  • You call a plumber at 3am (on-call)
  • You organize your own insurance (backups)
  • Total control, total responsibility
🏢

RDS = Managed Apartment Building

  • Building manager handles plumbing (AWS manages engine)
  • Maintenance team patches issues (automated patching)
  • 24/7 on-call building staff (AWS monitoring)
  • Fire insurance included (automated backups)
  • You just live there and focus on your work
Multi-AZ vs Read Replicas — Critical Distinction Core

This is the most important concept in RDS and the most common exam mistake. These two features solve completely different problems:

Feature Purpose Replication Can Serve Reads?
Multi-AZ 🛡️ High Availability Synchronous (zero data loss) ❌ No — standby only
Read Replica 📈 Read Scaling Asynchronous (slight lag) ✅ Yes — read traffic
🧠 Key Insight

Multi-AZ = HA (failover protection). Read Replica = performance (scale reads). You can combine both: a Multi-AZ primary with read replicas for a production-grade, highly available, read-scalable database tier.

Chapter Summary Introductory
  • RDS = fully managed relational DB — AWS handles patching, backups, failover, replication
  • 6 engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server (+ Aurora separately)
  • Runs in your VPC — private subnet, security groups, never on public internet
  • Multi-AZ = high availability (synchronous standby, NOT readable)
  • Read Replicas = read scaling (asynchronous, readable, up to 15)
  • Mental model: managed apartment building — you use it, AWS maintains it
02
Chapter Two

Core Concepts — DB Instance, Storage & Endpoints

The DB Instance — The Core Unit Introductory

A DB instance is an isolated database environment running in AWS. It is the fundamental building block of RDS — think of it as a virtual database server. Each DB instance runs one database engine (MySQL, PostgreSQL, etc.) and can contain multiple databases.

🖥️

Instance Class

  • db.t3 — burstable (dev/test)
  • db.m6g — general purpose
  • db.r6g — memory-optimized
  • db.x2g — extra-large memory
  • Choose: vCPU + RAM
💾

Storage

  • gp2 — general SSD (3 IOPS/GB)
  • gp3 — new general SSD, independent IOPS
  • io1 / io2 — provisioned IOPS (I/O intensive)
  • magnetic — legacy, not recommended
  • Auto-scaling available (gp2/gp3/io1)
🔗

Endpoint

  • DNS hostname, not IP address
  • DNS stays the same after failover
  • mydb.xxx.us-east-1.rds.amazonaws.com
  • Port: 3306 (MySQL) / 5432 (PG)
  • Connection string stays stable
Storage Types — When to Use What Core
Type IOPS Best For Cost
gp2 (SSD) 3 IOPS/GB, burst to 3000 Most workloads, small DBs $
gp3 (SSD) 3000 baseline, up to 16,000 Most workloads — prefer over gp2 $$ (20% cheaper than gp2)
io1 / io2 Up to 64,000 IOPS High I/O: OLTP, ERP, financial $$$$
Magnetic ~100 IOPS Legacy only, avoid $
DB Instance Lifecycle Core
DB Instance components — what makes up an RDS instance
DB INSTANCE DB Engine (MySQL/PG...) Parameter Group Storage (gp3/io1) Subnet Group (VPC) DNS Endpoint (stable) Application EC2 / Lambda Automated Backups (S3)
Key Configuration Options Core
📝

Parameter Groups

  • DB engine configuration settings
  • E.g., max_connections, innodb_buffer_pool_size
  • Default parameter group = engine defaults
  • Create custom group to tune performance
  • Changes may require reboot
📅

Maintenance Windows

  • Weekly window for minor version upgrades + patches
  • Default: 30-minute window during low-traffic hours
  • You choose the window (e.g., Sun 03:00–03:30 UTC)
  • Multi-AZ = failover to standby → near-zero downtime
  • Single-AZ = brief outage during patch
📈

Storage Auto-Scaling

  • Automatically expands storage when near limit
  • Set maximum storage threshold (required)
  • Trigger: free space <10% for 5+ minutes
  • Trigger: last scaling was 6+ hours ago
  • Min increment: 10% of current size (or min 5 GB)
  • Max limit: 65,536 GiB (64 TiB)
  • No downtime • Supported: gp2, gp3, io1
🖧️

DB Subnet Group

  • Collection of subnets in different AZs
  • RDS uses these for Multi-AZ placement
  • Must span at least 2 AZs
  • Best practice: private subnets only
  • Required for all RDS instances (even single-AZ)
🧠 Key Insight

Prefer gp3 over gp2 — it's cheaper and gives independent IOPS control. Always use a DB subnet group with private subnets. The DNS endpoint never changes — your app always uses the same connection string, even after failover.

Performance Insights — Wait Time Analysis Advanced
📊

What is Performance Insights

  • Visualises DB load by wait type (CPU, I/O, locks, network)
  • Shows SQL queries consuming the most resources
  • 7-day retention free; up to 2 years with paid tier
  • Supports MySQL, PostgreSQL, MariaDB, Oracle, SQL Server
  • Enabled per DB instance — zero impact on performance
🔍

When to Use

  • Database is slow but you don't know why
  • Find which queries are causing lock waits
  • Identify CPU vs I/O bound workloads
  • Spot regression after schema change or deploy
  • Exam: “identify DB bottleneck / slow queries” → Performance Insights
Chapter Summary Introductory
  • DB Instance = isolated DB environment (engine + compute + storage + endpoint)
  • Instance classes: db.t3 (burstable), db.m6g (general), db.r6g (memory-optimized)
  • Storage: gp3 (best default), io1 (high-IOPS workloads), gp2 (legacy)
  • Endpoint = stable DNS hostname — stays the same after failover
  • Parameter groups = DB engine tuning; subnet groups = VPC placement
  • Storage auto-scaling = expands automatically when <10% free (no downtime)
03
Chapter Three

High Availability — Multi-AZ

What is Multi-AZ Introductory

Multi-AZ is RDS's high availability feature. When enabled, RDS automatically provisions a standby replica in a different Availability Zone. Data is synchronously replicated to the standby. If the primary fails, RDS automatically fails over to the standby — no manual intervention, no data loss.

👉 Critical to understand: Multi-AZ standby is NOT a read replica. You cannot query the standby instance. It exists solely as a failover target. Its only job is to take over if the primary fails.

How Failover Works Core
1️⃣

Failure Detected

Primary instance fails (hardware, OS crash, AZ outage, maintenance). RDS health checks detect within ~30 seconds.

2️⃣

DNS Flipped

RDS updates the DNS endpoint to point to the standby. Total failover time: typically 1–2 minutes. Your app reconnects to same endpoint automatically.

3️⃣

Standby Promoted

Standby becomes the new primary. RDS automatically provisions a new standby in the other AZ to restore HA. Zero data loss (synchronous replication).

Diagram 1 — Multi-AZ Normal Operation Core
Multi-AZ — Synchronous replication to standby (standby not queryable)
Application EC2 / Lambda DNS Endpoint mydb.xxx.rds... AZ-a PRIMARY ✔ Read + Write us-east-1a sync AZ-b STANDBY ❌ NOT queryable us-east-1b ❌ cannot read from standby
Diagram 2 — Failover Flow Core
Multi-AZ Failover — Primary fails, DNS flips, standby promoted (<2 min)
AZ-a (us-east-1a)
RDS Primary
Primary ❌ FAILED
Hardware failure / AZ outage
AZ-b (us-east-1b)
RDS Standby
Standby → New Primary ✔
Promoted automatically
Same DNS endpoint
① Primary failure detected (~30s)
② RDS updates DNS record → standby IP
③ Application reconnects (same endpoint URL)
④ Standby is now primary — new standby provisioned in AZ-a
⏱️ Total downtime: typically 60–120 seconds
What Triggers Failover Core

Automatic Failover Triggers

  • Primary instance failure (hardware, OS crash)
  • Network connectivity loss to primary
  • Storage failure on primary
  • AZ or data centre outage
  • Planned maintenance with reboot
  • You manually trigger (Reboot with failover)
💡

Multi-AZ Gotchas

  • Standby is in a different AZ, not different region
  • Standby DNS endpoint is different — don't use it
  • Backups taken from standby (zero I/O impact on primary)
  • Both instances same class/storage (can't scale standby independently)
  • Extra cost: ~2× (two instances running)
  • Not available for all instance classes
Multi-AZ DB Cluster (New) Advanced

RDS also offers a Multi-AZ DB Cluster (a newer option) — one writer and two readable standbys across 3 AZs. Unlike classic Multi-AZ, the standbys can serve reads. Failover is faster (<35 seconds). Currently available for MySQL 8.0 and PostgreSQL 13+.

💻

Classic Multi-AZ (DB Instance)

  • 1 primary + 1 standby
  • Standby NOT readable
  • Failover: ~60–120 seconds
  • Supported: all engines
  • Exam default assumption
🗃️

Multi-AZ DB Cluster

  • 1 writer + 2 readable standbys
  • Standbys ARE readable
  • Failover: <35 seconds
  • MySQL 8 + PostgreSQL 13+ only
  • Higher cost, better read availability

📋 Classic vs Cluster — Comparison Table

Feature Classic Multi-AZ Multi-AZ Cluster
Standby count 1 (different AZ) 2 (different AZs)
Standby readable? ❌ No ✅ Yes
Failover time 60–120 seconds <35 seconds
Supported engines All engines MySQL 8, PostgreSQL 13+
🧠 Key Insight

Multi-AZ = availability, not performance. The standby is invisible to your app — same endpoint, same experience. Failover is automatic and takes less than 2 minutes. For exam: “Multi-AZ” always means “HA”, not scaling. Standby = NOT queryable (unless using the newer Multi-AZ Cluster).

Chapter Summary Introductory
  • Multi-AZ = high availability — primary + synchronous standby in different AZ
  • Standby is NOT queryable — exists only as a failover target
  • Automatic failover in 1–2 minutes — DNS updated, app reconnects to same endpoint
  • Triggers: hardware failure, AZ outage, network loss, planned maintenance reboot
  • Zero data loss — synchronous replication means every write reaches standby first
  • Multi-AZ Cluster: newer option (MySQL 8/PG 13+) — 2 readable standbys, <35s failover
04
Chapter Four

Scaling — Read Replicas

What is a Read Replica Introductory

A Read Replica is an asynchronous copy of your primary RDS instance that serves read-only queries. Applications send writes to the primary and reads to the replica(s). This offloads read traffic from the primary, improving overall database performance for read-heavy workloads.

👉 Core idea: Read Replicas solve performance, not availability. Reads scale horizontally — add more replicas. Writes still go to one primary. Replication is asynchronous — a tiny lag exists.

Multi-AZ vs Read Replica — Side by Side Core
🛡️

Multi-AZ — High Availability

  • Purpose: survive failures
  • Replication: synchronous (zero data loss)
  • Queryable: NO — standby only
  • Automatic failover: YES (<2 min)
  • Same region only
  • Cost: ~2× (two instances)
📈

Read Replica — Read Scaling

  • Purpose: handle more reads
  • Replication: asynchronous (slight lag)
  • Queryable: YES — read-only traffic
  • Automatic failover: NO — manual promotion
  • Same region, cross-region, cross-account
  • Cost: additional instance per replica
Read Replica Facts Core
🔢

Limits

  • Up to 5 replicas (MySQL, MariaDB)
  • Up to 5 replicas (PostgreSQL)
  • Up to 5 replicas (Oracle, SQL Server)
  • Replicas of replicas (chaining) supported
  • Each replica has its own endpoint
🌎

Cross-Region

  • Create replica in a different AWS region
  • Replication over network (encrypted)
  • Disaster recovery: promote to primary if home region fails
  • Lower latency reads for global users
  • Additional cross-region data transfer cost
🎯

Promotion

  • Manually promote replica → standalone primary
  • Replication stops on promotion
  • Gets its own read+write endpoint
  • Use for: DR failover, migration, scaling writes
  • NOT automatic — requires manual action
Diagram 1 — Read Replica Architecture Core
Read Replicas — Writes to primary, reads distributed across replicas
Application Writes → primary Reads → replicas Primary Read + Write AZ-a All writes land here async Replica 1 Read-only • AZ-b Same region Replica 2 Read-only • AZ-c Replica 3 Read-only • eu-west-1 Cross-region Promote → Primary (manual, for DR)
Read Replica Use Cases Core

Good Use Cases

  • Read-heavy apps: news sites, e-commerce catalogues
  • Analytics queries: run heavy reports without impacting primary
  • Geographic distribution: replica in EU for EU users
  • DR strategy: cross-region replica for regional failover
  • Migration: promote replica to move DB to new region

Not Suitable For

  • Applications that need guaranteed consistency (async lag)
  • Write scaling — all writes still go to one primary
  • Automatic failover — promotion is manual
  • Real-time synchronisation (tiny delay always exists)
🧠 Key Insight

Read Replicas = scale reads horizontally. Exam: “read-heavy workload” or “offload analytics” → Read Replica. “Automatic failover” → Multi-AZ (not Read Replica). Cross-region replica → disaster recovery + global low-latency reads.

Chapter Summary Introductory
  • Read Replica = asynchronous copy for read-only traffic — offloads primary
  • Up to 5 replicas per primary (MySQL, PG) — each has its own endpoint
  • Async replication = tiny lag — not suitable for apps needing instant consistency
  • Cross-region replicas — for DR and lower global read latency
  • Promotion = manual action — promotes replica to standalone read+write primary
  • Exam trap: Read Replica ≠ automatic failover (that's Multi-AZ)
05
Chapter Five

Backups & Snapshots

Two Types of Backups Introductory

RDS provides two backup mechanisms that complement each other: automated backups (enabled by default, point-in-time recovery) and manual snapshots (user-initiated, retained forever until deleted).

🔄

Automated Backups

  • Enabled by default on all RDS instances
  • Daily full backup during backup window
  • Continuous transaction log backups (every 5 min)
  • Retention: 1–35 days (default 7 days)
  • Stored in S3 (AWS-managed, not visible)
  • Deleted when DB instance is deleted
  • Enables point-in-time recovery (PITR)
📸

Manual Snapshots

  • User-initiated (CLI / Console / API)
  • Full backup of DB instance
  • Retained indefinitely (until you delete)
  • Stored in S3 (visible in console)
  • Survive DB instance deletion
  • Can copy across regions
  • Can share with other AWS accounts
Point-in-Time Recovery (PITR) Core

Point-in-Time Recovery lets you restore your database to any second within your retention period. RDS combines the daily snapshot with transaction logs to reconstruct the exact state of the database at your requested timestamp.

Point-in-Time Recovery — Restore to any second within retention window
RETENTION WINDOW (up to 35 days) Day 0 Day 7 full backup Day 14 full backup 🎯 Restore to this point Day 19 14:32:07 Today + replay txn logs ✔ New DB instance restored
Backup Configuration Details Core
📅

Backup Window

  • 30-minute window daily for full backup
  • Set during low-traffic hours
  • Brief I/O suspension possible (single-AZ)
  • Multi-AZ: backup from standby (zero I/O impact)
  • Can change anytime

Retention Period

  • Default: 7 days
  • Range: 1–35 days
  • Set to 0 = disable automated backups
  • Increase for longer PITR window
  • Automated backups deleted with DB
🌐

Cross-Region Backup

  • Replicate automated backups to another region
  • Additionally protected against regional disaster
  • Extra cost (storage + transfer)
  • Manual snapshot copy also supported
  • Share snapshot with another AWS account
Restore Behaviour Core

⚠️ Restore creates a NEW DB instance

  • Restoring a snapshot or PITR always creates a new RDS instance with a new endpoint
  • You must update your application's connection string to the new endpoint
  • Original DB instance continues running (if still alive)
  • Gives you a clean way to test restoration without impacting production
  • Restored instance uses default parameter group — reapply custom settings
🐌

Restore: Lazy Loading (S3)

  • Restored DB loads data from S3 lazily (on first access per block)
  • First queries against restored DB may be slower than usual
  • Data is fully loaded in the background over time
  • For production restores: enable EBS fast snapshot restore on provisioned IOPS volumes to pre-warm data
  • Exam: “first queries slow after snapshot restore” → lazy loading from S3
AWS Backup Integration Advanced
🗄️

AWS Backup Service

  • Central backup management across AWS services
  • Covers RDS, DynamoDB, EFS, EBS, EC2, Aurora
  • Set backup policies (frequency, retention, cross-region)
  • Compliance reporting (PITR, audit logs)
  • Useful for multi-service backup governance
💰

Snapshot Pricing

  • First snapshot = full DB size
  • Subsequent snapshots = incremental (changed blocks)
  • Storage: ~$0.095/GB-month
  • Free tier: backup storage up to DB size
  • Automated backups: free up to DB size
RDS Event Notifications (SNS) Advanced
🔔

Event Notifications

  • Subscribe to SNS topics for RDS events
  • Events: failover, backup started/completed, low storage, maintenance, deletion
  • Covers DB instances, parameter groups, snapshots, security groups
  • Near real-time alerts — typically within minutes
  • Chain to Lambda / SQS for automated response
🤖

Automation Examples

  • SNS → Lambda: auto-scale read replicas on high load alert
  • SNS → Slack/PagerDuty: alert on-call when failover occurs
  • SNS → Lambda: take manual snapshot before maintenance window
  • Exam: “alert when RDS failover happens” → RDS Event Notification + SNS
🧠 Key Insight

Automated backups = PITR within retention (max 35 days). Manual snapshots = forever until deleted, cross-region, cross-account. Restoring always creates a NEW instance. Multi-AZ backups from standby = zero performance impact on primary.

Chapter Summary Introductory
  • Automated backups = daily full + continuous transaction logs — enables PITR (1–35 days)
  • Manual snapshots = user-initiated, retained forever, cross-region/account shareable
  • PITR = restore to any second within retention window — full backup + log replay
  • Restore = new instance — new endpoint, update connection string; first queries slow (lazy S3 loading)
  • Multi-AZ backup benefit: backup taken from standby — zero I/O impact on primary
  • Event Notifications: SNS alerts for failover, backup, low storage, maintenance events
  • Exam: “recover to specific time” → PITR; “cross-account snapshot” → manual snapshot copy
06
Chapter Six

Security & Networking

VPC & Subnet Groups Introductory

RDS always runs inside a VPC. A DB subnet group specifies the subnets (across at least 2 AZs) where RDS can place instances. Best practice: use private subnets only — no public internet access to your database.

RDS Security Layers — VPC + Subnet Group + Security Group + Encryption
VPC VPC
AZ-a — Private Subnet
RDS Primary
RDS Primary
KMS encrypted
Port 3306
AZ-b — Private Subnet
RDS Standby
RDS Standby
Multi-AZ replica
Auto-failover
🔒 Security Group: inbound port 3306 from EC2 SG only — no public internet
🔑 KMS encryption at rest — all data, logs, snapshots encrypted
🌐 SSL/TLS in transit — encrypted connection between app and DB
Security Groups for RDS Core

Correct Security Group Setup

  • RDS Security Group inbound: only from App SG
  • Rule: TCP port 3306 (MySQL) from EC2 security group ID
  • Never open to 0.0.0.0/0 (public internet)
  • Lambda in VPC → attach Lambda to same VPC, allow its SG
  • On-prem: allow VPN/DX CIDR range

Common Mistakes

  • Enabling public accessibility (RDS reachable from internet)
  • Opening port 3306 to 0.0.0.0/0
  • Forgetting outbound rules on app SG
  • Lambda outside VPC can't reach private RDS
  • Not using SSL — credentials transmitted in plaintext
Encryption Core
🔑

Encryption at Rest (KMS)

  • Enable at creation time — cannot add later
  • Uses AWS KMS (AES-256)
  • Encrypts: data files, backups, snapshots, replicas, logs
  • Read replicas inherit encryption from primary
  • To encrypt unencrypted DB: snapshot → copy with encryption → restore
  • Exam: “encrypt existing unencrypted RDS” → snapshot + copy method
🌐

Encryption in Transit (SSL/TLS)

  • Download AWS RDS certificate bundle
  • Enable SSL in connection string: --ssl-ca=rds-ca.pem
  • Enforce SSL: set parameter require_secure_transport=1
  • PostgreSQL: ssl=true in connection string
  • Oracle / SQL Server: native SSL
IAM Authentication Advanced
👤

IAM DB Authentication

  • Authenticate to RDS using IAM token (no password)
  • Token generated via generate-db-auth-token API
  • Valid for 15 minutes
  • Supported: MySQL 5.7/8.0 and PostgreSQL 10+ only
  • Not supported: Oracle, SQL Server, MariaDB
  • No credentials stored in app code — use IAM role
  • Good for: EC2, Lambda, ECS accessing RDS
🗝️

Secrets Manager (Recommended)

  • Store DB credentials in Secrets Manager
  • Automatic rotation (every 30/60/90 days)
  • Native RDS integration — rotates without downtime
  • App reads secret via SDK — never hardcodes password
  • Exam: “rotate DB credentials automatically” → Secrets Manager
🧠 Key Insight

Always-on: encryption at rest (enable at creation), SSL in transit, private subnet, SG allowing only app. Exam: “encrypt existing RDS” → snapshot + encrypted copy + restore. “Rotate DB credentials” → Secrets Manager.

Chapter Summary Introductory
  • VPC + private subnet = RDS never exposed to internet; DB subnet group spans 2+ AZs
  • Security Groups = restrict inbound to app SG only (never 0.0.0.0/0)
  • KMS encryption at rest = must enable at creation; covers data, logs, snapshots, replicas
  • SSL/TLS in transit = download RDS CA cert, enable in connection string
  • IAM auth = token-based, no passwords; Secrets Manager = auto-rotating credentials
  • Exam trap: can't enable encryption on existing DB — must snapshot → copy encrypted → restore
07
Chapter Seven

Architecture Patterns

Pattern 1 — Web Application (EC2 + Multi-AZ RDS) Core
Pattern 1 — Classic 3-tier web app: ALB + EC2 + Multi-AZ RDS
VPC VPC
ALB
ALB
Public
subnet
PRIVATE SUBNET — App Tier
EC2
EC2
AZ-a
EC2
EC2
AZ-b
PRIVATE SUBNET — DB Tier
RDS
Primary
AZ-a
Standby
Standby
AZ-b
Multi-AZ RDS • Automated backups • KMS encrypted • Security Group: EC2 → RDS only
Pattern 2 — Read-Heavy App (Primary + Read Replicas) Core
Pattern 2 — Read-heavy: write to primary, read from replicas (e.g., news site, catalogue)
VPC VPC
App Tier
EC2
App Server
Writes → Primary
Reads → Replicas
PRIMARY (R/W)
RDS Primary
Primary
Writes only
AZ-a
async ————▶
async ————▶
READ REPLICAS (R)
Replica 1
Replica 1
AZ-b
Replica 2
Replica 2
AZ-c
Use case: reads up to 3× faster • Analytics on replica • No impact on primary from heavy reads
Pattern 3 — Serverless API (Lambda + RDS Proxy + RDS) Advanced

Lambda functions can't maintain persistent DB connections — each invocation opens and closes a connection. At scale, this exhausts the RDS connection pool. RDS Proxy sits between Lambda and RDS, pooling connections and reusing them efficiently.

Pattern 3 — Serverless: Lambda → RDS Proxy → RDS (avoids connection exhaustion)
Lambda
Lambda ×1000
Each invocation
needs DB access
RDS Proxy
RDS Proxy
Connection
pooling & reuse
IAM auth
RDS
RDS Instance
Max connections
protected
RDS Proxy benefits: connection pooling • reduces failover time (66% faster) • IAM auth enforcement • Secrets Manager integration
Pattern 4 — Full Production Setup Advanced
Pattern 4 — Production: Multi-AZ + Read Replicas + ElastiCache + Secrets Manager
VPC Production VPC
ALB
ALB
Public
EC2
EC2 / ECS
App tier
Multi-AZ
ElastiCache
ElastiCache
Cache layer
Redis / Memcached
cache miss ↓
RDS Primary
RDS Primary
Multi-AZ
+2 Replicas
🔑 Secrets Manager: automatic credential rotation • 🔒 KMS at rest • 🌐 SSL in transit
🔄 Multi-AZ standby • 📸 Daily snapshots (cross-region) • 📈 2 read replicas for analytics
Pattern 5 — Blue/Green Deployments (Zero-Downtime Changes) Advanced

RDS Blue/Green Deployments create a synchronized staging environment (green) that mirrors production (blue). You test schema changes safely, then switch production traffic to green in seconds — with zero application downtime.

🔵

Blue (Production)

  • Current production DB
  • Live traffic serving users
  • Changes tested here in green first
  • Becomes old environment after switchover
🟢

Green (Staging)

  • Synchronized copy of blue
  • Apply schema changes & patches safely
  • Test application against new schema
  • Kept in sync via binlog replication
🔄

Switchover

  • Single-click switchover (seconds)
  • DNS flipped — prod now points to green
  • Old blue retained for rollback
  • No data loss, no application outage

🎯 Blue/Green Use Cases

  • Major version upgrades (e.g., MySQL 5.7 → 8.0) with zero downtime
  • Schema changes: adding columns, changing indexes
  • Testing DB engine parameter changes safely before applying to production
  • Exam: “zero-downtime major version upgrade” → RDS Blue/Green Deployments
Decision Guide — RDS vs Other DB Options Core
You Need... Use Why
Managed relational DB (MySQL / PG) RDS Patching, backups, Multi-AZ managed
Maximum relational performance Aurora 5× MySQL / 3× PG performance, auto-scales
Key-value / document store DynamoDB Serverless, single-digit ms, unlimited scale
In-memory caching (reduce DB load) ElastiCache Redis / Memcached, microsecond latency
Lambda + RDS (connection pooling) RDS Proxy Prevents connection exhaustion, IAM auth
Real-time analytics without ETL pipelines Zero-ETL → Redshift Near real-time RDS → Redshift, no pipelines needed
OS-level DB access (Oracle / SQL Server) RDS Custom Managed + SSH/filesystem access for legacy migrations
Full DB control on EC2 EC2 + DB Custom configs RDS doesn't support (rare)
Exam Cheatsheet Core

🎯 Exam Keywords → RDS Answer

  • “automatic failover DB” → Multi-AZ (NOT read replica)
  • “read-heavy, offload reads” → Read Replica
  • “recover to specific time” → PITR (automated backups)
  • “encrypt existing unencrypted RDS” → snapshot → copy with encryption → restore
  • “Lambda + RDS connection exhaustion” → RDS Proxy
  • “faster failover with Lambda/RDS” → RDS Proxy (cuts failover time ~66%)
  • “rotate DB credentials automatically” → Secrets Manager
  • “cross-region disaster recovery DB” → Cross-region read replica
  • “DB not publicly accessible” → private subnet + security group
  • “Multi-AZ standby queryable?” → NO (classic Multi-AZ) / YES (Multi-AZ Cluster)
  • “zero-downtime major version upgrade” → Blue/Green Deployments
  • “real-time RDS analytics, no ETL” → Zero-ETL integration → Redshift
  • “OS-level access Oracle/SQL Server managed” → RDS Custom
  • “identify slow queries / DB bottleneck” → Performance Insights
  • “alert when DB failover / backup happens” → RDS Event Notifications + SNS
  • “first queries slow after restore” → lazy S3 loading; use EBS fast snapshot restore
🧠 Final Insight

RDS is your production relational database foundation: Multi-AZ for HA, Read Replicas for scale, PITR for safety, Secrets Manager for credentials, RDS Proxy for serverless (66% faster failover). Use Blue/Green Deployments for zero-downtime upgrades, Performance Insights to diagnose slow queries, Zero-ETL for real-time analytics to Redshift, and RDS Custom for Oracle/SQL Server when you need OS-level access.