LearningTree · AWS · Database

Other AWS Database Services —
The Right Database for Every Problem

AWS has a database for every data model. Beyond RDS, Aurora, DynamoDB, and ElastiCache, there is a purpose-built service for graphs, analytics, time-series data, immutable ledgers, JSON documents, and wide-column stores. Learning when to choose each is the real skill.

🗺️ The AWS Database Universe

Type	Service	Best For
Relational	RDS / Aurora	SQL, ACID, OLTP
NoSQL / Key-Value	DynamoDB	Serverless, ms latency, any scale
In-Memory Cache	ElastiCache	Sub-ms reads, reduce DB load
Graph	Neptune	Relationships, social, fraud
Analytics / OLAP	Redshift	Data warehouse, petabyte analytics
Time-Series	Timestream	IoT, metrics, time-based data
Ledger	QLDB	Immutable audit trail, financial records
Document	DocumentDB	JSON, MongoDB-compatible
Wide-Column	Keyspaces	Cassandra-compatible, wide-column
In-Memory (Durable)	MemoryDB for Redis	Redis speed + full durability
Relational + OS Access	RDS Custom	Oracle/SQL Server with SSH access
Relational (On-Prem)	RDS on VMware	Hybrid cloud, data residency

Graph Database

Amazon Neptune — Graph Database

What is Neptune & Why Graphs Introductory

Traditional databases store data in tables or documents. But some problems are fundamentally about relationships: who is connected to whom, how are entities related, what path connects A to B? Amazon Neptune is a fully managed graph database that makes relationship queries fast and natural.

👉 Mental model: Think of Neptune as a database made of nodes (things) and edges (relationships). Instead of a table of users and a join to friends, you simply traverse connections. “Find all friends of friends who live in NYC” is a single graph traversal — not a multi-join SQL query.

🧠

Core Concept

Nodes: Alice, Bob, Product, Post
Edges: FRIEND_OF, BOUGHT, LIKES
Properties: attributes on nodes/edges
Queries traverse relationships, not rows
Optimized for highly connected data

✅

Use Cases

Social networks (friends, followers)
Fraud detection (unusual connections)
Recommendation engines
Knowledge graphs
Network topology, IT infrastructure

📌

Query Languages

Gremlin: property graph traversal
SPARQL: RDF / semantic web
openCypher: graph pattern matching
Neptune supports all three
Pick based on data model

Graph model — nodes and edges represent entities and relationships

✅

Choose Neptune When

Data is fundamentally about relationships
Multi-hop traversals (friend of friend)
Fraud detection (unusual connection patterns)
Recommendation: “users who bought X also bought Y”
Knowledge graphs, ontologies

📌

Don't Use Neptune When

Data is tabular — use RDS/Aurora
Simple key-value lookups — DynamoDB
Relationships are shallow (SQL JOIN is fine)
Analytics/reporting — Redshift

Neptune vs RDS & Global Database Core

⚡

Neptune vs RDS for Relationships

SQL can do friend-of-friend with multiple JOINs
SQL complexity: O(n³) for 3-hop traversals
Neptune traversal: O(n) regardless of depth
For deep relationship queries, graphs are exponentially faster
Use RDS when relationships are shallow (1–2 JOINs)

🌐

Neptune Global Database

Replicate graph data across multiple regions
Replication lag <1 second
Active-active reads across regions
Single writer region (primary)
Use for global social / recommendation apps

🧠 Key Insight

Neptune is for relationship-first data. When your most important queries are about how things are connected, Neptune is the right choice. Exam: “social network”, “fraud detection”, “recommendation engine” → Neptune.

Data Warehouse / Analytics

Amazon Redshift — Data Warehouse

OLTP vs OLAP — The Critical Distinction Introductory

OLTP (Online Transaction Processing) = what RDS does: fast, small, frequent transactions. OLAP (Online Analytical Processing) = what Redshift does: complex analytical queries across massive datasets. Never run OLTP on Redshift or analytics on RDS.

   Aspect OLTP (RDS/Aurora) OLAP (Redshift) 
 QueriesSimple, fast (ms)Complex, slow (seconds–min)
Data volumeGB–TBTB–PB
OperationsINSERT/UPDATE/DELETESELECT, GROUP BY, SUM, AVG
StorageRow-basedColumnar (faster aggregations)
  

Aspect	OLTP (RDS/Aurora)	OLAP (Redshift)
Queries	Simple, fast (ms)	Complex, slow (seconds–min)
Data volume	GB–TB	TB–PB
Operations	INSERT/UPDATE/DELETE	SELECT, GROUP BY, SUM, AVG
Storage	Row-based	Columnar (faster aggregations)

📊

What is Redshift

Fully managed data warehouse
Columnar storage for fast aggregations
Massively parallel processing (MPP)
Petabyte-scale analytics
SQL interface (PostgreSQL-compatible)

✅

Use Cases

Business intelligence dashboards
Sales / revenue analytics
Log analysis at scale
ETL pipeline destination
RDS → Redshift (Zero-ETL)

⚡

Key Features

Redshift Serverless: no cluster to manage
Spectrum: query S3 data directly
Zero-ETL: near real-time from RDS/Aurora
ML: train models with SQL
Up to 16 PB storage

Redshift in a typical analytics pipeline

RDS / Aurora

OLTP source
Zero-ETL

→

Redshift

Data warehouse
Columnar, TB-PB

→

BI Tools

QuickSight
Tableau, Looker

Distribution Styles & Sort Keys Core

📦

Distribution Styles

AUTO: Redshift chooses automatically (default)
EVEN: Round-robin across nodes — large tables without clear join key
KEY: Same key value → same node — optimizes joins on that column
ALL: Full copy on every node — small dimension tables
🎯 Exam: “optimize join performance” → KEY distribution on join column

📊

Sort Keys & Concurrency Scaling

Sort Key: data physically sorted on disk
Put frequently-filtered columns first (e.g., date, region)
Compound: columns in defined order — most common
Interleaved: equal weight per column — rare, high-maintenance
Concurrency Scaling: auto-adds transient clusters during query spikes; pay per second used

🧠 Key Insight

Redshift is for analytics, not transactions. Exam: “data warehouse”, “petabyte analytics”, “business intelligence”, “OLAP” → Redshift. RDS → Zero-ETL → Redshift for near-real-time analytics without pipelines.

Time-Series Database

Amazon Timestream — Time-Series Database

What is Time-Series Data Introductory

Time-series data is measurements recorded at regular intervals over time — IoT sensor readings, CPU metrics, stock prices. Key pattern: data is always appended (never updated), queries are always time-ranged, and recent data is accessed far more than old data.

⏱️

What is Timestream

Fully managed time-series database
Serverless — scales automatically
Purpose-built for timestamped data
10× faster and 1/10 cost vs relational
Built-in time-series SQL functions

✅

Use Cases

IoT sensor data (temp, pressure)
DevOps metrics (CPU, memory, latency)
Application performance monitoring
Financial data (tick data, prices)
Industrial equipment monitoring

💡

Key Features

Tiered storage: hot (memory) → warm (SSD) auto-tier
SQL-like queries with time functions
Retention policy: auto-expire old data
Integrates with Grafana, QuickSight
IoT Core / Kinesis integration

Timestream storage — recent data in memory, older data auto-tiered to SSD and S3

Memory Store, Magnetic Store & Scheduled Queries Core

🧲

Memory vs Magnetic Store

Memory store: recent data, high-throughput writes, low-latency reads (~1ms)
Magnetic store: older data, lower cost, slower queries
Configure retention policy to move data automatically
Memory retention: hours to days (configurable)
Magnetic retention: days to years (configurable)

⏰

Scheduled Queries

Run aggregations (hourly, daily, weekly) automatically
Write results to a new derived table
Use for downsampling high-frequency IoT data
Example: 1-sec sensor reads → hourly averages
Reduces query cost and improves dashboard speed

🧠 Key Insight

Timestream is for append-heavy, time-ranged data. Recent = hot, old = cold. The auto-tiering matches the natural access pattern perfectly. Exam: “IoT sensor data”, “metrics”, “time-series” → Timestream.

Ledger Database

Amazon QLDB — Ledger Database

What is an Immutable Ledger Introductory

Amazon QLDB (Quantum Ledger Database) is a fully managed ledger database that provides a transparent, immutable, cryptographically verifiable transaction log. Every change is permanently recorded — nothing can be deleted or altered, and you can prove it mathematically.

👉 Mental model: QLDB is a traditional database where every row has a full history and that history is cryptographically proven. Show a regulator not just what the data is today, but every state it's ever been in — and prove it hasn't been tampered with.

📜

Core Features

Immutable journal: nothing can be deleted
Cryptographic hashes: SHA-256 chain
Full history: every version of every record
PartiQL (SQL-like) query language
Serverless, fully managed

✅

Use Cases

Financial ledgers (debit/credit history)
Supply chain tracking
Regulatory audit trails
Medical records history
Insurance claims processing

💡

QLDB vs Blockchain

QLDB: centralized (AWS-managed)
Blockchain: decentralized (no single owner)
QLDB: faster, simpler, single-owner trust
Use QLDB when: you own and trust the data
Exam: “immutable, audit trail” → QLDB

QLDB journal — every change appended, cryptographically chained, cannot be altered

QLDB vs Managed Blockchain — Exam Trap Core

   Feature QLDB Managed Blockchain 
 Trust modelCentralized (AWS-managed)Decentralized (multiple parties)
ImmutabilityCryptographic hash verificationConsensus across nodes
OwnershipSingle owner controls the ledgerNo single owner — trustless
PerformanceFaster, simplerSlower (consensus overhead)
🎯 Exam trigger“audit trail, ledger, immutable history”“blockchain, multiple parties, trustless”
  

Feature	QLDB	Managed Blockchain
Trust model	Centralized (AWS-managed)	Decentralized (multiple parties)
Immutability	Cryptographic hash verification	Consensus across nodes
Ownership	Single owner controls the ledger	No single owner — trustless
Performance	Faster, simpler	Slower (consensus overhead)
🎯 Exam trigger	“audit trail, ledger, immutable history”	“blockchain, multiple parties, trustless”

🧠 Key Insight

QLDB is for when you need proof that data hasn't been tampered with. Exam: “immutable audit trail”, “financial ledger”, “verify data history” → QLDB. “Decentralized blockchain” → Amazon Managed Blockchain (not QLDB).

Document Database

Amazon DocumentDB — Document Database

What is DocumentDB Introductory

Amazon DocumentDB is a fully managed document database that is MongoDB-compatible. It stores data as JSON-like documents with flexible schemas where each document can have different fields. It uses Aurora's shared distributed storage model under the hood.

📄

Document Model

Data stored as JSON documents
Each document can have different schema
Rich query language (filter, project, aggregate)
Collections = tables; Documents = rows
Documents nested up to 100 levels

✅

Use Cases

Content management systems
User profiles (varied attributes)
Product catalogs (different specs)
Mobile app backends
MongoDB → AWS migration

⚡

Why DocumentDB

MongoDB-compatible: minimal code changes
Storage auto-grows to 64 TiB
6 copies across 3 AZs (Aurora-style)
Fully managed, no Mongo ops
Up to 15 read replicas

   Aspect DocumentDB DynamoDB 
 ModelJSON documents, rich queriesKey-value, access-pattern queries
Query flexibilityHigher (MongoDB query language)Key-based only
ScaleVery largeUnlimited (serverless)
CompatibleMongoDB drivers/appsDynamoDB SDK only
  

Aspect	DocumentDB	DynamoDB
Model	JSON documents, rich queries	Key-value, access-pattern queries
Query flexibility	Higher (MongoDB query language)	Key-based only
Scale	Very large	Unlimited (serverless)
Compatible	MongoDB drivers/apps	DynamoDB SDK only

👉 Important: DocumentDB is wire-protocol compatible with MongoDB — your existing MongoDB drivers and applications work without code changes. However, it is not a fork of MongoDB open-source code. Under the hood it uses Aurora's distributed storage engine, meaning you get Aurora-class durability (6 copies across 3 AZs) with MongoDB API compatibility.

🧠 Key Insight

DocumentDB is for JSON-document workloads needing MongoDB compatibility or richer queries than DynamoDB. Exam: “MongoDB-compatible”, “JSON documents”, “flexible schema” → DocumentDB. For unlimited serverless scale, DynamoDB is better.

Wide-Column Database

Amazon Keyspaces — Wide-Column Database

What is Keyspaces Introductory

Amazon Keyspaces is a fully managed, serverless wide-column database compatible with Apache Cassandra. Run Cassandra workloads on AWS without managing Cassandra clusters — automatic scaling, no capacity planning, pay per use.

🗂️

Wide-Column Model

Rows identified by partition key
Each row can have different columns
Optimized for write-heavy workloads
CQL (Cassandra Query Language)
Designed for high throughput at scale

✅

Use Cases

Industrial equipment data
High-velocity write workloads
Cassandra → AWS migration
Time-series at massive scale
Event logging, clickstreams

⚡

Why Keyspaces

Cassandra-compatible: drop-in replacement
Serverless, no Cassandra ops
Auto-scales read/write capacity
Single-digit ms latency
Multi-Region replication

Keyspaces Capacity Modes Core

💳

On-Demand (Pay per Request)

Pay only for reads/writes you use
No capacity planning required
Best for unpredictable or spiky workloads
Scales instantly to any traffic level

📏

Provisioned (RCU / WCU)

Set read/write capacity units upfront
Lower cost for predictable, steady workloads
Auto-scaling adjusts capacity automatically
Similar model to DynamoDB provisioned mode

🧠 Key Insight

Keyspaces is for Cassandra workloads on AWS without managing Cassandra clusters. Exam: “Cassandra-compatible”, “wide-column”, “migrate Cassandra” → Keyspaces. For general-purpose NoSQL at massive scale, DynamoDB is usually the better choice.

In-Memory Database (Durable)

Amazon MemoryDB for Redis

Redis Speed + Database Durability Introductory

Amazon MemoryDB for Redis is a fully managed, Redis-compatible, durable in-memory database. Unlike ElastiCache (a cache where data loss is acceptable on failure), MemoryDB persists every write to a Multi-AZ transaction log before acknowledging it — giving you Redis speed with true database durability.

👉 Mental model: MemoryDB = “Redis with durability”. ElastiCache is a cache in front of your database. MemoryDB is the database — built for real-time applications that need microsecond reads and writes and cannot afford data loss.

   Feature ElastiCache for Redis MemoryDB for Redis 
 PurposeCache (sits in front of DB)Primary database
DurabilityOptional snapshots onlyAlways — Multi-AZ transaction log
Data loss on failoverPossibleNone
Use caseSession cache, leaderboardsReal-time apps, persistent microservice state
  

Feature	ElastiCache for Redis	MemoryDB for Redis
Purpose	Cache (sits in front of DB)	Primary database
Durability	Optional snapshots only	Always — Multi-AZ transaction log
Data loss on failover	Possible	None
Use case	Session cache, leaderboards	Real-time apps, persistent microservice state

🧠 Key Insight

Exam: “durable Redis”, “persistent in-memory”, “Redis speed + no data loss” → MemoryDB. If the question says “cache”, choose ElastiCache. If it says “primary database” or “durability” with Redis → MemoryDB.

Managed Relational + OS Access

Amazon RDS Custom

RDS with OS-Level Access Introductory

Amazon RDS Custom gives you the automation of RDS (backups, monitoring, patching) while also allowing SSH / OS-level access to the underlying EC2 instance. Use it when you need to install third-party agents, apply custom patches, or satisfy compliance tools that require direct OS access.

🔧

What is RDS Custom

RDS automation + SSH access to EC2
Supports Oracle and SQL Server only
Install custom patches, OS-level agents
AWS still manages backups & monitoring
You own OS configuration

✅

Choose RDS Custom When

Oracle or SQL Server workload
Compliance requires OS-level monitoring agent
Custom OS patches / third-party tools
Legacy enterprise apps with OS dependencies

⛔

Don't Use When

Standard RDS meets your needs (use RDS — cheaper)
MySQL / PostgreSQL workload (not supported)
You don't need OS access (adds management overhead)

🧠 Key Insight

Exam: “Oracle/SQL Server + OS-level access”, “custom OS patches on RDS”, “third-party agent on RDS host” → RDS Custom. For everything else, use standard RDS.

Hybrid / On-Premises Relational

Amazon RDS on VMware

RDS in Your Own Data Centre Introductory

Amazon RDS on VMware lets you deploy RDS-managed databases in your on-premises VMware environment using the same RDS APIs and console you use in the cloud. It bridges hybrid architectures — manage on-prem databases the same way you manage cloud RDS.

🏢

What & Why

RDS deployed inside your VMware infrastructure on-prem
Same RDS console, APIs, and automation
Automated backups, patching, monitoring on-prem
Supports MySQL, PostgreSQL, SQL Server, Oracle
Good for data residency requirements

✅

Use Cases

Hybrid cloud — on-prem + AWS
Regulatory / data residency (data must stay on-prem)
Gradual migration to AWS RDS
Consistent tooling across cloud and on-prem

🧠 Key Insight

RDS on VMware is rarely a primary exam topic but appears in hybrid architecture questions. Exam: “RDS in on-premises VMware”, “data residency, manage on-prem databases with RDS APIs” → RDS on VMware.

Decision Guide

Decision Guide — Choosing the Right AWS Database

The Full AWS Database Decision Table Core

   Requirement / Keyword Choose Reason 
 SQL, ACID, relationalRDS / AuroraTraditional relational workloads
Serverless NoSQL, ms latency, any scaleDynamoDBUnlimited scale, access-pattern design
Sub-ms caching, session storageElastiCacheRedis or Memcached, in-memory
Social network, fraud detection, relationshipsNeptuneGraph traversal, nodes + edges
Analytics, OLAP, data warehouse, BIRedshiftColumnar, petabyte-scale analytics
IoT, metrics, time-series dataTimestreamPurpose-built for time-stamped data
Immutable audit trail, financial ledgerQLDBCryptographically verifiable history
MongoDB-compatible, JSON documentsDocumentDBFlexible JSON schema, rich queries
Cassandra-compatible, wide-columnKeyspacesManaged Cassandra, high-write throughput
Durable Redis, persistent in-memory, Redis + no data lossMemoryDB for RedisRedis speed + Multi-AZ transaction log persistence
Oracle/SQL Server + OS-level access, custom patchesRDS CustomRDS automation + SSH to EC2 host
RDS on-premises, VMware, data residencyRDS on VMwareHybrid cloud, on-prem RDS management
  

Requirement / Keyword	Choose	Reason
SQL, ACID, relational	RDS / Aurora	Traditional relational workloads
Serverless NoSQL, ms latency, any scale	DynamoDB	Unlimited scale, access-pattern design
Sub-ms caching, session storage	ElastiCache	Redis or Memcached, in-memory
Social network, fraud detection, relationships	Neptune	Graph traversal, nodes + edges
Analytics, OLAP, data warehouse, BI	Redshift	Columnar, petabyte-scale analytics
IoT, metrics, time-series data	Timestream	Purpose-built for time-stamped data
Immutable audit trail, financial ledger	QLDB	Cryptographically verifiable history
MongoDB-compatible, JSON documents	DocumentDB	Flexible JSON schema, rich queries
Cassandra-compatible, wide-column	Keyspaces	Managed Cassandra, high-write throughput
Durable Redis, persistent in-memory, Redis + no data loss	MemoryDB for Redis	Redis speed + Multi-AZ transaction log persistence
Oracle/SQL Server + OS-level access, custom patches	RDS Custom	RDS automation + SSH to EC2 host
RDS on-premises, VMware, data residency	RDS on VMware	Hybrid cloud, on-prem RDS management

Exam Cheatsheet — Keywords to Service Core

🎯 Exam Keywords → Service Answer

“social network, relationships, graph traversal” → Neptune
“fraud detection, recommendation engine” → Neptune
“deep multi-hop traversals, friend-of-friend” → Neptune (SQL JOINs don't scale)
“global graph, multi-region graph replication” → Neptune Global Database
“data warehouse, OLAP, petabyte analytics, BI” → Redshift
“RDS to analytics without ETL pipeline” → Redshift Zero-ETL
“query S3 data with SQL” → Redshift Spectrum
“optimize Redshift join performance” → KEY distribution style on join column
“Redshift handle query spike, many concurrent BI users” → Concurrency Scaling
“IoT sensor, time-series, metrics ingestion” → Timestream
“downsample IoT data, hourly aggregates” → Timestream Scheduled Queries
“immutable audit trail, ledger, history verification” → QLDB
“decentralized blockchain” → Amazon Managed Blockchain (NOT QLDB)
“MongoDB-compatible, JSON documents” → DocumentDB
“migrate MongoDB to AWS” → DocumentDB (wire-protocol compatible)
“Cassandra-compatible, wide-column” → Keyspaces
“migrate Cassandra to AWS” → Keyspaces
“durable Redis, persistent in-memory, Redis + durability” → MemoryDB for Redis
“Redis as primary database, no data loss on failover” → MemoryDB (not ElastiCache)
“Oracle/SQL Server + OS access, custom OS patch, SSH to RDS” → RDS Custom
“RDS on-premises, VMware, manage on-prem databases like RDS” → RDS on VMware

🧠 Final Insight

The real AWS database skill is not learning every feature — it is choosing the right tool for the problem. Know the one-line definition and exam keyword for each service. Most common traps: OLAP ≠ OLTP (Redshift vs RDS); QLDB ≠ Managed Blockchain; Neptune for relationships, not just “big data”.