Other AWS Database Services —
The Right Database for Every Problem
AWS has a database for every data model. Beyond RDS, Aurora, DynamoDB, and ElastiCache, there is a purpose-built service for graphs, analytics, time-series data, immutable ledgers, JSON documents, and wide-column stores. Learning when to choose each is the real skill.
🗺️ The AWS Database Universe
| Type | Service | Best For |
|---|---|---|
| Relational | RDS / Aurora | SQL, ACID, OLTP |
| NoSQL / Key-Value | DynamoDB | Serverless, ms latency, any scale |
| In-Memory Cache | ElastiCache | Sub-ms reads, reduce DB load |
| Graph | Neptune | Relationships, social, fraud |
| Analytics / OLAP | Redshift | Data warehouse, petabyte analytics |
| Time-Series | Timestream | IoT, metrics, time-based data |
| Ledger | QLDB | Immutable audit trail, financial records |
| Document | DocumentDB | JSON, MongoDB-compatible |
| Wide-Column | Keyspaces | Cassandra-compatible, wide-column |
| In-Memory (Durable) | MemoryDB for Redis | Redis speed + full durability |
| Relational + OS Access | RDS Custom | Oracle/SQL Server with SSH access |
| Relational (On-Prem) | RDS on VMware | Hybrid cloud, data residency |
Amazon Neptune — Graph Database
Traditional databases store data in tables or documents. But some problems are fundamentally about relationships: who is connected to whom, how are entities related, what path connects A to B? Amazon Neptune is a fully managed graph database that makes relationship queries fast and natural.
👉 Mental model: Think of Neptune as a database made of nodes (things) and edges (relationships). Instead of a table of users and a join to friends, you simply traverse connections. “Find all friends of friends who live in NYC” is a single graph traversal — not a multi-join SQL query.
Core Concept
- Nodes: Alice, Bob, Product, Post
- Edges: FRIEND_OF, BOUGHT, LIKES
- Properties: attributes on nodes/edges
- Queries traverse relationships, not rows
- Optimized for highly connected data
Use Cases
- Social networks (friends, followers)
- Fraud detection (unusual connections)
- Recommendation engines
- Knowledge graphs
- Network topology, IT infrastructure
Query Languages
- Gremlin: property graph traversal
- SPARQL: RDF / semantic web
- openCypher: graph pattern matching
- Neptune supports all three
- Pick based on data model
Choose Neptune When
- Data is fundamentally about relationships
- Multi-hop traversals (friend of friend)
- Fraud detection (unusual connection patterns)
- Recommendation: “users who bought X also bought Y”
- Knowledge graphs, ontologies
Don't Use Neptune When
- Data is tabular — use RDS/Aurora
- Simple key-value lookups — DynamoDB
- Relationships are shallow (SQL JOIN is fine)
- Analytics/reporting — Redshift
Neptune vs RDS for Relationships
- SQL can do friend-of-friend with multiple JOINs
- SQL complexity: O(nยณ) for 3-hop traversals
- Neptune traversal: O(n) regardless of depth
- For deep relationship queries, graphs are exponentially faster
- Use RDS when relationships are shallow (1โ2 JOINs)
Neptune Global Database
- Replicate graph data across multiple regions
- Replication lag <1 second
- Active-active reads across regions
- Single writer region (primary)
- Use for global social / recommendation apps
Neptune is for relationship-first data. When your most important queries are about how things are connected, Neptune is the right choice. Exam: “social network”, “fraud detection”, “recommendation engine” → Neptune.
Amazon Redshift — Data Warehouse
OLTP (Online Transaction Processing) = what RDS does: fast, small, frequent transactions. OLAP (Online Analytical Processing) = what Redshift does: complex analytical queries across massive datasets. Never run OLTP on Redshift or analytics on RDS.
| Aspect | OLTP (RDS/Aurora) | OLAP (Redshift) |
|---|---|---|
| Queries | Simple, fast (ms) | Complex, slow (seconds–min) |
| Data volume | GB–TB | TB–PB |
| Operations | INSERT/UPDATE/DELETE | SELECT, GROUP BY, SUM, AVG |
| Storage | Row-based | Columnar (faster aggregations) |
What is Redshift
- Fully managed data warehouse
- Columnar storage for fast aggregations
- Massively parallel processing (MPP)
- Petabyte-scale analytics
- SQL interface (PostgreSQL-compatible)
Use Cases
- Business intelligence dashboards
- Sales / revenue analytics
- Log analysis at scale
- ETL pipeline destination
- RDS → Redshift (Zero-ETL)
Key Features
- Redshift Serverless: no cluster to manage
- Spectrum: query S3 data directly
- Zero-ETL: near real-time from RDS/Aurora
- ML: train models with SQL
- Up to 16 PB storage
Distribution Styles
- AUTO: Redshift chooses automatically (default)
- EVEN: Round-robin across nodes — large tables without clear join key
- KEY: Same key value → same node — optimizes joins on that column
- ALL: Full copy on every node — small dimension tables
- 🎯 Exam: “optimize join performance” → KEY distribution on join column
Sort Keys & Concurrency Scaling
- Sort Key: data physically sorted on disk
- Put frequently-filtered columns first (e.g., date, region)
- Compound: columns in defined order — most common
- Interleaved: equal weight per column — rare, high-maintenance
- Concurrency Scaling: auto-adds transient clusters during query spikes; pay per second used
Redshift is for analytics, not transactions. Exam: “data warehouse”, “petabyte analytics”, “business intelligence”, “OLAP” → Redshift. RDS → Zero-ETL → Redshift for near-real-time analytics without pipelines.
Amazon Timestream — Time-Series Database
Time-series data is measurements recorded at regular intervals over time — IoT sensor readings, CPU metrics, stock prices. Key pattern: data is always appended (never updated), queries are always time-ranged, and recent data is accessed far more than old data.
What is Timestream
- Fully managed time-series database
- Serverless — scales automatically
- Purpose-built for timestamped data
- 10× faster and 1/10 cost vs relational
- Built-in time-series SQL functions
Use Cases
- IoT sensor data (temp, pressure)
- DevOps metrics (CPU, memory, latency)
- Application performance monitoring
- Financial data (tick data, prices)
- Industrial equipment monitoring
Key Features
- Tiered storage: hot (memory) → warm (SSD) auto-tier
- SQL-like queries with time functions
- Retention policy: auto-expire old data
- Integrates with Grafana, QuickSight
- IoT Core / Kinesis integration
Memory vs Magnetic Store
- Memory store: recent data, high-throughput writes, low-latency reads (~1ms)
- Magnetic store: older data, lower cost, slower queries
- Configure retention policy to move data automatically
- Memory retention: hours to days (configurable)
- Magnetic retention: days to years (configurable)
Scheduled Queries
- Run aggregations (hourly, daily, weekly) automatically
- Write results to a new derived table
- Use for downsampling high-frequency IoT data
- Example: 1-sec sensor reads → hourly averages
- Reduces query cost and improves dashboard speed
Timestream is for append-heavy, time-ranged data. Recent = hot, old = cold. The auto-tiering matches the natural access pattern perfectly. Exam: “IoT sensor data”, “metrics”, “time-series” → Timestream.
Amazon QLDB — Ledger Database
Amazon QLDB (Quantum Ledger Database) is a fully managed ledger database that provides a transparent, immutable, cryptographically verifiable transaction log. Every change is permanently recorded — nothing can be deleted or altered, and you can prove it mathematically.
👉 Mental model: QLDB is a traditional database where every row has a full history and that history is cryptographically proven. Show a regulator not just what the data is today, but every state it's ever been in — and prove it hasn't been tampered with.
Core Features
- Immutable journal: nothing can be deleted
- Cryptographic hashes: SHA-256 chain
- Full history: every version of every record
- PartiQL (SQL-like) query language
- Serverless, fully managed
Use Cases
- Financial ledgers (debit/credit history)
- Supply chain tracking
- Regulatory audit trails
- Medical records history
- Insurance claims processing
QLDB vs Blockchain
- QLDB: centralized (AWS-managed)
- Blockchain: decentralized (no single owner)
- QLDB: faster, simpler, single-owner trust
- Use QLDB when: you own and trust the data
- Exam: “immutable, audit trail” → QLDB
| Feature | QLDB | Managed Blockchain |
|---|---|---|
| Trust model | Centralized (AWS-managed) | Decentralized (multiple parties) |
| Immutability | Cryptographic hash verification | Consensus across nodes |
| Ownership | Single owner controls the ledger | No single owner โ trustless |
| Performance | Faster, simpler | Slower (consensus overhead) |
| 🎯 Exam trigger | “audit trail, ledger, immutable history” | “blockchain, multiple parties, trustless” |
QLDB is for when you need proof that data hasn't been tampered with. Exam: “immutable audit trail”, “financial ledger”, “verify data history” → QLDB. “Decentralized blockchain” → Amazon Managed Blockchain (not QLDB).
Amazon DocumentDB — Document Database
Amazon DocumentDB is a fully managed document database that is MongoDB-compatible. It stores data as JSON-like documents with flexible schemas where each document can have different fields. It uses Aurora's shared distributed storage model under the hood.
Document Model
- Data stored as JSON documents
- Each document can have different schema
- Rich query language (filter, project, aggregate)
- Collections = tables; Documents = rows
- Documents nested up to 100 levels
Use Cases
- Content management systems
- User profiles (varied attributes)
- Product catalogs (different specs)
- Mobile app backends
- MongoDB → AWS migration
Why DocumentDB
- MongoDB-compatible: minimal code changes
- Storage auto-grows to 64 TiB
- 6 copies across 3 AZs (Aurora-style)
- Fully managed, no Mongo ops
- Up to 15 read replicas
| Aspect | DocumentDB | DynamoDB |
|---|---|---|
| Model | JSON documents, rich queries | Key-value, access-pattern queries |
| Query flexibility | Higher (MongoDB query language) | Key-based only |
| Scale | Very large | Unlimited (serverless) |
| Compatible | MongoDB drivers/apps | DynamoDB SDK only |
👉 Important: DocumentDB is wire-protocol compatible with MongoDB — your existing MongoDB drivers and applications work without code changes. However, it is not a fork of MongoDB open-source code. Under the hood it uses Aurora's distributed storage engine, meaning you get Aurora-class durability (6 copies across 3 AZs) with MongoDB API compatibility.
DocumentDB is for JSON-document workloads needing MongoDB compatibility or richer queries than DynamoDB. Exam: “MongoDB-compatible”, “JSON documents”, “flexible schema” → DocumentDB. For unlimited serverless scale, DynamoDB is better.
Amazon Keyspaces — Wide-Column Database
Amazon Keyspaces is a fully managed, serverless wide-column database compatible with Apache Cassandra. Run Cassandra workloads on AWS without managing Cassandra clusters — automatic scaling, no capacity planning, pay per use.
Wide-Column Model
- Rows identified by partition key
- Each row can have different columns
- Optimized for write-heavy workloads
- CQL (Cassandra Query Language)
- Designed for high throughput at scale
Use Cases
- Industrial equipment data
- High-velocity write workloads
- Cassandra → AWS migration
- Time-series at massive scale
- Event logging, clickstreams
Why Keyspaces
- Cassandra-compatible: drop-in replacement
- Serverless, no Cassandra ops
- Auto-scales read/write capacity
- Single-digit ms latency
- Multi-Region replication
On-Demand (Pay per Request)
- Pay only for reads/writes you use
- No capacity planning required
- Best for unpredictable or spiky workloads
- Scales instantly to any traffic level
Provisioned (RCU / WCU)
- Set read/write capacity units upfront
- Lower cost for predictable, steady workloads
- Auto-scaling adjusts capacity automatically
- Similar model to DynamoDB provisioned mode
Keyspaces is for Cassandra workloads on AWS without managing Cassandra clusters. Exam: “Cassandra-compatible”, “wide-column”, “migrate Cassandra” → Keyspaces. For general-purpose NoSQL at massive scale, DynamoDB is usually the better choice.
Amazon MemoryDB for Redis
Amazon MemoryDB for Redis is a fully managed, Redis-compatible, durable in-memory database. Unlike ElastiCache (a cache where data loss is acceptable on failure), MemoryDB persists every write to a Multi-AZ transaction log before acknowledging it โ giving you Redis speed with true database durability.
👉 Mental model: MemoryDB = “Redis with durability”. ElastiCache is a cache in front of your database. MemoryDB is the database — built for real-time applications that need microsecond reads and writes and cannot afford data loss.
| Feature | ElastiCache for Redis | MemoryDB for Redis |
|---|---|---|
| Purpose | Cache (sits in front of DB) | Primary database |
| Durability | Optional snapshots only | Always โ Multi-AZ transaction log |
| Data loss on failover | Possible | None |
| Use case | Session cache, leaderboards | Real-time apps, persistent microservice state |
Exam: “durable Redis”, “persistent in-memory”, “Redis speed + no data loss” → MemoryDB. If the question says “cache”, choose ElastiCache. If it says “primary database” or “durability” with Redis → MemoryDB.
Amazon RDS Custom
Amazon RDS Custom gives you the automation of RDS (backups, monitoring, patching) while also allowing SSH / OS-level access to the underlying EC2 instance. Use it when you need to install third-party agents, apply custom patches, or satisfy compliance tools that require direct OS access.
What is RDS Custom
- RDS automation + SSH access to EC2
- Supports Oracle and SQL Server only
- Install custom patches, OS-level agents
- AWS still manages backups & monitoring
- You own OS configuration
Choose RDS Custom When
- Oracle or SQL Server workload
- Compliance requires OS-level monitoring agent
- Custom OS patches / third-party tools
- Legacy enterprise apps with OS dependencies
Don't Use When
- Standard RDS meets your needs (use RDS โ cheaper)
- MySQL / PostgreSQL workload (not supported)
- You don't need OS access (adds management overhead)
Exam: “Oracle/SQL Server + OS-level access”, “custom OS patches on RDS”, “third-party agent on RDS host” → RDS Custom. For everything else, use standard RDS.
Amazon RDS on VMware
Amazon RDS on VMware lets you deploy RDS-managed databases in your on-premises VMware environment using the same RDS APIs and console you use in the cloud. It bridges hybrid architectures โ manage on-prem databases the same way you manage cloud RDS.
What & Why
- RDS deployed inside your VMware infrastructure on-prem
- Same RDS console, APIs, and automation
- Automated backups, patching, monitoring on-prem
- Supports MySQL, PostgreSQL, SQL Server, Oracle
- Good for data residency requirements
Use Cases
- Hybrid cloud โ on-prem + AWS
- Regulatory / data residency (data must stay on-prem)
- Gradual migration to AWS RDS
- Consistent tooling across cloud and on-prem
RDS on VMware is rarely a primary exam topic but appears in hybrid architecture questions. Exam: “RDS in on-premises VMware”, “data residency, manage on-prem databases with RDS APIs” → RDS on VMware.
Decision Guide — Choosing the Right AWS Database
| Requirement / Keyword | Choose | Reason |
|---|---|---|
| SQL, ACID, relational | RDS / Aurora | Traditional relational workloads |
| Serverless NoSQL, ms latency, any scale | DynamoDB | Unlimited scale, access-pattern design |
| Sub-ms caching, session storage | ElastiCache | Redis or Memcached, in-memory |
| Social network, fraud detection, relationships | Neptune | Graph traversal, nodes + edges |
| Analytics, OLAP, data warehouse, BI | Redshift | Columnar, petabyte-scale analytics |
| IoT, metrics, time-series data | Timestream | Purpose-built for time-stamped data |
| Immutable audit trail, financial ledger | QLDB | Cryptographically verifiable history |
| MongoDB-compatible, JSON documents | DocumentDB | Flexible JSON schema, rich queries |
| Cassandra-compatible, wide-column | Keyspaces | Managed Cassandra, high-write throughput |
| Durable Redis, persistent in-memory, Redis + no data loss | MemoryDB for Redis | Redis speed + Multi-AZ transaction log persistence |
| Oracle/SQL Server + OS-level access, custom patches | RDS Custom | RDS automation + SSH to EC2 host |
| RDS on-premises, VMware, data residency | RDS on VMware | Hybrid cloud, on-prem RDS management |
🎯 Exam Keywords → Service Answer
- “social network, relationships, graph traversal” → Neptune
- “fraud detection, recommendation engine” → Neptune
- “deep multi-hop traversals, friend-of-friend” → Neptune (SQL JOINs don't scale)
- “global graph, multi-region graph replication” → Neptune Global Database
- “data warehouse, OLAP, petabyte analytics, BI” → Redshift
- “RDS to analytics without ETL pipeline” → Redshift Zero-ETL
- “query S3 data with SQL” → Redshift Spectrum
- “optimize Redshift join performance” → KEY distribution style on join column
- “Redshift handle query spike, many concurrent BI users” → Concurrency Scaling
- “IoT sensor, time-series, metrics ingestion” → Timestream
- “downsample IoT data, hourly aggregates” → Timestream Scheduled Queries
- “immutable audit trail, ledger, history verification” → QLDB
- “decentralized blockchain” → Amazon Managed Blockchain (NOT QLDB)
- “MongoDB-compatible, JSON documents” → DocumentDB
- “migrate MongoDB to AWS” → DocumentDB (wire-protocol compatible)
- “Cassandra-compatible, wide-column” → Keyspaces
- “migrate Cassandra to AWS” → Keyspaces
- “durable Redis, persistent in-memory, Redis + durability” → MemoryDB for Redis
- “Redis as primary database, no data loss on failover” → MemoryDB (not ElastiCache)
- “Oracle/SQL Server + OS access, custom OS patch, SSH to RDS” → RDS Custom
- “RDS on-premises, VMware, manage on-prem databases like RDS” → RDS on VMware
The real AWS database skill is not learning every feature — it is choosing the right tool for the problem. Know the one-line definition and exam keyword for each service. Most common traps: OLAP ≠ OLTP (Redshift vs RDS); QLDB ≠ Managed Blockchain; Neptune for relationships, not just “big data”.