LearningTree · AWS · Database

Amazon ElastiCache —
In-Memory Caching at Scale

ElastiCache is not just a faster database — it is a layer that protects your database. By storing frequently accessed data in memory, ElastiCache reduces database load, cuts latency from milliseconds to microseconds, and lets your application scale without rewriting queries. Choose Redis for rich features or Memcached for simplicity.

⚡ ElastiCache in 30 Seconds

Fully managed in-memory caching — Redis or Memcached
Sub-millisecond latency — 100× faster than disk-based databases
Sits between your app and database — reduces DB load, speeds up reads
Cache hit = instant response from memory; Cache miss = fetch from database
Use for: session storage, leaderboards, real-time analytics, API response caching
Redis: rich data types, persistence, pub/sub; Memcached: simpler, multi-threaded

Chapter One

What is Amazon ElastiCache

The Problem — Databases Are Slow Introductory

Every time your application reads data from a database, it makes a network round-trip, waits for the database to find the data on disk, and returns it. Even with fast SSDs, this takes 1–10 milliseconds per query. When you have thousands of users making the same queries, your database becomes a bottleneck. Scaling the database is expensive. And much of that work is redundant — the same data is being fetched over and over.

👉 The core problem: Your database is doing the same work repeatedly. If 1,000 users request the same product page, the database runs the same query 1,000 times. That's wasted compute, wasted latency, and unnecessary load on a system that's hard to scale horizontally.

What is Caching Introductory

Caching is storing frequently accessed data in a fast, temporary location so you don't have to fetch it from the slower primary source every time. Think of it like keeping your most-used tools on your desk instead of walking to the garage each time you need them.

💾

Database (Slow)

Data stored on disk (SSD/HDD)
Latency: 1–10+ ms
Durable — survives restarts
Handles complex queries
Expensive to scale horizontally

⚡

Cache (Fast)

Data stored in RAM (memory)
Latency: <1 ms (microseconds)
Volatile — data can be lost
Key-value lookups only
Cheap to scale horizontally

🤝

Together

Cache handles hot data
Database handles cold data
Cache absorbs repetitive load
Database handles complex queries
Best of both worlds

Memory vs Disk — Why Caching is Fast Core

The speed difference between RAM and disk is not incremental — it is orders of magnitude. Understanding this is key to understanding why caching works.

Storage Type	Latency	Analogy
L1 CPU Cache	~1 nanosecond	Blink of an eye
RAM (Memory)	~100 nanoseconds	One heartbeat
SSD	~100 microseconds	Walking to the kitchen
HDD	~10 milliseconds	Walking to the store
Network (same region)	~1 millisecond	Driving across town

RAM is ~1,000× faster than SSD, and ~100,000× faster than HDD. This is why caching works.

What is Amazon ElastiCache Introductory

Amazon ElastiCache is a fully managed in-memory caching service. You choose an engine — Redis or Memcached — and AWS handles provisioning, patching, scaling, and failover. ElastiCache runs inside your VPC, sitting between your application and your database.

✅

What ElastiCache Provides

Managed Redis or Memcached clusters
Automatic failover (Multi-AZ for Redis)
Backup and restore (Redis)
Encryption at rest and in transit
CloudWatch metrics and alarms
VPC isolation, Security Groups

📌

What You Still Manage

Cache invalidation logic
Application-side caching strategy
Key design and TTL policy
Choosing Redis vs Memcached
Sizing nodes appropriately
Deciding what data to cache

🧠 Mental Model — Cache as a Protective Layer Introductory

Most People Think (Incomplete):

“Cache = faster database. I put data in cache, reads are faster.”

✨ Better Mental Model:

Cache = a protective layer that absorbs load and reduces latency. It sits in front of your database, answering repeated questions so the database doesn't have to. The database handles complex queries and writes; the cache handles hot reads. The cache is not a replacement — it is a shield.

🛑

Without Cache

Every read hits the database
Database CPU spikes on popular data
Response times degrade under load
Scale = bigger database (expensive)
DB becomes single point of failure

✅

With Cache

Hot reads served from memory
Database handles only cache misses
Consistent sub-ms response times
Scale = add cache nodes (cheap)
Database load reduced 80–90%

💡

Key Insight

Cache hit ratio is everything
90% hit ratio = 10× less DB load
Cache what's read often, changes rarely
Don't cache complex joins — cache results
Invalidation is the hard problem

Concept Diagram — Cache Hit vs Cache Miss Core

Cache flow — hit returns instantly from memory; miss fetches from database then caches

AWS Architecture Diagram — ElastiCache in a Typical Stack Core

ElastiCache sits between your application and database — the most common architecture

Application

EC2 / Lambda
Check cache first

→

ElastiCache

Redis / Memcached
Sub-ms latency
In-memory

→

RDS / Aurora

Source of truth
Cache miss fallback

Cache hit: <1ms • Cache miss: fetch from DB (1–10ms) • Typical hit ratio: 80–95% • DB load reduced proportionally

When to Use ElastiCache Core

✅

Use ElastiCache When

Read-heavy workload (80%+ reads)
Same data requested repeatedly (hot keys)
Need sub-millisecond latency
Database is a bottleneck or expensive to scale
Session storage for stateless app servers
Real-time leaderboards, analytics, counters
API response caching

📌

Don't Use Cache Alone When

Write-heavy workload (cache doesn't help writes)
Data changes constantly (cache always stale)
Every request is unique (no cache hits)
Strong consistency required (cache = eventual)
Complex joins/aggregations (cache key-value only)
Small dataset that fits in DB memory anyway

ElastiCache Engine Comparison — Redis vs Memcached Core

Feature	Redis	Memcached
Data types	Strings, lists, sets, hashes, sorted sets	Strings only
Persistence	Yes (snapshots + AOF)	No
Replication	Yes (primary + replicas)	No
Multi-AZ failover	Yes (automatic)	No
Pub/Sub	Yes	No
Lua scripting	Yes	No
Multi-threaded	Single-threaded (mostly)	Yes

Rule of thumb: Choose Redis unless you specifically need Memcached's multi-threading or have a legacy Memcached dependency.

🧠 Key Insight

ElastiCache is not about making your database faster — it's about reducing the work your database has to do. A 90% cache hit ratio means your database handles 10× less load. The cache absorbs repetitive reads; the database focuses on writes and complex queries. This is how you scale read-heavy applications without scaling your database.

Chapter Summary Introductory

 Caching = storing frequently accessed data in fast memory to avoid slow disk/network access
ElastiCache = fully managed Redis or Memcached clusters in AWS
Memory is ~1,000× faster than SSD — that's why caching works
Cache hit = data found in cache, return instantly (<1ms)
Cache miss = data not in cache, fetch from DB, store in cache, return
Mental model: cache = protective layer that absorbs load from your database
Redis vs Memcached: Redis has richer features; Memcached is simpler and multi-threaded
 

Chapter Two

Core Concepts — TTL, Eviction & Cache Behavior

Cache Hit vs Cache Miss — The Foundation Introductory

Every cache interaction has one of two outcomes: a hit (data found) or a miss (data not found). Understanding this is fundamental to everything else in caching.

✔️

Cache Hit

Application requests data from cache
Data exists in cache
Return immediately from memory
Latency: <1 millisecond
Database is not touched
This is what you want — maximize hit ratio

✗

Cache Miss

Application requests data from cache
Data not found in cache
Fetch from database (slow)
Store result in cache for next time
Latency: 1–10+ milliseconds
First request for any data is always a miss

📊 Cache Hit Ratio

Hit Ratio = Cache Hits / Total Requests. A 90% hit ratio means 90% of requests are served from cache (fast) and only 10% go to the database (slow). The higher your hit ratio, the less load on your database. Target: 80–95% for most workloads.

TTL — Time to Live Core

TTL (Time to Live) is how long a cached item remains valid before it expires and is automatically removed. TTL is your primary tool for balancing freshness vs performance.

⏱️

Short TTL (seconds)

Data refreshes frequently
More cache misses
Higher database load
More consistent with source
Use for: rapidly changing data

⏰

Long TTL (hours/days)

Data stays cached longer
More cache hits
Lower database load
May serve stale data
Use for: static/rarely changing data

♾️

No TTL (infinite)

Data never expires automatically
Must manually invalidate
Risk of serving stale data forever
Use sparingly
Use for: truly immutable data

   Data Type Suggested TTL Rationale 
  User session 15–30 minutes Match session timeout 
 Product catalog 1–24 hours Rarely changes, high read volume 
 User profile 5–15 minutes Balance freshness vs load 
 Stock price 1–5 seconds Changes constantly 
 Static config 24+ hours Almost never changes 
  

Data Type	Suggested TTL	Rationale
User session	15–30 minutes	Match session timeout
Product catalog	1–24 hours	Rarely changes, high read volume
User profile	5–15 minutes	Balance freshness vs load
Stock price	1–5 seconds	Changes constantly
Static config	24+ hours	Almost never changes

Eviction Policies — What Happens When Cache is Full Core

Cache memory is finite. When it fills up, the cache must evict (remove) existing items to make room for new ones. The eviction policy determines which items get removed.

💡

LRU — Least Recently Used

Evicts items that haven't been accessed recently
Most common policy — Redis default
Keeps hot data, removes cold data
Good for: general-purpose caching
Exam: “remove least recently accessed” → LRU

🔢

LFU — Least Frequently Used

Evicts items accessed the fewest times
Keeps frequently accessed data longer
Better for skewed access patterns
Good for: data with long-term popularity
Exam: “remove least frequently accessed” → LFU

volatile-lru

LRU among keys with TTL set. Keys without TTL are never evicted.

allkeys-lru

LRU among all keys. Most common choice for pure cache use cases.

noeviction

Returns error when memory full. Use when data loss is unacceptable.

Concept Diagram — TTL & Eviction Lifecycle Core

Cache item lifecycle — from creation to expiration or eviction

Cache Invalidation — The Hard Problem Advanced

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton. When data changes in your database, the cached copy becomes stale. You have two options:

⏱️

TTL-Based Expiration

Let cache expire naturally via TTL
Simple — no extra code
Data may be stale up to TTL duration
Good for: data where staleness is acceptable
Example: product catalog (24h TTL)

🗑️

Explicit Invalidation

Delete/update cache when data changes
More complex — need to track dependencies
Data always fresh
Good for: data where consistency matters
Example: user profile (invalidate on update)

👉 Common pattern: Use TTL as a safety net, plus explicit invalidation for critical data. TTL ensures stale data eventually disappears even if you forget to invalidate. Explicit invalidation ensures important updates are immediately visible.

🧠 Key Insight

TTL is your balance between freshness and performance. Short TTL = fresher data but more database load. Long TTL = better performance but risk of stale data. Eviction policies decide what to remove when memory is full. LRU (least recently used) is the default and right choice for most use cases.

Chapter Summary Introductory

 Cache hit = data found, return instantly; cache miss = fetch from DB, then cache
Hit ratio = hits / total requests — target 80–95%
TTL = time-to-live; controls how long data stays in cache before auto-expiring
Short TTL = fresher data, more DB load; Long TTL = better performance, risk of staleness
Eviction = removing items when cache is full; LRU (least recently used) is most common
Invalidation = TTL-based (simple) or explicit delete (complex but fresh)
 

Chapter Three

Redis vs Memcached — Choosing the Right Engine

The Decision That Matters Introductory

ElastiCache supports two engines: Redis and Memcached. Both are in-memory key-value stores, but they have fundamentally different architectures and capabilities. The right choice depends on your use case.

🎯 Quick Decision Rule

Choose Redis unless you have a specific reason to use Memcached. Redis has more features, better availability, and covers 95% of use cases. Memcached is for legacy systems or very specific multi-threaded scaling needs.

Redis — Feature-Rich In-Memory Database Core

✅

Redis Strengths

Rich data types: strings, lists, sets, sorted sets, hashes, streams
Persistence: RDB snapshots + AOF (append-only file)
Replication: primary-replica architecture
Multi-AZ failover: automatic with replicas
Pub/Sub: real-time messaging
Lua scripting: atomic operations
Transactions: MULTI/EXEC for atomic batches
Cluster mode: horizontal sharding

💡

Redis Use Cases

Session storage (with persistence)
Leaderboards (sorted sets)
Real-time analytics (counters, HyperLogLog)
Message queues (lists, streams)
Pub/Sub notifications
Rate limiting
Geospatial queries
Caching with advanced data structures

Redis Data Types — Beyond Simple Key-Value Core

String

Simple key-value. Max 512 MB. Supports atomic increment/decrement.

SET user:123 "Alice"

List

Ordered collection. Push/pop from both ends. Good for queues.

LPUSH queue task1

Set

Unordered unique values. Set operations (union, intersect).

SADD tags "aws"

Sorted Set

Set with scores. Auto-sorted by score. Perfect for leaderboards.

ZADD leaderboard 100 "player1"

Hash

Field-value pairs under one key. Like a mini-object.

HSET user:123 name "Alice"

Stream

Append-only log. Consumer groups. Event sourcing.

XADD events * action "click"

Redis Transactions — Atomic Batches Advanced

🔄

MULTI / EXEC

MULTI — begin transaction block
Queue commands inside the block
EXEC — execute all at once (atomic)
DISCARD — cancel queued commands
All commands run, or none (on network failure)
⚠️ Not a true rollback — if one command fails, others still execute

👁️

WATCH — Optimistic Locking

WATCH key — monitor key for changes
If key changes before EXEC, transaction aborts
Use for: read-modify-write patterns
Example: decrement inventory + add order
Retry on abort for optimistic concurrency

Memcached — Simple, Distributed Cache Core

📦

Memcached Strengths

Multi-threaded: scales with CPU cores
Simple: just key-value strings
Distributed by design: client-side sharding
Low memory overhead: no persistence metadata
Mature: battle-tested for decades
Easy horizontal scaling: add nodes, update client

⚠️

Memcached Limitations

Strings only: no complex data types
No persistence: data lost on restart
No replication: no built-in HA
No failover: node loss = data loss
No pub/sub: just get/set
No transactions: no atomic batches

Comparison Table — Redis vs Memcached Core

   Feature Redis Memcached 
  Data types Strings, lists, sets, hashes, sorted sets, streams Strings only 
 Persistence Yes (RDB + AOF) No 
 Replication Yes (primary-replica) No 
 Multi-AZ failover Yes (automatic) No 
 Pub/Sub Yes No 
 Lua scripting Yes No 
 Cluster mode (sharding) Yes (server-side) Client-side only 
 Multi-threaded Mostly single-threaded Yes 
 Max item size 512 MB 1 MB (default) 
 Backup/Restore Yes No 
  

Feature	Redis	Memcached
Data types	Strings, lists, sets, hashes, sorted sets, streams	Strings only
Persistence	Yes (RDB + AOF)	No
Replication	Yes (primary-replica)	No
Multi-AZ failover	Yes (automatic)	No
Pub/Sub	Yes	No
Lua scripting	Yes	No
Cluster mode (sharding)	Yes (server-side)	Client-side only
Multi-threaded	Mostly single-threaded	Yes
Max item size	512 MB	1 MB (default)
Backup/Restore	Yes	No

Concept Diagram — Architecture Differences Core

Redis (replicated, persistent) vs Memcached (distributed, ephemeral)

Decision Flowchart Core

✔️

Choose Redis When

Need complex data types (lists, sets, sorted sets)
Need persistence — data must survive restarts
Need high availability — Multi-AZ failover
Need pub/sub for real-time messaging
Building leaderboards (sorted sets)
Need atomic operations or Lua scripting
Default choice for most use cases

📦

Choose Memcached When

Simple key-value strings are enough
Multi-threaded performance is critical
Have legacy Memcached dependency
Data loss is acceptable (pure cache)
Need lowest memory overhead
Horizontal scaling via client-side sharding
Cost-sensitive, simpler architecture

🧠 Key Insight

Redis is the default choice for 95% of caching needs. It has everything Memcached has, plus persistence, replication, failover, and rich data types. Choose Memcached only if you specifically need its multi-threaded architecture or have a legacy dependency. Memcached node failure = data loss for that shard — no recovery. Exam tip: Redis for complex data structures and HA; Memcached for simple, ephemeral cache.

Chapter Summary Introductory

 Redis: rich data types, persistence, replication, Multi-AZ failover, pub/sub, Lua, transactions — choose by default
Memcached: strings only, no persistence, no replication, multi-threaded — node failure = data lost permanently
Redis data types: strings, lists, sets, sorted sets (leaderboards), hashes, streams
Redis persistence: RDB (snapshots) + AOF (append-only file) — survives restarts
Redis MULTI/EXEC: atomic batches; use WATCH for optimistic locking
Memcached nodes: independent, no failover — node failure = data loss for that shard
Exam: “leaderboards” → Redis sorted sets; “session with persistence” → Redis; “simple cache” → either
 

Chapter Four

Caching Strategies — Cache-Aside, Write-Through & More

Why Strategy Matters Introductory

A caching strategy defines when and how data moves between your application, cache, and database. The wrong strategy leads to stale data, cache misses, or wasted resources. The right strategy matches your access patterns.

👉 The core question: When data is written to the database, how does the cache stay in sync? And when data is read, who is responsible for populating the cache? Your answers determine your caching strategy.

Cache-Aside (Lazy Loading) — The Most Common Strategy Core

Cache-Aside is the most widely used caching pattern. The application is responsible for managing the cache — checking it on reads and updating it on misses. The cache and database are independent; neither knows about the other.

✅

How Cache-Aside Works

Read: App checks cache first
Hit: Return cached data
Miss: Fetch from DB, store in cache, return
Write: Write to DB, invalidate/delete from cache
Cache is populated on demand (lazy)
Only accessed data is cached

💡

Pros & Cons

✔ Only caches data that's actually used
✔ Cache failure doesn't break the app (fallback to DB)
✔ Simple to implement
✗ First request is always a miss (cold start)
✗ Stale data possible if DB updated externally
✗ App must manage cache logic

Cache-Aside pattern — application manages cache reads and invalidation

Write-Through — Always Keep Cache in Sync Core

Write-Through writes data to the cache and database at the same time. The cache is always up-to-date, eliminating stale data. But every write has latency of both systems.

📝

How Write-Through Works

Write: App writes to cache AND database together
Read: Always read from cache (always populated)
Cache is never stale for writes through the app
Often combined with cache-aside for reads

💡

Pros & Cons

✔ Cache is always consistent with DB
✔ No stale data from app writes
✗ Write latency = cache + DB (slower)
✗ Cache may hold data never read
✗ More complex — must handle failures

Write-Behind (Write-Back) — Async Writes for Performance Advanced

Write-Behind writes to cache immediately, then asynchronously writes to the database later. This optimizes write latency but risks data loss if the cache fails before flushing to DB.

⚡

How Write-Behind Works

Write: App writes to cache (fast)
Background: Cache flushes to DB asynchronously
Writes are batched for efficiency
App sees fast response; DB updated later

⚠️

Risks

✗ Data loss if cache fails before flush
✗ Complex failure handling
✗ Eventual consistency only
Use only when: write speed > durability

Read-Through — Cache Manages Reads Core

Read-Through is similar to cache-aside, but the cache itself fetches from the database on a miss (not the application). The app only talks to the cache.

👉 Read-Through vs Cache-Aside: In cache-aside, the app fetches from DB on miss and writes to cache. In read-through, the cache layer handles this automatically. Read-through requires cache infrastructure that supports it (like DAX for DynamoDB).

Strategy Comparison Table Core

   Strategy Read Write Best For 
  Cache-Aside App checks cache, misses fetch from DB Write to DB, invalidate cache General purpose, read-heavy 
 Write-Through Always from cache Write to cache + DB together Consistency critical 
 Write-Behind Always from cache Write to cache, async to DB Write-heavy, latency-sensitive 
 Read-Through Cache fetches from DB on miss Varies (often write-through) Simplified app logic 
  

Strategy	Read	Write	Best For
Cache-Aside	App checks cache, misses fetch from DB	Write to DB, invalidate cache	General purpose, read-heavy
Write-Through	Always from cache	Write to cache + DB together	Consistency critical
Write-Behind	Always from cache	Write to cache, async to DB	Write-heavy, latency-sensitive
Read-Through	Cache fetches from DB on miss	Varies (often write-through)	Simplified app logic

🧠 Key Insight

Cache-Aside is the default strategy for ElastiCache — simple, flexible, and works with any database. Use Write-Through when you need strong consistency. Write-Behind is risky but fast. Pre-warm your cache after deployments to avoid a cold-start burst on your database. Exam tip: “reduce DB load for reads” = cache-aside; “cache always in sync” = write-through; “DynamoDB caching” = DAX (read-through/write-through).

Cache Warming — Avoiding Cold Start Core

On first deployment (or after a cache flush), every request is a cache miss. This creates a cold-start burst of database queries. Cache warming pre-populates the cache before traffic arrives.

🔥

Warming Strategies

Pre-load script: run before deployment, load hot keys from DB
Gradual traffic shift: route small % first, cache builds up
Off-peak warming: pre-warm during low traffic windows
Background refresh: re-populate before TTL expires (refresh-ahead)

💡

Why It Matters

Cold cache = all requests hit DB (danger zone)
Especially critical for RDS — can cause DB overload
Warming reduces cold-start latency spikes
Plan warming as part of deployment runbook

Chapter Summary Introductory

 Cache-Aside: app manages cache; read from cache, miss fetches from DB, write invalidates cache — most common
Write-Through: write to cache + DB together; cache always consistent; higher write latency
Write-Behind: write to cache, async to DB; fast writes but risk of data loss
Read-Through: cache fetches from DB on miss (app only talks to cache)
Cache Warming: pre-populate cache before traffic; avoids cold-start DB burst
Default choice: cache-aside for ElastiCache + RDS/Aurora
Exam: “lazy loading” = cache-aside; “DAX” = read-through/write-through for DynamoDB
 

Chapter Five

Scaling & Performance

Scaling ElastiCache — Vertical vs Horizontal Introductory

As your cache workload grows, you have two scaling options: vertical (bigger nodes) or horizontal (more nodes). The right choice depends on your engine (Redis vs Memcached) and workload pattern.

⬆️

Vertical Scaling (Scale Up)

Use larger node types (more RAM, CPU)
Simple — no code changes
Limited by largest available instance
Downtime during resize (Redis) or create new cluster
Good for: moderate growth, single-node workloads

➡️

Horizontal Scaling (Scale Out)

Add more nodes to distribute load
Data is sharded across nodes
Near-linear scalability
More complex — need cluster mode (Redis) or client sharding (Memcached)
Good for: large datasets, high throughput

Redis Cluster Mode — Sharding for Scale Core

Redis Cluster Mode enables horizontal scaling by partitioning data across multiple shards. Each shard has a primary node and optional replicas for high availability.

🗃️

Cluster Mode Enabled

Data sharded across 1–500 shards
Each shard: 1 primary + up to 5 replicas
Total: up to 500 × 6 = 3,000 nodes
Keys distributed by hash slots (16,384 total)
Automatic rebalancing when adding shards
Multi-AZ with automatic failover per shard

📦

Cluster Mode Disabled

Single shard (no sharding)
1 primary + up to 5 replicas
Simpler — all data on one node
Limited by single node's memory
Replicas for read scaling and HA
Good for: smaller datasets, simpler ops

Redis Cluster Mode — data sharded across multiple shards, each with primary + replicas

Memcached Scaling — Client-Side Sharding Core

Memcached doesn't have built-in clustering. Scaling is done via client-side sharding — your application (or client library) hashes keys to determine which node to use.

🖥️

How It Works

Application hashes the key
Hash determines which node stores/retrieves
Nodes are independent (no communication)
Add nodes = update client config + rehash

⚠️

Limitations

Node failure = data loss for that shard
No automatic failover
Adding nodes causes cache misses (rehashing)
No replication

Redis Replication — Read Scaling & HA Core

Read Replicas in Redis serve two purposes: offload read traffic (scale reads) and provide failover targets (high availability).

🔄

Read Replicas

Up to 5 replicas per shard
Async replication from primary
Route reads to replicas → reduce primary load
Slightly stale (milliseconds lag)
Use for: read-heavy workloads

🛡️

Multi-AZ Failover

Replicas in different AZs
Primary fails → replica promoted automatically
DNS endpoint updated → app reconnects
Failover time: ~1 minute
Use for: production workloads

Redis Global Datastore — Multi-Region Replication Advanced

Redis Global Datastore replicates your Redis cluster across multiple AWS regions. It provides a fast, globally distributed cache with <1 second replication lag and automatic failover to a secondary region.

🌐

How Global Datastore Works

Active-passive: writes go to primary region only
Secondary regions replicate for reads only
Typical replication lag: <1 second
Up to 2 secondary regions
Automatic failover: promote secondary to primary

🛡️

Use Cases

Disaster recovery across regions
Global read scaling (low-latency reads near users)
Cross-region session replication
Exam: “multi-region Redis” → Global Datastore
Note: writes still single-region (not active-active)

Redis Backup & Restore Core

📸

Automated Snapshots

Daily backups in configured backup window
Retention: 1–35 days
Stored in S3 (AWS-managed)
No performance impact during backup
Enabled by default for Redis clusters

💾

Manual Snapshots & Restore

On-demand, retained indefinitely
Restore creates a new cluster (no in-place restore)
Supports cross-region copy for DR
Memcached: no backup/restore support
Exam: “Redis backup” = manual or automated snapshots

ElastiCache Serverless Core

ElastiCache Serverless (launched 2024) removes capacity planning entirely. No node types to choose, no cluster sizing — it scales automatically based on demand and charges per request.

⚡

Serverless vs Provisioned

Serverless: no capacity planning, auto-scales, pay per request
Provisioned: choose node types, manual scaling, pay per node-hour
Serverless: good for unpredictable workloads
Provisioned: good for predictable, high-throughput workloads
Exam: “serverless Redis” → ElastiCache Serverless

💡

When to Use

New applications with unknown traffic
Spiky / unpredictable workloads
Dev/test environments
Cost optimization for low-traffic periods
Not all regions available yet

Performance Tuning Advanced

Connection Pooling

Reuse connections instead of creating new ones. Each connection uses memory. Pool size = expected concurrent requests.

Redis Pipelining

Send multiple commands without waiting for responses. Reduces N round-trips to 1. Example: 100 commands = 1 RTT instead of 100.

Key Design

Prefix keys by entity type. Keep keys short. Avoid large values (>1MB). Use hashes for related fields.

📊 CloudWatch Metrics to Monitor

CPUUtilization — Redis is single-threaded; high CPU = bottleneck
EngineCPUUtilization — CPU used by Redis engine specifically
CacheHitRate — target 80%+; low = bad key design or wrong TTL
Evictions — items removed due to memory pressure; high = need more memory
CurrConnections — current client connections; too high = need connection pooling
ReplicationLag — delay between primary and replicas; high = replication issues

🧠 Key Insight

Redis Cluster Mode is for horizontal scaling — shard data across multiple primaries. Replicas are for read scaling and HA within each shard. Global Datastore extends Redis across regions for DR and global reads. Serverless removes capacity planning entirely. Memcached scales via client-side sharding but has no failover, no backup, and no HA.

Chapter Summary Introductory

 Vertical scaling: bigger nodes; simple but limited; downtime to resize
Horizontal scaling: more nodes; data sharded; near-linear scalability
Redis Cluster Mode: 1–500 shards, 16,384 hash slots, automatic rebalancing
Redis Replicas: up to 5 per shard; read scaling + Multi-AZ failover
Global Datastore: cross-region Redis replication; active-passive; <1s lag
Backup/Restore: automated daily + manual snapshots; restore creates new cluster
ElastiCache Serverless: no capacity planning; auto-scales; pay per request
Memcached sharding: client-side only; no replication; node failure = data loss
Key metrics: CPUUtilization, CacheHitRate, Evictions, ReplicationLag
 

Chapter Six

Security & Networking

ElastiCache Security Model Introductory

ElastiCache runs inside your VPC — it has no public endpoint by default. Security is layered: network isolation (VPC/subnets), access control (Security Groups), encryption (in-transit and at-rest), and authentication (Redis AUTH / IAM).

👉 Key principle: ElastiCache is a VPC-only service. There is no public internet access. Your application must be in the same VPC (or connected via VPC peering/Transit Gateway) to reach ElastiCache.

VPC & Subnet Placement Core

🖧

VPC Configuration

ElastiCache cluster lives in your VPC
Choose subnet group (collection of subnets)
Subnets should be private (no internet gateway)
For Multi-AZ, use subnets in different AZs
No public IP — access via private IP only

🛡️

Best Practices

Place in private subnets only
Use dedicated subnets for ElastiCache
CIDR should have enough IPs for nodes
Spread across 2+ AZs for HA (Redis replicas)
Same VPC as your application servers

Security Groups — Network Access Control Core

Security Groups control which traffic can reach your ElastiCache cluster. Only allow inbound connections from your application's security group on the cache port.

Security Group configuration — only allow traffic from app tier SG

App Servers

SG: app-sg
Outbound: all

→

ElastiCache

SG: cache-sg
Inbound: 6379 from app-sg

Redis port: 6379 • Memcached port: 11211 • Only allow from app SG, not 0.0.0.0/0

   Rule Type Port Source 
  ✔ Good Inbound 6379 sg-app-servers (app tier SG) 
 ✔ Good Inbound 6379 10.0.0.0/16 (VPC CIDR) 
 ✗ Bad Inbound 6379 0.0.0.0/0 (never do this) 
  

Rule	Type	Port	Source
✔ Good	Inbound	6379	sg-app-servers (app tier SG)
✔ Good	Inbound	6379	10.0.0.0/16 (VPC CIDR)
✗ Bad	Inbound	6379	0.0.0.0/0 (never do this)

Encryption — At Rest & In Transit Core

🔒

Encryption at Rest

Data encrypted on disk (backups, snapshots)
Uses AWS KMS keys
Choose AWS-managed key or customer-managed CMK
Must enable at cluster creation — cannot add later
Supported: Redis only (not Memcached)

🔐

Encryption in Transit

Data encrypted between app and cache (TLS)
Prevents eavesdropping on network
Must enable at cluster creation
Adds ~10–20% latency overhead
Supported: Redis only (not Memcached)

👉 Exam tip: Memcached does not support encryption (at-rest or in-transit). If the question mentions encryption requirements, the answer is Redis. Encryption must be enabled at cluster creation — you cannot add it to an existing cluster.

Authentication — Redis AUTH & IAM Core

🔑

Redis AUTH Token

Password-based authentication
Set AUTH token at cluster creation
Client must provide token to connect
Requires in-transit encryption enabled
Token: 16–128 characters

👤

IAM Authentication (Redis 7+)

Authenticate using IAM users/roles
No static passwords to manage
Integration with IAM policies
Requires Redis 7.0+ and encryption
Best for: Lambda, ECS, modern apps

Concept Diagram — ElastiCache Security Layers Core

Security layers — VPC, Security Groups, Encryption, Authentication

🧠 Key Insight

ElastiCache is VPC-only — no public access. Security layers: VPC isolation, private subnets, Security Groups (allow only app tier), encryption (at-rest + in-transit for Redis), and authentication (AUTH token or IAM). Memcached has no encryption support. Exam tip: “encrypt cache data” = Redis with encryption at-rest + in-transit; “cache in private subnet” = correct architecture.

Chapter Summary Introductory

 VPC-only: ElastiCache has no public endpoint; must be in VPC
Private subnets: place cache nodes in private subnets, spread across AZs
Security Groups: allow inbound only from app tier SG on port 6379 (Redis) or 11211 (Memcached)
Encryption at-rest: KMS-managed; Redis only; enable at creation
Encryption in-transit: TLS; Redis only; ~10-20% latency overhead
Authentication: Redis AUTH token or IAM (Redis 7+); Memcached has no auth
 

Chapter Seven

Architecture Patterns

Pattern 1 — RDS/Aurora + ElastiCache (Read-Heavy Apps) Core

The most common pattern: ElastiCache sits in front of your relational database, caching frequently-read data to reduce load and latency.

Pattern 1 — Cache-aside pattern with RDS/Aurora backend

Application

Check cache first

→

ElastiCache

Redis
Session + query cache

→

RDS / Aurora

Source of truth
Complex queries

Hit ratio 80-95% • DB load reduced 5-10× • Latency: cache <1ms, DB 1-10ms

When to Use

Read-heavy workload (80%+ reads)
Same data queried repeatedly
Database is a bottleneck
Need sub-ms response times

Implementation

Cache-aside (lazy loading) strategy
TTL based on data freshness needs
Invalidate on writes
Key pattern: entity:id

Pattern 2 — DynamoDB + DAX (Serverless Caching) Core

DAX (DynamoDB Accelerator) is a purpose-built cache for DynamoDB. Unlike ElastiCache, DAX is API-compatible — just change the endpoint and caching is automatic.

Pattern 2 — DAX provides transparent caching for DynamoDB

Lambda

Uses DAX SDK
Same DynamoDB API

→

DAX

Microsecond reads
Write-through cache

→

DynamoDB

Source of truth
Cache miss fallback

Microsecond latency • Read-through + write-through • Eventually consistent only

👉 DAX vs ElastiCache for DynamoDB: Use DAX for DynamoDB — it's purpose-built, API-compatible, and handles cache management automatically. Use ElastiCache (Redis) when you need features DAX doesn't have (pub/sub, complex data types) or for non-DynamoDB data. DAX consistency note: Write-through only updates items already in cache — it does not proactively load uncached items. DAX also supports eventually consistent reads only; bypass DAX for strongly consistent reads.

Pattern 3 — Session Storage Core

Store user sessions in ElastiCache to enable stateless application servers. Any server can handle any request because session data is centralized.

Pattern 3 — Centralized session storage enables stateless app tier

ALB

No sticky sessions
Round-robin

→

EC2

→

Redis

Session store
TTL = session timeout

Stateless servers • Horizontal scaling • No sticky sessions needed • Session TTL = 15-30 min

Benefits

Servers are stateless — easy to scale
No sticky sessions — better load distribution
Server failure doesn't lose sessions
TTL auto-expires old sessions

Implementation

Use Redis with persistence (for HA)
Key: session:<session_id>
Value: serialized session data (JSON)
TTL: match session timeout (e.g., 30 min)

Pattern 4 — Leaderboards with Redis Sorted Sets Core

Redis sorted sets are perfect for leaderboards. Each member has a score; Redis keeps them sorted automatically. O(log N) insert/update, O(log N + M) for range queries.

🏆

How It Works

ZADD leaderboard 1500 "player1" — add/update score
ZRANK leaderboard "player1" — get rank (0-based)
ZRANGE leaderboard 0 9 WITHSCORES — top 10
ZREVRANGE for descending order (highest first)
Millions of players, instant rank lookup

🎮

Use Cases

Gaming leaderboards
Top sellers / trending products
Real-time analytics dashboards
Voting / rating systems
Activity feeds (sorted by time)

Pattern 5 — Real-Time Pub/Sub Advanced

Redis Pub/Sub enables real-time messaging between services. Publishers send to channels; subscribers receive instantly. No message persistence (fire-and-forget).

How Pub/Sub Works

SUBSCRIBE notifications — listen to channel
PUBLISH notifications "new order" — send message
All subscribers receive simultaneously
No message queue — missed = lost

Use Cases

Real-time notifications
Chat applications
Live updates (sports scores, stock prices)
Cache invalidation broadcast

Decision Guide — ElastiCache vs Alternatives Core

   Requirement Solution Why 
  Cache for RDS/Aurora ElastiCache Redis General-purpose, flexible 
 Cache for DynamoDB DAX API-compatible, microseconds 
 Session storage ElastiCache Redis Persistence, TTL, HA 
 Leaderboards ElastiCache Redis Sorted sets 
 Simple cache, multi-threaded ElastiCache Memcached CPU efficiency 
 Message queue (durable) SQS / SNS Redis Pub/Sub is fire-and-forget 
  

Requirement	Solution	Why
Cache for RDS/Aurora	ElastiCache Redis	General-purpose, flexible
Cache for DynamoDB	DAX	API-compatible, microseconds
Session storage	ElastiCache Redis	Persistence, TTL, HA
Leaderboards	ElastiCache Redis	Sorted sets
Simple cache, multi-threaded	ElastiCache Memcached	CPU efficiency
Message queue (durable)	SQS / SNS	Redis Pub/Sub is fire-and-forget

Exam Cheatsheet Core

🎯 Exam Keywords → ElastiCache Answer

“sub-millisecond latency” → ElastiCache (Redis or Memcached)
“reduce database load” → ElastiCache cache-aside pattern
“session storage, stateless servers” → ElastiCache Redis
“leaderboard, ranking” → Redis sorted sets
“DynamoDB caching, microsecond” → DAX (not ElastiCache)
“cache with persistence” → Redis (Memcached has no persistence)
“cache with Multi-AZ failover” → Redis (Memcached has no HA)
“encrypt cache data” → Redis with encryption at-rest + in-transit
“simple cache, multi-threaded” → Memcached
“Memcached node failure” → data lost permanently; no automatic recovery
“lazy loading” → cache-aside strategy
“cache always in sync” → write-through strategy
“cold start, first request miss” → cache warming / preloading
“multi-region Redis” → Global Datastore (active-passive, <1s lag)
“serverless Redis” → ElastiCache Serverless (no capacity planning)
“Redis backup” → automated snapshots + manual; restore creates new cluster
“atomic multi-key Redis operation” → MULTI/EXEC transaction
“LRU eviction” → least recently used (default)
“TTL expiration” → time-to-live controls cache freshness
“pub/sub messaging” → Redis (not durable; use SQS for durable)

🧠 Final Insight

ElastiCache is the AWS caching layer for reducing database load and latency. Redis is the default choice — it has persistence, HA, and rich data types. Use Memcached only for simple, ephemeral caching with multi-threaded needs. Use DAX for DynamoDB specifically. The cache is not a replacement for your database — it's a protective shield that absorbs repetitive read load.