Amazon ElastiCache
LearningTree · AWS · Database

Amazon ElastiCache —
In-Memory Caching at Scale

ElastiCache is not just a faster database — it is a layer that protects your database. By storing frequently accessed data in memory, ElastiCache reduces database load, cuts latency from milliseconds to microseconds, and lets your application scale without rewriting queries. Choose Redis for rich features or Memcached for simplicity.

⚡ ElastiCache in 30 Seconds

  • Fully managed in-memory caching — Redis or Memcached
  • Sub-millisecond latency — 100× faster than disk-based databases
  • Sits between your app and database — reduces DB load, speeds up reads
  • Cache hit = instant response from memory; Cache miss = fetch from database
  • Use for: session storage, leaderboards, real-time analytics, API response caching
  • Redis: rich data types, persistence, pub/sub; Memcached: simpler, multi-threaded
01
Chapter One

What is Amazon ElastiCache

The Problem — Databases Are Slow Introductory

Every time your application reads data from a database, it makes a network round-trip, waits for the database to find the data on disk, and returns it. Even with fast SSDs, this takes 1–10 milliseconds per query. When you have thousands of users making the same queries, your database becomes a bottleneck. Scaling the database is expensive. And much of that work is redundant — the same data is being fetched over and over.

👉 The core problem: Your database is doing the same work repeatedly. If 1,000 users request the same product page, the database runs the same query 1,000 times. That's wasted compute, wasted latency, and unnecessary load on a system that's hard to scale horizontally.

What is Caching Introductory

Caching is storing frequently accessed data in a fast, temporary location so you don't have to fetch it from the slower primary source every time. Think of it like keeping your most-used tools on your desk instead of walking to the garage each time you need them.

💾

Database (Slow)

  • Data stored on disk (SSD/HDD)
  • Latency: 1–10+ ms
  • Durable — survives restarts
  • Handles complex queries
  • Expensive to scale horizontally

Cache (Fast)

  • Data stored in RAM (memory)
  • Latency: <1 ms (microseconds)
  • Volatile — data can be lost
  • Key-value lookups only
  • Cheap to scale horizontally
🤝

Together

  • Cache handles hot data
  • Database handles cold data
  • Cache absorbs repetitive load
  • Database handles complex queries
  • Best of both worlds
Memory vs Disk — Why Caching is Fast Core

The speed difference between RAM and disk is not incremental — it is orders of magnitude. Understanding this is key to understanding why caching works.

Storage Type Latency Analogy
L1 CPU Cache ~1 nanosecond Blink of an eye
RAM (Memory) ~100 nanoseconds One heartbeat
SSD ~100 microseconds Walking to the kitchen
HDD ~10 milliseconds Walking to the store
Network (same region) ~1 millisecond Driving across town

RAM is ~1,000× faster than SSD, and ~100,000× faster than HDD. This is why caching works.

What is Amazon ElastiCache Introductory

Amazon ElastiCache is a fully managed in-memory caching service. You choose an engine — Redis or Memcached — and AWS handles provisioning, patching, scaling, and failover. ElastiCache runs inside your VPC, sitting between your application and your database.

What ElastiCache Provides

  • Managed Redis or Memcached clusters
  • Automatic failover (Multi-AZ for Redis)
  • Backup and restore (Redis)
  • Encryption at rest and in transit
  • CloudWatch metrics and alarms
  • VPC isolation, Security Groups
📌

What You Still Manage

  • Cache invalidation logic
  • Application-side caching strategy
  • Key design and TTL policy
  • Choosing Redis vs Memcached
  • Sizing nodes appropriately
  • Deciding what data to cache
🧠 Mental Model — Cache as a Protective Layer Introductory

Most People Think (Incomplete):

“Cache = faster database. I put data in cache, reads are faster.”

✨ Better Mental Model:

Cache = a protective layer that absorbs load and reduces latency. It sits in front of your database, answering repeated questions so the database doesn't have to. The database handles complex queries and writes; the cache handles hot reads. The cache is not a replacement — it is a shield.

🛑

Without Cache

  • Every read hits the database
  • Database CPU spikes on popular data
  • Response times degrade under load
  • Scale = bigger database (expensive)
  • DB becomes single point of failure

With Cache

  • Hot reads served from memory
  • Database handles only cache misses
  • Consistent sub-ms response times
  • Scale = add cache nodes (cheap)
  • Database load reduced 80–90%
💡

Key Insight

  • Cache hit ratio is everything
  • 90% hit ratio = 10× less DB load
  • Cache what's read often, changes rarely
  • Don't cache complex joins — cache results
  • Invalidation is the hard problem
Concept Diagram — Cache Hit vs Cache Miss Core
Cache flow — hit returns instantly from memory; miss fetches from database then caches
Application EC2 / Lambda GET product:123 ElastiCache Redis / Memcached In-Memory ✔ HIT: <1ms ✗ MISS: go to DB Database RDS / DynamoDB Disk: 1–10ms 1. check ✔ CACHE HIT — Data in cache Return immediately from memory (<1ms) 2. return cached ✗ CACHE MISS — Data not in cache Fetch from DB, store in cache, return (1–10ms) 2. fetch DB 3. return + cache 4. return to app
AWS Architecture Diagram — ElastiCache in a Typical Stack Core
ElastiCache sits between your application and database — the most common architecture
App
Application
EC2 / Lambda
Check cache first
ElastiCache
ElastiCache
Redis / Memcached
Sub-ms latency
In-memory
RDS
RDS / Aurora
Source of truth
Cache miss fallback
Cache hit: <1ms • Cache miss: fetch from DB (1–10ms) • Typical hit ratio: 80–95% • DB load reduced proportionally
When to Use ElastiCache Core

Use ElastiCache When

  • Read-heavy workload (80%+ reads)
  • Same data requested repeatedly (hot keys)
  • Need sub-millisecond latency
  • Database is a bottleneck or expensive to scale
  • Session storage for stateless app servers
  • Real-time leaderboards, analytics, counters
  • API response caching
📌

Don't Use Cache Alone When

  • Write-heavy workload (cache doesn't help writes)
  • Data changes constantly (cache always stale)
  • Every request is unique (no cache hits)
  • Strong consistency required (cache = eventual)
  • Complex joins/aggregations (cache key-value only)
  • Small dataset that fits in DB memory anyway
ElastiCache Engine Comparison — Redis vs Memcached Core
Feature Redis Memcached
Data types Strings, lists, sets, hashes, sorted sets Strings only
Persistence Yes (snapshots + AOF) No
Replication Yes (primary + replicas) No
Multi-AZ failover Yes (automatic) No
Pub/Sub Yes No
Lua scripting Yes No
Multi-threaded Single-threaded (mostly) Yes

Rule of thumb: Choose Redis unless you specifically need Memcached's multi-threading or have a legacy Memcached dependency.

🧠 Key Insight

ElastiCache is not about making your database faster — it's about reducing the work your database has to do. A 90% cache hit ratio means your database handles 10× less load. The cache absorbs repetitive reads; the database focuses on writes and complex queries. This is how you scale read-heavy applications without scaling your database.

Chapter Summary Introductory
  • Caching = storing frequently accessed data in fast memory to avoid slow disk/network access
  • ElastiCache = fully managed Redis or Memcached clusters in AWS
  • Memory is ~1,000× faster than SSD — that's why caching works
  • Cache hit = data found in cache, return instantly (<1ms)
  • Cache miss = data not in cache, fetch from DB, store in cache, return
  • Mental model: cache = protective layer that absorbs load from your database
  • Redis vs Memcached: Redis has richer features; Memcached is simpler and multi-threaded
02
Chapter Two

Core Concepts — TTL, Eviction & Cache Behavior

Cache Hit vs Cache Miss — The Foundation Introductory

Every cache interaction has one of two outcomes: a hit (data found) or a miss (data not found). Understanding this is fundamental to everything else in caching.

✔️

Cache Hit

  • Application requests data from cache
  • Data exists in cache
  • Return immediately from memory
  • Latency: <1 millisecond
  • Database is not touched
  • This is what you want — maximize hit ratio

Cache Miss

  • Application requests data from cache
  • Data not found in cache
  • Fetch from database (slow)
  • Store result in cache for next time
  • Latency: 1–10+ milliseconds
  • First request for any data is always a miss

📊 Cache Hit Ratio

Hit Ratio = Cache Hits / Total Requests. A 90% hit ratio means 90% of requests are served from cache (fast) and only 10% go to the database (slow). The higher your hit ratio, the less load on your database. Target: 80–95% for most workloads.

TTL — Time to Live Core

TTL (Time to Live) is how long a cached item remains valid before it expires and is automatically removed. TTL is your primary tool for balancing freshness vs performance.

⏱️

Short TTL (seconds)

  • Data refreshes frequently
  • More cache misses
  • Higher database load
  • More consistent with source
  • Use for: rapidly changing data

Long TTL (hours/days)

  • Data stays cached longer
  • More cache hits
  • Lower database load
  • May serve stale data
  • Use for: static/rarely changing data
♾️

No TTL (infinite)

  • Data never expires automatically
  • Must manually invalidate
  • Risk of serving stale data forever
  • Use sparingly
  • Use for: truly immutable data
Data Type Suggested TTL Rationale
User session 15–30 minutes Match session timeout
Product catalog 1–24 hours Rarely changes, high read volume
User profile 5–15 minutes Balance freshness vs load
Stock price 1–5 seconds Changes constantly
Static config 24+ hours Almost never changes
Eviction Policies — What Happens When Cache is Full Core

Cache memory is finite. When it fills up, the cache must evict (remove) existing items to make room for new ones. The eviction policy determines which items get removed.

💡

LRU — Least Recently Used

  • Evicts items that haven't been accessed recently
  • Most common policy — Redis default
  • Keeps hot data, removes cold data
  • Good for: general-purpose caching
  • Exam: “remove least recently accessed” → LRU
🔢

LFU — Least Frequently Used

  • Evicts items accessed the fewest times
  • Keeps frequently accessed data longer
  • Better for skewed access patterns
  • Good for: data with long-term popularity
  • Exam: “remove least frequently accessed” → LFU

volatile-lru

LRU among keys with TTL set. Keys without TTL are never evicted.

allkeys-lru

LRU among all keys. Most common choice for pure cache use cases.

noeviction

Returns error when memory full. Use when data loss is unacceptable.

Concept Diagram — TTL & Eviction Lifecycle Core
Cache item lifecycle — from creation to expiration or eviction
1. WRITE SET key value TTL = 300s Item created 2. ACTIVE GET key → hit TTL counting down Serving requests 3a. TTL EXPIRES TTL = 0 Auto-removed 3b. EVICTED Memory full LRU/LFU removes 4. MISS GET key → null Fetch from DB Re-cache 5. WRITE New TTL Cycle repeats
Cache Invalidation — The Hard Problem Advanced

“There are only two hard things in Computer Science: cache invalidation and naming things.” — Phil Karlton. When data changes in your database, the cached copy becomes stale. You have two options:

⏱️

TTL-Based Expiration

  • Let cache expire naturally via TTL
  • Simple — no extra code
  • Data may be stale up to TTL duration
  • Good for: data where staleness is acceptable
  • Example: product catalog (24h TTL)
🗑️

Explicit Invalidation

  • Delete/update cache when data changes
  • More complex — need to track dependencies
  • Data always fresh
  • Good for: data where consistency matters
  • Example: user profile (invalidate on update)

👉 Common pattern: Use TTL as a safety net, plus explicit invalidation for critical data. TTL ensures stale data eventually disappears even if you forget to invalidate. Explicit invalidation ensures important updates are immediately visible.

🧠 Key Insight

TTL is your balance between freshness and performance. Short TTL = fresher data but more database load. Long TTL = better performance but risk of stale data. Eviction policies decide what to remove when memory is full. LRU (least recently used) is the default and right choice for most use cases.

Chapter Summary Introductory
  • Cache hit = data found, return instantly; cache miss = fetch from DB, then cache
  • Hit ratio = hits / total requests — target 80–95%
  • TTL = time-to-live; controls how long data stays in cache before auto-expiring
  • Short TTL = fresher data, more DB load; Long TTL = better performance, risk of staleness
  • Eviction = removing items when cache is full; LRU (least recently used) is most common
  • Invalidation = TTL-based (simple) or explicit delete (complex but fresh)
03
Chapter Three

Redis vs Memcached — Choosing the Right Engine

The Decision That Matters Introductory

ElastiCache supports two engines: Redis and Memcached. Both are in-memory key-value stores, but they have fundamentally different architectures and capabilities. The right choice depends on your use case.

🎯 Quick Decision Rule

Choose Redis unless you have a specific reason to use Memcached. Redis has more features, better availability, and covers 95% of use cases. Memcached is for legacy systems or very specific multi-threaded scaling needs.

Redis — Feature-Rich In-Memory Database Core

Redis Strengths

  • Rich data types: strings, lists, sets, sorted sets, hashes, streams
  • Persistence: RDB snapshots + AOF (append-only file)
  • Replication: primary-replica architecture
  • Multi-AZ failover: automatic with replicas
  • Pub/Sub: real-time messaging
  • Lua scripting: atomic operations
  • Transactions: MULTI/EXEC for atomic batches
  • Cluster mode: horizontal sharding
💡

Redis Use Cases

  • Session storage (with persistence)
  • Leaderboards (sorted sets)
  • Real-time analytics (counters, HyperLogLog)
  • Message queues (lists, streams)
  • Pub/Sub notifications
  • Rate limiting
  • Geospatial queries
  • Caching with advanced data structures
Redis Data Types — Beyond Simple Key-Value Core

String

Simple key-value. Max 512 MB. Supports atomic increment/decrement.

SET user:123 "Alice"

List

Ordered collection. Push/pop from both ends. Good for queues.

LPUSH queue task1

Set

Unordered unique values. Set operations (union, intersect).

SADD tags "aws"

Sorted Set

Set with scores. Auto-sorted by score. Perfect for leaderboards.

ZADD leaderboard 100 "player1"

Hash

Field-value pairs under one key. Like a mini-object.

HSET user:123 name "Alice"

Stream

Append-only log. Consumer groups. Event sourcing.

XADD events * action "click"

Redis Transactions — Atomic Batches Advanced
🔄

MULTI / EXEC

  • MULTI — begin transaction block
  • Queue commands inside the block
  • EXEC — execute all at once (atomic)
  • DISCARD — cancel queued commands
  • All commands run, or none (on network failure)
  • ⚠️ Not a true rollback — if one command fails, others still execute
👁️

WATCH — Optimistic Locking

  • WATCH key — monitor key for changes
  • If key changes before EXEC, transaction aborts
  • Use for: read-modify-write patterns
  • Example: decrement inventory + add order
  • Retry on abort for optimistic concurrency
Memcached — Simple, Distributed Cache Core
📦

Memcached Strengths

  • Multi-threaded: scales with CPU cores
  • Simple: just key-value strings
  • Distributed by design: client-side sharding
  • Low memory overhead: no persistence metadata
  • Mature: battle-tested for decades
  • Easy horizontal scaling: add nodes, update client
⚠️

Memcached Limitations

  • Strings only: no complex data types
  • No persistence: data lost on restart
  • No replication: no built-in HA
  • No failover: node loss = data loss
  • No pub/sub: just get/set
  • No transactions: no atomic batches
Comparison Table — Redis vs Memcached Core
Feature Redis Memcached
Data types Strings, lists, sets, hashes, sorted sets, streams Strings only
Persistence Yes (RDB + AOF) No
Replication Yes (primary-replica) No
Multi-AZ failover Yes (automatic) No
Pub/Sub Yes No
Lua scripting Yes No
Cluster mode (sharding) Yes (server-side) Client-side only
Multi-threaded Mostly single-threaded Yes
Max item size 512 MB 1 MB (default)
Backup/Restore Yes No
Concept Diagram — Architecture Differences Core
Redis (replicated, persistent) vs Memcached (distributed, ephemeral)
REDIS — REPLICATED Primary Read/Write + Persistence Replica 1 Read only Replica 2 Read only ✔ Multi-AZ failover • Persistence • Backups Primary fails → replica promoted automatically MEMCACHED — DISTRIBUTED Node 1 Keys A-F Node 2 Keys G-M Node 3 Keys N-Z Node 4 Failed! ⚠️ No replication • No persistence • No failover Node fails → data lost, re-fetch from DB
Decision Flowchart Core
✔️

Choose Redis When

  • Need complex data types (lists, sets, sorted sets)
  • Need persistence — data must survive restarts
  • Need high availability — Multi-AZ failover
  • Need pub/sub for real-time messaging
  • Building leaderboards (sorted sets)
  • Need atomic operations or Lua scripting
  • Default choice for most use cases
📦

Choose Memcached When

  • Simple key-value strings are enough
  • Multi-threaded performance is critical
  • Have legacy Memcached dependency
  • Data loss is acceptable (pure cache)
  • Need lowest memory overhead
  • Horizontal scaling via client-side sharding
  • Cost-sensitive, simpler architecture
🧠 Key Insight

Redis is the default choice for 95% of caching needs. It has everything Memcached has, plus persistence, replication, failover, and rich data types. Choose Memcached only if you specifically need its multi-threaded architecture or have a legacy dependency. Memcached node failure = data loss for that shard — no recovery. Exam tip: Redis for complex data structures and HA; Memcached for simple, ephemeral cache.

Chapter Summary Introductory
  • Redis: rich data types, persistence, replication, Multi-AZ failover, pub/sub, Lua, transactions — choose by default
  • Memcached: strings only, no persistence, no replication, multi-threaded — node failure = data lost permanently
  • Redis data types: strings, lists, sets, sorted sets (leaderboards), hashes, streams
  • Redis persistence: RDB (snapshots) + AOF (append-only file) — survives restarts
  • Redis MULTI/EXEC: atomic batches; use WATCH for optimistic locking
  • Memcached nodes: independent, no failover — node failure = data loss for that shard
  • Exam: “leaderboards” → Redis sorted sets; “session with persistence” → Redis; “simple cache” → either
04
Chapter Four

Caching Strategies — Cache-Aside, Write-Through & More

Why Strategy Matters Introductory

A caching strategy defines when and how data moves between your application, cache, and database. The wrong strategy leads to stale data, cache misses, or wasted resources. The right strategy matches your access patterns.

👉 The core question: When data is written to the database, how does the cache stay in sync? And when data is read, who is responsible for populating the cache? Your answers determine your caching strategy.

Cache-Aside (Lazy Loading) — The Most Common Strategy Core

Cache-Aside is the most widely used caching pattern. The application is responsible for managing the cache — checking it on reads and updating it on misses. The cache and database are independent; neither knows about the other.

How Cache-Aside Works

  • Read: App checks cache first
  • Hit: Return cached data
  • Miss: Fetch from DB, store in cache, return
  • Write: Write to DB, invalidate/delete from cache
  • Cache is populated on demand (lazy)
  • Only accessed data is cached
💡

Pros & Cons

  • ✔ Only caches data that's actually used
  • ✔ Cache failure doesn't break the app (fallback to DB)
  • ✔ Simple to implement
  • ✗ First request is always a miss (cold start)
  • ✗ Stale data possible if DB updated externally
  • ✗ App must manage cache logic
Cache-Aside pattern — application manages cache reads and invalidation
READ FLOW (Cache-Aside) App GET user:1 Cache HIT? → return Database MISS? → fetch 1 2 3. cache it WRITE FLOW (Cache-Aside) App UPDATE Cache DELETE key Database WRITE data 1. write DB first 2. invalidate CACHE-ASIDE SUMMARY ✔ Most common pattern ✔ App controls cache logic ✔ Cache only what's used ✔ Resilient to cache failure ✗ First request = cache miss ✗ Possible stale data window Best for: read-heavy workloads ElastiCache + any database
Write-Through — Always Keep Cache in Sync Core

Write-Through writes data to the cache and database at the same time. The cache is always up-to-date, eliminating stale data. But every write has latency of both systems.

📝

How Write-Through Works

  • Write: App writes to cache AND database together
  • Read: Always read from cache (always populated)
  • Cache is never stale for writes through the app
  • Often combined with cache-aside for reads
💡

Pros & Cons

  • ✔ Cache is always consistent with DB
  • ✔ No stale data from app writes
  • ✗ Write latency = cache + DB (slower)
  • ✗ Cache may hold data never read
  • ✗ More complex — must handle failures
Write-Behind (Write-Back) — Async Writes for Performance Advanced

Write-Behind writes to cache immediately, then asynchronously writes to the database later. This optimizes write latency but risks data loss if the cache fails before flushing to DB.

How Write-Behind Works

  • Write: App writes to cache (fast)
  • Background: Cache flushes to DB asynchronously
  • Writes are batched for efficiency
  • App sees fast response; DB updated later
⚠️

Risks

  • ✗ Data loss if cache fails before flush
  • ✗ Complex failure handling
  • ✗ Eventual consistency only
  • Use only when: write speed > durability
Read-Through — Cache Manages Reads Core

Read-Through is similar to cache-aside, but the cache itself fetches from the database on a miss (not the application). The app only talks to the cache.

👉 Read-Through vs Cache-Aside: In cache-aside, the app fetches from DB on miss and writes to cache. In read-through, the cache layer handles this automatically. Read-through requires cache infrastructure that supports it (like DAX for DynamoDB).

Strategy Comparison Table Core
Strategy Read Write Best For
Cache-Aside App checks cache, misses fetch from DB Write to DB, invalidate cache General purpose, read-heavy
Write-Through Always from cache Write to cache + DB together Consistency critical
Write-Behind Always from cache Write to cache, async to DB Write-heavy, latency-sensitive
Read-Through Cache fetches from DB on miss Varies (often write-through) Simplified app logic
🧠 Key Insight

Cache-Aside is the default strategy for ElastiCache — simple, flexible, and works with any database. Use Write-Through when you need strong consistency. Write-Behind is risky but fast. Pre-warm your cache after deployments to avoid a cold-start burst on your database. Exam tip: “reduce DB load for reads” = cache-aside; “cache always in sync” = write-through; “DynamoDB caching” = DAX (read-through/write-through).

Cache Warming — Avoiding Cold Start Core

On first deployment (or after a cache flush), every request is a cache miss. This creates a cold-start burst of database queries. Cache warming pre-populates the cache before traffic arrives.

🔥

Warming Strategies

  • Pre-load script: run before deployment, load hot keys from DB
  • Gradual traffic shift: route small % first, cache builds up
  • Off-peak warming: pre-warm during low traffic windows
  • Background refresh: re-populate before TTL expires (refresh-ahead)
💡

Why It Matters

  • Cold cache = all requests hit DB (danger zone)
  • Especially critical for RDS — can cause DB overload
  • Warming reduces cold-start latency spikes
  • Plan warming as part of deployment runbook
Chapter Summary Introductory
  • Cache-Aside: app manages cache; read from cache, miss fetches from DB, write invalidates cache — most common
  • Write-Through: write to cache + DB together; cache always consistent; higher write latency
  • Write-Behind: write to cache, async to DB; fast writes but risk of data loss
  • Read-Through: cache fetches from DB on miss (app only talks to cache)
  • Cache Warming: pre-populate cache before traffic; avoids cold-start DB burst
  • Default choice: cache-aside for ElastiCache + RDS/Aurora
  • Exam: “lazy loading” = cache-aside; “DAX” = read-through/write-through for DynamoDB
05
Chapter Five

Scaling & Performance

Scaling ElastiCache — Vertical vs Horizontal Introductory

As your cache workload grows, you have two scaling options: vertical (bigger nodes) or horizontal (more nodes). The right choice depends on your engine (Redis vs Memcached) and workload pattern.

⬆️

Vertical Scaling (Scale Up)

  • Use larger node types (more RAM, CPU)
  • Simple — no code changes
  • Limited by largest available instance
  • Downtime during resize (Redis) or create new cluster
  • Good for: moderate growth, single-node workloads
➡️

Horizontal Scaling (Scale Out)

  • Add more nodes to distribute load
  • Data is sharded across nodes
  • Near-linear scalability
  • More complex — need cluster mode (Redis) or client sharding (Memcached)
  • Good for: large datasets, high throughput
Redis Cluster Mode — Sharding for Scale Core

Redis Cluster Mode enables horizontal scaling by partitioning data across multiple shards. Each shard has a primary node and optional replicas for high availability.

🗃️

Cluster Mode Enabled

  • Data sharded across 1–500 shards
  • Each shard: 1 primary + up to 5 replicas
  • Total: up to 500 × 6 = 3,000 nodes
  • Keys distributed by hash slots (16,384 total)
  • Automatic rebalancing when adding shards
  • Multi-AZ with automatic failover per shard
📦

Cluster Mode Disabled

  • Single shard (no sharding)
  • 1 primary + up to 5 replicas
  • Simpler — all data on one node
  • Limited by single node's memory
  • Replicas for read scaling and HA
  • Good for: smaller datasets, simpler ops
Redis Cluster Mode — data sharded across multiple shards, each with primary + replicas
SHARD 1 (slots 0–5460) Primary AZ-a Replica AZ-b Replica AZ-c Keys: user:*, session:* SHARD 2 (slots 5461–10922) Primary AZ-b Replica AZ-a Replica AZ-c Keys: product:*, order:* SHARD 3 (slots 10923–16383) Primary AZ-c Replica AZ-a Replica AZ-b Keys: cache:*, temp:* 3 shards × 3 nodes each = 9 nodes • Multi-AZ HA • 16,384 hash slots distributed
Memcached Scaling — Client-Side Sharding Core

Memcached doesn't have built-in clustering. Scaling is done via client-side sharding — your application (or client library) hashes keys to determine which node to use.

🖥️

How It Works

  • Application hashes the key
  • Hash determines which node stores/retrieves
  • Nodes are independent (no communication)
  • Add nodes = update client config + rehash
⚠️

Limitations

  • Node failure = data loss for that shard
  • No automatic failover
  • Adding nodes causes cache misses (rehashing)
  • No replication
Redis Replication — Read Scaling & HA Core

Read Replicas in Redis serve two purposes: offload read traffic (scale reads) and provide failover targets (high availability).

🔄

Read Replicas

  • Up to 5 replicas per shard
  • Async replication from primary
  • Route reads to replicas → reduce primary load
  • Slightly stale (milliseconds lag)
  • Use for: read-heavy workloads
🛡️

Multi-AZ Failover

  • Replicas in different AZs
  • Primary fails → replica promoted automatically
  • DNS endpoint updated → app reconnects
  • Failover time: ~1 minute
  • Use for: production workloads
Redis Global Datastore — Multi-Region Replication Advanced

Redis Global Datastore replicates your Redis cluster across multiple AWS regions. It provides a fast, globally distributed cache with <1 second replication lag and automatic failover to a secondary region.

🌐

How Global Datastore Works

  • Active-passive: writes go to primary region only
  • Secondary regions replicate for reads only
  • Typical replication lag: <1 second
  • Up to 2 secondary regions
  • Automatic failover: promote secondary to primary
🛡️

Use Cases

  • Disaster recovery across regions
  • Global read scaling (low-latency reads near users)
  • Cross-region session replication
  • Exam: “multi-region Redis” → Global Datastore
  • Note: writes still single-region (not active-active)
Redis Backup & Restore Core
📸

Automated Snapshots

  • Daily backups in configured backup window
  • Retention: 1–35 days
  • Stored in S3 (AWS-managed)
  • No performance impact during backup
  • Enabled by default for Redis clusters
💾

Manual Snapshots & Restore

  • On-demand, retained indefinitely
  • Restore creates a new cluster (no in-place restore)
  • Supports cross-region copy for DR
  • Memcached: no backup/restore support
  • Exam: “Redis backup” = manual or automated snapshots
ElastiCache Serverless Core

ElastiCache Serverless (launched 2024) removes capacity planning entirely. No node types to choose, no cluster sizing — it scales automatically based on demand and charges per request.

Serverless vs Provisioned

  • Serverless: no capacity planning, auto-scales, pay per request
  • Provisioned: choose node types, manual scaling, pay per node-hour
  • Serverless: good for unpredictable workloads
  • Provisioned: good for predictable, high-throughput workloads
  • Exam: “serverless Redis” → ElastiCache Serverless
💡

When to Use

  • New applications with unknown traffic
  • Spiky / unpredictable workloads
  • Dev/test environments
  • Cost optimization for low-traffic periods
  • Not all regions available yet
Performance Tuning Advanced

Connection Pooling

Reuse connections instead of creating new ones. Each connection uses memory. Pool size = expected concurrent requests.

Redis Pipelining

Send multiple commands without waiting for responses. Reduces N round-trips to 1. Example: 100 commands = 1 RTT instead of 100.

Key Design

Prefix keys by entity type. Keep keys short. Avoid large values (>1MB). Use hashes for related fields.

📊 CloudWatch Metrics to Monitor

  • CPUUtilization — Redis is single-threaded; high CPU = bottleneck
  • EngineCPUUtilization — CPU used by Redis engine specifically
  • CacheHitRate — target 80%+; low = bad key design or wrong TTL
  • Evictions — items removed due to memory pressure; high = need more memory
  • CurrConnections — current client connections; too high = need connection pooling
  • ReplicationLag — delay between primary and replicas; high = replication issues
🧠 Key Insight

Redis Cluster Mode is for horizontal scaling — shard data across multiple primaries. Replicas are for read scaling and HA within each shard. Global Datastore extends Redis across regions for DR and global reads. Serverless removes capacity planning entirely. Memcached scales via client-side sharding but has no failover, no backup, and no HA.

Chapter Summary Introductory
  • Vertical scaling: bigger nodes; simple but limited; downtime to resize
  • Horizontal scaling: more nodes; data sharded; near-linear scalability
  • Redis Cluster Mode: 1–500 shards, 16,384 hash slots, automatic rebalancing
  • Redis Replicas: up to 5 per shard; read scaling + Multi-AZ failover
  • Global Datastore: cross-region Redis replication; active-passive; <1s lag
  • Backup/Restore: automated daily + manual snapshots; restore creates new cluster
  • ElastiCache Serverless: no capacity planning; auto-scales; pay per request
  • Memcached sharding: client-side only; no replication; node failure = data loss
  • Key metrics: CPUUtilization, CacheHitRate, Evictions, ReplicationLag
06
Chapter Six

Security & Networking

ElastiCache Security Model Introductory

ElastiCache runs inside your VPC — it has no public endpoint by default. Security is layered: network isolation (VPC/subnets), access control (Security Groups), encryption (in-transit and at-rest), and authentication (Redis AUTH / IAM).

👉 Key principle: ElastiCache is a VPC-only service. There is no public internet access. Your application must be in the same VPC (or connected via VPC peering/Transit Gateway) to reach ElastiCache.

VPC & Subnet Placement Core
🖧

VPC Configuration

  • ElastiCache cluster lives in your VPC
  • Choose subnet group (collection of subnets)
  • Subnets should be private (no internet gateway)
  • For Multi-AZ, use subnets in different AZs
  • No public IP — access via private IP only
🛡️

Best Practices

  • Place in private subnets only
  • Use dedicated subnets for ElastiCache
  • CIDR should have enough IPs for nodes
  • Spread across 2+ AZs for HA (Redis replicas)
  • Same VPC as your application servers
Security Groups — Network Access Control Core

Security Groups control which traffic can reach your ElastiCache cluster. Only allow inbound connections from your application's security group on the cache port.

Security Group configuration — only allow traffic from app tier SG
App
App Servers
SG: app-sg
Outbound: all
ElastiCache
ElastiCache
SG: cache-sg
Inbound: 6379 from app-sg
Redis port: 6379 • Memcached port: 11211 • Only allow from app SG, not 0.0.0.0/0
Rule Type Port Source
✔ Good Inbound 6379 sg-app-servers (app tier SG)
✔ Good Inbound 6379 10.0.0.0/16 (VPC CIDR)
✗ Bad Inbound 6379 0.0.0.0/0 (never do this)
Encryption — At Rest & In Transit Core
🔒

Encryption at Rest

  • Data encrypted on disk (backups, snapshots)
  • Uses AWS KMS keys
  • Choose AWS-managed key or customer-managed CMK
  • Must enable at cluster creation — cannot add later
  • Supported: Redis only (not Memcached)
🔐

Encryption in Transit

  • Data encrypted between app and cache (TLS)
  • Prevents eavesdropping on network
  • Must enable at cluster creation
  • Adds ~10–20% latency overhead
  • Supported: Redis only (not Memcached)

👉 Exam tip: Memcached does not support encryption (at-rest or in-transit). If the question mentions encryption requirements, the answer is Redis. Encryption must be enabled at cluster creation — you cannot add it to an existing cluster.

Authentication — Redis AUTH & IAM Core
🔑

Redis AUTH Token

  • Password-based authentication
  • Set AUTH token at cluster creation
  • Client must provide token to connect
  • Requires in-transit encryption enabled
  • Token: 16–128 characters
👤

IAM Authentication (Redis 7+)

  • Authenticate using IAM users/roles
  • No static passwords to manage
  • Integration with IAM policies
  • Requires Redis 7.0+ and encryption
  • Best for: Lambda, ECS, modern apps
Concept Diagram — ElastiCache Security Layers Core
Security layers — VPC, Security Groups, Encryption, Authentication
VPC (Network Isolation) Private Subnet App Server EC2 / Lambda SG: app-sg Security Group: cache-sg ElastiCache Redis 🔒 Encryption at-rest • 🔐 TLS in-transit • 🔑 AUTH TLS + AUTH SECURITY LAYERS 1. VPC isolation 2. Private subnet 3. Security Group 4. Encryption + AUTH
🧠 Key Insight

ElastiCache is VPC-only — no public access. Security layers: VPC isolation, private subnets, Security Groups (allow only app tier), encryption (at-rest + in-transit for Redis), and authentication (AUTH token or IAM). Memcached has no encryption support. Exam tip: “encrypt cache data” = Redis with encryption at-rest + in-transit; “cache in private subnet” = correct architecture.

Chapter Summary Introductory
  • VPC-only: ElastiCache has no public endpoint; must be in VPC
  • Private subnets: place cache nodes in private subnets, spread across AZs
  • Security Groups: allow inbound only from app tier SG on port 6379 (Redis) or 11211 (Memcached)
  • Encryption at-rest: KMS-managed; Redis only; enable at creation
  • Encryption in-transit: TLS; Redis only; ~10-20% latency overhead
  • Authentication: Redis AUTH token or IAM (Redis 7+); Memcached has no auth
07
Chapter Seven

Architecture Patterns

Pattern 1 — RDS/Aurora + ElastiCache (Read-Heavy Apps) Core

The most common pattern: ElastiCache sits in front of your relational database, caching frequently-read data to reduce load and latency.

Pattern 1 — Cache-aside pattern with RDS/Aurora backend
App
Application
Check cache first
ElastiCache
ElastiCache
Redis
Session + query cache
RDS
RDS / Aurora
Source of truth
Complex queries
Hit ratio 80-95% • DB load reduced 5-10× • Latency: cache <1ms, DB 1-10ms

When to Use

  • Read-heavy workload (80%+ reads)
  • Same data queried repeatedly
  • Database is a bottleneck
  • Need sub-ms response times

Implementation

  • Cache-aside (lazy loading) strategy
  • TTL based on data freshness needs
  • Invalidate on writes
  • Key pattern: entity:id
Pattern 2 — DynamoDB + DAX (Serverless Caching) Core

DAX (DynamoDB Accelerator) is a purpose-built cache for DynamoDB. Unlike ElastiCache, DAX is API-compatible — just change the endpoint and caching is automatic.

Pattern 2 — DAX provides transparent caching for DynamoDB
Lambda
Lambda
Uses DAX SDK
Same DynamoDB API
DAX
DAX
Microsecond reads
Write-through cache
DynamoDB
DynamoDB
Source of truth
Cache miss fallback
Microsecond latency • Read-through + write-through • Eventually consistent only

👉 DAX vs ElastiCache for DynamoDB: Use DAX for DynamoDB — it's purpose-built, API-compatible, and handles cache management automatically. Use ElastiCache (Redis) when you need features DAX doesn't have (pub/sub, complex data types) or for non-DynamoDB data. DAX consistency note: Write-through only updates items already in cache — it does not proactively load uncached items. DAX also supports eventually consistent reads only; bypass DAX for strongly consistent reads.

Pattern 3 — Session Storage Core

Store user sessions in ElastiCache to enable stateless application servers. Any server can handle any request because session data is centralized.

Pattern 3 — Centralized session storage enables stateless app tier
ALB
ALB
No sticky sessions
Round-robin
EC2
EC2
EC2
EC2
ElastiCache
Redis
Session store
TTL = session timeout
Stateless servers • Horizontal scaling • No sticky sessions needed • Session TTL = 15-30 min

Benefits

  • Servers are stateless — easy to scale
  • No sticky sessions — better load distribution
  • Server failure doesn't lose sessions
  • TTL auto-expires old sessions

Implementation

  • Use Redis with persistence (for HA)
  • Key: session:<session_id>
  • Value: serialized session data (JSON)
  • TTL: match session timeout (e.g., 30 min)
Pattern 4 — Leaderboards with Redis Sorted Sets Core

Redis sorted sets are perfect for leaderboards. Each member has a score; Redis keeps them sorted automatically. O(log N) insert/update, O(log N + M) for range queries.

🏆

How It Works

  • ZADD leaderboard 1500 "player1" — add/update score
  • ZRANK leaderboard "player1" — get rank (0-based)
  • ZRANGE leaderboard 0 9 WITHSCORES — top 10
  • ZREVRANGE for descending order (highest first)
  • Millions of players, instant rank lookup
🎮

Use Cases

  • Gaming leaderboards
  • Top sellers / trending products
  • Real-time analytics dashboards
  • Voting / rating systems
  • Activity feeds (sorted by time)
Pattern 5 — Real-Time Pub/Sub Advanced

Redis Pub/Sub enables real-time messaging between services. Publishers send to channels; subscribers receive instantly. No message persistence (fire-and-forget).

How Pub/Sub Works

  • SUBSCRIBE notifications — listen to channel
  • PUBLISH notifications "new order" — send message
  • All subscribers receive simultaneously
  • No message queue — missed = lost

Use Cases

  • Real-time notifications
  • Chat applications
  • Live updates (sports scores, stock prices)
  • Cache invalidation broadcast
Decision Guide — ElastiCache vs Alternatives Core
Requirement Solution Why
Cache for RDS/Aurora ElastiCache Redis General-purpose, flexible
Cache for DynamoDB DAX API-compatible, microseconds
Session storage ElastiCache Redis Persistence, TTL, HA
Leaderboards ElastiCache Redis Sorted sets
Simple cache, multi-threaded ElastiCache Memcached CPU efficiency
Message queue (durable) SQS / SNS Redis Pub/Sub is fire-and-forget
Exam Cheatsheet Core

🎯 Exam Keywords → ElastiCache Answer

  • “sub-millisecond latency” → ElastiCache (Redis or Memcached)
  • “reduce database load” → ElastiCache cache-aside pattern
  • “session storage, stateless servers” → ElastiCache Redis
  • “leaderboard, ranking” → Redis sorted sets
  • “DynamoDB caching, microsecond” → DAX (not ElastiCache)
  • “cache with persistence” → Redis (Memcached has no persistence)
  • “cache with Multi-AZ failover” → Redis (Memcached has no HA)
  • “encrypt cache data” → Redis with encryption at-rest + in-transit
  • “simple cache, multi-threaded” → Memcached
  • “Memcached node failure” → data lost permanently; no automatic recovery
  • “lazy loading” → cache-aside strategy
  • “cache always in sync” → write-through strategy
  • “cold start, first request miss” → cache warming / preloading
  • “multi-region Redis” → Global Datastore (active-passive, <1s lag)
  • “serverless Redis” → ElastiCache Serverless (no capacity planning)
  • “Redis backup” → automated snapshots + manual; restore creates new cluster
  • “atomic multi-key Redis operation” → MULTI/EXEC transaction
  • “LRU eviction” → least recently used (default)
  • “TTL expiration” → time-to-live controls cache freshness
  • “pub/sub messaging” → Redis (not durable; use SQS for durable)
🧠 Final Insight

ElastiCache is the AWS caching layer for reducing database load and latency. Redis is the default choice — it has persistence, HA, and rich data types. Use Memcached only for simple, ephemeral caching with multi-threaded needs. Use DAX for DynamoDB specifically. The cache is not a replacement for your database — it's a protective shield that absorbs repetitive read load.