Amazon Kinesis
LearningTree Β· AWS Β· Analytics

Amazon Kinesis β€”
Real-Time Streaming at Any Scale

Capture, process, and deliver streaming data in real time β€” from IoT sensors and clickstreams to application logs and financial transactions. Kinesis is the real-time counterpart to batch analytics with Athena and Glue.

01
Chapter One Β· Analytics

What is Amazon Kinesis?

Amazon Kinesis is a managed platform for real-time streaming data. It collects, processes, and delivers data records continuously β€” in milliseconds to seconds β€” rather than waiting for batch intervals of minutes or hours. Kinesis is an umbrella brand with multiple services, each solving a different streaming problem.

Batch vs Streaming β€” The Core Distinction Introductory
AspectBatch (Athena / Glue)Streaming (Kinesis)
Data deliveryCollected over hours, processed togetherProcessed record-by-record as it arrives
LatencyMinutes to hoursMilliseconds to seconds
Use casesReports, historical analysis, ad-hoc queriesReal-time dashboards, alerting, event processing
Data modelFiles in S3 (CSV, Parquet)Continuous stream of records (ordered sequence)
ScalingQuery-time (Athena scales per query)Capacity-based (shards provisioned ahead)
Cost modelPay per query / per TB scannedPay per shard-hour + per GB ingested
🧠 Mental Model β€” The River

Think of Kinesis as a river system. Data flows continuously from sources (streams). You can dip a bucket in the river at any point to read data (consumers). The river doesn't stop β€” it just keeps flowing. Batch analytics (Athena) is like analysing a lake after the river has filled it. Streaming analytics (Kinesis) is like analysing the water as it flows past you.

The Kinesis Family β€” Four Services Core
πŸ”΄

Kinesis Data Streams (KDS)

  • Core streaming service β€” capture and store data records
  • Real-time ingestion with ordered shards
  • YOU write consumer code (Lambda, KCL, custom)
  • Retention: 24 hours β†’ 365 days
  • Most flexible β€” full control over processing
πŸ”₯

Kinesis Data Firehose

  • Fully managed delivery to destinations
  • Auto-batches and delivers to S3, Redshift, OpenSearch, Splunk
  • No consumer code needed β€” zero administration
  • Near-real-time (60-second minimum buffer)
  • Built-in format conversion (JSON β†’ Parquet)
πŸ“Š

Kinesis Data Analytics

  • Run SQL or Apache Flink on streaming data
  • Real-time aggregations, windowed queries
  • Input from KDS or Firehose
  • Output to KDS, Firehose, Lambda
  • Now called "Managed Service for Apache Flink"
πŸ“Ή

Kinesis Video Streams

  • Capture and stream video from devices
  • Cameras, drones, dashcams, IoT video
  • Integrates with Rekognition for ML
  • Different use case β€” not covered here
Kinesis Data Streams vs Firehose β€” The Key Decision Core
DimensionKinesis Data StreamsKinesis Data Firehose
AdministrationYou manage shards, consumersFully managed β€” zero administration
Latency~200ms (real-time)60 seconds minimum (near-real-time)
Consumer codeRequired (Lambda, KCL, custom)Not needed β€” built-in delivery
DestinationsAnything (you write the code)S3, Redshift, OpenSearch, Splunk, HTTP
Data retention24h – 365 days (replay capable)None β€” delivers once and forgets
Replayβœ… Re-read from any point in retention window❌ No replay β€” fire and forget
ScalingManual shard splitting/merging (or on-demand mode)Auto-scales automatically
Format conversionNot built-inJSON β†’ Parquet/ORC built-in
Best forCustom real-time processing, multiple consumersSimple delivery to S3/analytics stores
Architecture Overview Core
Amazon Kinesis β€” Streaming Architecture Overview
PRODUCERS IoT sensors Click streams App logs Transactions Social feeds Game events DATA STREAMS Shards Β· Ordered ~200ms latency FIREHOSE Auto-batches 60s buffer β†’ deliver Lambda Real-time process KCL App / Flink Custom consumers S3 (Parquet) Redshift OpenSearch Alerts / DynamoDB Analytics β†’ Athena / QuickSight Data Streams = real-time custom processing Β· Firehose = managed delivery to storage
Producers β†’ Kinesis (Data Streams for real-time, Firehose for delivery) β†’ Consumers / Destinations
Real-World Use Cases Core
πŸ“Š

Real-Time Dashboards

  • Live metrics from web applications
  • Gaming leaderboards updated per second
  • Trading floor price displays
  • Ad impression counting
🚨

Anomaly Detection

  • Fraud detection on transactions
  • Security threat detection
  • DDoS traffic spike alerting
  • Healthcare vitals monitoring
πŸ“‹

Log & Event Pipelines

  • Application log centralisation
  • Clickstream collection β†’ S3 for analytics
  • IoT sensor data ingestion
  • Microservice event streaming
🎯 Exam Insight
  • "Real-time data ingestion" or "streaming" β†’ Kinesis Data Streams
  • "Deliver streaming data to S3 automatically" β†’ Kinesis Firehose
  • "Near-real-time, zero administration delivery" β†’ Firehose (not Data Streams)
  • "Custom real-time processing with multiple consumers" β†’ Data Streams + Lambda/KCL
  • "Replay streaming data" β†’ Data Streams (Firehose cannot replay)
  • "Convert JSON to Parquet during delivery" β†’ Firehose (built-in format conversion)
  • Data Streams = you manage consumers. Firehose = fully managed delivery.
Chapter 01 β€” Key Takeaway

Kinesis is AWS's real-time streaming platform. Data Streams gives you full control with ordered shards and custom consumers (~200ms latency). Firehose is zero-admin delivery to S3/Redshift/OpenSearch (60s buffer). Use Data Streams when you need real-time processing, replay, or multiple consumers. Use Firehose when you just need to land streaming data in a destination with no code.

02
Chapter Two Β· Analytics

Kinesis Data Streams β€” Deep Dive

Kinesis Data Streams (KDS) is the core real-time ingestion service. It captures data records into an ordered, durable log partitioned across shards. You control capacity, retention, and processing logic. It's the foundation for any custom real-time architecture on AWS.

Core Concepts β€” Shards, Records, Partition Keys Core
ConceptWhat It IsKey Detail
StreamA named channel for data recordsContains 1+ shards; you choose shard count
ShardA unit of capacity β€” ordered sequence of records1 MB/s write, 2 MB/s read per shard
RecordA single data blob (up to 1 MB)Contains: partition key + sequence number + data blob
Partition KeyDetermines which shard a record goes to (hashed)Same key = same shard = guaranteed ordering
Sequence NumberUnique ID assigned by Kinesis per record per shardMonotonically increasing within a shard
RetentionHow long records stay in the streamDefault 24h, extendable to 7 days (or 365 days)
Shard Model β€” Capacity Planning Core
🧠 Mental Model β€” Lanes on a Highway

Think of a Kinesis stream as a highway. Each shard is one lane. More lanes (shards) = more cars (records) can flow simultaneously. Each lane has a speed limit: 1 MB/s in, 2 MB/s out. If you need more throughput, add more lanes. Records with the same partition key always use the same lane (ordering preserved).

πŸ“₯

Write Capacity (per shard)

  • 1 MB/second ingestion
  • 1,000 records/second
  • Each record max 1 MB
  • Exceed β†’ ProvisionedThroughputExceededException
  • Fix: add more shards or use on-demand mode
πŸ“€

Read Capacity (per shard)

  • 2 MB/second shared across all consumers
  • Or 2 MB/second per consumer with Enhanced Fan-Out
  • 5 GetRecords calls/second per shard (shared mode)
  • Multiple consumers compete for read bandwidth
  • Fix: Enhanced Fan-Out (dedicated throughput per consumer)
Capacity Modes β€” Provisioned vs On-Demand Core
ModeHow It WorksCostBest For
ProvisionedYou specify exact shard count; manually scale$0.015/shard-hour + $0.014/million PUTPredictable, steady traffic with known throughput
On-DemandAuto-scales shards based on traffic (up to 200 MB/s)$0.04/GB ingested + $0.04/GB retrievedUnpredictable, bursty, or new workloads
πŸ’‘ On-Demand vs Provisioned Decision

On-Demand is simpler (no shard math) but ~2Γ— more expensive at steady loads. Choose Provisioned when you know your throughput and want to optimise cost. Choose On-Demand when traffic is unpredictable, you're starting a new stream, or you want zero capacity planning.

Partition Keys & Ordering Core

The partition key is the most important design decision in Kinesis. It determines which shard receives each record:

βœ…

Good Partition Key Design

  • High cardinality (many unique values)
  • Evenly distributed across shards
  • Example: user_id, device_id, session_id
  • Same key = same shard = ordering preserved per entity
❌

Bad Partition Key Design

  • Low cardinality (few unique values)
  • Creates "hot shards" β€” one shard overloaded
  • Example: country (only ~200 values, US gets 80% of traffic)
  • Hot shard β†’ throttling even with capacity available
Enhanced Fan-Out β€” Dedicated Consumer Throughput Deep
AspectShared (Classic)Enhanced Fan-Out (EFO)
Read throughput2 MB/s shared across ALL consumers per shard2 MB/s dedicated PER consumer per shard
DeliveryConsumer polls (pull-based)Push-based (SubscribeToShard API)
Latency~200ms (polling interval)~70ms (push, lower latency)
CostIncluded in base shard priceAdditional $0.015/consumer-shard-hour + $0.013/GB
Best for1–2 consumers, cost-sensitive3+ consumers, latency-sensitive, isolation needed
Producers β€” How to Put Data In Core
Producer MethodWhen to UseKey Detail
AWS SDK (PutRecord / PutRecords)Simple applications, low-to-medium throughputPutRecords batches up to 500 records per call
Kinesis Producer Library (KPL)High-throughput production workloadsAuto-batches, retries, aggregates small records
Kinesis AgentStream log files from EC2 instancesDaemon that tails files β†’ writes to stream
CloudWatch Logs subscriptionStream CloudWatch Logs to KinesisBuilt-in integration, no code needed
IoT Core rule actionIoT devices β†’ Kinesis directlyRule routes MQTT messages to stream
Shard Diagram β€” How Data Flows Core
Kinesis Data Streams β€” Shards, Partition Keys & Ordering
PRODUCERS user_001 β†’ key=A user_002 β†’ key=B user_001 β†’ key=A user_003 β†’ key=C user_002 β†’ key=B user_001 β†’ key=A MD5 HASH SHARD-1 key=A: rec1 β†’ rec3 β†’ rec6 (ordered) SHARD-2 key=B: rec2 β†’ rec5 (ordered) SHARD-3 key=C: rec4 (ordered) CONSUMERS Lambda KCL App Flink 1 worker per shard (KCL pattern) Same partition key β†’ same shard β†’ guaranteed order within that key
Partition keys are hashed to determine shard placement β€” records with the same key are always ordered within one shard
🎯 Exam Insight
  • "Ordered processing per user/device" β†’ use user_id/device_id as partition key (same key = same shard = ordered)
  • "ProvisionedThroughputExceededException" β†’ hot shard or insufficient shards. Fix: better partition key or add shards
  • "1 MB/s write, 2 MB/s read per shard" β€” memorise these limits
  • "Multiple consumers need dedicated throughput" β†’ Enhanced Fan-Out
  • "On-demand mode" β†’ auto-scales, no shard math, ~2Γ— cost of provisioned for steady traffic
  • "Replay data from 3 days ago" β†’ extend retention to 7 days (or 365 for long-term)
  • "High-throughput producer" β†’ Kinesis Producer Library (KPL) with aggregation + batching
Chapter 02 β€” Key Takeaway

Kinesis Data Streams is built on shards β€” each shard handles 1 MB/s in and 2 MB/s out. Partition keys determine which shard receives each record, guaranteeing order per key. Choose provisioned mode for cost at steady load, on-demand for simplicity with bursty traffic. Enhanced Fan-Out gives each consumer dedicated 2 MB/s per shard with push-based delivery at ~70ms latency.

03
Chapter Three Β· Analytics

Kinesis Data Firehose β€” Managed Delivery

Kinesis Data Firehose is the zero-administration delivery pipe. You point it at a source and a destination β€” it handles batching, compression, format conversion, and delivery automatically. No shards to manage, no consumer code, no capacity planning. Near-real-time with a 60-second minimum buffer.

What Firehose Does Automatically Core
πŸ“¦

Batching

  • Buffers records by time (60–900s) or size (1–128 MB)
  • Delivers one file per batch to S3
  • Reduces number of S3 objects dramatically
  • Configurable buffer hints
πŸ”„

Format Conversion

  • JSON β†’ Parquet (most common)
  • JSON β†’ ORC
  • Uses Glue Catalog schema for conversion
  • Zero-code columnar conversion
πŸ—œοΈ

Compression

  • GZIP, Snappy, ZIP, or uncompressed
  • Applied after format conversion
  • Reduces storage cost in S3
  • Parquet has built-in Snappy
Supported Destinations Core
DestinationUse CaseKey Detail
Amazon S3Data lake landing, archive, analytics with AthenaMost common. Supports Parquet conversion. Partition by date.
Amazon RedshiftLoad streaming data into data warehouseFirehose first writes to S3, then COPY's into Redshift
Amazon OpenSearchReal-time log search and dashboardsDelivers JSON; good for Kibana/Dashboards
SplunkEnterprise log management (SIEM)HEC (HTTP Event Collector) endpoint
HTTP EndpointCustom API, Datadog, New Relic, Sumo LogicAny HTTPS endpoint that accepts batched POSTs
Kinesis Data StreamsFan-out: KDS β†’ Firehose β†’ S3 (common pattern)Firehose consumes from an existing KDS stream
Firehose Flow Diagram Core
Kinesis Firehose β€” Delivery Pipeline
SOURCE Direct PUT or KDS stream Ξ» TRANSFORM (optional) enrich / filter BUFFER 60–900 sec 1–128 MB CONVERT JSON β†’ Parquet + compress S3 Parquet files partitioned BACKUP Failed records β†’ S3 backup Source β†’ (Lambda) β†’ Buffer β†’ Convert β†’ Deliver β†’ Backup failures
Firehose handles the entire pipeline: optional transform, buffering, format conversion, delivery, and error backup
Lambda Transformation β€” Enrichment in Flight Deep

Firehose can invoke a Lambda function on each batch before delivery. The Lambda receives records, transforms them (add fields, filter, parse), and returns them to Firehose for delivery. Common uses:

  • Parse raw log lines into structured JSON
  • Add enrichment data (e.g. GeoIP lookup on IP addresses)
  • Filter out unwanted records (reduce storage cost)
  • Normalize timestamps across time zones
Firehose + Glue Catalog β€” Format Conversion Core
πŸ’‘ The Parquet Pipeline (Most Common Pattern)

Configure Firehose with a Glue Catalog table schema β†’ Firehose converts incoming JSON records to Parquet automatically. The output files land in S3 as Parquet, immediately queryable by Athena. This is the lowest-effort way to build a streaming data lake.

Key Firehose Limits Core
LimitValueNote
Minimum buffer interval60 secondsCannot deliver faster than 1 min (not true real-time)
Maximum record size1 MBSame as Data Streams
Maximum buffer size128 MBDelivers when buffer fills OR interval expires (whichever first)
Lambda transform timeout5 minutes maxLambda receives up to 6 MB of records per invocation
Retry on failureUp to 24 hours (S3), varies per destinationUndeliverable records β†’ S3 backup bucket
🎯 Exam Insight
  • "Deliver streaming data to S3 with no code" β†’ Kinesis Firehose
  • "Convert JSON to Parquet during delivery" β†’ Firehose + Glue Catalog schema
  • "60-second minimum latency" β†’ Firehose is NEAR-real-time, not true real-time
  • "Transform records in flight before delivery" β†’ Firehose + Lambda transformation
  • "Firehose delivers to Redshift" β†’ actually writes to S3 first, then COPY into Redshift
  • "Failed delivery records" β†’ automatically backed up to an S3 error bucket
  • "No shards, no capacity planning" β†’ Firehose auto-scales (unlike Data Streams provisioned)
  • "Firehose vs Data Streams for replay?" β†’ Only Data Streams supports replay. Firehose delivers once.
Chapter 03 β€” Key Takeaway

Kinesis Firehose is the fully-managed delivery service β€” zero shards, zero consumer code, auto-scaling. It buffers records (60s–15min), optionally transforms via Lambda, converts to Parquet using Glue Catalog, compresses, and delivers to S3/Redshift/OpenSearch. The 60-second minimum buffer means it's near-real-time, not true real-time. Use it when you just want data to land in a destination with minimal effort.

04
Chapter Four Β· Analytics

Consumers & Processing

Consumers read records from Kinesis Data Streams and process them. AWS offers multiple consumer options β€” from serverless (Lambda) to fully custom (KCL applications). The choice depends on latency requirements, processing complexity, and operational preference.

Consumer Options Comparison Core
ConsumerHow It WorksBest ForScaling
AWS LambdaEvent source mapping polls shards, invokes function per batchServerless, simple transforms, <15 min processing1 Lambda invocation per shard (parallel = shard count)
KCL (Kinesis Client Library)Java/Python library manages shard leases, checkpoints on EC2/ECSLong-running, complex processing, custom checkpointing1 worker per shard; library handles rebalancing
Managed Apache FlinkStream processing with SQL or Java/Python (ex-Kinesis Data Analytics)Windowed aggregations, joins, complex event processingManaged scaling based on throughput
Kinesis FirehoseConsume from KDS and deliver to S3/Redshift/OpenSearchSimple delivery to storage (no custom logic)Auto-scales, no management
SDK (GetRecords)Direct API polling in custom applicationSimple scripts, testing, one-off readsManual β€” you manage everything
Lambda as Consumer β€” The Serverless Pattern Core
βœ…

Advantages

  • Zero infrastructure β€” no EC2, no containers
  • Auto-polls shards, manages checkpoints
  • Scales to 1 concurrent invocation per shard
  • Built-in retry with bisect-on-error
  • Sends failed batches to DLQ (SQS or SNS)
⚠️

Limitations

  • 15-minute max execution time
  • One invocation per shard (parallelization factor configurable up to 10)
  • Cold starts add latency
  • Memory limited to 10 GB
  • Not ideal for stateful processing
KCL β€” Custom Consumer at Scale Deep

The Kinesis Client Library (KCL) is a Java library (with Python/Node wrappers) that handles shard management for you:

  • Lease management β€” assigns shards to workers via DynamoDB lease table
  • Checkpointing β€” tracks last processed sequence number per shard
  • Rebalancing β€” when workers are added/removed, shards are redistributed
  • Shard splits/merges β€” KCL handles resharding transparently
πŸ’‘ KCL Rule of Thumb

Number of KCL workers should equal number of shards for maximum parallelism. More workers than shards = some workers idle. Fewer workers = some workers process multiple shards (fine, just slower).

Error Handling Strategies Core
StrategyHow It WorksUse Case
Retry entire batchKinesis retries the failed batch (default for Lambda)Transient errors that resolve on retry
Bisect on errorSplit failed batch in half, retry each halfIdentify and isolate the poisonous record
Skip & continueMove checkpoint past the bad recordNon-critical records where data loss is acceptable
Dead-letter queueSend failed records to SQS DLQ after max retriesPreserve failed records for later investigation
Max retry attemptsFail after N retries, move to next batchPrevent infinite blocking on bad records
🎯 Exam Insight
  • "Serverless consumer for Kinesis" β†’ Lambda with event source mapping
  • "1 Lambda per shard" β€” default parallelism (configurable up to 10 with parallelization factor)
  • "Checkpoint progress" β†’ KCL uses DynamoDB table for lease/checkpoint tracking
  • "Poison pill record blocking the stream" β†’ enable bisect-on-error + DLQ
  • "Real-time SQL on streaming data" β†’ Managed Apache Flink (formerly Kinesis Data Analytics)
  • "Fan-out: KDS β†’ multiple destinations" β†’ KDS β†’ Lambda + KDS β†’ Firehose (parallel consumers)
Chapter 04 β€” Key Takeaway

Lambda is the simplest consumer (serverless, auto-checkpoint, DLQ support) but limited to 15 min and 1 invocation per shard. KCL gives full control with DynamoDB-based checkpointing for long-running EC2/ECS apps. Flink is for complex stream SQL (windowed aggregations, joins). Always configure error handling β€” bisect-on-error + DLQ prevents one bad record from blocking your entire stream.

05
Chapter Five Β· Analytics

Architecture Patterns

Kinesis rarely operates alone. These are the four most common production patterns β€” from simple log delivery to real-time analytics pipelines combining multiple AWS services.

Pattern 1 β€” Streaming Data Lake (Most Common) Core

The most popular Kinesis pattern: stream data, convert to Parquet, land in S3, query with Athena.

Streaming Data Lake β€” Firehose β†’ S3 β†’ Athena
PRODUCERS App / IoT / Logs FIREHOSE β†’ Parquet S3 Data Lake GLUE Catalog ATHENA SQL BI QuickSight Producers β†’ Firehose (Parquet) β†’ S3 β†’ Glue Catalog β†’ Athena β†’ QuickSight
Pattern 2 β€” Real-Time Processing + Archive Core

Stream to KDS for real-time processing (Lambda) while simultaneously archiving to S3 via Firehose. Best of both worlds.

⚑

Real-Time Path

  • KDS β†’ Lambda β†’ DynamoDB / alerts / API
  • Sub-second processing
  • Fraud detection, live metrics, notifications
πŸ’Ύ

Archive Path (same stream)

  • KDS β†’ Firehose β†’ S3 (Parquet)
  • Near-real-time archival for analytics
  • Athena queries on historical data
Pattern 3 β€” Fan-Out to Multiple Consumers Core

One KDS stream β†’ multiple independent consumers, each doing different work. With Enhanced Fan-Out, each gets dedicated 2 MB/s per shard.

ConsumerPurpose
Lambda #1Real-time anomaly detection β†’ SNS alerts
Lambda #2Update DynamoDB counters (live dashboard)
FirehoseArchive to S3 data lake
FlinkWindowed aggregations β†’ Redshift
Pattern 4 β€” Kinesis vs SQS vs EventBridge Core
RequirementBest ServiceWhy
Real-time streaming, ordering, replayKinesis Data StreamsOrdered shards, retention, replay from any point
Simple work queue, one consumer per messageSQSSimpler, per-message, no ordering needed
Event routing by content, many targetsEventBridgeRules route events to specific targets by content
High-throughput ingestion (100K+ records/sec)Kinesis Data StreamsDesigned for high-volume streaming ingestion
Deliver raw logs to S3 without codeKinesis FirehoseManaged delivery, format conversion, zero ops
React to AWS service eventsEventBridgeNative integration with 180+ services
🎯 Exam Insight
  • "Stream data to S3 as Parquet for Athena" β†’ Firehose + Glue Catalog (Pattern 1)
  • "Process in real-time AND archive" β†’ KDS with Lambda consumer + Firehose consumer (Pattern 2)
  • "Multiple consumers, independent processing" β†’ KDS with Enhanced Fan-Out (Pattern 3)
  • "Kinesis vs SQS" β†’ Kinesis = ordered stream, replay, high throughput. SQS = per-message queue, simpler
  • "Kinesis vs EventBridge" β†’ Kinesis = high-volume ingestion. EventBridge = event routing by content
Chapter 05 β€” Key Takeaway

The most common Kinesis pattern is Firehose β†’ S3 (Parquet) β†’ Athena for streaming data lakes. For real-time + archive, use KDS with parallel consumers (Lambda for real-time, Firehose for S3). Enhanced Fan-Out enables multiple isolated consumers on the same stream. Choose Kinesis over SQS when you need ordering, replay, or high-throughput streaming. Choose SQS for simple work queues.

06
Chapter Six Β· Analytics

Cost & Best Practices

Kinesis costs depend heavily on which service you use and how you configure it. Data Streams charges per shard-hour. Firehose charges per GB ingested. Understanding the cost model is key to building affordable streaming architectures.

Pricing Model Core
ServicePricing DimensionApproximate Cost
KDS ProvisionedPer shard-hour$0.015/shard-hour + $0.014/million PUT payloads (25KB each)
KDS On-DemandPer GB ingested + retrieved$0.04/GB in + $0.04/GB out
KDS Extended RetentionPer shard-hour (beyond 24h)$0.023/shard-hour (7-day) or $0.046 (365-day)
KDS Enhanced Fan-OutPer consumer-shard-hour + per GB$0.015/consumer-shard-hour + $0.013/GB retrieved
FirehosePer GB ingested$0.029/GB (first 500 TB/month)
Firehose format conversionPer GB converted$0.018/GB (JSON β†’ Parquet)
Cost Optimization Checklist Core
#OptimizationImpactHow
1Right-size shards30–70% savingsMonitor IncomingBytes metric; reduce shards if utilisation <50%
2Use Firehose instead of KDS + Lambda for simple deliverySimpler + cheaper for S3 archivalFirehose eliminates Lambda cost, shard management
3Aggregate small records with KPLUp to 25Γ— fewer PUTsKPL aggregates multiple sub-25KB records into one PUT payload
4Use On-Demand only if truly burstyProvisioned is 2Γ— cheaper at steady loadSwitch to provisioned once you know your throughput pattern
5Minimise retentionAvoid extended retention chargesKeep 24h default unless replay is actually needed
6Avoid Enhanced Fan-Out unless needed$0 extra cost without itUse shared mode for 1–2 consumers; EFO only for 3+
Production Best Practices Core
βœ…

Data Streams Best Practices

  • Use high-cardinality partition keys (avoid hot shards)
  • Monitor WriteProvisionedThroughputExceeded alarm
  • Use KPL for high-throughput producers (aggregation + batching)
  • Enable enhanced monitoring for per-shard metrics
  • Set up Lambda DLQ for poisonous record handling
  • Use on-demand mode for new/unpredictable workloads
βœ…

Firehose Best Practices

  • Enable Parquet conversion with Glue Catalog schema
  • Set buffer interval to 60s for lowest latency delivery
  • Configure S3 error output bucket for failed records
  • Use Lambda transform only when needed (adds cost)
  • Partition output by date: year/month/day/hour/
  • Enable CloudWatch delivery metrics for monitoring
Common Mistakes Core
❌

Using KDS + Lambda When Firehose Suffices

If you just need to deliver data to S3/Redshift/OpenSearch with no custom logic, Firehose is simpler and cheaper. Adding KDS + Lambda for simple delivery adds unnecessary complexity and cost.

❌

Low-Cardinality Partition Keys

Using country or status as partition key creates hot shards. Use user_id, device_id, or random UUIDs for even distribution.

❌

No Error Handling on Lambda Consumer

Without bisect-on-error and DLQ, one bad record blocks the entire shard indefinitely. Always configure max retry attempts + DLQ for Lambda event source mappings.

❌

Over-Provisioning Shards

Starting with 100 shards "just in case" wastes $108/month even with zero traffic. Start small, use CloudWatch alarms to trigger scale-out, or use on-demand mode initially.

Full Page Summary β€” Amazon Kinesis Core
πŸ“‹ Amazon Kinesis β€” Complete Recap
  • What: Managed real-time streaming platform β€” Data Streams (custom processing), Firehose (managed delivery), Flink (stream SQL).
  • Data Streams: Shard-based, ordered by partition key, 1 MB/s in + 2 MB/s out per shard. Retention 24h–365 days. Supports replay.
  • Firehose: Zero-admin delivery to S3/Redshift/OpenSearch. 60-second buffer minimum. Built-in JSONβ†’Parquet conversion. No replay.
  • Consumers: Lambda (serverless, 1/shard), KCL (custom, DynamoDB checkpoints), Flink (windowed SQL), Firehose (delivery).
  • Key Decision: Data Streams when you need real-time processing, replay, or multiple consumers. Firehose when you just need delivery with no code.
  • vs SQS: Kinesis = ordered stream, high throughput, replay. SQS = per-message queue, simpler.
  • Cost: KDS = $0.015/shard-hour (provisioned). Firehose = $0.029/GB ingested. Right-size shards, use KPL aggregation, minimise retention.
πŸ‘‰ Final Takeaway β€” Kinesis in One Thought

Kinesis is the real-time half of the AWS analytics stack. Where Athena and Glue process data at rest in S3, Kinesis processes data in motion. Data Streams gives you full control over real-time processing with shards and custom consumers. Firehose gives you zero-ops delivery to storage. Together with Athena, they form the complete analytics pipeline: stream β†’ store β†’ query.