AWS Developer Associate (DVA-C02) — Exam-Aligned Study Guide

Structured by Official Exam Guide Domains and Tasks Reference: AWS DVA-C02 Exam Guide PDF Last updated: March 2026

Exam Blueprint

Domain	Weight	Questions (~65 total)
1. Development with AWS Services	32%	~21
2. Security	26%	~17
3. Deployment	24%	~16
4. Troubleshooting and Optimization	18%	~11

DOMAIN 1 — Development with AWS Services (32%)

Task 1.1: Develop Code for Applications Hosted on AWS

1.1.1 Architectural Patterns

  Loosely Coupled                         Tightly Coupled
  ┌─────┐  SQS  ┌─────┐                 ┌─────┐───▶┌─────┐
  │ Svc │──────▶│ Svc │  Resilient       │ Svc │    │ Svc │  Fragile
  │  A  │       │  B  │                  │  A  │◀───│  B  │
  └─────┘       └─────┘                 └─────┘    └─────┘

  Fan-out Pattern:                Event-Driven:
  SNS ──▶ SQS-1 ──▶ Svc-A        S3 Event ──▶ Lambda
      ──▶ SQS-2 ──▶ Svc-B        DDB Stream ──▶ Lambda
      ──▶ SQS-3 ──▶ Svc-C        EventBridge ──▶ Step Functions

Service Selection for Decoupling:

Pattern	Service	When to Use
Queue-based	SQS	Point-to-point, async processing
Pub/Sub	SNS	One-to-many broadcast
Fan-out	SNS + SQS	Broadcast + independent parallel processing
Event bus	EventBridge	Cross-account, SaaS integration, rule-based
Streaming	Kinesis Data Streams	Real-time, ordered, high-volume data ingestion
Orchestration	Step Functions	Complex multi-step workflows with state
Choreography	EventBridge	Loosely coupled event-driven microservices

1.1.2 AWS SDK and API Essentials

Retry and Exponential Backoff:

All AWS SDKs implement automatic retries with exponential backoff
ThrottlingException, ProvisionedThroughputExceededException trigger auto-retry
Custom formula: base * 2^attempt with jitter (add randomness to avoid thundering herd)
Always cap maximum backoff to prevent infinite waits

Pagination:

Most List* / Describe* APIs return paginated results
Use NextToken / Marker to fetch subsequent pages
SDKs provide built-in paginators (e.g., .pages() in Python boto3)

Waiters:

SDK utility to poll until a resource reaches a desired state
Example: ec2.get_waiter('instance_running').wait(InstanceIds=[...])

Idempotency:

Use ClientToken / IdempotencyToken to prevent duplicate operations
SQS FIFO: MessageDeduplicationId provides 5-minute dedup window

1.1.3 Amazon API Gateway

Endpoint Types:

Type	Description	Use Case
Edge-optimized	Routed through CloudFront edge locations	Global clients (default)
Regional	Served from the API region	Same-region clients, custom CDN
Private	Accessible only from within a VPC	Internal microservices

Integration Types:

Integration	Request Transform	Response Transform	Notes
Lambda Proxy	No (raw pass)	No (Lambda formats)	Lambda MUST return `{statusCode, headers, body}`
Lambda Custom	Mapping template	Mapping template	Use for SOAP-to-REST, XML-to-JSON
HTTP Proxy	No	No	Pass-through to HTTP endpoint
HTTP Custom	Mapping template	Mapping template	Transform before/after HTTP backend
AWS Service	Mapping template	Mapping template	Direct integration (SQS, DynamoDB, S3)

Stages and Stage Variables:

Deploy to named stages: /dev, /staging, /prod
Stage variables act as environment variables for API Gateway
Reference Lambda alias via stage variable: ${stageVariables.lambdaAlias}
Canary deployments on stages: route percentage of traffic to canary

Caching:

Enable per stage; TTL 0-3600s (default 300s)
Invalidate: Cache-Control: max-age=0 header (requires execute-api:InvalidateCache)
Metrics: CacheHitCount, CacheMissCount (only visible when caching enabled)

Throttling:

Account-level: 10,000 requests/second (soft limit)
Stage/method-level throttling via Usage Plans
API Keys for identification (NOT authentication), paired with Usage Plans
Returns 429 Too Many Requests when throttled

CORS:

Lambda Proxy: Return CORS headers FROM the Lambda function itself
Lambda Custom: Configure CORS in API Gateway console
Required headers: Access-Control-Allow-Origin, Access-Control-Allow-Methods, Access-Control-Allow-Headers

Error Codes:

Code	Meaning	Root Cause
400	Bad Request	Malformed request syntax
403	Forbidden	WAF blocked, IAM denied, resource policy denied
429	Too Many Requests	Throttle limit exceeded
502	Bad Gateway	Lambda returned invalid response format
503	Service Unavailable	Temporary backend issue
504	Gateway Timeout	Backend exceeded 29-second integration timeout

1.1.4 Messaging and Event Services

Amazon SQS:

Feature	Standard Queue	FIFO Queue
Throughput	Unlimited	300 TPS (3,000 with batching)
Delivery	At-least-once	Exactly-once
Ordering	Best-effort	Strict FIFO per Message Group ID
Deduplication	None	Content-based or MessageDeduplicationId
Queue name	Any	Must end with `.fifo`
Retention	1 min - 14 days (default 4)	Same
Max message size	256 KB	256 KB
Visibility timeout	0s - 12h (default 30s)	Same

Visibility Timeout: Set >= max processing time; for Lambda, set >= 6x Lambda timeout
Dead-Letter Queue (DLQ): After maxReceiveCount failures, message sent to DLQ
Long Polling: WaitTimeSeconds > 0 (max 20s) reduces empty responses and cost
Short Polling: Returns immediately, may return empty, more API calls
Extended Client Library (Java only): For messages > 256 KB (up to 2 GB via S3)

Amazon SNS:

Pub/Sub: topic to subscribers (Lambda, SQS, HTTP/S, email, SMS)
Message Filtering: Subscription filter policy so subscribers get only matching messages
Fan-out: SNS to multiple SQS queues for parallel independent processing
FIFO Topics: Pair with SQS FIFO for ordered fan-out
Message attributes: Key-value metadata attached to messages

Amazon EventBridge:

Serverless event bus for application events
Rules: Match event patterns and route to targets (Lambda, SQS, Step Functions)
Schema Registry: Auto-discover and version event schemas
Archive and Replay: Store events and replay them for debugging/recovery
Cross-account: Send/receive events across AWS accounts
Scheduler: Cron and rate-based scheduling (replaces CloudWatch Events)

Amazon Kinesis Data Streams:

Feature	Detail
Retention	24 hours default, max 365 days
Ordering	Per shard (by partition key)
Consumers	Standard (shared) or Enhanced (dedicated)
Throughput per shard	1 MB/s in, 2 MB/s out (standard)
Enhanced fan-out	2 MB/s per consumer per shard
Resharding	Split hot shards, merge cold shards

PutRecord + SequenceNumberForOrdering = strict order within shard
PutRecords (batch) does NOT guarantee cross-record order
ProvisionedThroughputExceededException use exponential backoff or increase shards

1.1.5 AWS Step Functions

Workflow Types:

Feature	Standard	Express
Duration	Up to 1 year	Up to 5 minutes
Execution model	Exactly-once	At-least-once (async) or sync
Pricing	Per state transition	Per execution + duration + memory
History	Full (25,000 events max)	Sent to CloudWatch Logs
Use case	Long-running, auditable	High-volume, short-lived (IoT)

State Types:

State	Purpose
Task	Do work (Lambda, ECS, Batch, DynamoDB, SNS, SQS)
Choice	Conditional branching (if/else)
Wait	Delay by seconds or until a timestamp
Parallel	Run branches concurrently
Map	Iterate over an array (dynamic parallelism)
Pass	Pass-through / inject fixed data (debugging)
Succeed	Terminal success state
Fail	Terminal failure state (no retry from Fail)

Input/Output Processing:

  Raw Input
      |
  InputPath ---- Filter what the state sees (e.g., "$.order")
      |
  Parameters --- Reshape input, add static values
      |
  [  STATE  ] -- Does work, produces RESULT
      |
  ResultSelector -- Filter/transform raw result
      |
  ResultPath ----- WHERE to place result relative to input
      |               "$.taskResult" = input.taskResult = result
      |               "$"            = result REPLACES entire input
      |               null           = result DISCARDED, input unchanged
  OutputPath ----- Final filter for next state
      |
  Output to Next State

Exam key: ResultPath is the one that COMBINES input + result.

Error Handling:

Retry: ErrorEquals, IntervalSeconds, MaxAttempts, BackoffRate
Catch: ErrorEquals, Next (fallback state), ResultPath (preserve error info)
Flow: Error then Retry (up to MaxAttempts) then Catch then Next state
Predefined errors: States.ALL, States.Timeout, States.TaskFailed, States.Permissions
Retry and Catch defined in state machine JSON, NOT application code

Task 1.2: Develop Code for AWS Lambda

1.2.1 Lambda Invocation Types

Type	Behavior	Error Handling	Sources
Synchronous	Caller waits for response	Caller handles errors	API Gateway, ALB, SDK `Invoke()`
Asynchronous	Returns 202 immediately, queues internally	Auto-retry 2x, then DLQ/destination	S3, SNS, EventBridge, CloudFormation
Poll-based (ESM)	Lambda service polls source	DLQ on source queue (not Lambda)	SQS, Kinesis, DynamoDB Streams

Event Source Mapping (ESM) Details:

Source	Batch Size	Failure Handling
SQS	1-10	`ReportBatchItemFailures` for partial batch retry
Kinesis	Up to 10,000	`BisectBatchOnFunctionError`, on-failure destination
DynamoDB Streams	Up to 10,000	`BisectBatchOnFunctionError`, on-failure destination

SQS: DLQ configured on the SQS queue, NOT on Lambda
Kinesis/DDB: MaximumRetryAttempts, MaximumRecordAgeInSeconds
Parallelization factor: Process multiple batches per shard concurrently

1.2.2 Concurrency Model

  Account Concurrency Pool (Default: 1,000)
  +-----------------------------------------+
  |  Reserved (Fn-A): 400  (guaranteed)     |
  |  Reserved (Fn-B): 200  (guaranteed)     |
  |  Unreserved Pool: 400  (shared by rest) |
  |  AWS keeps minimum 100 unreserved!      |
  +-----------------------------------------+

  Formula: concurrent_executions = invocations/sec x avg_duration_sec

Concurrency Type	Behavior
Unreserved	Shared pool across all functions (default)
Reserved	Guarantees AND caps capacity for a function
Provisioned	Pre-initializes execution environments (eliminates cold starts)

Setting reserved concurrency to 0 = function completely disabled
Throttled: synchronous returns 429; async auto-retries then DLQ

1.2.3 Execution Lifecycle

  COLD START:  Download code -> Start runtime -> Run INIT code -> Run handler
  WARM START:  INIT skipped -> Run handler directly

  Optimization: Put expensive setup OUTSIDE the handler
  - DB connections, SDK clients, cached data persist across warm invocations
  - /tmp directory persists too (512 MB free, up to 10 GB)

1.2.4 Lambda Configuration Limits

Setting	Detail
Memory	128 MB - 10,240 MB (CPU scales proportionally)
Timeout	Max 15 minutes (900 seconds)
Ephemeral storage	/tmp: 512 MB (free) up to 10 GB
Deployment package	50 MB zipped, 250 MB unzipped (incl. layers)
Layers	Max 5 per function; extract to `/opt/`
Env variables	Max 4 KB total size
vCPU	Cannot set directly (controlled by memory setting)
1 full vCPU	At 1,769 MB memory

1.2.5 Lambda Networking (VPC)

  DEFAULT (no VPC):  Lambda --> Internet --> AWS APIs  (Private RDS not accessible)
  WITH VPC CONFIG:   Lambda --ENI--> Private Subnet --> RDS
                     For internet: Lambda --> NAT GW --> IGW --> Internet
                     For AWS APIs: Use VPC Endpoints (no NAT GW needed)

Lambda creates ENIs (Elastic Network Interfaces), NOT Elastic IPs
VPC config adds cold start latency; only use when needed

1.2.6 Lambda Layers and Deployment

Layers: shared code/libraries across functions (extract to /opt/)
Max 5 layers per function; 250 MB total unzipped (function + all layers)
CloudFormation ZipFile: Inline source code (Node.js and Python only); NOT a zip file path

1.2.7 Destinations vs DLQ

Feature	Destinations	DLQ (Dead Letter Queue)
Event types	Success AND failure	Failure only
Targets	SQS, SNS, Lambda, EventBridge	SQS or SNS only
Scope	Async invocations only	Async invocations only
Recommendation	Preferred (more flexible)	Legacy (still supported)

1.2.8 Aliases and Versions

Version: Immutable snapshot of function code + config
Alias: Pointer to a version (e.g., PROD points to v5)
Weighted alias: Route traffic between two versions (canary/linear)
$LATEST: Mutable, always latest code; cannot be referenced by alias weights
CodeDeploy integrates with aliases for automated traffic shifting

1.2.9 Lambda at the Edge

Feature	CloudFront Functions	Lambda@Edge
Runtime	JavaScript only	Node.js, Python
Execution location	218+ edge locations	Regional edge caches
Max duration	Less than 1 ms	5s (viewer), 30s (origin)
Max memory	2 MB	128-3,008 MB
Network/file access	No	Yes
Use case	Header manipulation, URL rewrites	Auth, A/B testing, origin selection

1.2.10 Lambda Function URLs

A dedicated HTTPS endpoint for a Lambda function — no API Gateway required.

Auth Types:

Auth Type	Who Can Invoke	Use Case
`AWS_IAM`	Only callers with valid IAM credentials	Internal services, cross-account invocations
`NONE`	Anyone on the internet (public)	Webhooks, third-party callbacks, public APIs

Function URL format: https://<url-id>.lambda-url.<region>.on.aws
Supports streaming response (response payload streamed as it’s generated)
CORS configurable directly on the Function URL (no API Gateway needed)
Resource-based policy controls access (even with NONE auth, you can restrict by IP/account)

Lambda Function URL vs API Gateway:

Feature	Lambda Function URL	API Gateway
Cost	Free (pay only for Lambda)	Per-request + data transfer charges
Setup effort	Minimal (one click / one line)	More setup (stages, methods, resources)
Auth options	`AWS_IAM` or `NONE`	IAM, Cognito, Lambda Authorizer, API Keys
Caching	No	Yes (built-in)
Throttling / Usage Plans	No	Yes
Request/response transform	No	Yes (mapping templates)
Custom domain	No (use CloudFront in front)	Yes (native custom domains)
WAF integration	No	Yes
Webhook from third-party	Best choice (simplest)	Works but more overhead

Webhook Pattern (exam favorite):

  Third-Party Platform (e.g., Stripe, GitHub)
      |
      | HTTPS POST with signature in headers
      | (platform signs with a secret key)
      v
  Lambda Function URL (AuthType: NONE)
      |
      | Step 1: Extract signature from headers
      | Step 2: Recompute signature using shared secret
      | Step 3: Compare signatures
      |    Match --> Execute domain logic
      |    No match --> Return 403 (reject)
      |
  (Custom validation IN the Lambda code itself)

Exam Trap — Function URL vs API Gateway for Webhooks:

Question Pattern	Answer
Third-party webhook + public HTTPS + least effort	Lambda Function URL (`NONE`) + custom validation
Webhook + signature in headers + validate before processing	Lambda Function URL (`NONE`) + validate in code
Third-party webhook + API Gateway + Lambda Authorizer	Works but NOT least effort (extra components)
Function URL with `AWS_IAM` for third-party webhook	WRONG (third-party cannot sign with AWS Sig V4)
Function URL with `CodeSigningConfigArn` condition	WRONG (code signing = deployment packages, not requests)

Real exam example: A third-party platform sends webhook requests signed with a secret key in headers. Need a public HTTPS endpoint processed by Lambda with least development effort:

Correct: Create Lambda Function URL with AuthType: NONE + resource-based policy allowing public invoke + custom signature validation inside the Lambda function

Wrong: API Gateway + Lambda Authorizer (works but MORE effort — two components instead of one)

Wrong: Function URL with AWS_IAM (third-party platform cannot create AWS Sig V4 signatures)

Wrong: CodeSigningConfigArn condition (that validates deployment packages, not incoming HTTP requests)

Key Distinctions to Memorize:

Term	What It Does	Exam Confusion
`FunctionUrlAuthType`	Controls who can call the Function URL (NONE or IAM)	Auth for HTTP callers
`CodeSigningConfig`	Validates deployment package integrity (code trust)	Auth for code deployments
Lambda Authorizer	Custom auth logic as a separate Lambda function	API Gateway only
Cognito Authorizer	JWT validation from Cognito User Pool	API Gateway only

Task 1.3: Use Data Stores in Application Development

1.3.1 Amazon DynamoDB

Core Concepts:

Partition Key (PK): Determines data distribution; must be high cardinality
Sort Key (SK): Optional; enables range queries within a partition
Item size: Max 400 KB

Capacity Modes:

Mode	Billing	Best For
On-Demand	Pay per request	Unpredictable traffic, new tables
Provisioned	Set RCU/WCU (auto-scaling OK)	Predictable traffic, cost optimization

RCU / WCU Calculations:

  READ (RCU):
  1 RCU = 1 strongly consistent read/sec for item <= 4 KB
        = 2 eventually consistent reads/sec for item <= 4 KB

  RCU = (reads/sec x ceil(item_KB / 4)) / consistency_factor
    Strongly consistent:   factor = 1
    Eventually consistent: factor = 2  (HALF cost)
    Transactional:         factor = 0.5 (DOUBLE cost)

  WRITE (WCU):
  1 WCU = 1 write/sec for item <= 1 KB

  WCU = writes/sec x ceil(item_KB / 1)
  Transactional writes: multiply by 2

Example: 150 eventually consistent reads/sec, 3.5 KB items: RCU = 150 x ceil(3.5/4) / 2 = 150 x 1 / 2 = 75 RCU

Indexes:

Feature	LSI (Local Secondary)	GSI (Global Secondary)
Partition key	Same as base table	Different from base table
Sort key	Different from base table	Different from base table
Creation time	At table creation ONLY	Anytime
Capacity	Shares table RCU/WCU	Has its OWN RCU/WCU
Consistency	Strong or Eventually	Eventually ONLY
Max per table	5	20
Size limit	10 GB per partition key	No limit

ProvisionedThroughputExceededException on writes? Check if GSI WCU is less than base table WCU.

Query vs Scan:

Operation	What It Reads	Cost	Use When
Query	Items matching PK (+ SK)	Efficient	You know the partition key
Scan	Entire table	Expensive	Need all data (avoid if possible)

Scan applies FilterExpression AFTER reading (still consumes full RCU)
Optimize Scan: parallel scan with rate limiting; set Limit to control page size

DynamoDB Streams:

Captures item-level changes: INSERT, MODIFY, DELETE
StreamViewType: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, NEW_AND_OLD_IMAGES
24-hour retention (Lambda must run within 24h or data loss)
Lambda polls streams via Event Source Mapping (synchronous invocation)

Transactions:

API	Behavior	Cost	Notes
TransactWriteItems	All-or-nothing (ACID)	2x WCU	Up to 100 items, 4 MB total
TransactGetItems	Consistent read of multiple items	2x RCU	Up to 100 items, 4 MB total
BatchWriteItem	Best-effort (NOT atomic)	1x WCU	Up to 25 items, no UpdateItem
BatchGetItem	Best-effort	1x RCU	Up to 100 items, 16 MB max

Batch Operations — Partial Results and UnprocessedKeys (exam favorite!):

BatchGetItem returns partial results (UnprocessedKeys) when:

Response size exceeds 16 MB limit
Table’s provisioned throughput is exceeded
More than 1 MB per partition is requested
Internal processing failure occurs

BatchWriteItem returns partial results (UnprocessedItems) when:

Table’s provisioned throughput is exceeded
Internal processing failure occurs

How to Handle UnprocessedKeys / UnprocessedItems:

Approach	Reliable?	Why
Exponential backoff with jitter (randomized delay)	YES	Reduces request frequency, avoids thundering herd
Use AWS SDK (built-in retry + exponential backoff)	YES	SDK handles retry logic automatically
Immediately retry the batch request	NO	Still throttled; high chance of failing again
Increase RCUs / enable Auto Scaling	Partial	Helps with throughput but partial results can still occur due to size limits
Create a GSI	NO	GSI doesn’t change BatchGetItem behavior

Exam example: Python script uses BatchGetItem, frequently gets UnprocessedKeys. Most reliable handling?

Exponential backoff with randomized delay between retries

Use AWS SDK (has built-in automatic retry + exponential backoff)

Wrong: Immediate retry (still throttled), increase RCUs (doesn’t fix size-limit partials), GSI (irrelevant to batch ops)

Batch Limits Quick Reference:

API	Max Items	Max Size	Partial Result Key
BatchGetItem	100	16 MB	`UnprocessedKeys`
BatchWriteItem	25	16 MB	`UnprocessedItems`
TransactGetItems	100	4 MB	All-or-nothing (no partial)
TransactWriteItems	100	4 MB	All-or-nothing (no partial)

Optimistic Locking: Use version number attribute with ConditionExpression.

TTL: Auto-delete expired items (no WCU cost); eventually consistent (up to 48h delay).

DAX vs ElastiCache:

Feature	DAX	ElastiCache
Purpose	DynamoDB-specific cache	General-purpose cache
API	Same as DynamoDB (drop-in)	Custom cache logic in your app
Consistency	Eventually consistent only	You control
Data types	DynamoDB items/queries	Any (aggregated, computed, sessions)
Best for	Caching DynamoDB reads	Computed results, multi-source cache

1.3.2 Amazon S3

Server-Side Encryption:

Type	Key Management	Header
SSE-S3	AWS managed	`x-amz-server-side-encryption: AES256`
SSE-KMS	KMS key	`x-amz-server-side-encryption: aws:kms` + optional key ID
SSE-C	Customer key	3 headers: algorithm, key (base64), key MD5

SSE-KMS: Each operation calls KMS API and counts against KMS quota
Enforce encryption: Bucket policy deny s3:PutObject without encryption header

Storage Classes:

Class	Access Pattern	Retrieval Fee	Min Duration
S3 Standard	Frequent	None	None
S3 Intelligent-Tiering	Unknown / changing	None	None
S3 Standard-IA	Infrequent, rapid access	Per GB	30 days
S3 One Zone-IA	Infrequent, single AZ OK	Per GB	30 days
S3 Glacier Instant	Quarterly, millisecond access	Per GB	90 days
S3 Glacier Flexible	1-2x/year, mins-hours	Per GB	90 days
S3 Glacier Deep Archive	1x/year, 12-48 hours	Per GB	180 days

Key Features:

Presigned URLs: Temporary access to private objects (upload or download)
Event Notifications: Targets Lambda, SQS, SNS, EventBridge
Versioning: Protects against accidental deletion (delete markers)
MFA Delete: Requires MFA for permanent version deletion
Lifecycle Rules: Transition between storage classes or expire objects

CORS:

Configure on the target bucket (the one being accessed cross-origin)
Lambda Proxy integration: Return CORS headers from Lambda function
Non-proxy integration: Enable CORS in API Gateway console

1.3.3 Amazon ElastiCache

Redis vs Memcached:

Feature	Redis	Memcached
Replication	Multi-AZ with auto-failover	No replication
Persistence	AOF / RDB snapshots	No persistence
Data types	Strings, lists, sets, hashes	Simple key-value only
Pub/Sub	Yes	No
Threading	Single-threaded	Multi-threaded
Use case	HA, persistence, complex data	Simple caching, max throughput

Caching Strategies:

Strategy	How It Works	Pros	Cons
Lazy Loading	Cache miss then fetch DB then cache it	Only caches needed data	Stale data, cache-miss penalty
Write-Through	Write to cache AND DB simultaneously	Always fresh	Write penalty, caches all data
Write-Behind	Write to cache then async write to DB	Fast writes	Data loss risk

Best practice: Write-Through + TTL = fresh data + automatic cleanup of unused entries.

1.3.4 Amazon OpenSearch

Full-text search, log analytics, real-time dashboards
Common pattern: DynamoDB Streams to Lambda to OpenSearch
Use when DynamoDB cannot meet search requirements (full-text, fuzzy matching)

DOMAIN 2 — Security (26%)

Task 2.1: Implement Authentication and Authorization

2.1.1 IAM Core Concepts

Policy Evaluation Logic:

  1. All requests DENIED by default
  2. Evaluate all applicable policies
  3. Explicit DENY always wins (overrides any Allow)
  4. Explicit ALLOW grants access (if no Deny)
  5. If no Allow found then implicit deny

Policy Types:

Policy Type	Scope	Notes
Service Control Policy	Organization / OU / Account	Sets maximum permissions boundary
Permissions Boundary	IAM user / role	Sets max permissions for entity
Identity-based	IAM user / role / group	Inline or managed policies
Resource-based	S3, SQS, Lambda, KMS, etc.	Cross-account without AssumeRole
Session policy	STS session	Limits assumed role permissions

Key Distinctions:

Users: Long-term credentials (access keys); for humans or CI/CD
Roles: Temporary credentials via STS; for services, cross-account, federation
Instance Profile: Wrapper around IAM role for EC2 instances
Always prefer roles over access keys

2.1.2 STS (Security Token Service)

API	Use Case
`AssumeRole`	Cross-account access, role switching
`AssumeRoleWithWebIdentity`	OIDC federation (Google, Facebook, Cognito)
`AssumeRoleWithSAML`	SAML 2.0 federation (Active Directory)
`GetSessionToken`	MFA-protected API calls (ONLY STS API with MFA!)
`GetFederationToken`	Proxy apps issuing temp credentials
`DecodeAuthorizationMessage`	Decode `UnauthorizedOperation` error details

Cross-Account Access Pattern:

  PRODUCTION ACCOUNT                     DEVELOPMENT ACCOUNT
  1. Create IAM Role with               3. Create IAM Policy allowing
     Trust Policy: Dev Account               sts:AssumeRole on Prod Role ARN
  2. Attach permissions to Role          4. Attach to Dev IAM users/roles

Role created in account WITH the resource. Trust policy specifies WHO can assume it.

2.1.3 Amazon Cognito

User Pools (Authentication — “Who are you?”):

User directory: sign-up, sign-in, password policies
Social login: Google, Facebook, Apple, SAML, OIDC
MFA, adaptive authentication, account recovery
Returns JWTs: ID Token (user identity claims), Access Token (API access scopes), Refresh Token
Hosted UI with customizable branding
Directly integrates with API Gateway as a native Cognito Authorizer
Cannot grant AWS service credentials directly
Token stored client-side (e.g., browser local storage) and sent in Authorization header

Identity Pools / Federated Identities (Authorization — “What can you access?”):

Exchanges tokens (from User Pool, Google, Facebook, SAML) for temporary AWS credentials (via STS)
Maps authenticated and unauthenticated users to IAM roles
Supports guest / unauthenticated access
Returns Cognito ID (CognitoIdentityId) — a unique identifier for the user
The Cognito ID is then used to obtain temporary, limited-privilege AWS credentials
Does NOT directly integrate with API Gateway as an authorizer
Use when your app needs to call AWS services directly (S3, DynamoDB) from client-side

Identity Pool with External IdP — Full Flow:

  FLOW 3: External IdP + Identity Pool (mobile apps, federated access)

  ┌──────┐  SDK    ┌───────────┐ OAuth/OIDC  ┌───────────────┐  returns  ┌──────────┐
  │Mobile│──────>  │ Identity  │   token     │   Amazon      │ Cognito   │ Cognito  │
  │ App  │  login  │ Provider  │────────────>│   Cognito     │───ID────> │ ID       │
  └──────┘         │(Google,   │             │  (Identity    │           │(unique   │
                   │ Facebook, │             │   Pool)       │           │ user ID) │
                   │ SAML...)  │             └───────────────┘           └─────┬────┘
                   └───────────┘                                              │
                                                                    GetCredentialsForIdentity
                                                                              │
                                                                              v
                                                                    ┌──────────────────┐
                                                                    │ Temp AWS Creds    │
                                                                    │ (AccessKeyId,     │
                                                                    │  SecretAccessKey, │
                                                                    │  SessionToken)    │
                                                                    └──────────────────┘
                                                                              │
                                                                    S3, DynamoDB, SNS...

What Each “Cognito ___” Term Means (exam loves these distractors):

Term	What It Is	Real?
Cognito ID	Unique user identifier returned by Identity Pool	YES — correct answer
Cognito Key Pair	Not a real Cognito concept	NO — distractor
Cognito SDK	Development toolkit to interact with Cognito	Exists but not a return value
Cognito API	API interface for Cognito service	Exists but not a return value

Exam example: Mobile app authenticates with IdP using provider’s SDK, passes OAuth/OIDC token to Cognito. What is returned to provide temporary AWS credentials?

Answer: Cognito ID — Identity Pool returns a Cognito ID, which is then used to get temporary, limited-privilege AWS credentials

The Cognito ID uniquely identifies the user across all federated identity providers

The Two Flows — Critical to Understand:

  FLOW 1: User Pool + API Gateway (most common exam scenario)
  ┌──────┐ sign-in  ┌───────────┐  JWT    ┌───────────┐ Cognito   ┌──────────┐
  │ User │─────────>│ User Pool │──token──>│ Browser / │ Authorizer│ API GW   │
  │      │          │           │         │ App       │──────────>│ (validates│
  └──────┘          └───────────┘         └───────────┘  header:  │  JWT)     │
                                                        Authorization └──────────┘
  - API Gateway validates JWT natively (no Lambda needed)
  - Set token source = name of header (usually "Authorization")
  - Create authorizer in API GW console using User Pool ID

  FLOW 2: User Pool + Identity Pool + AWS Services
  ┌──────┐ sign-in  ┌───────────┐  JWT   ┌───────────────┐  STS   ┌──────────┐
  │ User │─────────>│ User Pool │──token─>│ Identity Pool │──────>│ Temp AWS │
  │      │          │           │        │               │       │ Creds    │
  └──────┘          └───────────┘        └───────────────┘       └──────────┘
                                                                       │
                                                              S3, DynamoDB, etc.
  - Identity Pool exchanges JWT for IAM credentials
  - App calls AWS services DIRECTLY (not through API Gateway)
  - Use when client needs S3.putObject, DynamoDB.getItem, etc.

Exam Trap — User Pool vs Identity Pool for API Gateway:

Question Pattern	Answer
”JWT authorizer for API Gateway”	User Pool (NOT Identity Pool)
“ReactJS app + Cognito + JWT in local storage + API Gateway”	User Pool + Cognito Authorizer
”Token source header for API Gateway authorizer”	Set header on User Pool authorizer
”App needs to call S3/DynamoDB directly from browser”	Identity Pool (for AWS creds)
“Guest access to AWS resources”	Identity Pool
”User Pool or Identity Pool for API Gateway?”	Always User Pool

Real exam example: A ReactJS app on S3 uses Cognito SDK for sign-up/sign-in, stores JWT in local storage, and uses JWT to authorize API Gateway calls. The correct steps are:

Create a Cognito User Pool (for sign-up/sign-in and JWT issuance)

On API Gateway console, create an authorizer using the Cognito User Pool ID

Set the header name (e.g., Authorization) as the token source pointing to the User Pool authorizer

Identity Pool is NOT needed here because the app only calls API Gateway (not AWS services directly).

Complete Decision Guide:

Scenario	Service
User sign-up / sign-in / user directory	User Pool
JWT tokens for API Gateway authorization	User Pool (Cognito Authorizer)
Token source header for API Gateway	User Pool authorizer config
Temporary AWS credentials (S3, DynamoDB from client)	Identity Pool
Guest / unauthenticated access to AWS resources	Identity Pool
Social login + access S3 directly from app	User Pool + Identity Pool
Social login + access API Gateway	User Pool (Cognito Authorizer)
Cross-device sync (single user key-value)	Cognito Sync
Multi-user real-time shared data	AppSync (NOT Cognito Sync)

2.1.4 API Gateway Authentication

Method	How It Works	Use When
IAM (AWS_IAM)	Sig V4 signed requests	AWS users/roles, cross-account
Cognito User Pool Authorizer	Validates JWT from User Pool natively	User pool-authenticated clients
Lambda Authorizer (TOKEN)	Custom auth logic on bearer token	Custom/3rd-party auth, Identity Pool tokens
Lambda Authorizer (REQUEST)	Custom auth on headers, query params	Multiple identity sources
API Keys	Identification only (NOT authentication!)	Usage tracking, throttling, quotas

Identity Pool + API Gateway (when needed):

Identity Pool does NOT have a native API Gateway authorizer
If you must use Identity Pool tokens with API Gateway, use a Lambda Authorizer to validate
But the standard pattern is: User Pool JWT + Cognito Authorizer (simpler, no Lambda needed)

Resource Policies: JSON policies on the API itself for cross-account access or IP restrictions.

Task 2.2: Implement Encryption Using AWS Services

2.2.1 AWS KMS (Key Management Service)

Key Types:

Type	Management	Rotation	Use Case
AWS managed key (`aws/s3`)	AWS	Auto every year	Default for AWS services
Customer managed key (CMK)	You	Optional (enable auto-rotate)	Custom control, policies
AWS owned key	AWS	Varies	Internal AWS use

Envelope Encryption (for data larger than 4 KB):

  ENCRYPT:
  1. Call GenerateDataKey -> returns plaintext key + encrypted key
  2. Encrypt data with the plaintext key (client-side)
  3. DELETE plaintext key from memory
  4. Store encrypted data + encrypted key together

  DECRYPT:
  1. Send encrypted key to KMS (Decrypt API) -> returns plaintext key
  2. Decrypt data with the plaintext key
  3. DELETE plaintext key from memory

KMS can only directly encrypt/decrypt up to 4 KB
For larger data you MUST use envelope encryption
GenerateDataKey vs GenerateDataKeyWithoutPlaintext (encrypted key only, for later use)

KMS Key Policies:

Every KMS key must have a key policy (resource-based)
Default: Gives the account root user full access
Cross-account: Key policy must allow external account AND external account needs IAM permissions

KMS API Quota:

5,500 - 30,000 requests/sec per region
SSE-KMS on S3: each upload/download calls KMS and can hit quota
Fix: request quota increase, use S3 Bucket Keys, or switch to SSE-S3

2.2.2 KMS vs CloudHSM

Feature	KMS	CloudHSM
HSM tenancy	Multi-tenant	Single-tenant (dedicated)
Key management	AWS manages HSMs	You manage keys and HSMs
FIPS compliance	Level 2	Level 3
Integration	100+ AWS services	Custom key store for KMS
Use case	Most encryption needs	Regulatory compliance, full control

2.2.3 Encryption at Rest and in Transit

Service	At Rest	In Transit
S3	SSE-S3, SSE-KMS, SSE-C	HTTPS (enforce via bucket policy)
DynamoDB	AWS owned key (default) or KMS CMK	TLS (automatic)
EBS	KMS-encrypted volumes	Encrypted between EC2 and EBS
RDS	KMS encryption at creation	SSL/TLS certificates
SQS	SSE-SQS or SSE-KMS	HTTPS
Kinesis	KMS server-side encryption	TLS

2.2.4 ACM (AWS Certificate Manager)

Free public SSL/TLS certificates for AWS services
Auto-renewal for ACM-issued certificates
Integrates with: ALB, CloudFront, API Gateway
Cannot use ACM certificates directly on EC2

Task 2.3: Manage Sensitive Data in Application Code

2.3.1 Parameter Store vs Secrets Manager

Feature	Parameter Store	Secrets Manager
Rotation	Manual (custom Lambda)	Built-in auto-rotation
Cost	Free (standard tier)	$0.40/secret/month
Max size	4 KB (std), 8 KB (adv)	64 KB
RDS integration	No native rotation	Native rotation for RDS
Cross-region	No	Yes (replica secrets)
Encryption	SecureString with KMS	Always encrypted with KMS
Best for	Config values, feature flags	Credentials needing rotation

Parameter Store Tiers:

Tier	Max Size	Cost	Features
Standard	4 KB	Free	No policies
Advanced	8 KB	Paid	Expiration, NoChangeNotification

CloudFormation Dynamic References:

{{resolve:ssm:paramName:version}} for plaintext parameters
{{resolve:ssm-secure:paramName:version}} for SecureString parameters
{{resolve:secretsmanager:secretId:key}} for Secrets Manager

2.3.2 Best Practices for Sensitive Data

Never store credentials in code, Git, or CloudFormation parameters in plaintext
Lambda: Use environment variables encrypted with KMS for sensitive values
ECS: Reference Secrets Manager or Parameter Store in task definitions
Use IAM roles instead of access keys whenever possible

DOMAIN 3 — Deployment (24%)

Task 3.1: Prepare Application Artifacts for Deployment

3.1.1 Lambda Packaging

Method	Details
Zip deployment	50 MB compressed, 250 MB unzipped (incl. layers)
Container image	Up to 10 GB; must implement Lambda Runtime API
Inline (ZipFile)	CloudFormation only; Node.js and Python; source code only
Layers	Shared libraries; max 5 per function; extract to `/opt/`

3.1.2 Container Images (ECS/EKS)

Dockerfile to docker build to push to Amazon ECR
ECR: Managed container registry with image scanning
ECS Task Definition references ECR image URI
Multi-stage builds reduce image size (build + runtime stages)

3.1.3 Elastic Beanstalk Source Bundle

ZIP or WAR containing application code
.ebextensions/*.config for custom resources and settings (YAML)
env.yaml for environment manifest
Dockerrun.aws.json for multi-container Docker
cron.yaml for periodic worker tasks
Procfile to define processes

Task 3.2: Test Applications in Development Environments

3.2.1 SAM Local Testing

  sam local invoke            Invoke Lambda locally with event payload
  sam local start-api         Start local API Gateway + Lambda
  sam local start-lambda      Start Lambda endpoint for SDK testing
  sam local generate-event    Generate sample event payloads

Requires Docker for local Lambda simulation

3.2.2 CDK + SAM Local Testing (Exam Favorite!)

The exact two-step flow:

  Step 1: cdk synth --stack StackName
          |
          v
  Generates CloudFormation template in cdk.out/
  (e.g., cdk.out/MyStack.template.json)

  Step 2: sam local invoke -t cdk.out/MyStack.template.json MyFunctionLogicalId
          |
          v
  SAM reads the synthesized template, finds the Lambda,
  spins up Docker container, invokes locally

Full command reference for CDK + SAM local testing:

Step	Command	Purpose
1	`cdk synth` (with optional `--stack StackName`)	Generate CloudFormation template to `cdk.out/`
2a	`sam local invoke -t cdk.out/template.json FunctionId`	Invoke a specific Lambda locally
2b	`sam local start-api -t cdk.out/template.json`	Start local API Gateway + Lambda
2c	`sam local start-lambda -t cdk.out/template.json`	Start local Lambda endpoint for SDK testing

Exam Trap — What is NOT needed for local testing:

Command / Action	Needed for Local Testing?	Why
`cdk synth`	YES	Generates the template SAM needs
`sam local invoke -t ...`	YES	Invokes the function locally using template
`cdk bootstrap`	NO	Sets up deployment infra (S3 bucket), not local
`sam package`	NO	Uploads code to S3 for deployment, not local
`sam deploy`	NO	Deploys to AWS, not local
`cdk deploy`	NO	Deploys to AWS, not local

Exam example: CDK app with L2 constructs, SAM and CDK configured locally. What TWO steps to test Lambda locally?

Run cdk synth to generate CloudFormation template (specify stack name)

Run sam local invoke with -t pointing to the synthesized template and the function logical ID

Wrong: sam package (that’s for S3 upload/deployment), cdk bootstrap (that’s for deployment prep)

Task 3.3: Automate Deployment Testing

3.3.1 CodeBuild

buildspec.yml structure:

version: 0.2
env:
  variables:
    KEY: "value"
  parameter-store:
    DB_PASS: /app/db-password
  secrets-manager:
    SECRET: my-secret:key
phases:
  install:
    runtime-versions:
      nodejs: 18
    commands:
      - npm install
  pre_build:
    commands:
      - npm test
  build:
    commands:
      - npm run build
  post_build:
    commands:
      - echo "Build complete"
artifacts:
  files:
    - '**/*'
  base-directory: dist
cache:
  paths:
    - node_modules/**/*

Build projects defined in buildspec.yml (NOT appspec.yml)
Artifacts stored in S3
Environment variables from Parameter Store and Secrets Manager
VPC support for accessing private resources during build

Task 3.4: Deploy Code Using AWS CI/CD Services

3.4.1 CodePipeline

  Source --> Build --> [Test] --> [Manual Approval] --> Deploy

Source: CodeCommit, GitHub, S3, ECR
Build: CodeBuild
Deploy: CodeDeploy, CloudFormation, ECS, Elastic Beanstalk, S3

Manual Approval:

Pipeline pauses until approved, rejected, or times out
Default timeout: 7 days
SNS notification to approvers
IAM permission: codepipeline:PutApprovalResult
Use cases: production gate, compliance sign-off, change management

3.4.2 AWS CodeDeploy

Deployment Matrix:

Platform	In-Place	Blue/Green	Agent Required?
EC2	Yes	Yes	YES (must be installed and running)
On-Premises	Yes	No	YES (must be installed and running)
Lambda	No	Yes (always)	NO (managed by AWS)
ECS	No	Yes (always)	NO (managed by AWS)

CodeDeploy Agent Details:

Must be installed and running on EC2 and On-Premises instances only
Agent polls CodeDeploy for deployment instructions
Install at scale via SSM Run Command
NOT required for Lambda or ECS (AWS manages natively)
DownloadBundle error: Check agent is running AND instance IAM role has S3 permissions

Traffic Shifting Strategies (Lambda / ECS):

Strategy	Behavior
Canary	X% to new then wait then remaining (e.g., Canary10Percent5Minutes)
Linear	Equal increments over time (e.g., Linear10PercentEvery1Minute)
AllAtOnce	100% immediately

AppSpec File:

Platform	Format	Key Sections
EC2	YAML	`files` (source to dest), `hooks` (lifecycle events)
Lambda	YAML/JSON	`version`, `resources` (function, alias, versions)
ECS	YAML/JSON	`version`, `resources` (task def, container, port)

EC2/On-Prem Lifecycle Hooks (in order):

#	Hook	Managed By	Can You Script It?
1	`ApplicationStop`	User (AppSpec)	Yes — run scripts to stop current app
2	`DownloadBundle`	Agent only	NO — cannot configure in AppSpec
3	`BeforeInstall`	User (AppSpec)	Yes — pre-install tasks (backup, decrypt)
4	`Install`	Agent only	NO — agent copies files per AppSpec
5	`AfterInstall`	User (AppSpec)	Yes — post-install (chmod, config)
6	`ApplicationStart`	User (AppSpec)	Yes — start/restart your application
7	`ValidateService`	User (AppSpec)	Yes — health checks, smoke tests

Exam key: DownloadBundle and Install are agent-managed only. You cannot write scripts for them in AppSpec. Any answer saying “configure DownloadBundle in AppSpec” is wrong.

DownloadBundle Failures — Deep Dive:

Error Message	Root Cause	Fix
`UnknownError: not opened for reading`	EC2 instance IAM profile lacks S3 read permissions	Add `s3:Get`, `s3:List` to instance profile
`DownloadBundle` timeout	Agent cannot reach S3 (network/VPC issue)	Check security groups, NACLs, VPC endpoints
`DownloadBundle` with access denied	S3 bucket policy denies the instance role	Update bucket policy OR instance profile
Agent not found / deployment stuck	CodeDeploy agent not installed or not running	Install agent via SSM Run Command; start service

Exam traps around DownloadBundle:

S3 versioning is NOT required for CodeDeploy to download bundles
DownloadBundle works in all regions (not region-restricted)
You cannot configure DownloadBundle in the AppSpec file (it’s agent-managed)
The most common cause is missing IAM permissions on the EC2 instance profile

Rollback:

Automatic rollback deploys last known good revision as a new deployment (new ID)
Does NOT restore previous deployment; creates a new one
Triggers: deployment failure or CloudWatch alarm breach

3.4.3 CloudFormation

Template Sections:

Section	Required?	Purpose
AWSTemplateFormatVersion	No	Template version (2010-09-09)
Description	No	Template description
Parameters	No	Input values at deploy time
Mappings	No	Static key-value lookup tables
Conditions	No	Conditional resource creation
Transform	No	Macros (SAM: `AWS::Serverless-2016-10-31`)
Resources	YES	AWS resources (ONLY required section)
Outputs	No	Export values for cross-stack references

Key Intrinsic Functions:

Function	Purpose
`Ref`	Reference parameter or resource logical ID
`Fn::GetAtt`	Get attribute of a resource
`Fn::Join`	Concatenate strings with delimiter
`Fn::Sub`	String substitution with variables
`Fn::Select`	Select item from list by index
`Fn::ImportValue`	Import exported output from another stack
`Fn::FindInMap`	Lookup value in Mappings section

Cross-Stack References:

Stack A: Outputs with Export: Name: "VPC-ID"
Stack B: Fn::ImportValue: "VPC-ID"

Change Sets: Preview changes before executing. Shows adds, modifies, replaces.

Drift Detection: Identifies resources changed outside CloudFormation.

Helper Scripts (EC2):

Script	Purpose
`cfn-init`	Install packages, create files, start services
`cfn-signal`	Signal CloudFormation that instance is ready
`cfn-hup`	Daemon detecting metadata changes, re-runs cfn-init
`cfn-get-metadata`	Retrieve metadata from template

3.4.4 AWS SAM

CloudFormation extension for serverless; requires Transform: AWS::Serverless-2016-10-31

SAM Resource Types:

SAM Resource	Equivalent
`AWS::Serverless::Function`	Lambda + IAM Role + API GW + Events
`AWS::Serverless::Api`	API Gateway RestApi
`AWS::Serverless::HttpApi`	API Gateway HttpApi (v2)
`AWS::Serverless::SimpleTable`	DynamoDB Table
`AWS::Serverless::LayerVersion`	Lambda Layer

SAM CLI Commands:

Command	Purpose
`sam init`	Scaffold project (not needed if project exists)
`sam build`	Install dependencies, prepare artifacts
`sam deploy`	Package (zip + S3) + Deploy via CloudFormation (combined!)
`sam local invoke`	Local Lambda testing
`sam local start-api`	Local API Gateway

sam deploy = sam package + deploy combined CloudFormation CLI requires TWO commands: aws cloudformation package then deploy

SAM Policy Templates: Predefined IAM policies like DynamoDBCrudPolicy, S3ReadPolicy, SQSPollerPolicy.

3.4.5 AWS CDK

Infrastructure as code (Python, TypeScript, Java, C#, Go)
Compiles to CloudFormation via cdk synth

Command	Purpose
`cdk bootstrap`	FIRST command in new account/region (creates S3 for assets)
`cdk synth`	Generate CloudFormation template
`cdk deploy`	Deploy stack
`cdk diff`	Preview changes
`cdk destroy`	Delete stack

NoSuchBucket error means run cdk bootstrap
Constructs: L1 (raw CFN), L2 (opinionated defaults), L3 (patterns)

3.4.6 Elastic Beanstalk Deployments

Strategy	Downtime	Capacity	Rollback	Cost
All-at-once	YES	Reduced	Manual redeploy	Lowest
Rolling	No	Reduced	Manual redeploy	Low
Rolling + Batch	No	Full	Manual redeploy	Medium
Immutable	No	Full	Terminate new ASG	Higher
Traffic Splitting	No	Full	Reroute traffic	Higher
Blue/Green	No	Full	CNAME swap	Highest

Blue/Green: Best for major platform changes
Immutable: Best for quick rollback without DNS complexity

3.4.7 CodeArtifact

Managed artifact repository for Maven, npm, pip, NuGet
Upstream repos: npmjs.com, Maven Central, PyPI
Cross-account sharing via resource policies

DOMAIN 4 — Troubleshooting and Optimization (18%)

Task 4.1: Assist in Root Cause Analysis

4.1.1 CloudWatch Logs

Log Groups: Container for log streams (one per application/service)
Log Streams: Sequence of events from a single source
Metric Filters: Extract metrics from log data (count ERROR occurrences)
Insights: Query and analyze log data with purpose-built query language
Subscription Filters: Real-time processing to Lambda, Kinesis, Firehose
Retention: Configure per log group (1 day to 10 years; default forever)

Lambda Logging:

Execution role needs: logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents
Managed policy: AWSLambdaBasicExecutionRole

4.1.2 CloudWatch Metrics and Alarms

Custom Metrics:

PutMetricData API to publish custom metrics
High-resolution: 1-second granularity (standard is 60 seconds)
Embedded Metric Format (EMF): Structured log auto-extracted as metric

Alarms:

State	Meaning
OK	Metric within threshold
ALARM	Metric breached threshold
INSUFFICIENT_DATA	Not enough data to evaluate

Actions: SNS, Auto Scaling, EC2, Lambda
Composite alarms: AND/OR logic across multiple alarms

4.1.3 CloudTrail

Records API calls (management events by default)
Data events (S3, Lambda) are opt-in with additional cost
Use for: auditing, compliance (“who did what when”)
NOT for distributed tracing (use X-Ray)

4.1.4 Common Error Reference

Error	Root Cause	Fix
API Gateway 502	Lambda wrong response format	Return `{statusCode, headers, body}`
API Gateway 504	Backend exceeded 29s timeout	Optimize Lambda or use async
Lambda 429	Concurrency limit reached	Reserved concurrency + backoff
DynamoDB `ProvisionedThroughputExceededException`	Hot partition or low capacity	Better PK + exponential backoff
Lambda `AccessDeniedException`	Execution role missing permissions	Update execution role IAM policy
EC2 `UnauthorizedOperation`	IAM policy missing	`sts:DecodeAuthorizationMessage`
CDK `NoSuchBucket`	CDK not bootstrapped	Run `cdk bootstrap`
Lambda `Unable to import module`	Missing deps in package	Install locally + zip + upload
CodeBuild `RequestError timeout`	Missing proxy config	Add proxy to `buildspec.yml`
CodeDeploy `DownloadBundle` / `not opened for reading`	EC2 IAM profile missing S3 permissions	Add S3 read perms to instance profile
CodeDeploy `DownloadBundle` stuck	Agent not running or network blocked	Check agent + SG + NACLs + VPC endpoints

Task 4.2: Instrument Code for Observability

4.2.1 AWS X-Ray

Core Concepts:

Concept	Description
Trace	End-to-end journey of a single request
Segment	Work done by a single service (auto-generated)
Subsegment	Granular downstream calls within a segment
Annotation	Key-value pair, indexed for search/filter
Metadata	Key-value pair, NOT indexed for debug data
Service Map	Visual graph of distributed application
Sampling	Rules controlling how many traces are recorded

Annotations vs Metadata:

Feature	Annotations	Metadata
Indexed	YES (searchable)	NO
Value types	Strings, numbers, booleans	Any type (objects, arrays)
Use for	Filtering by user_id, env	Debug data you do not search
API	`putAnnotation(key, value)`	`putMetadata(key, value)`

Subsegment Namespaces:

Namespace	Type of Call
`aws`	AWS SDK call (DynamoDB, S3, SQS)
`remote`	External HTTP API call
(none)	Custom subsegment (your own code logic)

Trace Analysis APIs:

GetTraceSummaries to filter traces by annotations and get trace IDs
BatchGetTraces to get full traces by ID (no filtering)

Integration by Service:

Service	How to Enable
Lambda	Enable checkbox (TracingConfig: Active); automatic
ECS / Fargate	X-Ray daemon as sidecar container (UDP port 2000!)
Elastic Beanstalk	`.ebextensions/xray.config`
EC2	Install X-Ray daemon + `AWSXRayDaemonWriteAccess` role
API Gateway	Enable tracing in stage settings

“Install X-Ray daemon on Lambda” is WRONG. Just enable the checkbox. X-Ray daemon on ECS uses UDP port 2000 (not TCP).

Sampling Rules:

Reservoir: Fixed traces per second (guaranteed)
Rate: Percentage of additional traces beyond reservoir
Default: 1 request/sec + 5% of additional requests

X-Ray IAM Policies:

Role	Policy Needed
X-Ray daemon (EC2/ECS)	`AWSXRayDaemonWriteAccess`
Lambda with X-Ray	Automatic (just enable)
View traces in console	`AWSXrayReadOnlyAccess`
Full access	`AWSXrayFullAccess`

4.2.2 CloudWatch Contributor Insights

Identify top contributors (e.g., top IPs causing errors)
DynamoDB: Identify most accessed items (hot keys)

Task 4.3: Optimize Applications Using AWS Services

4.3.1 Lambda Optimization

Optimization	Approach
Reduce cold starts	Provisioned concurrency; keep functions warm
Faster execution	Increase memory (= more CPU); optimize code
Reduce package size	Use layers; remove unused deps; container images
Reuse connections	Initialize SDK clients OUTSIDE the handler
Caching	Store data in /tmp; use global variables
Async processing	Decouple with SQS/SNS; return early

4.3.2 DynamoDB Optimization

Goal	Approach
Fix hot partitions	High-cardinality PK; add random suffix if needed
Read optimization	DAX for microsecond reads; ElastiCache for computed
Write optimization	Batch writes; on-demand for spiky traffic
Cost optimization	Eventually consistent reads (half cost); TTL for cleanup
Query optimization	Use Query not Scan; project only needed attributes
GSI throttle prevention	Ensure GSI WCU >= base table WCU

4.3.3 S3 Performance

Optimization	Detail
Upload speed	Multipart upload (recommended for files larger than 100 MB)
Download speed	S3 Transfer Acceleration (uses CloudFront edge)
Byte-range fetches	Download only portions of an object
S3 Select	Query data in place using SQL (filter before download)
Request rate	3,500 PUT + 5,500 GET per prefix per second

4.3.4 Caching Decision Tree

  DynamoDB reads with known keys?
    YES --> DAX

  Computed/aggregated data or multi-source?
    YES --> ElastiCache (Redis/Memcached)

  API responses?
    YES --> API Gateway caching

  Static content?
    YES --> CloudFront

4.3.5 CloudFront

Cache Update Strategies (exam favorite!):

Strategy	Immediate?	Cost	How It Works
Versioned file names	YES	FREE	Change URL (e.g., `img_v2.jpg`); new URL = cache miss = fresh content
Invalidation	YES	Costs $$	Removes objects from edge caches; first 1,000 paths free/month, then $0.005/path
Wait for TTL expiration	NO	Free	Objects served stale until TTL expires
Disable/re-enable distribution	NO	Free	Does NOT clear cache; causes downtime

Exam trap: “Fast AND cost-efficient” = versioned file names (not invalidation!)

Invalidation is fast but NOT cost-efficient for thousands of objects

Versioned file names are both fast AND free

Waiting for expiration is free but NOT fast

Disabling distribution doesn’t clear cache and causes downtime

Other CloudFront Features:

CloudFront Functions: Lightweight edge logic (viewer request/response), JavaScript only, < 1ms
Lambda@Edge: Heavier processing (origin request/response), Node.js/Python, up to 30s
CloudFront-Viewer-Country header for geo-routing
Origin Access Control (OAC): Restrict S3 access to CloudFront only (replaces OAI)
S3 Transfer Acceleration: Uses CloudFront edge for faster S3 uploads (different from CloudFront distribution)

APPENDIX A — Service Limits Quick Reference

Service	Key Limits
Lambda timeout	15 minutes (900 seconds)
Lambda memory	128 MB - 10,240 MB
Lambda concurrent	1,000 (soft limit, per account per region)
Lambda package	50 MB zip, 250 MB unzipped, 5 layers, 10 GB container
Lambda /tmp	512 MB free, up to 10 GB
API Gateway timeout	29 seconds integration timeout
API Gateway TPS	10,000 requests/second (account level)
API Gateway cache TTL	0 - 3600 seconds (default 300)
DynamoDB item size	400 KB max
DynamoDB RCU	1 RCU = 4 KB (strong), 8 KB (eventual)
DynamoDB WCU	1 WCU = 1 KB/sec
DynamoDB LSI	5 per table (creation only)
DynamoDB GSI	20 per table (anytime)
DynamoDB Streams	24-hour retention
SQS Standard TPS	Unlimited
SQS FIFO TPS	300 (3,000 with batching)
SQS message size	256 KB (up to 2 GB with Extended Client)
SQS retention	1 min - 14 days (default 4 days)
SQS visibility	0 - 12 hours (default 30 seconds)
Kinesis retention	24 hours - 365 days
KMS direct encrypt	4 KB max
KMS API quota	5,500 - 30,000 req/sec per region
Cognito Sync	1 MB per dataset, 20 datasets per identity
Step Functions events	25,000 execution history events (standard)
CloudFormation	500 resources per stack

APPENDIX B — Exam Strategy

Key Phrases and Their Answers:

Phrase in Question	Likely Answer Direction
”Least operational effort”	Managed service, built-in feature
”Most secure”	IAM roles, least privilege, encryption
”Cost-effective”	Query over Scan, binpack, on-demand, reserved
”Without code changes”	ALB OIDC, CloudFront, API caching
”Near real-time”	DynamoDB Streams, Kinesis, EventBridge
”Exactly-once”	SQS FIFO (NOT Standard)
“Cross-account”	AssumeRole + trust policy
”MFA protected”	GetSessionToken
”Ordering guaranteed”	SQS FIFO, Kinesis (per shard)
“Decouple”	SQS (NOT Kinesis unless streaming needed)

Service Confusion Matrix:

Scenario	DO NOT Pick	DO Pick
Simple decoupling	Kinesis	SQS
Serverless deploy	Raw CloudFormation	SAM
Coordinate Lambdas	Direct invoke	Step Functions
Feature flags	Lambda + SSM	AppConfig
Distributed tracing	CloudWatch Logs	X-Ray
API call auditing	X-Ray	CloudTrail
npm / Maven repository	ECR	CodeArtifact
Cross-device sync (1 user)	AppSync	Cognito Sync
Multi-user real-time	Cognito Sync	AppSync
Infra as code in Python	CloudFormation YAML	CDK
DB credential rotation	Parameter Store	Secrets Manager
Container image registry	S3	ECR
Third-party webhook (least effort)	API GW + Lambda Auth	Lambda Function URL (NONE)
JWT auth for API Gateway	Identity Pool	User Pool (Cognito Authorizer)
Public HTTPS for Lambda	API Gateway (overkill)	Lambda Function URL

When Stuck Between Two Answers:

Does it follow least privilege?
Does it use a managed/native service?
Does it address the root cause (not symptoms)?
Is it the simplest path meeting ALL requirements?

APPENDIX C — Additional Service Cards

AWS AppSync: Managed GraphQL; real-time subscriptions via WebSocket; offline support with conflict resolution.

AWS AppConfig: Feature flags and dynamic configuration; gradual rollout without deployment; preferred over Lambda + Parameter Store.

Amazon ECS Task Placement:

Strategy	Behavior	Use Case
binpack	Pack tightly (fewest instances)	Cost optimization
spread	Distribute across AZs or instances	High availability
random	Random placement	Testing

ECS Roles:

Role	Who Uses It	For What
Execution Role	ECS Agent	Pull ECR images, push CloudWatch logs
Task Role	Your container	Call AWS APIs (S3, DynamoDB, SNS, etc.)

EC2 Instance Metadata:

http://169.254.169.254/latest/meta-data/ for instance details
http://169.254.169.254/latest/user-data/ for launch scripts
IMDSv2 (recommended): Requires session token

ALB: Layer 7; OIDC auth on HTTPS:443 (no code changes); X-Forwarded-For for client IP; Lambda targets supported. NLB is Layer 4 with no OIDC and no Lambda targets.

SQS Extended Client Library: Java SDK only. Messages up to 2 GB via S3 storage. Not available via CLI, console, or other SDKs.

AWS DVA-C02 Exam-Aligned Study Guide | March 2026