AWS Developer Associate (DVA-C02) β Exam-Aligned Study Guide
Structured by Official Exam Guide Domains and Tasks Reference: AWS DVA-C02 Exam Guide PDF Last updated: March 2026
Exam Blueprint
| Domain | Weight | Questions (~65 total) |
|---|---|---|
| 1. Development with AWS Services | 32% | ~21 |
| 2. Security | 26% | ~17 |
| 3. Deployment | 24% | ~16 |
| 4. Troubleshooting and Optimization | 18% | ~11 |
DOMAIN 1 β Development with AWS Services (32%)
Task 1.1: Develop Code for Applications Hosted on AWS
1.1.1 Architectural Patterns
Loosely Coupled Tightly Coupled
βββββββ SQS βββββββ βββββββββββΆβββββββ
β Svc ββββββββΆβ Svc β Resilient β Svc β β Svc β Fragile
β A β β B β β A ββββββ B β
βββββββ βββββββ βββββββ βββββββ
Fan-out Pattern: Event-Driven:
SNS βββΆ SQS-1 βββΆ Svc-A S3 Event βββΆ Lambda
βββΆ SQS-2 βββΆ Svc-B DDB Stream βββΆ Lambda
βββΆ SQS-3 βββΆ Svc-C EventBridge βββΆ Step Functions
Service Selection for Decoupling:
| Pattern | Service | When to Use |
|---|---|---|
| Queue-based | SQS | Point-to-point, async processing |
| Pub/Sub | SNS | One-to-many broadcast |
| Fan-out | SNS + SQS | Broadcast + independent parallel processing |
| Event bus | EventBridge | Cross-account, SaaS integration, rule-based |
| Streaming | Kinesis Data Streams | Real-time, ordered, high-volume data ingestion |
| Orchestration | Step Functions | Complex multi-step workflows with state |
| Choreography | EventBridge | Loosely coupled event-driven microservices |
1.1.2 AWS SDK and API Essentials
Retry and Exponential Backoff:
- All AWS SDKs implement automatic retries with exponential backoff
ThrottlingException,ProvisionedThroughputExceededExceptiontrigger auto-retry- Custom formula:
base * 2^attemptwith jitter (add randomness to avoid thundering herd) - Always cap maximum backoff to prevent infinite waits
Pagination:
- Most
List*/Describe*APIs return paginated results - Use
NextToken/Markerto fetch subsequent pages - SDKs provide built-in paginators (e.g.,
.pages()in Python boto3)
Waiters:
- SDK utility to poll until a resource reaches a desired state
- Example:
ec2.get_waiter('instance_running').wait(InstanceIds=[...])
Idempotency:
- Use
ClientToken/IdempotencyTokento prevent duplicate operations - SQS FIFO:
MessageDeduplicationIdprovides 5-minute dedup window
1.1.3 Amazon API Gateway
Endpoint Types:
| Type | Description | Use Case |
|---|---|---|
| Edge-optimized | Routed through CloudFront edge locations | Global clients (default) |
| Regional | Served from the API region | Same-region clients, custom CDN |
| Private | Accessible only from within a VPC | Internal microservices |
Integration Types:
| Integration | Request Transform | Response Transform | Notes |
|---|---|---|---|
| Lambda Proxy | No (raw pass) | No (Lambda formats) | Lambda MUST return {statusCode, headers, body} |
| Lambda Custom | Mapping template | Mapping template | Use for SOAP-to-REST, XML-to-JSON |
| HTTP Proxy | No | No | Pass-through to HTTP endpoint |
| HTTP Custom | Mapping template | Mapping template | Transform before/after HTTP backend |
| AWS Service | Mapping template | Mapping template | Direct integration (SQS, DynamoDB, S3) |
Stages and Stage Variables:
- Deploy to named stages:
/dev,/staging,/prod - Stage variables act as environment variables for API Gateway
- Reference Lambda alias via stage variable:
${stageVariables.lambdaAlias} - Canary deployments on stages: route percentage of traffic to canary
Caching:
- Enable per stage; TTL 0-3600s (default 300s)
- Invalidate:
Cache-Control: max-age=0header (requiresexecute-api:InvalidateCache) - Metrics:
CacheHitCount,CacheMissCount(only visible when caching enabled)
Throttling:
- Account-level: 10,000 requests/second (soft limit)
- Stage/method-level throttling via Usage Plans
- API Keys for identification (NOT authentication), paired with Usage Plans
- Returns
429 Too Many Requestswhen throttled
CORS:
- Lambda Proxy: Return CORS headers FROM the Lambda function itself
- Lambda Custom: Configure CORS in API Gateway console
- Required headers:
Access-Control-Allow-Origin,Access-Control-Allow-Methods,Access-Control-Allow-Headers
Error Codes:
| Code | Meaning | Root Cause |
|---|---|---|
| 400 | Bad Request | Malformed request syntax |
| 403 | Forbidden | WAF blocked, IAM denied, resource policy denied |
| 429 | Too Many Requests | Throttle limit exceeded |
| 502 | Bad Gateway | Lambda returned invalid response format |
| 503 | Service Unavailable | Temporary backend issue |
| 504 | Gateway Timeout | Backend exceeded 29-second integration timeout |
1.1.4 Messaging and Event Services
Amazon SQS:
| Feature | Standard Queue | FIFO Queue |
|---|---|---|
| Throughput | Unlimited | 300 TPS (3,000 with batching) |
| Delivery | At-least-once | Exactly-once |
| Ordering | Best-effort | Strict FIFO per Message Group ID |
| Deduplication | None | Content-based or MessageDeduplicationId |
| Queue name | Any | Must end with .fifo |
| Retention | 1 min - 14 days (default 4) | Same |
| Max message size | 256 KB | 256 KB |
| Visibility timeout | 0s - 12h (default 30s) | Same |
- Visibility Timeout: Set >= max processing time; for Lambda, set >= 6x Lambda timeout
- Dead-Letter Queue (DLQ): After
maxReceiveCountfailures, message sent to DLQ - Long Polling:
WaitTimeSeconds> 0 (max 20s) reduces empty responses and cost - Short Polling: Returns immediately, may return empty, more API calls
- Extended Client Library (Java only): For messages > 256 KB (up to 2 GB via S3)
Amazon SNS:
- Pub/Sub: topic to subscribers (Lambda, SQS, HTTP/S, email, SMS)
- Message Filtering: Subscription filter policy so subscribers get only matching messages
- Fan-out: SNS to multiple SQS queues for parallel independent processing
- FIFO Topics: Pair with SQS FIFO for ordered fan-out
- Message attributes: Key-value metadata attached to messages
Amazon EventBridge:
- Serverless event bus for application events
- Rules: Match event patterns and route to targets (Lambda, SQS, Step Functions)
- Schema Registry: Auto-discover and version event schemas
- Archive and Replay: Store events and replay them for debugging/recovery
- Cross-account: Send/receive events across AWS accounts
- Scheduler: Cron and rate-based scheduling (replaces CloudWatch Events)
Amazon Kinesis Data Streams:
| Feature | Detail |
|---|---|
| Retention | 24 hours default, max 365 days |
| Ordering | Per shard (by partition key) |
| Consumers | Standard (shared) or Enhanced (dedicated) |
| Throughput per shard | 1 MB/s in, 2 MB/s out (standard) |
| Enhanced fan-out | 2 MB/s per consumer per shard |
| Resharding | Split hot shards, merge cold shards |
PutRecord+SequenceNumberForOrdering= strict order within shardPutRecords(batch) does NOT guarantee cross-record orderProvisionedThroughputExceededExceptionuse exponential backoff or increase shards
1.1.5 AWS Step Functions
Workflow Types:
| Feature | Standard | Express |
|---|---|---|
| Duration | Up to 1 year | Up to 5 minutes |
| Execution model | Exactly-once | At-least-once (async) or sync |
| Pricing | Per state transition | Per execution + duration + memory |
| History | Full (25,000 events max) | Sent to CloudWatch Logs |
| Use case | Long-running, auditable | High-volume, short-lived (IoT) |
State Types:
| State | Purpose |
|---|---|
| Task | Do work (Lambda, ECS, Batch, DynamoDB, SNS, SQS) |
| Choice | Conditional branching (if/else) |
| Wait | Delay by seconds or until a timestamp |
| Parallel | Run branches concurrently |
| Map | Iterate over an array (dynamic parallelism) |
| Pass | Pass-through / inject fixed data (debugging) |
| Succeed | Terminal success state |
| Fail | Terminal failure state (no retry from Fail) |
Input/Output Processing:
Raw Input
|
InputPath ---- Filter what the state sees (e.g., "$.order")
|
Parameters --- Reshape input, add static values
|
[ STATE ] -- Does work, produces RESULT
|
ResultSelector -- Filter/transform raw result
|
ResultPath ----- WHERE to place result relative to input
| "$.taskResult" = input.taskResult = result
| "$" = result REPLACES entire input
| null = result DISCARDED, input unchanged
OutputPath ----- Final filter for next state
|
Output to Next State
Exam key:
ResultPathis the one that COMBINES input + result.
Error Handling:
- Retry:
ErrorEquals,IntervalSeconds,MaxAttempts,BackoffRate - Catch:
ErrorEquals,Next(fallback state),ResultPath(preserve error info) - Flow: Error then Retry (up to MaxAttempts) then Catch then Next state
- Predefined errors:
States.ALL,States.Timeout,States.TaskFailed,States.Permissions - Retry and Catch defined in state machine JSON, NOT application code
Task 1.2: Develop Code for AWS Lambda
1.2.1 Lambda Invocation Types
| Type | Behavior | Error Handling | Sources |
|---|---|---|---|
| Synchronous | Caller waits for response | Caller handles errors | API Gateway, ALB, SDK Invoke() |
| Asynchronous | Returns 202 immediately, queues internally | Auto-retry 2x, then DLQ/destination | S3, SNS, EventBridge, CloudFormation |
| Poll-based (ESM) | Lambda service polls source | DLQ on source queue (not Lambda) | SQS, Kinesis, DynamoDB Streams |
Event Source Mapping (ESM) Details:
| Source | Batch Size | Failure Handling |
|---|---|---|
| SQS | 1-10 | ReportBatchItemFailures for partial batch retry |
| Kinesis | Up to 10,000 | BisectBatchOnFunctionError, on-failure destination |
| DynamoDB Streams | Up to 10,000 | BisectBatchOnFunctionError, on-failure destination |
- SQS: DLQ configured on the SQS queue, NOT on Lambda
- Kinesis/DDB:
MaximumRetryAttempts,MaximumRecordAgeInSeconds - Parallelization factor: Process multiple batches per shard concurrently
1.2.2 Concurrency Model
Account Concurrency Pool (Default: 1,000)
+-----------------------------------------+
| Reserved (Fn-A): 400 (guaranteed) |
| Reserved (Fn-B): 200 (guaranteed) |
| Unreserved Pool: 400 (shared by rest) |
| AWS keeps minimum 100 unreserved! |
+-----------------------------------------+
Formula: concurrent_executions = invocations/sec x avg_duration_sec
| Concurrency Type | Behavior |
|---|---|
| Unreserved | Shared pool across all functions (default) |
| Reserved | Guarantees AND caps capacity for a function |
| Provisioned | Pre-initializes execution environments (eliminates cold starts) |
- Setting reserved concurrency to 0 = function completely disabled
- Throttled: synchronous returns
429; async auto-retries then DLQ
1.2.3 Execution Lifecycle
COLD START: Download code -> Start runtime -> Run INIT code -> Run handler
WARM START: INIT skipped -> Run handler directly
Optimization: Put expensive setup OUTSIDE the handler
- DB connections, SDK clients, cached data persist across warm invocations
- /tmp directory persists too (512 MB free, up to 10 GB)
1.2.4 Lambda Configuration Limits
| Setting | Detail |
|---|---|
| Memory | 128 MB - 10,240 MB (CPU scales proportionally) |
| Timeout | Max 15 minutes (900 seconds) |
| Ephemeral storage | /tmp: 512 MB (free) up to 10 GB |
| Deployment package | 50 MB zipped, 250 MB unzipped (incl. layers) |
| Layers | Max 5 per function; extract to /opt/ |
| Env variables | Max 4 KB total size |
| vCPU | Cannot set directly (controlled by memory setting) |
| 1 full vCPU | At 1,769 MB memory |
1.2.5 Lambda Networking (VPC)
DEFAULT (no VPC): Lambda --> Internet --> AWS APIs (Private RDS not accessible)
WITH VPC CONFIG: Lambda --ENI--> Private Subnet --> RDS
For internet: Lambda --> NAT GW --> IGW --> Internet
For AWS APIs: Use VPC Endpoints (no NAT GW needed)
- Lambda creates ENIs (Elastic Network Interfaces), NOT Elastic IPs
- VPC config adds cold start latency; only use when needed
1.2.6 Lambda Layers and Deployment
- Layers: shared code/libraries across functions (extract to
/opt/) - Max 5 layers per function; 250 MB total unzipped (function + all layers)
- CloudFormation ZipFile: Inline source code (Node.js and Python only); NOT a zip file path
1.2.7 Destinations vs DLQ
| Feature | Destinations | DLQ (Dead Letter Queue) |
|---|---|---|
| Event types | Success AND failure | Failure only |
| Targets | SQS, SNS, Lambda, EventBridge | SQS or SNS only |
| Scope | Async invocations only | Async invocations only |
| Recommendation | Preferred (more flexible) | Legacy (still supported) |
1.2.8 Aliases and Versions
- Version: Immutable snapshot of function code + config
- Alias: Pointer to a version (e.g.,
PRODpoints to v5) - Weighted alias: Route traffic between two versions (canary/linear)
- $LATEST: Mutable, always latest code; cannot be referenced by alias weights
- CodeDeploy integrates with aliases for automated traffic shifting
1.2.9 Lambda at the Edge
| Feature | CloudFront Functions | Lambda@Edge |
|---|---|---|
| Runtime | JavaScript only | Node.js, Python |
| Execution location | 218+ edge locations | Regional edge caches |
| Max duration | Less than 1 ms | 5s (viewer), 30s (origin) |
| Max memory | 2 MB | 128-3,008 MB |
| Network/file access | No | Yes |
| Use case | Header manipulation, URL rewrites | Auth, A/B testing, origin selection |
1.2.10 Lambda Function URLs
A dedicated HTTPS endpoint for a Lambda function β no API Gateway required.
Auth Types:
| Auth Type | Who Can Invoke | Use Case |
|---|---|---|
AWS_IAM | Only callers with valid IAM credentials | Internal services, cross-account invocations |
NONE | Anyone on the internet (public) | Webhooks, third-party callbacks, public APIs |
- Function URL format:
https://<url-id>.lambda-url.<region>.on.aws - Supports streaming response (response payload streamed as itβs generated)
- CORS configurable directly on the Function URL (no API Gateway needed)
- Resource-based policy controls access (even with
NONEauth, you can restrict by IP/account)
Lambda Function URL vs API Gateway:
| Feature | Lambda Function URL | API Gateway |
|---|---|---|
| Cost | Free (pay only for Lambda) | Per-request + data transfer charges |
| Setup effort | Minimal (one click / one line) | More setup (stages, methods, resources) |
| Auth options | AWS_IAM or NONE | IAM, Cognito, Lambda Authorizer, API Keys |
| Caching | No | Yes (built-in) |
| Throttling / Usage Plans | No | Yes |
| Request/response transform | No | Yes (mapping templates) |
| Custom domain | No (use CloudFront in front) | Yes (native custom domains) |
| WAF integration | No | Yes |
| Webhook from third-party | Best choice (simplest) | Works but more overhead |
Webhook Pattern (exam favorite):
Third-Party Platform (e.g., Stripe, GitHub)
|
| HTTPS POST with signature in headers
| (platform signs with a secret key)
v
Lambda Function URL (AuthType: NONE)
|
| Step 1: Extract signature from headers
| Step 2: Recompute signature using shared secret
| Step 3: Compare signatures
| Match --> Execute domain logic
| No match --> Return 403 (reject)
|
(Custom validation IN the Lambda code itself)
Exam Trap β Function URL vs API Gateway for Webhooks:
| Question Pattern | Answer |
|---|---|
| Third-party webhook + public HTTPS + least effort | Lambda Function URL (NONE) + custom validation |
| Webhook + signature in headers + validate before processing | Lambda Function URL (NONE) + validate in code |
| Third-party webhook + API Gateway + Lambda Authorizer | Works but NOT least effort (extra components) |
Function URL with AWS_IAM for third-party webhook | WRONG (third-party cannot sign with AWS Sig V4) |
Function URL with CodeSigningConfigArn condition | WRONG (code signing = deployment packages, not requests) |
Real exam example: A third-party platform sends webhook requests signed with a secret key in headers. Need a public HTTPS endpoint processed by Lambda with least development effort:
- Correct: Create Lambda Function URL with
AuthType: NONE+ resource-based policy allowing public invoke + custom signature validation inside the Lambda function- Wrong: API Gateway + Lambda Authorizer (works but MORE effort β two components instead of one)
- Wrong: Function URL with
AWS_IAM(third-party platform cannot create AWS Sig V4 signatures)- Wrong:
CodeSigningConfigArncondition (that validates deployment packages, not incoming HTTP requests)
Key Distinctions to Memorize:
| Term | What It Does | Exam Confusion |
|---|---|---|
FunctionUrlAuthType | Controls who can call the Function URL (NONE or IAM) | Auth for HTTP callers |
CodeSigningConfig | Validates deployment package integrity (code trust) | Auth for code deployments |
| Lambda Authorizer | Custom auth logic as a separate Lambda function | API Gateway only |
| Cognito Authorizer | JWT validation from Cognito User Pool | API Gateway only |
Task 1.3: Use Data Stores in Application Development
1.3.1 Amazon DynamoDB
Core Concepts:
- Partition Key (PK): Determines data distribution; must be high cardinality
- Sort Key (SK): Optional; enables range queries within a partition
- Item size: Max 400 KB
Capacity Modes:
| Mode | Billing | Best For |
|---|---|---|
| On-Demand | Pay per request | Unpredictable traffic, new tables |
| Provisioned | Set RCU/WCU (auto-scaling OK) | Predictable traffic, cost optimization |
RCU / WCU Calculations:
READ (RCU):
1 RCU = 1 strongly consistent read/sec for item <= 4 KB
= 2 eventually consistent reads/sec for item <= 4 KB
RCU = (reads/sec x ceil(item_KB / 4)) / consistency_factor
Strongly consistent: factor = 1
Eventually consistent: factor = 2 (HALF cost)
Transactional: factor = 0.5 (DOUBLE cost)
WRITE (WCU):
1 WCU = 1 write/sec for item <= 1 KB
WCU = writes/sec x ceil(item_KB / 1)
Transactional writes: multiply by 2
Example: 150 eventually consistent reads/sec, 3.5 KB items:
RCU = 150 x ceil(3.5/4) / 2 = 150 x 1 / 2 = 75 RCU
Indexes:
| Feature | LSI (Local Secondary) | GSI (Global Secondary) |
|---|---|---|
| Partition key | Same as base table | Different from base table |
| Sort key | Different from base table | Different from base table |
| Creation time | At table creation ONLY | Anytime |
| Capacity | Shares table RCU/WCU | Has its OWN RCU/WCU |
| Consistency | Strong or Eventually | Eventually ONLY |
| Max per table | 5 | 20 |
| Size limit | 10 GB per partition key | No limit |
ProvisionedThroughputExceededExceptionon writes? Check if GSI WCU is less than base table WCU.
Query vs Scan:
| Operation | What It Reads | Cost | Use When |
|---|---|---|---|
| Query | Items matching PK (+ SK) | Efficient | You know the partition key |
| Scan | Entire table | Expensive | Need all data (avoid if possible) |
- Scan applies
FilterExpressionAFTER reading (still consumes full RCU) - Optimize Scan: parallel scan with rate limiting; set
Limitto control page size
DynamoDB Streams:
- Captures item-level changes: INSERT, MODIFY, DELETE
- StreamViewType:
KEYS_ONLY,NEW_IMAGE,OLD_IMAGE,NEW_AND_OLD_IMAGES - 24-hour retention (Lambda must run within 24h or data loss)
- Lambda polls streams via Event Source Mapping (synchronous invocation)
Transactions:
| API | Behavior | Cost | Notes |
|---|---|---|---|
| TransactWriteItems | All-or-nothing (ACID) | 2x WCU | Up to 100 items, 4 MB total |
| TransactGetItems | Consistent read of multiple items | 2x RCU | Up to 100 items, 4 MB total |
| BatchWriteItem | Best-effort (NOT atomic) | 1x WCU | Up to 25 items, no UpdateItem |
| BatchGetItem | Best-effort | 1x RCU | Up to 100 items, 16 MB max |
Batch Operations β Partial Results and UnprocessedKeys (exam favorite!):
BatchGetItem returns partial results (UnprocessedKeys) when:
- Response size exceeds 16 MB limit
- Tableβs provisioned throughput is exceeded
- More than 1 MB per partition is requested
- Internal processing failure occurs
BatchWriteItem returns partial results (UnprocessedItems) when:
- Tableβs provisioned throughput is exceeded
- Internal processing failure occurs
How to Handle UnprocessedKeys / UnprocessedItems:
| Approach | Reliable? | Why |
|---|---|---|
| Exponential backoff with jitter (randomized delay) | YES | Reduces request frequency, avoids thundering herd |
| Use AWS SDK (built-in retry + exponential backoff) | YES | SDK handles retry logic automatically |
| Immediately retry the batch request | NO | Still throttled; high chance of failing again |
| Increase RCUs / enable Auto Scaling | Partial | Helps with throughput but partial results can still occur due to size limits |
| Create a GSI | NO | GSI doesnβt change BatchGetItem behavior |
Exam example: Python script uses
BatchGetItem, frequently getsUnprocessedKeys. Most reliable handling?
- Exponential backoff with randomized delay between retries
- Use AWS SDK (has built-in automatic retry + exponential backoff)
Wrong: Immediate retry (still throttled), increase RCUs (doesnβt fix size-limit partials), GSI (irrelevant to batch ops)
Batch Limits Quick Reference:
| API | Max Items | Max Size | Partial Result Key |
|---|---|---|---|
| BatchGetItem | 100 | 16 MB | UnprocessedKeys |
| BatchWriteItem | 25 | 16 MB | UnprocessedItems |
| TransactGetItems | 100 | 4 MB | All-or-nothing (no partial) |
| TransactWriteItems | 100 | 4 MB | All-or-nothing (no partial) |
Optimistic Locking: Use version number attribute with ConditionExpression.
TTL: Auto-delete expired items (no WCU cost); eventually consistent (up to 48h delay).
DAX vs ElastiCache:
| Feature | DAX | ElastiCache |
|---|---|---|
| Purpose | DynamoDB-specific cache | General-purpose cache |
| API | Same as DynamoDB (drop-in) | Custom cache logic in your app |
| Consistency | Eventually consistent only | You control |
| Data types | DynamoDB items/queries | Any (aggregated, computed, sessions) |
| Best for | Caching DynamoDB reads | Computed results, multi-source cache |
1.3.2 Amazon S3
Server-Side Encryption:
| Type | Key Management | Header |
|---|---|---|
| SSE-S3 | AWS managed | x-amz-server-side-encryption: AES256 |
| SSE-KMS | KMS key | x-amz-server-side-encryption: aws:kms + optional key ID |
| SSE-C | Customer key | 3 headers: algorithm, key (base64), key MD5 |
- SSE-KMS: Each operation calls KMS API and counts against KMS quota
- Enforce encryption: Bucket policy deny
s3:PutObjectwithout encryption header
Storage Classes:
| Class | Access Pattern | Retrieval Fee | Min Duration |
|---|---|---|---|
| S3 Standard | Frequent | None | None |
| S3 Intelligent-Tiering | Unknown / changing | None | None |
| S3 Standard-IA | Infrequent, rapid access | Per GB | 30 days |
| S3 One Zone-IA | Infrequent, single AZ OK | Per GB | 30 days |
| S3 Glacier Instant | Quarterly, millisecond access | Per GB | 90 days |
| S3 Glacier Flexible | 1-2x/year, mins-hours | Per GB | 90 days |
| S3 Glacier Deep Archive | 1x/year, 12-48 hours | Per GB | 180 days |
Key Features:
- Presigned URLs: Temporary access to private objects (upload or download)
- Event Notifications: Targets Lambda, SQS, SNS, EventBridge
- Versioning: Protects against accidental deletion (delete markers)
- MFA Delete: Requires MFA for permanent version deletion
- Lifecycle Rules: Transition between storage classes or expire objects
CORS:
- Configure on the target bucket (the one being accessed cross-origin)
- Lambda Proxy integration: Return CORS headers from Lambda function
- Non-proxy integration: Enable CORS in API Gateway console
1.3.3 Amazon ElastiCache
Redis vs Memcached:
| Feature | Redis | Memcached |
|---|---|---|
| Replication | Multi-AZ with auto-failover | No replication |
| Persistence | AOF / RDB snapshots | No persistence |
| Data types | Strings, lists, sets, hashes | Simple key-value only |
| Pub/Sub | Yes | No |
| Threading | Single-threaded | Multi-threaded |
| Use case | HA, persistence, complex data | Simple caching, max throughput |
Caching Strategies:
| Strategy | How It Works | Pros | Cons |
|---|---|---|---|
| Lazy Loading | Cache miss then fetch DB then cache it | Only caches needed data | Stale data, cache-miss penalty |
| Write-Through | Write to cache AND DB simultaneously | Always fresh | Write penalty, caches all data |
| Write-Behind | Write to cache then async write to DB | Fast writes | Data loss risk |
Best practice: Write-Through + TTL = fresh data + automatic cleanup of unused entries.
1.3.4 Amazon OpenSearch
- Full-text search, log analytics, real-time dashboards
- Common pattern: DynamoDB Streams to Lambda to OpenSearch
- Use when DynamoDB cannot meet search requirements (full-text, fuzzy matching)
DOMAIN 2 β Security (26%)
Task 2.1: Implement Authentication and Authorization
2.1.1 IAM Core Concepts
Policy Evaluation Logic:
1. All requests DENIED by default
2. Evaluate all applicable policies
3. Explicit DENY always wins (overrides any Allow)
4. Explicit ALLOW grants access (if no Deny)
5. If no Allow found then implicit deny
Policy Types:
| Policy Type | Scope | Notes |
|---|---|---|
| Service Control Policy | Organization / OU / Account | Sets maximum permissions boundary |
| Permissions Boundary | IAM user / role | Sets max permissions for entity |
| Identity-based | IAM user / role / group | Inline or managed policies |
| Resource-based | S3, SQS, Lambda, KMS, etc. | Cross-account without AssumeRole |
| Session policy | STS session | Limits assumed role permissions |
Key Distinctions:
- Users: Long-term credentials (access keys); for humans or CI/CD
- Roles: Temporary credentials via STS; for services, cross-account, federation
- Instance Profile: Wrapper around IAM role for EC2 instances
- Always prefer roles over access keys
2.1.2 STS (Security Token Service)
| API | Use Case |
|---|---|
AssumeRole | Cross-account access, role switching |
AssumeRoleWithWebIdentity | OIDC federation (Google, Facebook, Cognito) |
AssumeRoleWithSAML | SAML 2.0 federation (Active Directory) |
GetSessionToken | MFA-protected API calls (ONLY STS API with MFA!) |
GetFederationToken | Proxy apps issuing temp credentials |
DecodeAuthorizationMessage | Decode UnauthorizedOperation error details |
Cross-Account Access Pattern:
PRODUCTION ACCOUNT DEVELOPMENT ACCOUNT
1. Create IAM Role with 3. Create IAM Policy allowing
Trust Policy: Dev Account sts:AssumeRole on Prod Role ARN
2. Attach permissions to Role 4. Attach to Dev IAM users/roles
Role created in account WITH the resource. Trust policy specifies WHO can assume it.
2.1.3 Amazon Cognito
User Pools (Authentication β βWho are you?β):
- User directory: sign-up, sign-in, password policies
- Social login: Google, Facebook, Apple, SAML, OIDC
- MFA, adaptive authentication, account recovery
- Returns JWTs: ID Token (user identity claims), Access Token (API access scopes), Refresh Token
- Hosted UI with customizable branding
- Directly integrates with API Gateway as a native Cognito Authorizer
- Cannot grant AWS service credentials directly
- Token stored client-side (e.g., browser local storage) and sent in
Authorizationheader
Identity Pools / Federated Identities (Authorization β βWhat can you access?β):
- Exchanges tokens (from User Pool, Google, Facebook, SAML) for temporary AWS credentials (via STS)
- Maps authenticated and unauthenticated users to IAM roles
- Supports guest / unauthenticated access
- Returns
Cognito ID(CognitoIdentityId) β a unique identifier for the user - The Cognito ID is then used to obtain temporary, limited-privilege AWS credentials
- Does NOT directly integrate with API Gateway as an authorizer
- Use when your app needs to call AWS services directly (S3, DynamoDB) from client-side
Identity Pool with External IdP β Full Flow:
FLOW 3: External IdP + Identity Pool (mobile apps, federated access)
ββββββββ SDK βββββββββββββ OAuth/OIDC βββββββββββββββββ returns ββββββββββββ
βMobileβββββββ> β Identity β token β Amazon β Cognito β Cognito β
β App β login β Provider βββββββββββββ>β Cognito ββββIDββββ> β ID β
ββββββββ β(Google, β β (Identity β β(unique β
β Facebook, β β Pool) β β user ID) β
β SAML...) β βββββββββββββββββ βββββββ¬βββββ
βββββββββββββ β
GetCredentialsForIdentity
β
v
ββββββββββββββββββββ
β Temp AWS Creds β
β (AccessKeyId, β
β SecretAccessKey, β
β SessionToken) β
ββββββββββββββββββββ
β
S3, DynamoDB, SNS...
What Each βCognito ___β Term Means (exam loves these distractors):
| Term | What It Is | Real? |
|---|---|---|
| Cognito ID | Unique user identifier returned by Identity Pool | YES β correct answer |
| Cognito Key Pair | Not a real Cognito concept | NO β distractor |
| Cognito SDK | Development toolkit to interact with Cognito | Exists but not a return value |
| Cognito API | API interface for Cognito service | Exists but not a return value |
Exam example: Mobile app authenticates with IdP using providerβs SDK, passes OAuth/OIDC token to Cognito. What is returned to provide temporary AWS credentials?
- Answer: Cognito ID β Identity Pool returns a Cognito ID, which is then used to get temporary, limited-privilege AWS credentials
- The Cognito ID uniquely identifies the user across all federated identity providers
The Two Flows β Critical to Understand:
FLOW 1: User Pool + API Gateway (most common exam scenario)
ββββββββ sign-in βββββββββββββ JWT βββββββββββββ Cognito ββββββββββββ
β User ββββββββββ>β User Pool βββtokenββ>β Browser / β Authorizerβ API GW β
β β β β β App βββββββββββ>β (validatesβ
ββββββββ βββββββββββββ βββββββββββββ header: β JWT) β
Authorization ββββββββββββ
- API Gateway validates JWT natively (no Lambda needed)
- Set token source = name of header (usually "Authorization")
- Create authorizer in API GW console using User Pool ID
FLOW 2: User Pool + Identity Pool + AWS Services
ββββββββ sign-in βββββββββββββ JWT βββββββββββββββββ STS ββββββββββββ
β User ββββββββββ>β User Pool βββtokenβ>β Identity Pool βββββββ>β Temp AWS β
β β β β β β β Creds β
ββββββββ βββββββββββββ βββββββββββββββββ ββββββββββββ
β
S3, DynamoDB, etc.
- Identity Pool exchanges JWT for IAM credentials
- App calls AWS services DIRECTLY (not through API Gateway)
- Use when client needs S3.putObject, DynamoDB.getItem, etc.
Exam Trap β User Pool vs Identity Pool for API Gateway:
| Question Pattern | Answer |
|---|---|
| βJWT authorizer for API Gatewayβ | User Pool (NOT Identity Pool) |
| βReactJS app + Cognito + JWT in local storage + API Gatewayβ | User Pool + Cognito Authorizer |
| βToken source header for API Gateway authorizerβ | Set header on User Pool authorizer |
| βApp needs to call S3/DynamoDB directly from browserβ | Identity Pool (for AWS creds) |
| βGuest access to AWS resourcesβ | Identity Pool |
| βUser Pool or Identity Pool for API Gateway?β | Always User Pool |
Real exam example: A ReactJS app on S3 uses Cognito SDK for sign-up/sign-in, stores JWT in local storage, and uses JWT to authorize API Gateway calls. The correct steps are:
- Create a Cognito User Pool (for sign-up/sign-in and JWT issuance)
- On API Gateway console, create an authorizer using the Cognito User Pool ID
- Set the header name (e.g.,
Authorization) as the token source pointing to the User Pool authorizerIdentity Pool is NOT needed here because the app only calls API Gateway (not AWS services directly).
Complete Decision Guide:
| Scenario | Service |
|---|---|
| User sign-up / sign-in / user directory | User Pool |
| JWT tokens for API Gateway authorization | User Pool (Cognito Authorizer) |
| Token source header for API Gateway | User Pool authorizer config |
| Temporary AWS credentials (S3, DynamoDB from client) | Identity Pool |
| Guest / unauthenticated access to AWS resources | Identity Pool |
| Social login + access S3 directly from app | User Pool + Identity Pool |
| Social login + access API Gateway | User Pool (Cognito Authorizer) |
| Cross-device sync (single user key-value) | Cognito Sync |
| Multi-user real-time shared data | AppSync (NOT Cognito Sync) |
2.1.4 API Gateway Authentication
| Method | How It Works | Use When |
|---|---|---|
| IAM (AWS_IAM) | Sig V4 signed requests | AWS users/roles, cross-account |
| Cognito User Pool Authorizer | Validates JWT from User Pool natively | User pool-authenticated clients |
| Lambda Authorizer (TOKEN) | Custom auth logic on bearer token | Custom/3rd-party auth, Identity Pool tokens |
| Lambda Authorizer (REQUEST) | Custom auth on headers, query params | Multiple identity sources |
| API Keys | Identification only (NOT authentication!) | Usage tracking, throttling, quotas |
Identity Pool + API Gateway (when needed):
- Identity Pool does NOT have a native API Gateway authorizer
- If you must use Identity Pool tokens with API Gateway, use a Lambda Authorizer to validate
- But the standard pattern is: User Pool JWT + Cognito Authorizer (simpler, no Lambda needed)
Resource Policies: JSON policies on the API itself for cross-account access or IP restrictions.
Task 2.2: Implement Encryption Using AWS Services
2.2.1 AWS KMS (Key Management Service)
Key Types:
| Type | Management | Rotation | Use Case |
|---|---|---|---|
AWS managed key (aws/s3) | AWS | Auto every year | Default for AWS services |
| Customer managed key (CMK) | You | Optional (enable auto-rotate) | Custom control, policies |
| AWS owned key | AWS | Varies | Internal AWS use |
Envelope Encryption (for data larger than 4 KB):
ENCRYPT:
1. Call GenerateDataKey -> returns plaintext key + encrypted key
2. Encrypt data with the plaintext key (client-side)
3. DELETE plaintext key from memory
4. Store encrypted data + encrypted key together
DECRYPT:
1. Send encrypted key to KMS (Decrypt API) -> returns plaintext key
2. Decrypt data with the plaintext key
3. DELETE plaintext key from memory
- KMS can only directly encrypt/decrypt up to 4 KB
- For larger data you MUST use envelope encryption
GenerateDataKeyvsGenerateDataKeyWithoutPlaintext(encrypted key only, for later use)
KMS Key Policies:
- Every KMS key must have a key policy (resource-based)
- Default: Gives the account root user full access
- Cross-account: Key policy must allow external account AND external account needs IAM permissions
KMS API Quota:
- 5,500 - 30,000 requests/sec per region
- SSE-KMS on S3: each upload/download calls KMS and can hit quota
- Fix: request quota increase, use S3 Bucket Keys, or switch to SSE-S3
2.2.2 KMS vs CloudHSM
| Feature | KMS | CloudHSM |
|---|---|---|
| HSM tenancy | Multi-tenant | Single-tenant (dedicated) |
| Key management | AWS manages HSMs | You manage keys and HSMs |
| FIPS compliance | Level 2 | Level 3 |
| Integration | 100+ AWS services | Custom key store for KMS |
| Use case | Most encryption needs | Regulatory compliance, full control |
2.2.3 Encryption at Rest and in Transit
| Service | At Rest | In Transit |
|---|---|---|
| S3 | SSE-S3, SSE-KMS, SSE-C | HTTPS (enforce via bucket policy) |
| DynamoDB | AWS owned key (default) or KMS CMK | TLS (automatic) |
| EBS | KMS-encrypted volumes | Encrypted between EC2 and EBS |
| RDS | KMS encryption at creation | SSL/TLS certificates |
| SQS | SSE-SQS or SSE-KMS | HTTPS |
| Kinesis | KMS server-side encryption | TLS |
2.2.4 ACM (AWS Certificate Manager)
- Free public SSL/TLS certificates for AWS services
- Auto-renewal for ACM-issued certificates
- Integrates with: ALB, CloudFront, API Gateway
- Cannot use ACM certificates directly on EC2
Task 2.3: Manage Sensitive Data in Application Code
2.3.1 Parameter Store vs Secrets Manager
| Feature | Parameter Store | Secrets Manager |
|---|---|---|
| Rotation | Manual (custom Lambda) | Built-in auto-rotation |
| Cost | Free (standard tier) | $0.40/secret/month |
| Max size | 4 KB (std), 8 KB (adv) | 64 KB |
| RDS integration | No native rotation | Native rotation for RDS |
| Cross-region | No | Yes (replica secrets) |
| Encryption | SecureString with KMS | Always encrypted with KMS |
| Best for | Config values, feature flags | Credentials needing rotation |
Parameter Store Tiers:
| Tier | Max Size | Cost | Features |
|---|---|---|---|
| Standard | 4 KB | Free | No policies |
| Advanced | 8 KB | Paid | Expiration, NoChangeNotification |
CloudFormation Dynamic References:
{{resolve:ssm:paramName:version}}for plaintext parameters{{resolve:ssm-secure:paramName:version}}for SecureString parameters{{resolve:secretsmanager:secretId:key}}for Secrets Manager
2.3.2 Best Practices for Sensitive Data
- Never store credentials in code, Git, or CloudFormation parameters in plaintext
- Lambda: Use environment variables encrypted with KMS for sensitive values
- ECS: Reference Secrets Manager or Parameter Store in task definitions
- Use IAM roles instead of access keys whenever possible
DOMAIN 3 β Deployment (24%)
Task 3.1: Prepare Application Artifacts for Deployment
3.1.1 Lambda Packaging
| Method | Details |
|---|---|
| Zip deployment | 50 MB compressed, 250 MB unzipped (incl. layers) |
| Container image | Up to 10 GB; must implement Lambda Runtime API |
| Inline (ZipFile) | CloudFormation only; Node.js and Python; source code only |
| Layers | Shared libraries; max 5 per function; extract to /opt/ |
3.1.2 Container Images (ECS/EKS)
- Dockerfile to
docker buildto push to Amazon ECR - ECR: Managed container registry with image scanning
- ECS Task Definition references ECR image URI
- Multi-stage builds reduce image size (build + runtime stages)
3.1.3 Elastic Beanstalk Source Bundle
- ZIP or WAR containing application code
.ebextensions/*.configfor custom resources and settings (YAML)env.yamlfor environment manifestDockerrun.aws.jsonfor multi-container Dockercron.yamlfor periodic worker tasksProcfileto define processes
Task 3.2: Test Applications in Development Environments
3.2.1 SAM Local Testing
sam local invoke Invoke Lambda locally with event payload
sam local start-api Start local API Gateway + Lambda
sam local start-lambda Start Lambda endpoint for SDK testing
sam local generate-event Generate sample event payloads
- Requires Docker for local Lambda simulation
3.2.2 CDK + SAM Local Testing (Exam Favorite!)
The exact two-step flow:
Step 1: cdk synth --stack StackName
|
v
Generates CloudFormation template in cdk.out/
(e.g., cdk.out/MyStack.template.json)
Step 2: sam local invoke -t cdk.out/MyStack.template.json MyFunctionLogicalId
|
v
SAM reads the synthesized template, finds the Lambda,
spins up Docker container, invokes locally
Full command reference for CDK + SAM local testing:
| Step | Command | Purpose |
|---|---|---|
| 1 | cdk synth (with optional --stack StackName) | Generate CloudFormation template to cdk.out/ |
| 2a | sam local invoke -t cdk.out/template.json FunctionId | Invoke a specific Lambda locally |
| 2b | sam local start-api -t cdk.out/template.json | Start local API Gateway + Lambda |
| 2c | sam local start-lambda -t cdk.out/template.json | Start local Lambda endpoint for SDK testing |
Exam Trap β What is NOT needed for local testing:
| Command / Action | Needed for Local Testing? | Why |
|---|---|---|
cdk synth | YES | Generates the template SAM needs |
sam local invoke -t ... | YES | Invokes the function locally using template |
cdk bootstrap | NO | Sets up deployment infra (S3 bucket), not local |
sam package | NO | Uploads code to S3 for deployment, not local |
sam deploy | NO | Deploys to AWS, not local |
cdk deploy | NO | Deploys to AWS, not local |
Exam example: CDK app with L2 constructs, SAM and CDK configured locally. What TWO steps to test Lambda locally?
- Run
cdk synthto generate CloudFormation template (specify stack name)- Run
sam local invokewith-tpointing to the synthesized template and the function logical IDWrong:
sam package(thatβs for S3 upload/deployment),cdk bootstrap(thatβs for deployment prep)
Task 3.3: Automate Deployment Testing
3.3.1 CodeBuild
buildspec.yml structure:
version: 0.2
env:
variables:
KEY: "value"
parameter-store:
DB_PASS: /app/db-password
secrets-manager:
SECRET: my-secret:key
phases:
install:
runtime-versions:
nodejs: 18
commands:
- npm install
pre_build:
commands:
- npm test
build:
commands:
- npm run build
post_build:
commands:
- echo "Build complete"
artifacts:
files:
- '**/*'
base-directory: dist
cache:
paths:
- node_modules/**/*
- Build projects defined in
buildspec.yml(NOTappspec.yml) - Artifacts stored in S3
- Environment variables from Parameter Store and Secrets Manager
- VPC support for accessing private resources during build
Task 3.4: Deploy Code Using AWS CI/CD Services
3.4.1 CodePipeline
Source --> Build --> [Test] --> [Manual Approval] --> Deploy
- Source: CodeCommit, GitHub, S3, ECR
- Build: CodeBuild
- Deploy: CodeDeploy, CloudFormation, ECS, Elastic Beanstalk, S3
Manual Approval:
- Pipeline pauses until approved, rejected, or times out
- Default timeout: 7 days
- SNS notification to approvers
- IAM permission:
codepipeline:PutApprovalResult - Use cases: production gate, compliance sign-off, change management
3.4.2 AWS CodeDeploy
Deployment Matrix:
| Platform | In-Place | Blue/Green | Agent Required? |
|---|---|---|---|
| EC2 | Yes | Yes | YES (must be installed and running) |
| On-Premises | Yes | No | YES (must be installed and running) |
| Lambda | No | Yes (always) | NO (managed by AWS) |
| ECS | No | Yes (always) | NO (managed by AWS) |
CodeDeploy Agent Details:
- Must be installed and running on EC2 and On-Premises instances only
- Agent polls CodeDeploy for deployment instructions
- Install at scale via SSM Run Command
- NOT required for Lambda or ECS (AWS manages natively)
DownloadBundleerror: Check agent is running AND instance IAM role has S3 permissions
Traffic Shifting Strategies (Lambda / ECS):
| Strategy | Behavior |
|---|---|
| Canary | X% to new then wait then remaining (e.g., Canary10Percent5Minutes) |
| Linear | Equal increments over time (e.g., Linear10PercentEvery1Minute) |
| AllAtOnce | 100% immediately |
AppSpec File:
| Platform | Format | Key Sections |
|---|---|---|
| EC2 | YAML | files (source to dest), hooks (lifecycle events) |
| Lambda | YAML/JSON | version, resources (function, alias, versions) |
| ECS | YAML/JSON | version, resources (task def, container, port) |
EC2/On-Prem Lifecycle Hooks (in order):
| # | Hook | Managed By | Can You Script It? |
|---|---|---|---|
| 1 | ApplicationStop | User (AppSpec) | Yes β run scripts to stop current app |
| 2 | DownloadBundle | Agent only | NO β cannot configure in AppSpec |
| 3 | BeforeInstall | User (AppSpec) | Yes β pre-install tasks (backup, decrypt) |
| 4 | Install | Agent only | NO β agent copies files per AppSpec |
| 5 | AfterInstall | User (AppSpec) | Yes β post-install (chmod, config) |
| 6 | ApplicationStart | User (AppSpec) | Yes β start/restart your application |
| 7 | ValidateService | User (AppSpec) | Yes β health checks, smoke tests |
Exam key:
DownloadBundleandInstallare agent-managed only. You cannot write scripts for them in AppSpec. Any answer saying βconfigure DownloadBundle in AppSpecβ is wrong.
DownloadBundle Failures β Deep Dive:
| Error Message | Root Cause | Fix |
|---|---|---|
UnknownError: not opened for reading | EC2 instance IAM profile lacks S3 read permissions | Add s3:Get*, s3:List* to instance profile |
DownloadBundle timeout | Agent cannot reach S3 (network/VPC issue) | Check security groups, NACLs, VPC endpoints |
DownloadBundle with access denied | S3 bucket policy denies the instance role | Update bucket policy OR instance profile |
| Agent not found / deployment stuck | CodeDeploy agent not installed or not running | Install agent via SSM Run Command; start service |
Exam traps around DownloadBundle:
- S3 versioning is NOT required for CodeDeploy to download bundles
DownloadBundleworks in all regions (not region-restricted)- You cannot configure
DownloadBundlein the AppSpec file (itβs agent-managed) - The most common cause is missing IAM permissions on the EC2 instance profile
Rollback:
- Automatic rollback deploys last known good revision as a new deployment (new ID)
- Does NOT restore previous deployment; creates a new one
- Triggers: deployment failure or CloudWatch alarm breach
3.4.3 CloudFormation
Template Sections:
| Section | Required? | Purpose |
|---|---|---|
| AWSTemplateFormatVersion | No | Template version (2010-09-09) |
| Description | No | Template description |
| Parameters | No | Input values at deploy time |
| Mappings | No | Static key-value lookup tables |
| Conditions | No | Conditional resource creation |
| Transform | No | Macros (SAM: AWS::Serverless-2016-10-31) |
| Resources | YES | AWS resources (ONLY required section) |
| Outputs | No | Export values for cross-stack references |
Key Intrinsic Functions:
| Function | Purpose |
|---|---|
Ref | Reference parameter or resource logical ID |
Fn::GetAtt | Get attribute of a resource |
Fn::Join | Concatenate strings with delimiter |
Fn::Sub | String substitution with variables |
Fn::Select | Select item from list by index |
Fn::ImportValue | Import exported output from another stack |
Fn::FindInMap | Lookup value in Mappings section |
Cross-Stack References:
- Stack A:
OutputswithExport: Name: "VPC-ID" - Stack B:
Fn::ImportValue: "VPC-ID"
Change Sets: Preview changes before executing. Shows adds, modifies, replaces.
Drift Detection: Identifies resources changed outside CloudFormation.
Helper Scripts (EC2):
| Script | Purpose |
|---|---|
cfn-init | Install packages, create files, start services |
cfn-signal | Signal CloudFormation that instance is ready |
cfn-hup | Daemon detecting metadata changes, re-runs cfn-init |
cfn-get-metadata | Retrieve metadata from template |
3.4.4 AWS SAM
- CloudFormation extension for serverless; requires
Transform: AWS::Serverless-2016-10-31
SAM Resource Types:
| SAM Resource | Equivalent |
|---|---|
AWS::Serverless::Function | Lambda + IAM Role + API GW + Events |
AWS::Serverless::Api | API Gateway RestApi |
AWS::Serverless::HttpApi | API Gateway HttpApi (v2) |
AWS::Serverless::SimpleTable | DynamoDB Table |
AWS::Serverless::LayerVersion | Lambda Layer |
SAM CLI Commands:
| Command | Purpose |
|---|---|
sam init | Scaffold project (not needed if project exists) |
sam build | Install dependencies, prepare artifacts |
sam deploy | Package (zip + S3) + Deploy via CloudFormation (combined!) |
sam local invoke | Local Lambda testing |
sam local start-api | Local API Gateway |
sam deploy=sam package+ deploy combined CloudFormation CLI requires TWO commands:aws cloudformation packagethendeploy
SAM Policy Templates: Predefined IAM policies like DynamoDBCrudPolicy, S3ReadPolicy, SQSPollerPolicy.
3.4.5 AWS CDK
- Infrastructure as code (Python, TypeScript, Java, C#, Go)
- Compiles to CloudFormation via
cdk synth
| Command | Purpose |
|---|---|
cdk bootstrap | FIRST command in new account/region (creates S3 for assets) |
cdk synth | Generate CloudFormation template |
cdk deploy | Deploy stack |
cdk diff | Preview changes |
cdk destroy | Delete stack |
NoSuchBucketerror means runcdk bootstrap- Constructs: L1 (raw CFN), L2 (opinionated defaults), L3 (patterns)
3.4.6 Elastic Beanstalk Deployments
| Strategy | Downtime | Capacity | Rollback | Cost |
|---|---|---|---|---|
| All-at-once | YES | Reduced | Manual redeploy | Lowest |
| Rolling | No | Reduced | Manual redeploy | Low |
| Rolling + Batch | No | Full | Manual redeploy | Medium |
| Immutable | No | Full | Terminate new ASG | Higher |
| Traffic Splitting | No | Full | Reroute traffic | Higher |
| Blue/Green | No | Full | CNAME swap | Highest |
- Blue/Green: Best for major platform changes
- Immutable: Best for quick rollback without DNS complexity
3.4.7 CodeArtifact
- Managed artifact repository for Maven, npm, pip, NuGet
- Upstream repos: npmjs.com, Maven Central, PyPI
- Cross-account sharing via resource policies
DOMAIN 4 β Troubleshooting and Optimization (18%)
Task 4.1: Assist in Root Cause Analysis
4.1.1 CloudWatch Logs
- Log Groups: Container for log streams (one per application/service)
- Log Streams: Sequence of events from a single source
- Metric Filters: Extract metrics from log data (count ERROR occurrences)
- Insights: Query and analyze log data with purpose-built query language
- Subscription Filters: Real-time processing to Lambda, Kinesis, Firehose
- Retention: Configure per log group (1 day to 10 years; default forever)
Lambda Logging:
- Execution role needs:
logs:CreateLogGroup,logs:CreateLogStream,logs:PutLogEvents - Managed policy:
AWSLambdaBasicExecutionRole
4.1.2 CloudWatch Metrics and Alarms
Custom Metrics:
PutMetricDataAPI to publish custom metrics- High-resolution: 1-second granularity (standard is 60 seconds)
- Embedded Metric Format (EMF): Structured log auto-extracted as metric
Alarms:
| State | Meaning |
|---|---|
| OK | Metric within threshold |
| ALARM | Metric breached threshold |
| INSUFFICIENT_DATA | Not enough data to evaluate |
- Actions: SNS, Auto Scaling, EC2, Lambda
- Composite alarms: AND/OR logic across multiple alarms
4.1.3 CloudTrail
- Records API calls (management events by default)
- Data events (S3, Lambda) are opt-in with additional cost
- Use for: auditing, compliance (βwho did what whenβ)
- NOT for distributed tracing (use X-Ray)
4.1.4 Common Error Reference
| Error | Root Cause | Fix |
|---|---|---|
| API Gateway 502 | Lambda wrong response format | Return {statusCode, headers, body} |
| API Gateway 504 | Backend exceeded 29s timeout | Optimize Lambda or use async |
| Lambda 429 | Concurrency limit reached | Reserved concurrency + backoff |
DynamoDB ProvisionedThroughputExceededException | Hot partition or low capacity | Better PK + exponential backoff |
Lambda AccessDeniedException | Execution role missing permissions | Update execution role IAM policy |
EC2 UnauthorizedOperation | IAM policy missing | sts:DecodeAuthorizationMessage |
CDK NoSuchBucket | CDK not bootstrapped | Run cdk bootstrap |
Lambda Unable to import module | Missing deps in package | Install locally + zip + upload |
CodeBuild RequestError timeout | Missing proxy config | Add proxy to buildspec.yml |
CodeDeploy DownloadBundle / not opened for reading | EC2 IAM profile missing S3 permissions | Add S3 read perms to instance profile |
CodeDeploy DownloadBundle stuck | Agent not running or network blocked | Check agent + SG + NACLs + VPC endpoints |
Task 4.2: Instrument Code for Observability
4.2.1 AWS X-Ray
Core Concepts:
| Concept | Description |
|---|---|
| Trace | End-to-end journey of a single request |
| Segment | Work done by a single service (auto-generated) |
| Subsegment | Granular downstream calls within a segment |
| Annotation | Key-value pair, indexed for search/filter |
| Metadata | Key-value pair, NOT indexed for debug data |
| Service Map | Visual graph of distributed application |
| Sampling | Rules controlling how many traces are recorded |
Annotations vs Metadata:
| Feature | Annotations | Metadata |
|---|---|---|
| Indexed | YES (searchable) | NO |
| Value types | Strings, numbers, booleans | Any type (objects, arrays) |
| Use for | Filtering by user_id, env | Debug data you do not search |
| API | putAnnotation(key, value) | putMetadata(key, value) |
Subsegment Namespaces:
| Namespace | Type of Call |
|---|---|
aws | AWS SDK call (DynamoDB, S3, SQS) |
remote | External HTTP API call |
| (none) | Custom subsegment (your own code logic) |
Trace Analysis APIs:
GetTraceSummariesto filter traces by annotations and get trace IDsBatchGetTracesto get full traces by ID (no filtering)
Integration by Service:
| Service | How to Enable |
|---|---|
| Lambda | Enable checkbox (TracingConfig: Active); automatic |
| ECS / Fargate | X-Ray daemon as sidecar container (UDP port 2000!) |
| Elastic Beanstalk | .ebextensions/xray.config |
| EC2 | Install X-Ray daemon + AWSXRayDaemonWriteAccess role |
| API Gateway | Enable tracing in stage settings |
βInstall X-Ray daemon on Lambdaβ is WRONG. Just enable the checkbox. X-Ray daemon on ECS uses UDP port 2000 (not TCP).
Sampling Rules:
- Reservoir: Fixed traces per second (guaranteed)
- Rate: Percentage of additional traces beyond reservoir
- Default: 1 request/sec + 5% of additional requests
X-Ray IAM Policies:
| Role | Policy Needed |
|---|---|
| X-Ray daemon (EC2/ECS) | AWSXRayDaemonWriteAccess |
| Lambda with X-Ray | Automatic (just enable) |
| View traces in console | AWSXrayReadOnlyAccess |
| Full access | AWSXrayFullAccess |
4.2.2 CloudWatch Contributor Insights
- Identify top contributors (e.g., top IPs causing errors)
- DynamoDB: Identify most accessed items (hot keys)
Task 4.3: Optimize Applications Using AWS Services
4.3.1 Lambda Optimization
| Optimization | Approach |
|---|---|
| Reduce cold starts | Provisioned concurrency; keep functions warm |
| Faster execution | Increase memory (= more CPU); optimize code |
| Reduce package size | Use layers; remove unused deps; container images |
| Reuse connections | Initialize SDK clients OUTSIDE the handler |
| Caching | Store data in /tmp; use global variables |
| Async processing | Decouple with SQS/SNS; return early |
4.3.2 DynamoDB Optimization
| Goal | Approach |
|---|---|
| Fix hot partitions | High-cardinality PK; add random suffix if needed |
| Read optimization | DAX for microsecond reads; ElastiCache for computed |
| Write optimization | Batch writes; on-demand for spiky traffic |
| Cost optimization | Eventually consistent reads (half cost); TTL for cleanup |
| Query optimization | Use Query not Scan; project only needed attributes |
| GSI throttle prevention | Ensure GSI WCU >= base table WCU |
4.3.3 S3 Performance
| Optimization | Detail |
|---|---|
| Upload speed | Multipart upload (recommended for files larger than 100 MB) |
| Download speed | S3 Transfer Acceleration (uses CloudFront edge) |
| Byte-range fetches | Download only portions of an object |
| S3 Select | Query data in place using SQL (filter before download) |
| Request rate | 3,500 PUT + 5,500 GET per prefix per second |
4.3.4 Caching Decision Tree
DynamoDB reads with known keys?
YES --> DAX
Computed/aggregated data or multi-source?
YES --> ElastiCache (Redis/Memcached)
API responses?
YES --> API Gateway caching
Static content?
YES --> CloudFront
4.3.5 CloudFront
Cache Update Strategies (exam favorite!):
| Strategy | Immediate? | Cost | How It Works |
|---|---|---|---|
| Versioned file names | YES | FREE | Change URL (e.g., img_v2.jpg); new URL = cache miss = fresh content |
| Invalidation | YES | Costs $$ | Removes objects from edge caches; first 1,000 paths free/month, then $0.005/path |
| Wait for TTL expiration | NO | Free | Objects served stale until TTL expires |
| Disable/re-enable distribution | NO | Free | Does NOT clear cache; causes downtime |
Exam trap: βFast AND cost-efficientβ = versioned file names (not invalidation!)
- Invalidation is fast but NOT cost-efficient for thousands of objects
- Versioned file names are both fast AND free
- Waiting for expiration is free but NOT fast
- Disabling distribution doesnβt clear cache and causes downtime
Other CloudFront Features:
- CloudFront Functions: Lightweight edge logic (viewer request/response), JavaScript only, < 1ms
- Lambda@Edge: Heavier processing (origin request/response), Node.js/Python, up to 30s
CloudFront-Viewer-Countryheader for geo-routing- Origin Access Control (OAC): Restrict S3 access to CloudFront only (replaces OAI)
- S3 Transfer Acceleration: Uses CloudFront edge for faster S3 uploads (different from CloudFront distribution)
APPENDIX A β Service Limits Quick Reference
| Service | Key Limits |
|---|---|
| Lambda timeout | 15 minutes (900 seconds) |
| Lambda memory | 128 MB - 10,240 MB |
| Lambda concurrent | 1,000 (soft limit, per account per region) |
| Lambda package | 50 MB zip, 250 MB unzipped, 5 layers, 10 GB container |
| Lambda /tmp | 512 MB free, up to 10 GB |
| API Gateway timeout | 29 seconds integration timeout |
| API Gateway TPS | 10,000 requests/second (account level) |
| API Gateway cache TTL | 0 - 3600 seconds (default 300) |
| DynamoDB item size | 400 KB max |
| DynamoDB RCU | 1 RCU = 4 KB (strong), 8 KB (eventual) |
| DynamoDB WCU | 1 WCU = 1 KB/sec |
| DynamoDB LSI | 5 per table (creation only) |
| DynamoDB GSI | 20 per table (anytime) |
| DynamoDB Streams | 24-hour retention |
| SQS Standard TPS | Unlimited |
| SQS FIFO TPS | 300 (3,000 with batching) |
| SQS message size | 256 KB (up to 2 GB with Extended Client) |
| SQS retention | 1 min - 14 days (default 4 days) |
| SQS visibility | 0 - 12 hours (default 30 seconds) |
| Kinesis retention | 24 hours - 365 days |
| KMS direct encrypt | 4 KB max |
| KMS API quota | 5,500 - 30,000 req/sec per region |
| Cognito Sync | 1 MB per dataset, 20 datasets per identity |
| Step Functions events | 25,000 execution history events (standard) |
| CloudFormation | 500 resources per stack |
APPENDIX B β Exam Strategy
Key Phrases and Their Answers:
| Phrase in Question | Likely Answer Direction |
|---|---|
| βLeast operational effortβ | Managed service, built-in feature |
| βMost secureβ | IAM roles, least privilege, encryption |
| βCost-effectiveβ | Query over Scan, binpack, on-demand, reserved |
| βWithout code changesβ | ALB OIDC, CloudFront, API caching |
| βNear real-timeβ | DynamoDB Streams, Kinesis, EventBridge |
| βExactly-onceβ | SQS FIFO (NOT Standard) |
| βCross-accountβ | AssumeRole + trust policy |
| βMFA protectedβ | GetSessionToken |
| βOrdering guaranteedβ | SQS FIFO, Kinesis (per shard) |
| βDecoupleβ | SQS (NOT Kinesis unless streaming needed) |
Service Confusion Matrix:
| Scenario | DO NOT Pick | DO Pick |
|---|---|---|
| Simple decoupling | Kinesis | SQS |
| Serverless deploy | Raw CloudFormation | SAM |
| Coordinate Lambdas | Direct invoke | Step Functions |
| Feature flags | Lambda + SSM | AppConfig |
| Distributed tracing | CloudWatch Logs | X-Ray |
| API call auditing | X-Ray | CloudTrail |
| npm / Maven repository | ECR | CodeArtifact |
| Cross-device sync (1 user) | AppSync | Cognito Sync |
| Multi-user real-time | Cognito Sync | AppSync |
| Infra as code in Python | CloudFormation YAML | CDK |
| DB credential rotation | Parameter Store | Secrets Manager |
| Container image registry | S3 | ECR |
| Third-party webhook (least effort) | API GW + Lambda Auth | Lambda Function URL (NONE) |
| JWT auth for API Gateway | Identity Pool | User Pool (Cognito Authorizer) |
| Public HTTPS for Lambda | API Gateway (overkill) | Lambda Function URL |
When Stuck Between Two Answers:
- Does it follow least privilege?
- Does it use a managed/native service?
- Does it address the root cause (not symptoms)?
- Is it the simplest path meeting ALL requirements?
APPENDIX C β Additional Service Cards
AWS AppSync: Managed GraphQL; real-time subscriptions via WebSocket; offline support with conflict resolution.
AWS AppConfig: Feature flags and dynamic configuration; gradual rollout without deployment; preferred over Lambda + Parameter Store.
Amazon ECS Task Placement:
| Strategy | Behavior | Use Case |
|---|---|---|
| binpack | Pack tightly (fewest instances) | Cost optimization |
| spread | Distribute across AZs or instances | High availability |
| random | Random placement | Testing |
ECS Roles:
| Role | Who Uses It | For What |
|---|---|---|
| Execution Role | ECS Agent | Pull ECR images, push CloudWatch logs |
| Task Role | Your container | Call AWS APIs (S3, DynamoDB, SNS, etc.) |
EC2 Instance Metadata:
http://169.254.169.254/latest/meta-data/for instance detailshttp://169.254.169.254/latest/user-data/for launch scripts- IMDSv2 (recommended): Requires session token
ALB: Layer 7; OIDC auth on HTTPS:443 (no code changes); X-Forwarded-For for client IP; Lambda targets supported. NLB is Layer 4 with no OIDC and no Lambda targets.
SQS Extended Client Library: Java SDK only. Messages up to 2 GB via S3 storage. Not available via CLI, console, or other SDKs.
AWS DVA-C02 Exam-Aligned Study Guide | March 2026