LearningTree · AWS · Networking

Amazon API Gateway —
Managed API Control Plane

Your front door to backend services — create, publish, secure, and monitor APIs at any scale. Routes requests to Lambda, ECS, EC2, or any HTTP endpoint. Not just a proxy — it's your API control plane with auth, throttling, caching, and observability built in.

⚡ API Gateway in 30 Seconds

Fully managed API layer — create REST, HTTP, and WebSocket APIs without managing servers
Routes requests to Lambda, ALB, EC2, or any HTTP endpoint (backend integration)
Built-in security — IAM, Cognito, Lambda authorizers, API keys, resource policies
Throttling & caching — protect backends from overload, reduce latency with response cache
Two main types — REST API (full features) vs HTTP API (faster, cheaper, simpler)

Chapter One

What is API Gateway

The Problem — Why API Gateway Exists Introductory

You build a backend service — a Lambda function, an ECS container, or EC2 instances running your app. Now you need to expose it to the outside world. You need URL routing, authentication, rate limiting, CORS headers, request validation, monitoring, and versioning. Do you build all that yourself?

👉 The fundamental problem: Every API needs the same cross-cutting concerns — auth, throttling, logging, CORS, versioning. Building this per-service is wasteful. API Gateway provides all of it as a managed infrastructure layer in front of your backends.

What is an API Introductory

An API (Application Programming Interface) is a contract between systems. Your frontend says "give me the user's orders" — it calls GET /api/orders. The backend processes the request and returns structured data (JSON). The API defines the what (endpoints, methods, data format) — not the how (implementation details).

❌

Without API Gateway

Each backend exposes its own endpoint directly
Auth logic duplicated in every service
No centralized rate limiting
CORS, logging, versioning — per-service
Client needs to know each service's address
No single place to monitor all API calls

✅

With API Gateway

Single entry point for all APIs
Auth handled once at the gateway level
Global rate limiting and throttling
CORS, caching, logging — centralized
Client talks to one URL — gateway routes internally
CloudWatch metrics for every endpoint

What is Amazon API Gateway Introductory

Amazon API Gateway is a fully managed service that lets you create, publish, maintain, monitor, and secure APIs at any scale. It acts as the "front door" for applications to access data, business logic, or functionality from your backend services — Lambda functions, ECS containers, EC2 instances, or any HTTP endpoint.

🚪

Front Door

Single entry point for all API consumers. Clients call the gateway URL — the gateway decides which backend handles each request.

🛡️

Control Plane

Authentication, authorization, throttling, caching, request validation, CORS — all managed at the gateway before reaching your code.

📊

Observability

Every request logged to CloudWatch. Latency, 4xx/5xx errors, integration latency — all visible per-endpoint without code changes.

Diagram 1 — Concept: API Gateway as Front Door Introductory

API Gateway sits between clients and backends — single entry point for all API traffic

Diagram 2 — AWS Architecture: API Gateway in the Stack Core

AWS Architecture — Route 53 → CloudFront → API Gateway → Backend services

Mental Model — The Hotel Concierge Introductory

🏨

Without Concierge (Direct Access)

Guests find the restaurant themselves
No one checks if they're allowed in
Kitchen overwhelmed with too many orders
No record of who ordered what
Each room service handles its own billing

🛎️

With Concierge (API Gateway)

All requests go through the concierge desk
Concierge checks room key (authentication)
Limits orders per guest per hour (throttling)
Routes to the right kitchen/service (routing)
Logs everything (monitoring)
Caches common requests (response cache)

Diagram 3 — Where API Gateway fits in the full request flow Core

Complete Request Flow — User → DNS → Edge → API Gateway → Compute → Data

🧠 Key Insight

API Gateway is not just an HTTP proxy — it's a managed API control plane. It controls access, secures endpoints, applies policies (throttling, caching), and provides complete observability — all without a single line of infrastructure code.

What Makes API Gateway Special Core

⚡

Fully Managed

No servers, no patching, no scaling decisions. AWS handles availability, scaling to millions of requests, and security patches automatically.

🔌

Any Backend

Route to Lambda (serverless), ALB (containers), EC2 (VMs), or any public HTTP endpoint. Mix backends behind one API.

💰

Pay-Per-Request

No minimum fees. Pay only for API calls received + data transfer. $1/million requests (HTTP API) or $3.50/million (REST API).

Chapter Summary Introductory

API Gateway = managed front door for all your API traffic — single entry point for clients
Cross-cutting concerns — auth, throttling, caching, CORS, logging handled at gateway level, not per-service
Backend-agnostic — routes to Lambda, ALB/ECS, EC2, or any HTTP endpoint
Fully managed — no servers, auto-scales to millions of requests, pay-per-call
Position in stack — Route 53 → CloudFront → API Gateway → Compute → Data
Mental model = hotel concierge: checks credentials, routes requests, limits traffic, logs everything

Chapter Two

Core Concepts — REST vs HTTP APIs

Two API Types — REST API vs HTTP API Introductory

AWS API Gateway offers two main API types for request/response APIs. Choosing the right one is critical — it affects features, performance, and cost. Most new projects should start with HTTP API unless they need REST API-specific features.

👉 Simple rule: HTTP API = faster + cheaper + simpler. REST API = more features (caching, request validation, WAF, usage plans). Default to HTTP API unless you specifically need a REST API feature.

🌐

REST API (v1)

Full-featured API management
Request/response transformation
Built-in response caching
Request validation (models)
Usage plans + API keys (quota management)
WAF integration
Edge-optimized or Regional endpoints
$3.50 per million requests
Higher latency (~10-30ms overhead)

⚡

HTTP API (v2)

Lightweight, low-latency proxy
Automatic deployments
Native OIDC / OAuth 2.0 support
No request transformation (proxy only)
No built-in caching (use CloudFront)
No WAF (add CloudFront in front)
Regional endpoint only
$1.00 per million requests (71% cheaper)
Lower latency (~5-10ms overhead)

Comparison Table Core

   Feature REST API HTTP API 
  Cost (per 1M requests) $3.50 $1.00 ✅ 
 Latency overhead 10-30ms 5-10ms ✅ 
 Lambda integration ✅ Proxy + Custom ✅ Proxy only 
 HTTP backend ✅ ✅ 
 Response caching ✅ Built-in ❌ (use CloudFront) 
 Request validation ✅ ❌ 
 Request transformation ✅ (VTL templates) ❌ 
 Usage plans / API keys ✅ ❌ 
 WAF integration ✅ ❌ (via CloudFront) 
 OAuth 2.0 / OIDC (native) ❌ (use Cognito/Lambda) ✅ Built-in 
 Edge-optimized endpoint ✅ ❌ (Regional only) 
  

Feature	REST API	HTTP API
Cost (per 1M requests)	$3.50	$1.00 ✅
Latency overhead	10-30ms	5-10ms ✅
Lambda integration	✅ Proxy + Custom	✅ Proxy only
HTTP backend	✅	✅
Response caching	✅ Built-in	❌ (use CloudFront)
Request validation	✅	❌
Request transformation	✅ (VTL templates)	❌
Usage plans / API keys	✅	❌
WAF integration	✅	❌ (via CloudFront)
OAuth 2.0 / OIDC (native)	❌ (use Cognito/Lambda)	✅ Built-in
Edge-optimized endpoint	✅	❌ (Regional only)

Resources, Methods & Routes Core

An API is defined by its resources (URL paths) and methods (HTTP verbs). Together they form routes — the specific combinations that trigger backend integrations.

📂

Resources (Paths)

/users
/users/{id}
/orders
/orders/{orderId}/items
Hierarchical URL structure
Path parameters: {id}

🔧

Methods (Verbs)

GET — read data
POST — create new
PUT — replace
PATCH — partial update
DELETE — remove
OPTIONS — CORS preflight

🛤️

Routes (Combined)

GET /users → list users
POST /users → create user
GET /users/{id} → get one
DELETE /users/{id} → delete
Each route → one integration
Route = method + resource

Stages & Deployments Core

Stages are named references to a deployment of your API — like environments. Each stage has its own URL, variables, and settings.

🚀

Stages

dev → https://abc123.execute-api.us-east-1.amazonaws.com/dev/
staging → .../staging/
prod → .../prod/
Each stage = separate URL + config
Stage variables (like env vars for API)
Different throttling/caching per stage

📦

Deployments

Snapshot of your API configuration
Must deploy to make changes live
REST API: manual deploy required
HTTP API: auto-deploy option (default on)
Can roll back to previous deployment
Canary deployments (REST API only)

Diagram 1 — Concept: Resources, Methods & Stages Introductory

API Structure — Resources + Methods form Routes, deployed to Stages

Diagram 2 — AWS: REST API vs HTTP API Architecture Core

REST API has full pipeline (transform + cache + validate); HTTP API is a lightweight pass-through

Diagram 3 — Endpoint Types Core

Endpoint Types — Edge-optimized (REST only) vs Regional vs Private

Stage Variables & Canary Deployments Advanced

🏷️

Stage Variables

Key-value pairs defined per stage (dev/prod)
Use in: integration URLs, Lambda aliases, VTL templates
Example: Lambda alias = ${stageVariables.lambdaAlias}
dev stage → alias=dev | prod stage → alias=prod
Same API, different Lambda versions per environment

🐦

Canary Deployments (REST API)

Deploy new version to canary (e.g., 5% traffic)
Monitor errors + latency on canary
All good → promote to full stage
Issues → rollback to previous deployment
Catch issues before full production release
REST API only (HTTP API uses auto-deploy)

Payload Limits Core

   API Type Max Payload Workaround for Larger 
  REST API 10 MB S3 presigned URL for uploads 
 HTTP API 10 MB S3 presigned URL for uploads 
 WebSocket 128 KB Chunk messages or use S3 
 Integration timeout 29 seconds Async: SQS + Lambda (or Step Functions) 
  

API Type	Max Payload	Workaround for Larger
REST API	10 MB	S3 presigned URL for uploads
HTTP API	10 MB	S3 presigned URL for uploads
WebSocket	128 KB	Chunk messages or use S3
Integration timeout	29 seconds	Async: SQS + Lambda (or Step Functions)

Custom domain pricing: $0.50/month per domain per region (in addition to API Gateway call charges).

🧠 Key Insight

Default to HTTP API for new projects — 71% cheaper, lower latency, simpler. Only choose REST API when you need: built-in caching, request validation, VTL transformation, WAF, usage plans, or edge-optimized endpoints.

Chapter Summary Introductory

REST API = full-featured ($3.50/1M, higher latency) — caching, transformation, validation, WAF, usage plans
HTTP API = lightweight proxy ($1.00/1M, lower latency) — cheap, fast, OIDC native, auto-deploy
Resources = URL paths (/users, /orders/{id}); Methods = HTTP verbs (GET, POST, DELETE)
Stages = environments (dev/staging/prod) — each with its own URL and settings
Endpoint types: Edge-optimized (CloudFront built-in), Regional (direct), Private (VPC only)
Exam tip: "cheapest API" → HTTP API; "caching at gateway" → REST API; "internal only" → Private

Chapter Three

Integrations — Lambda, ALB, HTTP

What is an Integration Introductory

An integration defines how API Gateway connects to your backend. When a request hits a route, the integration determines where the request goes and how it's passed along. Think of integrations as the "wires" connecting API Gateway to your compute layer.

👉 Key concept: Each route (e.g., GET /users) has exactly one integration — the backend that handles it. API Gateway supports Lambda, HTTP endpoints, AWS services, mock responses, and VPC links.

Integration Types Core

Lambda Proxy (Most Common)

API Gateway passes entire request to Lambda
Lambda receives: headers, body, path params, query string
Lambda returns: statusCode, headers, body
No VTL transformation needed
Works with both REST API and HTTP API
Default choice for serverless APIs

🌐

HTTP Proxy

Forwards request to any HTTP endpoint
ALB, EC2, on-premises servers, external APIs
Passes request as-is to backend URL
Backend must return proper HTTP response
Use VPC Link for private ALB/ECS
Best for: containerized services behind ALB

☁️

AWS Service Integration

Call AWS services directly (no Lambda needed)
Examples: DynamoDB PutItem, SQS SendMessage, Step Functions
API Gateway signs the request with IAM
Use VTL templates to transform request
REST API only (not HTTP API)
Best for: simple CRUD without compute

🎭

Mock Integration

Returns a fixed response without calling any backend
Define the response in API Gateway itself
No Lambda/compute cost
Use for: testing, health checks, OPTIONS (CORS)
REST API only
Best for: API prototyping + CORS preflight

Lambda Proxy — Deep Dive Core

The Lambda Proxy integration is the most popular choice. API Gateway passes the full HTTP request as an event object to your Lambda function. Your function processes it and returns a response object.

// Lambda receives this event:
{ "httpMethod": "GET", "path": "/users/123",
  "headers": { "Authorization": "Bearer ..." },
  "pathParameters": { "id": "123" },
  "queryStringParameters": { "fields": "name,email" } }

// Lambda must return:
{ "statusCode": 200,
  "headers": { "Content-Type": "application/json" },
  "body": "{\"id\":\"123\",\"name\":\"Alice\"}" }

VPC Link — Private Integrations Advanced

By default, API Gateway connects to public endpoints. To reach resources in a private VPC (ALB, ECS, EC2 in private subnets), you need a VPC Link. It creates a private connection from API Gateway into your VPC via an ENI.

🔗

VPC Link for REST API

Connects to NLB in your VPC
NLB forwards to private targets
Pattern: API GW → VPC Link → NLB → ECS/EC2
Static IP for backend (NLB provides EIP)

🔗

VPC Link for HTTP API

Connects to ALB, NLB, or Cloud Map
More flexible — ALB for content routing
Pattern: API GW → VPC Link → ALB → ECS
Simpler setup than REST API VPC Link

Diagram 1 — Concept: Integration Types Introductory

Four integration types — Lambda proxy, HTTP proxy, AWS service, and Mock

Diagram 2 — AWS: API Gateway + Lambda (Serverless Pattern) Core

Most common pattern — API Gateway + Lambda + DynamoDB (fully serverless)

Diagram 3 — AWS: API Gateway + VPC Link + ALB (Containers) Advanced

Container pattern — API Gateway → VPC Link → ALB → ECS Fargate (private subnet)

🧠 Key Insight

Lambda Proxy is the default choice for serverless APIs — zero servers, auto-scales, pay-per-invocation. For containers, use HTTP API + VPC Link + ALB to reach ECS/EC2 in private subnets. Both patterns keep your backends completely hidden from the internet.

Chapter Summary Introductory

Integration = the "wire" connecting a route to a backend — each route has exactly one
Lambda Proxy = most common — full request passed as event, Lambda returns response object
HTTP Proxy = forwards to ALB/EC2/external URL — use for containers and legacy systems
AWS Service = call DynamoDB/SQS/Step Functions directly without Lambda (REST API only)
Mock = return fixed response without any backend — testing, CORS, health checks
VPC Link = private connection to ALB/NLB in VPC — keeps containers off the internet
Serverless pattern: API GW → Lambda → DynamoDB (zero servers, pay-per-use)
Container pattern: API GW → VPC Link → ALB → ECS Fargate (private subnet)

Chapter Four

Request & Response Flow

How a Request Flows Through API Gateway Introductory

When a client calls your API, the request passes through multiple stages before reaching your backend — and the response passes through stages on the way back. Understanding this flow is critical for debugging and for the exam.

👉 The full pipeline (REST API): Client → Method Request (validate + auth) → Integration Request (transform) → Backend → Integration Response (transform) → Method Response (status codes) → Client. HTTP API skips the transformation stages.

REST API — Full Request Pipeline Core

➡️

Method Request (Inbound)

First stage — validates incoming request
Checks authorization (IAM, Cognito, Lambda authorizer)
Validates request parameters and body (if configured)
Applies API key requirement
Rejects bad requests BEFORE hitting backend
Saves you Lambda invocations and cost

🔄

Integration Request (Transform)

Transforms request before sending to backend
VTL (Velocity Template Language) mapping templates
Reshape headers, body, query params
Add hardcoded values or stage variables
Map path params to request body
REST API only — HTTP API skips this

⬅️

Integration Response (Return)

Receives raw response from backend
Maps backend response to API response
Select response based on status code pattern
Transform response body with VTL template
Map headers from backend to client
REST API only

📤

Method Response (Final)

Defines what the client actually sees
HTTP status codes (200, 400, 500)
Response headers to return
Response models (JSON schema)
CORS headers set here
Final output shape for the client

Proxy Integration (Simple Mode) Core

In proxy mode (default for HTTP API and Lambda proxy), API Gateway skips all transformation — it passes the entire request to the backend and returns the response as-is. The backend controls everything.

⚡

Lambda Proxy (Recommended)

Full request passed as event object
Lambda controls status code, headers, body
No VTL templates needed
Most flexible — all logic in code
Default for HTTP API
90%+ of real-world APIs use this

🔧

Non-Proxy / Custom (REST API)

API Gateway transforms request before backend
Uses VTL mapping templates
Maps response back to client format
Useful when: backend expects different format
Legacy system integration
Rarely needed for new projects

Diagram 1 — Concept: REST API Request Pipeline Core

REST API Full Pipeline — 4 stages of processing (request and response)

Diagram 2 — AWS: Request Validation Saves Cost Core

Request validation rejects bad requests at the gateway — saves Lambda invocations

Diagram 3 — CORS Flow Advanced

CORS Preflight — Browser sends OPTIONS before actual request

🧠 Key Insight

For most APIs, use Lambda Proxy (pass-through) — let your code handle everything. Use the REST API full pipeline only when you need gateway-level request validation, response reshaping, or when integrating with legacy backends that expect different formats.

Chapter Summary Introductory

REST API pipeline: Method Request → Integration Request → Backend → Integration Response → Method Response
HTTP API / Proxy: Client → Route → Backend → Client (no transformation)
Request validation (REST API) rejects invalid requests before they hit Lambda — saves cost
VTL templates transform request/response format — REST API only, rarely needed
CORS: HTTP API = one-click toggle; REST API = manual OPTIONS method + headers
Default choice: Lambda Proxy (90%+ of APIs) — backend controls everything

Chapter Five

Security & Authorization

Why Security at the Gateway Introductory

API Gateway is the first line of defense for your APIs. By handling authentication and authorization at the gateway level, you prevent unauthorized requests from ever reaching your backend — saving compute cost and reducing attack surface.

👉 Critical exam concept: API Gateway supports 4 authorization methods: IAM, Cognito User Pools, Lambda Authorizers, and API Keys. Know when to use each — this is heavily tested across AWS exams.

Authorization Methods Compared Core

🔒

IAM Authorization

Request signed with AWS SigV4
Caller must have IAM role/user with permissions
Best for: AWS service-to-service calls
Lambda → API GW, EC2 → API GW
No external users (AWS accounts only)
Fine-grained: per-method IAM policies
Cross-account: resource policies

👥

Cognito User Pools

Users authenticate with Cognito, get JWT token
Client passes JWT in Authorization header
API Gateway validates token automatically
Best for: web/mobile apps with user sign-up
Managed user directory (email/password, social login)
No custom code needed for auth
REST API: native | HTTP API: JWT authorizer

Lambda Authorizer (Custom)

Your Lambda function decides allow/deny
Receives token or request parameters
Returns IAM policy (allow/deny + context)
Best for: custom auth logic, third-party tokens
OAuth tokens from non-Cognito providers
Can cache auth results (5min default)
Two types: Token-based, Request-based

🔑

API Keys (NOT Auth!)

NOT for authentication — for identification/throttling
x-api-key header in request
Tied to usage plans (quota + throttle)
Best for: tracking API consumers, rate limiting
Partner APIs: each partner gets a key
REST API only (not HTTP API)
Combine WITH another auth method

Decision Flowchart — Which Auth Method Core

   Scenario Auth Method Why 
  AWS service calling your API IAM Already has IAM role, SigV4 signing 
 Cross-account access IAM + Resource Policy Resource policy allows other account 
 Web/mobile app with user login Cognito Managed user pool, JWT tokens 
 Social login (Google, Facebook) Cognito (federated) Cognito handles federation 
 Third-party OAuth/OIDC tokens Lambda Authorizer Custom validation logic 
 Custom auth (database lookup, SAML) Lambda Authorizer Full control in your code 
 Track partners + rate limit per key API Key + Usage Plan Identification, not authentication 
  

Scenario	Auth Method	Why
AWS service calling your API	IAM	Already has IAM role, SigV4 signing
Cross-account access	IAM + Resource Policy	Resource policy allows other account
Web/mobile app with user login	Cognito	Managed user pool, JWT tokens
Social login (Google, Facebook)	Cognito (federated)	Cognito handles federation
Third-party OAuth/OIDC tokens	Lambda Authorizer	Custom validation logic
Custom auth (database lookup, SAML)	Lambda Authorizer	Full control in your code
Track partners + rate limit per key	API Key + Usage Plan	Identification, not authentication

Diagram 1 — Concept: Auth Methods at the Gateway Introductory

Four authorization methods — each intercepts requests before they reach the backend

Diagram 2 — AWS: Cognito + API Gateway Flow Core

Most common web/mobile pattern — Cognito issues JWT, API Gateway validates it

Diagram 3 — Lambda Authorizer Flow Advanced

Lambda Authorizer — custom function validates token and returns IAM policy

HTTP API JWT Authorizer (Built-in) Core

🗝️

How It Works

HTTP API validates JWTs natively — no Lambda needed
Works with Cognito User Pools or any OIDC provider
Checks: issuer (iss), audience (aud), expiry (exp)
Sets $context.authorizer.claims for backend
Faster + cheaper than Lambda authorizer
Configure: issuer URL + audience claim

⚡

Lambda Authorizer Caching Impact

First request: Lambda invoked — adds 50–500ms latency
Cached requests (within TTL): ~10ms — policy reused
TTL default: 300 seconds (5 minutes)
TTL = 0: every request hits Lambda (expensive!)
TTL too high: auth changes slow to propagate
Best practice: TTL = 300–900s for most apps

Resource Policies Advanced

📜

What Resource Policies Do

JSON policy attached to the API itself
Controls WHO can invoke the API
Allow/deny by: IP, VPC, AWS account
Works with IAM auth (additive)
Required for cross-account access
Required for Private API (VPC access)

🛡️

Common Use Cases

Allow only specific IP ranges (corporate)
Allow specific VPC endpoint (Private API)
Allow specific AWS account (cross-account)
Deny all except whitelisted sources
Combine with IAM for defense-in-depth

🧠 Key Insight

Cognito for user-facing apps (managed, no code); IAM for service-to-service (SigV4); Lambda Authorizer for custom/third-party tokens. API Keys are NOT auth — they're for identification and throttling only. This distinction is tested heavily across AWS exams.

Chapter Summary Introductory

IAM = AWS service-to-service auth (SigV4 signing) — internal AWS calls
Cognito = user pools with JWT tokens — web/mobile apps, managed user directory
Lambda Authorizer = custom auth logic — third-party tokens, legacy systems, SAML
API Keys = NOT auth! — identification + throttling via usage plans (REST API only)
Resource Policies = IP/VPC/account restrictions on the API itself — cross-account, Private API
Caching: Lambda Authorizer results cached 0-3600s (default 300s) to reduce invocations
Exam keywords: "user login" → Cognito, "AWS calls" → IAM, "custom token" → Lambda Auth

Chapter Six

Throttling, Caching & Monitoring

Throttling — Protecting Your Backends Core

API Gateway includes built-in rate limiting that protects your backend from traffic spikes. Without throttling, a sudden burst of requests could overwhelm Lambda concurrency, ECS tasks, or database connections.

👉 Default limits: 10,000 requests/second steady-state + 5,000 burst across ALL APIs in an account/region. Per-method throttling and usage plans let you set granular limits.

🚦

Account-Level Throttling

10,000 requests/second (steady-state)
5,000 burst capacity (token bucket)
Shared across ALL APIs in account + region
Exceeding returns 429 Too Many Requests
Can request increase via AWS support
Exam tip: know these defaults!

🎯

Per-Method / Per-Client Throttling

Set rate + burst per API method
Usage plans: per-client quotas (daily/weekly/monthly)
API Key identifies the client
Example: Free tier = 1000 calls/day
Premium = 100,000 calls/day
REST API only (usage plans)

Usage Plans & API Keys Core

📊

Usage Plan

Defines throttle (rate + burst) and quota (requests per day/week/month) for a group of API consumers.

🔑

API Key

Identifies the caller. Tied to a usage plan. Sent via x-api-key header. NOT for auth — for tracking + limiting.

💰

Monetization Pattern

Free tier (100/day) vs Basic ($10/mo, 10K/day) vs Pro ($100/mo, unlimited). Each tier = one usage plan.

Response Caching Core

API Gateway caching stores backend responses and returns them directly for identical subsequent requests — reducing backend calls, latency, and cost. REST API only.

⚡

How Caching Works

Cache per-stage (dev/prod cached separately)
TTL: 0–3600 seconds (default 300s)
Cache size: 0.5 GB to 237 GB
Cache key = full request URL + params
Cache hit → return immediately (no backend call)
Cache miss → call backend, store response

⚠️

Caching Gotchas

Charged per hour ($0.02–$3.80/hr by size)
NOT available on HTTP API (use CloudFront instead)
Invalidate with Cache-Control: max-age=0
Client can bypass with header (if allowed)
Must configure per method (not global)
Don't cache user-specific responses!

Monitoring — CloudWatch Integration Core

📈

Key Metrics

Count — total API calls
Latency — time to respond
IntegrationLatency — backend time
4XXError — client errors
5XXError — server errors
CacheHitCount/MissCount

📝

Logging

Execution logs (full request/response)
Access logs (Apache-format, custom)
Enable per-stage
Send to CloudWatch Logs
X-Ray tracing for distributed tracing
Debug issues end-to-end

🔔

Alarms

5XX rate > 1% → alert
Latency p99 > 3s → alert
4XX spike → possible attack
429 count → throttle hitting
Integration timeout → backend issue
SNS notifications

Diagram 1 — Concept: Throttling Layers Core

Throttling layers — Account limit → Stage limit → Method limit → Usage plan (per-client)

Diagram 2 — AWS: Caching Flow Core

Response Caching — cache hit returns immediately, cache miss calls backend and stores result

Diagram 3 — Monitoring Dashboard Core

CloudWatch Metrics — What to monitor for production APIs

🧠 Key Insight

Throttling protects backends from overload (429 errors). Caching reduces cost and latency (REST API only — use CloudFront for HTTP API). CloudWatch gives you Count, Latency, Errors per-endpoint. X-Ray traces requests end-to-end across services.

Chapter Summary Introductory

Throttling: 10K/s account-level, per-stage, per-method — excess returns 429
Usage Plans: per-client quotas (daily/monthly) + API keys for identification (REST API only)
Caching: 0.5–237 GB, TTL 0–3600s, per-stage — REST API only (HTTP API: use CloudFront)
CloudWatch Metrics: Count, Latency, IntegrationLatency, 4XX, 5XX, CacheHit/Miss
Logging: execution logs (debug) + access logs (analytics) — per-stage enablement
X-Ray: distributed tracing across API GW → Lambda → DynamoDB

Chapter Seven

Architecture Patterns

Pattern 1 — Serverless REST API Core

The most common pattern for new projects. Zero servers, zero infrastructure management, pay-per-use. API Gateway + Lambda + DynamoDB handles everything from auth to data storage.

Pattern 1 — Full serverless API (most common for startups and new projects)

Pattern 2 — API Gateway + Microservices (ECS/Fargate) Core

🐳

When to Use

Services need long-running processes
Existing Docker containers
Large request/response payloads
Need consistent latency (no cold starts)
Team prefers containers over Lambda

🏗️

Architecture

API GW (HTTP API) → VPC Link → ALB
ALB routes to ECS Fargate services
Each microservice = own container + target group
Private subnet — no direct internet access
API GW handles CORS, auth, throttling

Pattern 3 — Public API for Partners Advanced

🌐

Architecture

REST API (for usage plans + API keys)
Usage plans: Free, Basic, Premium tiers
Each partner gets API key + quota
Response caching for public data
WAF for DDoS protection
Custom domain: api.yourcompany.com

💰

Monetization

Free: 100 calls/day, 1 req/s
Basic ($29/mo): 10K calls/day, 10 req/s
Pro ($199/mo): 100K calls/day, 100 req/s
Enterprise: custom limits + SLA
Track usage via CloudWatch
Billing via AWS Marketplace (optional)

Pattern 4 — WebSocket API (Real-time) Advanced

🔌

WebSocket APIs

Persistent two-way connections
Routes: $connect, $disconnect, $default, custom
Each route → Lambda integration
Connection ID for targeting specific clients
1M concurrent connections supported
Use for: chat, notifications, live dashboards

💬

Use Cases

Real-time chat applications
Live stock tickers / sports scores
IoT device communication
Multiplayer game state sync
Collaborative editing (like Google Docs)
Push notifications to connected clients

Pattern 5 — API Gateway + CloudFront (Edge Enhancement) Advanced

🌍

Why Add CloudFront in Front

Edge caching — cache GET responses at 450+ PoPs globally
WAF at edge — block attacks before hitting API Gateway
Custom domain + TLS termination at edge
Shield Standard (DDoS) included free with CloudFront
Solves HTTP API limitation: no built-in cache or WAF

🏗️

When to Use This Pattern

HTTP API + need response caching
HTTP API + need WAF protection
Need custom domain with managed TLS
Global users — reduce latency via edge
Want DDoS protection without upgrading to Shield Advanced
Pattern: Route 53 → CloudFront → API GW → Lambda

WebSocket API — Deep Dive Advanced

📡

Route Types

$connect — client opens connection (accept/reject)
$disconnect — client leaves (cleanup)
$default — unmatched messages
Custom: join, message, leave
Each route → Lambda (no HTTP proxy)

🔌

Connection Management

@connections API to push to clients
POST to connectionId → message sent
DELETE connectionId → disconnect client
Store connectionIds in DynamoDB
Max 1M concurrent connections

⏱️

Limits & Pricing

Idle timeout: 10 minutes
Max connection: 2 hours
Message size: 128 KB max
Pricing: ~$1/1M messages + $0.25/1M connect-minutes
Send keepalive pings before idle timeout

Diagram 2 — Pattern Decision Tree Core

Which API Gateway pattern? — Decision flowchart

Diagram 3 — Full Production Architecture Advanced

Production-grade serverless API — all pieces together

🧠 Key Insight

Default to HTTP API + Lambda for new projects (cheapest, fastest, simplest). Use REST API only when you need caching, WAF, API keys, or request validation at the gateway. For containers, use HTTP API + VPC Link + ALB. For real-time, use WebSocket API.

Chapter Summary Introductory

Pattern 1 (Serverless): HTTP API + Lambda + DynamoDB — zero servers, ~$2.50/month for 1M requests
Pattern 2 (Containers): HTTP API + VPC Link + ALB + ECS Fargate — long-running, consistent latency
Pattern 3 (Public API): REST API + Usage Plans + API Keys — monetization, per-client quotas
Pattern 4 (Real-time): WebSocket API + Lambda — chat, notifications, live data
Decision: Real-time → WebSocket | Need cache/WAF/keys → REST | Default → HTTP API
Production stack: Route 53 + WAF + API GW + Cognito + Lambda + DynamoDB + CloudWatch + X-Ray