Security Services
Tools for identity, access management, and protecting workloads in AWS.
What is IAM?
AWS Identity and Access Management (IAM) is the global service that controls who (users, roles, applications) can perform what actions on which AWS resources. It is the foundation of AWS security.
How IAM Works
- Identities: Users, roles, or federated identities make requests.
- Policies: JSON documents define permissions โ attached to identities or resources.
- Evaluation: Default is implicit deny. Explicit
Denyalways overrides anyAllow.
Core Components
- Users: Permanent identities for humans or applications. Credentials: password, access keys, MFA.
- Groups: Collections of users for shared policy management. Groups cannot contain other groups.
- Roles: Temporary identities assumed by AWS services or external accounts via STS. Best practice for applications โ no long-lived keys.
- Policies:
- AWS Managed: Pre-built (e.g.,
AmazonS3ReadOnlyAccess) - Customer Managed: Your own reusable policies
- Inline: Embedded directly in a user, group, or role
- AWS Managed: Pre-built (e.g.,
Key Features
- MFA: Required second factor for console/API access. Strongly recommended for root and privileged accounts.
- Identity Federation: SAML 2.0 for enterprise SSO (Active Directory), OIDC for web identity (Google, GitHub Actions).
- ABAC: Tag-driven permissions โ allow access to resources tagged
env=devby principals taggedteam=devs. - Cross-Account Access: Role in Account A trusted by Account B โ external users assume it via
sts:AssumeRole. - Permission Boundaries: Ceiling on max permissions a user or role can ever have.
- IAM Access Analyzer: Identifies resources shared publicly or cross-account for security audits.
Best Practices
- Apply least privilege โ grant only minimum required permissions.
- Never use the root account for daily tasks. Lock it with MFA.
- Use roles for applications and services โ not long-lived access keys.
- Rotate and audit access keys regularly.
Related Identity Services
- AWS Cognito: User directory (User Pools) for end-user authentication + Identity Pools for temporary AWS credentials. Recommended for managing application users.
- AWS Directory Service: Managed Microsoft AD, AD Connector (proxy to on-premises AD), Simple AD.
Compute Services
Scalable compute resources for running applications, containers, and serverless workloads.
What is EC2?
Amazon EC2 provides resizable virtual servers in the cloud. It is the foundational service for running custom applications without managing physical hardware. You control the OS, patching, and configuration.
Architecture
Instances run on the Nitro hypervisor, which offloads networking, storage, and security to dedicated hardware for improved performance and isolation. Instances launch inside a VPC subnet from an AMI.
- AMIs: Pre-configured templates with OS + software. Regional, shareable across accounts.
- Instance Metadata: Accessible at
http://169.254.169.254/latest/meta-data/โ instance ID, IP, role credentials.
Instance Families
- General Purpose (T, M): Balanced CPU, memory, network โ web apps, app servers. T-series is burstable.
- Compute Optimized (C): High vCPU ratio โ batch processing, gaming, HPC.
- Memory Optimized (R, X): Large memory โ in-memory databases, real-time analytics.
- Storage Optimized (I, D): High-throughput local storage โ NoSQL, data warehouses.
- GPU (G, P): NVIDIA GPUs โ ML training, video transcoding, inference.
Storage Options
- EBS: Persistent network-attached storage. Survives stop/start. Supports snapshots and live resizing.
- Instance Store: Local ephemeral SSD โ very high IOPS but lost on stop or termination. Use for temp data or caches only.
- EFS / S3: Shared file system (EFS) or object storage (S3) accessible over the network.
Purchase Options
- On-Demand: Pay-per-second, no commitment โ dev/test and unpredictable workloads.
- Reserved Instances: 1 or 3-year commitment for significant savings on predictable workloads.
- Spot Instances: Spare capacity at deep discount โ interruptible, ideal for batch and fault-tolerant jobs.
- Savings Plans: Flexible commitment by compute spend โ applies across EC2, Lambda, and Fargate.
Scaling & High Availability
- Use Auto Scaling Groups (ASG) to automatically launch/terminate instances across multiple AZs.
- Use Elastic Load Balancer to distribute traffic and perform health checks.
What is Lambda?
Lambda runs your code in response to events without any server management. AWS handles execution, scaling, and availability. Functions are stateless and terminate after execution.
How It Works
- Triggers: API Gateway, S3, DynamoDB Streams, SQS, EventBridge, SNS, and more.
- Execution Environment: Isolated container with configurable memory (128 MB โ 10 GB). CPU scales proportionally.
- Concurrency: Scales automatically to thousands of concurrent executions. Use Provisioned Concurrency to eliminate cold starts.
- Runtimes: Java, Python, Node.js, Go, Ruby, .NET, and custom runtimes via Lambda Layers.
Key Limits & Features
- Max execution timeout: 15 minutes
- Ephemeral
/tmpstorage up to 10 GB - Lambda Layers: Share common dependencies (libraries, JARs) across functions
- Dead Letter Queues: SQS or SNS for failed async invocations
- Destinations: Route success/failure results to SQS, SNS, EventBridge, or another Lambda
Best Practices
- Initialise SDK clients and DB connections outside the handler to take advantage of container reuse.
- Use environment variables for configuration โ never hardcode secrets.
- Instrument with X-Ray for profiling and tracing.
- Keep functions small and single-purpose.
Amazon ECS (Elastic Container Service)
ECS is a fully managed container orchestration service for Docker containers. You define tasks (container + resource specs) and services (desired count, scaling, load balancer integration).
- Task Definition: Blueprint โ image, CPU, memory, environment variables, networking mode.
- Service: Maintains desired number of task copies. Integrates with ALB for traffic distribution.
- Launch Types: EC2 (manage instances yourself) or Fargate (serverless).
AWS Fargate
Fargate removes the need to provision EC2 instances for containers. Specify vCPU and memory per task โ AWS provisions the underlying compute invisibly.
- No EC2 instances to patch, scale, or maintain.
- Works with both ECS and EKS.
- Ideal for microservices, batch jobs, and variable-load workloads.
What is ELB?
ELB distributes incoming traffic across multiple targets (EC2, containers, Lambda, IPs) across Availability Zones for fault tolerance and scale.
ELB Types
- Application Load Balancer (ALB): HTTP/HTTPS, Layer 7. Path and host-based routing, WebSockets, gRPC. Best for web apps and microservices.
- Network Load Balancer (NLB): TCP/UDP/TLS, Layer 4. Ultra-low latency, static IP, millions of requests per second. Best for high-performance workloads.
- Gateway Load Balancer (GWLB): For scaling third-party virtual appliances (firewalls, IDS/IPS).
Key Features
- Health Checks: Automatically removes unhealthy targets from rotation.
- SSL/TLS Termination: Offload TLS using ACM certificates at the load balancer.
- Sticky Sessions: Route a user to the same target using session cookies.
- Cross-Zone Load Balancing: Distribute traffic evenly across all AZs regardless of target count per AZ.
What are ASGs?
ASGs automatically adjust EC2 instance count in response to demand โ ensuring the right capacity is available without over or under-provisioning.
Scaling Policies
- Target Tracking: Maintain a specific metric value (e.g., CPU at 50%). Simplest and most common.
- Step Scaling: Add/remove fixed capacity in steps based on CloudWatch alarms.
- Scheduled Scaling: Pre-planned scaling for predictable load patterns.
- Predictive Scaling: ML-based demand forecasting for proactive scale-out.
Key Concepts
- Launch Template: Defines instance config โ AMI, type, key pair, security groups.
- Min / Max / Desired Capacity: Bounds and target counts for the group.
- Warm Pools: Pre-initialised instances in stopped state โ reduces scale-out latency.
Elastic Beanstalk is PaaS that automates deployment, scaling, and management of web applications. Upload code; Beanstalk provisions EC2, load balancers, auto scaling, and monitoring automatically.
- Supports Java, .NET, Python, Node.js, PHP, Ruby, Go, Docker.
- You retain full control over underlying AWS resources.
- Supports rolling, blue/green, and immutable deployment strategies.
- Ideal for developers who want fast deployment without deep infrastructure expertise.
Monitoring & Management
Observe, audit, and govern your AWS environment at scale.
What is CloudWatch?
CloudWatch is the observability hub for AWS โ collecting metrics, logs, and events from virtually every AWS service and your own applications.
Core Capabilities
- Metrics: Time-series data โ CPU, memory (via agent), error rates, latency.
- Alarms: Trigger SNS notifications, ASG scaling, or Lambda when a metric breaches a threshold.
- Logs: Collect from EC2, Lambda, ECS, and any application via the CloudWatch Agent or SDK.
- Log Insights: Interactive SQL-like query engine for log data.
- Dashboards: Real-time visualisations across services and accounts.
- Synthetics: Scripted canary monitors simulating user interactions.
- Container Insights: Performance monitoring for ECS and EKS.
Best Practices
- Install the CloudWatch Agent on EC2 for memory and disk metrics (not available by default).
- Use Metric Math to derive composite metrics (e.g., error rate = errors รท requests).
- Use EMF (Embedded Metric Format) in Lambda for high-volume structured metrics.
- Configure log retention policies to prevent unbounded log growth.
What is CloudTrail?
CloudTrail records every AWS API call โ who made it, when, from where, and what was affected. Primary tool for auditing, compliance, and security investigation.
Key Concepts
- Event History: Last 90 days of management events โ free, always available.
- Trails: Persistent logging to S3 (and optionally CloudWatch Logs) for long-term retention.
- Management Events: Control plane actions โ creating buckets, launching EC2, modifying IAM policies.
- Data Events: Resource-level operations โ S3 object reads/writes, Lambda invocations. High volume, opt-in.
- Insights: Automatically detects unusual API activity (e.g., spike in
DeleteBucketcalls).
Best Practices
- Enable a multi-region trail to capture all account-wide activity.
- Enable log file validation to detect tampered logs.
- Protect the trail S3 bucket โ enable versioning and restrict delete permissions.
- Alert on
rootaccount usage via CloudWatch Logs metric filter.
EventBridge is a serverless event bus connecting AWS services, SaaS applications, and custom microservices via events.
- Event Bus: Default (AWS services), custom (your events), partner (SaaS integrations).
- Rules: Filter events by pattern and route to targets โ Lambda, SQS, SNS, Step Functions.
- Scheduled Rules: Cron-based triggers replacing CloudWatch Scheduled Events.
- Schema Registry: Auto-discover event schemas and generate code bindings.
- Pipes: Point-to-point connections from a source (SQS, DynamoDB Streams) to a target with optional Lambda enrichment.
Storage Services
Scalable, durable storage for every use case โ object, block, and shared file systems.
What is S3?
Amazon S3 is object storage โ store any amount of data, access it from anywhere, and build upon it with versioning, lifecycle policies, event notifications, and analytics.
Core Concepts
- Buckets: Globally unique containers. Region-specific, globally accessible.
- Objects: Any file up to 5 TB, identified by a key.
- Durability: 99.999999999% (11 nines).
- Versioning: Preserve and restore every version of an object.
Storage Classes
- Standard: General-purpose, frequent access.
- Standard-IA: Infrequent access, lower cost, retrieval fee applies.
- One Zone-IA: Single AZ, lower cost โ for non-critical, reproducible data.
- Intelligent-Tiering: Auto-moves objects between access tiers based on usage.
- Glacier Instant Retrieval: Archive with millisecond retrieval.
- Glacier Flexible Retrieval: Minutes to hours retrieval โ backups and archives.
- Glacier Deep Archive: 12โ48 hour retrieval โ long-term compliance storage.
Security
- Block Public Access: Enable at account and bucket level โ prevents accidental exposure.
- Bucket Policies: Resource-based IAM for cross-account and fine-grained access.
- Encryption: SSE-S3 (AWS-managed), SSE-KMS (customer keys), SSE-C (client-provided), or client-side.
- MFA Delete: Requires MFA to delete objects or disable versioning.
Advanced Features
- Lifecycle Policies: Auto-transition or expire objects โ e.g., move to Glacier after 90 days.
- Replication (CRR / SRR): Cross-Region or Same-Region replication for DR, compliance, and data localisation.
- Event Notifications: Trigger Lambda, SQS, or SNS on object create/delete.
- S3 Select: Query inside objects with SQL โ no need to download entire files.
- Object Lock: WORM (write-once-read-many) compliance mode.
- Transfer Acceleration: Upload via CloudFront edge locations for faster global transfers.
What is EBS?
EBS provides persistent block storage volumes for EC2 โ like a network-attached hard drive. Volumes persist independently of instances and can be snapshotted for backup or AMI creation.
Volume Types
- gp3 (General Purpose SSD): 3,000 IOPS and 125 MB/s baseline, independent of size. Cost-effective default choice.
- io2 Block Express (Provisioned IOPS SSD): Up to 256,000 IOPS โ for critical databases.
- st1 (Throughput HDD): High sequential throughput โ big data, log processing, data warehouses.
- sc1 (Cold HDD): Lowest cost โ infrequently accessed sequential data.
Key Features
- Resize volumes without downtime โ increase capacity and change type on the fly.
- Snapshots: Incremental backups to S3 โ cross-region and cross-account cloning.
- Encryption: AES-256 at rest and in transit, backed by KMS.
- Multi-Attach (io2): Attach one volume to up to 16 instances in the same AZ for clustered databases.
EFS is a fully managed NFS file system mountable by thousands of EC2 instances simultaneously across multiple AZs. Scales automatically โ you pay only for what you use.
- Performance Modes: General Purpose (low latency) or Max I/O (high throughput, parallel workloads).
- Storage Tiers: Standard and Infrequent Access (IA) โ lifecycle management moves files automatically.
- Ideal for shared application data, home directories, CMS platforms, and container persistent volumes.
Networking Services
Build isolated networks, distribute traffic globally, and connect to on-premises.
What is VPC?
VPC is your logically isolated virtual network within AWS. You define IP ranges, subnets, routing, and security โ modelling your own network in the cloud.
Core Components
- Subnets: Public (internet-accessible) and private (internal only) across Availability Zones.
- Internet Gateway: Enables internet access for resources in public subnets.
- NAT Gateway: Outbound-only internet access for private subnet instances.
- Route Tables: Control traffic routing within VPC and to external destinations.
- Security Groups: Stateful firewall at the instance/ENI level โ tracks connections automatically.
- NACLs: Stateless firewall at the subnet level โ explicit rules for both inbound and outbound.
Connectivity Options
- VPC Peering: Private connectivity between VPCs (cross-account, cross-region). Non-transitive.
- Transit Gateway: Hub-and-spoke model connecting multiple VPCs and on-premises. Supports transitive routing.
- Site-to-Site VPN: Encrypted IPsec tunnel from on-premises to AWS over the internet.
- Direct Connect: Dedicated private circuit from on-premises โ consistent bandwidth, lower latency than VPN.
- VPC Endpoints: Private connectivity to AWS services without internet traversal. Gateway (S3, DynamoDB) and Interface (most others).
- PrivateLink: Expose services privately across VPCs via Interface Endpoints โ no peering needed.
What is Route 53?
Route 53 is AWS's highly available and scalable DNS service. It translates domain names to IPs and routes users to the best endpoint globally.
Routing Policies
- Simple: Standard single-resource DNS resolution.
- Failover: Active-passive โ routes to secondary when primary health check fails.
- Weighted: Traffic by percentage โ useful for canary deployments.
- Latency-Based: Routes to the region with the lowest latency for the user.
- Geolocation: Route by user's country or continent.
- Geoproximity: Route by geographic proximity with optional bias shifting.
- Multi-Value Answer: Return multiple healthy records for basic client-side load balancing.
Other Features
- Health Checks: Monitor endpoint health and trigger failover automatically.
- Private Hosted Zones: Resolve domain names within a VPC for internal service discovery.
- DNSSEC: Sign DNS records to protect against cache poisoning.
- Resolver: Inbound/outbound endpoints for hybrid DNS resolution between VPC and on-premises.
What is CloudFront?
CloudFront is AWS's CDN โ distributing web content from 450+ edge locations globally for reduced latency and reduced load on origin servers.
Key Concepts
- Origins: S3, ALB, API Gateway, EC2, or any HTTP server.
- Cache Behaviours: Rules controlling caching, header forwarding, and origin selection per path.
- Invalidations: Purge cached content before TTL expires โ typically after a deployment.
Security Features
- HTTPS: Free SSL/TLS via ACM. Enforce HTTPS-only.
- AWS WAF: Block SQLi, XSS, and rate-based attacks at the edge.
- Signed URLs / Cookies: Restrict content to authorised users with time-limited tokens.
- Origin Access Control (OAC): Restrict S3 origins so only CloudFront can fetch them.
Performance Features
- Lambda@Edge / CloudFront Functions: Run lightweight code at edge โ A/B testing, URL rewrites, auth.
- Origin Shield: Extra caching layer that reduces origin load by up to 90%.
Database Services
Managed relational, NoSQL, in-memory, and analytical databases.
What is RDS?
RDS manages relational databases in the cloud โ automating provisioning, patching, backups, scaling, and failover. Supports PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, and Aurora.
Deployment Options
- Single-AZ: One instance โ dev/test environments.
- Multi-AZ: Synchronous standby replica in a second AZ, automatic failover in 1โ2 minutes. High availability โ standby is not readable.
- Read Replicas: Asynchronous replicas for horizontal read scaling โ same region, cross-region, or promotable to standalone.
Key Features
- Automated Backups: Daily snapshots + transaction logs โ point-in-time restore up to 35 days.
- Encryption: At rest (KMS) and in transit (TLS). Must be enabled at creation.
- Performance Insights: Visual dashboard showing database load by SQL statement.
- RDS Proxy: Connection pooling โ handles burst connections without overwhelming the database.
What is Aurora?
Aurora is AWS's cloud-native relational database โ MySQL and PostgreSQL compatible but architecturally redesigned for higher performance and availability.
Architecture
- Storage: 6 copies across 3 AZs โ tolerates 2 copy failures for writes, 3 for reads.
- Auto-Scaling Storage: Grows automatically from 10 GB to 128 TB.
- Up to 15 read replicas with sub-10ms failover โ much faster than RDS Multi-AZ.
Advanced Features
- Aurora Serverless v2: Auto-scales compute in seconds based on load โ pay only for what you use.
- Global Database: Read replicas in up to 5 regions โ sub-1-second replication lag for near-zero RPO DR.
- Backtrack: (MySQL) Rewind to any point in time without restoring from backup.
- Clone: Instant copy-on-write database cloning โ no data duplication overhead.
What is DynamoDB?
DynamoDB is a fully managed, serverless NoSQL database delivering single-digit millisecond performance at any scale. Supports key-value and document data models.
Core Concepts
- Primary Key: Partition key alone, or composite partition + sort key.
- On-Demand Mode: Scales instantly, no capacity planning โ pays per request.
- Provisioned Mode: Define read/write capacity units with auto-scaling โ more cost-effective for predictable loads.
Advanced Features
- Global Tables: Multi-region multi-master replication โ active-active, sub-1-second lag.
- DynamoDB Streams: Ordered log of item changes โ trigger Lambda for change data capture.
- DAX: In-memory cache reducing read latency from milliseconds to microseconds.
- GSI / LSI: Secondary indexes for querying on non-primary key attributes.
- TTL: Auto-expire and delete items by timestamp.
- Transactions: ACID transactions across multiple items and tables.
Fully managed in-memory caching service. Dramatically speeds up read-heavy apps by serving frequently accessed data from memory.
Redis vs Memcached
- Redis: Rich data structures, pub/sub, replication, cluster mode, persistence, Lua scripting. Use for sessions, leaderboards, real-time analytics, and queues.
- Memcached: Simple key-value cache with multi-threading. Use when you only need basic caching and horizontal scaling without persistence.
Redis Key Features
- Up to 5 read replicas per shard, cluster mode for sharding across up to 500 nodes.
- Automatic failover with Multi-AZ.
- Persistence: RDB snapshots + AOF logs.
- Global Datastore: Cross-region replication for DR and low-latency global reads.
Analytics Services
Process, query, and visualise data at scale โ from ad-hoc queries to data warehouses and streaming.
Serverless interactive query service โ run standard SQL directly against data in S3. Pay only for data scanned.
- Supports CSV, JSON, Parquet, ORC, Avro, and compressed formats.
- Integrates with AWS Glue Data Catalog for schema management.
- Use columnar formats (Parquet/ORC) and partitioned data to minimise scanned bytes.
- Ideal for ad-hoc analysis, log investigation (VPC Flow Logs, CloudTrail), and ETL validation.
- Federated Query: Query RDS, DynamoDB, Redshift, and on-premises data alongside S3 in one statement.
What is Redshift?
Fully managed, petabyte-scale cloud data warehouse optimised for OLAP. Columnar storage and MPP (massively parallel processing) enable fast complex queries across large datasets.
Architecture
- Leader Node: Parses queries and coordinates parallel execution across compute nodes.
- Compute Nodes: Execute query fragments in parallel, return results to leader node.
- Columnar Storage: Data stored by column โ dramatically reduces I/O for queries touching few columns.
Key Features
- Redshift Serverless: Auto-scales capacity per workload โ pay only for queries run.
- Redshift Spectrum: Query S3 data directly from Redshift โ combine in-cluster and S3 data.
- Data Sharing: Share live data across clusters without copying โ cross-account supported.
- Automatic Table Optimisation: Tunes sort keys and distribution styles based on query patterns.
A family of services for real-time streaming data ingestion, processing, and analytics.
- Kinesis Data Streams: Capture streaming data with shard-based scaling. Consumers: Lambda, custom apps, Kinesis Analytics.
- Kinesis Data Firehose: Serverless delivery to S3, Redshift, OpenSearch, or Splunk โ with optional Lambda transformation. Zero consumer code.
- Managed Apache Flink: SQL or Java/Scala Flink jobs for real-time stream processing.
- Amazon MSK: Fully managed Apache Kafka for teams already invested in Kafka.
Managed big data platform running Apache Spark, Hadoop, Hive, HBase, Flink, and Presto on elastic clusters.
- Ideal for large-scale data processing, ML pipelines, and analytics.
- EMR Serverless: Submit Spark or Hive jobs without managing clusters โ auto-scales workers per job.
- Use S3 as the primary data lake โ decouple compute from storage to scale independently.
- Integrates with Glue Data Catalog, Lake Formation, and Athena for a full analytics ecosystem.