AWS CloudFormation โ
Infrastructure as Code
Define your entire AWS infrastructure in declarative templates. Version-controlled, repeatable, and automated โ from a single EC2 instance to a multi-region architecture.
โก CloudFormation in 30 Seconds
- Write your infrastructure in YAML/JSON templates โ EC2, VPC, RDS, IAM, everything
- CloudFormation creates and manages resources as a stack โ a single unit
- Update a stack by modifying the template โ CloudFormation figures out the diff
- Delete a stack โ all resources cleaned up automatically (no orphaned resources)
- Free to use โ you only pay for the AWS resources created
What is CloudFormation
AWS CloudFormation is an Infrastructure as Code (IaC) service that lets you define AWS resources in a text file (a template) and have CloudFormation create, update, and delete those resources for you.
๐ Think of CloudFormation as: A blueprint that AWS reads to automatically build your infrastructure
Instead of clicking through the AWS Console or writing imperative scripts, you declare what you want โ and CloudFormation handles the how: dependency ordering, parallel creation, rollback on failure.
Manual Infrastructure
- Click-through console = undocumented
- Hard to reproduce across environments
- Drift between dev/staging/prod
- No audit trail of changes
- Cleanup is error-prone (orphaned resources)
CloudFormation Solves
- Infrastructure defined in code (YAML/JSON)
- Identical environments with one template
- Version-controlled in Git
- Full audit trail via CloudTrail
- Delete stack โ all resources cleaned up
| Concept | What It Is | Analogy |
|---|---|---|
| Template | A YAML/JSON file describing AWS resources | The blueprint / recipe |
| Stack | A running instance of a template โ the actual deployed resources | The built house |
| Change Set | A preview of what will change before you apply an update | A renovation proposal |
| Stack Set | Deploy one template across multiple accounts/regions | Build the same house in many cities |
| Drift Detection | Detect if actual resources differ from the template | Check if someone modified the house without the blueprint |
| Approach | How It Works | Example |
|---|---|---|
| Imperative (scripts) | You describe how โ step by step | AWS CLI: aws ec2 run-instances ...Then aws ec2 create-security-group ...Then aws ec2 authorize-security-group-ingress ... |
| Declarative (CloudFormation) | You describe what โ the desired state | Template says: "I want an EC2 instance in a VPC with this SG" CloudFormation works out the order and creates everything |
๐ Declarative wins because CloudFormation handles dependency resolution, parallelism, error handling, and rollback. You don't write "create VPC, then subnet, then route table, then EC2" โ you declare all of them and CloudFormation figures out the order.
| Feature | CloudFormation | Terraform |
|---|---|---|
| Provider | AWS-native (first-party) | HashiCorp (third-party, multi-cloud) |
| Language | YAML / JSON | HCL (HashiCorp Configuration Language) |
| State management | Managed by AWS (no state file to maintain) | You manage state file (S3 + DynamoDB for locking) |
| Rollback | Automatic on failure | Manual โ you must handle it |
| Drift detection | Built-in | terraform plan shows drift |
| Multi-cloud | AWS only | AWS, Azure, GCP, and 3,000+ providers |
| Cost | Free | Free (OSS) / Paid (Terraform Cloud) |
| Best for | Pure AWS shops, AWS exam context | Multi-cloud, large teams with diverse infra |
CloudFormation turns infrastructure into code โ declarative, version-controlled, and automatically provisioned. You describe what you want; CloudFormation handles how to build it.
Templates & Stacks
A CloudFormation template is a YAML (or JSON) file with a defined structure. Here are the key sections โ only Resources is required:
| Section | Required? | Purpose |
|---|---|---|
AWSTemplateFormatVersion | No | Template version date (always "2010-09-09" โ hasn't changed) |
Description | No | Human-readable description of the template |
Parameters | No | Input values provided at deploy time (instance type, env name, etc.) |
Mappings | No | Static key-value lookups (e.g., AMI IDs per region) |
Conditions | No | Conditional resource creation (e.g., only create in prod) |
Resources | Yes โ | The AWS resources to create (EC2, S3, VPC, etc.) |
Outputs | No | Values exported after stack creation (URLs, IDs, ARNs) |
๐ Simple CloudFormation Template
AWSTemplateFormatVersion: "2010-09-09"
Description: A simple web server stack
Parameters:
InstanceType:
Type: String
Default: t3.micro
AllowedValues: [t3.micro, t3.small, t3.medium]
Resources:
WebServerSG:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Allow HTTP
SecurityGroupIngress:
- IpProtocol: tcp
FromPort: 80
ToPort: 80
CidrIp: 0.0.0.0/0
WebServer:
Type: AWS::EC2::Instance
Properties:
InstanceType: !Ref InstanceType
ImageId: ami-0abcdef1234567890
SecurityGroupIds:
- !Ref WebServerSG
UserData:
Fn::Base64: |
#!/bin/bash
yum install -y httpd
systemctl start httpd
Outputs:
PublicIP:
Value: !GetAtt WebServer.PublicIp
Description: Public IP of the web server Key things to notice:
!Ref InstanceTypeโ references the parameter value provided at deploy time!Ref WebServerSGโ references another resource (CloudFormation resolves the ID)!GetAtt WebServer.PublicIpโ gets an attribute from a created resource- CloudFormation creates the Security Group before the EC2 instance (it resolves the dependency automatically)
The Resources section is the heart of every template. Each resource has a logical name, a type, and properties:
Logical Name
Your chosen name within the template (e.g., WebServer). Used for !Ref and cross-references. Must be unique in the template.
Type
The AWS resource type (e.g., AWS::EC2::Instance, AWS::S3::Bucket). CloudFormation supports 800+ resource types.
Properties
Configuration for the resource โ instance type, VPC ID, tags, etc. Each resource type has its own set of required and optional properties.
CloudFormation provides built-in functions for dynamic values, references, and logic within templates:
| Function | Short Form | What It Does | Example |
|---|---|---|---|
Ref | !Ref | Returns the value of a parameter or the ID of a resource | !Ref MyBucket โ bucket name |
Fn::GetAtt | !GetAtt | Gets a specific attribute from a resource | !GetAtt MyELB.DNSName |
Fn::Sub | !Sub | String substitution with variables | !Sub "arn:aws:s3:::${BucketName}" |
Fn::Join | !Join | Joins values with a delimiter | !Join ["-", [prod, app, sg]] โ prod-app-sg |
Fn::Select | !Select | Select an item from a list by index | !Select [0, !GetAZs ""] โ first AZ |
Fn::Split | !Split | Split a string into a list | !Split [",", "a,b,c"] โ [a, b, c] |
Fn::ImportValue | !ImportValue | Import an exported output from another stack | !ImportValue NetworkStack-VpcId |
Fn::FindInMap | !FindInMap | Look up a value in the Mappings section | !FindInMap [RegionMap, !Ref "AWS::Region", AMI] |
Condition Functions | !If, !Equals, !And, !Or, !Not | Conditional logic in templates | !If [IsProd, m5.large, t3.micro] |
CloudFormation provides built-in pseudo parameters that resolve at deploy time โ you don't need to define them:
| Pseudo Parameter | Returns | Common Use |
|---|---|---|
AWS::AccountId | The AWS account ID (e.g., 123456789012) | Building ARNs, policies |
AWS::Region | The region the stack is deployed in | Region-specific AMI lookups |
AWS::StackName | Name of the current stack | Tagging resources, naming conventions |
AWS::StackId | Full ARN of the stack | Unique identifiers |
AWS::NoValue | Removes a property when used with !If | Conditional property inclusion |
A stack is a single unit of deployment โ the collection of AWS resources created from one template. Stacks are the operational unit you interact with.
Stack Properties
- Has a unique name within a region
- Tracks all resources it created
- Has a lifecycle: CREATE โ UPDATE โ DELETE
- Emits events for every resource operation
- Supports tags (propagated to resources)
Stack Operations
- Create โ provision all resources from template
- Update โ modify resources (CloudFormation computes the diff)
- Delete โ tear down all resources (clean up)
- Detect Drift โ find manual changes made outside CloudFormation
During creation, CloudFormation:
- Validates the template syntax and resource types
- Builds a dependency graph โ knows VPC must exist before subnet, subnet before EC2
- Creates resources in parallel where no dependency exists (faster)
- Rolls back all changes if any resource fails to create (all-or-nothing)
- Emits events for each resource (CREATE_IN_PROGRESS โ CREATE_COMPLETE or CREATE_FAILED)
When you update a stack, CloudFormation determines how each resource is affected. The impact depends on the property you changed:
| Update Type | What Happens | Downtime? | Example |
|---|---|---|---|
| No interruption | Resource updated in-place, stays running | None | Changing EC2 tags, SG rules, Lambda code |
| Some interruption | Resource updated in-place, brief disruption | Brief | Changing EC2 instance type (requires stop/start) |
| Replacement | Resource deleted and recreated (new physical ID) | Yes | Changing EC2 AMI, RDS engine, VPC CIDR |
๐ Always use Change Sets before updating production stacks. A Change Set shows you exactly which resources will be modified, replaced, or deleted โ before you commit. Never blindly update a production stack.
Create Rollback
- If any resource fails during creation
- All previously created resources are deleted
- Stack status:
ROLLBACK_COMPLETE - You must delete the stack and fix the template
Update Rollback
- If any resource fails during update
- All changes are reverted to previous state
- Stack status:
UPDATE_ROLLBACK_COMPLETE - Stack remains operational at the old configuration
When you delete a stack, all resources are deleted by default. But you can override this with a DeletionPolicy:
| DeletionPolicy | Behaviour | Use For |
|---|---|---|
Delete (default) | Resource is deleted when stack is deleted | Ephemeral resources (dev, test) |
Retain | Resource is kept even after stack deletion | Databases, S3 buckets with data, critical resources |
Snapshot | Creates a snapshot before deleting (EBS, RDS, Redshift) | Databases where you want a backup before cleanup |
๐ Always set DeletionPolicy: Retain on production databases and S3 buckets with important data. Without it, deleting the stack deletes your data permanently.
Templates define what to build (Resources, Parameters, Outputs). Stacks are the live deployment โ supporting create, update (with change sets), delete (with rollback), and drift detection. Use DeletionPolicy to protect critical data.
Template Deep Dive
Parameters make templates reusable by accepting input values at deploy time. Instead of hardcoding an instance type or environment name, you declare a parameter and let the user choose.
Parameters:
Environment:
Type: String
Default: dev
AllowedValues: [dev, staging, prod]
Description: Deployment environment
InstanceType:
Type: String
Default: t3.micro
AllowedValues: [t3.micro, t3.small, t3.medium, t3.large]
KeyPairName:
Type: AWS::EC2::KeyPair::KeyName
Description: Select an existing EC2 key pair | Property | Purpose | Example |
|---|---|---|
Type | Data type of the parameter | String, Number, CommaDelimitedList, AWS::EC2::VPC::Id |
Default | Value used if none provided | t3.micro |
AllowedValues | Restrict to a list of valid options | [dev, staging, prod] |
AllowedPattern | Regex validation | "[a-zA-Z0-9]*" |
MinLength / MaxLength | String length constraints | MinLength: 1 |
MinValue / MaxValue | Numeric range constraints | MinValue: 1, MaxValue: 100 |
NoEcho | Mask the value (for passwords) | NoEcho: true |
ConstraintDescription | Custom error message on validation fail | "Must be a valid environment name" |
AWS-Specific Parameter Types
AWS::EC2::VPC::Idโ dropdown of VPCsAWS::EC2::Subnet::Idโ dropdown of subnetsAWS::EC2::KeyPair::KeyNameโ dropdown of key pairsAWS::EC2::SecurityGroup::Idโ dropdown of SGsAWS::SSM::Parameter::Value<String>โ from SSM Parameter Store
Best Practices
- Use
AllowedValuesto prevent invalid inputs - Use
Defaultso dev deploys need zero input - Use
NoEcho: truefor passwords/secrets - Use SSM parameter types to read live values from Parameter Store
- Limit parameters โ too many makes templates hard to use
Mappings are static lookup tables in the template โ hardcoded key-value pairs used to select values based on conditions like region or environment. Unlike parameters, mappings are fixed at template authoring time.
Mappings:
RegionAMI:
us-east-1:
HVM64: ami-0abcdef1111111111
eu-west-1:
HVM64: ami-0abcdef2222222222
ap-southeast-1:
HVM64: ami-0abcdef3333333333
Resources:
WebServer:
Type: AWS::EC2::Instance
Properties:
ImageId: !FindInMap [RegionAMI, !Ref "AWS::Region", HVM64] The !FindInMap function takes three arguments: map name, first-level key, second-level key. This pattern is most commonly used for region-specific AMI IDs.
Conditions let you create resources or set properties only when certain criteria are met โ typically based on parameter values.
Conditions:
IsProd: !Equals [!Ref Environment, prod]
CreateReadReplica: !And
- !Equals [!Ref Environment, prod]
- !Equals [!Ref EnableReplica, "true"]
Resources:
ProdAlarm:
Type: AWS::CloudWatch::Alarm
Condition: IsProd # Only created if Environment=prod
Properties:
AlarmName: HighCPU
... | Condition Function | Purpose | Example |
|---|---|---|
!Equals | True if two values are equal | !Equals [!Ref Env, prod] |
!If | Returns one of two values based on a condition | !If [IsProd, m5.large, t3.micro] |
!Not | Negates a condition | !Not [!Equals [!Ref Env, dev]] |
!And | True if ALL conditions are true | !And [Condition: IsProd, Condition: HasDB] |
!Or | True if ANY condition is true | !Or [Condition: IsProd, Condition: IsStaging] |
CloudFormation automatically resolves dependencies when you use !Ref or !GetAtt. But sometimes you need to explicitly declare a dependency that CloudFormation cannot infer:
Automatic (No DependsOn Needed)
!Ref WebServerSGin an EC2 resource โ CloudFormation knows to create the SG first!GetAtt MyDB.Endpoint.Addressโ DB created before the resource referencing it- Any
!Refor!GetAttcreates an implicit dependency
Explicit DependsOn Required
- Resource depends on another but doesn't reference it
- Example: EC2 instance needs an IGW attachment to exist (but doesn't
!Refit) - RDS instance depends on a VPC gateway attachment
- Lambda depends on a log group being created first
Resources:
IGWAttachment:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref MyVPC
InternetGatewayId: !Ref MyIGW
WebServer:
Type: AWS::EC2::Instance
DependsOn: IGWAttachment # Explicit โ wait for IGW attachment
Properties:
SubnetId: !Ref PublicSubnet
... Outputs export values from a stack โ making them visible in the console and importable by other stacks. This is how you build multi-stack architectures.
# Stack A โ Network Stack
Outputs:
VpcId:
Value: !Ref MyVPC
Export:
Name: NetworkStack-VpcId # Globally unique export name
# Stack B โ App Stack (imports from Stack A)
Resources:
AppServer:
Type: AWS::EC2::Instance
Properties:
SubnetId: !ImportValue NetworkStack-SubnetId Nested stacks let you compose templates from reusable child templates. A parent template references child templates stored in S3. Each child becomes a resource of type AWS::CloudFormation::Stack.
When to Use Nested Stacks
- Reuse common components (VPC, SG, IAM roles)
- Break large templates into manageable pieces
- Share a standard VPC template across teams
- Overcome the 500-resource limit per stack
Nested vs Cross-Stack
- Nested: Parent manages child lifecycle (coupled)
- Cross-Stack: Independent stacks linked via exports (decoupled)
- Use nested for tightly coupled components
- Use cross-stack for shared infrastructure (VPC, DNS)
| Feature | Nested Stacks | Cross-Stack References |
|---|---|---|
| Lifecycle | Child managed by parent (create/update/delete together) | Independent stacks, independent lifecycles |
| Coupling | Tight โ child is a resource in the parent | Loose โ only linked via exported values |
| Reuse pattern | Template reuse (same child template in many parents) | Value sharing (one stack exports, another imports) |
| Update impact | Updating parent can trigger child updates | Updating exporter doesn't touch importer |
| Deletion order | Parent handles child deletion automatically | Must delete importers before exporters |
| Best for | Component reuse (standard VPC module) | Shared infra across teams/services |
Parameters make templates reusable, Mappings provide static lookups (AMI per region), Conditions enable environment-specific resources, DependsOn handles non-obvious ordering, and Outputs + ImportValue enable multi-stack architectures. Use nested stacks for component reuse; use cross-stack references for shared infrastructure.
Stack Operations
A Change Set is a preview of proposed changes before they are applied to a stack. It shows exactly which resources will be added, modified, or replaced โ without actually making any changes.
What Change Sets Show
- Which resources will be Added
- Which will be Modified (in-place)
- Which will be Replaced (deleted + recreated)
- Which will be Removed
- Whether replacement will cause data loss
What Change Sets Don't Show
- Whether the update will succeed (permissions, limits)
- Downstream impact on dependents
- Cost impact of changes
- Application-level impact (app downtime)
- Still validate your changes in staging first!
๐ Always use Change Sets in production. Direct update-stack applies immediately with no preview. Change Sets give you a safety net โ review, approve, then execute (or discard).
A Stack Policy is a JSON document that protects specific resources from unintended updates. Once applied, CloudFormation denies updates to protected resources unless you explicitly override.
How Stack Policies Work
- Attached to a stack (one policy per stack)
- Default: all updates allowed (no policy)
- Once applied: cannot be removed, only replaced
- Deny by default โ then whitelist what's allowed
- Override temporarily during specific updates
Common Use Case
- Protect production RDS from accidental replacement
- Protect S3 buckets with critical data
- Allow updates to EC2/Lambda but block database changes
- Prevent accidental deletion of IAM roles
Drift occurs when the actual state of a resource differs from what CloudFormation expects (the template). Someone manually changed a Security Group rule, resized an instance, or modified a bucket policy outside of CloudFormation.
| Drift Status | Meaning |
|---|---|
| IN_SYNC | Resource matches the template โ no drift |
| DRIFTED | One or more properties differ from template |
| NOT_CHECKED | Drift detection hasn't been run on this resource |
| DELETED | Resource was deleted outside of CloudFormation |
Stack Sets let you deploy a single template across multiple AWS accounts and regions with one operation. Essential for organisations using AWS Organizations.
Deployment Targets
- Specific account IDs
- All accounts in an OU
- All accounts in the org
- Specific regions per account
Operation Settings
- Max concurrent โ how many accounts at once
- Failure tolerance โ how many can fail before stopping
- Region order โ sequential or parallel
- Automatic deployment for new accounts in OU
Common Use Cases
- Security baseline across all accounts
- Enable CloudTrail/Config everywhere
- Deploy IAM roles/policies org-wide
- Compliance guardrails (Control Tower uses Stack Sets)
Termination protection prevents accidental stack deletion. When enabled, any delete-stack call is rejected until protection is explicitly disabled.
Enable For
- Production stacks
- Stacks with databases or persistent data
- Shared infrastructure stacks (VPC, IAM)
- Any stack where accidental deletion = disaster
How to Enable
- At creation:
--enable-termination-protection - After creation:
update-termination-protection - Console: Stack settings โ Termination protection
- Disable first โ then delete (deliberate two-step)
CloudFormation can import existing AWS resources (created manually or by other tools) into a stack โ without recreating them. This lets you bring unmanaged resources under IaC control.
| Step | Action |
|---|---|
| 1 | Add the resource to your template with the correct type and properties |
| 2 | Add a DeletionPolicy: Retain (required for import) |
| 3 | Run create-change-set --change-set-type IMPORT with the resource identifier |
| 4 | Review the change set and execute |
| 5 | Resource is now managed by CloudFormation |
๐ Resource import is powerful for brownfield environments โ you don't need to tear down and rebuild existing infrastructure. Import it, and CloudFormation manages it going forward.
| Operation | Purpose | Key Behaviour |
|---|---|---|
| Create Stack | Deploy new infrastructure | All-or-nothing; auto-rollback on failure |
| Update Stack | Modify existing resources | Use Change Sets first; rollback on failure |
| Delete Stack | Tear down all resources | Respects DeletionPolicy (Retain/Snapshot) |
| Change Set | Preview changes before applying | Safe โ makes zero changes until executed |
| Drift Detection | Find manual/out-of-band changes | Compares template vs actual state |
| Stack Sets | Deploy across accounts/regions | Single template, many stack instances |
| Import | Bring existing resources into a stack | No recreation; requires DeletionPolicy: Retain |
| Termination Protection | Prevent accidental deletion | Must be explicitly disabled before delete |
Change Sets preview updates safely. Stack Policies protect critical resources. Drift Detection catches manual changes. Stack Sets deploy across multi-account/multi-region. Termination Protection prevents accidental deletion. Resource Import brings existing infrastructure under IaC control.
Advanced Features
cfn-init is CloudFormation's built-in configuration management tool. It runs on an EC2 instance at launch time and reads configuration instructions from the template's AWS::CloudFormation::Init metadata โ installing packages, writing files, starting services, and more.
packages
Install software via yum, apt, pip, or rpm. E.g., install Apache, Node.js, or Python packages.
files
Write config files to disk with specific content, permissions, and ownership. E.g., write /etc/httpd/conf/httpd.conf.
services
Enable and start system services. Restart them when specific files or packages change. E.g., auto-restart httpd on config change.
commands
Run arbitrary shell commands in order. E.g., run migrations, set permissions, download artifacts from S3.
groups & users
Create Linux groups and users on the instance. E.g., create an appuser with specific UID/GID.
sources
Download and extract archives (tar, zip) from URLs. E.g., pull app bundle from S3 and extract to /var/www.
| Feature | User Data (bash script) | cfn-init |
|---|---|---|
| Format | Shell script (imperative) | Declarative metadata in template |
| Idempotent? | No โ runs once, you manage idempotency | Yes โ desired-state config |
| Service restarts | Manual handling | Auto-restart on file/package changes |
| Config sets | No concept | Group configs into ordered sets |
| Error handling | Check exit codes manually | Reports success/failure to CloudFormation |
| Best for | Simple installs, quick scripts | Complex multi-step server configuration |
cfn-signal tells CloudFormation whether an instance successfully completed configuration. Without it, CloudFormation marks an EC2 instance as CREATE_COMPLETE the moment it launches โ even if cfn-init hasn't finished or failed.
CreationPolicy
- Attached to the EC2 or ASG resource in the template
- Tells CloudFormation: "wait for N signals within timeout"
- If signals not received โ CREATE_FAILED โ rollback
- For ASGs: wait for
MinSuccessfulInstancesPercentsignals
WaitCondition (Legacy)
- Older mechanism โ same idea as CreationPolicy
- Uses a separate
WaitConditionHandleresource - Can receive signals from external sources (not just the instance)
- Prefer CreationPolicy for EC2/ASG; use WaitCondition for external coordination
Custom Resources let you extend CloudFormation beyond its built-in resource types. They invoke a Lambda function (or SNS topic) during stack create/update/delete โ allowing you to do anything that AWS APIs can do.
Common Use Cases
- Populate an S3 bucket at stack creation (upload initial data)
- Create DNS records in Route 53 hosted zones
- Empty an S3 bucket before deletion (required โ CFN can't delete non-empty buckets)
- Call external APIs (Datadog, PagerDuty, Slack notifications)
- Look up the latest AMI ID dynamically
- Provision resources in services not yet supported by CFN
Gotchas
- Lambda must send a response to the S3 presigned URL โ or CFN hangs
- Handle all three events:
Create,Update,Delete - If Lambda fails without sending response โ stack stuck for 1 hour (timeout)
- Always include error handling and logging
- Prefer
Custom::*type overAWS::CloudFormation::CustomResource
| Script | Purpose | When to Use |
|---|---|---|
cfn-init | Read metadata and configure the instance (packages, files, services) | Complex instance configuration instead of User Data scripts |
cfn-signal | Signal CloudFormation that configuration is complete (or failed) | With CreationPolicy โ so CFN waits for app readiness |
cfn-hup | Daemon that detects metadata changes and re-runs cfn-init | When you want instances to auto-update on stack update |
cfn-get-metadata | Retrieve metadata from a resource in the template | Debugging โ inspect what metadata a resource has |
๐ cfn-hup is the key to updating running instances. Without it, updating the Init metadata in a template only affects new instances. cfn-hup runs as a daemon, polls for metadata changes, and re-runs cfn-init โ so existing instances pick up updates.
CloudFormation Registry
- Catalogue of resource types, modules, and hooks
- Includes AWS resources + third-party resources
- E.g.,
Datadog::Monitors::Monitor,MongoDB::Atlas::Cluster - Register your own custom types with schemas and handlers
- Versioned, discoverable, and type-safe
CloudFormation Modules
- Reusable template fragments registered in the Registry
- Package common patterns (e.g., "standard VPC" module)
- Consumed as
MODULE::*resource types in templates - Hides complexity โ users specify simple properties
- Versioned and shareable across the organisation
cfn-init provides declarative instance configuration (packages, files, services); cfn-signal + CreationPolicy ensure CloudFormation waits for app readiness. Custom Resources extend CloudFormation via Lambda to handle any API call. cfn-hup keeps running instances in sync with template changes. The Registry enables third-party resources and reusable Modules.
Patterns & Best Practices
In production, you should never put everything in one giant stack. Split your infrastructure into layered, loosely coupled stacks with independent lifecycles.
Why Split Stacks
- Independent lifecycles โ update app without touching VPC
- Blast radius โ a failed app stack update won't break networking
- Team ownership โ network team owns VPC stack, app team owns app stack
- Reuse โ same VPC stack for dev/staging/prod
- Limits โ avoid the 500-resource-per-stack limit
Anti-Patterns to Avoid
- God stack โ everything in one template (fragile, slow updates)
- Circular exports โ Stack A imports from B which imports from A
- Hardcoded IDs โ use !ImportValue or SSM parameters instead
- No DeletionPolicy on databases โ stack deletion = data loss
- No Change Sets in production โ blind updates are dangerous
CloudFormation integrates naturally into CI/CD pipelines for automated infrastructure deployment:
Template Hygiene
- Use YAML over JSON (more readable, supports comments)
- Use Parameters โ never hardcode account IDs, AMIs, or env names
- Use Mappings for region-specific values (AMI per region)
- Use Conditions for env-specific resources (alarms only in prod)
- Validate with
cfn-lintbefore deploying - Store templates in Git โ full change history
Safety & Protection
- Enable Termination Protection on all prod stacks
- Set DeletionPolicy: Retain on databases and S3 buckets
- Use Stack Policies to protect critical resources from update
- Always use Change Sets before updating production
- Run Drift Detection regularly (weekly or in CI)
- Tag all stacks โ
Environment,Team,CostCenter
| Mistake | Why It's Bad | Fix |
|---|---|---|
| Everything in one stack | Slow updates, large blast radius, 500-resource limit | Split into network/security/data/app stacks |
| No DeletionPolicy on RDS | Delete stack = delete database = data gone | DeletionPolicy: Snapshot or Retain |
| Direct update without Change Set | Surprise replacements can delete data | Always create + review Change Set first |
| Hardcoded values | Template only works in one region/account | Use Parameters, Mappings, and !Ref AWS::Region |
| No cfn-signal on EC2 | CFN marks instance complete before app starts | Add CreationPolicy + cfn-signal |
| Manual changes after deployment | Drift โ template no longer matches reality | All changes through CloudFormation; run drift detection |
| Not using SSM for secrets | Secrets in plain text Parameters | Use AWS::SSM::Parameter::Value or Secrets Manager dynamic references |
| Limit | Value | Workaround |
|---|---|---|
| Resources per stack | 500 | Split into multiple stacks (nested or cross-stack) |
| Parameters per template | 200 | Use SSM parameters or Mappings instead |
| Outputs per template | 200 | Export only what's needed; use SSM for others |
| Template body size (inline) | 51,200 bytes | Upload to S3 (up to 1 MB from S3) |
| Stacks per account per region | 2,000 | Request limit increase or consolidate |
| Exports per account per region | 200 | Use SSM parameters for sharing beyond 200 |
CloudFormation โ Complete Summary
- Templates โ YAML/JSON files declaring infrastructure. Sections: Parameters, Mappings, Conditions, Resources (required), Outputs.
- Stacks โ Live deployment of a template. Lifecycle: Create โ Update โ Delete. Auto-rollback on failure.
- Intrinsic Functions โ !Ref, !GetAtt, !Sub, !FindInMap, !If, !ImportValue for dynamic templates.
- Change Sets โ Preview updates before applying. Shows adds, modifies, replacements, removals.
- Cross-Stack References โ Export/ImportValue for decoupled multi-stack architectures.
- Stack Sets โ Deploy one template across multiple accounts and regions (org-wide governance).
- Drift Detection โ Detect manual changes that diverge from the template.
- cfn-init + cfn-signal โ Declarative instance config + readiness signalling to CloudFormation.
- Custom Resources โ Lambda-backed extension for anything CloudFormation doesn't natively support.
- Best Practices โ Split stacks, use Change Sets, enable termination protection, DeletionPolicy on data, CI/CD pipeline integration.
CloudFormation is the backbone of AWS infrastructure automation. Split infrastructure into layered stacks, protect production with Change Sets + Termination Protection + DeletionPolicy, integrate into CI/CD pipelines, and treat infrastructure with the same discipline as application code โ versioned, reviewed, and tested.