Management & Governance β
Visibility, Compliance & Operational Control
CloudWatch, CloudTrail, AWS Config, Systems Manager, Organizations, Control Tower β the services that monitor, audit, enforce compliance, and automate operations across your entire AWS environment.
Why Management & Governance?
Management & Governance services give you the eyes, ears, and guardrails for your AWS environment β so you can monitor performance, audit every API call, enforce compliance, and automate operations at scale.
As your AWS footprint grows from a single account to hundreds of accounts running thousands of resources, you need a layered management stack. Without it, you're flying blind β unable to diagnose failures, prove compliance, or control costs.
Monitor & Observe
- Collect metrics, logs, and traces from every resource
- Set alarms and dashboards for real-time visibility
- Detect anomalies before they become outages
- Services: CloudWatch, X-Ray
Audit & Comply
- Record every API call across accounts
- Track resource configuration changes over time
- Evaluate resources against compliance rules
- Services: CloudTrail, AWS Config
Operate & Automate
- Patch, inventory, and configure fleets of instances
- Automate runbooks and incident response
- Manage parameters and secrets centrally
- Services: Systems Manager
Govern & Control
- Organise accounts into OUs with policies
- Set up landing zones with guardrails
- Track costs, set budgets, get alerts
- Services: Organizations, Control Tower, Cost Explorer
Reactive Monitoring
- Metrics + alarms β detect and respond to issues
- CloudWatch Alarms trigger SNS or Auto Scaling
- Log analysis after incidents occur
- Good start, but you're always behind
Proactive Compliance
- Rules define what "good" looks like
- Continuous evaluation catches drift instantly
- Auto-remediation fixes issues before humans notice
- Config Rules + SSM Automation
Preventive Governance
- SCPs and guardrails block bad actions before they happen
- Landing zones enforce account structure
- Budget alerts prevent cost surprises
- Organizations + Control Tower
Services & Spectrum
| Pillar | Services | Primary Data | Scope | Best For |
|---|---|---|---|---|
| Monitoring | CloudWatch | Metrics, Logs, Traces | Per-resource / application | Performance monitoring, alarming, dashboards |
| Auditing | CloudTrail | API call events | Per-account / org-wide | Security audit, forensics, compliance evidence |
| Compliance | AWS Config | Configuration snapshots | Per-resource / multi-account | Drift detection, rule evaluation, remediation |
| Operations | Systems Manager | Inventory, commands, patches | Fleet-wide (hybrid) | Patch management, automation, parameter store |
| Governance | Organizations, Control Tower | Policies, guardrails | Org-wide | Multi-account strategy, preventive controls |
| Cost Control | Cost Explorer, Budgets | Billing data | Per-account / org-wide | Cost visibility, budget alerts, right-sizing |
Each management service builds on the one before it. CloudWatch tells you what is happening now. CloudTrail tells you what happened. Config tells you if it's correct. Systems Manager lets you fix it. Organizations + Control Tower prevent it from going wrong in the first place.
Decision Guide
- Monitor CPU, memory, disk, custom metrics
- Set alarms for threshold breaches
- Centralise application and infrastructure logs
- Build operational dashboards
- Detect anomalies in metric patterns
- Audit who made an API call and when
- Investigate security incidents forensically
- Meet regulatory audit requirements
- React to specific API events via EventBridge
- Query historical API activity with Lake
- Check if resources comply with internal rules
- Detect configuration drift over time
- Auto-remediate non-compliant resources
- View resource configuration timeline
- Run advanced queries across resources
- Patch EC2 or on-prem instances at scale
- Run commands across fleets without SSH
- Store config values and secrets (Parameter Store)
- Automate operational runbooks
- Manage hybrid (cloud + on-prem) environments
- Manage multiple AWS accounts centrally
- Apply guardrails (preventive + detective)
- Consolidate billing across accounts
- Provision new accounts with best practices
- Enforce SCPs to limit account-level actions
| Service | Pillar | Data Collected | Retention | Scope | Billing |
|---|---|---|---|---|---|
| CloudWatch | Monitoring | Metrics, Logs, Traces | 15 months (metrics) / configurable (logs) | Per-resource | Per metric, log GB, alarm |
| CloudTrail | Auditing | Management & data events | 90 days (console) / S3 unlimited | Per-account / org | Free (mgmt events) / per data event |
| AWS Config | Compliance | Configuration items | Configurable (7 yrs default) | Per-resource / multi-account | Per config item + rule eval |
| Systems Manager | Operations | Inventory, commands, patches | 30 days (run history) | Fleet / hybrid | Free (most features) / advanced tiers |
| Organizations | Governance | SCPs, OU structure | N/A | Org-wide | Free |
| Control Tower | Governance | Guardrails, landing zone | N/A | Org-wide | Free (underlying services billed) |
| Cost Explorer | Cost Control | Billing & usage data | 12 months | Per-account / org | Free (API calls billed) |
Architecture Patterns
Most production environments combine management services in complementary layers. Here are the three canonical patterns:
Pattern 1: Observe β Alert β Heal
- Metric crosses threshold β alarm fires
- SNS triggers Lambda for triage
- SSM Automation runs remediation runbook
- Best for: self-healing infrastructure
Pattern 2: Audit β Detect β Respond
- Sensitive API call recorded by CloudTrail
- EventBridge rule catches event in real-time
- Lambda evaluates and blocks / notifies
- Best for: security incident response
Pattern 3: Evaluate β Remediate β Report
- Config Rule detects resource drift
- Auto-remediation via SSM fixes the issue
- Compliance status reported to dashboard
- Best for: continuous compliance enforcement
Exam Insights
| If the question says⦠| Think⦠|
|---|---|
| "Monitor CPU utilisation" or "set alarm" | CloudWatch |
| "Who deleted the S3 bucket?" or "API call history" | CloudTrail |
| "Is this resource compliant?" or "configuration drift" | AWS Config |
| "Patch instances" or "run command at scale" | Systems Manager |
| "Multi-account strategy" or "SCP" | Organizations |
| "Landing zone" or "guardrails" | Control Tower |
| "Cost anomaly" or "budget alert" | Cost Explorer + Budgets |
| "Configuration timeline" or "resource history" | AWS Config |
| "Centralise logs from multiple accounts" | CloudWatch + CloudTrail (org trail) |
| "Auto-remediate non-compliant resource" | AWS Config + SSM Automation |
| "Store database password securely" | SSM Parameter Store (SecureString) |
| "Best-practice recommendations" | Trusted Advisor |
| Trap | Reality |
|---|---|
| "CloudTrail records performance metrics" | NO. CloudTrail records API calls. CloudWatch records metrics and logs. |
| "AWS Config can block non-compliant changes" | Config detects and remediates AFTER the change. Use SCPs to prevent changes. |
| "Systems Manager requires SSH access" | SSM Agent uses HTTPS β no SSH, no bastion, no open inbound ports. |
| "CloudWatch Logs retention is automatic" | Logs are kept forever by default. You must set retention policy explicitly. |
| "Organizations SCPs affect the management account" | SCPs never apply to the management account. It always has full access. |
| "Control Tower replaces Organizations" | Control Tower builds on top of Organizations, adding guardrails and landing zones. |
| "Config Rules prevent resource creation" | Config Rules are detective, not preventive. They evaluate after creation. |
- Management = visibility + control. You can't operate what you can't see, audit, or enforce rules on.
- Core services: CloudWatch (monitor), CloudTrail (audit), Config (comply), Systems Manager (operate), Organizations (govern).
- CloudWatch + CloudTrail work together β metrics show WHAT happened, trail shows WHO did it.
- Config + SSM Automation is the compliance loop β detect drift, auto-remediate, report compliance.
- Organizations + Control Tower for enterprise multi-account β SCPs prevent, guardrails protect, landing zones standardise.
Layer your management stack: CloudWatch observes, CloudTrail audits, Config evaluates, Systems Manager operates, and Organizations governs. Each service fills a gap the others cannot β use them together for complete operational control.