Amazon SNS β
Simple Notification Service
A fully managed publish-subscribe messaging service that lets one event simultaneously reach many consumers β decoupling your systems and enabling event-driven architectures at scale.
π‘ SNS in 30 Seconds
- Managed pub/sub service β one message published, many subscribers receive it simultaneously
- Decouples producers from consumers β they don't need to know about each other
- Supports SQS, Lambda, HTTP/S, Email, and SMS as subscriber targets
- Fan-out architecture: SNS + SQS is the standard reliable multi-consumer pattern
- Push-based β SNS delivers to subscribers immediately (not pull like SQS)
What is Amazon SNS
Imagine an e-commerce platform where a customer places an order. The Order Service needs to:
- Send a confirmation email to the customer
- Update the inventory system
- Notify the shipping service to prepare a package
- Trigger a fraud detection check
Tightly Coupled (Bad)
- Order Service calls each system directly
- If Email Service is down β order fails
- Adding a new service requires changing Order Service code
- All services must be available simultaneously
- One slow consumer slows everything
Loosely Coupled with SNS (Good)
- Order Service publishes one event to SNS
- SNS fans out to all subscribers independently
- Each system works independently
- New systems subscribe without touching Order Service
- One slow subscriber doesn't affect others
π SNS is like a public address system in a building. One announcement is made from a single microphone β and every speaker in every room hears it at the same time. The speaker doesn't know who is listening, and the listeners don't know who is speaking.
Real-world analogies that work well:
Radio Broadcast
One station broadcasts a signal. Many radios tuned to that frequency receive it. The station doesn't care how many listeners exist.
Newsletter
You publish one newsletter. All subscribers receive the same content. Subscribers can opt in or out without affecting the publisher.
Push Notification
An app sends one alert. All users with notifications enabled receive it simultaneously β regardless of how many they are.
Amazon SNS (Simple Notification Service) is a fully managed publish-subscribe (pub/sub) messaging service. It enables you to decouple applications by:
- Publishers sending messages to a central Topic
- SNS immediately delivering that message to all Subscribers
- Subscribers receiving the same message in parallel β independently
The core formula is simple:
SNS solves three fundamental distributed systems problems:
Decoupling
Producers and consumers don't know about each other. The Order Service doesn't need to know that Email, Lambda, and SQS exist β it just publishes to a topic.
Scalable Communication
Adding a new consumer doesn't require changes to the producer. Subscribe a new service to the topic and it instantly starts receiving all future events.
Event-Driven Architecture
Systems react to events rather than being directly called. This is the foundation of modern microservice and serverless architectures on AWS.
| Concern | Direct API Calls (Synchronous) | SNS (Asynchronous Pub/Sub) |
|---|---|---|
| Adding new consumer | Modify publisher code | Subscribe to topic β no code change |
| Consumer goes down | Publisher fails too | Other consumers unaffected |
| Scaling consumers | Publisher must coordinate | Each consumer scales independently |
| Latency sensitivity | Publisher waits for all calls | Publisher returns instantly after publish |
| Architecture complexity | Grows with each new consumer (NΓN) | Always 1ΓN β one publisher, N subscribers |
When an exam question describes a scenario where one event needs to trigger multiple actions simultaneously β order placed β email + inventory + shipping β the answer pattern is SNS fan-out. SNS is the "broadcasting" service; SQS is the "buffering queue" service. They are complementary, not competing.
SNS is the broadcast backbone of event-driven AWS architecture β one message reaches many consumers instantly, without the publisher knowing who is listening
Core Messaging Concepts
Every SNS interaction involves four elements. Understanding these clearly is the foundation for everything else on this page.
Topic
The central communication channel. A named endpoint that publishers write to and subscribers listen on. Think of it as a named event stream β order-placed, user-signup, payment-failed.
- Has a unique ARN identifier
- Can have up to 12.5 million subscriptions
- Two types: Standard and FIFO
Publisher
Any application or service that sends a message to a topic. The publisher has no knowledge of who the subscribers are or how many exist.
- EC2 instance, Lambda function, microservice
- AWS services (CloudWatch, CodePipeline, S3)
- Calls
PublishAPI with topic ARN + message
Subscriber
Any endpoint that registers interest in a topic. When a message arrives, SNS pushes it to every active subscriber in parallel.
- SQS queue, Lambda function, HTTP/S endpoint
- Email address, SMS phone number
- Amazon Kinesis Data Firehose
Message
The payload sent by the publisher. Up to 256 KB of text (JSON, XML, plain text). Can carry attributes (metadata) for filtering without touching the message body.
- Message body: up to 256 KB
- Message attributes: key-value metadata pairs
- Message ID: unique identifier assigned by SNS
The 256 KB limit applies to the entire publish request β message body + attributes + metadata combined. Multi-byte characters (emoji, CJK) consume more bytes than they appear.
What counts toward 256 KB
- Message body (JSON / text / XML)
- Message attribute keys and values
- Topic ARN + request metadata
Character encoding
- All messages are UTF-8
- ASCII: 1 byte per character
- Emoji / CJK: 3β4 bytes each
- 256 KB β 256,000 ASCII chars
If you exceed 256 KB
- Store payload in S3, send S3 URI in SNS
- Compress with gzip before publishing
- Split into multiple sequenced messages
- Best practice: keep SNS <10 KB
π Fan-out is the defining capability of SNS: one message in β N deliveries out. Each subscriber receives a full, independent copy of the message and processes it in parallel.
Fan-out solves a classic distributed systems challenge: how do you notify many systems about the same event without the event source needing to know about all of them?
One Publish Call
The publisher makes a single API call to SNS. No loops, no parallel threads, no knowledge of downstream systems.
Parallel Delivery
SNS delivers to all subscribers simultaneously. Subscriber A doesn't wait for Subscriber B to finish processing.
Independent Processing
Each subscriber processes at its own pace, in its own way. A Lambda function and an SQS queue behave entirely differently β and that's fine.
SNS offers two topic types. Choose based on whether message order and deduplication matter for your workload:
| Feature | Standard Topic | FIFO Topic |
|---|---|---|
| Message ordering | Best-effort (not guaranteed) | Strict order guaranteed |
| Deduplication | Possible duplicates | Exactly-once delivery |
| Throughput | Near-unlimited | 300 msg/s (3,000 with batching) |
| Subscriber types | SQS, Lambda, HTTP, Email, SMS | SQS FIFO queues only |
| Use case | Most event-driven workloads | Financial transactions, stock tickers, strict sequencing |
Throughput
- 300 msg/sec β default without batching
- 3,000 msg/sec β with batching (10 messages per batch Γ 300 TPS)
- Can request limit increase via AWS Support
- Each batch counts as one API transaction
Hard Restrictions
- Subscriber types: SQS FIFO only β no Lambda, HTTP, Email, SMS
- No subscription filter policies β all subscribers get all messages
- No cross-account β SQS FIFO must be in same AWS account
- Message deduplication ID required (or enable content-based deduplication)
- Messages with same group ID are strictly ordered
| Aspect | Detail |
|---|---|
| Message group ID | Required β defines ordering scope. Messages with same ID are processed in FIFO order. |
| Deduplication window | 5 minutes β same deduplication ID within 5 min is silently dropped |
| Content-based dedup | Enable on topic to avoid managing dedup IDs β SNS hashes the message body |
| When NOT to use FIFO | Lambda/HTTP/Email subscribers, filter policies needed, throughput > 3,000 msg/sec |
By default, every subscriber receives every message published to a topic. Filter policies let subscribers declare which messages they care about β based on message attributes β without the publisher needing to route differently.
Without Filters
- All subscribers get all messages
- Each subscriber must ignore irrelevant messages in code
- Wastes Lambda invocations and SQS receives
- Fine for homogeneous consumers
With Filter Policies
- Subscriber only receives matching messages
- Filter evaluated by SNS β no code needed
- Example:
"event-type": ["order-placed"] - Reduces cost and processing overhead
Filter policies are JSON objects you attach to a subscription. SNS evaluates them server-side before delivering. The publisher must include matching message attributes for filtering to work.
Match by string value
{"event-type": ["order.placed",
"order.updated"]} Array = OR logic. Delivers if attribute equals any listed value.
Numeric comparison
{"price": [{"numeric": [">", 100]}]} Supports >, <, >=, <=, =. Also supports ranges: [">", 0, "<=", 500].
AND logic (multi-key)
{"event-type": ["order.placed"],
"region": ["NA","EU"],
"amount": [{"numeric":[">=",500]}]} Multiple keys = AND. All conditions must match.
Prefix & exists checks
{"tier": [{"prefix": "gold"}]}
{"priority": [{"exists": true}]} Prefix: match strings starting with value. Exists: require attribute to be present.
π Publishing with attributes (required for filtering): Message attributes must be set by the publisher β SNS cannot filter on the message body. If a message has no attributes, it is delivered to ALL subscribers regardless of filter policies.
A common exam scenario: "You have one SNS topic with multiple subscribers. Each subscriber should only receive messages relevant to them, without code changes." The answer is SNS Subscription Filter Policies β set a filter on each subscription based on message attributes, and SNS handles the routing server-side.
SNS topics, publishers, subscribers, and filter policies are the four core building blocks β master these and every architecture pattern follows naturally
How SNS Works in AWS Architecture
In any AWS architecture, SNS sits as the event broadcast layer between event producers (applications, AWS services, Lambda functions) and event consumers (queues, functions, endpoints). The flow is always the same:
π Application or AWS Service β publishes event β SNS Topic β SNS pushes to all subscribers in parallel
SQS Queue
Most common subscriber. SNS pushes the message into the queue where workers poll and process at their own pace. Enables durable, buffered processing.
Lambda Function
SNS triggers Lambda directly. Ideal for real-time processing, transformation, or routing. Lambda scales automatically with the message volume.
HTTP / HTTPS Endpoint
SNS sends an HTTP POST to any publicly reachable URL. Useful for webhook-style integrations with third-party systems or custom services.
Email / Email-JSON
SNS sends an email to a subscribed address. Requires manual confirmation by the recipient. Used for CloudWatch alarms and operational alerts.
SMS
Delivers text messages to mobile numbers. Used for critical alerts, OTP notifications, and on-call paging. Supports transactional and promotional tiers.
Kinesis Firehose
SNS fans events directly into a Firehose delivery stream. Used for real-time analytics pipelines β events flow to S3, Redshift, or OpenSearch automatically.
SNS isn't just for application code. Many AWS services can publish directly to an SNS topic without any custom code:
| AWS Service | What It Publishes | Common Use |
|---|---|---|
| CloudWatch Alarms | Alarm state changes (OK β ALARM β INSUFFICIENT) | Notify on-call team, trigger auto-remediation Lambda |
| S3 Event Notifications | Object created, deleted, or restored | Trigger processing pipeline when file arrives in S3 |
| CodePipeline / CodeBuild | Pipeline state changes, build pass/fail | Notify dev team of deployment success or failure |
| AWS Health | Service disruptions, maintenance events | Proactive alerts to ops teams |
| Auto Scaling | Scale-out / scale-in lifecycle events | Warm up new instances before traffic hits them |
| RDS | Database events (failover, backup, restore) | Alert on database failover events |
SNS delivery is near-instant but not zero. Latency varies by subscriber type β plan your architecture around realistic expectations:
| Subscriber Type | Typical Latency | Notes |
|---|---|---|
| SQS (same region) | 30β100 ms | Fastest β queue is co-located in same region |
| Lambda (same region) | 50β200 ms | Add cold-start time (~200ms) if function is not warm |
| HTTP/S endpoint | 100β500 ms | Depends on endpoint response time; SNS waits for 200 OK |
| Cross-region SQS | 200β800 ms | Data transfer across AWS regions adds overhead |
| 2β10 seconds | Email provider and spam filter delays | |
| SMS | 2β15 seconds | Carrier network delays; varies by country |
π SNS offers no delivery latency SLA β it is best-effort with high throughput. Do not use SNS for hard real-time requirements (<10 ms). For near-real-time (100β200 ms), use SQS + long-polling consumers.
SMS and Email subscriptions have significant regional and operational constraints that matter in production:
| Region | SMS Supported | Sender ID | Notes |
|---|---|---|---|
| United States | Yes | No (10DLC required) | Must register 10-digit long code (10DLC) for A2P messaging |
| Canada | Yes | No | Registration required for A2P; short codes available |
| Europe | Yes | Yes (varies) | Sender ID registration varies per country; some require approval |
| India | Yes | No | DLT (Distributed Ledger Technology) registration mandatory |
| Australia | Yes | Yes (pre-register) | Sender IDs require pre-registration with carriers |
| China | Limited | N/A | Requires local ICP providers; SNS direct SMS not reliable |
Email Subscription Limits
- Requires manual confirmation by recipient (double opt-in)
- Not suitable for high-volume automated sends
- No attachments β plain text or HTML only
- For scale: use Lambda subscriber β Amazon SES instead
SMS Best Practices
- Test in target countries before production
- Use Transactional type for OTP / alerts (higher priority)
- Use Promotional type for marketing (lower cost)
- Set monthly SMS spend limits to avoid surprise bills
In a microservices architecture, each service owns a narrow slice of business logic. Without SNS, services communicate by calling each other directly β creating a tight mesh of dependencies. With SNS:
Without SNS β Direct Calls
- Order API calls Inventory API, Email API, Shipping API directly
- 3 services to keep track of from 1 producer
- Add a 4th consumer β update Order API code, redeploy
- If any downstream service is slow β Order API is slow
With SNS β Event Bus
- Order API publishes
OrderPlacedevent to SNS topic - Inventory, Email, Shipping subscribe independently
- Add a 4th consumer β just subscribe to the topic β zero code change in Order API
- Each consumer processes asynchronously at its own speed
Key architecture trigger words: When you see "one event should trigger multiple independent actions", "services should not be tightly coupled", or "adding a new consumer without changing existing code" β the answer is SNS topic with multiple subscribers. This pattern maps directly to real-world microservice design.
SNS acts as the broadcast backbone of AWS event-driven systems β AWS services, applications, and microservices all publish to topics, and any number of consumers receive events without coupling
SNS + SQS Fan-Out Architecture
π SNS delivers messages immediately and does not store them. If a subscriber is temporarily unavailable, that message is gone. For durable, reliable processing β pair SNS with SQS.
SNS β Broadcasts
- Push-based: delivers instantly to all subscribers
- No storage: if subscriber misses the message, it's gone
- Best for fire-and-forget notifications (email, HTTP webhooks)
- Not ideal for workloads that need retries or buffering
SQS β Buffers
- Pull-based: consumers poll at their own pace
- Stores messages for up to 14 days
- Handles backpressure: consumers can slow down without losing messages
- Built-in retry with visibility timeout and DLQ support
The SNS + SQS fan-out is the most important SNS architecture pattern. It combines SNS's broadcasting with SQS's durability:
- One producer publishes a single event to an SNS topic
- SNS fans out to multiple SQS queues simultaneously
- Each SQS queue is consumed by an independent service or worker
- Each consumer processes at its own pace, retries independently, scales independently
Decoupling
The Order Service (producer) has zero knowledge of Inventory, Shipping, or Analytics. They are completely independent. Remove any one of them β the producer keeps working unchanged.
Independent Scaling
The Inventory worker can scale to 10 instances while Shipping uses 2 and Analytics uses 1. Each scales based on its own queue depth β completely independent of the others.
Retry Isolation
If the Shipping worker fails for 10 minutes, messages accumulate in the Shipping SQS queue. Inventory and Analytics are completely unaffected. When Shipping recovers, it drains its queue.
Durability via SQS
Even if SNS delivers and the worker is down, messages are safely held in SQS for up to 14 days. Nothing is lost. SNS alone would drop the message β SQS is the durability layer.
| Concern | Direct Service Calls | SNS + SQS Fan-Out |
|---|---|---|
| Adding a new consumer | Modify producer + redeploy | Subscribe new SQS queue, zero producer change |
| Consumer is slow | Producer waits / times out | Queue buffers messages, consumer catches up |
| Consumer crashes | Messages lost | Messages stay in SQS queue, retried automatically |
| Consumer scaling | Must coordinate with producer | Each service scales via its own queue depth |
| One consumer fails | Can cascade to other consumers | Other consumers completely unaffected |
| Audit / replay | Hard β real-time only | Messages retained in queue for inspection or replay |
| Scenario | Use SNS Alone | Use SNS + SQS |
|---|---|---|
| Email / SMS alerts | Yes β fire and forget is fine | Not needed |
| Lambda real-time processing | Yes β Lambda handles retries itself | SQS adds buffering if needed |
| Multiple microservices, each processing independently | Risky β no durability | Yes β each service gets its own queue |
| Consumer may be slow / unavailable | Messages dropped | Yes β SQS buffers the backlog |
| Audit trail needed | No β SNS doesn't retain messages | Yes β SQS retains for up to 14 days |
The SNS + SQS fan-out pattern is one of the most frequently tested AWS architecture patterns. The trigger will usually be: "multiple services must each receive and independently process the same event", "a consumer can be unavailable without losing messages", or "services should scale independently". The answer is always SNS topic β multiple SQS queues β independent consumers.
SNS + SQS fan-out is the gold standard for reliable, scalable, multi-consumer event processing β SNS broadcasts, SQS buffers, and each consumer is fully independent
SNS vs SQS vs EventBridge
SNS, SQS, and EventBridge are all messaging services β but they solve different problems. Many developers (and exam questions) confuse them because they all involve sending and receiving messages between services. The key is understanding what each one was designed for.
π These three services are not competitors β they are complementary layers. Production architectures routinely use all three together.
SNS
"I need to broadcast one event to many consumers at the same time."
Push-based fan-out. One publish β many parallel deliveries.
SQS
"I need to queue work so a consumer can process it reliably at its own pace."
Pull-based buffer. One message β one consumer. Durable storage.
EventBridge
"I need to route events based on content, with advanced filtering and scheduling."
Event router with rules, cross-account delivery, and SaaS integration.
| Feature | SNS | SQS | EventBridge |
|---|---|---|---|
| Pattern | Publish / Subscribe | Message Queue | Event Bus / Router |
| Delivery model | Push β SNS pushes to subscribers | Pull β consumers poll the queue | Push β EventBridge pushes to targets |
| Consumers per message | Many β all subscribers receive a copy | One β a single consumer receives each message | Many β multiple rules can match and route |
| Message retention | No β delivers immediately or fails | Yes β up to 14 days | No β event bus is transient |
| Filtering | Subscription filter policies (on message attributes) | No filtering β consumer receives all messages | Rich content-based filtering on event body |
| Ordering | Standard: best-effort Β· FIFO: strict | Standard: best-effort Β· FIFO: strict | No ordering guarantee |
| throughput | Near-unlimited (Standard) | Near-unlimited (Standard) | 10,000 events/sec default (soft limit) |
| Retry / DLQ | Retry per subscriber Β· DLQ supported | Built-in retry with visibility timeout + DLQ | Retry with exponential backoff Β· DLQ supported |
| SaaS / cross-account | No | No | Yes β Salesforce, Zendesk, GitHub, cross-account buses |
| Scheduling | No | No | Yes β cron and rate-based scheduled rules |
| Best for | Broadcasting events to multiple systems simultaneously | Reliable, buffered work queue for a single consumer | Event routing, filtering, scheduling, SaaS integration |
Choose SNS whenβ¦
- One event must reach multiple consumers at once
- You need simple broadcast semantics
- Combining with SQS queues for fan-out
- Sending CloudWatch alarm emails or SMS alerts
- Triggering multiple Lambda functions from one event
Choose SQS whenβ¦
- Work must be processed exactly once by one consumer
- Consumer may be slow, unavailable, or need retries
- You need message buffering and backpressure handling
- Task queues: image processing, order fulfilment, batch jobs
- You need messages retained for up to 14 days
Choose EventBridge whenβ¦
- You need content-based routing β route by event fields
- Integrating with SaaS providers (Salesforce, GitHub, Stripe)
- Cross-account or cross-region event delivery
- Scheduled tasks (cron jobs, periodic triggers)
- Complex event filtering without custom code
In production architectures, these three services are often used in the same pipeline:
| Confusion | Clarification |
|---|---|
| SNS vs SQS β which is the queue? | SQS is the queue. SNS has no queue β it delivers immediately or not at all. SNS is the broadcaster, SQS is the buffer. |
| SNS vs EventBridge β both fan out, whats the difference? | SNS does simple fan-out to multiple subscribers. EventBridge adds content-based routing, SaaS integration, scheduling, and cross-account buses. Use EventBridge when you need routing logic; use SNS when you need broadcast. |
| Can SNS and EventBridge both trigger Lambda? | Yes. For simple pub/sub use SNS. For complex routing rules, SaaS events, or scheduled triggers use EventBridge. |
| Do I need SNS if I have EventBridge? | Often yes. EventBridge targets a single rule destination per match. SNS is better when you need guaranteed simultaneous fan-out to many subscribers from one event. |
| Who retains messages β SNS or SQS? | SQS retains messages (up to 14 days). SNS does not retain β if delivery fails and retries are exhausted, the message is gone unless you configured a DLQ. |
- One event β multiple systems in parallel β SNS fan-out (+ SQS for durability)
- Work queue, one processor per task, retry needed β SQS
- Route events from SaaS / across accounts / by content fields β EventBridge
- Scheduled recurring job β EventBridge scheduled rule β Lambda
- Message must survive consumer downtime β SQS (not SNS alone)
SNS broadcasts, SQS buffers, EventBridge routes β they are not alternatives but layers of an event-driven architecture that work best together
Real-World Use Cases
SNS is not an exotic service β it shows up in almost every production AWS architecture. These are the patterns you will encounter most often:
Scenario
An e-commerce platform needs to notify multiple systems the moment a customer places an order β confirmation email, inventory update, shipping preparation, and fraud detection all at once.
Architecture
- Order Service publishes
OrderPlacedto SNS topic - SNS β SQS (Inventory) β Inventory worker updates stock
- SNS β SQS (Shipping) β Shipping worker creates fulfilment job
- SNS β Lambda β Fraud score calculated in real-time
- SNS β Email β Confirmation sent to customer
Scenario
Infrastructure monitoring: when a CloudWatch alarm fires (high CPU, low disk, error rate spike), multiple teams and automated systems need to react simultaneously.
Architecture
- CloudWatch Alarm β SNS topic
ops-alerts - SNS β Email β on-call engineer paged
- SNS β Lambda β Auto-remediation script runs
- SNS β SQS β Audit log queue for compliance
- SNS β HTTP β PagerDuty / Slack webhook
Scenario
A user management service handles sign-up, profile updates, and account deletions. Downstream services (marketing, billing, analytics) each need to react to these events independently.
Architecture
- User Service publishes events to
user-eventsSNS topic - Subscription filter:
event-type = "signup"β Marketing SQS - Subscription filter:
event-type = "deleted"β Billing Lambda - No filter β Analytics SQS receives all events
- New downstream service? Just subscribe β zero changes to User Service
CodePipeline β SNS
Pipeline state changes (SUCCESS, FAILED, STOPPED) publish to an SNS topic automatically β no custom code needed.
SNS β Email
Dev team receives a deployment success or failure email immediately after each pipeline run.
SNS β Lambda
On failure: Lambda posts to Slack, creates a Jira ticket, or triggers a rollback β all driven by the same SNS event.
Scenario
GuardDuty detects suspicious activity (e.g., root account login from unusual IP). This security event must reach multiple systems simultaneously for rapid response.
Architecture
- GuardDuty β EventBridge rule β SNS topic
security-alerts - SNS β SMS β Security team paged immediately
- SNS β Lambda β Automatically disable the IAM user
- SNS β SQS β Incident response queue for ticketing system
- SNS β Email β CISO notification
SNS appears in every category of AWS architecture β from e-commerce and DevOps to security and serverless pipelines β always solving the same core problem: broadcasting one event to many consumers simultaneously
Security & Reliability
IAM Policies
Control which IAM users, roles, and services can call sns:Publish, sns:Subscribe, and sns:CreateTopic. Applied to the caller (publisher/subscriber).
- Attach to EC2 instance roles, Lambda execution roles
- Least-privilege: only allow
sns:Publishon specific topic ARN - Deny publish from untrusted accounts
Topic Policies (Resource Policies)
Attached directly to the SNS topic β controls who can access the topic from outside. Essential for cross-account publishing.
- Allow other AWS accounts to publish to your topic
- Allow specific AWS services (S3, CloudWatch) to publish
- Deny all except specific principals
| Scenario | Use IAM Policy | Use Topic Policy |
|---|---|---|
| IAM user/role in same account publishes | Yes β attach to role | Not required (optional) |
| Another AWS account publishes to your topic | Not sufficient alone | Yes β must allow in topic policy |
| AWS service (S3, CloudWatch) publishes | Not required | Yes β trust the service principal |
| Restrict which topics a role can publish to | Yes β specify ARN in IAM | Not the right tool |
Scenario: Account A (producer) needs to publish to an SNS topic owned by Account B. IAM policy alone is insufficient β you need both.
Account B β Topic Policy (owner)
{
"Statement": [{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::ACCOUNT-A:root"
},
"Action": "sns:Publish",
"Resource": "arn:aws:sns:us-east-1:
ACCOUNT-B:MyTopic"
}]
} Account A β IAM Role Policy (producer)
{
"Statement": [{
"Effect": "Allow",
"Action": "sns:Publish",
"Resource": "arn:aws:sns:us-east-1:
ACCOUNT-B:MyTopic"
}]
} π KMS encryption adds a step: if the topic uses SSE with a custom CMK, Account B's KMS key policy must also grant Account A kms:Decrypt and kms:GenerateDataKey β otherwise publish will fail with an access denied error on the KMS key, not SNS.
Encryption at Rest (SSE)
SNS supports Server-Side Encryption using AWS KMS. Messages are encrypted when stored in SNS (briefly, during delivery processing).
- Enable on topic creation or update
- Uses AWS managed key (
aws/sns) or your own CMK - Required for compliance (PCI DSS, HIPAA workloads)
Encryption in Transit
All communication with SNS endpoints is over HTTPS/TLS by default. You can enforce HTTPS-only access via a topic policy condition.
- Add condition
aws:SecureTransport: truein topic policy - Denies any HTTP (unencrypted) publish attempts
- Best practice for all production topics
By default, SNS API calls route over the public internet (even with TLS). EC2 instances in private subnets without a NAT Gateway cannot reach SNS. The solution is a VPC Interface Endpoint (PrivateLink).
Benefits
- Traffic stays entirely within the AWS network β never touches the internet
- No Internet Gateway or NAT Gateway required
- Lower latency β no internet hop
- Endpoint policy can restrict which topics can be published to from the VPC
Setup Steps
- Create interface VPC endpoint: service
com.amazonaws.{region}.sns - Enable DNS resolution on the endpoint (uses existing SNS SDK URLs)
- Attach security group allowing outbound HTTPS (443) from compute
- Optionally add endpoint policy to restrict
sns:Publishto specific topic ARNs - Cost: ~$0.01/hr per AZ + data processing
Scenario: "EC2 instances in a private subnet need to publish to SNS. The company requires traffic to stay within AWS β no internet exposure." Answer: Create a VPC Interface Endpoint for SNS. This is the same PrivateLink pattern used for S3 Gateway endpoints and SSM endpoints.
SNS retries failed deliveries automatically β but the behavior depends on the subscriber type:
| Subscriber Type | Retry Behaviour | DLQ Support |
|---|---|---|
| SQS | 3 retries with backoff (very reliable β SQS is durable) | Yes |
| Lambda | 3 retries with backoff | Yes |
| HTTP / HTTPS | Up to 23 retries over ~23 days with exponential backoff | Yes |
| Best-effort β limited retries, no DLQ | No | |
| SMS | Best-effort β limited retries based on carrier | No |
π Dead-Letter Queues (DLQ) on SNS subscriptions capture messages that could not be delivered after all retries. Always configure a DLQ on SQS and Lambda subscriptions in production so no event is silently dropped.
SNS publishes these metrics to CloudWatch automatically. Set alarms on NumberOfNotificationsFailed as your primary health signal:
| Metric | What It Measures | Recommended Action |
|---|---|---|
| NumberOfNotificationsDelivered | Messages successfully delivered to subscribers | Baseline β watch for unexpected drops |
| NumberOfNotificationsFailed | Delivery failures after all retries exhausted | Alarm if > 0 sustained for 5 min |
| NumberOfNotificationsFilteredOut | Messages blocked by subscription filter policies | Monitor for unexpected filtering spikes |
| NumberOfMessagesPublished | Total messages published to the topic | Baseline for capacity planning |
| PublishSize | Size of published messages (bytes) | Alarm if frequently near 256 KB limit |
Recommended Alarm
- Metric:
NumberOfNotificationsFailed - Statistic: Sum
- Period: 5 minutes
- Threshold: > 0
- Action: SNS β email ops team
Transient single failures are normal. Alarm on sustained failures only.
Enable Delivery Status Logging
- Activate in SNS topic settings per subscriber protocol
- HTTP/S β logs request/response body
- Lambda β logs invocation results
- SQS β logs delivery attempts and failures
- Logs go to CloudWatch Logs β enable in topic configuration
What SNS Does NOT Do
- Does not persist messages after delivery (unlike SQS)
- Does not allow consumers to replay past messages
- Does not guarantee delivery to unavailable subscribers (without SQS)
- Does not support message visibility or polling
Design Patterns to Compensate
- Pair with SQS for durable buffering of every event
- Configure DLQ on subscriptions to catch delivery failures
- Use Kinesis Firehose subscriber to archive all events to S3
- Enable CloudWatch metrics on SNS to monitor delivery failures
| Best Practice | Why It Matters |
|---|---|
| Enable SSE (KMS encryption) on topics with sensitive data | Protects messages at rest β required for regulated industries |
Enforce HTTPS-only via topic policy (aws:SecureTransport) | Prevents plaintext message interception in transit |
| Use least-privilege IAM for publishers | Limits blast radius if a service is compromised |
| Configure DLQ on all SQS and Lambda subscriptions | Prevents silent message loss on delivery failure |
| Use topic policies for cross-account access (not IAM alone) | IAM alone is insufficient for cross-account publish |
Monitor NumberOfNotificationsFailed in CloudWatch | Alerts you to delivery failures before they become data-loss events |
- Cross-account SNS publish β requires a topic policy allowing the external account, not just IAM
- Messages silently dropped? β configure DLQ on the subscription
- Compliance / encryption at rest β enable SSE with KMS on the topic
- Consumer down, messages lost from SNS? β add SQS queue between SNS and the consumer
Pricing & Cost Optimization
| Component | Price | Notes |
|---|---|---|
| API requests | $0.50 / 1M requests | Publish, Subscribe, Unsubscribe all count |
| Data transfer out (to internet) | Varies (~$0.09/GB) | SQS/Lambda delivery within AWS = free |
| SMS (USA) | ~$0.00645 / message | Transactional tier; promotional is cheaper |
| Email / Email-JSON | $2.00 / 100,000 notifications | Direct SNS email; use SES for high volume |
| FIFO topics | Same as Standard | No price premium for FIFO |
| Workload | Messages/Day | Approx. Monthly Cost |
|---|---|---|
| Small app notifications | 10,000 | ~$0.15 |
| Medium e-commerce (3 subscribers) | 1,000,000 | ~$45 (3M deliveries) |
| Large IoT telemetry (2 subscribers) | 50,000,000 | ~$1,500 |
| SMS alerts (US, 10K/day) | 10,000 SMS | ~$1,900 SMS + $15 API |
Batch Publishes
Use PublishBatch API (up to 10 messages per call). Reduces API requests β and cost β by 90% for high-volume publishers.
Use Filter Policies
Fewer deliveries = fewer Lambda invocations and SQS receives. Filter evaluation is free β only matching deliveries cost money.
Minimize Cross-Region
Publish in same region as subscribers. Cross-region data transfer adds egress charges on top of SNS API costs.
- SNS is a managed pub/sub service β one publisher, many subscribers, push-based fan-out
- Core building blocks β Topic, Publisher, Subscriber, Message (256 KB), Filter Policies with JSON attribute matching
- FIFO topics β strict ordering, 300β3,000 msg/sec, SQS FIFO subscribers only, no filter policies
- SNS + SQS fan-out is the gold-standard pattern β SNS broadcasts, SQS adds durability and retry
- SNS vs SQS vs EventBridge β broadcast vs buffer vs route; they complement each other
- Use cases β order pipelines, CloudWatch alarms, microservice events, CI/CD, security alerts
- Reliability β retries per subscriber type, DLQ, CloudWatch metrics (
NumberOfNotificationsFailed), delivery logging - Security β IAM + topic policies, cross-account requires both, SSE (KMS), VPC Endpoint for private subnets
- Cost β $0.50/1M API requests; batch publishes + filter policies = key optimizations