Other Storage Services
Glacier & FSx
High-level overview of Amazon S3 Glacier (cheapest archive storage) and Amazon FSx (managed enterprise file systems) โ when to choose them over S3, EBS, or EFS.
Amazon S3 Glacier
Amazon S3 Glacier is archive storage โ the cheapest way to store data in AWS when you rarely need to access it. Think of it as a deep-freeze warehouse: incredibly cheap to store boxes, but it takes time and costs money to retrieve one.
๐ Think of S3 Glacier as: A cold storage warehouse โ pennies per GB to store forever, but you wait minutes to hours to get your data back
S3 Glacier is not a separate service โ it's a set of S3 storage classes. Your data still lives in an S3 bucket; you just choose a Glacier storage class (or lifecycle rules move it there automatically). The S3 API works the same โ the difference is cost and retrieval time.
| Storage Class | Cost (us-east-1) | Retrieval Options | First Byte Latency | Best For |
|---|---|---|---|---|
| S3 Glacier Instant Retrieval | ~$0.004/GB/month | Milliseconds (like Standard) | Milliseconds | Medical images, news media โ accessed 1ร/quarter but need instant access. Min object: 128 KB. |
| S3 Glacier Flexible Retrieval | ~$0.0036/GB/month | Expedited (1-5 min), Standard (3-5 hr), Bulk (5-12 hr) | Minutes to hours | Backup archives, compliance data โ accessed 1-2ร/year |
| S3 Glacier Deep Archive | ~$0.00099/GB/month | Standard (12 hr), Bulk (48 hr) | 12-48 hours | Regulatory compliance (7+ years), disaster recovery copies |
S3 Standard = Your Desk
- Files right in front of you โ instant access
- Expensive real estate (you're paying for prime location)
- Use for files you touch daily/weekly
Glacier = Off-Site Warehouse
- Dirt cheap to store boxes โ pennies per GB/month
- Takes time to retrieve (truck has to drive to your office)
- Use for files you touch rarely (1ร/year or less)
Lifecycle Rules (Most Common)
S3 lifecycle policy: "Move objects to Glacier Flexible Retrieval after 90 days, Deep Archive after 365 days." Fully automatic.
Direct Upload
Upload directly to a Glacier storage class: PUT with x-amz-storage-class: GLACIER. For data you know is archival from day one.
S3 Intelligent-Tiering
Let S3 automatically move objects between tiers (including Archive and Deep Archive access tiers) based on access patterns.
For Glacier Flexible Retrieval, you choose retrieval speed at restore time โ faster = more expensive:
| Retrieval Tier | Time | Cost (per GB) | Use When |
|---|---|---|---|
| Expedited | 1-5 minutes | $0.03/GB | Urgent โ need a specific file NOW (rare emergency) |
| Standard | 3-5 hours | $0.01/GB | Normal retrieval โ can wait a few hours |
| Bulk | 5-12 hours | $0.0025/GB | Large batch restores โ cheapest, used for petabyte-scale recoveries |
๐ Exam tip: "Cheapest storage with 12-hour retrieval acceptable" โ S3 Glacier Deep Archive. "Archive but need millisecond access once per quarter" โ S3 Glacier Instant Retrieval. Know the retrieval times โ they're exam favorites.
Use S3 Glacier When
- Compliance/regulatory archives (HIPAA, SOX โ 7+ year retention)
- Database backups older than 90 days
- Media archives (raw video footage, old photography)
- Disaster recovery copies you hope to never use
- Scientific data sets rarely reanalyzed
- Log archives beyond your active analysis window
Do NOT Use Glacier When
- You need instant access to all data (use S3 Standard)
- Retrieval time matters for user experience
- Data is accessed weekly or more (S3-IA or Standard cheaper overall)
- Small objects (<128 KB) โ minimum storage charge applies
- Short-lived objects โ 90/180 day minimum storage charges
| Feature | S3 Standard | S3 Glacier Flexible | S3 Glacier Deep Archive |
|---|---|---|---|
| Storage Cost | $0.023/GB/mo | $0.0036/GB/mo (84% cheaper) | $0.00099/GB/mo (96% cheaper) |
| Retrieval Time | Milliseconds | 1 min โ 12 hours | 12 โ 48 hours |
| Retrieval Cost | Free (just GET request) | $0.01 โ $0.03/GB | $0.02 โ $0.05/GB |
| Min Storage Duration | None | 90 days | 180 days |
| Min Object Size Charge | None | 40 KB | 40 KB |
| Request Cost (retrieval) | $0.0004/1K requests | $0.05/100 (Expedited), $0.0004/1K (Standard/Bulk) | $0.05/100 (Standard), $0.025/1K (Bulk) |
| Access Pattern | Any frequency | 1-2ร per year | Once per year or less |
S3 Glacier = cheapest storage in AWS for data you rarely access. Three classes: Instant (ms), Flexible (min-hrs), Deep Archive (12-48 hrs). Data moves there via lifecycle rules automatically. Know the retrieval times for the exam.
- Not a separate service โ it's S3 storage classes. Same buckets, same API, different pricing.
- Three classes: Glacier Instant (ms access, $0.004/GB), Glacier Flexible (min-hrs, $0.0036/GB), Deep Archive (12-48 hrs, $0.00099/GB).
- Lifecycle rules: automate moves from Standard โ IA โ Glacier โ Deep Archive based on days since last access.
- Retrieval tiers (Flexible): Expedited (1-5 min, $0.03/GB), Standard (3-5 hr, $0.01/GB), Bulk (5-12 hr, $0.0025/GB).
- Minimum storage charges: 90 days (Flexible), 180 days (Deep Archive). Don't archive short-lived data.
- Exam pattern: "Cheapest + 12 hr retrieval OK" โ Deep Archive. "Archive + instant access quarterly" โ Glacier Instant.
Amazon FSx
Amazon FSx provides fully managed third-party file systems. Unlike EFS (which is NFS only), FSx gives you enterprise-grade file systems โ Windows (SMB), Lustre (HPC), NetApp ONTAP (multi-protocol), and OpenZFS โ fully managed by AWS. You get the performance and features of these file systems without building and maintaining the infrastructure yourself.
๐ Think of FSx as: AWS saying "you want a Windows file server? Lustre cluster? NetApp NAS? We'll run it for you โ fully managed, in your VPC"
Each FSx variant solves a different problem. The key exam skill is knowing which FSx to choose based on the workload description.
FSx for Windows File Server
- Protocol: SMB (Server Message Block)
- OS: Windows + Linux (via SMB)
- AD Integration: Native Active Directory
- Features: DFS namespaces, shadow copies, quotas
- Use case: Windows file shares, .NET apps, SQL Server, home dirs
- Performance: Up to 2 GB/s, millions of IOPS
FSx for Lustre
- Protocol: Lustre (POSIX-compliant, Linux)
- OS: Linux only
- Performance: Sub-millisecond latency, 100s GB/s throughput
- S3 integration: Transparently linked to S3 bucket
- Use case: HPC, ML training, video rendering, genomics
- Key: Fastest file system on AWS โ for compute-heavy bursts
FSx for NetApp ONTAP
- Protocol: NFS, SMB, iSCSI (multi-protocol)
- OS: Linux, Windows, macOS
- Features: Data dedup, compression, snapshots, clones, tiering
- Use case: Enterprise NAS, VMware migration, multi-protocol
- Key: Most feature-rich โ NetApp without the hardware
FSx for OpenZFS
- Protocol: NFS (v3, v4, v4.1, v4.2)
- OS: Linux, Windows (via NFS), macOS
- Features: Snapshots, clones, compression, up to 1M IOPS
- Use case: Linux workloads migrating from on-prem ZFS
- Key: Drop-in replacement for self-managed ZFS/NFS servers
FSx Lustre โ Deployment Types
- Scratch: Temporary, no data replication, highest burst throughput. Cheapest. Data lost if server fails. Use for short-term HPC jobs where data is in S3 anyway.
- Persistent: Data replicated within AZ, consistent performance. Use for long-running workloads, ML training pipelines, production analytics.
FSx Windows โ High Availability
- Single-AZ: Standard deployment. Use for dev/test or non-critical workloads. Cheaper.
- Multi-AZ: Active/standby across two AZs. Automatic failover (<60 seconds). ~50% more expensive. Use for production.
๐ FSx pricing model: FSx uses provisioned capacity โ you choose a storage size (e.g., 2 TB) and pay for it whether used or not. This is different from EFS's elastic pay-per-GB-used model. FSx is better for predictable capacity; EFS is better for unpredictable/spiky storage needs. Only EFS supports Lambda mounting โ no FSx variant works with Lambda.
๐ Quick decision: "Windows/SMB/AD" โ FSx Windows. "HPC/ML/fastest throughput" โ FSx Lustre. "Multi-protocol enterprise NAS" โ FSx ONTAP. "ZFS migration / Linux NFS with high IOPS" โ FSx OpenZFS. "Shared Linux NFS, simple" โ EFS (not FSx).
FSx for Lustre has a unique feature: it can be transparently linked to an S3 bucket. Data in S3 appears as files in the Lustre filesystem. Compute instances read from Lustre at extreme speed (100s GB/s), while the data source remains S3.
Use FSx When
- Windows/SMB required โ FSx for Windows (not EFS)
- HPC / ML needing extreme throughput โ FSx Lustre (100s GB/s)
- Multi-protocol (NFS+SMB+iSCSI) โ FSx ONTAP
- Migrating NetApp/ZFS from on-premises โ FSx ONTAP or OpenZFS
- Need sub-millisecond latency on shared filesystem โ FSx Lustre or OpenZFS
- Active Directory integration โ FSx Windows
Do NOT Use FSx When
- Simple shared Linux NFS โ use EFS (simpler, elastic, cheaper for basic use)
- Object storage / data lake โ use S3
- Single-instance block storage โ use EBS
- Archive/cold storage โ use S3 Glacier
- Serverless (Lambda) storage โ use EFS (FSx not supported with Lambda)
| Feature | EFS | FSx Windows | FSx Lustre | FSx ONTAP |
|---|---|---|---|---|
| Protocol | NFS v4.1 | SMB | Lustre/POSIX | NFS+SMB+iSCSI |
| OS | Linux only | Windows (+Linux SMB) | Linux only | All |
| Performance | Good (ms latency) | Good (ms) | Extreme (sub-ms, 100s GB/s) | Good (ms) |
| Capacity | Elastic | Provisioned | Provisioned | Provisioned (auto-tiering) |
| S3 Link | No | No | Yes (transparent) | No (FlexCache) |
| Multi-AZ | Yes (default) | Optional | No (single AZ) | Yes |
| Lambda | Yes | No | No | No |
| Best For | Shared Linux, containers | Windows apps, AD | HPC, ML, rendering | Enterprise NAS |
| Pricing | Pay per GB used | Per GB provisioned | Per GB provisioned | Per GB provisioned |
| Scenario | Answer | Why |
|---|---|---|
| "Windows file shares with Active Directory for 500 users" | FSx for Windows | SMB + native AD integration. EFS doesn't support Windows/SMB. |
| "ML training โ 200 GPU instances need to read 50 TB dataset from S3 fast" | FSx for Lustre | Linked to S3 โ 100s GB/s reads. Lustre = fastest file system on AWS. |
| "Migrate on-prem NetApp to AWS, need NFS + SMB access" | FSx for NetApp ONTAP | Multi-protocol, same NetApp features (dedup, snapshots, FlexClone). |
| "Simple shared NFS for 10 EC2 Linux instances behind ALB" | EFS (not FSx) | Simple use case. EFS is simpler, elastic, supports Lambda. No need for FSx. |
| "Shared storage for Lambda functions" | EFS | Only EFS supports Lambda mounting. No FSx variant works with Lambda. |
| "Linux workloads migrating from on-prem ZFS servers" | FSx for OpenZFS | Drop-in replacement. Same ZFS features (snapshots, clones, compression). |
FSx = managed enterprise file systems. Four variants: Windows (SMB/AD), Lustre (HPC/fastest), ONTAP (multi-protocol NAS), OpenZFS (ZFS migration). If the question says "Windows" โ FSx Windows. "HPC/ML throughput" โ FSx Lustre. "Simple shared Linux" โ EFS. "Lambda" โ EFS only.
- FSx for Windows: SMB protocol, native Active Directory, DFS. For Windows workloads, .NET, SQL Server, SharePoint.
- FSx for Lustre: fastest file system on AWS (sub-ms, 100s GB/s). Links to S3. For HPC, ML training, genomics, rendering.
- FSx for NetApp ONTAP: multi-protocol (NFS+SMB+iSCSI), dedup, compression, snapshots, auto-tiering to S3. Enterprise NAS.
- FSx for OpenZFS: NFS, snapshots, clones, up to 1M IOPS. Drop-in for self-managed ZFS/NFS servers.
- FSx vs EFS: EFS = simple elastic NFS for Linux/Lambda. FSx = specialized enterprise file systems. Don't use FSx for simple NFS.
- Exam key: "Windows/SMB" โ FSx Windows. "Fastest/HPC" โ Lustre. "Multi-protocol" โ ONTAP. "Lambda" โ EFS only.