S3 Glacier FSx
LearningTree ยท AWS ยท Storage

Other Storage Services
Glacier & FSx

High-level overview of Amazon S3 Glacier (cheapest archive storage) and Amazon FSx (managed enterprise file systems) โ€” when to choose them over S3, EBS, or EFS.

01
Service One

Amazon S3 Glacier

What is S3 Glacier Introductory

Amazon S3 Glacier is archive storage โ€” the cheapest way to store data in AWS when you rarely need to access it. Think of it as a deep-freeze warehouse: incredibly cheap to store boxes, but it takes time and costs money to retrieve one.

๐Ÿ‘‰ Think of S3 Glacier as: A cold storage warehouse โ€” pennies per GB to store forever, but you wait minutes to hours to get your data back

S3 Glacier is not a separate service โ€” it's a set of S3 storage classes. Your data still lives in an S3 bucket; you just choose a Glacier storage class (or lifecycle rules move it there automatically). The S3 API works the same โ€” the difference is cost and retrieval time.

The Three Glacier Storage Classes Core
Storage ClassCost (us-east-1)Retrieval OptionsFirst Byte LatencyBest For
S3 Glacier Instant Retrieval~$0.004/GB/monthMilliseconds (like Standard)MillisecondsMedical images, news media โ€” accessed 1ร—/quarter but need instant access. Min object: 128 KB.
S3 Glacier Flexible Retrieval~$0.0036/GB/monthExpedited (1-5 min), Standard (3-5 hr), Bulk (5-12 hr)Minutes to hoursBackup archives, compliance data โ€” accessed 1-2ร—/year
S3 Glacier Deep Archive~$0.00099/GB/monthStandard (12 hr), Bulk (48 hr)12-48 hoursRegulatory compliance (7+ years), disaster recovery copies
Mental Model โ€” The Trade-Off Introductory
๐Ÿ 

S3 Standard = Your Desk

  • Files right in front of you โ€” instant access
  • Expensive real estate (you're paying for prime location)
  • Use for files you touch daily/weekly
๐Ÿญ

Glacier = Off-Site Warehouse

  • Dirt cheap to store boxes โ€” pennies per GB/month
  • Takes time to retrieve (truck has to drive to your office)
  • Use for files you touch rarely (1ร—/year or less)
How Data Gets to Glacier Core
๐Ÿ”„

Lifecycle Rules (Most Common)

S3 lifecycle policy: "Move objects to Glacier Flexible Retrieval after 90 days, Deep Archive after 365 days." Fully automatic.

๐Ÿ“ค

Direct Upload

Upload directly to a Glacier storage class: PUT with x-amz-storage-class: GLACIER. For data you know is archival from day one.

๐Ÿ“‹

S3 Intelligent-Tiering

Let S3 automatically move objects between tiers (including Archive and Deep Archive access tiers) based on access patterns.

Retrieval Speed vs Cost Core

For Glacier Flexible Retrieval, you choose retrieval speed at restore time โ€” faster = more expensive:

Retrieval TierTimeCost (per GB)Use When
Expedited1-5 minutes$0.03/GBUrgent โ€” need a specific file NOW (rare emergency)
Standard3-5 hours$0.01/GBNormal retrieval โ€” can wait a few hours
Bulk5-12 hours$0.0025/GBLarge batch restores โ€” cheapest, used for petabyte-scale recoveries

๐Ÿ‘‰ Exam tip: "Cheapest storage with 12-hour retrieval acceptable" โ†’ S3 Glacier Deep Archive. "Archive but need millisecond access once per quarter" โ†’ S3 Glacier Instant Retrieval. Know the retrieval times โ€” they're exam favorites.

Concept Diagram Introductory
S3 Storage Classes โ€” Cost vs Access Speed Spectrum
CHEAPER STORAGE โ†’โ†’โ†’ โ†โ†โ† FASTER ACCESS S3 Standard $0.023/GB Milliseconds S3 Standard-IA $0.0125/GB Milliseconds Glacier Instant $0.004/GB Milliseconds Glacier Flexible $0.0036/GB Min โ†’ Hours Deep Archive $0.00099/GB 12-48 Hours
AWS Architecture Diagram Core
S3 Lifecycle โ€” Data flows from hot โ†’ warm โ†’ cold โ†’ archive automatically
๐Ÿ“ฑ APP / USER PUT S3 Standard Day 0 โ†’ 30 Active data $0.023/GB 30d Standard-IA Day 30 โ†’ 90 Infrequent $0.0125/GB 90d Glacier Flex Day 90 โ†’ 365 Archive $0.0036/GB 365d Deep Archive Day 365+ $0.00099/GB S3 Lifecycle Policy โ€” Fully Automatic Restore request โ†’ copy back to Standard (temp)
When to Use / When NOT to Use Core
โœ…

Use S3 Glacier When

  • Compliance/regulatory archives (HIPAA, SOX โ€” 7+ year retention)
  • Database backups older than 90 days
  • Media archives (raw video footage, old photography)
  • Disaster recovery copies you hope to never use
  • Scientific data sets rarely reanalyzed
  • Log archives beyond your active analysis window
๐Ÿšซ

Do NOT Use Glacier When

  • You need instant access to all data (use S3 Standard)
  • Retrieval time matters for user experience
  • Data is accessed weekly or more (S3-IA or Standard cheaper overall)
  • Small objects (<128 KB) โ€” minimum storage charge applies
  • Short-lived objects โ€” 90/180 day minimum storage charges
Comparison with Other Storage Core
FeatureS3 StandardS3 Glacier FlexibleS3 Glacier Deep Archive
Storage Cost$0.023/GB/mo$0.0036/GB/mo (84% cheaper)$0.00099/GB/mo (96% cheaper)
Retrieval TimeMilliseconds1 min โ€“ 12 hours12 โ€“ 48 hours
Retrieval CostFree (just GET request)$0.01 โ€“ $0.03/GB$0.02 โ€“ $0.05/GB
Min Storage DurationNone90 days180 days
Min Object Size ChargeNone40 KB40 KB
Request Cost (retrieval)$0.0004/1K requests$0.05/100 (Expedited), $0.0004/1K (Standard/Bulk)$0.05/100 (Standard), $0.025/1K (Bulk)
Access PatternAny frequency1-2ร— per yearOnce per year or less
๐Ÿ‘‰ Key Takeaway

S3 Glacier = cheapest storage in AWS for data you rarely access. Three classes: Instant (ms), Flexible (min-hrs), Deep Archive (12-48 hrs). Data moves there via lifecycle rules automatically. Know the retrieval times for the exam.

๐Ÿ“‹ S3 Glacier โ€” Summary
  • Not a separate service โ€” it's S3 storage classes. Same buckets, same API, different pricing.
  • Three classes: Glacier Instant (ms access, $0.004/GB), Glacier Flexible (min-hrs, $0.0036/GB), Deep Archive (12-48 hrs, $0.00099/GB).
  • Lifecycle rules: automate moves from Standard โ†’ IA โ†’ Glacier โ†’ Deep Archive based on days since last access.
  • Retrieval tiers (Flexible): Expedited (1-5 min, $0.03/GB), Standard (3-5 hr, $0.01/GB), Bulk (5-12 hr, $0.0025/GB).
  • Minimum storage charges: 90 days (Flexible), 180 days (Deep Archive). Don't archive short-lived data.
  • Exam pattern: "Cheapest + 12 hr retrieval OK" โ†’ Deep Archive. "Archive + instant access quarterly" โ†’ Glacier Instant.
02
Service Two

Amazon FSx

What is Amazon FSx Introductory

Amazon FSx provides fully managed third-party file systems. Unlike EFS (which is NFS only), FSx gives you enterprise-grade file systems โ€” Windows (SMB), Lustre (HPC), NetApp ONTAP (multi-protocol), and OpenZFS โ€” fully managed by AWS. You get the performance and features of these file systems without building and maintaining the infrastructure yourself.

๐Ÿ‘‰ Think of FSx as: AWS saying "you want a Windows file server? Lustre cluster? NetApp NAS? We'll run it for you โ€” fully managed, in your VPC"

Each FSx variant solves a different problem. The key exam skill is knowing which FSx to choose based on the workload description.

The Four FSx File Systems Core
FSx Windows

FSx for Windows File Server

  • Protocol: SMB (Server Message Block)
  • OS: Windows + Linux (via SMB)
  • AD Integration: Native Active Directory
  • Features: DFS namespaces, shadow copies, quotas
  • Use case: Windows file shares, .NET apps, SQL Server, home dirs
  • Performance: Up to 2 GB/s, millions of IOPS
FSx Lustre

FSx for Lustre

  • Protocol: Lustre (POSIX-compliant, Linux)
  • OS: Linux only
  • Performance: Sub-millisecond latency, 100s GB/s throughput
  • S3 integration: Transparently linked to S3 bucket
  • Use case: HPC, ML training, video rendering, genomics
  • Key: Fastest file system on AWS โ€” for compute-heavy bursts
FSx ONTAP

FSx for NetApp ONTAP

  • Protocol: NFS, SMB, iSCSI (multi-protocol)
  • OS: Linux, Windows, macOS
  • Features: Data dedup, compression, snapshots, clones, tiering
  • Use case: Enterprise NAS, VMware migration, multi-protocol
  • Key: Most feature-rich โ€” NetApp without the hardware
FSx OpenZFS

FSx for OpenZFS

  • Protocol: NFS (v3, v4, v4.1, v4.2)
  • OS: Linux, Windows (via NFS), macOS
  • Features: Snapshots, clones, compression, up to 1M IOPS
  • Use case: Linux workloads migrating from on-prem ZFS
  • Key: Drop-in replacement for self-managed ZFS/NFS servers
Key Details for Exam Core
๐Ÿ“‚

FSx Lustre โ€” Deployment Types

  • Scratch: Temporary, no data replication, highest burst throughput. Cheapest. Data lost if server fails. Use for short-term HPC jobs where data is in S3 anyway.
  • Persistent: Data replicated within AZ, consistent performance. Use for long-running workloads, ML training pipelines, production analytics.
๐Ÿข

FSx Windows โ€” High Availability

  • Single-AZ: Standard deployment. Use for dev/test or non-critical workloads. Cheaper.
  • Multi-AZ: Active/standby across two AZs. Automatic failover (<60 seconds). ~50% more expensive. Use for production.

๐Ÿ‘‰ FSx pricing model: FSx uses provisioned capacity โ€” you choose a storage size (e.g., 2 TB) and pay for it whether used or not. This is different from EFS's elastic pay-per-GB-used model. FSx is better for predictable capacity; EFS is better for unpredictable/spiky storage needs. Only EFS supports Lambda mounting โ€” no FSx variant works with Lambda.

Mental Model โ€” The FSx Family Introductory

๐Ÿ‘‰ Quick decision: "Windows/SMB/AD" โ†’ FSx Windows. "HPC/ML/fastest throughput" โ†’ FSx Lustre. "Multi-protocol enterprise NAS" โ†’ FSx ONTAP. "ZFS migration / Linux NFS with high IOPS" โ†’ FSx OpenZFS. "Shared Linux NFS, simple" โ†’ EFS (not FSx).

Concept Diagram Introductory
FSx Family โ€” Which file system for which workload
Amazon FSx Managed File Systems FSx Windows SMB ยท Active Directory Windows file shares โ†’ .NET, SQL, SharePoint FSx Lustre POSIX ยท 100s GB/s S3-linked ยท HPC โ†’ ML, genomics, rendering FSx ONTAP NFS+SMB+iSCSI Dedup ยท Snapshots โ†’ Enterprise NAS, VMware FSx OpenZFS NFS ยท 1M IOPS Snapshots ยท Clones โ†’ ZFS migration Windows Workloads AD users, Group Policy SMB file shares HPC / ML / Batch 100s of EC2 instances Read data from S3 Enterprise / Hybrid Mixed OS environments On-prem NetApp migration ZFS / NFS Migration Self-managed NFS โ†’ AWS Low-latency Linux apps
FSx for Lustre + S3 Integration Core

FSx for Lustre has a unique feature: it can be transparently linked to an S3 bucket. Data in S3 appears as files in the Lustre filesystem. Compute instances read from Lustre at extreme speed (100s GB/s), while the data source remains S3.

FSx Lustre + S3 โ€” HPC/ML Training Architecture
S3 Bucket Training data (TB/PB) Durable, cheap storage linked FSx Lustre Sub-ms latency 100s GB/s throughput S3 data appears as files GPU-1 Training GPU-2 GPU-N S3 Output Model artifacts Lustre reads from S3 at 100s GB/s โ€” no data copy needed
When to Use / When NOT to Use Core
โœ…

Use FSx When

  • Windows/SMB required โ€” FSx for Windows (not EFS)
  • HPC / ML needing extreme throughput โ€” FSx Lustre (100s GB/s)
  • Multi-protocol (NFS+SMB+iSCSI) โ€” FSx ONTAP
  • Migrating NetApp/ZFS from on-premises โ€” FSx ONTAP or OpenZFS
  • Need sub-millisecond latency on shared filesystem โ€” FSx Lustre or OpenZFS
  • Active Directory integration โ€” FSx Windows
๐Ÿšซ

Do NOT Use FSx When

  • Simple shared Linux NFS โ€” use EFS (simpler, elastic, cheaper for basic use)
  • Object storage / data lake โ€” use S3
  • Single-instance block storage โ€” use EBS
  • Archive/cold storage โ€” use S3 Glacier
  • Serverless (Lambda) storage โ€” use EFS (FSx not supported with Lambda)
Full Comparison โ€” EFS vs FSx Family Core
FeatureEFSFSx WindowsFSx LustreFSx ONTAP
ProtocolNFS v4.1SMBLustre/POSIXNFS+SMB+iSCSI
OSLinux onlyWindows (+Linux SMB)Linux onlyAll
PerformanceGood (ms latency)Good (ms)Extreme (sub-ms, 100s GB/s)Good (ms)
CapacityElasticProvisionedProvisionedProvisioned (auto-tiering)
S3 LinkNoNoYes (transparent)No (FlexCache)
Multi-AZYes (default)OptionalNo (single AZ)Yes
LambdaYesNoNoNo
Best ForShared Linux, containersWindows apps, ADHPC, ML, renderingEnterprise NAS
PricingPay per GB usedPer GB provisionedPer GB provisionedPer GB provisioned
Architecture Diagram โ€” Enterprise Hybrid with FSx In-Depth
FSx ONTAP โ€” Multi-protocol access from Windows + Linux + On-Premises
VPC FSx ONTAP NFS + SMB + iSCSI Dedup ยท Tiering ยท Snapshots Windows EC2 SMB mount Linux EC2 NFS mount ECS Task iSCSI On-Premises Data Center NetApp / NFS clients DX / VPN S3 (Capacity) Auto-tier cold data to S3 (cheapest) capacity tiering
Exam Scenarios In-Depth
ScenarioAnswerWhy
"Windows file shares with Active Directory for 500 users"FSx for WindowsSMB + native AD integration. EFS doesn't support Windows/SMB.
"ML training โ€” 200 GPU instances need to read 50 TB dataset from S3 fast"FSx for LustreLinked to S3 โ†’ 100s GB/s reads. Lustre = fastest file system on AWS.
"Migrate on-prem NetApp to AWS, need NFS + SMB access"FSx for NetApp ONTAPMulti-protocol, same NetApp features (dedup, snapshots, FlexClone).
"Simple shared NFS for 10 EC2 Linux instances behind ALB"EFS (not FSx)Simple use case. EFS is simpler, elastic, supports Lambda. No need for FSx.
"Shared storage for Lambda functions"EFSOnly EFS supports Lambda mounting. No FSx variant works with Lambda.
"Linux workloads migrating from on-prem ZFS servers"FSx for OpenZFSDrop-in replacement. Same ZFS features (snapshots, clones, compression).
๐Ÿ‘‰ Key Takeaway

FSx = managed enterprise file systems. Four variants: Windows (SMB/AD), Lustre (HPC/fastest), ONTAP (multi-protocol NAS), OpenZFS (ZFS migration). If the question says "Windows" โ†’ FSx Windows. "HPC/ML throughput" โ†’ FSx Lustre. "Simple shared Linux" โ†’ EFS. "Lambda" โ†’ EFS only.

๐Ÿ“‹ Amazon FSx โ€” Summary
  • FSx for Windows: SMB protocol, native Active Directory, DFS. For Windows workloads, .NET, SQL Server, SharePoint.
  • FSx for Lustre: fastest file system on AWS (sub-ms, 100s GB/s). Links to S3. For HPC, ML training, genomics, rendering.
  • FSx for NetApp ONTAP: multi-protocol (NFS+SMB+iSCSI), dedup, compression, snapshots, auto-tiering to S3. Enterprise NAS.
  • FSx for OpenZFS: NFS, snapshots, clones, up to 1M IOPS. Drop-in for self-managed ZFS/NFS servers.
  • FSx vs EFS: EFS = simple elastic NFS for Linux/Lambda. FSx = specialized enterprise file systems. Don't use FSx for simple NFS.
  • Exam key: "Windows/SMB" โ†’ FSx Windows. "Fastest/HPC" โ†’ Lustre. "Multi-protocol" โ†’ ONTAP. "Lambda" โ†’ EFS only.