Post on 14-Apr-2017
© 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Timothy Harder
harder@amazon.com
October 2015
Amazon Elastic File System
STG306
What to expect from the session (STG306)
Advanced level: 300
• Motivations for creating the world’s first cloud scale NAS
• How to set up and administer a highly scalable file system
• Awareness of security mechanisms available to your file systems
• View of Amazon EFS performance model
Agenda
1. Overview of Amazon EFS
2. Amazon EFS technical concepts
3. Walk through experience of creating a file system
4. Guest presenter – ClearSky Data
5. Discuss file system security mechanisms
6. Review the Amazon EFS performance model
7. Explore the Amazon EFS regional availability and durability model
8. Q&A
What if you never had to worry about file system space again?
Overview of Amazon EFS
Amazon S3 • Object storage: data presented as buckets of objects
Amazon EFS • File storage (analogous to NAS): data presented as a file system
Amazon
Elastic Block
Store (EBS)
• Block storage (analogous to SAN): data presented as disk volumes
Amazon
Glacier• Archival storage: data presented as vaults/archives of objects
The AWS storage portfolio
• Fully managed file system for Amazon EC2 instances
• Provides standard file system semantics
• Works with standard operating system APIs
• Sharable across thousands of instances
• Elastically grows to petabyte scale
• Delivers performance for a wide variety of workloads
• Highly available and durable
• NFS v4–based
What is Amazon EFS?
Why did we build Amazon Elastic File System?
• Compute + Storage + File system + Multi-AZ replication
+ 24*7 + Management = Hard, expensive, wobbly.
• Prepackaged appliances = Easy, but...
We built Amazon EFS so that you do not need to manage
the discrete infrastructure elements for your file systems
We focused on changing the game
Amazon EFS
is simpleAmazon EFS
is elasticAmazon EFS
is scalable
1 2 3
Amazon EFS is simple
Fully managed
- No hardware, network, file layer
- Create a scalable file system in seconds!
Seamless integration with existing tools and apps
- NFS v4—widespread, open
- Standard file system semantics
- Works with standard OS file system APIs
Simple pricing = simple forecasting
1
Amazon EFS is elastic
File systems grow and shrink automatically
as you add and remove files
No need to provision storage capacity or
performance
You pay only for the storage space you use,
with no minimum fee
2
File systems can grow to petabyte scale
Throughput and IOPS scale automatically
as file systems grow
Consistent low latencies regardless of file
system size
Support for thousands of concurrent NFS
connections
Amazon EFS is scalable3
Why does this matter?
… to app owners
and developers?
… to your
business?
• Easy to move existing code, applications, and tools
used today with existing NFS servers to the AWS cloud
• Simple shared file storage solution for new cloud-native
applications
• Predictable pricing with no up-front investment
• Increased agility
• Spend less time managing file storage and more
time focusing on your business
… to IT
administrators?
• Eliminates need to manage and maintain file system
storage at scale
Diving in
What is a file system?
The primary resource in Amazon EFS
Where you store files and directories
Can create multiple file systems per account
How to access a file system from an instance
You “mount” a file system on an Amazon EC2 instance (standard command) — the file system appears like a local set of directories and files
An NFSv4 client is standard on Linux distributions
mount –t nfs4
[file system DNS name]:/
/[user’s target directory]
What is a mount target?
To access your file system from instances in an Amazon VPC, you create mount targets in the VPC
A mount target is an NFSv4 endpoint in your VPC
A mount target has an IP address and a DNS name you use in your mount command
AVAILABILITY ZONE 1
REGION
AVAILABILITY ZONE 2
AVAILABILITY ZONE 3
VPC
EC2EC2
EC2
EC2
Mount
target
How does it all fit together?
AVAILABILITY ZONE 1
REGION
AVAILABILITY ZONE 2
AVAILABILITY ZONE 3
VPC
EC2EC2
EC2
EC2
Customer’s file
system
There are three ways to set up and
manage a file system
AWS Management Console
AWS Command Line Interface (CLI)
AWS Software Development Kit (SDK)
The AWS Management Console, CLI, and SDK each
allow you to perform a variety of management tasks
Create a file system
Create and manage mount targets
Tag a file system
Delete a file system
View details on file systems in your AWS account
Setting up and mounting a file system takes
under a minute
1. Create a file system
2. Create a mount target in each Availability Zone from
which you want to access the file system
3. Enable the NFS client on your instances
4. Run the mount command
It takes 35 seconds or so..
Multi-exabyte file system
available for use
Don’t worry—We only bill for
the space you use
Securing your file system
Control surfaces for Amazon EFS security
Control network traffic to and from file systems (mount
targets) by using VPC security groups and network ACLs
Control file and directory access by using standard
Linux/Windows directory-/file-level permissions
Control administrative access (API access) to file systems
by using AWS Identity and Access Management (IAM)
Only EC2 instances in the VPC you specify can access
your Amazon EFS file system
VPC
EC2EC2
EC2
EC2
VPC
EC2EC2
EC2
EC2
Customer’s file
system
VPC
EC2
EC2
Security groups control which instances in your VPC
can connect to your mount targets
Customer’s file
system
Security group:
sg-allowed
Security group:
Permit inbound traffic
from “sg-allowed”
Security group:
sg-not-allowed
Amazon EFS supports user-level file and
directory access permissions
Set file/directory permissions to specify read-write-execute
permissions for users and groups
Use IAM policies to control who can use the
administrative APIs to create, manage, and
delete file systems
Amazon EFS supports action-level and
resource-level permissions
Integration with AWS IAM provides administrative
security
FAQ on adjacent security topics
Does EFS support ACLs?
Does EFS support / need nis/nis+?
Does EFS support kerberized auth?
Does EFS support encryption?
Does EFS support Windows? …. Stay tuned.
Using AWS for an Enterprise
Storage Service
October 2015Laz Vekiarides, CTO & Co-Founder
What does 1PB of data look like?
Today With ClearSky
The ClearSky Global Storage Network
Metro-based
fully managed
service
SLA-guaranteed for
enterprise workloads
Complete
lifecycle
management
BackupRecovery
Primary
Metro coverage:
“Always within 2ms of the Customer”
= Tier 1 Location
= Tier 2 Location
Next to your apps In your metro area Regional
The ClearSky solution:
A hybrid cloud storage offering
Enterprise data center
ClearSky Edge ApplianceEnterprise
Apps
ClearSky POPsDistributed & optimized storage
Leveraging AWS
Edge cache Data services
Edge Metro POP
ClearSky
Metro
CacheN x Metro
E
Customer
SAN
iSCSI/NFS/Fi
ber Channel
VPC
EFS
Edge cache Data services
Hybrid cloud mobility
ClearSky
Metro
Cache2x 1GbE
Edge Metro POP
Automatic and optimized data migration to
AWS enables workload portability to EC2
Large and distributed network protects users
from latency issues
Customer use cases
• Managed/cloud service provider
• Data centers in Philadelphia &
Las Vegas
• Using EC2 for cloud workloads
• 1PB+ storage, currently running
on EqualLogic, Nimble• Heavy users of VMW, SQL
• Chose ClearSky for workload
portability, cloud economics &
scale
• Xtium will no longer need to
replicate data cross-country
• Boston-based biopharma
• 100TB storage, currently on Dell
Compellent
• Running full range of enterprise
apps on ClearSky:• VMware, SQL, Oracle
• Using EC2 for DR
• Chose ClearSky for simplicity and
cost effectiveness
• Momenta will no longer need a
secondary site
Thank You
www.clearskydata.com
Thank you!
Also see:
CMP404 – Cloud Rendering at Disney
Animation Studios
CMP405 – Containerizing Video. Sony
Electronics
Amazon EFS performance model
Amazon EFS aggregate performance is based on a
throughput bursting model that scales as a file
system grows
As a file system gets larger, it
needs access to more
throughput
Many file workloads are spiky,
with peak throughput well above
average levels
Amazon EFS scalable bursting model is designed to
make performance available when you need it
Throughput bursting model based on earning
and spending “bursting credits”
Accumulate up to
12 hours of
continuous bursting
Earn credits at a “baseline rate” of 0.05 MiB/s per GiB stored
Spend credits by reading/writing at up to:
• 100 MiB/s for file systems <1TiB
• 100 MiB/s per TiB for file systems >1TiB
• All file systems can drive sustained baseline throughput
(i.e., 50 MiB/s per TiB stored)
• File systems with a positive bursting credit balance are
able to “burst” to higher levels
• New file systems start with a full credit balance
Bursting model examples
File system size Read/write throughput
A 1 TiB EFS file system can… • Drive up to 50 MiB/s continuously
or
• Burst to 100 MiB/s for up to 12 hours each day*
A 10 TiB EFS file system can… • Drive up to 500 MiB/s continuously
or
• Burst to 1 GiB/s for up to 12 hours each day*
A 100 GiB EFS file system can… • Drive up to 5 MiB/s continuously
or
• Burst to 100 MiB/s for up to 72 minutes each day*
Amazon EFS is designed for wide spectrum of use cases
We started here
Ready now
We are actively
working here now
High throughput / parallel IO
Low latency / serial IO
Genomics
Big Data
Scale-out jobs
Homedir
CMS
Web serving
SW builds
Metadata-intensive jobs
Regional availability and
durability
In what regions can I use Amazon EFS?
US-West-2 (Oregon)
US-East-1 (Northern Virginia)
EU-West-1 (Ireland)
Data is stored in multiple Availability Zones for
high availability and durability
Every file system
object (directory,
file, and link) is
redundantly
stored across
multiple
Availability Zones
in a region
AVAILABILITY
ZONE 1
REGION
AVAILABILITY
ZONE 2
AVAILABILITY
ZONE 3
Amazon
EFS
Data can be accessed from any Availability Zone in
the region while maintaining full consistency
Your EC2 instances can connect to your EFS file system from any Availability Zone in a region
All reads are fully consistent
in all Availability Zones—
that is, a read in one
Availability Zone is
guaranteed to have the
latest data, even if the data
is being written in another
Availability Zone
AVAILABILITY
ZONE 1
REGIONVPC
EC2EC2
EC2
AVAILABILITY
ZONE 2
AVAILABILITY
ZONE 3
EC2
Write
Read
Wrapping up
TCO - 1TB example
User managed
$1.00 / GB
• Storage
• Compute
• Inter Availability Zone
• M3 xlarge x 3 x 3TB EBS GP2 + inter az replication
Appliance
$4.00 / GB
• Per-hour charge
• Storage
• Inter Availability Zone
• M3 xlarge x 3 x TB EBS GP2 + inter az replication
On-Premises AFA
$0.60 / GB
• Raw to usable
• Cost of funds
• Utilization
• Collocation
• Storage only Mirrored 2 site configuration
Amazon EFS
$0.30 / GB
• Elastic
• Simple
• Predictable
What to do next?
Learn more at aws.amazon.com/efs
Request an invite for our Preview
Thank you!
Remember to complete
your evaluations!