Scale and Availability Considerations for Cluster File Systems
David Noy, Symantec Corporation
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved. 2
SNIA Legal Notice
The material contained in this tutorial is copyrighted by the SNIA unless otherwise noted. Member companies and individual members may use this material in presentations and literature under the following conditions:
Any slide or slides used must be reproduced in their entirety without modificationThe SNIA must be acknowledged as the source of any material used in the body of any document containing material from these presentations.
This presentation is a project of the SNIA Education Committee.Neither the author nor the presenter is an attorney and nothing in this presentation is intended to be, or should be construed as legal advice or an opinion of counsel. If you need legal advice or a legal opinion please contact your attorney.The information presented herein represents the author's personal opinion and current understanding of the relevant issues involved. The author, the presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information.
NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK.
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved. 3
Abstract
Scale and Availability Considerations for Cluster File Systems
This session will appeal to server administrators looking to improve the availability of their mission critical applications using a heterogeneous tool that is cross platform and low cost. We will learn how you can use Cluster File System to improve performance and availability for your application environment.
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
With a Clustered File System
Without a Clustered File System
Simultaneous Access To File System
With a cluster file system, all servers that are part of the cluster can safely access the file system simultaneously
Veritas Cluster File System HA
File System 1 File System 2 File System 3 File System 4
A traditional file system can only be mounted on one server at any given time, otherwise data corruption will occur.
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
How is a CFS different from NAS?
NFS
Network Attached Storage (NAS)Uses TCP/IP to share a file system overthe Local Area NetworkHigher latency and overheadA file system is mounted via a network based file system protocol (CIFS on Windows, NFS on Unix)
Cluster File SystemLooks and feels like a local file system, but shared across multiple nodesUses a Storage Area Network to share the data,
And dedicated network interfaces to share locking informationTightly coupled with clustering to create redundancyWith a Cluster File System the application can run on the same node as the file system to get the best performance
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved. 6
CO
STServer Capacity
Typical Performance Decay at Scale
Bottlenecks as a result of legacy technologyPerformance per node can degrade with more nodesLinear scalability generally infeasible (results vary based on app)
PER
FOR
MAN
CE
Available Capacity
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Expected Scalability for a CFS
7
0
500
1000
1500
2000
2500
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Throughput for CFS for 1 – 16 nodes
Total MB/Sec to CFS
# of Nodes
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Legacy Meta-Data Management
File system transactions done primarily by one master node per file systemThat node becomes the bottleneck in transaction intensive workloadsThe amount of transactions performed locally can be improved but it is still the bottleneckAcceptable performance for some workloads, not for others
Metadata/Data
Data
Data
Data
CFS
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Distributed Meta-Data Management
Servers
Log pernode
SAN
Metadata / Data
Metadata / Data
Metadata / Data
Metadata / Data
CFS
All nodes in the cluster can perform transactions – No need to manually balance the loadNot directly visible to end usersPerformance tests show linear scalabilityScale-out becomes complex beyond 64 nodes
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Without Range Locking: Exclusive locking
Node 1 – Writer with an exclusive lock
Node 2 – Writer waiting for lock to be released
Lock held by node 1, other nodes do not have access until the lock is released
Appending writes, e.g. a log fileFile foo.bar
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Range Locking
Lock held by node 1, other nodes do not have access
Appending writes, e.g. data ingest
Node 1 - Writer Node 2 - Writer
File foo.bar
Available for read or write by any nodes
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Other Scalability Considerations
Distributed Meta-Data ManagementDistributed Lock ManagementMinimize intra-cluster messagingCache optimizationsReduce or eliminate fragementation
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Availability : I/O Fencing
Coordinator Disks
Data Disks
13
Node A Node B
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
I/O Fencing : Split Brain…
Coordinator Disks
Data Disks
14
Node A Node B
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
I/O Fencing : Split Brain Resolved
Coordinator Disks
Data Disks
15
Y N
Node A Node B
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Why Fencing?
Cluster Fencing
16
APP APP
• Need to restrict writes to current and verified nodes
SCSI-3 Based Fencing Proxy Based Fencing
SCSI3 PR Disks
• SCSI3 disks for i/o fencing• Maximum data protection
• Non SCSI3 fencing• Virtualized environment
IP Based Proxy Servers
Data Corruption
APP APP
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Other Availability Considerations
Robust fencing mechanismTight integration between application clustering and storage layer
File SystemVolume ManagerMulti-Pathing
Quick recovery of CFS objectsLock RecoveryMeta-Data Redistribution
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Cluster File System Use Cases
Faster Failovers
• Dramatically reduce application failover times
• When appropriate, eliminate the cost and complexity of parallel databases
Performance for Parallel Apps
• Linear scalability and performance for parallel applications
• Ideal for grid compute
Improve Storage Utilization
• Pool storage between servers eliminating storage islands
• Eliminate multiple copies of data
Databases
Messaging applications
CRM applications
ETL applications
Grid Applications
Parallel databases
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Accelerate Application Recovery
Classic Clustering Requires TimeFor Applications that Need Maximum Uptime
PassiveServer
FailedServerActiveServer
Client
ActiveServer
Recovery Steps
• Detect failure• Un-mount file
system • Deport disk group• Import disk group• Mount file system• Start application• Clients reconnect
File System
Classic Clustering
File System
Classic Clustering
Databases
Messaging applications
CRM applications
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Failover as Fast as Application Restart
Accelerate Application Recovery
FailedServer
Client
ActiveServer
Veritas Storage Foundation
Veritas Cluster Server
Veritas Storage Foundation
Veritas Cluster Server
Veritas Cluster File System
For Applications that Need Maximum Uptime
Recovery Steps
• Detect failure• Un-mount file
system • Deport disk group• Import disk group• Mount file system• Start application• Clients reconnect
Databases
Messaging applications
CRM applications
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
• Downtime cost = €1,000,000/plane/day• # Failovers/yr = 6 Failures @ 1 min
each• # Planes = 240• Total Downtime/Year = 6 minutes• @ 1,000,000/plane/day x 6 min x 240
planes• Downtime Cost = € 1,000,000
Fast Failover with CFS HAAfter: Ground Control System Fast Failover with CFS
Prod SG
Case Study : Reduced Downtime Cost
Before: Ground Control System Using CRM and DataBases Traditional High Availability
• Downtime cost = €1,000,000/plane/day• # Failovers/yr = 6 Failures @ 30 mins
each• # Planes = 240• Total Downtime/Year = 3 hours• @ 1,000,000/plane/day x 3 hrs x 240
planes• Downtime Cost = € 30,000,000
SAP and Oracle with HA
6x 30 Minute Failovers / Year
< 1 Minute Downtime / Failure
Prod SG
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Case Study : Large Casino
Messaging Services FT Cluster
• Using messaging services to run entire casino gaming floor• Experienced server outage• Storage failover was very quick• Failover server required 20 minutes to recover to rebuild
message queue• Entire casino floor came to a complete stop
• Total Savings• With CFS, a hot-standby server could recover in
seconds instead of 20 minutes
FailedServer
Client
ActiveServer
Veritas Storage Foundation
Messaging Server
Veritas Storage Foundation
Messaging Server
CFS
Shared SAN
Messaging server
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
With CFS Benefits
OSS Customer Care ETL Billing
CFS
Case Study : Reduce Billing Cycle
Without CFS Drawbacks
• Time required to process customer billing included 12 hours of copy time
• For billing systems, time is money
• Redundant copies of data at each server means 2x the storage requirements
• CFS eliminates copy time so one process can start when another completes
• Single copy of data shared among servers
• 12 hour reduction in billing cycle
OSS Customer Care ETL Billing
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Traditional Application Architecture
4 x 500GB ISLANDS = 2TB
3 4
P R O D
Case Study : Storage Consolidation
Shared Storage Architecture
Shared Cluster File System– Storage is accessible to all nodes– Reduce upfront over-provisioning– All nodes share common free
space– Minimize idle server and storage
resources
1.5 TB POOL
S P A R E
Traditional File System
– Islands of storage zoned to each server
– Storage over-provisioned due to unknown storage growth needs
– When storage is filled, new storage must be provisioned
P O S T P R O C
P O S T P R O C
P O S T P R O C P R O D P O S T
P R O C
1
P O S T P R O C
P O S T P R O C
S P A R ES P A R ES P A R ES P A R E
25%Less
1 22Grow
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved.
Availability and Scale for Clustered NAS
High Availability of NFS and CIFS service
Distributed NFS load balancing for performance
Scale servers and storage independentlyMore servers gives linear performance
Flexible storage growth
Stretch your NFS/CIFS cluster (up to 100 km active/active)
Choose your platform (Solaris, Sol x86, AIX, RHEL5)
Integrated with Dynamic Storage Tiering
Dynamic Multi Pathing
Thin Provisioning Reclamation
Increased Price / Performance compared to a similar NAS appliances
…
Lock Coordination
Scale and Availability Considerations for Cluster File Systems © 2011 Storage Networking Industry Association. All Rights Reserved. 2626
Q&A / Feedback
Please send any questions or comments on this presentation to SNIA: [email protected]
Many thanks to the following individuals for their contributions to this tutorial.
- SNIA Education Committee
Karthik RamamurthyDavid Noy
Top Related