Spectrum Scale final
-
Upload
joe-krotz -
Category
Technology
-
view
2.659 -
download
0
Transcript of Spectrum Scale final
IBM Spectrum ScaleSoftware defined storage for cloud, big data &
analytics, and NAS solutions
A Smarter Cloudfor a Smarter Planet
Joe Krotz
IBM CTS – Cloud and Storage Systems
August 2015
SOURCE: *2014 IBM Institute for Business Value Study on Infrastructure Matters; Gartner IT Metrics
The top two challenges organizations face with IT infrastructure are storage related –Data Management and Cost Efficiency
2.5 BillionGigabytes of data per day
Data Explosion
90%of data created in
last two years
Traditional Storage Models are being disrupted by the explosion of data
Data Innovation30% lower TCO with Flash
50% lower storage
management cost and Hybrid delivery with
Software Defined Storage
Data Innovation
0.4% overall IT budget growth
in 2015
670% more data
in 5 years for storage administrators
Data Economics
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
IBM Spectrum Scale – History and evolution
Software Defined Infrastructure
2006200520021998
HPC
GPFS (General Parallel File Systems)
General File Serving Standards Portable
operating system interface (POSIX) semantics-Large block Directory and
Small file perf Data
management
Virtual Tape Server (VTS)
Linux® Clusters (Multiple architectures)
IBM AIX® Loose Clusters
GPFS 2.1-2.3
HPC
ResearchVisualizationDigital MediaSeismicWeather explorationLife sciences
32 bit /64 bitInter-op (IBM AIX & Linux)GPFS Multicluster
GPFS over wide area networks (WAN)
Large scale clustersthousands of nodes
GPFS 3.1-3.2
2009
First called GPFS
GPFS 3.4
Enhanced Windows cluster support- Homogenous Windows Server
Performance and scaling improvements
Enhanced migration and diagnostics support
2010
GPFS 3.3
Restricted Admin Functions
Improved installation
New license model
Improved snapshot andbackup
Improved ILM policy engine
2012
Ease of administration
Multiple-networks/ RDMA
Distributed Token Management
Windows 2008
Multiple NSD servers
NFS v4 Support
Small file performance
Information lifecycle management (ILM)
Storage Pools File sets Policy Engine
GPFS 3.5
Active File Management
GPFS Native RAID
GPFS Shared Nothing Cluster (GPFS-FPO)
GPFS Storage Server
Research in video streaming started in 1993, commercialized in 1994
GPFS 4.1+
Part of IBM Spectrum Storage Software Defined Storage
GPFS 4.1
Encryption
Performance (LROC, AFM)
Usability (Network Monitor, NFS migration, FPO)
Elastic Storage on Linux on System z
Cloud Service on IBM Softlayer
Elastic Storage Server
2014 2015
Code name Elastic
Storage
IBM Spectrum
Scale
GPFS 4.1.1
Enhanced Client experience
New protocols: NSFv4, SMB/CIFS, improved OpenStack Swift
Async DR
Spectrum Scale proven at over 3,000 customers worldwide
Software Defined Infrastructure
Climate and weather modeling withwith 16 Petabytes on line and 12 Petabytes archive on tape
4 time Champion Infiniti Red Bull Racing does real-time race analytics
Wind turbine design analysisDone in hours instead of weeks
Private Cloud for digital media enables global collaboration
for film production Personalized cancer treatment for over 65,000 patients
R&D environment for natural language tools
IBM.com/systems/storage/spectrum/
IBM Spectrum Scale
Security• Native encryption and secure erase• Disaster Recovery
Scalability and Snapshots• Point in time copies
Performance• Flash acceleration and local read only cache• Fully integrated ILM
Usability• New configuration guidance
• Installation toolkit will quickly install IBM Spectrum Scale software – for Client, Server, FPO, and protocol nodes.
• Tight integration with IBM Spectrum Control for system health monitoring
Data Security, Performance and Usability
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Native Encryption and Secure Erase
• Native: encryption is built into the “Advanced” product
• Protects data from security breaches, unauthorized access, and being lost, stolen or improperly discarded
• Cryptographic erase for fast, simple and secure file deletion
• Complies with NIST SP 800-131A and is FIPS 140-2 certified
• Supports HIPAA, Sarbanes-Oxley, EU and national data privacy law compliance
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Native Encryption and Secure Erase
Encryption of data at rest
• Files are encrypted before they are stored on disk
• Keys are never written to disk
• No data leakage in case disks are stolen or improperly decommissioned
Secure deletion
• Ability to destroy arbitrarily large subsets of a file system
• No “digital shredding”, no overwriting: secure deletion is a cryptographic operation
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Data protection: Disaster recoveryThe challenge: How do I recover data after a major disastrous event that could not be anticipated?
• Force majeure, e.g. earthquake or hurricane
• Accidents, e.g. fire or flood
• Administrator mistakeThe Solution
• IBM Spectrum Scale lets you mirror your data at a secondary site
• Set your Recovery Point Objective (RPO) at say 15 mins, 30 mins, 1 hour, etc.
• If the primary site fails, data requests are automatically redirected to the secondary site
• Asynchronous updates accommodate unreliable networks
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Scalability and Snapshots
IBM Spectrum Scale provides the functionality to create snapshots at the file system, file set, and file level. Each Spectrum Scale file system can have multiple snapshots of any of the types at the same time
Software Defined Infrastructure
The snapshot function allows a backup or mirror program to run concurrently with user updates and still obtain a consistent copy of the file system as of the time that the snapshot was created.
IBM.com/systems/storage/spectrum/
Snapshot capacity
IBM Spectrum Scale V4.1 can retain 256 Global snapshots and 256 Snapshots of each Independent Fileset.
Spectrum Scale 4.1 can have 10,000 dependent filesets and 1,000 independent file sets.
Scalability
Maximum number of files/file system 264
(9quintillion) files per file system
Maximum file system size 299 bytesMaximum number of nodes 16,384
IBM Spectrum Scale is designed to meet the needs of data-intensive applications such as engineering design, digital media, data mining, relational databases, financial analytics, seismic data processing, scientific research and scalable file serving. The solution scales up to more than a billion petabytes of data and hundreds of GB/s throughput.
Flash Local Read Only Cache (LROC)
Clients
Spectrum Scale
Flash LROC SSDs• Inexpensive SSDs or Flash placed directly in Client nodes
• Accelerates I/O performance up to 6x by reducing the amount of time CPUs wait for data
• Also decreases the overall load on the network, benefitting performance across the board
• Improves application performance while maintaining all the manageability benefits of shared storage
• Cache consistency ensured by standard tokens
• Data is protected by checksum and verified on read
• Spectrum Scale handles the flash cache automatically so data is transparently available to your application with very low latency and no code changes
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
LROC Flash Cache Example Speed Up
• Initially, with all data coming from the disk storage system, the client reads data from the SAS disks at ~
5,000 IOPS
• As more data is cached in Flash, client performance increases to 32,000 IOPS while reducing the load
on the disk subsystem by more than 95%
~ 5,000 IOPS 10K RPM SAS Drives
~ 32,000 IOPS Flash SSD
~ 6x
• Two consumer grade 200 GB SSDs cache a forty-eight 300 GB 10K SAS disk storage system
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
IBM FlashSystem
All Files
FlashSystem as Cache
FlashSystem for Metadata StorageFlashSystem as storage tier
Performance: Using IBM Spectrum Scale with FlashSystem
IBM
FlashSystem
HDD Storage
Hot Files
FlashSystem is data center optimizedto deliver extreme performance,
flexible capacity and total system protection
All other files
Data and metadata Data Metadata
Spectrum Scale
cluster:
Primary Storage
Spectrum Scale
cluster:
Primary Storage
IBM FlashSystem
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Collaboration with Active File Management (AFM)•AFM makes global namespace truly
global by automatically managing asynchronous synchronization of data
•Only the modified contents are synchronized from the primary to the remote site
• Local caching: cached data access performs much better than WAN access
• Latencies are improved
• WAN link usage is reduced
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
3 Deployment Options
Software OnlySoftware licenses: Express, Standard or Advanced Editions
IBM Spectrum Scale SW, GUI, GNR, drives, services
IBM Elastic Storage Server
Managed ServiceIBM high performance services for data
IBM Spectrum Scale
IBM.com/systems/storage/spectrum/
Software Defined Infrastructure
Spectrum Scale + SoftLayer Cloud
Spectrum Scale Parallel Architecture
Software Defined Infrastructure
Clients use data, Network Storage Devices (NSDs) serve shared data
All NSD servers export to all clients in active-active mode
Spectrum Scale stripes files across NSD servers and NSDs in units of file-system block-size
NSD client communicates with all the servers
File-system load spread evenly across all the servers and storage. No HotSpots
Easy to scale file-system capacity and performance while keeping the architecture balanced
NSD Client does real-time parallel I/O to all the NSD servers and storage volumes/NSDs
File stored in blocks
IBM.com/systems/storage/spectrum/
Spectrum Scale Cluster Models
Software Defined Infrastructure
Storage
Storage Storage
TCP/IP or Infinband RDMA Network
Storage Network
TCP/IP or Infiniband Network
TCP/IP or Infinband Network
NSD Servers
ApplicationNodes
ApplicationNodes
IBM.com/systems/storage/spectrum/
Delivers Extreme Data Integrity and Space Efficiency
• 2- and 3-fault-tolerant erasure codes
• Up to 2PB per rack• End-to-end checksum• Protection against lost
writes• Disk Hospital• Proactively, detect,
diagnose and resolve disk issues
Software Defined Infrastructure
Model GL62 servers, 6 Enclosures, 28U, 348 NL-SAS, 2 SSD
2, 4, or 6TB drives12+ GB/sec
Breakthrough Performance
• High performance - less hardware
• De-clustered RAID reduces app load during rebuilds
• Up to 3x lower overhead to applications
• Built-in SSDs and NVRAM for write performance
• Fastest rebuild times using De-clustered RAID
• Graphical disk failure management
Lowers TCO
• 3 Years Maintenance and Support
• General Purpose Servers
• Off-the-shelf SBODs• Standardized in-band
SES management• Standard Linux• Modular Upgrades• Faster than
alternatives today –and tomorrow!
IBM.com/systems/storage/spectrum/
IBM Elastic Storage ServerIBM Spectrum Scale bundled solution
Spectrum Scale Use Cases
Software Defined Infrastructure
Spectrum Scale shared storage
Cinder SwiftHadoop
Connector
NFS
Single software defined storage solution across all these application types
Linear capacity & performance scale out
POSIX
Enterprise class storage using standard hardware
Single Name Space
NAS Big Data & Analytics Cloud
(Block) (Object)
File
SMB/CIFS
IBM.com/systems/storage/spectrum/
IBM Spectrum Scale benefits over other NAS solutionsBetter performance Eliminate hotspots with massively parallel access to files
Sequential I/O with ES greater than 400 GB/s
Throughput advantage for parallel streaming workloads, e.g. Tech Computing and
Analytics
More Storage. More Files. Hyper Scale.
Simplified Management Easier management with one global namespace instead of managing islands of
NAS arrays, e.g. no need to copy data between compute clusters
Integrated policy driven automation
Fewer storage administrators required
Lower Cost Optimizes storage tiers including flash, disk and tape
Increased efficiency and more efficient provisioning due to parallelization and
striping technology
Remove duplicate copies of data, e.g. run analytics on one copy of data without
having to set up a separate silo
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Data Access: IBM Spectrum Scale protocol support
• The IBM Spectrum Scale Protocol Node allows access to data stored in a Spectrum Scale filesystem, using additional access methods and protocols.
• The Protocol Node functions are clustered and can support transparent failover for NFS and SWIFT protocols as well as SMB protocols.
• Multiprotocol data access from other systems using the following protocols
• NFS v3 and v4
• SMB 2 and SMB 3.0 mandatory features / CIFS for
Windows support
• SMB support is delivered by Samba 4.2.
• 3,000 active connections per node / 20K max
• OpenStack Swift and S3 API support for object storage.
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
SWIFT
NFS
CIFS
Administrator
Command Line Interface
Users
NFS
SMB/CIFS
POSIX
Open Stack Swift
PN1
ProtocolNode
Flash
Disk
Tape
Exte
rnal
TC
P/IP
or
IB N
etw
ork
PN2
PNn
NSD1
Network Shared Disks
NSD2
NSDn
…
Physical Storage
Data Access: Protocol Support
IBM
Sp
ectr
um
Sca
le C
lust
er T
CP/
IP o
r IB
Net
wo
rkMgmt Nodes
AuthenticationServices
keystone
Open Stack Cinder
Spec
tru
m S
cale
Clu
ster
No
des
Elastic Storage Server
IBM.com/systems/storage/spectrum/
Software Defined Infrastructure
New GUI coming in October!
Spectrum Scale: Drop-in Replacement for HDFSAdding Analytics without adding a dedicated Analytics infrastructure• Hadoop connector• Supports IBM Big Insights Analytics and open
source Apache Hadoop• Existing infrastructure can do Hadoop-based
Analytics• No need to purchase a dedicated Analytics infrastructure, lowering CAPEX and
OPEX
• No need to move data in and out of an Analytics dedicated silo
• Software defined infrastructure for multi-tenancy
• Enterprise-class protection and efficiency‒ Full data lifecycle management‒ Policy based tiering from flash to disk to tape
• Reduce cost, simplify management
Compute Cluster
Spectrum Scale
HDFS
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
IBM Spectrum Scale: Hybrid storage for Hadoop Applications
Shared Storage Pools Shared Nothing Cluster Pool
DiskFlash
Spectrum Scale client
Spectrum Scale Hadoop Connector
Hadoop File System API
Hadoop Application • Exploit locality for the files stored in the local storage
• Access shared storage thru the same connector.
• Storage is completely transparent to the application
• Scale storage independent of compute nodes
• The IBM Spectrum Scale Hadoop connector has been extended to support shared storage that includes SAN Based storage, shared nothing cluster configurations, and integrated solutions like ESS.
• Full Hadoop interfaces for Map/Reduce analytics processing.
• No transfer or ingest required as the data is already there
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
• OpenStack Havana release includes a Cinder driver• Giving architects access to the features and capabilities of the industry’s leading enterprise scale-out software
defined storage
• With OpenStack on Spectrum Scale, all nodes see all data • Copying data between services, like Glance to Cinder is minimized
or eliminated
• Speeding instance creation and conserving storage space
• Rich set of data management and information lifecycle features• Efficient file clones
• Policy based automation optimizing data placement for locality or performance tier
• Backup
• Industrial strength reliability, minimizing risk
• Cinder driver provides resilient block storage, minimal data copying between services, speedy instance creation and efficient space utilization
Spectrum Scale OpenStack Cinder Driver
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
IBM Spectrum Scale and OpenStack Swift
• Consolidate File and Object under a single shared storage infrastructure.
• The new IBM Spectrum Scale Protocol Node lets you share the storage infrastructure for both Files and Objects
• Running your object store on IBM Spectrum Scale provides these key features:
• POSIX/NFS/SMB/Object in single storage cluster with a single authentication scheme
• Extra layers of data protection through Snapshots, Backup, and/or Disaster Recovery
• Integrated ILM tiering to move cold objects to low cost tier and off premise
• Encryption of data at rest and Secure Erase
• Additional data protection ESS solution
IBM Spectrum Scale
NFS
SMBPOSIX
SSD Fast
Disk
Slow
DiskTape
Swift
HDFS
Cinder
Glance Manila
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
OpenStack and IBM Spectrum Scale help clients manage data at scaleBusiness Needs IBM Spectrum Scale
Business: I need virtually unlimited storage An open & scalable cloud platform
Operations: I need a flexible infrastructure that supports both object and file based storage
A single data plane that supports Cinder, Glance, Swift, Manila as well as NFS, et. al.
Operations: I need to minimize the time it takes to perform common storage management tasks
A fully automated policy based data placement and migration tool
Collaboration: I need to share data between people, departments and sites with low latency.
Sharing with a variety of WAN caching modes
IBM.com/systems/storage/spectrum/
Software Defined Infrastructure
Data Center and Point of Presence
New Data Centers in 2014
Network Point of Presence
100,000+Servers
21,000Customers
20,000,000Active
Domains
•IPv4/IPv6 dual stack
•Global DNS
•Global DDOS Mitigation
•Global Internet Exchanges & Peering
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Infrastructure solution with a common management interface and API across a unified architecture
Mix and match bare metal servers, virtual server instances, and hosted private clouds
Full integration with all IBM storage portfolio offerings
Full OpenStack, RESTful API, SmartCloud Storage and IBM Storage Integration Server integration
Seamless scaling for Cloud and large deployments. This include Public, Private and Hybrid solutions
Bare metal with your own stack
Dedicated virtualized environment
Shared virtual environment
Dedicated virtualized environment
Triple Network Architecture
Automation & Support
Delivers Outstanding Performance & Price
Flexibility to Deliver Dynamic/ Hybrid Capability
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
35
What is LTFS?1) Open Format for data which is written to tape
Developed and disclosed by IBM Describes the format of data and meta data stored on tape Meta data is based on XML schema Applicable to LTO5, LTO6 and TS1140
Requires tape partitioning
2) File System support (code) to R/W tapes in LTFS format Externalizes the LTO5 / LTO6 / TS1140 tape as file system
Enables standard applications to write/read LTFS tapes Supports update, edit, and delete of files on LTFS tape
Supports partial recall
Available on Linux, Mac OS X and Windows
• Makes tape look and work like any removable media (e.g., USB drive, removable disk)
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
36
LTFS Mount point is the library
Cartridges are subdirectories
LTFS mounts cartridges into drive to service file access requests
Easy usage, no ISV required
Caching of tape indices in memory
For searching and displaying tape contents without needing a mount
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
Data Ingestion or creation
Data Processing Access Archival
High Performance TierFlash, SSD, SAS
Parallel Access
Provide highest performance for most demanding applications
High volume storage
Single Global Name Space across all tiers
Lower costs by allocating the right tier of storage to the right need
Archival storage with low cost disk or tape
Integration with Spectrum Protect and Spectrum Archive
Policy based Archival and remote Disaster Recovery
Manage the full data life cycle cost-effectively through policy driven IML
Software Defined Infrastructure
IBM.com/systems/storage/spectrum/
The Solution: IBM Spectrum Scale brings it all together
Global Name Space
IBM Spectrum Scale replaces
SAN-based file systems
Replaces NTFS, EXT4, JFS2 and other
POSIX file systems
Used by over 200 of the top 500
supercomputers
No file transfers required between
different OS
Can be used with everything from
databases to video streaming
For x86, POWER and
z System servers
Secure with
Data-at-rest encryption
IBM Spectrum Scale replaces HDFS and NAS file storage
Full Hadoop interfaces for Map/Reduce analytics processing
No transfer or ingest required as the data is already there
Fully protected with Backup Software
File-level access support for NFS, CIFS, FTP, SCP and HTTPS
Supports Enterprise File Sync-and-Share
via OwnCloud or Funambol
IBM Spectrum Scale
offers Object access
Object-level access based on
OpenStack Swift driver and
Amazon S3 APIs
IBM Spectrum Scale
supports all media and
integrates seamlessly
with LTFS
Spans flash, disk and tape
media with a file system view
that
IBM.com/systems/storage/spectrum/
Software Defined Infrastructure
For more information:Websites:
http://www.ibm.com/systems/storage/spectrum/scale/
http://www.ibm.com/cloud-computing/us/en/
Product Pages:
http://www-03.ibm.com/systems/storage/flash/
http://www-03.ibm.com/systems/storage/spectrum/ess/
http://www-03.ibm.com/systems/storage/tape/ltfs/
IBM RedBooks
https://www.redbooks.ibm.com/
Thank you!
Software Defined Infrastructure