CERN IT Department CH-1211 Genève 23 Switzerland t R&D Activities on Storage in CERN-IT’s FIO...

19
CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/ R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009 LBNL 27 October 2009

Transcript of CERN IT Department CH-1211 Genève 23 Switzerland t R&D Activities on Storage in CERN-IT’s FIO...

Page 1: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

R&D Activities on Storagein CERN-IT’s FIO group

Helge Meinhard / CERN-IT

HEPiX Fall 2009 LBNL

27 October 2009

Page 2: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Outline

Follow-up of two presentations in Umea meeting:• iSCSI technology (Andras Horvath)• Lustre evaluation project (Arne Wiebalck)

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 3: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

iSCSI - Motivation

• Three approaches– Possible replacement for rather expensive setups with

Fibre Channel SANs (used e.g. for physics databases with Oracle RAC, and for backup infrastructure) or proprietary high-end NAS appliances• Potential cost-saving

– Possible replacement for bulk disk servers (Castor)• Potential gain in availability, reliability and flexibility

– Possible use for applications, for which small disk servers have been used in the past• Potential gain in flexibility, cost-saving

• Focus is functionality, robustness and large-scale deployment rather than ultimate performance

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 4: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

iSCSI terminology

• iSCSI is a set of protocols for block-level access to storage– Similar to FC– Unlike NAS (e.g. NFS)

• “Target”: storage unit listening to block-level requests– Appliances available on the market– Do-it-yourself: put software stack on storage node, e.g. our

storage-in-a-box nodes

• “Initiator”: unit sending block-level requests (e.g. read, write) to the target– Most modern operating systems feature an iSCSI initiator

stack: Linux RH4, RH5; Windows

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 5: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Hardware used

• Initiators: number of different servers including– Dell M610 blades– Storage-in-a-box server– All running SLC5

• Targets:– Dell Equallogic PS5000E (12 drives, 2 controllers with 3 GigE

each)– Dell Equallogic PS6500E (48 drives, 2 controllers with 4 GigE

each)– Infortrend A12E-G2121 (12 drives, 1 controller with 2 GigE)– Storage-in-a-box: Various models with multiple GigE or

10GigE interfaces, running Linux

• Network (if required): private, HP ProCurve 3500 and 6600

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 6: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Target stacks under Linux

• RedHat Enterprise 5 comes with tgtd– Single-threaded– Does not scale well

• Tests with IET– Multi-threaded– No performance limitation in our tests– Required newer kernel to work out of the box (Fedora

and Ubuntu server worked for us) • In context of collaboration between CERN and Caspur, work

going on to understand the steps to be taken for backporting IET to RHEL 5

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 7: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Performance comparison

• 8k random I/O test with Oracle tool Orion

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 8: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Performance measurement

• 1 server, 3 storage-in-a-box servers as targets– Each target exporting 14 JBOD disks over 10GigE

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 9: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Almost production status…

• Two storage-in-a-box servers with hardware RAID5 running SLC5 and tgtd on GigE– Initiator provides multipathing and software RAID 1– Used for some grid services– No issues

• Two Infortrend boxes (JBOD configuration)– Again, initiator provides multipathing and software RAID 1– Used as backend storage for Lustre MDT (see next part)

• Tools for setup, configuration and monitoring in place

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 10: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Being worked on

• Large deployment of Equallogic ‘Sumos’ (48 drives of 1 TB each, dual controllers, 4 GigE/controller): 24 systems, 48 front-end nodes

• Experience encouraging, but there are issues– Controllers don’t support DHCP, manual config required– Buggy firmware– Problems with batteries on controllers– Support not fully integrated into Dell structures yet– Remarkable stability

• We have failed all network and server components that can fail, the boxes kept running

– Remarkable performance

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 11: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Equallogic performance

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

• 16 servers, 8 sumos, 1 GigE per server, iozone

Page 12: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Appliances vs. home-made

• Appliances– Stable– Performant– Highly functional (Equallogic: snapshots, relocation

without server involvement, automatic load balancing, …)

• Home-made with storage-in-a-box servers– Inexpensive– Complete control over configuration– Can run other things than target software stack– Can select function at software install time (iSCSI target

vs. classical disk server with rfiod or xrootd)

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 13: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Ideas (partly started testing)

• Two storage-in-a box servers as highly redundant setup– Running target and initiator stacks at the same time– Mounting half the disks local, half on the other machine– Some heartbeat detects failures and (e.g. by resetting an

IP alias) moves functionality to one or the other box

• Several storage-in-a-box servers as targets– Exporting disks either as JBOD or as RAID– Front-end server creates software RAID (e.g. RAID 6)

over volumes from all storage-in-a-box servers– Any one (or two with SW RAID 6) storage-in-a-box server

can fail entirely, the data remain available

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 14: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Lustre Evaluation Project

• Tasks and goals– Evaluate Lustre as a candidate for storage consolidation

• Home directories• Project space• Analysis space• HSM

– Reduce service catalogue• Increase overlap between service teams• Integrate with CERN fabric management tools

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 15: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Areas of interest (1)

• Installation– Quattorized installation of Lustre instances– Client RPMs for SLC5

• Backup– LVM-based snapshots for meta data– Tested with TSM, set up for PPS instance– Changelogs feature of v2.0 not yet usable

• Strong Authentication– v2.0: early adaptation, full Kerberos Q1/2011– Tested & used by other sites (not by us yet)

• Fault-tolerance– Lustre comes with built-in failover– PPS MDS iSCSI setup

Storage R&D at CERN IT-FIO – Helge Meinhard at cern.ch – 27-Oct-2009

Page 16: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

FT: MDS PPS Setup

Dell Equallogic iSCSI Arrays16x 500GB SATA

Dell PowerEdge M600Blade Server 16GB

Private iSCSI Network

MDS MDTOSS

OSS

CLT

Fully redundant against component failure– iSCSI for shared storage – Linux device mapper + md for mirroring– Quattorized– Needs testing

Page 17: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Areas of Interest (2/2)

• Special performance & Optimization– Small files: „Numbers dropped from slides“– Postmark benchmark (not done yet)

• HSM interface– Active developement, driven by CEA– Access to Lustre HSM code (to be tested with

TSM/CASTOR)

• Life Cycle Management (LCM) & Tools– Support for day-to-day operations?– Limited support for setup, monitoring and management

Page 18: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Findings and Thoughts

• No strong authentication as of now– Foreseen for Q1/2011

• Strong client/server coupling– Recovery

• Very powerful users– Striping, Pools

• Missing support for life cycle management– No user transparent data migration– Lustre/Kernel upgrades difficult

• Moving targets on the roadmap– V2.0 not yet stable enough for testing

Page 19: CERN IT Department CH-1211 Genève 23 Switzerland  t R&D Activities on Storage in CERN-IT’s FIO group Helge Meinhard / CERN-IT HEPiX Fall 2009.

Summary

• Some desirable features not there (yet)– Wish list communicated to SUN– SUN interested in evaluation

• Some more tests to be done– Kerberos, Small files, HSM

• Documentation