AscalableinfrastructureforCMS dataanalysisbasedon ... · PDF fileGlusterFS" 2 4 14...

5 Introduc+on Architecture System Components 7 A scalable infrastructure for CMS data analysis based on OpenStack Cloud and GlusterFS Datacenter Indirec+on Infrastructure(DII) for High Energy Physics (HEP) GlusterFS IO System Stability Performance Analysis Live Migra+on 8 Conclusion 9 Acknowledgements 1 Salman Toor, 2 Lirim Osmani, 1 Paula Eerola, 1 Oscar Kraemer, 1 Tomas Lindén, 2 Sasu Tarkoma and 3 John White 1 Helsinki Ins+tute of Physics (HIP), CMS Program, 2 Computer Science Department, University of Helsinki, and 3 Helsinki Ins+tute of Physics (HIP), Technology Program {Salman.Toor, Paula.Eerola, Carl.Kraemer, Tomas.Linden}@helsinki.fi, {Lirim.Osmani, Sasu.Tarkoma}@cs.helsinki.fi, [email protected] 1 3 2 4 The aim of this project is to design and implement a scalable and resilient Infrastructure for CERN High Energy Physics (HEP) data analysis. The infrastructure is based on OpenStack components for structuring a private Cloud with Gluster File System. Our test results show that the adopted approach provides a scalable and resilient solu+on for managing resources. Computa+onal resources Dell PowerEdge, 2 quad core Intel Xeon 32GB (8 x 4GB) RAM 4 x 10GbE Broadcom 57718 network Gluster File System servers Dell Power Edge servers 512GB LUN adached to each FCoE (Fibre Channel over Ethernet) protocol OpenStack Cloud (Grizzly release) Gluster File System Advanced Resource Connector middleware CERN Virtual Machine File System Experiments with different kinds of instances Minimal VM of Ubuntu m1.small took 6 sec Worker node VM with full configura+on took 43 sec Experienced random failures in higher number of live migra+on requests 4% performance loss evaluated with the HEPSPEC2006 benchmark Burst mode VM boot requests based on Local and GlusterFS based setups Uniform boot response with GlusterFS 6 GlusterFS is used inside the Cloud to provide shared area (Brick 1 & 2, 2TB) outside for Glance and Nova repositories (Brick 3 & 4, 1TB) Brick1 Brick2 Brick3 Brick4 Days 20 20 20 20 Total Reads 40GB 41GB 169GB 831GB Total Writes 38GB 36GB 364GB 1017GB Average – Maximum Latency (milliseconds) / No. of Calls (in millions) Read 0.0511.6/ 29.8 0.0510.8/ 31.0 0.29217.1/ 0.6 0.382386/ 0.32 Write 0.0816.8/ 1.8 0.0714.0/ 1.9 1.665151/ 10.8 1.067281/ 7.8 Type Nodes Cores Memory Virtual Machines (QCOW2 images) WN 25 4 14 CE 1 4 14 GlusterFS 2 4 14 Real Machines Controller 1 8 32 Compute 19 8 32 GlusterFS 2 8 32 Local and GlusterFS based storage Root local Ephemeral disk GlusterFS shared MP WN 10GB 45GB 900GB CE 10GB 900GB The evalua+on is based on the CMS Dashboard together with CSC and NDGF accoun+ng and monitoring systems More than 65k jobs have been processed, including CMS produc+on and analysis jobs Example of a specific user Run 400 analysis jobs with 74 wall+me days and 85% CPU efficiency We have demonstrated: More flexible system/user management through the virtualized environment; Efficient addi+on/removal of virtual resourses; Scalability with an acceptable performance loss; A seamless view of our site through ARC middleware. This project is funded by Academy of Finland (AoF) Thanks to Ulf Tigerstedt, CSC for help with HEPSPEC tests The CMS collabora+on for sending produc+on jobs to process

Upload
vuongdan
Category

Documents
view
215
download
1

Embed Size (px):

Transcript of AscalableinfrastructureforCMS dataanalysisbasedon ... · PDF fileGlusterFS" 2 4 14...

Page 1: AscalableinfrastructureforCMS dataanalysisbasedon ... · PDF fileGlusterFS" 2 4 14 Real"Machines" Controller" 1 8 32 Compute" 19 8 32 GlusterFS" 2 8 32 Localand GlusterFSbasedstorage

Introduc+on

Architecture

System Components

A scalable infrastructure for CMS data analysis based on

OpenStack Cloud and GlusterFS Datacenter Indirec+on Infrastructure(DII) for High Energy Physics (HEP)

GlusterFS IO

System Stability

Performance Analysis

Live Migra+on

8 Conclusion

9 Acknowledgements

1Salman Toor, 2Lirim Osmani, 1Paula Eerola, 1Oscar Kraemer, 1Tomas Lindén, 2Sasu Tarkoma and 3John White 1Helsinki Ins+tute of Physics (HIP), CMS Program, 2Computer Science Department, University of Helsinki, and 3Helsinki Ins+tute of Physics (HIP), Technology Program

{Salman.Toor, Paula.Eerola, Carl.Kraemer, Tomas.Linden}@helsinki.fi, {Lirim.Osmani, Sasu.Tarkoma}@cs.helsinki.fi, [email protected]

1 3

2 4

The aim of this project is to design and implement a scalable and resilient Infrastructure for CERN High Energy Physics (HEP) data analysis . The infrastructure is based on OpenStack components for structuring a private Cloud with Gluster File System. Our test results show that the adopted approach provides a scalable and resilient solu+on for managing resources.

Ø  Computa+onal resources ²  Dell PowerEdge, 2 quad core Intel Xeon ²  32GB (8 x 4GB) RAM ²  4 x 10GbE Broadcom 57718 network

Ø  Gluster File System servers ²  Dell Power Edge servers ²  512GB LUN adached to each ²  FCoE (Fibre Channel over Ethernet) protocol

Ø OpenStack Cloud (Grizzly release) Ø  Gluster File System Ø  Advanced Resource Connector

middleware Ø  CERN Virtual Machine File System

Ø  Experiments with different kinds of instances ² Minimal VM of Ubuntu m1.small took 6 sec ² Worker node VM with full configura+on took 43 sec

Ø  Experienced random failures in higher number of live migra+on requests

Ø  4% performance loss evaluated with the HEPSPEC-‐2006 benchmark

Ø  Burst mode VM boot requests based on Local and GlusterFS based setups

Ø Uniform boot response with GlusterFS

Ø  GlusterFS is used ²  inside the Cloud to provide shared

area (Brick 1 & 2, 2TB) ² outside for Glance and Nova

repositories (Brick 3 & 4, 1TB)

Brick-‐1 Brick-‐2 Brick-‐3 Brick-‐4 Days 20 20 20 20 Total Reads

40GB 41GB 169GB 831GB

Total Writes

38GB 36GB 364GB 1017GB

Average – Maximum Latency (milliseconds) / No. of Calls (in millions)

Read 0.05-‐11.6/ 29.8

0.05-‐10.8/ 31.0

0.29-‐217.1/ 0.6

0.38-‐2386/ 0.32

Write 0.08-‐16.8/ 1.8

0.07-‐14.0/ 1.9

1.66-‐5151/ 10.8

1.06-‐7281/ 7.8

Type Nodes Cores Memory

Virtual Machines (QCOW2 images) WN 25 4 14 CE 1 4 14 GlusterFS 2 4 14

Real Machines Controller 1 8 32 Compute 19 8 32 GlusterFS 2 8 32

Local and GlusterFS based storage

Root local

Ephemeral disk

GlusterFS shared MP

WN 10GB 45GB 900GB

CE 10GB -‐ 900GB

Ø  The evalua+on is based on the CMS Dashboard together with CSC and NDGF accoun+ng and monitoring systems

Ø More than 65k jobs have been processed, including CMS produc+on and analysis jobs

Ø Example of a specific user ² Run 400 analysis jobs with 74 wall+me

days and 85% CPU efficiency

We have demonstrated: Ø More flexible system/user management through the

virtualized environment; Ø  Efficient addi+on/removal of virtual resourses; Ø  Scalability with an acceptable performance loss; Ø  A seamless view of our site through ARC middleware.

Ø  This project is funded by Academy of Finland (AoF) Ø  Thanks to Ulf Tigerstedt, CSC for help with HEPSPEC tests Ø  The CMS collabora+on for sending produc+on jobs to

process