Petabye scale data challenge

Petabye Scale Data Challenge- Worldwide LHC Computing Grid

ASGC/Jason ShihComputex, Jun 2nd, 2010

Outline

Objectives & MilestonesWLCG experiment and ASGC Tier-1 CenterPetabyte Scale ChallengeStorage Management SystemSystem Architecture, Configuration and

Performance

Objectives

Building sustainable research and collaboration infrastructureSupport research by e-Science, on data intensive

sciences and applications require cross disciplinary distributed collaboration

ASGC Milestone

Operational from the deployment of LCG0 since 2002 ASGC CA establish on 2005 (IGTF in same year)Tier-1 Center responsibility start from 2005Federated Taiwan Tier-2 center (Taiwan Analysis Facility, TAF)

is also collocated in ASGCRep. of EGEE e-Science Asia Federation while joining EGEE

from 2004Providing Asia Pacific Regional Operation Center (APROC)

services to regional-wide WLCG/EGEE production infrastructure from 2005

Initiate Avian Flu Drug Discovery Project and collaborate with EGEE in 2006

Start of EUAsiaGrid Project from April 2008

LHC First Beam – Computing at the Petascale

General Purpose, pp, heavy ions

ATLAS: General Purpose, pp, heavy ions

ALICE: Heavy ions, ppLHCb: B-physics, CP Violation

CMS: General Purpose, pp, heavy ions

Size of LHC Detector

Bld. 40ATLAS

UNESCO Information Preservation debate, April 2007 -

Jamie Shiers@cern ch

7http://www.damtp.cam.ac.uk/user/gr/public/bb_history.html

Standard Cosmology

Good model from 0.01 secafter Big Bang

Supported by considerable observational evidence

Elementary Particle Physics

From the Standard Model into theunknown: towards energies of1 TeV and beyond: the Terascale

Towards Quantum Gravity

From the unknown into the unknown...

Energy, Density, Tem

perature

WLCG Timeline

First Beam on LHC, Sep. 10, 2008Severe Incident after 3w

operation (3.5TeV)

ASGC - Introduction

Large Hadron Collider (LHC)

Avian Flu Drug DiscoveryGrid Application Platform

A Worldwide Grid Infrastructure

Asia Pacific Regional Operation Center

>250 sites, 48 countries>68,000 CPUs, >25 PetaBytes>10,000 users, >200 VOs>150,000 jobs/day

Best Demo Award of EGEE’07

Lightweight Problem Solving Framework

1. Most Reliable T1: 98.83%2. Very Highly Performing and most Stable Site in CCRC08

Max CERN/T1-ASGC Point2Point Inbound : 9.3 Gbps

Collaborating e-Infrastructures

“Production” = Reliable, sustainable, with commitments to quality of service

TWGRID

EUAsiaGrid

Potential for linking ~80 countries

WLCG Computing Model- The Tier Structure

Tier-0 (CERN)Data recordingInitial data reconstructionData distribution

Tier-1 (11 countries)Permanent storage Re-processingAnalysis

Tier-2 (~130 countries)Simulation End-user analysis

4EGEE07, Budapest, 1-5 October 2007

Enabling Grids for E-sciencE

EGEE-II INFSO-RI-031688 4

ArcheologyAstronomyAstrophysicsCivil ProtectionComp. ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy PhysicsLife SciencesMultimediaMaterial Sciences…

Why Petabyte? Challenges

Why Petabyte?Experiment Computing ModelComparing with conventional data management

Challenges Performance: LAN and WAN activities

Sufficient B/W between CPU FarmEliminate Uplink Bottleneck (Switch Tires)

Fast responding of Critical EventsFabric Infrastructure & Service Level Agreement

Scalability and ManageabilityRobust DB engine (Oracle RAC)KB and Adequate Administration (Training)

Tier Model and Data Management Components

WLCG Experiment Computing Model

ATLAS T1 Data Flow

Tier-0

CPUFarm

T1T1Other

Tier-1s

DiskBuffer

1.6 GB/file0.02 Hz1.7K f/day32 MB/s2.7 TB/day

10 MB/file0.2 Hz17K f/day2 MB/s0.16 TB/day

500 MB/file0.004 Hz0.34K f/day2 MB/s0.16 TB/day

0.044 Hz3.74K f/day44 MB/s3.66 TB/day

ESD (2x)

AODm (10x)

1 Hz85K f/day720 MB/s

T1T1Other

Tier-1s

T1T1EachTier-2

DiskStorage

10 MB/file0.2 Hz17K f/day2 MB/s0.16 TB/day

Plus simulation and Plus simulation and analysis data flowanalysis data flow

WLCG Tier-1- Defined Minimum Levels of Services.

Define response time refer to max delay before taking action.Mean time repairing the service is also crucial but cover

indirectly through required availability target.

WLCG MoU & ASGC Resource Level- Pledged Resources and Projection

10002000

40005000

2005 2006 2007 2008 2009 2010

10002000

40005000

CPU MoUCPUDiskTapeDISK MoUTape MoU

Year CPU (HEP2k6) Disk (PB) Tape (PB)End 2009 29.5K 2.6 2.4Mou 2009 20K 3.0 3.0Mou 2010 28K 3.5 3.5

Data Management SystemCASTOR V1

CERN Advanced STORageSatisfactorily serving 10s of 1K

Req/day/TB of Disk CacheLimitation: 1M files in cacheTape movement API not flexible

CASTOR V2Centric DB Arch.Scheduling FeatureGSI and KerberosResource MgmtResource Handling

CASTOR Configurations- Current Infrastructure

Shared cores servicesServing: Atlas and CMSServices:

Stager, NS, DLF, Repack, and LSFDB cluster

Two DB Clusters (SRM and NS)5 Services (DB) split into two clusters 5 Oracle Instances

Total capacity: 0.63PB and 0.7PB for CMS and Atlas resp.Current usage: 63% and 44% for CMS and Atlas

CASTOR Configurations (cont’)- Disk Cache

Disk pools & serversPerformance (IOPS)

With 0.5kB IO size: 76.4k and 54k for read & write resp.Slightly decrease around 9% for both read and write

when inc. IO size to 4kB.80 disk servers (+6 will be online end of 3rdw Oct)

Total capacity: 1.67PB (0.3PB allocate dynamically)Current usage: 0.79PB (~58% usage)

14 disk pools (8 for atlas and 3 for CMS, another three for bio, SAM, and dynamic)

100150200250300350400450

GROUPDISK

biomedD1T

HotDisk

PrdD0T

asStag

dteamD0T

MCTAPE

Scratch

PrdD1T

MCDISK

0246810121416Install Capacity Free Capacity

Num of Disk Servers

Disk Pool Configuration- T1 MSS (CASTOR)

Distribution of Free Capacity- Per Disk Servers vs. per Pool

0 50 100 150 200 250

atlasGROUPDISK

atlasHotDisk

atlasMCDISK

atlasMCTAPE

atlasPrdD0T1

atlasPrdD1T0

atlasScratchDisk

atlasStage

biomedD1T0

cmsLTD0T1

cmsPrdD1T0

cmsWANOUT

dteamD0T0

Standby

Free Capacity (TB)

Storage Server Generation- Drive vs. Total Capacity

18235.5

3774123

0100200300400500600700800

0 10 20 30 40

Numer of Raid Subsystem

CASTOR Configurations (cont’)- Core Service Overview

Services Type

OS Level Release Remark

Core SLC 4.7/x86-64 2.1.7-19 Stager/NS/DLFSRM SLC 4.7/x86-64 2.7-18 3 Head Nodes

Disk Svr. SLC 4.7/x86-64 2.1.7-19 80 Q3 2k9 (20+ in Q4)Tape Svr. SLC 4.7/32 + 64 2.1.8-8 X86-64 OS deployed

CASTOR Configurations (cont’)- CMS Disk Cache: Current Resource Level

Space Token

Disk PoolCapacity/Job Limit

DiskServers

TapePool/Capacity

cmsLTD0T1 278TB/488 9 *cmsPrdD1T0 284TB/1560 13cmsWanOut 72TB/220 4

* Dep. on tape family.

CASTOR Configurations (cont’)- Atlas Disk Cache: Current Resource Level

Space Token Cap/JobLimit DiskServers TapePool/Cap.atlasMCDISK 163TB/790 8 -atlasMCTAPE 38TB/80 2 atlasMCtp/39TBatlasPrdD1T0 278TB/810 15 -

atlasPrdD0T1 61TB/210 3 atlasPrdtp/105TB

atlasScratchDisk 28TB/80 1 -atlasHotDisk 2/40TB 2 -

atlasGROUPDISK 19T/40 1 -

Total 950TB/1835 46 -

IDC CollocationFacility install complete at Mar 27th

Tape system delay after Apr 9th

RealignmentRMA for faulty parts

Storage Farm~ 110 raid subsystem deployed since 2003.Supporting both Tier1 and 2 storage fabricDAS connection to frontend blade server

Flexible switching front end server upon performance requirement4-8G fiber channel connectivity

CASTOR Configurations (cont’)- Tape Pool

Tape PoolCapacity

(TB)/UsageDrive

DedicationLTO3/4 Mixed

atlasMCtp 8.98/40% N YatlasPrdtp 101/65% N Y

cmsCSA08cruzet 15.6/46% N NcmsCSA08reco 5/0% N N

cmsCSAtp 639/99% N YcmsLTtp 34.4/44% N N

dteamTest 3.5/1% N N

MSS Monitoring ServicesStd. Nagios Probes

NRPE + customized pluginsSMS to OSE/SM for all types of critical

alarmsAvailability metricsTape metrics (SLS)Throughput, capacity & scheduler per

VO and Diskpool

MSS Tape System- Expansion/Upgrade Planning

Before incident:LTO3 * 8 + LTO4 * 4720TB with LTO3530TB with LTO4

May 2009:Two LOT3 drivesMES: 6 LTO4 drives end of MayCapacity: 1.3PB (old, LTO3,4 mixed) + 0.8PB (LTO4)

New S54 model introduce mid of 20092K slots with tier modelRequired:

Upgrade ALMSEnhanced gripper

MES Q3 200918 LTO4 drivesHA implementation resume in Q4

Expansion Planning

20080.5PB expansion of Tape system in Q2Meet MOU target mid of Nov.1.3MSI2k per rack base on recent E5450 processor.

2009 Q1150 SMP/QC blade serversRaid subsystem consider 2TB per drive42TB net capacity per chassis and 0.75PB in total

2009 Q3-418 LTO4 drives – mid of Oct.330 Xeon QC (SMP, Intel 5450) blades servers2nd phase TAPE MES - 5 LTO4 drives + HA3rd phase TAPE MES – 6 LTO4 drivesETA 0.8PB expansion delivery: mid of Nov

Computing/Storage System Infrastructure

G48T10/100/1000BASE-T

1 25 2 26 3 27 4 28 5 29 6 30 7 31 8 32 9 33 10 34 11 35 12 36 13 37 14 38 15 39 16 40 17 41 18 42 19 43 20 44 21 45 22 46 23 47 24 48

10GBASE-X1 32 4

10G4X 41611Di

G 4 8X a 4 1 54 2

2 51 2 62 2 73 2 84 2 95 3 06 317 3 28 2 39 3 410 3 51 1 3 61 2 3 71 3 3 81 4 3 91 5 401 6 4 11 7 4 21 8 4 319 4 42 0 4 52 1 4 62 2 4 72 3 4 82 4

10GBASE-X1 32 4

10G4X 41611

10GBASE-X1 32 4

10G4X 41611

10GBASE-X1 32 4

10G4X 41611

CASTOR2 Disk servers

Core Services – CE, RB, DPM, PX, BDII etc.

CASTOR2 Tape Servers

4 * GE (SX) to ASGC Distribution Switch in Rack#49

(links to Tier-1 Servers)

BladeCenter

1 2 3 4 5 6 7 8 9 10 11 12 13 14

64 x IBM HS20 Blade system -

WNBladeCenter

1 2 3 4 5 6 7 8 9 10 11 12 13 14

142 x IBM HS21 Blade system -

20 x Quanta Blades -WN

Battery#1 + #2

Battery#3 + #4

DC SMR 48V / 100A

2 * GE (LX) to 4F M160(links to HK, JP Tier-2s)

2 * GE (LX) to 4F TaipeiGigaPoP-7609(links to TW Tier-2s)

Data Center – C3 Archive Room

ASGC CASTOR2 Disk Farm

Throughput of WLCG ExperimentsThroughput defined as Job Eff. x # Jobs runningCharacteristic of 4 LHC Exp. depicting in-efficiency

is due to poor coding.

Reliability From Different View Perspective

Summary

Deploy highly-scalable DM system and performance driven storage infrastructure

Eliminate possible complexity of SRM abstraction layerResource utilization, provisioning and optimization

From POC to Production, the challenges remains:Data Challenge, Service Challenge, CCRC08, STEP09, etc.Motivation appear clear for: Medical, Climate, Cosmological Operation wide:

Robust Database setupKB for fabric infrastructure operationFast enough event processing and documentation

Consider beyond the data management use cases in WLCG: commonality in many other disciplines in EGEE infrastructureactively participate in e-Science collaboration within the region

Petabye scale data challenge

Education

Transcript of Petabye scale data challenge

ImageNet Large Scale Visual Recognition Challenge … manuscript No. (will be inserted by the editor) ImageNet Large Scale Visual Recognition Challenge Olga Russakovsky* Jia Deng*

Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights

Large Scale Visual Recognition Challenge (ILSVRC) 2013: Detection spotlights

Embracing the data challenge in a digitalised worldEmbracing the data challenge in a digitalised world | Page 5 1.1 Data as a disruptor The sheer scale and volume of data is growing

CHALLENGE 26M - GenomenonMastermind: An Automated Genomic Knowledge Harvesting and Data Prioritization Tool to Facilitate Analysis of Large-Scale Genomic Data Genome sequence analysis

The 90 Day Challenge - Agile on a Corporate Scale

yelp data challenge

Recognizing Families In the Wild (RFIW): The 4th Editionarxiv.org/pdf/2002.06303.pdfRecognizing Families In the Wild (RFIW) challenge series: a large-scale data challenge in support

Safety Data Challenge

Opportunity versus Challenge · 2019. 5. 3. · Presentation 1: Using timing information associated with response data in large-scale assessments Kentaro Yamamoto, ETS Timing data

Billion-scale Network Embedding with Iterative Random ...4 Challenge: Billion-scale Network Data Social Networks WeChat: 1 billion monthly active users (March, 2018) Facebook: 2 billion

Large Scale Visual Recognition Challenge 2011

AUTOMATING LARGE-SCALE IT ASSET TRACKING AND DATA … · 2020-03-20 · AUTOMATING LARGE-SCALE IT ASSET TRACKING AND DATA CENTER MANAGEMENT BUSINESS CHALLENGE A leading Fortune 100

Large Scale Visual Recognition Challenge (ILSVRC) 2013: Detection spotlights.

Copyright: SIPC Directions to 2050 A New International Framework The scale of the challenge The scale of the challenge Post Copenhagen Architecture Post.

Cloudera Data Science Challenge

Scale of the challenge

RAMP Data Challenge

The Challenge of Scale – Nationwide Upgrading

LAK14 Data Challenge (#LAKdata14)