Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data...

78
Summary Session I Summary Session I René Brun 27 May 2005 ACAT05

Transcript of Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data...

Page 1: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Summary Session ISummary Session IRené Brun

27 May 2005ACAT05

Page 2: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 2

OutlineOutline

Data Analysis, Data Acquisition and Tools : 6

GRID Deployment : 4Applications on the GRID : 5High Speed Computing : 4

19 presentations

Page 3: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 3

Data Analysis, Acquisition, Data Analysis, Acquisition, ToolsTools

• Evolution of the Babar configuration data base design• DAQ software for SND detector• Interactive Analysis environment of unified accelerator libraries• DaqProVis, a toolkit for acquisition, analysis, visualisation• The Graphics Editor in ROOT• Parallel interactive and batch HEP data analysis with PROOF

Page 4: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Evolution of the Configuration Evolution of the Configuration Database DesignDatabase Design

Andrei Salnikov, SLACFor BaBar Computing Group

ACAT05 – DESY, Zeuthen

Page 5: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 5

BaBar database migrationBaBar database migration

• BaBar was using Objectivity/DB ODBMS for many of its databases

• About two years ago started migration from Objectivity to ROOT for event store, which was a success and improvement

• No reason to keep pricey Objectivity only because of “secondary” databases

• Migration effort started in 2004 for conditions, configuration, prompt reconstruction, and ambient databases

Page 6: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 6

Configuration database APIConfiguration database API• Main problem of the old database – API exposed too much to the

implementation technology:• Persistent objects, handles, class names, etc.

• API has to change but we don’t want to make the same mistakes again (new mistakes are more interesting)

• Pure transient-level abstract API independent on any specific implementation technology

• Always make abstract APIs to avoid problems in the future (this may be hard and need few iterations)

• Client code should be free from any specific database implementation details

• Early prototyping could answer a lot of questions, but five years of experience count too

• Use different implementations for clients with different requirements• Implementation would benefit from features currently missing in

C++: reflection, introspection (or from completely new language)

Page 7: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

DAQ software for SND DAQ software for SND detectordetector

Budker Institute of Nuclear Physics, NovosibirskM. Achasov, A. Bogdanchikov, A. Kim, A. Korol

Page 8: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 8

Main data flowMain data flow

100 Hz

1 KB

Readout and events building

Events packing

Events filtering

1 kHz

4 KB

1 kHz

4 KB

1 kHz

1 KB

Storage

Expected rates:• Events fragments: 4 МB/s are read from IO processors over

Ethernet;• Event building: 4 MB/s;• Event packing: 1 МB/s;• Events filtering (90% screening): 100 KB/sec.

Page 9: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 9

Filtered events

DAQ architectureDAQ architecture

Detector

Readout & Event Building

Calibration process

Front-end electronics

Buffer TLT computers

Backup

Off-line

Visualization

X 12X 16

KLUKVAKLUKVAKLUKVAKLUKVACAMACCAMAC

Database

System support

Page 10: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 10

Page 11: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Interactive Analysis Environment of Interactive Analysis Environment of Unified Accelerator LibrariesUnified Accelerator Libraries

V. Fine, N. Malitsky, R.Talman

Page 12: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 12

AbstractAbstract

Unified Accelerator Libraries (UAL,http://www.ual.bnl.gov) software is an open accelerator simulation environment addressing a broad spectrum of accelerator tasks ranging from online-oriented efficient models to full-scale realistic beam dynamics studies. The paper introduces a new package integrating UAL simulation algorithms with the Qt-based Graphical User Interface and an open collection of analysis and visualization components. The primary user application is implemented as an interactive and configurable Accelerator Physics Player whose extensibility is provided by plug-in architecture. Its interface to data analysis and visualization modules is based on the Qt layer (http://root.bnl.gov) developed and supported by the Star experiment. The present version embodies the ROOT (http://root.cern.ch) data analysis framework and Coin 3D (http://www.coin3d.org) graphics library.

Page 13: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 13

Accelerator Physics PlayerAccelerator Physics Player

An open collection of algorithms

An open collectionof viewers

UAL::USPAS::BasicPlayer* player = new UAL::USPAS::BasicPlayer(); player->setShell(&shell); qApp.setMainWidget(player); player->show(); qApp.exec();

Page 14: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 14

Examples of the Accelerator-Specific Examples of the Accelerator-Specific

ViewersViewers

Bunch 2D Distributions(based on ROOT TH2F)

Turn-By-Turn BPM data(based on ROOT TH2F or TGraph )

Twiss plots (based on ROOT TGraph)

Bunch 3D Distributions(based on COIN 3D)

Page 15: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Parallel Interactive and Batch HEP-Data Analysis

with PROOF

Maarten Ballintijn*, Marek Biskup**, Rene Brun**, Philippe Canal***,

Derek Feichtinger****, Gerardo Ganis**, Guenter Kickinger**, Andreas Peters**,

Fons Rademakers**

* - MIT ** - CERN *** - FNAL **** - PSI

Page 16: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 16

ROOT Analysis ModelROOT Analysis Model

Client

Local file

Remote file(dcache, Castor, RFIO, Chirp)

Rootd/xrootdserver

standard model Files analyzed on a local computer

Remote data accessed via remote fileserver (rootd/xrootd)

Page 17: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 17

PROOF Basic ArchitecturePROOF Basic Architecture

Slaves

ClientMaster Files

Commands, scripts

Histograms, plots

Single-Cluster mode The Master divides the work among the

slaves

After the processing finishes, merges the results (histograms, scatter plots)

And returns the result to the Client

Page 18: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 18

PROOF and SelectorsPROOF and Selectors

No user’s control of the entries loop!

Many Trees are

being processed

Initialize each slave

The code is shipped to each slave and SlaveBegin(), Init(), Process(), SlaveTerminate() are executed there

The same code works also without PROOF.

Page 19: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 19

Analysis session snapshotAnalysis session snapshot

What we are implementing:What we are implementing:

AQ1: 1s query produces a local histogram

AQ2: a 10mn query submitted to PROOF1

AQ3->AQ7: short queries

AQ8: a 10h query submitted to PROOF2

BQ1: browse results of AQ2

BQ2: browse temporary results of AQ8

BQ3->BQ6: submit 4 10mn queries to PROOF1

CQ1: Browse results of AQ8, BQ3->BQ6

Monday at 10h15

ROOT sessionOn my laptop

Monday at 16h25

ROOT sessionOn my laptop

Wednesday at 8h40

sessionon any web

browser

Page 20: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 20

ROOT Graphics EditorROOT Graphics Editorby Ilka Antchevaby Ilka Antcheva

ROOT graphics editor can be:

• Embedded – connected only with the canvas in the application window

• Global – has own application window and can be connected to any created canvas in a ROOT session.

Page 21: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 21

Focus on UsersFocus on Users

• Novices (for a short time)• Theoretical understanding, no practical experience with ROOT• Impatient with learning concepts; patient with performing tasks

• Advanced beginners (many people remain at this level)• Focus on a few tasks and learn more on a need-to-do basis• Perform several given tasks well

• Competent performers (fewer then previous class)• Know and perform complex tasks that require coordinated

actions• Interested in solving problems and tracking down errors

• Experts (identified by others)• Ability to find solution in complex functionality• Interested in theories behind the design• Interested in interacting with other expert systems

Page 22: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 22

DaqProVisDaqProVisM.MorhacM.Morhac

• DaqProVis, a toolkit for acquisition, interactive analysis, processing and visualization of multidimensional data

• Basic features• DaqProVis is well suited for interactive analysis of

multiparameter data from small and medium sized experiments in nuclear physics.

• data acquisition part of the system allows one to acquire multiparameter events either directly from the experiment or from a list file, i.e., the system can work either in on-line or off-line acquisition mode.

• in on-line acquisition mode, events can be taken directly from CAMAC crates or from VME system that cooperates with DaqProVis in the client-server working mode.

• in off-line acquisition mode the system can analyze event data even from big experiments, e.g. from Gammasphere.

• the event data can be read also from another DaqProVis system. The capability of DaqProVis to work simultaneously in both the client and the server working mode enables us to realize remote as well as distributed nuclear data acquisition, processing and visualization systems and thus to create multilevel configurations

Page 23: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 23

DaqProVis (Visualisation)DaqProVis (Visualisation)

Page 24: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 24

DaqProVis (suite)DaqProVis (suite)

• DaqProVis and ROOT teams are already cooperating.• Agreement during the workshop to extend this

cooperation

Page 25: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 25

GRID deploymentGRID deployment

• Towards the operation of the Italian Tier-1 for CMS: Lessons learned from the CMS Data Challenge

• GRID technology in production at DESY• Grid middleware Configuration at the KIPT CMS Linux

Cluster• Storage resources management and access at Tier1

CNAF

Page 26: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Towards the operations ofTowards the operations ofthe Italian Tier-1 for CMS:the Italian Tier-1 for CMS:

lessons learned from the CMS Data Challengelessons learned from the CMS Data Challenge

D. Bonacorsi(on behalf of INFN-CNAF Tier-1 staff and the CMS experiment)

ACAT 2005X Int. Work. on Advanced Computing & Analysis Techniques in Physics Research

May 22nd-27th, 2005 - DESY, Zeuthen, Germany

Page 27: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 27

DC04 outcome DC04 outcome (grand-summary + focus on (grand-summary + focus on INFN T1)INFN T1)

• reconstruction/data-transfer/analysis may run at 25 Hz• automatic registration and distribution of data, key role of the TMDB

• was the embrional PhEDEx!• support a (reasonable) variety of different data transfer tools and set-up

• Tier-1’s: different performances, related to operational choices• SRB, LCG Replica Manager and SRM investigated: see CHEP04 talk

• INFN T1: good performance of LCG-2 chain (PIC T1 also)• register all data and metadata (POOL) to a world-readable catalogue

• RLS: good as a global file catalogue, bad as a global metadata catalogue• analyze the reconstructed data at the Tier-1’s as data arrive

• LCG components: dedicated bdII+RB; UIs, CEs+WNs at CNAF and PIC• real-time analysis at Tier-2’s was demonstrated to be possible

• ~15k jobs submitted• time window between reco data availability - start of analysis jobs can be

reasonably low (i.e. 20 mins)• reduce number of files (i.e. increase <#events>/<#files>)

• more efficient use of bandwidth• reduce overhead of commands• address scalability of MSS systems (!)

Page 28: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 28

• Some general considerations may apply:• although a DC is experiment-specific, maybe its conclusions are

not

• an “experiment-specific” problem is better addressed if conceived as a “shared” one in a shared Tier-1

• an experiment DC just provides hints, real work gives insight

crucial role of the experiments at the Tier-1• find weaknesses of CASTOR MSS system in particular operating conditions• stress-test new LSF farm with official production jobs by CMS• testing DNS-based load-balancing by serving data for production and/or

analysis from CMS disk-servers• test new components, newly installed/upgraded Grid tools, etc… • find bottleneck and scalability problems in DB services• give feedback on monitoring and accounting activities• …

Learn from DC04 lessons…Learn from DC04 lessons…

Page 29: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 29

PhEDEx at INFNPhEDEx at INFN• INFN-CNAF is a T1 ‘node’ in PhEDEx

• CMS DC04 experience was crucial to start-up PhEDEX in INFN• CNAF node operational since the beginning

• First phase (Q3/4 2004):• Agent code development + focus on operations: T0T1 transfers

• >1 TB/day T0T1 demonstrated feasible• … but the aim is not to achieve peaks, but to sustain them in normal operations

• Second phase (Q1 2005):• PhEDEx deployment in INFN to Tier-n, n>1:

• “distributed” topology scenario• Tier-n agents run at remote sites, not at the T1: know-how required, T1 support

• already operational at Legnaro, Pisa, Bari, Bologna

Third phase (Q>1 2005): Many issues.. e.g. stability of service, dynamic routing, coupling PhEDEx to CMS official production system, PhEDEx involvement in SC3-phaseII, etc…

~450 Mbps CNAF T1 ~450 Mbps CNAF T1 LNL-T2 LNL-T2 ~205 Mbps CNAF T1 ~205 Mbps CNAF T1 Pisa-T2 Pisa-T2An example:

data flow to T2’s in daily operations (here: a test with ~2000 files, 90 GB, with no optimization)

Page 30: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Storage resources management and Storage resources management and access at TIER1 CNAFaccess at TIER1 CNAF

ACAT 2005May 22-27 2005

DESY Zeuthen, Germany

Ricci Pier Paolo, Lore Giuseppe, Vagnoni Vincenzo on behalf of INFN TIER1 Staff

[email protected]

Page 31: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 31

TIER1 INFN CNAF Storage TIER1 INFN CNAF Storage

Linux SL 3.0 clients (100-1000 nodes)

WAN or TIER1 LAN

STK180 with 100 LTO-1 (10Tbyte Native)

STK L5500 robot (5500 slots) 6 IBM LTO-2, 2 (4) STK 9940B drives

PROCOM 3600 FC NAS2 9000 Gbyte

PROCOM 3600 FC NAS3 4700 Gbyte

NAS1,NAS43ware IDE SAS1800+3200 Gbyte

AXUS BROWIEAbout 2200 GByte 2 FC interface

2 Gadzoox Slingshot 4218 18 port FC Switch

STK BladeStoreAbout 25000 GByte 4 FC interfaces

Infortrend 4 x 3200 GByte SATA A16F-R1A2-M1

NFS-RFIO-GridFTP oth...

W2003 Server with LEGATO Networker (Backup)

CASTOR HSM servers

H.A.

Diskservers with Qlogic FC HBA 2340

IBM FastT900 (DS 4500) 3/4 x 50000 GByte 4 FC interfaces

2 Brocade Silkworm 3900 32 port FC Switch

Infortrend 5 x 6400 GByte SATA A16F-R1211-M2 + JBOD

SAN 2 (40TB) SAN 1 (200TB)

HSM (400 TB) NAS (20TB)

NFSRFIO

Page 32: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 32

CASTOR HSMCASTOR HSMSTK L5500 2000+3500 mixed slots

6 drives LTO2 (20-30 MB/s)

2 drives 9940B (25-30 MB/s)

1300 LTO2 (200 GB native)

650 9940B (200 GB native)

Sun Blade v100 with 2 internal ide disks with software raid-0 running ACSLS 7.0 OS Solaris 9.0 1 CASTOR (CERN)Central

Services server RH AS3.0

8 tapeserver

Linux RH AS3.0

HBA Qlogic 2300

6 stager with diskserver RH AS3.0

15 TB Local staging area

EXPERIMENT Staging area (TB)

Tape pool (TB native)

ALICE 8 12

ATLAS 6 20

CMS 2 15

LHCb 18 30

BABAR,AMS+oth 2 4

Point to Point FC 2Gb/s connections

1 ORACLE 9i rel 2 DB server RH AS 3.0

8 or more rfio diskservers

RH AS 3.0 min 20TB staging area

SAN 1

WAN or TIER1 LAN

SAN 2Indicates Full rendundancy FC 2Gb/sconnections (dual controller HW and Qlogic SANsurfer Path Failover SW)

Page 33: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 33

DISK access (2)DISK access (2)We have different protocols in production for accessing the disk

storage. In our diskservers and Grid SE front-ends we corrently have:

1. NFS on local filesystem: ADV. Easy client implementation and compatibility and possibility of failover (RH 3.0). DIS. Bad perfomance scalability for an high number of access (1 client 30MB/s 100 client 15MB/s throughtput)

2. RFIO on local filesystem: ADV. Good performance and compatibility with Grid Tools and possibility of failover. DIS. No scalability of front-ends for the single filesystem, no possibility of load-balancing

3. Grid SE Gridftp/rfio over GPFS (CMS,CDF): ADV: Separation from GPFS servers (accessing the disks) and SE GPFS clients. Load balancing and HA on the GPFS servers and possibility to implement the same on the Grid SE services (see next slide). DIS. GPFS layer requirements on OS and Certified Hardware for support.

4. Xrootd (BABAR): ADV: Good performance DIS: No possibility of load-balancing for the single filesystem backends, not grid compliant (at present...)

NOTE The IBM GPFS 2.2 is a CLUSTERED FILESYSTEM so is possible from many front-ends (i.e. gridftp or rfio server) to access simultaneously the SAME filesystem. Also can use bigger filesystem size (we use 8-12TB).

1

Page 34: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 34

Generic BenchmarkGeneric Benchmark(here shown for 1 GB files)(here shown for 1 GB files)

WRITE (MB/s) READ (MB/s)

# of simultaneous client processes

1 5 10 50 120 1 5 10 50 120

GPFS 2.3.0-1

native 114 160 151 147 147 85 301 301 305 305

NFS 102 171 171 159 158 114 320 366 322 292

RFIO 79 171 158 166 166 79 320 301 320 321

Lustre 1.4.1

native 102 512 512 488 478 73 366 640 453 403

RFIO 93 301 320 284 281 68 269 269 314 349

• Numbers are reproducible with small fluctuations• Lustre tests with NFS export not yet performed

Page 35: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 35

Grid Technology in Production at DESY

Andreas Gellrich*

DESY

ACAT 2005

24 May 2005

*http://www.desy.de/~gellrich/

Page 36: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 36

• With the HERA-II luminosity upgrade, the demand for MC production rapidly increased while the outside collaborators moved there computing resources towards LCG

• The ILC group plans the usage of Grids for their computing needs

• The LQCD group develops a Data Grid to exchange data

• DESY considers a participation in LHC experiments

EGEE and D-GRID

dCache is a DESY / FNAL development

Since spring 2004 an LCG-2 Grid infrastructure in operation

Grid @ DESY

Page 37: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 37

Grid Infrastructure @ DESY …

• DESY installed (SL3.04, Quattor, yaim) and operates a complete independent Grid infrastructure which provides generic (non- experiment specific) Grid services to all experiments and groups

• The DESY Production Grid is based on LCG-2_4_0 and includes: Resource Broker (RB), Information Index (BDII), Proxy (PXY) Replica Location Services (RLS) In total 24 + 17 WNs (48 + 34 = 82 CPUs) dCache-based SE with access to the entire DESY data space

• VO management for the HERA experiments (‘hone’, ‘herab’, ‘hermes’, ‘szeu’), LQCD (‘ildg’), ILC (‘ilc’, ‘calice’), Astro-particle Physics (‘baikal’, ‘icecube’)

• Certification services for DESY users in cooperation with GridKa

Page 38: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 38

Page 39: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Grid Middleware Configuration Grid Middleware Configuration at the KIPT CMS Linux Clusterat the KIPT CMS Linux Cluster

S. Zub, L. Levchuk, P. Sorokin, D. Soroka

Kharkov Institute of Physics & Technology, 61108 Kharkov, Ukraine

http://www.kipt.kharkov.ua/[email protected]

Page 40: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 40

What is our specificity?What is our specificity?

Small PC-farm (KCC)

Small scientific group of 4 physicists, combining their work with system administration

CMS tasks orientation

No commercial software installed

Self-security providing

Narrow bandwidth communication channel

Limited traffic

Page 41: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 41

SummarySummary• An enormous data flow expected in the LHC experiments

forces the HEP community to resort to the Grid technology• The KCC is a specialized PC farm constructed at the NSC KIPT

for computer simulations within the CMS physics program and preparation to the CMS data analysis

• Further development of the KCC is planned with considerable increase of its capacities and deeper integration into the LHC Grid (LCG) structures

• Configuration of the LCG middleware can be troublesome (especially at small farms with poor internet connection), since this software is neither universal nor “complete”, and one has to resort to special tips

• Scripts are developed that facilitate the installation procedure at a small PC farm with a narrow internet bandwidth

Page 42: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 42

Applications on the GridApplications on the Grid

• The CMS analysis chain in a distributed environment• Monte Carlo Mass production for ZEUS on the Grid• Metadata services on the Grid• Performance comparison of the LCG2 and gLite File

Catalogues• Data Grids for Lattice QCD

Page 43: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 43

The CMS analysis chain in a distributed environment

ACAT 2005ACAT 2005DESY, Zeuthen, Germany 22DESY, Zeuthen, Germany 22ndnd – –

2727thth May, 2005 May, 2005

on behalf of theCMS collaboration

Nicola De Filippis

Page 44: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 44

Overview:• Data management

• Data Transfer service: PHEDEX• Data Validation stuff: ValidationTools• Data Publication service: RefDB/PubDB

• Analysis Strategy• Distributed Software installation: XCMSI • Analysis job submission tool: CRAB

• Job Monitoring • System monitoring: BOSS• application job monitoring: JAM

The CMS analysis tools

Page 45: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 45

CRABJob submission tool

Computing Element

StorageElement

Resource Broker (RB)

UI

Workload Management

System

The user provides:• Dataset (runs,#event,..)

• private code

DataSet Catalogue

(PubDB/RefDB)

The end-user analysis wokflow

Worker node

XCMSI

CRAB discovers data and sites hosting them by querying RefDB/ PubDB CRAB prepares, splits and submits jobs to the Resource Broker

The RB sends jobs at sites hosting the data provided the CMS software was installed

CRAB retrieves automatically the output files of the the job

Page 46: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 46

CMS first working prototype for Distributed User Analysis isavailable and used by real users

Phedex, PubDB, ValidationTools, XCMSI, CRAB, BOSS, JAM under development, deployment and in production in many sites

CMS is using Grid infrastructure for physics analyses and Monte Carlo production

tens of users, 10 million of analysed data, 10000 jobs submitted

CMS is designing a new architecture for the analysis workflow

Conclusions

Page 47: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 47

Page 48: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 48

Page 49: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Metadata Services on the GRIDMetadata Services on the GRID

Nuno Santos

ACAT’05 May 25th, 2005

Page 50: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 50

Metadata on the GRIDMetadata on the GRID

• Metadata is data about data• Metadata on the GRID

• Mainly information about files• Other information necessary for running jobs• Usually living on DBs

• Need simple interface for Metadata access• Advantages

• Easier to use by clients - no SQL, only metadata concepts• Common interface – clients don’t have to reinvent the wheel

• Must be integrated in the File Catalogue• Also suitably for storing information about other resources

Page 51: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 51

ARDA ImplementationARDA Implementation• Backends

• Currently: Oracle, PostgreSQL, SQLite

• Two frontends• TCP Streaming

• Chosen for performance• SOAP

• Formal requirement of EGEE• Compare SOAP with TCP Streaming

• Also implemented as standalone Python library• Data stored on filesystem

Python Interpreter

Metadata Python

APIClient

filesystem

Metadata Server

MDServer

SOAP

TCP Streaming

PostgreSQL

Oracle

SQLite

Client

Client

Page 52: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 52

SOAP Toolkits performanceSOAP Toolkits performance

• Test communication performance• No work done on the backend• Switched 100Mbits LAN

• Language comparison• TCP-S with similar performance in

all languages• SOAP performance varies strongly

with toolkit• Protocols comparison

• Keepalive improves performance significantly

• On Java and Python, SOAP is several times slower than TCP-S

1000 pings

0

5

10

15

20

25

Exe

cutio

n T

ime

[s]

C++ (gSOAP) Java (Axis) Python (ZSI)

TCP-S no KATCP-S KA

gSOAP no KAgSOAP KA

Page 53: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 53

Page 54: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 54

Page 55: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 55

Page 56: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 56

Page 57: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 57

Page 58: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 58

Page 59: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 59

High speed ComputingHigh speed Computing

• Infiniband• Analysis of SCTP and TCP based communication in

high-speed cluster• The apeNEXT Project• Optimisation of Lattice QCD codes for the Opteron

processor

Page 60: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

A. Heiss, U. Schwickerath

InfiniBand – Experiences at InfiniBand – Experiences at Forschungszentrum KarlsruheForschungszentrum Karlsruhe

Forschungszentrum Karlsruhein der Helmholtz-Gemeinschaft

Credits: Inge Bischoff-Gauss Marc García Martí

Bruno Hoeft Carsten Urbach

InfiniBand-Overview Hardware setup at IWR HPC applications:

MPI performance lattice QCD LM

HTC applications rfio xrootd

Page 61: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Lattice QCD Benchmark GE wrt/ Lattice QCD Benchmark GE wrt/ InfiniBandInfiniBand

Memory and communi- cation intensive application Benchmark by

C. Urbach See also CHEP04 talk

given by A. Heiss

Significant speedupby using InfiniBand

Thanks to Carsten UrbachFU Berlin and DESY Zeuthen

Page 62: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

RFIO/IB Point-to-Point file transfers RFIO/IB Point-to-Point file transfers (64bit)(64bit)

RFIO/IB see ACAT03 NIM A 534(2004) 130-134

Notes

PCI-X and PCI-Express throughput

solid: file transfers cache->/dev/nulldashed: network+protocol only

best results with PCI-Express: > 800MB/s raw transfer speed > 400MB/s file transfer speed

Disclaimer on PPC64: Not an official IBM Product. Technology Prototype. (see also slide 5 and 6)

Page 63: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

Xrootd and InfiniBandXrootd and InfiniBand

First preliminary resultsNotes:

IPoIB notes: Dual Opteron V20z Mellanox Gold drivers SM on InfiniCon 9100 same nodes as for GE

Native IB notes: proof of concept version based on Mellanox VAPI using IB_SEND dedicated send/recv buffers same nodes as above

10GE notes: IBM xseries 345 nodes Xeon 32bit, single CPU 1 and 2 GB RAM 2.66GHz clock speed Intel PRO/10GbE LR cards used for long distance tests

Page 64: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

TCP vs. SCTP TCP vs. SCTP in high-speed cluster environmentin high-speed cluster environment

Miklos KozlovszkyBudapest University of Technology and Economics

BUTE

Page 65: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 65

TCP vs. SCTPTCP vs. SCTP

TCP SCTP

Byte stream oriented Message oriented

3 way handshake connection init 4 way handshake connection init (cookie)

Old (more than 20 years) Quite new (2000-)

Multihoming

Path-mtu discovery

Both:• IPv4 & IPv6 compatible• Reliable• Connection oriented• Offers acknowledged, error free, non-duplicated transfer • Almost same Flow and Congestion Control

Page 66: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 66

SummarySummary• SCTP inherited all the “good features of TCP”• SCTP want to behave like a next generation TCP• It is more secure than TCP, and has many attractive feature

(e.g.:multihoming)• Theoretically it can work better than TCP, but TCP is faster (yet

“poor” implementations)• Well standardized, and can be useful for cluster

Page 67: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 67

Page 68: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 68

Page 69: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 69

Page 70: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 70

Page 71: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 71

Page 72: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 72

Page 73: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 73

Page 74: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 74

Page 75: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

My ImpressionsMy Impressions

Page 76: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 76

ConcernsConcerns

• Only a small fraction of the Session I talks correspond to the original spirit of the AIHEP/ACAT Session I talks.

• In particular, many of the GRID talks about deployment and infrastructure should be given to CHEP, not here.

• The large LHC collaborations have their own ACAT a few times/year.

• The huge experiment software frameworks do not encourage cross-experiments discussions or tools.

• For the next ACAT, the key people involved in the big experiments should work together to encourage more talks or reviews.

Page 77: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 77

Positive aspectsPositive aspects

• ACAT continues to be a good opportunity to meet with other cultures. Innovation may come from small groups or non HENP fields.

• Contacts (even sporadic) with Session III or plenary talks are very beneficial, in particular to young people.

Page 78: Summary Session I René Brun 27 May 2005ACAT05. R. Brun, ACAT05 DESY, Zeuthen 2 Outline Data Analysis, Data Acquisition and Tools : 6 GRID Deployment :

R. Brun, ACAT05 DESY, Zeuthen 78

The Captain of KopenickThe Captain of Kopenick

• Question to the audience :• Is Friedrich Wilhelm Voigt (Captain of Kopenick) an

ancestor of Voigt, the father of the Voigt function ?