05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

32
05/09/2001 ATLAS UK Physics Meeting Data Challenge Needs RWL Jones

Transcript of 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

Page 1: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Data Challenge Needs

RWL Jones

Page 2: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Data challenges

• Goal validate our computing model and our software

• How?• Iterate on a set of DCs of increasing complexity

• start with data which looks like real data• Run the filtering and reconstruction chain• Store the output data into our database• Run the analysis• Produce physics results

• To understand our computing model• Performances, bottle necks, etc…

• To check and validate our software

Page 3: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

But:

• Today we don’t have ‘real data’Needs to produce ‘simulated data’ first so:

Physics Event generation

Simulation

Pile-up

Detector response

Plus reconstruction and analysis

will be part of the first Data Challenges

Page 4: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

ATLAS Kits

• Each DC will have an associated kit

• Current kit 1.3.0 for tests, will be replaced for DC0; testing 2.0.3 in October

• Default kit excludes compilers, but an `all in’ kit exists – more intrusive in OS but usable by more OS versions

• So far, no Grid/Globus tools included

Page 5: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

ATLAS kit 1.3.0 • Tar file with ATLAS software to be

sent to remote GRID sites and used in DC0

• Main requirements: NO AFS NO root privileges to install the

software Possibility to COMPILE the code NOT to big Tar file Should run on Linux platform

Page 6: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

First version of the ATLAS kit

It installs: SRT (= Software Release Tools) version 0.3.2 a subset of ATLAS release 1.3.1 : main Atlas Applications code + Makefiles : DiceMain, DicePytMain AtreconMain (Dice = G3 based ATLAS simulation program) (Atrecon = ATLAS Reconstruction program) ATLAS packages and libraries needed for compilation CLHEP version 1.6.0.0

Page 7: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

It requires: Linux OS (at the moment tested on Redhat 6.1,

6.2, 7.1) Mandrake has problems CERNLIB 2000 installed If you need CERBLIB2000, use kit1 If you are on RedHat 7.1, need compilers in kit2

It provides: all instructions to install / compile / run in a

README file example jobs to run full simulation and

reconstruction plus example datacards (DICE, Atrecon)

some scripts to set environment variables, to compile and run

Page 8: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

It can be downloaded : http://pcatl0a.mi.infn.it/~resconi/kit/atlas_kit.html ATLAS_kit.tar.gz (~ 90 MB) then execute: gtar -xvzf ATLAS_kit.tar.gz it will open a directory /ATLAS of ~ 500 MB It has been installed and tested on GRID machines + non-ATLAS machines sites involved in first tests of the ATLAS kit: Milan, Rome, Glasgow, Lund providing feedback… Lacks verification kit, analysis tools WORK IN PROGRESS

Page 9: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

What about Globus?

• Not needed in current kit (but needed if you want to be part of the DataGrid Test Bed)

• Will be needed for DC1 (?!) – should be in the Kit if so

• If installing now, take the version from the GridPP website – the Hey CD-ROM is out of date (RPMs will be available)

Page 10: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

DC0: start: 1 November 2001end : 12 December 2001

• 'continuity' test through the software chain• aim is primarily to check the state of readiness for

Data Challenge 1• 100k Z+jet events, or similar – several times• software works:

– issues to be checked include • G3 simulation on PC farm• 'pile-up' handling• what trigger simulation is to be run (ATRIG?)• reconstruction running.

– data must be written/read to/from the database

Page 11: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

DC1: start: 1 February 2002

end : 30 July 2002• scope increases significantly beyond DC0

– Several sample of up to 107 events– Should involve CERN & outside-CERN sites– as a goal, be able to run:

• O(1000) PC’s • 107 events

– Simulation– Pile-up– Reconstruction

• 10-20 days

Page 12: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Aims of DC1(1)

• Provide a sample of 107 events for HLT studies

improve previous statistics by a factor 10

Study performance of Athena and algorithms for use in HLT

• HLT TDR due for the end of 2002

Page 13: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Aims of DC1(2)

• Try out running 'reconstruction' and 'analysis' on a large scale.

learn about our data model I/O performances Bottle necks

• NoteSimulation and pile-up will play an important role

Page 14: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Aims of DC1(3)

• Understand our ‘distributed’ computing model– GRID

• Use of GRID tools

– Data management• Dbase technologies

– N events with different technologies

– distributed analysis• Access to data

Page 15: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Aims of DC1(4)

• Provide samples of physics events to check and extend some of the Physics TDR studies

data generated will be mainly ‘standard model’

checking Geant3 versus Geant4 understand how to do the comparison

understand ‘same’ geometry

Page 16: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

DC2 start: January 2003

end : September 2003• scope depends on the ‘success’ of DC0/1• goals

use of ‘Test-Bed’ 108 events, complexity at ~50% of 2006-7 system

Geant4 should play a major role ‘hidden’ new physics test of calibration procedures extensive use of GRID middleware Do we want to add part or all of:

– DAQ– LVl1, Lvl2, Event filter

Page 17: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

The ATLAS Data Challenges Project Structure Organisation

ATLAS Data Challenges

CSG

DC Overview Board

DCExecution

Board

DCDefinition

Committee(DC2)

Work Plan Definition

WP

RTAG

WP WP WP WP

Reports

ReviewsNCB

Resource Matters

OtherComputing

GridProjects

DataGridProject TIERs

Page 18: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Event generation• The type of events has to be defined• Several event generators will probably be used

– For each of them we have to define the version• in particular Pythia

– Robust?

• Event type & event generators have to be defined by – HLT group (for HLT events)– Physics community

• Depending on the output we can use the following frameworks– ATGEN/GENZ

• for ZEBRA output format

– Athena• for output in OO-db (HepMC)

– Zebra to HepMc convertor already exists.

Page 19: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Simulation Geant3 or Geant4?

DC0 and DC1 will still rely on Geant3 – G4 version not ready

Urgently need Geant4 experience Geometry has to be defined (same as G3 for validation) Use standard events for validation The `physics’ is improved

for Geant3 simulation, “Slug/Dice” or “Atlsim” frameworkIn both cases output will be Zebra

for Geant4 simulation, probably use the FADS/Goofy frameworkoutput will be ‘Hits collections’ in OO-db

Page 20: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Pile-up• Add to the ‘Physics event’ “N” ‘minimum bias events’

– N depends on the luminosity• Suggested

– 2-3 at L = 1033

– 6 at L = 2 x 1033

– 24 at L = 1034

– N depends of the detector • In the calorimeter NC is ~ 10 times bigger than for other detectors

– Matching events for different pile-up in different detectors a real headache!

• The ‘minimum bias’ events should be generated first, they will then be picked-up randomly when the merging is done – This will be a high I/O operation

– Efficiency technology dependent (sequential or random access files)

Page 21: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Reconstruction

• Reconstruction– Run in Athena framework– Input should be from OO-db– Output in OO-db:

• ESD• AOD• TAG

– Atrecon could be a back-up possibility• To be decided

Page 22: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Data management

• Many ‘pieces’ of infrastructure still to be decided – Everything related to the OO-db (Objy or/and ORACLE)

• Tools for creation, replication, distribution

– What do we do with ROOT I/O• Which fraction of the events will be done with ROOT I/O

– Thousands of files will be produced and need “bookkeeping” and a “catalog”

• Where is the “HepMC” truth data ?

• Where is the corresponding “simulated” or AOD data ?

• Selection and filtering?

• Correlation between different pieces of information?

Page 23: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Data management

• Several technologies will be evaluated, so we will have to duplicate some data – Same data in ZEBRA & OO-db

– Same data in ZEBRA FZ and ZEBRA random-access (for pile-up)

– We need to quantify this overhead

• We have also to “realize” that the performances will depend on the technology– Sequential versus random access files

Page 24: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

DC0 planning

• For DC0, probably in September software week, decide on the strategy to be adopted: – Software to be used

• Dice geometry• Reconstruction adapted to this geometry• Database

– Infrastructure• Gilbert hopes (hmmm) that we will have in place ‘tools’ for:

– Automatic job-submission– Data catalog and book keeping– allocation of “run numbers” and of “random numbers” ( book keeping)

– The ‘validation’ of components must be done now

Page 25: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Currently available software, June-Sept2001.

atgen-bfortran, Hp only----------------Py5.7Jetset74+code dedicated toB-physics----------------Lujets->GENZ bank

Particle lev. simulationZEBRA

Detector simulation

Dice: slug+geant3 fortranproduce GENZ+KINE bank

ZEBRA

Reconstruction

Atrecon fortran,c++ read GENZ+kineproduce

ATHENA

Fast det.simulation

Atlfast++reads GENZ convert to HepMcproduce

Ntuples

New geom

TDRgeom

ReconstructionC++reads GENZ +kineconvert to HepMcproduce NtuplesNtuples

Page 26: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Simulation software to be available Nov-Dec2001.

HepMc??

Detector simulation

Dice: slug+geant3 fortranproduce GENZ+KINE bank

ZEBRA

ATHENA

Fast det.simulation

ReconstructionC++reads GENZ +kineconvert to HepMcproduce Ntuples

ATHENA

Particle lev. simulation

GeneratorModulesC++, linux----------------Py6+code dedicated toB-physics----------------PYJETS->HepMc--------------- EvtGen BaBar package ( later).

Atlfast++reads HepMcproduce

Ntuples

Page 27: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Analysis

– Analysis tools evaluation should be part of the DC

– Required for test of the Event Data Model

– Essential for tests of Computing Models

– Output for HLT studies will be only few hundred events

– ‘Physics events’ would be more appropriate for this study

– ATLAS Kit must include analysis tools

Page 28: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Storage and CPU issues in DC1• Testing storage technology will inflate data

volume/event (easiest to re-simulate)• Testing software chains will inflate CPU usage/event• The size of the events with pile-up depends on the

luminosity• 4 MB per event @ L= 2 x 1033

• 15 MB per event @ L= 1034

• The time to do the pile-up depends also on the luminosity

• 55 s (HP) per event @ L= 2 x 1033

• 200 s (HP) per event @ L= 1034

Page 29: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Issues for DC1

– Manpower is the most precious resource; coherent generation will be a significant job for each participating site

– Do we have enough hardware resources in terms of CPU, disk space, tapes, data servers … Looks OK

• Entry-requirement for generation O(100) CPUs (NCB) – clouds

• What will we do with the data generated during the DC?– Keep it on CASTOR? Tapes?

• How will we exchange the data?– Do we want to have all the information at CERN?, everywhere?

• What are the networking requirements?

Page 30: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

ATLAS interim manpower request from GridPP

• Requested another post for DC co-ordination and management tools, running DCs, and Grid integration and verification. Looking at declared manpower, this is insufficient in pre-Grid era!

• Further post for Replication, Catalogue and MSS integration for ATLAS

Page 31: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Interim ATLAS request

• Grid-aware resource discovery and job submission for ATLAS; essential all this be progammatic by DC2. Overlap with LHCb?

• Should add to this post(s) for verification activities, which is a large part of the work

• Should also ask for manpower for verification packages

Page 32: 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.

05/09/2001 ATLAS UK Physics Meeting

Joint meeting with LHCb

• Common project on experiment code installation Grid-based tools

• Event selection and data discovery tools (GANGA is an LHCb proposed prototype layer between Gaudi/Athena and Datastores and catalogues)