05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.
-
Upload
mitchell-bryant -
Category
Documents
-
view
212 -
download
0
Transcript of 05/09/2001ATLAS UK Physics Meeting Data Challenge Needs RWL Jones.
05/09/2001 ATLAS UK Physics Meeting
Data Challenge Needs
RWL Jones
05/09/2001 ATLAS UK Physics Meeting
Data challenges
• Goal validate our computing model and our software
• How?• Iterate on a set of DCs of increasing complexity
• start with data which looks like real data• Run the filtering and reconstruction chain• Store the output data into our database• Run the analysis• Produce physics results
• To understand our computing model• Performances, bottle necks, etc…
• To check and validate our software
05/09/2001 ATLAS UK Physics Meeting
But:
• Today we don’t have ‘real data’Needs to produce ‘simulated data’ first so:
Physics Event generation
Simulation
Pile-up
Detector response
Plus reconstruction and analysis
will be part of the first Data Challenges
05/09/2001 ATLAS UK Physics Meeting
ATLAS Kits
• Each DC will have an associated kit
• Current kit 1.3.0 for tests, will be replaced for DC0; testing 2.0.3 in October
• Default kit excludes compilers, but an `all in’ kit exists – more intrusive in OS but usable by more OS versions
• So far, no Grid/Globus tools included
05/09/2001 ATLAS UK Physics Meeting
ATLAS kit 1.3.0 • Tar file with ATLAS software to be
sent to remote GRID sites and used in DC0
• Main requirements: NO AFS NO root privileges to install the
software Possibility to COMPILE the code NOT to big Tar file Should run on Linux platform
05/09/2001 ATLAS UK Physics Meeting
First version of the ATLAS kit
It installs: SRT (= Software Release Tools) version 0.3.2 a subset of ATLAS release 1.3.1 : main Atlas Applications code + Makefiles : DiceMain, DicePytMain AtreconMain (Dice = G3 based ATLAS simulation program) (Atrecon = ATLAS Reconstruction program) ATLAS packages and libraries needed for compilation CLHEP version 1.6.0.0
05/09/2001 ATLAS UK Physics Meeting
It requires: Linux OS (at the moment tested on Redhat 6.1,
6.2, 7.1) Mandrake has problems CERNLIB 2000 installed If you need CERBLIB2000, use kit1 If you are on RedHat 7.1, need compilers in kit2
It provides: all instructions to install / compile / run in a
README file example jobs to run full simulation and
reconstruction plus example datacards (DICE, Atrecon)
some scripts to set environment variables, to compile and run
05/09/2001 ATLAS UK Physics Meeting
It can be downloaded : http://pcatl0a.mi.infn.it/~resconi/kit/atlas_kit.html ATLAS_kit.tar.gz (~ 90 MB) then execute: gtar -xvzf ATLAS_kit.tar.gz it will open a directory /ATLAS of ~ 500 MB It has been installed and tested on GRID machines + non-ATLAS machines sites involved in first tests of the ATLAS kit: Milan, Rome, Glasgow, Lund providing feedback… Lacks verification kit, analysis tools WORK IN PROGRESS
05/09/2001 ATLAS UK Physics Meeting
What about Globus?
• Not needed in current kit (but needed if you want to be part of the DataGrid Test Bed)
• Will be needed for DC1 (?!) – should be in the Kit if so
• If installing now, take the version from the GridPP website – the Hey CD-ROM is out of date (RPMs will be available)
05/09/2001 ATLAS UK Physics Meeting
DC0: start: 1 November 2001end : 12 December 2001
• 'continuity' test through the software chain• aim is primarily to check the state of readiness for
Data Challenge 1• 100k Z+jet events, or similar – several times• software works:
– issues to be checked include • G3 simulation on PC farm• 'pile-up' handling• what trigger simulation is to be run (ATRIG?)• reconstruction running.
– data must be written/read to/from the database
05/09/2001 ATLAS UK Physics Meeting
DC1: start: 1 February 2002
end : 30 July 2002• scope increases significantly beyond DC0
– Several sample of up to 107 events– Should involve CERN & outside-CERN sites– as a goal, be able to run:
• O(1000) PC’s • 107 events
– Simulation– Pile-up– Reconstruction
• 10-20 days
05/09/2001 ATLAS UK Physics Meeting
Aims of DC1(1)
• Provide a sample of 107 events for HLT studies
improve previous statistics by a factor 10
Study performance of Athena and algorithms for use in HLT
• HLT TDR due for the end of 2002
05/09/2001 ATLAS UK Physics Meeting
Aims of DC1(2)
• Try out running 'reconstruction' and 'analysis' on a large scale.
learn about our data model I/O performances Bottle necks
• NoteSimulation and pile-up will play an important role
05/09/2001 ATLAS UK Physics Meeting
Aims of DC1(3)
• Understand our ‘distributed’ computing model– GRID
• Use of GRID tools
– Data management• Dbase technologies
– N events with different technologies
– distributed analysis• Access to data
05/09/2001 ATLAS UK Physics Meeting
Aims of DC1(4)
• Provide samples of physics events to check and extend some of the Physics TDR studies
data generated will be mainly ‘standard model’
checking Geant3 versus Geant4 understand how to do the comparison
understand ‘same’ geometry
05/09/2001 ATLAS UK Physics Meeting
DC2 start: January 2003
end : September 2003• scope depends on the ‘success’ of DC0/1• goals
use of ‘Test-Bed’ 108 events, complexity at ~50% of 2006-7 system
Geant4 should play a major role ‘hidden’ new physics test of calibration procedures extensive use of GRID middleware Do we want to add part or all of:
– DAQ– LVl1, Lvl2, Event filter
05/09/2001 ATLAS UK Physics Meeting
The ATLAS Data Challenges Project Structure Organisation
ATLAS Data Challenges
CSG
DC Overview Board
DCExecution
Board
DCDefinition
Committee(DC2)
Work Plan Definition
WP
RTAG
WP WP WP WP
Reports
ReviewsNCB
Resource Matters
OtherComputing
GridProjects
DataGridProject TIERs
05/09/2001 ATLAS UK Physics Meeting
Event generation• The type of events has to be defined• Several event generators will probably be used
– For each of them we have to define the version• in particular Pythia
– Robust?
• Event type & event generators have to be defined by – HLT group (for HLT events)– Physics community
• Depending on the output we can use the following frameworks– ATGEN/GENZ
• for ZEBRA output format
– Athena• for output in OO-db (HepMC)
– Zebra to HepMc convertor already exists.
05/09/2001 ATLAS UK Physics Meeting
Simulation Geant3 or Geant4?
DC0 and DC1 will still rely on Geant3 – G4 version not ready
Urgently need Geant4 experience Geometry has to be defined (same as G3 for validation) Use standard events for validation The `physics’ is improved
for Geant3 simulation, “Slug/Dice” or “Atlsim” frameworkIn both cases output will be Zebra
for Geant4 simulation, probably use the FADS/Goofy frameworkoutput will be ‘Hits collections’ in OO-db
05/09/2001 ATLAS UK Physics Meeting
Pile-up• Add to the ‘Physics event’ “N” ‘minimum bias events’
– N depends on the luminosity• Suggested
– 2-3 at L = 1033
– 6 at L = 2 x 1033
– 24 at L = 1034
– N depends of the detector • In the calorimeter NC is ~ 10 times bigger than for other detectors
– Matching events for different pile-up in different detectors a real headache!
• The ‘minimum bias’ events should be generated first, they will then be picked-up randomly when the merging is done – This will be a high I/O operation
– Efficiency technology dependent (sequential or random access files)
05/09/2001 ATLAS UK Physics Meeting
Reconstruction
• Reconstruction– Run in Athena framework– Input should be from OO-db– Output in OO-db:
• ESD• AOD• TAG
– Atrecon could be a back-up possibility• To be decided
05/09/2001 ATLAS UK Physics Meeting
Data management
• Many ‘pieces’ of infrastructure still to be decided – Everything related to the OO-db (Objy or/and ORACLE)
• Tools for creation, replication, distribution
– What do we do with ROOT I/O• Which fraction of the events will be done with ROOT I/O
– Thousands of files will be produced and need “bookkeeping” and a “catalog”
• Where is the “HepMC” truth data ?
• Where is the corresponding “simulated” or AOD data ?
• Selection and filtering?
• Correlation between different pieces of information?
05/09/2001 ATLAS UK Physics Meeting
Data management
• Several technologies will be evaluated, so we will have to duplicate some data – Same data in ZEBRA & OO-db
– Same data in ZEBRA FZ and ZEBRA random-access (for pile-up)
– We need to quantify this overhead
• We have also to “realize” that the performances will depend on the technology– Sequential versus random access files
05/09/2001 ATLAS UK Physics Meeting
DC0 planning
• For DC0, probably in September software week, decide on the strategy to be adopted: – Software to be used
• Dice geometry• Reconstruction adapted to this geometry• Database
– Infrastructure• Gilbert hopes (hmmm) that we will have in place ‘tools’ for:
– Automatic job-submission– Data catalog and book keeping– allocation of “run numbers” and of “random numbers” ( book keeping)
– The ‘validation’ of components must be done now
05/09/2001 ATLAS UK Physics Meeting
Currently available software, June-Sept2001.
atgen-bfortran, Hp only----------------Py5.7Jetset74+code dedicated toB-physics----------------Lujets->GENZ bank
Particle lev. simulationZEBRA
Detector simulation
Dice: slug+geant3 fortranproduce GENZ+KINE bank
ZEBRA
Reconstruction
Atrecon fortran,c++ read GENZ+kineproduce
ATHENA
Fast det.simulation
Atlfast++reads GENZ convert to HepMcproduce
Ntuples
New geom
TDRgeom
ReconstructionC++reads GENZ +kineconvert to HepMcproduce NtuplesNtuples
05/09/2001 ATLAS UK Physics Meeting
Simulation software to be available Nov-Dec2001.
HepMc??
Detector simulation
Dice: slug+geant3 fortranproduce GENZ+KINE bank
ZEBRA
ATHENA
Fast det.simulation
ReconstructionC++reads GENZ +kineconvert to HepMcproduce Ntuples
ATHENA
Particle lev. simulation
GeneratorModulesC++, linux----------------Py6+code dedicated toB-physics----------------PYJETS->HepMc--------------- EvtGen BaBar package ( later).
Atlfast++reads HepMcproduce
Ntuples
05/09/2001 ATLAS UK Physics Meeting
Analysis
– Analysis tools evaluation should be part of the DC
– Required for test of the Event Data Model
– Essential for tests of Computing Models
– Output for HLT studies will be only few hundred events
– ‘Physics events’ would be more appropriate for this study
– ATLAS Kit must include analysis tools
05/09/2001 ATLAS UK Physics Meeting
Storage and CPU issues in DC1• Testing storage technology will inflate data
volume/event (easiest to re-simulate)• Testing software chains will inflate CPU usage/event• The size of the events with pile-up depends on the
luminosity• 4 MB per event @ L= 2 x 1033
• 15 MB per event @ L= 1034
• The time to do the pile-up depends also on the luminosity
• 55 s (HP) per event @ L= 2 x 1033
• 200 s (HP) per event @ L= 1034
05/09/2001 ATLAS UK Physics Meeting
Issues for DC1
– Manpower is the most precious resource; coherent generation will be a significant job for each participating site
– Do we have enough hardware resources in terms of CPU, disk space, tapes, data servers … Looks OK
• Entry-requirement for generation O(100) CPUs (NCB) – clouds
• What will we do with the data generated during the DC?– Keep it on CASTOR? Tapes?
• How will we exchange the data?– Do we want to have all the information at CERN?, everywhere?
• What are the networking requirements?
05/09/2001 ATLAS UK Physics Meeting
ATLAS interim manpower request from GridPP
• Requested another post for DC co-ordination and management tools, running DCs, and Grid integration and verification. Looking at declared manpower, this is insufficient in pre-Grid era!
• Further post for Replication, Catalogue and MSS integration for ATLAS
05/09/2001 ATLAS UK Physics Meeting
Interim ATLAS request
• Grid-aware resource discovery and job submission for ATLAS; essential all this be progammatic by DC2. Overlap with LHCb?
• Should add to this post(s) for verification activities, which is a large part of the work
• Should also ask for manpower for verification packages
05/09/2001 ATLAS UK Physics Meeting
Joint meeting with LHCb
• Common project on experiment code installation Grid-based tools
• Event selection and data discovery tools (GANGA is an LHCb proposed prototype layer between Gaudi/Athena and Datastores and catalogues)