INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
UNOSAT and Geant4: Experiences of their merge in the LCG EnvironmentPatricia Méndez Lorenzo
CERN (IT-GD) / CNAF
1st IEEE International Conference on e-Science and Grid Computing
Dec, 5-8 Melbourne (Australia)
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Outlook
Scope of the talk Geant4 Experiences UNOSAT Experiences Summary
This talk assumes you are familiar with the LCG/EGEE infrastructure and architecture
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Scope of this talk
LCG/EGEE is providing a Grid infrastructure to high energy physics experiments at CERN
HOWEVER...An important number of other communities are getting in
contact with us to use the Grid LCG/EGEE foreseen the support to this new
groups The question is: How to begin and what to do to
get involved The scope of this talk is to show you how the
gridifications of new projects are performed
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
First Example: Geant4 What is Geant4 (GEometry ANd Tracking)?
Generic Toolkit for Monte Carlo simulation of particle interactions with the matter (i.e. detectors)
Application domains: High-Energy Physics: ATLAS, CMS and LHCb (LHC), BaBar (SLAC), etc
Space Radiation: ESA
Medical Physics: Proton and brachy therapies, etc
Object-Oriented (C++) project, modular and extensible. Significant improved with respect its predecessor, Geant3, not only from the software structure, but mainly for the physics coverage
Electromagnetic physics of Geant4 and even more Hadronic physics are complex fields. It is fundamental to test their models covering the widest possible range of particles, materials and energies
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Geant4 Requirements
Geant4 works with those physics models the experiments will have to face to
Electromagnetic and Hadronic physics are fundamental features to be properly simulated by Geant4. However they are extremely CPU demanding Number of events and energy depending:
1 event of 1GeV ~ 0.03 sec (2.4 GHz) 1 event of 300 Gev ~ 9-10 sec
Geant4 wants to use the LCG environment to validate the software they provide to their users twice per year
Two large productions per yearGoal during the software validation: Comparison some shower observables between the two different Geant4 versions and check statistical significant changesSmall productions (some few thousands of jobs) during the whole year
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Geant4 Parameters and Strategy
In order to test the software they provide, Geant4 has to run over a large range of parameters: 8 different particles 23 Beam energies 5 physics list (physics models) 7 simplified detectors
Make all the combinations; these are the independent jobs to be run in each validation of the Geant4 software
Geant4 Procedure Already two productions have been validated in LCG Comparison of two different software versions Generation of samples for each version
First time separatelySecond time both versions together in each job
Analysis procedure outside Grid Third production going on at this moment
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Geant4 Production in LCG Stages: 1. Software installation: Installation of the Geant4 packages (with all the required external
additional packages: PI, AIDA, etc) Software provided via a tar file Installation through jobs using specific LCG tools Fundamental request for the sites: Shared area between WNs and
perfectly definition of the software installation region 2. Events production: Jobs sent by bunches of about 1000 jobs (remember all the possible
parameter combinations) defined by each physics list 5000 events in each job were produced3. Analysis: Statistical tests to perform the comparison between the two G4
versions
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Geant4 Production in LCG General Characteristics:
VO: 1st Production: dteam (6 certificates, one as dteamsgm) 2nd Production: alice (2certificates, one as alicesgm) 3rd Production: First time as Geant4!
The group is following all the necessary streps to become VO Sites and middleware operating system:
1st Production: RedHat7.3 2nd and 3rd Productions: Scientific Linux
Resources: 1st Production: Own RB+BDII+UI: lxb2006 at CERN 2nd and 3rd Productions: lxplus resources and 2 BDII
All output: 1st Production: About 30 GB stored at CERN (lxn1183) 2nd Production: Comparable quantity stored at CERN (lxn1180) 3rd Production: Retrieving the outputs to a delivered afs area at CERN
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Framework developed for Geant4
Generation of a general framework consisting of 3 major tools:Tool for general and automatic job submissionTool for events generation in all those sites where the software has
been installedTool for data analysis (not needed during the 2nd Production)
First Part: Tool for job submission Copy and registry of the Geant4 package
A file containing the TURL is created and is passed to the WNFollow up of candidates able to admit Geant4 jobs Selection of long queues onlyAutomatic built of the .jdl files for each long queue
Built taking as base those proposed by the user joining the name of the queue where to submit the job
Software Installation tools are used to perform the installation Submission of these files to each queue
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Framework developed for Geant4
Software Installation tool(Tool submitted in the first step to all sites to install the software)
First step:The tar file is copied from the SE at CERN to the WN It is untar and copied to the VO_DTEAM_SW_DIR area
Second Step: Software Installation toolSome Geant4 tests are performed to validate the installation If succeeded a tag is published in the Information System
Results:The software installation was tried in 63 sites1st Production: 28 sites 2nd Production: 35 sites3rd Production: Transition phase: configuring the G4 sites
Main Problems: Sites were having submission problemsSites did not have defined the VO_<VO_NAME>_SW_DIR area or did not have
shared area among WNs
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Framework developed for Geant4
Second Step: Tool for the ProductionStrategy:Strategy:
Only long queues will be used to run the productionAll outputs (hbook files) will be stored at CERN
Methodology:Methodology:Geant4 provides their own code to perform the events
productionPython Script for each type of particle, energy, physics list
and calorimeter is created by the framework from one template provided by Geant4
Generation of one jdl per job containing the code provided by Geant4 (the same for all jobs) + that script generated by the framework and changing for each job
Submission of all jdl files to all sites containing the Geant4 installation
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Framework developed for Geant4
Results (First and Second Production):A hbook file containing 5000 event is created in the case the
production succeeded The name of the file is created by the framework containing the
type of particles, the energy, the physics list and the calorimeter within the name (important to perform later the comparison)
The hbook file is copied and registered to a disk at CERNDuring the 2nd production a tar file containing different files
should have been created in the case the job succeeded. This file was retrieved to the afs area delivered for this aim and copied and registered to the grid
Around 4508 jobs (two physics list for both Geant4 versions) were run in lest than 2 weeks in 28 sites with a efficiency of about 87%
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Update of the FrameworkThis framework covered the Geant4 requirements for its
first production It’s not useful for larger production
Difficult to deal with the output and visualize the resultsA new complete tool has been developed for large
production Flexible enough to be used for any VO and any user application Most of the improvements mostly relative to the outputs handle
Documentation: “LCG2 User Guide”http://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/
index.cgi?var=eis/docsDownload: http://goc.grid.sinica.edu.tw/gocwiki/User_tools
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Update of the FrameworkThe new framework consists mainly of two tools:
Tool to perform the automatic job submissionTool to retrieve and handle the corresponding output
1. Automatic job submissionOverview: Given an user’s jdl this tool performs the following actions:
It lists all sites able to run the jdl provided by the user It creates automatically a jdl file based on that provided by the user It submits the just created jdl containing the user application(s)Moreover it creates a subdirectory (defined by the user) containing
a list of the sites where the jobs have been submitted, the corresponding jdls and the jobs IDs
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Update of the FrameworkAdditional Features:
The user can define the queues where the jobs are submitted. These queues are checked to see whether it fixes the job requirements Requested LFN files can be included. The corresponding TURLs are searched and included in a file passed in the InputSandbox to the WN
2. Retrieve and handle of the outputs The 2nd tool checks the status of the jobs from the job IDs included in the directory given by the userIt provides the following output:
The job run in ramses.dcic.ups.es:2119/jobmanager-torque-dteam is in status: ScheduledThe job run in grid01.phy.ncu.edu.tw:2119/jobmanager-torque-dteam is in status: runningThe job run in scaic10.scai.frauhofer.de:2119/jobmanager-torque-dteam is in status: over
The user is queried to retrieve the output to the destination he has previously decided
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Update of the Framework
Additional Features:
• It is possible to visualize the outputs on the web• A html report is provided showing the filesdecided by the user
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
New Geant4 Production: DIANE
Results obtained for another community: ITU
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Second Example: UNOSAT Satellite imagery based web mapping service Objectives
Easy access to quality geoinformation serviceOrganize the demand for geoinformationEnsure cost-effective and timely products
Core ServicesHumanitarian MappingImage Processing
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Second Example: UNOSAT
Data suppliers
UNOSATCentral Unit
USERWWW
Ground station
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Relief Projects of UNOSAT Case Study: Indian Ocean Tsunami Relief and Development 29th Dec 2004: First Map distributed online to field users
14th Jan 2005: Imagery Bank online: 100 Tsunami-related maps (pre and post) 670 raw satellite images
January: 200,000 tsunami maps downloaded in total
UNOSAT has a huge amount of data to stored
CERN has provided a good amount of space for this aim
From Summer 2005 the collaboration with GRID began
Running and storing data in LCG/EGEE can certainly assist UNOSAT in their purposes
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
UNOSAT and LCG/EGEE In summer 2005 we have provided a whole
structure at CERN for UNOSATUNOSAT Virtual Organization (VO)3.5TB in CASTORComputing Elements, Resource Brokers Collaboration with ARDA groupAFS area of 5GB
We have run some UNOSAT tests (images compression) inside the GRID environment (quite successful)
The framework developed for Geant4 has been adapted for UNOSAT needs
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
ARDA Support
LFC
ARDA APP
Oracle DB
CASTOR
Metadata (x,y,z)
LFN PFN
SRM
Melbourne 5th-8th December Patricia Méndez Lorenzo
Enabling Grids for E-sciencE
INFSO-RI-508833
Summary
Just one messageWe are involving already different communities
inside the GRIDHuge applications field for GRIDWe have created different frameworks to gridify in
a short time the new projectsThanks to ARDA developers we have covered
many needs of each communityOne of the EGGE purposes (involved different
communities inside the GRID) is already a reality
Top Related