INFSO-RI-508833 Enabling Grids for E-sciencE UNOSAT and Geant4: Experiences of their merge in the...

23
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Méndez Lorenzo CERN (IT-GD) / CNAF 1st IEEE International Conference on e-Science and Grid Computing Dec, 5-8 Melbourne (Australia)

description

Enabling Grids for E-sciencE INFSO-RI Melbourne 5th-8 th December Patricia Méndez Lorenzo Scope of this talk  LCG/EGEE is providing a Grid infrastructure to high energy physics experiments at CERN  HOWEVER...  An important number of other communities are getting in contact with us to use the Grid  LCG/EGEE foreseen the support to this new groups  The question is: How to begin and what to do to get involved  The scope of this talk is to show you how the gridifications of new projects are performed

Transcript of INFSO-RI-508833 Enabling Grids for E-sciencE UNOSAT and Geant4: Experiences of their merge in the...

Page 1: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

UNOSAT and Geant4: Experiences of their merge in the LCG EnvironmentPatricia Méndez Lorenzo

CERN (IT-GD) / CNAF

1st IEEE International Conference on e-Science and Grid Computing

Dec, 5-8 Melbourne (Australia)

Page 2: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Outlook

Scope of the talk Geant4 Experiences UNOSAT Experiences Summary

This talk assumes you are familiar with the LCG/EGEE infrastructure and architecture

Page 3: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Scope of this talk

LCG/EGEE is providing a Grid infrastructure to high energy physics experiments at CERN

HOWEVER...An important number of other communities are getting in

contact with us to use the Grid LCG/EGEE foreseen the support to this new

groups The question is: How to begin and what to do to

get involved The scope of this talk is to show you how the

gridifications of new projects are performed

Page 4: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

First Example: Geant4 What is Geant4 (GEometry ANd Tracking)?

Generic Toolkit for Monte Carlo simulation of particle interactions with the matter (i.e. detectors)

Application domains: High-Energy Physics: ATLAS, CMS and LHCb (LHC), BaBar (SLAC), etc

Space Radiation: ESA

Medical Physics: Proton and brachy therapies, etc

Object-Oriented (C++) project, modular and extensible. Significant improved with respect its predecessor, Geant3, not only from the software structure, but mainly for the physics coverage

Electromagnetic physics of Geant4 and even more Hadronic physics are complex fields. It is fundamental to test their models covering the widest possible range of particles, materials and energies

Page 5: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Geant4 Requirements

Geant4 works with those physics models the experiments will have to face to

Electromagnetic and Hadronic physics are fundamental features to be properly simulated by Geant4. However they are extremely CPU demanding Number of events and energy depending:

1 event of 1GeV ~ 0.03 sec (2.4 GHz) 1 event of 300 Gev ~ 9-10 sec

Geant4 wants to use the LCG environment to validate the software they provide to their users twice per year

Two large productions per yearGoal during the software validation: Comparison some shower observables between the two different Geant4 versions and check statistical significant changesSmall productions (some few thousands of jobs) during the whole year

Page 6: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Geant4 Parameters and Strategy

In order to test the software they provide, Geant4 has to run over a large range of parameters: 8 different particles 23 Beam energies 5 physics list (physics models) 7 simplified detectors

Make all the combinations; these are the independent jobs to be run in each validation of the Geant4 software

Geant4 Procedure Already two productions have been validated in LCG Comparison of two different software versions Generation of samples for each version

First time separatelySecond time both versions together in each job

Analysis procedure outside Grid Third production going on at this moment

Page 7: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Geant4 Production in LCG Stages: 1. Software installation: Installation of the Geant4 packages (with all the required external

additional packages: PI, AIDA, etc) Software provided via a tar file Installation through jobs using specific LCG tools Fundamental request for the sites: Shared area between WNs and

perfectly definition of the software installation region 2. Events production: Jobs sent by bunches of about 1000 jobs (remember all the possible

parameter combinations) defined by each physics list 5000 events in each job were produced3. Analysis: Statistical tests to perform the comparison between the two G4

versions

Page 8: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Geant4 Production in LCG General Characteristics:

VO: 1st Production: dteam (6 certificates, one as dteamsgm) 2nd Production: alice (2certificates, one as alicesgm) 3rd Production: First time as Geant4!

The group is following all the necessary streps to become VO Sites and middleware operating system:

1st Production: RedHat7.3 2nd and 3rd Productions: Scientific Linux

Resources: 1st Production: Own RB+BDII+UI: lxb2006 at CERN 2nd and 3rd Productions: lxplus resources and 2 BDII

All output: 1st Production: About 30 GB stored at CERN (lxn1183) 2nd Production: Comparable quantity stored at CERN (lxn1180) 3rd Production: Retrieving the outputs to a delivered afs area at CERN

Page 9: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Framework developed for Geant4

Generation of a general framework consisting of 3 major tools:Tool for general and automatic job submissionTool for events generation in all those sites where the software has

been installedTool for data analysis (not needed during the 2nd Production)

First Part: Tool for job submission Copy and registry of the Geant4 package

A file containing the TURL is created and is passed to the WNFollow up of candidates able to admit Geant4 jobs Selection of long queues onlyAutomatic built of the .jdl files for each long queue

Built taking as base those proposed by the user joining the name of the queue where to submit the job

Software Installation tools are used to perform the installation Submission of these files to each queue

Page 10: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Framework developed for Geant4

Software Installation tool(Tool submitted in the first step to all sites to install the software)

First step:The tar file is copied from the SE at CERN to the WN It is untar and copied to the VO_DTEAM_SW_DIR area

Second Step: Software Installation toolSome Geant4 tests are performed to validate the installation If succeeded a tag is published in the Information System

Results:The software installation was tried in 63 sites1st Production: 28 sites 2nd Production: 35 sites3rd Production: Transition phase: configuring the G4 sites

Main Problems: Sites were having submission problemsSites did not have defined the VO_<VO_NAME>_SW_DIR area or did not have

shared area among WNs

Page 11: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Framework developed for Geant4

Second Step: Tool for the ProductionStrategy:Strategy:

Only long queues will be used to run the productionAll outputs (hbook files) will be stored at CERN

Methodology:Methodology:Geant4 provides their own code to perform the events

productionPython Script for each type of particle, energy, physics list

and calorimeter is created by the framework from one template provided by Geant4

Generation of one jdl per job containing the code provided by Geant4 (the same for all jobs) + that script generated by the framework and changing for each job

Submission of all jdl files to all sites containing the Geant4 installation

Page 12: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Framework developed for Geant4

Results (First and Second Production):A hbook file containing 5000 event is created in the case the

production succeeded The name of the file is created by the framework containing the

type of particles, the energy, the physics list and the calorimeter within the name (important to perform later the comparison)

The hbook file is copied and registered to a disk at CERNDuring the 2nd production a tar file containing different files

should have been created in the case the job succeeded. This file was retrieved to the afs area delivered for this aim and copied and registered to the grid

Around 4508 jobs (two physics list for both Geant4 versions) were run in lest than 2 weeks in 28 sites with a efficiency of about 87%

Page 13: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Update of the FrameworkThis framework covered the Geant4 requirements for its

first production It’s not useful for larger production

Difficult to deal with the output and visualize the resultsA new complete tool has been developed for large

production Flexible enough to be used for any VO and any user application Most of the improvements mostly relative to the outputs handle

Documentation: “LCG2 User Guide”http://grid-deployment.web.cern.ch/grid-deployment/cgi-bin/

index.cgi?var=eis/docsDownload: http://goc.grid.sinica.edu.tw/gocwiki/User_tools

Page 14: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Update of the FrameworkThe new framework consists mainly of two tools:

Tool to perform the automatic job submissionTool to retrieve and handle the corresponding output

1. Automatic job submissionOverview: Given an user’s jdl this tool performs the following actions:

It lists all sites able to run the jdl provided by the user It creates automatically a jdl file based on that provided by the user It submits the just created jdl containing the user application(s)Moreover it creates a subdirectory (defined by the user) containing

a list of the sites where the jobs have been submitted, the corresponding jdls and the jobs IDs

Page 15: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Update of the FrameworkAdditional Features:

The user can define the queues where the jobs are submitted. These queues are checked to see whether it fixes the job requirements Requested LFN files can be included. The corresponding TURLs are searched and included in a file passed in the InputSandbox to the WN

2. Retrieve and handle of the outputs The 2nd tool checks the status of the jobs from the job IDs included in the directory given by the userIt provides the following output:

The job run in ramses.dcic.ups.es:2119/jobmanager-torque-dteam is in status: ScheduledThe job run in grid01.phy.ncu.edu.tw:2119/jobmanager-torque-dteam is in status: runningThe job run in scaic10.scai.frauhofer.de:2119/jobmanager-torque-dteam is in status: over

The user is queried to retrieve the output to the destination he has previously decided

Page 16: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Update of the Framework

Additional Features:

• It is possible to visualize the outputs on the web• A html report is provided showing the filesdecided by the user

Page 17: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

New Geant4 Production: DIANE

Results obtained for another community: ITU

Page 18: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Second Example: UNOSAT Satellite imagery based web mapping service Objectives

Easy access to quality geoinformation serviceOrganize the demand for geoinformationEnsure cost-effective and timely products

Core ServicesHumanitarian MappingImage Processing

Page 19: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Second Example: UNOSAT

Data suppliers

UNOSATCentral Unit

USERWWW

Ground station

Page 20: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Relief Projects of UNOSAT Case Study: Indian Ocean Tsunami Relief and Development 29th Dec 2004: First Map distributed online to field users

14th Jan 2005: Imagery Bank online: 100 Tsunami-related maps (pre and post) 670 raw satellite images

January: 200,000 tsunami maps downloaded in total

UNOSAT has a huge amount of data to stored

CERN has provided a good amount of space for this aim

From Summer 2005 the collaboration with GRID began

Running and storing data in LCG/EGEE can certainly assist UNOSAT in their purposes

Page 21: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

UNOSAT and LCG/EGEE In summer 2005 we have provided a whole

structure at CERN for UNOSATUNOSAT Virtual Organization (VO)3.5TB in CASTORComputing Elements, Resource Brokers Collaboration with ARDA groupAFS area of 5GB

We have run some UNOSAT tests (images compression) inside the GRID environment (quite successful)

The framework developed for Geant4 has been adapted for UNOSAT needs

Page 22: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

ARDA Support

LFC

ARDA APP

Oracle DB

CASTOR

Metadata (x,y,z)

LFN PFN

SRM

Page 23: INFSO-RI-508833 Enabling Grids for E-sciencE   UNOSAT and Geant4: Experiences of their merge in the LCG Environment Patricia Mndez Lorenzo.

Melbourne 5th-8th December Patricia Méndez Lorenzo

Enabling Grids for E-sciencE

INFSO-RI-508833

Summary

Just one messageWe are involving already different communities

inside the GRIDHuge applications field for GRIDWe have created different frameworks to gridify in

a short time the new projectsThanks to ARDA developers we have covered

many needs of each communityOne of the EGGE purposes (involved different

communities inside the GRID) is already a reality