Using The EDG Testbed The European DataGrid Project Team .

33
Using The EDG Testbed The European DataGrid Project Team http://www.eu-datagrid.org
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    1

Transcript of Using The EDG Testbed The European DataGrid Project Team .

Using The EDG Testbed

The European DataGrid Project Team

http://www.eu-datagrid.org

EDG Use Cases Tutorial - n° 2

Summary

Use Cases High Energy Physics

Earth Observation

Biomedical Applications

EDG Use Cases Tutorial - n° 3

EDG Application Areas

High Energy Physics

Biomedical Applications

Earth Observation Science Applications

EDG Use Cases Tutorial - n° 4

High Energy Physics

4 Experiments on LHC CMSATLAS

LHCb

~6-8 PetaBytes / year~108 events/year

~103 batch and interactive users

EDG Use Cases Tutorial - n° 5

Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users

CERN’s Network in the World

EDG Use Cases Tutorial - n° 6

Data Flow in LHC

RAW Data

DAQ

Trigger

Reconstruction

Event Summary Data (ESD) Reconstruction Tags

RAW Tags Conditions / Calibration Data

Physics Generator

Detector Simulation

Generator Data

RAWmc Data

Monte Carlo

Reconstruction

Event Summary Data (ESD) Reconstruction Tags

RAWmc Tags Conditions / Calibration Data

EDG Use Cases Tutorial - n° 7

LHCb EDG Integration

LHCb

LHCb distributed computing environment

Integration of DataGrid middleware Authentication

Job submission to DataGrid

Monitoring and control

Data replication

Resource scheduling – use of CERN MSS

EDG Use Cases Tutorial - n° 8

LHCb

LHC collider experiment

109 events * 1Mb = 1 Pb

Need a distributed model

Create, distribute and keep track of data automatically

EDG Use Cases Tutorial - n° 9

Transfer data toMass store

Submit jobs remotelyviaWeb

Data Quality Check

Executeon farm

LHCb distributed computing environmentUpdate bookkeeping

database

EDG Use Cases Tutorial - n° 10

Submit jobs remotelyvia Web

Executeon farm

Monitorperformanceof farm viaWeb

Update bookkeepingdatabase

Transfer data toCASTOR (and HPSS, RAL Datastore)

Data Quality Check ‘Online’

UserInterface

WMS

InformationServices

Replica Management

Online histogram production using

GRID pipes

MetaDataCatalog

LHCb Environment using EDG Middleware

EDG Use Cases Tutorial - n° 11

1. Authentication

Issue grid-proxy-init to get a valid user certificate.

EDG Use Cases Tutorial - n° 12

2. Job Submission

dg-job-submit /home/evh/sicb/sicb/bbincl1600061.jdl -o /home/evh/logsub

bbincl1600061.jdl:#

Executable = "script_prod";

Arguments = "1600061,v235r4dst,v233r2";

StdOutput = "file1600061.output";

StdError = "file1600061.err";

InputSandbox = {"/home/evhtbed/scripts/x509up_u149","/home/evhtbed/sicb/mcsend","/home/evhtbed/sicb/fsize","/home/evhtbed/sicb/cdispose.class","/home/evhtbed/v235r4dst.tar.gz","/home/evhtbed/sicb/sicb/bbincl1600061.sh","/home/evhtbed/script_prod","/home/evhtbed/sicb/sicb1600061.dat","/home/evhtbed/sicb/sicb1600062.dat","/home/evhtbed/sicb/sicb1600063.dat","/home/evhtbed/v233r2.tar.gz"};

OutputSandbox = {"job1600061.txt","D1600063","file1600061.output","file1600061.err","job1600062.txt","job1600063.txt"};

EDG Use Cases Tutorial - n° 13

3. Monitoring and Control

dg-job-status

dg-job-cancel

dg-job-get-output

EDG Use Cases Tutorial - n° 14

3. Monitoring and Control

EDG Use Cases Tutorial - n° 15

3. Monitoring and Control

EDG Use Cases Tutorial - n° 16

3. Monitoring and Control

EDG Use Cases Tutorial - n° 17

3. Monitoring and Control

EDG Use Cases Tutorial - n° 18

Job data

Local diskCompute Element

data

Mass storereplicacatalog (Nikhef)

data

Job dataStorage Element

Storage Element

EDG Use Cases Tutorial - n° 19

Job data

Local diskCompute Element

data

Mass storereplicacatalog (Nikhef)

data

globus-url-copy

rfcp

Job dataStorage Element

Storage Element

EDG Use Cases Tutorial - n° 20

Job data

Local diskCompute Element

data

Mass storereplicacatalog (Nikhef)

data

globus-url-copy

rfcp

Job dataStorage Element

publish

register-local-file

Storage Element

EDG Use Cases Tutorial - n° 21

Job data

Local diskCompute Element

data

Mass storereplicacatalog (Nikhef)

data

globus-url-copy

rfcp

Job dataStorage Elementreplica-get

publish

register-local-file

Storage Element

EDG Use Cases Tutorial - n° 22

Job data

Local diskCompute Element

data

Mass storereplicacatalog (Nikhef)

data

globus-url-copy

rfcp

Job dataStorage Elementreplica-get

publish

register-local-file

Storage Element

glo

bu

s-url-co

py

EDG Use Cases Tutorial - n° 23

4. Publish data on storage element

Copy data file to storage element:

globus-url-copy file:///${chemin}/L69999 \ gsiftp://lxshare0219.cern.ch/flatfiles/SE1/lhcb/L69999

Register stored data in the catalog:

/opt/globus/bin/globus-job-run lxshare0219.cern.ch \/bin/bash -c "export GDMP_CONFIG_FILE=/opt/edg/lhcb/etc/gdmp.conf; \ /opt/edg/bin/gdmp_register_local_file -d /flatfiles/SE1/lhcb"

Publish catalog:

/opt/globus/bin/globus-job-run lxshare0219.cern.ch \/bin/bash -c "export GDMP_CONFIG_FILE=/opt/edg/lhcb/etc/gdmp.conf; \ /opt/edg/bin/gdmp_publish_catalogue -n"

EDG Use Cases Tutorial - n° 24

The ALICE Event

EDG Use Cases Tutorial - n° 25

The ALICE Event Cont’d ## ----- Job Description for Aliroot -----

## author: [email protected]

Executable = "/bin/sh";

StdOutput = "aliroot.out";

StdError = "aliroot.err";

InputSandbox = {"start_aliroot.sh","rootrc","grun.C","Config.C"};

OutputSandbox = {"aliroot.err","aliroot.out","galice.root"};

RetryCount = 7;

Arguments = "start_aliroot.sh 3.02.04 3.07.01";

Requirements = Member(other.RunTimeEnvironment,"ALICE-3.07.01");

( start_aliroot.sh) :

#!/bin/sh

mv rootrc $HOME/.rootrc

echo "ALICE_ROOT_DIR is set to: $ALICE_ROOT_DIR"

export ROOTSYS=$ALICE_ROOT_DIR/root/$1

export PATH=$PATH:$ROOTSYS/bin

export LD_LIBRARY_PATH=$ROOTSYS/lib:$LD_LIBRARY_PATH

export ALICE=$ALICE_ROOT_DIR/aliroot

export ALICE_LEVEL=$2

export ALICE_ROOT=$ALICE/$ALICE_LEVEL

export ALICE_TARGET=`uname`

export LD_LIBRARY_PATH=$ALICE_ROOT/lib/tgt_$ALICE_TARGET:$LD_LIBRARY_PATH

export PATH=$PATH:$ALICE_ROOT/bin/tgt_$ALICE_TARGET:$ALICE_ROOT/share

export MANPATH=$MANPATH:$ALICE_ROOT/man

$ALICE_ROOT/bin/tgt_$ALICE_TARGET/aliroot -q -b grun.C

EDG Use Cases Tutorial - n° 26

Earth Observation Application

Processing of raw GOME data to ozone profiles

With OPERA (KNMI)

Validate GOME ozone profiles with

Ground Based measurements (IPSL)

Raw satellite data from the GOME instrument

(ESA)

2 different jobs are executed on the TESTBED, using data provided via

the sandbox model

Visualization

LIDAR data

EDG Use Cases Tutorial - n° 27

OPERA application (KNMI)

From wave spectra measured by the GOME instrument on the ERS satellite ozone profiles can be calculated. ESA provides these spectra as level 1 data. This level 1 data is then processed using OPERA to produce ozone profiles, a level 2 product. The algorithm and s/w (OPERA) are developed by KNMI.

GOME takes ~30.000 usable measurements for ozone profile retrieval per day.

The calculation of 1 profile takes ~2 min on a 800Mhz PIII.

One day of profiles will take 40 days on 1 computer.

EDG Use Cases Tutorial - n° 28

Validation application (IPSL)

Produced profiles by OPERA are validated by IPSL using ground based LIDAR measurements.

Since the LIDAR data are in-situ, pre-selection of the global GOME data has to be performed to create a dataset which is geographically and temporally in coincidence.

The main function of the program is to perform statistical operations like the bias between GOME and LIDAR data for different altitudes and its standard deviations.

The output of the validation program are 2 plots, generated by xmgr.

EDG Use Cases Tutorial - n° 29

Used JDL file

Executable = "o3gome-lidar_xmgr.final";StdOutput = "appli.out";StdError = "appli.err";InputSandbox = {"/home/leroy/DEMO_190202/o3gome-lidar_xmgr.final", "/home/leroy/DEMO_190202/obs20001019.dat", "/home/leroy/DEMO_190202/obs20001002.dat", "/home/leroy/DEMO_190202/obs20001003.dat", "/home/leroy/DEMO_190202/obs20001004.dat", "/home/leroy/DEMO_190202/obs20001005.dat", "/home/leroy/DEMO_190202/obs20001006.dat", "/home/leroy/DEMO_190202/select_coinc.exe", "/home/leroy/DEMO_190202/data_process_demoxmgr", "/home/leroy/DEMO_190202/oho30010.gol"}; OutputSandbox = {"out_proc.dat","profil_gome.dat","profil_lidar.dat", "appli.out","appli.err"};Requirements = other.OpSys == “RH 6.2”;RetryCount = 10;Rank = other.MaxCpuTime;

 

The produced profiles by OPERA are validated by IPSL using ground based LIDAR measurements.

One Month of data (gome and lidar data) is used to do a analysis between the different measurements

The result is visualized using xmgr.

EDG Use Cases Tutorial - n° 30

Validation OutputFigure 1:

Estimation of the bias between Gome and Lidar using one month of data.

Figure 2 :

example of 2 profiles : Comparison between Gome profile and lidar profile for the 2nd October 2000.

EDG Use Cases Tutorial - n° 31

World-Wide Ozone Distribution Mapping

Need for systematic and global mapping of ozone distribution

Large amount of information about atmosphere gases

stored in Terabytes of data

GOME SCIAMACHY

Scientific community: need for acollaborative environment to study problems such as ozone depletion

GRID

EDG Use Cases Tutorial - n° 32

Example of Application Description

Compute global ozone mapping from 1997-98

GOME instrument

1

2 Generate 1..n LFNs

1 yr = 5110 data files 1 data file = 15 Mb (raw)= 67Gb of data to process= 5110 jobs to run

3 Build JDL script

IDL Program

List of LFNs5 View Results

5110 x 700Kb

4 Submit Job

JDL Script

WMSGRID

EDG Use Cases Tutorial - n° 33

Further Information

High Energy Physics

http://datagrid-wp8.web.cern.ch/DataGrid-WP8/

Bio-Informatics

http://marianne.in2p3.fr/datagrid/wp10/index.html

Earth Observation

http://styx.esrin.esa.it/grid/