EDG Applications The European DataGrid Project Team .

30
EDG Applications The European DataGrid Project Team http://www.eu-datagrid.org

Transcript of EDG Applications The European DataGrid Project Team .

Page 1: EDG Applications The European DataGrid Project Team .

EDG Applications

The European DataGrid Project Team

http://www.eu-datagrid.org

Page 2: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 2

EDG Application Areas

High Energy Physics

Biomedical Applications

Earth Observation Science Applications

Page 3: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 3

High Energy Physics

4 Experiments on LHC CMSATLAS

LHCb

~6-8 PetaBytes / year~108 events/year

~103 batch and interactive users

Page 4: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 4

Europe: 267 institutes, 4603 usersElsewhere: 208 institutes, 1632 users

CERN’s Network in the World

Page 5: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 5

Data Flow in LHC

RAW Data

DAQ

Trigger

Reconstruction

Event Summary Data (ESD) Reconstruction Tags

RAW Tags Conditions / Calibration Data

Physics Generator

Detector Simulation

Generator Data

RAWmc Data

Monte Carlo

Reconstruction

Event Summary Data (ESD) Reconstruction Tags

RAWmc Tags Conditions / Calibration Data

Page 6: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 6

Example: CMS Monte Carlo Production

Page 7: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 7

CMS jobs description

CMKIN : MC Generation of the proton-proton interaction for a physics channel (dataset)

CMSIM: Detailed simulation of the CMS detector, processing the data produced during the CMKIN step

CMKINJob

CMSIMJob

Output data

Output data

Grid Storage

Write to Grid

Storage Element

Write to Grid

Storage Element

Read from

Grid

Stora

ge Elem

ent

* PIII 1GHz 512MB 46.8 SI95

size/event time*/event

CMKIN ~ 0.05MB ~ 0.4-0.5 sec

CMSIM ~ 1.8 MB ~ 6 min

Page 8: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 8

CMS production components interfaced to EDG middleware Production is managed from the EDG User Interface with

IMPALA/BOSS

CMS Virtual Organization server at NIKHEF (Amsterdam)

CMS EDG

SECE

CMS software

BOSSDB

WorkloadManagement

System

JDL

RefDB

parameters

Push data or info

Pull info

UIIMPALA/BOSS

CE

CMS software

CE

CMS software

CE

SE

SE

SE

Page 9: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 9

CMS EDG

SECE

CMS software

BOSSDB

WorkloadManagement

System

JDL

RefDB

parameters

data registration

input

dat a

lo

cat i

on

Push data or info

Pull info

UIIMPALA/BOSS

Replica Manager

CE

CMS software

CE

CMS software

CE

WN

SECE

CMS software

SE

SE

SE

X

CMS production components interfaced to EDG middleware CMKIN jobs running on all EDG Testbed sites with CMS software installed

CMSIM jobs running on CE close to the input data

produced data: scripts for batch replication to a dedicated SE

Page 10: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 10

CMS production components interfaced to EDG middleware

Job monitoring and bookkeeping: BOSS DBs, EDG Logging & Bookkeeping service

CMS EDG

SECE

CMS software

BOSSDB

WorkloadManagement

System

JDL

RefDB

parameters

data registration

Job output filteringRuntime monitoring

input

dat a

lo

cat i

on

Push data or info

Pull info

UIIMPALA/BOSS

Replica Manager

CE

CMS software

CE

CMS software

CE

WN

SECE

CMS software

SE

SE

SE

Page 11: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 11

CMS use of the system (Statistics)

CEsSEsN

b.

of

evts

time

Events Production within EDG is part of the Official CMS production

http://cmsdoc.cern.ch/cms/production/www/html/general/

Page 12: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 12

Summary of CMS work and plans for use of EDG middleware

RESULTS We can distribute and run CMS s/w in the EDG environment

We have generated ~250K events for physics with ~10000 jobs in 3 week period

OBSERVATIONS and PLANNING for the future We were able to quickly add new sites to provide extra resources

There was a fast turnaround in bug fixing and installing new software

The stress test was labor intensive (since software was developing)

Release EDG 2.0 should fix the major problems and allow for enhanced scalability,and we look forward to evaluating it and using it in our Data Challenge work

Page 13: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 13

ESA(IT) – KNMI(NL)Processing of raw GOMEdata to ozone profiles.

2 alternative algorithms~28000 profiles/day

IPSL(FR)Validate some of the

GOME ozone profiles (~106/y)Coincident in space and time

with Ground-Based measurements

Visualization & Analyze

LIDAR data (7 stations, 2.5MB per month)

DataGridenvironment

Level 2

(example of 1 day total O3)

Level 1

Raw satellite data from the GOME instrument(~75 GB - ~5000 orbits/y)

EDG EO challenge: Processing / validation of 1y of GOME data

Page 14: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 14

EO WebMap Portal

Page 15: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 15

Web Portal EO ProductCatalogue

EDGStorage Element

EDGUser Interface

EDGResource

BrokerEDGComputing

Element

EO Replica Catalogue

EOGrid Engine

EO ProductArchive

1. Search Level-1 catalogue

2. Retrieve Level-2 products

3. Level-2 Products already registered in RC?

8. Submit jobs to process Level-1 data

7. Register Level-1 data

11. Register level-2 data

9. Process Level-1 data

10. Transfer Level-2 data to SE

12. Return new Level-2 products

Yes? 4. Return available Level-2 productsNo? 5. Perform GRID processing on-the-fly 6. Transfer

Level-1 data from Archive to the Grid

Processing Sequence

Page 16: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 16

Goals of the DataGrid applicationvalidate satellite data with all ground based data available in an easy way: Comparison of ozone profiles provided by satellite with lidar data in different locations and times (see the web portal) Statistical comparison and analysis in order to improve algorithms.

OZONE LAYER50 km

10 km

ERS/GOME satellite

Lidar at the Haute Provence Observatory

GOME Ozone Profile Validation

Page 17: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 17

Level 2 Catalogue

Lidar data catalogue

Queries and data information retrieval from the Lidar metadata catalogue

GRID

ComputingElement

Storage Elements with

Lidar data

Queries and data information retrieval from the Gome Level 2 orbit or pixel metadata catalogues

When completed comparison between lidar and satellite ozone profiles

Satellite data validation Lidar site

Level 2 Catalogue

GRID Portal

Storage Elements with Gome L2 data

Submission of the Job in the GRID

1

2

3

4

Validation Processing Sequence

Page 18: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 18

Validation Output

Figure 1:

Estimation of the bias between Gome and Lidar using one month of data.

Figure 2 :

example of 2 profiles : Comparison between Gome profile and lidar profile for the 2nd October 2000.

Page 19: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 19

Perspectives for Biomedical Applications

Grids open new perspectives in large scale genomics analysis Complete genome annotation

Cross-genomes analysis

Data mining on distributed databases

Pipelining of huge automatic bio-informatics analysis

Medical image processing Large databases processing

Anatomy and physiology modeling

Epidemiological studies

Page 20: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 20

Biomedical Applications

Bio-informatics Phylogenetics : BBE Lyon (T. Sylvestre) Search for primers : Centrale Paris (K. Kurata) Statistical genetics : CNG Evry (N. Margetic) Bio-informatics web portal : IBCP (C. Blanchet) Parasitology : LBP Clermont, Univ B. Pascal (N. Jacq) Data-mining on DNA chips : Karolinska (R. Médina, R. Martinez) Geometrical protein comparison : Univ. Padova (C. Ferrari)

Medical imaging MR image simulation : CREATIS (H. Benoit-Cattin) Medical data and metadata management : CREATIS (J.

Montagnat) Mammographies analysis ERIC/Lyon 2 (S. Miguet, T. Tweed) Simulation platform for PET/SPECT based on Geant4 : GATE

collaboration (L. Maigne)

Applications deployedApplications tested on EDGApplications under preparation

Page 21: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 21

Medical Imaging

Medical images

Metadata

HH

1. query

2. visu

alisat

ion

3. similarity search4. scores

5. best results visualisation

LFN image patient hospital ...

Page 22: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 22

Graphic layer

Job Monitoring

Grid File Browsing

File registration and retrieval

Page 23: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 23

Graphical Interfaces

Image registration

Image retrieval

Local files Grid files Metadata

Query over metadata Query result

Page 24: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 24

Image Registration

LFN image patient hospital ...

Imager

SE

Page 25: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 25

Similarity search

Similarity computation

Results visualization

Job monitoring Ranked list of images

Source image Most similar images Low score images

Page 26: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 26

Future: Interfacing medical data with the Grid

Client 1interface

Client 2interface

RSinterface

core

grid - serverinterface

header blankingencryption

StorageElement

ReplicaCatalog

ReplicationService

RCinterface

Metadata interface

Medical (trusted) site

Grid middleware

File metadataACLsizechecksum...

Application metadataACLencryption keysensitive metadata...Medical server

StorageElementMSS

Master File

Replica

Imager

Page 27: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 27

Parallel Processing

Magnetic Resonance Images simulation using the grid

3 levels of parallelism:

Parallel isochromat computations

Multi-slice MRI computation

Parallel magnetization kernel

Magnetisationcomputation

kernel

Reconstructionalgorithm MRI

ImageVirtualobject

MRIsequence

Page 28: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 28

Summary

Use Cases High Energy Physics

Earth Observation

Biomedical Applications

Page 29: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 29

Further Information

High Energy Physics

http://datagrid-wp8.web.cern.ch/DataGrid-WP8/

Bio-Informatics

http://marianne.in2p3.fr/datagrid/wp10/index.html

Earth Observation

http://styx.esrin.esa.it/grid/

Page 30: EDG Applications The European DataGrid Project Team .

EU DataGrid - Applications 30