David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop...

Post on 30-Dec-2015

212 views 0 download

Transcript of David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop...

David Adams

ATLAS

ATLAS Distributed Analysis

David AdamsBNL

March 18, 2004

ATLAS Software WorkshopGrid session

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 2

David Adams

ATLAS

Contents

Definitions

Architecture

AJDL

Analysis service

Catalog services

Strategy

ARDA

More information

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 3

David Adams

ATLAS

DefinitionsAnalysis (not necessarily distributed)

• Supports the manipulation and extraction of summary data (e.g. histograms) from any type of event data

– AOD, ESD, …

• Supports user-level production of event data– e.g. MC generation, simulation and reconstruction

Distributed analysis• Extends the extraction and production support to

include distributed users, data and processing.• Natural extension of non-distributed analysis• Easily invoked from any ATLAS analysis environment

– including Python, ROOT, command line– easily ported to any future environment (e.g. JAS)

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 4

David Adams

ATLAS

Architecture

M id d lew ar e s er v ic e in ter f ac es

C EW M S F ileC ata lo g

etc . . . .e tc . M id d lew ar es er v ic es

Hig h lev e l s er v ic e in te r f ac es ( AJ D L )

D I ALAn aly s isS er v ic e

G AN G AAn aly s isS er v ic e

AT P R O DAn aly s isS er v ic e

R O O Tc m d lin e

C lien t

G AN G Ac m d lin e

C lien t

G AN G AT as k

Ed ito r

D I R ACAn aly s isS er v ic e

G AN G AJ o b

S u b m is s io n

G AN G AJ o b

M o n ito r

Hig h - lev els er v ic es

C lien t to o ls

AR D AAn aly s isS er v ic e

C ata lo gs er v ic es

G AN G A G UI

D atas e tS p lit te r

D atas e tM er g er

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 5

David Adams

ATLAS

AJDLAcronym: Analysis Job Definition Language

Used to define interface for high-level services

Components include:• Application – executable to process data

• Task – user configuration of application

• Dataset – describes input and output data

• Job – app, task and input dataset output dataset

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 6

David Adams

ATLAS

AJDL (cont)Components must be extensible

• Use types– E.g. HistogramDataset, EventDataset, AtlasEventDataset

• Generic interface– For use by (shared) generic high-level services

• Experiment-specific interface– Used by application

Nature of components• Persistent representation of data (e.g. XML)

• Classes to interpret this data (C++, Python ,java,…)

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 7

David Adams

ATLAS

Analysis serviceExample scenario for processing a high-level job

• Input is application, task, dataset and job configuration

• Map input virtual dataset to concrete representation

• Split into sub-datasets

• Create sub-job for each sub-dataset

• Stage files for each sub-job

• Locate and possibly install application

• Build (e.g. compile) task

• Run sub-jobs

• Gather and merge results (output datasets)

• Output is dataset and job performance description

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 8

David Adams

ATLAS

AnalysisFramework

Job 1

Job 2

Application Task

Dataset 1

AnalysisService

1. Locate

2. select 3. Create or select

4. select

5. submit(app,tsk,ds)

6. splitDataset

Dataset 2

7. create

e.g. ROOT

e.g. athena

Result9. create

10. gather

Result 9. create

exe, pkgs scripts, codeADA/DIAL user

interface

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 9

David Adams

ATLAS

Catalog servicesRepositories

• Store AJDL components indexed by ID

Selection (metadata) catalogs• Help user to select input data, task , …

VDC – Virtual Dataset Catalog• Prescriptions for creating datasets

– Application, task input dataset

DRC – Dataset Replica Catalog• Mapping between virtual and concrete datasets

Job catalog• Detailed provenance for concrete datasets

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 10

David Adams

ATLAS

StrategyDefine AJDL

• Components, nature, interfaces

Implement catalogs• Tables in AMI

• Programmatic interface– (C++ with Python binding)

Analysis services• Start with existing services or analogs

– DIAL, ATCOM, Capone, GANGA, …

• Different implementations for different strategies

• At least one using ARDA middleware

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 11

David Adams

ATLAS

Strategy (cont)User interface

• Programmatic interface to high-level services and AJDL components

– C++, python and eventually java bindings

• GANGA will provide python binding and use it to deliver a GUI

– Extensible design: client tools plug into python bus

Middleware• Whatever works to begin

• ARDA services will be used in that context– Like to see better integration with other middleware efforts

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 12

David Adams

ATLAS

Strategy (cont)We service infrastructure

• Short term use independent persistent services

• Mid-term follow ARDA strategy– GAS – grid access service

• Long term follow standards such as WSRF– Dataset becomes a resource?

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 13

David Adams

ATLAS

ARDAARDA begins April 1

Two areas in LCG:• Middleware development (1st report delivered)

• Integration team

Other participants• Implementation team(s) from each experiment

– Use ARDA middleware to provide analysis system

• Tool providers: POOL, SEAL, ROOT, GANGA

• Users in each experiment to try out implementations

• Regional centers deploy services and analysis systems

• GAG to advise

ATLAS Distributed Analysis USATLAS Grid March 18, 2004 14

David Adams

ATLAS

More informationADA home page:

• http://www.usatlas.bnl.gov/ADA

• This page has links to other projects