David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software...
-
Upload
rafe-gardner -
Category
Documents
-
view
217 -
download
0
Transcript of David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software...
![Page 1: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/1.jpg)
David Adams
ATLAS
ATLAS Distributed Analysis Plans
David AdamsBNL
December 2, 2003
ATLAS software workshopCERN
![Page 2: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/2.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 2
David Adams
ATLAS
Contents
DAC mandate
Scope
Strategy
Scenario for first release
Plans for the first release
GANGA status
DIAL status
Deliverables for the first release
Conclusions
![Page 3: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/3.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 3
David Adams
ATLAS
DAC MandateDistributed Analysis Coordinator
• Is responsible for coordinating the development of software tools for distributed analysis and their integration into the ATLAS software environment
• Start with the analysis of existing tools such as GANGA, DIAL, AtCom…
• Provide users with transparent access to metadata of different sorts as well as to event data in all stages of processing
• Participate actively in the definition of LCG projects such as ARDA
• Is a member of relevant LCG committees and working groups
![Page 4: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/4.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 4
David Adams
ATLAS
ScopeAnalysis (not necessarily distributed)
• Supports the manipulation and extraction of summary data (e.g. histograms) from any type of event data
– AOD, ESD, …
• Supports user-level production of event data– e.g. MC generation, simulation and reconstruction
Distributed analysis• Extends the extraction and production support to
include distributed processing and distributed data• Natural extension of non-distributed analysis• Easily invoked from any ATLAS analysis environment
– including Python, ROOT, command line– easily ported to any future environment (e.g. JAS)
![Page 5: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/5.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 5
David Adams
ATLAS
StrategyImplement DA as a collection of grid services
• As described in ARDA document
• Use ARDA components where possible
• Add missing and ATLAS-specific pieces
Provide clients for ATLAS analysis environments• Python, ROOT, command line
Regular releases• Perhaps for each SW week and ATLAS X.0
• Provide useful tool
• Demonstrate functionality
• Expand functionality with each release
![Page 6: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/6.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 6
David Adams
ATLAS
Strategy (cont)Look to common projects for most of the pieces
• ARDA, GANGA, DIAL, …
• Share as much as possible with ATLAS production– Also distributed
– Similar interfaces and code for bulk and user-level production
• ADA (ATLAS distributed analysis) must identify these pieces and tie them together
Deployment• ADA services must be deployed at relevant sites
• Provide testing and monitoring of these services
• Work with facilities to deploy and maintain– Also to develop facility-specific features
![Page 7: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/7.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 7
David Adams
ATLAS
Scenario for first releaseHere is a scenario for user interaction with the first release of ADA
• Authenticate – Proxy from authentication service
• Choose application– E.g. PAW to process DC1 ntuples– Or Athena to process DC2 AOD– Also Athena reconstruction?
• Define task– Analysis: provide code to define and fill histograms– Production: athena job options, maybe code– Perhaps select starting point from provenance catalog
• Select input dataset– From dataset metadata catalog service
![Page 8: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/8.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 8
David Adams
ATLAS
Scenario for first release (cont)• Create job configuration
– Response time, role, …
• Locate processing service
• Submit job– Application, task, dataset, configuration
• While job is running– Query service for status and partial results
– Examine partial results (e.g. histograms)
– Kill job if results are bad
• When job is finished– Examine complete result
– Modify task or select new dataset and repeat
![Page 9: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/9.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 9
David Adams
ATLAS
Plan for first releaseSchedule
• Implement and deploy in advance of March 2004 software workshop
• Provide starting point for discussion at that meeting
Building blocks• Code and developers in GANGA and DIAL
– Following sections summarize current status
• LCG project following from ARDA– Just starting; so don’t wait but
– Stay closely coupled to that project
• Open to contributions (especially effort) from others
![Page 10: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/10.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 10
David Adams
ATLAS
Ganga: status update (1)Work since September software week has focused on refactoring, to create a system that is more modular and more flexible
• In short-term (next 1-2 months), changes will mainly affect developers
• In longer term (in time for DC2) will see significant gains for users: improvements in functionality, ease of use and stability
Have introduced PyBus software bus, developed by W. Lavrijsen with contributions from K. Harrison
• Allows association between module and logical name to be made at run time
• Makes system more configurable: supports ATLAS/LHCb customizations and user add-ons
Moving to XML-based job description• Mechanics have been worked out, but still defining details of XML
schema• Aim to have job description consistent with DIAL (and others?)
![Page 11: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/11.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 11
David Adams
ATLAS
Ganga: status update (2)Job-options editor (JOE) is evolving to become a more powerful, standalone component, which will be loaded by Ganga
• Assist user in the creation/modification of Gaudi/Athena job options by presenting the user with a hierarchical view of available options files and helping the user with value entry
• In process of creating Job Options Information Resource (JOIR) database
– JOIR database of job options will facilitate validation by providing valid ranges, valid option choices, and option descriptions
– Considering suggestions from LHCb for improving automated job-option extraction
![Page 12: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/12.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 12
David Adams
ATLAS
Ganga: job definition and submission to LCG
Application
SelectApplication
PrepareSandbox
PrepareAlgFlowOptions
and DLLs
AlgorithmFlowEdit
AlgorithmFlow
AlgParamOptions
SelectDatasets
EditAlgParamOptions
DatasetOptions
AlgFlowOptions
SandboxDLLs
JobOptionsFileCatalogue slice
Submit Job
Metadatacatalogue
AlgOptionscatalogue
DLLs
Filecatalogue
![Page 13: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/13.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 13
David Adams
ATLAS
Ganga: future plans
Plans well defined up to March 2004• Work towards Ganga/DIAL integration within ADA
• Enable job submission to LCG
• Release improved version of JOE
• Include interface to Pacman 3 for package installation– Informal Pacman workshop pencilled in for January 2004
• More tentatively, looking at possibilities for interfacing to Atlantis for displaying event data
Request for GridPP funding beyond December 2004 requires ATLAS/LHCb work plan for Ganga up to September 2007
• Need to ensure ATLAS priorities are taken into account
![Page 14: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/14.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 14
David Adams
ATLAS
DIAL statusRelease 0.60
• Made in November
• Has application to process combined ntuple datasets with PAW
• Command line and ROOT clients
• Processing can be done by instantiating a private scheduler or by contacting a persistent web service
• Dataset catalogs have been implemented– DSC – dataset selection catalog
– DRC – dataset replica catalog
– Datasets created for all DC1 combined ntuples
![Page 15: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/15.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 15
David Adams
ATLAS
DIAL status (cont)High-level JDL
• DIAL envisions a hierarchy of schedulers
• Interface to these schedulers constitutes a high-level JDL (job definition language)
– Job submission, monitoring and gathering of results
– See figure
• Would like to standardize this JDL so schedulers can be shared between projects and experiments
– See figure
![Page 16: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/16.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 16
David Adams
ATLAS
UserAnalysis
Job 1
Job 2
Application Task
Dataset 1
Scheduler
1. Create or locate
2. select 3. Create or select
4. select
5. submit(app,tsk,ds)
6. splitDataset
Dataset 2
7. create
e.g. ROOT
e.g. athena
Result9. fill
10. gather
Result 9. fill
Result CodeComponents of DIAL
high-level JDL
![Page 17: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/17.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 17
David Adams
ATLAS
DIAL status: sharing via JDL
P I/S E A L
G A N G A
R O O T
J A S
C o m m a nd line
H ighle v e lJ D L
P R O O FG A N G A -LC G
C o nd o r-G
G C E /C him e raS T A RJ D A P (J A S )
D IA L-inte ra c tiv e
An a ly s is e n v iro n m e n ts S ch e d u le r s
G r idse rv ice s
A T LA S p ro d u c tio n
P lu g -inclie n ts
P o rta l/s w itc h
![Page 18: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/18.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 18
David Adams
ATLAS
Deliverables for first releaseComments
• Goal is to support the scenario outlined earlier
• Build on current GANGA and DIAL implementations and plans
• Emergence of ARDA project may change plans
• Coordination with ATLAS production may also lead to changes
• Add more tasks if more ideas and effort are found
![Page 19: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/19.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 19
David Adams
ATLAS
Deliverables for first release (cont)Authentication service
• GSI based
• Support both EDG and US certificates
High-level JDL• Start from current DIAL interface
• Incorporate ideas from PPDG, ARDA, …– If available in time
• This defines the interface (WSDL) for the following analysis and production services
![Page 20: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/20.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 20
David Adams
ATLAS
Deliverables for first release (cont)Interactive analysis service
• Build on existing DIAL scheduler service– Add authentication
– Deploy as web or grid service
• Client schedulers– Keep command line and ROOT clients
– Add Python (GANGA) client
> Possibly with associated GUI
• Application/task/dataset– Keep PAW with fortran task to fill histograms from HBOOK
combined ntuples
– Add ROOT with C++ task to fill from ROOT ntuples?
– Add athena with C++ task to fill from AOD?
![Page 21: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/21.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 21
David Adams
ATLAS
Deliverables for first release (cont)User-level batch production service?
• Start from GANGA LCG submission service– Add high-level JDL– Requires GANGA to support client-server
• Other candidates for production services:– GCE/Chimera– DIAL– New ATLAS production model– Switch to choose between these
• Supported production tasks– Reconstruction– Simulation?– Event generation?– Fill histograms from AOD?
![Page 22: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/22.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 22
David Adams
ATLAS
Deliverables for first release (cont)Dataset and file catalog services
• Functionality:– Means for users to select an input dataset
– Means for production to register output dataset
– Means for system (e.g. DIAL scheduler) to turn dataset specification into accessible physical files
• Start from AMI and DIAL
• Need file catalog and replication services– Magda, RLS1, RLS2, …
![Page 23: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/23.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 23
David Adams
ATLAS
ConclusionsDistributed analysis is a new project for ATLAS
Philosophy• Tightly integrate with non-distributed analysis
• Be neutraluse client-server mechanism to support different analysis environments and different processing systems
• Be flexiblecapabilities (and hence demands) will change as technology evolves
• Be responsive to evolving user requirements
• Build on existing ideas and projects including GANGA, DIAL, ATLAS production and ARDA
![Page 24: David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.](https://reader036.fdocuments.us/reader036/viewer/2022081516/56649ed15503460f94bdfde9/html5/thumbnails/24.jpg)
ADA Plans ATLAS SW – Grid session December 2, 2003 24
David Adams
ATLAS
Conclusions (cont)Plan of action
• Define interface (high-level JDL)
• Quickly implement services for analysis, user-level production and dataset catalogs
• Expose to users, learn lessons and re-implement
• Repeat
More information• Web site coming soon
• Mail to [email protected]