Post on 24-Dec-2015
The GANGA Interface
for ATLAS/LHCb
• Project overview• Design details• Component descriptions• Interfaces• Refactorisation plans • ARDA
Roger W L Jones ( Lancaster University)For the GANGA Team
22 September 2003GridPP8 meeting, Bristol 2
The Project
Ganga-related information regularly updated on web site
http://ganga.web.cern.ch/ganga
Ganga-related information regularly updated on web site
http://ganga.web.cern.ch/ganga
Ganga is being developed as a joint project between the ATLAS and LHCb experimentsBegan in the UK supported by GridPP, important collaborations with US colleagues
Ganga is being developed as a joint project between the ATLAS and LHCb experimentsBegan in the UK supported by GridPP, important collaborations with US colleagues
Current main contributors are:• Developers: K.Harrison, A.Soroko, C.L.Tan (GridPP funded)• Technical input and consulation: W.T.L.P.Lavrijsen, J.Martyniak,
P.Mato, C.E.Tull• GridPP coordination: N.Brook, R.W.L.Jones, G.N.Patrick
Current main contributors are:• Developers: K.Harrison, A.Soroko, C.L.Tan (GridPP funded)• Technical input and consulation: W.T.L.P.Lavrijsen, J.Martyniak,
P.Mato, C.E.Tull• GridPP coordination: N.Brook, R.W.L.Jones, G.N.Patrick
22 September 2003GridPP8 meeting, Bristol 3
Motivation and Background
ATLAS and LHCb develop applications within a (complex but powerful) common framework: Gaudi/AthenaBoth collaborations aim to exploit potential the of the Grid for large-scale, data-intensive distributed computing
ATLAS and LHCb develop applications within a (complex but powerful) common framework: Gaudi/AthenaBoth collaborations aim to exploit potential the of the Grid for large-scale, data-intensive distributed computing
Simplify management of analysis and production jobs for end-user physicists by developing tools for accessing Grid services with built-in knowledge of how Gaudi/Athena works: Gaudi/Athena and Grid Alliance (GANGA)Also aid job creation, submission, management and archival in non-Grid contextsGeneric components, especially those for interfacing to the Grid, can be used in other experiments
Simplify management of analysis and production jobs for end-user physicists by developing tools for accessing Grid services with built-in knowledge of how Gaudi/Athena works: Gaudi/Athena and Grid Alliance (GANGA)Also aid job creation, submission, management and archival in non-Grid contextsGeneric components, especially those for interfacing to the Grid, can be used in other experiments
22 September 2003GridPP8 meeting, Bristol 4
Milestones
GAUDI/Athenaapplication
GANGAGU
I
JobOptionsAlgorithms
Collective&
ResourceGrid
Services
HistogramsMonitoringResults
Summer 2001: First ideas for GangaSpring 2002: Work on Ganga started, with strong support from GridPPSpring 2003: GANGA1 released for user evaluationAutumn 2003: Refactorisation for GANGA2 – 3 week workshop, BNLSummer 2004: Use of GANGA2 in Data Challenges and Computing Model tests
Summer 2001: First ideas for GangaSpring 2002: Work on Ganga started, with strong support from GridPPSpring 2003: GANGA1 released for user evaluationAutumn 2003: Refactorisation for GANGA2 – 3 week workshop, BNLSummer 2004: Use of GANGA2 in Data Challenges and Computing Model tests
22 September 2003GridPP8 meeting, Bristol 5
Ganga DeploymentGanga has been installed and used at a number of sites:
Birmingham, BNL, Cambridge, CERN, Imperial, LBNL, Oxford
Ganga interfaces to several Grid implementations:EDG; Trillium/US-ATLAS; NorduGrid under test
Ganga interfaces to local batch systems:LSF, PBS
Ganga has been used to run a variety of LHCb and ATLAS applications:LHCb analysis (DaVinci)ATLAS reconstruction ATLAS full and fast simulation
Ganga has been used to run BaBar applications (see talk by J.Martyniak)A Ganga tutorial was given at BNL (US-Atlas Computing and Physics meeting), with some 50 participants, all of whom successfully used Ganga to submit jobs (at the same time)
Ganga has been installed and used at a number of sites:Birmingham, BNL, Cambridge, CERN, Imperial, LBNL, Oxford
Ganga interfaces to several Grid implementations:EDG; Trillium/US-ATLAS; NorduGrid under test
Ganga interfaces to local batch systems:LSF, PBS
Ganga has been used to run a variety of LHCb and ATLAS applications:LHCb analysis (DaVinci)ATLAS reconstruction ATLAS full and fast simulation
Ganga has been used to run BaBar applications (see talk by J.Martyniak)A Ganga tutorial was given at BNL (US-Atlas Computing and Physics meeting), with some 50 participants, all of whom successfully used Ganga to submit jobs (at the same time)
22 September 2003GridPP8 meeting, Bristol 6
Related ActivitiesAthAsk:
Creating a Gaudi/Athena job is in itself complicated, AthAsk wraps the complexity, incorporating knowledge of the component applications
DIAL: DIAL focuses on the requirements for interactive Grid analysis DIAL complements Ganga (focus on non-interactive processing), and work has
started on a Ganga-DIAL interface to combine the best of bothChimera:
Providing a Chimera-driven production and analysis workflow system for ATLAS
Automated installation and packaging with CMT&pacman Needed for many-site code maintenance and to distribute user code and run-
time environment Looking at ways to make use of pacman from Ganga
AtCom: Interim production tool 02/03 GridPP development effort GANAG test bed
AthAsk: Creating a Gaudi/Athena job is in itself complicated, AthAsk wraps the
complexity, incorporating knowledge of the component applicationsDIAL:
DIAL focuses on the requirements for interactive Grid analysis DIAL complements Ganga (focus on non-interactive processing), and work has
started on a Ganga-DIAL interface to combine the best of bothChimera:
Providing a Chimera-driven production and analysis workflow system for ATLAS
Automated installation and packaging with CMT&pacman Needed for many-site code maintenance and to distribute user code and run-
time environment Looking at ways to make use of pacman from Ganga
AtCom: Interim production tool 02/03 GridPP development effort GANAG test bed
22 September 2003GridPP8 meeting, Bristol 7
General Design Characteristics
The user interacts with a single application covering all stages of a job’s life-timeThe design is modularAlthough Atlas and LHCb use the same framework and the same software management tool, there are significant differences in what is expected from the software, and Ganga must have the flexibility to copeGanga provides a set of tools to manipulate jobs and data. Tools are accessible from CLI (other scripts) or from GUIGanga allows access both to the local resources (e.g., LSF batch system) and to the GRIDShould follow, and contribute to, developments in LHC Computing Grid
Implementation is in Python
The user interacts with a single application covering all stages of a job’s life-timeThe design is modularAlthough Atlas and LHCb use the same framework and the same software management tool, there are significant differences in what is expected from the software, and Ganga must have the flexibility to copeGanga provides a set of tools to manipulate jobs and data. Tools are accessible from CLI (other scripts) or from GUIGanga allows access both to the local resources (e.g., LSF batch system) and to the GRIDShould follow, and contribute to, developments in LHC Computing Grid
Implementation is in Python
22 September 2003GridPP8 meeting, Bristol 8
Value AddedSingle point of entry for configuring and running different types of ATLAS and LHCb jobs, uniform approach
It helps with job definition and configuration Common task and user-defined templates Application-specific job options, or user-supplied job-options file Editing of job-option values, guided to meaningful values for some applications
Simple, flexible procedure for splitting and cloning jobs Accepts user-provided script for splitting/cloning
Bookkeeping and stored job settings
Persistent job representation for archive & exchange
Automatic monitoring, job status query on local and distributed systems
Extensible framework for job-related operations Integration of new services, and for the assimilation of contributions from users
Single point of entry for configuring and running different types of ATLAS and LHCb jobs, uniform approach
It helps with job definition and configuration Common task and user-defined templates Application-specific job options, or user-supplied job-options file Editing of job-option values, guided to meaningful values for some applications
Simple, flexible procedure for splitting and cloning jobs Accepts user-provided script for splitting/cloning
Bookkeeping and stored job settings
Persistent job representation for archive & exchange
Automatic monitoring, job status query on local and distributed systems
Extensible framework for job-related operations Integration of new services, and for the assimilation of contributions from users
22 September 2003GridPP8 meeting, Bristol 9
Software Bus DesignUser has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software BusSoftware Bus itself is implemented as a Python moduleComponents used by Ganga fall into 3 categories: Ganga components of general
applicability or core components (to right in diagram)
Ganga components providing specialised functionality (to left in diagram)
External components (at bottom in diagram)
User has access to functionality of Ganga components through GUI and CLI, layered one over the other above a Software BusSoftware Bus itself is implemented as a Python moduleComponents used by Ganga fall into 3 categories: Ganga components of general
applicability or core components (to right in diagram)
Ganga components providing specialised functionality (to left in diagram)
External components (at bottom in diagram)
Job Definition
Job Registry
Job Handling
File Transfer
Python Native
Softw
are Bus
CLI
GUI
Py
RO
OT
Gau
di
Pyth
on
PyC
MT
PyA
MI
Py
Mag
da
BaBar Job Definition and
Splitting
Gaudi/Athena Job Options
Editor
Gaudi/Athena Job Definition
22 September 2003GridPP8 meeting, Bristol 10
Generic Components (1)
Components may have uses outside ATLAS and LHCbCore component provides classes for job definition, where a job is characterised in terms of: name, workflow, required resources, status Workflow is represented as a sequence of elements (executables,
parameters, input/output files, etc) for which associated actions are implicitly defined
Required resources are specified using a generic syntax Future workflow will merge with DIAL and Chimera
Components may have uses outside ATLAS and LHCbCore component provides classes for job definition, where a job is characterised in terms of: name, workflow, required resources, status Workflow is represented as a sequence of elements (executables,
parameters, input/output files, etc) for which associated actions are implicitly defined
Required resources are specified using a generic syntax Future workflow will merge with DIAL and Chimera
22 September 2003GridPP8 meeting, Bristol 11
Generic Components (2)Job-registry component allows for storage and recovery of job information, and allows for job objects to be serialized Multi-threaded environment based on Python threading module Serialisation of objects (user jobs) is implemented with the Python
pickle moduleScript-generation component translates a job's work flow into the set of instructions to be executed when the job is runJob-submission component submits work flow script to target batch system, creates JDL file and translates resource requests EDG, Trillium/US-ATLAS, LSF, PBS Can submit, monitor, and get output from GRID jobs
File-transfer component handles transfer between sites of input & output files, adds commands to work flow script on submissionJob-monitoring component performs queries of job status Should move to R-GMA Local/batch job monitoring problematic; move to job pushing info to
specified location, integrate with NetLogger for Grid
Job-registry component allows for storage and recovery of job information, and allows for job objects to be serialized Multi-threaded environment based on Python threading module Serialisation of objects (user jobs) is implemented with the Python
pickle moduleScript-generation component translates a job's work flow into the set of instructions to be executed when the job is runJob-submission component submits work flow script to target batch system, creates JDL file and translates resource requests EDG, Trillium/US-ATLAS, LSF, PBS Can submit, monitor, and get output from GRID jobs
File-transfer component handles transfer between sites of input & output files, adds commands to work flow script on submissionJob-monitoring component performs queries of job status Should move to R-GMA Local/batch job monitoring problematic; move to job pushing info to
specified location, integrate with NetLogger for Grid
22 September 2003GridPP8 meeting, Bristol 12
Experiment-Specific Components
GaudiApplicationHandler Can access Configuration DB for some Gaudi applications, using the
xmlrpclib module Ganga can create user-customized Job Options files using this DB Intelligent Job Options editor exists for some applications Specialised application handlers exist for ATLAS fast simulation and
for LHCb analysisComponents incorporate knowledge of the experiments’ Gaudi/Athena framework Gaudi job definition component adds to workflow elements in general-
purpose job-definition component, e.g. dealing with configuration management; also provides workflow templates covering common tasks
Other components provide for job splitting, and output collection Job splitting may have generic aspects; will be investigated in
collaboration with DIAL More work needed on job merging
GaudiApplicationHandler Can access Configuration DB for some Gaudi applications, using the
xmlrpclib module Ganga can create user-customized Job Options files using this DB Intelligent Job Options editor exists for some applications Specialised application handlers exist for ATLAS fast simulation and
for LHCb analysisComponents incorporate knowledge of the experiments’ Gaudi/Athena framework Gaudi job definition component adds to workflow elements in general-
purpose job-definition component, e.g. dealing with configuration management; also provides workflow templates covering common tasks
Other components provide for job splitting, and output collection Job splitting may have generic aspects; will be investigated in
collaboration with DIAL More work needed on job merging
22 September 2003GridPP8 meeting, Bristol 13
Job handling: splitting a job(lots of potential reuse)
Template Job
Repository
Splitting script 1
Splitting script 2
Splitting script …
Job H
and
ling m
odu
le
Subjob 1
Subjob 2
Subjob 3
Subjob 4
Subjob …
Selects or creates
Selects or creates
22 September 2003GridPP8 meeting, Bristol 14
External Components
Additional functionality is obtained using components developed outside of Ganga:Modules of python standard libraryNon-python components for which appropriate interface
has been written Gaudi framework itself (GaudiPython) Analysis package, ROOT (PyROOT) Configuration management tool (CMT) ATLAS metadata interface, AMI (PyAMI) ATLAS manager for Grid data, Magda (PyMagda)
Additional functionality is obtained using components developed outside of Ganga:Modules of python standard libraryNon-python components for which appropriate interface
has been written Gaudi framework itself (GaudiPython) Analysis package, ROOT (PyROOT) Configuration management tool (CMT) ATLAS metadata interface, AMI (PyAMI) ATLAS manager for Grid data, Magda (PyMagda)
22 September 2003GridPP8 meeting, Bristol 15
Implementation of Components (1)
JobHandler Requirements
JobAttributes Credentials
11… Job
JobsRegistryJobsCatalog
Application
GaudiApplicationHandler
Parameter Executable
1
0…
1
0…
1
1
1
1…
Job definition component
Job registry componentJob handling component
Specialised component: Gaudi/Athena job definition
22 September 2003GridPP8 meeting, Bristol 16
Implementation of Components (2)
JobHandler
AnotherJobHandler
GridJobHandler
PBSJobHandlerLSFJobHandler
LocalJobHandler
Job handling component
ApplicationHandler
BaBarApplicationHandler
GaudiApplicationHandler
DaVinchiApplicationHandler
AtlfastApplicationHandler
Application specific components
22 September 2003GridPP8 meeting, Bristol 17
Interfacing to the Grid
Job class Job class JobsRegistry classJobsRegistry classJob Handler
classJob Handler
class
Data management
service
Data management
service
Job submissionJob submission Job monitoring Job monitoring Security serviceSecurity service
dg-job-list-match
dg-job-submit
dg-job-cancel
dg-job-list-match
dg-job-submit
dg-job-cancel
grid-proxy-init
MyProxy grid-proxy-init
MyProxy
dg-job-status
dg-job-get-logging-info
R-GMA
dg-job-status
dg-job-get-logging-info
R-GMA
edg-replica-manager
dg-job-get-output
globus-url-copy
edg-replica-manager
dg-job-get-output
globus-url-copy
EDG UI
22 September 2003GridPP8 meeting, Bristol 18
Interfaces: CLI
atlasSetup = GangaCommand(“source
/afs/cern.ch/user/h/harrison/public/atlasSetup.sh”)
atlfast = GangaCMTApplication(“TestRelease”,
“TestRelease-00-00-15”,”athena.exe”,
“run/AtlasfastOptions.txt”)
atlfastOutput = GangaOutputFile(“atlfast.ntup”)
workStep1 = GangaWorkStep([atlasSetup,atlfast,atlfastOutput])
workFlow = GangaWorkFlow([workStep1])
lsfJob = GangaLSFJob(“atlfastTest”,workFlow)
lsfJob.build()
lsfJob.run()
atlasSetup = GangaCommand(“source
/afs/cern.ch/user/h/harrison/public/atlasSetup.sh”)
atlfast = GangaCMTApplication(“TestRelease”,
“TestRelease-00-00-15”,”athena.exe”,
“run/AtlasfastOptions.txt”)
atlfastOutput = GangaOutputFile(“atlfast.ntup”)
workStep1 = GangaWorkStep([atlasSetup,atlfast,atlfastOutput])
workFlow = GangaWorkFlow([workStep1])
lsfJob = GangaLSFJob(“atlfastTest”,workFlow)
lsfJob.build()
lsfJob.run()At the moment CLI is based on low-level tools; a higher-level set of commands is under development
At the moment CLI is based on low-level tools; a higher-level set of commands is under development
22 September 2003GridPP8 meeting, Bristol 19
Interfaces: GUIGUI has been implemented using wxPython extension moduleLayered on CLIAll job configuration data are represented in a hierarchical structure accessible via tree control; most important job parameters are brought to the top of the tree“User view” provides easy access to the top-level parametersAll job parameters defined by the user can be edited via GUI dialogs A help system has been implemented, using html browser classes from wxPythonAll implemented tools are available through the GUI, but some require a more elaborate interface, e.g., Job Options browser/editorPython shell is embedded into the GUI and allows user to configure interface from the command line
GUI has been implemented using wxPython extension moduleLayered on CLIAll job configuration data are represented in a hierarchical structure accessible via tree control; most important job parameters are brought to the top of the tree“User view” provides easy access to the top-level parametersAll job parameters defined by the user can be edited via GUI dialogs A help system has been implemented, using html browser classes from wxPythonAll implemented tools are available through the GUI, but some require a more elaborate interface, e.g., Job Options browser/editorPython shell is embedded into the GUI and allows user to configure interface from the command line
22 September 2003GridPP8 meeting, Bristol 20
Basic GUI
Job tree
Pythoninterpreter
Toolbar
Main panel
22 September 2003GridPP8 meeting, Bristol 21
Job creation
22 September 2003GridPP8 meeting, Bristol 22
Job-parameters panel
22 September 2003GridPP8 meeting, Bristol 23
Job-options editor: sequences
22 September 2003GridPP8 meeting, Bristol 24
Job-options editor: options
22 September 2003GridPP8 meeting, Bristol 25
Job submission
Job position depends on monitoring
info
22 September 2003GridPP8 meeting, Bristol 26
Examination of job output
22 September 2003GridPP8 meeting, Bristol 27
Job splitting
User or “third party” splitting script is requiredGUI displays script descriptions, where these exist, to guide user choicesIf “split” function of the script accepts a parameter it is interpreted as a number of subjobs and can be entered in the job splitting dialogue
User or “third party” splitting script is requiredGUI displays script descriptions, where these exist, to guide user choicesIf “split” function of the script accepts a parameter it is interpreted as a number of subjobs and can be entered in the job splitting dialogue
22 September 2003GridPP8 meeting, Bristol 28
Ganga Help
22 September 2003GridPP8 meeting, Bristol 29
Ganga RefactorisationScheme can be broken down as follows:
Definition of job options The user retrieves a set of job options from a database (or other standard location), and is then able to make modifications
using an intelligent job-options editor; the result is a job-options template
Definition of dataset The user selects the input dataset, obtaining information on the available datasets from a catalogue
Definition of execution strategy The strategy might be selected from a database, or the user might provide a new strategy definition
Creation of job-collection description An XML description of the job, or collection of jobs, to be submitted is created on the basis of the previously defined job-
options template, input dataset and execution strategy
Definition of job requirements Some standard requirements may be defined for each experiment The user may also specify requirements, for example imposing that jobs be submitted to a particular cluster Additional requirements may be derived from the job-collection description
Job submission A dispatcher determines where to submit jobs, on the basis of the job-collection description and the job requirements, and
invokes a service that returns the appropriate submission procedure
Installation of software and components Experiment-specific software required to run the user job is installed as necessary on the remote client Any (Ganga) components required to interpret the job-collection description are also installed
Job execution Agents supervise the execution and validation of jobs on the batch nodes
Scheme can be broken down as follows: Definition of job options
The user retrieves a set of job options from a database (or other standard location), and is then able to make modifications using an intelligent job-options editor; the result is a job-options template
Definition of dataset The user selects the input dataset, obtaining information on the available datasets from a catalogue
Definition of execution strategy The strategy might be selected from a database, or the user might provide a new strategy definition
Creation of job-collection description An XML description of the job, or collection of jobs, to be submitted is created on the basis of the previously defined job-
options template, input dataset and execution strategy
Definition of job requirements Some standard requirements may be defined for each experiment The user may also specify requirements, for example imposing that jobs be submitted to a particular cluster Additional requirements may be derived from the job-collection description
Job submission A dispatcher determines where to submit jobs, on the basis of the job-collection description and the job requirements, and
invokes a service that returns the appropriate submission procedure
Installation of software and components Experiment-specific software required to run the user job is installed as necessary on the remote client Any (Ganga) components required to interpret the job-collection description are also installed
Job execution Agents supervise the execution and validation of jobs on the batch nodes
22 September 2003GridPP8 meeting, Bristol 30
Future plans
Database ofStandard Job Options
Job-Options Editor
Job-Options Template
Job-OptionsKnowledge Base
Dataset
Dataset Catalogue
Dataset Selection
Job Factory(Machinery for Generating XML Descriptions of Multiple Jobs)
StrategySelection
Job Collection(XML Description)
User Requirements
Database ofJob Requirements
Derived Requirements
Job Requirements
Strategy Database(Splitter Algorithms)
DispatcherScheduler Proxy
Scheduler Service
Remote-Client SchedulerGrid/ Batch-System
SchedulerAgent
(Runs/Validates Job)
Software CacheComponent
Cache
Software/Component Server
Remote Client
Local Client
Execution node
NorduGridLocalDIALDIRACOther
JDL, Classads,
LSF Resources, etc
LSFPBSEDGUSG
Refactorisation of Ganga, with submission on remote client
Refactorisation of Ganga, with submission on remote client
Motivation
• Ease integration of external componentsFacilitate multi-person, distributed developmentIncrease Customizability/FlexibilityAllow GANGA components to be used externally more easily
Motivation
• Ease integration of external componentsFacilitate multi-person, distributed developmentIncrease Customizability/FlexibilityAllow GANGA components to be used externally more easily
22 September 2003GridPP8 meeting, Bristol 31
Use of Components Outside of Ganga
Ganga complies with recent requirements for grid services domain decomposition as described in the Architectural Roadmap towards Distributed Analysis (ARDA) document.
Some Ganga components provide native services (API, UI) Majority of components just represent an uniform interface to the existent grid middleware services (e.g.,
Data Management, Job Monitoring)
Ganga complies with recent requirements for grid services domain decomposition as described in the Architectural Roadmap towards Distributed Analysis (ARDA) document.
Some Ganga components provide native services (API, UI) Majority of components just represent an uniform interface to the existent grid middleware services (e.g.,
Data Management, Job Monitoring)
Service (ARDA) Provided by Ganga (native)
Provided by Ganga (middleware interface)
API and User Interface PyBus, Ganga GUI, CLI
Authentication, Authorisation and Auditing JobHandlers package
Workload and Data Management Systems JobHandlers package
File and Metadata Catalogues File and PyMagda modules
Information service Software server
Grid and Job Monitoring services JobHandlers package
Storage and Computing elements File module
Package Manager and Job provenance service PyCMT, Packman modules
22 September 2003GridPP8 meeting, Bristol 32
Future Plans
GANGA prototype has had enthusiastic and demanding early adopters
- A new release will provide an interim production versionA new release will provide an interim production version A refactorisation is now underway
- Stricter adherence to the component modelStricter adherence to the component model- Compliance with the draft LCG distributed analysis services modelCompliance with the draft LCG distributed analysis services model- Installation tools need to be interfaced, at least for user analysis codeInstallation tools need to be interfaced, at least for user analysis code- Add new job handlersAdd new job handlers- Web-based variant GUI (or thin remote client) should be considered. Security issues
need to be addressed in this case.- Exploit Grid Monitoring Architecture
Components should be capable of wide reuseGANGA can deal with the ARDA frameworkSoftware installation for analysis jobs a priorityMetadata query/selection/browsing and design a high priority
GANGA prototype has had enthusiastic and demanding early adopters
- A new release will provide an interim production versionA new release will provide an interim production version A refactorisation is now underway
- Stricter adherence to the component modelStricter adherence to the component model- Compliance with the draft LCG distributed analysis services modelCompliance with the draft LCG distributed analysis services model- Installation tools need to be interfaced, at least for user analysis codeInstallation tools need to be interfaced, at least for user analysis code- Add new job handlersAdd new job handlers- Web-based variant GUI (or thin remote client) should be considered. Security issues
need to be addressed in this case.- Exploit Grid Monitoring Architecture
Components should be capable of wide reuseGANGA can deal with the ARDA frameworkSoftware installation for analysis jobs a priorityMetadata query/selection/browsing and design a high priority