LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

39
LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN

Transcript of LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

Page 1: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

LCG Applications Area

13th GridPP Meeting, Durham4-6 July 2005

Pere Mato/CERN

Page 2: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 2

Outline

Applications area scope and organization Requirements and Architecture Applications Area projects

– SPI– ROOT– POOL– Simulation

Summary

Page 3: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 3

Application Area Focus

Deliver the common physics applications software

Organized to ensure focus on real experiment needs– Experiment-driven requirements and monitoring– Architects in management and execution– Open information flow and decision making – Participation of LHC experiment and external

developers– Frequent releases enabling iterative feedback

Success defined by experiment validation– Integration, evaluation, successful deployment

Page 4: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 4

Domain Decomposition

Core

PluginMgr Dictionary

MathLibs I/O

Interpreter

GUI 2D Graphics

Geometry Histograms Fitters

Simulation

Foundation Utilities

Engines

Generators

Data Management

Persistency

FileCatalogFramework

DataBase

Distributed Analysis

Batch

Interactive

OS binding

3D Graphics

NTuple Physics

Collections

Conditions

Experiment Frameworks

Simulation Program Reconstruction Program Analysis Program

Event Detector Calibration Algorithms

Page 5: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 5

Principal Architecture Requirements Long lifetime: support technology evolution C++ today; support language evolution Seamless distributed operation and usability off-network Component modularity, public interfaces Interchangeability of implementations Integration into coherent framework and experiment

software Design for end-user’s convenience more than the developer’s Re-use existing implementations Software quality at least as good as any LHC experiment Meet performance, quality requirements of trigger/DAQ

software Platforms: Linux/gcc, MacOSX/gcc, Windows/vc++

Page 6: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 6

Applications Area Organization

AA Manager

Alice Atlas CMS LHCb

Architects Forum

Application Area Meeting

PEB LHCCSC2

External Collaborations Fluka

WorkplansQuartery Reports

ReviewsResources

LCG AA Projects

PROJECT A

EGEE

Chairs

Decisions

WP1 WP2

PROJECT B

WP1 WP2

PROJECT D

WP1

WP2WP3

. . .

Geant4

Page 7: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 7

Current AA Projects

SPI – Software process infrastructure (A. Aimar)– Software and development services: external libraries,

savannah, software distribution, support for build, test, QA, etc.

ROOT – Core Libraries and Services (R. Brun)– Foundation class libraries, math libraries, framework services,

dictionaries, scripting, GUI, graphics, etc. POOL – Persistency Framework (D. Duellmann)

– Storage manager, file catalogs, event collections, relational access layer, conditions database, etc.

SIMU - Simulation project (G. Cosmo)– Simulation framework, physics validation studies, MC event

generators, participation in Geant4, Fluka.

Page 8: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 8

Changes in AA for LCG Phase 2

1. SEAL and ROOT projects merge2. Some redefinition of SPI role3. Some adaptations of POOL required4. PI discontinued and existing libraries

absorbed by client projects5. SIMULATION project basically unchanged

Page 9: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 9

Rational for the SEAL & ROOT merge

Optimization of resources– Avoid duplication of developments

Better “coherency” vis-à-vis our clients, the LHC experiments

ROOT activity fully integrated in the LCG organization– Planning, milestones, reviews, resources, etc.

Ease long-term maintenance and evolution of a single set of software products– Thinking on the post-LCG era

Page 10: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 10

SEAL and ROOT Merge

Internal AA review in April supported the merge– “Ensure that the best part of the two projects is taken forward”

Details of the merge are being discussed following a process defined by the AF– Breakdown into a number of topics– Proposals discussed with the experiments– Public presentations– Final decisions by the AF

Current status– Dictionary plans approved– MathCore and Vector libraries proposals approved– First development release of ROOT including these new libraries

Page 11: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 11

SEAL + ROOT Migration Adiabatic changes towards experiments

– Experiments need to see libraries they use currently will evolve from current usage today towards a unique set

Details are being planned in the AA Phase 2 plan document – Currently in draft status

newROOT Libraries

SEAL Libraries

ROOT Libraries

SEAL Libraries

ROOT Libraries

1 deliverable butsome duplication still

2 deliverables1 deliverable and

no duplication timenow ~August 2005 ~January 2006

Page 12: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

Applications Area Projects

Page 13: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 13

Software Process Infrastructure (SPI)

The AA projects share a single development infrastructure provided by the SPI project– Crucial for fostering homogeneity and avoiding duplications– Users are the LCG AA projects, LHC experiments and others

external projects– Provides a number of “roles” and “services”

Roles– Software librarian– QA and Policies– Testing frameworks– Configuration Management

Services– External Software service– Software distribution service– Savannah portal– Documentation and web site

Page 14: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 14

Project

DeveloperDeveloper

Software Development Roles

ReleaseManager

Developer Design softwareWrite codeWrite unit testsRun unit testsDebugWrite Documentation

Integrate contributionsBuild release on few platformsRun “all” unit testsRun system and regression testsPrepare release notesDocument

Librarian

Prepare configurationBuild releases on “all” platformsRun and check “all” testsProduce documentationProduce distribution kitsAnnounce releaseEnsures uniformityExternal

Libraries

SoftwareDistribution

Maintain distributionsInstall and support distribution tools

Maintain installationsAutomate installations

QAManager

Webservices

Infrastruc-ture support

SPI WebWorkbookSavannah maintenance

Provide and maintain CVS, build, tests serversOS, compiler support

Page 15: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 15

External Libraries service

Install and upgrade, on the LCG platforms, all external software needed AA project and LHC experiments

~80 packages following the same structure– <package>/<version>/

<platform> Platforms

– RH73, SCL3, Win32, OSX,… 500 installations on 100 GB Ongoing work to automate as

much as possible

http://spi.cern.ch/extsoft

Page 16: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 16

Savannah Project Portal

Using the open source Savannah tool to provide “portals” for all software projects

Current activities– User support– Functionality

enhancements– Maintenance of the service

and bug fixes– Work/Merge with Savannah

open source

http://savannah.cern.ch/

Page 17: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 17

Software Distribution Service

Internal (AA) and external software Handling the complex use case and requirements from

users– Binary distributions (supported/compatible platforms)– Source distributions (non-supported platforms)– Distributions for developers– Run-time distributions (batch farms)– Remote central installations

Towards a single source of information concerning dependencies and configuration– Tool neutral XML configuration file

Starting to use Pacman– Pacman caches (binary and source) are becoming available

Page 18: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 18

Testing and Quality Assurance

Testing is an integral part of the software development process in the AA

Testing frameworks– CppUnit, QmTest

Test support– Policies, Tools, Documentation

Quality Assurance– Codewizard for EGEE– QA reports– Test coverage

Page 19: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 19

Core libraries and services (ROOT)

Provides basic functionality needed by any application Evolution of the current ROOT+SEAL projects

– Adiabatic changes towards experiments Current work packages

– BASE: Foundation and system classes, documentation and releases

– DICT: Reflexion system, meta classes, CINT and Python interpreters

– I/O: Basic I/O, Trees, queries– PROOF: parallel ROOT facility, xrootd– MATH: Mathematical libraries, histogramming, fitting– GUI: Graphical Uner interfaces and Object editors– GRAPHICS: 2-D and 3-D graphics– GEOM: Geometry system

Page 20: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 20

Dictionary

Adding reflexion/introspection capabilities to C++– Essential for I/O, distribution, interpreters, GUI, etc.

Towards a “single” reflexion system– Updating ROOT/CINT to use the reflexion system Reflex (SEAL)– Dictionary generation from C++ class descriptions (.h files)

CINT: C++ interpreter– Reengineering the interpreter to handle uniformly byte code

generating and interpretation PyROOT: interface of any C++ class to Python

interpreter– Generic Python binding to any C++ class having a dictionary– Completing the C++ language constructs (namespace,

templates, etc.)

Page 21: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 21

Math Libraries Development of a common and complete set of Mathematical

functionality– Available for all programs (from interactive analysis to real-time

software triggers) Developing the MathCore library (from the SEAL project)

– Mathematical functions, utilities, numerical algorithms, physics vectors and random number generators

– Standalone library, license free. MathMore library: larger set of utilities and functions

– Wrappers on top of GSL (Gnu Scientific Library) Linear Algebra

– Evaluation of alternative solutions, recommendations. Minimization and Fitting

– Re-engineering the current design in ROOT– New C++ Minuit – Incorporating RooFit package from BaBar

Page 22: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 22

I/O

Development of the core I/O system, ROOT TTree and TTree queries

Functionality developed in SEAL and POOL being integrated

New optimizations in speed New C++ constructs

– Better STL support, virtual inheritance, typedefs, etc. Support for bitmap indices (improved selection

performance)

Page 23: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 23

PROOF

The work on the Parallel ROOT Facility (PROOF) is accelerating with several new developments– From short blocking queries to long asynchronous queries in a

stateless client mode. – Later reconnection of the client from a different session possible.– Reusing the xrootd file server infrastructure.– “zero-config” cluster setup by using the Apple Bonjour protocol– Improving usability and friendless

Extensions to be able to run PROOF on the Grid– Interfacing to the middleware to access file catalogs, job

schedulers and storage elements Aiming for a demo with all the new features by September

Page 24: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 24

GUI and Graphics

ROOT provides several implementations of the abstract interface TVirtualX used for the GUI and 2-D graphics– X11, native Win32, TrollTech Qt

Set of high level widgets– canvas manager, object browser, TTree viewer

Object editors and GUI designer and code generator 2D Graphics

– Basic graphics like: lines, text, polygons, etc.– Generation of various output formats: ps, eps, pdf, gif, jpeg, bmp– Image processing classes

3D Graphics– Abstract interface to a 3-D viewer– Implementations: X3D and powerful Open GL taking advantage of

modern graphics cards

Page 25: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 25

Persistency Frameworks (POOL) The LCG persistency framework project consists of two parts

– Common project with CERN IT and strong experiment involvement POOL

– Hybrid object persistency integration object streaming (using ROOT I/O for event data) with Relational Database technology (for meta data)

– Established baseline for three LHC experiments– Successfully integrated into the software frameworks of ATLAS, CMS

and LHCb – Successfully deployed in three large scale data challenges

Conditions Database (COOL)– Store, manage and distribute time varying data (detector conditions)– Conditions DB was moved into the scope of the LCG project

» To consolidate different independent developmentsand integrate with other LCG components (SEAL, POOL)

– Storage of complex objects via POOL into Root I/O and RDBMS backend

Page 26: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 26

POOL Components

Storage Manager – Streams transient C++ objects

to/from disk – Resolves a logical object reference to

a physical object File Catalog

– Maintains consistent lists of accessible files together with their unique identifiers (FileID), which appear in the object representation in the persistent space

– Resolves a logical file reference (FileID)to a physical file

Collections– Provides the tools to manage

potentially large sets of objects stored via POOL

POOL API

Storage Service

FileCatalog Collections

ROOT I/OStorage Svc

XMLCatalog

MySQLCatalog

Grid ReplicaCatalog

ExplicitCollection

ImplicitCollection

RelationalStorage Svc

Page 27: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 27

POOL deployment in the Grid Coupling to Grid services

– In 2004 based on the EDG-RLS service using Oracle Application Server + DB» Connects POOL to all LCG files» Local Replica Catalog (LRC) for GUID <-> PFN mapping for all local files» Replica Metadata Catalog (RMC) for file level meta-data and GUID <-> LFN» Replica Location Index (RLI) to find files at remote sites (not deployed in

LCG)» Resulted in a single centralized catalog (scalability and availability

concerns)– New file catalog implementations released recently (version 2.1)

implementing of the POOL interface» LFCCatalog (lfc), GliteCatalog (glite, Fireman), GTCatalog (globus toolkit)

But Grid-decoupled modes also required by production use-cases– XML based Catalog

» typically used as local file by a single user/process at a time » no need for network » supports R/O operations via http; tested up to 50K entries

– Native MySQL Catalog » Shared catalog e.g. in a production LAN » handles multiple users and jobs (multi-threaded); tested up to 1M entries

Page 28: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 28

Relational Abstraction Layer (RAL)

C++ API for SQL-free, technology neutral access to relational data– Inserting, deleting, updating and

retrieving rows Support for bulk operations,

client side cashing and SQL variable binding

RAL enforces “best practices” in database programming

Strong links to providers of database services

AttributeList

RelationalAccess

OracleAccess

SQLiteAccess

ODBCAccess

AuthenticationService

Available backends– Oracle– SQLite– MySQL– Authentication

services

Page 29: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 29

Conditions Objects for LCG (COOL)

Conditions data: non-event data that vary with time and exists in several versions

Produced from both online (slow control) and offline (calibrations)

COOL implementation based on Oracle/MySQL using RAL

Payload definedby end-users– AtributeList– External reference

(POOL token) Production release

– Version 1.2

Page 30: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 30

Simulation Project Simulation framework

– Interface to multiple simulation engines (Geant4, Fluka) and geometry models exchange

Geant4 team participating– Aligned with and responding to needs from LHC experiments, physics

validation, simulation framework Fluka team participating

– Framework integration, physics validation Garfield team participating

– Garfield package integration and support in LCG - SPI Simulation physics validation

– Assess adequacy of simulation and physics environment for LHC and provide the feedback to drive needed improvements

Generator services– MC generator libraries; common event files; validation/test suite;

development when needed (HepMC, etc.)

Page 31: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 31

Simulation Framework

Provide flexible infrastructure for the development, validation and usage of Monte Carlo simulation applications

Work packages– GDML: Geometry description markup language

» GDML writer and readers exists for Geant4 and ROOT– Geant4 Geometry persistency

» Saving/retrieving Geant4 geometries with ROOT I/O– FLUGG: Calling Geant4 geometry from FLUKA

» example application exists (ATLAS Pixel)– Python interface to Geant4

» Provide Python bindings to G4 classes (SEAL PyLCGDict)» Steering Geant4 applications from Python scripts

– Monte Carlo truth handling

Page 32: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 32

Physics Validation

Compare detector simulation engines (Geant4, FLUKA) with experimental data to understand suitability for the LHC experiments– Collaboration and coordination between physics groups of LHC

experiments and developers of simulation codes Ongoing work

– Re-visit requirements for the simulation packages by evaluating the impact of simulation uncertainties on physics observables

– Validation of electromagnetic physics using test beam data and simple benchmarks (thin targets setups)

– Validation of hadronic physics (calorimetry, inner detectors, background radiation)

Page 33: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 33

Geant4 and Fluka

Contribution to the support/maintenance and development of Geant4 to server the needs of the LHC experiments– Leading role in the development/maintenance of the geometry,

field and transportation module – Release management and creation of release distributions– Participation in the development of electromagnetic physics

packages– Participation in the development of hadronic physics packages– System integration, development of an acceptance test suit, etc.

The Fluka team participates as externals to the simulation project– Very beneficial for the Physics Validation and Framework sub-

projects

Page 34: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 34

Garfield

Garfield is a specialized program for simulation of gaseous detectors– E.g. two- and three-dimensional wire chambers, TPCs, muti-

wire counters, etc. Ongoing work

– Interfacing to Maxwell and FEMLAB packages for electric field maps

– Diffusion modeling for strongly converging and diverging fields

– Electron transport properties in arbitrary gas mixtures obtained with the Magbolz program

– Ionization simulation using the Heed program

Page 35: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 35

Generator Services

The goal is to guarantee the MC generator support for the LHC experiments

The project collaborates with the MC generator authors to provide validated code for the theoretical and experimental communities at LHC

On going work– Generator library (GENSER). Central code repository and

installations for most popular MC generators– Contribution to the definition of standards for generator

interfaces– Database of “certified” MC event files to be used for

benchmarks, comparisons and combinations– Functional validation of MC generators

Page 36: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

Concluding

Page 37: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 37

AA Validation Highlights

POOL successfully used in large scale production in ATLAS, CMS, LHCb data challenges in 2004– ~400TB of POOL data produced– Objective of a quickly-developed persistency hybrid leveraging ROOT

I/O and RDBMSes has been fulfilled Geant4 firmly established as baseline simulation in successful

ATLAS, CMS, LHCb production– EM & hadronic physics validated– Highly stable: 1 G4-related crash per O(1M) events

SEAL components underpin POOL’s success, in particular the dictionary system– Now entering a second generation with Reflex

SPI’s Savannah project portal and external software service are accepted standards inside and outside the project

Page 38: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 38

LCG Phase 2 Planning

Started to plan second phase of Applications Area – The major change for this new phase is the merge of

ROOT and SEAL projects Internal AA Review: “Evolution plan technically

reasonable and supported by all experiments” Technical details of the plan are being discussed

(one topic at the time) by the projects and experiments and approved by Architects Forum

The planning document will be finished during this month

Page 39: LCG Applications Area 13 th GridPP Meeting, Durham 4-6 July 2005 Pere Mato/CERN.

06/07/05 LCG Applications Area Summary 39

Conclusion

AA is consolidating of a number of key products

Establishing the level of long-term support that is required for the products that are essential for the experiments – Minimizing duplication– Re-using software and infrastructure across projects– Easing maintenance of AA software at the end of the

LCG Stressing development in the area of Physics

Analysis (local, distributed, etc.)