US CMS + US ATLAS ITR (pre-)proposal “Globally Enabled Analysis Communities”

18
1 Lothar A T Bauerdick Fermilab US CMS + US ATLAS ITR US CMS + US ATLAS ITR (pre-)proposal (pre-)proposal “Globally Enabled Analysis “Globally Enabled Analysis Communities” Communities” Sound Bites: Sound Bites: Dynamic workspaces” Dynamic workspaces” Private Grids to support scientific analysis Private Grids to support scientific analysis communities” communities” Build autonomous communities operating within global Build autonomous communities operating within global collaborations” collaborations” Empower small groups of scientists to profit from and Empower small groups of scientists to profit from and contribute to international big science” contribute to international big science” Democratization of Science via new technologies” Democratization of Science via new technologies” What is it? Well, we will have to define it in the What is it? Well, we will have to define it in the proposal, due in March 2003 proposal, due in March 2003

description

US CMS + US ATLAS ITR (pre-)proposal “Globally Enabled Analysis Communities”. Sound Bites: “Dynamic workspaces” “Private Grids to support scientific analysis communities” “Build autonomous communities operating within global collaborations” - PowerPoint PPT Presentation

Transcript of US CMS + US ATLAS ITR (pre-)proposal “Globally Enabled Analysis Communities”

Page 1: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

1Lothar A T Bauerdick Fermilab

US CMS + US ATLAS ITR (pre-)proposal US CMS + US ATLAS ITR (pre-)proposal “Globally Enabled Analysis Communities”“Globally Enabled Analysis Communities”

Sound Bites:Sound Bites:““Dynamic workspaces”Dynamic workspaces”

““Private Grids to support scientific analysis communities”Private Grids to support scientific analysis communities”““Build autonomous communities operating within global collaborations”Build autonomous communities operating within global collaborations”

““Empower small groups of scientists to profit from and contribute to Empower small groups of scientists to profit from and contribute to international big science” international big science”

““Democratization of Science via new technologies”Democratization of Science via new technologies”

What is it? Well, we will have to define it in the proposal, due in March 2003What is it? Well, we will have to define it in the proposal, due in March 2003

Page 2: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

2Lothar A T Bauerdick Fermilab

Berkeley Workshop Nov 2002 -- The Global PictureBerkeley Workshop Nov 2002 -- The Global Picture

operation of base grid infrastructure

• ~ 4-5 core centres (including the LCG Tier 1s)• information service, catalogues, ..• coordination, operations centre, ..• call centre, user support, training, ..

other grid nodes for physics,biology, medicine, ….

end-to-endmiddleware

end-to-end biomiddleware

end-to-end hepmiddleware

Hardening/Reworkingof basic

middlewareprototyped by

current projects

AdvancedMiddleware

requirements definedby current projects

middlewareengineering

other scienceapplications

LHC applications bio applications

…..science sciencescience

middlewareR&D(CS)

Application/grid interface

Development of A Science Grid Infrastructure (L.Robertson)Development of A Science Grid Infrastructure (L.Robertson)

Page 3: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

3Lothar A T Bauerdick Fermilab

… … and the “missing pieces”and the “missing pieces”Transition to Transition to Production Level GridsProduction Level Grids

(middleware support, error recovery, robustness, 24x7, monitoring and (middleware support, error recovery, robustness, 24x7, monitoring and system usage optimization, strategy and policy for resource allocation, system usage optimization, strategy and policy for resource allocation, authentication and authorization, simulation of grid operations, tools for authentication and authorization, simulation of grid operations, tools for optimizing distributed systems)optimizing distributed systems)

Globally Enabled Globally Enabled Analysis CommunitiesAnalysis Communities (WG2) (WG2)

Enabling Enabling Global CollaborationGlobal Collaboration (a medium ITR?) (a medium ITR?)

Page 4: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

4Lothar A T Bauerdick Fermilab

The goal:The goal:

Provide individual physicists and Provide individual physicists and groups of scientists capabilities from groups of scientists capabilities from the desktop that allow them:the desktop that allow them:To participate as an equal in one or more To participate as an equal in one or more

“Analysis Communities”“Analysis Communities”Full representation in the Global Experiment Full representation in the Global Experiment

Enterprise Enterprise To on-demand receive whatever resources To on-demand receive whatever resources

and information they need to explore their and information they need to explore their science interest while respecting the science interest while respecting the collaboration wide priorities and needs.collaboration wide priorities and needs.

Page 5: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

5Lothar A T Bauerdick Fermilab

Assumptions of the ITR pre-proposalAssumptions of the ITR pre-proposal

The Project will provide some of the missing capabilities The Project will provide some of the missing capabilities for ATLAS and CMS data analysis systems.for ATLAS and CMS data analysis systems.

Project will be managed as part of US CMS and US Project will be managed as part of US CMS and US ATLAS S&C projects. ATLAS S&C projects.

An existing robust, fully functional core Grid Infrastructure An existing robust, fully functional core Grid Infrastructure for work flow management, data management (data, for work flow management, data management (data, meta-data, provenance), and security management.meta-data, provenance), and security management.

Project will deploy incremental capabilities for experiment Project will deploy incremental capabilities for experiment use throughout the lifetime.use throughout the lifetime.

Up front experiment-wide management and oversight Up front experiment-wide management and oversight will ensure appropriateness and buy-in.will ensure appropriateness and buy-in.

Page 6: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

6Lothar A T Bauerdick Fermilab

Constraints of the ITR pre-proposalConstraints of the ITR pre-proposal

It may actually not get accepted...It may actually not get accepted...Five year development and deployment Five year development and deployment

program, can’t start before end of 2003 program, can’t start before end of 2003 the total possible funding is $15M over 5 years the total possible funding is $15M over 5 years

(including all overheads that is “only” some ten (including all overheads that is “only” some ten FTE!)FTE!)

Many (>12?) institutions/universities want to Many (>12?) institutions/universities want to participate in the projectparticipate in the project

Development teams include Computer Scientists Development teams include Computer Scientists (Information Technologists?) and Physicists(Information Technologists?) and Physicists

Funding comes from Computer Science Funding comes from Computer Science department in the NSF, and reviewers are mainly department in the NSF, and reviewers are mainly Computer Scientists.Computer Scientists.

Page 7: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

7Lothar A T Bauerdick Fermilab

We must be able to reliably and consistently move resources & information in both We must be able to reliably and consistently move resources & information in both directions between the Global Collaboration and the Analysis Communitiesdirections between the Global Collaboration and the Analysis Communities

Communities can share among themselves.Communities can share among themselves.

Physics Analysis in CMSPhysics Analysis in CMSThe Experiment controls and The Experiment controls and maintains the global enterprise: maintains the global enterprise:

HardwareHardware: Computers, Storage : Computers, Storage (permanent and temporary)(permanent and temporary)

SoftwareSoftware Packages: physics, Packages: physics, framework, data management, build framework, data management, build and distribution mechanisms; base and distribution mechanisms; base infrastructure (operating systems, infrastructure (operating systems, compilers, network, grid); compilers, network, grid);

Event and Physics Event and Physics Data and Data and DatasetsDatasets

SchemaSchema which define: meta-data, which define: meta-data, provenance, ancillary information provenance, ancillary information (run, luminosity, trigger, Monte-Carlo (run, luminosity, trigger, Monte-Carlo parameters, calibration etc)parameters, calibration etc)

Organization, Policy and PracticeOrganization, Policy and Practice

Analysis Groups - Communities - Analysis Groups - Communities - are of 1 to many individualsare of 1 to many individuals

Each community is part of the Each community is part of the Enterprise :Enterprise :

Is assigned or shares the total Is assigned or shares the total Computation and StorageComputation and Storage

Can access and modify software, Can access and modify software, data, schema (meta-data)data, schema (meta-data)

is subject the overall organization and is subject the overall organization and managementmanagement

Each community has local (private) Each community has local (private) control ofcontrol of

Use of outside resources e.g. local Use of outside resources e.g. local institution computing centersinstitution computing centers

Special versions of software, Special versions of software, datasets, schema, compilersdatasets, schema, compilers

Organization, policy and practiceOrganization, policy and practice

Page 8: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

8Lothar A T Bauerdick Fermilab

Environment for CMS (LHC) Distributed Analysis on the GridEnvironment for CMS (LHC) Distributed Analysis on the Grid

Dynamic WorkspacesDynamic Workspaces - provide capability - provide capability for individual and community to request for individual and community to request and receive expanded, contracted or and receive expanded, contracted or otherwise modified resources, while otherwise modified resources, while maintaining the integrity and policies of maintaining the integrity and policies of the Global Enterprise. the Global Enterprise.

Private GridsPrivate Grids - provide capability for - provide capability for individual and community to request, individual and community to request, control and use a heterogeneous mix of control and use a heterogeneous mix of Enterprise wide and community specific Enterprise wide and community specific software, data, meta-data, resources.software, data, meta-data, resources.

Page 9: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

9Lothar A T Bauerdick Fermilab Information flow

C

St

ScSo

XX

X

XX

X

CC

C

C

C

C

C

St

St

So

SoSo

So

ScScSc

X - physicistSo - softwareC- compute St- storageSc - schema/information - desktop

The Global Community

C

StC

a private grid/analysis community

Page 10: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

10Lothar A T Bauerdick Fermilab

Technologies to be DevelopedTechnologies to be Developed

The CS and IT part of the proposal!The CS and IT part of the proposal!

Page 11: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

11Lothar A T Bauerdick Fermilab

Infrastructure to support Private Infrastructure to support Private “Community Grids”“Community Grids”

Meta-data to describe and manage private grids.Meta-data to describe and manage private grids.Tools for information and data transfer and communication Tools for information and data transfer and communication between communitiesbetween communitiesSynchronization and validation tools between the community Synchronization and validation tools between the community grids and the global enterprise.grids and the global enterprise.Application/user interfaces for management, administration Application/user interfaces for management, administration and operation of set of private grids within an enterprise.and operation of set of private grids within an enterprise.

Page 12: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

12Lothar A T Bauerdick Fermilab

Infrastructure to support dynamic Infrastructure to support dynamic workspace capabilities workspace capabilities

Rapid response reconfiguration and administration tools.Rapid response reconfiguration and administration tools.

Enterprise wide integrity and validation tools across all Enterprise wide integrity and validation tools across all private grids.private grids.

Application/user interfaces for the distributed request and Application/user interfaces for the distributed request and control of extensions and contractions of private grids.control of extensions and contractions of private grids.

Page 13: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

13Lothar A T Bauerdick Fermilab

De-centralized, multi-tiered De-centralized, multi-tiered schema evolution and schema evolution and

synchronizationsynchronization

Mechanisms to support parallel evolution and Mechanisms to support parallel evolution and resynchronization of decentralized heterogeneous local resynchronization of decentralized heterogeneous local and enterprise schema.and enterprise schema.

Application interfaces for definition, modification of and Application interfaces for definition, modification of and access to local and enterprise schema.access to local and enterprise schema.

Page 14: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

14Lothar A T Bauerdick Fermilab

HEP specific developments:HEP specific developments:

User Interfaces:User Interfaces:Describe, modify, control and access all physics Describe, modify, control and access all physics

analysis data and meta-dataanalysis data and meta-dataControl and manage analysis processesControl and manage analysis processesRequest and use private community gridsRequest and use private community grids

Integration with physics codes:Integration with physics codes:Definition of architecture into which ITR deliverables Definition of architecture into which ITR deliverables

will fitwill fit Integration activities to include, test and validate ITR Integration activities to include, test and validate ITR

deliverables interactively.deliverables interactively.

Page 15: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

15Lothar A T Bauerdick Fermilab

Architecture and Work PlanArchitecture and Work Plan

Architecture is very important for developing the work plan --> Proposal!!Architecture is very important for developing the work plan --> Proposal!! Blueprint RTAG continuation w/r to Grid interfaces?Blueprint RTAG continuation w/r to Grid interfaces? Computing model, work loads, data flows?Computing model, work loads, data flows?

We need to discuss and agree on a system architecture and components We need to discuss and agree on a system architecture and components from the end-to-end physicist user to underlying grid infrastructure from the end-to-end physicist user to underlying grid infrastructure assumed to be in place; assumed to be in place;

We need to have a project work plan (concrete set of tasks) in place with We need to have a project work plan (concrete set of tasks) in place with deliverables to make people realize this is a project to deliver working deliverables to make people realize this is a project to deliver working production systems for the experiments to use as the project proceeds; production systems for the experiments to use as the project proceeds;

We need to negotiate and agree on some deliverables/components from We need to negotiate and agree on some deliverables/components from other projects e.g. from LCG Applications Area; and to understand the other projects e.g. from LCG Applications Area; and to understand the time line in terms of what capabilities would be available when;time line in terms of what capabilities would be available when;

We need to upfront explicitly identify who is going to do test cases of We need to upfront explicitly identify who is going to do test cases of running analyses with these tools to provide feedback to development running analyses with these tools to provide feedback to development effort --> link to data challengeseffort --> link to data challenges

Page 16: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

16Lothar A T Bauerdick Fermilab

LCG Applications Area Blueprint: LCG Applications Area Blueprint:

Blueprint Scope 4) Physics data management: Blueprint Scope 4) Physics data management: … … Provide data management services meeting the scalability requirements of the LHC Provide data management services meeting the scalability requirements of the LHC

experiments, including integration with large-scale storage management and the Grid… experiments, including integration with large-scale storage management and the Grid… Schema evolution , propagation, merging Schema evolution , propagation, merging

relevance to grid interfaces and capabilities?relevance to grid interfaces and capabilities?it might for example be useful to look at extending it it might for example be useful to look at extending it

Event

Generation

Core Services

Dictionary

Whiteboard

Foundation and Utility Libraries

Detector

Simulation

Engine

Persistency

StoreMgr

Reconstruction

Algorithms

Geometry Event Model

Grid

Services

Interactive

Services

Modeler

GUI

Analysis

EvtGen

Calibration

Scheduler

Fitter

PluginMgr

Monitor

NTuple

Scripting

FileCatalog

ROOT GEANT4 DataGrid Python Qt

Monitor

. . .MySQLFLUKA

Event

Generation

Core Services

Dictionary

Whiteboard

Foundation and Utility Libraries

Detector

Simulation

Engine

Persistency

StoreMgr

Reconstruction

Algorithms

Geometry Event Model

Grid

Services

Interactive

Services

Modeler

GUI

Analysis

EvtGen

Calibration

Scheduler

Fitter

PluginMgr

Monitor

NTuple

Scripting

FileCatalog

ROOT GEANT4 DataGrid Python Qt

Monitor

. . .MySQLFLUKA

LCG Application DomainLCG Application Domain

Page 17: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

17Lothar A T Bauerdick Fermilab

This ITR should be of direct benefit for CMSThis ITR should be of direct benefit for CMS

Each analysis group/physicist will be able to perform local analyses which Each analysis group/physicist will be able to perform local analyses which can be reliably and quickly validated and trusted by the collaboration can be reliably and quickly validated and trusted by the collaboration on request. The experiment will be able to demonstrate and compare on request. The experiment will be able to demonstrate and compare methods and results reliably and improve the turnaround time to methods and results reliably and improve the turnaround time to physics publications.physics publications.

The experiment will be able to quickly respond to and decide upon new The experiment will be able to quickly respond to and decide upon new requests from analysis groups/physicists for resources, with minimal requests from analysis groups/physicists for resources, with minimal perturbation to the rest of the collaborationperturbation to the rest of the collaboration

The experiment will have an established infrastructure for evolution and The experiment will have an established infrastructure for evolution and extension for its long life. extension for its long life.

There will be a lowering of the intellectual cost barrier for new physicists There will be a lowering of the intellectual cost barrier for new physicists and researchers to contribute. We will enable small groups to perform and researchers to contribute. We will enable small groups to perform reliable exploratory analyses on their own. There will be an increased reliable exploratory analyses on their own. There will be an increased potential for individual or small community analyses and discovery. potential for individual or small community analyses and discovery.

Individuals and groups will be assured they are using a well defined set of Individuals and groups will be assured they are using a well defined set of software and data. software and data.

Page 18: US CMS + US ATLAS ITR (pre-)proposal  “Globally Enabled Analysis Communities”

18Lothar A T Bauerdick Fermilab

This may all “seem easy” but ask This may all “seem easy” but ask any physicist doing analysis on a any physicist doing analysis on a

large experiment today…large experiment today…

And the LHC is 10 times larger in all And the LHC is 10 times larger in all dimensions.dimensions.