Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull...

31
Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005 NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HENPCG/NERSC/LBNL 2006 Director’s Review of Computing LBNL - September 22, 2006

Transcript of Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull...

Page 1: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

NERSC High Energy and Nuclear Physics Computing Group

Craig E. Tull

HCG/NERSC/LBNL

2005 Science Colloquium SeriesDOE - August 23, 2005

NERSC High Energy and Nuclear Physics Computing Group

Craig E. Tull

HENPCG/NERSC/LBNL

2006 Director’s Review of ComputingLBNL - September 22, 2006

Page 2: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

HENP Computing Group

Group Leader: Craig Tull 1 Senior Scientist, 4 Scientists, 6 Engineers, 1 Postdoc

Embedded software professionals w/ science backgrounds Provide computing systems & support for science collaborations

Mostly software focus, but with a track record for integrated software and hardware systems

Scientists focus on science and requirements on the software rather than detailed design or implementation. Leave non-scientific code to computing professionals with

expertise and time to apply software process. Management and Leadership Roles Software and Technology Development Software Engineering Best Practices and Support

Page 3: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

HENP Environment

Large, distributed collaborations are the norm ~2000 scientists, from ~150 institutions in ~50 countries Scientists require equal access to data and resources

Very long time duration of projects & software Detectors take 5-10 years to design and build Detectors have an operational lifetime of 5-20 years 10 to 30 year Project lifetimes

• Strong focus on robust, maintainable software supporting graceful evolution of components & protocols

Commodity computing (Intel, Linux) Polite parallelism/Partitioning of calculations Data Intensive (100's TB => 1,000's TB) The World is Networked Scientists are developers and not just users

Many skill levels from Wizard to Neophyte Issues of scaling are sociological as well as technical

Page 4: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

HENPC Group Role Within a Project

Software professionals with scientific background as full collaborators Establishment of a software development process

Adapt CSE methodologies and processes to HENP environment• Object Oriented, Architectural• Aspects of Unified Software Development Process (USDP) and Extreme

Programming (XP) Design and implementation of software development infrastructure

Code repository, release build, test, and distribution systems Design and implementation of major software components

Control frameworks Online experiment control Data persistency frameworks Physics toolkits

Training and mentoring Tutorials, code guidelines, requirement/design/code reviews, etc.

Re-engineering of existing designs Provide expertise to improve robustness, performance, maintainability

Page 5: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

HENPC Group Role across Projects

Promote a common culture Best practices, open source, architecture, code reuse

Develop and integrate tools that support these best practices Explore and integrate new technologies

Object Oriented Database Systems CORBA based distributed systems GRID integration C++, Java, Python J2EE, JBoss, JMX

Generate an Institutional knowledge base User Requirements Architectural decomposition

• Components Leverage coupling between NERSC and Physical Sciences at

LBNL

Page 6: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

HENPC Projects (2004-2006)

ATLAS BaBar Daya Bay GUPFS IceCube Majorana PPDG SNAP SNF

Calafiura

Day

Gaponenko

Kushner

Lavrijsen

Leggett

Marino

McGinnis

Mokhtarani

Patton

Quarrie

Tull

Wouldstra

Page 7: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Experiments & Projects(2004-2006)

ATLAS LHC accelerator at CERN, Geneva Software lead, Chief Architect, Core software & Athena Gaudi

framework BaBar PEP-II collider at SLAC, Stanford

Conditions Database, IceCube Neutrino detector at South Pole

Chief Architect, Experiment Control, Build Environment, Offline Framework

Majorana Neutrinoless double beta decay, LDRD GEANT4 build system, GEANT4 Geometry Database

SNAP Supernova satellite Java Simulation Framework

GUPFS Global Unified Parallel File System Management and Scientific Liason

SNF Super Nova Factory, Telescope, Hawaii Lisp-based Observation Scheduling Software

PPDG Particle Physics Data Grid Replica catalogs technical survey, Security & VO roles

Page 8: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing SciencesThe ATLAS Experiment: A large HEP project

Page 9: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

LHC

• √s = 14 TeV (7 times higher than Tevatron/Fermilab) → search for new massive particles up to m ~ 5 TeV

• Ldesign = 1034 cm-2 s-1 (>102 higher than Tevatron/Fermilab)

→ search for rare processes with small σ (N = Lσ )

ALICE : heavy ions

ATLAS and CMS :pp, general purposeATLAS and CMS :pp, general purpose

27 km ring used fore+e- LEP machine in 1989-2000

Start : Summer 2007

pp

LHCb : pp, B-physics

Page 10: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

ATLAS

Length : ~ 46 m Radius : ~ 12 m Weight : ~ 7000 tons~ 108 electronic channels~ 3000 km of cables

• Tracking (||<2.5, B=2T) : -- Si pixels and strips -- Transition Radiation Detector (e/ separation)

• Calorimetry (||<5) : -- EM : Pb-LAr -- HAD: Fe/scintillator (central), Cu/W-LAr (fwd)

• Muon Spectrometer (||<2.7) : air-core toroids with muon chambers

Page 11: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

ATLAS Collaboration

34 Countries151 Institutions1770 Scientific Authors

Albany, Alberta, NIKHEF Amsterdam, Ankara, LAPP Annecy, Argonne NL, Arizona, UT Arlington, Athens, NTU Athens, Baku, IFAE Barcelona, Belgrade, Bergen, Berkeley LBL and UC, Bern, Birmingham, Bonn, Boston, Brandeis, Bratislava/SAS Kosice, Brookhaven NL, Bucharest, Cambridge, Carleton, Casablanca/Rabat, CERN, Chinese Cluster, Chicago, Clermont-Ferrand, Columbia, NBI Copenhagen, Cosenza, INP Cracow, FPNT Cracow, Dortmund, JINR Dubna, Duke, Frascati, Freiburg, Geneva, Genoa, Glasgow, LPSC Grenoble, Technion Haifa, Hampton, Harvard, Heidelberg, Hiroshima, Hiroshima IT, Indiana, Innsbruck, Iowa SU, Irvine UC, Istanbul Bogazici, KEK, Kobe, Kyoto, Kyoto UE, Lancaster, Lecce, Lisbon LIP, Liverpool, Ljubljana, QMW London, RHBNC London, UC London, Lund, UA Madrid, Mainz, Manchester, Mannheim, CPPM Marseille, Massachusetts, MIT, Melbourne, Michigan, Michigan SU, Milano, Minsk NAS, Minsk NCPHEP, Montreal, FIAN Moscow, ITEP Moscow, MEPhI Moscow, MSU Moscow, Munich LMU, MPI Munich, Nagasaki IAS, Naples, Naruto UE, New Mexico, Nijmegen, BINP Novosibirsk, Ohio SU, Okayama, Oklahoma, LAL Orsay, Oslo, Oxford, Paris VI and VII, Pavia, Pennsylvania, Pisa, Pittsburgh, CAS Prague, CU Prague, TU Prague, IHEP Protvino, Ritsumeikan, UFRJ Rio de Janeiro, Rochester, Rome I, Rome II, Rome III, Rutherford Appleton Laboratory, DAPNIA Saclay, Santa Cruz UC, Sheffield, Shinshu, Siegen, Simon Fraser Burnaby, Southern Methodist Dallas, NPI Petersburg, Stockholm, KTH Stockholm, Stony Brook, Sydney, AS Taipei, Tbilisi, Tel Aviv, Thessaloniki, Tokyo ICEPP, Tokyo MU, Tokyo UAT, Toronto, TRIUMF, Tsukuba, Tufts, Udine, Uppsala, Urbana UI, Valencia, UBC Vancouver, Victoria, Washington, Weizmann Rehovot, Wisconsin, Wuppertal, Yale, Yerevan

Page 12: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

ATLAS Computing Characteristics

Large, complex detector ~108 channels

Long lifetime Project started in 1992, first data in 2007, last data 2027?

320 MB/sec raw data rate ~3 PB/year

Large, geographically dispersed collaboration 1770 people, 151 institutions, 34 countries Most are, or will become, software developers

• Programming abilities range from Wizard to Neophyte Scale and complexity reflected in software

~1000 packages, ~8000 C++ classes, ~5M lines of code ~70% code is algorithmic (written by physicists) ~30% infrastructure, framework (written by sw engineers)

• HENPC responsible for large portion of this software Provide robustness but plan for evolution Requires enabling technologies Requires management & coherency

Page 13: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Software Methodology

Object-Oriented using C++ as programming language Some wrapped FORTRAN and Java Python as interactive & configuration language

Heavy use of components behind abstract interfaces Support multiple implementations Robustness & evolution

Lightweight development process Emphasis on automation and feedback rather than very formal

process• Previous attempt at developing a software system had failed due to a

too rigorous software process decoupled from developers Make it easy for developers to do the “right thing” Some requirements/design reviews Sub-system “functionality” reviews

• 2 weeks each• Focus on client viewpoint

Page 14: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Event Store

Athena/Gaudi Component Model

Converter

Algorithm

StoreGateSvcPersistencyService

DataFiles

Algorithm

StoreGateSvc PersistencyService

DataFiles

Detector Store

MessageService

JobOptionsService

Particle Prop.Service

OtherServices

HistogramService

ApplicationManager

ConverterConverterEventLoopMgr

Auditors

PersistencyService

DataFiles

Histogram Store

Sequencer

Page 15: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Athena Components

Algorithms Provide basic per-event processing Share a common interface (state machine) Sequencer is type of Algorithm that sequences/filters other Algorithms

Tools More specialized but more flexible than Algorithms

Services E.g. Particle Properties, Random Numbers, Histogramming

Data Stores (blackboards) Data registered by one Algorithm/Tool can be retrieved by another Multiple stores handle different lifetimes (per event, per job, etc.) Stores accessed via Services (e.g. StoreGateSvc)

Converters Transform data from one representation to another

• e.g. transient/persistent Properties

Adjustable parameters of components Can be modified at run-time to configure job

Page 16: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

ATLAS Computing Organization

Page 17: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

HENPC Group within ATLAS

David Quarrie Software Project Lead

Paolo Calafiura Chief Architect Core Services Group Lead

Wim Lavrijsen Python configuration and support tools

Charles Leggett Athena/Gaudi framework support

Martin Woudstra Integration of Athena with production system

Page 18: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

FY06-07 Major Activities

Computing support for ATLAS Detector Commissioning Electronics & detector calibrations Cosmic ray tests

Computing System Commissioning (CSC) Commissioning of the Computing and Software

system itself High Level Trigger Large Scale Tests

Offline software components used in HLT• Athena framework• Tracking & calorimetry algorithms

Page 19: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

ATLAS (some highlights)

Management & Leadership Roles ATLAS Software Project Lead: David Quarrie ATLAS Chief Architect: Paolo Calafiura

• Previously D.Quarrie US ATLAS Core Software Manager: Paolo Calafiura

• Previously D.Quarrie, C.Tull Software and Technology Development

Athena Control & Analysis Framework PyROOT: Introspection-driven ROOT Python interface StoreGate: Object Transient Store

Software Engineering Best Practices and Support Nightly Build & Release Campaign Dozens of tutorials and extensive documentation ASK (Athena Startup Kit): Robust GUI for build system Hephaestus: Low-overhead Memory Profiler

Page 20: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

IceCube

Management & Leadership Roles Software Architect: Simon Patton Experiment Control: Simon Patton

Software and Technology Development Ice Tray: Component-based Analysis Framework JBoss/JMX Control Architecture

• Hierarchical State Machine• Web Portal interface

Software Engineering Best Practices and Support BFD (Baseline File Development): UML based

develop, build, release system. Tutorials & Developer Support

Page 21: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

IceCube Computing

System Architecture and Development (Simon Patton) Strong coherent vision for all IceCube software. Laying out "best practices" to follow to ensure good code. Development environment, tools supporting best practices.

Experiment Control (Simon Patton, Chris Day, Akbar Mohktarani) Layered State Machine control of components of data flow Uses J2EE, JBoss/JMX Heirarchical State Machine

Page 22: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

SNAP

Management & Leadership Roles Computing Framework Architect:

Gary Kushner• SNAP Collaboration Systems Manager Group• Represent SNAP computing w/ Bill Carithers

Software and Technology Development Computing Framework & Monte Carlo (Java-based)

• Simulate the Universe being observed• Simulate the Instrumentation and Detecton Process• Simulate the Extraction of Cosmological Parameter from

Mission Data Software Engineering Best Practices and Support

Ant-based build environment Shrink-wrapped deployment package (out-of-box

experience) Redesign/implementation of Physics Codes: up to X15

speedup

Page 23: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Supernova Mission Simulator

With our Monte Carlo:— Have simulated SNAP with detector characteristics and observing program

— Have simulated other potential experiments including ground-based instruments

— Use state-of-the-art SNe models that simulate SNe population drift with redshift

— Included systematic effects and calibration errors

— Can generate error ellipses for cosmological parameters

— Can optimize SNAP

Page 24: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

High Level CS Architecture

RunFile

RunFile

OutputData Set

1 1

CatalogAccessToolkit

DataManipulationToolkit

DataVisualizationToolkit

GUICommand

Line

RunManager

DataSetToolkit

Image Toolkit

Repository

DataSet

DataSet

DataSet

DataSet

DataSet

DataSet

ImageRepository

Run TimeRepository

Page 25: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

BaBar

Management & Leadership Roles Previously Chief Architect: David Quarrie Previously Database Head: Simon Patton

Software and Technology Development Conditions DataBase (CDB): Igor Gaponenko & Akbar

Mokhtarani Historically:

• Object Oriented Database (Objectivity)• Offline & Online software & General applications

Software Engineering Best Practices and Support Refactorizing all BaBar database applications to newer

persistency technologies (ROOT I/O & MySQL) Expert-level support for distributed database management

& tools Consultation on Database Software

Page 26: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

CDB Concepts : Scope & Ownership Diagram

DATABASE

ORIGIN

P-LAYOUT(2-D SPACE)

PARTITION

VIEW

FOLDER

CONFIGURATION

PHYSICAL CONDITION(2-D SPACE)

REVISION

VISIBLE INTERVAL ORIGINAL INTERVAL

USER DATA OBJECT

MIR

uses

ownsprovides scope

for

Page 27: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Majorana

Software and Technology Development Centralized MySQL database for

MaGe (GEANT4) material properties and geometry • Schema, API, implementation, and support

Software Engineering Best Practices and Support Unified software management to ensure code

integrity across participating institutions Incremental build and test system General application and development support

• Geant4, Root, CLHEP, Dawn, etc

Page 28: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

“MaGe” Simulation Package.

Framework uses powerful object-oriented and abstraction capabilities of C++ and

STL for flexibility

Gerda-relateddetector geometries

Majorana-relateddetector geometries

MaGeGeant 4/ ROOTEvent GeneratorsCommon geometriesPhysics processes

Majorana-relatedoutput

Gerda-relatedoutput

Page 29: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

MaGe ActivitiesPrevious Activities:

Characterization of radioactive backgrounds in conceptual design for NuSAG proposal.

Interpretation of results from highly segmented detector at MSU.

TUNL-FEL Run

Charge Distribution in 0-decay

Current Activities:

Full characterization of Majorana reference design and optimization of segmentation scheme.

Neutron background from muons in rock

Alpha contamination on surfaces.

Pulse-generation.

Gerda

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Posters at Neutrino04,

TAUP05

Page 30: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Challenges & Opportunities

Many other science disciplines are growing in size, distribution, and data volume HENP lessons learned should be leveraged Non-HENP techniques of interest to HENP

New HENP experiments/projects: SNAP, Majorana, Daya Bay, Gretina

"Old" experiments/projects: BaBar, ATLAS, IceCube

Lack of any base funding for HENPC: Problems with long-term stability & predictability Great difficulty jump-starting new projects Common use software, libraries & tools a

"volunteer" effort

Page 31: Computing Sciences NERSC High Energy and Nuclear Physics Computing Group Craig E. Tull HCG/NERSC/LBNL 2005 Science Colloquium Series DOE - August 23, 2005.

Computing Sciences

Summary of HENPC Group

A small group with involvement in many projects Have had a major impact in the field

Leaders in use of Object Oriented programming and software process

Leaders in use of Object Oriented Databases Control Frameworks in use by many experiments

• Not just those we’ve worked directly on

Demonstrated ability to leverage our expertise and empower large, dispersed, software teams

Demonstrated ability to design and implement large scale scientific computing systems

Keeping abreast of modern computing techniques and tools, but delivering concrete, robust software for production use.