CHEP 2003 General Summary

16
CHEP 2003 General Summary Torre Wenaus, BNL/CERN CHEP 2003, UC San Diego, La Jolla March 28, 2003

description

CHEP 2003 General Summary. Torre Wenaus, BNL/CERN CHEP 2003, UC San Diego, La Jolla March 28, 2003. Themes and observations. Lesson from the past: Make it simple (R. Brun) No more complex than necessary Users want consolidation, ease of use, and stability - PowerPoint PPT Presentation

Transcript of CHEP 2003 General Summary

CHEP 2003 General Summary

Torre Wenaus, BNL/CERN

CHEP 2003, UC San Diego, La Jolla

March 28, 2003

CHEP 2003 Summary, March 28 2003 Slide 4

Torre Wenaus, BNL/CERN

Themes and observations

Lesson from the past: Make it simple (R. Brun) No more complex than necessary Users want consolidation, ease of use, and stability Must consider also needs of the future; longer view of maintainability and

evolution In the interests of long term stability

OO and C++ is the accepted paradigm No major OO/C++ migration or usage angst at this conference, it is done and

accepted Offline and online: “Triumph of C++ for HEP DAQ confirmed” – DAQ

summary Now we are hearing reports on Nth generation C++ software

L. Sexton Kennedy, CDF: Every component has been rewritten at least once. Implementations have now stabilized such that every new arrival doesn’t start by discarding and rewriting software

“Many more talks about redesign than about design” – Data management summary

And on the maturation and emergence of tools as broad standards, after years of development and refinement

e.g. Geant4, ROOT I/O

CHEP 2003 Summary, March 28 2003 Slide 5

Torre Wenaus, BNL/CERN

Themes and observations

The tyranny of Moore’s Law Wolbers: it is not a substitute for more efficient & faster code,

smaller data size it works against thinking before doing Optimize wherever possible

Addressing the digital divide in networking (H.Newman) HEP is obligated as a community to work on this A world problem in which our field can have visible impact

Farm challenges Don’t underestimate farm installation and operations (R.Divia) Big issues are power, cooling, space! (S.Wolbers)

Watts/$ steadily rising (R.Mount) Tape-disk random access performance gap in analysis is receding

as an issue, but disk-memory gap is hardly being addressed (R.Mount)

CHEP 2003 Summary, March 28 2003 Slide 7

Torre Wenaus, BNL/CERN

Rising trends

ROOT For analysis, I/O, and much else Now fully supported at CERN: EP/SFT section Close interaction with experiments on new developments

Run II, RHIC, ALICE, LCG, BaBar, … Foreign classes, PROOF, geometry, grid integration, …

Mentioned in 47+ talks at this conference Open source databases (MySQL, Postgres, …)

Metadata, distributed computing, conditions, … Empowering software: easy and potent MySQL mentioned in 37 talks! Postgres in 8, Oracle in 27

Online – offline continuum Similar Linux farm environments, attainable time budgets Same framework, maybe same algorithms, in HLT as in offline

(V.Boisvert, ATLAS) Stringent performance/robustness requirements on software

CHEP 2003 Summary, March 28 2003 Slide 8

Torre Wenaus, BNL/CERN

Rising trends

Common projects Joint projects one of the CDF/D0 successes (Wolbers)

But hard to align running experiments with LHC LHC Computing Grid project Grid projects in general Laudable but difficult; increasingly forced by the circumstances

Resource constraints and increasing scale and complexity makes go-it-alone N times too costly

cf. comments in online/DAQ context by G. Dubois-Feldmann today: somewhat less success in online where it is even harder than offline, but possible LHC inroads

Related is software reuse… Respect what we know about long software development

timescales

CHEP 2003 Summary, March 28 2003 Slide 9

Torre Wenaus, BNL/CERN

Rene’s time to develop plot

LCG?

CHEP 2003 Summary, March 28 2003 Slide 10

Torre Wenaus, BNL/CERN

LCG must effectively re-use and leverage existing software, or fail

LCG?

This is the approach taken: cf. POOL, SEAL talks. Time will tell! cf. next CHEP

CHEP 2003 Summary, March 28 2003 Slide 15

Torre Wenaus, BNL/CERN

Rising trends – The Grid

The central importance of distributed computing to future (increasingly, present) HENP is long known

‘The Grid’ as the means to that is now established Major, broad successes in funding and in attracting

collaboration with CS F.Berman, Grid 2003: “HEP has set a model for

integration, focus, coordination” Progress in applying Grid software and infrastructure

to real problems Batch production

Clearly the chosen path; success to be proven, but has promise and broad commitment

CHEP 2003 Summary, March 28 2003 Slide 16

Torre Wenaus, BNL/CERN

The Grid

F.Berman, Grids on the horizon: Must be useful, usable, stable; supported More cooperative than competitive

[Not always the case today!] Applications are key to success

Not a “Field of Dreams” “build it and they will come” R&D field any more

Grid killer app: a focus on data. Good match to us Still a long way to go

CHEP 2003 Summary, March 28 2003 Slide 17

Torre Wenaus, BNL/CERN

The Grid

Miron Livny: Benefit to science: democratization of computing Still very manpower intensive: when the support team

goes on holiday, so does the Grid (CMS testbed in Dec)

Best practice middleware requires True collaboration, “open minds” (cf. Berman) Testing, deployment/adoption, evaluation metrics,

robustness, professional support, longevity, responsiveness to show stoppers, …

Much to do and improve but important progress E.g. VDT as standard middleware suite

CHEP 2003 Summary, March 28 2003 Slide 22

Torre Wenaus, BNL/CERN

Receding trends

Objectivity and ODBMS in general “Jury still out” at CHEP 2000 (P.Sphicas), but now clear Objectivity dropped or being phased out by LHC experiments,

COMPASS, BaBar event store In PHENIX “becoming a liability” (compiler issues);

augmented with RDBMSs Not due to technical failure but a mix of technical problems,

commercial concerns, manpower costs, availability of an alternative

Its replacements are not other ODBMSes but files (often ROOT) + RDBMS (mySQL, Oracle, Postgres…) for metadata

Magnetic tape (apart from archival) PASTA: “unlimited” multi-PB disk caches technically possible

but true cost is unclear (reliability, manageability) File system access under urgent investigation “tapes as random access device no longer a viable option” –

large disk caches needed for LHC analysis

CHEP 2003 Summary, March 28 2003 Slide 23

Torre Wenaus, BNL/CERN

Receding trends

Commercial software? No… Some in decline (Objy, LHC++), but new prospects

opening (IBM, Sun, MS, …) in Grid Open source now has an important commercial

element we derive great benefit from (even post-.com crash)

Red Hat, MySQL, Qt, …

CHEP 2003 Summary, March 28 2003 Slide 24

Torre Wenaus, BNL/CERN

Underrepresented

Collaborative tools Was represented this week, but only lightly Vital for distributed collaboration on software development and

physics analysis H. Newman: need culture of collaboration

Distributed and remote collaboration should be the norm Not solely, or even predominantly?, a matter of tool development

in the community How is the exponential commercial side evolving and how

can we leverage it What is the evolutionary path, strategy, role for community-

developed tools such as VRVS Why is the user experience often poor

Poor physical facilities/configurations, instabilities, heterogeneous tools/protocols, support issues, …

Current experience sometimes competes unsuccessfully with the telephone, despite all the shortcomings

CHEP 2003 Summary, March 28 2003 Slide 33

Torre Wenaus, BNL/CERN

Concerns

Data analysis as “the last wheel of the car” (R. Brun) Clear message from current generation (e.g. Run 2, BaBar): don’t

leave data analysis systems and infrastructure too late, it will lead to problems

Vastly more true when we are talking about doing globally distributed analysis, for the first time

with unprecedented volume and complexity, e.g. Terabyte scale at the LHC

Making dist analysis both very difficult and mandatory We cannot bootstrap ourselves into a global analysis system, it

will take long incremental work, so we better be working in a coordinated & effective way now

R. Brun: Will not converge on one system; will be multiple competing systems, and that will not be bad [hopefully a small number]

CHEP 2003 Summary, March 28 2003 Slide 34

Torre Wenaus, BNL/CERN

Concerns

Are we doing enough to ensure senior people can contribute directly to physics analysis?

How do we interpret the fact (R. Brun) that PAW usage is still rising?

Has everyone bought the C++/OO paradigm shift? Are we developing and/or providing the right tools?

Is there enough engagement of senior physicists in the (limited) exploratory work being done on future physics analysis environments?

Almost certainly no, and may be difficult to attract their attention unless/until attractive prototypes can be turned loose on them

CHEP 2003 Summary, March 28 2003 Slide 38

Torre Wenaus, BNL/CERN

Conclusions (2)

Grids and networking are making great strides HENP is a successful and valued partner with CS

We provide a community focused on challenging large-scale deployments in real research settings

But Murphy’s Law is a potent adversary today; far from robust transparency, and much much more to do

Global collaborative computing must become a successful norm for us

Down to the global researcher at the home institute Rich leadership potential for our field

Important new common endeavours like the Grid and LCG have much invested in their success… will be interesting to measure the degree of success at next CHEP