Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7....

17
Experimental Computing Frank Porter System Manager: Juan Barayoga 1 Frank Porter, Caltech DoE Review, July 21, 2004

Transcript of Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7....

Page 1: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Experimental Computing

Frank Porter

System Manager: Juan Barayoga

1 Frank Porter, Caltech DoE Review, July 21, 2004

Page 2: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

2 Frank Porter, Caltech DoE Review, July 21, 2004

Page 3: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

HEP Experimental Computing System Description (I)

CPU farm— Linux on dual-Intel-CPU rack-mounted units, 1-2 GByte memory

each— Currently 122 CPUs in farm— “PBS” batch system for resource allocation— 100BaseT network connection to each unit— KVM switches for local keyboard/mouse/videoFile servers— Also Linux/Intel-based— IDE RAID 5 technology— 7.9 TByte capacity on five servers— Gbit ethernet— NFS, AFS, Samba file serving software

3 Frank Porter, Caltech DoE Review, July 21, 2004

Page 4: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

HEP Experimental Computing System Description (II)

Interactive servers— Desktop mixture of linux and Windows on Intel— Central interactive linux servers (4 dual-CPU), legacy AIX servers— Recent purchase of eight high performance desktops (including lo-

cal RAID 0 serial ATA) for heavy interactive analysisOther CPU services— NT domain servers (primary and secondary)— Web servers (Linux and NT)— Objectivity (for BaBar)Tape drives— DLT tape library on fileserver— DLT drive on NT— Mostly used for backups now

4 Frank Porter, Caltech DoE Review, July 21, 2004

Page 5: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

HEP Experimental Computing System Description (III)

Network— 100BaseT capability available everywhere, 2 subnets for security,

capacity— HEP gigabit ethernet switches, plus CITNET 2000— Wireless 11 Mbps, maintained by Caltech— WAN supported by CaltechPrinters (principally 2 color and 2 B&W)VGA projectors (conference rooms, plus roamer)UPS for critical services (network, file servers, mail server, web server)

5 Frank Porter, Caltech DoE Review, July 21, 2004

Page 6: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

6 Frank Porter, Caltech DoE Review, July 21, 2004

Page 7: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Caltech is a major site for BaBar Monte Carlo ProductionTwo jobs on each of 40 dual-PIII nodes; Four jobs (hyperthreading

mode) on each of 20 dual Xeon nodes. Each job has 512 MB RAM

available.Allocations– Signal modes (large variety)– Generics: B0B̄0, B+B−, cc̄, uds, τ+τ−, µ+µ−

Alex Samuel runs MC production at Caltech– Checks 1–2 times/day– Request new allocation every 2–3 weeks– Consecutive allocations run overlapped, so no dead time except

when drain queue to upgrade conditions, background triggers, or

softwareCurrently, third most productive BaBar site for SP6.

7 Frank Porter, Caltech DoE Review, July 21, 2004

Page 8: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Caltech BaBar Monte Carlo Production

8 Frank Porter, Caltech DoE Review, July 21, 2004

Page 9: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Caltech BaBar Monte Carlo Production, Weekly StatsStats from BaBar Monte Carlo production sites in past week (July 16, 2004):

Total events produced in SP6: 54.12 M

Site Runs Done

Runs Failed

Failure Rate (%)

Events (M)

Machines Events/Machine Cpu Eff. (%)

Site eff. (%)

uvic2 3305 7 0.2114 7.248 78 0.0929 73.7 127.7

utd 2231 134 5.6660 7.018 62 0.1132 96.7 76.7

caltech 3447 18 0.5195 6.428 56 0.1148 83.7 127.2

osu 2358 8 0.3381 4.196 27 0.1554 97.2 91.9

cu-boulder

2072 34 1.6144 4.087 87 0.0470 95.1 39.8

albany 1037 1594 60.5853 4.028 32 0.1259 95.0 97.6

tud 2293 1589 40.9325 4.016 33 0.1217 79.6 130.5

ccin2p3 228 6 2.5641 3.608 372 0.0097 67.0 11.2

utenn 1431 97 6.3482 3.456 40 0.0864 98.4 90.5

infn 1867 165 8.1201 3.22 39 0.0826 71.9 117.0

fzk 1175 297 20.1766 2.762 343 0.0081 80.2 4.6

uk-spgrid 947 10 1.0449 2.196 34 0.0646 79.5 79.4

uvic 376 127 25.2485 1.144 20 0.0572 90.3 84.9

slac 258 0 0.0000 .432 32 0.0135 97.0 7.4

westgrid 738 1 0.1353 .148 71 0.0021 79.0 1.8

uofl 0 3 100.0000 .133 12 0.0111 91.7 5.8

9 Frank Porter, Caltech DoE Review, July 21, 2004

Page 10: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

BaBar Physics Analysis at Caltech

Use of Caltech computing for BaBar data analysis is increasing

– Off-loads SLAC computing

– Using both batch queues and interactive analysis

Set up for full CM2-based user analysis

– Have new releases

– Can import datasets from SLAC

– Can compile/debug/run user code; building and running is gener-

ally faster than running at SLAC

Used for large (100’s of GB) physics dataset storage (eg, ntuples) and

analysis

10 Frank Porter, Caltech DoE Review, July 21, 2004

Page 11: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

MINOS Looming Large

Pre-data MINOS uses farm for occasional Monte Carlo runs, some

analysis.

Beam expected to start in December. Computing model remains

somewhat unclear relative to Caltech, but likely will mean something

like:

– Some reprocessing of data

– Monte Carlo production, possibly substantial

– Physics analysis, requiring efficient interactive access to data

A worry: FNAL is not going with Red Hat Enterprise Linux.

11 Frank Porter, Caltech DoE Review, July 21, 2004

Page 12: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Continuous Evolution

Farm continues to grow to match needs.

— In 2004 installed additional 20 dual-CPU machines with 2.8 GHzIntel Xeon processors. Currently configuring order for another 20.Haven’t hit any scaling limit yet.

— Blade servers: opted against so far, will continue to watch.

— Rack space, power/heat load issues. Recently added electrical ca-pacity, will have to do more.

— Cost/unit approximately constant, performance/unit increases.

Disk space continues to grow to match needs.

— Cost/unit approximately constant, performance/unit increases.

Replaced DQS batch system with PBS.

No remaining reliance on AIX for services.

12 Frank Porter, Caltech DoE Review, July 21, 2004

Page 13: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Continuous Evolution (II)

Disk space and network replacing tapes; tapes required mostly for

backups.

Recently upgraded UPS; still need a bit more.

Additional switches (Cisco 2948G, 2970) recently purchased, currently

have four

13 Frank Porter, Caltech DoE Review, July 21, 2004

Page 14: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Caltech Support

Caltech’s ITS (Information Technology Services) provides variety of ser-

vices benefitting HEPSite-wide software license agreements— These agreements have improved considerably over time, and are

now quite flexible in permitting desired uses.— Autocad, Visio, Pro Engineer— Maple, Mathematica, Matlab— Microsoft: OS, Office, Visual Studio, Project— Norton antivirus (NAV)— PCTeX— SSH, WinSCP— Adobe Acrobat— New: Red Hat Enterprise Linux

14 Frank Porter, Caltech DoE Review, July 21, 2004

Page 15: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Caltech Support – Networking

Networking support

— Campus backbone, and equipment/maintenance for WAN connec-

tion provided by Caltech

— Caltech provides ISP (cable modem, PPP, ISDN) service

— Caltech monitors security alerts

— Major Caltech campus-wide network upgrade

— Uniform wireless 802.11b (partial) implementation

15 Frank Porter, Caltech DoE Review, July 21, 2004

Page 16: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Operations

New farm and CPU-servers were brought up with help from BaBar,

and largely restricted to BaBar as “guinea pigs.” Now supporting all

of the groups.

However, use continues to be dominated by BaBar with heavy Monte

Carlo and analysis demands.

Decision to go uniformly with Linux was largely the result of personnel

concerns. [Phase-out of AIX is essentially complete.]

Juan Barayoga is system manager.

Additional help from part-time students, physicists.

16 Frank Porter, Caltech DoE Review, July 21, 2004

Page 17: Frank Porter System Manager: Juan Barayogahep.caltech.edu/~babar/doe2004/compSlides.pdf · 2004. 7. 21. · — Desktop mixture of linux and Windows on Intel — Central interactive

Comments on Caltech HEP computing budget

Computing requirements are increasing rapidly, at same time power/dollarincreases.

Operating budget request is made up of two pieces;

— Maintenance and system administration, almost entirely salary.

— Equipment budget. Actually supersedes most former “mainte-nance” expense, with continuous “upgrade” – replacement moreeffective than repair.

Proposal is to maintain current level of funding.

17 Frank Porter, Caltech DoE Review, July 21, 2004