Computing for Belle

Computing for BelleComputing for Belle

CHEP2004September 27, 2004

Nobu KatayamaKEK

2004/09/27 Nobu Katayama(KEK) 2

OutlineOutline

Belle in generalSoftwareComputingProductionNetworking and Collaboration issuesSuper KEKB/Belle/Belle computing upgrades


Belle detectorBelle detector


Integrated luminosityIntegrated luminosity

May 1999:first collisionJuly 2001:30fb

Oct. 2002:100fb

July 2003:150fb

July 2004:287fb

July 2005:550fb

Super KEKB1~10ab/year!

>1TB raw data/daymore than 20% is hadronic events

~1 fb/day !


No need to stop runAlways at ~max. currents and therefore maximum luminosity

Continuous InjectionContinuous Injection

[CERN courier Jan/Feb 2004]~30% more Ldt both KEKB & PEP-II

HER current

LER current

continuous injection (new)normal injection (old)

Luminosity

0 12 24

Time

~1 fb-1/day ! (~1x106 BB)

-


Summary of Summary of b b sqq sqq CPV CPV

2.4

“sin21”

(A: consistent with 0)

Something new in the loop??

-

We needa lot more data


IHEP, Moscow IHEP, ViennaITEPKanagawa U.KEKKorea U.Krakow Inst. of Nucl. Phys.Kyoto U. Kyungpook Nat’l U. U. of Lausanne Jozef Stefan Inst.

Aomori U.BINPChiba U.Chonnam Nat’l U.Chuo U.U. of CincinnatiEwha Womans U.Frankfurt U.Gyeongsang Nat’l U.U. of HawaiiHiroshima Tech.IHEP, Beijing

U. of MelbourneNagoya U.Nara Women’s U.National Central U.Nat’l Kaoshiung Normal U.Nat’l Lien-Ho Inst. of Tech.Nat’l Taiwan U.Nihon Dental CollegeNiigata U.Osaka U.Osaka City U.Panjab U.Peking U.Princeton U.Riken-BNLSaga U.USTC

Seoul National U.Shinshu U.Sungkyunkwan U.U. of SydneyTata InstituteToho U.Tohoku U.Tohuku Gakuin U.U. of TokyoTokyo Inst. of Tech.Tokyo Metropolitan U.Tokyo U. of A and T.Toyama Nat’l CollegeU. of TsukubaUtkal U.VPIYonsei U.

The The BelleBelle Collaboration CollaborationThe The BelleBelle Collaboration Collaboration

13 countries, institutes, ~400 members


Collaborating institutionsCollaborating institutionsCollaborators

Major labs/universities from Russia, China, IndiaMajor universities from Japan, Korea, Taiwan, Australia…Universities from US and Europe

KEK dominates in one sense30~40 staffs work on Belle exclusivelyMost of construction and operating costs are paid by KEK

Universities dominates in another senseYoung students to stay at KEK, help operations, do physics analysis

Human resource issueAlways lacking man power

SoftwareSoftware


Core SoftwareCore SoftwareOS/C++

Solaris 7 on sparc and RedHat 6/7/9 on PCsgcc 2.95.3/3.0.4/3.2.2/3.3 (code compiles with SunCC)

No commercial software except for batch queuing system and hierarchical storage management system

QQ, EvtGen, GEANT3, CERNLIB (2001/2003), CLHEP(~1.5), postgres 7

Legacy FORTRAN codeGSIM/GEANT3/ and old calibration/reconstruction code)

I/O:home-grown stream IO package + zlibThe only data format for all stages (from DAQ to final user analysis skim files)Index file (pointer to events in data files) are used for final physics analysis


Framework (BASF)Framework (BASF)

Event parallelism on SMP (1995~)Using fork (for legacy Fortran common blocks)

Event parallelism on multi-compute servers (dbasf, 2001~, V2, 2004)Users’ code/reconstruction code are dynamically loadedThe only framework for all processing stages (from DAQ to final analysis)


Data access methodsData access methodsThe original design of our framework, BASF, allowed to have an appropriate IO package loaded at run time

Our (single) IO package grew to handle more and more possible ways of IOs (disk, tape, etc.) and was extended to deal with special situations and became spaghetti-ball like code

This summer we were faced again to extend it to handle new HSM and for tests with new software

We finally rewrote our big IO package into small pieces, making it possible to simply add one derived class for one IO method

Basf_io class

Basf_io_disk Basf_io_tape Basf_io_zfserv Basf_io_srb …

derived classes

IO objects are dynamically loaded upon request


Reconstruction softwareReconstruction software

30~40 people have contributed in the last several yearsFor many parts of reconstruction software, we only have one package. Very little competition

Good and bad

Identify weak points and ask someone to improve them

Mostly organized within the sub detector groupsPhysics motivated, though

Systematic effort to improve tracking software but very slow progress

For example, 1 year to get down tracking systematic error from 2% to less than 1%Small Z bias for either forward/backward or positive/negative charged tracks

When the problem is solved we will reprocess all data again


Analysis softwareAnalysis softwareSeveral ~ tens of people have contributed

Kinematical and vertex fitterFlavor taggingVertexingParticle ID (Likelihood)Event shapeLikelihood/Fisher analysis

People tend to use standard packages but…System is not well organized/documentedHave started a task force (consisting of young Belle members)


Postgresql database systemPostgresql database system

The only database system Belle usesother than simple UNIX files and directoriesA few years ago, we were afraid that nobody uses postgresql but it seems postgresql is now widely used and well maintained

One master, several copies at KEK, many copies at institutions/on personal PCs

~120,000 records (4.3GB on disk)IP (Interaction point) profile is the largest/most popular

It is working quite well although consistency among many database copies is the problem


BelleG4 for (Super) BelleBelleG4 for (Super) BelleWe have also started (finally) building Geant4 version of the Belle detector simulation programOur plan is to build the Super Belle detector simulation code first, then write reconstruction codeWe hope to increase number of people who can write Geant4 code in Belle and in one year or so, we can write Belle G4 simulation code and compare G4, G3 and real dataSome of the detector code is in F77 and we must rewrite them

ComputingComputing


Computing Equipment budgetsComputing Equipment budgetsRental system

Four five year contract (20% budget reduction!)

• 1997-2000 (25Byen;~18M euro for 4 years)• 2001-2005 (25Byen;~18M euro for 5 years)

A new acquisition process is starting for 2006/1-Belle purchased systems

KEK Belle operating budget 2M Euro/yearOf 2 M Euro, 0.4~1Meuro/year for computing

• Tapes(0.2MEuro), PCs(0.4MEuro) etc.Sometimes we get bonus(!)

• so far in five years we got about 2M Euro in total

Other institutions0~0.3Meuro/year/institutionOn the average, very little money allocated


Rental system(2001-2005)Rental system(2001-2005)

0.51.8

0.7

4.2

3.9

2.7

1.2

0.4

3.9

DAQ interfaceGroup serversUser terminalsTape systemDisk systemsSparc serversPC serversNetworkSupport

total cost in fiveyears (M Euro)


Sparc and Intel CPUsSparc and Intel CPUsBelle’s reference platform

Solaris 2.7Everyone has account9 workgroup servers (500Hz, 4CPU)38 compute servers

500GHz, 4CPULSF batch system40 tape drives (2 each on 20 servers)

Fast access to disk servers20 user workstations with DAT, DLT, AITsMaintained by Fujitsu SE/CEs under the big rental contract

Compute servers (@KEK, Linux RH 6.2/7.2/9)User terminals (@KEK to log onto the group servers)

120 PCs (~50Win2000+X window sw, ~70 Linux)

User analysis PCs(@KEK, unmanaged)Compute/file servers at universities

A few to a few hundreds @ each institutionUsed in generic MC production as well as physics analyses at each institutionTau analysis center @ Nagoya U. for example


Fujitsu 127PCs(Pentium-III 1.26GHz)

Appro 113PCs(Athlon 1.67GHz×2)

Compaq 60PCs(Pentium-III 0.7GHz)

1020GHz 320GHz

168GHz

Dell 36PCs(Pentium-III ~0.5GHz)

470GHz768GHz

377GHz

NEC 84PCs(Xeon 2.8GHz×2)

Fujitsu 120PCs(Xeon 3.2GHz×2)

Dell 150PCs(Xeon 3.4GHz×2)

Comingsoon!

PC farm of several generationsPC farm of several generations

0

600

1200

1800

2400

3000

3600

1999/6/ 1

2000/6/ 1

2001/6/ 1

2002/6/ 1

2003/6/ 1

2004/6/ 1

IntegratedLuminosity

Total # of GHz

99 00 01 02 03 04


Disk servers@KEKDisk servers@KEK8TB NFS file servers

Mounted on all UltraSparc servers via GbE4.5TB staging disk for HSM

Mounted on all UltraSparc/Intel PC servers~15TB local data disks on PCs

generic MC files are stored and used remotelyInexpensive IDE RAID disk servers

160GB (7+1) 16 = 18TB @ 100K Euro (12/2002)250GB (7+1) 16 = 28TB @ 110K Euro (3/2003)320GB (6+1+S) 32 = 56TB @ 200K Euro (11/2003)400GB (6+1+S) 40 = 96TB @ 250K Euro (11/2004)

• With a tape library in the back end


Tape librariesTape librariesDirect access DTF2 tape library

40 drives (24MB/s each) on 20 UltraSparc servers4 drives for data taking (one is used at a time)Tape library can hold 2500 tapes (500TB)We store raw data and dst using

• Belle tape IO package (allocate, mount, unmount, free)• perl script (files to tape database), and• LSF (exclusive use of tapes)

HSM backend with three DTF2 tape librariesThree 40TB tape libraries are back-end of 4.5TB disksFiles are written/read as all are on disk

• When the capacity of the library becomes full, we move tapes out of the library and human operators insert tapes when requested by users by mails


Data size so farData size so farRaw data

400TB written since Jan. 2001 for 230 fb of data on 2000 tapesDST data

700TB written since Jan. 2001 for 230 fb of data on 4000 tapes, compressed with zlib

MDST data (four vectors, verteces and PID)15TB for 287 fb of hadronic events (BBbar and continuum), compressed with zlib, two photon: add 9TB for 287 fb

DTF(2) Tape usage

0

1000

2000

3000

4000

5000

6000

7000

8000

Nu

mb

er

of

tap

es DTF(2) tape usages

2001 2002 2003 2004

DTF(40GB) tapesfrom 1999-2000

DTF2(200GB) tapes2001-2004

Total ~1PB in DTF(2) tapesplus 200+TB in SAIT


Mass storage strategyMass storage strategyDevelopment of next gen. DTF drives was canceled

SONY’s new mass storage system will use SAIT drive technology (Metal tape, helical scan)

We decided to test itPurchased a 500TB tape library and installed it as the backend (HSM) using newly acquired inexpensive IDE based RAID systems and PC file servers

We are moving from direct tape access to hierarchical storage system

We have learned that automatic file migration is quite convenientBut we need a lot of capacity so that we do not need operators to mount tapes


Compact, inexpensive HSMCompact, inexpensive HSMThe front-end disk system consists of

8 dual Xeon PC servers with two SCSI channelsEach connecting one 16 320GB IDE disk RAID system

• Total capacity is 56TB (1.75TB(6+1+S)228)

The back-end tape system consists ofSONY SAIT Petasite tape library system in three rack wide space; main system (four drives) + two cassette consoles with total capacity of 500 TB (1000 tapes)

They are connected by a 16 port 2Gbit FC switchInstalled in Nov. 2003~Feb. 2004 and is working well

We keep all new data on this systemLots of disk failures (even power failures) but system is surviving

Please see my poster


PetaServe: HSM softwarePetaServe: HSM softwareCommercial software from SONY

We have been using it since 1999 on Solaris

It now runs on Linux (RH7.3 based)It used Data Management API of XFS developed by SGI

When the drive and the file server is connected via FC switch, the data on disk can directly be written to tapeMinimum size for migration and the size of which the first part of the file remains on disk can be set by usersThe backup system, PetaBack, works intelligently with PetaServe; In particular, in the next version, if the file has been shadowed (one copy on tape, one copy on disk), it will not be backed up, saving space and time

So far, it’s working extremely well but

Files must be distributed by ourselves among disk partitions (now 32, in three months many more as there is 2TB limit…)Disk is a file system and not cacheHSM disk can not make an effective scratch disk

• There is a mechanism to do nightly garbage collection if the users delete files by themselves


Extending the HSM systemExtending the HSM systemThe front-end disk system will include

ten more dual Xeon PC servers with two SCSI channelsEach connecting one 16 400GB RAID system

• Total capacity is 150TB (1.7TB(6+1+S)228) +(2.4TB(6+1+S)2210)The back-end tape system will includeFour more cassette consoles and eight more drives;

• total capacity is 1.29 PB (2500 tapes)

32 port 2Gbit FC switch will be added (32+16 inter connected with 4 ports)Belle hopes to migrate all DTF2 data into this library by the end of 2005

20TB/day!data must bemoved fromthe old tapelibrary to thenew tapelibraray


Floor space requiredFloor space required

B D D D D D D D J

DD

DC

CC

CC

CC

150 EX 150EX150EX

B DC

D

11,493mm

4,792mm

3,325mm

3,970mm

3,100mm

12,285mm

•Lib（ ~500TB） ~141㎡•BHSM（ ~150TB） ~31㎡•Backup（ ~13TB） ~16㎡

200BF200C200C200C200C200C200C

6,670mm

2,950mm

Capacity

Spacerequire

d

DTF2

663TB 130m2

SAIT 1.29PB

20m2SAIT library systems

DTF2 library systems


Use of LSFUse of LSFWe have been using LSF since 1999We started using LSF on PCs since 2003/3Of ~1400 CPUs (as of 2004/11), ~1000 CPU are under LSF and being used for DST production, generic MC generation, calibration, signal MC generation and user physics analyses

DST production uses it’s own distributed computing (dbasf) and child nodes do not run under LSFAll other jobs share the CPUs and dispatched from LSFFor users we use fair share scheduling

We will be trying to evaluate new features in LSF 6.0 such as report and multi-clusters hoping to use it a collaboration wide job management tool


Rental system plan for BelleRental system plan for Belle

We have started the lengthy process of the computer acquisition for 2006 (-2010)100,000 specCINT2000_rates (or whatever…) compute servers10PB storage system with extensions possiblefast enough network connection to read/write data at the rate of 2-10GB/s (2 for DST, 10 for physics analysis)User friendly and efficient batch system that can be used collaboration wideHope to make it a Grid capable system

Interoperate with systems at remote institutions which are involved in LHC grid computing

Careful balancing of lease and yearly-purchases

ProductionProduction


Online production and KEKBOnline production and KEKBAs we take data, we write raw data directly on tapes (DTF2), at the same time we run dst production code (event reconstruction) using 85 dual Athlon 2000+ PC servers

See Itoh san’s talk on RFARM on 29th in Online computing session

Using the results of event reconstruction we send feedback to the KEKB accelerator group such as location and size of the collision so that they can tune the beams to maximize instantaneous luminosity, keeping them collide at the centerWe also monitor BBbar cross section very precisely and we can change the machine beam energies by 1MeV or so to maximize the number of BBbars produced

The resulting DST are written on temporary disks and skimmed for detector calibration

Once detector calibration is finished, we run the production again and make DST/mini-DST for final physics analysis

Changed “RF phase” corresponding to .25 mm

VertexZ position

Luminosity


Online DST production systemOnline DST production system

Serverroom

Belle Event Builder

DTF2

Control room

E-hut Computer Center 1000base/SX

100base-TX

PCs are operated by Linux

DellCX600

3com4400

3com4400

Dual Athron

1.6GHz .....

85 nodes

inputdistributor

outputcollector

control

disk serverDNSNIS

PlanexFMG-226SX

Dual PIII(1.3GHz)

Dual Xeon(2.8GHz)

Dual PIII(1.3GHz) (170 CPUs)


Online data processingOnline data processing

　　 Run 39-53 (Sept. 18, 2004)211 pb accumulatedAccepted 9,029,058 eventsAccept rate 360.30HzRun time 25,060sPeak Lum was 91033,we started running from Sept. 9

Level 3softwaretrigger

Recorded 4710231 events(52%) Used 186.6[GB]

DTF2 tape libraryRFARM82 dual Athlon2000+ servers

Total CPU time:1812887 secL4 outputs:3980994 events893416 HadronsDST:381GB(95K/ev)

New SAIT HSM

Skims: pair 115569 (5711are written)Bhabha: 475680 (15855)Tight Hadron 687308 (45819)

More than 2MBBbar events!


DST productionDST productionReprocessing strategy

Goal:3 months to reprocess all data using all KEK compute servers

• Often we have to wait for constants• Often we have to restart due to bad constants• Efficiency:50~70%

History2002: Major software updates

• 2002/7 Reprocessed all data till then (78fb in three months)

2003: No major software updates• 2003/7 Reprocessed data taken since 2002/10 (~60fb in

three months)2004: SVD2 was installed in summer 2003

• Software is all new and being tested till mid May• 2004/7 Reprocessed data taken since 2003/10 (~130fb

in three months)

Elapsed time (actual hours)

Setup

Dbaseupdate

ToF(RFARM)

dE/dx Belleincomingdata rate(1fb1/day)

Comparison with previous system

New!

Old…

0

1000

2000

3000

4000

5000

6000

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61L pro

cess

ed/d

ay(p

b)

day

-2004-2003

See Adachi/Ronga poster

Using 1.12THz total CPUwe achieved >5 fb1/day

5 fb1/day


generic MC productiongeneric MC productionMainly used for physics background study400GHz Pentium III~2.5fb/day80~100GB/fbdata in the compressed formatNo intermediate (GEANT3 hits/raw) hits are kept.

When a new release of the library comes, we try to produce new generic MC sample

For every real data taking run, we try to generate 3 times as many events as in the real run, taking

Run dependenceDetector background are taken from random trigger events of the run being simulated

into account

At KEK, if we use all CPUs we can keep up with the raw data taking (3) MC production

We ask remote institutions to generate most of MC eventsWe have generated more than 2109 events so far using qq

100TB for 3300 fb of real dataWould like to keep on disk

0

50

100

150

200

250

M events produced since Apr. 2004


Random number generatorRandom number generatorPseudo random number can be difficult to manage when we must generate billions of events in many hundred thousands of jobs. (Each event consumes about one thousand random numbers in one mille second.Encoding the daytime as the seed may be one solution but there is always a danger that one sequence is the same as anotherWe tested a random generator board using thermal noises.

The board can generate up to 8M real random numbers in one secondWe will have random number servers on several PC servers so that we can run many event generator jobs in parallel

Network/data transferNetwork/data transfer


Networks and data transferNetworks and data transferKEKB computer system has a fast internal network for NFS

We have added Belle bought PCs, now more than 500 PCs and file serversWe have connected Super-SINET dedicated 1Gbps lines to four universitiesWe have also requested that we connect this network to outside KEK for tests of Grid computing

Finally, a new Cisco 6509 has been added to separate the above three networks

A firewall and login servers make the data transfer miserable (100Mbps max.)DAT tapes to copy compressed hadron files and MC generated by outside institutionsDedicated GbE network to a few institutions are now being addedTotal 10Gbit to/from KEK being addedStill slow network to most of collaborators


Data transfer to universitiesData transfer to universities

KEK Computing Research cento

r

Osaka U

Nagoya U

1TB/day~100Mbps

~ 1TB/day400 GB/day~45 Mbps

170 GB/day

NFS

e+e- Bo Bo

Tohoku U

10Gbps

Tsukuba HallBelle

We use Super-SINET, APAN and other internationalAcademic networks as the back bone of the experiment

U of TokyoTIT

AustraliaU.S.

KoreaTiwanErc.


Grid Plan for BelleGrid Plan for BelleBelle hasn’t made commitment to Grid technology (We hope to learn more here)

Belle@KEK has not butBelle@remote institutions might have been involved, in particular, when they are also involved in one of the LHC experiments

Parallel/distributed computing aspect

Nice but we do have our own solutionEvent (trivial) parallelism worksHowever, as we accumulate more data, we hope to have more parallelism (several tens to hundreds to thousands of CPUs)Thus we may adopt a standard solution

Parallel/distributed file systemYes we need something better than many UNIX files on many nodesFor each “run” we create ~1000 files; we have had O(10,000) runs so far

Collaboration wide issuesWAN aspect

• Network connections are quite different from one institution to another

authentication• We use DCE to login• We are testing VPN with OTP• CA based system would be

nice

resource management issues• We want submit jobs to

remote institutions if CPU is available


Belle’s attemptsBelle’s attemptsWe have separated stream IO package so that we can connect to any of (Grid) file management packagesWe started working with remote institutions (Australia, Tohoku, Taiwan, Korea)

See our presentation by G. Molony on 30th in Distributed Computing session

SRB (Storage Resource Broker by San Diego Univ.)We constructed test beds connecting Australian institutes, Tohoku university using GSI authenticationSee our presentation by Y. Iida on 29th in Fabric session

gfarm (AIST)We are using gfarm (cluster) at AIST to generate generic MC eventsSee presentation by O. Tatebe on 27th in Fabric session


Human resourcesHuman resourcesKEKB computer system+Network

Supported by the computer center (1 researcher, 3~4 system engineers+1 hardware engineer, 2~3 operators)

PC farms and Tape handling1 KEK/Belle researcher, 2 Belle support staffs (they help productions as well)

DST/MC production management2 KEK/Belle researchers, 1 pos-doc or student at a time from collaborating institutions

Library/Constants database2 KEK/Belle researchers + sub detector groups

We are barely surviving and are proud of what we have been able to do; make the data usable for analyses

Super KEKB, S-Belle and Super KEKB, S-Belle and Belle computing upgradesBelle computing upgrades


More questions in flavor physicsMore questions in flavor physics

The Standard Model of particle physics cannot really explain the origin of the universe

It does not have enough CP violationModels beyond the SM have many CP violating phases

The is no principle to explain flavor structure in models beyond the Standard model

Why Vcb>>Vub

What about the lepton sector?Is there a relationship between quark and neutrino mixing matrices?

We know thatStandard Model (incl. Kobayashi-Maskawa mechanism) is the effective low energy description of Nature.However most likely New Physics lies in O(1) TeV region.

• LHC will start within a few years and (hopefully) discover new elementary particles.

Flavor-Changing-Neutral-Currents (FCNC) suppressed

• New Physics w/o suppression mechanism excluded up to 103 TeV. New Physics Flavor Problem

• Different mechanism different flavor structure in B decays (tau, charm as well)

Luminosity upgrade will be quite effective and essential.


Roadmap of B physicsRoadmap of B physics

Discovery of CPV in B decays

Precise test of SMand search for NP

Study of NP effect in B and decays

Identification of SUSY breaking mechanism

time or

integrated luminosity

Yes!!2001

sin21, CPV in B,3, Vub, Vcb, bs,bsll, new states etc.

AnomalousCPV in bsss

NP discoveredat LHC (2010?)

Now280 fb-1

if NP=SUSY

Tevatron (m~100GeV) LHC (m~1TeV)KEKB (1034) SuperKEKB (1035)

Concurrent program


Projected LuminosityProjected Luminosity

SuperKEKB“Oide scenario”

50ab-1


UT MeasurementsUT Measurements

5 ab1Two cases at

sin21=0.013, 2=2.9o, 3=4o, |Vub|=5.8%

New Physics !SM only

estimated from current results


With more and more dataWith more and more dataWe now have more than 250M BWe now have more than 250M B00 on tape. We can no on tape. We can not only observe very rare decays but also measure time t only observe very rare decays but also measure time dependent asymmetry ot the rare deay modes and deterdependent asymmetry ot the rare deay modes and determine the CP phasesmine the CP phases

2716 events 68 events

enlarged


New New CPCPV phase in V phase in bb ss

J/Ks

Ks

5 ab1

(S =0.24 summer 2003 bs WA)6.2

S Sf – SJ/Ks


SuperKEKB: schematicsSuperKEKB: schematics

8 GeV e+ beam 4.1 A

3.5 GeV e beam 9.6 A

L ∝ I y

y*

(Super Quad)

Super Belle

4 x 3 0.5 ~24


Detector upgrade: baseline designDetector upgrade: baseline design

/ K

L d

etec

tio

nS

cin

till

ato

r st

rip

/til

e

Tracking + dE/dx small cell/fast-gas + larger radius

CsI(Tl) 16X0

Si vtx. det.

SC solenoid1.5T lower?

2 pixel lyrs. + 3 lyr. DSSD

pure CsI (endcap)

PID“TOP” + RICH

R&D inprogress


Computing for super KEKBComputing for super KEKB

For (even) 1035 luminosity;DAQ:5KHz, 100KB/event500MB/sPhysics rate: BBbar@100Hz1015 bytes/year:1PB/year800 4GHz CPUs to catch up data taking2000 4GHz 4CPU PC servers10+PB storage system (what media?)300TB-1PB MDST/year online data disk

Costing >50 M Euro?


ConclusionsConclusionsBelle has accumulated 274M BBbar events so far (July, 2004); data are fully processed; everyone is enjoying doing physics (tight competition with BaBar)A lot of computing resource are added to the KEKB computer system to handle floods of dataThe management team remains small (less than 10 KEK staffs (not full time, two of whom are group leasers of sub detectors) + less than 5 SE/CE) in particular, the PC farms and the new HSM are managed by a few peopleWe look for the GRID solution for us at KEK and for the rest of Belle collaborators


Belle jargonBelle jargonDST: output files of good reconstructed eventsMDST: mini-DST, summarized events for physics analysis(re)-production/reprocess

Event reconstruction process(Re)-run all reconstruction code on all raw datareprocess:process ALL events using a new version of software

Skim (Calibration and Physics) pairs, gamma-gamma, Bhabha, Hadron, Hadron, J/, single/double/end-point lepton, bs, hh…

generic MC (Production)A lot of jetset c,u,d,s pairs/QQ/EvtGen generic bc decays MC events for background study


Belle Software LibraryBelle Software LibraryCVS (no remote check in/out)

Check-ins are done by authorized persons

A few releases per yearUsually it takes a few weeks to settle down after a major release as we have no strict verification, confirmation or regression procedure so far. It has been left to the developers to check the “new” version” of the code. We are now trying to establish a procedure to compare against old versionsAll data are reprocessed/All generic MC are regenerated with a new major release of the software (at most once per year, though)

Reprocess all data before summer conferences

In April, we have a version with improved reconstruction softwareDo reconstruction of all data in three monthsTune for physics analysis and MC productionFinal version before October for physics publications using this version of dataTakes about 6 months to generate generic MC samples

Library cycle (2002~2004)20020405 0416 0424 0703 1003 20030807(SVD1) (+) 20040727(SVD2)

Overview – Main changes Based on the RFARM

Adapted to off-line production

Dedicated database server

Input from tape

Output to HSM

⇒ Stable, light, fastInputbcs03

Tape

Master

basf

basf

basf

basf

Outputbcs202

basf

zfserv

Typical DBASF cluster

Overview – CPU powerFarm 1 nodes CPU [Ghz] Power [Ghz]bcs101-bcs130 30 4 x 0.766 92bcs131-bcs160 30 4 x 0.766 92

Farm 2bcs201-bcs242 42 2 x 1.7 143bcs243-bcs284 42 2 x 1.7 143bcs285-bcs326 42 2 x 1.7 143

Farm 3bcs801-bcs840 40 2 x 3.2 256bcs841-bcs880 40 2 x 3.2 256bcs881-bcs900 20 2 x 3.2 128

Total 286 1252

34%

15%

51%

1.261.261.26


Belle PC farmsBelle PC farmsHave added as we take data

’99/06:16 4CPU 500MHz Pen III’00/04:20 4CPU 550MHz Pen III’00/10:20 2CPU 800MHz Pen III’00/10:20 2CPU 933MHz Pen III’01/03:60 4CPU 700MHz Pen III’02/01:127 2CPU 1.26GHz Pen III’02/04:40 700MHz mobile Pen III’02/12:113 2CPU Athlon 2000+’03/03:84 2CPU 2.8GHz Xeon’04/03:120 2CPU 3.2GHz Xeon’04/11:150 2CPU 3.4GHz Xeon(800)…

We are barely catching up…We must get a few to 20 TFLOPS in coming years as we take more data

Computing resources

Int. Luminosities

0

600

1200

1800

2400

3000

3600

1999/6/ 1

2000/6/ 1

2001/6/ 1

2002/6/ 1

2003/6/ 1

2004/6/ 1

0

600

1200

1800

2400

3000

3600

1999/6/ 1

2000/6/ 1

2001/6/ 1

2002/6/ 1

2003/6/ 1

2004/6/ 1

Summary and prospects

Major challenges

Database access (very heavy load at job start)

Input rate (tape servers are getting old…)

Network stability (shared disks, output,…)

Runs Time Time online Luminosity Rate Rate onlineExp [hour] [hour] [1/fb] [1/fb/day] [1/fb/day]

33 565 413.5 321.2 20.5 1.19 1.5335 489 254.4 137.3 18.8 1.78 3.2931 1078 140.0 136.8 20.0 3.43 3.5137 1276 507.4 349.1 66.4 3.14 4.57All 3408 1349.0 944.3 125.8 2.24 3.20

Old system: 1.12 1/fb/day (online)

Production performance (CPU hours)

Farm 8 operational!

> 5.2 fb/day

Computing for Belle

Documents

Transcript of Computing for Belle