BaBar and the GRID

BaBar and the GRID

Roger Barlow for Fergus Wilson

GridPP 13

5th July 2005, Durham


Fergus Wilson 2

Outline Personnel. Current BaBar Computing Model

Monte Carlo Data Reconstruction User Analyses

Projections of required resources. BaBar GRID effort and planning.

Monte Carlo User Analysis


Fergus Wilson 3

BaBar GRID Personnel (2.5 FTEs)

James Werner

Manchester

GridPP funded

Giuliano Castelli

RAL

GridPP funded

Chris Brew

RAL

50% GRID

Roger Barlow

Manchester

BaBar GRID PI

We do not have an infinite number of monkeys… our goals are therefore constrained

Fergus Wulson

RAL


Fergus Wilson 4

BaBar Computing Model – Monte Carlo Monte Carlo is generated at ~25 sites around the

world. Database driven production. ~20KBytes per event. ~10 seconds per event. 2.8 billion events generated last year. 99.5% efficient. Need 100-150 million events per week.

MC datasets (ROOT files) are merged and sent to SLAC.

MC datasets are distributed from SLAC to any Tier 1/2/3 that wants them.


Fergus Wilson 5

BaBar Computing Model - Data 10 Mbytes/sec to tape at SLAC. Reconstructed at Padova (1.5 fb-1/day). “Skimmed” into datasets at Karlsruhe. Skimmed datasets (ROOT files) sent to

SLAC. Datasets are distributed from SLAC to any

Tier 1/2/3 that wants them. An analysis can be run on a laptop.


Fergus Wilson 6

BaBar Computing Model – User Analysis Location of datasets provided by mySQL/Oracle

database. Data/Monte Carlo datasets accessed via Xrootd file

server (load-balancing, fault-tolerant, disk or tape interface).

Conditions accessed from proprietary Objectivity database.

User Code

XrootdObjectivity

Files Files Files Files

Tier 1/2/3mySQL


Fergus Wilson 7

Current Status at RAL Tier 1 RAL imports data and Monte Carlo every

night. RAL has the full data and Monte Carlo for 4

out 15 of the Analysis Working Group. All disk and tape are full.

Importing has stopped. We will have to delete our backups of the data. Moving to a disk/tape staging system but unlikely

to keep up with demand. CPU underused at the moment.


Fergus Wilson 8

BaBar ProjectionsBottom-up planning driven by luminosity:

Double dataset by 2006 (500 fb-1) Quadruple dataset by 2008 (1000 fb-1)

GridPP Allocation versus Luminosity

0

200

400

600

800

1000

1200

Disk (TB) 75 101 68 112 267

Tape (TB) 70 43 84 70 121

Luminosity (fb-1) 250 375 500 750 1000

2004 2005 2006 2007 2008


Fergus Wilson 9

BaBar Monte Carlo on the GRID We have already produced 30 million Monte Carlo

events on the GRID at Bristol/RAL/Manchester/RHUL (2004 using globus).

Now using LCG at RAL: Software is installed via an RPM at sites (provided by

BaBar Italian GRID groups). Job submission/control from RAL. 1.2 million events per week during June 2005. This is 7.5% of BaBar weekly production (during a slow

period). Will aim to soak up 25% of our Tier 1 allocation with SP as

requested by GridPP. Should do 3-6 million per week at RAL.


Fergus Wilson 10

BaBar Monte Carlo on the GRID – Tier 2 We are merging the QMUL, Birmingham and

Bristol BaBar farms: 240 slow (866MHz) cpus.

We will setup regional Objectivity servers that can be accessed over WAN. This means Objectivity is not needed at every Tier site.

We need a large stable Tier 2 if we are to roll this out beyond RAL. We don’t have the manpower to develop the MC and manage lots of small sites.


Fergus Wilson 11

BaBar GRID Data Analysis We now have a standard generic initialisation

script for all GRID sites. Sets up BaBar environment. Sets up xrootd/objectivity. Identifies what software releases are available. Identifies what conditions are available. Identifies what collections of datasets are

available. Identifies if site is setup and/or validated for Monte

Carlo production.


Fergus Wilson 12

BaBar GRID Data Analysis Prototype Job Submission System

(EasyGrid): interfaces to mySQL database to identify required

datasets and allocates them to jobs. Submits jobs Resubmits jobs when they fail. Resubmits jobs when they fail again. Monitors progress. Retrieves output (usually root files).

Have analysed 60 million events this way with jobs submitted from Manchester to RAL.


Fergus Wilson 13

BaBar GRID Data Analysis The Data Analysis works if you know that the data exists at a

particular site. Datasets are not static:

MC always being generated. Billions of events. Millions of files. Thousands (currently 36000) collections of datasets (arranged by

processing release and physics process). The challenge will be to

Interrogate sites about their available data. Allocate jobs according to available data and site resources. Monitor it all.

First Step: Shortly the local mySQL database that identifies the locally

available datasets will also know about the availability of datasets at every other site. Can then form the backend of an RLS.


Fergus Wilson 14

Conclusion We are already doing Monte Carlo production on the GRID.

We have met all our deliverables. We will start major production at RAL. We need some large Tier 2 sites if this is to go anywhere in the

UK. We are already doing Data Analysis on the GRID.

We have met all our deliverables. Concentrate on sites with BaBar infrastructure and local

datasets. Provide WAN-accessible servers. We have a prototype data analysis GRID interface. Still many GRID issues to be tackled before allowing normal

people near it. BUT…the GRID still has prove it can provide a production quality

service on the time scale of running experiments.

BaBar and the GRID

Documents

Transcript of BaBar and the GRID