BaBar and the GRID
description
Transcript of BaBar and the GRID
BaBar and the GRID
Roger Barlow for Fergus Wilson
GridPP 13
5th July 2005, Durham
5th July 2005, Durham
Fergus Wilson 2
Outline Personnel. Current BaBar Computing Model
Monte Carlo Data Reconstruction User Analyses
Projections of required resources. BaBar GRID effort and planning.
Monte Carlo User Analysis
5th July 2005, Durham
Fergus Wilson 3
BaBar GRID Personnel (2.5 FTEs)
James Werner
Manchester
GridPP funded
Giuliano Castelli
RAL
GridPP funded
Chris Brew
RAL
50% GRID
Roger Barlow
Manchester
BaBar GRID PI
We do not have an infinite number of monkeys… our goals are therefore constrained
Fergus Wulson
RAL
5th July 2005, Durham
Fergus Wilson 4
BaBar Computing Model – Monte Carlo Monte Carlo is generated at ~25 sites around the
world. Database driven production. ~20KBytes per event. ~10 seconds per event. 2.8 billion events generated last year. 99.5% efficient. Need 100-150 million events per week.
MC datasets (ROOT files) are merged and sent to SLAC.
MC datasets are distributed from SLAC to any Tier 1/2/3 that wants them.
5th July 2005, Durham
Fergus Wilson 5
BaBar Computing Model - Data 10 Mbytes/sec to tape at SLAC. Reconstructed at Padova (1.5 fb-1/day). “Skimmed” into datasets at Karlsruhe. Skimmed datasets (ROOT files) sent to
SLAC. Datasets are distributed from SLAC to any
Tier 1/2/3 that wants them. An analysis can be run on a laptop.
5th July 2005, Durham
Fergus Wilson 6
BaBar Computing Model – User Analysis Location of datasets provided by mySQL/Oracle
database. Data/Monte Carlo datasets accessed via Xrootd file
server (load-balancing, fault-tolerant, disk or tape interface).
Conditions accessed from proprietary Objectivity database.
User Code
XrootdObjectivity
Files Files Files Files
Tier 1/2/3mySQL
5th July 2005, Durham
Fergus Wilson 7
Current Status at RAL Tier 1 RAL imports data and Monte Carlo every
night. RAL has the full data and Monte Carlo for 4
out 15 of the Analysis Working Group. All disk and tape are full.
Importing has stopped. We will have to delete our backups of the data. Moving to a disk/tape staging system but unlikely
to keep up with demand. CPU underused at the moment.
5th July 2005, Durham
Fergus Wilson 8
BaBar ProjectionsBottom-up planning driven by luminosity:
Double dataset by 2006 (500 fb-1) Quadruple dataset by 2008 (1000 fb-1)
GridPP Allocation versus Luminosity
0
200
400
600
800
1000
1200
Disk (TB) 75 101 68 112 267
Tape (TB) 70 43 84 70 121
Luminosity (fb-1) 250 375 500 750 1000
2004 2005 2006 2007 2008
5th July 2005, Durham
Fergus Wilson 9
BaBar Monte Carlo on the GRID We have already produced 30 million Monte Carlo
events on the GRID at Bristol/RAL/Manchester/RHUL (2004 using globus).
Now using LCG at RAL: Software is installed via an RPM at sites (provided by
BaBar Italian GRID groups). Job submission/control from RAL. 1.2 million events per week during June 2005. This is 7.5% of BaBar weekly production (during a slow
period). Will aim to soak up 25% of our Tier 1 allocation with SP as
requested by GridPP. Should do 3-6 million per week at RAL.
5th July 2005, Durham
Fergus Wilson 10
BaBar Monte Carlo on the GRID – Tier 2 We are merging the QMUL, Birmingham and
Bristol BaBar farms: 240 slow (866MHz) cpus.
We will setup regional Objectivity servers that can be accessed over WAN. This means Objectivity is not needed at every Tier site.
We need a large stable Tier 2 if we are to roll this out beyond RAL. We don’t have the manpower to develop the MC and manage lots of small sites.
5th July 2005, Durham
Fergus Wilson 11
BaBar GRID Data Analysis We now have a standard generic initialisation
script for all GRID sites. Sets up BaBar environment. Sets up xrootd/objectivity. Identifies what software releases are available. Identifies what conditions are available. Identifies what collections of datasets are
available. Identifies if site is setup and/or validated for Monte
Carlo production.
5th July 2005, Durham
Fergus Wilson 12
BaBar GRID Data Analysis Prototype Job Submission System
(EasyGrid): interfaces to mySQL database to identify required
datasets and allocates them to jobs. Submits jobs Resubmits jobs when they fail. Resubmits jobs when they fail again. Monitors progress. Retrieves output (usually root files).
Have analysed 60 million events this way with jobs submitted from Manchester to RAL.
5th July 2005, Durham
Fergus Wilson 13
BaBar GRID Data Analysis The Data Analysis works if you know that the data exists at a
particular site. Datasets are not static:
MC always being generated. Billions of events. Millions of files. Thousands (currently 36000) collections of datasets (arranged by
processing release and physics process). The challenge will be to
Interrogate sites about their available data. Allocate jobs according to available data and site resources. Monitor it all.
First Step: Shortly the local mySQL database that identifies the locally
available datasets will also know about the availability of datasets at every other site. Can then form the backend of an RLS.
5th July 2005, Durham
Fergus Wilson 14
Conclusion We are already doing Monte Carlo production on the GRID.
We have met all our deliverables. We will start major production at RAL. We need some large Tier 2 sites if this is to go anywhere in the
UK. We are already doing Data Analysis on the GRID.
We have met all our deliverables. Concentrate on sites with BaBar infrastructure and local
datasets. Provide WAN-accessible servers. We have a prototype data analysis GRID interface. Still many GRID issues to be tackled before allowing normal
people near it. BUT…the GRID still has prove it can provide a production quality
service on the time scale of running experiments.