Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory
-
Upload
hoyt-jarvis -
Category
Documents
-
view
20 -
download
0
description
Transcript of Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory
Central Reconstruction Central Reconstruction System on the RHIC Linux System on the RHIC Linux
Farm in Brookhaven Farm in Brookhaven LaboratoryLaboratory
HEPIX - BNLHEPIX - BNL
October 19, 2004October 19, 2004
Tomasz Wlodek - BNLTomasz Wlodek - BNL
BackgroundBackground
Brookhaven National Laboratory (BNL) is a Brookhaven National Laboratory (BNL) is a multi-disciplinary research laboratory multi-disciplinary research laboratory funded by US government.funded by US government.
BNL is the site of Relativistic Heavy Ion BNL is the site of Relativistic Heavy Ion Collider (RHIC) and four of its experiments Collider (RHIC) and four of its experiments – Brahms, Phenix, Phobos and Star.– Brahms, Phenix, Phobos and Star.
The Rhic Computing Facility (RCF) was The Rhic Computing Facility (RCF) was formed in the mid 90’s, in order to address formed in the mid 90’s, in order to address computing needs of RHIC experiments.computing needs of RHIC experiments.
Background (cont.)Background (cont.)
The data collected by Rhic experiments is written The data collected by Rhic experiments is written to tapes in HPSS mass storage facility, to be to tapes in HPSS mass storage facility, to be reprocessed at a later time.reprocessed at a later time.
In order to automate the process of data In order to automate the process of data reconstruction a home-grown batch system (“Old reconstruction a home-grown batch system (“Old CRS”) was developed.CRS”) was developed.
““Old CRS” manages data staging from and to Old CRS” manages data staging from and to HPSS system, schedules jobs and monitors their HPSS system, schedules jobs and monitors their executionexecution
Due to the growth of the size of the farm, the “old Due to the growth of the size of the farm, the “old CRS” system does not scale well and needs to be CRS” system does not scale well and needs to be replaced.replaced.
RCF facilityRCF facilityServer racks
HPSS storage
The RCF Farm Batch Systems - The RCF Farm Batch Systems - presentpresent
Data Analysis farm
(LSF batch system)
Reconstruction farm
(CRS system)B
erl
in w
all
New CRS - requirementsNew CRS - requirements
Stage input files from HPSS, Unix Stage input files from HPSS, Unix (and in the future – Grid)(and in the future – Grid)
Stage output files to HPSS, Unix (and Stage output files to HPSS, Unix (and in the future – Grid)in the future – Grid)
Capable of running mass production, Capable of running mass production, high degree of automationhigh degree of automation
Error diagnosticsError diagnostics BookkeepingBookkeeping
Condor as the scheduler of new Condor as the scheduler of new CRS systemCRS system
Condor comes with Dagman – a meta Condor comes with Dagman – a meta scheduler which allows to build “graphs” scheduler which allows to build “graphs” of interdependent batch jobs.of interdependent batch jobs.
Dagman allows to construct jobs which Dagman allows to construct jobs which consist of several subjobs, which perform consist of several subjobs, which perform data staging operations and data data staging operations and data reconstruction separately …reconstruction separately …
… … which in turn allows us to optimize the which in turn allows us to optimize the staging of data tapes to minimize the staging of data tapes to minimize the number of tape mountsnumber of tape mounts
New CRS batch systemNew CRS batch system
CRS job HPSSinterface
MySQLLogbook
server
HPSS
Anatomy of a CRS jobAnatomy of a CRS job
Parent job(1 per input file)
Parent job(1 per input file)
Main job
Parent job(1 per input file)
Each CRS job consist of several subjobs. Parent jobs (one per each input file) are responsible for locating input Data and – if necessary – staging them from tapes to HPSS disk cache.The main job is responsible for actual data reconstruction and is executed if and only if all parent jobs completed succesfully.
……..
Lifecycle of the Main JobLifecycle of the Main Job
Import input files from NFS or HPSS
to local disk
Run user’s executable
Export output files
Check exit code, checkIf all required data files are present
Did all parent jobs completedsuccessfully?
Perform error diagnosticsand recovery.
Update job databases.
At all stages of execution keeptrack of the production status and update the job/files databases.
yes
no
Lifecycle of the HPSS interface Lifecycle of the HPSS interface subjobsubjob
Check if HPSS is available
Submit “stage file” request
Stage successful?
Notify the system about the error.Perform error diagnostics.
(For example: can resubmitting the request help?)
Notify CRS systemUpdate databases
noyes
noyes
Logbook managerLogbook manager Each job should provide own logfile, for Each job should provide own logfile, for
monitoring and debugging purposes.monitoring and debugging purposes. It is easy to keep a logfile of an individual It is easy to keep a logfile of an individual
job running on a single machinejob running on a single machine CRS jobs consist of several subjobs which CRS jobs consist of several subjobs which
run independently, on different machines run independently, on different machines at different times.at different times.
In order to synchronize the record keeping In order to synchronize the record keeping a dedicated logbook manager is needed a dedicated logbook manager is needed which is responsible for compiling reports which is responsible for compiling reports from individual subjobs into one, human from individual subjobs into one, human readable, logfile.readable, logfile.
CRS DatabaseCRS Database
In order to keep track of the In order to keep track of the production CRS is interfaced to production CRS is interfaced to MySQL databaseMySQL database
The database stores information The database stores information about each job and subjob status, about each job and subjob status, status of data files, HPSS staging status of data files, HPSS staging requests, open data transfer links requests, open data transfer links and statistics of completed jobs.and statistics of completed jobs.
CRS control panelCRS control panel
The RCF Farm – future.The RCF Farm – future.
Reconstruction farm
(Condor system)
Analysis farm
(Condor system)
General RemarksGeneral Remarks
Condor as batch system for HEPCondor as batch system for HEP
Condor has several nice features which Condor has several nice features which make it very useful as batch system (for make it very useful as batch system (for example DAG)example DAG)
Submitting very large number of condor Submitting very large number of condor jobs (o(10k)) from a single node can put a jobs (o(10k)) from a single node can put a very high load on submit machine, leading very high load on submit machine, leading to potential condor meltdown.to potential condor meltdown.
This is not a problem when many users This is not a problem when many users submit jobs from many machines, but submit jobs from many machines, but for for centralized services (data centralized services (data reconstruction, MC production,…) this reconstruction, MC production,…) this can become a very difficult issue. can become a very difficult issue.
Status of condor – cntd.Status of condor – cntd.
People from Condor team were People from Condor team were extremely helpful with solving extremely helpful with solving Condor problems – when they were Condor problems – when they were on site (in BNL).on site (in BNL).
Remote support (by e-mail) is slow.Remote support (by e-mail) is slow.
Management barrierManagement barrier
So far HEP experiments were So far HEP experiments were relatively small (approx 500 people) relatively small (approx 500 people) by business standards.by business standards.
Everybody knows everybodyEverybody knows everybody This is going to change – next This is going to change – next
generation of experiments will have generation of experiments will have thousands of physiciststhousands of physicists
Management barrier – cntd. Management barrier – cntd.
In such a big communities it will become In such a big communities it will become increasingly hard to introduce new increasingly hard to introduce new software products.software products.
Users have (understandable) tendency to Users have (understandable) tendency to be conservative and they do not want any be conservative and they do not want any changes in the environment in which they changes in the environment in which they workwork
Convincing users to switch to a new Convincing users to switch to a new product and aiding them in transition product and aiding them in transition will become a much more complex will become a much more complex task than inventing this product!task than inventing this product!