Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory

Central Reconstruction Central Reconstruction System on the RHIC Linux System on the RHIC Linux

Farm in Brookhaven Farm in Brookhaven LaboratoryLaboratory

HEPIX - BNLHEPIX - BNL

October 19, 2004October 19, 2004

Tomasz Wlodek - BNLTomasz Wlodek - BNL

BackgroundBackground

Brookhaven National Laboratory (BNL) is a Brookhaven National Laboratory (BNL) is a multi-disciplinary research laboratory multi-disciplinary research laboratory funded by US government.funded by US government.

BNL is the site of Relativistic Heavy Ion BNL is the site of Relativistic Heavy Ion Collider (RHIC) and four of its experiments Collider (RHIC) and four of its experiments – Brahms, Phenix, Phobos and Star.– Brahms, Phenix, Phobos and Star.

The Rhic Computing Facility (RCF) was The Rhic Computing Facility (RCF) was formed in the mid 90’s, in order to address formed in the mid 90’s, in order to address computing needs of RHIC experiments.computing needs of RHIC experiments.

Background (cont.)Background (cont.)

The data collected by Rhic experiments is written The data collected by Rhic experiments is written to tapes in HPSS mass storage facility, to be to tapes in HPSS mass storage facility, to be reprocessed at a later time.reprocessed at a later time.

In order to automate the process of data In order to automate the process of data reconstruction a home-grown batch system (“Old reconstruction a home-grown batch system (“Old CRS”) was developed.CRS”) was developed.

““Old CRS” manages data staging from and to Old CRS” manages data staging from and to HPSS system, schedules jobs and monitors their HPSS system, schedules jobs and monitors their executionexecution

Due to the growth of the size of the farm, the “old Due to the growth of the size of the farm, the “old CRS” system does not scale well and needs to be CRS” system does not scale well and needs to be replaced.replaced.

RCF facilityRCF facilityServer racks

HPSS storage

The RCF Farm Batch Systems - The RCF Farm Batch Systems - presentpresent

Data Analysis farm

(LSF batch system)

Reconstruction farm

(CRS system)B

erl

in w

all

New CRS - requirementsNew CRS - requirements

Stage input files from HPSS, Unix Stage input files from HPSS, Unix (and in the future – Grid)(and in the future – Grid)

Stage output files to HPSS, Unix (and Stage output files to HPSS, Unix (and in the future – Grid)in the future – Grid)

Capable of running mass production, Capable of running mass production, high degree of automationhigh degree of automation

Error diagnosticsError diagnostics BookkeepingBookkeeping

Condor as the scheduler of new Condor as the scheduler of new CRS systemCRS system

Condor comes with Dagman – a meta Condor comes with Dagman – a meta scheduler which allows to build “graphs” scheduler which allows to build “graphs” of interdependent batch jobs.of interdependent batch jobs.

Dagman allows to construct jobs which Dagman allows to construct jobs which consist of several subjobs, which perform consist of several subjobs, which perform data staging operations and data data staging operations and data reconstruction separately …reconstruction separately …

… … which in turn allows us to optimize the which in turn allows us to optimize the staging of data tapes to minimize the staging of data tapes to minimize the number of tape mountsnumber of tape mounts

New CRS batch systemNew CRS batch system

CRS job HPSSinterface

MySQLLogbook

server

HPSS

Anatomy of a CRS jobAnatomy of a CRS job

Parent job(1 per input file)


Main job


Each CRS job consist of several subjobs. Parent jobs (one per each input file) are responsible for locating input Data and – if necessary – staging them from tapes to HPSS disk cache.The main job is responsible for actual data reconstruction and is executed if and only if all parent jobs completed succesfully.

……..

Lifecycle of the Main JobLifecycle of the Main Job

Import input files from NFS or HPSS

to local disk

Run user’s executable

Export output files

Check exit code, checkIf all required data files are present

Did all parent jobs completedsuccessfully?

Perform error diagnosticsand recovery.

Update job databases.

At all stages of execution keeptrack of the production status and update the job/files databases.

yes

no

Lifecycle of the HPSS interface Lifecycle of the HPSS interface subjobsubjob

Check if HPSS is available

Submit “stage file” request

Stage successful?

Notify the system about the error.Perform error diagnostics.

(For example: can resubmitting the request help?)

Notify CRS systemUpdate databases

noyes

noyes

Logbook managerLogbook manager Each job should provide own logfile, for Each job should provide own logfile, for

monitoring and debugging purposes.monitoring and debugging purposes. It is easy to keep a logfile of an individual It is easy to keep a logfile of an individual

job running on a single machinejob running on a single machine CRS jobs consist of several subjobs which CRS jobs consist of several subjobs which

run independently, on different machines run independently, on different machines at different times.at different times.

In order to synchronize the record keeping In order to synchronize the record keeping a dedicated logbook manager is needed a dedicated logbook manager is needed which is responsible for compiling reports which is responsible for compiling reports from individual subjobs into one, human from individual subjobs into one, human readable, logfile.readable, logfile.

CRS DatabaseCRS Database

In order to keep track of the In order to keep track of the production CRS is interfaced to production CRS is interfaced to MySQL databaseMySQL database

The database stores information The database stores information about each job and subjob status, about each job and subjob status, status of data files, HPSS staging status of data files, HPSS staging requests, open data transfer links requests, open data transfer links and statistics of completed jobs.and statistics of completed jobs.

CRS control panelCRS control panel

The RCF Farm – future.The RCF Farm – future.

Reconstruction farm

(Condor system)

Analysis farm

(Condor system)

General RemarksGeneral Remarks

Condor as batch system for HEPCondor as batch system for HEP

Condor has several nice features which Condor has several nice features which make it very useful as batch system (for make it very useful as batch system (for example DAG)example DAG)

Submitting very large number of condor Submitting very large number of condor jobs (o(10k)) from a single node can put a jobs (o(10k)) from a single node can put a very high load on submit machine, leading very high load on submit machine, leading to potential condor meltdown.to potential condor meltdown.

This is not a problem when many users This is not a problem when many users submit jobs from many machines, but submit jobs from many machines, but for for centralized services (data centralized services (data reconstruction, MC production,…) this reconstruction, MC production,…) this can become a very difficult issue. can become a very difficult issue.

Status of condor – cntd.Status of condor – cntd.

People from Condor team were People from Condor team were extremely helpful with solving extremely helpful with solving Condor problems – when they were Condor problems – when they were on site (in BNL).on site (in BNL).

Remote support (by e-mail) is slow.Remote support (by e-mail) is slow.

Management barrierManagement barrier

So far HEP experiments were So far HEP experiments were relatively small (approx 500 people) relatively small (approx 500 people) by business standards.by business standards.

Everybody knows everybodyEverybody knows everybody This is going to change – next This is going to change – next

generation of experiments will have generation of experiments will have thousands of physiciststhousands of physicists

Management barrier – cntd. Management barrier – cntd.

In such a big communities it will become In such a big communities it will become increasingly hard to introduce new increasingly hard to introduce new software products.software products.

Users have (understandable) tendency to Users have (understandable) tendency to be conservative and they do not want any be conservative and they do not want any changes in the environment in which they changes in the environment in which they workwork

Convincing users to switch to a new Convincing users to switch to a new product and aiding them in transition product and aiding them in transition will become a much more complex will become a much more complex task than inventing this product!task than inventing this product!

Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory

Documents

Transcript of Central Reconstruction System on the RHIC Linux Farm in Brookhaven Laboratory