Download - D.Spiga, L.Servoli, L.Faina INFN & University of Perugia

Transcript
Page 1: D.Spiga,  L.Servoli,  L.Faina INFN & University of Perugia

D.Spiga, L.Servoli, L.FainaINFN & University of Perugia

CRAB WorkFlow :

CRAB: CMS Remote Analysis Builder•A CMS specific tool written in python and developed within INFN to open the Grid to the masses!•It is aimed to allow CMS users to have access to all Data produced and available, using grid middleware•It should hide as much as possible grid complexities to CMS user •It have to be installed on the UserInterface (UI), the user acces point to the grid.must develop their analysis code in an

interactive environment with the program for CMS Reconstruction (ORCA), and choose a dataset to analyze

Input data discovery : the Computing Element (CE) of sites storing data are found querying central ( RefDB) and local (PubDBs) databasePackaging of user code: creation of a tgz archive with user code which contains bin, lib and data Job creation: the main steps are:

-Wrapper (sh) of ORCA executable or script creation: set up running environment on remote resources (WN); sanity check on WN; access to local catalogs; output handling;

-Job Description Language (Jdl) file creation: the site locations (CEs name) are passed to Resource Broker (RB) as requirements to drive resources matchmaking; -Job splitting according to user requests

User tasks:

CRAB main functionalities:

Dynamical Web Page

CRAB

Submission timeCheck status timeOutput retrieval time

MySQL DB

UDP server

Write &

Update

CRAB Monitoring:

Technical Implementation:

At three diffent points ofits workflow (submission, check status, output retrieval), CRAB sends UDP packets, containing the informations, to an UDP server which process the data and fills a MySQL database.To satisfy the request for a “real-time” monitoring, some of the informations stored into the database are also shown in a web page which is automatically updated.

To monitor and to analyze the use of CRAB, a monitoring tool has been developed to collect data in order to: -- show, in real time. informations such as: * RATE of CRAB-Jobs Submission * Dataset and Computing Element Usage * Provenance of Jobs -- answer, with an off-line analysis, to questions like: * How efficient is the service; * How many users are using CRAB (and how); * Which patterns of data access are emerging (which data are used and where); * Which are the failures/problems of the service; * How to improve the user support;

Left and right histograms show the CE and dataset/owner usage.Each bar represents the total number of jobs and it is divided into three categories:- jobs that produce ORCA Exit Code equal to 0 (green)- jobs that produce ORCA Exit Status different from 0 (yellow)- jobs that could not run due to the GRID problems (red)

On line web page

The role of the web pages is to show automatically updated quantities. There are several possibilities concerning mainly how many jobs are submitted, where the jobs run, which input data are requested and which User Iterfaces have been used. All data can be shown choosing among different time intervals.

The number of jobs submitted each month.

Analysing data collected by the monitoring tool, it is possible to understand in dept the behaviour of the system. The study of the time evolution of several quantities allows to draw conclusions on the use and the performance of the service.

From July 2005 to January 2006 about 400’000 CRAB-jobswere submitted on the grid.The above histogram shows the weekly submission rate for the LCG (dashed blue) and the OSG (green) grid infrastructure.

Time integral of the different computing element where CRAB jobs have run.

This plot shows the increase of the number ofsites that are storing and making available data for CMS distributed analysis.

Time integral of the number of different User Interfaces that have used CRAB.

This plot shows the diffusion among users ofthe use of the tool. It is evident a constantincrease.

1. CRAB project: http://cmsdoc.cern.ch/cms/ccs/wm/www/Crab2. The CMS experiment: http://cmsdoc.cern.ch3. LCG Project: http://lcg.web.cern.ch/LCG and “LCG Technical Design Report”,CERN-TDR-01 CERN-LHCC-2005-024, June 20054. OSG Project: http://www.opensciencegrid.org 5. ORCA project: http://cmsdoc.cern.ch/orca6. PubDB project: http://cmsdoc.cern.ch/swdev/viewcvs/viewcvs.cgi/OCTOPUS/PubDB7. “Job Description Language HowTo” December 17°, 2001 availabe at http://server11.infn.it/workload-grid/docs/DataGrid-01-TEN-0102-02-Document.pdf

References

We wish to thank Hassen Rihai and the CRAB team who provided support duringthe development and deployement of the monitoring tool.

Acknowledgements

Weekly success rate for CRAB jobs.The quantity plotted is the ratio between the jobs which don’t fail for infrastructure reasons (green & yellow) and the total number of jobs.

Weekly request rate for different datasets.

This plot gives an indication of how many datasets(currently about 390) are requested by the users.

http://cmsgridweb.pg.infn.it/crab/crabmon.php

Monitoring of job status: check the status of jobs on user demandOutput retrieval and Handling of user output: copy to UI or to a generic torage Element (SE) or any host with a gsiftp server (e.g. CASTOR)Job resubmission: if job suffers Grid failure (Aborted or Cancelled status)

Off line analysis

CRAB usage and jobs flow monitoring

Job submission to the Grid: via Workload Management System (WMS) command (edg-job-submit)