Internship report by Ž ygimantas Gatelis Supervisor: Giovanni Franzoni
description
Transcript of Internship report by Ž ygimantas Gatelis Supervisor: Giovanni Franzoni
1
DEVELOPMENT OF A SERVICE TO MANAGE REQUESTS OF DATA AND MONTE CARLO SAMPLES OF DEVELOPMENT AND VALIDATION AT CMS
Internship report by Žygimantas GatelisSupervisor: Giovanni Franzoni
Home institution: Vilnius University
2
Project
Project name - RelVal Machine RelVal - request samples dedicated to testing and
validating either novel event reconstruction updates or updates to the calibration constants and alignment parameters
RelVal Machine - web application to manage requests of data and Monte Carlo samples for development and validation in CMSSW.
3
ProblematicHow users perform tests now?
Now users use runTheMatrix.py tool for tests, validation and submission for production.
runTheMatrix.py - python tool that stores the configuration ofcommon samples production requests in .py files, and simplifies running of particular workflow.
cmsRun - is a command that performs both simulation of CMS detector and reconstruction, after that returns results in .root file format.
cmsDriver.py is tool that generates complex configuration for cmsRun. It works like interface to csmRun
when running runTheMatrix.py it creates cmsDriver.py command that generates configuration for cmsrun
4
ProblematicHow users perform tests now?
5
ProblematicrunTheMatrix.py drawbacks
Only limited amount of requests are available If user want to submit custom request he/she has to
insert new request in runTheMatrix.py code by hand If custom request is not pushed to git then after next
code checkout custom request will be lost Has only command line interface
6
Purpose of RelVal Machine
RelVal Machine stores requests in a databaseAll users can add new requests or edit existing. User will not loose his requests because they are stored in database, he can reuse them later or clone with some changes. RelVal Machine solves requests bookkeeping problem. Users can bookkeep his requests, clone requests made by someone else and customize them Using RelVal Machine user can perform tests using existing
infrastructureIt means that RelVal Machine integrates with runTheMatrix.py tool in order to perform tests. Using existing infrastracture is a big plus. We know that it works and generates correct configuration for cmsrun. Automate submission of relval samples for release
validationThere are about 100 bi-weekly relval samples that should be submitted in order to perform release validation
7
TechnologiesRelVal Machine architecture
8
Technologies
For database we choose Oracle Db technology:
Oracle DB CouchDBSupport from CERN DBA (they take care of DB backups)
PdmV people has more experience with Couch, McM is made with CouchDb
Much faster Easy to make schema changes (because there is no schema)
Wildcard search is available without external frameworks
Has nice admin interface, so admins can manage data in the database without custom interface
9
Technologies
Flask - python web framework was chosen as main server side framework
We chose between Flask and CherryPy: In general both frameworks are similar Both are lightweight python server side frameworks Only plus for cherryPy is that it has been used in McM Flask has bigger community and better documentation Flask has many plugins that helps accomplish general tasks easier Flask is faster
10
Technologies
Client side (frontend) technologies AngularJs (v1.2.12) - javascript framework for creating dynamic
web application also used in McM Bootstrap v3.1.0 - CSS framework, to make application look
attractive. And few other javascript frameworks to make life easier
11
What I have done
Created relational database schema, its implementation and programmed all required operations with database.
Simplified database schema:
Full database schema: https://twiki.cern.ch/twiki/pub/CMS/PdmVRelValMachine/relval_04-22.png
12
What I have done
13
What I have doneCreated web pages for data representation and search/sort
capability
14
What I have done
15
What I have doneCreated web pages for inserting/editing/cloning
existing data
16
What I have done
17
What I have done
RelVal Machine deployed into development machine under CERN SSO (Single Sign On).
Only users that have CERN account can access application.
Application accessible beyond CERN network.
Application link: https://cms-pdmv-dev.cern.ch/relvalmachine/
18
What I have done
Integration to existing CMS infrastructure
During integration development changes in existing runTheMatrix.py tool was made and pushed into newest CMSSW release git repository.
Programmed integration to runTheMatrix.py from RelVal Machine side .
For communication between RelVal Machine and runTheMatrix.py I use JSON file, where request information is stored .
19
What I have doneUsers requests that takes relative long time is
asynchronous.User can have fast response, that task is submitted and then service on different thread runs runTheMatrix.py that takes from 2 to 20 minutes command using external machine.
20
What I have done
Documentation for RelVal Machine
Created deployment descriptor. Detail instructions how to deploy application in plain virtual machine.
Created documentation of RelVal Machine application for future developers that will improve application.
Code stored in github repository under PdmV account and accessible for all.
21
Future improvementsThing that should be improved by future developers
Implement requests submission for production Search by any field (for now user can search only by title) Batches submission for local tests and production Make better client side validation Improve steps creation page, allow user to paste whole
command not only by flag/value pairs.
Now application has convenient infrastructure so it should be easy for future developers change and refactor existing code.
22
Questions