Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A...

46
Lecture 4 Software Process 1 LASER Summer School on Software Engineering 2013, Elba Island, Italy Pere Mato/CERN Tuesday, September 10, 13

Transcript of Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A...

Page 1: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Lecture 4

Software Process

1

LASER Summer School on Software Engineering 2013, Elba Island, ItalyPere Mato/CERN

Tuesday, September 10, 13

Page 2: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Challenges for HEP✤ HEP software is larger and more complex than ever✤ Complex software requires teams; multi-site, multi-disciplinary,

multi-layered✤ HEP software demands long-term maintenance✤ Testing and Validation is vital to insure quality✤ Documentation needs are greater✤ Managing change to the system is critical to capture bug fixes✤ Abundance of computer configurations (hard/software, OS) means

addressing cross-platform issues

2

Tuesday, September 10, 13

Page 3: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Really Large Teams✤ CMSSW is the software project for the CMS experiment✤ About 1000 people has contributed to the code

✤ Many have leftthe collaboration(e.g. PhD students)

✤ Maintenance is abig challenge

✤ Similar numbers forATLAS and smallerfor ALICE and LHCb

3

Tuesday, September 10, 13

Page 4: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Software Process Tools✤ A number of tools are used to support the software process of the HEP

community✤ There is no uniformity, but is encouraging to see some areas of convergence

✤ Categories:✤ Communication and Documentation

✤ E.g. Wiki, Web CMS, issue trackers (Savannah, JIRA), Doxygen✤ Revision Control

✤ Mainly SVN moving to GIT✤ Build Management

✤ Often overlooked, big challenge for the HEP scale ==> See next✤ Testing

✤ Test drivers (QMtest, CTest), statistical validation tools, dash boards✤ Release process

✤ Big challenge for the HEP computing infrastructure4

Tuesday, September 10, 13

Page 5: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Build Management✤ Once we have a design and have started to code the next step is

building the software✤ A the very bottom we need to compile a number of source files and link them

into a number of libraries, modules and executables✤ Other artifacts may also produced as part of the build process (configuration

files, metadata, documentation, etc.)✤ The challenges of building the software

✤ Scale✤ Configurability✤ Reproducibility✤ Multi-platform support✤ Make it simple for end-user scientists✤ Automation tools

5

Tuesday, September 10, 13

Page 6: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Scale✤ For example the ATLAS code that currently contains ~2200 packages with 4

million C++ and 1.4 million python lines written by ~100’s developers✤ Maintaining hand-written makefiles

for each package is unworkable✤ A physicist re-building the full code

base is no possible (time, space,...) ✤ Proliferation of branches (integration,

experimental, production, patch, analysis,migration, etc.

✤ Mitigation✤ Build instructions expressed at higher level than Makefiles (build tools)✤ Builds for various branches and platforms (OS+compiler combinations) made

available centrally (by the continuos build and integration test system) ✤ Thanks to shared libraries (plugins) the physicist does not need to re-build the

complete codebase✤ Easy to build on top a given release (very special concept)

6

ATLAS in numbersNumber of branches: 50Number of platforms: 20Led up to 379 stable releases in 2011 Build farm: 50 8-core nodesFull rebuild time: 10 hours

Tuesday, September 10, 13

Page 7: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Typical Physicist’s Workflow✤ Select the version/branch of the experiment software on which to

base the development or application execution✤ Typically a single tag name is sufficient

✤ Checkout from code repository the package or packages to be worked out; make changes with text editor

✤ Build these packages against the pre-built versions of the software in user workspace✤ Only the modified packages and dependent ones should be re-build✤ Interest to have well architected software to minimize dependencies

✤ Setup a running environment to make use of the packages in the user workspace in preference to the pre-built ones✤ Tools to set variables such as PATH, LD_LIBRARY_PATH, etc.

✤ Run the application with the adequate configuration parameters7

Tuesday, September 10, 13

Page 8: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Typical Physicist’s Workflow (2)✤ Once changes and new code has been tested locally, the physicist may

want to run it against the Big Data✤ We need to package his/her modified set of packages and attach them to the

jobs to be submitted to the Grid/Cloud✤ Challenges associated to this will be discussed in another lecture

✤ Often, the new or modified code should be incorporated into the experiment software to be used by other colleagues✤ He/she typically commits the code into the central code repository and

creates tags/labels to be used by the test and validation system✤ Obviously some lightweight rules need to be obeyed to avoid chaos (see later)

8

Tuesday, September 10, 13

Page 9: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Platform Support✤ Multi-platform support is one of the main requirements for the HEP

software✤ To be able to achieve a very long software lifetime and opportunistic on

resources✤ In general you obtain better quality software by using different OS, different

compilers, etc. Some software bugs only show up on specific compilers✤ Platform dependent code is concentrated in very few places✤ The three main platforms are supported

✤ Linux (mainly Scientific Linux) with gcc, icc, clang compilers✤ Scientific Linux with gcc is the main ‘production’ platform

✤ MacOSX with clang and gcc✤ Used mainly by scientists to perform final analysis, software development

✤ Windows with VisualC++✤ Not for everything, but for some common packages (e.g. ROOT, Geant4)

9

Tuesday, September 10, 13

Page 10: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Configuration/Reproducibility✤ When building a package/project for your own consumption on a

single system what you want is that it works✤ You appreciate that the configure/make resolve automatically all needed

dependencies with best empirics and guesses of what it finds in your system✤ When building the software of the experiment, which then will be

distributed to the world-wide Grid/Cloud you should not allow any guesses✤ The build should be absolutely reproducible in any other system with

different packages installed✤ All package dependencies and their locations would be explicitly specified

✤ Reproducibility of physics results is a very strong requirement✤ Full control of the version of the code used to obtain the physics results ✤ Running on different systems (with same OS + compiler) the same

experiment software version should give the same results10

Tuesday, September 10, 13

Page 11: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Make it Simple✤ The idea is to simplify the

specification of what needs to be build and how using a high-level language✤ E.g. CMT requirements file

✤ Take care of the default build rules✤ With customizations

✤ Take care of packagedependencies

11

package MyPackageversion v1r0

# Structure, i.e. directories to process.branches cmt doc src

# Used packages.use GaudiAlg v*

# Component library building rulelibrary MyPackage ../src/*.cpp

# define component library link optionsapply_pattern component_library library=MyPackage

CMT requirements

code code code code code

SVN repository

• What to build • How to build • Package dependencies

makefile

Building tools

(compilers, linkers, IDEs )

Libraries &

Executables

Example: CMT configuration and build system

Tuesday, September 10, 13

Page 12: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Many Tools Available✤ There are many tools to

help you building the software✤ E.g. the list from Wikipedia

✤ Some of them are very platform/language specific✤ E.g. Ant (Java), MSBuild

(Windows)✤ Some of them do not scale

to very large projects✤ Truly multi-platform and

multi-language support with good scalability are not so many

12Some of the tools listed in Wikipedia

Tuesday, September 10, 13

Page 13: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

The GNU build system (autotools)✤ Automake helps to create portable Makefiles,

which are in turn processed with the make utility

✤ It takes its input as Makefile.am, and turns it into Makefile.in, which is used by the configure script to generate the file Makefile output✤ It also performs automatic dependency tracking

✤ The nice thing is that is very standard in Unix environments✤ The bad thing is that it woks only for Unix (unless

using cygwin on Windows

13

Tuesday, September 10, 13

Page 14: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Main issues with plain ‘make’✤ Configuration step to customize the build for specific platform and

system is always required✤ Need to write (generate) a ‘configure’ script to deal with this customization

✤ Keep track of dependencies to other packages✤ The required package versions need to be located in the system and properly

selected✤ It is better to encapsulate the knowledge of how ‘use’ a package from another

✤ Scale and complexity✤ Maintaining many 100s of hand-crafted Makefiles is clearly unworkable✤ Hierarchical/recursive Makefiles can help but clearly the build performance

suffers a bit✤ Multi-platform support certainly adds complexity to the Makefiles

14

Tuesday, September 10, 13

Page 15: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Proliferation of Homegrown Tools✤ LHC experiments started by developing/adopting specific tools

✤ The main emphasis was to make the live easy for developers/scientists✤ Take care automatically of dependencies and versioning rules✤ No tool is supporting (re)building a set packages in user workspace against a full

pre-built version of the software (partial builds) ✤ Two examples:

✤ SCRAM✤ SCRAM (Source Configuration, Release, And Management tool) is the CMS

build program. It is responsible for building framework applications and also making sure that all the necessary shared libraries are available

✤ CMT✤ CMT (Configuration Management) is based around the notion of package✤ Provides a set of tools for automating the configuration and building of

packages✤ Both require high-level specifications and end-up generating Makefiles

15

https://twiki.cern.ch/twiki/bin/view/CMSPublic/SWGuideScram

http://www.cmtsite.net

Tuesday, September 10, 13

Page 16: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

CMake✤ Many HEP projects and experiments are migrating away from

homegrown solutions to tools such as CMake✤ Probably does not have all the bells and whistles but is more in main stream

✤ CMake is a cross-platform free software program for managing the build process of software using a compiler-independent method

✤ CMake generates native makefiles and workspaces that can be used in the compiler environment of your choice✤ We found that it is important to use the native build environment on each

system (e.g. systems like cygwin are problematic at the end)✤ This is clearly an opportunity to standardize and increase

commonality between HEP experiments

16

Tuesday, September 10, 13

Page 17: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

2

Why CMake?✤ CMake generates native build environments

✤ UNIX/Linux->Makefiles, Eclipse, Ninja✤ Windows->VisualStudio, nmake✤ Apple->Xcode

✤ Open-Source, Cross-Platform, pretty well documented, etc.✤ Can cope with complex, large build environments✤ Flexible & Extensible

✤ Scripting language (turing-complete)✤ Modules for finding/configuring software✤ Extensible to new platforms and languages✤ Create custom targets/commands✤ Run external programs

17

Tuesday, September 10, 13

Page 18: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

CMake simple example

✤ The CMakeLists.txt file describes by means of commands what needs to be build in terms of libraries, executables, etc.

✤ The how is taken care by CMake with very good defaults and plenty of variables ✤ cmake + make to build the project

18

#---Setup the project--------------------------------------------------------cmake_minimum_required(VERSION 2.6 FATAL_ERROR)project(N01)

#---Find Geant4 package, no UI and Vis drivers activated---------------------find_package(Geant4 REQUIRED)

#---Setup Geant4 include directories and compile definitions-----------------include(${Geant4_USE_FILE})

#---Locate sources and headers for this project------------------------------include_directories(${PROJECT_SOURCE_DIR}/include ${Geant4_INCLUDE_DIR})file(GLOB sources ${PROJECT_SOURCE_DIR}/src/*.cc)file(GLOB headers ${PROJECT_SOURCE_DIR}/include/*.hh)

#---Add the executable, and link it to the Geant4 libraries------------------add_executable(exampleN01 exampleN01.cc ${sources} ${headers})target_link_libraries(exampleN01 ${Geant4_LIBRARIES} )

Tuesday, September 10, 13

Page 19: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

CMake Performance✤ Used ROOT to test the usability, scalability and performance

✤ ROOT uses a collection ‘modular’ Makefiles ✤ Tested on SL(5-6), Ubuntu(11-12), Mac (10.6,10.7,10.8), Windows(vc9, vc10,

no cygwin required!)✤ Comparison between Module.mk and CMake

✤ Same code version, same machine, ‘Release’ build✤ “make –j8” vs. “make –j8” vs. “ninja”

19

Module.mk CMake(make) CMake(ninja)(2)

ROOT (noop) 2.2’’ 2.3’’ 0.2’’ROOT (TH2.cxx) 5’’(1) 9’’ 9’’ROOT (full) 6’38’’ 7’32’’ 7’33’’

(1) dependent libraries not re-done(2) http://martine.github.com/ninja/

Tuesday, September 10, 13

Page 20: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Using high-level macros

20

############################################################################# CMakeLists.txt file for building ROOT hist/hist package############################################################################

set(libname Hist)

ROOT_USE_PACKAGE(core)ROOT_USE_PACKAGE(math)ROOT_USE_PACKAGE(graf2d)ROOT_USE_PACKAGE(io/io)

ROOT_GENERATE_DICTIONARY(G__${libname} *.h Math/*.h LINKDEF LinkDef.h)

ROOT_GENERATE_ROOTMAP(${libname} LINKDEF LinkDef.h DEPENDENCIES Matrix MathCore)

ROOT_LINKER_LIBRARY(${libname} *.cxx G__${libname}.cxx DEPENDENCIES Matrix MathCore)

ROOT_INSTALL_HEADERS()

In this example package, we build a dictionary and rootmap file using a LinkDef.h file, which is then used to build a library with all .cxx files. The headers will be installed

Tuesday, September 10, 13

Page 21: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Module.mk✤ Module.mk and

CMakeLists.txt provides the same functionality✤ Comparing for hist/hist

package✤ 53 lines vs. 9 lines

✤ CMakeLists.txt is much higher-level✤ In principle it should be easier

to maintain

21

# Module.mk for hist module# Copyright (c) 2000 Rene Brun and Fons Rademakers## Author: Fons Rademakers, 29/2/2000

MODNAME := histMODDIR := $(ROOT_SRCDIR)/hist/$(MODNAME)MODDIRS := $(MODDIR)/srcMODDIRI := $(MODDIR)/inc

HISTDIR := $(MODDIR)HISTDIRS := $(HISTDIR)/srcHISTDIRI := $(HISTDIR)/inc

##### libHist #####HISTL := $(MODDIRI)/LinkDef.hHISTDS := $(call stripsrc,$(MODDIRS)/G__Hist.cxx)HISTDO := $(HISTDS:.cxx=.o)HISTDH := $(HISTDS:.cxx=.h)

HISTH := $(filter-out $(MODDIRI)/LinkDef%,$(wildcard $(MODDIRI)/*.h))HISTHMAT := $(filter-out $(MODDIRI)/Math/LinkDef%,$(wildcard $(MODDIRI)/Math/*.h))#HISTHMAT += mathcore/inc/Math/WrappedFunction.hHISTHH := $(HISTH) $(HISTHMAT)

HISTS := $(filter-out $(MODDIRS)/G__%,$(wildcard $(MODDIRS)/*.cxx))HISTO := $(call stripsrc,$(HISTS:.cxx=.o))

HISTDEP := $(HISTO:.o=.d) $(HISTDO:.o=.d)

HISTLIB := $(LPATH)/libHist.$(SOEXT)HISTMAP := $(HISTLIB:.$(SOEXT)=.rootmap)

# used in the main MakefileALLHDRS += $(patsubst $(MODDIRI)/%.h,include/%.h,$(HISTHH))#ALLHDRS += $(patsubst $(MODDIRI)/Math/%.h,include/Math/%.h,$(HISTHH))ALLLIBS += $(HISTLIB)ALLMAPS += $(HISTMAP)

# include all dependency filesINCLUDEFILES += $(HISTDEP)

##### local rules #####.PHONY: all-$(MODNAME) clean-$(MODNAME) distclean-$(MODNAME)

include/Math/%.h: $(HISTDIRI)/Math/%.h! ! @(if [ ! -d "include/Math" ]; then \! ! mkdir -p include/Math; \! ! fi)! ! cp $< $@

include/%.h: $(HISTDIRI)/%.h! ! cp $< $@

$(HISTLIB): $(HISTO) $(HISTDO) $(ORDER_) $(MAINLIBS) $(HISTLIBDEP)! ! @$(MAKELIB) $(PLATFORM) $(LD) "$(LDFLAGS)" \! ! "$(SOFLAGS)" libHist.$(SOEXT) $@ "$(HISTO) $(HISTDO)" \! ! "$(HISTLIBEXTRA)"

$(HISTDS): $(HISTHH) $(HISTL) $(ROOTCINTTMPDEP)! ! $(MAKEDIR)! ! @echo "Generating dictionary $@..."! ! $(ROOTCINTTMP) -f $@ -c $(HISTHH) $(HISTL)

$(HISTMAP): $(RLIBMAP) $(MAKEFILEDEP) $(HISTL)! ! $(RLIBMAP) -o $@ -l $(HISTLIB) \! ! -d $(HISTLIBDEPM) -c $(HISTL)

all-$(MODNAME): $(HISTLIB) $(HISTMAP)

clean-$(MODNAME):! ! @rm -f $(HISTO) $(HISTDO)

clean:: clean-$(MODNAME)

distclean-$(MODNAME): clean-$(MODNAME)! ! @rm -f $(HISTDEP) $(HISTDS) $(HISTDH) $(HISTLIB) $(HISTMAP)

distclean:: distclean-$(MODNAME)

# Optimize dictionary with stl containers.$(HISTDO): NOOPT = $(OPT)

Tuesday, September 10, 13

Page 22: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Using the CMake tool set✤ CMake comes with a nice set of companion tools to

support the software process✤ CTest

✤ CTest provides special commands in the CMakeLists.txt file to create tests. CTest can then be used to execute the tests, and optionally upload their results to a dashboard server

✤ CDash✤ CDash aggregates, analyzes and displays the results of software testing

processes✤ CPack

✤ CPack is a powerful, easy to use, muti-platform software packaging tool

22

Tuesday, September 10, 13

Page 23: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

CDash Dashboard

23http://cdash.cern.ch/index.php?project=ROOT

“group” sections

version+

platforms

selecting the date

changes from previous build on the same group

Tuesday, September 10, 13

Page 24: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Example:Geant4 Software Process

24

Tag DB

Developerscheck-in code

an creates tags

SVN repository

Distributednodes use CMake

CTest to build and run tests

Code checkout

Results postedon Web Server

Developersreview results

Supervisoraccepts/rejects

tags

Tuesday, September 10, 13

Page 25: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Main Player: The Developer ✤ Developers are responsible for their code to run correctly and not

affecting negatively other project functionality

✤ The developer typically:

✤ Checks-out, modifies and builds successfully the code (with CMake)

✤ Develops and runs Unit Tests to exercise his/her code in isolation in his/her preferred platform (with CMake/CTest)

✤ Commits code to SVN(GIT) and creates new “Tags”

✤ Inspects the results of running the new code for all Integration Tests and Examples in all supported platforms (with CDash)

✤ Take corrective actions ASAP: rejecting “Tags”, or committing new code with the fixes 25

Tuesday, September 10, 13

Page 26: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Doing Agile Development ✤ Without knowing it and/or formalizing it we have been doing Agile

development

✤ New ‘features’ or ‘functionalities’ being designed, implemented and tested continuously

✤ Basically the ‘trunk’ revision of each project is always ‘working’

26

Tuesday, September 10, 13

Page 27: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Software Development Cycles✤ At individual Project/Package Level

✤ Continuously integration and testing (in hours, daily)✤ unit tests, regression tests, etc.

✤ Monthly reference tags✤ system tests, validation tests, performance tests, etc.

✤ 1-2 major releases a year, plus patch releases✤ full validation, full documentation, long term support

✤ At Full Stack Level✤ Continuous integration and testing (daily)✤ Releasing full configurations on demand✤ Extensive validation (very expensive in CPU)

27

Tuesday, September 10, 13

Page 28: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Configuring and Building the Complete Software Stack

28

Tuesday, September 10, 13

Page 29: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Two Sub-Stacks

29

✤ The complete software stack is divided in two sub-stacks✤ One experiment independent part including the externals libraries and

common projects such as ROOT, Monte Carlogenerators, Geant4 and common frameworks

✤ One experiment specific with the rest of the projects and packages specific for each experiment

✤ The experiment independent part is controlledby a single version number that fixes the version of all the other individual packages ✤ LCG_XX (LHC Computing Grid configuration

number) ✤ Releases/Builds of the common part are made

of demand of the experiments✤ About 10 releases a year

Applications

Event DetDesc.

Calib.

Experiment Framework

Simulation AnalysisDataMngmt.

Core Libraries

non-HEP specificsoftware packages

Tuesday, September 10, 13

Page 30: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

LCG Configurations✤ Configuring, building and deploying external libraries (~100) and

MC Generators (~40) for all the supported platforms (~10) used by LHC experiments✤ Releasing full configurations. Content, versions and platforms discussed/

agreed with experiments✤ This is central service that has been working successfully for the last N years

✤ Current implementation✤ [LCG Externals] Set of script-lets (download, configure, build, install,...)

driven by the CMT tool (configuration management) and using package’s native build commands

✤ [MC Generators] Set of scripts with very limited configurability and dependency handling

✤ Moving to a new system based on CMake to overcome a number of difficulties

30

Tuesday, September 10, 13

Page 31: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Main Issues to Improve✤ MC Generators and more a more dependent of Externals (e.g. Python, Boost,

Fastjet, GSL, HepMC,...) ✤ Both sets should be released together

✤ Unification of the scripts and procedures between Generators and Externals✤ Aiming to save resources and time

✤ Better control of package dependencies✤ A binary package depends on its source version and all dependent package versions

✤ Speedup✤ The CMT tool is not particularly efficient for multi-core systems

✤ Automation✤ Aiming to produce new complete LCG releases with a single click

✤ Managing obsolesce✤ Difficult to know if removing old versions from AFS/CernVM-FS is safe

31

Tuesday, September 10, 13

Page 32: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

CMake ExternalProject Module✤ CMake comes with a standard module

ExternalProject that creates a custom targets to drive download, update/patch, configure, build, install and test steps of an external package✤ Fairly easy to add additional custom steps such

as the creation of source and binary tarfiles, installation of logfiles, etc.

✤ Implemented a wrapped to inject all these extra features

✤ CMake generates a Makefile that at the end drives all the build process✤ make -jN works like a dream!

32

Tuesday, September 10, 13

Page 33: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Example✤ Few lines are sufficient to describe the steps required for a

given package✤ Dependencies to other packages are explicit✤ Variables such as ${XXX_home} point to the installation of package XXX

33

#---agile------------------------------------------------------------------------

LCGPackage_Add( agile URL http://www.hepforge.org/archive/agile/AGILe-${agile_native_version}.tar.bz2 CONFIGURE_COMMAND ./configure --prefix=<INSTALL_DIR> --with-hepmc=${HepMC_home} --with-boost-incpath=${Boost_home_include} --with-lcgtag=${LCG_platform} PYTHON=${Python_home}/bin/python LD_LIBRARY_PATH=${Python_home}/lib:$ENV{LD_LIBRARY_PATH} SWIG=${swig_home}/bin/swig BUILD_COMMAND make all LD_LIBRARY_PATH=${Python_home}/lib:$ENV{LD_LIBRARY_PATH} INSTALL_COMMAND make install LD_LIBRARY_PATH=${Python_home}/lib:$ENV{LD_LIBRARY_PATH} BUILD_IN_SOURCE 1 DEPENDS HepMC Boost Python swig)

Tuesday, September 10, 13

Page 34: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Package Dependencies✤ From the dependencies we can generate dependency

graphs ✤ Useful for documentation✤ Full package dependency versions for binary compatibility

34

Tuesday, September 10, 13

Page 35: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Defining the Configuration✤ A single file lists all the packages and their required

versions

35

# Application Area ProjectsLCG_AA_project(COOL COOL_2_8_17)LCG_AA_project(CORAL CORAL_2_3_26)LCG_AA_project(RELAX RELAX_1_3_0k)LCG_AA_project(ROOT 5.34.05)LCG_AA_project(LCGCMT LCGCMT_${heptools_version})

# ExternalsLCG_external_package(4suite 1.0.2p1 )LCG_external_package(AIDA 3.2.1 )LCG_external_package(blas 20110419 )LCG_external_package(Boost 1.50.0 )

...

# GeneratorsLCG_external_package(starlight r43 MCGenerators/starlight )LCG_external_package(herwig 6.520 MCGenerators/herwig )LCG_external_package(herwig 6.520.2 MCGenerators/herwig )LCG_external_package(crmc v3400 MCGenerators/crmc )LCG_external_package(cython 0.19 MCGenerators/cython )LCG_external_package(yaml_cpp 0.3.0 MCGenerators/yaml_cpp )LCG_external_package(yoda 1.0.0 MCGenerators/yoda )

Tuesday, September 10, 13

Page 36: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Build instructions are fairly simple

36

- get or setup cmake- checkout lcgcmake package from SVN- setup C/C++/Fortran compilers- create workspace area- configure with cmake- build with make

http://sftweb.cern.ch/spi/HowtoBuildWithCMake

Building compete LCG configurations (~140) projects can be done in 1-2 hours (on a powerful build server)

Tuesday, September 10, 13

Page 37: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Quality Assurance

37

Tuesday, September 10, 13

Page 38: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

QA Toolbox

38

✤ Continuous integration✤ Multi-Platform: compilers, versions, architectures✤ Unit + functionality test suites✤ Regression tests on performance, results✤ Nightly builds + tests + automatic reports✤ Memory: valgrind, massif, igprof...✤ Performance: callgrind, shark, gprof...✤ Test coverage: gcov✤ Coding rule checkers✤ Automation + strong software management

Tuesday, September 10, 13

Page 39: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Memory QA✤ C++ inherent: memory errors

✤ use after delete✤ uninitialized pointers✤ buffer overruns✤ memory leaks

✤ Amplified by novice coders, non-defined ownership rules,...

✤ Tools at hand:✤ Dynamic analysis tools: e.g. valgrind✤ Static analysis tools: e.g. Coverity, Clang Static Analyzer

39

Tuesday, September 10, 13

Page 40: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Dynamic Analysis Tools✤ Tools such as valgrind are very good on finding dynamic

memory problems although the tend to be painfully slow for large programs✤ They check the normal execution of the program

✤ The main open issues are:✤ Those almost impossible code paths!✤ Problems that cannot be reproduce✤ Too many code paths for humans✤ Condition matrixes✤ “You know what I mean” coding

40

Tuesday, September 10, 13

Page 41: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Static Analysis Tools✤ Check what could be run, not what is run

✤ Catches errors in exceptional cases that a dynamic checker will never get✤ Using Coverity at CERN ✤ Shocking initial results with ROOT (2.6 MLOC)

✤ 4300 defect reports on 7 developers! Would this work?✤ Amazing relevance

✤ 12% false positives, 23% intentional thus 2/3 reports causing source changes✤ Despite an already well-filled

QA toolbox✤ Relevant reports stimulates

motivated developers!✤ Fixed the 4300 reports by 7

developers in 6 weeks✤ Much better release at the end

41

http://www.coverity.com

Tuesday, September 10, 13

Page 42: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Initial Numbers for the Experiments

42

0

2000

4000

6000

8000

ALICEATLAS

CMSLHCb

ROOT

High ImpactDefects

Tuesday, September 10, 13

Page 43: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Benefits of Static Analysis✤ Based on this successful recent experience we are

convinced that static analysis can really bring better quality to scientific software

✤ Other tools (open source) are being investigated✤ Clang/LLVM based

✤ Essential for the next major endeavor of the HEP community✤ Concurrent programming (multi-threading applications and

thread-safe libraries)✤ See later Lecture

43

Tuesday, September 10, 13

Page 44: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Physics Validation✤ Validation concerns building the right model

✤ It is utilized to determine that a model is an accurate representation of the real system ✤ Every element of the experiment software is challenged with a thorough validation

process✤ Simulation models (particle generators, particle interaction with matter, etc.)✤ Reconstruction and calibration algorithms✤ Statistical methods✤ Memory and time consumption

✤ Experiments spend huge amount of resources (people, computing) in this validation✤ Large Monte Carlo samples are produced to validate a number of standard physics

processes (candles)✤ Humans are required to judge the quality of the physics (statistical distributions)

✤ The process is repeated until model accuracy is judged to be acceptable✤ Every major software version (2-3 times a year) is thoroughly validated✤ Tools has been developed to assist physicists on this task

44

Tuesday, September 10, 13

Page 45: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Physics Validation✤ Many plots as this one are produced and need to be

compared (statistically) ✤ For many ‘observables‘

and different versions of the software

✤ Often humanintervention is required✤ The importance of

proper user interfaces (e.g. Web tools)

45

Tuesday, September 10, 13

Page 46: Software Processlaser.inf.ethz.ch/2013/material/mato/Mato-Lecture4.pdfSoftware Process Tools A number of tools are used to support the software process of the HEP community There is

Take Away Messages✤ The scale of HEP software in large

✤ In terms of LOC, #modules, #packages, #configuration parameters, etc. ✤ #developers (most of them part-time)

✤ One of the distinct feature is the ability to only build parts of the software stack✤ Thanks to the widespread use of shared libraries, modules, plugin

mechanisms and solid architectural designs minimizing dependencies ✤ We cannot ‘enforce’ to our community a very strict and complex

software process✤ Keep it simple but not simpler✤ Some minimal set of rules are enforced (effectiveness and pragmatism)

✤ Tools play an very important role to support the software process✤ Ease of use, accessible, and if possible open source

46

Tuesday, September 10, 13