CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With...

27
CHEP 2001 CHEP 2001 September, 2001 Lassi A. Tuura, Northeastern University http://iguana.cern.ch Analysing Software Analysing Software Dependencies Dependencies With Ignominy With Ignominy Lucas Taylor Lassi A. Tuura Northeastern University, Boston

Transcript of CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With...

Page 1: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

CHEP 2001CHEP 2001

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysing Software Analysing Software DependenciesDependenciesWith IgnominyWith Ignominy

Lucas TaylorLassi A. Tuura

Northeastern University, Boston

Page 2: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

2September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

MotivationMotivation IGUANA is largely an integrator for CMS: we need to have

a good grasp of the external software before its inclusion into our system By and large we are not seeking to select one product… but are trying to merge the strengths of several packages into a

very good physics analysis environment… and are seeking to provide feedback to component authors

We are interested in, among others: How much of the external package we would use Its impact on our physical software structure How well it fits in with the philosophy of CMS software and

other imports—in design and architecture, usage patterns, GUI, … What other software it depends in Commitment required, possibility of varying how much we use

Page 3: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

3September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

ExamplesExamplesSee http://iguana.cern.ch/2_4_3/dependencies.html

Page 4: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

4September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

ignominy: dishonour, disgrace, shame; infamy; the condition of being in disgrace, etc.

(Oxford English Dictionary)

IgnominyIgnominy Model

Examines and reports on direct and transitive source and binary dependencies

Creates reports of the collected results As a set of web pages Numerically Graphically As tables

SourceCode

BuildProducts

Metrics

Graphs

Tables

DependencyDatabase

User-definedlogical dependencies

+

ignominy: a suite of perl and shell scripts plus a number of configuration files (IGUANA)

Page 5: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

5September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Dependency AnalysisDependency Analysis Ignominy scans…

Make dependency data produced by the compilers (*.d files) Source code for #includes (resolved against the ones actually

seen) Shared library dependencies (“ldd” output) Defined and required symbols (“nm” output)

And maps… Source code and binaries into packages #include dependencies into package dependencies Unresolved/defined symbols into package dependencies

And warns… about problems and ambiguities (e.g. multiply defined symbols or dependent shared libraries not found)

Produces a simple text file database for the different dependencies: source only, binaries only, combined, forward and reverse, by package, by domain, …

Page 6: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

6September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

CaveatsCaveats Ignominy does only static dependencies, not dynamic

ones Indirect calls through pointers, virtual function calls State dependencies: Data reads and writes, thread synchronisation, …

The analysis of external software is heuristic; exact information from the build system helps considerably

Difficulties are posed by copied code (copy and paste or merged libraries) and defaults dependent on link-order (“dummies” that are supposed to be overridden by client) Most headaches so far with FORTRAN code

Ignominy must guess software structure when in doubt Based on project-defined heuristic search rules, usually works fine In face of an ambiguity Ignominy warns and assumes the worst

– Multiply defined symbol: dependency on all definitions– Multiple header matches: dependency on all (but correct with

compiler-generated dependency data!)

Page 7: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

7September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Single Package DependenciesSingle Package Dependencies

Cmscan/IgCmscanTesting Level: 5Outgoing edges: 6- from includes: 6 (145 files)- from symbols: 4 (636 symbols)

Incoming edges: 1- from includes: 1 (1 file)- from symbols: 1 (1 symbol)

Page 8: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

8September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Domain Test PlanDomain Test Plan

Page 9: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

9September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Package Impact DiagramPackage Impact Diagram

“Used-by” dependencie

s

Page 10: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

10

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

An Extra DependencyAn Extra Dependency

Bad dependency in prototype code;

was resolved to be from bad class

placement

1 IgSoReaderAppDriver IgQtTwigBrowservia IgQtTwigModel.h

1 IgSoReaderAppDriver IgQtTwigBrowservia IgQtTwigRep.h

Page 11: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

11

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Static vs. LogicalStatic vs. Logical

Logical dependencies from packages used through “Interfaces”

Page 12: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

12

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Discovering Forms of ModularityDiscovering Forms of Modularity A fairly good tool for discovering “philosophical

structure” IGUANA and Geant4 mostly use direct abstract interfaces

– The interfaces normally generate “correct” functional dependencies: interface definitions are in packages that obviously imply the function

“Plug in one implementation of this interface”– Some use in Lizard/AIDA and ROOT

All interfaces bundled into “interface” (or framework) packages– Used by Lizard/AIDA and ROOT

Explicit dynamic loading to solve modularity issues– Used extensively by ROOT

Fall back on scripts or commands evaluated at run-time– Some use in Geant4– Used quite a bit in ROOT

Page 13: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

13

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of Anaphe Analysis of Anaphe Distribution of tools and utilities for LHC era physics

Combination of commercial, free and HEP software Claims to be a toolkit

Appears to live up to its toolkit claims Good work on modularity Clean design is evident in many places Dependency diagrams often split

naturally into functional units

Page 14: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

14

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of ATLASAnalysis of ATLAS Torture-test exercise for the tool

Large release size (~50% F77, ~50% mainly C++ but also C, Java) Near the limit of Ignominy’s ability to discover software structure Pictures below illustrate analysis difficulties

Visible (and known) problems Many cleanly designed packages shadowed by a cycle with very

unpleasant effects on the overall structure A number of places show poor packaging and/or lack of abstract

interfaces

Known bybuild

system

Misconfiguredanalysis (1.3.2)

Improvedanalysis (1.3.7)

Page 15: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

15

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of CMS/ORCA Analysis of CMS/ORCA Large C++ project Deliberately fast development shows in places

Good design in key parts has helped

Recognised problems Especially with the length of the release sequence Clean-up/restructuring necessary soon

– To some extent starting alreadyORCA Visualisation —needs most of the rest

Page 16: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

16

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of CMS/COBRA, IGUANA Analysis of CMS/COBRA, IGUANA COBRA

CMS Reconstruction, analysis and simulation framework Recently successfully split off from ORCA Quite many small packages

Has helped with modularity– Some issues with partitioning: some small cycles, certain

package groups appear quite frequently

IGUANA Generic data analysis environment with CMS focus Many fairly small packages with targeted purpose (similar to

Anaphe) Project focus as an integrator and glue provider is fairly evident We too have some rats nests to clean up, but at least they are

small… Has had the advantage of considerable monitoring!

Page 17: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

17

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of Geant4 Analysis of Geant4 Fairly large C++ project

Very fine-grained (and multi-level) package structuring Seems quite clean from the preliminary analysis

Fine package subdivision helps in many ways but makes analysis and code understanding more complicated

One subsystemseems stronglycoupled andneeds attention

Need to studythe use of theinternal commandsystem

Page 18: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

18

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of ROOTAnalysis of ROOT ROOT developers have done a formidable job of breaking

binary (shared library) dependencies, but… It makes dubious use of its internal scripting facility For example: By static analysis, nothing seems to use the

postscript package directly (no incoming dependencies), but there is this code:

void TPad::Print (const char *filename, Option_t *option) { […]

TVirtualPS *psave = gVirtualPS;

if (gROOT->LoadClass("TPostScript","Postscript")) return;

gROOT->ProcessLineFast("new TPostScript()");

gVirtualPS->Open(psname,pstype);

gVirtualPS->SetBit(kPrintingPS); […] }

Taking these and global objects into account makes the dependency diagrams very different—and cast doubt on usefulness of binary-only dependency diagrams for ROOT

Sign of fast growth? Need a “next evolutionary step”? So “coherent” that replacing parts could get painful…

Page 19: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

19

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Analysis of ROOT…Analysis of ROOT…

Binary only Binary + Source + Logical = Real

Page 20: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

20

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Package MetricsPackage MetricsProject Release Packages

Average #of direct

dependencies

Cycles(Packages Involved)

# of levels ACD* CCD* NCCD* Size

Anaphe 3.6.1 31 2.6 -- 8 5.4 167 1.3 630/170kATLAS 1.3.2 230 6.3 2 (92) 96 70 16211 10 1350k

1.3.7 236 7.0 2 (92) 97 77 18263 11 1350kCMS/ORCA 4.6.0 199 7.4 7 (22) 35 24 4815 3.6 420kCMS/COBRA 5.2.0 87 6.7 4 (10) 19 15 1312 2.7 180kCMS/IGUANA 2.4.2 35 3.9 -- 6 5.0 174 1.2 150/38kGeant4 4.3.2 108 7.0 3 (12) 21 16 1765 2.8 680kROOT 2.25/05 30 6.4 1 (19) 22 19 580 4.7 660k*) John Lakos, Large-Scale C++ Programming

Size = total amount of source code (roughly—not normalised across projects!) ACD = average component dependency (~ libraries linked in) CCD = sum of single-package component dependencies over whole release

– Indicates testing/integration cost NCCD = Measure of CCD compared to a balanced binary tree

– A good toolkit’s NCCD will be close to 1.0

– < 1.0: structure is flatter than a binary tree (= independent packages)

– > 1.0: structure is more strongly coupled (vertical or cyclic)

– Aim: Minimise NCCD for given software/functionality

Page 21: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

21

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Metrics: NCCD vs CyclesMetrics: NCCD vs Cycles

0

2

4

6

8

10

12

0% 10% 20% 30% 40% 50% 60% 70%

Fraction of Packages in Cycles

NC

CD

Toolkits &Frameworks

ATLAS

ORCA

Anaphe

IGUANA

COBRAG4

ROOT

Page 22: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

22

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Metrics: NCCD vs SizeMetrics: NCCD vs Size

0

2

4

6

8

10

12

0 200 400 600 800 1000 1200 1400 1600

Size (k-lines of source [files])

NC

CD

Toolkits &Frameworks

ATLAS

ORCA

AnapheIGUANACOBRA

G4

ROOT

Page 23: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

23

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Metrics: NCCD vs ACDMetrics: NCCD vs ACD

0

2

4

6

8

10

12

0% 10% 20% 30% 40% 50% 60% 70%

Av. Component Deps (Fraction of Packages)

NC

CD

Toolkits &Frameworks

ATLAS

ORCA

AnapheIGUANACOBRAG4

ROOT

Page 24: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

24

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Metrics: NCCD vs AIDMetrics: NCCD vs AID

0

2

4

6

8

10

12

0% 5% 10% 15% 20% 25%

Av. Immediate Deps (Fraction of Packages)

NC

CD

Toolkits &Frameworks

ATLAS

ORCA

Anaphe IGUANA

COBRAG4

ROOT

Page 25: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

25

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Metrics: Packages vs SizeMetrics: Packages vs Size

0

50

100

150

200

250

0 200 400 600 800 1000 1200 1400 1600

Size (Own Only)

Pa

ck

ag

es

Toolkits &Frameworks

ATLAS

ORCA

AnapheIGUANA

COBRA

G4

ROOT

Page 26: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

26

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

Metrics: Packages vs SizeMetrics: Packages vs Size

0

50

100

150

200

250

0 200 400 600 800 1000 1200 1400 1600

Size (All)

Pa

ck

ag

es

Toolkits &Frameworks

ATLAS

ORCA

AnapheIGUANA

COBRA

G4

ROOT

Page 27: CHEP 2001 , 2001 Lassi A. Tuura, Northeastern University Analysing Software Dependencies With Ignominy Lucas Taylor Lassi.

27

September, 2001 Lassi A. Tuura, Northeastern Universityhttp://iguana.cern.ch

SummarySummary Ignominy is a rather simple tool—and as such tremendously

helpful in keeping a project on track Especially for keeping external software in check Also for giving hard facts about the project itself

It provides tools to study a software system structure It should not be used blindly, results must be understood and

interpreted correctly; a human is certainly required! We find it valuable—output is now a part of our release documentation

It doesn’t do everything, but what it does, it seeks to do well Feedback, suggestions for improvements etc. would be most welcome! Planning to add support for Java

Available for free at http://iguana.cern.ch/ See the IGUANA distributions (latest = 2.4.3 recommended) For questions please mail [email protected] or iguana-interest@cern.

ch