Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

60

Click here to load reader

Transcript of Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Page 1: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Tracing Software Build Processes to Uncover License Compliance Inconsistencies

Shane McIntosh

Sander van der Burg

Eelco Dolstra

Julius Davies

Daniel M. Germán

Armijn Hemel

TjaldurSoftware Governance

Solutions

@shane_mcintosh

Page 2: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Source code

What is a build system?

Page 3: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Source code

Deliverable

What is a build system?

Page 4: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

.tex

.c

.cc

.o

.o

.dvi

.a

.exe

.pdf

.deb

Build systems describe how sources aretranslated into deliverables

3

Page 5: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Continuous Integration:Enabled by the

build system

4

.c .mk

Page 6: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Continuous Integration:Enabled by the

build system

Commit

4

Commit 9719cf0

.c .mk

Page 7: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Continuous Integration:Enabled by the

build system

Commit

4

BuildCommit 9719cf0

.c .mk

Page 8: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Continuous Integration:Enabled by the

build system

Commit

4

Build

Test

Commit 9719cf0

.c .mk

Page 9: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Continuous Integration:Enabled by the

build system

Commit

4

Build

Test

ReportCommit 9719cf0 was successfully integrated

Commit 9719cf0

.c .mk

Page 10: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Continuous Integration:Enabled by the

build system

Commit

4

Build

Test

ReportCommit 9719cf0 was successfully integrated

Commit 9719cf0

.c .mk

Page 11: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

5

There [is] no such thing as a free lunch

“ ”

Page 12: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

5

There [is] no such thing as a free lunch

“ ”An Empirical Study of

Build Maintenance Effort

S. McIntosh, B. Adams, T. H. D. Nguyen,

Y. Kamei, A. E. Hassan [ICSE 2011]

Up to 27% of source changes

are accompanied by build changes

Page 13: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Maintenance overhead

Page 14: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Maintenance overhead

.c .mk?

Source-build co-change

Page 15: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Maintenance overheadBuild technology and maintenance

.c .mk?

Source-build co-change

Page 16: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Maintenance overheadBuild logic

cloningBuild technology and maintenance

.c .mk?

Source-build co-change

Page 17: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Execution overhead

Maintenance overheadBuild logic

cloningBuild technology and maintenance

.c .mk?

Source-build co-change

Page 18: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Execution overhead

Maintenance overheadBuild logic

cloningBuild technology and maintenance

.c .mk?

Source-build co-change

Build hotspot detection

Page 19: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Execution overhead

Maintenance overheadBuild logic

cloningBuild technology and maintenance

.c .mk?

Source-build co-change

Powerful hotspot indicators

Build hotspot detection

Page 20: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

6

Execution overhead

Maintenance overheadBuild logic

cloningBuild technology and maintenance

.c .mk?

Source-build co-change

Powerful hotspot indicators

Build hotspot detection

Build systems also contain useful information!

Page 21: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Reusable components are released under different license terms

7

Page 22: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Reusable components are released under different license terms

7

Apache Public License

Page 23: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Failure to comply with license terms can lead to costly legal issues

8

Page 24: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Failure to comply with license terms can lead to costly legal issues

8

Page 25: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Failure to comply with license terms can lead to costly legal issues

8

Page 26: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

9

Which source files are enabled?

Ensuring license compliance with reused components

.c.c.c.c

Page 27: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

9

Which source files are enabled?

Ensuring license compliance with reused components

.c.c.c

.c

Page 28: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

9

Which source files are enabled?

Which components are used?

Ensuring license compliance with reused components

.c.c.c

.c

Page 29: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

9

Which source files are enabled?

Which components are used?

How are they combined?

Ensuring license compliance with reused components

.c.c.c

.c

Static link

Dynamic link

Page 30: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

9

Which source files are enabled?

Which components are used?

How are they combined?

Ensuring license compliance with reused components

.c.c.c

.c

Static link

Dynamic link

The build system cananswer these questions!

Page 31: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

We use system tracing todiscover build dependencies

Build process

10

Tracelog

Page 32: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

OS kernel

open()

We use system tracing todiscover build dependencies

Build process

10

read()

write()

close()

Tracelog

Page 33: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Tracelog

We mine build traces to construct a concrete build dependency graph

patchelf.ccelf.h

patchelf.o

patchelf

/usr/bin/patchelflibstdc++

11

Page 34: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Tracelog

We mine build traces to construct a concrete build dependency graph

patchelf.ccelf.h

patchelf.o

patchelf

/usr/bin/patchelf

g++

libstdc++

g++install

11

Page 35: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

patchelf.ccelf.h

patchelf.o

patchelf

/usr/bin/patchelf

g++

libstdc++

g++install

Annotate build graph nodes with license information using Ninka

12

Page 36: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

patchelf.ccelf.h

patchelf.o

patchelf

/usr/bin/patchelf

g++

libstdc++

g++install

Annotate build graph nodes with license information using Ninka

12

Page 37: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Inconsistency introduced!

patchelf.ccelf.h

patchelf.o

patchelf

/usr/bin/patchelf

g++

libstdc++

g++install

Annotate build graph nodes with license information using Ninka

12

Page 38: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

13

Page 39: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Empirical study

13

Page 40: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Empirical study(RQ1)

Accuracy

13

Page 41: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Empirical study(RQ1)

Accuracy(RQ2)

Practicality

13

Page 42: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Measuring the accuracyof our CBDG approach

Included .c.c.c.c

Excluded

Page 43: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Measuring the accuracyof our CBDG approach

Included .c.c.c

.cExcluded

Page 44: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Page 45: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

Page 46: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Broken means true positive

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

Page 47: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Clean means false positive

Broken means true positive

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

Page 48: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Clean means false positive

Broken means true positive

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

.c

Page 49: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Clean means false positive

Broken means true positive

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

Execute build

.c

Page 50: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Clean means false positive

Broken means true positive

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

Broken means false negative

Execute build

.c

Page 51: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

14

Clean means false positive

Broken means true positive

Measuring the accuracyof our CBDG approach

Included .c.c .c

.cExcluded

Delete

Execute build

Clean means true negative

Broken means false negative

Execute build

.c

Page 52: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Empirical study(RQ1)

Accuracy

Precision: 88%-100%

Recall: 98%-100%

(RQ2)Practicality

15

Page 53: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Bugs filed using our approachon multi-licensed packages

FFmpeg

Licensewas updatedwithin 3 days

+

16

Page 54: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Bugs filed using our approachon multi-licensed packages

FFmpeg

Licensewas updatedwithin 3 days

+

CUPS

+

Offending files were removed within 2 days

16

Page 55: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective

Empirical study(RQ1)

Accuracy

Precision: 88%-100%

Recall: 98%-100%

(RQ2)Practicality

Prompted quick code changes in two systems

17

Page 56: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective
Page 57: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective
Page 58: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective
Page 59: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective
Page 60: Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective