Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective
Click here to load reader
-
Upload
shane-mcintosh -
Category
Software
-
view
139 -
download
0
Transcript of Tracing Software Build Processes to Uncover License Compliance Inconsistencies: A Retrospective
Tracing Software Build Processes to Uncover License Compliance Inconsistencies
Shane McIntosh
Sander van der Burg
Eelco Dolstra
Julius Davies
Daniel M. Germán
Armijn Hemel
TjaldurSoftware Governance
Solutions
@shane_mcintosh
Source code
What is a build system?
Source code
Deliverable
What is a build system?
.tex
.c
.cc
.o
.o
.dvi
.a
.exe
.deb
Build systems describe how sources aretranslated into deliverables
3
Continuous Integration:Enabled by the
build system
4
.c .mk
Continuous Integration:Enabled by the
build system
Commit
4
Commit 9719cf0
.c .mk
Continuous Integration:Enabled by the
build system
Commit
4
BuildCommit 9719cf0
.c .mk
Continuous Integration:Enabled by the
build system
Commit
4
Build
Test
Commit 9719cf0
.c .mk
Continuous Integration:Enabled by the
build system
Commit
4
Build
Test
ReportCommit 9719cf0 was successfully integrated
Commit 9719cf0
.c .mk
Continuous Integration:Enabled by the
build system
Commit
4
Build
Test
ReportCommit 9719cf0 was successfully integrated
Commit 9719cf0
.c .mk
5
There [is] no such thing as a free lunch
“ ”
5
There [is] no such thing as a free lunch
“ ”An Empirical Study of
Build Maintenance Effort
S. McIntosh, B. Adams, T. H. D. Nguyen,
Y. Kamei, A. E. Hassan [ICSE 2011]
Up to 27% of source changes
are accompanied by build changes
6
Maintenance overhead
6
Maintenance overhead
.c .mk?
Source-build co-change
6
Maintenance overheadBuild technology and maintenance
.c .mk?
Source-build co-change
6
Maintenance overheadBuild logic
cloningBuild technology and maintenance
.c .mk?
Source-build co-change
6
Execution overhead
Maintenance overheadBuild logic
cloningBuild technology and maintenance
.c .mk?
Source-build co-change
6
Execution overhead
Maintenance overheadBuild logic
cloningBuild technology and maintenance
.c .mk?
Source-build co-change
Build hotspot detection
6
Execution overhead
Maintenance overheadBuild logic
cloningBuild technology and maintenance
.c .mk?
Source-build co-change
Powerful hotspot indicators
Build hotspot detection
6
Execution overhead
Maintenance overheadBuild logic
cloningBuild technology and maintenance
.c .mk?
Source-build co-change
Powerful hotspot indicators
Build hotspot detection
Build systems also contain useful information!
Reusable components are released under different license terms
7
Reusable components are released under different license terms
7
Apache Public License
Failure to comply with license terms can lead to costly legal issues
8
Failure to comply with license terms can lead to costly legal issues
8
Failure to comply with license terms can lead to costly legal issues
8
9
Which source files are enabled?
Ensuring license compliance with reused components
.c.c.c.c
9
Which source files are enabled?
Ensuring license compliance with reused components
.c.c.c
.c
9
Which source files are enabled?
Which components are used?
Ensuring license compliance with reused components
.c.c.c
.c
9
Which source files are enabled?
Which components are used?
How are they combined?
Ensuring license compliance with reused components
.c.c.c
.c
Static link
Dynamic link
9
Which source files are enabled?
Which components are used?
How are they combined?
Ensuring license compliance with reused components
.c.c.c
.c
Static link
Dynamic link
The build system cananswer these questions!
We use system tracing todiscover build dependencies
Build process
10
Tracelog
OS kernel
open()
We use system tracing todiscover build dependencies
Build process
10
read()
write()
close()
Tracelog
Tracelog
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelflibstdc++
11
Tracelog
We mine build traces to construct a concrete build dependency graph
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
11
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
Annotate build graph nodes with license information using Ninka
12
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
Annotate build graph nodes with license information using Ninka
12
Inconsistency introduced!
patchelf.ccelf.h
patchelf.o
patchelf
/usr/bin/patchelf
g++
libstdc++
g++install
Annotate build graph nodes with license information using Ninka
12
13
Empirical study
13
Empirical study(RQ1)
Accuracy
13
Empirical study(RQ1)
Accuracy(RQ2)
Practicality
13
14
Measuring the accuracyof our CBDG approach
Included .c.c.c.c
Excluded
14
Measuring the accuracyof our CBDG approach
Included .c.c.c
.cExcluded
14
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
14
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
14
Broken means true positive
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
14
Clean means false positive
Broken means true positive
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
14
Clean means false positive
Broken means true positive
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
.c
14
Clean means false positive
Broken means true positive
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
Execute build
.c
14
Clean means false positive
Broken means true positive
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
Broken means false negative
Execute build
.c
14
Clean means false positive
Broken means true positive
Measuring the accuracyof our CBDG approach
Included .c.c .c
.cExcluded
Delete
Execute build
Clean means true negative
Broken means false negative
Execute build
.c
Empirical study(RQ1)
Accuracy
Precision: 88%-100%
Recall: 98%-100%
(RQ2)Practicality
15
Bugs filed using our approachon multi-licensed packages
FFmpeg
Licensewas updatedwithin 3 days
+
16
Bugs filed using our approachon multi-licensed packages
FFmpeg
Licensewas updatedwithin 3 days
+
CUPS
+
Offending files were removed within 2 days
16
Empirical study(RQ1)
Accuracy
Precision: 88%-100%
Recall: 98%-100%
(RQ2)Practicality
Prompted quick code changes in two systems
17