Sattose 2011
-
Upload
bogdan-vasilescu -
Category
Education
-
view
262 -
download
2
description
Transcript of Sattose 2011
2/8
/ department of mathematics and computer science
Aggregation techniques for software metrics
Better understand aggregation techniques for software metrics.
Household income in Ilocos, the Philippines (1998)
Income
Den
sity
0 500000 1000000 1500000 2000000 2500000
0e+
001e
−06
2e−
063e
−06
4e−
065e
−06
Source lines of code − freecol−0.9.4
SLOC per class
Den
sity
0 500 1000 1500 2000 2500 3000
0.00
00.
001
0.00
20.
003
0.00
4
Traditional: mean, sum, median, standard deviation, variance,skewness, kurtosis.
Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm.
2/8
/ department of mathematics and computer science
Aggregation techniques for software metrics
Better understand aggregation techniques for software metrics.
Household income in Ilocos, the Philippines (1998)
Income
Den
sity
0 500000 1000000 1500000 2000000 2500000
0e+
001e
−06
2e−
063e
−06
4e−
065e
−06
Source lines of code − freecol−0.9.4
SLOC per class
Den
sity
0 500 1000 1500 2000 2500 3000
0.00
00.
001
0.00
20.
003
0.00
4
Traditional: mean, sum, median, standard deviation, variance,skewness, kurtosis.
Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm.
3/8
/ department of mathematics and computer science
Correlation study
Aggregate SLOC from class to package level.
Study statistical correlation between pairs of aggregation techniques.
Not enough to measure.
4/8
/ department of mathematics and computer science
Available datasets
Qualitas Corpus 20101126 r+e.I r (recent): the most recent versions from 106 systems.I e (evolution): all available versions from 13 systems (≥ 10 versionsavailable), 414 versions in total.
5/8
/ department of mathematics and computer science
Tooling
Developed and available tooling to analyze the corpus:
I Extract metrics: SLOCCount, Understand (still not generic enough)I Compute inequality indices, perform statistical analyses: R (highlyscriptable)
I Put everything together: Python toolchain (easily extendable)
●
●●
●
−1.
0−
0.5
0.0
0.5
1.0
Kendall correlation: Atkinson − skewness (SLOC)
Ken
dall
corr
elat
ion
coef
ficie
nt
●
●●
●
−1.
0−
0.5
0.0
0.5
1.0
Kendall correlation: Atkinson − skewness (SLOC)
Ken
dall
corr
elat
ion
coef
ficie
nt
●
●
●
●
●
−1.
0−
0.5
0.0
0.5
1.0
Kendall correlation: Gini − Theil (SLOC)
Ken
dall
corr
elat
ion
coef
ficie
nt
●
●
●
●
●
−1.
0−
0.5
0.0
0.5
1.0
Kendall correlation: Gini − Theil (SLOC)
Ken
dall
corr
elat
ion
coef
ficie
nt
●
●
●
●
●
●
−1.
0−
0.5
0.0
0.5
1.0
Kendall correlation: mean − kurtosis (SLOC)
Ken
dall
corr
elat
ion
coef
ficie
nt
●
●
●
●
●
●
−1.
0−
0.5
0.0
0.5
1.0
Kendall correlation: mean − kurtosis (SLOC)
Ken
dall
corr
elat
ion
coef
ficie
nt
6/8
/ department of mathematics and computer science
Sample results - shape
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
● ●●
●●
●
●
●
●●
●●
●●
●
●
●
● ●●
● ●●
●
●
●
●
●
●●
● ●
0.0 0.1 0.2 0.3 0.4 0.5
−2
−1
01
23
4
jfreechart : Atkinson − skewness (SLOC)
Atkinson (SLOC)
skew
ness
(S
LOC
)
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●● ●
●●
●
●
●
●●●
●
●
0.0 0.2 0.4 0.6 0.8
0.0
0.5
1.0
1.5
jfreechart : Gini − Theil (SLOC)
Gini (SLOC)
The
il (S
LOC
)
●
●●●
●
●
●
●
●
●
●●●
●●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●●
●●
●
● ●●
●●● ●
●
●
●
●●
●
●●
●
● ●
●
●
●
●● ● ●
0 50 100 150 200 250 300
510
1520
jfreechart : mean − kurtosis (SLOC)
mean (SLOC)
kurt
osis
(S
LOC
)
7/8
/ department of mathematics and computer science
Sample results - evolution−
1.0
−0.
50.
00.
51.
0
hibernate − Kendall(Gini(SLOC), Theil(SLOC)) (86 releases)
Cor
. coe
ff. G
ini(S
LOC
) −
The
il(S
LOC
)
0.8.
11.
01.
12.
0−be
ta−
12.
0−be
ta−
22.
0−be
ta−
32.
0−be
ta−
42.
0−fin
al2.
0−rc
22.
0.1
2.0.
22.
0.3
2.1−
beta
−1
2.1−
beta
−2
2.1−
beta
−3
2.1−
beta
−3b
2.1−
beta
−4
2.1−
beta
−5
2.1−
beta
−6
2.1−
final
2.1−
rc1
2.1.
12.
1.2
2.1.
32.
1.4
2.1.
52.
1.6
2.1.
72.
1.8
3.0
3.0−
alph
a3.
0−be
ta1
3.0−
beta
23.
0−be
ta3
3.0−
beta
43.
0−rc
13.
0.1
3.0.
23.
0.3
3.0.
43.
0.5
3.1
3.1−
alph
a13.
1−be
ta1
3.1−
beta
23.
1−be
ta3
3.1−
rc1
3.1−
rc2
3.1−
rc3
3.1.
13.
1.2
3.1.
33.
2−al
pha1
3.2−
alph
a23.
2−cr
13.
2−cr
23.
2.0−
cr3
3.2.
0−cr
43.
2.0−
cr5
3.2.
0.ga
3.2.
1−ga
3.2.
2−ga
3.2.
3−ga
3.2.
4−ga
3.2.
4−sp
13.
2.5−
ga3.
2.6−
ga3.
2.7−
ga3.
3.0−
cr2
3.3.
0−ga
3.3.
0−sp
13.
3.0.
cr1
3.3.
1−ga
3.3.
2−ga
3.5.
0−be
ta−
13.
5.0−
beta
−2
3.5.
0−be
ta−
33.
5.0−
beta
−4
3.5.
0−cr
−1
3.5.
0−cr
−2
3.5.
3−fin
al3.
5.5−
final
3.6.
0−be
ta1
3.6.
0−be
ta2
3.6.
0−be
ta3
3.6.
0−be
ta4
−1.
0−
0.5
0.0
0.5
1.0
hibernate − Kendall(Atkinson(SLOC), Kolm(SLOC)) (86 releases)
Cor
. coe
ff. A
tkin
son(
SLO
C)
− K
olm
(SLO
C)
0.8.
11.
01.
12.
0−be
ta−
12.
0−be
ta−
22.
0−be
ta−
32.
0−be
ta−
42.
0−fin
al2.
0−rc
22.
0.1
2.0.
22.
0.3
2.1−
beta
−1
2.1−
beta
−2
2.1−
beta
−3
2.1−
beta
−3b
2.1−
beta
−4
2.1−
beta
−5
2.1−
beta
−6
2.1−
final
2.1−
rc1
2.1.
12.
1.2
2.1.
32.
1.4
2.1.
52.
1.6
2.1.
72.
1.8
3.0
3.0−
alph
a3.
0−be
ta1
3.0−
beta
23.
0−be
ta3
3.0−
beta
43.
0−rc
13.
0.1
3.0.
23.
0.3
3.0.
43.
0.5
3.1
3.1−
alph
a13.
1−be
ta1
3.1−
beta
23.
1−be
ta3
3.1−
rc1
3.1−
rc2
3.1−
rc3
3.1.
13.
1.2
3.1.
33.
2−al
pha1
3.2−
alph
a23.
2−cr
13.
2−cr
23.
2.0−
cr3
3.2.
0−cr
43.
2.0−
cr5
3.2.
0.ga
3.2.
1−ga
3.2.
2−ga
3.2.
3−ga
3.2.
4−ga
3.2.
4−sp
13.
2.5−
ga3.
2.6−
ga3.
2.7−
ga3.
3.0−
cr2
3.3.
0−ga
3.3.
0−sp
13.
3.0.
cr1
3.3.
1−ga
3.3.
2−ga
3.5.
0−be
ta−
13.
5.0−
beta
−2
3.5.
0−be
ta−
33.
5.0−
beta
−4
3.5.
0−cr
−1
3.5.
0−cr
−2
3.5.
3−fin
al3.
5.5−
final
3.6.
0−be
ta1
3.6.
0−be
ta2
3.6.
0−be
ta3
3.6.
0−be
ta4