Sattose 2011

8
Aggregation of software metrics Bogdan Vasilescu [email protected] Alexander Serebrenik [email protected] April 7, 2011

description

I used these slides during my talk at SaTTOSE 2011 in Koblenz, Germany.

Transcript of Sattose 2011

Page 1: Sattose 2011

Aggregationof software metricsBogdan [email protected]

Alexander [email protected]

April 7, 2011

Page 2: Sattose 2011

2/8

/ department of mathematics and computer science

Aggregation techniques for software metrics

Better understand aggregation techniques for software metrics.

Household income in Ilocos, the Philippines (1998)

Income

Den

sity

0 500000 1000000 1500000 2000000 2500000

0e+

001e

−06

2e−

063e

−06

4e−

065e

−06

Source lines of code − freecol−0.9.4

SLOC per class

Den

sity

0 500 1000 1500 2000 2500 3000

0.00

00.

001

0.00

20.

003

0.00

4

Traditional: mean, sum, median, standard deviation, variance,skewness, kurtosis.

Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm.

Page 3: Sattose 2011

2/8

/ department of mathematics and computer science

Aggregation techniques for software metrics

Better understand aggregation techniques for software metrics.

Household income in Ilocos, the Philippines (1998)

Income

Den

sity

0 500000 1000000 1500000 2000000 2500000

0e+

001e

−06

2e−

063e

−06

4e−

065e

−06

Source lines of code − freecol−0.9.4

SLOC per class

Den

sity

0 500 1000 1500 2000 2500 3000

0.00

00.

001

0.00

20.

003

0.00

4

Traditional: mean, sum, median, standard deviation, variance,skewness, kurtosis.

Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm.

Page 4: Sattose 2011

3/8

/ department of mathematics and computer science

Correlation study

Aggregate SLOC from class to package level.

Study statistical correlation between pairs of aggregation techniques.

Not enough to measure.

Page 5: Sattose 2011

4/8

/ department of mathematics and computer science

Available datasets

Qualitas Corpus 20101126 r+e.I r (recent): the most recent versions from 106 systems.I e (evolution): all available versions from 13 systems (≥ 10 versionsavailable), 414 versions in total.

Page 6: Sattose 2011

5/8

/ department of mathematics and computer science

Tooling

Developed and available tooling to analyze the corpus:

I Extract metrics: SLOCCount, Understand (still not generic enough)I Compute inequality indices, perform statistical analyses: R (highlyscriptable)

I Put everything together: Python toolchain (easily extendable)

●●

−1.

0−

0.5

0.0

0.5

1.0

Kendall correlation: Atkinson − skewness (SLOC)

Ken

dall

corr

elat

ion

coef

ficie

nt

●●

−1.

0−

0.5

0.0

0.5

1.0

Kendall correlation: Atkinson − skewness (SLOC)

Ken

dall

corr

elat

ion

coef

ficie

nt

−1.

0−

0.5

0.0

0.5

1.0

Kendall correlation: Gini − Theil (SLOC)

Ken

dall

corr

elat

ion

coef

ficie

nt

−1.

0−

0.5

0.0

0.5

1.0

Kendall correlation: Gini − Theil (SLOC)

Ken

dall

corr

elat

ion

coef

ficie

nt

−1.

0−

0.5

0.0

0.5

1.0

Kendall correlation: mean − kurtosis (SLOC)

Ken

dall

corr

elat

ion

coef

ficie

nt

−1.

0−

0.5

0.0

0.5

1.0

Kendall correlation: mean − kurtosis (SLOC)

Ken

dall

corr

elat

ion

coef

ficie

nt

Page 7: Sattose 2011

6/8

/ department of mathematics and computer science

Sample results - shape

●●●

● ●●

●●

●●

●●

●●

● ●●

● ●●

●●

● ●

0.0 0.1 0.2 0.3 0.4 0.5

−2

−1

01

23

4

jfreechart : Atkinson − skewness (SLOC)

Atkinson (SLOC)

skew

ness

(S

LOC

)

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●●

0.0 0.2 0.4 0.6 0.8

0.0

0.5

1.0

1.5

jfreechart : Gini − Theil (SLOC)

Gini (SLOC)

The

il (S

LOC

)

●●●

●●●

●●

● ●

●●

● ●●

●●

● ●●

●●● ●

●●

●●

● ●

●● ● ●

0 50 100 150 200 250 300

510

1520

jfreechart : mean − kurtosis (SLOC)

mean (SLOC)

kurt

osis

(S

LOC

)

Page 8: Sattose 2011

7/8

/ department of mathematics and computer science

Sample results - evolution−

1.0

−0.

50.

00.

51.

0

hibernate − Kendall(Gini(SLOC), Theil(SLOC)) (86 releases)

Cor

. coe

ff. G

ini(S

LOC

) −

The

il(S

LOC

)

0.8.

11.

01.

12.

0−be

ta−

12.

0−be

ta−

22.

0−be

ta−

32.

0−be

ta−

42.

0−fin

al2.

0−rc

22.

0.1

2.0.

22.

0.3

2.1−

beta

−1

2.1−

beta

−2

2.1−

beta

−3

2.1−

beta

−3b

2.1−

beta

−4

2.1−

beta

−5

2.1−

beta

−6

2.1−

final

2.1−

rc1

2.1.

12.

1.2

2.1.

32.

1.4

2.1.

52.

1.6

2.1.

72.

1.8

3.0

3.0−

alph

a3.

0−be

ta1

3.0−

beta

23.

0−be

ta3

3.0−

beta

43.

0−rc

13.

0.1

3.0.

23.

0.3

3.0.

43.

0.5

3.1

3.1−

alph

a13.

1−be

ta1

3.1−

beta

23.

1−be

ta3

3.1−

rc1

3.1−

rc2

3.1−

rc3

3.1.

13.

1.2

3.1.

33.

2−al

pha1

3.2−

alph

a23.

2−cr

13.

2−cr

23.

2.0−

cr3

3.2.

0−cr

43.

2.0−

cr5

3.2.

0.ga

3.2.

1−ga

3.2.

2−ga

3.2.

3−ga

3.2.

4−ga

3.2.

4−sp

13.

2.5−

ga3.

2.6−

ga3.

2.7−

ga3.

3.0−

cr2

3.3.

0−ga

3.3.

0−sp

13.

3.0.

cr1

3.3.

1−ga

3.3.

2−ga

3.5.

0−be

ta−

13.

5.0−

beta

−2

3.5.

0−be

ta−

33.

5.0−

beta

−4

3.5.

0−cr

−1

3.5.

0−cr

−2

3.5.

3−fin

al3.

5.5−

final

3.6.

0−be

ta1

3.6.

0−be

ta2

3.6.

0−be

ta3

3.6.

0−be

ta4

−1.

0−

0.5

0.0

0.5

1.0

hibernate − Kendall(Atkinson(SLOC), Kolm(SLOC)) (86 releases)

Cor

. coe

ff. A

tkin

son(

SLO

C)

− K

olm

(SLO

C)

0.8.

11.

01.

12.

0−be

ta−

12.

0−be

ta−

22.

0−be

ta−

32.

0−be

ta−

42.

0−fin

al2.

0−rc

22.

0.1

2.0.

22.

0.3

2.1−

beta

−1

2.1−

beta

−2

2.1−

beta

−3

2.1−

beta

−3b

2.1−

beta

−4

2.1−

beta

−5

2.1−

beta

−6

2.1−

final

2.1−

rc1

2.1.

12.

1.2

2.1.

32.

1.4

2.1.

52.

1.6

2.1.

72.

1.8

3.0

3.0−

alph

a3.

0−be

ta1

3.0−

beta

23.

0−be

ta3

3.0−

beta

43.

0−rc

13.

0.1

3.0.

23.

0.3

3.0.

43.

0.5

3.1

3.1−

alph

a13.

1−be

ta1

3.1−

beta

23.

1−be

ta3

3.1−

rc1

3.1−

rc2

3.1−

rc3

3.1.

13.

1.2

3.1.

33.

2−al

pha1

3.2−

alph

a23.

2−cr

13.

2−cr

23.

2.0−

cr3

3.2.

0−cr

43.

2.0−

cr5

3.2.

0.ga

3.2.

1−ga

3.2.

2−ga

3.2.

3−ga

3.2.

4−ga

3.2.

4−sp

13.

2.5−

ga3.

2.6−

ga3.

2.7−

ga3.

3.0−

cr2

3.3.

0−ga

3.3.

0−sp

13.

3.0.

cr1

3.3.

1−ga

3.3.

2−ga

3.5.

0−be

ta−

13.

5.0−

beta

−2

3.5.

0−be

ta−

33.

5.0−

beta

−4

3.5.

0−cr

−1

3.5.

0−cr

−2

3.5.

3−fin

al3.

5.5−

final

3.6.

0−be

ta1

3.6.

0−be

ta2

3.6.

0−be

ta3

3.6.

0−be

ta4