Metrics - You can't control the unfamiliar
-
Upload
icsm-2011 -
Category
Technology
-
view
389 -
download
3
description
Transcript of Metrics - You can't control the unfamiliar
![Page 1: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/1.jpg)
/ W&I / MDSE PAGE 0 5-10-2011
Metrics are usually computed at a low level:
classes, methods, …
![Page 2: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/2.jpg)
Multitude of data values obscures a general
picture of the system maintainability
/W&I / MDSE PAGE 1 5-10-2011
![Page 3: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/3.jpg)
That we are actually interested in!
/W&I / MDSE PAGE 2 5-10-2011
![Page 4: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/4.jpg)
You Can't Control the Unfamiliar:
A Study on the Relations
Between Aggregation
Techniques for Software Metrics
Bogdan Vasilescu
Alexander Serebrenik
Mark van den Brand
![Page 5: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/5.jpg)
Two kinds of aggregation
Same artifact, different
metrics
Same metrics, different
artifacts
/W&I / MDSE PAGE 4 5-10-2011
![Page 6: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/6.jpg)
Various techniques can be
found in the literature
Same metrics, different
artifacts
/W&I / MDSE PAGE 5 5-10-2011
Traditional: mean,
median, sum, …
Econometric
inequality indices:
Gini, Theil, Hoover,
Kolm, Atkinson
![Page 7: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/7.jpg)
Various techniques can be
found in the literature
Same metrics, different
artifacts
/W&I / MDSE PAGE 6 5-10-2011
Traditional: mean,
median, sum, …
Econometric
inequality indices:
Gini, Theil, Hoover,
Kolm, Atkinson
Which
aggregation
technique
should we
use?
![Page 8: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/8.jpg)
Questions
1. Which and to what extent do the different
aggregation techniques agree?
2. What is the nature of the relation between the
various aggregation techniques?
3. How does the correlation coefficient change as the
systems evolve?
/W&I / MDSE PAGE 7 5-10-2011
![Page 9: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/9.jpg)
Qualitas Corpus 20101126
/W&I / MDSE PAGE 8 5-10-2011
• Qualitas Corpus 20101126r, 106 systems
• FitJava v1.1, 2 packages, 2240 SLOC
• NetBeans v6.9.1, 3373 packages 1890536 SLOC.
![Page 10: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/10.jpg)
1) Agreement between diff techniques
• Agreement:
• Aggregation: Class SLOC Package
• Techniques agree if they rank the packages similarly
/W&I / MDSE PAGE 9 5-10-2011
We use rank-based correlation coefficient: Kendall’s
![Page 11: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/11.jpg)
1) Agreement: different inequality indices?
• Gini, Theil, Hoover, Atkinson – agree
• aggregates obtained convey the same information
• Kolm does not!
/W&I / MDSE PAGE 10 5-10-2011
![Page 12: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/12.jpg)
1) Agreement: traditional and ineq indices?
• mean
• Kolm: strong (0,8) and statistically significant (92%)
• median, standard deviation, and variance
• sum
• does not correlate with any other aggregation technique
/W&I / MDSE PAGE 11 5-10-2011
![Page 13: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/13.jpg)
2) Nature of the relation: Typical patterns
• Theil is known to be more
sensitive to the rich
• Theil increases faster
when Gini increases
/W&I / MDSE PAGE 12 5-10-2011
• Linear relation with a “fat”
head
![Page 14: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/14.jpg)
Which aggregation technique? (1)
• Theil, Hoover, Gini and Atkinson agree
• Any can be chosen from the correlation point of view
• Some might be “better” in each specific case
• easy to interpret: Gini [0,1]
• provide additional insights: Theil (explanation)
• negative values: Gini, Hoover
− affects the domain!
• sensitive for high values: Theil, Atkinson
• deviations from uniformity: Gini, Hoover
/ W&I / MDSE PAGE 13 5-10-2011
![Page 15: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/15.jpg)
Which aggregation technique? (2)
• Kolm and mean agree
• Kolm is reliable for skewed distributions
− better alternative (“by no means”)
• Not in the paper:
− agreement observed for NOC
− but not for DIT!
/ W&I / MDSE PAGE 14 5-10-2011
![Page 16: Metrics - You can't control the unfamiliar](https://reader034.fdocuments.us/reader034/viewer/2022051818/54b4a8b84a79595d688b4653/html5/thumbnails/16.jpg)
Conclusions
/W&I / MDSE PAGE 15 5-10-2011