Know thy tools
-
Upload
cs-ncstate -
Category
Engineering
-
view
278 -
download
6
description
Transcript of Know thy tools
![Page 2: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/2.jpg)
2
Know thy tools
Stop treating data miners as black boxes.
Looking inside is (1) fun, (2) easy, (3) needed.
![Page 3: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/3.jpg)
3
INFOGAIN: (the Fayyad and Irani MDL discretizer) in 55 lineshttps://raw.githubusercontent.com/timm/axe/master/old/ediv.py
Input: [ (1,X), (2,X), (3,X), (4,X), (11,Y), (12,Y), (13,Y), (14,Y) ] Output: 1, 11 dsfdsdssdsdsddsdsdsfsdfsdsdfsdsdf
E = Σ –p*log2(p)
![Page 4: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/4.jpg)
4
Know thy tools
Stop treating data miners as black boxes.
Looking inside is (1) fun, (2) easy, (3) needed.
![Page 5: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/5.jpg)
5
Know thy tools
Stop treating data miners as black boxes.
Looking inside is (1) fun, (2) easy, (3) needed.
![Page 6: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/6.jpg)
6
It doesn't matter what you do but does matter who does it!
Martin Shepperd, Brunel University, West London, UKhttp://crest.cs.ucl.ac.uk/?id=3695
![Page 7: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/7.jpg)
7
Systematic Review
• Conducted by Tracy Hall and David Bowes– T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. “A systematic
literature review on fault prediction performance in software engineering”, Accepted for publication in TSE (download from BURA).
• Located 208 relevant primary studies• Due to reporting requirements used 18
studies that contain 194 results– binary classifiers, confusion matrix, context details
![Page 8: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/8.jpg)
8
Matthews correlation coefficient
![Page 9: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/9.jpg)
9
(iv) Research Group
![Page 10: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/10.jpg)
10
ANOVA Results
Factor % of varAuthor group 61%Metric family 3%Author/metric 9%Everything else 8% (but not significant)Residuals 19%
![Page 11: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/11.jpg)
11
Final word
We cannot ignore the fact that the main determinant of a validation study result is which research group undertakes it.
![Page 12: Know thy tools](https://reader034.fdocuments.us/reader034/viewer/2022042714/554a0f82b4c90507558b4b7c/html5/thumbnails/12.jpg)
12
Know thy tools
Stop treating data miners as black boxes.
Looking inside is (1) fun, (2) easy, (3) needed.