1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and...
-
Upload
briana-watts -
Category
Documents
-
view
213 -
download
0
Transcript of 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and...
1
The Relationship of Cyclomatic Complexity, Essential Complexity and
Error Rates
Mike Chapman and Dan [email protected]
2
Hypothesis
Essential complexity is a better predictor of error than cyclomatic complexity.
3
• Cyclomatic Complexity, or v(G), measures the number of linearly independent paths through a given program.
• Essential complexity is the measure of the degree to which a module contains unstructured constructs.
• Module design complexity, iv(G), is a measure of the module’s decision structure as it relates to calls to other modules.
Complexity Definitions
4
Lines of Code
• McCabe finds the open bracket and starts counting until it finds the close bracket. This metric was used to show LOC error prediction in this study.
5
The Dataset• 312K executable lines of C code• 8 years of problem reports (life of project)• 1652 Software problem reports whose fix included
a change to at least one software module.• 2078 modules changed at least once due to a
problem report• 4221 total module changes as the result of
problem reports• 12,094 modules as identified by McCabe IQ
6
What Was Eliminated
• A CSCI that had recently been redesigned using C++
• 1007 modules that no longer exist in the baseline or were ambiguous in their naming
• A CSCI that consisted of multiple directories of essentially the same code
• Leaving 11,494 modules with 3154 errors
7
Number of Modules
Number of Modules vs. Lines of Code
1
10
100
1000
1 10 100 1000 10000
Number of Modules vs. Cyclomatic Complexity
1
10
100
1000
10000
1 10 100 1000
Number of Modules vs. Essential Complexity
1
10
100
1000
10000
1 10 100 1000
Number of Modules vs. Design Complexity
1
10
100
1000
10000
1 10 100 1000
8
Error Rate
Error Rate vs. Lines of Code
0
5
10
15
20
25
30
1 10 100 1000 10000
Error Rate vs. Cyclomatic Complexity
0
5
10
15
20
25
30
1 10 100 1000
Error Rate vs. Essential Complexity
0
5
10
15
20
25
30
1 10 100 1000
Error Rate vs. Design Complexity
0
5
10
15
20
25
30
1 10 100 1000
9
Error Prediction
• What does it mean to say that a metric is a good predictor of error(s)?
• Or that one is better than another?
• How does one select a threshold?
10
A Closer Look at v(G)Error Rate vs. Cyclomatic Complexity
0
1
2
3
4
0 10 20 30 40
11
A Closer Look at ev(G)
Error Rate vs. Essential Complexity
0
1
2
3
4
0 10 20 30 40
12
Thresholds
• Compare the error rate among modules above a threshold with that of those below
Error Rate Above/Below v(G) Threshold
0
1
2
3
4
5
6
7
0 5 10 15 20 25 30
Error Rate Above/Below ev(G) Threshold
0
1
2
3
4
5
6
7
0 5 10 15 20 25 30
13
Probability of Error
Probability of Error vs. v(G)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000
Probability of Error vs. ev(G)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000
Probability of Error vs. Lines of Code
0
0.2
0.4
0.6
0.8
1
1 10 100 1000 10000
Probability of Error vs. iv(G)
0
0.2
0.4
0.6
0.8
1
1 10 100 1000
14
A Closer Look at v(G)
Probability of Error vs. v(G)
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
15
A Closer Look at ev(G)
Probability of Error vs. ev(G)
0
0.2
0.4
0.6
0.8
1
0 10 20 30 40
16
Thresholds
• Compare the error probability for modules above a threshold with those below
Pr. of Error Above/Below v(G) Threshold
0
0.5
1
1.5
2
2.5
3
3.5
0 5 10 15 20 25 30
Pr. of Error Above/Below ev(G) Threshold
0
0.5
1
1.5
2
2.5
3
3.5
0 5 10 15 20 25 30
17
Conclusions
• Both cyclomatic and essential complexities indicate errors
• Cyclomatic is a better indicator
• The McCabe thresholds (10 and 4) aren’t bad, but 20 and 10 look better
18
Future Work
• This analysis should be carried out for additional programs.
• We have proposed to investigate the relation between software structures and errors
• IV&V Facility Metrics Data Program will provide metrics and error data to the research community