1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and...

18
1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon [email protected] [email protected]

Transcript of 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and...

Page 1: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

1

The Relationship of Cyclomatic Complexity, Essential Complexity and

Error Rates

Mike Chapman and Dan [email protected]

[email protected]

Page 2: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

2

Hypothesis

Essential complexity is a better predictor of error than cyclomatic complexity.

Page 3: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

3

• Cyclomatic Complexity, or v(G), measures the number of linearly independent paths through a given program.

• Essential complexity is the measure of the degree to which a module contains unstructured constructs.

• Module design complexity, iv(G), is a measure of the module’s decision structure as it relates to calls to other modules.

Complexity Definitions

Page 4: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

4

Lines of Code

• McCabe finds the open bracket and starts counting until it finds the close bracket. This metric was used to show LOC error prediction in this study.

Page 5: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

5

The Dataset• 312K executable lines of C code• 8 years of problem reports (life of project)• 1652 Software problem reports whose fix included

a change to at least one software module.• 2078 modules changed at least once due to a

problem report• 4221 total module changes as the result of

problem reports• 12,094 modules as identified by McCabe IQ

Page 6: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

6

What Was Eliminated

• A CSCI that had recently been redesigned using C++

• 1007 modules that no longer exist in the baseline or were ambiguous in their naming

• A CSCI that consisted of multiple directories of essentially the same code

• Leaving 11,494 modules with 3154 errors

Page 7: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

7

Number of Modules

Number of Modules vs. Lines of Code

1

10

100

1000

1 10 100 1000 10000

Number of Modules vs. Cyclomatic Complexity

1

10

100

1000

10000

1 10 100 1000

Number of Modules vs. Essential Complexity

1

10

100

1000

10000

1 10 100 1000

Number of Modules vs. Design Complexity

1

10

100

1000

10000

1 10 100 1000

Page 8: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

8

Error Rate

Error Rate vs. Lines of Code

0

5

10

15

20

25

30

1 10 100 1000 10000

Error Rate vs. Cyclomatic Complexity

0

5

10

15

20

25

30

1 10 100 1000

Error Rate vs. Essential Complexity

0

5

10

15

20

25

30

1 10 100 1000

Error Rate vs. Design Complexity

0

5

10

15

20

25

30

1 10 100 1000

Page 9: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

9

Error Prediction

• What does it mean to say that a metric is a good predictor of error(s)?

• Or that one is better than another?

• How does one select a threshold?

Page 10: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

10

A Closer Look at v(G)Error Rate vs. Cyclomatic Complexity

0

1

2

3

4

0 10 20 30 40

Page 11: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

11

A Closer Look at ev(G)

Error Rate vs. Essential Complexity

0

1

2

3

4

0 10 20 30 40

Page 12: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

12

Thresholds

• Compare the error rate among modules above a threshold with that of those below

Error Rate Above/Below v(G) Threshold

0

1

2

3

4

5

6

7

0 5 10 15 20 25 30

Error Rate Above/Below ev(G) Threshold

0

1

2

3

4

5

6

7

0 5 10 15 20 25 30

Page 13: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

13

Probability of Error

Probability of Error vs. v(G)

0

0.2

0.4

0.6

0.8

1

1 10 100 1000

Probability of Error vs. ev(G)

0

0.2

0.4

0.6

0.8

1

1 10 100 1000

Probability of Error vs. Lines of Code

0

0.2

0.4

0.6

0.8

1

1 10 100 1000 10000

Probability of Error vs. iv(G)

0

0.2

0.4

0.6

0.8

1

1 10 100 1000

Page 14: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

14

A Closer Look at v(G)

Probability of Error vs. v(G)

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40

Page 15: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

15

A Closer Look at ev(G)

Probability of Error vs. ev(G)

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40

Page 16: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

16

Thresholds

• Compare the error probability for modules above a threshold with those below

Pr. of Error Above/Below v(G) Threshold

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30

Pr. of Error Above/Below ev(G) Threshold

0

0.5

1

1.5

2

2.5

3

3.5

0 5 10 15 20 25 30

Page 17: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

17

Conclusions

• Both cyclomatic and essential complexities indicate errors

• Cyclomatic is a better indicator

• The McCabe thresholds (10 and 4) aren’t bad, but 20 and 10 look better

Page 18: 1 The Relationship of Cyclomatic Complexity, Essential Complexity and Error Rates Mike Chapman and Dan Solomon chapman@ivv.nasa.gov solomon@ivv.nasa.gov.

18

Future Work

• This analysis should be carried out for additional programs.

• We have proposed to investigate the relation between software structures and errors

• IV&V Facility Metrics Data Program will provide metrics and error data to the research community