House of Cards: Code Smells in Open-source C# Repositories

22
House of Cards: Code Smells in Open-source C# Repositories Tushar Sharma, Marios Fragkoulis, and Diomidis Spinellis Funded by SENECA project under Marie-Skłodowska Curie Actions

Transcript of House of Cards: Code Smells in Open-source C# Repositories

House of Cards: Code Smells in Open-source C# Repositories

Tushar Sharma, Marios Fragkoulis, and Diomidis Spinellis

FundedbySENECAprojectunderMarie-Skłodowska CurieActions

Code Smells

…certain structures in the code that suggest (sometimes they scream for) the possibility of refactoring.

- Kent Beck

<!>

http://www.tusharma.in/smells/

A Taxonomy of Software Smells

Identified gap

• Existing mining studies on smells lack • scale (Number of subject systems analyzed), and• breadth (Number of smells detected)

• Solely performed on Java subject systems

Overview of the study

Research questions

1988 open-source repositories

Designite

<!> 19 Design smells11 Implementation smells

Results

Detected design smellsAbstraction smells:• Duplicate Abstraction• Imperative Abstraction• Multifaceted Abstraction• Unnecessary Abstraction• Unutilized Abstraction

Encapsulation smells:• Deficient Encapsulation• Unexploited Encapsulation

Modularization smells:• Broken Modularization• Cyclically-dependent

Modularization• Hub-like Modularization• Insufficient Modularization

Hierarchy smells:• Broken Hierarchy• Cyclic Hierarchy• Deep Hierarchy• Missing Hierarchy• Multipath Hierarchy• Rebellious Hierarchy• Unfactored Hierarchy• Wide Hierarchy

Detected implementation smells• Complex Conditional• Complex Method• Duplicate Code• Empty Catch Block• Long Identifier• Long Method• Long Parameter List• Long Statement• Magic Number• Missing Default• Virtual Method Call from Constructor

Mining GitHub C# repositories

Repositories 1,988Numberoftypes 436,832Numberofmethods 2,265,971LinesofCode(C#) 49,303,314MedianLOC 4,391

RQ1. Frequency of smells

RQ1. What is the distribution of design and implementation smells in C# code?

UnutilizedAbstraction 90,786

DuplicateAbstraction 73,992

UnnecessaryAbstraction 44,583

ImperativeAbstraction 11,790

MultifacetedAbstraction 1,236

DeficientEncapsulation 30,214

UnexploitedEncapsulation 6,964

Cyclically-dependentModularization

52,436

InsufficientModularization

26,429

BrokenModularization 15,624

Hub-likeModularization 676

Unfactored Hierarchy 20,962

BrokenHierarchy 20,332

RebelliousHierarchy 11,794

CyclicHierarchy 4,342

WideHierarchy 3,140

MissingHierarchy 2,598

MultipathHierarchy 1,454

DeepHierarchy 179

RQ1. What is the distribution of design and implementation smells in C# code?

UnutilizedAbstraction 90,786

DuplicateAbstraction 73,992

UnnecessaryAbstraction 44,583

ImperativeAbstraction 11,790

MultifacetedAbstraction 1,236

DeficientEncapsulation 30,214

UnexploitedEncapsulation 6,964

Cyclically-dependentModularization

52,436

InsufficientModularization

26,429

BrokenModularization 15,624

Hub-likeModularization 676

Unfactored Hierarchy 20,962

BrokenHierarchy 20,332

RebelliousHierarchy 11,794

CyclicHierarchy 4,342

WideHierarchy 3,140

MissingHierarchy 2,598

MultipathHierarchy 1,454

DeepHierarchy 179

RQ1. What is the distribution of design and implementation smells in C# code?

MagicNumber 2,993,353

LongStatement 462,491

ComplexMethod 95,244

LongParameterList 79,899

MissingDefault 23,497

ComplexConditional 21,643

DuplicateCode 17,921

LongMethod 17,521

EmptyCatchBlock 14,560

LongIdentifier 7,741

VirtualMethodCallfromConstructor

4,545

RQ1. What is the distribution of design and implementation smells in C# code?

MagicNumber 2,993,353

LongStatement 462,491

ComplexMethod 95,244

LongParameterList 79,899

MissingDefault 23,497

ComplexConditional 21,643

DuplicateCode 17,921

LongMethod 17,521

EmptyCatchBlock 14,560

LongIdentifier 7,741

VirtualMethodCallfromConstructor

4,545

Onaverage,onemagic

numbersmellper16linesof

code!!

RQ2. Inter-category co-occurrence

RQ2. What is the relationship between the occurrence of design smells and implementation smells?

0 5 10 15 20

0

2

4

6

8

10

12

Design Smells

Impl

emen

tatio

n S

mel

ls

18152230374451586572798694101108115

Counts

Co-occurrence between smell instances

⍴ = 0.78 (p-value < 2.2e-16)

Co-occurrence between smell types

⍴ = 0.80 (p-value < 2.2e-16)

Theresultsemphasizetheneedtopayattentiontosmellsatallgranularities.

0 50 100 150 200 250 300

0200

400

600

800

1000

Design Smells

Impl

emen

tatio

n sm

ells

RQ3. Intra-category co-occurrence

RQ3. Is the principle of coexistence applicable to smells in C# projects?

𝐶 𝑠1, 𝑠2 = 𝑛1 ∗ 𝑛2

𝑁

RQ4. Smell density and project size

RQ4. Does smell density depend on the size of the C# repository?

0 50000 100000 150000 200000

020

4060

80100

LOC

Des

ign

Sm

ells

Den

sity

0 50000 100000 150000 200000

0100

200

300

400

500

LOC

Impl

emen

tatio

n S

mel

ls D

ensi

ty

⍴ = -0.25 (p-value < 2.2e-16) ⍴ = 0.27 (p-value < 2.2e-16)

Each C# class that you work with, on average, has approximately • 2 design smells (113*14.7/1000 = 1.67) and • 6 implementation smells (113*55.8/1000 = 6.2)