Ranking Refactoring Suggestions based on Historical Volatility
ICSME14 - On the Impact of Refactoring Operations on Code Quality Metrics
-
Upload
oscar-chaparro -
Category
Software
-
view
55 -
download
0
Transcript of ICSME14 - On the Impact of Refactoring Operations on Code Quality Metrics
On the Impact of Refactoring Operations on Code Quality
Metrics
ICSME 2014
Victoria, BC, Canada
Oscar Chaparro
Gabriele Bavota
AndrianMarcus
Massimiliano Di Penta
Software refactoring
Any change in the code that improves its internal structure without affecting its external behavior
Refactoring has side effects
Metrics improvement
At the expense of
other metrics
Developers do not know the effect of refactoring in code metrics upfront!
RIPE Refactoring Impact PrEdiction
RIPE includes a set 89 simple, independent and reusable prediction functions
It tells you the specific change in the metrics before applying a specific refactoring
It allows you to decide between refactoring alternatives
RIPE under the hood
𝑀1 𝑀2 𝑀3 … 𝑀11
𝑅𝑂1 𝑓1,1 𝑓1,3 …
𝑅𝑂2 𝑓2,1 … 𝑓2,11
𝑅𝑂3 𝑓3,2 … 𝑓3,11
… … … … … …
𝑅𝑂12 𝑓𝑖,3 … 𝑓12,11
12 Refactoring operations (RO)
11 Code metrics (M)
𝑓𝑀,𝑅𝑂 Code, 𝑅 = 𝑚𝑝
‐ Heuristic-based‐ Defined based on
⁻ Fowler’s definition⁻ Our experience⁻ Study of common cases
‐ Assume refactoring independence
Refactoring operations in RIPE
Refactoring Operation Category
Extract Method (EM)Composing Methods
Inline Method (IM)
Replace Method w. Method Obj. (RMMO)
Pull Up Field (PUF)
Dealing with Generalization
Pull Up Method (PUM)
Push Down Field (PDF)
Push Down Method (PDM)
Replace Delegation with Inheritance (RDI)
Replace Inheritance with Delegation (RID)
Extract Class (EC)Moving Features
Move Field (MF)
Move Method (MM)
Code metrics in RIPE
Code Metric Code Property
Response for a Class (RFC)
CouplingCoupling Between Objects (CBO)
Data Abstraction Coupling (DAC)
Message Passing Coupling (MPC)
Lines of Code (LOC)Size
Number of Methods (NOM)
McCabe’s Cyclomatic Number (CYCLO) Complexity
Lack of Cohesion of Methods 2 (LCOM2)Cohesion
Lack of Cohesion of Methods 5 (LCOM5)
Number of Children (NOC)Inheritance
Depth of Inheritance Tree (DIT)
Predicting how an Extract Class refactoring changes the Coupling Between Objects metric
Example of how RIPE works
Extract Class (EC) refactoringBefore Refactoring After Refactoring
Coupling Between Objects (CBO) metric
CBO(SourceClass) = 5
Before Refactoring
CBO change after EC – Source Class
CBO(SourceClass) = 5
Before Refactoring After Refactoring
CBO(SourceClass) = 5 - 2 + 1 = 4
CBO change after EC – Target Class
After Refactoring
CBO(TargetClass) = 2 + 1 = 3
Before Refactoring
Evaluation goal and process
RQ: What is RIPE’s accuracy in estimating the impact of refactoring operations on code metrics?
Definition: what is accuracy? how is it computed?
Process: how to evaluate accuracy?
Data: code and refactorings
Evaluation metrics
‐ Accuracy: ratio of perfect predictions over all the predictions (level or perfection)
‐ Deviation: gap between the prediction and metric value (level of imperfection)
Accuracy
Deviation
↑
↑
Evaluation process
Measure metrics of
code
Predict metric changes with
RIPE
Apply/Extract refactorings
Measure metrics again
Compare actual metrics vs predictions
List of refactorings
Seeded refactorings
- Goal: Have a uniform distribution of refactorings
- Procedure: 2 PhD. students identified & applied refactorings
- Projects: ArgoUML and aTunes
Existing refactorings
- Goal: Validate the approach in everyday changes
- Procedure: Usage of a tool than mined the versioning logs
- Projects: 13 open source systems
Software projects
Results summary
Dataset AccuracyDeviation
Med Avg
Seeded refactorings 68% 0% 12%
Existing refactorings 22% 14% 41%
Seeded and Existing refactorings
38% 5% 31%
Metric analysis - seeded refactorings
90%
60%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
DIT, NOC, NOM RFC, CBO, LOC
Code metrics
Accuracy
- Coarse granularity- No ambiguity on how
refactorings impact such metrics
- EC and RMMO were difficult to predict
- Specific changes are assumed but not needed
Low deviation: 15% avg and 2% median
Refactoring analysis - seeded refactorings
80%
55%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
PDM, MF, RID RMMO, EC
Refactoring Operations
Accuracy
- There are no many refactoring alternatives
- Why they are not 100% accurate?
- RIPE implementation is conservative
- There are many implementation alternatives
- Deviation is higher:56% avg and 12% median
Low deviation: 5% avg and 0% median
Conclusions
Some metrics are coarse, linear and easy to interpret (e.g., NOM) and others are fine grained and less intuitive (e.g., LCOM5)
RMMO and EC refactorings are difficult to predict as these have many implementation alternatives in practice
RIPE evaluation showed good prediction performance: 38% perfect prediction, low deviation (31% avg & 5% median)
It is possible to predict the specific change of metrics resulted from refactoring through RIPE
Future work
Improve our prediction functions and include more metrics and refactorings
More studies for understanding change on code quality metrics and properties in practice
We will move towards predicting metric changes of composite refactorings and recommendation of these kind of refactorings
On the Impact of Refactoring Operations on Code Quality
Metrics
ICSME 2014
Victoria, BC, Canada
Oscar Chaparro
Gabriele Bavota
AndrianMarcus
Massimiliano Di Penta
RIPE in refactoring decision making
Method 1
Blob Class
Method 2
Method 3
Method 4
…
CsMethod 2
Method 3
…
Cs
Method 1
Method 4
…
Ct
Method 3
…
Cs
Method 1
Method 4
…
Ct
Method 2
Ct has low cohesion
Deviation analysis
‐ There are some metric predictions with high accuracy and high deviation (e.g., NOM or DAC)
‐ What is the meaning of deviation in practice?⁻ Some metrics are coarse, linear and easy to interpret
(e.g., NOM)⁻ Others are fine grained and less intuitive (e.g., LCOM5)
Metric Avg deviation Actual metric deviation
NOM 63% 4
LCOM5 27% 0.104
Only 4 methods are being “mispredicted”
The number of code elements could be high (field accesses from 9 to 30)
ArgoUML results