Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
-
Upload
reverse-university-of-naples-federico-ii -
Category
Software
-
view
526 -
download
5
Transcript of Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Toward Reverse Engineering of VBA
Based Excel Spreadsheets
Applications
Department of Electrical Engineering and Information Technologies
University of Naples “Federico II”, Italy
Domenico Amalfitano
Nicola Amatucci
Vincenzo De Simone
Anna Rita Fasolino
Porfirio Tramontana
2nd Workshop on Software Engineering Methods in Spreadsheets
Florence, Italy 18th May, 2015
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Context and Motivations
Context
◦ Reverse Engineering of Excel
Spreadsheet Applications
Motivation
◦ Propose techniques and tools to support
the comprehension of VBA based
Spreadsheet applications
2 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Why reverse engineering
spreadsheet applications?
Spreadsheets are widely adopted◦ for different purposes: calculation, storage,
collaboration, etc.
◦ in different domains: business, automotive, engineering, science, medical, etc.
Need for their comprehension in different scenarios◦ individual comprehension
◦ knowledge transferring
◦ re-documentation
◦ Maintenance and evolution
◦ migration towards different architectures
3 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Excel Spreadsheet Applications
Comprehension Issues
Poor or absent internal and external documentation
No clear distinction between different layers
◦ Data, Business Logic, Presentation
Spreadsheet application can be complex
◦ Data spread on different sheets
◦ Data dependencies through formulas
◦ Use of VBA code
Enhanced user interface (User Forms, shapes, controls)
User defined functionalities (Macros) and functions
Handling of events related to default or user defined elements
Direct and indirect dependencies through VBA code
4 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Proposed Reverse Engineering
Approach
Data Model Reverse Engineering
◦ performed to reconstruct a conceptual model of the data stored in
the spreadsheet application.
User Interface and Business Logic Reverse Engineering
◦ performed to comprehend both the structure and the behavior of
User Interface (UI) and the functionalities provided by the
application.
5 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Data Model Reverse
Engineering
We propose an heuristic-based approach
◦ Based on our experience in an industrial domain1,2,3
◦ process made of seven sequential steps to infer, with gradual
refinements, an UML class diagram of the considered spreadsheet
application
◦ in each step, one or more heuristic rules are executed.
Heuristic rules
◦ adapted from rules defined in the literature or
◦ defined by us exploiting some formatting properties typical of
spreadsheet applications and analyzing the cells content
1. Amalfitano, D.; Fasolino, A.R.; Maggio, V.; Tramontana, P.; Di Mare, G.; Ferrara, F.; Scala, S., “Migrating legacy spreadsheets-based
systems to Web MVC architecture: An industrial case study” - CSMR-WCRE 2014
2. Amalfitano, D.; Fasolino, A.R.; Maggio, V.; Tramontana, P.; De Simone V., “Reverse Engineering of Data Models from Legacy
Spreadsheets-Based Systems: An Industrial Case Study” - SEBD 2014
3. Amalfitano, D.; Fasolino, A.R.;Tramontana, P.; De Simone V.; Di Mare, G.; Scala, S., “Information Extraction from Legacy
Spreadsheet-based Information System - An Experience in the Automotive Context” - DATA 2014
6 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Application of heuristics -
Examples
7 of 19SEMS 15 – Florence, Italy – May 18th
For example, applying one of the
rule to the worksheet Sheet1, two
separate areas are identified. In
this case we introduce a class for
each areas and a composition
relation between these classes and
the one related to the sheet.
Sheet1 is composed by
Sheet1_Area1 and Sheet1_Area2
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Application of heuristics -
Examples
8 of 19SEMS 15 – Florence, Italy – May 18th
In this other example, we applied another rule that
exploits the identification of merged cells to extract
different classes from an area under analysis. As
the figure shows we were able to identify from the
Area1 in Sheet1 2 different classes and an
association between them
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Business Logic Reverse
Engineering
definition of a model that takes into account the main
elements of VBA-based spreadsheet applications and
their relationships
Introduction of different views of these applications on
the basis of the model we presented
development of a tool to support the comprehension of
these kind of applications providing extraction and
visualization features
9 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Modeling VBA based
Spreadsheet Applications - 1
10 of 19SEMS 15 – Florence, Italy – May 18th
We provided different
views of a
spreadsheet
application: we
reported the main
elements composing
the application and
their composition and
generalization
relationship.
In the left side are
reported the graphical
elements of a
spreadsheet
application whereas in
the right one the code
elements.
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Modeling VBA based
Spreadsheet Applications - 2
11 of 19SEMS 15 – Florence, Italy – May 18th
In this other view we
reported the main
dependencies between
the elements composing
these applications. In
particular we considered
• call dependencies
between procedures
• write/read
dependencies
between procedures
and cells
• open/hide and unload
dependencies
between procedures
and user form
Besides we also
reported what procedure
handles events of given
elements.
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
EXACT – EXcel Application
Comprehension Tool
Extraction of data from the spreadsheet application
Abstraction of the extracted data according to the
proposed model
Generation of different views
12 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
EXACT – EXcel Application
Comprehension Tool – provided
views
Structural Views
◦ Elements composing the application and their relationships
◦ Details related of an element by clicking on it
User Functionalities Views
◦ List all the user defined functionalities present
Cell Dependencies Views
◦ List of potential dependencies between cells through formulas,
validation rules and VBA code
Report & Metrics Views
◦ Metrics about the complexity of the application (worksheets,
shapes, userforms, modules, procedures, LOCs)
13 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Visualization features provided
by EXACT - 1
14 of 19SEMS 15 – Florence, Italy – May 18th
In this view, the
main structure
of the
application is
shown.
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Visualization features provided
by EXACT - 2
15 of 19SEMS 15 – Florence, Italy – May 18th
By clicking on
an element,
further details
on the element
…
... and a view
showing its
dependencies
are reported
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Visualization features provided
by EXACT – User Functionalities
Views
16 of 19SEMS 15 – Florence, Italy – May 18th
This view
shows all the
events defined
on an element
(Workbook,
Active
Worksheet and
UserForms).
In this example
all the events
related to the
selected User
Form are
reported
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Visualization features provided
by EXACT – User Functionalities
Views
17 of 19SEMS 15 – Florence, Italy – May 18th
By clicking on an
event (in this case
UserForm_Initialize
) a new view
appears showing
the potential
dependencies of
the event handler.
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Future Works
Evaluation of the tool in real business and industrial
contexts to support professionals in executing
comprehension tasks of VBA spreadsheet applications
Extending the model taking into account other Excel
features
Improving the features and the views provided by
EXACT
18 of 19SEMS 15 – Florence, Italy – May 18th
Toward Reverse Engineering of VBA Based Excel Spreadsheets Applications
Thanks for your attention
19 of 19SEMS 15 – Florence, Italy – May 18th
Questions
?
Further Information:
http://reverse.dieti.unina.it
@REvERSE_UNINA