Use case of source code clones detection Cana… · Use case of source code clones detection...
Transcript of Use case of source code clones detection Cana… · Use case of source code clones detection...
![Page 1: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/1.jpg)
Use case of source code clones detectionAnalysis of reused code between to FLOSS projects using FLOSS
tools
Luis Canas-Dıaz
Linux Tag 2012, Berlin, May 23rd, 2012
Luis Canas-Dıaz Use case of source code clones detection
![Page 2: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/2.jpg)
c©2012 BitergiaSome rights reserved. This presentation is distributed under the
“Attribution-ShareAlike 3.0” license, by Creative Commons,available at
http://creativecommons.org/licenses/by-sa/3.0/
Luis Canas-Dıaz Use case of source code clones detection
![Page 3: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/3.jpg)
GSyC/LibreSoft
Research group at Universidad Rey Juan Carlos
About 20 persons, including students
Focus on FLOSS (free, libre, open source software)
One of the main research lines:
understanding FLOSS developmentquantitative, empirical approachbased on data retrieval from FLOSS development repositories
Participating in several R&D projects
Luis Canas-Dıaz Use case of source code clones detection
![Page 4: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/4.jpg)
Bitergia: an spin-off
Company starting operations in June 2012
Building on the experience of LibreSoft
Offering professional products and services
Focused on:
Metrics about software developent(including community metrics)Specialized support for development forges(including metrics for projects)
“How to understand risks associated to open sourcecommunities” by Daniel Izquierdo on Saturday
http://bitergia.com
Luis Canas-Dıaz Use case of source code clones detection
![Page 5: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/5.jpg)
Introduction
Provincial Council of A Coruna
gisEIEL and gvSIG-EIEL , both with similar features
Luis Canas-Dıaz Use case of source code clones detection
![Page 6: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/6.jpg)
Introduction: gisEIEL and gvSIG-EIEL
gisEIEL is the geographic information system used by thetechnical staff of the Provincial Council of A Coruna and themunicipalities
gvSIG-EIELStack includes three gvSIG extensions that provideseveral functionalities to work with the EIEL (Survey onInfraestructure and Local Facilities)
Luis Canas-Dıaz Use case of source code clones detection
![Page 7: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/7.jpg)
Introduction: project A and project B
gisEIEL = project A
gvSIG-EIEL = project B
Luis Canas-Dıaz Use case of source code clones detection
![Page 8: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/8.jpg)
Introduction: the history
gisEIEL (project A):
created in 2000 and funded by the Provincial Council of ACorunawas released in 2004 as FLOSS based on gvSIG 1.0
Luis Canas-Dıaz Use case of source code clones detection
![Page 9: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/9.jpg)
Introduction: the history
gvSIG-EIEL (project B):
years later the Provincial Council of Pontevedra funded thecreation of a similar a application ( instead of using the projectA )project B was released with very similar functionality
Luis Canas-Dıaz Use case of source code clones detection
![Page 10: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/10.jpg)
Introduction: our client
Our client was in charge of maintaining the project A
Interested in:finding out whether a merge is feasible
amount of reused code in B
how the code is being reused
licensing and copyright issues
study the functionality
Luis Canas-Dıaz Use case of source code clones detection
![Page 11: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/11.jpg)
Methodology
Data analysed is publicly available (replicability)
Done with FLOSS tools
Luis Canas-Dıaz Use case of source code clones detection
![Page 12: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/12.jpg)
Methodology
Retrieval of the source code to be analysed
Selection of tools to get information from source code
Process the raw data
Identification of relevant information
Luis Canas-Dıaz Use case of source code clones detection
![Page 13: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/13.jpg)
Methodology: sources
Project A: Snapshot downloaded from 1 SVN repository
Project B: Snapshots downloaded from 6 Git and 2 SVNrepositories
No feedback from developers
Luis Canas-Dıaz Use case of source code clones detection
![Page 14: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/14.jpg)
Methodology: CCFinder anc Cloc
CCFinder
http://www.ccfinder.net/
CCFinder allows to match similar parts of the code
Works at token level
Must be carefully configured
Cloc
http://cloc.sourceforge.net
Calculates the SLOC
Support for 86 programming languages
Luis Canas-Dıaz Use case of source code clones detection
![Page 15: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/15.jpg)
Methodology: Ninka and grep
Ninka
http://ninka.turingmachine.org/
Lightweight license identification tool for source code
Grep
Well know command line in the UNIX environment
Searches text strings using regular expressions
Luis Canas-Dıaz Use case of source code clones detection
![Page 16: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/16.jpg)
Methodology: Process the raw data from CCFinder
clone id file id.tokens file id.tokens
16359 476.1119-1177 2093.644-702
16359 476.1119-1177 2093.749-807
16359 476.1119-1177 2093.889-947
16359 476.1119-1177 2093.1034-1092
16359 476.1119-1177 2093.1181-1239
1207 476.1259-1310 2093.1324-1375
36 476.37-149 2094.37-149
1831 476.260-326 2094.221-287
Luis Canas-Dıaz Use case of source code clones detection
![Page 17: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/17.jpg)
Methodology: How much code in common?
Luis Canas-Dıaz Use case of source code clones detection
![Page 18: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/18.jpg)
Results: file by file
One of the files of the project A:
File name ExportMapTo.java
Cloned files 3
SLOC 569
License GPLv2
Copyright Copyright (C) 2009 Deputacion de A Coruna
Luis Canas-Dıaz Use case of source code clones detection
![Page 19: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/19.jpg)
Results: file by file
For one file in A we got the clones below in B
Have a look at the license and copyright!
File name % SLOC license copyright
ExportSeveralTo.java 43 % 244 None None
StopEditingToShp.java 28 % 159 None None
ExportTo.java 47 % 267 None None
Luis Canas-Dıaz Use case of source code clones detection
![Page 20: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/20.jpg)
Results: A project vs. B project (1/3)
Module of project A SLOC similar SLOC %
appgvSIG 48279 483 1
EIEL-Autenticacion 1062 0 0
EIEL-DescargaMunicipiosBD 3142 0 0
EIEL-extCAD 21423 13068 61
EIEL-Formularios-Alfanumer 27224 0 0
EIEL-GeneracionScriptsInBDT 3992 0 0
EIEL-GestionDeLeyendasImpr 980 0 0
EIEL-GestionDeMapasGisEIEL 936 0 0
EIEL-GestionPermisos 776 0 0
EIEL-GestionUsuarios 1517 0 0
Luis Canas-Dıaz Use case of source code clones detection
![Page 21: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/21.jpg)
Results: A project vs. B project (2/3)
Module of project A SLOC similar SLOC %
EIEL-GisEIEL 22906 687 3
EIEL-Informes 935 0 0
EIEL-Utilidades 1146 23 2
EIEL-Validaciones 3487 0 0
extJDBC 3600 36 1
extOracleSpatial 9034 90 1
fwAndami 13886 0 0
libCorePlugin 3510 35 1
libCq CMS for java 26617 0 0
libFMap 41159 0 0
Luis Canas-Dıaz Use case of source code clones detection
![Page 22: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/22.jpg)
Results: A project vs. B project (3/4)
Luis Canas-Dıaz Use case of source code clones detection
![Page 23: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/23.jpg)
Results: A project vs. B project (4/4)
6 % of the A’s code was reused by project B (14K out of319K SLOC)
Luis Canas-Dıaz Use case of source code clones detection
![Page 24: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/24.jpg)
Results: project B vs. project A (1/3)
Module in B SLOC SLOC similar %
extDBConnection 1648 0 0
ELLE 3459 35 1
OpenCADTools 36974 15899 43
NavTable 5685 57 1
exteieltable 8311 83 1
extvalidation 1160 0 0
exteielutils 1711 0 0
exteielforms 8185 82 1
Luis Canas-Dıaz Use case of source code clones detection
![Page 25: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/25.jpg)
Results: project B vs. project A (2/3)
Luis Canas-Dıaz Use case of source code clones detection
![Page 26: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/26.jpg)
Results: B project vs. A project (3/3)
B reused around 20 % of its code from A (16K out of 80KSLOC)
Luis Canas-Dıaz Use case of source code clones detection
![Page 27: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/27.jpg)
Final conclusions (1/3)
20 % of the code in project B was reused from A
6 % of the A’s code is reused in project B
Luis Canas-Dıaz Use case of source code clones detection
![Page 28: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/28.jpg)
Final conclusions (2/3)
most of the code reused by B is part of a single module(OpenCADTools). This module reused 43 % of its code fromanother module from A called EIEL-extCAD
Luis Canas-Dıaz Use case of source code clones detection
![Page 29: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/29.jpg)
Final conclusions (3/3)
91 % of the files reused by B did not contain the originalcopyright holder
early versions of A reused code from gvSIG project and theydid not contain the original copyright holder either (fixed inlatest versions of A)
Luis Canas-Dıaz Use case of source code clones detection
![Page 30: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/30.jpg)
Your time!
any questions?
Luis Canas-Dıaz Use case of source code clones detection
![Page 31: Use case of source code clones detection Cana… · Use case of source code clones detection Analysis of reused code between to FLOSS projects using FLOSS tools Luis Canas-D~ az lcanas@libresoft.es](https://reader034.fdocuments.us/reader034/viewer/2022050303/5f6ba32d88c9d27f2c747430/html5/thumbnails/31.jpg)
Thank you! / ¡Gracias!
contact me at [email protected]
Luis Canas-Dıaz Use case of source code clones detection