Automatically Spotting Cross-language Relations
-
Upload
federico-tomassetti -
Category
Technology
-
view
626 -
download
2
description
Transcript of Automatically Spotting Cross-language Relations
![Page 1: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/1.jpg)
Spotting automatically
cross-language relations
Federico Tomassetti (me)
Giuseppe Rizzo
Marco Torchiano
![Page 2: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/2.jpg)
CREATE TABLE Persons ( ID int, FirstName varchar(255), LastName varchar(255), City varchar(255) ); String query = "select ID, FirstName, LastName, " + "City " + "from " + dbName + ".Persons"; try { ... while (rs.next()) { int id = rs.getInt("ID"); String firstName = rs.getString("FirstName"); String lastName = rs.getString("LastName"); String city= rs.getString("City"); } } catch (SQLException e ) { ...... }
data.sql
Person.java
![Page 3: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/3.jpg)
CREATE TABLE Persons ( ID int, FirstName varchar(255), LastName varchar(255), City varchar(255) ); String query = "select ID, FirstName, LastName, " + "City " + "from " + dbName + ".Persons"; try { ... while (rs.next()) { int id = rs.getInt("ID"); String firstName = rs.getString("FirstName"); String lastName = rs.getString("LastName"); String city= rs.getString("City"); } } catch (SQLException e ) { (Hopefully it does not happen) }
data.sql
Person.java
![Page 4: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/4.jpg)
…the complexive system, works, sometimes
![Page 5: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/5.jpg)
If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
• Validate them
• Navigate them
So I am aware that this ID is
related to something else
![Page 6: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/6.jpg)
If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
• Validate them
• Navigate them
If I change one, the others are
updated
![Page 7: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/7.jpg)
If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
• Validate them
• Navigate them
See broken relations as errors
![Page 8: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/8.jpg)
If we would automatically identify
cross-language relations we could:
• Recognize them
• Support refactoring
• Validate them
• Navigate them
Click to see the other side of
the relation
![Page 9: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/9.jpg)
![Page 10: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/10.jpg)
CodeModels
ASTs
![Page 11: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/11.jpg)
Embedded AST (prendo immagine da paper)
![Page 12: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/12.jpg)
<ul id="types">
<li ng-repeat="t in types" ng-class="{'selected': t.id == type}">
<a ng-href="#/{{t.id}}">{{t.title}}</a>
</li>
</ul>
var types = [
{ id: 'sliding-puzzle', title: 'Sliding puzzle' },
{ id: 'word-search-puzzle', title: 'Word search puzzle' }
];
index.html
app.js
app.controller('slidingAdvancedCtrl', function($scope) {
$scope.puzzles = [
{ src: './img/misko.jpg', title: 'Miško Hevery', rows: 4, cols: 4 },
{ src: './img/igor.jpg', title: 'Igor Minár', rows: 3, cols: 3 },
{ src: './img/vojta.jpg', title: 'Vojta Jína', rows: 4, cols: 3 }
];
});
<div ng-repeat="puzzle in puzzles">
<h2>{{puzzle.title}}</h2>
…
</div>
![Page 13: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/13.jpg)
<ul id="types">
<li ng-repeat="t in types" ng-class="{'selected': t.id == type}">
<a ng-href="#/{{t.id}}">{{t.title}}</a>
</li>
</ul>
var types = [
{ id: 'sliding-puzzle', title: 'Sliding puzzle' },
{ id: 'word-search-puzzle', title: 'Word search puzzle' }
];
index.html
app.js
app.controller('slidingAdvancedCtrl', function($scope) {
$scope.puzzles = [
{ src: './img/misko.jpg', title: 'Miško Hevery', rows: 4, cols: 4 },
{ src: './img/igor.jpg', title: 'Igor Minár', rows: 3, cols: 3 },
{ src: './img/vojta.jpg', title: 'Vojta Jína', rows: 4, cols: 3 }
];
});
<div ng-repeat="puzzle in puzzles">
<h2>{{puzzle.title}}</h2>
…
</div>
![Page 14: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/14.jpg)
Context of a node:
all the descendants
+
the siblings and their descendants
![Page 15: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/15.jpg)
Context of a node:
all the descendants
+
the siblings and their descendants
![Page 16: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/16.jpg)
Some metrics we use:
• Number of shared values
• Min and max number of different values
• Tversky Index
𝑇𝑉 𝑋, 𝑌 =|𝑋∩𝑌|
|𝑋∩𝑌|+𝛼|𝑋−𝑌|+𝛽|𝑌−𝑋|
• Jaro, Jaccard, tf-idf and others
How to compare contexts:
1) Take all the values in the context (IDs, strings,
numbers)
+
2) Employ different metrics
![Page 17: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/17.jpg)
How to combine those metrics:
Random Tree tells us
We built a golden set of 1200 candidate relations
(around 140 real relations, the other just same ID)
We train it with golden set
Random Tree find out the best way to combine those
metrics to decide if a pair is related or not
Rule to understand if two nodes with same ID are
connected
Output of Random Tree
![Page 18: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/18.jpg)
How to evaluate it?
10-fold cross valiationn
![Page 19: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/19.jpg)
What now?
Code available at:
https://github.com/orgs/CrossLanguageProject
• We want to build a larger golden set
• We want to integrate support in editors
What we have
• A tool that spot automatically cross-language relations
with a precision and recall > 90% (on a first in-house
dataset)
![Page 20: Automatically Spotting Cross-language Relations](https://reader033.fdocuments.us/reader033/viewer/2022052522/554e9086b4c90526358b4dc8/html5/thumbnails/20.jpg)
Code available at:
https://github.com/orgs/CrossLanguageProject
www.slideshare.net/FTomassetti
Spotting Automatically
Cross-Language Relations
Federico Tomassetti, Giuseppe Rizzo, Marco Torchiano
CSMR 2014, Antwerpen, Belgium
Preprint at:
http://www.di.unito.it/~rizzo/publications/Tomassetti_Rizzo-CSMRWCRE2014.pdf