Graph-Based Code Completion

Graph-Based Pattern-Oriented, Context-Sensitive Source Code Completion

Nguyen, T.T. ; Nguyen, H.A. ; Tamrawi, A. ; Nguyen, H.V. ; Al-Kofahi, J. ; Nguyen, T.N.

Presented By: Mohammad Masudur Rahman

Contents

Code Completion Thesis Statement Motivating Example Terminologies Methodology Empirical Evaluation & Results My Observation & Future Thoughts

Code Completion

Built-in feature of modern all IDEs Speed up development Longer Identifier names for program comprehension Less overhead for developers Mostly single variable, method supports- API

packages Template based support – control structure, event

handling and others

Thesis Statement

Novel approach with graph-based code completion

Graph based feature extracting, searching, ranking of API usage pattern, matching with editing context of current code.

Empirical evaluation shows correctness and usefulness- 95% precision, 92% recall, 93% f-score over 24 real world systems

Motivating Example (Single-line)

Fig 1: Current State of Code Completion (Eclipse 3.6)

Motivating Example (Multi-line)

Fig 2: SWT Usage Example

Motivating Example (Query)

Fig 3: SWT Query Example

Terminologies

GRAPACC API Usage Pattern Groum Based Model Context-sensitive Weight

GRAPACC

Graph-Based Pattern-Oriented Context-Sensitive Code Completion

API Usage Pattern

Fig 4: SWT API Usage

Groum Based Model

Fig 5: Groum Conversion

Context-Sensitive Weight

Wf (q)=Context-sensitive weight of feature q

q= feature of Query, Q

d=distance to the closest token in Groum Model

Methodology

Query Processing and Feature Extraction Pattern Managing, Searching and Ranking Pattern Oriented Code Completion

Query Processing and Feature Extraction

Tokenizing Partial Parsing Groum Building Feature Extracting and Weighting

Tokenizing, Partial Parsing

Lexical analysis Preserves keywords related to control

structure, rest are removed elsewhere but saved

Eclipse java parser PPA tool returns AST (Abstract Syntax

Tree) Unresolved nodes assigned ‘Unknown Type’

Groum Building

Groum from AST Unresolved nodes are

discarded but considered as tokens

Query converted to the following Groum

Fig 6: Groum of Query

Feature Extraction & Weighting

Groum nodes mapped to tokens in tokenization step

Feature extracted from Groum for path, L<=3 3 factors contribute to feature weight Structured based factor (size) Structured based factor (centrality) User based factor

ws(q)= size based weight for feature, q of Query, Q (w(q)=1+size(q); 1<= size(q)<=3)

wc(q)= Centrality based weight for feature, q of Query, Q (wc(q)=n / s, n=no of neighbors, s=size)

(wf(q)=1/(d+1)), distance between focus node and the closest token in feature path Groum Model

w(q)= total weight for feature, q of Query, Q ws(q)= size based weight for feature, q of Query, Q

wc(q)= Centrality based weight for feature, q of Query, Q

wf(q)= used based weight for feature, q of Query, Q

Pattern Managing, Searching and Ranking

Pr(P) is popularity of pattern P = frequency of Pattern P

Weight of feature p in Pattern P using inverse indexing

Np,P=occurrence of feature p in P, NP=total no of features in P

Np=No of patterns containing p, N=total no of pattern in database

Pattern Managing, Searching and Ranking

For each feature p, L(p), a list of patterns from which p can be extracted

p for pattern feature, q for query feature Now sim(p,q)>∂,then p is added to F, set of mapped

features for q For each pєF, top n ranked patterns from L(p) is

added to C, candidate patterns for relevance computation

Now for each P in C, compute fit(P,Q)

Feature Similarity

is a name-based similarity between two features given that feature is a collection of labels and has the formOf X.Y.Z where X=package nameY=class nameZ=method name

Name-based Similarity (nsim)

wsim(X, X’) is word-based similarity X, X’ are broken down and two sequence of words

L(x) and L(y) Similarity computed as Lo/Lm

Lo is length of LCS, Lm is average length of two sequences

Pattern Matching (Relevance)

Pattern Matching

SM(P,Q)=total weight of Matched feature pair

Fit (P, Q)=Relevance degree between P and Q

Pr(P)=Popularity of Pattern P

Pattern Oriented Code Completion

Matched pattern is selected and corresponding node in Groum is matched

The missing nodes are fulfilled with code

Empirical Evaluation

Precision Recall F-score java.io, java.util :API used as library 28 real world open-source systems 4 for training, 24 for testing

Empirical Evaluation

My Observation

Planning to use semantic web technology Data and control dependency relationship

can be improved using semantic relationship like conceptual similarity

Matching of pattern is complex and error-prone, semantic score can be beneficial

Thanks

Questions??

Graph-Based Code Completion

Education

Transcript of Graph-Based Code Completion

Code Writing vs Code Completion Puzzles: Analyzing ...

Knowledge Graph Completion - GitHub Pages · • Knowledge graph embeddings –Useful for link prediction and triples classification –Recall the Microsoft-founded_in-Seattle example

Geometric Matrix Completion with Recurrent Multi-Graph Neural … · 2018-02-13 · Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks Federico Monti Università

Rest api code completion for javascript - dotjs 2015

Graph Theoretic methods for Matrix Completion Problems€¦ · The matrix completion problem for the class of Π -matrices is to determine which patterns have Π -completion. 2. Graphs

Graph-Based Source Code Analysis of JavaScript Repositoriesftsrg.mit.bme.hu/ingraph/pub/stein-daniel-msc.pdf · Graph-Based Source Code Analysis of JavaScript Repositories Master’s

Relational Message Passing for Knowledge Graph Completion

Principled Syntactic Code Completion with Placeholders

COLOR CODE LOCATION - General Paintgeneralpaint.biz/userfiles/file/Color-Code-Location.pdf · graphs graph 4 graph 6 graph 8 graph 1 graph 5 graph 7 graph 2 graph 3 color code location

Geometric Matrix Completion with Recurrent Multi-Graph Neural … · 2017. 4. 25. · Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks The earliest attempts

ProjE: Embedding Projection for Knowledge Graph Completion · ProjE: Embedding Projection for Knowledge Graph Completion Baoxu Shi 1 and Tim Weninger 1 1 Department of Computer Science

KNOWLEDGE GRAPH COMPLETION PART 4: KEY ...sais/KGC/3-KeyDiscovery_Final.pdfKNOWLEDGE GRAPH COMPLETION PART 4: KEY DISCOVERY (1) LRI, PARIS SUD UNIVERSITY, CNRS, PARIS SACLAY UNIVERSITY

Multiplying Big Data Analytics - project-lambda.org · Knowledge graph embeddings for e.g. KB completion, link prediction Graph Clustering Association rule mining (AMIE+ = mining

Automatic Code Completion Exploting Semantic Similarity

Inductive Matrix Completion Using Graph Autoencoder

Matrix Completion with Graph Neural Networksyunshengb.com/wp-content/uploads/2018/02/Geometric... · A new deep learning approach for matrix completion based on multi-graph convolutional

PROBABILISTIC KNOWLEDGE GRAPH CONSTRUCTION C AND ...cm.cecs.anu.edu.au/documents/kim_cikm16_slides.pdf• Knowledge graph completion task -where our goal is to identify which triple

Enhance existing REST APIs (e.g. Facebook Graph API) with code completion using Swagger/Spring- Devoxx 2015

Online Graph Completion: Multivariate Signal Recovery in ...monajalal.github.io/assets/pdf/cvpr2017_1315.pdf · Online Graph Completion: Multivariate Signal Recovery in Computer Vision

Learning from Examples to Improve Code Completion Systems