Api specification based function search engine using natural language query-Seminar Conducted by me

32
By Sanif S S Reg No:10007399 S7 IT

description

This is the seminar I have conducted as a part of my syllabus.

Transcript of Api specification based function search engine using natural language query-Seminar Conducted by me

Page 1: Api specification based function search engine using natural language query-Seminar Conducted by me

By Sanif S S

Reg

No:10007399

S7 IT

Page 2: Api specification based function search engine using natural language query-Seminar Conducted by me

Overview

Understanding the terms. Objectives. In detail

o Keyword Retrievalo Variable Retrievalo API Specification Miningo Function Retrievalo Code Generation

Experiment Conclusion References

Page 3: Api specification based function search engine using natural language query-Seminar Conducted by me

API Specification-Based Function Search EngineUsing Natural Language Query

API – Application Programming Interface. An API is a set of commands, functions, and protocols which programmers can use when

building software for a specific operating system APIs are usually Implemented as Header Files. EX:

o Java APIso ODBC for Microsoft Windows

Page 4: Api specification based function search engine using natural language query-Seminar Conducted by me

API Specification-Based Function Search EngineUsing Natural Language Query

Description about the classes and methods inside the API.

Each method(or function) and its uses are briefly described in the API Specifications.

Page 5: Api specification based function search engine using natural language query-Seminar Conducted by me

API Specification-Based Function Search EngineUsing Natural Language Query

Function search engine is nothing but as the name suggests a search engine for all the methods in the API.

Page 6: Api specification based function search engine using natural language query-Seminar Conducted by me

API Specification-Based Function Search EngineUsing Natural Language Query

Natural Language Query is a query that uses a complete sentence or question to begin a search.

Ex:o “What is the capital of India?”o “How to make pizza?”

Page 7: Api specification based function search engine using natural language query-Seminar Conducted by me

API Specification-Based Function Search EngineUsing Natural Language Query

Means a search engine to search all the functions/methods in an Application programming interface(API) using simple queries.

Additionally this paper also suggests a means of generating automatic function calls based on the search.

Page 8: Api specification based function search engine using natural language query-Seminar Conducted by me

Programmers nearly always use existing functions while developing their applications.

The functions have grown more numerous and more diverse.

The Problem is that ‘what functions they want’ and know ‘how tocall those functions?’.

The Solution:-o This paper present two novel approaches to address these problems. o The first is the approach to find right functions based on the API specification.o The second is approach to automatically generate code for “function call”

Page 9: Api specification based function search engine using natural language query-Seminar Conducted by me

There are two main objectives in this paper:o Retrieving functions, and o Generating code for function calls.

Two different forms of queries corresponding to these objectives. o The first is “function search query” which requests to look for functions. o The second is “function call query” which requests to generate code for function

calls.

Page 10: Api specification based function search engine using natural language query-Seminar Conducted by me

CodeGeneration

VariableRetrieval

FunctionDescription

APIDocument

Fig:Function Search Model

FunctionSearch Query

KeywordRetrieval

Mining

FunctionRetrieval

Function retrieval is the process of finding suitable functions by matching “the extracted keywords from a function search query” to “descriptions of functions in the API specification”.

Keyword retrieval is the process of extracting keywords from a function search query

Mining is the process of extracting contents in the API specification to support function retrieval

Function CallQuery

FunctionCall

Variable retrieval is the process of extracting Variables from a function call query

Code generation is the process of generating code for a function call based on both the variables extracted from function call query.

Page 11: Api specification based function search engine using natural language query-Seminar Conducted by me

There are several methods to identify keywords in a naturallanguage sequence.

Some methods identify keyword as a simple word, while others identify a keyword phrase.

In this paper Introducing four technologies of natural language processing to extract keywords.

-POS tagging, POS filtering, Stemming, Synonym generation.

Page 12: Api specification based function search engine using natural language query-Seminar Conducted by me

Word/POS

POSFilter

POS tagging (part-ofspeech tagging) is the technology to mark up a word in a natural language sentence (NL Sentence).

Fig Keyword Retrieval Process

NL Sentence POS Tagging

Stemming

keywordsSynonym Generation

MainWord

OriginalWord

POS filtering is the technology to remove stopwordssuch as prepositions, pronouns, conjunctions, and interjections.

Stemming is the technology to reduce inflected (orsometimes derived) words to their root form.(Ex: ‘return’ is the root form of words “returns, returning, returned”.

Synonym generation is the technologyto identify synonyms of the retrieved keywords

Page 13: Api specification based function search engine using natural language query-Seminar Conducted by me

For the natural language query “Gets an element in the collection”. The followings are results obtained in the above stages.

o POS Tagging: Gets/VB an/DT element/NN in/IN the/DF collection/NN.

o POS Filtering: Gets element collection.

o Stemming: Get element collection.

o Synonym Generation: Get-have/return element-object/component collection-list/set.

NOTE:

VB-VerbDT-DeterminerNN-NounIN-PrepositionDF-Adjective

Page 14: Api specification based function search engine using natural language query-Seminar Conducted by me

Two kinds of objects in a function call query:-Words and Variables.

Many words related to each variable in the query.

Also each word in the query is only relevant to one(or zero) variable.

words, which are relevant to a variable, is called features of this variable.

Page 15: Api specification based function search engine using natural language query-Seminar Conducted by me

Every relation between words and variable is represented by a “variable retrieval rule” derived from a corresponding syntactic rule.

Ex:Some variable retrieval ruleso Root(sf V ) -> V B(wf W)NP(sf V ) o NP(sf fv1; v2g) -> NP(vf v1)PP(vf v2) o NP(sf V [ fvg) -> NP(sf V )PP(vf v) o NP(sf V1 [ V2) -> NP(sf V1)PP(sf V2) o PP(vf v[W1 W2]) -> IN(wf W1)NP(wf v[W2]) o PP(sf V ) -> IN(wf W)NP(sf V ) o NP(wf W1 W2) -> NN(wf W1)NN(wf W2) o NP(vf v[W1 W2]) -> NN(wf W1)NN(vf v[W2]) o NP(vf v[W1 W2 W3]) ->DT(wf W1)

V BN(wf W2)NN(vf v[W3])

Page 16: Api specification based function search engine using natural language query-Seminar Conducted by me

In figure 3, a query in natural language (“Insert element e in a set at index k”) is parsed in a tree structure by using Stanford-Parser tool.

The last result is: o e[element]; o a[a set]; o k[at index];

Fig. 3: Parsing tree for the function call query

Page 17: Api specification based function search engine using natural language query-Seminar Conducted by me

This subsection focuses on mining the API specification of Java ,called Java API specification.

In the Java API specification, there are many contents related to function which may be mined to support the function retrieval process and the code generation process.

They are:-o function specificationo functionality descriptiono parameter features

Page 18: Api specification based function search engine using natural language query-Seminar Conducted by me

Function specification: is a structured data that describes the usage of function. information, which can be extracted from this content, is:Function name, function scope,

return type, a list of parameters,and so on…

Functionality description: is an unstructured data in the form of natural language that describes the functionality of the function.

To extract information in this content, the keyword retrieval method (presented in previous slide) is used.

Parameter features: is an unstructured data in the form of natural language that describes

features of the parameters in the function specification. The necessary information in this content are extracted by usingnatural language

processing technologies.

Page 19: Api specification based function search engine using natural language query-Seminar Conducted by me

Example: The function add() is described in the Java API specification ArrayList as follows.

Function specification: public void add(int index,Object element).

Functionality description: “Inserts the specified element at the specifiedposition in this list”.

Parameter features: “index - index at which the specified element is to be inserted” and “element - element to be inserted”.

Page 20: Api specification based function search engine using natural language query-Seminar Conducted by me

There are three stages in the process of retrieving function.

Stage 1: extracting the functions related to user’s query based on some constraints.

Stage 2: refining the obtained result in the previous stage by removing some irrelevant functions.

Stage 3: ranking the collected relevant functions in descending order of appropriate degree of query.

Page 21: Api specification based function search engine using natural language query-Seminar Conducted by me

The standard syntax of a function call statement is object.callName(arg1, arg2,…., argk)

To generate code for a function call, we map user’s query to the corresponding function call based on its function definition.

Two Steps:i. identifying certain variable vj as the object o , and ii. mapping the remaining variables to the corresponding arguments arg1, arg2, argk

Page 22: Api specification based function search engine using natural language query-Seminar Conducted by me

In the first Step , the function retrieval method is used to identify a set of functions related to user’s query.

However, to use this method, the “function call query” need to be transferred to the “function search query” by removing all variables in this query.

The variable, whose type contains at least one function related to the new query, is the desired object o

In the second step all Other variables are set as parameters.

For example, give the query “inserts an element <e:Object> in a collection <a:ArrayList>”, the variable a with type ArrayList contains the function add related to the new query “inserts an element in a collection”, so a:add(?) is a suitable function call.

Page 23: Api specification based function search engine using natural language query-Seminar Conducted by me

A. User Study In the first user study, ten common search tasks are designed and assigned them to the

participants.

Then, each participant used FSE and some other search engines to complete these tasks.

Three search engines are given to users for study: FSE, Krugle, Koder.

Page 24: Api specification based function search engine using natural language query-Seminar Conducted by me

In the second user study, the participants suggested over 100 requests that generate code for function call.

Then, they checked degree of fitness between obtained results and their requests to calculate accuracy for FSE.

There are four degrees of fitness: Highly Relevant, Somewhat Relevant, Somewhat Irrelevant, Highly Irrelevant.

Hightly Relevant- The top result in the set of the returned solutions is absolutely fit with user’s request.

Somewhat Relevant- The desired result in result set was not in the first position. Somewhat Irrelevant- If it contains the function with correct name but wrong

parameters. Highly Irrelevant- The lowest level.

Page 25: Api specification based function search engine using natural language query-Seminar Conducted by me

B. Results

0

0.1

0.2

0.3

0.4

0.5

0.6

User 1 User 2 User 3

Krugle

Koder

FSE

Page 26: Api specification based function search engine using natural language query-Seminar Conducted by me

B. Results

In this figure 92% -correct functions that were

relevant to user’s request. 71% -correct function in the first

position of solution set. 7% -did not find any proper

function.

Page 27: Api specification based function search engine using natural language query-Seminar Conducted by me

Efficient function search approach by using the API specification is proposed in this paper

Also presented a novel function call generation method that generates source code to invoke the functions based on variable features extracted from user’s query.

Finally, we have implemented FSE, a function search engine that helps programmers to quickly examine different functions that might be appropriate for a problem, obtain more information about particular functions, and automatically generate code for function calls to know how to use a function.

Page 28: Api specification based function search engine using natural language query-Seminar Conducted by me

[1] A. J. Ko, B. A. Myers, and H. H. Aung, “Six learning barriers in enduser programming systems,” in Proc. of the 2004 IEEE Symposium onVisual Languages - Human Centric Computing, ser. VLHCC ’04. IEEE Computer Society, 2004, pp. 199–206.[2] D. Mandelin, L. Xu, R. Bod´ık, and D. Kimelman, “Jungloid mining: helping to navigate the api jungle,” in Proc. of the 2005 ACM SIGPLAN conference on Programming language design and implementation, ser. PLDI ’05. ACM, 2005, pp. 48–61.[3] J. Stylos and B. A. Myers, “Mica: A web-search tool for finding api components and examples,” in Proc. of the Visual Languages and Human-Centric Computing, ser. VLHCC ’06. IEEE Computer Society, 2006, pp. 195–202. [4] R. Hoffmann, J. Fogarty, and D. S. Weld, “Assieme: finding and leveraging implicit references in a web search interface for programmers,” in Proc. of the 20th annual ACM symposium on User interface software and technology, ser. UIST ’07. ACM, 2007, pp. 13–22.

Page 29: Api specification based function search engine using natural language query-Seminar Conducted by me

[5] S. Thummalapenta and T. Xie, “Parseweb: a programmer assistant for reusing open source code on the web,” in Proc. of the twentysecond IEEE/ACM international conference on Automated software engineering, ser. ASE ’07. ACM, 2007, pp. 204–213.[6] M. Grechanik, C. Fu, Q. Xie, C. McMillan, D. Poshyvanyk, and C. Cumby, “A search engine for finding highly relevant applications,” in Proc. of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ser. ICSE ’10. ACM, 2010, pp. 475–484.[7] S. Chatterjee, S. Juvekar, and K. Sen, “Sniff: A search engine for java using free-form queries,” in Proc. of the 12th International Conference on Fundamental Approaches to Software Engineering: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009, ser. FASE ’09. Springer-Verlag, 2009, pp. 385–400.[8] M. Grechanik, K. M. Conroy, and K. A. Probst, “Finding relevant applications for prototyping,” in Proc. of the Fourth International Workshop on Mining Software Repositories, ser. MSR ’07. IEEE Computer Society, 2007, pp. 12–.

Page 30: Api specification based function search engine using natural language query-Seminar Conducted by me

[9] R. Pandita, X. Xiao, H. Zhong, T. Xie, S. Oney, and A. Paradkar, “Inferring method specifications from natural language api descriptions,” in Proceedings of the 2012 International Conference on Software Engineering, ser. ICSE 2012. IEEE Press, 2012, pp. 815–825.[10] A. Fantechi, S. Gnesi, G. Lami, and A. Maccari, “Application of linguistic techniques for use case analysis,” in Proc. of the 10th Anniversary IEEE Joint International Conference on Requirements Engineering, ser. RE ’02. IEEE Computer Society, 2002, pp. 157–164. [11] D. Klein and C. D. Manning, “Accurate unlexicalized parsing,” in Proc. of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1, ser. ACL ’03. Association for Computational Linguistics, 2003, pp. 423–430.[12] L. Kof, “Scenarios: Identifying missing objects and actions by means of computational linguistics.” in RE. IEEE, 2007, pp. 121–130.[13] K. Rothenhausler and H. Schutze, “Part of speech filtered word spaces,” in Proc. of the 2007 Workshop on Contextual Information in Semantic Space Models: Beyond Words and Documents, 2007, pp. 25–32. [14] D. Shepherd, Z. P. Fry, E. Hill, L. Pollock, and K. Vijay-Shanker, “Using natural language program analysis to locate and understand action-oriented concerns,” in Proc. of the 6th international conference on Aspect-oriented software development, ser. AOSD ’07. ACM, 2007, pp. 212–224.

Page 31: Api specification based function search engine using natural language query-Seminar Conducted by me

[15] R. Hemayati, W. Meng, and C. Yu, “Semantic-based grouping of search engine results using wordnet,” in Proc. of the joint 9th Asia- Pacific web and 8th international conference on web-age information management conference on Advances in data and web management, ser. APWeb/WAIM’07. Springer-Verlag, 2007, pp. 678–686.[16] C. Manning and D. Klein. The stanford parser. [Online]. Available: http://nlp.stanford.edu/software/lex-parser.shtml[17] Java api. [Online]. Available: docs.oracle.com/javase/1.4.2/docs/api[18] L. Vaughan, “New measurements for search engine evaluation proposed and tested,” Inf. Process. Manage., vol. 40, no. 4, pp. 677–691, May 2004.[19] Krugle inc. [Online]. Available: http://opensearch.krugle.com/[20] Koder inc. [Online]. Available: http://www.koders.com/[21] S. E. Sim, M. Umarji, S. Ratanotayanon, and C. V. Lopes, “How well do search engines support code retrieval on the web?” ACM Trans. Softw. Eng. Methodol., vol. 21, no. 1, pp. 4:1–4:25, Dec. 2011

Page 32: Api specification based function search engine using natural language query-Seminar Conducted by me