A N IDE-B ASED C ONTEXT - A WARE M ETA S EARCH E NGINE Mohammad Masudur Rahman, Shamima Yeasmin, and...
-
Upload
ariel-holt -
Category
Documents
-
view
220 -
download
0
Transcript of A N IDE-B ASED C ONTEXT - A WARE M ETA S EARCH E NGINE Mohammad Masudur Rahman, Shamima Yeasmin, and...
AN IDE-BASED CONTEXT-AWARE META SEARCH ENGINEMohammad Masudur Rahman, Shamima Yeasmin, and Chanchal K. Roy
Department of Computer Science
University of Saskatchewan
20th Working Conference on Reverse Engineering (WCRE 2013), Koblenz, Germany
SOFTWARE MAINTENANCE, BUGS & EXCEPTIONS
Very Common Event!!
EXCEPTION HANDLING: IDE SUPPORT
1
2
EXCEPTION HANDLING: DEVELOPERS (NOVICE & EXPERT)
EXCEPTION HANDLING: WEB SEARCH
Class can not access a member of class java.util.HashMap$HashIterator with modifiers "public final”
IDE-BASED WEB SEARCH
About 80% effort on Software Maintenance Bug fixation– error and exception handling Developers spend about 19% of time in web
search Traditional web search
Does not consider context of search (No ties between IDE and web browser)
Context-switching and distracting Time consuming Often not much productive
o IDE-Based context-aware search addresses those issues.
EXISTING RELATED WORKS
Cordeiro et al. (RSSE’ 2012)– Context-based recommendation system
Ponzanelli et al. (ICSE 2013)– Seahawk Poshyvanyk et al. (IWICSS 2007)– COTS
(Google Desktop) into Eclipse IDE Brandt et al. (SIGCHI 2010)– Integrating
Google web search into IDE
MOTIVATION EXPERIMENTS
Search Query
Common Results
Google Only
Yahoo Only
Bing Only
Content Only
32 09 16 18
Content and Context
47 09 11 10
83 Exceptions Solutions found for at most 58 exceptions.
THE KEY IDEA !! META SEARCH ENGINE
PROPOSED IDE-BASED META SEARCH MODEL
PROPOSED IDE-BASED META SEARCH MODEL
Distinguished Features Meta search engine– captures data from
multiple search engines More precise context– both stack trace and
associated code as exception context Popularity and confidence of result links Complete web browsing experience within the
IDE
PROPOSED METRICS & SCORES
Title to title Matching Score (Stitle)– Cosine similarity measurement
Stack trace Matching Score (Sst)– SimHash based similarity measurement
Code context Matching Score (Scc)– SimHash based similarity measurement
StackOverflow Vote Score (Sso)– Summation of differences between up and down votes for all posts in the link
PROPOSED METRICS & SCORES
Top Ten Score (Stt)– Position of result link in the top 10 of each provider.
Page Rank Score (Spr)-- Relative popularity among all links in the corpus using Page Rank algorithm.
Site Traffic Rank Score (Sstr)-- Alexa and Compete Rank of each link
Search Engine weight (Ssew)---Relative reliability or importance of each search engine. Experiments with 75 programming queries against the search engines.
METRICS NORMALIZATION
Normalization applied to -- Sst , Scc , Sso , Stt , Spr and Sstr
Avoiding bias to any particular aspect
)min()max(
)min(,
ii
iinormalizedi SS
SSS
FINAL SCORE COMPONENTS
Content Relevance Scnt=Stitle
Context Relevance Scxt=(Sst + Scc)/2 Link Popularity Spop=(Sso +Spr + Sstr)/3 Search Engine Confidence Sser=(Ssew x Stt)
EXPERIMENT OVERVIEW
25 Exceptions collected from Eclipse IDE workspaces.
Related to Eclipse plug-in framework and Java Application Development
Solutions chosen from exhaustive web search with cross validations by peers
Recommended results manually validated.
EXPERIMENTAL RESULTS
Score Top 10 Rank10 Top 20 Rank20
Scnt 10 3.60 16 8.63
Scnt, Scxt 11 3.00 16 7.43
Scnt, Spop 13 4.69 18 8.11
Scnt, Sser 23 4.39 23 4.39
Scnt, Scxt, Spop 13 4.07 18 7.61
Scnt, Scxt, Sser 24 4.45 24 4.45
Scnt, Scxt, Sser, Spop 23 4.26 24 4.54
Top10: No. of test cases solved when the top 10 results consideredRank10: Average rank of solutions when the top 10 results considered
USER STUDY
Five interesting exception test cases. Five CS graduates research students as
participants. Top 10 results from SurfClipse randomly
presented to the participants. To avoid the bias of choosing top rated
solutions. 64.28% agreement found.
USER STUDY RESULTS
Question ID ANSR ANSM Agreement
Q1 2.8 2.0 71.43%
Q2 4.6 2.8 60.87%
Q3 4.6 2.4 52.17%
Q4 4.2 3.0 71.43%
Q5 5.8 3.8 65.52%
Overall 4.4 2.8 64.28%
ANSR: Avg. no. of solutions recommended by the participants.ANSM: Avg. no. of solution matched with that by our approach.Agreement: % of agreement between solutions.
THREATS TO VALIDITY
Search is not real time yet. Different aspects need different weights.
LATEST UPDATES
A Distributed model for IDE-Based web search– client-server architecture, remotely hosted web service
Parallel processing in computation Two modes of operations– proactive and
interactive Granular refinement of metrics and assigning
relative weights (i.e., importance) Complete IDE-based web search solution.
CONCLUSION & FUTURE WORKS
A novel IDE-Based search with meta search capabilities
Exploits existing search service providers Considers content, context, popularity and
search engine confidence of a result. Recommends correct solution for 24(96%)
out of 25 test cases. 64.28% agreement in user study. Needs more extended experiments and user
study. Metrics need to be fine-tuned and more
granulated.
THANK YOU !!!