Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit,...

9
Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto

Transcript of Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit,...

Page 1: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Which Feature Location Technique is Better?

Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit,

Dawn Lawrie, Rocco Oliveto

Page 2: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Motivation: Differentiating FLTs

Totally unrelated

In vicinity

Precision = 0.20 Precision = 0.20

Page 3: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Example• Developer works down ranked list• At each item can explore or not• When exploring structure, can bail

at any time

Page 4: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Proposed Approach: Rank Topology

• Use evaluation measures that consider the likelihood of a developer finding fix locations

• Use textual information to approximate developer’s interest (i.e., likelihood) of following “trail” in structural topology, starting from ranked list

• Rank topology = inverse of the number of hops in topology

Page 5: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Example• Developer works down ranked list• At each item can explore or not

• 3rd rank result + 4 structural hops = 7 total hops

• Rank topology metric = 1 / 7

Page 6: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

• No discrimination: explores everything

How “smart” is the user?

• Semi-intelligent: only follows a structural hop if the next method exhibits textual clues– Rank topology uses VSM cosine similarity (tf-idf)– Structural edge added if both methods > median

scores for query– Supported by user studies of information foraging

theory [Lawrance, et al TSE 2013]

• Omniscient: makes no wrong choices, exploring only those ranks and structural hops that lead to a bug

Page 7: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Preliminary Study: Distinguish QLM from Random

Ranked list of results all have same bug fixes at exactly the same ranks

Page 8: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Conclusion• Rank topology differentiates between

randomly ordered lists and a state of the art IR technique (QLM) with relevant results at the exact same ranks

• Future work– How well does rank topology mimic developer

behavior in practice?– How closely can/should we model user behavior?

• Our question: Does the research community need to revise how we evaluate FLTs?

Page 9: Which Feature Location Technique is Better? Emily Hill, Alberto Bacchelli, Dave Binkley, Bogdan Dit, Dawn Lawrie, Rocco Oliveto.

Preliminary Study

• Effect of program structure on the rank topology metric for each JabRef bug used in the case study.