Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13
description
Transcript of Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13
![Page 1: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/1.jpg)
Date: 2013/9/25Author: Mikhail Ageev, Dmitry Lagun, Eugene AgichteinSource: SIGIR’13Advisor: Jia-ling KohSpeaker: Chen-Yu Huang
Improving Search Result Summaries by Using
Searcher Behavior Data
![Page 2: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/2.jpg)
2
Outline• Introduction•Behavior-biased snippet generation• Text-based snippet generation• Inferring relevant text fragment from search behavior
•Experiment• Data collection and experiment setup• result
•Conclusion
![Page 3: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/3.jpg)
3
• Include the desired information directly• Provide sufficient information for the user to distinguish
•An ideal snippet would include the text fragment where user focused his attention
Introduction• Improve snippet generation for informational queries
![Page 4: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/4.jpg)
4
•Following the literature on snippet quality, snippets must satisfy:•Representativeness• Readability• Judgeability
•Not to replace the existing text-based snippet generation approaches, but rather to add additional evidence.
•Present a new approach to improving result summaries by incorporating post-click searchers behavior data.
Introduction
![Page 5: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/5.jpg)
5
Outline• Introduction•Behavior-biased snippet generation• Text-based snippet generation• Inferring relevant text fragment from search behavior
•Experiment• Data collection and experiment setup• result
•Conclusion
![Page 6: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/6.jpg)
6
Behavior-biased snippet generation
• Generate candidate fragments (use a strong text-only baseline)
• Infer the document fragments of interest to the user (based on user examination data)
• Generate the final behavior-biased snippet (BeBS)
![Page 7: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/7.jpg)
7
•Make three key assumption:•Primarily target informational queries•Assume that document visits can be grouped by query intent•Assume that user interactions on landing pages can be collected by a search engine or a third party
Behavior-biased snippet generation
![Page 8: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/8.jpg)
8
•For a given query we first select all the sentences that have at least one match of query terms.•Extend the method of Metzler and Kanungo by adding additional features.•Train a Gradient Boosting Regression Tree model(GBRT) on a subset of training query-URL pairs to predict snippet fragment scores.
Text-biased snippet generation
![Page 9: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/9.jpg)
9
Text-biased snippet generation
![Page 10: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/10.jpg)
10
•Collect searcher interactions on web pages• Adapt the publicly available EMU toolbar for Firefox browser , that is able to collect mouse cursor movement.• After the html page is rendered in the browser, modifies the document DOM tree, so that each word is wrapped by a separate DOM element tags.
Inferring relevant text fragments from search behavior
![Page 11: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/11.jpg)
11
•For each page visit we know • the searcher’s intent• the search engine query that user issued• the URL• the contents of the document• the bounding boxes of each word in HTML text• the log of behavior actions ( mouse cursor coordinates , mouse clicks, scrolling, an answer to the question that the user found )
Behavior-biased snippet generation
![Page 12: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/12.jpg)
12
•Focus on capturing user’s behavior associated with focused attention, not contain any document or query information.
Inferring relevant text fragments from search behavior
![Page 13: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/13.jpg)
13
•Train the GBRT model on the training set of the labeled document fragments.•Training set (stemming and stopword removal)
Inferring relevant text fragments from search behavior
labelThe answer
that user submitted is
correctThe answer shares words with the
fragment 1The answer shares no word the
fragment 0
![Page 14: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/14.jpg)
14
•Combine the text-based score TextScore(f) for candidate fragment with the behavior-based score BScore(f), infer from the examination data.
• TextScore(f) value in the range[1,5]• Bscore(f) value in the range[0,1]
Behavior-biased snippet generation system
![Page 15: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/15.jpg)
15
Outline• Introduction•Behavior-biased snippet generation• Text-based snippet generation• Inferring relevant text fragment from search behavior
•Experiment• Data collection and experiment setup• result
•Conclusion
![Page 16: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/16.jpg)
16
•The participants were recruited through the Amazon Mechanical Turk.•The participants played a search contest “game” consisting of 12 search questions to solve.• 98 users• 1175 search sessions• 2289 page visits• 508 distinct URLs• 707 different query-URL pairs
Data collection and experimental setup
![Page 17: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/17.jpg)
17
•Prediction of fragment interestingness• 10-fold cross validation(disjoint)• ROUGE
Result
![Page 18: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/18.jpg)
18
•Text-based baseline
Result
![Page 19: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/19.jpg)
19
• BeBS system
• λ affects two characteristics of the algorithm:• Snippet coverage: the ratio of the snippets produced by the behavior- biased algorithm• Snippet quality: judgeability, readability, representativeness, via manual assessments
Result
![Page 20: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/20.jpg)
20
•BeBS system
Result
![Page 21: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/21.jpg)
21
•Behavior feature
Result
![Page 22: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/22.jpg)
22
Outline• Introduction•Behavior-biased snippet generation• Text-based snippet generation• Inferring relevant text fragment from search behavior
•Experiment• Data collection and experiment setup• result
•Conclusion
![Page 23: Date: 2013/9/25 Author: Mikhail Ageev , Dmitry Lagun , Eugene Agichtein Source : SIGIR’13](https://reader035.fdocuments.us/reader035/viewer/2022062811/5681624c550346895dd29630/html5/thumbnails/23.jpg)
23
•Presented a novel way to generate behavior-biased search snippets.•Method was primarily targeted informational queries, an important consideration is how to generalize approach to a wider class of queries.•Approach is not necessarily limited to desktop-based computers with a mouse.
Conclusion