Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

12
Query trends CS 349 Presentation December 2 nd , 2008 Catherine Grevet

Transcript of Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Page 1: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Query trends

CS 349 Presentation

December 2nd, 2008Catherine Grevet

Page 2: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Introduction to query trend analysis What is a query trend?

Analysis of large query volumes over time Queries are organized into similar topics

2

Page 3: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

How do queries change over time? Hourly Analysis of a Very Large Topically Categorized Web Query Log [Beitzel et al. 2004] More queries issues during peak hours than non-peak hours

Some topics are more popular during specific times of the day while others are constant throughout the day.

Absolute change vs. Percentage change

3

Page 4: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Google trends Started around 2006 Search Volume Index and News Reference Volume

Inaccuracies: data sampling issues and approximations used to compute the trends

Hot trends shows hourly popular searches Searches that deviate the most from their historic trends

Only shows top 100 Displays trends in relative mode http://www.google.com/trends4

Page 5: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Example of trend analysis for public health Study by Eysenbach in 2006 Query trends for syndromic surveillance to signal an outbreak

Google AdSense ads to track geographic locations and number of hits for flu-related queries

Results: Search engine clicks were better and timelier predictor of flu outbreaks than predicted by physicians

“Infodemiology”: track health demand and supply trends

5

Page 6: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Google as a health monitor November 2008 started flu trends on google.org http://www.google.org/flutrends/

6

Page 7: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Wellesley trends Wellesley.edu uses a google mini search for searching the wellesley domain

Outputs monthly results for 100 top queries

Data: 100 top queries from November 2006 to September 2008

Goal: Make sense of the data to improve the wellesley.edu website

How: Wellesley trends!

7

Page 8: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Wellesley trends

8

Page 9: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Concerns Privacy Needs human labor for categorizing Inaccuracies and data sampling issues

9

Page 10: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

To solve the human categorizing problem

10

Goal: understand intent of user through classification Automatic categorization into informational, navigational and

transactional groups [Jansen 2008] Manually derived the characteristics and then ran the algorithm on

search engine logs This categorization was 74% accurate

Page 11: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

Conclusions

11

Analyzing query trends can be particularly useful for the next generation of search engines.

Understanding these trends can also have important implications for health, politics…

For domain-specific search engines (such as Wellesley mini search), the queries will be mainly related to the website. By using automatic categorization and analysis of historical trends, the homepage of the website could adapt to the popular queries.

Page 12: Query trends CS 349 Presentation December 2 nd, 2008 Catherine Grevet.

References

12

S. Beitzel et al., Hourly Analysis of a Very Large Topically Categorized Web Query Log, SIGIR 2004, 2004

G. Eysenbach, Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance, AMIA 2006 Symposium Proceedings, 2006

B. Jansen, Determining the User Intent of Web Search Engine Queries, 2007

http://www.google.com/trends Thanks to Claire Lorenz for Wellesley mini search data