TwitterSearch : A Comparison of Microblog Search and Web Search Jaime Teevan, Daniel Ramage,...
-
Upload
kelley-carter -
Category
Documents
-
view
217 -
download
1
Transcript of TwitterSearch : A Comparison of Microblog Search and Web Search Jaime Teevan, Daniel Ramage,...
TwitterSearch: A Comparison of Microblog Search and Web Search
Jaime Teevan, Daniel Ramage, Meredith Ringel MorrisMicrosoft Research, Stanford University, Microsoft Re-searchWSDM’ 11
March 11 2011Presented by Somin Kim
Outline Introduction Why People Search Twitter How People Search Twitter What People Find on Twitter Design Implications Conclusion
2/31
Introduction (1/3)
Social networking service users share messages with their network of friends and with the general public
One of the most popular social networking service is Twitter
3/31
People use microblogging services like Twitter to share information
Introduction (2/3)
Tweets
News about the person
posting
Commentary on links
Directed dis-cussion
Any other con-tent …
The poster’s current mood
Location infor-mation
4/31
Introduction (3/3)
People also use microblogging services to find infor-mation– Some status updates are questions directed to user’s social
connections– Twitter provides a search interface to access public tweets– Bing and Google have begun to provide online search of Twit-
ter posts
Different properties of contents on microblogging and web
Microblogging content Web content
• Short • Rich
• Generated frequently • Generated more slowly
• Do not change after being posted
• Evolve after creation
5/31
Outline Introduction Why People Search Twitter
– Timely Information– Social Information– Topical Information
How People Search Twitter What People Find on Twitter Design Implications Conclusion
6/31
Why People Search Twitter (1/4)
We conducted a freeform questionnaire and had an structure interview to some Twitter users at Microsoft– “When you search Twitter, what kind of information are you
looking for?”
Behavior of Twitter users at Microsoft– The median time of using Twitter was 1~2 years– 83% read tweets at least one time per day– 59% wrote tweets at least one time per day– The number of people followed was more than typical
The mean was 370.4 and the median was 159.5
– The most popular applications were TweetDeck, Twitter, and Seesmic
– 87% experienced having searched a Twitter posts
7/31
Why People Search Twitter (2/4)
Timely Information
Current events– To keep up with what was happening– To understand trends– Ex.
Information related to news ( technology news, trends ) Topics gaining popularity ( currently trending topic ) Summaries of events colleagues were attending ( event hash-
tags )
Real-time information– Ex.
Regional/local information ( police incident, weather, etc) Reports of traffic Status of online services
8/31
Why People Search Twitter (3/4)
Social Information
Information related to other Twitter users– To find individuals with specific interests– To discover what particular individuals were saying– To understand the context of tweets that others wrote– To look for replies
General picture of people’s overall opinions on par-ticular topic– To learn the community buzz– Ex.
Movie reviews Marketing campaigns Upcoming Microsoft event or product
9/31
Why People Search Twitter (4/4)
Topical Information
Information related to specific topics– It closely matched traditional Web search motivations– It contain themes related to timely and social information
Information previously encountered– But it is difficult to re-find using Twitter search
10/31
Outline Introduction Why People Search Twitter How People Search Twitter
– Collecting Twitter and Web Queries– Queries Issued– Temporal Search Aspects of Search Behavior– Common Cross-Corpus Queries
What People Find on Twitter Design Implications Conclusion
11/31
How People Search Twitter (1/10)
Collecting Twitter and Web Queries
Query data issued to the Twitter search engine and Web search engine– Sampled from the Web browser logs of the Bing toolbar– from millions of users and includes hundreds of millions of
page visits– Collected during November 11 – 24 , 2009
The browser logs contain query URLs associated with multiple search engines– It is possible to extract the queries issued to each engine
from the URLs– It is possible to associate the queries with user IDs and time-
stamps
12/31
How People Search Twitter (2/10)
Collecting Twitter and Web Queries
Twitter queries are from Twitter search engine, and Web queries are from Bing, Google, and Yahoo
Some queries were treated as the same query in-stance– Occurred within a fifteen minute– In the same window– To the same Web search engine– With no other queries intervening
13/31
How People Search Twitter (3/10)
Queries Issued
Key differences between Twitter and Web query strings
14/31
How People Search Twitter (4/10)
Queries Issued
Web queries are navigational Twitter queries are temporally based or popular inter-
net memes15/31
How People Search Twitter (5/10)
Queries Issued
Celebrity Queries– Celebrities were a popular topic among both Twitter and Web
searches– Celebrity queries issued to Twitter
Significantly more likely to be a celebrity name Motivated by a desire for timely information
– Celebrity queries issued to Web Much more likely to include a celebrity name and additional con-
tent Motivated by a desire learn more about a particular aspect of
some person
Twitter queries Web queries
15.22%
3.11%
6.51%
14.86%
Celebrity name
Celebrity name and additional content
16/31
How People Search Twitter (6/10)
Queries Issued
Specialized syntax– @
As a Twitter convention, @is used to refer to a user’s alias In Twitter queries, @ symbol usually appears at the beginning In Web queries, @ symbol usually appears as a part of an e-mail
address @ in the body of the tweets(36%) is more common than that in
Twitter queries
– # As a Twitter convention, # is used in hashtags, adopted self-tag
posts In Twitter queries, # symbol usually appears at the start of a word In Web queries, # symbol usually represents the term “number” Many hashtags are compound words
– The words in Twitter queries(7.31 characters) are on average longer than Web query words(6.10 characters)
– @ and # can improve search success by reducing ambiguity
17/31
How People Search Twitter (7/10)
Queries Issued
Query popularity– In general, Twitter queries are more consistent
Searchers often issue queries on Twitter by clicking rather than typing
– Trending topic ( listed by the Twitter search box ) The use of hashtags encourages query convergence
– Popular Twitter queries are much more likely to contain a hashtag– 50.73% of the 50 most popular Twitter queries contain a hashtag
Timely queries are popular– Only a limited set of topics are current at any one time
Celebrity names are popular query– 24.92% of the 50 most popular Twitter queries contain a celebrity
name
18/31
How People Search Twitter (8/10)
Temporal Aspects of Search Behavior
Search sessions
– Twitter session behavior often appears to involve monitoring of tweets
– Overlapping but non-duplicate queries being more common with Web search
Re-finding– More repeat queries overall on Twitter than on Web
In Web search, repeat queries often lead to re-finding For Twitter, people use repeat queries to monitor topics over time
19/31
How People Search Twitter (9/10)
Common Cross-Corpus Queries
Common queries are the queries issued to both ser-vices by the same individual
20/31
How People Search Twitter (10/10)
Common Cross-Corpus Queries
Queries– Common queries were usually a succinct representation of
the common need
Temporal aspects– Common queries were more likely to be issued on the Web
first– Web search sessions were more likely to include related
queries– People issued common queries with roughly the same fre-
quency on Twitter and Web
21/31
Outline Introduction Why People Search Twitter How People Search Twitter What People Find on Twitter
– Collecting Twitter and Web Results– Language Differences in Results
Design Implications Conclusion
22/31
What People Search Twitter (1/4)
Collecting Twitter and Web Results
Twitter content returned for the queries were crawled by Twitter’s spritzer stream for November 17-24, 2009
Twitter search results differ from Web search results– Twitter search results are presented to the user in the result
list– To represent the Web search results, we extracted the title
and summary text of the results
For a better exploration of differences in Twitter and Web search results, very common and very rare terms were filtered from each query-specific results set
23/31
What People Search Twitter (2/4)
Language Differences in Results
The amount of information available following a query– Average number of words
In a Twitter results : 19.55 In the Web snippets : 33.95
– Contents can be found via link Web snippets are associated with a Web page Only 34% of the Twitter results contain an external link
Many common terms are shared, but there are differ-ences in search results between Tweets and Web queries
24/31
What People Search Twitter (3/4)
Language Differences in Results
We used LDA’s per-docu-ment topic distributions
Characteristics of topics on each corpora– Common topics
Information semantically re-lated to the query
– Twitter topics Social chatter and current
events
– Web topics Basic facts and navigational
results
25/31
What People Search Twitter (4/4)
Language Differences in Results
The language of tweets is significantly different from that of the Web results
Web results are more topically diverse than are tweets
26/31
Outline Introduction Why People Search Twitter How People Search Twitter What People Find on Twitter Design Implications Conclusion
27/31
Design Implications (1/2)
To suggest for the design of next-generation search tools– Enhancing temporal queries– Enriching people search
Not only return the recent tweets but also top links or top stories Incorporate more information into Twitter and Web result pages
– Leveraging hashtags Social tags manually added to some sites could be automatically
supplemented with hashtags
– Employing user history There are many more repeat queries on Twitter, query history
could be useful
28/31
Design Implications (2/2)
To suggest for the design of next-generation search tools (cont.)– Providing query disambiguation
If a query-specific Twitter topic were popular, pages matching that topic could be ranked higher
Tweets include questions could suggest additional information
29/31
Outline Introduction Why People Search Twitter How People Search Twitter What People Find on Twitter Design Implications Conclusion
30/31
Conclusion
Twitter Search vs Web Search– People’s motivation for searching Twitter included an interest
in timely information, social information, and topical informa-tion
– Twitter queries were shorter, but contained longer words, more specialized syntax, and more references to people
– People often used Twitter search to monitor for new content while Web search was used to develop and learn about a topic
– Twitter results included more social content and events, while Web results contained more facts and navigation
We hope this understanding enables a new genera-tion of search tools
31/31