Post on 28-Nov-2014
description
time for events telling the world’s stories from social media
Mor Naaman Rutgers SC&I & Mahaya, Inc.
@informor
enter: social media
(JCDL 2007)
(JCDL 2007)
yes.
(SIGIR 2007)
organize the world’s memories
people, together
BYOBW
outside lands festival
organize the world’s memories
objectives d
ete
ct
ide
nti
fy
org
an
ize
objectives d
ete
ct
ICWSM 2011a JASIST 2011 WebDB 2009 SIGIR 2007
objectives
ide
nti
fy
WSDM 2012 ICWSM 2011b WSDM 2010
objectives
org
an
ize
ICMR 2012 CHI 2012
CSCW 2012 MTAP 2012 VAST 2010
WWW 2009
today d
ete
ct
org
an
ize
Vox!Multiplayer
multi-site id
en
tify
Vox Civitas
over
view
Multiplayer
Multi-site content E
[with Hila Becker, Luis Gravano]
goal effectively retrieve social media content for known events from multiple services
E
E
challenges event descriptor not well-formed brief textual descriptors noise formats/conventions/metadata differ
E
approach two-step query formulation
precision-based recall-based
validate queries based on known/extracted event model
E
step 1 term extraction from event descriptors generates “high precision” queries e. g. “andrew bird, opening gala, celebrate brooklyn, prospect park”
E E
step 2 use “high precision” corpus to generate more general queries to improve recall e. g. “andrew bird concert”, “state farm insurance”
E E
recall-oriented queries Benefits: - Works cross-site - Works with short content Challenges: - Introduces noise - Potentially large set of queries
E E
post-filtering use known event model (topics, time, location) use queries with a result set that matches known model
E E
for example...
E E
0"20"40"60"80"
100"120"
6/7/11" 6/8/11" 6/9/11" 6/10/11" 6/11/11" 6/12/11" 6/13/11"
[andrew"bird"concert]" [state"farm"insurance]"
5" 5"
4" 4"
39" 36" 34" 34"
9" 8" 8" 7"
0"0.1"0.2"0.3"0.4"0.5"0.6"0.7"0.8"0.9"1"
1.1"
0" 5" 10" 15" 20" 25"
NDC
G%
Number%of%Documents%k%
Precision"
Twi7er8MS"
YouTube8MS"
evaluation query generation relevance of retrieved documents
E
takeaways can aggregate content fragmented across platforms improve recall, not rely on site-specific features
E
Vox Civitas
over
view
Multiplayer
Multi-site content E (WSDM 2012)
[with postdoctoral fellow Nick Diakopoulos]
research questions can Twitter content around broadcast news events inform journalistic inquiry? what insights and analyses can we enable through visual analytic tools?
direct attention to relevant information
automatic content analysis for filtering
– relevance
– uniqueness / novelty
– sentiment
– keyword extraction
supporting analysis
how to evaluate? directly evaluate the output of the algorithms (quantitative)
deep, extensive evaluation of users’ interaction with the system (qualitative)
read more: Olsen (UIST ’07) Naaman (MTAP ’12)
Vox evaluation goals • How effective for generating story ideas?
• What kind of insights/analysis are supported?
• Shortcomings and how features are used?
takeaways can extract reliable event structure from social media
Vox Civitas
over
view
Multiplayer
Multi-site content E
(VAST 2010)
what the hell?
[with: Lyndon Kennedy, Dan Ellis, Kai Su]
supporting analysis extract the signal from people’s attention: find overlapping moments compute and rank scenes extract scene descriptors
audio fingerprinting
Wang et al. (ISMIR ’03)
two clips, aligned
0:00
0:00 0:18
2:32
3:32
a story of n clips
time
from clips to scenes
time Happy Birthday, Birthday
Higher Ground Encore
evaluation quantitative: evaluated matching, scene extraction… qualitative: evaluated deployment scenario/task
takeaways can create an event presentation that gets better them more content is added
Vox Civitas
over
view
Multiplayer
Multi-site content E
(NM&S 2012, ICMR 2012, MTAP 2012, WWW 2009)
towards better models of large-scale human attention
printing press
è knowledge archive
digital documents
èdigital archive
the web
ènetworked archive
social media
èexperience archive
new methods?
search by subject code?
explore. new information seeking tasks (and models) new applications for social media content
explore.
beyond real-time personal and social
mor@rutgers.edu @informor
http://mornaaman.com
questions?
Luis Gravano Hila Becker Nick Diakopoulos Kai Su Dan Ellis Munmun de Choudhury Tarikh Korula …
thanks