The Power of Social Media

37
The Power Ricardo Baeza-Yates VP, Yahoo! Research Barcelona, Spain & Santiago, Chile Social Media of Today is the Memory of Tomorrow Remember!

description

La presentazione di Ricardo Baeza-Yates (Yahoo! Research Barcelona, Spain & Santiago, Chile) al workshop La Memoria al Tempo di Internet

Transcript of The Power of Social Media

Page 1: The Power of Social Media

The Power

Ricardo Baeza-Yates

VP, Yahoo! Research

Barcelona, Spain & Santiago, Chile

Social Media

of

Today is theMemory

of Tomorrow

Remember!

Page 2: The Power of Social Media

3

Yahoo! Research

Agenda

� The Internet and the Web today

� Web 2.0 and Social Media

� Example: Social Search

� Yahoo! Research

� The Wisdom of the Crowds

� The Future

Internet and the Web

Page 3: The Power of Social Media

6

Yahoo! Research

Internet and the Web Today

� Between 1 and 2.5 billion people connected

– 5 billion estimated for 2015

� 1.8 billion mobile phones today

– 500 million expected to have mobile broadband in 2010

� Internet traffic has increased 20 times in last 5 years

� Today there are more than 185 million Web servers

– 50% Apache, 34% Windows

� The Web is in practice unbounded

– Dynamic pages are unbounded

– Static pages are over 12 billion?

7

Yahoo! Research

Trends

• Web 2.0, social networks

– Fragmentation of content ownership

– Fragmentation of the access (age, topic, etc.)

– Fragmentation of the right to access

• Increase of the Semantic Web

– RDF, microformats, metadata in general

• Increase of Internet advertising associated to search/content

Page 4: The Power of Social Media

Yahoo! Research

Advertising

2011

2007

2012

2011

USA

9

Yahoo! Research

Advertising and the Web 2.0

� The power of the mouth to mouth

� The power of the influential bloggers

� Viral Marketing

– Positive (Dove)

– Negative (HSBC)

� Presence in virtual(?) worlds (Second Life)

Page 5: The Power of Social Media

11

Yahoo! Research

Yahoo! Scale (2007)

24 languages, 20 countries

� > 4 billion page views per day (largest in the world)

� > 500 million unique users each month (half the Internet users!)

� > 250 million mail users (1 million new accounts a day)� 95 million groups members � 7 million moderators� 4 billion music videos streamed in 2005

� 20 Pb of storage (20M Gb) – US Library of congress every day (28M books, 20TB)

� 15 Tb of data processed per day

� 7 billion song ratings

� 2 billion photos stored

� 2 billion Mail+Messenger sent per day

Social Media

Page 6: The Power of Social Media

13

Yahoo! Research

New Trends

14

Yahoo! Research

The Web: A Play in Three Acts

“ O u r”

W e b

“ My ”

We b

“ T h e ”

W e b

Public

Personal

Social

Page 7: The Power of Social Media

15

Yahoo! Research

Web 2.0: Ingredients

Reviews

RSS

PhotosVideo

Blogs

Bookm arks

Playlists

Audio

Podcasts

IM

TagsVoIP

APIs

Groups

16

Yahoo! Research

Some Social Networks

� Blogs

– Directed collaborative topical discussions

� Instant messenger

– Buddy list

� Yahoo! Groups

– Topically focused communities

� MySpace, Facebook, Friendster, Orkut

– Friendship network

� Del.icio.us

– Collaborative bookmarking

� Flickr, You Tube

– Photo/video sharing and tagging

� Yahoo! Answers

– People answering people

Page 8: The Power of Social Media

Yahoo! Research

Web 2.0 in Yahoo!

• Yahoo! Groups 8 million, 1 of each 10 members

• Del.icio.us 2 million users

• Flickr 1 million pictures per day

• Yahoo! Respuestas 100M users, 150M answers

• Messenger 85M unique users

Sit ios sociales tuvieron 115M visitantes únicos, 56M “ m enores de 35” .

(datos del 2007)

18

Yahoo! Research

Why do people come online?

� To communicate

� To be informed

� To be entertained

� Increasingly… to be part of new forms of participation,

belonging and sharing

� To be part of social media

– also referred as Social Networks

Page 9: The Power of Social Media

20

Yahoo! Research

“One-way” ContentFilm Clips

Competition

Critics

Picture Gallery

Community

Content

User’s photos

User’s reviews

User knowledge

22

Yahoo! Research

S o c ia l

Ne t w o rks

Ma in ly y o u n g

p e o p le (1 3 -2 5 )

Mo b ile u s e

Page 10: The Power of Social Media

25

Yahoo! Research

Who are they?

Ag e % Re p re s e n t a t iv e in t e re s t s

26

Yahoo! Research

What makes Flickr special?

1. User Generated Content

Content not licensed from providers such as Corbis or Getty, but rather

contributed by users.

2. User Organized Content

Content is tagged, described, organized, discovered, etc. not by “editors” but

by the users themselves.

3. User Distributed Content

Flickr achieved distribution across the internet, not through “business deals”

per se, but rather through the Flickr community which distributed Flickr

content on 3rd-party blogs.

4. User Developed Functionality

Flickr exposed APIs (PHP, Perl, etc.) that allowed the community of

developers to build against the Flickr platform.

Entire ecosystem created by less than ten employees…

aided by millions in the Flickr community.

Page 11: The Power of Social Media

27

Yahoo! Research

Visualizing Tags: Tag Cloud from Flickr

29

Yahoo! Research

A Digression: Computer Vision is hard

Page 12: The Power of Social Media

30

Yahoo! Research

Page 13: The Power of Social Media
Page 14: The Power of Social Media

34

Yahoo! Research

38

Yahoo! Research

In t e rn e t UGC (Us e r Ge n e ra t e d Co n t e n t )

Ty p e s o f Co n t e n tHa v e y o u e x p e rie n c e d UGC?

Mu lt ip le Ch o ic e No

Ty p e s o f Co n t e n t

Ye s

As a

Pu b lis h e r

As a

Co n s u m e r

Ph o t o s ,

Im a g e s

Te x t

Vid e o s

Mu s ic

An im a t io n , Fla s h

Ot h e rs

Source: National Internet Development Agency Report in June, 2006 (South Korea)

Page 15: The Power of Social Media

40

Yahoo! Research

Using a syst em of user-assigned rat ings, LAUNCHcast builds up a profile of preferences for each individual. .

The m ore rat ings users m ake, t he m ore int e lligent t he radio becom es.

W e have over 6 billion rat ings

LAUNCHcast = m usic t hat list ens t o you

Users can t hen share t he ir cust om radio st at ion w it h fr iends t hrough Yahoo! M essengert aking a ll t he hassle out of discovering new m usic

Simple acts create value and opportunity

41

Yahoo! Research

Community Dynamics

1 creators

10 synthesizers

100 consumers

Next generation products will blur distinctions between

Creators, Synthesizers, and Consumers

Example: Launchcast

Every act of consumption is an implicit act of production

that requires no incremental effort…

Listening itself implicitly creates a radio station…

Page 16: The Power of Social Media

42

Yahoo! Research

Social Process

�Millions of users of Flickr share and tag each others’ photographs (why???)

�Fernando Flores: Blogs

– Look into the future

– Warning

– Commotion

– Institution

� Individual or collaborative

– Community newspaper: www.elmorrocotudo.cl

�Power law distribution

Social Search

Page 17: The Power of Social Media

44

Yahoo! Research

The Knowledge Challenge

Challenge � Enabling users to share knowledge with their community to create a

better search experience

Number of Results

Vacation Chile 26,800,000

“Everything Ricardo knows about Chile” 0

Exam pleQuery: Vacat ion Chile

Query: “ Everything Ricardo knows about Chile”

45

Yahoo! Research

Subjective Queries

The kinds of queries that rely on domain expertise…

� “Do you know a reputable plumber in Southampton?”� “Where is the cool nightlife in Trento?”� “What political blogs do you think I’d enjoy reading?”� “Where can I buy a cool pair of shoes?”

These kinds of queries are ill-served by today’s search

engines, but are ironically the most valuable (i.e.

transactional queries.)

How do we capture the people’s experience?

Page 18: The Power of Social Media

48

Yahoo! Research

Social Powered Search: Yahoo! Answers

� Democratize process of “voting”

(whether explicit or implicit)� Move out of the purview of webmasters and hand

control back to users� Allow dynamic assignment to various authorities of

trust, new degree of freedom

“Better Search Through People”

49

Yahoo! Research

Challenges in Social Search

�How do we use UGC for better search?�What’s the ratings and reputation system?�How do you cope with (social) spam?�What are the incentive mechanisms

�The bigger challenge: Where else can you

leverage the power of the people?

Page 19: The Power of Social Media

Yahoo! Research

51

Yahoo! Research

Agenda

� European search vision

� Knowledge - the next challenge

� People power

� Making knowledge pay

Leader board

Poorly formed questions

Page 20: The Power of Social Media

Yahoo! Research

P. Jurczyk, E. Agichtein: “Discovering authorities in Q.A. communities by using link

analysis” CIKM'07

Askers

Answerers

53

Yahoo! Research

No definitive

answer

Unverifiable

answer

Community consensus

Page 21: The Power of Social Media

54

Yahoo! Research

What are the Problems?

�Which questions are legitimate?

�What is the incentive system?

�How do we validate answers?

�What is the role of the community?

�What is the reputation system?

57

Yahoo! Research

What are the challenges?

� Community of users

– Social system

� Incentives and reputations

– Economic system

� Poorly phrased, “gramatically” limited queries

– Language analysis

� Improving user experience from past data

– Data mining

Page 22: The Power of Social Media

58

Yahoo! Research

What are the sciences?

� Information retrieval & language processing

�Microeconomics

�Data Mining

�Sociology and human-computer interaction

�Community networks

Duncan Watts

Six Degrees of Separation

The Wisdom of the Crowds

Page 23: The Power of Social Media

61

Yahoo! Research

� The Wisdom of Crowds

- James Surowiecki - 2004

– “Under the right circumstances, groups are remarkably

intelligent”

• Importance of diversity, independence and decentralization

– “large groups of people are smarter than an elite

few, no matter how brilliant—they are better at

solving problems, fostering innovation, coming to

wise decisions, even predicting the future”.

• How to deploy this in the next generation of social search and

media services?

– SEMEDIA video retrieval EU Project

(with BBC, Glasgow U., Smoke & Mirrors, Joaneeum & UPF)

The Rationale behind Web Mining

63

Yahoo! Research

Page 24: The Power of Social Media

64

Yahoo! Research

Anchor Text

� The wisdom of the crowds can be used to search

� The principle is not new – anchor text is used in

“standard” search: when indexing a document D, include

anchor text from links pointing to D

www.ib m .com

Arm on k, NY-b a s e d com p u te r

g ia n t IBM a n n ou n ce d tod a y

Joe ’s com p u te r h a rd wa re lin ks

Com p a q HP IBM

Big Blu e tod a y a n n ou n ce d

re cord p rofit s fo r th e q u a rte r

Yahoo! Research

Chris Anderson: “The Long Tail”. Hyperion, 2006.

Frequency

Quality

Traditional

publishing

User-

generated

Quality and Frequency

Page 25: The Power of Social Media

Yahoo! Research

Chris Anderson: “The Long Tail”. Hyperion, 2006.

Quantity

Quality

User-

generated

Traditional

publishing

Quality and Quantity

Yahoo! ResearchChris Martin from Coldplay in The Rolling Stone, Fortieth Aniversary, July 2007.

Quantity

Quality

“ We t h in k it 's a ll

a b o u t q u a lit y o v e r

q u a n t it y n o w ,

b e c a u s e t h e re 's s o

m u c h n o is e

e v e ry w h e re , t h e re 's

n o p o in t in p u t t in g

a n y t h in g o u t u n le s s

it 's fu c kin g

a m a z in g . ”

Page 26: The Power of Social Media

Yahoo! Research

Quantity

Quality

User-

generated

Traditional

publishing

The Push for Quality

?

Yahoo! Research

Page 27: The Power of Social Media

Yahoo! Research

¼ questions want an

opinion: informal polls

¾ questions seek for

information or advice

Page 28: The Power of Social Media

Yahoo! ResearchQ. Su, D. Pavlov, J.-H. Chow, W. C. Baker. “Internet-scale collection of

human-reviewed data”.WWW'07.

17%-45% of

answers

were correct

65%-90% of

questions had

at least one

correct answer

Page 29: The Power of Social Media

Yahoo! Research

There are top contributors ...

... but they don't have all the answers

Yahoo! Research

High Medium Low

High 41% 15% 8%

Medium 53% 76% 74%

Low 6% 9% 18%

100% 100% 100%

Answer

quality

Question quality

Question quality and answer quality are not independent

and can be predicted reasonable well (Castillo et al, 2008)

What about real quality?

Page 30: The Power of Social Media

77

Yahoo! Research

Influence Leadership (Bopal et al, 2008)

� Influence of social graph in particular actions

– Social graph: Yahoo! Instant Messenger

– Actions log: Yahoo! Movies

• Action = user u rated movie m at time t

– joined through common users identifiers

�Started from Yahoo! Instant Messenger subgraph of “most active” users (110M nodes) and 21M ratings from Yahoo! Movies.

– Ended with 217.5K nodes, 221.4K edges and 1.8M ratings.

78

Yahoo! Research

Leaders vs. Tribe leaders

Page 31: The Power of Social Media

79

Yahoo! Research

The Wisdom of Crowds

� Crucial for Search Ranking

� Text content: Web Writers

– not only for the Web!

� Links: Web Publishers

� Annotations: Web 2.0 Users

– Tags, bookmarks, comments, ratings, etc.

� Queries: All Web Users!

– Queries and actions

80

Yahoo! Research

Query Intention (Broder, 2000)

•~40% Navigational

•~35% Transactional

~25% Informational

Page 32: The Power of Social Media

85

Yahoo! Research

Mining Queries for ...

�Improved Web Search

�Ranking

�Query recommensations

�User Driven Design

– Information Scent

– The Web Site that the Users Want

– The Web Site that You should Have

– Improve content & structure

�Bootstrap of pseudo-semantic resources

Yahoo! Research

Query Mining: Relating Similar Queries

Page 33: The Power of Social Media

Yahoo! Research

Implicit Folksonomy

Yahoo! Research

Implicit Knowledge (Baeza-Yates et al, 2007)

Page 34: The Power of Social Media

Yahoo! Research

Experimental Evaluation

Yahoo! Research

Some Open Issues

• Implicit social network

– Any fundamental similarities?

• How to evaluate with partial knowledge?

– Data volume amplifies the problem

• User aggregation vs. personalization– Optimize common tasks: help more people– Move away from privacy issues

Page 35: The Power of Social Media

Epilogue

92

Yahoo! Research

The Future

�The Web is scientifically young

� It is intellectually diverse

– The human element

– The social element

�The technology mirrors the economic,

legal and sociological reality

Page 36: The Power of Social Media

93

Yahoo! Research

Mirror of the Society

94

Yahoo! Research

Exports/Imports vs. Domain Links

Baeza-Yates & Castillo, WWW2006Web Spam Challenge:• UK Web Collection• Training set with thousands of

judged sites

Page 37: The Power of Social Media

96

Yahoo! ResearchWhat’s next? Fourth generation: From Information Retrieval to Information Supply

Explicit dem and for inform at ion driven by a user query

Increase use of context

Act ive inform at ion supply driven by user act ivity and context

97

Yahoo! Research

Web 3.0?

� We are at Web 2.0 beta

� People wants to get tasks done

– Where I do go for a original holiday with 1,000

euros?

� Take in account the context of the task

I want to book a vacation in Tuscany.Start Finish

Yahoo! Experience