Implicit Structure and Dynamics of Blogspace

17
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Implicit Structure and Dynamics of Blogspace Lada Adamic Accelerating Change 2004 (joint work with: Eytan Adar, Li Zhang, and Rajan Lukose)

description

Implicit Structure and Dynamics of Blogspace. Lada Adamic Accelerating Change 2004 (joint work with: Eytan Adar, Li Zhang, and Rajan Lukose). Blogs and the digital experience. Use: Record real-world and virtual experiences Easy to record and discuss things “seen” on the net - PowerPoint PPT Presentation

Transcript of Implicit Structure and Dynamics of Blogspace

Page 1: Implicit Structure and Dynamics of Blogspace

© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice

Implicit Structure and Dynamics of Blogspace

Lada AdamicAccelerating Change 2004

(joint work with: Eytan Adar, Li Zhang, and Rajan Lukose)

Page 2: Implicit Structure and Dynamics of Blogspace

2

Blogs and the digital experience• Use:

− Record real-world and virtual experiences− Easy to record and discuss things “seen” on the

net

• Structure: blog-to-blog linking• Use + Structure

− Great to track “memes”:ideas spreading in the blogosphere like an

epidemic

Page 3: Implicit Structure and Dynamics of Blogspace

3

Our interest• Macroscopic patterns of blog epidemics

− How does the popularity of a topic evolve over time?

• Microscopic patterns of blog epidemics− Implicit & Explicit− Who is getting information from whom?

• Ranking algorithms that take advantage of infection patterns

Page 4: Implicit Structure and Dynamics of Blogspace

4

Tracking Blogs• Blogdex: Earliest example

− Lets you see which blogs (and when) linked to a site

− Others emerged with similar/related functionality

• Can find epidemic profiles (popularity over time)

• Our question: do different types of information have different epidemic profiles

Page 5: Implicit Structure and Dynamics of Blogspace

5

For Example…

Pop

ula

rity

Time

Slashdot EffectSlashdot Effect

BoingBoing EffectBoingBoing Effect

Page 6: Implicit Structure and Dynamics of Blogspace

6

Clusters reflect different epidemic profiles

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

Slashdot huge surge followed by sharp drop

(slashdot-effect)

Major News – front page

More delayed death (broader interest)

Page 7: Implicit Structure and Dynamics of Blogspace

7

Clusters

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

5 10 15 200

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

day

% o

f hits

Products, etc.

Sustained over a period of time

Major-news site (editorial content) – back of the paper

Page 8: Implicit Structure and Dynamics of Blogspace

8

Microscale Example: Giant Microbes

Page 9: Implicit Structure and Dynamics of Blogspace

9

Microscale Dynamics• What do we need track specific epidemics?

− Timings− Graphs

b1b1

Time of infectiont0 t1

b2b2

b3b3

Page 10: Implicit Structure and Dynamics of Blogspace

10

Microscale Dynamics

• Challenges− Root may be unknown− Multiple possible paths− Uncrawled space, alternate media (email, voice)− No links

b1b1

Time of infectiont0 t1

b2b2

b3b3

??

bnbn

Page 11: Implicit Structure and Dynamics of Blogspace

11

Microscale Dynamics who is getting info from whom

• Explicit blog to blog links (easy)− Via links are even better

• Implicit/Inferred transfer (harder)− Use ML algorithm for link inference problem

• Support Vector Machine (SVM)• Logistic Regression

− What we can use• Full text• Blogs in common• Links in common• History of infection

Page 12: Implicit Structure and Dynamics of Blogspace

12

Visualization• Zoomgraph tool

− Using GraphViz (by AT&T) layouts

• Simple algorithm− If single, explicit link exists, draw it− Otherwise use ML algorithm

• Pick the most likely explicit link• Pick the most likely possible link

• Tool lets you zoom around space, control threshold, link types, etc.

Page 13: Implicit Structure and Dynamics of Blogspace

13

Giant Microbes epidemic visualization

via link explicit link inferred link blog

Page 14: Implicit Structure and Dynamics of Blogspace

14

iRank• “Practical” uses of inferred epidemic

information− Can use a simpler inference (timing)

• Finding good sources− Invisible authorities b1b1

b2b2

b3b3 b4b4 b5b5 bnbn…

True source

Popular site

Page 15: Implicit Structure and Dynamics of Blogspace

15

iRank Algorithm• Draw a weighted edge for all pairs of blogs that cite the same URL• higher weight for mentions closer together• run PageRank• control for ‘spam’

Time of infectiont0 t1

Page 16: Implicit Structure and Dynamics of Blogspace

16

Do Bloggers Kill Kittens?

Friday morning Wired writes:

"Warning: Blogs Can Be Infectious.”

Shortly thereafter Slashdot posts:

"Bloggers' Plagiarism Scientifically Proven"

Which is picked up by Metafilter as "A good amount of bloggers are outright thieves."

Page 17: Implicit Structure and Dynamics of Blogspace

17

Research at the Information Dynamics Lab at HP:

http://www.hpl.hp.com/research/idl

[email protected]