1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In...
![Page 1: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/1.jpg)
1
Exploring Blog Networks
Patterns and a Model for Information Propagation
Mary McGlohon
In collaboration with Jure Leskovec, Christos Faloutsos
Natalie Glance, Matthew Hurst
Sandia National Labs- July 6, 2007
(As seen at SIAM-
Data Mining 2007)
![Page 2: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/2.jpg)
2
Long-term Goals
● How does information on the Web propagate?● With what pattern do ideas catch on, diffuse,
and decrease in popularity?● Can we build a model for this propagation?
![Page 3: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/3.jpg)
3
Why blogs?
● Blogs are a widely used medium of information for many topics and have become an important mode of communication.
● Blogs cite one another, creating a record of how information and ideas spread through a social network.
● This record is publicly available.
![Page 4: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/4.jpg)
4
Why do we care?
● Understanding how the blog network works is important for:– Social issues: Political mapping, social trends and
change, reactions to mass media.– Economic issues: Marketing, predicting
commercial success, discovering links between companies.
Example: blogs in the 2004 election.[Adamic, Glance 2005]
![Page 5: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/5.jpg)
5
Immediate Goals
● Temporal questions: Does popularity have half-life? Is there periodicity?
● Topological questions: What topological patterns do posts and blogs follow? What shapes do cascades take on? Stars? Chains? Something else?
● Generative model: Can we build a generative model that mimics properties of cascades?
![Page 6: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/6.jpg)
6
OutlineMotivation
PreliminariesConcepts and terminologyData
Temporal ObservationsTopological ObservationsCascade Generation ModelDiscussion & Conclusions
![Page 7: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/7.jpg)
7
What is a blog?
● A blog is a frequently-updated webpage.● A blog’s author updates the blog using posts.● Each post has a permanent hyperlink, and may
contain links to other blog posts.
slashdot boingboing
![Page 8: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/8.jpg)
8
What is a blog?
● A blog is a frequently-updated webpage.● A blog’s author updates the blog using posts.● Each post has a permanent hyperlink, and may
contain links to other blog posts.
slashdot boingboing
The iPhone is here, hooray!
![Page 9: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/9.jpg)
9
What is a blog?
● A blog is a frequently-updated webpage.● A blog’s author updates the blog using posts.● Each post has a permanent hyperlink, and may
contain links to other blog posts.
slashdot boingboing
The iPhone is here, hooray!
At this link, Slashdot says the iPhone has arrived. But I’m not buying one, because …
![Page 10: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/10.jpg)
10
What is a blog?
● A blog is a frequently-updated webpage.● A blog’s author updates the blog using posts.● Each post has a permanent hyperlink, and may
contain links to other blog posts.
slashdot boingboing
The iPhone is here, hooray!
At this link, Slashdot says the iPhone has arrived. But I’m not buying one, because …
Here Boingboing says they’re not
buying an iPhone. They’re just
jealous.
![Page 11: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/11.jpg)
11
Blogosphere network
B1 B2
B4B3
B1 B2
B4B3
11
2
1 3
1
a
b c
de
From blogs to networks
1
Non-trivial vs. trivial cascades
Stars vs. chains
Nodes a,b,c,d are cascade initiators
e is a connector
Blog network Post network
slashdotboingboing
DlistedMichelleMalkin
slashdotboingboing
DlistedMichelleMalkin
![Page 12: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/12.jpg)
12
Blogosphere network
Non-trivial vs. trivial cascades
Cascades
From networks to cascades
slashdot boingboing
DlistedMichelleMalkin
![Page 13: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/13.jpg)
13
From networks to cascades
Non-trivial vs. trivial cascades
Cascade initiators are first sources of information
We also have stars and chains
Blogosphere network
Cascades
slashdot boingboing
DlistedMichelleMalkin
![Page 14: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/14.jpg)
14
Dataset (Nielsen Buzzmetrics)● Gathered from August-September 2005*
● Used set of 44,362 blogs, traced cascades
● 2.4 million posts, ~5 million out-links, 245,404 blog-to-blog links
Time [1 day]
Nu
mb
er
of p
ost
s
![Page 15: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/15.jpg)
15
OutlineMotivationPreliminaries
Concepts and terminologyData
Temporal ObservationsDoes blog traffic behave periodically?How does popularity change over time?
Topological ObservationsCascade Generation ModelDiscussion & ConclusionsFuture Work
![Page 16: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/16.jpg)
16
Temporal Observations
Does blog traffic behave periodically?• Posts have “weekend effect”, less traffic on
Saturday/Sunday.
![Page 17: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/17.jpg)
17
Temporal Observations
Does blog traffic behave periodically?• Monday appears to compensate for this behavior, but
it is not actually the case.
• We normalize data: countnorm = count / pd
where pd is percentage of links on that day.
Same data, normalizedMonday post dropoff- days after post
Num
ber
in-li
nks
(log)
Num
ber
in-li
nks
(log)
![Page 18: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/18.jpg)
18
Temporal Observations
How does post popularity change over time?
Post popularity dropoff follows a power law identical to that found in communication response times in [Vazquez2006].
Observation 1: The probability that a post written at time tp acquires a link at time tp + is:
p(tp+) 1.5
Days after post
Nu
mb
er
of in
-lin
ks
![Page 19: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/19.jpg)
19
OutlineMotivationPreliminariesTemporal Observations
Does blog traffic behave periodically?How does post popularity change over time?
Topological ObservationsWhat are graph properties for blog networks?What shapes do cascades take on? Stars, chains,
or something else?Cascade Generation ModelDiscussion & ConclusionsFuture Work
![Page 20: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/20.jpg)
20
Topological Observations
What graph properties does the blog network exhibit?
B1 B2
B4B3
11
2
1 3
1
![Page 21: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/21.jpg)
21
Topological Observations
What graph properties does the blog network exhibit? How connected?
● 44,356 nodes, 122,153 edges● Half of blogs belong to largest connected
component.
B1 B2
B4B3
11
2
1 3
1
![Page 22: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/22.jpg)
22
Topological Observations
What power laws does the blog network exhibit?
Both in- and out-degree follows a power law distribution, in-link PL exponent -1.7, out-degree PL exponent near -3.
This suggests strong rich-get-richer phenomena.
Number of blog in-links (log scale) Number of blog out-links (log scale)
Co
unt
(lo
g s
cale
)
Co
unt
(lo
g s
cale
)
![Page 23: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/23.jpg)
23
Topological Observations
How are blog in- and out-degree related?
In-links and out-links are not correlated. (correlation coefficient 0.16)
Number of blog in-links (log scale)
Nu
mb
er
of b
log
ou
t-lin
ks
(lo
g s
cale
)
![Page 24: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/24.jpg)
24
Topological Observations
What graph properties does the post network exhibit?
a
b c
de
![Page 25: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/25.jpg)
25
Topological Observations
a
b c
de
What graph properties does the post network exhibit?
Very sparsely connected: 98% of posts are isolated.
![Page 26: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/26.jpg)
26
Topological Observations
• Both in-and out-degree follow power laws:• In-degree has PL exponent -2.15, out-degree
has PL exponent -2.95.
What power laws does the post network exhibit?
Post in-degree
Co
unt
Post out-degree
Co
unt
![Page 27: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/27.jpg)
27
Topological Observations
How do we measure how information flows through the network?
We gather cascades using the following procedure:– Find all initiators (out-degree 0).
a
b c
de
![Page 28: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/28.jpg)
28
Topological Observations
How do we measure how information flows through the network?
We gather cascades using the following procedure:– Find all initiators (out-degree 0).– Follow in-links.
a
b c
de
a
b c
de
![Page 29: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/29.jpg)
29
Topological Observations
How do we measure how information flows through the network?
We gather cascades using the following procedure:– Find all initiators (out-degree 0).– Follow in-links.– Produces directed acyclic graph.
a
b c
de
a
b c
de
d
e
b c
e
a
![Page 30: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/30.jpg)
30
Topological Observations
How do we measure how information flows through the network?
Common cascade shapes are extracted using algorithms in [Leskovec2006].
![Page 31: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/31.jpg)
31
Topological Observations
How do we measure how information flows through the network?
Number of edges increases linearally with cascade size, while effective diameter increases logarithmically, suggesting tree-like structures.
Cascade size (# nodes)
Num
ber
of e
dges
Cascade size
Eff
ectiv
e di
amet
er
![Page 32: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/32.jpg)
32
Topological Observations
How do we measure how information flows through the network?
We work with a bag of cascades– each cascade is a disconnected subgraph.
We now explore some graph properties of cascades.
![Page 33: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/33.jpg)
33
Topological Observations
As before, in- and out-degree in bag of cascades follow power laws.
What graph properties do cascades exhibit?
Cascade node in-degree Cascade node out-degree
Cou
nt
Cou
nt
![Page 34: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/34.jpg)
34
Topological Observations
Cascade size distributions also follow power law.
What graph properties do cascades exhibit?
![Page 35: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/35.jpg)
35
Topological Observations
Cascade size distributions also follow power law.
What graph properties do cascades exhibit?
Observation 2: The probability of observing a cascade on n nodes follows a Zipf distribution:
p(n) n-2
Cascade size (# of nodes)
Cou
nt
![Page 36: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/36.jpg)
36
Topological Observations
What graph properties do cascades exhibit?
Stars and chains also follow a power law, with different exponents (star -3.1, chain -8.5).
![Page 37: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/37.jpg)
37
Topological Observations
What graph properties do cascades exhibit?
Stars and chains also follow a power law, with different exponents (star -3.1, chain -8.5).
Size of chain (# nodes)
Cou
nt
Size of star (# nodes)
Cou
nt
![Page 38: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/38.jpg)
38
OutlineMotivationPreliminariesTemporal ObservationsTopological Observations
What are graph properties for blog networks?What shapes and patterns do cascades take on?
Cascade Generation ModelEpidemiological BackgroundProposed ModelExperimental Validation
Discussion & ConclusionsFuture Work
![Page 39: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/39.jpg)
39
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 40: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/40.jpg)
40
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 41: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/41.jpg)
41
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 42: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/42.jpg)
42
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 43: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/43.jpg)
43
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 44: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/44.jpg)
44
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 45: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/45.jpg)
45
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 46: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/46.jpg)
46
Epidemiological models
● We consider modeling cascade generation as an epidemic, with ideas as viruses.
● We use the SIS model:– At any time, an entity is in one of two states:
susceptible or infected.– One parameter determines how easily spreading
conversations are.– [Hethcote2000]
![Page 47: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/47.jpg)
47
Cascade Generation Model
B1
3
B2
B3 B4
1
2
1
1
1
0. Begin with Blog Net.
![Page 48: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/48.jpg)
48
Cascade Generation Model
B1 B2
B3 B4
0. Begin with Blog Net, but ignore edge weights.
Example–
B1 links to B2, B2 links to B1, B4 links to B2 and B1, as well as itself
B3 is isolated, linking to itself.
![Page 49: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/49.jpg)
49
Cascade Generation Model
B1 B2
B3 B4
1. Randomly pick a blog to infect, add node to cascade
B1
![Page 50: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/50.jpg)
50
Cascade Generation Model
B1 B2
B3 B4
2. Infect each in-linked neighbor with probability .
B1
![Page 51: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/51.jpg)
51
Cascade Generation Model
B1 B2
B3 B4
2. Infect each in-linked neighbor with probability .
B1
INFECT
DO NOT INFECT
![Page 52: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/52.jpg)
52
Cascade Generation Model
B1 B2
B3 B4
3. Add infected neighbors to cascade.
B1
B4
![Page 53: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/53.jpg)
53
Cascade Generation Model
B1 B2
B3 B4
4. Set “old” infected nodes to uninfected.
B1
B4
![Page 54: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/54.jpg)
54
Cascade Generation Model
B1 B2
B3 B4
4. Set “old” infected nodes to uninfected. Repeat steps 2-4 until no nodes are infected.
B1
B4
![Page 55: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/55.jpg)
55
Cascade Generation Model
B1 B2
B3 B4
4. Set “old” infected nodes to uninfected. Repeat steps 2-4 until no nodes are infected.
B1
B4DO NOT INFECT
![Page 56: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/56.jpg)
56
Cascade Generation Model
B1 B2
B3 B4
4. Set “old” infected nodes to uninfected. Repeat steps 2-4 until no nodes are infected.
B1
B4
Completed cascade!
![Page 57: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/57.jpg)
57
CGM matches observations
● After trying several values, we decide on =.025.● 10 simulations, 2 million cascades each● Most frequent cascades: 7 of 10 matched exactly.
model
data
![Page 58: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/58.jpg)
58
CGM matches observations
Cascade size in this model also follows a power law-- the model distribution is shown with the real data points.
Cascade size (number of nodes)
Cou
nt
![Page 59: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/59.jpg)
59
CGM matches observations
● Stars and chains both follow power laws, close to those observed in real data.
Cou
nt
Star size
Cou
nt
Chain size
![Page 60: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/60.jpg)
60
Results in brief
● Analyzed one of largest available collections of blog information.
● Two networks: “Post network” and “blog network”.
● Discovered several properties of the networks.● Also analyzed properties of “cascades”.● Presented generative model for cascades.
![Page 61: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/61.jpg)
61
Immediate questions: answered
● Temporal questions: Does popularity have half-life? Is there periodicity?– Popularity dropoff follows a power-law distribution
exactly as found in response times in other work. We do find that posts follow weekly periodicity.
Days after post
Nu
mb
er
of in
-lin
ks
![Page 62: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/62.jpg)
62
Immediate questions: answered
● Topology: What topological patterns do posts and blogs follow? What shapes to cascades take on? Stars? Chains? Something else?– We find power law distributions in almost every
topological property. In cascade shapes, stars are more common than chains, and size of cascades follow a power law. Cascades are tree-like.
Size of chain (# nodes)
Cou
nt
Size of star (# nodes)
Cou
nt
![Page 63: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/63.jpg)
63
Immediate questions: answered
● Can a simple model replicate this behavior?– Yes. We developed a model based on the SIS
model in epidemiology. It is a simple model with only one parameter, and it produces behavior remarkably similar to that found in the dataset.
Cou
nt
Star size
Cou
nt
Chain size
![Page 64: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/64.jpg)
64
Future work and applications
● This work suggested that ideas may behave like viruses under an SIS model.
● This may be useful for mapping social/political trends.
● Further investigation into these properties may also allow us early detection of changes in social or economic structure.
![Page 65: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/65.jpg)
65
Related work
● For explanation of SIS model:– [Hethcote2000] H.W. Hethcote. The mathematics of
infectious diseases. SIAM Rev., 42(4):599–653, 2000.● For algorithms for extracting cascade shapes:
– [Leskovec2006] J. Leskovec, A. Singh, and J. Kleinberg. Patterns of influence in a recommendation network. PAKDD 2006.
● For some modeling of power laws:– [Vazquez2006] A. Vazquez, J. G. Oliveira, Z. Dezso, K. I.
Goh, I. Kondor, and A. L. Barabasi. Modeling bursts and heavy tails in human dynamics. Physical Review E, 73:036127, 2006.
![Page 67: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/67.jpg)
6767
Acknowledgments
● Mary McGlohon was partially supported by an NSF Graduate Fellowship.
● Jure Leskovec was partially supported by a Microsoft Fellowship.
![Page 68: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/68.jpg)
68
Questions?
![Page 69: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/69.jpg)
69
● EXTRA SLIDES BEGIN HERE!
![Page 70: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/70.jpg)
7070
Preliminaries- PCA
● We will work with very high-dimensional data (~9,000 dimensions).
● Principal Component Analysis is a method of dimensionality reduction.
Depth upwards
Conversation mass upwards
Hypothetically, for each blog...
![Page 71: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/71.jpg)
7171
Preliminaries- PCA
● We will work with very high-dimensional data (~9,000 dimensions).
● Principal Component Analysis is a method of dimensionality reduction.
Depth upwards
Conversation mass upwards
Hypothetically, for each blog...
![Page 72: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/72.jpg)
7272
Preliminaries- PCA
● We will work with very high-dimensional data (~9,000 dimensions).
● Principal Component Analysis is a method of dimensionality reduction.
Depth upwards
Hypothetically, for each blog...
Conversation mass upwards
![Page 73: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/73.jpg)
73
Preliminaries- PCA
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
v1
We can represent any real N x M matrix X as X= U x x Vt
Det
ails
X U
Vt
![Page 74: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/74.jpg)
74
Preliminaries- PCA
● Reduce dimensionality by setting all other components of to zero.
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
=9.64 0
0 5.29x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
Det
ails
![Page 75: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/75.jpg)
75
Preliminaries- PCA
Reference: Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition, Academic Press.
1 1 1 0 0
2 2 2 0 0
1 1 1 0 0
5 5 5 0 0
0 0 0 2 2
0 0 0 3 30 0 0 1 1
0.18 0
0.36 0
0.18 0
0.90 0
0 0.53
0 0.800 0.27
~9.64 0
0 0x
0.58 0.58 0.58 0 0
0 0 0 0.71 0.71
x
Det
ails
![Page 76: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/76.jpg)
76
Preliminaries- Regularizing data
● Not everything in life is normally distributed.
To
tal I
n-li
nks
Total Conversation Mass Downwards
Blog properties, linear-linear scale
![Page 77: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/77.jpg)
77
Preliminaries- Regularizing data
● Not everything in life is normally distributed.
To
tal I
n-li
nks
Total Conversation Mass Downwards
Blog properties, linear-linear scale
99.4% of points!
![Page 78: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/78.jpg)
78
Preliminaries: Regularizing data
● Not everything in life is normally distributed.
To
tal I
n-li
nks
Total Conversation Mass Downwards
Blog properties, linear-linear scale
Try to fit a line...
![Page 79: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/79.jpg)
79
Preliminaries: Regularizing data
● Not everything in life is normally distributed.
To
tal I
n-li
nks
Total Conversation Mass Downwards
Blog properties, linear-linear scale
Try to fit a line...
Outliers dramatically affect fit.
![Page 80: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/80.jpg)
80
Preliminaries: Regularizing data
● Not everything in life is normally distributed. ● Therefore, we propose to take log(count+1).
To
tal I
n-li
nks
Total Conversation Mass Downwards
Blog properties, log-log scale
![Page 81: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/81.jpg)
81
Preliminaries: Regularizing data
● Not everything in life is normally distributed. ● Therefore, we propose to take log(count+1).
To
tal I
n-li
nks
Total Conversation Mass Downwards
Blog properties, log-log scale
Outliers’ effects are minimized.
![Page 82: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/82.jpg)
82
● Suppose we want to cluster blogs based on content. What features do we use per blog?
![Page 83: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/83.jpg)
83
CascadeType
• Perform PCA on sparse matrix.
• Use log(count+1)• Project onto 2 PC…
.01…
.07.67…
1.12.1…
5.1…
4.2…
.073.41.13.2boingboing
.092.14.6slashdot
…………
~9,000 cascade types
~44
,000
blo
gs
![Page 84: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/84.jpg)
8484
CascadeType: Results
● Observation: Content of blogs and cascade behavior are often related.
• Distinct clusters for “conservative” and “humorous” blogs (hand-labeling).
![Page 85: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/85.jpg)
8585
CascadeType: Results
● Observation: Content of blogs and cascade behavior are often related.
• Distinct clusters for “conservative” and “humorous” blogs (hand-labeling).
![Page 86: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/86.jpg)
86
● Suppose we want to cluster blog posts. What features do we use?
![Page 87: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/87.jpg)
8787
Preliminaries- Blogs
● There are several terms we use to describe cascades:
● In-link, out-link
– Green node has one out-link
– Yellow node has one in-link.● Depth downwards/upwards
– Pink node has an upward depth of 1,
– downward depth of 2.
● Conversation mass upwards/downwards
– Pink node has upward CM 1,
– downward CM 3
![Page 88: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/88.jpg)
8888
PostFeatures
.6.1…
1.1.6boingboing-p002
6.24.2boingboing-p001
2.41.2…
4.5.2…
2.2.3slashdot-p002
4.5slashdot-p001
# in
-link
s #
out-
links
C
M u
p
C
M d
own
de
pth
up
dep
th d
own
~2,
400,
000
post
s Run PCA…
![Page 89: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/89.jpg)
89
PostFeatures: Results
• Observation: Posts within a blog tend to retain similar network characteristics.
![Page 90: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/90.jpg)
90
PostFeatures: Results
• Observation: Posts within a blog tend to retain similar network characteristics.
MichelleMalkin
Dlisted
– PC1 ~ CM upward– PC2 ~ CM
downward– We show this
scatter plot instead.
![Page 91: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/91.jpg)
9191
Ranking blogs by PostFeatures
● Conversation mass up/down gives a better understanding of the blog posts than in-links and out-links.
● Therefore, we may choose to rank blogs based on these attributes.
![Page 92: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/92.jpg)
9292
Blogs ranked by CM vs in-links
1 michellemalkin.com
2 boingboing.net
3 imao.us (75)
4 captainsquartersblog.com/mt
5 instapundit.com
6 radioequalizer.blogspot.com (53)
7 powerlineblog.com
8 waxy.org/links
9 washingtonmonthly.com
10 kottke.org/reminder
1 boingboing.net
2 michellemalkin.com
3 instapundit.com
4 waxy.org/links
5 kottke.com/reminder
6 patriotdaily.com (11)
7 captainsquartersblog.com/mt
8 powerlineblog.com
9 washingtonmonthly.com
10 petashon.com (30)
Top blogs by conversation mass Top blogs by in-links
![Page 93: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/93.jpg)
9393
Blogs ranked by CM vs in-links
1 michellemalkin.com
2 boingboing.net
3 imao.us (75)
4 captainsquartersblog.com/mt
1 boingboing.net
2 michellemalkin.com
3 instapundit.com
4 waxy.org/links
Top blogs by conversation mass Top blogs by in-links
in-links: 2
CM: 6in-links: 5
CM: 5
– Perhaps IMAO has longer cascades, just fewer inlinks.– While petashun has “stars”.
.....10 petashon.com (30)
![Page 94: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/94.jpg)
94
BlogTimeFractal: some time series
● Problem: time series data is nonuniform and difficult to analyze.
● Any patterns?● Any measures?
in-links over time
![Page 95: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/95.jpg)
9595
BlogTimeFractal: Definitions
● Any patterns?● Self similarity!● The 80-20 law describes self-similarity.● For any sequence, we divide it into two equal-
length subsequences. 80% of traffic is in one, 20% in the other.– Repeat recursively.
![Page 96: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/96.jpg)
96
Self-similarity
● The bias factor for the 80-20 law is b=0.8.20 80
Det
ails
![Page 97: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/97.jpg)
97
Self-similarity
● The bias factor for the 80-20 law is b=0.8.20 80
Q: How do we estimate b?
Det
ails
![Page 98: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/98.jpg)
98
Self-similarity
● The bias factor for the 80-20 law is b=0.8.20 80
Q: How do we estimate b?
A: Entropy plots!
Det
ails
![Page 99: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/99.jpg)
9999
BlogTimeFractal
● An entropy plot plots entropy vs. resolution.● From time series data, begin with resolution R=
T/2. ● Record entropy H
R
![Page 100: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/100.jpg)
100100
BlogTimeFractal
● An entropy plot plots entropy vs. resolution.● From time series data, begin with resolution R=
T/2. ● Record entropy H
R
● Recursively take finer resolutions.
![Page 101: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/101.jpg)
101101
BlogTimeFractal
● An entropy plot plots entropy vs. resolution.● From time series data, begin with resolution r=
T/2. ● Record entropy H
r
● Recursively take finer resolutions.
![Page 102: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/102.jpg)
102
BlogTimeFractal: Definitions● Entropy measures the non-uniformity of histogram at
a given resolution.● We define entropy of our sequence at given R :
where p(t) is percentage of posts from a blog on interval t, R is resolution and 2R is number of intervals.
Det
ails
![Page 103: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/103.jpg)
103103
BlogTimeFractal
● For a b-model (and self similar cases), entropy plot is linear. The slope s will tell us the bias factor.
● Lemma: For traffic generated by a b-model, the bias factor b obeys the equation:
s= - b log2 b – (1-b) log2 (1-b)
![Page 104: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/104.jpg)
104
Entropy Plots
● Linear plot Self-similarity
Resolution
En
tro
py
![Page 105: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/105.jpg)
105
Entropy Plots
● Linear plot Self-similarity● Uniform: slope s=1. bias=.5● Point mass: s=0. bias=1
Resolution
En
tro
py
![Page 106: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/106.jpg)
106
Entropy Plots
● Linear plot Self-similarity● Uniform: slope s=1. bias=.5● Point mass: s=0. bias=1
Resolution
En
tro
py
Michelle Malkin in-links, s= 0.85
By Lemma 1, b= 0.72
![Page 107: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/107.jpg)
107107
BlogTimeFractal: Results● Observation: Most time series of interest are
self-similar.● Observation: Bias factor is approximately 0.7--
that is, more bursty than uniform (70/30 law).
in-links, b=.72 conversation mass, b=.76 number of posts, b=.70
Entropy plots: MichelleMalkin
![Page 108: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/108.jpg)
108
● Other related work
![Page 109: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/109.jpg)
109
[Ali-Hasen, Adamic 2007]Expressing Social Relationships on the Blog through Links and Comments
Analyzed three blog communities:
Dallas-Fort Worth
-Most links are external to community (91%)
-Low centralization
-Low reciprocity
UAE
-Fewer links external to community
-More centralization
-Obvious “hub” structure
Kuwait
-Fewest links external to community (53%)
-Highly centralized
-Much reciprocity
![Page 110: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/110.jpg)
110
[Duarte et. al. 2007]
● Classified blogs into parlor, register, and broadcast.
Total sessions
Fra
ctio
ns o
f se
ssio
ns
with
com
men
ts
parlor
register
broadcast
![Page 111: 1 Exploring Blog Networks Patterns and a Model for Information Propagation Mary McGlohon In collaboration with Jure Leskovec, Christos Faloutsos Natalie.](https://reader034.fdocuments.us/reader034/viewer/2022052701/56649d6d5503460f94a4d4fe/html5/thumbnails/111.jpg)
111
[Adar et. al. 2004]
● Implicit Structure and the Dynamics of Blogspace
Suggested that ideas behaved like epidemics.
Presented iRank based on how “infectious” a blog was.
(giant microbes, a site infectious in more ways than one)