Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal...

50
Jure Leskovec Stanford University

Transcript of Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal...

Page 1: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Jure Leskovec Stanford University

Page 2: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Social Transformation of Computing Technological networks intertwined with social

Profound transformation in: How knowledge is produced and shared How people interact and communicate The scope of CS as a discipline

6/28/2012 Jure Leskovec, Stanford University 2

Corporate e-mail communication [Adamic-Adar, ‘05]

Online friendships [Ugander-Karrer-Backstrom-Marlow, ‘11]

Page 3: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Two issues for foundations of computing (1) How do we design in this space?

Combine social models with core ideas from computing Complex networks:

design, analysis, models Algorithmic game theory:

designing with incentives Social media: reputation,

recommendation, contagion

6/28/2012 Jure Leskovec, Stanford University 3

Page 4: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Two issues for foundations of computing (2) Science advanced when

the invisible becomes visible. Can we recognize fundamental

patterns of human behavior from raw digital traces? Can new computational

models address long-standing social-science questions?

6/28/2012 Jure Leskovec, Stanford University 4

Page 5: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

We are surrounded by linked objects Social networks: Friendships/informal contacts among people Collaboration in companies, organizations, …

Information networks: Content creation, markets People seeking information

Traditionally networks were hard to obtain

6/28/2012 Jure Leskovec, Stanford University 5

Page 6: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Now: Large on-line systems

Social networks: On-line communities: Facebook, Twitter, ... E-mail, blogging, electronic markets

Information networks: Hypertext, Wikipedia, Web

6/28/2012 Jure Leskovec, Stanford University 6

What have we learned about these networks?

Page 7: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

We know a lot about the structure

6/28/2012 Jure Leskovec, Stanford University 7

Network Property

Social Networks (MSN [Leskovec,Horvitz ‘08])

Information Networks (Web [Broder et al. ‘00])

Connectivity: Well connected

Giant component of 99.9% nodes

Giant component of 90% nodes

Degrees: Heavy-tailed Log-normal Power-law

Diameter: Small 6-degrees of separation ~20

Model

Small-world Bow-tie

In Out Core 40%

Page 8: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

We know much less about processes!

What process is common to both?

Navigation! How people find their way

through social networks? How people find information

on the Web, Wikipedia?

6/28/2012 Jure Leskovec, Stanford University 8

Page 9: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University

Browsing the Web

Literature search

Consulting an encyclopedia

9

Page 10: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Milgram’s small-world experiment [‘67] People forward letters via friends to

far-away targets they don’t know Six steps on avg. → Six degrees of separation

6/28/2012 Jure Leskovec, Stanford University 10

Milgram experiment (Travers-Milgram ‘70)

Page 11: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 11

You are here

Get there!

Page 12: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Study navigation in social as well as information networks What is common? What differs? What are the design implications for

computing applications and systems?

Common theme: Use large-scale online data to as a ‘telescope’ into these processes

6/28/2012 Jure Leskovec, Stanford University 12

Page 13: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Why should strangers be able to find short chains of acquaintances linking them together? Models for decentralized routing in

social networks [Kleinberg ‘00, Watts-Dodds-Newman ‘02, ...]

6/28/2012 Jure Leskovec, Stanford University 13

Omaha, NE Council Bluffs, IA Pittsburgh, PA

Boston, MA

Sharon, MA

Page 14: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

The MSN Messenger network: 180 million people, 1.3 billion edges

6/28/2012 Jure Leskovec, Stanford University 14

Fraction of country’s population on MSN: •Iceland: 35% •Spain: 28% •Netherlands, Canada, Sweden, Norway: 26% •France, UK: 18% •USA, Brazil: 8%

[Leskovec-Horvitz, ‘08]

Page 15: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Avg. degree of separation = 6.6, mode=6 Long paths (>30) exist in the network Network is robust to removal of hubs

6/28/2012 Jure Leskovec, Stanford University 15

Av.g

deg

ree

of s

epar

atio

n

[Leskovec-Horvitz, ‘08]

Page 16: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

What are characteristics of short paths?

How hard is it to find them? Strategy: S-T shortest-paths Pick random S-T, run Dijkstra, examine the paths

6/28/2012 Jure Leskovec, Stanford University 16

S

D

B E

T

U

C A

F

Source Target

Def: Node is lucrative, if it leads “closer” to T

[Leskovec-Horvitz, ‘12]

Page 17: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

17

High degree nodes

Many good choices

Nod

e D

egre

e # Lucrative N

odes

S T 1 2 Steps to-go to T

6/28/2012 Jure Leskovec, Stanford University

[Leskovec-Horvitz, ‘12]

Page 18: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Probability of success if we forward to a random neighbor

18

S T 1 2 Steps to-go to T

P(L

ucra

tive)

6/28/2012 Jure Leskovec, Stanford University

Page 19: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

19

Geo

-dis

tanc

e to

T [1

03 k

m]

Path makes longest strides towards T in

steps 4 and 3

S T 1 2 Steps to-go to T

6/28/2012 Jure Leskovec, Stanford University

Page 20: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

How good are heuristics at navigation?

Heuristics: Jump to a node X chosen: R: Random G: min geo(𝑋𝑋,𝑇𝑇) D: max deg (𝑋𝑋)

DG: min 𝑔𝑔𝑔𝑔𝑔𝑔(𝑋𝑋,𝑇𝑇)deg2(𝑋𝑋)

6/28/2012 Jure Leskovec, Stanford University 20

Steps to-go to T

P(L

ucra

tive)

Page 21: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Bottom line: P(hit T in

≤ 10 steps) = 0.001 P(get in 10km of T

in ≤ 10 steps) = 1

Geography provides an important cue but fails in local neighborhoods 6/28/2012 Jure Leskovec, Stanford University 21

Steps

Phi

t(T)

P10

km(T

)

Page 22: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

How do these translate to navigation in information networks? Web-browsing Encyclopedia navigation

6/28/2012 Jure Leskovec, Stanford University 22

Page 23: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

23

Understand how humans navigate Wikipedia

Get an idea of how people connect concepts

Large-scale study of navigation in Wikipedia

6/28/2012 Jure Leskovec, Stanford University

[West-Leskovec, ‘12]

Page 24: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

24

Goal-directed navigation of Wikipedia Optimal solution: ⟨DIK-DIK, WATER, GERMANY, EINSTEIN⟩

6/28/2012 Jure Leskovec, Stanford University

[West-Leskovec, ‘12]

Page 25: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

25 6/28/2012 Jure Leskovec, Stanford University

[West et al., ‘09]

Page 26: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Graph: “Wikipedia Selection for schools” 4,000 articles, 120,000 links Shortest paths between all pairs:

median 3, mean 3.2, max 9 Wikispeedia 30,000 games since Aug 2009 9,400 distinct IP addresses

Important: We know the target! 6/28/2012 Jure Leskovec, Stanford University 26

[West-Leskovec, ‘12]

Page 27: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Larger variance in human than opt. paths Overall, humans not much worse than opt.

6/28/2012 Jure Leskovec, Stanford University 27

incl. back-clicks mode 4, median 5, mean 5.8

excl. back-clicks mode 4, median 4, mean 4.9

optimal solutions mode 3, median 3, mean 2.9

Page 28: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 28

Distance to-go to the target Distance to-go to the target

Only missions of SPL 3

Page 29: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

For each path position: Logistic regression to predict human choice Inspect weights for similarity and degree

6/28/2012 Jure Leskovec, Stanford University 29

MUSIC current

ORPHEUS not chosen by human (neg. example)

...

KOREA chosen by human (pos. example) KIMCHI

target

Page 30: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

For each path position: Logistic regression to predict human choice Inspect weights for content similarity & degree

6/28/2012 Jure Leskovec, Stanford University 30

content

degree

Step on the path

Fea

ture

wei

ght

Page 31: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

For each path position: Logistic regression to predict human choice Inspect weights for content similarity & degree

6/28/2012 Jure Leskovec, Stanford University 31

content degree

Step on the path

Fea

ture

wei

ght

Page 32: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Path: ... → Water → Germany → Albert Einstein

Endgame strategy: Map last 3 articles to categories: Science → Geography → People

Few popular endgame strategies (Target category)³ typically most popular

Among non-target categories, Geography most popular 6/28/2012 Jure Leskovec, Stanford University 32

Page 33: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 33

human game length – optimal game length

optimal game length

all games single-cat. most pop. multi-cat.

geography people

Overhead =

technology

Page 34: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 34

Can we build machines that navigate better than humans?

Page 35: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 35

No common sense, only low-level knowledge such as word counts

Common sense and high-level background knowledge

Who is better?

Page 36: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

An agent aims to navigate to target T

Agent is currently at node U and navigates to neighbor W s.t. 𝑊𝑊 = arg max𝑈𝑈→𝑊𝑊 𝑉𝑉(𝑊𝑊|𝑈𝑈,𝑇𝑇) Ideally: 𝑉𝑉(𝐸𝐸|𝑈𝑈,𝑇𝑇) > 𝑉𝑉(𝐶𝐶|𝑈𝑈,𝑇𝑇)

What is the value function? 6/28/2012 Jure Leskovec, Stanford University 36

D

E T

U

C A

F

Target

Page 37: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

(1) Human (2) Similarity based (TXT): 𝑉𝑉 𝑊𝑊 𝑈𝑈,𝑇𝑇 = tf−idf(𝑊𝑊,𝑇𝑇) Go to W that is textually most similar to T

(3) Machine learning agents (ML): Use human/shortest paths to learn the value

function

Support Vector Machines Reinforcement Learning

6/28/2012 Jure Leskovec, Stanford University 37

Page 38: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Features for the machine learning agents Inspired by analysis of human behavior sim(next, target) (TF-IDF cosine) sim(current, next) deg(next) taxdist(next, target) (taxonomical distance) linkcos(next, target) (cosine similarity in

outgoing hyperlinks)

6/28/2012 Jure Leskovec, Stanford University 38

Page 39: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Machine beats human! But, machines can get terribly lost Humans are sloppy (83% they miss a direct link)

6/28/2012 Jure Leskovec, Stanford University 39

H ML TXT H ML TXT

Page 40: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 40

Can we predict where the user is going?

Page 41: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Task: Given first few clicks Predict the target player is trying to reach

6/28/2012 Jure Leskovec, Stanford University 41

given given

to be predicted

Page 42: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Markov model of human navigation

Predict the most likely target

6/28/2012 Jure Leskovec, Stanford University 42

next current target params features

given path prefix

Page 43: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Fit Θ in learning-to-rank setup [Weston et al. ‘10]

6/28/2012 Jure Leskovec, Stanford University 43

given given

to be predicted & given for training

initial Θ final Θ

training Kimchi Gopher … Albert Einstein …

Albert Einstein … Football Orpheus ...

Page 44: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Given choice of 2, choose true target

Rank articles such that true target gets high rank

6/28/2012 Jure Leskovec, Stanford University 44

chance

2 clicks observed 3 clicks observed

1 click observed

Page 45: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Humans manage to find their ways in large networks, despite having only local information

How do they do it?

Analyze large-scale data from the MSN network and Wikispeedia game

Answer: They leverage expectations about network connectivity, based on background knowledge

6/28/2012 Jure Leskovec, Stanford University 45

Page 46: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Computational ideas play 2 crucial roles Designing systems in this new space Modeling the social processes

Designing systems: Search engines User click-trails for web search ranking

[Bilenko-White, ‘08] Web revisitation patterns for

crawling [Adar et al. ‘08]

6/28/2012 Jure Leskovec, Stanford University 46

Page 47: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Designing systems: Navigational tools Is user lost? Where is she trying to go? User facing tools and browsers:

ScentTrails [Olston-Chi, ‘03]

Creating navigable networks Navigable maps, ontologies

[Helic-Strohmaier et al., ‘11] Social browsing

6/28/2012 Jure Leskovec, Stanford University 47

Page 48: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

Models: How we search for information Information scent [Chi et al., ‘01] Information foraging [Pirolli, ‘99]

Networks facilitate new ways of interacting with information Targeted search vs. Casual browsing

Can all this help us understand ourselves and each other any better?

6/28/2012 Jure Leskovec, Stanford University 48

Page 49: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

6/28/2012 Jure Leskovec, Stanford University 49

Page 50: Jure Leskovec · Small-world Bow-tie . In . Core Out . ... people connect concepts ... optimal solutions . mode 3, median 3, mean 2.9 . 6/28/2012 Jure Leskovec, Stanford University

50 Jure Leskovec, Stanford University 6/28/2012