Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

82
CMU SCS Mining Large Graphs: Spectral Methods, Tensors and Influence propagation Christos Faloutsos CMU

description

Mining Large Graphs: Spectral Methods, Tensors and Influence propagation. Christos Faloutsos CMU. Thanks. Alex Smola Jia Yu (Tim) Pan. Roadmap. Graph problems: G1: Fraud detection – BP G2: Botnet detection – spectral G3: Beyond graphs: tensors and ``NELL’’ - PowerPoint PPT Presentation

Transcript of Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

Page 1: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

Christos FaloutsosCMU

Page 2: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Thanks• Alex Smola

• Jia Yu (Tim) Pan

Google, June 2013 C. Faloutsos (CMU) 2

Page 3: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 3

Roadmap• Graph problems:

– G1: Fraud detection – BP– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling– C1: spikeM model

• Conclusions

Google, June 2013

Page 4: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Google, June 2013 C. Faloutsos (CMU) 4

E-bay Fraud detection

w/ Polo Chau &Shashank Pandit, CMU[www’07]

Page 5: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Google, June 2013 C. Faloutsos (CMU) 5

E-bay Fraud detection

Page 6: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Google, June 2013 C. Faloutsos (CMU) 6

E-bay Fraud detection

Page 7: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Google, June 2013 C. Faloutsos (CMU) 7

E-bay Fraud detection - NetProbe

Page 8: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Google, June 2013 C. Faloutsos (CMU) 8

E-bay Fraud detection - NetProbe

F A HF 99%

A 99%

H 49% 49%

Compatibilitymatrix

heterophily

details

Page 9: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 9

Background 1: Belief Propagation Equations

mij (x j ) = φi (xi ) ⋅ψ ij (xi , x j ) ⋅ mni (xi )n∈N (i)\ j

∏xi

bi (xi ) = η ⋅φi (xi ) ⋅ mij (xi )j∈N (i)∏

[Pearl ‘82][Yedidia+ ‘02]…[Pandit+ ‘07][Gonzalez+ ‘09][Chechetka+ ‘10]

Google, June 2013

~bi (xi )

Page 10: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Popular press

And less desirable attention:• E-mail from ‘Belgium police’ (‘copy of

your code?’)

Google, June 2013 C. Faloutsos (CMU) 10

Page 11: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 11

Roadmap• Graph problems:

– G1: Fraud detection – BP• Ebay• Symantec• Unification

– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling• Conclusions

Google, June 2013

Page 12: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Polo ChauMachine Learning Dept

Carey NachenbergVice President & Fellow

Jeffrey WilhelmPrincipal Software Engineer

Adam WrightSoftware Engineer

Prof. Christos FaloutsosComputer Science Dept

Polonium: Tera-Scale Graph Mining and Inference for Malware Detection

PATENT PENDING

SDM 2011, Mesa, Arizona

Page 13: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Polonium: The Data60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program

50+ million machines900+ million executable files

Constructed a machine-file bipartite graph (0.2 TB+)

1 billion nodes (machines and files)37 billion edges

Google, June 2013 13C. Faloutsos (CMU)

Page 14: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Polonium: Key Ideas• Use “guilt-by-association” (i.e., homophily)

– E.g., files that appear on machines with many bad files are more likely to be bad

• Scalability: handles 37 billion-edge graph

Google, June 2013 14C. Faloutsos (CMU)

Page 15: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Polonium: One-Interaction Results

84.9% True Positive Rate1% False Positive Rate

True Positive Rate% of malware

correctly identified

False Positive Rate% of non-malware wrongly labeled as malware15

Ideal

Google, June 2013 C. Faloutsos (CMU)

Page 16: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 16

Roadmap• Graph problems:

– G1: Fraud detection – BP• Ebay• Symantec• Unification

– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling• Conclusions

Google, June 2013

Page 17: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Unifying Guilt-by-Association Approaches:

Theorems and Fast Algorithms

Danai KoutraU KangHsing-Kuo Kenneth Pao

Tai-You KeDuen Horng (Polo) ChauChristos Faloutsos

ECML PKDD, 5-9 September 2011, Athens, Greece

Page 18: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Problem Definition:GBA techniques

C. Faloutsos (CMU) 18

Given: Graph; & few labeled nodesFind: labels of rest(assuming network effects)

?

?

?

?

Google, June 2013

Page 19: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Homophily and Heterophily

C. Faloutsos (CMU) 19

Step 1

Step 2

homophily heterophily

All methods handle

homophily

NOT all methods handle

heterophilyBUT

proposed method

does!Google, June 2013

Page 20: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Are they related?• RWR (Random Walk with Restarts)

– google’s pageRank (‘if my friends are important, I’m important, too’)

• SSL (Semi-supervised learning) – minimize the differences among neighbors

• BP (Belief propagation) – send messages to neighbors, on what you

believe about them

Google, June 2013 C. Faloutsos (CMU) 20

Page 21: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Are they related?• RWR (Random Walk with Restarts)

– google’s pageRank (‘if my friends are important, I’m important, too’)

• SSL (Semi-supervised learning) – minimize the differences among neighbors

• BP (Belief propagation) – send messages to neighbors, on what you

believe about them

Google, June 2013 C. Faloutsos (CMU) 21

YES!

Page 22: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 22

Background 1: Belief Propagation Equations

mij (x j ) = φi (xi ) ⋅ψ ij (xi , x j ) ⋅ mni (xi )n∈N (i)\ j

∏xi

bi (xi ) = η ⋅φi (xi ) ⋅ mij (xi )j∈N (i)∏

[Pearl ‘82][Yedidia+ ‘02]…[Pandit+ ‘07][Gonzalez+ ‘09][Chechetka+ ‘10]

Google, June 2013

Page 23: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Correspondence of Methods

C. Faloutsos (CMU) 23

Method Matrix Unknown knownRWR [I – c AD-1] × x = (1-c)ySSL [I + a(D - A)] × x = y

FABP [I + a D - c’A] × bh = φh

0 1 01 0 10 1 0

? 0 1 1

d1

d2 d3

final labels/ beliefs

prior labels/ beliefs

adjacency matrix

Google, June 2013

Page 24: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Correspondence of Methods

C. Faloutsos (CMU) 24

Method Matrix Unknown knownRWR [I – c AD-1] × x = (1-c)ySSL [I + a(D - A)] × x = y

FABP [I + a D - c’A] × bh = φh

0 1 01 0 10 1 0

? 0 1 1

d1

d2 d3

final labels/ beliefs

prior labels/ beliefs

adjacency matrix

Google, June 2013

We know when it converges!

Page 25: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Results: Scalability

C. Faloutsos (CMU) 25

FABP is linear on the number of edges.

# of edges (Kronecker graphs)

runt

ime

(min

)

Google, June 2013

Page 26: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Results: Parallelism

C. Faloutsos (CMU) 26

FABP ~2x faster & wins/ties on accuracy.

runtime (min)

% a

ccur

acy

Google, June 2013

Page 27: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 27

Conclusions for BP

• ‘NetProbe’, ‘Polonium’, and belief propagation: exploit network effects.

• FaBP: fast & accurate (and -> convergence conditions)

Google, June 2013

Page 28: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 28

Roadmap• Graph problems:

– G1: Fraud detection – BP• Ebay• Symantec• Unification

– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling• Conclusions

Google, June 2013

Page 29: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokesB. Aditya Prakash, Mukund Seshadri, Ashwin

Sridharan, Sridhar Machiraju and Christos Faloutsos: EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs, PAKDD 2010, Hyderabad, India, 21-24 June 2010.

C. Faloutsos (CMU) 29Google, June 2013

Page 30: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• Eigenvectors of adjacency matrix

equivalent to singular vectors (symmetric, undirected graph)

30C. Faloutsos (CMU)Google, June 2013

Page 31: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• Eigenvectors of adjacency matrix

equivalent to singular vectors (symmetric, undirected graph)

31C. Faloutsos (CMU)Google, June 2013

N

N

details

Page 32: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• Eigenvectors of adjacency matrix

equivalent to singular vectors (symmetric, undirected graph)

32C. Faloutsos (CMU)Google, June 2013

N

N

details

Page 33: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• Eigenvectors of adjacency matrix

equivalent to singular vectors (symmetric, undirected graph)

33C. Faloutsos (CMU)Google, June 2013

N

N

details

Page 34: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• Eigenvectors of adjacency matrix

equivalent to singular vectors (symmetric, undirected graph)

34C. Faloutsos (CMU)Google, June 2013

N

N

details

Page 35: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• EE plot:• Scatter plot of

scores of u1 vs u2• One would expect

– Many points @ origin

– A few scattered ~randomly

C. Faloutsos (CMU) 35

u1

u2

Google, June 2013

1st Principal component

2nd Principal component

Page 36: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes• EE plot:• Scatter plot of

scores of u1 vs u2• One would expect

– Many points @ origin

– A few scattered ~randomly

C. Faloutsos (CMU) 36

u1

u290o

Google, June 2013

Page 37: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes - pervasiveness•Present in mobile social graph

across time and space

•Patent citation graph

37C. Faloutsos (CMU)Google, June 2013

Page 38: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes - explanation

Near-cliques, or near-bipartite-cores, loosely connected

38C. Faloutsos (CMU)Google, June 2013

Page 39: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes - explanation

Near-cliques, or near-bipartite-cores, loosely connected

39C. Faloutsos (CMU)Google, June 2013

Page 40: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes - explanation

Near-cliques, or near-bipartite-cores, loosely connected

40C. Faloutsos (CMU)Google, June 2013

Page 41: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

EigenSpokes - explanation

Near-cliques, or near-bipartite-cores, loosely connected

So what? Extract nodes with high

scores high connectivity Good “communities”

spy plot of top 20 nodes

41C. Faloutsos (CMU)Google, June 2013

Page 42: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Bipartite Communities!

magnified bipartite community

patents fromsame inventor(s)

`cut-and-paste’bibliography!

42C. Faloutsos (CMU)Google, June 2013

Page 43: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

(maybe, botnets?)

Victim IPs?

Botnet members?

43C. Faloutsos (CMU)Google, June 2013

Exploring itwith Dr. Eric Mao (III-Taiwan)

Page 44: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 44

Roadmap• Graph problems:

– G1: Fraud detection – BP– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling• Conclusions

Google, June 2013

Page 45: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

GigaTensor: Scaling Tensor Analysis Up By 100 Times –

Algorithms and Discoveries

U Kang

ChristosFaloutsos

KDD’12

EvangelosPapalexakis

AbhayHarpale

Google, June 2013 45C. Faloutsos (CMU)

Page 46: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Background: Tensors• Tensors (=multi-dimensional arrays) are

everywhere– Hyperlinks &anchor text [Kolda+,05]

URL 1

URL 2

Anchor Text

Java

C++

C#

11

1

1

1

1 1

Google, June 2013 46C. Faloutsos (CMU)

Page 47: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Background: Tensors• Tensors (=multi-dimensional arrays) are

everywhere– Sensor stream (time, location, type)– Predicates (subject, verb, object) in knowledge base

“Barack Obama is president of U.S.”

“Eric Clapton playsguitar”

(26M)

(26M)

(48M)

NELL (Never Ending Language Learner) data

Nonzeros =144M

Google, June 2013 47C. Faloutsos (CMU)

Page 48: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Background: Tensors• Tensors (=multi-dimensional arrays) are

everywhere– Sensor stream (time, location, type)– Predicates (subject, verb, object) in knowledge base

Google, June 2013 48C. Faloutsos (CMU)IP-destination

IP-source

Time-stamp Anomaly Detection inComputernetworks

Page 49: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Problem Definition• How to decompose a billion-scale tensor?

– Corresponds to SVD in 2D case

Google, June 2013 49C. Faloutsos (CMU)

Page 50: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Problem Definition• How to decompose a billion-scale tensor?

– Corresponds to SVD in 2D case

Google, June 2013 50C. Faloutsos (CMU)

‘Politicians’ ‘Artists’

Page 51: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Problem Definition

Q1: Dominant concepts/topics? Q2: Find synonyms to a given noun phrase? (and how to scale up: |data| > RAM)

(26M)

(26M)

(48M)

NELL (Never Ending Language Learner) data

Nonzeros =144M

Google, June 2013 51C. Faloutsos (CMU)

Page 52: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Experiments• GigaTensor solves 100x larger problem

Number of nonzero= I / 50

(J)

(I)

(K)

GigaTensor

Tensor

Toolbox Out ofMemory

100x

Google, June 2013 52C. Faloutsos (CMU)

Page 53: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

A1: Concept Discovery• Concept Discovery in Knowledge Base

Google, June 2013 53C. Faloutsos (CMU)

Page 54: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

A1: Concept Discovery

Google, June 2013 54C. Faloutsos (CMU)

Page 55: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

A2: Synonym Discovery

Google, June 2013 55C. Faloutsos (CMU)

Page 56: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 56

Roadmap• Graph problems:

– G1: Fraud detection – BP– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling• Conclusions

Google, June 2013

Page 57: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Rise and Fall Patterns of Information Diffusion:Model and Implications

Yasuko Matsubara (Kyoto University), Yasushi Sakurai (NTT), B. Aditya Prakash (CMU),

Lei Li (UCB), Christos Faloutsos (CMU)KDD’12, Beijing China

KDD 2012 57Y. Matsubara et al.

Page 58: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

• Meme (# of mentions in blogs)– short phrases Sourced from U.S. politics in 2008

58

“you can put lipstick on a pig”

“yes we can”

Rise and fall patterns in social media

Google, June 2013

Page 59: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

Rise and fall patterns in social media

59

• four classes on YouTube [Crane et al. ’08]• six classes on Meme [Yang et al. ’11]

Google, June 2013

Page 60: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

Rise and fall patterns in social media

60

• Can we find a unifying model, which includes these patterns?

• four classes on YouTube [Crane et al. ’08]• six classes on Meme [Yang et al. ’11]

Google, June 2013

Page 61: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

Rise and fall patterns in social media

61

• Answer: YES!

• We can represent all patterns by single model

Google, June 2013

Page 62: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 62

Main idea - SpikeM- 1. Un-informed bloggers (uninformed about rumor)- 2. External shock at time nb (e.g, breaking news)- 3. Infection (word-of-mouth)

Time n=0 Time n=nb

β

Google, June 2013

Infectiveness of a blog-post at age n:

- Strength of infection (quality of news)- Decay function

Time n=nb+1

Page 63: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 63

- 1. Un-informed bloggers (uninformed about rumor)- 2. External shock at time nb (e.g, breaking news)- 3. Infection (word-of-mouth)

Time n=0 Time n=nb

β

Google, June 2013

Infectiveness of a blog-post at age n:

- Strength of infection (quality of news)- Decay function

Time n=nb+1

Main idea - SpikeM

Page 64: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Google, June 2013 C. Faloutsos (CMU) 64

-1.5 slope

J. G. Oliveira & A.-L. Barabási Human Dynamics: The Correspondence Patterns of Darwin and Einstein. Nature 437, 1251 (2005) . [PDF]

Response time (log)

Prob(RT > x)(log) -1.5

Page 65: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

SpikeM - with periodicity• Full equation of SpikeM

65

Periodicity

noonPeak 3am

Dip

Time n

Bloggers change their activity over time

(e.g., daily, weekly, yearly)

activity

Details

Google, June 2013

Page 66: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

Details• Analysis – exponential rise and power-raw fall

66

Lin-log

Log-log

Rise-part

SI -> exponential SpikeM -> exponential

Google, June 2013

Page 67: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

Details• Analysis – exponential rise and power-raw fall

67

Lin-log

Log-log

Fall-part

SI -> exponential SpikeM -> power law

Google, June 2013

Page 68: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

Tail-part forecasts

68

• SpikeM can capture tail part

Google, June 2013

Page 69: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

“What-if” forecasting

69

e.g., given (1) first spike, (2) release date of two sequel movies (3) access volume before the release date

?

(1) First spike

(2) Release date

(3) Two weeks before release

Google, June 2013

?

Page 70: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU)

“What-if” forecasting

70SpikeM can forecast upcoming spikes

(1) First spike

(2) Release date

(3) Two weeks before release

Google, June 2013

Page 71: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Conclusions for spikes• Exp rise; PL decay• ‘spikeM’ captures all patterns, with a few

parms– And can do extrapolation– And forecasting

Google, June 2013 C. Faloutsos (CMU) 71

Page 72: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 72

Roadmap• Graph problems:

– G1: Fraud detection – BP– G2: Botnet detection – spectral – G3: Beyond graphs: tensors and ``NELL’’

• Influence propagation and spike modeling• Future research• Conclusions

Google, June 2013

Page 73: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Challenge#1: Time evolving networks / tensors

• Periodicities? Burstiness?• What is ‘typical’ behavior of a node, over time• Heterogeneous graphs (= nodes w/ attributes)

Google, June 2013 C. Faloutsos (CMU) 73

Page 74: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Challenge #2: ‘Connectome’ – brain wiring

Google, June 2013 C. Faloutsos (CMU) 74

• Which neurons get activated by ‘bee’• How wiring evolves• Modeling epilepsy

N. Sidiropoulos

George Karypis

V. Papalexakis

Tom Mitchell

Page 75: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 75

Thanks

Google, June 2013

Thanks to: NSF IIS-0705359, IIS-0534205, CTA-INARC; Yahoo (M45), LLNL, IBM, SPRINT, Google, INTEL, HP, iLab

Page 76: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 76

Project info: PEGASUS

Google, June 2013

www.cs.cmu.edu/~pegasusResults on large graphs: with Pegasus +

hadoop + M45Apache licenseCode, papers, manual, video

Prof. U Kang Prof. Polo Chau

Page 77: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 77

Cast

Akoglu, Leman

Chau, Polo

Kang, U

McGlohon, Mary

Tong, Hanghang

Prakash,Aditya

Google, June 2013

Koutra,Danai

Beutel,Alex

Papalexakis,Vagelis

Page 78: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 78

References

• Deepayan Chakrabarti, Christos Faloutsos: Graph mining: Laws, generators, and algorithms. ACM Comput. Surv. 38(1): (2006)

Google, June 2013

Page 79: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

C. Faloutsos (CMU) 79

References• Christos Faloutsos, Tamara G. Kolda, Jimeng Sun:

Mining large graphs and streams using matrix and tensor tools. Tutorial, SIGMOD Conference 2007: 1174

Google, June 2013

Page 80: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

References• Yasuko Matsubara, Yasushi Sakurai, B. Aditya

Prakash, Lei Li, Christos Faloutsos, "Rise and Fall Patterns of Information Diffusion: Model and Implications", KDD’12, pp. 6-14, Beijing, China, August 2012

Google, June 2013 C. Faloutsos (CMU) 80

Page 81: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

References• Jimeng Sun, Dacheng Tao, Christos

Faloutsos: Beyond streams and graphs: dynamic tensor analysis. KDD 2006: 374-383

Google, June 2013 C. Faloutsos (CMU) 81

Page 82: Mining Large Graphs: Spectral Methods, Tensors and Influence propagation

CMU SCS

Overall Conclusions• G1: fraud detection

– BP: powerful method– FaBP: faster; equally accurate; known

convergence• G2: botnets -> Eigenspokes• G3: Subject-Verb-Object ->

Tensors/GigaTensor• Spikes: ‘spikeM’ (exp rise; PL drop)

Google, June 2013 C. Faloutsos (CMU) 82