CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides...

13
CSE416A ANALYSIS OF NETWORK DATA Fall 2019 Marion Neumann SEMESTER SUMMARY Contents in these slides may be subject to copyright. Some materials are adopted from: http://www.cs.cornell.edu/home/kleinber/networks-book, http://web.stanford.edu/class/cs224w/ , http://www.mmds.org .

Transcript of CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides...

Page 1: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

CSE416A ANALYSIS OF NETWORK DATA

Fall 2019Marion Neumann

SEMESTER SUMMARY

Contents in these slides may be subject to copyright. Some materials are adopted from: http://www.cs.cornell.edu/home/kleinber/networks-book, http://web.stanford.edu/class/cs224w/, http://www.mmds.org.

Page 2: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

REASONING ABOUT NETWORKS

• What do we study in networks? • Structure and evolution:• What is the structure of a network? • Why and how did it become to have such structure?

• Processes and dynamics: • networks provide “skeleton” for spreading of information,

behavior, diseases, …

2

Page 3: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

REASONING ABOUT NETWORKS

• How do we reason about/understand networks? • Empirical: Study network data to find organizational

principles• Mathematical models: Study probabilistic models

and graph theory to derive properties theoretically• Algorithms: Methods for analyzing graphs to

compute patterns, similarities, and interesting hidden features

3

Page 4: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

REASONING ABOUT NETWORKS• Empirical à organizational principles• Mathematical models à theoretical properties• Algorithms à patterns, similarities, features

4

Mathematics à prob/statsà graph theory à linear algebra

Field of Application

à social/political science

à biologyà intelligenceà ...

Computer Science

à algorithms & data structures

à data scienceà big data

Structure and Evolution

Processes and Dynamics

Page 5: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

WHAT WE LEARNED 1. Communities in Networks2. Betweenness-based Clustering

• Girvan-Newman3. Modularity

• maximization• modularity matrix

4. Spectral clustering• Graph Laplacian

5. Overlapping Communities• Clique Percolation Method• Finding Cliques

6. Node Similarity• structural & regular equivalence• Random Walks• SimRank

7. Node Classification• Label Propagation

5

1. Random Graph Model• degree distribution• clustering coefficient• avg. path length• giant component• evolution & phase transition• problems

2. Small World Model• avg. path length• clustering coefficient• problems

3. Scale-free Networks• power-law distribution• exponent estimation

4. Preferential Attachment Model• rich-get-richer phenomenon• power-law distribution• clustering coefficient• average path length

1. Spreading Processes• Probabilistic Models of Spread• Epidemics

2. Cascading Behavior• Independent Cascade Model• Exposure Curves• Viral Marketing

3. Graph Classification

Part II

Part III

Part IV

Networked Data & Graphs… Part I

Page 6: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

HOW IT ALL FITS TOGETHERMeasures & Properties Models Algorithms

diameter & local structure, small-world

ER & small world models shortest-path algorithms

degree, scale-free, hubs preferential attachment, power-law distribution

centrality measures, power iteration

strong and weak ties,communities

Girvan-Newman, modularity,spectral, CPM

node properties & similarity random walks SimRank, label propagation

contagion & spread epidemicsindependent cascade model

simulations & statistical tests

6

Page 7: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

NETWORK MODELS: ER & SM

• Properties• small-world à small diameter• local structure à high clustering coefficient

• Models• Random Graph Model• Small-World Model

à phase transition

• Digression: configuration model

7

Page 8: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

NETWORK MODELS: PREFERENTIAL ATTACHMENT

• Properties• degree distribution• rich-get-richer/hubs• scale-free

• Model• Preferential attachment

à model network evolution

8

Page 9: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

COMMUNITY DETECTION

• Properties• local structure • homophily

• Algorithms• Girvan-Newman• Modularity Maximization• Spectral Clustering• Clique Percolation Method

9

Page 10: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

NODE SIMILARITY & CLASSIFICATION

• Properties• homophily/auto-correlation• node properties• missing information

• Model• Random walk (with restart)

• Algorithms• SimRank• Label Propagation

10

Page 11: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

NETWORK DYNAMICS• Properties• contagion/spread• exposure/adoption

• Measures• epidemic threshold• exposure curves

• Models• SIS, SIR• Independence Cascade Model

11

β

δ

Page 12: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

WHAT CAN WE DO WITH COMPLEX NETWORK ANALYSIS?

Complex Network Analysis: Use empirical measures, comparisons to network models, and network algorithms to reasonabout network properties and underlying phenomena.

12

map of superpowers

Page 13: CSE416A ANALYSIS OF NETWORK DATA SEMESTER SUMMARY › ~m.neumann › fl2019 › cse416 › slides › … · 4. Spectral clustering • Graph Laplacian 5. Overlapping Communities

SUMMARY

• You have learned a lot!• answered insightful questions• derived many interesting results • implemented a number of algorithms• practiced many real-world workflows

Thank You for the Hard Work!!!

Please, fill-in the course evaluation.

13

Looking for a good

read for the break?