Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf ·...

33

Transcript of Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf ·...

Page 1: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Geometric Representations of Graphs and Words andApplications

Devdatt Dubhashi

Computer Science and Engg., Chalmers and GU

Machine Learning, Algorithms, Computational Biology Group

www.cs.chalmers.se/research/lab

Data Science, Stockholm

Dec 4�5, 2014

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 1

/ 32

Page 2: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

�Era of Big Data is Here�

Large-scale, time-varying, heterogeneous, inter-related, etc.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 2

/ 32

Page 3: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Geometric Embeddings: Unifying Framework for Learning

Embed graph or word or other data in high (moderate) dimensional

Euclidean space in a way that preserves structure.

Can exploit geometry to design fast algorithms.

Can exploit arsenal of machine learning techniques that work on vector

structured data.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 3

/ 32

Page 4: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Lovasz ϑ and orthogonal labellings

u1

u2

u3 u4

u5

1

u1

u2

u3 u4

u5

2

u1

u2

u3 u4

u5

3

u1

u2

u3 u4

u5

4

u1

u2

u3 u4

u5

5

u1

u2

u3 u4

u5

Orthogonal Labelling

U = [u1, . . . ,un] is an orthogonal labelling of G if

u>i uj = 0 whenever (i, j) /∈ E

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 4

/ 32

Page 5: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Lovasz ϑ and orthogonal labellings

~u1~u2

~u3~u4

~u5

~c

1√ϑ

ϑ(G) = minU

min‖c‖=1

maxi

1

(c>ui)2

cos−1 1√ϑ: Maximum angle between �handle� c and any of ui's minimum

among all valid orthogonal representations and handles.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 5

/ 32

Page 6: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Applications of ϑ(G)

Finding large independent sets/cliques

Graph coloring

Finding planted cliques in random graphs

Finding maxcut

· · ·Problem

Can't compute ϑ for graphs having > 100's of nodes!

SDP: �E�cient� in theory but disastrous in practice.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 6

/ 32

Page 7: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Approximating ϑ(G) by ω(K)

Vinay Jethava, Ph.D (2014) Given K = Aρ + I, �nd best handle

c ⇔ One-class SVM

~u1~u2

~u3~u4 ~u5

~c

1√ω

maxxi≥0

2n∑i=1

xi −∑i

∑j

xixjKij︸ ︷︷ ︸ω(K)

How close is one-class SVM solution ω(K) to ϑ(G) ?

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 7

/ 32

Page 8: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Approximating ϑ(G) by ω(K)

Vinay Jethava, Ph.D (2014) Given K = Aρ + I, �nd best handle

c ⇔ One-class SVM

~c

1√ω 1√

ϑ

How close is one-class SVM solution ω(K) to ϑ(G) ?DD () Geometric Representations of Graphs and Words and Applications

Data Science, Stockholm Dec 4�5, 2014 7/ 32

Page 9: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Using the SVM-ϑ Theory

Vinay Jethava, Ph.D. 2014

Solving classic combinatorial optimization problems on graphs:

maxcut, max-k-cut, graph colouring replacing SDP by SVM.

Finding planted cliques.

Integrative analysis of networks: �nding a common dense subgraphs in

multiple graphs using multiple kernel learning.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 8

/ 32

Page 10: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Google's word2vec: Words and Contexts

A context cwi of a word wi occurence in a text w1, · · · .wi, · · ·wN is a

small window of words wi−t, · · · , wi−1, wi+1, · · ·wi+t occuring around

wi.

word2vec assigns vectors uw,uc to words and contexts using a simple

logistic regression model based on the dot product uTwuc.

Completely unsupervised method that scales to very large corpus e.g.

Wikipedia.

Words that occur in the same kinds of contexts will get assigned

similar vectors.

Surprising ability to capture semantic information, (not very well

understood).

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 9

/ 32

Page 11: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

word2vec

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 10

/ 32

Page 12: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Extending word2vec

Multiple vectors corresponding to di�erent senses of a word.

graph2vec uses same idea for embeddings of graph structured data.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 11

/ 32

Page 13: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Applications

Language Technology

(VR Framework Project �Culturomics�, 2013�17 17 M SEK)

Business Intelligence

(SSF Data Intensive Systems, 2012-16, 25 M SEK)

Computational Biology

(VR project 2011-15)

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 12

/ 32

Page 14: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Word Sense Disambiguation (WSD)

Fundamental problem in language technology:

I went �shing for some sea bass.

The bass line of the song is too weak.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 13

/ 32

Page 15: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Using word vectors for WSD

(Fredrik Johansson, Mikael Kageback, Richard Johansson, 2014 in

progress)

Train multiple vectors corresponding to di�erent senses based on the

di�erent contexts (also represented by vectors)

Cluster the contexts in a non�parametric data�driven fashion.

use clusters to assign senses, possibly using seantic networks such as

WordNet or BabelNet.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 14

/ 32

Page 16: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Multi�document Summarization

Fundamental problem in information extraction:

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 15

/ 32

Page 17: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

word2vec for Multi�document Summarization

(Olof Mögren, Mikael Kageback and Vinay Jethava 2014)

Measure similarity between sentences using similarity of vectors for

their constituent words.

Use this similarity in sub�modular optimization of coherence and

diversity of summary.

Can combine with other similarities using multiple kernel learning

(MKL).

Tool available for use at Findwise Labs.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 16

/ 32

Page 18: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Entity Disambiguation (ED)

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 17

/ 32

Page 19: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Entity Disambiguation (ED)

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 18

/ 32

Page 20: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Entity Disambiguation (ED)

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 19

/ 32

Page 21: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Entity Disambiguation (ED)

“Chris Anderson” “Chris Anderson”

TED WIRED

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 20

/ 32

Page 22: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Entity Disambiguation (ED)

G1

G2v1v2

K(G1,G2)

G

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 21

/ 32

Page 23: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Comparing graphs

Graph kernels compare how similar two graphs are

K : G × G → R

De�ned using features extracted from subgraphs

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 22

/ 32

Page 24: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Graph kernels

Existing graph kernels

Random Walks (Gärtner et. al., '03)

Shortest paths (Borgwardt & Kriegel, '05)

Graphlets (Shervashidze, Mehlhorn et. al., '09)

· · ·

�local� subgraphs cannot capture global properties

girth � length of the shortest cycle

chromatic number χ(G)

maxcut

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 23

/ 32

Page 25: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Kernels based on ϑ embedding

~u1~u2

~u3~u4 ~u5

~c

1√ϑ

~u1

~u2

~u3~u4

~u5~c

1

~u1

~u2

~u3~u4

~u5~c2

~u1

~u2

~u3~u4

~u5~c

3

~u1

~u2

~u3~u4

~u5~c

4

~u1

~u2

~u3~u4

~u5~c

5

~u1

~u2

~u3~u4

~u5~c

~u1

~u2

~u3~u4

~u5~d

1

~u1

~u2

~u3~u4

~u5~d2

~u1

~u2

~u3~u4

~u5~d

3

~u1

~u2

~u3~u4

~u5~d

4

~u1

~u2

~u3~u4

~u5~d

5

~u1

~u2

~u3~u4

~u5~d

(a) ϑ(G) (b) Subgraph UG|B (c) Lovasz value ϑB

Fredrik Johansson, Lic. 2014

K(G(1), G(2)) =∑

(B1,B2),|B1|=|B2|

k(ϑB1 , ϑB2)

where Bi is a subgraph of G(i).

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 24

/ 32

Page 26: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Computing E�ciently

Problems

The Lovasz�ϑ kernel involves a sum over all subgraphs � too

expensive!

Computing ϑ(G) has complexity O(n5), where n is the number of

nodes. The optimization is a semide�nite program.

Solutions, Johansson et al 2014

Random sampling!O(n log n/ε2) samples is enough.

Use the SVM�ϑ approximation.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 25

/ 32

Page 27: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Classifying Graphs with Planted Structures

(a) Random graph (b) Graph with planted clique

Theorem (F. Johansson, ICML 2014)

There is a linear separator separating with high probability, G(n, p) and

G(n, p, k) graphs, for large enough k = 2t√

n(1−p)p , with margin,

γ ≥ O(t)− o(√n)

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 26

/ 32

Page 28: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Results on graph kernel benchmarks

Table: Average classi�cation accuracy (%) on benchmark datasets. Numbers inbold indicate the best results in each column.

Kernels PTC MUTAG ENZYME NCIA

SP 63.0 87.2 30.5 67.3

GL 63.1 83.5 26.7 62.9

RW 60.6 85.6 21.2 63.1

Lo-ϑ 64.3 86.2 26.5 65.2

SVM-ϑ 63.8 87.8 33.5 62.7

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 27

/ 32

Page 29: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Entity Disambiguation (ED)

G1

G2v1v2

K(G1,G2)

G

CIKM 2013 paper and tool implemented in Recorded Future pipeline.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 28

/ 32

Page 30: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Integrative Analysis/Data Fusion

How to combine multiple sources of evidence?

How to combine di�erent data type measurements e.g. RNa-seq,

microarray, copy number variation ...

How to �nd what is common and what is di�erent across multiple

networks.

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 29

/ 32

Page 31: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Common dense subgraphs in multiple graphs

⇒Integrative analysis of microarray gene expression datasets

Detecting motifs

Identifying functional groups

(Vinay Jethava et al, NIPS 2013, JMLR 2014)

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 30

/ 32

Page 32: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Multiple kernel learning

Kernel � similarity matrix between objects of a certain type.

MKL � Data fusion of multiple sources of information

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 31

/ 32

Page 33: Geometric Representations of Graphs and Words …people.dsv.su.se/~henke/DSWS/devdatt.pdf · Geometric Representations of Graphs and Words and ... Solving classic combinatorial optimization

Summary

Geometric representations

are a unifying framework for learning problems from diverse contexts.

www.cs.chalmers.se/research/lab

DD () Geometric Representations of Graphs and Words and ApplicationsData Science, Stockholm Dec 4�5, 2014 32

/ 32