A Sparsification Approach for Temporal Graphical Model Decomposition

26
A Sparsification Approach for Temporal Graphical Model Decomposition Ning Ruan Kent State University Joint work with Ruoming Jin (KSU), Victor Lee (KSU) and Kun Huang (OSU)

description

A Sparsification Approach for Temporal Graphical Model Decomposition. Ning Ruan Kent State University. Joint work with Ruoming Jin (KSU), Victor Lee (KSU) and Kun Huang (OSU). Motivation: Financial Markets. Fluorescence Counts. Protein-Protein Interaction. - PowerPoint PPT Presentation

Transcript of A Sparsification Approach for Temporal Graphical Model Decomposition

Page 1: A Sparsification Approach for Temporal Graphical Model Decomposition

A Sparsification Approach for Temporal Graphical Model

Decomposition

Ning Ruan Kent State University

Joint work with Ruoming Jin (KSU), Victor Lee (KSU) and Kun Huang (OSU)

Page 2: A Sparsification Approach for Temporal Graphical Model Decomposition

Motivation: Financial Markets

Page 3: A Sparsification Approach for Temporal Graphical Model Decomposition

Motivation: Biological Systems

3

Microarray time series profileProtein-Protein Interaction

Fluorescence Counts

Page 4: A Sparsification Approach for Temporal Graphical Model Decomposition

4

Vector Autoregression• Univariate Autoregression is self-regression for a time-

series

• VAR is the multivariate extension of autoregression

T

u

tutXutX1

)()()()(

T

u

tutut1

)()()()( XΦX

1 1 1 1 1 1

2 2 2 2 2 2

3 3 3 3 3 3

(0) (1) (2) (3) (4) ( )(0) (1) (2) (3) (4) ( )(0) (1) (2) (3) (4) ( )

(0) (1) (2) (3) (4) ( )m m m m m m

x x x x x x Tx x x x x x Tx x x x x x T

x x x x x x T

0t= 1 2 3 4 T

Page 5: A Sparsification Approach for Temporal Graphical Model Decomposition

5

Granger Causality• Goal: reveal causal relationship between two

univariate time series.– Y is Granger causal for X at time t if Xt-1 and Yt-1

together are a better predictor for Xt than Xt-1 alone.– i.e., compare the magnitude of error ε(t) vs. ε′(t)

)()]()([)(

.

)()]([)(

1

1

tutYutXtX

vs

tutXtX

ut

T

uut

T

uut

Page 6: A Sparsification Approach for Temporal Graphical Model Decomposition

Temporal Graphical Modeling

• Recover the causal structure among a group of relevant time series

X1

X2

X3

X4

X5

X6

X7

X8 temporal graphical model

X1

X3

X2

X5

X4

X7 X6

X8

Φ12

Page 7: A Sparsification Approach for Temporal Graphical Model Decomposition

The Problem• Given a temporal graphical model, can we

decompose it to get a simpler global view of the interactions among relevant time series?

How to interpret these How to interpret these causal relationshipscausal relationships??????

Page 8: A Sparsification Approach for Temporal Graphical Model Decomposition

Extra Benefit

X1

X2

X3

X4

X5

X6

X7

X8

Clustering based on similarity

Consider time series clustering from a new perspective!

X1 X2 X8X7X6X5X4X3

X1 X3 X8X7X6X5X4X2

X1

X3

X2

X5

X4

X7 X6

X8

Page 9: A Sparsification Approach for Temporal Graphical Model Decomposition

Clustered Regression Coefficient Matrix

• Vector Autoregression Model

– Φ(u) is a NxN coefficient matrix• Clustered Regression Coefficient Matrix

T

u

tutut1

)()()()( XΦX

)(00

0)(000)(

)( 2

1

u

uu

u

K

1) ifΦ(u)ij≠0,then time series i and j are in the same cluster

2) if time series i and j are not in the same cluster,then Φ(u)ij=0

submatrix

Page 10: A Sparsification Approach for Temporal Graphical Model Decomposition

Temporal Graphical Model Decomposition Cost

• Goal: preserve prediction accuracy while reducing representation cost

• Given a temporal graphical model, the cost for model decomposition is

• Problem– Tend to group all time series into one cluster

)||)(||(||)()()(|| 2

1

2

1

uutXutXL

t

T

u

prediction error L2 penalty

Page 11: A Sparsification Approach for Temporal Graphical Model Decomposition

Refined Cost for Decomposition• Balance size of clusters

– C is NxK membership matrix• Overall cost is the sum of three parts

• Optimal Decomposition Problem– Find a cluster membership matrix C and its

regression coefficient matrix Φ such that the cost for decomposition is minimal

))(()||)(||(||)()()(|| 2

1

2

1

CCtruutXutX TL

t

T

u

k i

ikT CCCtr 2)()(

prediction error L2 penalty size constraint

1 0 01 0 00 1 00 0 1

X2

C1

Page 12: A Sparsification Approach for Temporal Graphical Model Decomposition

Hardness of Decomposition Problem

• Combined integer (membership matrix) and numerical (regression coefficient matrix) optimization problem

• Large number of unknown variables – NxK variables in membership matrix– NxN variables in regression coefficient matrix

Page 13: A Sparsification Approach for Temporal Graphical Model Decomposition

Basic Idea for Iterative Optimization Algorithm

• Relax binary membership matrix C to probabilistic membership matrix P

• Optimize membership matrix while fixing regression coefficient matrix

• Optimize regression coefficient matrix while fixing membership matrix

• Employ two optimization steps iteratively to get a local optimal solution

Page 14: A Sparsification Approach for Temporal Graphical Model Decomposition

Overview of Iterative Optimization Algorithm

Time Series Data

Temporal Graphical Model

Optimize cluster membership matrix

Quasi-Newton Method

Optimize regression coefficient matrix

Generalized ridge regression

Step 1 Step 2

Page 15: A Sparsification Approach for Temporal Graphical Model Decomposition

Step 1: Optimize Membership Matrix

• Apply Lagrange multiplier method:

• Quasi-Newton method– Approximate Hessian matrix by iteratively

updating

cost( ) ( ( | ) 1)ii k

F P p k i

( 1) ( )( ) ( ) ( )

( 1) ( )( , )

n nn n n

n n

P PH F P

Page 16: A Sparsification Approach for Temporal Graphical Model Decomposition

Step 2: Optimize Regression Coefficient Matrix

• Decompose cost functions into N subfunctions

• Generalized Ridge Regression

– yk is a vector related with P and X (length L)– Xk is a matrix related with P and X (size LxN)k=1, traditional ridge regression

iiTi

k

Tikk

TTikki MXyXyF )()(

constant

1

costN

ii

F

Page 17: A Sparsification Approach for Temporal Graphical Model Decomposition

Complexity Analysis

Step 1 is the computational bottleneck of entire algorithm

NxK+N

NxK

+N

Update Hessian Matrix takes 2( ( ) )O k NK N

1 0 0 7 0

5 0 5 0 6

8 0 2 0 3

0 3 0 1 2

4 0 6 0 0

Compute coefficient matrix 3( )iO RN

NNxK

Page 18: A Sparsification Approach for Temporal Graphical Model Decomposition

Basic Idea for Scalable Approach

• Utilize variable dependence relationship to optimize each variable (or a small number of variables) independently, assuming other relationships are fixed

• Convert the problem to a Maximal Weight Independent Set (MWIS) problem

Page 19: A Sparsification Approach for Temporal Graphical Model Decomposition

Experiments: Synthetic Data• Synthetic data generator

– Generate community-based graph as underlying temporal graphical model [Girvan and Newman 05]

– Assign random weights to graphical model and generate time series data using recursive matrix multiplication [Arnold et al. 07]

• Decomposition Accuracy– Find a matching between clustering results and

ground-truth clusters such that the number of intersected variables are maximal

– The number of intersected variables over total number of variables is decomposition accuracy

Page 20: A Sparsification Approach for Temporal Graphical Model Decomposition

Experiments: Synthetic Data (cont.)

• Applied algorithms– Iterative optimization algorithm based on Quasi-

Newton method (newton)– Iterative optimization algorithm based on MWIS

method (mwis)– Benchmark 1: Pearson correlation test to generate

temporal graphical model, and Ncut [Shi00] for clustering (Cor_Ncut)

– Benchmark 2: directed spectral clustering [Zhou05] on ground-truth temporal graphical model (Dcut)

Page 21: A Sparsification Approach for Temporal Graphical Model Decomposition

Experimental Results: Synthetic• On average, newton is

better than Cor_Ncut and Dcut by 27% and 32%, respectively

• On average, mwis is better than Cor_Ncut and Dcut by 24% and 29%, respectively

Page 22: A Sparsification Approach for Temporal Graphical Model Decomposition

Experimental Results: Synthetic

mwis is better than Cor_Ncut by an average of 30%

mwis is better than Dcut by an average of 52%

Page 23: A Sparsification Approach for Temporal Graphical Model Decomposition

Experiment: Real Data• Data

– Annual GDP growth rate (downloaded from http://www.ers.usda.gov/Data/Macroeconomics)

– 192 countries• 4 Time periods

– 1969-1979– 1980-1989– 1990-1999– 1998-2007

• Hierarchically bipartition into 6 or 7 clusters

Page 24: A Sparsification Approach for Temporal Graphical Model Decomposition

Experimental Result: Real Data

Page 25: A Sparsification Approach for Temporal Graphical Model Decomposition

Summary• We formulate a novel objective function for the

decomposition problem in temporal graphical modeling.

• We introduce an iterative optimization approach utilizing Quasi-Newton method and generalized ridge regression.

• We employ a maximum weight independent set based approach to speed up the Quasi-Newton method.

• The experimental results demonstrate the effective and efficiency of our approaches.

Page 26: A Sparsification Approach for Temporal Graphical Model Decomposition

Thank youThank you