MACH: Fast Randomized Tensor Decompositions

Post on 23-Mar-2016

223 views 1 download

Tags:

description

MACH: Fast Randomized Tensor Decompositions

Transcript of MACH: Fast Randomized Tensor Decompositions

Charalampos  (Babis)    E.  Tsourakakis  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 1

SIAM Data Mining Conference April 30th, 2010

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 2

0200040006000800010000051015202530time (min)value

Temperature 02000400060008000100000100200300400500600time (min)value

Light

020004000600080001000000.511.522.5time (min)value

Voltage 0200040006000800010000010203040time (min)value

Humidity

Intel Berkeley lab

3 MACH: Fast Randomized Tensor Decompositions, SDM 2010

time

Loca

tion

Data modeled as a tensor, i.e., multidimensional matrix, T x (#sensors) x (#types of measurements)  

4

                                                                                             Observation                                    Multi-­‐aspect  data  can  be  modeled  in  such  way.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010

Time  mode    Sensor  mode    Measurement  type  mode  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 5

5 Mode Tensor voxel x subjects x trials x task conditions x timeticks

Functional Magnetic Resonance Imaging (fMRI)

Tensors  model  naturally  numerous  real-­‐world  datasets.                                                                                                                                  And  now  what?    

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 6

MACH: Fast Randomized Tensor Decompositions, SDM 2010 7

o + o + o … +

u1 u2 u3

v1 v2 v3

σ1 σ2 σ3

Α  (m  x  n)  =

Singular  value  decomposition  (SVD)    The  “Swiss  army  knife”  of  matrix  decompositions  (O’Leary)  

= x x

Document to term matrix

Documents to Document HCs

Strength of each concept

Term to Term HCs data graph java brain lung

CS

MD

8 MACH: Fast Randomized Tensor Decompositions, SDM 2010

  Two  families  of  algorithms  extend  SVD  to  the  multilinear  setting    PARAFAC/CANDECOMP  decompositions    Tucker  decomposition      

MACH: Fast Randomized Tensor Decompositions, SDM 2010 9

Kolda Bader

Tensor Decompositions and its Applications, SIAM review

MACH: Fast Randomized Tensor Decompositions, SDM 2010 10

Tucker  is  an  SVD-­‐like  decomposition  of  a  tensor,  with  one    projection  matrix  per  mode  and  a  core  tensor.  

~

J.  Sun  showed  that  Tucker  decompositions  can  be  used  to  extract  useful  knowledge  from  monitoring  systems.  

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 11

 Most  of  the  real-­‐world  processes  result  in  sparse  tensors.    

  However,  there  exist  important  processes  which  result  in  dense  tensors:  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 12

Physical  Process     Percentage  of  non-­‐zero  entries  

Sensor  network  (sensor  x  measurement  type  x  timeticks)  

85%  

Computer  network  (machine  x    measurement  type  x  timeticks)  

81%  

  It  can  be  either  very  slow  or  impossible  to  perform  a  Tucker  decomposition  on  a  dense  tensor  due  to  memory  constraints.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 13

Given  the  fact  that    (low  rank)  Tucker  decompositions  are  valuable  in  practice,  can  we  “trade”  a  “little  bit”  of  accuracy  for  efficiency?  

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 14

MACH: Fast Randomized Tensor Decompositions, SDM 2010 15

McSherry Achlioptas

MACH extends the work of Achlioptas-McSherry for fast low rank approximations to the multilinear setting.

Fast low rank matrix approximation STOC 2001

  Toss  a  coin  for  each  non-­‐zero  entry  with  probability  p      If  it  “survives”  reweigh  it  by  1/p.      If  not,  make  it  zero!  

  Perform  Tucker  on  the  sparsified  tensor!    For  the  theoretical  results  and  more  details,  see  the  MACH  paper.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 16

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 17

  Intemon:  A  prototype  monitoring  and  mining  system  for  data  centers,  developed  at  Carnegie  Mellon  University.  

  Tensor  X,  100  machines  x  12  types  of    measurement  x  10080  timeticks  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 18

MACH: Fast Randomized Tensor Decompositions, SDM 2010 19

For  p=0.1  we  obtain    that  Pearson’s  Correlation  Coefficient    is  0.99  

Ideal  ρ=1  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 20

Exact MACH

The  qualitative  analysis  which  is  important  for  our  goals  remains  the  same!  

Find the differences!

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 21

  Berkeley  Lab  

  Tensor  54  sensors  x  4  types  of  measurement  x  5385  timeticks  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 22

MACH: Fast Randomized Tensor Decompositions, SDM 2010 23

The  qualitative  analysis  which  is  important    for  our  goals  remains  the  same!  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 24

The  spatial  principal  mode  is  also  preserved,    and  Pearson’s  correlation  coefficient    is  again  almost  1!  

Exact MACH

MACH: Fast Randomized Tensor Decompositions, SDM 2010 25

                       REMARKS  1)  Daily  periodicity    is    apparent.  2)  Pearson’s  correlation  Coefficient  0.99  with  the  exact  component.  

  Introduction   Why  Tensors?    Tensor  Decompositions  

  Our  Motivation    Proposed  Method    Experimental  Results  

  Case  study  I:    Intemon      Case  study  II:  Intel  Berkeley  Lab  

  Conclusion  MACH: Fast Randomized Tensor Decompositions, SDM 2010 26

  Randomized  Algorithms  for  Tensors      Smallest  p*  for  tensor  sparsification  for  the    HOOI    algorithm    Randomized  Algorithms  work  very  well  (e.g.,  sublinear  time  algorithm),  but  typically  hard  to  analyze.  

MACH: Fast Randomized Tensor Decompositions, SDM 2010 27

MACH: Fast Randomized Tensor Decompositions, SDM 2010 28

MACH: Fast Randomized Tensor Decompositions, SDM 2010 29

MACH: Fast Randomized Tensor Decompositions, SDM 2010 30

Remark:Even if our theoretical results refer to HOSVD, MACH works for HOOI

MACH: Fast Randomized Tensor Decompositions, SDM 2010 31

Canonical  Decomposition  CANDECOMP/  PARAFAC                        Tucker  Decomposition