Charalampos (Babis) E. Tsourakakis
MACH: Fast Randomized Tensor Decompositions, SDM 2010 1
SIAM Data Mining Conference April 30th, 2010
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 2
0200040006000800010000051015202530time (min)value
Temperature 02000400060008000100000100200300400500600time (min)value
Light
020004000600080001000000.511.522.5time (min)value
Voltage 0200040006000800010000010203040time (min)value
Humidity
Intel Berkeley lab
3 MACH: Fast Randomized Tensor Decompositions, SDM 2010
time
Loca
tion
Data modeled as a tensor, i.e., multidimensional matrix, T x (#sensors) x (#types of measurements)
4
Observation Multi-‐aspect data can be modeled in such way.
MACH: Fast Randomized Tensor Decompositions, SDM 2010
Time mode Sensor mode Measurement type mode
MACH: Fast Randomized Tensor Decompositions, SDM 2010 5
5 Mode Tensor voxel x subjects x trials x task conditions x timeticks
Functional Magnetic Resonance Imaging (fMRI)
Tensors model naturally numerous real-‐world datasets. And now what?
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 6
MACH: Fast Randomized Tensor Decompositions, SDM 2010 7
o + o + o … +
u1 u2 u3
v1 v2 v3
σ1 σ2 σ3
Α (m x n) =
Singular value decomposition (SVD) The “Swiss army knife” of matrix decompositions (O’Leary)
= x x
Document to term matrix
Documents to Document HCs
Strength of each concept
Term to Term HCs data graph java brain lung
CS
MD
8 MACH: Fast Randomized Tensor Decompositions, SDM 2010
Two families of algorithms extend SVD to the multilinear setting PARAFAC/CANDECOMP decompositions Tucker decomposition
MACH: Fast Randomized Tensor Decompositions, SDM 2010 9
Kolda Bader
Tensor Decompositions and its Applications, SIAM review
MACH: Fast Randomized Tensor Decompositions, SDM 2010 10
Tucker is an SVD-‐like decomposition of a tensor, with one projection matrix per mode and a core tensor.
~
J. Sun showed that Tucker decompositions can be used to extract useful knowledge from monitoring systems.
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 11
Most of the real-‐world processes result in sparse tensors.
However, there exist important processes which result in dense tensors:
MACH: Fast Randomized Tensor Decompositions, SDM 2010 12
Physical Process Percentage of non-‐zero entries
Sensor network (sensor x measurement type x timeticks)
85%
Computer network (machine x measurement type x timeticks)
81%
It can be either very slow or impossible to perform a Tucker decomposition on a dense tensor due to memory constraints.
MACH: Fast Randomized Tensor Decompositions, SDM 2010 13
Given the fact that (low rank) Tucker decompositions are valuable in practice, can we “trade” a “little bit” of accuracy for efficiency?
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 14
MACH: Fast Randomized Tensor Decompositions, SDM 2010 15
McSherry Achlioptas
MACH extends the work of Achlioptas-McSherry for fast low rank approximations to the multilinear setting.
Fast low rank matrix approximation STOC 2001
Toss a coin for each non-‐zero entry with probability p If it “survives” reweigh it by 1/p. If not, make it zero!
Perform Tucker on the sparsified tensor! For the theoretical results and more details, see the MACH paper.
MACH: Fast Randomized Tensor Decompositions, SDM 2010 16
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 17
Intemon: A prototype monitoring and mining system for data centers, developed at Carnegie Mellon University.
Tensor X, 100 machines x 12 types of measurement x 10080 timeticks
MACH: Fast Randomized Tensor Decompositions, SDM 2010 18
MACH: Fast Randomized Tensor Decompositions, SDM 2010 19
For p=0.1 we obtain that Pearson’s Correlation Coefficient is 0.99
Ideal ρ=1
MACH: Fast Randomized Tensor Decompositions, SDM 2010 20
Exact MACH
The qualitative analysis which is important for our goals remains the same!
Find the differences!
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 21
Berkeley Lab
Tensor 54 sensors x 4 types of measurement x 5385 timeticks
MACH: Fast Randomized Tensor Decompositions, SDM 2010 22
MACH: Fast Randomized Tensor Decompositions, SDM 2010 23
The qualitative analysis which is important for our goals remains the same!
MACH: Fast Randomized Tensor Decompositions, SDM 2010 24
The spatial principal mode is also preserved, and Pearson’s correlation coefficient is again almost 1!
Exact MACH
MACH: Fast Randomized Tensor Decompositions, SDM 2010 25
REMARKS 1) Daily periodicity is apparent. 2) Pearson’s correlation Coefficient 0.99 with the exact component.
Introduction Why Tensors? Tensor Decompositions
Our Motivation Proposed Method Experimental Results
Case study I: Intemon Case study II: Intel Berkeley Lab
Conclusion MACH: Fast Randomized Tensor Decompositions, SDM 2010 26
Randomized Algorithms for Tensors Smallest p* for tensor sparsification for the HOOI algorithm Randomized Algorithms work very well (e.g., sublinear time algorithm), but typically hard to analyze.
MACH: Fast Randomized Tensor Decompositions, SDM 2010 27
MACH: Fast Randomized Tensor Decompositions, SDM 2010 28
MACH: Fast Randomized Tensor Decompositions, SDM 2010 29
MACH: Fast Randomized Tensor Decompositions, SDM 2010 30
Remark:Even if our theoretical results refer to HOSVD, MACH works for HOOI
MACH: Fast Randomized Tensor Decompositions, SDM 2010 31
Canonical Decomposition CANDECOMP/ PARAFAC Tucker Decomposition
Top Related