On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension...
Transcript of On the extension of trace norm to tensorsttic.uchicago.edu/~ryotat/talks/trace10.pdfOn the extension...
On the extension of trace norm to tensors
Ryota Tomioka1, Kohei Hayashi2, Hisashi Kashima1
1The University of Tokyo2Nara Institute of Science and Technology
2010/12/10
NIPS2010 Workshop: Tensors, Kernels, and Machine Learning
Convex low-rank tensor completion
SensorsTime
Feat
ures
Factors(loadings)
Core(interactions)
I1
I2I3
I1
I2
I3
r1
r2
r3r1
r2 r3
Tucker decomposition
Conventional formulation (nonconvex)
mode-k productobservation
minimizeC,U1,U2,U3
!! " (Y # C $1 U1 $2 U2 $3 U3) !2F + regularization.
minimizeX
!! " (Y # X ) !2F s.t. rank(X ) $ (r1, r2, r3).
• Alternate minimization• Have to fix the rank beforehand
Our approach
MatrixEstimation of
low-rank matrix(hard)
Trace normminimization(tractable)
[Fazel, Hindi, Boyd 01]
TensorEstimation of
low-rank tensor(hard)
Rank defined in the sense ofTucker decomposition
Extendedtrace norm
minimization(tractable)
Generalization
Trace norm regularization (for matrices)
X ! RR!C , m = min(R,C)
Linear sum of singular-values
• Roughly speaking, L1 regulariza6on on the singular‐values.
• Stronger regulariza6on ‐‐> more zero singular‐values ‐‐>
low rank.
• Not obvious for tensors (no singular‐values for tensors)
!X!tr =m!
j=1
!j(X)
Mode-k unfolding (matricization)
I1
I2 I3
I1
I2 I2 I2
I3
Mode-1 unfolding
I1
I2 I3
I2
I3 I3 I3
I1
Mode-2 unfolding
X(1)
X(2)
Elementary facts about Tucker decomposition
X = C !1 U1 !2 U2 !3 U3
r1
r2 r3C
I1r1
U1 I2
r2U2
I3
r3
U3
Mode-1 unfolding
Mode-2 unfolding
Mode-3 unfolding
X(1) = U1C(1)(U3 !U2)!
X(2) = U2C(2)(U1 !U3)!
X(3) = U3C(3)(U2 !U1)!
rank ≦ r1
rank ≦ r2
rank ≦ r3
The rank of X(k) is no more than the rank of C(k)
What it means
• We can use the trace norm of an unfolding of a
tensor X to learn low‐rank X.
Tensor X is low-rank∃k, rk<Ik
Unfolding Xk is a low-rank
matrix
Matricization
Tensorization
Approach 1: As a matrix
• Pick a mode k, and hope that the tensor to be
learned is low rank in mode k.
minimizeX!RI1!···!IK
12!
!! " (Y # X ) !2F + !X(k)!",
Pro: Basically a matrix problem → Theoretical guarantee (Candes & Recht 09)Con: Have to be lucky to pick the right mode.
Approach 2: Constrained optimization
• Constrain so that each unfolding of X is
simultaneously low rank.
minimizeX!RI1!···!IK
12!
!! " (Y # X ) !2F +
K!
k=1
"k!X(k)!".
γk: tuning parameter usually set to 1.
Pro: Jointly regularize every modeCon: Strong constraint
See also Marco Signoretto et al.,10
Approach 3: Mixture of low-rank tensors
• Each mixture component Zk is regularized to be
low‐rank only in mode‐k.
minimizeZ1,...,ZK
12!
!!!!!! !"Y "
#K
k=1Zk
$!!!!!
2
F
+K#
k=1
"k#Zk(k)#!,
Pro: Each Zk takes care of each modeCon: Sum is not low-rank
Numerical experiment
• True tensor: Size 50x50x20, rank 7x8x9. No noise (λ=0).
• Random train/test split.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10−4
10−2
100
102
Fraction of observed elements
Gen
eral
izatio
n er
ror
As a Matrix (mode 1)As a Matrix (mode 2)As a Matrix (mode 3)ConstraintMixtureTucker (large)Tucker (exact)Optimization tolerance
Tucker=EM algo
(nonconvex)
Computation time
• Convex formula6on is also fast
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
10
20
30
40
50
Fraction of observed elements
Com
puta
tion
time
(s)
As a MatrixConstraintMixtureTucker (large)Tucker (exact)
Phase transition behaviour
0 20 40 60 800
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
[10 12 13]
[10 12 9][16 6 8]
[17 13 15]
[20 20 5]
[20 3 2]
[30 12 10]
[32 3 4]
[40 20 6]
[40 3 2]
[40 9 7]
[4 42 3][5 4 3]
[7 8 15]
[7 8 9][8 5 6]
Sum of true ranks
Frac
tion
requ
ired
for p
erfe
ct re
cons
truct
ion
• Sum of true ranks= min(r1,r2r3)+ min(r2,r3r1)+ min(r3,r1r2)
Summary
• Low‐rank tensor comple6on can be computed in a convex
op6miza6on problem using the trace norm of the unfoldings.
‐ No need to specify the rank beforehand.
• Convex formula6on is more accurate and faster than
conven6onal EM‐based Tucker decomposi6on.
• Curious “phase transi6on” found → compressive‐sensing‐
type analysis is an on‐going work.
• Technical report: Arxiv:1010.0789 (including op6miza6on)
• Code:
‐ h]p://www.ibis.t.u‐tokyo.ac.jp/RyotaTomioka/So_wares/Tensor
Acknowledgment
• This work was supported in part by MEXT KAKENHI
22700138, 22700289, and NTT Communica6on
Science Laboratories.