Nonnegative Matrix Factorization-
Complexity, Algorithms and Applications
Nicolas Gillis†
Advisor: Francois GlineurUniversite catholique de Louvain
†Universite de MonsDepartment of Mathematics and Operational Research
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 1
Belgium
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 2
Belgium
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 2
Nonnegative Matrix Factorization (NMF)Given a matrix M ∈ Rp×n
+ and a factorization rank r � min(p, n), find
U ∈ Rp×rand V ∈ Rr×n such that
minU≥0,V≥0
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
NMF is a linear dimensionality reduction technique for nonnegative data :
M(:, j)︸ ︷︷ ︸≥0
≈r∑
k=1
U(:, k)︸ ︷︷ ︸≥0
V (k, j)︸ ︷︷ ︸≥0
for all j.
Why nonnegativity?
→ Interpretability: Nonnegativity constraints lead to easily interpretablefactors (and a sparse and part-based representation).→ Many applications. image processing, text mining, recommendersystems, community detection, clustering, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 3
Nonnegative Matrix Factorization (NMF)Given a matrix M ∈ Rp×n
+ and a factorization rank r � min(p, n), find
U ∈ Rp×rand V ∈ Rr×n such that
minU≥0,V≥0
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
NMF is a linear dimensionality reduction technique for nonnegative data :
M(:, j)︸ ︷︷ ︸≥0
≈r∑
k=1
U(:, k)︸ ︷︷ ︸≥0
V (k, j)︸ ︷︷ ︸≥0
for all j.
Why nonnegativity?
→ Interpretability: Nonnegativity constraints lead to easily interpretablefactors (and a sparse and part-based representation).→ Many applications. image processing, text mining, recommendersystems, community detection, clustering, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 3
Nonnegative Matrix Factorization (NMF)Given a matrix M ∈ Rp×n
+ and a factorization rank r � min(p, n), find
U ∈ Rp×rand V ∈ Rr×n such that
minU≥0,V≥0
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
NMF is a linear dimensionality reduction technique for nonnegative data :
M(:, j)︸ ︷︷ ︸≥0
≈r∑
k=1
U(:, k)︸ ︷︷ ︸≥0
V (k, j)︸ ︷︷ ︸≥0
for all j.
Why nonnegativity?
→ Interpretability: Nonnegativity constraints lead to easily interpretablefactors (and a sparse and part-based representation).→ Many applications. image processing, text mining, recommendersystems, community detection, clustering, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 3
Example: blind hyperspectral unmixing
Figure : Urban hyperspectral image with 162 spectral bands and 307-by-307pixels.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 4
Example: blind hyperspectral unmixing
� Basis elements allow to recover the different materials;
� Weights allow to know which pixel contains which material.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 5
Example: blind hyperspectral unmixing
� Basis elements allow to recover the different materials;
� Weights allow to know which pixel contains which material.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 5
Example: blind hyperspectral unmixing
� Basis elements allow to recover the different materials;
� Weights allow to know which pixel contains which material.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 5
Example: blind hyperspectral unmixing
Figure : Decomposition of the Urban dataset.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 6
Example: blind hyperspectral unmixing
Figure : Decomposition of the Urban dataset.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 6
Example: blind hyperspectral unmixing
Figure : Decomposition of the Urban dataset.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 6
Outline
1. Algorithms and Applications
I General framework for NMF algorithms
I Solving NMF sequentially with underapproximations
2. Complexity and Bounds
I Nonnegative rank
I rank-one subproblems, weights and missing data
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 7
How can we ‘solve’ NMF problems?Given a matrix M ∈ Rm×n
+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
0. Initialize (U , V ). Then, alternatively update U and V :1. Update V ≈ argminX≥0 ||M − UX||2F . (NNLS)2. Update U ≈ argminY≥0 ||M − Y V ||2F . (NNLS)
HALS algorithm: Use block-coordinate descent on NNLS subproblems−→ closed-form solutions for the columns of U and rows of V :
U∗:k = argminU:k≥0 ||Rk − U:kVk:||2F = max
(0,
RkVTk:
||Vk:||22
)∀k,
where Rk.= M −
∑j 6=k U:jVj:, and similarly for V .
HALS can be accelerated ordering updates efficiently.
G., Glineur, Accelerated Multiplicative Updates and Hierarchical ALS Algorithms forNonnegative Matrix Factorization, Neural Computation 2012.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 8
How can we ‘solve’ NMF problems?Given a matrix M ∈ Rm×n
+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
0. Initialize (U , V ). Then, alternatively update U and V :1. Update V ≈ argminX≥0 ||M − UX||2F . (NNLS)2. Update U ≈ argminY≥0 ||M − Y V ||2F . (NNLS)
HALS algorithm: Use block-coordinate descent on NNLS subproblems−→ closed-form solutions for the columns of U and rows of V :
U∗:k = argminU:k≥0 ||Rk − U:kVk:||2F = max
(0,
RkVTk:
||Vk:||22
)∀k,
where Rk.= M −
∑j 6=k U:jVj:, and similarly for V .
HALS can be accelerated ordering updates efficiently.
G., Glineur, Accelerated Multiplicative Updates and Hierarchical ALS Algorithms forNonnegative Matrix Factorization, Neural Computation 2012.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 8
How can we ‘solve’ NMF problems?Given a matrix M ∈ Rm×n
+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
0. Initialize (U , V ). Then, alternatively update U and V :1. Update V ≈ argminX≥0 ||M − UX||2F . (NNLS)2. Update U ≈ argminY≥0 ||M − Y V ||2F . (NNLS)
HALS algorithm: Use block-coordinate descent on NNLS subproblems−→ closed-form solutions for the columns of U and rows of V :
U∗:k = argminU:k≥0 ||Rk − U:kVk:||2F = max
(0,
RkVTk:
||Vk:||22
)∀k,
where Rk.= M −
∑j 6=k U:jVj:, and similarly for V .
HALS can be accelerated ordering updates efficiently.
G., Glineur, Accelerated Multiplicative Updates and Hierarchical ALS Algorithms forNonnegative Matrix Factorization, Neural Computation 2012.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 8
Drawbacks of standard NMF Algorithms
1. The optimal solution is, in most cases, non-unique and the problem isill-posed. Many variants of NMF impose additional constraints (e.g.,sparsity, smoothness, spatial information, etc.).
2. In practice, it is difficult to choose the factorization rank (in general,trial and error approach or estimation using the SVD).
A possible way to overcome these drawbacks is to use underapproximationconstraints to solve NMF sequentially.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 9
Drawbacks of standard NMF Algorithms
1. The optimal solution is, in most cases, non-unique and the problem isill-posed. Many variants of NMF impose additional constraints (e.g.,sparsity, smoothness, spatial information, etc.).
2. In practice, it is difficult to choose the factorization rank (in general,trial and error approach or estimation using the SVD).
A possible way to overcome these drawbacks is to use underapproximationconstraints to solve NMF sequentially.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 9
Drawbacks of standard NMF Algorithms
1. The optimal solution is, in most cases, non-unique and the problem isill-posed. Many variants of NMF impose additional constraints (e.g.,sparsity, smoothness, spatial information, etc.).
2. In practice, it is difficult to choose the factorization rank (in general,trial and error approach or estimation using the SVD).
A possible way to overcome these drawbacks is to use underapproximationconstraints to solve NMF sequentially.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 9
Nonnegative Matrix Underapproximation (NMU)It is possible to solve NMF sequentially, solving at each step
minu≥0,v≥0
||M − uvT ||2F such that uvT ≤M ⇐⇒ M − uvT ≥ 0.
NMU is yet another linear dimensionality reduction technique.However,
� As PCA/SVD, it is sequential and is well-posed.
� As NMF, it leads to a separation by parts. Moreover theadditional underapproximation constraints enhance this property.
� In the presence of pure-pixels, the NMU recursion is able todetect materials individually.
G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, PatternRecognition, 2010.G., Plemmons, Dimensionality Reduction, Classification, and Spectral Mixture Analysis usingNonnegative Underapproximation, Optical Engineering, 2011.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 10
Nonnegative Matrix Underapproximation (NMU)It is possible to solve NMF sequentially, solving at each step
minu≥0,v≥0
||M − uvT ||2F such that uvT ≤M ⇐⇒ M − uvT ≥ 0.
NMU is yet another linear dimensionality reduction technique.However,
� As PCA/SVD, it is sequential and is well-posed.
� As NMF, it leads to a separation by parts. Moreover theadditional underapproximation constraints enhance this property.
� In the presence of pure-pixels, the NMU recursion is able todetect materials individually.
G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, PatternRecognition, 2010.G., Plemmons, Dimensionality Reduction, Classification, and Spectral Mixture Analysis usingNonnegative Underapproximation, Optical Engineering, 2011.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 10
Nonnegative Matrix Underapproximation (NMU)It is possible to solve NMF sequentially, solving at each step
minu≥0,v≥0
||M − uvT ||2F such that uvT ≤M ⇐⇒ M − uvT ≥ 0.
NMU is yet another linear dimensionality reduction technique.However,
� As PCA/SVD, it is sequential and is well-posed.
� As NMF, it leads to a separation by parts. Moreover theadditional underapproximation constraints enhance this property.
� In the presence of pure-pixels, the NMU recursion is able todetect materials individually.
G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, PatternRecognition, 2010.G., Plemmons, Dimensionality Reduction, Classification, and Spectral Mixture Analysis usingNonnegative Underapproximation, Optical Engineering, 2011.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 10
Nonnegative Matrix Underapproximation (NMU)It is possible to solve NMF sequentially, solving at each step
minu≥0,v≥0
||M − uvT ||2F such that uvT ≤M ⇐⇒ M − uvT ≥ 0.
NMU is yet another linear dimensionality reduction technique.However,
� As PCA/SVD, it is sequential and is well-posed.
� As NMF, it leads to a separation by parts. Moreover theadditional underapproximation constraints enhance this property.
� In the presence of pure-pixels, the NMU recursion is able todetect materials individually.
G., Glineur, Using Underapproximations for Sparse Nonnegative Matrix Factorization, PatternRecognition, 2010.G., Plemmons, Dimensionality Reduction, Classification, and Spectral Mixture Analysis usingNonnegative Underapproximation, Optical Engineering, 2011.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 10
NMU on the Urban dataset
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 11
NMU on the Urban dataset
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 11
NMU on the Urban dataset
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 11
NMU on the Urban dataset
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 11
NMU on the Urban dataset
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 11
Identifying Lighting Orientations with NMUA static scene is illuminated from many directions.
NMU is able to detect the different lighting orientations.
NMU has also been successfully used in text mining for anomaly detection,and in image processing for segmentation of medical images.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 12
Identifying Lighting Orientations with NMUA static scene is illuminated from many directions.
NMU is able to detect the different lighting orientations.
NMU has also been successfully used in text mining for anomaly detection,and in image processing for segmentation of medical images.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 12
Identifying Lighting Orientations with NMUA static scene is illuminated from many directions.
NMU is able to detect the different lighting orientations.
NMU has also been successfully used in text mining for anomaly detection,and in image processing for segmentation of medical images.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 12
Outline
1. Algorithms and Applications
I General framework for NMF algorithms
I Solving NMF sequentially with underapproximations
2. Complexity and Bounds
I Nonnegative rank
I Rank-one subproblems, weights and missing data
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 13
What is the nonnegative rank?
The nonnegative rank of a nonnegative matrix M ∈ Rm×n+ is the minimum
number r of nonnegative rank-one factors needed to reconstruct M :
M =
r∑i=1
uivTi , ui ∈ Rm
+ , vi ∈ Rn+,
that is, the minimum r such that an exact NMF exists:
M = UV =
r∑i=1
U:iVi:, U ≥ 0, V ≥ 0.
The nonnegative rank of M is denoted rank+(M). Clearly,
rank(M) ≤ rank+(M) ≤ min(m,n).
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 14
What is the nonnegative rank?
The nonnegative rank of a nonnegative matrix M ∈ Rm×n+ is the minimum
number r of nonnegative rank-one factors needed to reconstruct M :
M =
r∑i=1
uivTi , ui ∈ Rm
+ , vi ∈ Rn+,
that is, the minimum r such that an exact NMF exists:
M = UV =
r∑i=1
U:iVi:, U ≥ 0, V ≥ 0.
The nonnegative rank of M is denoted rank+(M). Clearly,
rank(M) ≤ rank+(M) ≤ min(m,n).
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 14
An amazing resultLet P be a polytope
P = {x ∈ Rk | bi −A(i, :)x ≥ 0 for 1 ≤ i ≤ m},
and let vj ’s (1 ≤ j ≤ n) be its vertices.
We define the m-by-n slack matrix SP of P as follows:
SP(i, j) = bi −A(i, :)vj≥ 0 1 ≤ i ≤ m, 1 ≤ j ≤ n.
An extended formulation of P is higher dimensional polyhedron Q ⊆ Rk+p
that (linearly) projects onto P . The minimum number of facets of such apolytope is called the extension complexity xp(P) of P.
Theorem (Yannakakis, 1991).
rank+(SP) = xp(P).
Remark. Other closely related problems in communication complexity,probability, graph theory.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 15
An amazing resultLet P be a polytope
P = {x ∈ Rk | bi −A(i, :)x ≥ 0 for 1 ≤ i ≤ m},
and let vj ’s (1 ≤ j ≤ n) be its vertices.
We define the m-by-n slack matrix SP of P as follows:
SP(i, j) = bi −A(i, :)vj≥ 0 1 ≤ i ≤ m, 1 ≤ j ≤ n.
An extended formulation of P is higher dimensional polyhedron Q ⊆ Rk+p
that (linearly) projects onto P . The minimum number of facets of such apolytope is called the extension complexity xp(P) of P.
Theorem (Yannakakis, 1991).
rank+(SP) = xp(P).
Remark. Other closely related problems in communication complexity,probability, graph theory.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 15
An amazing resultLet P be a polytope
P = {x ∈ Rk | bi −A(i, :)x ≥ 0 for 1 ≤ i ≤ m},
and let vj ’s (1 ≤ j ≤ n) be its vertices.
We define the m-by-n slack matrix SP of P as follows:
SP(i, j) = bi −A(i, :)vj≥ 0 1 ≤ i ≤ m, 1 ≤ j ≤ n.
An extended formulation of P is higher dimensional polyhedron Q ⊆ Rk+p
that (linearly) projects onto P . The minimum number of facets of such apolytope is called the extension complexity xp(P) of P.
Theorem (Yannakakis, 1991).
rank+(SP) = xp(P).
Remark. Other closely related problems in communication complexity,probability, graph theory.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 15
An amazing resultLet P be a polytope
P = {x ∈ Rk | bi −A(i, :)x ≥ 0 for 1 ≤ i ≤ m},
and let vj ’s (1 ≤ j ≤ n) be its vertices.
We define the m-by-n slack matrix SP of P as follows:
SP(i, j) = bi −A(i, :)vj≥ 0 1 ≤ i ≤ m, 1 ≤ j ≤ n.
An extended formulation of P is higher dimensional polyhedron Q ⊆ Rk+p
that (linearly) projects onto P . The minimum number of facets of such apolytope is called the extension complexity xp(P) of P.
Theorem (Yannakakis, 1991).
rank+(SP) = xp(P).
Remark. Other closely related problems in communication complexity,probability, graph theory.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 15
An amazing resultLet P be a polytope
P = {x ∈ Rk | bi −A(i, :)x ≥ 0 for 1 ≤ i ≤ m},
and let vj ’s (1 ≤ j ≤ n) be its vertices.
We define the m-by-n slack matrix SP of P as follows:
SP(i, j) = bi −A(i, :)vj≥ 0 1 ≤ i ≤ m, 1 ≤ j ≤ n.
An extended formulation of P is higher dimensional polyhedron Q ⊆ Rk+p
that (linearly) projects onto P . The minimum number of facets of such apolytope is called the extension complexity xp(P) of P.
Theorem (Yannakakis, 1991).
rank+(SP) = xp(P).
Remark. Other closely related problems in communication complexity,probability, graph theory.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 15
The Hexagon
SP =
0 1 2 2 1 00 0 1 2 2 11 0 0 1 2 22 1 0 0 1 22 2 1 0 0 11 2 2 1 0 0
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 16
The Hexagon
SP =
0 1 2 2 1 00 0 1 2 2 11 0 0 1 2 22 1 0 0 1 22 2 1 0 0 11 2 2 1 0 0
=
1 0 0 1/2 00 1 0 1 00 0 1 1/2 00 0 1 0 1/20 1 0 0 11 0 0 0 1/2
0 1 2 1 0 00 0 1 0 0 11 0 0 0 1 20 0 0 2 2 02 2 0 0 0 0
,
with
rank(SP) = 3 ≤ rank+(SP) = 5 ≤ min(m,n) = 6.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 16
Lower Bounds for the Nonnegative RankIt is difficult to compute the nonnegative rank (NP-hard). However, it ispossible to compute lower bounds efficiently. For example,
n = # vertices(P) ≤ 2r+ .
Goemans, Smallest compact formulation for the permutahedron, ISMP 2009.Beasley and Laffey, Real rank versus nonnegative rank, LAA 2009.G., Glineur, On the geometric interpretation of the nonnegative rank, LAA 2012.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 17
What is the complexity of NMF?
Given a matrix M ∈ Rm×n+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
� For r = 1: ‘easy’ [Perron-Frobenius and Eckart-Young theorems].
� rank(M) = 2: rank+(M) = 2, ‘easy’.
� r part of the input: NP-hard (Vavasis, 2009).
� rank+(M) = r not part of the input: polynomial –O((mn)r
2)
(Arora et al. 2012, Moitra 2013).
� rank(M) = k not part of the input: open problem.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 18
What is the complexity of NMF?
Given a matrix M ∈ Rm×n+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
� For r = 1: ‘easy’ [Perron-Frobenius and Eckart-Young theorems].
� rank(M) = 2: rank+(M) = 2, ‘easy’.
� r part of the input: NP-hard (Vavasis, 2009).
� rank+(M) = r not part of the input: polynomial –O((mn)r
2)
(Arora et al. 2012, Moitra 2013).
� rank(M) = k not part of the input: open problem.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 18
What is the complexity of NMF?
Given a matrix M ∈ Rm×n+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
� For r = 1: ‘easy’ [Perron-Frobenius and Eckart-Young theorems].
� rank(M) = 2: rank+(M) = 2, ‘easy’.
� r part of the input: NP-hard (Vavasis, 2009).
� rank+(M) = r not part of the input: polynomial –O((mn)r
2)
(Arora et al. 2012, Moitra 2013).
� rank(M) = k not part of the input: open problem.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 18
What is the complexity of NMF?
Given a matrix M ∈ Rm×n+ and a factorization rank r ∈ N:
minU∈Rm×r
+ ,V ∈Rr×n+
||M − UV ||2F =∑i,j
(M − UV )2ij . (NMF)
� For r = 1: ‘easy’ [Perron-Frobenius and Eckart-Young theorems].
� rank(M) = 2: rank+(M) = 2, ‘easy’.
� r part of the input: NP-hard (Vavasis, 2009).
� rank+(M) = r not part of the input: polynomial –O((mn)r
2)
(Arora et al. 2012, Moitra 2013).
� rank(M) = k not part of the input: open problem.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 18
What about rank-one problems in NMF?
Solving NMF requires to finding r nonnegative rank-one factors U:kVk:,each having to satisfy the following equality as well as possible
U:kVk: ≈M −∑j 6=k
U:jVj:.= Rk � 0 ∀k.
These subproblems have the following form
minu∈Rm,v∈Rn
||R− uvT ||2F such that u ≥ 0, v ≥ 0. (R1NF)
called rank-one nonnegative factorization.
Is this problem difficult?If R ≥ 0 : No.If R � 0 : Surprisingly, yes.
G., Glineur, A Continuous Characterization of the Maximum-Edge Biclique Problem, JOGO ’12.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 19
What about rank-one problems in NMF?
Solving NMF requires to finding r nonnegative rank-one factors U:kVk:,each having to satisfy the following equality as well as possible
U:kVk: ≈M −∑j 6=k
U:jVj:.= Rk � 0 ∀k.
These subproblems have the following form
minu∈Rm,v∈Rn
||R− uvT ||2F such that u ≥ 0, v ≥ 0. (R1NF)
called rank-one nonnegative factorization.
Is this problem difficult?If R ≥ 0 : No.If R � 0 : Surprisingly, yes.
G., Glineur, A Continuous Characterization of the Maximum-Edge Biclique Problem, JOGO ’12.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 19
What about rank-one problems in NMF?
Solving NMF requires to finding r nonnegative rank-one factors U:kVk:,each having to satisfy the following equality as well as possible
U:kVk: ≈M −∑j 6=k
U:jVj:.= Rk � 0 ∀k.
These subproblems have the following form
minu∈Rm,v∈Rn
||R− uvT ||2F such that u ≥ 0, v ≥ 0. (R1NF)
called rank-one nonnegative factorization.
Is this problem difficult?If R ≥ 0 : No.If R � 0 : Surprisingly, yes.
G., Glineur, A Continuous Characterization of the Maximum-Edge Biclique Problem, JOGO ’12.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 19
What about rank-one problems in NMF?
Solving NMF requires to finding r nonnegative rank-one factors U:kVk:,each having to satisfy the following equality as well as possible
U:kVk: ≈M −∑j 6=k
U:jVj:.= Rk � 0 ∀k.
These subproblems have the following form
minu∈Rm,v∈Rn
||R− uvT ||2F such that u ≥ 0, v ≥ 0. (R1NF)
called rank-one nonnegative factorization.
Is this problem difficult?If R ≥ 0 : No.If R � 0 : Surprisingly, yes.
G., Glineur, A Continuous Characterization of the Maximum-Edge Biclique Problem, JOGO ’12.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 19
What about weights and missing data?
In some cases, some entries are missing/unknown.
For example, we would like to predict how much someone is going to like amovie based on its movie preferences (e.g., 1 to 5 stars) :
Users
Movies
2 3 2 ? ?? 1 ? 3 21 ? 4 1 ?5 4 ? 3 2? 1 2 ? 41 ? 3 4 3
Huge potential in electronic commerce sites (movies, books, music, . . . ).Good recommendations will increase the propensity of a purchase.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 20
What about weights and missing data?
In some cases, some entries are missing/unknown.
For example, we would like to predict how much someone is going to like amovie based on its movie preferences (e.g., 1 to 5 stars) :
Users
Movies
2 3 2 ? ?? 1 ? 3 21 ? 4 1 ?5 4 ? 3 2? 1 2 ? 41 ? 3 4 3
Huge potential in electronic commerce sites (movies, books, music, . . . ).Good recommendations will increase the propensity of a purchase.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 20
Low-rank model for recommendation systems
The behavior of users is modeled using linear combination of ’feature’users (related to age, sex, culture, etc.)
M(:, j)︸ ︷︷ ︸user j
≈r∑
k=1
U(:, k)︸ ︷︷ ︸feature user k
V (k, j)︸ ︷︷ ︸weights
Or equivalently, movies ratings are modeled as linear combinations of’feature’ movies (related to the genres - child oriented, serious vs. escapist,thriller, romantic, actors, etc.).
M(i, :)︸ ︷︷ ︸movie i
≈r∑
k=1
U(i, k)︸ ︷︷ ︸weights
V (k, :)︸ ︷︷ ︸genre k
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 21
Low-rank model for recommendation systems
The behavior of users is modeled using linear combination of ’feature’users (related to age, sex, culture, etc.)
M(:, j)︸ ︷︷ ︸user j
≈r∑
k=1
U(:, k)︸ ︷︷ ︸feature user k
V (k, j)︸ ︷︷ ︸weights
Or equivalently, movies ratings are modeled as linear combinations of’feature’ movies (related to the genres - child oriented, serious vs. escapist,thriller, romantic, actors, etc.).
M(i, :)︸ ︷︷ ︸movie i
≈r∑
k=1
U(i, k)︸ ︷︷ ︸weights
V (k, :)︸ ︷︷ ︸genre k
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 21
Example
M =
2 3 2 ? ?? 1 ? 3 21 ? 4 1 ?5 4 ? 3 2? 1 2 ? 41 ? 3 4 3
≈
0.5 0.6 −0.10.8 −0.2 −0.30.8 −0.7 0.6−2 2.3 1.8−0.2 0.3 0.91 −0.2 −0.2
1.7 2.1 3.7 5 4.1
2.2 3.2 0.8 5 0.52 0.6 2.6 0.9 5
= UV
=
2 2.9 2.1 5.4 1.90.3 0.9 2 2.7 1.71 −0.2 4 1 5.95.3 4.2 −0.9 3.1 22.1 1.1 1.8 1.3 2.80.9 1.3 3 3.8 3
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 22
Example
M =
2 3 2 ? ?? 1 ? 3 21 ? 4 1 ?5 4 ? 3 2? 1 2 ? 41 ? 3 4 3
≈
0.5 0.6 −0.10.8 −0.2 −0.30.8 −0.7 0.6−2 2.3 1.8−0.2 0.3 0.91 −0.2 −0.2
1.7 2.1 3.7 5 4.1
2.2 3.2 0.8 5 0.52 0.6 2.6 0.9 5
= UV
=
2 2.9 2.1 5.4 1.90.3 0.9 2 2.7 1.71 −0.2 4 1 5.95.3 4.2 −0.9 3.1 22.1 1.1 1.8 1.3 2.80.9 1.3 3 3.8 3
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 22
Example
M =
2 3 2 ? ?? 1 ? 3 21 ? 4 1 ?5 4 ? 3 2? 1 2 ? 41 ? 3 4 3
≈
0.5 0.6 −0.10.8 −0.2 −0.30.8 −0.7 0.6−2 2.3 1.8−0.2 0.3 0.91 −0.2 −0.2
1.7 2.1 3.7 5 4.1
2.2 3.2 0.8 5 0.52 0.6 2.6 0.9 5
= UV
=
2 2.9 2.1 5.4 1.90.3 0.9 2 2.7 1.71 −0.2 4 1 5.95.3 4.2 −0.9 3.1 22.1 1.1 1.8 1.3 2.80.9 1.3 3 3.8 3
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 22
ComplexityWeighted low-rank approximation is the following problem
infU ∈ Rm×r
V ∈ Rr×n
||M − UV T ||2W =∑ij
Wij(M − UV T )2ij , (WLRA)
where W ≥ 0 is the weighting matrix. A zero in W represents amissing/unknown entry in M .For r = 1 and M ≥ 0, this is equivalent to Weighted NMF.
TheoremIt is NP-hard to find an approximate solution of WLRA, for any r ≥ 1.
Other applications: computer vision, microarray data analysis, 2-Ddigital filter, etc.
G., Glineur, Low-Rank Matrix Approximation with Weights or Missing Data is NP-hard,SIMAX ’11.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 23
ComplexityWeighted low-rank approximation is the following problem
infU ∈ Rm×r
V ∈ Rr×n
||M − UV T ||2W =∑ij
Wij(M − UV T )2ij , (WLRA)
where W ≥ 0 is the weighting matrix. A zero in W represents amissing/unknown entry in M .For r = 1 and M ≥ 0, this is equivalent to Weighted NMF.
TheoremIt is NP-hard to find an approximate solution of WLRA, for any r ≥ 1.
Other applications: computer vision, microarray data analysis, 2-Ddigital filter, etc.
G., Glineur, Low-Rank Matrix Approximation with Weights or Missing Data is NP-hard,SIMAX ’11.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 23
ComplexityWeighted low-rank approximation is the following problem
infU ∈ Rm×r
V ∈ Rr×n
||M − UV T ||2W =∑ij
Wij(M − UV T )2ij , (WLRA)
where W ≥ 0 is the weighting matrix. A zero in W represents amissing/unknown entry in M .For r = 1 and M ≥ 0, this is equivalent to Weighted NMF.
TheoremIt is NP-hard to find an approximate solution of WLRA, for any r ≥ 1.
Other applications: computer vision, microarray data analysis, 2-Ddigital filter, etc.
G., Glineur, Low-Rank Matrix Approximation with Weights or Missing Data is NP-hard,SIMAX ’11.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 23
Maximum-edge biclique problemThe reductions for NMU, R1NF and WLRA are from themaximum-edge biclique problem: Given two sets of objects interactingtogether (a bipartite graph), find highly connected subgroups.
This is the maximum-edge complete bipartite subgraph.
Applications in clustering : text mining, web community discovery . . .
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 24
Maximum-edge biclique problemThe reductions for NMU, R1NF and WLRA are from themaximum-edge biclique problem: Given two sets of objects interactingtogether (a bipartite graph), find highly connected subgroups.
This is the maximum-edge complete bipartite subgraph.
Applications in clustering : text mining, web community discovery . . .
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 24
Maximum-edge biclique problemThe reductions for NMU, R1NF and WLRA are from themaximum-edge biclique problem: Given two sets of objects interactingtogether (a bipartite graph), find highly connected subgroups.
This is the maximum-edge complete bipartite subgraph.
Applications in clustering : text mining, web community discovery . . .
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 24
ExampleLet M be the biadjacency matrix of the bipartite graph representing theinteractions:
M =
1 0 10 1 11 0 1
=
1 00 11 0
( 1 0 10 1 1
)
=
1 0 10 0 01 0 1
+
0 0 00 1 10 0 0
Each rank-one factor corresponds to a community.
[W11] Wang et al., Community discovery using nonnegative matrix factorization, Data Min.Knowl. Disc. 22:493-521, 2011.[PRE11] Psorakis, Roberts, and Ebden, Overlapping community detection using Bayesiannon-negative matrix factorization, Physical Review E 83, 066114 (2011).[YL13] Yang and Leskovec, Overlapping Community Detection at Scale: A Nonnegative MatrixFactorization Approach, Proc. of the sixth ACM int. conf. on web search and data mining, 2013.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 25
Conclusion
Low-rank matrix approximation can be used for linear dimensionalityreduction. It is a powerful tool for data analysis.
Nonnegativity renders the problem difficult.
However, it enhances significantly its applicability in many areas (byimproving interpretability), e.g., image processing, text mining,hyperspectral unmixing, clustering, community detection, andrecommender systems.
Still many open questions for NMF and related problems, e.g., subclass ofmatrices for which NMF can be computed efficiently (separable NMF),bounding and computing nonnegative ranks, non-uniquenesscharacterization, complexity when using other norms, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 26
Conclusion
Low-rank matrix approximation can be used for linear dimensionalityreduction. It is a powerful tool for data analysis.
Nonnegativity renders the problem difficult.
However, it enhances significantly its applicability in many areas (byimproving interpretability), e.g., image processing, text mining,hyperspectral unmixing, clustering, community detection, andrecommender systems.
Still many open questions for NMF and related problems, e.g., subclass ofmatrices for which NMF can be computed efficiently (separable NMF),bounding and computing nonnegative ranks, non-uniquenesscharacterization, complexity when using other norms, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 26
Conclusion
Low-rank matrix approximation can be used for linear dimensionalityreduction. It is a powerful tool for data analysis.
Nonnegativity renders the problem difficult.
However, it enhances significantly its applicability in many areas (byimproving interpretability), e.g., image processing, text mining,hyperspectral unmixing, clustering, community detection, andrecommender systems.
Still many open questions for NMF and related problems, e.g., subclass ofmatrices for which NMF can be computed efficiently (separable NMF),bounding and computing nonnegative ranks, non-uniquenesscharacterization, complexity when using other norms, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 26
Conclusion
Low-rank matrix approximation can be used for linear dimensionalityreduction. It is a powerful tool for data analysis.
Nonnegativity renders the problem difficult.
However, it enhances significantly its applicability in many areas (byimproving interpretability), e.g., image processing, text mining,hyperspectral unmixing, clustering, community detection, andrecommender systems.
Still many open questions for NMF and related problems, e.g., subclass ofmatrices for which NMF can be computed efficiently (separable NMF),bounding and computing nonnegative ranks, non-uniquenesscharacterization, complexity when using other norms, etc.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 26
Thank you for your attention!
Code and papers available onhttps://sites.google.com/site/nicolasgillis/
Recent Survey: ‘The Why and How of Nonnegative Matrix Factorization’,arXiv:1401.5226.
Householder XIX Nonnegative Matrix Factorization: Complexity, Algorithms and Applications 27
Top Related