Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE...
Transcript of Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE...
![Page 1: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/1.jpg)
BOOLEAN TENSOR FACTORIZATIONS
Pauli Miettinen 14 December 2011
![Page 2: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/2.jpg)
BACKGROUND: TENSORS AND TENSOR FACTORIZATIONS
X
![Page 3: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/3.jpg)
BACKGROUND: TENSORS AND TENSOR FACTORIZATIONS
XA
B
C
≈
![Page 4: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/4.jpg)
BACKGROUND: BOOLEAN MATRIX FACTORIZATIONS
• Given a binary matrix X and a positive integer R, find two binary matrices A and B such that A has R columns and B has R rows and X ≈ A o B.
• A o B is the Boolean matrix product of A and B,
(A ◦ B)ij =R�
r=1
bilclj
![Page 5: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/5.jpg)
BOOLEAN TENSOR FACTORIZATIONS: THE IDEA
1. Take existing (normal) tensor factorization
2. Make everything binary and define summation as 1 + 1 = 1
3. Try to understand what you just did.
Research problem. What can we say about Boolean tensor factorizations and how do they relate to normal tensor factorizations and Boolean matrix factorizations?
![Page 6: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/6.jpg)
RANK-1 (BOOLEAN) TENSORS
X
a
b
=
X = a× b
![Page 7: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/7.jpg)
RANK-1 (BOOLEAN) TENSORS
X
a
b
c
X = a×1 b×2 c
=
![Page 8: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/8.jpg)
THE CP TENSOR DECOMPOSITION
xijk ≈R�
r=1
airbjrckr
≈X
a1 a2 aR
bRb2b1
c1 c2 cR
+ + · · ·+
![Page 9: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/9.jpg)
THE CP TENSOR DECOMPOSITION
XA
B
C
≈
xijk ≈R�
r=1
airbjrckr
![Page 10: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/10.jpg)
THE BOOLEAN CP TENSOR DECOMPOSITION
≈X
a1 a2 aR
bRb2b1
c1 c2 cR
∨ ∨ · · ·∨
xijk ≈R�
r=1
airbjrckr
![Page 11: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/11.jpg)
THE BOOLEAN CP TENSOR DECOMPOSITION
XA
B
C
≈ ◦◦
xijk ≈R�
r=1
airbjrckr
![Page 12: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/12.jpg)
DIGRESSION: FREQUENT TRI-ITEMSET MINING
• Rank-1 N-way binary tensors define an N-way itemset
• Particularly, rank-1 binary matrices define an itemset
• In itemset mining the induced sub-tensor must be full of 1s
• Here, the items can have holes
• Boolean CP decomposition = lossy N-way tiling
![Page 13: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/13.jpg)
TENSOR RANKThe rank of a tensor is the minimum number of rank-1
tensors needed to represent the tensor exactly.
X
a1 a2 aR
bRb2b1
c1 c2 cR
+ + · · ·+=
![Page 14: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/14.jpg)
BOOLEAN TENSOR RANKThe Boolean rank of a binary tensor is the minimum
number of binary rank-1 tensors needed to represent the tensor exactly using Boolean arithmetic.
=X
a1 a2 aR
bRb2b1
c1 c2 cR
∨ ∨ · · ·∨
![Page 15: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/15.jpg)
SOME RESULTS ON RANKS
• Normal tensor rank is NP-hard to compute
• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}
• But no more than min{nm, nk, mk}
![Page 16: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/16.jpg)
SOME RESULTS ON RANKS
• Normal tensor rank is NP-hard to compute
• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}
• But no more than min{nm, nk, mk}
• Boolean tensor rank is NP-hard to compute
![Page 17: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/17.jpg)
SOME RESULTS ON RANKS
• Normal tensor rank is NP-hard to compute
• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}
• But no more than min{nm, nk, mk}
• Boolean tensor rank is NP-hard to compute
• Boolean tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}
![Page 18: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/18.jpg)
SOME RESULTS ON RANKS
• Normal tensor rank is NP-hard to compute
• Normal tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}
• But no more than min{nm, nk, mk}
• Boolean tensor rank is NP-hard to compute
• Boolean tensor rank of n-by-m-by-k tensor can be more than min{n, m, k}
• But no more thanmin{nm, nk, mk}
![Page 19: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/19.jpg)
SPARSITY
• Binary matrix X of Boolean rank R and |X| 1s has Boolean rank-R decomposition A o B such that |A| + |B| ≤ 2|X| [M., ICDM ’10]
• Binary N-way tensor of Boolean tensor rank R has Boolean rank-R CP-decomposition with factor matrices A1, A2, …, AN such that ∑i |Ai| ≤ N| |
• Both results are existential only and extend to approximate decompositions
X
X
![Page 20: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/20.jpg)
THE TUCKER TENSOR DECOMPOSITION
X GAB
C
≈
xijk ≈P�
p=1
Q�
q=1
R�
r=1
gpqraipbjqckr
![Page 21: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/21.jpg)
THE BOOLEAN TUCKER TENSOR DECOMPOSITION
X GAB
C
≈
xijk ≈P�
p=1
Q�
q=1
R�
r=1
gpqraipbjqckr
![Page 22: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/22.jpg)
THE ALGORITHMS
• The normal CP-decomposition can be solved using matricization and ALS
• ⊙ is the Khatri–Rao matrix product
• (C ⊙ B)T is R-by-mk
• For normal matrices, we can use standard least-squares projections
• One projection per mode
• Similar algorithms for the Tucker decomposition
X(1) = A(C⊙ B)T
X(2) = B(C⊙A)T
X(3) = C(B⊙A)T
![Page 23: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/23.jpg)
THE ALGORITHMS
• For Boolean case, matrix product must be changed
• Khatri–Rao stays as it
• Finding the optimal projection is NP-hard even to approximate
• Good initial values are needed due to multiple local minima
• Obtained using Boolean matrix factorization to matricizations
X(1) = A ◦ (C⊙ B)T
X(2) = B ◦ (C⊙A)T
X(3) = C ◦ (B⊙A)T
![Page 24: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/24.jpg)
THE TUCKER CASE
• The core tensor has global effects
• Updates are hard
• Core tensor is usually small
• We can afford more time per element
• In Boolean case many changes make no difference
X GAB
C
≈
xijk ≈P�
p=1
Q�
q=1
R�
r=1
gpqraipbjqckr
![Page 25: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/25.jpg)
SYNTHETIC EXPERIMENTS
4 8 16 32 640
2
4
6
8
10
12x 10
4
r
Reconstr
uction e
rror
BCP_ALSCP_ALS
B
CP_ALSF
CP_OPTB
CP_OPTF
(4, 4, 4) (8, 4, 4) (8, 8, 8) (16, 8, 4)0
1
2
3
4
5
6
7
8
9x 10
4
Ranks
Re
con
stru
ctio
n e
rro
r
BTucker_ALSTucker_ALS
B
Tucker_ALSF
![Page 26: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/26.jpg)
SYNTHETIC EXPERIMENTS
4 8 16 32 640
2
4
6
8
10
12x 10
4
r
Reconstr
uction e
rror
BCP_ALSCP_ALS
B
CP_ALSF
CP_OPTB
CP_OPTF
(4, 4, 4) (8, 4, 4) (8, 8, 8) (16, 8, 4)0
1
2
3
4
5
6
7
8
9x 10
4
Ranks
Re
con
stru
ctio
n e
rro
r
BTucker_ALSTucker_ALS
B
Tucker_ALSF
![Page 27: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/27.jpg)
SYNTHETIC EXPERIMENTS
4 8 16 32 640
2
4
6
8
10
12x 10
4
r
Reconstr
uction e
rror
BCP_ALSCP_ALS
B
CP_ALSF
CP_OPTB
CP_OPTF
(4, 4, 4) (8, 4, 4) (8, 8, 8) (16, 8, 4)0
1
2
3
4
5
6
7
8
9x 10
4
Ranks
Re
con
stru
ctio
n e
rro
r
BTucker_ALSTucker_ALS
B
Tucker_ALSF
![Page 28: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/28.jpg)
REAL-WORLD EXPERIMENTS
5 10 15 301200
1400
1600
1800
2000
2200
2400
r
Re
co
nstr
uction
err
or
BCP_ALSCP_ALS
B
CP_ALSF
CP_OPTB
CP_OPTF
(5, 5, 5) (10, 10, 10) (15, 15, 15) (30, 30, 15)1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
RanksR
eco
nst
ruct
ion
err
or
BTucker_ALSTucker_ALS
B
Tucker_ALSF
![Page 29: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/29.jpg)
REAL-WORLD EXPERIMENTS
5 10 15 301200
1400
1600
1800
2000
2200
2400
r
Re
co
nstr
uction
err
or
BCP_ALSCP_ALS
B
CP_ALSF
CP_OPTB
CP_OPTF
(5, 5, 5) (10, 10, 10) (15, 15, 15) (30, 30, 15)1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
RanksR
eco
nst
ruct
ion
err
or
BTucker_ALSTucker_ALS
B
Tucker_ALSF
![Page 30: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/30.jpg)
REAL-WORLD EXPERIMENTS
5 10 15 301200
1400
1600
1800
2000
2200
2400
r
Re
co
nstr
uction
err
or
BCP_ALSCP_ALS
B
CP_ALSF
CP_OPTB
CP_OPTF
(5, 5, 5) (10, 10, 10) (15, 15, 15) (30, 30, 15)1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
RanksR
eco
nst
ruct
ion
err
or
BTucker_ALSTucker_ALS
B
Tucker_ALSF
![Page 31: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/31.jpg)
CONCLUSIONS
• Boolean tensor decompositions are a bit like normal tensor decompositions
• And a bit like Boolean matrix factorizations
• They generalize other data mining techniques in many ways
• The playing field between Boolean and normal tensor factorizations is more level
![Page 32: Boolean tensor factorizations - mpi-inf.mpg.depmiettin/papers/BTF_ICDM_slides.pdfFACTORIZATIONS: THE IDEA 1. Take existing (normal) tensor factorization 2. Make everything binary and](https://reader030.fdocuments.us/reader030/viewer/2022041214/5e030717d9e2ea2f20416b86/html5/thumbnails/32.jpg)
CONCLUSIONS
• Boolean tensor decompositions are a bit like normal tensor decompositions
• And a bit like Boolean matrix factorizations
• They generalize other data mining techniques in many ways
• The playing field between Boolean and normal tensor factorizations is more level
!ank Y#!