Multi-view Anomaly Detection via Robust Probabilistic ...
Transcript of Multi-view Anomaly Detection via Robust Probabilistic ...
![Page 1: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/1.jpg)
Copyright©2016 NTT corp. All Rights Reserved.
Multi-view Anomaly Detection via Robust Probabilistic Latent Variable Models
Tomoharu Iwata NTT Communication Science Labs Makoto Yamada RIKEN, AIP
NIPS2016
![Page 2: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/2.jpg)
2
Multi-view anomaly • Instances that have inconsistent views • Application
– information disparity management • find documents that contain different information
across multilingual Wikipedia documents – malicious insider detection – purchase behavior analysis
• find movies inconsistently purchased by users based on the genre (animation by purchased by grown-ups)
image annotation
text URLs
audio video
Japanese text English text
Multi-view data
![Page 3: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/3.jpg)
3
Single-view/multi-view anomaly
B M C A
F D G
S
I J
E
B M C D
F A
G
S
I J
E
H
H
View1 View2
• single-view anomaly is an instance that does not conform to expected behavior
• S is single-view anomaly, but not multi-view anomaly • M is not single-view anomaly, but multi-view
anomaly
![Page 4: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/4.jpg)
4
Existing method: HOAD (1) • HOrizontal Anomaly Detection (HOAD)
– A Spectral Framework for Detecting Inconsistency across Multi-source Object Relationships, Gao et. al. ICDM 2011
• Step1: soft clustering two views together with the constraint that an instance should be assigned to the same cluster
two-view combined similarity
graph
view1
view2
view1 view2 diagonal
diagonal
spectral clustering
![Page 5: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/5.jpg)
5
Existing method: HOAD (2)
• Step2: quantify the difference between the two clustering solutions
• weak points – anomalous instances also have the constraint to
be assigned to the same cluster – require hyper-parameter tuning (e.g. weight for
the constraint)
![Page 6: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/6.jpg)
6
Proposed model • Normal (non-anomalous) instance
– all views are generated from a single latent vector • Anomaly
– different views are generated from different latent vectors
B M C A
F D G
S
I J
E
B M C D
F A
G
S
I J
E
H
H
B M C D
F A G
S
I J E H M
Observed view 1 Observed view 2
Latent space
𝑾1 𝑾2
![Page 7: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/7.jpg)
7
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
A
Observed view 1 Observed view 2
Latent space A
A
A A
Observed view D
𝑾1 𝑾2 𝑾𝐷
![Page 8: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/8.jpg)
8
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
A
Observed view 1 Observed view 2
Latent space A
A
A A
Observed view D
𝑾1 𝑾2 𝑾𝐷
![Page 9: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/9.jpg)
9
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
A
Observed view 1 Observed view 2
Latent space A
A
A A
Observed view D
𝑾1 𝑾2 𝑾𝐷
A
![Page 10: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/10.jpg)
10
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
A
Observed view 1 Observed view 2
Latent space A
A
A A
Observed view D
𝑾1 𝑾2 𝑾𝐷
A
![Page 11: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/11.jpg)
11
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
Observed view 1 Observed view 2
Latent space A
A
A
Observed view D
𝑾1 𝑾2 𝑾𝐷
A A
A A
![Page 12: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/12.jpg)
12
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
Observed view 1 Observed view 2
Latent space A
A
A
Observed view D
𝑾1 𝑾2 𝑾𝐷
A A
A A
![Page 13: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/13.jpg)
13
Proposed model • Each instance has potentially a countably infinite number
of latent vectors • Each view of an instance is generated depending on a
view-specific projection matrix and a latent vector
Observed view 1 Observed view 2
Latent space A
A
A
Observed view D
𝑾1 𝑾2 𝑾𝐷
A A
A
A A
![Page 14: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/14.jpg)
14
Proposed probabilistic model
• By using Dirichlet processes prior for each instance, the number of latent vectors is automatically determined from the given data – if views are consistent, they are clustered; otherwise, they
need different latent vectors • By using view-dependent projection matrices, we can handle
different properties across different views
𝑝 𝒙𝑛𝑛 𝒛𝑛,𝑾𝑛 ,𝜽𝑛,𝛼 = �𝜃𝑛𝑛𝑁(𝒙𝑛𝑛|𝑾𝑛𝒛𝑛𝑛 ,𝛼−1𝑰)∞
𝑛=1
view d of instance n
j-th latent vectors of instance n
projection matrix for view d
mixture weight for j-th latent vector
precision
𝑣𝑛 =1𝐻� 𝐼(𝐽𝑛 ℎ > 1)𝐻
ℎ=1
#latent vectors of instance n at the h-th iteration after burn-in
indicator function
Anomaly score: probability that the instance uses more than one latent vector
![Page 15: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/15.jpg)
15
Proposed model • clustering views for each instance in the latent space
Observed view 1 Observed view 2
Latent space A1 A2
A4
Observed view 5
𝑾1 𝑾2 𝑾5
A
A
A
A5 A3
![Page 16: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/16.jpg)
16
• Proposed model – 𝑝 𝒙𝑛𝑛 𝒛𝑛,𝑾𝑛 ,𝜽𝑛,𝛼 = ∑ 𝜃𝑛𝑁(𝒙𝑛𝑛|𝑾𝑛𝒛𝑛𝑛 ,𝛼−1𝑰)∞
𝑛=1
• Probabilistic PCA – 𝑝 𝒙𝑛𝑛 𝒁,𝑾 = 𝑁(𝒙𝑛𝑛|𝑾𝒛𝑛𝑛 ,𝛼−1𝑰)
• Probabilistic CCA – 𝑝 𝒙𝑛𝑛 𝒁,𝑾 = 𝑁(𝒙𝑛𝑛|𝑾𝑛𝒛𝑛𝑛 ,𝚺)
• Infinite Gaussian mixture – 𝑝 𝒙𝑛𝑛 𝝁,𝜽 = ∑ 𝜃𝑛𝑁(𝒙𝑛𝑛|𝝁𝑛 ,𝛼−1𝑰)∞
𝑛=1
Relation with other latent variable models
![Page 17: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/17.jpg)
17
Generative process
![Page 18: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/18.jpg)
18
Inference based on stochastic EM • Analytically integrate out the latent vectors Z, mixture
weights Θ, precision 𝛼 • E-step: collapsed Gibbs sampling of latent vector
assignment s for each view of each instance ℓ = (𝑛,𝑑)
• M-step: maximum joint likelihood estimation of
projection matrices W
![Page 19: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/19.jpg)
19
Experiments • Data
– 11 data sets from LIBSVM data – generated multiple views by randomly splitting
the features – anomalies were added by swapping views of two
randomly selected instances • Comparing methods
– PCCA: probabilistic canonical correlation analysis – HOAD: horizontal anomaly detection – CC: consensus clustering based anomaly detection – OCSVM: one-class support vector machine
![Page 20: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/20.jpg)
20
Multi-view anomaly detection with different anomaly rate
The proposed model achieved the best with 8 of the 11 data sets
![Page 21: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/21.jpg)
21
Multi-view anomaly detection with different latent dimensionality
![Page 22: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/22.jpg)
22
MovieLens data analysis instance: movie, view1: user list who rated, view2: genre
‘The Full Monty’ and ‘Liar Liar’ were ‘Comedy’ genre. They are rated by not only users who likes ‘Comedy’, but also who likes ‘Romance’ and ‘Action-Thriller’. ‘The Professional’ was anomaly because it was rated by two different user groups, where a group prefers ‘Romance’ and the other prefers ‘Action’. Since ‘Star Trek’ series are typical Sci-Fi and liked by specific users, its anomaly score was low.
high anomaly score movies low anomaly score movies
![Page 23: Multi-view Anomaly Detection via Robust Probabilistic ...](https://reader033.fdocuments.us/reader033/viewer/2022061102/629d8662046c6030e754a256/html5/thumbnails/23.jpg)
23
Conclusion • We proposed a generative model approach for multi-
view anomaly detection, which finds instances that have inconsistent views.
• In the experiments, we confirmed that the proposed model could perform much better than existing methods for detecting multi-view anomalies
• Future work – nonlinear projection using Gaussian processes or
deep neural nets