Stick-breaking Construction for the Indian Buffet Process
description
Transcript of Stick-breaking Construction for the Indian Buffet Process
![Page 1: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/1.jpg)
Stick-breaking Construction for the Indian Buffet Process
Duke University Machine Learning Group
Presented by Kai Ni
July 27, 2007
Yee Whye The, Dilan Gorur, and Zoubin Ghahramani,AISTATS 2007
![Page 2: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/2.jpg)
Outline
Introduction
Indian Buffet Process (IBP)
Stick-breaking construction for IBP
Slice samplers
Results
![Page 3: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/3.jpg)
Introduction Indian Buffet Process (IBP)
A distribution over binary matrices consisting of N rows (objects) and an unbounded number of columns (features);
1/0 in entry (i,k) indicates feature k present/absent from object i.
An example Objects are movies – “Terminator 2”, “Shrek” and “Shanghai
Knights”;
Features are – “action”, “comedy”, “stars Jackie Chan”;
The matrix can be [101; 010; 110].
![Page 4: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/4.jpg)
Relationship to CRP IBP and CRP are both tools for defining nonparametric
Bayesian models with latent variables.
CRP – Each object belongs to only one of infinitely many latent classes.
IBP – Each object can possess potentially any combination of infinitely many latent features.
Previous Gibbs sampler for IBP is based on CRP. In this paper the author derives a stick-breaking representation for the IBP, and develop efficient slice samplers.
![Page 5: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/5.jpg)
Indiant Buffet Process
Let Z be a random binary N x K matrix, and denote entry (I,k) in Z by zik. For each feature k let uk be the prior probability that feature k is present in an object.
Let be the strength parameter of the IBP, the full model is:
If we integrated out uk and taking the limit of K -> infinity, we obtain the IBP in the situation similar to CRP.
![Page 6: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/6.jpg)
![Page 7: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/7.jpg)
Gibbs sampler for IBP
For new features:
![Page 8: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/8.jpg)
Stick-breaking construction for IBP
![Page 9: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/9.jpg)
Derivation
pdf for each u cdf for each u
cdf for u(1)
pdf for u(1)
![Page 10: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/10.jpg)
Relation to DP
![Page 11: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/11.jpg)
Stick-breaking for IBP (2)
In truncated stick-breaking for IBP, let K* be the truncation level. We set u(k)=0 for k>K*, and zik=0 for k>K*.
![Page 12: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/12.jpg)
Slice Sampler
Using Adaptive rejection sampling (ARS) to deal with the truncation level. Introduce an auxiliary slice variable s with
![Page 13: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/13.jpg)
1. Update s: if new s makes K* becomes larger, we iteratively draw u(k) until u(K*’) > s.
2. Update Z: given s, we only need to update zik for each i and k<=K*.
3. Update for k = 1,…, K*.
4. Update u(k) for k = 1, …, K*.
Sampling
k
1 2 K* K*’ Decreasing u(k)
Old s New s
0Range of uniform dist. for s
![Page 14: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/14.jpg)
Change of Representations IBP – ignoring the ordering on features; Stick-breaking IBP – enforcing an ordering with
decreasing weights.
Stick-breaking -> IBP: Drop the stick lengths and the inactive features, leaving only the K+ active feature columns along with the corresponding parameters.
IBP -> stick-breaking: Draw both the stick lengths and order the features in decreasing stick lengths, introducing Ko inactive features until
![Page 15: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/15.jpg)
Semi-ordered Stick-breaking uk
+ on active features are unordered and draw from a CRP similar distribution:
The stick length on inactive feature is similar to the stick-breaking IBP
The auxiliary variable s determines how many inactive features need to add.
(unordered 1~K+) Ko
s0Range of uniform dist. for s
Min(u(k))
![Page 16: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/16.jpg)
Results Used the conjugate linear-Gaussian binary latent
feature model for comparing the performance of the different samplers. Each data point is modeled using a spherical Gaussian with mean zi,:A and variance 2
X
![Page 17: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/17.jpg)
Demonstration
Apply semi-ordered slice sampler to 1000 examples of handwritten images of 3’s in the MNIST dataset.
![Page 18: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/18.jpg)
![Page 19: Stick-breaking Construction for the Indian Buffet Process](https://reader033.fdocuments.us/reader033/viewer/2022050723/56816005550346895dcf059c/html5/thumbnails/19.jpg)
Conclusion
The author derived novel stick-breaking representations of the Indian buffet process.
Based on these representations, new MCMC samplers are proposed that are easy to implement and work on more general models than Gibbs sampling.