Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006...

10
Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009

Transcript of Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006...

Page 1: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Parameter Expanded Variational Bayesian Methods

Yuan (Alan) Qi and Tommi S. Jaakkola, MITNIPS 2006

Presented by: John Paisley

Duke University, ECE

3/13/2009

Page 2: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Outline

• Introduction

• PX-VB algorithm

• Applications– Bayesian Probit Regression– Automatic Relevance Determination

• Convergence Properties

• Conclusion

Page 3: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Introduction

• Variational Bayes is a popular method for approximating the posterior distribution of a model.

• Can be slow to converge if variables are strongly correlated

• Parameter-expanded methods can speed convergence by adding auxiliary parameters, which can remove the strong coupling of parameters.

Page 4: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

PX-VB algorithmAuxiliary variables are added and optimized with each iteration. The original parameters are then recovered by setting the auxiliary variables to the values that recover the original model.

Page 5: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Bayesian Probit Regression• The original model: Where TN is the truncated-Gaussian

• The parameter-expanded model:

Where q(z_n) and q(w) updated with this is followed by the inverse mapping

Page 6: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Bayesian Probit Regression: Results

Page 7: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Automatic Relevance Determination (RVM)

• Separate auxiliary variables

As well as an auxiliary variable for \alpha, the details for which are omitted

• Shared auxiliary variable

The auxiliary variable c is optimized with each iteration using the iterative Newton method, as no closed form solution exists.

Page 8: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Automatic Relevance Determination: Results

Page 9: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Convergence Properties

• A general convergence theorem was presented and proven:

Page 10: Parameter Expanded Variational Bayesian Methods Yuan (Alan) Qi and Tommi S. Jaakkola, MIT NIPS 2006 Presented by: John Paisley Duke University, ECE 3/13/2009.

Conclusion

• The theorem and proof shows that as long as the inverse mapping function, M_a, has a largest eigenvalue smaller than 1, PX-VB is guaranteed to converge faster than VB, with the rate of convergence increasing as this value decreases.

• The approach presented was a general method for speeding up VB inference. This was demonstrated on two popular Bayesian models.