Multi-Prediction Deep Boltzmann...

Multi-Prediction Deep Boltzmann Machines

Goodfellow, Mirza, Courville, Bengio

Vipul Venkataraman Nov 29, 2016

Outline• Goal of the paper [1]

• A primer on RBMs and DBMs

• Training DBMs

• Proposed method: motivations and intuitions

• Results

• Conclusions

Goal of the paper

Make training unsupervised models great again!

Goal of the paper

Make training unsupervised models great again!

Deep Boltzmann Machines

Preliminaries

Image: [2]

Training

• Unsupervised

• Generative model

• Feature learning algorithm

• Unsupervised

Training Methods

Classification

• Exact inference is intractable

• Use mean field expectations of the hidden units

Training DBMsSteps [2]:

1. Layer-wise pre-training

• Unsupervised

• RBMs as building blocks

2. Discriminative fine-tuning

• Supervised

• Back-propagationGood reference: https://www.youtube.com/watch?v=Oq38pINmddk

Pre-training

Fine-tuning

!Pre-training

• Deep: good features

• Can use any unsupervised algorithm

• RBM (w/ CD)

• Auto-encoder

Fine-tuning

• Won’t make drastic changes

• Need less labelled data

• Can use a lot of unlabelled data

• Greedy training, not considering global interactions

• Many models, criteria

• Extra classifier as well

• CD-k: we don’t know k

• Gradient approximation may be bad

An Aside: CD Intuition

• Far away ‘holes’

• May want our particles to move many steps [3]

• The mixing may get slower

• CD-1 -> CD-3 -> CD-10

Solutions

Proposed method

• Mantra: Simplify

Proposed method

• Many models -> one model

Proposed method

• Many criteria -> one criterion

Proposed method

• Many criteria -> one criterion

• Extra classification layer at the top -> unified model

Quick recap

Multi-Prediction Training

Random bit-mask

Example 1

Example 1, update 1

Example 1, update 2

Two mean-field fixed point updates

Example 2, all updates

Example 3, all updates

One iteration

Minibatch

One iteration

Minibatch Backprop

Performance

• Works well (results in a bit)

• Expensive though

• Needing to run several iterations for convergence

Multi-Inference Trick

Mean field

Multi-inference

average with the mean-field estimate

Mean field

Multi-inference

average with the mean-field estimate

Nesterov’s accelerated gradient descent

Results

Can someone find me a suitable picture?

Image: Goodfellow's defense

Setting• Dataset: MNIST

• First layer: 500 hidden units

• Second layer: 1000 hidden units

• Minibatch size: 100 examples

• Test set: 10000 examples

• For more related results: [1]

Classification

MP-DBM with 2X hidden units: 0.91

Robustness

Missing inputs

Conclusions

• Simpler, more intuitive methodology for training Deep Boltzmann Machines

• Improved accuracy for approximate inference problems

References

[1] Goodfellow, Mirza, Courville, and Bengio. Multi-prediction deep Boltzmann machines. NIPS ’13.

[2] Salakhutdinov and Hinton. Deep Boltzmann machines. AISTATS 2009.

[3] Hinton. Neural Networks for Machine Learning. Coursera.

Questions?

Multi-Prediction Deep Boltzmann...

Documents

Transcript of Multi-Prediction Deep Boltzmann...

The Linear Boltzmann Equation - · PDF fileThe Linear Boltzmann Equation 1. ... Boltzmann equation is the other main kinetic linear equation, ... pendent random variables inR+ with

Mach Boltzmann

Boltzmann Time Bomb

Classification using Discriminative Restricted Boltzmann ...swoh.web.engr.illinois.edu/courses/IE598/handout/fall2016_slide20.pdf · Outline 1 Introduction 2 Modeling Classi cation

Ludwig Boltzmann BZ-1 - Gustavus Adolphus Collegehomepages.gac.edu/~anienow/CHE-371a/Lectures/Boltzmann...Ludwig Boltzmann BZ-1 Boltzmann is well-known as a chemical theorist – he

Non-linear Boltzmann Equationmctp/SciPrgPgs/events/2013/CAP13/... · Non-Linear Boltzmann Su, Lim, Shellard Introduction Motivation Background Part I: 2nd-Order Boltzmann Equation

From Lattice Boltzmann Method to Lattice Boltzmann Flux … · From Lattice Boltzmann Method to Lattice Boltzmann Flux Solver Yan Wang 1, ... flows [8,13–15], compressible flows

A Deep Learning Approach to Link Prediction in Dynamic ... · 3.1 Restricted Boltzmann Machine Restricted Boltzmann Machine (RBM) is a special case of Markov Random Field which has

Boltzmann and kinetic equations - Computational …arrigoni/vorlesungen/non...55 Boltzmann and kinetic equations Boltzmann equation: semiclassical equation to describe nonequilibrium

Thermodynamics Maxwell-Boltzmann Distribution Second Lawnebula2.deanza.edu/~lanasheridan/4C/Phys4C-Lecture21.pdfReminder: Maxwell-Boltzmann speed distribution. The Boltzmann distribution

Lattice Boltzmann book

Restricted Boltzmann Machines for Collaborative Filteringswoh.web.engr.illinois.edu/courses/IE598/handout/fall... · 2016-12-03 · Restricted Boltzmann Machines for Collaborative

[2014] Towards Lattice-Boltzmann Prediction of Turbofan.pdf

GENERALIZED LINEAR BOLTZMANN EQUATIONS FOR … LINEAR BOLTZMANN EQUATIONS FOR PARTICLE TRANSPORT IN POLYCRYSTALS ... of a macroscopic particle cloud is described by the linear Boltzmann

Ludwig Boltzmann BZ-1homepages.gac.edu/~anienow/CHE-371/Lectures/Boltzmann...Ludwig Boltzmann BZ-1 Boltzmann is well-known as a chemical theorist – he worked on topics ranging from

Maxwell Boltzmann Statistic

Lattice Boltzmann Method

Argon Boltzmann Analysis

IE598: Games, Markets, and Mathematical Programmingjugal.ise.illinois.edu/IE598_lecture_notes.pdf · IE598: Games, Markets, and Mathematical Programming Jugal Garg July 18, 2019 1

1. Boltzmann distribution - Huang Labhuanglab.ucsf.edu/Chem241/Ch 10-11 Notes.pdf1. Boltzmann distribution a. Macrostate vs microstate b. Derivation of Boltzmann distribution c. Definition