DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... ·...

27
DRAW A Recurrent Neural Network for Image Generation Brandon Marlowe - 2693414 CIS 601 - Spring 18

Transcript of DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... ·...

Page 1: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAWA Recurrent Neural Network for Image

Generation

Brandon Marlowe - 2693414CIS 601 - Spring 18

Page 2: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Agenda● What is a Neural Network? (VERY briefly)● DRAW (Deep Recurrent Attentive Writer) Overview● Why DRAW matters● DRAW...ing in Detail● Experimentation and Results

Page 3: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

What is a Neural Network?

● Statistical learning model inspired by the structure of the human mind● Composed of “Neurons” (AKA, nodes)● Consist of three main parts

○ Input Layer○ Hidden Layer○ Output Layer

Page 4: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Extremely Simple Example Feedforward Neural Network

(Computes XOR Function)

Page 5: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Inputs of [1, 1] passed into the Neural Network

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 6: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Random weights are assigned to each Synapse in all layers

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 7: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

The weights corresponding to each Neuron are summed

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 8: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Activation function (Sigmoid in this case) is applied to each of the weighted sums

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 9: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Example Activation Functions

Image: Özkan, C., & Erbek, F. S. (2003). The Comparison of Activation Functions for Multispectral Landsat TM Image Classification. Photogrammetric Engineering & Remote Sensing, 69(11), 1225-1234. doi:10.14358/pers.69.11.1225

Page 10: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Hidden layer values multiplied by weights and summed

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 11: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Error = target - calculated = -0.77

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 12: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

The derivative of the activation function is used to adjust weights and the process is repeated

Image:https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/

Page 13: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Recurrent Neural Networks (RNNs) vs. Feedforward Neural Networks (FNNs)

● RNNs are similar to FNNs○ Main difference: RNNs are aware of previous inputs, FNNs are not

● RNNs can be thought of as multiple FNNs

Image: https://image.slidesharecdn.com/mdrnn-yandexmoscowcv-160427182305/95/multidimensional-rnn-4-638.jpg?cb=1461781453

Page 14: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAW Overview

Page 15: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAW

● DRAW = Deep Recurrent Attentive Writer○ Comprised of two Long Short-Term Memory Recurrent Neural Networks

■ Encoder RNN: compresses images■ Decoder RNN: reconstitutes images

○ Long Short-Term Memory Architecture composed of:■ Read Gate, Write Gate, Keep/Forget Gate

● Not the first image generation Neural Network● Belongs to family of Variational Autoencoders● Mimics behavior of the human eye● Creates portions of scenes independently and iteratively refines them

Page 16: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Why DRAW Matters

● Previous Autoencoders created images in a single pass○ Accuracy suffered○ Details were missed○ Complex images posed problems○ Could not create natural-looking images

● DRAW creates images iteratively○ Generates complex images that cannot be distinguished from the real image○ Gradually refines each portion of the image○ Substantially improves on state of the art image generation models

Page 17: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Structure of DRAW in DetailConventional Auto-Encoder DRAW

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 18: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAW...ing with Attention to Detail

● Read Gate places N x N grid of Gaussian Filters on image and determines the image center (gx, gy)

● δ = “stride” or “zoom” of attention patch○ Large stride means more of the image is

visible to the attention model

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 19: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAW...ing with Attention to Detail

● Write Gate extracts previous attention parameters, and inverts them

● The inversion alternates focus between highly detailed and broad views of image

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Key Component:

Page 20: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAW...ing with Attention to Detail

DRAW recreating images from MNIST dataset

Image: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 21: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Experimentation

● Three sets of training data were used:○ MNIST (Modified National Institute of Standards and Technology Database)

■ Database of handwritten digits○ SVHN (Street View House Numbers)

■ Database of images containing house numbers○ CIFAR-10 (Canadian Institute For Advanced Research - 10 Classes)

■ Database containing 10 classes of vehicles and animals● Experiment consisted of:

○ Classifying MNIST images○ Generating MNIST images○ Generating SVHN images○ Generating CIFAR-10 images

Page 22: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Classifying MNIST

● MNIST 100 x 100 Clutter Classification○ 100 x 100 pixel images contained digit-like fragments○ DRAW was tasked with identifying digits○ The model was given a fixed number of “glimpses”

■ Each glimpse is 12 x 12 pixels in size○ DRAW compared with RAM (Recurrent Attention Model)

■ DRAW uses ¼ of the attention patches RAM uses

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 23: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Generating MNIST

● DRAW tasked with generating MNIST-like digits○ MNIST is widely used, allowing DRAW to be easily compared

● Trained on MNIST dataset● With vs. without selective attention compared as well

All images generated by DRAW(except rightmost column = training set image)Negative Log-likelihood (lower is better)

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 24: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Generating SVHN

● DRAW trained on 64 x 64 pixel images of house numbers

● 231,053 images in dataset

● 4,701 validation images

Sequence of drawing SVHN digits

All images generated by DRAW(except rightmost column = training set image)

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 25: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Generating CIFAR-10

● DRAW trained on 50,000 images○ Small training sample considering diversity of

images● Still able to capture a good portion of detail

All images generated by DRAW(except rightmost column = training set image)

Images: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 26: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

DRAW in Action

Image: Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

Page 27: DRAW - Cleveland State Universitycis.csuohio.edu/~sschung/CIS601/BrandonMarlowe_CIS601... · 2018-04-17 · DRAW = Deep Recurrent Attentive Writer Comprised of two Long Short-Term

Sources

● Karol Gregor, Ivo Danihelka, Alex Graves, Daan Wierstra (2015). DRAW: A Recurrent Neural Network For Image Generation. CoRR, abs/1502.04623, .

● Özkan, C., & Erbek, F. S. (2003). The Comparison of Activation Functions for Multispectral Landsat TM Image Classification. Photogrammetric Engineering & Remote Sensing, 69(11), 1225-1234. doi:10.14358/pers.69.11.1225

● https://stevenmiller888.github.io/mind-how-to-build-a-neural-network/