Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic...

104
Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1 / 53

Transcript of Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic...

Page 1: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Artistic Style Transfer

Saifuddin Syed

MLRG Fall 2016

1 / 53

Page 2: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Introduction

Outline

1 Introduction

2 Review of CNNVGG Network

3 The Gatys et al ConstructionContent RepresentationStyle RepresentationImage ConstructionExamples

4 Alternative MethodsMRF ConstructionExamples

5 AST For videoExample

2 / 53

Page 3: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Introduction

Introduction

Suppose I give you a piece of art and a photo. I ask you to recreatethe photo in the style of the art. How would one go about doing it?

This is a very difficult task for humans, even talented ones.

3 / 53

Page 4: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Introduction

Introduction

Suppose I give you a piece of art and a photo. I ask you to recreatethe photo in the style of the art. How would one go about doing it?

This is a very difficult task for humans, even talented ones.

3 / 53

Page 5: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Introduction

Introduction

Suppose I give you a piece of art and a photo. I ask you to recreatethe photo in the style of the art. How would one go about doing it?

This is a very difficult task for humans, even talented ones.

3 / 53

Page 6: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Introduction

Our goal is to teach a computer to do exactly this.

4 / 53

Page 7: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

Outline

1 Introduction

2 Review of CNNVGG Network

3 The Gatys et al ConstructionContent RepresentationStyle RepresentationImage ConstructionExamples

4 Alternative MethodsMRF ConstructionExamples

5 AST For videoExample

5 / 53

Page 8: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

CNN Overview

Convolutional neural networks (CNN) are a type of neural networkwhich have been widely used for image recognition tasks.

We input an image and each layer applies a set of filters thatidentify local features in the network.

Typically the deeper we go in the network, high level content isidentified as opposed to just pixel values.

6 / 53

Page 9: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

CNN Overview

Convolutional neural networks (CNN) are a type of neural networkwhich have been widely used for image recognition tasks.

We input an image and each layer applies a set of filters thatidentify local features in the network.

Typically the deeper we go in the network, high level content isidentified as opposed to just pixel values.

6 / 53

Page 10: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

CNN Overview

Convolutional neural networks (CNN) are a type of neural networkwhich have been widely used for image recognition tasks.

We input an image and each layer applies a set of filters thatidentify local features in the network.

Typically the deeper we go in the network, high level content isidentified as opposed to just pixel values.

6 / 53

Page 11: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

CNN Overview

7 / 53

Page 12: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

Feature Maps

Suppose layer l of the network has Nl filters, we will refer thecollection of filtered images the feature maps at layer l .

8 / 53

Page 13: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

VGG Network

VGG Network

The content generated in this talk was done using the VGG-19network without the fully connected layer. Implementations inLenet and Resnet also exist.

Properties of VGG.

Won ImageNet with a 7.3% top 5 error rate.

Only 3x3 Conv stride 1, pad 1

2x2 MAX POOL stride 2

140 Million parameters

9 / 53

Page 14: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Review of CNN

VGG Network

VGG Network

The content generated in this talk was done using the VGG-19network without the fully connected layer. Implementations inLenet and Resnet also exist.

Properties of VGG.

Won ImageNet with a 7.3% top 5 error rate.

Only 3x3 Conv stride 1, pad 1

2x2 MAX POOL stride 2

140 Million parameters

9 / 53

Page 15: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Outline

1 Introduction

2 Review of CNNVGG Network

3 The Gatys et al ConstructionContent RepresentationStyle RepresentationImage ConstructionExamples

4 Alternative MethodsMRF ConstructionExamples

5 AST For videoExample

10 / 53

Page 16: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Artistic Style Transfer

We will now outline the procedure of Gatys, Ecker, Bethge. (Sept2015).

The key finding of this paper is that the representations of“content” and “style” in the Convolutional Neural Network areseparable.

We will make precise what we mean by style and content, but first,let us set up the problem formally.

11 / 53

Page 17: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Artistic Style Transfer

We will now outline the procedure of Gatys, Ecker, Bethge. (Sept2015).

The key finding of this paper is that the representations of“content” and “style” in the Convolutional Neural Network areseparable.

We will make precise what we mean by style and content, but first,let us set up the problem formally.

11 / 53

Page 18: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Artistic Style Transfer

We will now outline the procedure of Gatys, Ecker, Bethge. (Sept2015).

The key finding of this paper is that the representations of“content” and “style” in the Convolutional Neural Network areseparable.

We will make precise what we mean by style and content, but first,let us set up the problem formally.

11 / 53

Page 19: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Notation

Our aim to is to construct an image x with the content of image pin the style of image a.

We suppose that in our network, layer l has Nl filters, each withspatial dimension Ml (the product of its width and height).

12 / 53

Page 20: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Notation

Our aim to is to construct an image x with the content of image pin the style of image a.

We suppose that in our network, layer l has Nl filters, each withspatial dimension Ml (the product of its width and height).

12 / 53

Page 21: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Notation

Let Φl(·) the function implemented by the part of theconvolutional network from input up to the layer l .

The feature maps extracted by the network from the original imagep, the style image a and the stylized image x we denote by Pl , Sl ,and Fl respectively.

Pl = Φl(p), Sl = Φl(a), Fl = Φl(x)

The dimensionality of these feature maps is Nl ×Ml .

13 / 53

Page 22: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Notation

Let Φl(·) the function implemented by the part of theconvolutional network from input up to the layer l .

The feature maps extracted by the network from the original imagep, the style image a and the stylized image x we denote by Pl , Sl ,and Fl respectively.

Pl = Φl(p), Sl = Φl(a), Fl = Φl(x)

The dimensionality of these feature maps is Nl ×Ml .

13 / 53

Page 23: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Notation

Let Φl(·) the function implemented by the part of theconvolutional network from input up to the layer l .

The feature maps extracted by the network from the original imagep, the style image a and the stylized image x we denote by Pl , Sl ,and Fl respectively.

Pl = Φl(p), Sl = Φl(a), Fl = Φl(x)

The dimensionality of these feature maps is Nl ×Ml .

13 / 53

Page 24: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Notation

Let Φl(·) the function implemented by the part of theconvolutional network from input up to the layer l .

The feature maps extracted by the network from the original imagep, the style image a and the stylized image x we denote by Pl , Sl ,and Fl respectively.

Pl = Φl(p), Sl = Φl(a), Fl = Φl(x)

The dimensionality of these feature maps is Nl ×Ml .

13 / 53

Page 25: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Representation

Each layer aims to learn a different aspect of the image content. Itis reasonable to assume that two images with similar contentshould have similar feature maps at each layer.

We will say x matches the content of p at layer l , if their featureresponses at layer l of the network are the same.

14 / 53

Page 26: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Representation

Each layer aims to learn a different aspect of the image content. Itis reasonable to assume that two images with similar contentshould have similar feature maps at each layer.

We will say x matches the content of p at layer l , if their featureresponses at layer l of the network are the same.

14 / 53

Page 27: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Loss

Let F lij and P l

ij be the j th position of filter i in layer l of thenetwork.

We define the content loss at layer l to be,

Llc(x,p) =1

2NlMl‖Φl(x)− Φl(p)‖22

=1

2NlMl

∑i ,j

|F lij − P l

ij |2

We define our content reconstruction xlc to be

xlc = argminxLlc(x,p)

15 / 53

Page 28: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Loss

Let F lij and P l

ij be the j th position of filter i in layer l of thenetwork.

We define the content loss at layer l to be,

Llc(x,p) =1

2NlMl‖Φl(x)− Φl(p)‖22

=1

2NlMl

∑i ,j

|F lij − P l

ij |2

We define our content reconstruction xlc to be

xlc = argminxLlc(x,p)

15 / 53

Page 29: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Loss

Let F lij and P l

ij be the j th position of filter i in layer l of thenetwork.

We define the content loss at layer l to be,

Llc(x,p) =1

2NlMl‖Φl(x)− Φl(p)‖22

=1

2NlMl

∑i ,j

|F lij − P l

ij |2

We define our content reconstruction xlc to be

xlc = argminxLlc(x,p)

15 / 53

Page 30: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Loss

We have Llc satisfies

∂Llc∂F l

ij

=

{(Fl − Pl)ij F l

ij > 0

0 F lij < 0

We can use back propagation and descent methods to iterativelyminimize Llc and learn xlc .

Normally x is initialized as a Gaussian white noise.

16 / 53

Page 31: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Loss

We have Llc satisfies

∂Llc∂F l

ij

=

{(Fl − Pl)ij F l

ij > 0

0 F lij < 0

We can use back propagation and descent methods to iterativelyminimize Llc and learn xlc .

Normally x is initialized as a Gaussian white noise.

16 / 53

Page 32: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Loss

We have Llc satisfies

∂Llc∂F l

ij

=

{(Fl − Pl)ij F l

ij > 0

0 F lij < 0

We can use back propagation and descent methods to iterativelyminimize Llc and learn xlc .

Normally x is initialized as a Gaussian white noise.

16 / 53

Page 33: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Reconstruction

17 / 53

Page 34: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Reconstruction

Higher layers in the network capture the high-level content interms of objects and their arrangement in the input image but donot constrain the exact pixel values of the reconstruction.

In contrast, reconstructions from the lower layers simply reproducethe exact pixel values of the original image.

The feature responses in higher layers better encode the content ofthe image.

18 / 53

Page 35: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Reconstruction

Higher layers in the network capture the high-level content interms of objects and their arrangement in the input image but donot constrain the exact pixel values of the reconstruction.

In contrast, reconstructions from the lower layers simply reproducethe exact pixel values of the original image.

The feature responses in higher layers better encode the content ofthe image.

18 / 53

Page 36: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Content Representation

Content Reconstruction

Higher layers in the network capture the high-level content interms of objects and their arrangement in the input image but donot constrain the exact pixel values of the reconstruction.

In contrast, reconstructions from the lower layers simply reproducethe exact pixel values of the original image.

The feature responses in higher layers better encode the content ofthe image.

18 / 53

Page 37: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Representation

The feature responses of an image a at layer l encode the content,however to determine style we are less interested in any individualfeature of our image but rather how they all relate to each other.

The style consists of the correlations between the different featureresponses.

We will say x matches the style of a at layer l , if the correlationsbetween their feature maps at layer l of the network are the same.

This was the main insight of Gatys, et al.

19 / 53

Page 38: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Representation

The feature responses of an image a at layer l encode the content,however to determine style we are less interested in any individualfeature of our image but rather how they all relate to each other.

The style consists of the correlations between the different featureresponses.

We will say x matches the style of a at layer l , if the correlationsbetween their feature maps at layer l of the network are the same.

This was the main insight of Gatys, et al.

19 / 53

Page 39: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Representation

The feature responses of an image a at layer l encode the content,however to determine style we are less interested in any individualfeature of our image but rather how they all relate to each other.

The style consists of the correlations between the different featureresponses.

We will say x matches the style of a at layer l , if the correlationsbetween their feature maps at layer l of the network are the same.

This was the main insight of Gatys, et al.

19 / 53

Page 40: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Representation

The feature responses of an image a at layer l encode the content,however to determine style we are less interested in any individualfeature of our image but rather how they all relate to each other.

The style consists of the correlations between the different featureresponses.

We will say x matches the style of a at layer l , if the correlationsbetween their feature maps at layer l of the network are the same.

This was the main insight of Gatys, et al.

19 / 53

Page 41: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Representation

We will encode the correlations of the feature maps into theGraham Matrices,

Alij = Sl

i• · Slj• =

Ml∑k=1

S likS

ljk

G lij = Fl

i• · Flj• =

Ml∑k=1

F likF

ljk

Al and Gl define a Nl × Nl dimension matrix, where Nl is thenumber of filters in layer l .

20 / 53

Page 42: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Loss

We define the style loss at layer l to be,

Lls(x, a) =1

4N2l M

2l

‖Gl − Al‖2F

=1

4N2l M

2l

∑i ,j

|G lij − Al

ij |2

We define our style reconstruction xls to be

xls = argminxLls(x, a)

21 / 53

Page 43: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Loss

We define the style loss at layer l to be,

Lls(x, a) =1

4N2l M

2l

‖Gl − Al‖2F

=1

4N2l M

2l

∑i ,j

|G lij − Al

ij |2

We define our style reconstruction xls to be

xls = argminxLls(x, a)

21 / 53

Page 44: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Loss

Similar to the content loss, we have Lls satisfies

∂Lls∂F l

ij

=

{1

N2l M

2l

((Fl)T (Gl − Al))ij Flij > 0

0 Flij < 0

We can use back propagation and descent methods to iterativelyminimize Lls and learn xls

22 / 53

Page 45: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Loss

Similar to the content loss, we have Lls satisfies

∂Lls∂F l

ij

=

{1

N2l M

2l

((Fl)T (Gl − Al))ij Flij > 0

0 Flij < 0

We can use back propagation and descent methods to iterativelyminimize Lls and learn xls

22 / 53

Page 46: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Reconstruction

23 / 53

Page 47: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Reconstruction

Note the size and complexity of local image structures from theinput image increases along the hierarchy.

Heuristically, the higher layers learn more complex features thanlower layers, and produce a more detailed style representation.

24 / 53

Page 48: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Style Representation

Style Reconstruction

Note the size and complexity of local image structures from theinput image increases along the hierarchy.

Heuristically, the higher layers learn more complex features thanlower layers, and produce a more detailed style representation.

24 / 53

Page 49: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

We now want to combine the content and style constructionsoutlined previously to develop an image x which simultaneouslytries to match the content of p with the style of a.

We will define our content (style) loss as the weighted average ofthe style (content) loss at each layer.

Lc(x,p) =∑l

αlLlc(x,p)

Ls(x, a) =∑l

βlLls(x,p)

Often we take the αl = 0 for low l , and βl = 1.

25 / 53

Page 50: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

We now want to combine the content and style constructionsoutlined previously to develop an image x which simultaneouslytries to match the content of p with the style of a.

We will define our content (style) loss as the weighted average ofthe style (content) loss at each layer.

Lc(x,p) =∑l

αlLlc(x,p)

Ls(x, a) =∑l

βlLls(x,p)

Often we take the αl = 0 for low l , and βl = 1.

25 / 53

Page 51: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

We now want to combine the content and style constructionsoutlined previously to develop an image x which simultaneouslytries to match the content of p with the style of a.

We will define our content (style) loss as the weighted average ofthe style (content) loss at each layer.

Lc(x,p) =∑l

αlLlc(x,p)

Ls(x, a) =∑l

βlLls(x,p)

Often we take the αl = 0 for low l , and βl = 1.

25 / 53

Page 52: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

To match the content we need to minimize Lc and to match thestyle we need to minimize Ls . Therefore we will minimize bothsimultaneously by minimizing

L(x,p, a) = αLc(x,p) + βLs(x, a).

and we define the image x∗ as,

x∗ = argminxL(x,p, a).

26 / 53

Page 53: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

To match the content we need to minimize Lc and to match thestyle we need to minimize Ls . Therefore we will minimize bothsimultaneously by minimizing

L(x,p, a) = αLc(x,p) + βLs(x, a).

and we define the image x∗ as,

x∗ = argminxL(x,p, a).

26 / 53

Page 54: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

L(x,p, a) = αLc(x,p) + βLs(x, a).

The constants α and β dictate how much preference we give tocontent matching vs style matching.

If αβ is high, we favour matching the content more than the style.

If αβ is low, we favour matching the style more than the content.

27 / 53

Page 55: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

L(x,p, a) = αLc(x,p) + βLs(x, a).

The constants α and β dictate how much preference we give tocontent matching vs style matching.

If αβ is high, we favour matching the content more than the style.

If αβ is low, we favour matching the style more than the content.

27 / 53

Page 56: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Image Construction

L(x,p, a) = αLc(x,p) + βLs(x, a).

The constants α and β dictate how much preference we give tocontent matching vs style matching.

If αβ is high, we favour matching the content more than the style.

If αβ is low, we favour matching the style more than the content.

27 / 53

Page 57: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Figure: Columns: αβ . Row: Layer of network

28 / 53

Page 58: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Examples

Figure: Left: Neckarfront in Tubingen, Germany, B: The Shipwreck ofthe Minotaur by J.M.W. Turner, 1805

29 / 53

Page 59: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Examples

Figure: Left: Neckarfront in Tubingen, Germany, C: The Starry Night byVincent van Gogh, 1889

30 / 53

Page 60: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Examples

Figure: Left: Neckarfront in Tubingen, Germany, D: Der Schrei byEdvard Munch, 1893

31 / 53

Page 61: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Examples

Figure: Left: Neckarfront in Tubingen, Germany, E: Femme nue assiseby Pablo Picasso, 1910

32 / 53

Page 62: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Examples

Figure: Left: Neckarfront in Tubingen, Germany, F: Composition VII byWassily Kandinsky, 1913

33 / 53

Page 63: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

The Gatys et al Construction

Image Construction

Examples

Figure: Left: My friend Grant, Right: Grant as a pizza

34 / 53

Page 64: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Outline

1 Introduction

2 Review of CNNVGG Network

3 The Gatys et al ConstructionContent RepresentationStyle RepresentationImage ConstructionExamples

4 Alternative MethodsMRF ConstructionExamples

5 AST For videoExample

35 / 53

Page 65: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Alternative Methods

Following the success of Gatys et. al., many people began to refineand improve upon their method.

Some examples include:

“Combining Markov Random Fields and Convolutional NeuralNetworks for Image Synthesis” (Li and Wand, Jan 2016)

“Perceptual Losses for Real-Time Style Transfer andSuper-Resolution” (Johnson, et al, Mar 2016)

“Semantic Style Transfer and Turning Two-Bit Doodles intoFine Artwork”(Champandard, Mar 2016)

We will outline the work of Li and Wand.

36 / 53

Page 66: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Alternative Methods

Following the success of Gatys et. al., many people began to refineand improve upon their method.

Some examples include:

“Combining Markov Random Fields and Convolutional NeuralNetworks for Image Synthesis” (Li and Wand, Jan 2016)

“Perceptual Losses for Real-Time Style Transfer andSuper-Resolution” (Johnson, et al, Mar 2016)

“Semantic Style Transfer and Turning Two-Bit Doodles intoFine Artwork”(Champandard, Mar 2016)

We will outline the work of Li and Wand.

36 / 53

Page 67: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Alternative Methods

Following the success of Gatys et. al., many people began to refineand improve upon their method.

Some examples include:

“Combining Markov Random Fields and Convolutional NeuralNetworks for Image Synthesis” (Li and Wand, Jan 2016)

“Perceptual Losses for Real-Time Style Transfer andSuper-Resolution” (Johnson, et al, Mar 2016)

“Semantic Style Transfer and Turning Two-Bit Doodles intoFine Artwork”(Champandard, Mar 2016)

We will outline the work of Li and Wand.

36 / 53

Page 68: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

MRF construction of Li and Wand

Their method was analogous to that of Gatys with a different stylerepresentation.

Following Gatys, et al. Li and Wand, tried to match the contentand style simultaneously by trying to minimize a linear combinationof content and style loss function in addition to a regularizer.

LMRF (x,p, a) = αLc(x,p) + βL̃s(x, a) + λR(x)

Where Lc is the same content loss function used in the Gatysconstruction.

37 / 53

Page 69: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

MRF construction of Li and Wand

Their method was analogous to that of Gatys with a different stylerepresentation.

Following Gatys, et al. Li and Wand, tried to match the contentand style simultaneously by trying to minimize a linear combinationof content and style loss function in addition to a regularizer.

LMRF (x,p, a) = αLc(x,p) + βL̃s(x, a) + λR(x)

Where Lc is the same content loss function used in the Gatysconstruction.

37 / 53

Page 70: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

MRF construction of Li and Wand

Their method was analogous to that of Gatys with a different stylerepresentation.

Following Gatys, et al. Li and Wand, tried to match the contentand style simultaneously by trying to minimize a linear combinationof content and style loss function in addition to a regularizer.

LMRF (x,p, a) = αLc(x,p) + βL̃s(x, a) + λR(x)

Where Lc is the same content loss function used in the Gatysconstruction.

37 / 53

Page 71: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Li and Wand also noted that the style of an image is encoded inthe feature maps in a given layer of the network. Where theirapproach is differs is in how they extract it.

Unlike using correlations and Graham matrices, they noted thatthe “style” can encoded locally via patches in the feature maps ina given layer l .

The patches are of dimension k × k ×Nl , where k is the width andheight of the patch (typically k is small) and Nl is the number offilters in layer l .

Our goal will be to match patches of Φl(x) to Φl(a) in some layerl .

38 / 53

Page 72: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Li and Wand also noted that the style of an image is encoded inthe feature maps in a given layer of the network. Where theirapproach is differs is in how they extract it.

Unlike using correlations and Graham matrices, they noted thatthe “style” can encoded locally via patches in the feature maps ina given layer l .

The patches are of dimension k × k ×Nl , where k is the width andheight of the patch (typically k is small) and Nl is the number offilters in layer l .

Our goal will be to match patches of Φl(x) to Φl(a) in some layerl .

38 / 53

Page 73: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Li and Wand also noted that the style of an image is encoded inthe feature maps in a given layer of the network. Where theirapproach is differs is in how they extract it.

Unlike using correlations and Graham matrices, they noted thatthe “style” can encoded locally via patches in the feature maps ina given layer l .

The patches are of dimension k × k ×Nl , where k is the width andheight of the patch (typically k is small) and Nl is the number offilters in layer l .

Our goal will be to match patches of Φl(x) to Φl(a) in some layerl .

38 / 53

Page 74: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Li and Wand also noted that the style of an image is encoded inthe feature maps in a given layer of the network. Where theirapproach is differs is in how they extract it.

Unlike using correlations and Graham matrices, they noted thatthe “style” can encoded locally via patches in the feature maps ina given layer l .

The patches are of dimension k × k ×Nl , where k is the width andheight of the patch (typically k is small) and Nl is the number offilters in layer l .

Our goal will be to match patches of Φl(x) to Φl(a) in some layerl .

38 / 53

Page 75: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Let Ψ(Φl(x)) = {Ψi (Φl(x))}mli=1 be an ordered list of all local

patches extracted from Φl(x).

Given i , we define NN(i) to be the position of the patch inΨ(Φl(a)) that deviates the most from from Ψi (Φl(x)). I.e.

NN(i) = argminjΨi (Φl(x)) ·Ψj(Φl(a))

|Ψi (Φl(x))| · |Ψj(Φl(a))|

So we define L̃s to be

L̃s(x, a) =

ml∑i=1

‖Ψi (Φl(x))−ΨNN(i)(Φl(a))‖2

39 / 53

Page 76: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Let Ψ(Φl(x)) = {Ψi (Φl(x))}mli=1 be an ordered list of all local

patches extracted from Φl(x).

Given i , we define NN(i) to be the position of the patch inΨ(Φl(a)) that deviates the most from from Ψi (Φl(x)). I.e.

NN(i) = argminjΨi (Φl(x)) ·Ψj(Φl(a))

|Ψi (Φl(x))| · |Ψj(Φl(a))|

So we define L̃s to be

L̃s(x, a) =

ml∑i=1

‖Ψi (Φl(x))−ΨNN(i)(Φl(a))‖2

39 / 53

Page 77: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Style Loss

Let Ψ(Φl(x)) = {Ψi (Φl(x))}mli=1 be an ordered list of all local

patches extracted from Φl(x).

Given i , we define NN(i) to be the position of the patch inΨ(Φl(a)) that deviates the most from from Ψi (Φl(x)). I.e.

NN(i) = argminjΨi (Φl(x)) ·Ψj(Φl(a))

|Ψi (Φl(x))| · |Ψj(Φl(a))|

So we define L̃s to be

L̃s(x, a) =

ml∑i=1

‖Ψi (Φl(x))−ΨNN(i)(Φl(a))‖2

39 / 53

Page 78: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Regularizer

There is significant amount of low-level image informationdiscarded during the training of the network. In consequence,reconstructing an image can often be noisy and unnatural.

To correct this we add a regularizer that encourages smoothness inthe final image.

We defining the discrete gradient of x as

∆xi ,j = (xi+1,j − xi ,j , xi ,j+1 − xi ,j).

There is smoothness in the image when ‖∆x‖22 is small, so we let

R(x) = ‖∆x‖22 =∑i ,j

(xi+1,j − xi ,j)2 + (xi ,j+1 − xi ,j)

2

40 / 53

Page 79: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Regularizer

There is significant amount of low-level image informationdiscarded during the training of the network. In consequence,reconstructing an image can often be noisy and unnatural.

To correct this we add a regularizer that encourages smoothness inthe final image.

We defining the discrete gradient of x as

∆xi ,j = (xi+1,j − xi ,j , xi ,j+1 − xi ,j).

There is smoothness in the image when ‖∆x‖22 is small, so we let

R(x) = ‖∆x‖22 =∑i ,j

(xi+1,j − xi ,j)2 + (xi ,j+1 − xi ,j)

2

40 / 53

Page 80: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Regularizer

There is significant amount of low-level image informationdiscarded during the training of the network. In consequence,reconstructing an image can often be noisy and unnatural.

To correct this we add a regularizer that encourages smoothness inthe final image.

We defining the discrete gradient of x as

∆xi ,j = (xi+1,j − xi ,j , xi ,j+1 − xi ,j).

There is smoothness in the image when ‖∆x‖22 is small, so we let

R(x) = ‖∆x‖22 =∑i ,j

(xi+1,j − xi ,j)2 + (xi ,j+1 − xi ,j)

2

40 / 53

Page 81: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

MRF Construction

Regularizer

There is significant amount of low-level image informationdiscarded during the training of the network. In consequence,reconstructing an image can often be noisy and unnatural.

To correct this we add a regularizer that encourages smoothness inthe final image.

We defining the discrete gradient of x as

∆xi ,j = (xi+1,j − xi ,j , xi ,j+1 − xi ,j).

There is smoothness in the image when ‖∆x‖22 is small, so we let

R(x) = ‖∆x‖22 =∑i ,j

(xi+1,j − xi ,j)2 + (xi ,j+1 − xi ,j)

2

40 / 53

Page 82: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Examples

Examples

41 / 53

Page 83: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Examples

Examples

42 / 53

Page 84: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Examples

Examples

43 / 53

Page 85: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Examples

Examples

44 / 53

Page 86: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

Alternative Methods

Examples

Examples

Figure: Example of Gatys, et al performing better

45 / 53

Page 87: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Outline

1 Introduction

2 Review of CNNVGG Network

3 The Gatys et al ConstructionContent RepresentationStyle RepresentationImage ConstructionExamples

4 Alternative MethodsMRF ConstructionExamples

5 AST For videoExample

46 / 53

Page 88: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Artistic Style Transfer for Video

The natural extension to doing artistic style transfer for an imageis to do artistic style transfer for a video.

Ruder, Dosovitskiy, and Brox in April 2016 did exactly this. Wewill now outline their construction.

The two main issues we will have to deal with is the initializationof the optimization procedure and the temporal consistencybetween frames.

47 / 53

Page 89: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Artistic Style Transfer for Video

The natural extension to doing artistic style transfer for an imageis to do artistic style transfer for a video.

Ruder, Dosovitskiy, and Brox in April 2016 did exactly this. Wewill now outline their construction.

The two main issues we will have to deal with is the initializationof the optimization procedure and the temporal consistencybetween frames.

47 / 53

Page 90: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Artistic Style Transfer for Video

The natural extension to doing artistic style transfer for an imageis to do artistic style transfer for a video.

Ruder, Dosovitskiy, and Brox in April 2016 did exactly this. Wewill now outline their construction.

The two main issues we will have to deal with is the initializationof the optimization procedure and the temporal consistencybetween frames.

47 / 53

Page 91: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Notation

We use the following notation: Let p be the content video withframes pi , and a be the style image. We want to create a video xsuch that each frame xi has the content of pi and style a.

Our goal will be to determine xi in chronological order. We willalso denote xi0 to be the initialization of in the optimizationprocedure to determine xi .

48 / 53

Page 92: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Notation

We use the following notation: Let p be the content video withframes pi , and a be the style image. We want to create a video xsuch that each frame xi has the content of pi and style a.

Our goal will be to determine xi in chronological order. We willalso denote xi0 to be the initialization of in the optimizationprocedure to determine xi .

48 / 53

Page 93: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Naive Method

The first natural thing to attempt is to apply your favourite artisticstyle routine to each frame individually.

However, the optimization procedure is not perfect and due to therandom initialization, different frames fall into different localminima. This results in flickering and lack of continuity betweenframes.

The next natural step would be to initialize the optimizationprocedure by xi0 = xi−1.

If there is motion in the scene, this simple approach does notperform well since moving objects are initialized incorrectly.

49 / 53

Page 94: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Naive Method

The first natural thing to attempt is to apply your favourite artisticstyle routine to each frame individually.

However, the optimization procedure is not perfect and due to therandom initialization, different frames fall into different localminima. This results in flickering and lack of continuity betweenframes.

The next natural step would be to initialize the optimizationprocedure by xi0 = xi−1.

If there is motion in the scene, this simple approach does notperform well since moving objects are initialized incorrectly.

49 / 53

Page 95: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Naive Method

The first natural thing to attempt is to apply your favourite artisticstyle routine to each frame individually.

However, the optimization procedure is not perfect and due to therandom initialization, different frames fall into different localminima. This results in flickering and lack of continuity betweenframes.

The next natural step would be to initialize the optimizationprocedure by xi0 = xi−1.

If there is motion in the scene, this simple approach does notperform well since moving objects are initialized incorrectly.

49 / 53

Page 96: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Optical Flow to the rescue

The optical flow in a the content video p between frame j to i(denoted by w i

j ) is a function that warps a given image pj using

the optical flow field that was estimated between frame pj and pi .

Intuitively, it can be thought of as a function that predicts frame iin p given frame j . I.e.

w ij (pj) ≈ pi .

Optical flow can be computed via two state-of-the-art optical flowestimation algorithms: DeepFlow and EpicFlow.

Given the optical flow of w ii−1 between frame pi−1 and pi of the

content video, we can initialize xi0 via

xi0 = w ii−1(xi−1).

50 / 53

Page 97: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Optical Flow to the rescue

The optical flow in a the content video p between frame j to i(denoted by w i

j ) is a function that warps a given image pj using

the optical flow field that was estimated between frame pj and pi .

Intuitively, it can be thought of as a function that predicts frame iin p given frame j . I.e.

w ij (pj) ≈ pi .

Optical flow can be computed via two state-of-the-art optical flowestimation algorithms: DeepFlow and EpicFlow.

Given the optical flow of w ii−1 between frame pi−1 and pi of the

content video, we can initialize xi0 via

xi0 = w ii−1(xi−1).

50 / 53

Page 98: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Optical Flow to the rescue

The optical flow in a the content video p between frame j to i(denoted by w i

j ) is a function that warps a given image pj using

the optical flow field that was estimated between frame pj and pi .

Intuitively, it can be thought of as a function that predicts frame iin p given frame j . I.e.

w ij (pj) ≈ pi .

Optical flow can be computed via two state-of-the-art optical flowestimation algorithms: DeepFlow and EpicFlow.

Given the optical flow of w ii−1 between frame pi−1 and pi of the

content video, we can initialize xi0 via

xi0 = w ii−1(xi−1).

50 / 53

Page 99: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Optical Flow to the rescue

The optical flow in a the content video p between frame j to i(denoted by w i

j ) is a function that warps a given image pj using

the optical flow field that was estimated between frame pj and pi .

Intuitively, it can be thought of as a function that predicts frame iin p given frame j . I.e.

w ij (pj) ≈ pi .

Optical flow can be computed via two state-of-the-art optical flowestimation algorithms: DeepFlow and EpicFlow.

Given the optical flow of w ii−1 between frame pi−1 and pi of the

content video, we can initialize xi0 via

xi0 = w ii−1(xi−1). 50 / 53

Page 100: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Temporal consistency

To force extra temporal continuity between frames, we will add aregularizer that rewards consecutive frames that are consistentwith each other.

We define the temporal loss to be

Ltemp(x,w, c) =1

D

D∑k=1

ck(xk − wk)2

Where D is the dimension of the image. And ck = 0 if the motionat pixel wk is a boundary point, and 1 otherwise. ci can beapproximated using optical flow. For details of this procedure seeArxiv 1604.08610.

51 / 53

Page 101: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Temporal consistency

To force extra temporal continuity between frames, we will add aregularizer that rewards consecutive frames that are consistentwith each other.

We define the temporal loss to be

Ltemp(x,w, c) =1

D

D∑k=1

ck(xk − wk)2

Where D is the dimension of the image. And ck = 0 if the motionat pixel wk is a boundary point, and 1 otherwise. ci can beapproximated using optical flow. For details of this procedure seeArxiv 1604.08610.

51 / 53

Page 102: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Temporal Consistency

To force some consistency between consecutive frames, we canminimize

Lshort(xi ,pi , a) = αLc(xi ,pi ) + βLs(xi , a)

+ γLtemp(xi ,w ii−1(xi−1), ci )

To get a smoother result, it is better to achieve some long termcosistency between not just the previous frame, but rather theprevious J frames (typically J = 1, 2, 4).

Llong (xi ,pi , a) = αLc(xi ,pi ) + βLs(xi , a)

+ γJ∑

j=1

Ltemp(xi ,w ii−j(xi−j), ci−j)

52 / 53

Page 103: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Temporal Consistency

To force some consistency between consecutive frames, we canminimize

Lshort(xi ,pi , a) = αLc(xi ,pi ) + βLs(xi , a)

+ γLtemp(xi ,w ii−1(xi−1), ci )

To get a smoother result, it is better to achieve some long termcosistency between not just the previous frame, but rather theprevious J frames (typically J = 1, 2, 4).

Llong (xi ,pi , a) = αLc(xi ,pi ) + βLs(xi , a)

+ γ

J∑j=1

Ltemp(xi ,w ii−j(xi−j), ci−j)

52 / 53

Page 104: Saifuddin Syed - University of British Columbiasaif.syed/papers/mlrg_style_transfer.pdf · Artistic Style Transfer Artistic Style Transfer Saifuddin Syed MLRG Fall 2016 1/53. Artistic

Artistic Style Transfer

AST For video

Example

Example

See https://www.youtube.com/watch?v=Khuj4ASldmU

53 / 53