Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: –...
Transcript of Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: –...
![Page 1: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/1.jpg)
Lars Mescheder1, Sebastian Nowozin2, Andreas Geiger1
1MPI-IS Tübingen 2University of Tübingen 3Microsoft Research Cambridge
Which Training Methods for GANs do actually Converge?
![Page 2: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/2.jpg)
2
Introduction
Generative neural networks:
Key challenge:have to learn high dimensional probability distribution
noise
![Page 3: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/3.jpg)
3
Introduction
Generative Adversarial networks (GANs):
real or fake?Generator
Discriminator
![Page 4: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/4.jpg)
4
Generative Adversarial Networks
Alternating gradient descent
![Page 5: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/5.jpg)
5
Generative Adversarial Networks
Simultaneous gradient descent
![Page 6: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/6.jpg)
6
Generative Adversarial Networks
● Does a (pure) Nash-equilibrium exist?– Yes, if there is with (Goodfellow et al., 2014)
● Does it solve the min-max problem?– Yes, if (Goodfellow et al., 2014)
● Do simultaneous and / or alternating gradient descent converge to the Nash-equilbrium?
![Page 7: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/7.jpg)
7
Ist GAN training locally asymptotically stable?
Mescheder et al. (2017): No, if Jacobian of gradientvector field has purely imaginary eigenvalues
Is GAN training locally asymptotically stable in the general case?
Nagarajan and Kolter (2017): Yes, if generatorand data distributions locally have the same support
Heusel et al. (2017): Yes, if optimal discriminatorparameters are continuous function of generator parametersand two-timescale annealing scheme is adopted
![Page 8: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/8.jpg)
8
Our Contributions
● Dirac-GAN:– Unregularized GAN-training is not always stable
when distributions do not have the same support● Analysis of common regularizers:
– WGAN and WGAN-GP not always stable– Instance noise & zero-centered gradient penalties are stable
● Simplified gradient penalties– Convergence proof for realizable case
● Empirical results:– High resolution (1024x1024) generative models
without progressively growing architectures
![Page 9: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/9.jpg)
9
Convergence theory
“Simple experiments, simple theorems are the building blocks that help us understand more complicated systems.”
Ali Rahimi – Test of Time Award speech, NIPS 2017
![Page 10: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/10.jpg)
10
Convergence theory
The Dirac-GAN:
![Page 11: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/11.jpg)
11
Convergence theory
Unregularized GAN training:
![Page 12: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/12.jpg)
12
Convergence theory
Understanding the gradient vector field:
Local convergence of simultaneous and alternating gradient descent determined by eigenvalues of Jacobian
![Page 13: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/13.jpg)
13
Convergence theory
Continuous system:
![Page 14: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/14.jpg)
14
Convergence theory
Continuous system:
![Page 15: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/15.jpg)
15
Convergence theory
Discretized system:
![Page 16: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/16.jpg)
16
Convergence theory
Discretized system:
![Page 17: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/17.jpg)
17
Which training methods converge?
Unregularized GAN training:
Eigenvalues:
![Page 18: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/18.jpg)
18
Which training methods converge?
Eigenvalues:
Wasserstein-GAN1 training:
1Arjovsky et al. - Wasserstein GAN (2017)
![Page 19: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/19.jpg)
19
Which training methods converge?
Eigenvalues:
Wasserstein-GAN1 training(5 discriminator updates / generator update):
1Arjovsky et al. - Wasserstein GAN (2017)
![Page 20: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/20.jpg)
20
Which training methods converge?
Zero-centered Gradient Penalties
Eigenvalues:
![Page 21: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/21.jpg)
21
Which training methods converge?
Zero-centered Gradient Penalties (critical)
Eigenvalues:
![Page 22: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/22.jpg)
22
Which training methods converge?
Zero-centered Gradient Penalties
no convergence
slow convergence
fast convergence
![Page 23: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/23.jpg)
23
General convergence results
● Regularizers for discriminator
● Regularized gradient vector field
![Page 24: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/24.jpg)
24
Assumption IV: the generator and data distributionhave locally the same support (Nagarajan & Kolter)
General convergence results
Assumption I: the generator can represent the true data distribution
Assumption II: and
Assumption III: the discriminator can detect when the generator deviates from equilibrium
![Page 25: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/25.jpg)
25
Assumption IV: the generator and data distributionhave locally the same support (Nagarajan & Kolter)
General convergence results
Assumption I: the generator can represent the true data distribution
Assumption II: and
Assumption III: the discriminator can detect when the generator deviates from equilibrium
For the Dirac-GAN:
![Page 26: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/26.jpg)
26
Assumption IV: the generator and data distributionhave locally the same support (Nagarajan & Kolter)
General convergence results
Assumption I: the generator can represent the true data distribution
Assumption II: and
Assumption III: the discriminator can detect when the generator deviates from equilibrium
For GANs in the wild:
![Page 27: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/27.jpg)
27
General convergence results
Theorem: under Assumption I, II, III and some mild technical assumptions the GAN training dynamics for the regularized training objective are locally asymptotically stable near the equilibrium point
![Page 28: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/28.jpg)
28
General convergence results
Proof (idea):
1Nagarajan & Kolter - Gradient descent GAN optimization is locally stable (2017)
(Extends prior work by Nagarajan & Kolter1)
Negative definiteFull column rank
(orthogonal to )
All eigenvalues of have negative real part
![Page 29: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/29.jpg)
29
Experiments
Imagenet (128 x 128, 1k classes)
![Page 30: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/30.jpg)
30
Experiments
LSUN bedrooms (256 x 256)
![Page 31: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/31.jpg)
31
Experiments
LSUN churches (256 x 256)
![Page 32: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/32.jpg)
32
Experiments
LSUN towers (256 x 256)
![Page 33: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/33.jpg)
33
Experiments
celebA-HQ (1024 x 1024)
![Page 34: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/34.jpg)
34
● use alternating instead of simultaneous gradient descent● don’t use momentum● use regularization to stabilize the training● simple zero-centered gradient penalties for the discriminator
yield excellent results ● progressively growing architectures might be not all that
important when using a good regularizer
Practical recommendations
![Page 35: Which Training Methods for GANs do actually Converge? · 8 Our Contributions Dirac-GAN: – Unregularized GAN-training is not always stable when distributions do not have the same](https://reader035.fdocuments.us/reader035/viewer/2022070916/5fb63b86aa488725854683bc/html5/thumbnails/35.jpg)
35
Poster #77