Abstract

Generative Adversarial Networks (GANs) provide a fascinating new paradigm in machine learning and artificial intelligence, especially in the context of unsupervised learning. GANs are quickly becoming a state of the art tool, used in various applications such as image generation, image super resolutions, text generation, text to image synthesis to name a few. However, GANs potential is restricted due to the various training difficulties. To overcome the training difficulties of GANs, the use of a more powerful measure of dissimilarity via the use of the Wasserstein distance was proposed. Thereby giving birth to the GAN extension known as Wasserstein Generative Adversarial Networks (WGANs).

Recognizing the crucial and central role played by both the cost function and the order of the Wasserstein distance used in WGAN, this thesis seeks to provide a comparative assessment of the effect of a various common used norms on WGANs. Inspired by the impact of norms like the L1-norm in LASSO Regression, the L2-norm Ridge Regression and the great success of the combination of the L1 and L2 norms in elastic net and its extensions in statistical machine learning, we consider exploring and investigating to a relatively large extent, the effect of these very same norms in the WGAN context. In this thesis, the primary goal of our research is to study the impact of these norms on WGANs from a pure computational and empirical standpoint, with an emphasis on how each norm impacts the space of the weights/parameters of the machines contributing to the WGAN. We also explore the effect of different clipping values which are used to enforce the k-Lipschitz constraint on the functions making up the specific WGAN under consideration. Another crucial component of the research carried out in this thesis focuses on the impact of the number of training iterations on the WGAN loss function (objective function) which somehow gives us an empirical rough estimate of the computational complexity of WGANs. Finally, and quite importantly, in keeping WGANs' application to recovery of scenes and reconstruction of complex images, we dedicate a relative important part of our research to the comparison of the quality of recovery across various choices of the norms considered. Like previous researchers before us, we perform a substantial empirical exploration on both synthetic data and real life data. We specifically explore a simulated data set made up of a mixture of eight bivariate Gaussian random variables with large gaps, the likes of which would be hard task for traditional GANs but can be readily handled quite well be WGANs thanks to the inherent strength/power of the underlying Wasserstein distance. We also explore various real data sets, namely the ubiquitous MNIST datasets made up of handwritten digits and the now very popular CIFAR-10 dataset used an de facto accepted benchmark data set for GANs and WGANs. For all the datasets, synthetic and real, we provide a thorough comparative assessment of the effect and impact of the norms mentioned earlier, and it can be readily observed that there are indeed qualitative and quantitative difference from one norm to another, with differences established via measures such as (a) shape, value and pattern of the generator loss, (b) shape, value and pattern of the discriminator loss (c) shape, value and pattern of the inception score, and (d) human inspection of quality of recovery or reconstruction of images and scenes.

Library of Congress Subject Headings

Neural networks (Computer science); Machine learning; Image processing--Digital techniques

Publication Date

5-2019

Document Type

Thesis

Student Type

Graduate

Degree Name

Computer Science (MS)

Department, Program, or Center

Computer Science (GCCIS)

Advisor

Reynold Bailey

Advisor/Committee Member

Leon Reznik

Advisor/Committee Member

Ernest Fokoue

Campus

RIT – Main Campus

Plan Codes

COMPSCI-MS

Share

COinS