by Liam Hinzman
For the past decade Artificial Intelligence, using Machine Learning, has been the number-crunching king of the jungle. AI is already better than humans in many fields. But can AI create art? Machines using Generative Adversarial Networks (GANs) can create high-quality original art and images.
Here’s How Generative Adversarial Networks Work
Two neural networks (a type of computing system) are trained against each other, they’re adversaries. The first neural network is the Generator, which tries to generate ‘fake’ images that are indistinguishable from the real thing. The second neural network is the Discriminator, which tries to identify whether an image is real, or a fake created by the Generator.
How the Generator Learns
- Looks at its dataset (could be Picasso paintings)
- Tries to learn what makes that type of image unique
- Generates a fake image
- Gets feedback from Discriminator: positive if it tricked the Discriminator, or negative if it failed to
How the Discriminator Learns
- Looks at its dataset
- Tries to learn what makes that type of image unique
- Looks at images (both fake and real), and tries to classify whether an image is real or fake
- Gets feedback from label: positive if it correctly classified an image as fake or real, or negative if it failed
GANs learn the same way a forger and a detective learn together. The forger (the Generator) tries to create replica paintings that look just like a Picasso painting. The forger looks at real Picasso paintings (its dataset), attempts to learn what makes a Picasso painting unique, and tries to make its forgeries look just like the real thing. Meanwhile, the detective (the Discriminator) tries to identify which paintings are real and which are forgeries.
Despite Being Adversaries, the Generator and Discriminator Learn Together
In this example, and in actual GANs, the forger and detective slowly improve together. Every time the forger generates a convincing forgery, it learns a little bit more about what makes a Picasso painting unique. Every time the detective catches a fake, it learns what isn’t a Picasso painting.
The forger only has a reason to improve if the detective is detecting the fakes, otherwise, there’s no motivation to improve. Likewise, the detective only has a reason to improve if forgeries are getting past it undetected.
The Generator and Discriminator are in a race, each trying to stay one step ahead of the other.
It’s important that the Discriminator does not start off too powerful. If the Discriminator is too good at its job, always telling the difference between real and forgery, then the Generator will have no pieces of information to improve itself, and its forgeries will look terrible. If the Discriminator is well balanced, not too weak or strong, then the Generator will become a true artist (of fakes) and earn the title of Vincent GAN Gogh.
In a race, and with GANs, the two competitors have to be well-matched to keep them both motivated and improving.
How I Created My Own Generative Adversarial Networks (GANs)
I decided to create a Deep Convolutional GANs (DCGANs) to generate images of celebrities and cats. DCGANs are like GANs, but DCGANs use convolutional layers to find patterns in images much more effectively than regular neural networks.
A Quick Explanation on How Convolutional Layers work
Regular neural networks look at each pixel individually and then try to find patterns. At a high-level, neural networks using convolution layers (ConvNets or CNNs) use kernels to look at groups of pixels and try to find patterns in them. For example, if there’s a group of white pixels, and then a vertical line of black pixels to the right, this likely indicates an edge. Convolutional layers can easily find patterns like this.
What Happens If You Stack Convolutional Layers?
- 1 layer can find features in images like edges
- 2 layers can find features composed of edges, such as shapes
- 3 layers can find features composed of shapes, such as noses or eyes
Increasing the number of convolutional layers, increases the level of abstraction of features a ConvNet can detect.
ConvNets are how apps like Snapchat detect the edges and features of your face to create photo filters.
The Architecture I Chose to Use For My DCGAN
My DCGAN’s architecture (and training procedure) closely follow what was outlined in the original DCGAN paper, with modifications as recommended by ganhacks. I’ll keep the technical explanations brief.
The Generator is given a random latent vector of size 100 with values ranging from -1 to 1 as input. The Generator then map this latent vector to ‘data-space’, turning it into an image, with the goal of making the generated image model the statistical distribution of the dataset (this is why the Generator creates new images, and doesn’t just copy from the dataset).
The Generator is composed of a series of convolutional layers, with batch normalization and a ReLU activation between each convolutional layer (a big ol’ GAN and cheese sandwich). The output of the Generator is then fed through a Tanh activation (basically Mayonnaise with more math).
Here’s My PyTorch Implementation of the Generator
The Discriminator has a very similar to the architecture of the Generator, the main difference being that instead of taking a latent vector and outputting a image as output, the Discriminator takes an image as input and outputs a scalar probability that the input image is real (as opposed to fake).
How I Trained My DCGAN
- Discriminator is trained with all-real batch of images
- Discriminator is trained with all-fake batch of images
- Generator creates images, then Discriminator classifies output as real/fake
For each step listed above, the training loss is computed, then backpropagated through the network to improve it.
Results of My DCGAN
After just 4 hours of training time (on a laptop!) I was able to train my DCGAN to create stamp-sized images of celebrities.
Although some of the faces are distorted, many of them look great.
Because of the great versatility of GANs, I was able to train my DCGAN to generate images of cats without changing any code. All I had to do was give my DCGAN another 3 hours to train.
I’ve never been great at art, so creating a machine to make art for me is very satisfying.
Fantastic Generative Adversarial Networks and Where to Find Them
GANs can create high-quality original art and images, but once the novelty wears off why should you care?
Regular Generative Adversarial Networks
Generative Adversarial Networks can create a wide variety of images. As long as the GAN has access to a dataset where the images share similar features, then it can create new images that resemble the originals. GANs can create images of dogs, cats, anime characters and so much more.
Progressive Growing of Generative Adversarial Networks
Using a technique known as Progressive Growing of GANs, HD pictures can be created. The way it works is just like normal GANs, but this time, there’s a team of generators. The first generator creates an image, and then each of the following generators learn how to slightly upscale the resolution. By using a large enough team of generators, beautiful HD photos can be created.
Progressive Growing of GANs could be used to create photo-realistic HD avatars for profile pictures, chatbots, advertisements, and many other scenarios.
Super Resolution Generative Adversarial Networks (SRGANs)
Super Resolution GANs (SRGANs) can drastically increase the resolution of photos. SRGANs increase image resolution by attempting to make the converted image look like other HD images (adversarial loss), and by keeping features, such as hands or a string, in the image (perceptual similarity).
SRGANs could be used to convert movies or TV shows shot at low resolutions to 4k, to automatically fix blurry photos taken on consumer phones, or by businesses that provide photo editing services. In fact, Photoshop CC 2018 uses SRGANs to increase the resolution of photos.
CycleGANs can change the content of images, while still staying similar to the original. Just like regular GANs, CycleGANs have to create a new image, but this time they start with an existing image and modify it. To ensure that the generated image looks similar to the original, except for the desired changes, two generators and two discriminators are trained. The first generator, and its discriminator, are responsible for transforming the photo. The second generator/discriminator pair is responsible for ensuring that the modified photo is similar to the original.
CycleGANs have incredible potential. They could be used by historians to colorize photos, or art galleries could use them to show what the painter saw when they were creating the original painting.
Generative Adversarial Networks are not limited to only creating images. As long as the dataset can be represented as an array of numbers, and these numbers have patterns (hidden or not), GANs can generate new data that fits the pattern. For example, songs can be turned into a series of numbers, known as MIDI, that represent the pitch and duration of the instruments. Give GANs enough MIDI information, and they can create catchy tunes just like GANs N’ Roses.
You can listen to music created by GANs here.
Put GANs to Work for You Online
- Bring color to your B&W photos
- Increase the resolution of your photos
- Transform the artistic style of an image
- Using GANs, machines can create (and modify) a wide variety of high-quality original art and images
- GANs work by having two neural networks, a Generator and a Discriminator, compete against each other
- SRGANs can increase the resolution of photos, creating HD images
- CycleGANs can change the contents of photos
- GANs are not only limited to image generation
- Anyone can use GANs online right now
With GANs, AI can now be both number-crunching machines, and creative and artistic. AI’s great versatility makes it poised to disrupt almost any field of work, even the creative industries.
Andrew Ng, a leading figure in the AI industry, captures the enormous possibilities that AI has.
“Just as electricity transformed almost everything 100 years ago, today I actually have a hard time thinking of an industry that I don’t think AI will transform in the next several years”Andrew Ng