Generator for AI anime GANs

This Answer explains the mechanism behind the generator for our AI anime GAN model, which allows us to generate anime character images in real time! After understanding the concepts behind the generator and the discriminator, we will be ready to write our own code for generating AI anime character images. Without further ado, let's get started!

What is a generator in machine learning?

A generator is a component of GANs that is responsible for producing data that is close to the real samples. As the generator's realistic factor increases, our model can generate data that is virtually indistinguishable from reality. Therefore, our discriminator's task of telling real from generated data becomes harder.

This continuous exchange between both of these allows us to gradually make our GAN generate more realistic outputs using harder generators for discriminators to catch.

Concept	Explanation
Sequential model	A Sequential model is like a linear stack of layers in neural networks. It allows us to create a neural network in sequential order.
Input dimensions	An input dimension specifies the size of the input vector.
Padding	A padding of value "same" keeps the output size consistent to the input.
Kernel initializer	A kernel initializer sets how the weights of the kernel are initialized.
Dense layer	A Dense layer having 8 * 8 * 512 neurons is added, projecting a 100 dimensions input vector into a high-dimensional space.
ReLU activation	ReLU activation is used to introduce non-linearity in our model. Non-linearity helps us identify complex patterns.
Reshaping	The Reshape layer transforms the output of the dense layer into a 4D tensor with dimensions i.e. 8 * 8 * 512. This helps prepare our data for the next convolutional layers.
Deconvolutional blocks	Conv2DTranspose layers are used to upsample data. As we increase the filters, we increase spatial dimensions to make the image clearer.
Output layer	The Conv2D layer is the final output layer. It combines patterns learned in earlier layers to generate a complete picture.
Tanh activation	The tanh activation function maps gives us values between -1 and 1.

generator = Sequential(name='generator')

generator.add(layers.Dense(8 * 8 * 512, input_dim=100))
generator.add(layers.ReLU())
generator.add(layers.Reshape((8, 8, 512)))

generator.add(layers.Conv2DTranspose(256, (4, 4), strides=(2, 2), padding='same', kernel_initializer=keras.initializers.RandomNormal(mean=0.0, stddev=0.02)))
generator.add(layers.ReLU())

generator.add(layers.Conv2DTranspose(128, (4, 4), strides=(2, 2), padding='same', kernel_initializer=keras.initializers.RandomNormal(mean=0.0, stddev=0.02)))
generator.add(layers.ReLU())

generator.add(layers.Conv2DTranspose(64, (4, 4), strides=(2, 2), padding='same', kernel_initializer=keras.initializers.RandomNormal(mean=0.0, stddev=0.02)))
generator.add(layers.ReLU())

generator.add(layers.Conv2D(3, (4, 4), padding='same', activation='tanh'))

Let's go through the steps one by one.

We start by creating a Sequential model called "generator".
We add a Dense layer that can output data of a shape 8 x 8 x 512, using an input vector of size 100.
Here, we add a ReLU activation function to introduce non-linearity.
We reshape the output into a 3D tensor having the shape 8 x 8 x 512 using Reshape layer.
We add a 2D transposed convolutional layer Conv2DTranspose i.e. deconvolution with 256 filters. It allows us to increase the spatial dimensions by a factor of 2 in both dimensions due to (2, 2) strides . We set the padding to "same". The kernel weights are then initialized using a random normal distribution having a 0 mean and a standard deviation of 0.02.
We again apply a ReLU activation to the output of our model's previous layer.
Next, we add another 2D transposed convolutional layer with 128 filters and the same size and padding as before. Kernel weights are initialized too.
Again, we apply a ReLU activation on the output.
We add a third 2D transposed convolutional layer with 64 filters along with the same settings.
Next, we apply yet another ReLU activation to our output.
Finally, we add a regular 2D convolutional layer Conv2D with 3 filters, each of size 4 x 4. Padding value is "same," and the activation function used is basically a hyperbolic tangent i.e. tanh. This layer aims to output an image with 3 color channels.

We can now display a summary of the model's architecture using generator.summary(), showing the layers, output shapes, parameters, etc.

End notes

We have successfully learned how to set up a generator for our AI anime GAN model. Conclusively, the generator starts with a low-dimensional noise input and slowly transforms it into a higher-dimensional image i.e., more realistic through deconvolutional layers. Each deconvolutional layer increases our generated image's spatial dimensions, producing better images. The final output, which is basically activated by "tanh", represents an AI-generated image that resembles real images if the generator is built well.

Note: The complete GAN code for generating new images using discriminators and generators is present here. After you've understood both of these Answers, you can continue with the rest of the code.
Generator
Discriminator
Complete project

Test your generator knowledge

Take this fun challenge and revise your concepts through it!

Generator for AI anime GANs

What is a generator in machine learning?

Core concepts

Generator code

End notes

Test your generator knowledge