This Answer explains the mechanism behind the discriminator for our AI anime GAN model, which allows us to generate anime character images in real time! After understanding the concepts behind the generator and the discriminator, we will be ready to write our own code for generating AI anime character images. Without further ado, let's get started!
A discriminator is a component of GANs that learns to differentiate between both real and generated data. We can use it to assign probability scores to our input data, showing the likelihood that the data is real. As the generator in the GAN starts generating higher quality, i.e., close-to-reality data, the discriminator's task becomes more challenging.
This continuous feedback between both allows us to gradually generate our GAN more realistic outputs.
Note: GANs are generative adversarial networks which are a machine learning framework consisting of two neural networks.
The generator creates data
The discriminator evaluates it
There are a few crucial concepts that we will be brushing up on before we code up the discriminator ourselves. These concepts will be needed for you to build your understanding of the complete project.
Concept | Explanation |
Sequential model | A Sequential model is like a linear stack of layers in neural networks. It allows us to create a neural network in sequential order. |
Input shape | The input_shape is kept (64,64,3) and it means a image of 64 x 64 pixels with 3 color channels (RGB). |
Convolutional layers | Convolutional layers can identify patterns in images by sliding small filters over the input. |
Strides | Strides can control how the filter moves over the image and downsamples it. |
Batch normalization | Batch normalization scales and shifts the output of the convolutional layer. |
Leaky ReLU | Leaky ReLU is an activation function that allows some values to pass through even if negative, so that the neurons don't die out. |
Flattening | Flattening converts multi dimensional data into a 1D array. |
Dropout | Dropout randomly deactivates neurons during training to reduce overfitting. |
Dense layers | The Dense layer connects all neurons from the previous layer to its single neuron. |
Sigmoid | A layer with a Sigmoid activation outputs a probability value between 0 and 1. |
The code for setting up a discriminator for our anime generator is given below. The core concepts explained above constitute most of the part of the code. The code will be written using Tensorflow and Keras.
discriminator = Sequential(name='discriminator')input_shape = (64, 64, 3)discriminator.add(layers.Conv2D(64, (4, 4), strides=(2, 2), padding='same', input_shape=input_shape))discriminator.add(layers.BatchNormalization())discriminator.add(layers.LeakyReLU(alpha=0.2))discriminator.add(layers.Conv2D(128, (4, 4), strides=(2, 2), padding='same'))discriminator.add(layers.BatchNormalization())discriminator.add(layers.LeakyReLU(alpha=0.2))discriminator.add(layers.Conv2D(128, (4, 4), strides=(2, 2), padding='same'))discriminator.add(layers.BatchNormalization())discriminator.add(layers.LeakyReLU(alpha=0.2))discriminator.add(layers.Flatten())discriminator.add(layers.Dropout(0.3))discriminator.add(layers.Dense(1, activation='sigmoid'))
Let's dive deeper into the steps of the discriminator creation.
We create a Sequential
model that we name "discriminator".
Next, we define the input shape for the model i.e. a 64 x 64 image with 3 RGB color channels.
We add a 2D convolutional layer Conv2D
having 64 filters of 4 x 4 size. The convolution operation has a stride of 2 in both dimensions, and padding is set to "same" meaning that the input and output image size remain the same. The input_shape
variable is the input shape for our layer.
We add a BatchNormalization
layer to stabilize the learning process by normalizing the previous layer outputs.
Then we move on to adding a LeakyReLU
activation function with a slope of 0.2. Therefore, we introduce a small gradient for negative inputs and prevent dying neurons.
We add another 2D convolutional layer Conv2D
but this time with 128 filters.
Next, we add another BatchNormalization
layer for the same purpose.
We also add another LeakyReLU
activation with the same slope.
We again add another 2D convolutional layer Conv2D
with the same settings.
We add another BatchNormalization
layer.
Then we add another LeakyReLU
activation.
We convert our output to a 1D vector using the Flatten
function.
We introduce our model to a Dropout
layer with a rate of 0.3. This randomly deactivates some neurons during training to prevent overfitting.
To connect everything, we add a Dense
or fully connected layer with 1 neuron and sigmoid activation. Through this layer, we can produce a probability score that shows whether our image is real or fake.
discriminator.summary()
We can now display a summary of the model's architecture using discriminator.summary()
, showing the layers, output shapes, parameters, etc.
Having gone through both the code and concepts, our discriminator understanding for our AI anime GAN model. In short, the discriminator takes real and generated images as inputs and learns to distinguish between them. Along the way, it gradually trains itself and makes itself better at telling apart real images from the ones produced by the generator. The output of the discriminator shows its evaluation of whether an image is real or AI-generated. In this way, it makes our generator better at generating more accurate images too.
Note: The complete GAN code for generating new images using discriminators and generators is present here. After you've understood both of these answers, you can continue with the rest of the code.
Take this fun challenge and revise your concepts on discriminators!
How well do you know discriminators?
Which layer is used to reduce the dimensions of the previous layer?
Free Resources