DCGANS for CIFAR-10 Dataset
Introduction
Artificial intelligence approach called GANs (Generative Adversarial Networks) is used to create new, synthetic data that is similar to a training dataset. They are made up of a generator and a discriminator neural network. The discriminator seeks to separate the synthetic data from the actual training data, while the generator tries to produce synthetic data comparable to the training data. The two networks are simultaneously trained, and while the generator attempts to provide data that can trick the discriminator, it gets better over time. Numerous types of synthetic data, including images, audio, and text, have been produced using GANs.
There are many different variations of generative adversarial networks (GANs), each with its own unique characteristics and applications. Some of the most common types of GANs include:
- Vanilla GANs: These are the simplest and most basic types of GANs. They consist of a generator and a discriminator, as described above.
- Conditional GANs: These GANs are able to generate synthetic data that is conditioned on some additional input. For example, a conditional GAN could be trained to generate images of a specific type of object (e.g. cats) when given a label indicating the desired object type as input.
- Deep Convolutional GANs (DCGANs): These GANs use deep convolutional neural networks as the generator and discriminator, which makes them well-suited for generating images.
- InfoGANs: These GANs are designed to disentangle the latent factors of variation in the training data and allow control over the generated data by manipulating these factors.
- Wasserstein GANs (WGANs): These GANs use the Wasserstein distance as a measure of the difference between the real data distribution and the synthetic data distribution, rather than the traditional GAN objective of minimizing the cross-entropy loss.
- CycleGANs: These GANs are used for image-to-image translation tasks, such as translating photos of horses into photos of zebras.
- StyleGANs: These GANs are able to generate highly realistic images, and are particularly well-suited for tasks such as generating synthetic faces
Kindly refer to the below links for a more in-depth explanation
Dataset Introduction
The CIFAR-10 dataset consists of 60,000 32x32 colour images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. The 10 classes in the dataset are:
- Aeroplane
- Automobile
- Bird
- Cat
- Deer
- Dog
- Frog
- Horse
- Ship
- Truck
The images in the dataset are of small natural objects and scenes, such as animals, vehicles, and everyday scenes. They are collected from the web and labelled by human annotators. Each image is labelled with one of the 10 classes. The dataset was developed by researchers at the Canadian Institute for Advanced Research (CIFAR). The CIFAR-10 dataset is widely used as a benchmark for image classification tasks, as well as for testing the performance of computer vision and machine learning algorithms. It is a relatively small dataset, making it a good starting point for experimenting with different algorithms and models. Additionally, the images are relatively small in size, making it possible to train models on a regular personal computer.
The dataset is available for download on the official website of the Canadian Institute for Advanced Research (CIFAR) and can be used for non-commercial research purposes.# example of loading the cifar10 dataset
# example of loading the cifar10 dataset
from keras.datasets.cifar10 import load_data
# load the images into memory
(trainX, trainy), (testX, testy) = load_data()
fig, ax = plt.subplots(ncols = 5,nrows = 5,figsize=(10,10))
for i in range(5):
for j in range(5):
ax[i][j].imshow(trainX[(i+1)*(j+1)-1])
Refer to the below link for an in-depth explanation
Preprocessing
- Feature Scaling: Feature Scaling is an important step of data preprocessing, If feature scaling is not performed, a machine learning algorithm would consider larger values to be higher and smaller values to be lower, regardless of the unit of measurement. The scaler used here is the Traditional MinMax Scaler. Minmax scaler Transform features by scaling each feature to a given range[-1, 1]
def load_real_samples():
X = trainX.astype('float32')
# Scale from [0, 255] to [-1, 1]
X = (X - 127.5)/127.5
return X
Preparing data for the Discriminator Model
- Generate Real Samples: Before building the discriminator model it is important to generate real & fake samples and label them accordingly. The Discriminator model would then be trained on this data so that it learns beforehand which is a real sample & which one is fake.
def generate_real_samples(dataset, n_samples):
ix = randint(0, dataset.shape[0], n_samples)
X = dataset[ix]
y = ones((n_samples, 1))
return X, y
2. Generate Fake Samples: As it is important to generate real samples, it is equally important to prepare fake samples. These samples are just random space latent variables with a label of zero.
def generate_fake_samples(n_samples):
# Generate uniform samples in [0, 1]
X = rand(32*32* 3 * n_samples)
# scale it to [-1, 1]
X = -1 + X + 2
# reshape into batch of color images
X= X.reshape((n_samples, 32,32,3))
y = zeros((n_samples, 1))
return X, y
Building & Training the Discriminator Model
In a Generative Adversarial Network (GAN), the discriminator model is responsible for distinguishing between real and fake samples. The goal of the discriminator is to correctly identify whether a given sample is real (from the training set) or fake (generated by the generator model).
To build a discriminator model for a GAN, one typically starts by selecting a suitable deep learning architecture, such as a convolutional neural network (CNN) or a deep feedforward network (DNN). The architecture should be chosen based on the type of data the GAN will be working with (e.g. images, text, etc.).
Next, the model is typically initialized with random weights and then trained on a labelled dataset of real samples. The training process involves feeding the model real and fake samples and adjusting the weights of the model based on the errors made by the model in distinguishing between the two types of samples.
def define_discriminator_model(in_shape = (32,32,3)):
model = Sequential()
model.add(Conv2D(64, (3,3),strides = (2,2),padding='same', input_shape =in_shape))
model.add(LeakyReLU(0.2))
model.add(Conv2D(128, (3,3),strides = (2,2),padding = 'same'))
model.add(LeakyReLU(0.2))
model.add(Conv2D(128, (3,3), strides = (2,2), padding = 'same'))
model.add(LeakyReLU(0.2))
# downsample
model.add(Conv2D(256, (3,3), strides=(2,2), padding='same'))
model.add(LeakyReLU(0.2))
# classifier
model.add(Flatten())
model.add(Dropout(0.4))
model.add(Dense(1, activation='sigmoid'))
opt = Adam(lr=0.0002, beta_1=0.5)
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
return model
model = define_discriminator_model()
model.summary()
from the snippet given above, we can see that the discriminator model is built with deeply connected convolutional layers which have 64, 128, and 128 filters. The strides = (2,2) represents the downsampling from 32, 32, 64 to 16, 16, 128 again we downsample the image to 8, 8, 128 further, downsampling gives 4,4, 256. We’ll upsample the image when we would make the Generator Model.
def train_discriminator_model(model, dataset, n_iter = 20, n_batch = 128):
half_batch = int(n_batch/2)
for i in range(n_iter):
X_real, y_real = generate_real_samples(dataset, half_batch)
_,real_acc = model.train_on_batch(X_real, y_real)
X_fake, y_fake = generate_fake_samples(half_batch)
_,fake_acc = model.train_on_batch(X_fake, y_fake)
print("Real Accuracy ",(real_acc*100),"Fake Accuracy",(fake_acc*100))
# define the discriminator model
model = define_discriminator_model()
# load image data
dataset = load_real_samples()
# fit the model
train_discriminator_model(model, dataset)
The above snippet represents the training of the discriminator model, we can see that the model is trained with a half batch of real & half batch of fake samples.
Building & Testing Generator Model
In GAN the work of the Generator is to fool the discriminator into thinking that the generated image is real, now this work like a double edge sword. If the discriminator is able to identify the generated image is fake, then the generator would update its weight and improve its generation capability.
def define_generator_model(latent_dim):
model = Sequential()
# foundation for 4x4 image
n_nodes = 256 * 4 * 4
model.add(Dense(n_nodes, input_dim=latent_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(Reshape((4, 4, 256)))
# upsample to 8x8
model.add(Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
model.add(LeakyReLU(alpha=0.2))
# upsample to 16x16
model.add(Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
model.add(LeakyReLU(alpha=0.2))
# upsample to 32x32
model.add(Conv2DTranspose(128, (4,4), strides=(2,2), padding='same'))
model.add(LeakyReLU(alpha=0.2))
# output layer
model.add(Conv2D(3, (3,3), activation='tanh', padding='same'))
return model
# define the size of the latent space
latent_dim = 100
# define the generator model
model = define_generator(latent_dim)
# summarize the model
model.summary()
the above snippet represents the creation of the generator model, it takes an image of shape 4x4x256 and as we said that we would upsample the image, which we have done with subsequent connections of Con2d layers.
def generate_fake_samples(model, latent_dim, n_samples):
x_input = generate_latent_points(latent_dim, n_samples)
X = model.predict(x_input)
y = zeros((n_samples, 1))
return X, y
latent_dim = 100
n_samples = 50
model = define_generator_model(latent_dim)
X,_ = generate_fake_samples(model, latent_dim, n_samples)
# Rescale from [-1, 1] to [0, 1]
X = (X+1)/2.0
fig, ax = plt.subplots(ncols = 3, nrows = 3, figsize=(10,10))
for i in range(3):
for j in range(3):
ax[i][j].imshow(X[(i+1)*(j+1)-1])
The above figure shows the generated image by the Generator when passed with random latent variables. It gives an output of random pixel values which is nothing but rubbish.
Building & Training the GAN Model
To further improve the accuracy of the Generator Model, we propose building a Model in such a way, that this model would combine both Generator & Discriminator.
def define_gan(g_model, d_model):
d_model.trainable = False
model = Sequential()
model.add(g_model)
model.add(d_model)
opt = Adam(lr=0.0002, beta_1=0.5)
model.compile(loss='binary_crossentropy', optimizer=opt)
return model
# size of the latent space
latent_dim = 100
# create the discriminator
d_model = define_discriminator_model()
# create the generator
g_model = define_generator_model(latent_dim)
# create the gan
gan_model = define_gan(g_model, d_model)
# summarize gan model
gan_model.summary()
The above snippet shows the creation of the GAN model, we have combined Generator & Dicrimiantor Models.
def train(g_model, d_model, gan_model, dataset, latent_dim, n_epochs=200, n_batch=128):
bat_per_epo = int(dataset.shape[0] / n_batch)
half_batch = int(n_batch / 2)
# manually enumerate epochs
for i in range(n_epochs):
# enumerate batches over the training set
for j in range(bat_per_epo):
# get randomly selected ✬real✬ samples
X_real, y_real = generate_real_samples(dataset, half_batch)
# update discriminator model weights
d_loss1, _ = d_model.train_on_batch(X_real, y_real)
# generate ✬fake✬ examples
X_fake, y_fake = generate_fake_samples(g_model, latent_dim, half_batch)
# update discriminator model weights
d_loss2, _ = d_model.train_on_batch(X_fake, y_fake)
# prepare points in latent space as input for the generator
X_gan = generate_latent_points(latent_dim, n_batch)
# create inverted labels for the fake samples
y_gan = ones((n_batch, 1))
# update the generator via the discriminator✬s error
g_loss = gan_model.train_on_batch(X_gan, y_gan)
# summarize loss on this batch
print('>%d, %d/%d, d1=%.3f, d2=%.3f g=%.3f' %
(i+1, j+1, bat_per_epo, d_loss1, d_loss2, g_loss))
if (i+1) % 10 == 0:
summarize_performance(i, g_model, d_model, dataset, latent_dim)
# size of the latent space
latent_dim = 100
# create the discriminator
d_model = define_discriminator_model()
# create the generator
g_model = define_generator_model(latent_dim)
# create the gan
gan_model = define_gan(g_model, d_model)
# load image data
dataset = load_real_samples()
# train model
train(g_model, d_model, gan_model, dataset, latent_dim)
the above snippet is for training the GAN model, we have used one trick which would increase the ability of the Generator to produce meaningful images, the trick here is to first train the Discriminator model and then train the GAN model and also to invert the labels doing so would subsequently reduce the ability of the discriminator to easily identify the samples generated by the generator. As we have said before the more accurate the Discriminator the better Generator would be able to improve its accuracy now this goes back & forth hence it has adversarial nature.
Testing the GAN
We have trained the model for 200 epochs with 128 batches following are the results from the model
def generate_latent_points(latent_dim, n_samples):
# generate points in the latent space
x_input = randn(latent_dim * n_samples)
# reshape into a batch of inputs for the network
x_input = x_input.reshape(n_samples, latent_dim)
return x_input
# create and save a plot of generated images
def save_plot(examples, n):
# plot images
fig, ax = plt.subplots(ncols = 5, nrows = 5, figsize = (20,20))
for i in range(5):
for j in range(5):
ax[i][j].imshow(examples[(i+1)*(j+1)-1, :, :])
# plot raw pixel data
# load model
model = load_model('generator_model_200.h5')
# generate images
latent_points = generate_latent_points(100, 100)
# generate images
X = model.predict(latent_points)
# scale from [-1,1] to [0,1]
X = (X + 1) / 2.0
# plot the result
save_plot(X, 10)
The results are pretty good! we can subtly identify the Boat, Car & Aeroplane. We can say that the Generator did a good job of producing such results from a random point. Finally, we can add that the Generator model can be improved further but for now, we are cheerful with the results.