Choose optimal number of epochs to train a neural network in Keras

Situatie

One of the critical issues while training a neural network on the sample data is Overfitting. When the number of epochs used to train a neural network model is more than necessary, the training model learns patterns that are specific to sample data to a great extent. This makes the model incapable to perform well on a new dataset. This model gives high accuracy on the training set (sample data) but fails to achieve good accuracy on the test set. In other words, the model loses generalization capacity by overfitting to the training data.

Solutie

Pasi de urmat

Finding the optimal number of epochs to avoid overfitting on MNIST dataset. Loading dataset and preprocessing:

import keras

from keras.utils.np_utils import to_categorical

from keras.datasets import mnist

# Loading data

(train_images, train_labels), (test_images, test_labels)= mnist.load_data()

# Reshaping data-Adding number of channels as 1 (Grayscale images)

train_images = train_images.reshape((train_images.shape[0],

train_images.shape[1],

train_images.shape[2], 1))

test_images = test_images.reshape((test_images.shape[0],

test_images.shape[1],

test_images.shape[2], 1))

# Scaling down pixel values

train_images = train_images.astype(‘float32’)/255

test_images = test_images.astype(‘float32’)/255

# Encoding labels to a binary class matrix

y_train = to_categorical(train_labels)

y_test = to_categorical(test_labels)

Building a CNN model:

from keras import models

from keras import layers

model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation =”relu”,

input_shape =(28, 28, 1)))

model.add(layers.MaxPooling2D(2, 2))

model.add(layers.Conv2D(64, (3, 3), activation =”relu”))

model.add(layers.MaxPooling2D(2, 2))

model.add(layers.Flatten())

model.add(layers.Dense(64, activation =”relu”))

model.add(layers.Dense(10, activation =”softmax”))

model.summary()

Summary of the model: Lightbox

Compiling the model with RMSprop optimizer, categorical cross entropy loss function and accuracy as success metric

model.compile(optimizer =”rmsprop”, loss =”categorical_crossentropy”,

metrics =[‘accuracy’])

Creating validation set and training set by partitioning the current training set:

val_images = train_images[:10000]
partial_images = train_images[10000:]
val_labels = y_train[:10000]
partial_labels = y_train[10000:]

Initializing earlystopping callback and training the model:

from keras import callbacks

earlystopping = callbacks.EarlyStopping(monitor =”val_loss”,

mode =”min”, patience = 5,

restore_best_weights = True)

history = model.fit(partial_images, partial_labels, batch_size = 128,

epochs = 25, validation_data =(val_images, val_labels),

callbacks =[earlystopping])

Training stopped at 11th epoch i.e., the model will start overfitting from 12th epoch. Therefore, the optimal number of epochs to train most dataset is 11.

Observing loss values without using Early Stopping call back function:
Train the model up until 25 epochs and plot the training loss values and validation loss values against number of epochs. The plot looks like

Tip solutie

Permanent

Follow Us