Choose optimal number of epochs to train a neural network in Keras

Configurare noua (How To)

Situatie

One of the critical issues while training a neural network on the sample data is Overfitting. When the number of epochs used to train a neural network model is more than necessary, the training model learns patterns that are specific to sample data to a great extent. This makes the model incapable to perform well on a new dataset. This model gives high accuracy on the training set (sample data) but fails to achieve good accuracy on the test set. In other words, the model loses generalization capacity by overfitting to the training data.

Solutie

Pasi de urmat

Finding the optimal number of epochs to avoid overfitting on MNIST dataset. Loading dataset and preprocessing:

import keras
from keras.utils.np_utils import to_categorical
from keras.datasets import mnist
# Loading data
(train_images, train_labels), (test_images, test_labels)= mnist.load_data()
# Reshaping data-Adding number of channels as 1 (Grayscale images)
train_images = train_images.reshape((train_images.shape[0],
train_images.shape[1],
train_images.shape[2], 1))
test_images = test_images.reshape((test_images.shape[0],
test_images.shape[1],
test_images.shape[2], 1))
# Scaling down pixel values
train_images = train_images.astype(‘float32’)/255
test_images = test_images.astype(‘float32’)/255
# Encoding labels to a binary class matrix
y_train = to_categorical(train_labels)
y_test = to_categorical(test_labels)

Building a CNN model:

from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation =”relu”,
input_shape =(28, 28, 1)))
model.add(layers.MaxPooling2D(2, 2))
model.add(layers.Conv2D(64, (3, 3), activation =”relu”))
model.add(layers.MaxPooling2D(2, 2))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation =”relu”))
model.add(layers.Dense(10, activation =”softmax”))
model.summary()

Summary of the model:Lightbox

Compiling the model with RMSprop optimizer, categorical cross entropy loss function and accuracy as success metric

model.compile(optimizer =”rmsprop”, loss =”categorical_crossentropy”,
metrics =[‘accuracy’])

Creating validation set and training set by partitioning the current training set:

val_images = train_images[:10000]
partial_images = train_images[10000:]
val_labels = y_train[:10000]
partial_labels = y_train[10000:]

Initializing earlystopping callback and training the model:

from keras import callbacks
earlystopping = callbacks.EarlyStopping(monitor =”val_loss”,
mode =”min”, patience = 5,
restore_best_weights = True)
history = model.fit(partial_images, partial_labels, batch_size = 128,
epochs = 25, validation_data =(val_images, val_labels),
callbacks =[earlystopping])

Training stopped at 11th epoch i.e., the model will start overfitting from 12th epoch. Therefore, the optimal number of epochs to train most dataset is 11.

Training stopped at 11th epoch i.e., the model will start overfitting from 12th epoch. Therefore, the optimal number of epochs to train most dataset is 11.

Observing loss values without using Early Stopping call back function:
Train the model up until 25 epochs and plot the training loss values and validation loss values against number of epochs. The plot looks like

Tip solutie

Permanent

Voteaza

(7 din 13 persoane apreciaza acest articol)

Despre Autor

Leave A Comment?