# Hyperparameters
training_epochs = 5 # Total number of training epochs
learning_rate = 0.03 # The learning rate
Conv2D - This layer creates a convolution kernel that is convolved with the layer input to produce a tensor of outputs.
BatchNormalization - Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.
Max pooling is a sample-based discretization process. The objective is to down-sample an input representation (image, hidden-layer output matrix, etc.), reducing its dimensionality and allowing for assumptions to be made about features contained in the sub-regions binned.
Dropout is a technique used to improve over-fit on neural networks, you should use Dropout along with other techniques like L2 Regularization. Basically during training some of neurons on a particular layer will be deactivated. This improve generalization because force your layer to learn with different neurons the same "concept". During the prediction phase the dropout is deactivated.
Flatten - Flattens the input. Does not affect the batch size.
To make this work in keras we need to compile a model. An important choice to make is the loss function. We use the categorical_crossentropy loss because it measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class).
Adadelta is a more robust extension of Adagrad that adapts learning rates based on a moving window of gradient updates, instead of accumulating all past gradients. This way, Adadelta continues learning even when many updates have been done.
# create a model
def create_model():
model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = (3,3), activation='relu',input_shape = (28,28,1)))
model.add(BatchNormalization())
model.add(Conv2D(filters = 16, kernel_size = (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(strides=(2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(filters = 32, kernel_size = (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(filters = 32, kernel_size = (3,3), activation='relu'))
model.add(BatchNormalization())
model.add(MaxPool2D(strides=(2,2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
# Compile a model
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.adadelta(), metrics=['accuracy'])
return model
model = create_model()
model.summary()
Let's trains the model for a given number of epochs.
results = model.fit(
X_train, y_train,
epochs= training_epochs,
batch_size = 128,
validation_data = (X_test, y_test),
verbose = 2
)
Model can generate output predictions for the input samples.
prediction_values = model.predict_classes(X_test)
Test-Accuracy :
print("Test-Accuracy:","%.2f%%" % (np.mean(results.history["val_acc"])*100))
Now we can check the accuracy of our model
print("Evaluating on training set...")
(loss, accuracy) = model.evaluate(X_train,y_train)
print("loss={:.4f}, accuracy: {:.4f}%".format(loss,accuracy * 100))
print("Evaluating on testing set...")
(loss, accuracy) = model.evaluate(X_test, y_test)
print("loss={:.4f}, accuracy: {:.4f}%".format(loss,accuracy * 100))
# summarize history for accuracy
plt.subplot(211)
plt.plot(results.history['acc'])
plt.plot(results.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='down right')
# summarize history for loss
plt.subplot(212)
plt.plot(results.history['loss'])
plt.plot(results.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper right')
plt.tight_layout()
max_loss = np.max(results.history['loss'])
min_loss = np.min(results.history['loss'])
print("Maximum Loss : {:.4f}".format(max_loss))
print("Minimum Loss : {:.4f}".format(min_loss))
print("Loss difference : {:.4f}".format((max_loss - min_loss)))
Y_true = np.argmax(y_test,axis = 1)
confusion_mtx = confusion_matrix(Y_true, prediction_values)
sns.heatmap(confusion_mtx, annot=True, fmt="d")
plt.ylabel('True')
plt.xlabel('Predicted')
model_json = model.to_json()
with open("CNN_model_Keras_digits_recoginition.json", "w") as json_file:
json_file.write(model_json)
# save weights to HDF5
model.save_weights("CNN_model_Keras_digits_recoginition.h5")