Solving MNIST with Convolutional Neural Networks

Let's get physical

Sometimes, nothing beats holding a copy of a book in your hands. Writing in the margins, highlighting sentences, folding corners. So this book is also available from Amazon as a paperback.

Buy now on Amazon

Now we have the basis of understanding CNNs let’s pull it together to create a CNN version of our MNIST problem.

We will flesh out the function createConvModel in start.js to create our CNN. That should be all that’s needed; the rest of our demo application remains the same. We first need to remember to create a convolutional model instead of our dense model, so replace:

MODEL = createDenseModel();

with

MODEL = createConvModel();

in our trainModel function.

We’ve covered the CNN concepts in the previous lecture, so we will now explain how to leverage those concepts in our demo application.

Next inside our createConvModel function paste this code:

const model = tf.sequential();
model.add(
    tf.layers.conv2d({ (1)
        inputShape: [28, 28, 1],
        kernelSize: 3,
        filters: 16,
        activation: "relu"
    })
);
model.add(
    tf.layers.maxPooling2d({ (2)
        poolSize: 2,
        strides: 2
    })
);
model.add(
    tf.layers.conv2d({ (3)
        kernelSize: 3,
        filters: 32,
        activation: "relu"
    })
);
model.add(
    tf.layers.maxPooling2d({ (4)
        poolSize: 2,
        strides: 2
}));

model.add(
    tf.layers.conv2d({ (5)
        kernelSize: 3,
        filters: 32,
        activation: "relu"
    })
);

model.add(
    tf.layers.flatten({}) (6)
);

model.add(
    tf.layers.dense({ (7)
        units: 64,
        activation: "relu"
    })
);

model.add(
    tf.layers.dense({ (8)
        units: 10,
        activation: "softmax"
    })
);

return model;

1	The first layer of the convolutional neural network plays a dual role; it is both the input layer of the neural network and a layer that performs the first convolution operation on the input. It receives the 28x28 pixels black and white images. This input layer uses 16 filters with a kernel size of 5 pixels each. It uses a simple RELU activation function.
2	We use our first `maxPooling2d` layer to downsample the data.
3	We now add another convolution layer, this time with 32 filters.
4	We again use max-pooling to downsample the data.
5	And we add another convolutional layer with another 32 filters.
6	Now we flatten everything, this turns the complex multi-dimensional input shape into a 1D output shape of 1000’s of weights.
7	We need to summarize this information so we can get to just 10 numbers, so we first create a densely connected layer that turns the 1000’s of inputs into 64 outputs.
8	Our last layer is a dense layer with 10 output units, one for each output class (i.e. 0, 1, 2, 3, 4, 5, 6, 7, 8, 9). Here the classes represent numbers, but it’s the same idea if you had classes that represented other entities like dogs and cats (two output classes: 0, 1). We use the softmax function as the activation for the output layer as it creates a probability distribution over our 10 classes, so their output values sum to 1.

Running the above application, we should hopefully result in a slightly more accurate version of the application.

Advanced JavaScript

This unique course teaches you advanced JavaScript knowledge through a series of interview questions. Bring your JavaScript to the 2021's today.

Level up your JavaScript now!

[🌲,🌳,🌴].push(🌲)

If you find my courses useful, please consider planting a tree on my behalf to combat climate change. Just $4.50 will pay for 25 trees to be planted in my name. Plant a tree!