Lab 4: MNIST Digit Classification
🔢 Teach a computer to read handwriting using professional AI tools.
Libraries: TensorFlow, Keras • Estimated Time: 3 hours
Part 1: From "Scratch" to Pro Tools
In the last lab, you built a neural network from scratch using NumPy. You manually coded forward
propagation, backward propagation, and the weight updates. It was a fantastic way to learn what's
happening under the hood!
But in the real world, data scientists don't do that every time. They use powerful libraries
that automate the process. Today, we'll use TensorFlow and Keras, the
most popular tools for building neural networks.
Our New Toolkit:
- TensorFlow: A powerful library from Google for high-performance numerical
computation. Think of it as the engine of our race car.
- Keras: A user-friendly API that runs on top of TensorFlow. It provides simple,
building-block-like commands to create neural networks. Think of it as the easy-to-use steering
wheel and dashboard for our race car. You tell Keras what you want, and it handles all the
complex TensorFlow engine work for you.
Our mission is to solve the "Hello, World!" of computer vision: classifying the
MNIST dataset of handwritten digits.
Part 2: Getting the Data
Keras makes it incredibly easy to access famous datasets like MNIST.
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
Just like that, we have our data! It's already split into two parts:
- Training set (`x_train`, `y_train`): The data our model will learn from (60,000
images).
- Testing set (`x_test`, `y_test`): The data we will use to evaluate our model's
performance on images it has never seen before (10,000 images).
💡 Your Turn
Let's inspect our data. Use the `.shape` attribute to see the dimensions of the data. In a new Colab
cell, type and run the following:
print(f"x_train shape:
{x_train.shape}")
print(f"y_train shape:
{y_train.shape}")
print(f"x_test shape: {x_test.shape}")
print(f"y_test shape: {y_test.shape}")
What does `(60000, 28, 28)` mean? It means we have 60,000 images, and each image is a
28x28 grid of pixels.
Part 3: Preparing the Evidence
Before we feed our data to the network, we need to prepare it. This is one of the most important steps in
any machine learning project.
3.1 Visualizing the Data
Let's look at one of the images to see what we're working with.
plt.imshow(x_train[0], cmap='gray')
plt.show()
print(f"The label for the first image is:
{y_train[0]}")
The label for the first image is: 5
3.2 Normalizing the Pixel Values
The pixel values in our images range from 0 (black) to 255 (white). Neural networks work best when input
values are small, typically between 0 and 1. So, we'll normalize the data by dividing every pixel value
by 255.
x_train = x_train / 255.0
x_test = x_test / 255.0
💡 Your Turn
After running the code above, print the first pixel of the first image to confirm it's been
normalized. Type `print(x_train[0][10][10])`. You should see a number between 0 and 1, not a large
number like 192.
3.3 Flattening the Images
Our network will use simple ("Dense") layers that expect a 1D list of numbers, not a 2D grid. We need to
"flatten" each 28x28 image into a single 1x784 array.
x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)
print(f"New x_train shape: {x_train.shape}")
New x_train shape: (60000, 784)
Part 4: Assembling the Brain
Now for the fun part! With Keras, building a neural network is like stacking LEGO blocks. We'll use a
`Sequential` model, which means a simple, layer-by-layer stack.
model = keras.Sequential([
keras.layers.InputLayer(input_shape=(784,)),
keras.layers.Dense(units=128, activation='relu'),
keras.layers.Dense(units=10, activation='softmax')
])
Model Summary
Let's print a summary of our architecture.
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 128) 100480
dense_1 (Dense) (None, 10) 1290
=================================================================
Total params: 101,770
Trainable params: 101,770
Non-trainable params: 0
_________________________________________________________________
Detective's Note: Look at "Param #". The first layer has 100,480 parameters! That's `784
inputs * 128 neurons + 128 biases`. Imagine calculating the gradient for all of those by hand! This is
why we use Keras.
💡 Your Turn
What would the `Param #` be for the first dense layer if you changed it from `128` neurons to `64`
neurons? Calculate it first (`784 * 64 + 64`), then change the code and run `model.summary()` to
check your answer.
Part 5: Teaching the Brain
5.1 Compiling the Model
Before we can train, we need to "compile" the model. This is where we define the learning process.
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
5.2 Training the Model
Now we're ready to train! The `.fit()` method is where the magic happens. It will show the data to the
network, calculate the loss, and update the weights over and over.
history = model.fit(
x_train, y_train,
epochs=5,
batch_size=32,
validation_split=0.2
)
Epoch 1/5
1500/1500 [==============================] - 5s 3ms/step - loss: 0.2871 - accuracy: 0.9184 - val_loss:
0.1554 - val_accuracy: 0.9557
Epoch 2/5
1500/1500 [==============================] - 4s 3ms/step - loss: 0.1268 - accuracy: 0.9631 - val_loss:
0.1119 - val_accuracy: 0.9673
...
Epoch 5/5
1500/1500 [==============================] - 4s 3ms/step - loss: 0.0519 - accuracy: 0.9840 - val_loss:
0.0818 - val_accuracy: 0.9753
Wow! After just 5 passes through the data, our model is achieving over 97% accuracy on
the validation set! This is the power of TensorFlow and Keras.
Part 6: Grading the Test
Our model did well on the validation data, but the real test is how it performs on the `x_test` set,
which it has never seen at all.
test_loss, test_accuracy = model.evaluate(x_test, y_test)
print(f"\nTest Accuracy:
{test_accuracy*100:.2f}%")
313/313 [==============================] - 1s 2ms/step - loss: 0.0768 - accuracy:
0.9759
Test Accuracy: 97.59%
Amazing! Now let's make a prediction on a single image and see the result.
test_image = x_test[0]
test_image_batch = np.expand_dims(test_image, axis=0)
prediction = model.predict(test_image_batch)
predicted_label = np.argmax(prediction)
plt.imshow(test_image.reshape(28,28), cmap='gray')
plt.show()
print(f"Model prediction:
{predicted_label}")
print(f"Actual label: {y_test[0]}")
Model prediction: 7
Actual label: 7
💡 Your Turn
Copy the prediction code block above. Change the index from `[0]` to another number (e.g., `[25]`) to
test a different image. Does the model get it right?
Part 7: Your Mission - Improve the Model
Assignment: Become an AI Architect
Your goal is to improve the test accuracy of our model. Can you get it above 98%?
Experiment with the following ideas in your Colab notebook. Remember to rebuild and re-train the
model after each change.
Ideas to Try:
- More Neurons: Change the number of units in the first Dense layer from `128` to
`256`. Does a bigger layer help?
- Deeper Network: Add a second hidden Dense layer. After the first `Dense(128,
...)` layer, add another one, e.g., `keras.layers.Dense(units=64, activation='relu')`. Does
making the network deeper improve performance?
- More Training: Increase the number of `epochs` from `5` to `10`. Does giving
the model more time to learn help?
- The Dropout Technique: Overfitting is when a model learns the training data too
well but fails on new data. A "Dropout" layer randomly "turns off" some neurons during training
to prevent this. Try adding `keras.layers.Dropout(0.2)` after your Dense layer(s). This will
randomly drop 20% of the neurons.
For each experiment, record the final test accuracy in a text cell. Which combination of
changes gave you the best result?
Part 8: Bonus - Fashion Police
Now that you can classify digits, let's try something a bit harder: classifying images of clothing! The
Fashion-MNIST dataset has the exact
same format as MNIST (784 pixels, 10 classes), but it's a more challenging problem.
Kaggle & The Fashion-MNIST Dataset
This dataset is also built into Keras, making it easy to start. The labels are numbers from 0 to 9,
corresponding to different clothing items like 'T-shirt/top', 'Trouser', 'Pullover', etc.
Your Challenge:
- Load the Data: In a new notebook, use `(x_train, y_train), (x_test, y_test) =
keras.datasets.fashion_mnist.load_data()` to get the data.
- Build and Train: Copy your best model architecture from the MNIST assignment.
Preprocess the fashion data in the same way (normalize, flatten) and train your network on it.
- Evaluate: What test accuracy can you achieve on this harder dataset? It will likely
be lower than what you got on MNIST. Can you tweak your model to get the best possible accuracy?
Part 9: Submission Guidelines
To complete this lab, please follow these instructions carefully.
- Complete all "Your Turn" tasks and the main "Lab Assignment" in a single Google Colab notebook. The
Kaggle project is a bonus.
- In the assignment section, use Text Cells to clearly label each experiment you run
and to report the final test accuracy for each one. Conclude with which model performed the best.
- Ensure all your code cells have been run so that their outputs and plots are visible.
- When you are finished, generate a shareable link. In Colab, click the "Share"
button in the top right.
- In the popup, under "General access", change "Restricted" to "Anyone with the link"
and ensure the role is set to "Viewer".
- Click "Copy link" and submit this link as your assignment.