Lab 3: Neural Network from Scratch

Part 1: The Goal - Teaching a Robot to Think

Imagine you have a tiny robot. You want to teach it a simple game. The game has four possible scenarios, based on two switches, and you want the robot to output the correct answer (0 or 1).

How can a robot "learn"? We can't just write a bunch of `if/else` rules, because that's not learning, that's just following instructions. We want the robot to figure out the rules on its own, just by looking at examples.

This is the core idea of a Neural Network. It's a computer program, inspired by the human brain, that can learn from data.

Our Plan:

The Brain Cell: We'll start by building a single artificial neuron, called a Perceptron.
The Learning Rule: We'll give it a way to measure its mistakes and correct them, a process called Gradient Descent.
Connecting The Pieces: We'll put it all together to create a simple, working Neural Network that learns to solve the puzzle.

Part 2: The Simplest Brain Cell (The Perceptron)

A real brain neuron gets signals from other neurons and decides whether to fire its own signal. An artificial neuron, or Perceptron, does the same thing with numbers.

How a Perceptron Works:

It takes inputs: These are the pieces of information for our puzzle.
It has weights: Each input is given an "importance" score, called a weight. A higher weight means that input is more important for the final decision.
It calculates a weighted sum: It multiplies each input by its weight and adds them all up.
It uses an activation function: It passes the sum through a final function to make a decision. The simplest one is a "Step Function": if the sum is greater than a threshold, output 1 (fire!), otherwise output 0 (don't fire!).

Let's code a simple perceptron in pure Python. Imagine we're teaching it to decide if you should go for a walk. The inputs are `is_sunny` (1 for yes, 0 for no) and `is_warm` (1 for yes, 0 for no).

                def perceptron(inputs, weights, threshold):

                  # Calculate the weighted sum

                  weighted_sum = 0

                  for i in range(len(inputs)):

                    weighted_sum += inputs[i] * weights[i]

                  # Apply the step function

                  if weighted_sum > threshold:

                    return 1 # Fire! (Go for a walk)

                  else:

                    return 0 # Don't fire. (Stay inside)

                # Inputs: [is_sunny, is_warm]

                inputs = [1, 1] # It's
                    sunny and warm

                # Weights: Let's say warmth is more important than sunniness

                weights = [0.5, 0.8]

                # Threshold: Our decision boundary

                threshold = 1.0

                output = perceptron(inputs, weights, threshold)

                print(f"Decision: {output}")

Decision: 1

💡 Your Turn

Copy the code above into a Colab cell.

What happens if you change the `inputs` to `[0, 1]` (not sunny, but warm)? Does the robot still decide to go for a walk?
What happens if you change the `weights` to `[0.8, 0.1]` making sunniness more important?
What happens if you increase the `threshold` to `1.5`?

This is cool, but how do we find the right `weights` and `threshold`? We could guess forever, or we could teach the robot to find them itself. This is where learning comes in.

Part 3: How a Neural Network Learns

Our network learns by doing three things over and over:

Make a guess (Forward Propagation): It takes the inputs and passes them through the network to get an output.
Measure the mistake (Calculate Loss): It compares its guess to the correct answer. The difference is the "error" or "loss." A big error means a bad guess.
Correct the weights (Backward Propagation & Gradient Descent): This is the magic step. It works backward from the error and figures out how much to "blame" each weight. It then nudges each weight in the right direction to make the error smaller next time.

Imagine you are blindfolded in a hilly field and want to find the lowest point. This process is like that:

You take a small step in one direction (this is our current `weight`).
You feel the ground to see if you went up or down (this is our `loss`). The slope of the ground is the `gradient`.
If you went up (error increased), you know to take your next step in the opposite direction.
The size of your step is the `learning_rate`. A tiny step means slow progress; a giant step might overshoot the lowest point entirely.

This process of following the slope downhill is called Gradient Descent.

Part 4: Building Our First Real Network with NumPy

Doing math with Python lists is slow. We'll now use NumPy, a library that is super-fast at handling arrays of numbers, which is exactly what our inputs and weights are. We will also switch to a smoother activation function called the Sigmoid function. Unlike the harsh 0-or-1 step function, Sigmoid squishes any number into a smooth curve between 0 and 1. This smoothness is essential for gradient descent to work properly.

4.1 The Setup

We'll tackle a simple problem. Given an input `[0, 0, 1]`, the correct output is `0`. Given `[1, 1, 1]`, the output is `1`. Can our network learn this pattern?

                import numpy as np

                # The Sigmoid function and its derivative

                def sigmoid(x):

                  return 1 / (1 + np.exp(-x))

                def sigmoid_derivative(x):

                  return x * (1 - x)

                # Our training data

                training_inputs = np.array([[0, 0, 1], [1, 1, 1], [1, 0, 1], [0, 1, 1]])

                training_outputs = np.array([[[0], [1], [1], [0]]]).T # .T transposes
                    it to a column

                # Seed the random numbers to make calculations deterministic (good for
                    debugging)

                np.random.seed(1)

                # Initialize weights randomly with mean 0

                synaptic_weights = 2 * np.random.random((3,
                1)) - 1

                print('Random starting synaptic weights:')

                print(synaptic_weights)

Random starting synaptic weights: [[-0.16595599] [ 0.44064899] [-0.99977125]]

4.2 The Training Loop

This is where the learning happens! We'll show the network the data 10,000 times (called "epochs"). In each epoch, it will guess, check its error, and adjust its weights.

                for iteration in range(10000):

                  # Step 1: Forward Propagation

                  input_layer = training_inputs

                  outputs = sigmoid(np.dot(input_layer, synaptic_weights)) # Make a
                    guess

                  # Step 2: Calculate Loss (the error)

                  error = training_outputs - outputs

                  # Step 3: Backward Propagation & Weight Update

                  adjustments = error * sigmoid_derivative(outputs) # Find the
                    'blame'

                  synaptic_weights += np.dot(input_layer.T, adjustments) # Nudge the
                    weights

                print('Synaptic weights after training:')

                print(synaptic_weights)

                print('\nOutputs after training:')

                print(outputs)

Synaptic weights after training: [[ 9.67299303] [-0.2078435 ] [-4.62963669]] Outputs after training: [[0.00966449] [0.99211957] [0.99358898] [0.00786506]]

Look at that! The final outputs are very close to the correct answers `[0, 1, 1, 0]`. Our network learned the pattern! Notice how the first weight is a large positive number. The network learned that the first input is the key to solving this puzzle.

💡 Your Turn

Combine the code from 4.1 and 4.2 into one Colab cell.

Change the number of iterations in the `range()` from `10000` to `100`. Run it. Are the final outputs as good?
Change it to `100000`. Do the outputs get even closer to 0 and 1?
Inside the loop, right after the weights are updated, add `if iteration % 1000 == 0: print(np.mean(np.abs(error)))`. This will print the average error every 1000 steps. You should see the error getting smaller and smaller!

Part 5: Your First Neural Network Mission

Assignment: The Electronics Shop

You are building a quality control system. You have data from 4 electronic components. Each component has 3 tests performed on it (pass=1, fail=0). Your goal is to train a neural network to predict if the component is faulty (output=1) or acceptable (output=0).

The Data:

A component is considered faulty if its first test result is a `1`.

                    # Inputs: [Test 1, Test 2, Test 3]

                    training_inputs = np.array([[0, 1, 1], [1, 0, 0], [0, 1, 0], [1, 1, 0]])

                    # Outputs: [Is Faulty?]

                    training_outputs = np.array([[[0], [1],
                    [0], [1]]]).T

Your Tasks:

Set up the Network: In a new Colab notebook, copy the setup code from section 4.1, but replace the `training_inputs` and `training_outputs` with the new data from the electronics shop.
Train the Network: Copy the training loop from section 4.2 and train your network on the new data for at least 20,000 iterations.
Analyze the Results: Print the final weights after training. Which weight is the largest? What does this tell you about which test is the most important for predicting a fault? (Write your answer in a text cell).
Make a New Prediction: A new component is tested with results `[1, 1, 1]`. Should it be marked as faulty? Write the code to pass this new input through your *trained* network and print the prediction.

Part 6: Bonus - The Digit Recognizer Challenge

The network you built has one neuron. Real neural networks have many neurons arranged in layers. Let's see how the concepts you learned apply to a real-world problem: recognizing handwritten digits.

Kaggle & The Digit Recognizer Dataset

The "Digit Recognizer" competition is a classic. You are given thousands of images of handwritten digits (0-9) and your goal is to correctly identify them.

Task 1: Get and See the Data

Go to the Digit Recognizer data page. Download `train.csv`.
In Colab, upload `train.csv` and load it with `digit_df = pd.read_csv('train.csv')`.
The first column, `label`, is the correct digit. The other 784 columns (`pixel0` to `pixel783`) are the pixel values of a 28x28 image.
Use this code to see the first digit in the dataset:
import matplotlib.pyplot as plt
first_digit_pixels = digit_df.iloc[0, 1:].values # Get all pixel columns for the first row
first_digit_image = first_digit_pixels.reshape(28, 28) # Reshape from 784 numbers to a 28x28 grid
plt.imshow(first_digit_image, cmap='gray')
plt.show()

Task 2: Your Challenge - Connect the Concepts

You don't need to build a full network for this. Instead, answer these questions in a text cell in your notebook to connect what you've learned.

Inputs: In our simple network, we had 3 inputs. For the digit recognizer, how many inputs would a neural network need?
Weights: Our network had one set of 3 weights. If a digit-recognizer network had just one neuron, how many weights would it have?
Outputs: Our network had one output (0 or 1). For this problem, we need to identify 10 different digits (0 through 9). How many output neurons do you think we would need?
Thinking Bigger: Why do you think a single neuron, even with 784 inputs, would not be enough to solve this problem accurately? What's the benefit of having multiple layers of neurons?

Part 7: Submission Guidelines

To complete this lab, please follow these instructions carefully.

Complete all "Your Turn" tasks and the main "Lab Assignment" in a single Google Colab notebook. The Kaggle project is a bonus.
Use Text Cells to label each section and answer any written questions.
Ensure all your code cells have been run so that their outputs are visible.
When you are finished, generate a shareable link. In Colab, click the "Share" button in the top right.
In the popup, under "General access", change "Restricted" to "Anyone with the link" and ensure the role is set to "Viewer".
Click "Copy link" and submit this link as your assignment.