(Neural Networks From Scratch · Article 2)
Why we need loss functions
In Article 1, we built something important:
- a neural network structure
- layers connected together
- a working forward pass
But there is a big unanswered question:
How do we know if the network is doing well or badly?
Right now, the network produces a number — but it has no idea whether that number is correct.
This is where loss functions come in.
The core idea (no math yet)
A loss function answers one simple question:
“How wrong is the network’s output?”
- Small loss → good prediction
- Large loss → bad prediction
A neural network cannot learn without a loss function because:
- there is no feedback
- no direction
- no notion of improvement
Loss is the bridge between prediction and learning.
Predictions vs. targets
Every learning problem has two things:
- Prediction (what the network outputs)
- Target (the correct answer)
Example:
Prediction: 2.7Target: 3.0
They are close — but not equal.
The loss function turns this difference into a single number.
Our first loss function: Mean Squared Error (MSE)
For beginners, Mean Squared Error (MSE) is perfect because:
- it is intuitive
- it works for regression
- it is easy to implement
Intuition first
- Compute the difference
- Square it (to avoid negatives)
- Average over all samples
That’s it.
MSE in plain English
If the network predicts:
- exactly right → loss = 0
- a bit wrong → small loss
- very wrong → large loss
Loss is always non-negative.
MSE formula (light touch)
We keep this minimal:
loss = (y − ŷ)²
y= targetŷ= prediction
No calculus yet.
No gradients yet.
Implementing MSE in Python
Create nn/losses.py:
class MSE: def forward(self, y_pred, y_true): losses = [] for yp, yt in zip(y_pred, y_true): losses.append((yp - yt) ** 2) return sum(losses) / len(losses)
That is a complete loss function.
Connecting loss to the network
For now, we will not integrate the loss into the training loop yet.
Instead, we use it manually to understand behavior.
Update examples/02_loss_demo.py:
import randomfrom nn.layers import Densefrom nn.core import NeuralNetfrom nn.losses import MSErandom.seed(42)net = NeuralNet([ Dense(1, 1)])loss_fn = MSE()X = [[1.0], [2.0], [3.0]]y_true = [2.0, 4.0, 6.0] # perfect linear relationshipy_pred = []for x in X: y_pred.append(net.forward(x)[0])loss = loss_fn.forward(y_pred, y_true)print("Predictions:", y_pred)print("Loss:", loss)
Run it:
python -m examples.02_loss_demo
You will see:
- random predictions
- a large loss value
This is expected.
Important insight: loss is a signal, not a fix
Right now:
- the network knows it is wrong
- but it does not know how to improve
Loss tells us:
“You are off by this much.”
But it does not say:
“Change this weight.”
That comes next.
Why we square the error
Two reasons:
- No negative errors
- Large mistakes hurt more
Example:
| Error | Squared |
|---|---|
| 1 | 1 |
| 2 | 4 |
| 3 | 9 |
This encourages the network to fix big mistakes first.
Common beginner confusion
“Why is my loss so large?”
Because weights are random and untrained.
“Why doesn’t loss go down?”
Because we haven’t implemented learning yet.
“Is MSE always the best?”
No. But it is the best starting point.
What we achieved in Article 2
You now understand:
- what a loss function is
- why neural networks need loss
- how MSE works
- how to compute error in Python
You have feedback, but no learning yet.
That is exactly where we want to be.
What comes next (Article 3)
The next logical question is:
“How does the network know which weights caused the error?”
That leads to:
- gradients
- backward pass
- the foundation of backpropagation
No magic.
No heavy math.
Step by step.
Series progress
- Article 1: Project setup & core abstractions ✅
- Article 2: Loss functions & error intuition ✅
- Article 3: Gradients and the backward pass ⏭️
Python source code is available on Github: https://github.com/Benard-Kemp/Building-a-Neural-Network-Template-in-Python