SolveWithPython

Reading Loss Curves and Detecting Overfitting — When Learning Goes Wrong

Up to now, we have focused on how to train a neural network:

forward propagation
backpropagation
vectorization
batch and mini-batch training

At this point, your network runs fast and learns.

But there is a new problem you must learn to recognize:

Sometimes a model is learning — and still getting worse.

This article teaches you how to see that happening.

Why Training Loss Alone Is Not Enough

Most beginner tutorials celebrate this moment:

			
Epoch 0   → Loss = 4.82
Epoch 200 → Loss = 0.12
Epoch 500 → Loss = 0.01

Lower loss looks good.

But here is the catch:

A model can achieve very low training loss and still perform terribly on new data.

This is called overfitting.

The Core Idea: Train vs Validation

To detect overfitting, we must split our data.

Training Set

Used to update weights
The model learns from this data

Validation Set

Never used for updates
Only used to evaluate performance

If these two behave differently, something is wrong.

Step 1: Creating a Train / Validation Split

			
X_train = X[:80]
y_train = y[:80]
X_val = X[80:]
y_val = y[80:]

The exact ratio doesn’t matter at first.
What matters is separation.

Step 2: Tracking Loss Over Time

Instead of printing one loss value, we store them.

train_losses = []
val_losses = []

At each epoch:

train_losses.append(train_loss)
val_losses.append(val_loss)

This allows us to see learning behavior.

Step 3: Plotting Loss Curves

import matplotlib.pyplot as plt
plt.plot(train_losses, label="Training Loss")
plt.plot(val_losses, label="Validation Loss")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.show()

This plot is one of the most important debugging tools in machine learning.

How to Read Loss Curves (Very Carefully)

Case 1: Healthy Learning

			
Training Loss ↓
Validation Loss ↓

This is ideal.
The model is learning and generalizing.

Case 2: Overfitting

			
Training Loss ↓
Validation Loss ↑

The model is memorizing the training data.

This is the most common failure mode.

Case 3: Underfitting

			
Training Loss → flat
Validation Loss → flat

The model is too simple or not trained enough.

Why Overfitting Happens

Overfitting occurs when:

The model has too many parameters
The dataset is small
Training runs too long
Noise is learned as signal

In other words:

The model becomes too specialized.

A Simple Overfitting Example

Imagine a model that learns:

“If input = exactly this pattern, output = correct”

Instead of:

“If input is similar to this pattern, output = correct”

The first fails in the real world.

Why Neural Networks Are Especially Prone to Overfitting

Neural networks:

are highly expressive
can memorize arbitrary patterns
do not know what “generalization” means

They only minimize loss.

If minimizing loss means memorizing — they will.

Early Warning Signs of Overfitting

Watch for:

Validation loss increasing while training loss decreases
Validation accuracy stagnating
Large gap between training and validation metrics

If you see these, stop training.

The Simplest Defense: Early Stopping

Early stopping means:

Stop training when validation loss stops improving.

Example:

if val_loss > previous_val_loss:
    stop_training = True

This is often enough to prevent severe overfitting.

What We Are Not Doing Yet (On Purpose)

We are not yet using:

regularization
dropout
weight decay

Those come next.

First, you must be able to recognize the problem visually.

Common Beginner Mistakes Here

Mistake 1: Trusting training loss only
→ Always track validation loss.

Mistake 2: Training “just a bit longer”
→ Often makes things worse.

Mistake 3: Assuming more data isn’t needed
→ Often, it is.

What You Have Learned in This Article

You can now:

split data properly
track training vs validation loss
interpret loss curves
detect overfitting and underfitting
know when training should stop

This is the beginning of model diagnosis.

What’s Next in the Series

In Article #15, we will introduce:

L2 regularization (weight decay)
Why it reduces overfitting
How it modifies the loss function
How to implement it from scratch

This will be your first active defense against overfitting.

Series Status

Part I — Foundations ✔
Part II — Scaling & Diagnostics ▶ In Progress

You now understand not just how to train a neural network —
but how to tell whether training is actually working.