SolveWithPython

Building a Neural Network Template in Python — Your First Complete Model

(Neural Networks From Scratch · Article 5)

This is the milestone article

In the previous articles, we built the neural network piece by piece:

  • structure (layers)
  • error measurement (loss)
  • learning signal (gradients)
  • weight updates (SGD)

Now we do something important:

We turn all of that into a usable template.

That means:

  • a clean API
  • less boilerplate
  • clearer intent
  • code that feels like a real model

This is the transition from learning mechanics to using a model.

What “complete” means (for now)

A complete neural network template should allow you to:

  • train a model with fit()
  • make predictions with predict()
  • reuse the same structure on new data
  • hide low-level details from the user

We are not adding advanced features yet.
We are making the basics clean and usable.

Step 1: Clean up the NeuralNet API

Update nn/core.py:

Python
class NeuralNet:
def __init__(self, layers, loss, optimizer):
self.layers = layers
self.loss = loss
self.optimizer = optimizer
def forward(self, x):
for layer in self.layers:
x = layer.forward(x)
return x
def backward(self, grad):
for layer in reversed(self.layers):
grad = layer.backward(grad)
def train_step(self, x, y):
y_pred = self.forward(x)
loss_value = self.loss.forward(y_pred, y)
grad_loss = self.loss.backward()
self.backward(grad_loss)
self.optimizer.step(self.layers)
return loss_value
def fit(self, X, y, epochs=100, verbose=True):
for epoch in range(epochs):
total_loss = 0.0
for x, target in zip(X, y):
total_loss += self.train_step(x, [target])
if verbose:
print(f"Epoch {epoch:03d} | Loss: {total_loss:.6f}")
def predict(self, X):
predictions = []
for x in X:
predictions.append(self.forward(x))
return predictions

This is the first real user-facing interface.

Step 2: Build a real example (end to end)

Create examples/05_complete_model.py:

Python
import random
from nn.layers import Dense
from nn.core import NeuralNet
from nn.losses import MSE
from nn.optimizers import SGD
random.seed(42)
# Dataset: y = 2x
X = [[1.0], [2.0], [3.0], [4.0], [5.0]]
y = [2.0, 4.0, 6.0, 8.0, 10.0]
net = NeuralNet(
layers=[
Dense(1, 3),
Dense(3, 1)
],
loss=MSE(),
optimizer=SGD(lr=0.05)
)
net.fit(X, y, epochs=50)
predictions = net.predict(X)
print("\nPredictions:")
for x, pred in zip(X, predictions):
print(f"x={x[0]:.1f} → ŷ={pred[0]:.2f}")

Run it:

python -m examples.05_complete_model

What you should see

You should observe:

  • loss decreasing steadily
  • predictions close to: y ≈ 2x

Example output:

Epoch 000 | Loss: 12.834215
Epoch 010 | Loss: 0.921233
Epoch 020 | Loss: 0.142331
Epoch 049 | Loss: 0.012194
Predictions:
x=1.0 → ŷ=2.01
x=2.0 → ŷ=3.98
x=3.0 → ŷ=6.02
...

The exact numbers may differ slightly.

What matters is:

  • the trend
  • the correctness
  • the clarity

What you should see

You should observe:

  • loss decreasing steadily
  • predictions close to: y ≈ 2x

Example output:

Epoch 000 | Loss: 12.834215
Epoch 010 | Loss: 0.921233
Epoch 020 | Loss: 0.142331
Epoch 049 | Loss: 0.012194
Predictions:
x=1.0 → ŷ=2.01
x=2.0 → ŷ=3.98
x=3.0 → ŷ=6.02
...

The exact numbers may differ slightly.

What matters is:

  • the trend
  • the correctness
  • the clarity

Important limitations (and why they are OK)

Right now, the template:

  • uses pure Python loops
  • trains sample by sample
  • has no activation functions
  • has no batching

That is intentional.

We prioritized:

  • understanding
  • correctness
  • structure

Performance comes later.

Common beginner questions

“Is this a real neural network?”

Yes. It uses the same principles as large frameworks.

“Why not NumPy yet?”

Because clarity beats speed at this stage.

“Can I extend this?”

Absolutely — and we will.

What we achieved in Articles 1–5

You now understand and have implemented:

  • neural network structure
  • forward propagation
  • loss computation
  • backpropagation
  • gradient descent
  • a usable training API

This is the foundation of everything that follows.

What comes next (Phase II)

Now we move from foundation to capability.

Next articles will add:

  • activation functions (ReLU, Sigmoid)
  • classification problems
  • batching
  • evaluation metrics
  • production-style usage

This is where the template becomes powerful.

Series progress