Implementation of a regularized 2-layer multiclass classifier on nonlinear 2D benchmarks (flower, spiral) in NumPy (explicit backprop) and PyTorch (module-based baseline).
A single-hidden-layer network:
with activation
Empirical risk minimization with L2 regularization (weight decay):
Cross-entropy corresponds to the negative log-likelihood under a categorical model
Full-batch gradient descent / SGD using analytically derived gradients; demonstrates the chain rule through softmax + cross-entropy (yielding
Softmax is computed with log-sum-exp shifting to prevent overflow.
Finite-difference gradient checking (using sigmoid to ensure differentiability everywhere) validates the NumPy backprop by comparing:
against a small tolerance.
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtNumPy (Flower):
python scripts/train_flower_numpy.pyNumPy (Spiral):
python scripts/train_spiral_numpy.pyPyTorch (Flower):
python scripts/train_flower_torch.pyPyTorch (Spiral):
python scripts/train_spiral_torch.pyPlots are saved to:
outputs/figures/flower-boundary.jpgoutputs/figures/spiral-boundary.jpg
Gradient checking uses a sigmoid hidden activation so the network is differentiable everywhere (ReLU is not differentiable at 0, which makes numerical gradients disagree near 0).
Run:
python scripts/gradient_check.pyThe printed output includes the relative difference:
A typical pass condition is diff < 1e-6.



