Model evaluation
Confusion matrix resnet18
Confusion matrix cnn
Resnet training and testing accuracy
CNN tes and training accuracy

CIFAR-10 Image Classification with ResNet18

Author: Dhanush Srinivas
Date: \( \text{Today} \)

Project Description

This project implements a deep-learning pipeline for image classification on the CIFAR-10 dataset using a ResNet18 architecture.

CIFAR-10 consists of 60 000 images of size \( 32 \times 32 \) across 10 classes.
The model is trained from scratch in PyTorch with standard preprocessing and data augmentation.

We compare ResNet18 against a simple CNN baseline and report:

Accuracy and losses
Learning curves
Confusion matrices
Misclassified examples

Features

ResNet18 implemented from scratch (residual connections)
Simple CNN baseline for comparison
Data augmentation: random crop, horizontal flip, normalization with CIFAR-10 stats
Comprehensive evaluation: accuracy, loss curves, confusion matrices, misclassifications
Clear comparison and trade-offs (capacity, time, performance)

Requirements

Recommended versions (or newer):

PyTorch (≥ 2.0) with CUDA support (optional but recommended)
Torchvision
NumPy
Matplotlib
Seaborn
scikit-learn

Install via:

pip install -r requirements.txt

Or directly with CUDA:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Project Structure

Project/
├── C10.ipynb          # Main Jupyter notebook
├── requirements.txt   # Dependencies
├── README.md          # Markdown readme (this file)
└── data/              # Auto-downloaded CIFAR-10 dataset

Usage

Open VS Code (with the Jupyter extension) and load C10.ipynb.
Ensure the Python environment includes PyTorch + CUDA (if GPU available).
Run all cells sequentially — dataset auto-downloads on first run.
Start with a small number of epochs to validate your setup.

Notebook Contents

Setup & Imports – Seeding, device selection (CPU/GPU)
Data Loading & Preprocessing – CIFAR-10 with augmentation and normalization
Model Architectures – ResNet18 and Simple CNN definitions
Training Setup – Loss, optimizer (SGD + momentum), scheduler (MultiStepLR)
Training – Train both models for given epochs
Evaluation & Visualization – Curves, confusion matrices, misclassifications
Results Comparison – Final metrics, per-class accuracy, runtime, overfitting gap
Conclusion & Future Work – Findings and extension ideas

Training Configuration

Parameter	Value
Optimizer	SGD (momentum = 0.9)
Initial LR	0.1
Scheduler	MultiStepLR (milestones = 30, 40; \( \gamma = 0.1 \))
Weight Decay	\( 1 \times 10^{-4} \)
Batch Size	128
Epochs	50 (adjust per hardware)
Loss	Cross-Entropy

Data Augmentation

Random crop (32 with padding = 4)
Random horizontal flip ( p = 0.5 )
Normalize with CIFAR-10 mean/std:

\[ \text{mean} = (0.4914, 0.4822, 0.4465), \quad \text{std} = (0.2023, 0.1994, 0.2010) \]

Results (Typical)

ResNet18: ≈ 93 – 95 % test accuracy (50 epochs, GPU)
Simple CNN: ≈ 75 – 80 % test accuracy
Learning curves show better convergence and generalization for ResNet18
Confusion matrices and misclassified samples show class-wise behavior

Tips for GPU Execution

Verify CUDA availability:

  torch.cuda.is_available()

should return True

Ensure VS Code kernel uses the CUDA-enabled environment
Begin with fewer epochs to validate speed and stability

Future Work

Transfer learning (ImageNet pretraining)
Advanced augmentation (AutoAugment, RandAugment, Mixup, CutMix, TTA)
Larger datasets (CIFAR-100, ImageNet subsets)
Optimization (AMP/mixed precision, pruning, quantization, distillation)
Architectural variants (ResNet34/50, SE/CBAM, EfficientNet, ViT)
Hyperparameter tuning (AdamW, cosine annealing, cyclical LR)

References

He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep Residual Learning for Image Recognition. CVPR.
Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. Technical Report.
PyTorch Documentation