CIFAR-10 Image Classification with ResNet18

Author: Dhanush Srinivas
Date: \( \text{Today} \)


Project Description

This project implements a deep-learning pipeline for image classification on the CIFAR-10 dataset using a ResNet18 architecture.

CIFAR-10 consists of 60 000 images of size \( 32 \times 32 \) across 10 classes.
The model is trained from scratch in PyTorch with standard preprocessing and data augmentation.

We compare ResNet18 against a simple CNN baseline and report:

  • Accuracy and losses
  • Learning curves
  • Confusion matrices
  • Misclassified examples

Features

  • ResNet18 implemented from scratch (residual connections)
  • Simple CNN baseline for comparison
  • Data augmentation: random crop, horizontal flip, normalization with CIFAR-10 stats
  • Comprehensive evaluation: accuracy, loss curves, confusion matrices, misclassifications
  • Clear comparison and trade-offs (capacity, time, performance)

Requirements

Recommended versions (or newer):

  • PyTorch (≥ 2.0) with CUDA support (optional but recommended)
  • Torchvision
  • NumPy
  • Matplotlib
  • Seaborn
  • scikit-learn

Install via:

pip install -r requirements.txt

Or directly with CUDA:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

Project Structure

Project/
├── C10.ipynb          # Main Jupyter notebook
├── requirements.txt   # Dependencies
├── README.md          # Markdown readme (this file)
└── data/              # Auto-downloaded CIFAR-10 dataset

Usage

  1. Open VS Code (with the Jupyter extension) and load C10.ipynb.
  2. Ensure the Python environment includes PyTorch + CUDA (if GPU available).
  3. Run all cells sequentially — dataset auto-downloads on first run.
  4. Start with a small number of epochs to validate your setup.

Notebook Contents

  1. Setup & Imports – Seeding, device selection (CPU/GPU)
  2. Data Loading & Preprocessing – CIFAR-10 with augmentation and normalization
  3. Model Architectures – ResNet18 and Simple CNN definitions
  4. Training Setup – Loss, optimizer (SGD + momentum), scheduler (MultiStepLR)
  5. Training – Train both models for given epochs
  6. Evaluation & Visualization – Curves, confusion matrices, misclassifications
  7. Results Comparison – Final metrics, per-class accuracy, runtime, overfitting gap
  8. Conclusion & Future Work – Findings and extension ideas

Training Configuration

Parameter Value
Optimizer SGD (momentum = 0.9)
Initial LR 0.1
Scheduler MultiStepLR (milestones = 30, 40; \( \gamma = 0.1 \))
Weight Decay \( 1 \times 10^{-4} \)
Batch Size 128
Epochs 50 (adjust per hardware)
Loss Cross-Entropy

Data Augmentation

  • Random crop (32 with padding = 4)
  • Random horizontal flip ( p = 0.5 )
  • Normalize with CIFAR-10 mean/std:

\[ \text{mean} = (0.4914, 0.4822, 0.4465), \quad \text{std} = (0.2023, 0.1994, 0.2010) \]


Results (Typical)

  • ResNet18: ≈ 93 – 95 % test accuracy (50 epochs, GPU)
  • Simple CNN: ≈ 75 – 80 % test accuracy
  • Learning curves show better convergence and generalization for ResNet18
  • Confusion matrices and misclassified samples show class-wise behavior

Tips for GPU Execution

  • Verify CUDA availability:
  torch.cuda.is_available()

should return True

  • Ensure VS Code kernel uses the CUDA-enabled environment
  • Begin with fewer epochs to validate speed and stability

Future Work

  • Transfer learning (ImageNet pretraining)
  • Advanced augmentation (AutoAugment, RandAugment, Mixup, CutMix, TTA)
  • Larger datasets (CIFAR-100, ImageNet subsets)
  • Optimization (AMP/mixed precision, pruning, quantization, distillation)
  • Architectural variants (ResNet34/50, SE/CBAM, EfficientNet, ViT)
  • Hyperparameter tuning (AdamW, cosine annealing, cyclical LR)

References

  • He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep Residual Learning for Image Recognition. CVPR.
  • Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. Technical Report.
  • PyTorch Documentation

Built With

Share this project:

Updates