CIFAR-10 Image Classification with ResNet18
Author: Dhanush Srinivas
Date: \( \text{Today} \)
Project Description
This project implements a deep-learning pipeline for image classification on the CIFAR-10 dataset using a ResNet18 architecture.
CIFAR-10 consists of 60 000 images of size \( 32 \times 32 \) across 10 classes.
The model is trained from scratch in PyTorch with standard preprocessing and data augmentation.
We compare ResNet18 against a simple CNN baseline and report:
- Accuracy and losses
- Learning curves
- Confusion matrices
- Misclassified examples
Features
- ResNet18 implemented from scratch (residual connections)
- Simple CNN baseline for comparison
- Data augmentation: random crop, horizontal flip, normalization with CIFAR-10 stats
- Comprehensive evaluation: accuracy, loss curves, confusion matrices, misclassifications
- Clear comparison and trade-offs (capacity, time, performance)
Requirements
Recommended versions (or newer):
- PyTorch (≥ 2.0) with CUDA support (optional but recommended)
- Torchvision
- NumPy
- Matplotlib
- Seaborn
- scikit-learn
Install via:
pip install -r requirements.txt
Or directly with CUDA:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
Project Structure
Project/
├── C10.ipynb # Main Jupyter notebook
├── requirements.txt # Dependencies
├── README.md # Markdown readme (this file)
└── data/ # Auto-downloaded CIFAR-10 dataset
Usage
- Open VS Code (with the Jupyter extension) and load
C10.ipynb. - Ensure the Python environment includes PyTorch + CUDA (if GPU available).
- Run all cells sequentially — dataset auto-downloads on first run.
- Start with a small number of epochs to validate your setup.
Notebook Contents
- Setup & Imports – Seeding, device selection (CPU/GPU)
- Data Loading & Preprocessing – CIFAR-10 with augmentation and normalization
- Model Architectures – ResNet18 and Simple CNN definitions
- Training Setup – Loss, optimizer (SGD + momentum), scheduler (MultiStepLR)
- Training – Train both models for given epochs
- Evaluation & Visualization – Curves, confusion matrices, misclassifications
- Results Comparison – Final metrics, per-class accuracy, runtime, overfitting gap
- Conclusion & Future Work – Findings and extension ideas
Training Configuration
| Parameter | Value |
|---|---|
| Optimizer | SGD (momentum = 0.9) |
| Initial LR | 0.1 |
| Scheduler | MultiStepLR (milestones = 30, 40; \( \gamma = 0.1 \)) |
| Weight Decay | \( 1 \times 10^{-4} \) |
| Batch Size | 128 |
| Epochs | 50 (adjust per hardware) |
| Loss | Cross-Entropy |
Data Augmentation
- Random crop (32 with padding = 4)
- Random horizontal flip ( p = 0.5 )
- Normalize with CIFAR-10 mean/std:
\[ \text{mean} = (0.4914, 0.4822, 0.4465), \quad \text{std} = (0.2023, 0.1994, 0.2010) \]
Results (Typical)
- ResNet18: ≈ 93 – 95 % test accuracy (50 epochs, GPU)
- Simple CNN: ≈ 75 – 80 % test accuracy
- Learning curves show better convergence and generalization for ResNet18
- Confusion matrices and misclassified samples show class-wise behavior
Tips for GPU Execution
- Verify CUDA availability:
torch.cuda.is_available()
should return True
- Ensure VS Code kernel uses the CUDA-enabled environment
- Begin with fewer epochs to validate speed and stability
Future Work
- Transfer learning (ImageNet pretraining)
- Advanced augmentation (AutoAugment, RandAugment, Mixup, CutMix, TTA)
- Larger datasets (CIFAR-100, ImageNet subsets)
- Optimization (AMP/mixed precision, pruning, quantization, distillation)
- Architectural variants (ResNet34/50, SE/CBAM, EfficientNet, ViT)
- Hyperparameter tuning (AdamW, cosine annealing, cyclical LR)
References
- He, K., Zhang, X., Ren, S., Sun, J. (2016). Deep Residual Learning for Image Recognition. CVPR.
- Krizhevsky, A. (2009). Learning Multiple Layers of Features from Tiny Images. Technical Report.
- PyTorch Documentation

Log in or sign up for Devpost to join the conversation.