A quick deep learning project for autonomous RC robot navigation using behavioral cloning. The system learns to map raw camera inputs directly to steering and throttle controls by mimicking human driving behavior.
- CNN architecture inspired by NVIDIA's self-driving car model
- 5 convolutional layers with batch normalization
- 4 fully connected layers with dropout
- Dual output: steering (1000-2000) and throttle (1000-2000)
- Input: 66x200 RGB images from car's perspective
- Output: Normalized control values [-1,1]
- Real-time inference at 30+ FPS
- End-to-end behavioral cloning
- Real-time trajectory visualization
- Support for both image and video inference
- Batch normalization for training stability
- Dropout layers for regularization
- Conv2d(3, 24, 5, stride=2) → BatchNorm2d
- Conv2d(24, 36, 5, stride=2) → BatchNorm2d
- Conv2d(36, 48, 5, stride=2) → BatchNorm2d
- Conv2d(48, 64, 3) → BatchNorm2d
- Conv2d(64, 64, 3) → BatchNorm2d
- Linear(1152, 100) → ReLU → Dropout(0.5)
- Linear(100, 50) → ReLU → Dropout(0.5)
- Linear(50, 10) → ReLU
- Linear(10, 2) → Tanh
- Train model with collected driving data in Colab with
rc_robot_train.ipynb - Run inference on images:
python rc_robot_inference_v2.py - Process videos:
python rc_robot_video_inference.py
- PyTorch
- OpenCV
- NumPy
- Matplotlib
- PIL
- NVIDIA's End-to-End Deep Learning for Self-Driving Cars paper
