A real-time yoga pose classification system using MediaPipe pose estimation and PyTorch neural networks. This project achieves 95.4% validation accuracy in classifying yoga poses from images.
This project helps yoga practitioners get instant feedback on their poses without needing an instructor present. Built as a complete end-to-end ML pipeline, it demonstrates skills in computer vision, deep learning, and deployment.
As someone passionate about making yoga accessible, I built this to help practitioners verify their form in real-time. This project taught me:
- End-to-end ML engineering (data preprocessing → deployment)
- Production-ready computer vision pipelines
- Real-time pose estimation and classification
- Web application deployment with Streamlit
- High Accuracy: 95.4% validation accuracy on yoga pose classification
- Real-time Detection: Fast pose estimation using MediaPipe
- Image & Video Support: Works with both static images and video files
- Visual Feedback: Annotated pose visualization with confidence scores
- Interactive Web App: User-friendly Streamlit interface
- Production-Ready: Clean, modular code with proper train/test separation
The system consists of three main components:
- Feature Extraction: MediaPipe extracts 33 3D landmarks (99 features total)
- Neural Network: Fully connected network with dropout regularization
- Classification: Multi-class softmax classifier with confidence scoring
Input (99 features: 33 landmarks × 3 coordinates)
↓
Linear(99 → 128) + ReLU + Dropout(0.3)
↓
Linear(128 → 64) + ReLU + Dropout(0.2)
↓
Linear(64 → num_classes)
↓
Softmax Output
| Metric | Score |
|---|---|
| Training Accuracy | 97.2% |
| Validation Accuracy | 95.4% |
| Training Loss (final) | 0.0875 |
| Model Size | ~50KB |
| Inference Speed | Real-time |
Upload an image or video to see real-time pose detection and classification!
- Python 3.10+
- pip
# Clone the repository
git clone https://github.com/Pooja-Vachhad/AI-Yoga-Pose-Classifier.git
cd AI-Yoga-Pose-Classifier
# Install dependencies
pip install -r requirements.txtThis project uses the Yoga Pose Classification Dataset from Kaggle.
Dataset Details:
- 5 yoga poses: Downdog, Goddess, Plank, Tree, Warrior2
- High-quality images for training and validation
- Preprocessed using MediaPipe pose estimation
Dataset Structure:
dataset/
├── Downdog/
├── Goddess/
├── Plank/
├── Tree/
└── Warrior2/
To prepare the dataset:
- Download from Kaggle
- Extract to
dataset/folder - Run preprocessing script (see Usage below)
# Update the dataset path in creating_csv.py
# Then run:
python creating_csv.pyThis extracts pose landmarks and creates pose_landmarks.csv.
python train.pyThis will:
- Train the neural network for 50 epochs
- Display training progress with loss and accuracy
- Save
yoga_pose_classifier.pth(model weights) - Save
label_encoder.pkl(class encoder)
# Update test_image_path in test.py
python test.pystreamlit run app.pyThen open http://localhost:8501 in your browser.
Web App Features:
- Upload images or videos
- Real-time pose detection with landmarks
- Confidence scores for predictions
- Side-by-side comparison
- Download processed videos
| Component | Technology |
|---|---|
| Deep Learning | PyTorch |
| Pose Estimation | MediaPipe |
| Computer Vision | OpenCV |
| Data Processing | Pandas, NumPy |
| Visualization | Matplotlib |
| Web Framework | Streamlit |
| ML Utils | scikit-learn |
AI-Yoga-Pose-Classifier/
├── app.py # Streamlit web application
├── train.py # Model training script
├── test.py # Testing script for images
├── creating_csv.py # Dataset preprocessing
├── requirements.txt # Python dependencies
├── packages.txt # System dependencies
├── yoga_pose_classifier.pth # Trained model weights
├── label_encoder.pkl # Class label encoder
├── README.md # Project documentation
└── dataset/ # Training data (not included)
├── Downdog/
├── Goddess/
├── Plank/
├── Tree/
└── Warrior2/
Solution:
- Added dropout layers (0.3 and 0.2) to prevent overfitting
- Implemented stratified train-test split for balanced data
- Result: 95.4% validation accuracy
Solution:
- Used MediaPipe's optimized pose estimation
- Lightweight neural network architecture (~50KB)
- Achieved real-time inference on CPU
Solution:
- Frame-by-frame pose detection
- Efficient video encoding with OpenCV
- Added download feature for processed videos
- Add more yoga poses (expand to 20+ poses)
- Real-time webcam support
- Mobile app deployment (Flutter/React Native)
- Pose correction suggestions
- Multi-person pose detection
- Export workout analytics
