Deep Learning Projects

Overview

This repository contains deep learning projects using state-of-the-art neural networks, pretrained models, and transfer learning across vision, NLP, and audio domains. These projects are designed to demonstrate the power of deep neural networks.

Project Structure

/
├── README.md                                 # Main documentation
├── Speaker Identification/                   # Speaker recognition using deep audio features
├── Sentiment_Classification_BERT.ipynb       # NLP sentiment classification using BERT
├── ImageClassification/                      # Image classification using CNN + Transfer Learning

Key Features

Domains Covered: Speech Recognition, Natural Language Processing, and Computer Vision.
Model Categories:
- Deep audio classification (Speaker Recognition)
- Transformer-based models (BERT for NLP)
- Convolutional Neural Networks (Image Classification)
Libraries and Tools: PyTorch, torchaudio, HuggingFace Transformers, TensorFlow/Keras, Matplotlib, seaborn, NumPy, pandas.
Workflow Steps:
- Problem definition
- Data preprocessing and augmentation
- Feature extraction
- Model training and evaluation
- Visualization and interpretation

1. Speaker Recognition and Classification

Summary

This project identifies speakers using audio recordings by leveraging pretrained ECAPA-TDNN embeddings from the SpeechBrain library. It explores both classification and similarity-based approaches to speaker recognition, and includes a Gradio demo app for interactive testing. Designed using a small sample dataset, this project showcases a lightweight hybrid speaker recognition pipeline.

Highlights

Data: Small-scale speaker dataset with short audio clips representing multiple identities.
Methods:
- Preprocessing: Mono conversion, resampling to 16 kHz, and fixed-length waveform padding using torchaudio.
- Embedding Extraction: ECAPA-TDNN model from speechbrain.pretrained.EncoderClassifier.
- Two Approaches:
  - Classification-based: A custom neural network (EmbeddingClassifierBN) trained on embeddings for multi-class speaker identification.
  - Similarity-based: Cosine similarity between embeddings to verify if two audio clips are from the same speaker.
- Hybrid Evaluation: Combines both approaches for flexible inference and robustness testing.
Interface: A user-friendly Gradio app allows real-time testing of speaker audio to evaluate both classification and similarity predictions.
Evaluation: Accuracy, training loss curves, confusion matrices, and real-world inference via the Gradio interface.

📁 Project Directory: Speaker Recognition

Sentiment Classification with BERT (IMDB Movie Reviews)

Summary

This project leverages BERT (Bidirectional Encoder Representations from Transformers) for fine-tuned sentiment classification on the IMDB movie review dataset. The goal is to predict the sentiment— positive, or negative—of a given movie review using Transformer-based contextual embeddings.

Highlights

Dataset: IMDB movie reviews labeled with sentiment classes.
Methods & Tools:
- Pretrained BERT model: bert-base-uncased
- Input preprocessing: tokenization, padding, attention masks
- Fine-tuning BERT on the labeled sentiment data
Evaluation Metrics:
- Accuracy
- Precision, Recall, F1-score
- Confusion Matrix for class-wise performance

📁 Project Notebook

Sentiment Classification with BERT (via Transfer Learning)

3. Image Classification via Transfer Learning

Summary

This computer vision project uses transfer learning with pretrained CNNs (like ResNet or VGG) for classifying images into different categories. It reduces training time while achieving strong performance.

Highlights

Data: A folder-based image dataset with labels.
Methods:
- Pretrained CNN (e.g., ResNet18) from torchvision.models
- Data augmentation using transforms
- Fine-tuning last layers on custom dataset
Evaluation: Training accuracy, validation accuracy, confusion matrix, and visualizations.

📁 Project Directory: Image Classification

Contributing

Contributions are welcome! Fork the repository, make your changes, and submit a pull request. For major changes, please open an issue to discuss what you’d like to change.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
ImageClassification		ImageClassification
Speaker Identification		Speaker Identification
README.md		README.md
Sentiment_Classification_with_BERT_via_Transfer_Learning.ipynb		Sentiment_Classification_with_BERT_via_Transfer_Learning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning Projects

Overview

Project Structure

Key Features

1. Speaker Recognition and Classification

Summary

Highlights

Sentiment Classification with BERT (IMDB Movie Reviews)

Summary

Highlights

📁 Project Notebook

3. Image Classification via Transfer Learning

Summary

Highlights

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deep Learning Projects

Overview

Project Structure

Key Features

1. Speaker Recognition and Classification

Summary

Highlights

Sentiment Classification with BERT (IMDB Movie Reviews)

Summary

Highlights

📁 Project Notebook

3. Image Classification via Transfer Learning

Summary

Highlights

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages