Skip to content

Yeeyash/Classification-Decision-Trees-and-KNN

Repository files navigation

📊 Classification: Decision Trees and KNN

This repository is part of my structured learning path in Machine Learning, where I aim to understand not just how algorithms work, but why they work — by diving deep into the mathematics, logic, and code implementation of classification methods like Decision Trees and K-Nearest Neighbors (KNN).


🚀 What This Project Covers

🔍 Mathematical Foundations:

  • Entropy
  • Gini Impurity
  • Information Gain
  • Euclidean Distance

🧠 Machine Learning Concepts:

  • Splitting criteria and tree growth
  • Lazy vs eager learning
  • Overfitting and generalization in classification

⚙️ Algorithms Implemented:

  • Decision Tree Classifier (custom and sklearn)
  • K-Nearest Neighbors (KNN)
  • Data preprocessing, model evaluation & visualization

📁 Project Structure

Classification-Decision-Trees-and-KNN/
├── dataset.csv                 # Sample dataset used for training/testing
├── decision_tree.py            # Custom Decision Tree classifier implementation
├── knn.py                      # K-Nearest Neighbors classifier implementation
├── utils.py                    # Helper functions (entropy, gini, info gain, etc.)
├── notebook.ipynb              # Jupyter Notebook with explanation & visualizations
├── requirements.txt            # List of Python dependencies
└── README.md                   # Project documentation

🛠️ Tech Stack

  • Python 3
  • NumPy & Pandas — data manipulation
  • Matplotlib & Seaborn — visualization
  • Scikit-learn — for model comparison & validation

📊 Visual Insights

The notebook includes:

  • Decision boundaries for KNN
  • Feature splits in Decision Trees
  • Comparative accuracy metrics
  • Confusion matrices

📚 Key Learning Outcomes

This project helped me:

  • Grasp the intuition behind classification algorithms
  • Understand how mathematical metrics guide decisions
  • Reinforce programming skills by writing logic from scratch
  • Use ML libraries with confidence and clarity

📦 Getting Started

  1. Clone the repo:
git clone https://github.com/Yeeyash/Classification-Decision-Trees-and-KNN.git
cd Classification-Decision-Trees-and-KNN
  1. Install dependencies:
requirments.txt

numpy
pandas
matplotlib
seaborn
scikit-learn
jupyter
pip install -r requirements.txt
  1. Launch the notebook:
jupyter notebook Classification_DecisionTrees_and_KNN.ipynb

🙌 Connect?

Linkedin: Yash Ghansham Thakare

About

Exploration of Decision Trees and KNN classifiers with focus on entropy, gini impurity, and information gain. Features custom implementations, preprocessing, and evaluation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors