D.R.I.B.B.L.E

Data-Driven Insights for Basketball Location & Efficiency

This project leverages a K-Nearest Neighbors (KNN) algorithm with various distance metrics and ensemble methods to predict NBA shot outcomes based on historical location data, enhancing decision-making by identifying optimal shooting strategies.

Project Structure

The project is organized into several key modules, each responsible for a specific aspect of the workflow:

1) `data_loader.py`

This script is responsible for loading the shot logs and other relevant datasets from specified file paths. It utilizes pandas to read data from CSV files, ensuring all necessary data like shot logs, player statistics, and game schedules are loaded into memory for further processing. This file sets the foundation for the data pipeline by providing the raw data needed for preprocessing and analysis.

2) `data_cleaning.py`

In this file, data preprocessing routines are implemented to clean and prepare the datasets for analysis. It includes handling missing values, correcting data types, and potentially filtering out irrelevant data points to streamline the datasets. The script ensures data integrity is maintained and the datasets are optimized for high-performance modeling, which is crucial for accurate machine learning predictions.

3) `data_analysis.py`

This script performs exploratory data analysis (EDA) and feature engineering on the cleaned data. It involves statistical analysis to understand the distributions of various features, the creation of new features based on existing data (e.g., calculating shot efficiency based on player and location), and the selection of relevant features that will be used for training the machine learning models. This file is key to uncovering insights from the data and preparing it in a format that enhances the predictive capabilities of the model.

4) `model.py`

This module contains the core machine learning implementation using the K-Nearest Neighbors (KNN) algorithm. It evaluates different KNN configurations, including variations with different distance metrics (like Euclidean, Manhattan, and Minkowski), and advanced methods such as weighted KNN and bagging to improve prediction accuracy. The script trains the model on the preprocessed data, validates its performance using accuracy metrics, and outputs the model’s predictive results. This file is central to the project as it directly handles the creation, training, and evaluation of the predictive model.

5) `output.py`

The output script is designed to format the results of the model into a user-friendly format or save them to a file. It might include functions to structure the model's outputs into readable reports or dashboards, and handle exporting data to CSV files or databases for further use or presentation. This file ensures that the insights generated by the model are accessible and actionable for end-users.

Running the Project

To execute the full workflow, run the following commands:

Run the backend process:
```
python main.py
```
Run the frontend process:
```
streamlit run website.py
```

6) `Model Accuracies`

Best cross-validated score: 67.61%
Test set accuracy: 67.78%

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.conda		.conda
Data		Data
FrontEnd		FrontEnd
Model		Model
OutputLogs		OutputLogs
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

D.R.I.B.B.L.E

Project Structure

1) `data_loader.py`

2) `data_cleaning.py`

3) `data_analysis.py`

4) `model.py`

5) `output.py`

Running the Project

6) `Model Accuracies`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

D.R.I.B.B.L.E

Project Structure

1) data_loader.py

2) data_cleaning.py

3) data_analysis.py

4) model.py

5) output.py

Running the Project

6) Model Accuracies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1) `data_loader.py`

2) `data_cleaning.py`

3) `data_analysis.py`

4) `model.py`

5) `output.py`

6) `Model Accuracies`

Packages