Shop Customer Clustering and Classification

Overview

This project involves analyzing and classifying shop customer data to gain insights into customer behavior and improve business strategies. The project consists of two main components:

Customer Clustering: Grouping customers into clusters based on their characteristics.
Customer Classification: Building classification models to predict customer clusters based on specific features.

The customer data is sourced from a Kaggle dataset.

Customer Clustering Notebook

This notebook focuses on clustering customers based on their features. Below are the key steps:

Steps Performed

Introduction to Dataset:
- Overview of the dataset with details on rows, columns, and feature descriptions.
Importing Libraries:
- Essential libraries such as pandas, numpy, matplotlib, seaborn, and sklearn are imported.
Loading the Dataset:
- Data is loaded into a Pandas DataFrame, and the first few rows are displayed.
Exploratory Data Analysis (EDA):
- Dataset structure and descriptive statistics are examined.
- Missing values are identified, and data visualizations are used to understand variable distributions.
Data Preprocessing:
- Handling missing values by removing rows with null values.
- Removing duplicate entries.
- Detecting and handling outliers using the IQR method.
- Encoding categorical features with LabelEncoder.
Clustering Model Development:
- Implementing K-Means clustering with an initial number of clusters (K=3).
- Evaluating the model using Silhouette Score.
- Optimizing the number of clusters using the Elbow method and Silhouette Score.
- Retraining the model with the optimal number of clusters.
- Performing feature selection to identify influential features.
- Training the K-Means model with selected features and comparing the results.
Clustering Results Visualization:
- Visualizing clustering outcomes using PCA for dimensionality reduction.
Cluster Analysis and Interpretation:
- Examining the characteristics of each cluster based on available features.
- Displaying value distributions within each cluster.
Exporting Results:
- Saving the clustering results to a CSV file.

This notebook provides a comprehensive analysis of shop customers, enabling the business to group customers into distinct segments for targeted strategies.

Customer Classification Notebook

This notebook focuses on building machine learning models to classify customers into their respective clusters.

Steps Performed

Importing Libraries:
- Libraries such as pandas, scikit-learn, seaborn, and matplotlib are imported.
Loading Clustered Dataset:
- The dataset from the clustering notebook is loaded into a DataFrame for further analysis.
Data Splitting:
- The dataset is split into training (70%) and testing (30%) sets.
Classification Model Development:
- Building models using the following algorithms:
  - Logistic Regression
  - Decision Tree
  - Random Forest
  - K-Nearest Neighbors (K-NN)
Model Evaluation:
- Evaluating models on the testing set using metrics such as:
  - Accuracy
  - Precision
  - Recall
  - F1-Score
  - Confusion Matrix
- Summary of results:
  - Decision Tree, Random Forest, and K-NN achieved perfect scores for all metrics (Accuracy, Precision, Recall, and F1-Score = 1.0).
  - Logistic Regression achieved an accuracy of 0.9119.
Confusion Matrix Visualization:
- Confusion matrices for each model are visualized using seaborn.

This notebook demonstrates a step-by-step process for data analysis, model training, and performance evaluation for classifying shop customers.

Conclusion

By combining clustering and classification approaches, this project provides valuable insights into customer segmentation and predictive analytics. These insights can be leveraged by businesses to create tailored marketing strategies, improve customer satisfaction, and optimize resource allocation.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data/Shop Customer Data		data/Shop Customer Data
.gitignore		.gitignore
README.md		README.md
[Clustering]_Submission_Akhir_BMLP_Bima_Rakajati.ipynb		[Clustering]_Submission_Akhir_BMLP_Bima_Rakajati.ipynb
[Klasifikasi]_Submission_Akhir_BMLP_Bima_Rakajati.ipynb		[Klasifikasi]_Submission_Akhir_BMLP_Bima_Rakajati.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shop Customer Clustering and Classification

Overview

Customer Clustering Notebook

Steps Performed

Customer Classification Notebook

Steps Performed

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Shop Customer Clustering and Classification

Overview

Customer Clustering Notebook

Steps Performed

Customer Classification Notebook

Steps Performed

Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages