Hotel Customer Segmentation - Attrition Prediction

Insights from three years of customer booking data.

-- Project Status: [Active]

Project Intro/Objective

The purpose of this project is to cluster customers into segments based on their booking data and recency, frequency and monetary value. The goal is to have a deeper understanding of our top value customers for more targeted marketing, managment action, promo triggers etc. We can also use these segments to help predict the if a customer is at risk of attrition (non-returning/lost customer).

The primary reason for attrition prediction is to retain customers at high risk of loss and take preventiative action to preserve revenue sources and ensure cost of acquisition resources are yielding a solid ROI.

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── external       <- Data from third party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── data           <- Scripts to download or generate data
│   │   └── make_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── predict_model.py
│   │   └── train_model.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

Project Description

Methods Used

Inferential Statistics
Machine Learning
Data Visualization
Predictive Modeling
KMeans clustering
Classification

Languages and Technologies

Python
Pandas
Jupyter
Numpy
Scikit-learn
PyCaret
Tableau

The Data

A real-world customer dataset with 31 variables describes 83,590 instances (customers) from a hotel in Lisbon, Portugal. Instances include; customer personal, behavioral, demographic, and geographical information for 3 full years. The dataset can be found on Kaggle here

Kaggle dataset origin, domain assumptions and data collection information:

Nuno Antonio, Ana de Almeida, Luis Nunes. A hotel's customer's personal, behavioral, demographic, and geographic dataset from Lisbon, Portugal (2015-2018). Data in Brief 33(2020)106583, 24(November), 2020. URL: https://www.sciencedirect.com/journal/data-in-brief.

Preview

(Table of Contents)

Getting Started

(Table of Contents)

Clone this repo (for help see this tutorial).
Raw Data is being kept [here](Repo folder containing raw data) within this repo.
Data processing/transformation scripts are being kept [here](Repo folder containing data processing scripts/notebooks)
Recreate environment and dependencies using this file
- Using anaconda prompt . . .
Follow setup [instructions](Link to file)

Featured Notebooks and Deliverables

(Table of Contents)

Credits

Kaggle dataset origin, domain assumptions and data collection information:

Nuno Antonio, Ana de Almeida, Luis Nunes. A hotel's customer's personal, behavioral, demographic, and geographic dataset from Lisbon, Portugal (2015-2018). Data in Brief 33(2020)106583, 24(November), 2020. URL: https://www.sciencedirect.com/journal/data-in-brief.

Contact

Connect with me on Linkedin here.
Personal website coming soon . . .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hotel Customer Segmentation - Attrition Prediction

-- Project Status: [Active]

Project Intro/Objective

The primary reason for attrition prediction is to retain customers at high risk of loss and take preventiative action to preserve revenue sources and ensure cost of acquisition resources are yielding a solid ROI.

Table of Contents

Project Organization

Project Description

Methods Used

Languages and Technologies

The Data

Preview

Getting Started

Featured Notebooks and Deliverables

Credits

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
docs		docs
models		models
notebooks		notebooks
references		references
reports		reports
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py
test_environment.py		test_environment.py
tox.ini		tox.ini

Folders and files

Latest commit

History

Repository files navigation

Hotel Customer Segmentation - Attrition Prediction

-- Project Status: [Active]

Project Intro/Objective

The primary reason for attrition prediction is to retain customers at high risk of loss and take preventiative action to preserve revenue sources and ensure cost of acquisition resources are yielding a solid ROI.

Table of Contents

Project Organization

Project Description

Methods Used

Languages and Technologies

The Data

Preview

Getting Started

Featured Notebooks and Deliverables

Credits

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages