Skip to content

merrillm1/Predicting_Hotel_Cancellations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting Hotel Cancellations

Can we predict if a guest will cancel a reservation at the moment of booking?

The booking process has changed dramatically over the past decade, with more guests choosing to book online rather than direct. While the accessibility of online travel agencies may increase exposure and demand for many hotels, it has been met with an increase in cancellation rates. While cancellations are a familiar foe of the hotel industry, it has been the advent of the "risk free cancellations" campaign put on by online travel agencies that have made it a damaging statistic worthy of a second look.

According to a study conducted by D-Edge Hospitality Solutions, cancellation rates in the hotel industry peaked at 41.3% in 2017, up from around 32% in 2014. What is importance to note here is how heavily skewed this average is by online travel agencies like "Booking.com" who posted a whopping 50% cancellation rate in 2018^[Hertzfeld, Esther. Study: Cancellation Rate at 40% as OTAs Push Free Change Policy. Hotel Management, 23 Apr. 2019, www.hotelmanagement.net/tech/study-cancelation-rate-at-40-as-otas-push-free-change-policy.] This is in stark contrast to an average cancellation rate of 18.2% in 2018 for customers booking direct. The booking process has changed and hotels are now forced to find ways of limiting the damage caused by cancellations. My work here aims to predict cancellations and offer a solution based on early outreach for red flags or high cancellation risk bookings.

The dataset was obtained from Science Direct and contains a collection of observations taken from 2015 to 2017 of guest bookings for two hotels, both located in Portugal. The data was collected directly from the hotels’ PMS (property management system) database and was relatively clean and structured upon retrieval. Each observation represents a booking, with the variables capturing details including when it was booked, dates of stay, through which operator the guest chose to go through, if that booking was cancelled and the average daily rate.


Overview

The project comprises all steps of Data Science work broken down as follows:

  • Data collection and wangling: done in Jupyter Notebook
  • Exploratory Data Analysis: using python in Rstudio with the reticulate library for statistical data analysis
  • Machine learning: using Python - Logistic Regression and Random Forests with scikit-learn and a final CatBoost algorithm in Jupyter Notebook
  • Report completed and rendered as Rmarkdown document

Links

The work has been broken down in stages and summary slides have been created for a quick look at the results.

Author

Acknowledgements

  • Dhiraj Khanna - Springboard mentor

About

Created prediction algorithm for determining if a customer will cancel at the moment of booking. After eliminating numerous data leakage sources, we achieved a 90% AUROC with a catboost classification model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors