This analysis is based on real world but anonymized data, with fresh collected business data.
This project focuses on analyzing a diverse set of data related to the hotel and accommodation industry, including marketing performance, website traffic, visitor behavior, booking engine usage, reservation patterns, and occupancy trends. The primary objective is to derive actionable insights and formulate data-driven recommendations that help optimize and maximize accommodation bookings.
By leveraging this comprehensive dataset, the analysis explores ways to:
- Improve the efficiency of marketing campaigns and maximize ROI.
- Enhance website usability and booking engine performance to boost user engagement.
- Identify key patterns and trends in booking behavior to target high-value opportunities.
- Optimize occupancy rates by understanding changes in demand and availability.
- The findings aim to provide a strategic foundation for improving the overall performance of hotel booking systems and achieving greater profitability in the competitive accommodation market.
The analysis was originally made for the in-house competition of the Data36.com data science club, where I ranked at second place with this work.
Processing and analysis of marketing and booking data from the hotel industry company for various hotels, with a view to optimising hotel booking opportunities and to increase the profitability of the company.
Main contributions:
- Data preprocessing and cleaning
- Comprehensive Exploratory Data Analysis
- Analysis of search patterns
- Conversion analysis
- Analysis for Revenue & Yield management, and Campaign optimisation
- Analysis of advertisement and PPC costs
- Proposals based on findings and insights to improve business profitability
Dataset:
11 tables per hotel with data collected between 18.09.2024 and 18.10.2024:
- 1 table with data on website traffic
- 1 table with data on marketing spending
- 8 tables with search and booking data
- 1 table with occupancy data
The description of the dataset variables is available here.
In the analysis, I have identified and communicated the key factors that are critical in guest behaviour at room booking. I communicated the recommendations for changes that would help to optimize the marketing campaigns and to increase the profitability:
Main proposals:
- When not to advertise, as the rooms will fill up automatically in high demand periods
- When to advertise - Between 10 and 50 days in lead time
- Meta is an informing and demand generator site, where the PPC costs can be much higher
- However, the optimization of Meta and Microsoft advertisements are recommended
- On the other side, Google is a conventional transmission interface, so the ppc/conversion spend will be lower due to its nature - which is represented in the data
- Simplify room recommendations to customers and visitors
- According to their purchase behaviour analysis (for example, prefer and suggest the more expensive rooms for guests with children)
- Improve conversion rates for families with children
- Improve upsell offers for guests with children, particularly in Hotel 2 and 3
- Presentation of results to stakeholders
Methods Used
- Data Cleaning
- Data Exploration
- Investigate Statistical Metrics
- Data Visualization
- Feature importance analysis
- Finding of business insights
- Proposals on advertisement costs and mediums
- Presentation of found insights
Tools
- Python
Data preprocessing and cleaning
- Join and merge several data tables (overall 11 tables for each hotel) in Python
- Ensuring data integrity
- Handling of missing data
- Handling of outliers
- Correction of inconsistent variables
- Feature engineering
- Generate new useful variables
- Currency conversion with webscraping
EDA
- Providing basic statistical metrics
- Providing important business insights, supported by statistical testing
- Identifying and segmenting customer groups
- Defining business-impacting outliers and groups
- Proposals for advertising platforms and marketing costs and time periods
Presentation of results
- In-person presentation of results to a jury of recognized senior data scientists
