This repository contains analysis and insights derived from the Telco Customer Churn dataset. The dataset provides valuable information about customer behavior, allowing businesses to predict and mitigate customer churn effectively. By exploring various data points, we aim to identify key factors influencing churn and propose actionable strategies for customer retention.
- Analyze the
Telco Customer Churndataset to gain valuable insights and identify specific business actions to reduce customer churn.
- Customer churn refers to when a customer stops using a product or service. Analyzing customer churn is crucial for businesses, especially now, given that retaining customers is equally or more valuable than acquiring new ones, as losing clients is very costly. This shift reflects a more product-led mindset in technological businesses.
- How can we reduce customer churn through action-oriented analysis?
| Source | Size |
|---|---|
| Kaggle: Telco Customer Churn | 7.043 Rows, 21 Columns |
customerid: Customer identification number.gender: Whether the customer is male or female.seniorcitizen: 0 if the customer is under 65 years old, or 1 if they are 65 or older.partner: Whether the customer has a partner (Yes or No).dependents: Whether the customer has dependents (Yes, No) such as children, parents, etc.tenure: Number of months the customer has stayed with the company.phoneservice: Whether the customer has a phone service (Yes, No).multiplelines: Whether the customer has multiple lines (Yes, No, No phone service).internetservice: Customer’s internet service provider (DSL, Fiber optic, No).onlinesecurity: Whether the customer has online security (Yes, No, No internet service).onlinebackup: Whether the customer has online backup (Yes, No, No internet service).deviceprotection: Whether the customer has device protection (Yes, No, No internet service).techsupport: Whether the customer has tech support (Yes, No, No internet service).streamingtv: Whether the customer has streaming TV (Yes, No, No internet service).streamingmovies: Whether the customer has streaming movies (Yes, No, No internet service).contract: Customer’s current contract type: Month-to-Month, One Year, Two Year.paperlessbilling: Whether the customer has paperless billing (Yes, No).paymentmethod: How the customer pays their bill: Bank Withdrawal, Credit Card, Mailed Check.monthlycharges: Customer’s current total monthly charge for all services.totalcharges: Customer’s total charges up to the end of the specified quarter.
Target Column = churn: 1 if the customer left the company (Churn=Yes) and 0 if they remained (Churn=No) based on the last quarter.
We created a demographics dashboard visualization with Tableau to have a visual perspective of the demographics distribution by churn:
Telco customer churn Tableau demographics dashboard
We created a customer segmentation based on the customer value matrix that analyzes customer loyalty and profitability to understand whether segmenting customers can be useful for understanding customer churn.
The result of customer segmentation of customers leaving (churn = yes) is the following graph:
Learning: More than half (∼55%) of customers leaving are 'butterflies' 🦋 (= highly profitable) so assessing customer churn is crucial to increasing Telco profit and improving business.
Analyzing billing features like payment methods, we created a heatmap that shows there is room for improvement in the 'electronic check' payment method as most of the customers leaving (churn = yes) are using this method.
Tableau: Payment Method heatmap
We looked for potential correlations through a heat map and also aimed to identify the key features influencing the target column, churn.
Heatmap:
Correlations to the Target churn:
We can confirm tha the Top correlations to churn (customers levaing) are:
- Contract type (∼0.40)
- Internet service (∼0.32)
Bivariate Analysis:
We plotted the key feautres (Contract type & Internet service) with the hisghet correlation to the target 'churn'
---
Settings & Data Overview
ML Models tested
No Upsampling models confusion matrix results:
Since the 'false positives' were weak (low) we decided to Upsample the data with SMOTE Model.
Upsampling models confusion matrix results:
Data Exploration:
- Demographics are not key features to understand customer churn.
- Potential target audience for improvement: single people with no kids/dependents.
- Customer segmentation highlights the importance of assessing customer churn.
- More than half of our customers leaving are highly profitable (spending more than average).
- 57% of customers leaving use electronic check as a payment method (= Room for improvement).
Correlation:
- Key features related to customer churn: contract type and internet service type.
- Contract type has a moderate relationship with churn. Focus on pitching longer-term contracts (two years) which have the lowest churn rate.
Machine Learning Models:
- Upsampling is crucial for assessing machine learning models due to significant data imbalance (70/30 split).
- Random Forest classifier using SMOTE upsampling is the best option for building an accurate model to predict customer churn.
