- Project Overview
- Business Questions
- Dataset
- Tech Stack
- Project Structure
- Key Findings
- Methodology
- Tableau Dashboard
- How to Run
- Skills Demonstrated
- Connect
This project delivers a full-cycle data analytics pipeline on a dataset of 5,200 social media advertising campaigns spanning 5 channels, 5 campaign goals, and 4 customer segments. The analysis moves from raw data through exploratory analysis, statistical testing, and feature engineering, culminating in an interactive Tableau dashboard built for business decision-making.
The central question: what combination of channel, goal, and audience maximises return on advertising spend?
| # | Question | Method |
|---|---|---|
| 1 | Which channel delivers the highest average ROI? | Group aggregation, ranked bar chart |
| 2 | Which channel-goal combination is most efficient? | Pivot heatmap, ANOVA |
| 3 | What separates high-ROI campaigns from low-ROI ones? | Efficiency quadrant, correlation analysis |
| 4 | Which customer segment converts best? | Segment breakdown, statistical testing |
| 5 | Are there time patterns in campaign performance? | Monthly trend, day-of-week, hour-of-day analysis |
| 6 | Which companies consistently outperform on ROI? | Company-level aggregation, tier classification |
| Property | Value |
|---|---|
| Source | Kaggle — Social Media Advertising Dataset |
| Rows | 5,200 campaigns |
| Columns | 19 features |
| Date range | Full calendar year |
| File | data/social_media_ads.csv |
| Column | Type | Description |
|---|---|---|
| Campaign_ID | String | Unique campaign identifier |
| Target_Audience | String | Audience demographic |
| Campaign_Goal | String | Business objective of the campaign |
| Duration | Integer | Campaign length in days |
| Channel_Used | String | Social media platform |
| Conversion_Rate | Float | Ratio of conversions to impressions |
| Acquisition_Cost | Float | Cost per acquired customer ($) |
| ROI | Float | Return on investment multiplier |
| Location | String | Geographic market |
| Language | String | Campaign language |
| Clicks | Integer | Total clicks |
| Impressions | Integer | Total impressions |
| Engagement_Score | Float | Platform engagement score (0–10) |
| Customer_Segment | String | Customer value tier |
| Date | Date | Campaign date |
| Company | String | Advertiser company |
| hour | Integer | Hour of day (0–23) |
| day | Integer | Day of week (0–6) |
| month | Integer | Month (1–12) |
| Feature | Formula | Purpose |
|---|---|---|
| CTR | Clicks / Impressions | Click efficiency |
| Cost_per_Click | Acquisition_Cost / Clicks | Cost efficiency |
| ROI_per_Day | ROI / Duration | Time-adjusted return |
social-media-ads-analytics/
│
├── README.md
├── requirements.txt
│
├── data/
│ ├── social_media_ads.csv ← Raw dataset
│ └── data_dictionary.md ← Full column reference
│
├── notebooks/
│ ├── 01_data_exploration.ipynb ← Schema, dtypes, missing values
│ ├── 02_cleaning_feature_eng.ipynb ← Cleaning + derived KPIs
│ ├── 03_EDA_visualizations.ipynb ← All charts and visual analysis
│ └── 04_statistical_analysis.ipynb ← ANOVA, Pearson, hypothesis tests
│
├── scripts/
│ ├── social_media_ads_EDA.py ← Full EDA pipeline script
│ └── social_media_ads_analysis.py ← Analytics pipeline script
│
├── tableau/
│ ├── dashboard_screenshots/
│ ├── overview.png
│ ├── efficiency_quadrant.png
│ ├── monthly_trend.png
│ └── channel_heatmap.png
│
│
├── outputs/
│ ├── channel_goal_summary.csv ← Exported aggregation table
Instagram campaigns average 5.81x ROI, outperforming the dataset average by 11.5%. Pinterest sits at the bottom at 4.62x.
Across all channels, Product Launch consistently outperforms other goals. The Instagram + Product Launch combination delivers 6.1x ROI — the peak of the entire dataset.
The efficiency quadrant analysis reveals that the highest-ROI campaigns do not necessarily have the highest acquisition costs. Several campaigns achieve top-quartile ROI with below-median spend — indicating targeting quality matters more than budget.
High-Value segment conversion rate: 13.4% Casual segment conversion rate: 7.2% This 86% gap suggests significant opportunity in audience targeting strategy.
Mid-week (Tuesday–Thursday) shows consistently higher engagement scores. Weekend engagement drops by approximately 32% compared to the weekly peak.
ROI shows a clear upward trend from January (4.8x) to November (5.8x), suggesting campaign optimisation over time or seasonal effects.
- Schema inspection, data type validation
- Missing value analysis across all 19 columns
- Descriptive statistics (mean, median, std, skewness, kurtosis)
- Handled zero-value edge cases in Clicks and Impressions using
np.nanreplacement - Engineered 3 derived KPIs: CTR, Cost_per_Click, ROI_per_Day
- Parsed Date column and extracted temporal features
- Univariate distributions for all numeric features
- Categorical breakdowns for Channel, Goal, Audience, Segment
- Correlation matrix across all numeric features
- Channel performance aggregation
- Efficiency quadrant scatter (Acquisition Cost vs ROI, sized by Engagement)
- Temporal analysis — monthly, day-of-week, hour-of-day
- Company performance ranking
- One-way ANOVA — tested whether mean ROI differs significantly across channels
- F-statistic and p-value reported
- Null hypothesis: all channel means are equal
- Pearson correlation — tested relationship between Acquisition Cost and ROI
- r value and p-value reported
The interactive Tableau dashboard was built with 9 sheets assembled into a dashboard view.
- 5 KPI scorecards with monthly sparklines and delta vs average
- Channel ROI ranked bar chart (metric switchable via parameter)
- Channel × Campaign Goal heatmap
- Efficiency Quadrant scatter plot with quadrant annotations
- Dual-axis monthly trend line (ROI + Conversion Rate)
- Customer segment donut chart
- Day-of-week engagement bar chart
- Top companies table with dynamic tier badges
- Campaign health score progress bar
- Top insight callout (auto-generated from filter state)
- 5 action filters for cross-sheet interactivity
- URL actions linking to Kaggle and GitHub profiles
22 calculated fields including LOD expressions (FIXED), window functions (WINDOW_MAX, WINDOW_MIN), parameter-driven metric switching, and dynamic tier classification.
View the live dashboard → Tableau Public
git clone https://github.com/yourusername/social-media-ads-analytics.git
cd social-media-ads-analyticspip install -r requirements.txtpython scripts/social_media_ads_EDA.pyjupyter notebook notebooks/Run notebooks 01 → 02 → 03 → 04 in sequence.
Open outputs/social_media_ads_report.html in any browser — no installation needed.
| Skill | Where demonstrated |
|---|---|
| Python data manipulation | Pandas groupby, pivot, merge, apply across 5,200 rows |
| Feature engineering | 3 derived KPIs from raw columns |
| Data visualisation | 15+ chart types using Matplotlib and Seaborn |
| Statistical analysis | One-way ANOVA, Pearson correlation, hypothesis testing |
| Business intelligence | KPI design, efficiency scoring, tier classification |
| Tableau | 22 calculated fields, LOD expressions, dual-axis charts, action filters |
| Data storytelling | Finding-driven titles, insight callouts, annotated charts |
| Documentation | Data dictionary, workbook build guide, structured README |
| Platform | Link |
|---|---|
| Kaggle | kaggle.com/Taofeek OLADIGBOLU |
| Tableau Public | public.tableau.com/Taofeek OLADIGBOLU |
| linkedin.com/Taofeek OLADIGBOLU | |
| GitHub | github.com/Taofeek OLADIGBOLU |
If you found this project useful, please consider giving it a star on GitHub, a upvote on Kaggle, and a favorite on Tableau Public.
