Skip to content

craftedbygaby/pigebank

Repository files navigation

Pig E Bank – Data Ethics & Applied Analytics

Project: CareerFoundry Data Immersion Program – Achievement 5
Objective: Provide analytical support to a global bank's anti-money-laundering compliance department, assessing client and transaction risk while handling real-world data ethics challenges.


Key Questions

  • What factors contribute most to a client's likelihood of closing their account?
  • How can we identify and mitigate bias in client risk models?
  • How do ethical considerations affect the collection, use, and sharing of client data?
  • How can time-series forecasting support compliance reporting and decision-making?

Data - Refer to Project Brief for links

  • Client dataset including demographics, account activity, and product usage
  • Analyzed using Microsoft Excel
  • Decision tree model built to predict account churn risk

Tools & Skills

  • Microsoft Excel for data cleaning, descriptive statistics, and time-series analysis
  • Decision tree modeling for churn prediction
  • Linear regression for predictive analysis
  • Time-series analysis & forecasting (moving averages)
  • Data ethics frameworks: bias identification, security & privacy considerations
  • GitHub for portfolio hosting and version control

Methodology

  1. Big Data Exploration: Identified characteristics of structured vs. unstructured data and limitations of big data approaches
  2. Data Ethics – Bias: Identified sources of bias in the client dataset and proposed mitigation strategies
  3. Data Ethics – Security & Privacy: Analyzed ethical dilemmas and relevant data protection considerations
  4. Data Mining: Cleaned client data, computed descriptive statistics, and built a decision tree to model churn outcomes
  5. Predictive Analysis: Applied linear regression concepts to client risk scenarios
  6. Time-Series Analysis: Created moving averages in Excel and explored forecasting models
  7. Reporting & Portfolio: Documented findings and hosted work on GitHub as part of a data analytics portfolio

Deliverables

  • 📊 Client Data Set Analysis (Excel) — available in this repository
  • 🌳 Decision Tree – Churn Model (PDF) — available in this repository

Insights

  • Key churn risk factors identified include account age, activity status, number of products held, and client demographics
  • Decision tree model highlights the compounding risk when multiple churn indicators are present
  • Bias analysis revealed the importance of reviewing demographic variables in compliance models
  • Time-series patterns in client activity provide a basis for proactive risk forecasting

Recommendations

  • Apply churn model to flag at-risk clients for proactive retention outreach
  • Review model variables regularly to control for demographic bias
  • Strengthen data governance practices around client PII in compliance workflows
  • Use time-series forecasting to anticipate volume changes in compliance workloads

Author: Gabriela Cascione

About

Churn prediction model built on customer behavioral and demographic data using decision trees. Identifies at-risk customers and surfaces key characteristics driving account closures, with attention to data ethics in a financial compliance context.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors