-
A chart showing the changes in outstanding account value for accounts that did eventually defect on their loans.
-
This map shows the loans contained in the dataset.
-
These graphs show the range of interest rates given for interest rates. The top shows the range and the bottom shows the average.
-
A heatmap showing the distribution of data across loan channel, property ownership, and type of property.
-
Code sample showing my decision tree.
Inspiration
I wanted to use the machine learning concepts I've learned in my classes, and the dataset from Risk Span seemed interesting. I hope to go into Data Science and risk management is a large area of that field. I am also a business minor and taking accounting currently. We have not covered mortgages yet or loans with much depth, but I find I understand financial information well and am always interested in learning more.
What it is
My final result is a Jupyter Notebook hosted by Google Collaboratory accompanied by some visualizations created in Tableau. This details the process I went through during Bitcamp 2019.
How I built it
I started off naively hoping I could intuitively understand the mortgage terminology in the data. This was a mistake as I spent far too much time looking at the data without context for the numbers I was seeing. I did some cursory research and decided to start with a decision tree classifier. I soon realized the data had limited variability and doubted the high accuracy my classifier was reporting, but I did some of the most important factors to consider. I used this information to guide my data visualisations created using Tableau.
Challenges I ran into
Time was the largest obstacle, even more so than other hackathons I have been to as I was working alone. I tried to find others interested in machine learning and looking for teammates, but that journey was not successful.
Accomplishments that I'm proud of
I am very proud that I managed to submit a project alone especially considering I had no prior knowledge regarding mortgage loans and spent a decent amount of time doing research. I am also proud of the visualizations I made with Tableau as it was my first time ever using this software.
What I learned
I learned how to use Google Collaboratory, some Tableau basics, as well as the struggle of submitting a project alone. I must say I far prefer the team experience.
What's next for Risk Data Analysis
I hope to use this data set to gain experience with more complex machine learning algorithms that I have only learned about conceptually so far like Neural Networks and unsupervised learning techniques.
Log in or sign up for Devpost to join the conversation.