FailSafe

Inspiration

We wanted to take a data set and apply it to a predictive model in figuring out how students should perform for the coming semester. This idea came from our group taking a data mining course this semester and wanting to take our knowledge and apply it to a real world example.

What it does

Our software takes the data, puts it into data frames, and creates two separate very useful predictive tools. The first, we use the data to find interesting rules and patterns within the data. Creating association rules like these help find likely outcomes for new students in the system. The other method we used was clustering. We used 2 clusters representing the student predicted to pass or fail.

How I built it

We used python, pandas, and numpy to build data frames. Using these data frames, we built clusters and a confusion matrix, we also built frequent itemsets and interesting rules using a defined confidence threshold.

Challenges I ran into

Pruning the data set to reduce the amount of noise was fairly difficult. The hardest part of data mining is determining what data is hurting your predictions more than helping. Getting a good grasp on the entire data set was crucial in helping us fully understand the data and how to setup our models.

Accomplishments that I'm proud of

We were able to create multiple predictive models. Additionally, we formed a handful of histographs and other visual tools to create a truly exception data analysis presentation.

What I learned

We learned that real life data examples are much harder to understand and use than classroom examples. The data can often be muddled and hard to read until the noise has been reduced.

What's next for FailSafe

Ideally, we would like to train the model better with more data. Another useful feature we could include would take in current class grades and how many points have been scored in the class already to figure out how much of a challenge the student has of passing at their current point.

Built With

Submitted to

HackPSU Fall 2017

Created by

I worked on doing the initial statistical analysis of the data, preprocessed the data for clustering, clustered the data, and assessed the effectiveness of the predictive power of the model.

jwa5426
I created the presentation for the team and also advised on some of the data mining aspects of the project

dmp5658
Devon Phillips
dht5043

Updates

Devon Phillips started this project — Nov 05, 2017 12:24 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.