What it does

The model submitted is explicitly trained for Cyclica's challenge question. In the submission, you will find our code for the deep learning algorithm that we have come up with and our solution to the unlabeled dataset.

How we built it

Multiple models were tried: MLP, logistic, linear, random forest, voting, etc. Random forest is used at last in the end with F1 score = 71.89.

Challenges we ran into

Imbalanced data (much more negative than positive)

Accomplishments that we're proud of

Understood and devised multiple models to tackle the issue, then decided the best approach for the given dataset.

What we learned

Fundamental skills in PyTorch, sklearn, pandas, and other libraries. We also gain experience in developing deep learning models and training them. Last but not least, we understood that not all models are suitable for a given dataset and that pre-processing the data beforehand is essential.

What's next for Data-Hackthon-2023

We want to develop the model further/train a new model for more accurate predictions. Moreover, we also want to try out the questions we did not have time to complete.

Built With

Share this project:

Updates