Introduction to Machine Learning is a summer project offered by the Programming Club of IIT Kanpur.
Given the UP crime dataset was created for a project at IIT Kanpur.
This task is to create a simple visualization notebook and find relations between reasons for the crime. Preprocessing Dataset for null and string to convert all to a standard representation. Some Graphs :- Reason wise , Number of people wise,city data. Deducing relations between reasons of the crime using given data.
Predicting the Taxi-Out Delay. Given the dataset predict the runway time of the flight
Performed a 90:10 split and for train and test purposes.
- Label encoded the columns which are required. Your target or y variable is TAXI-OUT time. Used all 8 algorithms below on the dataset with loss score as RMSE (Root mean Square Error).
- One-Hot encoded all the data points and repeated the process.
Models used:- Linear Regression Ridge Regression(Popularily L1) Lasso Regression(Popularily L2) KNN model SVR Naive Bayes Random Forest LightGBM(Tree Based Model)
By creating the notebook, we will see how label encoding or one hot encoding is better for the model and which out of the 8 algorithms which is the best. Plotted graphs to understand the results on bigger datasets.