GitHub - jishmisc28/HFresh_RecipeScore_Predictor

Hello Fresh RecipeScore Predictor

##Food Recipe Score Prediction Challenge

Problem Statement

A recipe score or rating is the comparative ranking that a user gives to a specific recipe. If customers like a specific recipe, we would expect them to rate the recipe high, if the customer did not like a recipe, we would expect them to rate it low. Here we will build a model that predicts the average recipe score using the data that is provided in the input datasets. We will concentrate on below two parts of the solution for this problem: 1. Problem Modeling 2. Production-Ready Implementation

Solution Approach

1. Problem Modeling : We will be working with python's sci-kit learn library and do a comprehensive Exploratory Data Analysis on the provided dataset over a Jupyter Notebook. We will preprocess the dataset and perform EDA. Basis which we will create new features and understand feature importances and their respective impact on model R2 score. We will use Gradient Boosted Regressor for modeling which provides good accuracy over different loss function options and works well with varied data.

Production-Ready Implementation : We will combine all above functioning components in sklearn GridSearchCV Pipeline. We will use GridSearchCV to store and operate on the best set of parameters. We will create datasets as test set for unseen data, and divide the remaining datat to train, validation. We will write all the code as python modules and save the model and output files.

Structure

data

This folder has the training, validation and test data.
meta

This folder is used to store intermediate results such as the best set of parameters determined by GridSearchCV and our trained model.
notebooks

This folder is used to host the notebook containing EDA steps and model execution steps over notebook.
out

This folder contains final prediction output on test(unseen) data.
src

This folder contains all of the source code.

Training

Training model Train.py script in the HFresh package inside src. Here is the flow of events

Create GridSearchCV Pipeline comprising of custom feature transformer and GradientBoostingRegressor regressor
Load training data, perform cleanup and feature transformation. Create training and validation sets.
Fit the training set on the pipeline
Calculate the best set of parameters and make predictions on the validation set
Evaluate metrics and scores for the pipeline's best_estimator
Document the prediction results on the validation set and save it to a pandas DataFrame.
Store the best set of parameters

Predict

Please look at the documented Predict.py script in the HFresh package inside src. Here is the flow of events

Create GridSearchCV Pipeline comprising of custom feature transformer and GradientBoostingRegressor regressor. This time we use the best set set of parameters obtained from training.
Load test data and perform cleanup and feature transformation.
Make predictions on the test data and store prediction output in out folder.

**Refrences link : https://www.ritchieng.com/machine-learning-evaluate-linear-regression-model/ **Refrences link : https://lib.ugent.be/fulltxt/RUG01/002/166/653/RUG01-002166653_2014_0001_AC.pdf **Refrences link : https://github.com/sinclam2/regression-analysis-with-python/blob/master/03-Multiple-Regression-in-Action.ipynb **Refrences link : https://github.com/abtpst/Kaggle-Whats-Cooking/blob/master/src/cook/cookTrain.py

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
meta		meta
notebooks		notebooks
out		out
src/HFresh		src/HFresh
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hello Fresh RecipeScore Predictor

Structure

Training

Predict

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Hello Fresh RecipeScore Predictor

Structure

Training

Predict

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages