Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
PysparkTrainingExample.ipynb		PysparkTrainingExample.ipynb
README.md		README.md

Repository files navigation

Pysparkexample

A repository that outlines how to train a Random Forest regressor using Mllib Pyspark.

The basic steps followed are as below:

Install Spark on Google Colab and load a dataset in PySpark
Describe and clean your dataset
Create a Random Forest pipeline to predict car prices
Create a cross validator for hyperparameter tuning
Train your model and predict test set car prices
Evaluate your model’s performance via several metrics

The link to the Coursera Project is below: https://www.coursera.org/projects/spark-machine-learning-pipeline-python

About

A repository that outlines how to train pyspark using mllib to find car prices.

Report repository

Releases

No releases published

Packages

Contributors

Languages

Jupyter Notebook 100.0%