Skip to content

Latest commit

 

History

History
Worked examples for talk: Producing and evaluating machine learning models.

Lecture slides: CV.pdf

Files:

The files below are telegraphic examples used to generate the graphs and numbers in the presentation. Once can in principle work through them using R ( https://cran.r-project.org ), RStudio ( https://www.rstudio.com ), and the referenced packages.  They are not complete tutorials, but used to generate the number for the included presentation slides.

For a free video lecture on gradient boosting (one of the methods used) please see here: http://www.win-vector.com/blog/2015/11/free-gradient-boosting-lecture/ .

For a description of the vtreat package (used for data preparation) please see here: http://www.win-vector.com/blog/2016/06/a-demonstration-of-vtreat-data-preparation/ .

CV.pdf : lecture slides.
project.Rproj : RStudio project file (see https://www.rstudio.com ).
installH2O.R : Instructions to install h2o deep learning kit.

kdd2009.Rmd : R knitr/r-markdown neural net fitting/scoring.
kdd2009.html : HTML rendering of above file.

KDD2009vtreat.Rmd : R knitr/r-markdown demonstration fitting/scoring.
KDD2009vtreat.html : HTML rendering of above file.
kdd2009tree.Rmd : R knitr/r-markdown decision tree fitting/scoring.
kdd2009tree.html : HTML rendering of above file.
kdd2009xgboost.Rmd : R knitr/r-markdown demonstration fitting/scoring.
kdd2009xgboost.html : HTML rendering of above file.

orange_small_train.data.gz : Example data.
orange_small_train_churn.labels.txt : Example data.