Given data with n parameters (where the last parameter is our output), predict a linear regression line between each of the n-1 features wrt the output parameter.
- pandas
- scikit-learn
- matplotlib
- numpy
You can just run
pip install -r requirements.txt
in terminal to install the necessary dependencies. Here is a link to pip if you don't already have it.
- Since different datasets have different delimiters, number of features etc, first parse the dataset and create a clean version with a fixed delimiter and store as 'temporary.txt' and use that later on.
This is the code for Siraj's challenge on Linear Regression here
- Diabetes - (Age, Deficit, C_peptide) - Given the Age and Deficit, predict the amount of C_peptide
- Electrical Length - (Inhabitants, Distance, Length) - Given the number of inhabitants and distance, predict the length required
- Challenge Dataset
- Diabetes Dataset
- Electrical Length Dataset
Type python demo.py <datafile_location> into terminal and you'll see the scatter plots and lines of best fit appear for the (feature, output) pairs for each feature.




