This project uses a simple linear regression model to predict house prices based on house size (in square feet). It uses a dataset stored in a CSV file and visualizes both the raw data and the model's predictions.
The goal is to:
- Load and explore a dataset of house sizes and prices
- Train a linear regression model using
scikit-learn - Make predictions on unseen data
- Visualize actual vs. predicted house prices using
matplotlib
- Python 3
- Pandas
- NumPy
- Matplotlib
- scikit-learn
home_dataset.csv: CSV file containing two columns:HouseSize(in sq.ft)HousePrice(in millions of dollars)
predict_home_prices.pyor your script file: contains the full code for loading data, training the model, and plotting results.
- Load the dataset using
pandas - Plot raw data as a scatter plot
- Split data into training and test sets (80/20 split)
- Train a linear regression model on the training data
- Predict house prices for the test set
- Visualize:
- Blue dots = actual house prices
- Red line = predicted prices based on the model
- Clone the repository:
git clone https://github.com/alishbamateen/predict_home_prices.git cd house-price-prediction - Make sure you have the required libraries:
pip install pandas numpy matplotlib scikit-learn
- Run the script:
python predict_home_prices
A second plot showing:
Blue dots: actual house prices (test data) Red line: predicted house prices from the linear regression model
- The red line in the final graph may not appear properly unless x_test is sorted before plotting.
- This project is for educational purposes and demonstrates basic regression modeling and visualization.