Implementing Linear Regression From Scratch using Python

Last Updated : 19 Mar, 2026

Linear regression is a supervised machine learning algorithm used to predict a continuous target variable based on one or more input variables. It assumes a linear relationship between the input and output, meaning the output changes proportionally as the input changes. The relationship is represented by a straight line that best fits the data.

Identifies the best-fitting straight line (regression line) that minimizes the difference between predicted and actual values.
Learns the relationship between independent (input) variables and the dependent (target) variable using a training dataset.
Calculates coefficients (weights) and intercept to define the linear equation for prediction.
Uses the learned model to make predictions on new, unseen data by applying the same linear relationship.

types_of_linear_regression — Type of Linear Regression

1. Simple Linear Regression

Simple Linear Regression is a supervised learning technique used to predict a continuous target variable based on a single input feature, assuming a linear relationship between the input and output. Now we implement Simple Linear regression from scratch.

Step 1: Import Libraries

Import the required libraries NumPy for numerical operations and Matplotlib for plotting the data and regression line.

R

import numpy as np
import matplotlib.pyplot as plt

Step 2: Implement Simple Linear Regression Class

Here we defines a SimpleLinearRegression class to model the relationship between a single input feature and a target variable using a linear equation.

__init__ method: Initializes slope, intercept, and R² attributes.
fit method: Adds a bias column to X, computes the best-fit slope and intercept using the Normal Equation, and calculates predicted values to determine the R score
predict method: Adds bias to the input X and calculates predicted values using the learned coefficients.

Python

class SimpleLinearRegression:
    def __init__(self):
        self.coefficient_ = None
        self.intercept_ = None
        self.r2score_ = None
        
    def fit(self, X, y):
        n = len(X)
        X_b = np.c_[np.ones((n,1)), X]  
   
        self.coefficients_ = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
        self.intercept_ = self.coefficients_[0]
        y_pred = X_b.dot(self.coefficients_)
        # R²
        self.r2score_ = 1 - (np.sum((y - y_pred)**2) / np.sum((y - np.mean(y))**2))
        self.y_pred_ = y_pred
        
    def predict(self, X):
        X_b = np.c_[np.ones((len(X),1)), X]
        return X_b.dot(self.coefficients_)

Step 3: Fit the Model and Visualize Results

Her we fits the Simple Linear Regression model on the dataset, prints the coefficients and R² score and plots the data points along with the best-fit regression line.

Python

X_simple = np.array([1,2,3,4,5,6,7,8,9,10]).reshape(-1,1)
y_simple = np.array([2,4,5,4,5,7,8,9,10,12])

slr = SimpleLinearRegression()
slr.fit(X_simple, y_simple)


print(f"Simple LR Coefficients: {slr.coefficients_}")
print(f"R² Score: {slr.r2score_:.2f}")

plt.scatter(X_simple, y_simple, color='blue', label='Data')
plt.plot(X_simple, slr.y_pred_, color='red', label='Regression Line')
plt.title("Simple Linear Regression")
plt.xlabel("X")
plt.ylabel("y")
plt.legend()
plt.show()

Output:

Screenshot-2026-03-12-171911 — Simple Linear Regression

2. Multiple Linear Regression

Multiple Linear Regression is used to predict a continuous target variable based on two or more input features, assuming a linear relationship between the inputs and the output.

Step 1: Import Libraries

Import NumPy for numerical operations, Matplotlib for plotting and mpl_toolkits.mplot3d to create 3D visualizations.

Python

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

Step 2: Implement Multiple Linear Regression Class

Here we implements Multiple Linear Regression class to model the relationship between multiple input features and a continuous target variable using a linear equation.

__init__ method: Initializes attributes for coefficients (slopes), intercept (bias) and R² score to store model accuracy.
fit method: Adds a bias column to X, computes coefficients using the Normal Equation, calculates predicted values, computes the R² score and stores predictions for the training data
predict method: Adds a bias column to new input X and computes predicted values using the learned coefficients.

Python

class MultipleLinearRegression:
    def __init__(self):
        self.coefficients_ = None 
        self.intercept_ = None     
        self.r2score_ = None    
        
    def fit(self, X, y):
        n = X.shape[0]
        X_b = np.c_[np.ones((n,1)), X]  
        self.coefficients_ = np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y)
        self.intercept_ = self.coefficients_[0]
        y_pred = X_b.dot(self.coefficients_)
        self.r2score_ = 1 - (np.sum((y - y_pred)**2)/np.sum((y - np.mean(y))**2))
        self.y_pred_ = y_pred
        
    def predict(self, X):
        X_b = np.c_[np.ones((X.shape[0],1)), X]  
        return X_b.dot(self.coefficients_)

Step 3: Generate Sample Dataset

We create a small dataset with two input features and a target variable, adding some random noise to simulate real-world data.

Python

np.random.seed(0)
X1 = np.random.randint(1, 11, 15)
X2 = np.random.randint(1, 11, 15)
X_multi = np.column_stack((X1, X2))
y_multi = 1 + 2*X1 + 3*X2 + np.random.randn(15)*2

Step 4: Fit the Model and Visualize

Here we fits the Multiple Linear Regression model on the dataset, prints the coefficients and R² score and visualizes the data along with the best-fit regression plane in 3D.

Python

mlr = MultipleLinearRegression()
mlr.fit(X_multi, y_multi)
print(f"Multiple LR Coefficients: {mlr.coefficients_}")
print(f"R² Score: {mlr.r2score_:.2f}")

fig = plt.figure(figsize=(10,7))
ax = fig.add_subplot(111, projection='3d')

ax.scatter(X_multi[:,0], X_multi[:,1], y_multi, color='blue', label='Data')

x1_surf, x2_surf = np.meshgrid(
    np.linspace(X_multi[:,0].min(), X_multi[:,0].max(), 10),
    np.linspace(X_multi[:,1].min(), X_multi[:,1].max(), 10)
)

pred_surf = mlr.predict(np.c_[x1_surf.ravel(), x2_surf.ravel()]).reshape(x1_surf.shape)

ax.plot_surface(x1_surf, x2_surf, pred_surf, color='red', alpha=0.5, rstride=1, cstride=1)

ax.set_xlabel('X1')
ax.set_ylabel('X2')
ax.set_zlabel('y')
ax.set_title("Multiple Linear Regression with Regression Plane")
ax.legend()
plt.show()

Output:

Screenshot-2026-03-12-173056 — Multiple linear regression

3. Polynomial Regression

Polynomial Regression is an extension of linear regression that models the relationship between the input and output as a polynomial equation, allowing it to capture non-linear patterns in the data.

Step 1: Define the Polynomial Regression Class

Here we implement a Polynomial Regression class to model the relationship between an input feature and a continuous target variable using a polynomial equation, allowing the model to capture non-linear patterns in the data.

__init__(self, degree=2): Initializes the model with the specified polynomial degree and sets placeholders for coefficients, intercept, R² score and polynomial transformer.
fit(self, X, y): Transforms input X into polynomial features, computes coefficients using the normal equation, stores the intercept, makes predictions and calculates the R² score.
predict(self, X): Generates predictions for new input data.

Python

class PolynomialRegression:
    def __init__(self, degree=2):
        self.degree = degree
        self.coefficients_ = None
        self.intercept_ = None
        self.r2score_ = None
        self.poly_features = None
        
    def fit(self, X, y):
        self.poly_features = PolynomialFeatures(degree=self.degree, include_bias=True)
        X_poly = self.poly_features.fit_transform(X)
        # Normal equation
        self.coefficients_ = np.linalg.inv(X_poly.T.dot(X_poly)).dot(X_poly.T).dot(y)
        self.intercept_ = self.coefficients_[0]
        # Predictions and R²
        y_pred = X_poly.dot(self.coefficients_)
        self.r2score_ = 1 - (np.sum((y - y_pred)**2) / np.sum((y - np.mean(y))**2))
        self.y_pred_ = y_pred
        
    def predict(self, X):
        X_poly = self.poly_features.transform(X)
        return X_poly.dot(self.coefficients_)

Step 2: Train the Model and Visualize the Fit

Here we generate sample data
Train the polynomial regression model on it
Visualize how well the model fits the data.
The plot shows the original data points and the polynomial curve representing the model’s predictions

Python

X_poly = np.random.rand(50,1)*6 - 3
y_poly = 0.5*X_poly**2 + X_poly + 2 + np.random.randn(50,1)*0.5
y_poly = y_poly.flatten()
pr = PolynomialRegression(degree=2)
pr.fit(X_poly, y_poly)
print(f"Polynomial LR Coefficients: {pr.coefficients_}")
print(f"R² Score: {pr.r2score_:.2f}")


plt.scatter(X_poly, y_poly, color='blue', label='Data')
plt.plot(np.sort(X_poly, axis=0), pr.y_pred_[np.argsort(X_poly, axis=0)], color='red', label='Polynomial Fit')
plt.title("Polynomial Regression")
plt.xlabel("X")
plt.ylabel("y")
plt.legend()
plt.show()

Output:

Screenshot-2026-03-12-174103 — Polynomial Regression

Download code from here

Comment

Article Tags:

Machine Learning

AI-ML-DS With Python

Explore

Machine Learning Basics

Python for Machine Learning

Feature Engineering

Supervised Learning

Unsupervised Learning

Model Evaluation and Tuning

Advanced Techniques

Machine Learning Practice