Skip to content

samankhalilinia/MachineLearning-SupportVectorMachines

 
 

Repository files navigation

Machine Learning-Support Vector Machines

Description:

  • A Python script to estimate from scratch Support Vector Machines for linear, polynomial and Gaussian kernels utilising the quadratic programming optimisation algorithm from library CVXOPT.
  • Support Vector Machines implemented from scratch and compared to scikit-learn's implementation.
  • All labelled examples are simulated data.
    • Linear hard margin fits linearly separable data. Linear soft margin fits linearly separable data with some overlap in class examples. Both Gaussian and polynomial kernels estimate non-linearly separable examples.

Using this Python Script

Using SKLearn Library

Hard Margin

Linear SVM hard margin. Using custom library.
Hard Margin

Linear SVM hard margin. Using SKLearn.
Soft Margin

Linear SVM soft margin.
Soft Margin

Linear SVM soft margin.
Polynomial Kernel

Polynomial SVM. Custom library.
Polynomial Kernel

Polynomial SVM. Using SKLearn.
Gaussian Kernel

Gaussian SVM. Custom library.
Gaussian Kernel

Gaussian SVM. Using SKLearn.
Given two classes of labelled examples, we are interested in finding a decision boundary resulting from an appropriate choice of support vectors.

    Model

  • Simulate labelled training dataset where there are N couples of and k is the number (dimension) of x variables.

  • We are interested in SVM disrimitive analysis by finding the optimum decision boundary resulting from a choice of S support vectors. This SVM optimization problem is a constrained convex quadratic optimization problem. Meaning it can be formulated in terms of Lagrange multipliers. For the Lagrange multipliers to be applied under constraints, can use the Wolfe Dual Principle which is appropriate as long as the Karush Kuhn Tucker (KKT) conditions are satisfied. Further, as this is a convex problem, Slater’s condition tells us that strong duality holds such that the dual and primal optimums give the solution.
  • A benefit to this dual formulation is that the objective function only depends on the Lagrange multipliers (weights and intercept drop out). Further, this formulation will be useful when requiring Kernals for entangled data sets.
  • The Wolfe dual soft margin formula with kernel is given by

    Where

    are the Lagrange multipliers, is the kernel function, N are the number of training samples in the dataset, x is the matrix of training samples, y is the vector of target values, C is a supplied hyperparameter.

  • The non-zero Lagrange multipliers are the data points which contribute to the formation of the decision boundary.

    The hypothesis function is the decision boundary. The hypothesis formula in terms of the Kernel function is given by:

  • Where S is the set of support vectors, is the Lagrange multiplier, b is the bias term, y is the target from the examples, is the Kernel and

    Linear Kernel

  • Applies to linearly separable data in the space.

  • When the constraint C is zero. The data must be completely linearly separable and the decision boundary is referred to as 'hard margin'.
  • When the parameter C is non-zero, the approach allows for some overlap in the data and the decision boundary is referred to as 'soft-margin'.
  • Where x and x' are two vectors.

    Polynomial Kernel

    Applies to non-linearly separable data in space.

    Where C is a constant and d is the degree of the kernel.

    Gaussian Kernel (aka Radial Basis Function (RBF))

    Applies to non-linearly separable data in . space.

    Where is a free scalar parameter chosen based on the data and defines the influence of each training example.

    CVXOPT Library

    The CVXOPT library solves the Wolfe dual soft margin constrained optimisation with the following API:

    Note: indicates component-wise vector inequalities. It means that each row of the matrix represents an inequality that must be satisfied.

    To use the CVXOPT convex solver API. The Wolfe dual soft margin formula is re-written as follows

    Where

    G is a Gram matrix of all possible dot products of vectors .

    How to use

    python supportVectorMachines.py
    

    Expected Output

    Estimating kernel: linear
         pcost       dcost       gap    pres   dres
     0: -1.4469e+01 -2.6709e+01  5e+02  2e+01  2e+00
     1: -1.6173e+01 -1.1054e+01  1e+02  6e+00  6e-01
     2: -2.1081e+01 -1.0259e+01  1e+02  4e+00  3e-01
     3: -6.7928e+00 -4.0658e+00  1e+01  4e-01  3e-02
     4: -3.1094e+00 -3.4085e+00  9e-01  2e-02  2e-03
     5: -3.2416e+00 -3.2970e+00  7e-02  4e-04  4e-05
     6: -3.2908e+00 -3.2914e+00  7e-04  4e-06  4e-07
     7: -3.2913e+00 -3.2913e+00  7e-06  4e-08  4e-09
     8: -3.2913e+00 -3.2913e+00  7e-08  4e-10  4e-11
     9: -3.2913e+00 -3.2913e+00  7e-10  4e-12  4e-13
    10: -3.2913e+00 -3.2913e+00  7e-12  4e-14  7e-15
    Optimal solution found.
    ============================================================
            SUPPORT VECTOR MACHINE TERMINATION RESULTS
    ============================================================
                   **** In-Sample: ****
    3 support vectors found from 160 examples.
                   **** Predictions: ****
    40 out of 40 predictions correct.
    ============================================================
    Finished
    

    Highlights

    • SVMs are particularly well-suited for classification of complex but small- or medium-sized linear and non-linearably separable datasets.
    • The SKLearn SVC class is based on the libsvm library, which implements an algorithm that supports the kernel trick. The training time complexity is usually between and . This means that training becomes slow when the number of training instances are large.
    • Support Vector Machines in Machine Learning generally follow

      Logistic regression in studies. This work is a natural stepping stone and advances the candidates Machine Learning knowledge from this step as follows:

      • Strong introduction to vectors and vector operations.
      • Appreciation and usage of advanced optimisation routines such as convex optimisation and Sequential Minimal Optimization (SMO).
      • SKLearn implements SVM optimisation using LIBSVM which utilises an SMO routine.

    Requirements

    Python (>2.7), Numpy, CVXOPT, sklearn and matplotlib.

About

Support vector machines implemented from scratch in Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 100.0%