SVM(Part-1): Prerequisites — vectors, linear separability and hyperplane

Jyoti Yadav
3 min readJan 7, 2020

I am starting with a new series of posts data science enthusiasts who are struggling to understand Support Vector Machines(SVMs). This series will not only include detailed description of SVMs but also other dimensions attached to it.

Vectors

The term Support Vector Machine contains a mathematical term “Vectors”. So, the natural question comes to the mind is “What is the significance of this term in the entire algorithm”. For any of the machine learning algorithms operations on vectors is one of the fundamental part.

In mathematics, vector is an object that has two components: magnitude & direction. The magnitude of a vector is more formally called its norm. It is the euclidean distance of the coordinates of the vector from the origin. Euclidean norm of a vector x lying in an ’n’ dimension space can be calculated as:

Whereas, the direction of a vector is angle between the vector and the horizontal line. It can be calculated through the following formula:

Linear Separability

We will understand linear separability through some examples. Lets consider a case study, a factory produces two types of soaps: organic soap and normal soap. In past days, they have observed that a few of the customers are getting a normal soap instead of organic soap.

The manager observed a trend in the weight of two soaps. The organic soap is having weight greater that 135 gm and the other one has weight smaller than it. Therefore, two of the soap can be segregated based on the decision boundary of 135 gm. This type of data where a linear hyperplane can classify the data into the required categories, is a linearly separable dataset.

Hyperplanes

Aim of the SVM is to find the optimal hyperplane that is capable of separating the corresponding plane. Hyperplane is a subspace having one dimension less than the space under consideration. For a three dimensional space, a line will be considered as a hyperplane. y =ax + b is the equation of a line and is a very simple example of a hyperplane.

In terms of the vectors it can be written as:

Where, w = (a, -1) and x = (x, y).

Understanding the classification of the data with a hyperplane. The following is an example of linearly separable datapoints with the above mentioned line format as the hyperplane separating the two classes.

Linear separation through a hyperplane

With each vector x there is a label y, these labels can have a value of +1 and -1. The points that lie on the line have a label of 0. The points lying above the line have a label of +1 and the ones lying below the line are labelled as -1.

The above labelling and separation can be represented through the equations as:

or,

Since, this is a function that produces linear combination of value, it is referred to as a linear classifier.

Above was the simple explanation of the essential topic for understanding SVMs. The other difficult topics will be explained in the subsequent articles.

Stay tuned!

--

--