SVM algorithm: a deep dive!


SVM algorithm
Spread the love

SVM algorithm is a wonderful paradigm for regression or classification of similar data points belonging to two different genres. In machine learning, SVM is majorly utilized for classifications. Which might include values, results, and even pixels. SVM-based machine learning tools are trained in a supervised learning method. And implementation of the same is easy. A labelled data set with similar data points must be fed to the model and keep on feeding unless the margin of error is reduced. Only then can an SVM-based machine learning tool act on unlabelled data sets. Implementation of SVM is as easy as loading a python library. The training data is fed to the program and the target is given for predictions. And then the model makes a prediction. SVM solves the convex optimization problem. It ensures the length of the margin is maximum and the data points for different categories are on different sides of the decision boundary.

The principle

Classification by the SVM algorithms involves charting a decision boundary between two different groups of data points. The decision boundary is a line in the case of two-dimensional representations and a plane or hyperplane in the case of three or more dimensions.  The dimensionality is important for SVM-based classifications due to the nature of segregation. The groups of data points are usually separated by a dimensional space and the decision boundary is the middle ground of the same. This dimensional space and the decision boundary are supported by data points that are on the end of a cluster or data point group. The distance between these extreme data points is called the margin and the decision boundary originates from the middle point of the margin. These extreme data points are called support vectors, due to their supporting nature for the decision boundary and margin.

See also  What Kind of Training Is Included in PMO Courses?

Different components of the SVM algorithm

  • Decision boundary

The decision boundary is a line in 2D representation and a plane in 3d representations. This line or plane is a midpoint of two support vectors. And the dimensional gap on both sides of the boundary must be quantitatively equal.

  • Support vector

The support vectors are the extreme data points. A minimum of two support vectors one from each category is essential for forming an optimal decision boundary. SVM algorithms classify two data point categories, by ensuring an optimal margin. And a support vector is the very foundation of the same.

  • Margin

The margin of an SVM-based classification is the quantitative vector-based representation of the dimensional space between two groups of data points. The margin is the distance between two support vectors. And the vertical intersection of both the margins is termed the decision boundary. SVM algorithm is trained for the maximization of this margin and to make sure the dimensional space is free of data points.

Types of SVM based on margins

Hard margins

Hard margin SVM-based classification is used for a clean data set without many exceptions and impossibilities. A hard margin is usually free of intrusion from both the categories of data points, and data points belonging to different categories are usually outside the dimensional space.

Soft margins

Soft margins are flexible. Instead of applying SVM with stringency. A soft margin allows the deviations and exceptions to be ignored regardless of their localization on a plot. These data points can be located in most unwanted positions and their existence possess a challenge in the deployment of hard margin SVM algorithms.

See also  Want Good Essay Writing Help in the UK? 5 Points to Consider

The kernel trick

Most datasets we encounter in real life can not be classified by a linear decision boundary. In the case of these data sets SVM algorithm-based classification is done by following the kernel trick. The trick is adding one or more dimensions to the data set for separation into two groups.

Imagine a 3D dataset represented in 2D, where a group is surrounded by the other group. Now the decision boundary must be a circle, or nonlinear entity, and not a plane, surface, or straight line. After adding in the 3rd dimension, the image shows a clear distinction between two groups, for instance, the surrounded group is low on the 3rd dimension value. Then the SVM can introduce a decision boundary belonging to a different dimension.


Spread the love

sanket goyal

Sanket has been in digital marketing for 8 years. He has worked with various MNCs and brands, helping them grow their online presence.