Day 13-100 Days MLCode: Support Vectro Machine

In the last blog we had discussed the Logistic Regression, today, we are going to discuss about Support Vector Machine.

Support Vector Machine is a very famous and powerful machine learning model and used for linear and nonlinear regression, classification and decision outlier detector.

As per wiki :

support vector machines (SVMs, also support vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.
wikipedia

Linear SVM Classification

Linear SVM can be seen as hyperplane which separates the two classes as shown in the image below. The dotted line act as a boundary and red line in the center of dotted line represents the decision boundary of SVM classifier. This decision line not only separates the class but also stays away from the nearest training instance. The instance which is located on the edge of the dotted lines are called Support Vectors and you can see that adding any new instance of data in each side will not affect the decision boundary as it is fully supported by instance located as edge of the hyperplane. Below image is called Large Margin SVM classification

SVM are very sensitive to the feature scale, so make sure that you use the SciKit-Learn’s StandardScaler class to scales the feature before training the model.

Hard Margin Classification

If our training data if fully linear separable and all the training instance are on the one side of the hyperplanes, this is called Hard Margin Classifications. Image show above looks like Hard Margin classification.

Soft Margin Classification

There are very few instances where you find the training data linear separable and you are able to find the straight line to separate the class. Our aim is to keep the balance between the size of hyperplane and margin violation(outlinear where training instance is on the wrong side). This act of keeping balance is called Soft Margin Classification.

In SciKit-Learn, you control the margin by tuning the parameter “c” of the SVM class. Smaller the value of C, wider the hyperplane but more margin violation.

Let’s create a small model using SciKit-Learn LinearSVM class on the MNIST data.

Import Libraries

import numpy as np
from sklearn import datasets
mnist_data = datasets.load_digits()

Check the structure of the dataset

list(mnist_data.keys())

Output: [‘data’, ‘target’, ‘target_names’, ‘images’, ‘DESCR’]

Prepare the train and test data using SciKit-Learn

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(mnist_data.data, (mnist_data.target == 7).astype(np.int), test_size=0.20, random_state=42)

In the above code, we are converting the value of Y as 0 if digit is not 7 and 1 if the digit is 7.

Let’s train a model to detect there the particular digit is 7 or not. Before we start training, let’s verify our label data, it should have only 0 and 1 and not the actual digit

y_train

Output: array([0, 0, 0, …, 0, 1, 0])

from sklearn.pipeline import Pipeline  
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
svm_clf = Pipeline([ ("scaler", StandardScaler() ),
                    ( "linear_svc", LinearSVC(C=1, loss= 'hinge',  random_state=42) ) 
                   ])
svm_clf.fit(X_train, y_train)

Output:
Pipeline(memory=None,
     steps=[('scaler', StandardScaler(copy=True, with_mean=True, with_std=True)), ('linear_svc', LinearSVC(C=1, class_weight=None, dual=True, fit_intercept=True,
     intercept_scaling=1, loss='hinge', max_iter=1000, multi_class='ovr',
     penalty='l2', random_state=42, tol=0.0001, verbose=0))])

Let’s predict something to see whether our model is working or not.

# Predict for One Observation (image)
import random
i = random.randint(0, (len(X_test)-1))
print(f"Prediction is {svm_clf.predict(X_test[i].reshape(1,-1))} and the actual value is {y_test[i]}")

Output: Prediction is [1] and the actual value is 1

Now Let’s check the accuracy of our model:

from sklearn.metrics import accuracy_score

y_pred = svm_clf.predict(X_train)
accuracy_score(y_train, y_pred)

Output: 0.9979123173277662

This looks very good. In next blog, we’ll work on the None Linear SVM.

In conclusion, SVM is a discriminative classifier, formally defined by a separating hyperplane. This is a very popular model and used in all regression and classification problems. You can find the today’s code here.

#100DaysofMLCode SVM

Day 13-100 Days MLCode: Support Vectro Machine