A Support Vector Machine (SVM) uses a supervised learning method to solve regression or classification problems. Let’s say a dataset has n features. So, we can think of an n-dimensional space formed by the features. And a hyperplane is an (n-1) dimensional subspace that separates the input variable space.
For example, if a dataset has two features, then the feature variables will form a two-dimensional space. And a hyperplane will be a line that separates these points in 2-dimensional space.
A Support Vector Machine (SVM) selects the hyperplane to separate the input variables in an n-dimensional space in the best possible way.
The SVM algorithm is implemented using a kernel. And the kernel can be linear, polynomial, or radial kernel. Interested readers, who want to know more about how SVM works, please refer to these youtube videos:
https://www.youtube.com/watch?v=efR1C6CvhmE
https://www.youtube.com/watch?v=Toet3EiSFcM
https://www.youtube.com/watch?v=Qc5IyLW_hns
How to solve classification problems using Support Vector Machines (SVM) in sklearn?
Let’s read the iris dataset. The dataset contains four features based on which we can determine the type of flower. Now, there are three different types of flowers in the dataset. We can use the following Python code to solve this multiclass classification problem using SVM.
import seaborn from sklearn.model_selection import train_test_split from sklearn.svm import SVC from sklearn.metrics import accuracy_score df = seaborn.load_dataset("iris") df_features = df.drop(labels=["species"], axis=1) df_target = df.filter(items=["species"]) X_train, X_test, y_train, y_test = train_test_split(df_features, df_target["species"], shuffle=True, random_state=1) classifier = SVC() classifier.fit(X_train, y_train) y_test_pred = classifier.predict(X_test) accuracy = accuracy_score(y_test, y_test_pred) print("Accuracy Score: ", accuracy)
Here, we are first reading the iris dataset using the seaborn library. Then, we are splitting the dataset into features and target. df_features contain all four features. And df_target contains the output labels…
0 Comments