Support Vector Machines Part 1

Hello everyone. Image classification are among the most common usage area of artificial intelligence. There are many ways to classify images, but I want to talk about support vector machines in this blog.

In machine learning, support-vector machines are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.Since the algorithm in question does not require any joint distribution function information regarding the data, they are distribution independent learning algorithms.Support Vector Machine (SVM) can be used for both classification and regression challenges. However, it is mostly used for classification problems.

How to solve the classification problem with SVM?

In this algorithm, we draw each data item as a point in n-dimensional space. Next, we classify by finding the hyperplane that separates the two classes very well. The algorithm is set in two classes of the line to be drawn in such a way that it passes from the furthest place to its elements. It is a nonparametric classifier. SVM can also classify linear and nonlinear data, but generally tries to classify data linearly.

SVMs apply a classification strategy that uses a margin-based geometric criterion instead of a pure statistical criterion. In other words, SVMs do not need statistical distribution estimates of classes in order to move from the classification task, and they define the classification model using the concept of margin maximization.

In SVM literature, the predictor is called a variable symbol, and a transformed symbol used to describe the hyperplane is called a feature. The task of choosing the most appropriate representation is also known as feature selection. A set of properties that describe a case is called a vector.

Thus, the purpose of SVM modeling; The goal is to find the optimal hyperplane separating the vector sets, with the single-category states of the variable on one side of the plane and the other categorized states on the other side of the plane.

Classification with SVM

The mathematical algorithms owned by the SVM were originally designed for the classification problem of two-class linear data, then generalized for classification of multi-class and non-linear data. The working principle of DVM is based on the prediction of the most appropriate decision function that can distinguish the two classes, in other words, the definition of the hyper-plane that can distinguish the two classes from each other in the most appropriate way (Vapnik, 1995; Vapnik, 2000). In recent years, intensive studies have been carried out on the use of DVMs in the field of remote sensing, which are used successfully in many areas. (Foody et al., 2004; Melgani et al., 2004; Pal et al., 2005; Kavzoglu et al., 2009). In order to determine the optimum hyperplane, two hyperplanes parallel to this plane and its boundaries must be determined. The points that make up these hyperplanes are called support vectors.

How to Identify the Correct Hyper Plane?

It is quite easy to detect the correct hyperplane with package programs such as R, Python, but we can also detect the correct hyperplane manually with simple methods. Let’s consider a few simple examples.

Here we have 3 different hyperplanes a, b and c. Now let’s define the correct hyperplane to classify the star and the circle. Hyperplane b is chosen because it correctly separates stars and circles in this graph.

If all of our hyperplanes separate classes well, how can we detect the correct hyperplane?

Here, maximizing the distances between the nearest data point (class) or hyperplane will help us decide on the correct hyperplane. This distance is called the Margin.

We can see that the hyperplane C margin is high compared to both A and B. Hence, we call the straight plane C.

SVM for linearly inseparable data

In many problems, such as the classification of satellite images, it is not possible to separate the data linearly. In this case, the problem arising from the fact that some of the training data remains on the other side of the optimum hyperplane is solved by defining a positive dummy variable. The balance between maximizing the limit and minimizing false classification errors can be controlled by defining a regulation parameter (0 <C <∞) that takes positive values and is denoted by C (Cortes et al., 1995). Thus, data can be separated linearly and hyper-plane between classes can be determined. Support vector machines can mathematically make nonlinear transformations with the help of a kernel function, thus allowing the data to be separated linearly in high dimensions.

It is essential to determine the kernel function to be used for a classification process to be performed with support vector machines (SVM) and the optimum parameters of this function. The most commonly used kernel functions in the literature are polynomial, radial based function, PUK function and normalized polynomial kernels.

SVM is used for things like disease recognition in medicine, limitation of consumer loans in banking, and face recognition in artificial intelligence. In the next blog, I will try to talk about their applications on package programs. Goodbye until we meet again …




Article Review: Multi-Category Classification with CNN

Classification of Multi-Category Images Using Deep Learning: A Convolutional Neural Network Model

In this article, the article ‘Classifying multi-category images using Deep Learning: A Convolutional Neural Network Model’ presented in India in 2017 by Ardhendu Bandhu, Sanjiban Sekhar Roy is being reviewed. An image classification model using a convolutional neural network is presented with TensorFlow. TensorFlow is a popular open-source library for machine learning and deep neural networks. A multi-category image dataset was considered for classification. Traditional back propagation neural network; has an input layer, hidden layer, and an output. A convolutional neural network has a convolutional layer and a maximum pooling layer. We train this proposed classifier to calculate the decision boundary of the image dataset. Real-world data is mostly untagged and unstructured. This unstructured data can be an image, audio, and text data. Useful information cannot be easily derived from neural networks that are shallow, meaning they are those with fewer hidden layers. A deep neural network-based CNN classifier is proposed, which has many hidden layers and can obtain meaningful information from images.

Keywords: Image, Classification, Convolutional Neural Network, TensorFlow, Deep Neural Network.

First of all, let’s examine what classification is so that we can understand the steps laid out in the project. Image Classification refers to the function of classifying images from a multi-class set of images. To classify an image dataset into multiple classes or categories, there must be a good understanding between the dataset and classes.

In this article;

  1. Convolutional Neural Network (CNN) based on deep learning is proposed to classify images.
  2. The proposed model achieves high accuracy after repeating 10,000 times within the dataset containing 20,000 images of dogs and cats, which takes about 300 minutes to train and validate the dataset.

In this project, a convolutional neural network consisting of a convolutional layer, RELU function, a pooling layer, and a fully connected layer is used. A convolutional neural network is an automatic choice when it comes to image recognition using deep learning.

Convolutional Neural Network

For classification purposes, it has the architecture as the convolutional network [INPUT-CONV-RELU-POOL-FC].

INPUT- Raw pixel values as images.

CONV- Contents output in the first cluster of neurons.

RELU- It applies the activation function.

POOL- Performs downsampling.

FC- Calculates the class score.

In this publication, a multi-level deep learning system for picture characterization is planned and implemented. Especially the proposed structure;

1) The picture shows how to find nearby neurons that are discriminatory and non-instructive for grouping problem.

2) Given these areas, it is shown how to view the level classifier.


A data set containing 20,000 dog and cat images from the Kaggle dataset was used. The Kaggle database has a total of 25000 images available. Images are divided into training and test sets. 12,000 images are entered in the training set and 8,000 images in the test set. Split dataset of the training set and test set helps cross-validation of data and provides a check over errors; Cross-validation checks whether the proposed classifier classifies cat or dog images correctly.

The following experimental setup is done on Spyder, a scientific Python development environment.

  1. First of all, Scipy, Numpy, and Tensorflow should be used as necessary.
  2. A start time, training path, and test track must be constant. Image height and image width were provided as 64 pixels. The image dataset containing 20,000 images is then loaded. Due to a large number of dimensions, it is resized and iterated. This period takes approximately 5-10 minutes.
  3. This data is fed by TensorFlow. In TensorFlow, all data is passed between operations in a calculation chart. Properties and labels must be in the form of a matrix for the tensors to easily interpret this situation.
  4. Tensorflow Prediction: To call data within the model, we start the session with an additional argument where the name of all placeholders with the corresponding data is placed. Because the data in TensorFlow is passed as a variable, it must be initialized before a graph can be run in a session. To update the value of a variable, we define an update function that can then run.
  5. After the variables are initialized, we print the initial value of the state variable and run the update process. After that, the rotation of the activation function, the choice of the activation function has a great influence on the behavior of the network. The activation function for a specific node is an input or output of the specific node provided a set of inputs.
  6. Next, we define the hyperparameters we will need to train our login features. In more complex neural networks, we encounter more hyperparameters. Some of our hyperparameters can be like the learning rate.
    Another hyperparameter is the number of iterations we train our data. The next hyperparameter is the batch size, which chooses the size of the set of images to send for classification at one time.
  7. Finally, after all that, we start the TensorFlow session, which makes TensorFlow work, because without starting a session a TensorFlow won’t work. After that, our model will start the training process.


🖇 As deep architecture, we used a convolutional neural network and also implemented the TensorFlow deep learning library. The experimental results below were done on Spyder, a scientific Python development environment. 20,000 images were used and the batch is fixed at 100 for you.

🖇 It is essential to examine the accuracy of the models in terms of test data rather than training data. To run the convolutional neural network using TensorFlow, the Windows 10 machine was used, the hardware was specified to have an 8GB of RAM with the CPU version of TensorFlow.

📌 As the number of iterations increases, training accuracy increases, but so does our training time. Table 1 shows the number line with the accuracy we got.

Number of iterations vs AccuracyThe graph has become almost constant after several thousand iterations. Different batch size values can lead to different results. We set a batch size value of 100 for images.

✨ In this article, a high accuracy rate was obtained in classifying images with the proposed method. The CNN neural network was implemented using TensorFlow. It was observed that the classifier performed well in terms of accuracy. However, a CPU based system was used. So the experiment took extra training time, if a GPU-based system was used, training time would be shortened. The CNN model can be applied in solving the complex image classification problem related to medical imaging and other fields.


  2. Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. science, 313(5786), 504- 507.
  3. Yoshua Bengio, “Learning Deep Architectures for AI”, Dept. IRO, Universite de Montreal C.P. 6128, Montreal, Qc, H3C 3J7, Canada, Technical Report 1312.
  4. Yann LeCun, Yoshua Bengio & Geoffrey Hinton, “Deep learning “, NATURE |vol 521 | 28 May 2015
  5. Yicong Zhou and Yantao Wei, “Learning Hierarchical Spectral–Spatial Features for Hyperspectral Image Classification”, IEEE Transactions on cybernetics, Vol. 46, No.7, July 2016.
Featured Image

A Step-By-Step Journey To Artificial Intelligence

Machine learning (ML) is the study of computer algorithms that develop automatically through experience [1]. According to Wikipedia, machine learning involves computers discovering how to perform tasks without being explicitly programmed [2]. The first thing that comes to most of you when it comes to artificial intelligence is undoubtedly robots, as you can see in the visual. Today I have researched the relevant courses at the basics of machine learning and artificial intelligence level for you, and here I will list the DataCamp and Coursera courses that I’m most pleased with.

DataCamp Courses

💠 Image Processing with Keras in Python: During this course, CNN networks will be taught how to build, train, and evaluate. It will be taught how to develop learning abilities from data and how to interpret the results of training.
Click to go to the course 🔗
💠 Preprocessing for Machine Learning in Python:  You’ll learn how to standardize your data to be the right format for your model, create new features to make the most of the information in your dataset, and choose the best features to improve your model compliance.
Click to go to the course  🔗
💠 Advanced Deep Learning with Keras: It shows you how to solve various problems using the versatile Keras functional API by training a network that performs both classification and regression.
Click to go to the course 🔗
💠 Introduction to TensorFlow in Python: In this course, you will use TensorFlow 2.3 to develop, train, and make predictions with suggestion systems, image classification, and models that power significant advances in fintech. You will learn both high-level APIs that will allow you to design and train deep learning models in 15 lines of code, and low-level APIs that will allow you to go beyond ready-made routines.
Click to go to the course 🔗
💠 Introduction to Deep Learning with PyTorch: PyTorch is also one of the leading deep learning frameworks, both powerful and easy to use. In this lesson, you will use Pytorch to learn the basic concepts of neural networks before creating your first neural network to estimate numbers from the MNIST dataset. You will then learn about CNN and use it to build more powerful models that deliver more accurate results. You will evaluate the results and use different techniques to improve them.
Click to go to the course 🔗
💠 Supervised Learning with scikit-learn: 

  • Classification
  • Regression
    • Fine-tuning your model
    • Preprocessing and pipelines

Click to go to the course 🔗

💠 AI Fundamentals:

  • Introduction to AI
  • Supervised Learning
    • Unsupervised Learning
    • Deep Learning & Beyond

Click to go to the course 🔗

Coursera Courses

💠 Machine Learning: Classification, University of Washington: 

  • The solution of both binary and multi-class classification problems
  • Improving the performance of any model using Boosting
  • Method scaling with stochastic gradient rise
  • Use of missing data processing techniques
  • Model evaluation using precision-recall metrics

Click to go to the course 🔗

💠 AI For Everyone,  

  • Realistic AI can’t be what it can be?
  • How to identify opportunities to apply artificial intelligence to problems in your own organization?
  • What is it like to create a machine learning and data science projects?
  • How does it work with an AI team and build an AI strategy in your company?
  • How to navigate ethical and social discussions about artificial intelligence?

Click to go to the course  🔗

💠 AI for Medical Diagnosis, 

  • In Lesson 1, you will create convolutional neural network image classification and segmentation models to diagnose lung and brain disorders.
  • In Lesson 2, you will create risk models and survival predictors for heart disease using statistical methods and a random forest predictor to determine patient prognosis.
  • In Lesson 3, you will create a treatment effect predictor, apply model interpretation techniques, and use natural language processing to extract information from radiology reports.

Click to go to the course 🔗
As a priority step in learning artificial intelligence, I took Artificial Neural Networks and Pattern Recognition courses in my Master’s degree. I developed projects related to these areas and had the opportunity to present these projects. So I realized that I added more to myself when I passed on what I knew. In this article, I mentioned the DataCamp and Coursera courses that you should learn in summary. Before this, I strongly recommend that you also finish the Machine Learning Crash Course.


  1. Mitchell, Tom (1997). Machine Learning. New York: McGraw Hill. ISBN 0-07-042807-7. OCLC 36417892.
  2. From Wikipedia, The free encyclopedia, Machine learning, 19 November 2020.
  3. DataCamp,
  4. Coursera,