Feature extraction techniques with MATLAB

Feature extraction, a method commonly used in computer vision, image processing and artificial intelligence projects, is the application of size reduction on raw data. [1]. As you know, machine learning has been experiencing dramatic developments recently. This has led to great interest in machine learning from industry, academia and popular culture 🏭👩‍🔬. With the introduction of machine learning and deep learning models in the field of Health in the world recently, intelligent systems can detect many diseases in advance or do not overlook details that an expert cannot see [2]. There are different regions that can detect the disease on MRI images that are common in medical treatments ☢️. In order to concentrate in these regions, feature selection and feature extraction are carried out and the results obtained are reflected to various algorithms and the machine is provided to detect diseases detected by human beings.

[gdlr_core_space height=”30px”]
Detection of lesion sites in sample brain MR image by feature inference [3]
[gdlr_core_space height=”30px”]

I’m going to continue to work on left hand wrist MRI images taken from individuals that I mentioned earlier in my article “preliminary stages in bone age determination with image processing”. Of course, the images used are entirely up to the person’s request, if you wish you can also make feature inference on another image data set ✔️. The important thing is that we can recognize the objects in the image and determine what feature it contains. Figure 2 ‘ to briefly explain the image preprocessing stages in the MATLAB environment show. During the pre-processing stages, certain filters were used to erase the trivial details found in the image, reduce the light factor, and clarify certain areas.

[gdlr_core_space height=”30px”]
MATLAB pre-processing steps
[gdlr_core_space height=”30px”]
Examining of Feature Extraction

📌 Feature extraction is the acquisition of details by reducing the size in order to positively affect performance in a project. Attribute inference (feature inference), used in machine learning, pattern recognition, and image processing, creates derived values (properties) using measured data given as input [3].

📌 A feature in machine learning is the individually measurable property of an observed data. Features are inputs fed into the machine learning model to make a prediction or classification [4].

Steps of Data Analysis
[gdlr_core_space height=”30px”]

Feature Extraction Techniques

Feature extraction aims to reduce the number of features in the data set by creating new features from existing data and then discarding the original features [6]. In line with this information, the color channels of the images were first checked in MATLAB. Then, according to the specified color channel RGB and Gray-Level information by keeping color conversions and numerical values will be obtained from these results. These numerical values will then be used in machine learning and will first be manually checked whether they belong to the same classes.

[gdlr_core_space height=”30px”]

🔎 RGB, HSV, LAB Color Spaces and Examining of  GLCM

🔗 RGB Color Space : RGB is the most widely used color space. In this color model, each color acts as the main spectral components of red, green and blue. The Cartesian Coordinate System is in the infrastructure of this model. The color subspace of interest is examined as this cube, which is frequently used in image processing [7].

📌 When working in the RGB color channel, let’s check that the image received with priority is suitable for the RGB color channel, and then let’s keep the Matrix values of the red, green and blue color channels as variables.

Parsing the image into R, G and B channels

Representation of Sample Red Channel Values

[gdlr_core_space height=”30px”]

🔗 HSV Color Space : The name of the HSV space comes from the initials of the words hue, saturation and brightness. HSV color space defines color with the terms Hue, Saturation, and Value. Although a mixture of Colors is used in RGB, HSV uses color, saturation and brightness values. Saturation determines the vitality of the color, while brightness refers to the brightness of the color.  The HSI space separates the nephew component in a color image from the hue and saturation, which are color-bearing information [9].

Parsing the image to H, S and V properties

Example Representation of V Channel Values

[gdlr_core_space height=”30px”]

🔗 CIE Color Space : The CIE 1931 color spaces were the first defined quantitative connections between the distribution of wavelengths in the electromagnetic visible spectrum and physiologically perceived colors in human color vision . The mathematical relationships that define these color spaces are essential tools for Color Management, which are important when dealing with recording devices such as color inks, illuminated displays, and digital cameras [11] . To parse the RGB image into CIELAB channels, the transformation must be performed with the command rgb2lab.

Parsing the image into channels C, I and E

Representation of Sample C Channel Values

[gdlr_core_space height=”30px”]

🔎 GLCM (Gray-Level Co-Occurrence Matrix) : Several tissue features can be extracted with the grey level co-formation Matrix. The texture filter functions provide a statistical view of the texture based on the image histogram. These functions can provide useful information about the texture of an image, but cannot provide information about the shape, that is, the spatial relationships of pixels in an image [12].

Calculation of sample GLCM values [12]

Creation of The Gray Level Co-Formation Matrix In The Image

🔔 Creation of the feature vector: In machine learning, feature vectors are used to represent numerical or symbolic properties of an object, called properties, in a mathematical, easily analyzable way. It is important for many different areas of machine learning and pattern processing. Machine learning algorithms often require a numerical representation of objects so that algorithms can perform processing and statistical analysis. Feature vectors are the equivalent of vectors of explanatory variables used in statistical procedures such as linear regression [13].

The resulting feature vector is a vector with values in size 1×28.

Graphitization of The Feature Vector

In this way we have obtained the feature vector. Hope to see you in my next post 🙌🏻

REFERENCES

[1] Sadi Evren Seker, “Feature Extraction”, December 2008, http://bilgisayarkavramlari.sadievrenseker.com/2008/12/01/ozellik-cikarimi-feature-extraction/.

[2] M. Mert Tunalı, “Brain tumor detection via MRI images Part 1 (U-Net)” taken from Medium.

[3] Shahab Aslani, Michael Dayan, Vittorio Murino, Diego Sona, “Deep 2D Encoder-Decoder Convolutional Neural Network for Multiple Sclerosis Lesion Segmentation in Brain MRI”, September 2018, Conference Paper, MICCAI2018 (BrainLes Workshop).


[4] MC.AI, The Computer Vision Pipeline, Part 4: Feature Extraction, October 2019, https://mc.ai/the-computer-vision-pipeline-part-4-feature-extraction/.

[5] Javier Gonzalez-Sanchez, Mustafa Baydoğan, Maria-Elena Chavez-Echeagaray, Winslow Burleson, Affect Measurement: A Roadmap Through Approaches, Technologies, and Data Analysis, December 2017.

[6] Pier Paolo Ippolito, “Feature Extraction Techniques”, Towards Data Science, https://towardsdatascience.com/feature-extraction-techniques-d619b56e31be.

[7] C. Gonzalez, Rafael, E. Woods, Richard, Digital Image Processing, Palme Publishing, (Ankara, 2014).

[8] Retrieved from https://favpng.com/png_view/light-rgb-color-space-rgb-color-model-light-png/BsYUHtec.

[9] Dr. Lecturer Member of Caner Ozcan, Karabuk University, CME429 Introduction to Image Processing, “Color Image Processing”.

[10] Retrieved from https://tr.pinterest.com/pin/391179917623338540/.

[11] From Wikipedia, The Free Encyclopedia, “CIE 1931 Color Space”, April 2020, https://en.wikipedia.org/wiki/CIE_1931_color_space.

[12] Matlab, Image Processing Toolbox User’s Guide, “Using a Gray-Level Co-Occurrence Matrix (GLCM)”, http://matlab.izmiran.ru/help/toolbox/images/enhanc15.html.

[13] Brilliant, “Feature Vector”,  https://brilliant.org/wiki/feature-vector/, April 2020.

Image Processing Color Spaces | RGB, HSV and CMYK 🌈

Welcome to the world of image processing 🎉I’m going to talk about color spaces in image processing, which is one of the areas of Computer Vision that are very important today. You know that image processing basically performs operations on the image. The use of color in image processing is due to two factors. First, color is an identifier that facilitates object recognition and object extraction from the image. Secondly, people can distinguish thousands of shades and intensity compared to gray shades only. For image analysis, we also need to specify the color channel to be used when performing various operations on the image. Before learning color channels, let’s learn a little about Color Image Processing. Scientifically, the foundations of the concept of color were discovered in 1665 by the British physicist Isaac Newton. In this experiment carried out in a dark room, it was noticed that the light coming through the door hole was shattered on the prism to form a color spectrum. Below is a summary drawing of this experiment.

We said the color spectrum but didn’ t mention what it means. The color spectrum is the separation of white light into its colors by passing through a special prism. In fact, we all know that very closely. To give a little more detail, rainbows that enthrall us with their colorful state after the rain are the most beautiful examples of color formation with the refraction of light 🎆.

Color spectrum formed when white light is passed through the prism 🌈

💎 The main reason for mentioning these is to present the spectrum range that the human eye can perceive in the most descriptive way. The spectrum range that the human eye can detect is 400 to 700 nm. This interval is defined as the scientifically visible region. This is exactly where the separation of white light passed through the prism you see in the photograph comes into play into the visible color spectrum. Here we’re going to work with this visible region. Let’s we talk about the most commonly used color channels !

As you know from everyday life and image processing, the main colors are Red, Green and Blue. The color channel consisting of these colors is called RGB in scientific terms. These primary colors come together to form the intermediate colors we use. Different image channels have been created from the main and intermediate colors.

🔎 As you can see, the main colors Red, Green and Blue are brought together to form Yellow, Cyan and Magenta colors. We’ll see these in the future when we examine the color spaces.

Formation of basic and intermediate colors 🌈

We basically talked about the color concept and the most used RGB structure up to this section. Now that we’ve got the basic structure we need to learn, we can access the color spaces. In image processing, a grayscale image has only one channel. Each pixel value that the image has is valued from 0 to 255. the image contains color according to the pixel values it has. In gray images, instead of colors such as red, green and blue in the RGB channel, the intensity level, ie brightness is handled. Actually, brightness refers to the colorless state of intensity. In color images, the storage space increases because more than one channel is used. The purpose of the color spaces or models I will now describe is often to facilitate color identification. A color space is the process of defining a subspace in the system in which a coordinate system and each color are represented by a single point. The RGB channel is widely used in image processing for color monitors and color video cameras, the CMY and CMYK channels for color printing, and the HSI (HSV) channel is created for people to describe and interpret color. These channels are the leading channels in image processing, so I will talk to you about them today.


RGB Color Channel

RGB is the most widely used color space. In this color model, each color acts as its main spectral components in red, green and blue. In the infrastructure of this model is located a Cartesian coordinate system. The color subspace concerned is examined as this cube, which is frequently used in image processing.

📝When this cube is examined, the RGB primary color values are found in the three corners of the cube, cyan, yellow, and magenda in the other three corners. The values R, G and B are expressed as vectors in the coordinate system. As you can see, in the RGB color space, different colors are located on the cube and in dots. When representing an RGB image with 24 bits, we specify the total number of colors (28) 3 = 16777216, assuming 8 bits are 1 byte. The cube you see above is a solid object containing the number of colors 16777216. To use colors in this cube, there are color codes or values written in specific color models. To use colors in this cube, there are color codes or values written in specific color models. There are many related websites. Examples include w3schools. In OpenCV, which is a very common library in image processing, we will examine how to define an RGB color model and how to extract an RGB histogram of the image.

📃 In OpenCV, the RGB color space is defined as BGR. A Histogram is the name given to a graphic that shows the numbers of color values in an image. If the histogram equality of the values in an image is desired to be expressed on the graph, first The X and y coordinates are specified on the matrix. Then the hist module is shown with imshow( ) by specifying how many boxes there will be. When creating a histogram of an image, we convert the original BGR image to gray.

Create a histogram chart

The original used image was chosen as an image where the RGB color model predominated.

📝 Histogram equalization or equalization is a method to resolve color distribution disorder caused by the fact that color values in a picture are clustered in a specific location. In the graph generated below, the values are clustered in the range 50-100.

(2 8 ) output of histogram chart with 256 boxes 📊

The image is a color image, so the RGB values will be processed and the colors will be separated and the histogram will be balanced for each of the red, green and blue colors.

Create a histogram chart with 32 and 8 boxes

Histogram chart with 32 boxes 📊

Histogram chart with 8 boxes 📊

Histogram Normalization

This generated histogram to convert into a probability distribution function, each value is divided by the sum of these values.

Creating a normalized histogram with the normed=True module

Normalized histogram boxes 📊

HSV | HSI Color Channel

Color models such as RGB, CMY, CMYK do not contain practical terms in terms of human interpretationFor example, no one talks about the color of a house by giving the percentage of the primary colors that make up that color. When we look at a colored object, we express that object with Hue, Saturation, and brightness. For this reason, the concepts of Hue, Saturation and brightness, which make it easy for us to define colors, have been put forward. The name of the HSI space comes from the initials hue, saturation and intensity, which are the English equivalent of the words hue, saturation and brightness. HSV color space defines color with the terms Hue, Saturation, and Value. Although a mixture of Colors is used in RGB, HSV uses color, saturation and brightness values. Saturation determines the vitality of the color, while brightness refers to the brightness of the color.  The HSI space separates the nephew component in a color image from the hue and saturation, which are color-bearing information.

The hue, saturation and brightness values used in HSV space are obtained from the RGB color Cube. The brightness value is zero while the color and saturation values for the Black color in the HSV space can take any between 0 and 255. In white, the brightness value is 255.

Conversion from RGB color space to HSV color space

Original : Image Converted to RGB and HSV Space

(a) – hue (b) – saturation (c) – intensity

CMY & CMYK COLOR CHANNEL

In the CMY model, the pigment primary colors, Cyan, winner and yellow, combined in equal amounts, should produce the Black color. In practice, the combination of these colors for printing produces a fuzzy-looking black tone. A fourth color, black, is added, which will reveal the CMYK color model to produce the correct black tone.

As mentioned earlier, this color model is used in image processing to produce hard copies. The equal amount of pigments of the CMY color space should produce the Black color. In order to produce the correct black tone in order to be dominant in printing, black tone was added to the CMY color space and CMYK color space was obtained. In publishing houses, “four-color printing” refers to CMYK, while “three-color printing” refers to the CMY color model.

REFERENCES

  1. C. Gonzalez, Rafael, E. Woods, Richard, Digital Image Processing, Palme Publishing, (Ankara, 2014)
  2. Retrieved from http://www.kisa-ozet.org/tayf-nedir-kisaca/.
  3. Dr. Lecturer. Member of Caner ÖZCAN, Karabuk University, BLM429 Introduction to Image Processing, Image Acquisition and Digitization.
  4. Retrieved from https://www.instructables.com/id/Exploring-Color-Space/
  5. Retrieved from https://www.fencix.net/isigin-sogurulmasi/
  6. Retrieved from https://www.eikonal.com.br/8930509-Prismas-especiais
  7. Retrieved from http://www.atasoyweb.net/Histogram-Esitleme
  8. Retrieved from https://www.istockphoto.com/tr/foto%C4%9Fraf/rainbow-lorikeet-gm115919863-2434334#/close
  9. Retrieved from https://people.eecs.berkeley.edu/~sequin/CS184/TOPICS/ColorSpaces/Color_0.html