Hello again from an intensive and abundant research day! When it came to image processing and computer vision, one of the areas I worked with fondly was image division. Since it is generally known as ‘Image Segmentation’ in the areas we work in, we should do our research according to this information. When deep learning and image processing are blended, of course, one of the most well-known methods of object detection is to segment images.
There are many deep learning methods available that are ready to better understand this method. In ConvNet architectures, which I often encounter, Mask-RCNN segments a digital image in different colors. It performs this process according to the pixel values in the image. It is often used in classification processes. Let’s examine the image given below 📝
Mask-RCNN result segmented 
You can get this image from the Udemy course, which I have noted in the references. If you’ve noticed, we’re not just talking about classification. If so, rectangular boxes would usually contain tag names. But this time the objects are separated by certain colors. In this image, the pixels in which the objects are located are divided into colors with a mask. In addition, these objects separated by masks are identified and labeled with boxes, as in the object recognition section.
🍃In the image division method, each of the pixels in a region, just like the parts in this image, is masked by removing similarities according to some characteristic or calculated properties, such as color values, density, texture.
How Mask-RCNN Works?
In order to find out, we may need to master the RCNN structure a little more first. I’ve told CNN in many of my previous posts, I’ll leave the link here if you want to examine it.
It is necessary to use a CNN to take different areas of interest from the image and classify the presence of the object in that area. The problem with this approach is that the objects of interest have different spatial positions and different aspect ratios within the image. For this reason, zone-based structures such as RCNN work much slower than other neural network architectures.
Faster-RCNN works with the RCNN structure, but shows much faster performance. In fact, all of our efforts consist of analyzing each part of the image.
Mask-RCNN is logically similar to Faster-RCNN. So what exactly does this Faster-RCNN do? It allows us to recognize objects by masking objects in the image. Look, this masking process is very important. Mask-RCNN also performs masking. But there is a situation where Mask-RCNN uses a binary value for the partition added according to its structure. I mean, if the mask is to be used for the object to be recognized, it outputs a value of 1 and 0 if it is not to be used.
❗️In summary, if we want to represent RCNN structures in artificial neural networks, it will create a matrix consisting of 0 and 1 that will be masked as output, while using a feature map that is extracted as input. Of course, it should be noted that during these processes, regional recommendations are extracted to the neural network.
I examined the notebooks in Kaggle to see the segmentation results and found a notebook used for the segmentation of optical coherence tomography images with diabetic macular edema. U-Net architecture has been solved and training has been carried out. As a result of this training, the following data was reached.
Total loss in epoch 1000 : 0.24524860084056854 and validation loss : 1.4668775796890259
🍃According to Ayyuce Kizrak, U-Net is a different architecture created from convoluted neural network layers and gives more successful results than classical models in terms of pixel-based image segmentation/segmentation.
In this project, the following image was obtained as the latest plot graphics using U-Net. If you have noticed, the edema images have undergone the necessary segmentation process.
In my next post, we will prepare to code together for image segmentation. It is always very important to get to the bottom of the work before the algorithm and its coding. For this reason, it was important for me that classification and segmentation were separated from each other by a sharp line. I hope I’ve been able to give you enough explanatory information. Wish you healthy days✨
- Face and Object Detection with Computer Vision | R-CNN, SSD, GANs, Udemy.
- Ayyuce Kizrak, Görüntü Bölütleme (Segmentasyon) için Derin Öğrenme: U-Net, Medium, https://medium.com/@ayyucekizrak/görüntü-bölütleme-segmentasyon-için-derin-öğrenme-u-net-3340be23096b.