Tutorial: Deep Learning Approach for Image Recognition
by Alok Negi
Image recognition often referred to as “image classification” or “image labeling”, is a computer vision technique that helps machines perceive and categorize what they see in images and videos. This core task is a key component in solving a broad range of computer vision-based machine learning problems.
By automating the manual image processing and achieving a higher degree of accuracy, image recognition aims to save time. An algorithm is fed a large number of representative pictures and trained to detect specific parts of them for this purpose. Recognizing image patterns and extracting features is a foundation for other, more advanced computer vision techniques, such as object recognition, image segmentation, and so on. But it also has a wide range of standalone applications, making it an important machine learning activity.
Image recognition is currently used by developers in several areas, ranging from Snapchat and Instagram masks to scientific research and medical goals. While state-of-the-art image recognition does not allow technology to completely replace medical professionals, it does make their jobs easier. Image recognition applications in healthcare include automated postpartum hemorrhage prediction, skin cancer identification, surgery simulations, and pneumonia detection, which are now assisting radiologists, surgeons, and other physicians.
Image processing techniques can be used to view a wide range of diseases, from simple two-dimensional images to complex four-dimensional doppler images. Images in two dimensions, such as x-rays of lung diseases, tuberculosis, lung emphysema, and pneumonia, can be quickly obtained and stored. It is possible to diagnose and treat various bone disorders such as cervical spondylosis, bony fractures, malignancies, and subclinical bone material changes. The use of 3D images from CT scans and MRIs in treating and diagnosing diseases affecting the head and abdomen would be more difficult. Image analysis may be used to help predict disease and as a prognostic tool.
Image recognition deals with recognizing the images and classifying them into various categories. So, image recognition can be categorized into single and multiclass recognition. In single-class image recognition, models predict only one label per image, while in multiclass recognition models can assign several labels to an image.
Image Recognition with Deep Neural Architecture
In general, deep learning architectures suitable for image recognition are based on variations of convolutional neural networks (CNNs). AlexNet, VGGNet, Inception, ResNet, SqueezeNet, MobileNet, etc., are the advanced convolutional neural networks (CNNs) that can be used for image recognition.
Image recognition models are trained to take an image as input and output one or more labels describing the image. The set of possible output labels are referred to as target classes. Technically, each input image can move through a series of convolution layers with filters (Kernals), pooling, fully connected layers (FC), and using Softmax to classify an object with probabilistic values from 0 to 1. Figure 1 below is a full CNN flow for processing an input image and classifies artifacts based on values.
Figure 1: CNN Model for Image Recognition
Face Mask Detection & Implementation: Case Study
The COVID-19 pandemic has had a fundamental impact on our daily lives, affecting global commerce and movement. As of 2:10 pm, CET on 12 March 2021, WHO had received reports of 118,754,336 confirmed cases of COVID-19, with 2,634,370 deaths. A total of 300,002,228 vaccine doses had been distributed as of March 9, 2021.
Wearing a medical face mask is one of the protective measures that can help reduce the spread of respiratory viral diseases like COVID-19. As a result, detecting face masks has become a critical role in aiding global society. The use of face masks and covers to track the spread of COVID-19 is advised, by authorities, for public safety and hygiene purposes. In this example, we are implementing a deep learning model for face mask detection from the masked and unmasked images. Sample images are shown in Figure 2.
Figure 2: Sample Images (Masked and Unmasked)
This detection mechanism can be used in hospitals, shopping malls, transportation hubs, restaurants, and other public gatherings where surveillance is needed. The flowchart for the proposed work is shown in Figure 3.
Figure 3: Proposed Flowchart
Step 1: Dataset Description
The dataset contains a total of 1376 images with two classes: Masked and Unmasked. Further, the dataset is divided into training sets and validation set as per the structure given in Figure 4. The training set contains 1104 images, while the validation set has 272 images.
Figure 4: Dataset Structure
Step 2: Data Pre-processing and Augmentation
For data pre-processing, images are resized into 196 x 196 x 3, and the data augmentation technique is used to increase the diversity of data by applying random transformations. Figure 5 shows the image after the augmentation.
Figure 5: Image after Data Augmentation
Step 3: Training Using Convolutional Neural Network
CNN model is used for training using python script with batch size 32 for 10 epochs. The number of training examples utilized in one iteration refers to batch size, while epochs refer to several passes made by the learning algorithm over the entire training dataset. Training history is shown in Figure 6.
Figure 6: Training History
For the evaluation of the CNN model, accuracy and categorical cross-entropy (log loss) are calculated. The overall training accuracy and validation accuracy are recorded at 99.82 percent and 98.90 percent, while log loss is recorded at 0.01 and 0.14, respectively. The better the model, the lower the log-loss, and a model with zero log-loss is said to be the best. The accuracy and log loss curves are shown in Figure 7.
Figure 7: Accuracy and Loss Curve
Step 4: Select Random Image and Detect Face
For the testing of the model, the face is detected for the random image using the Haar cascade classifier. Figure 8 shows the face identified inside a bounding rectangle.
Figure 8: Random Image and Face Detection
Step 5: Apply Trained CNN model for Prediction
Finally, a trained CNN model is applied to the identified face to decide whether or not the person is wearing a face mask. Predict for a random image is shown in Figure 9.
Figure 9: Model Prediction
Preventive measures must be implemented in order to protect mankind from the spread of COVID-19. This image recognition based proposed method could be used to combat Covid-19 global pandemics as a preventive measure. This will assist in the identification of security breaches, the use of face masks, and the protection of a safe working environment.
We conduct various live sessions and classes on the latest technologies for free. Please connect with us through our social media handles and be updated on all the upcoming sessions!
Facebook | Instagram | LinkedIn | Twitter
Leave a Comment