Deep Transfer Learning-Based COVID-19 Prediction Using Chest X-Rays

The novel coronavirus disease (COVID-19) is spreading very rapidly across the globe because of its highly contagious nature and is declared as a pandemic by the World Health Organization (WHO). Scientists are endeavouring to ascertain the drugs for its efficacious treatment. Because, until now, no full-proof drug is available to cure this deadly disease. Therefore, identifying COVID-19 positive people and quarantining them can be an effective solution to control its spread. Many machine learning and deep learning techniques are being used quite effectively to classify positive and negative cases. In this work, a deep transfer learning-based model is proposed to classify the COVID-19 cases using chest X-rays or CT scan images of infected persons. The proposed model is based on the ensembling of DenseNet121 and SqueezeNet1.0, which is named as DeQueezeNet. The model can extract the importance of various influential features from the X-ray images, which are effectively used to classify the COVID-19 cases. The performance study of the proposed model depicts its effectiveness in terms of accuracy and precision. A comparative study has also been done with the recently published works, and it is observed that the performance of the proposed model is significantly better.


Introduction
COVID-19 stands for coronavirus disease-19, has emerged as an epidemic, which has attracted worldwide attention since December 2019. Initially, it was referred to as the novel coronavirus 2019 (2019-nCoV), but later on was renamed as COVID-19, officially, by the World Health Organization (WHO) on 11 February 2020 (Guarner, 2020). Coronavirus disease  is an infectious disease caused by a new virus. This disease makes humans suffer from respiratory illness with symptoms such as cough, high fever and, in severe cases, difficulty in breathing.
If we look at the history of coronaviruses, these are the viruses that cause diseases in mammals and birds. In humans, coronaviruses cause respiratory tract infections (RTI) (Kharitonov et al., 1995) that can vary from mild to lethiferous. The first virus of the family Coronaviridae was SARS-CoV, which stands for severe acute respiratory syndrome coronavirus. Its outburst started in Guangdong, China, and later it spread to many other countries in Southeast Asia. The last case of SARS-CoV was reported in September 2003, and it infected 8,000 persons causing 774 deaths with a case mortality rate of 9.6% (Guarner, 2020).
About nine years ago, in mid-2012, a deadly virus had appeared in the Middle East from this family, which was known as Middle East respiratory syndrome coronavirus (MERS-CoV) (Al-Tawfiq et al., 2013;Guarner, 2020). When we compare the case mortality rate of MERS-CoV with SARS-CoV, it is quite high and is around 35%.
COVID-19 is again a novel virus from the Coronaviridae family, which emerged from the city of Wuhan, China (Khan et al., 2020), and spread to several other countries. It is a zoonotic origin virus that is transmitted from animals to humans; now it is transmitting between humans to humans.
Novel coronavirus or COVID-19 spread like a wildfire, wilder than the SARS in 2003. Figure 1 shows the worldwide infected cases at an interval of 10 days. The vertical axis represents the numbers of confirmed, recovered and death cases. As shown in Figure 1, on 2 May 2020, the number of confirmed, recovered and death cases are 3, 427,343, 1,093,112 and 243,808, respectively. I n the beginning, it increased slowly, when it was in the first phase of transmission, but once it reached into third/community transmission phase (community transmission phase is the one in which source of infection cannot be traced), then the rate of increase in the number of cases was very high. For example, the number of confirmed cases can be observed on 2 April 2020 and 12 April 2020 which is around 1.01M and 1.83M, respectively, showing a steep increase of 800K+ cases in small tenure of 10 days. As per its widespread and worldwide impact, WHO declared COVID-19 as a pandemic on 11 March 2020.
At the time of writing (7 May 2020), the worldwide number of confirmed, recovered and fatal coronavirus cases are 3.7M+, 1.2M+ and 250K+, respectively, and the mortality rate is more than 6.75%. If we look at the highly affected countries, leading the charts are United States of America (from the North American region), Spain, Italy, Germany (from European region), China, Iran (from Eastern Mediterranean region) and many other countries including India (from South East Asian region). As per the WHO report, the whole world is suffering from this damning disease, and its global risk assessment level is very high.
Unfortunately, until now, clinical features for identifying the various stages of COVID-19 pneumonia (Yan et al., 2020) are still not clear. Due to this, infected persons are often identified after reaching the severe stages. Scientists are endeavouring to discover the drugs for their effective treatment. For the time being, chloroquine and hydroxychloroquine have been found effective for controlling the incisiveness of pneumonia, improving lung imaging findings and promoting a virus-negative conversion (Gao et al., 2020;Gautret et al., 2020). Still, it is not a satisfactory solution because the rapid increase in the number of cases cannot be checked effectively and that leading to infect many others. Therefore, identifying the cases in the early stage and quarantining the infected persons is going to be a relevant solution.
There are many machine learning and deep transfer learning-based models being applied to X-ray images to detect COVID-19 cases effectively. But speedy and accurate detection is still a challenging task. For this, we have proposed a deep transfer-based learning model, which is an ensemble of DenseNet121 and SqueezeNet1.0 to detect the COVID-19 cases effectively.
The remaining part of the article is organised as follows. The next section highlights the related work which covers a few recently published models in the area. The third and fourth sections state the problem statement and data (data acquisition and pre-processing), respectively. The fifth section highlights the transfer learning, ensemble learning and the fundamental architecture of DenseNet121 and SqueezeNet. It also defines the proposed model and its flow. The performance analysis of the proposed model is covered in the sixth section. The seventh section concludes the model and its importance.

Related Work
A few similar works have been done in which many deep learning techniques have been applied. Applying these techniques over medical images such as X-rays, CT scans and MRI has been in practice since long, and the success rate is quite significant (Lundervold & Lundervold, 2019;Razzak et al., 2018). The implementations of deep learning techniques have a wide range over multiple tasks such as disease diagnosis (Gulshan et al., 2016), survival rate prediction (Han et al., 2020), tumour segmentation (Cui et al., 2016) and many other applications. DenseNet is also being used frequently in the field of medical imaging. Mentioning some noteworthy works, it is used for cardiac segmentation and diagnosis (Khened et al., 2019), a fully connected framework is used for semantic segmentation (Jégou et al., 2017) and a transfer learning-based framework is used on fundus medical images data (Xu et al., 2018); the outcomes have been quite encouraging in all these cases.
In the proposed model, we have used X-ray images for prediction, which also have a wide range of applications in deep learning, especially in case of pneumonia prediction, the symptoms of which are very similar to that of COVID-19. A few highly accurate and clinically approved uses of X-rays in deep learning are radiologist-level pneumonia detection, where architecture was constructed with name ChexNet by Rajpurkar et al. (2017) and tuberculosis detection using deep convolutional neural network architecture called TX-CNN used in .
A few works have been done to diagnose the COVID-19 using machine learning and deep learning frameworks. To note some, (Narin et al., 2020) propose to use a deep convolutional network-based model called ResNet50 and compare its performance concerning two other models, namely InceptionV3 and InceptionResNetV2 for detecting coronavirus-infected patients using X-ray images. Results establish that among these three, ResNet50 is the best performing model in terms of accuracy. But the reliability of this model is questionable, as it is tested on a quite small data set, and its performance needs to be observed on a larger data set also.
Another noteworthy study has been performed by Sethy and Behera (2020), where they have stacked a support vector machine (SVM) in front of ResNet50 to detect coronavirus-infected persons using X-ray images.
A deep learning-based model to detect the disease from chest CT scans using weak label (Zheng et al., 2020) and an Inception Net-inspired model also using CT scans to predict the disease  were also some recently published works on the same topic. Gaining motivation from all the aforementioned works, the proposed model aims to achieve higher efficiency standards on a larger data set so that it can be used for clinical purposes.

Problem Statement
Currently, many countries of the world, including India are suffering from COVID-19. A few countries such as the USA, Germany, Italy and several others are facing its spread in the community transfer phase, which indicates that one infected person can infect more than 100 people to whom he contacts. So, the solution to the problem is to identify the infected persons and put them in quarantine to stop the further spread. Existing diagnosis procedures to identify the infected person are time-consuming, which is affecting the rate of diagnosis when dealing with a large number of cases.
Therefore, for overcoming this issue, we have proposed a model, which can efficiently classify the COVID-19 positive and negative cases well advance in time.
So, provided the problem, the task is to classify people as COVID-19 positive or COVID-19 negative by using a suitable automated machine learning-based model. The model takes X-ray images as an input parameter which are showing initial symptoms of the disease.

Data
This section of the model states the data acquisition and discusses the pre-processing applied to the data.

Background and Data Acquisition
The diagnosis of COVID-19 is confirmed by polymerase chain reaction (PCR), which is a complex procedure comprising of heavy machinery and transfer requirement of test samples, consuming a lot of time. Also, various studies suggest that patients suffering from pneumonia can be diagnosed by observing the abnormalities in their respective chest X-rays or CT scans (Kermany, Goldbaum et al., 2018). As we do in pneumonia-like diseases, the same procedure can be followed with the patients showing symptoms of COVID-19 (Ng et al., 2020), which can save a lot of valuable time. As conducting X-rays is quite a feasible and quick procedure, we can conduct a large number of tests in a short time.
So, for our study, we have used a total of 401 chest X-ray images of patients, out of which 196 were obtained from COVID-19 image data collection (Cohen et al., 2020), an open-source data set. This data set, at the time of the acquisition, was small and skewed towards COVID-19 positive samples. So, to increase our sample size and make it skewed towards negative samples and transforming into a more realistic scenario, we have included 205 X-ray images from another open-source data set (Kermany, Zhang, & Goldbaum, 2018) at Kaggle. These images (Kermany, Zhang, & Goldbaum, 2018) are of normal or pneumonia-infected people for the studies of bacterial and viral pneumonia predictions.

Pre-Processing
The data collected from the aforementioned two sources is pre-processed to convert it into a usable form, in which first, we have scaled all the images of data set to a uniform size of 512 512 × and then distributed the data into three parts: training, validation and test sets. The data set contains 401 images out of which, 262 images are COVID-19 negative and the remaining 139 are for positive cases. First, a set of 73 images was selected randomly for testing purpose. Out of these 73 selected images, 21 were COVID-19 positive instances and the remaining 52 were COVID-19 negative instances. The remaining images were randomly shuffled and then a 3:1 training-validation split was performed, producing 246 images for training and 82 for validation purposes.

The Model
The model section comprises an introduction to transfer learning, ensemble learning, DenseNet and SqeezeNet, and it also describes the definition and detailed architecture of the proposed model and, in the end, states the training pattern.

Transfer Learning
In modern machine-dependent automated prediction and classification tasks, deep learning is considered to be the most sought-after concept, thanks to the remarkable performance obtained by it. But it suffers from the problem of data-dependence, as, its models require a tremendous amount of data to train them, due to their complex architecture. The new find cases like COVID-19 have an insufficient amount of training data, making it difficult to use deep learning architectures for tasks related to them. Therefore, to deal with the problem of insufficient training data, an existing machine learning tool has been used, which helps us to transfer pre-existing knowledge from other tasks or the source domain to achieve the target for our concerning task or the target domain. This whole procedure is known as transfer learning (Tan et al., 2018).
In case of deep learning, given a very deep neural network for one generalised task of image classification, the initial layers and their weights are task-independent; only the final layers decide which kind of image classification task it would be, so we can easily import the weights of the initial layers of a pre-trained model and then train the final ones with whatever data we have to make the model taskspecific. This whole process is broadly termed as deep transfer learning.

Ensemble Learning
Ensemble learning, in the context of deep neural networks, is training different neural networks with the same input data configuration and measuring the average value of the predictions made by each network to have our final prediction.
There are several methods to take the average, such as mean and mode. For our model, we have used mean predictions.
Given N neural networks, each making a prediction i p , the final prediction P made by a mean-based ensemble method is defined as mentioned in Equation 1.

Model Selections to Ensemble
Using the same training, validation and test data, we trained 5 deep transfer learning models, namely AlexNet, DenseNet121, ResNet50, VGG16 and SqueezeNet1.0. The best performing models out of these 5 are DenseNet121 and SqueezeNet1.0. As indicated in Figure 2, DenseNet121 and SqueezeNet1.0 return the highest precision, recall and accuracy values out of all five models. Also, an uncanny relation was there in between these two models on results got on test set that wrong predictions made by the classifiers were mostly on different samples.
Hence, it is appropriate to take the ensemble of these two (DenseNet121 and SqueezeNet1.0) models to compensate for the errors of prediction of each other.

DenseNet
For image classification tasks, Convolutional Neural Networks (CNNs) are the state-of-the-art neural architectures used all along with the world. With time, in an attempt to further improve the accuracy, researchers have made many alterations with its architecture. As the complexities of tasks have increased, therefore it is required to use deeper network architectures to process it effectively. Now, as the hardware availability has improved significantly, true deep CNN architectures are being proposed and trained. The barrier to the number of layers is being broken with time, be it 16 or 19 layers of VGG (Russakovsky et al., 2015) or more than 100 layers in Highway Networks (Srivastava et al., 2015). Also, a new trend is there to pass the residues of a layer to layers way ahead of them as in ResNet (He et al., 2016), that is, Layer-1 will not only forward its feature map to Layer-2, but it can also forward it to Layer-3, Layer-4 or any forthcoming layer. Multiple architectures have more than 1,000 layers in them.
To further improve the information flow between layers, it is proposed to connect each layer to all its subsequent layers that is for a given layer l , the input l x is defined as in Equation 1. [ ] where, is a combined feature map of all previous layers {0,1 , 2, , 1} l … − . This process enabled very dense connectivity between layers, and the resulting network is termed as Dense Convolutional Network (DenseNet) (Huang et al., 2017).
Many DenseNet models are in use. Here, DenseNet121 is used as one of the models for ensembling into the proposed model, where 121 is the total number of layers present in the architecture. The task flow in the DenseNet121 model can be visualised as shown in Figure 3.
It is a deep DenseNet with four dense blocks, in which the layer between two adjacent blocks is referred to as transition layers. The transition layers are intended to change the feature-map sizes via convolution and pooling.
The architecture takes the image as input and after detection shows the output in the form of COVID-19 positive or negative.

SqueezeNet
This model also was a breakthrough in the field of true deep CNNs. The main goal achieved by this model was to gain AlexNet level accuracy while minimising the number of parameters as compared to AlexNet to a great extent.
The model, in an uncompressed form, consists of 50x fewer parameters than AlexNet and also maintains the same accuracy level as AlexNet, over ImageNet data. In order to achieve the aforementioned goals, this model introduced a novel concept of fire module. The model uses 9 fire modules in total, each having two blocks, squeeze block, having three 1 × 1 convolution filters and an expand block having three 1 × 1 as well as four 3 × 3 convolution filters (Iandola et al., 2016).
The first version of this model, named SqueezeNet1.0 has been used in this study.

The Proposed Model
The proposed model is an ensemble of DenseNet121 and SqueezNet1.0, in an attempt to increase the efficiency of COVID-19 prediction.
For this model, we separately train DenseNet121 and SqueezeNet1.0, and then on any given chest X-ray images from the data sample, take the mean of the prediction probabilities obtained from these two DenseNet121 and SqueezeNet models to get the final prediction. For any given chest X-ray image I, the function of DenseNet121 and SqueezeNet1.0 can be expressed as follows.  As shown in Figure 4, we tak e the X-ray image data set as input and resize all images as a part of preprocessing into 512 × 512 and reserve a fraction of the data set [73 images] for testing purposes.
The rest part of the pre-proc essed data set is split into training and validation sets, in which 75% of the data is for the training set and the remaining 25% is for the validation set.
Using the training data set, we train DenseNet121 and SqueezeNet1.0 separately, followed by validation of the trained models.
Fully trained and validated m odels take the test set as input and predict the outcome in terms of probability. And these predicted probabilities are ensembled for further processing.
In the final step, the probab ility obtained after the ensemble is used to predict the COVID-19 positive and negative cases.

Model Training
The m odels used for the ensem ble in the proposed are trained as follows.
Although there are multiple v ariants of DenseNet and SqueezeNet, we have utilised DenseNet121 (121 is the number of layers present) and SqueezeNet1.0 (the first version of SqueezeNet) for our study.
Due to limited availability o f data, transfer learning method has been used, so we imported a pretrained model of DenseNet121 over, ImageNet, which is an image database consisting of millions of images organised according to the WordNet hierarchy, where each node of the hierarchy is represented by a collection of images, averaging 500 images per node (Deng et al., 2009). The model was imported from fastai library (Howard & Gugger, 2020), and then our training and validation data were fed onto it.
For training both the models (DenseNet121 and SqueezeNet1.0), we have used the LR-finder module of fastai (Howard & Gugger, 2020) to find the optimal learning rate suited for the training data. The LR-finder works by selecting a sequence of learning rates and training the models for 1 epoch each, using each learning rate. In the case of the high learning rate, loss of information starts to explode or increases exponentially. Therefore, adm issible maximum learning rates are chosen for training the models. In Figures 5 and 6, the learnin g rate versus loss plot after running the LR-finder module is shown, with Figure 5 denoting plot for DenseNet121 and Figure 6 denoting plot for SqueezeNet1.0. From the observation, it is derived that the maximum permissible learning rate for the model DenseNet121 is 1 -04 and for SqueezeNet1.0 is 2 -04 because beyond these levels, the loss starts to increase exponentially. Both the models are trained for 30 epochs, using there aforementioned maximum learning rates utilising fit-one-cycle policy (Smith, 2017). In fit-one-cycle policy, the learning rate is adjusted slightly after completion of each epoch.

Performance Analysis and Comparative Study
For studying the performance of t he proposed model, programming is done on Python-3, and the model is trained and tested on Google Colaboratory, a cloud-based programming platform.
Performance analysis of the propo sed model is observed by plotting the confusion matrix for validation and test samples, and the comparative study indicates the importance of the model.

Performance Metrics
A fe w important metrics used to c ompare the performance of the proposed model to other models are as follows.

Losses on Training and Validation
The loss is basically calculated by computing the difference between the actual and predicted values of the class probabilities. While comparing the models based on their losses, we have to look for three factors: first, the loss should be low and its difference should be as low as possible; second, if the training loss is much lesser than the validation loss, then the corresponding model is overfitting; and the third, if the validation loss is lesser than training loss, then the corresponding model is underfitting. The loss function used in our model is Softmax, quite a popular function for calculating the loss in binary classification.

Accuracy
Accuracy (Powers, 2008) where TP = true positive, TN = true negative, FP = false positive and FN = false negative. A higher accuracy value approves the effectiveness of the model.

Precision
In machine learning, precision (Powers, 2008) shown in Equation 9, is defined as the percentage of a specific class of samples correctly guessed by a model to the total number of samples of that very class. In the proposed model, the class is of COVID-19 positive people, which is considered for it. .

Recall
Recall is defined as the percentage of actual positive samples predicted correctly. In the proposed model, it is calculated in the form of a percentage of COVID-19 positive samples correct predictions. It can be calculated with the help of Equation 10. .

Area Under ROC Curve (AUROC)
Receiver operating characteristics (ROC) curve (Powers, 2008) is a graphical representation of the true positive rate (TPR) plotted against the false positive rate (FPR). TPR and FPR can be calculated as given in Equations 11 and 12. The higher TPR and lower FPR incur the high percentage area inside the plot. This indicates how much better a model is interpreted to be. . .

Confusion Matri x
The co nfusion m atrix is shown to analyse the performance of the proposed model DeQueezeNet. On the test set, Figure 7 shows the outcomes obtained by the DeQueezeNet model.
Observation from the Figure 7 shows that o ut of 21 positive samples, 19 were classified correctly, and out of 52 negative samples, 50 were classified correctly, making it 69 out of 73 correct predictions, with an accuracy of 94.52%, precision of 90.48% and recall of 96.15%. All of which are pretty decent scores, emphasising the usability of the model in practical and clinical cases.

Comparative Analysis
To study the comparative performance of th e proposed model, we chose three different models for comparison: DenseNet121, SqueezeNet1.0 (both models have been used to build the proposed model DeQueezeNet) and ResNet50 given by Narin et al. (2020) for detecting the coronavirus disease using X-ray images.
These models are applied to predict the COV ID-19 positive and negative cases. Figure 8 shows the accuracy over the varying number of epochs from 1 to 30.
Observation from Figure 8 shows that the pr oposed model DeQueezeNet is performing better than all the other models for every epoch of training. The proposed model's performance is better even when it is trained for just one epoch.   The second set of comparative study is done to observe the percentage score of the proposed model. All the models are trained up to 30 epochs before applying over the test-set data sample. Figure 9 shows that the percentage score is shown in terms of accuracy, precision and recall.
Observation from Figure 9 shows that the pro posed model DeQueezeNet is performing better than all the other three models in terms of accuracy, precision and recall. The performance of the DeQueezeNet is not only better than SqueezeNet1.0 and DenseNet121, but it is also better than recently proposed model ResNet50 (Narin et al., 2020) by Narin et al.

Conclusion
In this article, we have proposed an ensemble-based DeQueezeNet model which consists of DenseNet121 and SqueezeNet1.0. The performance evaluation is done by applying the model on X-ray images to predict COVID-19 positive and negative cases.
Performance evaluation is done by considering the performance metrics which are accuracy, precision and recall. The confusion matrix shows that the proposed model can identify the COVID-19 positive and negative cases effectively. Suitable accuracy and high precision imply the significance of the model.
A comparative study is also done with recent work on the above-stated performance metrics. In which, it is observed that the performance of the proposed model is significantly better which signifies the importance of the model. And it also justifies that the proposed model is best suited for classifying the COVID-19 positive and negative cases.