Abstract
This study presents an automated deep learning approach for the rapid detection of COVID-19 using chest X-ray images. Given the urgent need for efficient disease diagnostics, we utilize convolutional neural networks (CNNs) to develop a ResNet-based classification model. Our dataset includes ten images, five depicting COVID-19 cases and five standard X-ray images, chosen for their rapidity and low radiation dose. The model demonstrates high accuracy, successfully classifying all images. Transfer learning techniques further enhance performance, indicating the potential for broader application and improved diagnostic capabilities. This research addresses the current knowledge gap in automated COVID-19 diagnostics, offering a reliable method for swift and accurate detection, with implications for enhancing disease management strategies.
Highlights:
- Rapid COVID-19 detection using deep learning.
- High accuracy with convolutional neural networks.
- Quick results with low radiation in chest X-rays.
Keywords: COVID-19, deep learning, chest X-ray, automated diagnosis, convolutional neural networks
Introduction
In the digital setting, it is important to distinguish between data and information. Pixel gray-level values, or data, represent a pixel's brightness at a certain location in space. However, information is a perceptive examination of the accessible facts. Similar to how the alphabet is used to send information through words, data are utilised to convey information. The same quantity of information can be represented using different quantities of data.
Digital picture presentation in two dimensions
In Figure 1-2, a vector can be represented as a row or column within a two-dimensional data matrix with limited object count. The vector (r,c) can be depicted using 2D digital images where r and c are referred to as image elements or pixels. Pixel location and value are denoted by the term "pixel" itself, indicating uniqueness in both aspects. The brightness of an image at a specific position is determined by the value of point (r,c)[1]..
3D Digital Picture Display
Easy availability of 3D digital images for computer representation. These are part of computer vision, medical imaging and computed tomography [3]. A CT image is made up of multiple slides — each representing a different body part that has been scanned. Unlike regular digital images that are pixel-based, a CT slice comprises voxels since all its elements are volume-based data. A voxel stands for the tissues' 3D data volume and can be viewed as shown in Figure (1-3) [4,5]. The X, Y, and Z values define the width, length and height (or thickness) of a voxel respectively; while the pixel or X and Y together represent the face of the voxel [6].
Digital Image Types
The most basic kind of picture is a binary image. whose pixels can only have two different intensities. Usually, they are shown in black and white. In terms of numbers, binary pictures known as multiple labels (Black and White) or (1Bit ber Pixel Image) [8] often have two values: 0 for black and either 1 or 255 for white. See figure (1.4) for more information.
Binary image processing provides a number of benefits, but it also has certain disadvantages, such as minimal storage—no more than one bit per pixel—and ease of acquisition. Its restricted use and inability to be expanded to 3D are its drawbacks [9].
Grayscale Images: Images in grayscale are also known as pictures in one colour or monochrome. Just brightness data is available for these photos; colour information is absent,
Due to the fact that most modern screens and image capturing technology can only handle 8-bit pictures, gray-level images are quite frequent. Additionally, for a lot of activities, grayscale photos are quite adequate. Therefore, using more complex colour imagery there is not necessary. There are 256 potential shades of grey, ranging from black to white, depending on how the graylevel intensity is represented in a pixel (8 bits)
These levels consist of 256 different grey shades that make up the entire spectrum from black to white. The eye can only detect approximately 200 distinct grey levels which still manages to give an illusion of continuous tone. If we consider 256 grey levels, each black and white pixel can be stored in just one byte (8 bits) of memory [10]. However, pictures with 12 or 16 bits per pixel find utility in fields like medical imaging apart from other sectors— the extra gray levels help when zooming into small areas on images [11].
Colour images:
An image formed at specific frequencies across the electromagnetic spectrum is referred to as a multispectral image. This can consist of data from radar, x-ray, infrared, ultraviolet or even acoustic sources. As illustrated in figure (1.8), information is not directly perceived by the human system. Multispectral imaging (or imaging outside the human visual spectrum) often has the benefit of making viewers see items that would otherwise be invisible— for example, how X-rays allow practitioners to assess bone structure without surgery. Various satellite systems along with underwater sonar systems and infrared imaging systems among others provide these types of pictures which find applications in different fields like medical diagnostics without being asked [1] [11].
A picture taken at specific frequencies across the electromagnetic spectrum is referred to as a multispectral image. This can include information from radar, x-ray, infrared, ultraviolet, or acoustic sources as shown in figure (1.8). Human system does not instantly perceive information; therefore multispectral imaging— being outside the human visual spectrum— often has the advantage of making otherwise invisible items visible to the viewer. For example, X-rays allow doctors to see the bone structure without surgery: this type of imaging is widely used in different fields such as satellite systems or sonar underwater systems—infrared imaging systems or aviation radar among others. Popular providers of these types of images are numerous applications [1] [11].
2.1 X-ray Imaging and Deep Learning
X-rays came into existence after Wilhelm Conrad Röntgen identified them as a form of electromagnetic radiation in 1895. While the general awareness of X-rays has been around since then, the field of X-ray production and usage has significantly evolved in recent times. An important area where X-rays find application is imaging— it paves way for visualizing structures that are otherwise hidden or unresolved. This technique finds relevance in various fields like biological analysis through imaging methods used, medicine, materials research, etc., along with security imaging and structural integrity testing among others. Introduction to art history can now be done using technology as an aid: fake artwork can easily be distinguished from authentic pieces based on their structural composition thanks to high-harmonic generation that allows coherent X-rays at a much lower cost than before made possible this point. Differentiating between real pearls (which should shine red under X-ray light) and fake ones is also easy due to the accessibility of high-energy/intensity X-ray worldwide study and industry via several facilities including free-electron laser sources in X-ray range such as European XFEL (1).
2.2 What is the utility of x-rays?
The distance between atoms in a crystal corresponds to the angstrom range, much like how x-rays are wavelength-based. Atom arrays within crystals may act as an x-ray diffraction grating owing to this relationship. When Laue made his groundbreaking discovery of x-ray diffraction by crystal back in history it led him to receiving the Nobel Prize in Physics in 1914. The following year, 1915, saw William Henry and William Lawrence Bragg awarded with the Nobel Prize for their discovery relating crystal structure to diffraction pattern which laid foundation for X-ray Crystallography— now widely used for various purposes including stress diagnosis on aircraft engines or validation of film quality using protein structures ... Through Debye's contributions we learnt more about controlling high energy short wavelength photons that led him winning the 1936 Nobel Prize in Chemistry among other things such as absorption challenges presented by them due to enormous energy/momentum factors involved (absorption decreases when electron density rises but increases otherwise)...
In a similar fashion, consider this object with thick and thin parts represented in Figure 1-3. If this item were made by assembling painted plywood alternating with tissue paper, you could use bullets to trace an outline (in shadow on the wall behind) of plywood pieces because bullets pass through soft material but not always both. If bricks were used, you would need heavier projectiles than bullets (which might go right through them); however bullet holes are visible from both sides. In case automobiles: steel shipping containers, etc., are inspected using high-energy x-rays as depicted in Figure 2-3 while softer ones are used for detecting broken bones— since most components comprising an object imaged by hard x-rays can be seen only superficially due to shadow...
In solids, the index of refraction of x-rays is barely different from one because they rarely interact with solid materials. Sharp shadows are produced in radiography due to low refractive index of beams and this poses a substantial barrier to the development of refractive optics— like lenses used for visible light— owing to their effectiveness.
Making optics such as Fresnel zone plates or even pinholes for pinhole cameras is difficult due to the penetrating nature of x-rays, since the masking material has to be thicker than the apertures' diameter. An x-ray photon's energy is in the kiloelectronvolt range, while the valence electron transitions in materials often occur in the sub-electronvolt region. As a result, x-ray characteristics are less susceptible to chemical states than those of visible light, which may quickly cause changes in opacity or color. However, because x-ray energies are close to the ionization energies of core electrons, they may be used to search for distinct atomic transitions. X-ray fluorescence and X-ray absorption spectroscopy are required for elemental analysis. Siegbahn founded the field of x-ray spectroscopy, which won her the 1924 Nobel Prize in Physics..
2.3 Produce X-rays
A typical block layout of an x-ray generator is shown in Figure 1, where predominance of monochromatic light is observed. The x-ray generator comprises various components such as the x-ray tube, DC filament power source, turbomolecular pump, and continuous high voltage power supply (SL150, Spellman). The detailed structure of the x-ray tube can be seen in Figure 2. It consists of major components including the molybdenum hole target with 2.0 mm internal diameter, tungsten hairpin cathode (filament), Wehnelt focusing electrode, polyethylene terephthalate x-ray window with 0.25 mm thickness, and stainless steel tube body. The pressure at which the turbomolecular pump is connected to the demountable diode is about 0.5 mPa. The anode electrode of the x-ray tube receives a positive high voltage while the cathode is connected to ground potential. During the experiment, the tube voltage ranged between 25 and 35 kV and the tube current was controlled within 10 μA by regulating filament temperature. To achieve optimal x-ray intensity output, exposure time needs to be precisely managed.
X-rays are generated when the electron beams from the cathode converge on the target through a focusing electrode. The absence of bremsstrahlung rays allows for production of pure molybdenum K-series x-rays without a filter, as they are not emitted in the opposite direction from the electron track according to Sommerfeld's hypothesis. The description of an x-ray generator and its usage can be found in the August 2008 Proceedings of SPIE - The International Society for Optical Engineering 7080 authored by Shigehiro Sato, Osawa Akihiro, Matsukiyo Hiroshi and six other authors who are not listed here.
2.4 Application of X-rays
Most uses of X-rays depend on their penetration power. The penetrating ability differs among materials; wood and flesh are easily penetrated while lead and bone are opaque because they are harder. The energy of X-rays determines their depth of penetration too: low-energy soft X-rays do not go far into materials whereas high-energy hard X-rays can be intense and reach a lot further due to their frequency.
The darkness of shadows on a photographic plate or screen is determined by the relative opacity of body parts when X-rays pass through and create images revealing internal structures. Radiographs or skiagraphs are produced through X-ray technology: business uses it for nondestructive testing while medicine uses it for diagnosis. In fluoroscopy, bright screen replaces plate; though less quality, it saves time and money unlike radiography. Another application area is CAT scans which use X-rays along with computers to generate cross-sectional images— this helps identify anomalies within the body more easily than traditional two-dimensional approaches used by plain X-ray imaging or other methods available at that time.
Examining and analysing paintings is another use for radiography, as studies can provide information about the age of a work and the brushstroke methods used, which can be used to identify or confirm the artist. Several methods can provide magnified views of the structure of opaque things by using X-rays. The quantitative study of several materials might also benefit from the use of these methods, which are collectively known as X-ray microscopy or microradiography.(5) X ray Characterization, The Columbia Electronic Encyclopaedia, Columbia University Press., 6th ed. 2012,
2.5 Risks from X-rays
Reviewing radiation risks falls on the shoulders of international as well as national bodies, which include the International Commission on Radiological Protection (ICRP), United States National Council on Radiation Protection and Measurement (NCRP), Radiation Protection Division of Health Protection Agency in UK and United Nations Scientific Committee on Effects of Atomic Radiation (UNSCEAR). The efforts by these organizations— who regularly review international publications related to this subject — are highly instrumental in providing an objective assessment of risks associated with exposure to ionizing radiation. The most acceptable risk model at low levels for radiation protection currently sees that the likelihood of radiation-induced cancer and genetic disorders increase with escalating radiation dosage without any threshold level being present. According to the Linear No Threshold (LNT) concept, risk would rise linearly with every increase in exposure over natural background levels. LNT is not an ironclad rule; it's considered a reliable operational standard but its validity has been questioned by those who believe low doses are more harmful than what LNT suggests, or conversely, think low doses are less dangerous (and may even be helpful)— often termed hormesis. This paper evaluates arguments for and against LNT hypothesis particularly stemming from 2004 presentations at Radiological Congresses supporting hormesis and skillfully backing LNT model. We conclude by discussing how adoption of LNT model skews assessment of medical X-ray risks— thereby influencing rationalization and control of patient protection during medical exposures: we can't deny patients clinical benefits of modern diagnostic radiology yet still want to keep them safe without compromising anything.(2) Display each author separately including their initials Meara J.R. Wall B.F., Kendall G., Edwards A.A., Statistical capacity for clear-cut demonstration negative effects health resulting from far lower doses typically encountered during diagnostic investigations practice remains limited owing primarily methodologic reasons particular difficulties typical studies their basic limitations preclude obtaining convincing evidence regarding real magnitude any potential effect dose levels this includes few tens mGy rather cancer induction other long-term adverse outcomes being sought lack production experimental data under controlled conditions prevents uncovering mechanisms responsible observations made weakness results point inability provide specific grounds introducing amendments existing concepts estimates diagnostic refer individual situations sources information recorded doses received examinee action actual exposure intended obtain benefits [6].
Exposure and reference populations are typically chosen based on certain statistical criteria, although such selection processes may not be straightforward. In real-world scenarios, apart from other challenges like ambiguity in dose predictions, there could also be residual effects of confounding variables and biases due to various factors. Despite these limitations that epidemiologists face in determining actual harm by radiation through study design efforts at low doses (where small adverse effects are observed), some organizations like UNSCEAR have focused their analysis on this zone as it provides an opportunity for critics who believe concerns over radiation are unwarranted. They aim at showing any exaggeration or misrepresentation done but one should note that lack of evidence supporting an effect is different from having evidence against it. Additionally, selection effects play a significant role in influencing findings from epidemiological investigations; among them is the "Healthy Worker Effect," which will be discussed further later on.
When people arguing for higher risks associated with radiation than those indicated by LNT extrapolation present specific studies citing higher hazards produced at low doses by random variation (while ignoring data contradicting their position) — they have a point too: due to publication bias, research showing significant findings enters literature faster than one showing results ambiguously.
All these call for well-designed research since gray literature might include poorly designed epidemiological studies leading into misinterpretation as proof that risks are exaggerated or miscalculated.
2.6 Definition of deep learning
Chemical engineers are conversant with the ideas behind deep learning, despite it being a relatively new field within artificial intelligence. The methods that make deep learning possible are described in this article on artificial neural networks.
Imitation by a computer of human intelligent behavior is artificial intelligence (Figure 2-5). The ML branch of artificial intelligence is machine learning, which imparts learning capabilities to computers based on data without programmatic teaching. Artificial neural networks (ANNs) are self-learning algorithms in deep learning, a subfield of ML that takes inspiration from the brain's structure and functions. They do not receive direct instructions on problem-solving; rather, they are trained to gather information about models and patterns without supervision. A perceptron — an algorithm mimicking biological neurons — forms the basic unit of an artificial neural network [5]. Despite being introduced in 1957, ANNs were not widely recognized until recently because their effective result production demands high-level training beyond current computational abilities due to data size limitations.
If you want to grasp the recent escalation in computing power, picture this: back in 2012, the Google Brain project had to resort to a specially constructed device that burned a $5,000,000-shaped hole in their pocket and guzzled 600 kW of electricity. Fast forward two years to 2014 when the Stanford AI Lab upped their processing power game using three GPU-accelerated PCs— each a steal at $33,000 and sipping only 4 kW of electricity. And now? For just $80 you can lay your hands on a Neural Compute Stick specialized for your needs, boasting over 100 gigaflops worth of computing power. The times they are a-changin', my friend.
2.6 The perceptron
A typical human brain contains approximately one hundred billion neurons. Neurons receive information from other neurons through their dendrites; they collect all the inputs and fire if it exceeds a certain threshold. The output produced is then transmitted to other connected neurons in return (Figure 2.6).
The perceptron: a mathematical mimicry of its biological counterpart [6]. Just as a neuron in flesh, it undertakes input computation to yield an output. Each input bears a weight; individually they are multiplicated and aggregated before entering an activation function— deliberating the neuron's destiny on firing and final output evaluation (refer Figure 2.7).
The most basic kind of activation functions is the step function. But that's just one type; there are many others with unique features [7]. In a step function, if the input exceeds a certain threshold, it outputs a 1; otherwise, it outputs a 0. Let's say we have a perceptron with two inputs (x1 and x2) such that x1 = 0.9, x2 = 0.7 and they are weighted by w1 and w2 respectively.
w1 = 0.2 and w2 = 0.9
Assuming the threshold of the activation function is set at 0.75, the outcome of adding and weighting the inputs is:
x1w1 plus x2w2 equals (0.9×0.7) plus (0.2×0.9) = 0.81
The overall input surpasses the threshold (0.75), causing the neuron to activate. The result would be 1 since we choose an easy step function. So how does all of this contribute to intelligence? The ability to acquire foundational skills via education is the first step towards anything..
2.7 A Perceptron's Training
To train a perceptron, one has to provide it with many training examples and compute each sample's output. To reduce output error—which is generally seen as reflecting the discrepancy between the anticipated (target) and actual outputs—weights are adjusted after each sample. (Fig. 2.8).
The ability of perceptron to carry out categorization is critical because many intelligent activities depend on classification. One common application of categorization is in identifying spam emails. An algorithm capable of recognizing features specific to spam emails would be highly useful— this typically involves a training dataset containing emails that appear to be spam and those marked as regular, non-spam messages. Similarly, algorithms of such nature can distinguish between normally functioning and malfunctioning valves or determine the nature of a tumor as benign or malignant. Even learn about your musical preferences to classify songs into likelihood categories — all without your explicit directions..
Strong classifiers are perceptron algorithms. However, by itself, they are limited to learning linear patterns; complex or nonlinear patterns are beyond their capabilities.
2.8 Transparent perceptrons
Simple patterns can be learned by a single neuron, but learning capacity increases significantly when many neurons are coupled. In the human brain, each of the 100 billion neurons has around 7,000 connections with other neurons. A three-year-old's brain is thought to contain one quadrillion connections between neurons. Furthermore, the brain could have more neuronal connections than there are atoms in the cosmos.
An artificial neural network with multiple layers of neurons between the input and output is known as a multilayer perceptron (MLP). These networks are often called feedforward neural networks because the data flow from input to output layer happens in one direction only. Each neuron in a particular layer is connected to every other neuron in the subsequent layer through its regular output. The layers located between the input and output layers are termed hidden layers (Figure 2.10).
MLPs are frequently used in pattern recognition, classification, approximation, and prediction tasks. They could also learn complex patterns that are impossible to divide using basic curves like linear ones. The more neurons and layers an MLP network has, the more complex patterns it can learn. (2.11).
Artificial intelligence applications spanning across various fields have demonstrated success using MLPs. These include speech recognition [9], continuous stirred-tank reactor management [11], and thermal conductivity prediction in aqueous electrolyte solutions [10]. An example of an MLP structure would involve a bitmap grid of inputs (e.g., 9x12) to represent individual digit pixels through one or more hidden layers culminating in ten output neurons indicating the recognized number (0-9). The application for number recognition typically involves training the MLP by presenting pictures of numbers. The network learns through feedback when it correctly identifies images, thus modifying neuron weights that determine future classifications based on input features: despite being initially random.
A practical implementation of an MLP for handwritten digit recognition involves 784 perceptrons as inputs, 10 output neurons, and 15 hidden layer neurons which take input from a 28 x 28 pixel bitmap image of the handwritten digit [12]. This type of MLP model is typically trained on a batch containing 50,000 handwritten number images along with their respective annotations. If trained on a good computer system, it has the capability to recognize new handwritten numerals with a striking accuracy of 95% within few minutes.
In another scenario similar to this, some researchers used Perry's Chemical Engineers' Handbook data to train an MLP aiming at estimating substance viscosity [13]. The idea was borrowed from another case but implemented differently in order to achieve distinct results.
Result and Discussion
3-1 MATLAB-Based Deep Learning Model for COVID-19 Detection on Chest X-ray
Ten example pictures total—five for COVID-19 and five for standard X-rays—make up the dataset. The reason this study employed radiography instead of X-rays is because radiography typically requires less than 15 minutes and a lower radiation dosage to complete. Because X-ray pictures are digital, a physician may view them on a screen in a matter of minutes.
3-2 ResNet-50
When a convolutional neural network reaches 50 layers, it is referred to as ResNet-50. ResNet stands for Residual Networks and is a type of convolutional neural network that forms the backbone for numerous computer vision applications. This particular model came out on top during the 2015 ImageNet competition— an achievement worth note. Users have the option to load a pre-trained version of this network which has already been exposed to a diverse range of over a million images and can thus classify photos into 1000 distinct categories based on their workflow. The details regarding implementing deep learning using COVID-19 techniques are illustrated in Figure (3-1).
The first step in the process will be to establish an image data store. By class name, we will store photographs in two unique subfolders. X-ray photographs that are contaminated will be kept in the covid subdirectory, but uninfected X-ray images will be preserved in the regular subfolder.
3-3 dataset is the primary folder name.
Testing and training data will be kept separate. To get accuracy, testing is crucial. Based on our findings, we will then upgrade the ResNet-50 network. Given that ResNet uses 50 million photos to teach 1000 lessons. Ten images total in two classes in this instance. The training parameters, including the batch size, maximum number of epochs, and starting learning rate, must then be specified. We will adjust the hyperparameters and go on training our model if necessary. After the model has been trained, we will test it on a testing dataset to verify correctness. If everything goes as planned, you may even go so far as to implement your model on hardware for real-time applications..
Acquire and Import the Information
Let's use imageDatastore to develop an image database. All photographs will have automated labels applied.
Close all, clear all, and clc
Datapath for Images: You can adjust your path as necessary by using datapath='dataset(C/User/Desktop/Covid)';
% Picture Datastore imds=pictureDatastore(datapath, 'IncludeSubfolders',true,...
"LabelSource" and "foldernames";
Total_split=countEachLabel(imds) is the split up.
2. Display a few images
Examine the photographs visually to observe the variations between each class's images. Using random permutation, we will randomly read six photos.
num_images=length(imds.Labels); % Number of Images
% View arbitrary pictures
randper=randperm(num_images,6);%Integer permutation using random number generator;
Regarding idx=1:length(randper),
%Create axes in tiled positions imshow(imread(imds.Files{randper(idx)})); title((imds.Labels(randper(idx)))) subplot(2,3,idx);
conclude
As you know, since this dataset only contains a small number of photos, we divided it into ten equal parts for analysis. This means ten different algorithms were trained on different subsets of the dataset's images. Comparing this type of validation study to usual retention validation methods will allow us to more accurately measure our performance. Since the ResNet-50 design has proven to be very successful in various medical imaging applications, we use it in this blog [1,2].
Shared data for testing and training
% Number of folds num_folds=4;
% each folding cycle
Considering Fold_idx=1:num_folds
fprintf(fold_idx,num_folds), "%d folds under %d have been processed");
% test_idx=fold_idx:num_folds:num_images;The current folded test index
Current collapse imdsTest = subset(imds,test_idx); % test case;
train_idx=setdiff(1:length(imds.Files),test_idx);Current folded training index
% Create training cases for the current fold imdsTrain by subdividing imds using train_idx;
network=resnet50; % ResNet architecture
lgraph = clean grid; layer graph (lgraph);
Count the number of categories; this example includes Normal and COVID-19.
Total category %
categories(imdsTrain.Labels) = numel(numClasses));
Make a new layer.
% Novel learnable layer
FullConnectedLayer(numClasses,... 'name','new_fc',...) = newLearnableLayer.
'BiasLearnRateFactor',10,...'WeightLearnRateFactor',10);
% Add new layer to replace previous layer
lgraph is replaced with a new layer called newsoftmaxLayer(lgraph, 'fc1000', newLearnableLayer);
newClassLayer = ClassificationLayer('name','new_classoutput'); lgraph = ReplaceLayer(lgraph,'fc1000_softmax',newsoftmaxLayer);
ReplaceLayer(newClassLayer,lgraph,'ClassificationLayer_fc1000');
Check the architecture of the layer
Presentation, plot (graphics)
It seems thick because of its 177 layers.
preprocessing the dataset
A function handle is used to specify the function that reads the data. A preprocess function will be automatically applied during training. Because photographs might be either grey or color—that is, one matrix image for grayscale photos and three matrices for colour images—we need this preprocess. During instruction
The matrices must have the same dimensions; otherwise, an error will be generated.
% Preprocessing Method
The filenames are replaced by @(filename)preprocess_Xray(filename) in imdsTrain.ReadFcn and imdsTest.ReadFcn.
Instructional Choices
% Training Options: Because there aren't enough photos, we select a little mini-batch size.
'adam', 'MaxEpochs',30, 'MiniBatchSize',8,..., etc. are the training options.
'Plots','training-progress','every-epoch','Verbose',false,... 'InitialLearnRate',1e-4');
Data Enrichment
For picture augmentation, an image data augmenter sets up several preprocessing options including scaling, rotation, and reflection. You can use data augmentation to improve categorization on any type of chest X-ray image if you have comparable types of images.
% Information Extension
augmenter = imageDataAugmenter(...,'RandXReflection',1,'RandRotation',[-5 5],......
'RandYShear',[-0.05 0.05],'RandXShear',[-0.05 0.05],'RandYReflection',1;
viii-For the ResNet architecture, resize all training photos to [224 224].
since the picture size in ResNet is set to [224 224].
The ResNet architecture requires that all training pictures be resized to [224 224]. auimds = augmentedImageDatastore([224 224],imdsTrain,'DataAugmentation',augmenter);
ix: The training portion now
Training took three to four minutes in my instance. The deep learning model is trained using the trainNetwork function. Three input parameters are needed: layers, training option, and picture data. They are all generated by us in the steps above.
% Instruction
trainNetwork(auimds,lgraph,options) = netTransfer;
Figure (3-3) (a-b-c-d) As you can see from training progress, accuracy is about 100% for 30 epochs, and losses are also lowered to zero. x- Examination
All test pictures should be resized to [224 224] for the ResNet architecture.
% For the ResNet architecture, resize all test pictures to [224 224]. augtestimds = augmentedImageDatastore([224 224],imdsTest);
Classify(netTransfer,augtestimds) % Testing and Labels and Posterior for each Case [predicted_labels(test_idx),posterior(test_idx,:)];
Predicted_labels is where predicted classes are stored.
Precision
number(predicted_labe ls)*100 = sum(predicted_labels==imdsTest.Labels) /
Accuracy in my cases is 100%
In my cases accuracy is 100 %
As you can see, every image in the testing data has been correctly identified; not a single one has been misclassified. It indicates that we have created the ideal model for categorising COVID-19 chest X-ray pictures.
Save(sprintf('ResNet50_%d_among_%d_folds',fold_idx,num_folds),'netTransfer','test') the Independent ResNet Architectures acquired for each Fold.
_idx','train_idx');
% Eliminating unnecessary variables
fold_idx num_folds num_images predicted_labels posterior imds netTransfer; clearvars -except
conclude
Confusing Matrix
Analyse our algorithm's performance using the testing dataset. Additionally, this matrix informs us of the properly and incorrectly identified photos. We may evaluate our deep learning model's performance by looking at the confusion matrix.
% Actual Labels imdsTest.Labels actual_labels;
Figure of the Confusion Matrix;
plotconfusion(predicted_labels,actual_labels) title('ResNet50 Confusion Matrix');
Conclusion
The dataset consists of ten example images in total— five depicting COVID-19 cases and five standard X-ray images. The reason behind using radiography for this study instead of X-rays is that radiography typically takes less than 15 minutes with lower radiation dose requirements. As X-ray images are digital, they can be viewed on a screen by a physician within minutes.
We have introduced a basic deep learning approach for COVID-19 x-ray classification: ResNet-based classification which demonstrated good performance with high overall accuracy. The success of transfer learning techniques further underscores the suitability of CNN-based classification models for feature extraction; the algorithm can easily be re-trained with new sets of labeled images to improve its performance even more.
This image was not mistakenly categorized in any data images that were tested: the pictures are all correct. This indicates that we have developed a top model for classifying COVID-19 chest X-ray images.
References
- S. Umbaugh, "Digital image processing and analysis: Human and computer vision applications with CVIP tools," 2nd ed. CRC Press, 2010.
- R. C. Gonzalez, R. E. Woods, and S. L. Eddins, "Digital image processing using MATLAB," 2nd ed. MedData Interactive, The MathWorks, Inc., Sep. 5, 2003.
- S. N. Srihari, "Representation of three-dimensional digital images," ACM Computing Surveys (CSUR), vol. 13, no. 4, pp. 399-424, 1981.
- G. T. Herman, "Image reconstruction from projections: Fundamentals of computerized tomography," 2nd ed. Springer Science & Business Media, 2009.
- R. M. Zain, A. M. Razali, K. A. Salleh, and R. Yahya, "Image reconstruction of X-ray tomography by using Image J platform," *AIP Conference Proceedings*, vol. 1799, no. 1, pp. 050010(1-6), 2017.
- L. E. Romans, "Computed tomography for technologists: A comprehensive text," Wolters Kluwer Health/Lippincott Williams & Wilkins, 2011.
- H. M. Abdul Jabar, "Image reconstruction from its 2D projections," M.Sc. thesis, Physics Department, College of Education for Pure Science, Ibn Al-Hatham, University of Baghdad, 2002.
- S. Jayaraman, S. Esakkirajan, and T. Veerakumar, "Digital image processing," Tata McGraw Hill Education, 2009.
- A. Khajeh Djahromi, "Binary image processing," Department of Electrical Engineering, University of Texas at Arlington.
- A. McAndrew, "An introduction to digital image processing with MATLAB," Notes for SCM2511 Image Processing1, Semester 1, 2004, School of Computer Science and Mathematics, Victoria University of Technology.
- R. Gonzalez and R. Woods, "Digital image processing," 2nd ed. Pearson Education International, Prentice Hall, Inc., 2002.
- F. R. Verdun et al., "Image quality in CT: From physical measurements to model observers," European Journal of Medical Physics, vol. 31, no. 8, pp. 823-843, 2015.
- W. C. Scarfe and C. Angelopoulos, "Maxillofacial cone beam computed tomography: Principles, techniques and clinical applications," 1st ed. Springer, 2018.
- A. Staude and J. Goebbels, "Determining the spatial resolution in computed tomography – Comparison of MTF and line-pair structures," 2011.
- P. Sprawls, "Physical principles of medical imaging," 2nd ed. Medical Physics Publishing Corporation, 1995.
- J. Hsieh, "Computed tomography: Principles, design, artifacts, and recent advances," 2nd ed. SPIE, 2009.
- M. Vogel, "An introduction to X-ray physics, optics, and applications," Contemporary Physics, vol. 59, no. 1, pp. 1-1, Nov. 2017.
- J. Als-Nielsen and D. McMorrow, "Elements of modern X-ray physics," John Wiley & Sons, 2001.
- E. Lifshin, "X-ray characterization of materials," John Wiley & Sons, 1999.
- G. Michette and C. J. Buckley, "X-ray science and technology," Institute of Physics Publishing, 1993.
- A. Michette and S. Pfauntsch, "X-rays: The first hundred years," John Wiley & Sons, 1996.
- E. Spiller, "Soft X-ray optics," SPIE Press, 1994.
- D. Attwood and A. Sakdinawat, "X-rays and extreme ultraviolet radiation: Principles and applications," Cambridge University Press, 2016.
- F. Rosenblatt, "The perceptron: A probabilistic model for information storage and organization in the brain," Psychological Review, vol. 65, no. 6, 1958. [Online]. Available: www.ling.upenn.edu/courses/cogs501/Rosenblatt1958.pdf. [Accessed: May 4, 2018].
- C. Clabaugh et al., "Neural networks: The perceptron," Stanford University, [Online]. Available: https://cs.stanford.edu/people/eroberts/courses/soco/projects/neural-networks/Neuron/index.html. [Accessed: May 4, 2018].
- A. V. Sharma, "Understanding activation functions in neural networks," Medium, Mar. 30, 2017, [Online]. Available: https://medium.com/the-theory-of-everything/understanding-activation-functions-in-neural-networks-9491262884e0. [Accessed: May 4, 2018].
- D. Pandya et al., "Digital twins for predicting early onset of failures flow valves," presented at the AIChE Spring Meeting and Global Congress on Process Safety, Orlando, FL, Apr. 23, 2018.
- N. Azzizi and A. Zaatri, "A learning process of multilayer perceptron for speech recognition," International Journal of Pure and Applied Mathematics, vol. 107, no. 4, pp. 1005–1012, May 7, 2016.
- R. Eslamloueyan et al., "Using a multilayer perceptron network for thermal conductivity prediction of aqueous electrolyte solutions," *Industrial and Engineering Chemistry Research*, vol. 50, no. 7, pp. 4050–4056, Mar. 2, 2011.
- B. ZareNezhad and A. Aminian, "Application of the neural network-based model predictive controllers in nonlinear industrial systems. Case study," *Journal of the University of Chemical Technology and Metallurgy*, vol. 46, no. 1, pp. 67–74, 2011.
- M. Nielsen, "Chapter 1: Using neural nets to recognize handwritten digits," in "Neural networks and deep learning," Dec. 2017, [Online]. Available: http://neuralnetworksanddeeplearning.com/chap1.html. [Accessed: May 4, 2018].
- A. Moghadassi et al., "Application of artificial neural network for prediction of liquid viscosity," *Indian Chemical Engineer*, vol. 52, no. 1, pp. 37–48, Apr. 23, 2010.
- D. M. Himmelblau et al., "Fault classification with the aid of artificial neural networks," *IFAC Proceedings Volumes*, vol. 24, no. 6, pp. 541–545, Sept. 1991.
- A. Vasičkaninová and M. Bakošová, "Neural network predictive control of a chemical reactor," *Acta Chimica Slovaca*, vol. 2, no. 2, pp. 21–36, 2009.