Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv

Page created by Harry Jacobs
 
CONTINUE READING
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
1

                                            Deep Learning for Neuroimaging-based Diagnosis
                                            and Rehabilitation of Autism Spectrum Disorder:
                                                                A Review
                                            Marjane Khodatars, Afshin Shoeibi, Navid Ghassemi, Mahboobeh Jafari, Ali Khadem, Delaram Sadeghi,
                                                     Parisa Moridian, Sadiq Hussain, Roohallah Alizadehsani, Member, IEEE, Assef Zare,
                                          Abbas Khosravi, Member, IEEE, Saeid Nahavandi, Fellow, IEEE, U. Rajendra Acharya, Senior Member, IEEE,
                                                                                    Michael Berk
arXiv:2007.01285v3 [cs.LG] 26 Jul 2020

                                            Abstract—Accurate diagnosis of Autism Spectrum Disorder                                            I. I NTRODUCTION
                                         (ASD) is essential for its management and rehabilitation. Neu-
                                         roimaging techniques that are non-invasive are disease markers                        SD is a disorder of the nervous system that affects
                                         and may be leveraged to aid ASD diagnosis. Structural and func-
                                         tional neuroimaging techniques provide physicians substantial
                                                                                                                         A     the brain and results in difficulties in speech, social
                                                                                                                         interaction and communication deficits, repetitive behaviors,
                                         information about the structure (anatomy and structural con-
                                         nectivity) and function (activity and functional connectivity) of               and delays in motor abilities [1]. This disease can generally
                                         the brain. Due to the intricate structure and function of the brain,            be distinguished with extant diagnostic protocols from the
                                         diagnosing ASD with neuroimaging data without exploiting artifi-                age of three years onwards. Autism influences many parts
                                         cial intelligence (AI) techniques is extremely challenging. AI tech-            of the brain. This disorder also involves a genetic influence
                                         niques comprise traditional machine learning (ML) approaches
                                         and deep learning (DL) techniques. Conventional ML methods
                                                                                                                         via the gene interactions or polymorphisms [2], [3]. One in
                                         employ various feature extraction and classification techniques,                70 children worldwide is affected by autism. In 2018, the
                                         but in DL, the process of feature extraction and classification is              prevalence of ASD was estimated to occur in 168 out of 10,000
                                         accomplished intelligently and integrally. In this paper, studies               children in the United States, one of the highest prevalence
                                         conducted with the aid of DL networks to distinguish ASD were                   rates worldwide. Autism is significantly more common in boys
                                         investigated. Rehabilitation tools provided by supporting ASD
                                         patients utilizing DL networks were also assessed. Finally, we
                                                                                                                         than in girls. In the United States, about 3.63 percent of boys
                                         presented important challenges in the automated detection and                   aged 3 to 17 years have autism spectrum disorder, compared
                                         rehabilitation of ASD.                                                          with approximately 1.25 percent of girls [4].
                                            Index Terms—Autism Spectrum Disorder, Diagnosis, Rehabil-                       Diagnosing ASD is difficult because there is no pathophys-
                                         itation, Deep Learning, Neuroimaging, Neuroscience.                             iological marker, relying instead just on psychological criteria
                                                                                                                         [5]. Psychological tools can identify individual behaviors,
                                            M. Khodatars and D. Sadeghi are with the Dept. of Medical Engineering,
                                         Mashhad Branch, Islamic Azad University, Mashhad, Iran.
                                                                                                                         levels of social interaction, and consequently facilitate early
                                            A. Shoeibi and N. Ghassemi are with the Faculty of Electrical Engineering,   diagnosis. Behavioral evaluations embrace various instruments
                                         Biomedical Data Acquisition Lab, K. N. Toosi University of Technology,          and questionnaires to assist the physicians to specify the partic-
                                         Tehran, Iran, and the Computer Engineering Department, Ferdowsi University
                                         of Mashhad, Mashhad, Iran.
                                                                                                                         ular type of delay in a child’s development, including clinical
                                            M. Jafari is with Electrical and Computer Engineering Faculty, Semnan        observations, medical history, autism diagnosis instructions,
                                         University, Semnan, Iran.                                                       and growth and intelligence tests [6].
                                            A. Khadem is with the Faculty of Electrical Engineering, K. N. Toosi
                                         University of Technology, Tehran, Iran. (Corresponding author: Ali Khadem,         Several investigations for the diagnosis of ASD have re-
                                         email: alikhadem@kntu.ac.ir).                                                   cently been conducted on neuroimaging data (structural and
                                            P. Moridian is with the Faculty of Engineering, Science and Research         functional).
                                         Branch, Islamic Azad University, Tehran, Iran.
                                            Sadiq Hussain is System Administrator at Dibrugarh University, Assam,           Analyzing anatomy and structural connections of brain areas
                                         India, 786004.                                                                  with structural neuroimaging is an essential tool for study-
                                            A. Zare is with Faculty of Electrical Engineering, Gonabad Branch, Islamic   ing structural disorders of the brain in ASD. The principal
                                         Azad University, Gonabad, Iran.
                                            R. Alizadehsani, A. Khosravi and S. Nahavandi. are with the Institute        tools for structural brain imaging are magnetic resonance
                                         for Intelligent Systems Research and Innovation (IISRI), Deakin University,     imaging (MRI) techniques [7], [8], [9]. Cerebral anatomy is
                                         Victoria 3217, Australia.                                                       investigated by structrul MRI (sMRI) images and anatomical
                                            U. R. Acharya is with the Dept. of Electronics and Computer Engineering,
                                         Ngee Ann Polytechnic, Singapore 599489, Singapore, the Dept. of Biomedical      connections are assesed by diffusion tensor imaging MRI
                                         Informatics and Medical Engineering, Asia University, Taichung, Taiwan, and     (DTI-MR) [10]. Investigating the activity and functional con-
                                         the Dept. of Biomedical Engineering, School of Science and Technology,          nections of brain areas using functional neuroimaging can
                                         Singapore University of Social Sciences, Singapore.
                                            M. Berk is with the Deakin University, IMPACT - the Institute for            also be used for studying ASD. Brain functional diagnostic
                                         Mental and Physical Health and Clinical Translation, School of Medicine,        tools are older approaches than structural methods for studying
                                         Barwon Health, Geelong, Australia, and the Orygen, The National Centre          ASD. The most basic modality of functional neuroimaging is
                                         of Excellence in Youth Mental Health, Centre for Youth Mental Health,
                                         Florey Institute for Neuroscience and Mental Health and the Department of       electroencephalography (EEG), which records the electrical
                                         Psychiatry, The University of Melbourne, Melbourne, Australia.                  activity of the brain from the scalp with a high temporal
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
2

resolution (in milliseconds order) [11]. Studies have shown
that employing EEG signals to diagnose ASD have been useful
[12], [13], [14]. Functional MRI (fMRI) is one of the most
promising imaging modalities in functional brain disorders,
used as task-based (T-fMRI) or restingstate (rs-fMRI) [15],
[16]. fMRI-based techniques have a high spatial resolution (in
the order of millimeters) but a low temporal resolution due
to slow response of the hemodynamic system of the brain as
well as fMRI imaging time constraints and is not ideal for
recording the fast dynamics of brain activities. In addition,    Fig. 1: Number of papers published every year for ASD
these techniques have a high sensitivity to motion artifacts.    diagnosis and rehabilitation.
It should be stressed that in consonance with studies, three
less prevalent modalities of electrocorticography (ECoG) [17],
functional near-infrared spectroscopy (fNIRS) [18], and Mag-
netoencephalography (MEG) [19] can also attain reasonable
performance in ASD diagnosis. An appropriate approach is to
utilize machine-learning techniques alongside functional and
structural data to collaborate with physicians in the process
of accurately assessing ASD. In the field of ASD, applying
machine learning methods generally entail two categories of
traditional methods [20] and DL methods [21]. As opposed
to traditional methods, much less work has been done on DL
methods to explore ASD or design rehabilitation tools.
   This study reviews ASD assesment methods and patients’             Fig. 2: Illustration of various types of DL methods.
rehabilitation with DL networks. The outline of this paper is
as follows. Section 2 is search strategy. Section 3 concisely
presents the DL networks employed in the field of ASD. In        reinforcement learning [28], and a variety of DL networks are
section 4, existing computer-aided diagnosis systems (CADS)      provided for each type. So far, most studies applied to identify
that use brain functional and structural data are reviewed.      ASD using DL have been based on supervised or unsupervised
In section 5, DL-based rehabilitation tools for supporting       approaches. Figure 2 illustrates generally employed types of
ASD patients are introduced. Section 6 discusses the reviewed    DL networks with supervised or unsupervised learning to
papers. Section 7 reveals the challenges of ASD diagnosis        study ASD.
and rehabilitation with DL. Finally, the paper concludes and
suggests future work in section 8.                               IV. CADS- BASED DEEP LEARNING TECHNIQUES FOR ASD
                                                                             DIAGNOSIS BY NEUROIMAGING DATA
                   II. S EARCH S TRATEGY                            A traditional artificial intelligence (AI)-based CADS en-
   In this review, IEEE Xplore, ScienceDirect, SpringerLink,     compasses several stages of data acquisition, data pre-
ACM, as well as other conferences or journals were used to       processing, feature extraction, and classification [29], [30],
acquire papers on ASD diagnosis and rehabilitation using DL      [31], [32]. In [33], [34], [35] existing traditional algorithms for
methods. Further, the keywords ”ASD”, ”Autism Spectrum           diagnosing ASD have been reviewed. In contrast to traditional
Disorder” and ”Deep Learning” were used to select the papers.    methods, in DL-based CADS, feature extraction, and classi-
The papers are analyzed till June 03th, 2020 by the authors      fication are performed intelligently within the model. Also,
(AK, SN). Figure 1 depicts the number of considered papers       due to the structure of DL networks, using large dataset to
using DL methods for the automated detection and rehabilita-     train DL networks and recognize intricate patterns in datasets
tion of ASD each year.                                           is incumbent. The components of DL-based CADS for ASD
                                                                 detection are shown in Figure 3. It can be noted from the
    III. D EEP L EARNING T ECHNIQUES FOR ASD D IAGNOSIS          figure that, large and free databases are first introduced to
                     AND R EHABILITATION                         diagnose ASD. In the second step, various types of pre-
                                                                 processing techniques are used on functional and structural
   Nowadays, DL algorithms are used in many areas of
                                                                 data to be scrutinized. Finally, the DL networks are applied
medicine including structural and functional neuroimaging.
                                                                 on the preprocessed data.
The application of DL in neural imaging ranges from brain
MR image segmentation [22], to detection of brain lesions
such as tumors [23], diagnosis of brain functional disorders     A. Neuroimaging ASD Datasets
such as ASD [24], and production of artificial structural or       Datasets are the heart of any CADS development and the
functional brain images [25]. Machine learning techniques        capability of CADS depends primarily on the affluence of
are categorized into three fundamental categories of learning:   the input data. To diagnose ASD, several brain functional
supervised learning [26], unsupervised learning [27], and        and structural datasets are available. The most complete free
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
3

                       Fig. 3: Block diagram of CAD system using DL architecture for ASD detection.

dataset available is ABIDE [36] dataset with two subsets:           motion correction is to align all images to a reference image so
ABIDE-I and ABIDE-II, which encompasses sMRI, rs-fMRI,              that the coordinates and orientation of the voxels be identical
and phenotypic data. ABIDE-I involves data from 17 inter-           in all fMRI volumetric images [43], [44], [45].
national sites, yielding a total of 1112 datasets, including 539       S LICE T IMING C ORRECTION : The purpose of modifying
individuals with ASD and 573 healthy individuals (ages 64-7).       the slice time is to adjust the time series of the voxels so that
In accordance with HIPAA guidelines and 1000 FCP / INDI             all the voxels in each fMRI volume image have a common
protocols, these data are anonymized. In contrast, ABIDE-           reference time. Usually, the corresponding time of the first
II contains data from 19 international sites, with a total of       slice recorded in each fMRI volume image is selected as the
1114 datasets from 521 individuals with ASD and 593 healthy         reference time [43], [44], [45].
individuals (ages 5-64). Also, preprocessed images of the              I NTENSITY N ORMALIZATION : at this stage, the average
ABIDE-I series called PCP [37] can be freely downloaded by          intensity of fMRI signals are rescaled to compensate for global
the researchers. The second recently released ASD diagnostic        deviations within and between the recording sessions [43],
database is called NDAR, which comprises various modalities,        [44], [45].
and more information is provided in [38].                              R EGISTRATION TO A STANDARD ATLAS : The human brain
                                                                    entails hundreds of cortical and subcortical areas with var-
B. Preprocessing Techniques                                         ious structures and functions, each of which is very time-
    Neuroimaging data (especially functional ones) is of rela-      consuming and complex to study. To overcome the problem,
tively complicated structure, and if not pre-processed properly,    brain atlases are employed to partition brain images into a
it may affect the final diagnosis. Preprocessing of this data       confined number of ROIs, following which the mean time
typically entails multiple common steps performed by different      series of each ROI can be extracted [46]. ABIDE datasets use a
software as standard. Indeed, occasionally prepared pipelines       manifold of atlases, including Automated Anatomical Labeling
are applied on the dataset to yield pre-processed data for future   (AAL) [47], Eickhoff-Zilles (EZ) [48], Harvard-Oxford (HO)
researches. In the following section, preprocessing steps are       [49], Talaraich and Tournoux (TT) [50], Dosenbach 160 [51],
briefly explained for fMRI data.                                    Craddock 200 (CC200) [52] and Craddock 400 (CC400) [53]
    1) Standard (Low-level) fMRI preprocessing steps: Low-          and more information is provided in [54]. Table I provides
level pre-processing of fMRI images normally has fixed num-         complete information on preprocessing tools, atlases, and
ber of steps applied on the data, and prepared toolboxes            some other preprocessing information.
are usually used to reduce execution time and yield better             2) Pipeline Methods: Pipelines present preprocessed im-
accuracy. Some of these reputable toolboxes contain FMRIB           ages of ABIDE databases. They embrace generic pre-
software libraries (FSL) [39], BET [40], FreeSurfer [41],           processing procedures. Employing pipelines, distinct methods
and SPM [42]. Also, important and vital fMRI preprocess-            can be compared with each other. In ABIDE datasets, pre-
ing incorporates brain extraction, spatial smoothing, temporal      processing is performed by four pipeline techniques: neu-
filtering, motion correction, slice timing correction, intensity    roimaging analysis kit (NIAK) [55], data processing assistant
normalization, and registration to standard atlas, which are        for rs- fMRI (DPARSF) [56], the configurable pipeline for the
summarized as follows:                                              analysis of connectomes (CPAC) [57], or connectome com-
    B RAIN EXTRACTION : the goal is to remove the skull and         putation system (CCS) [58]. The preprocessing steps carried
cerebellum from the fMRI image and maintain the brain tissue        out by the various pipelines are comparatively analogous. The
[43], [44], [45].                                                   chief differences are in the particular algorithms for each
    S PATIAL SMOOTHING : involves averaging the adjacent vox-       step, the software simulations, and the parameters applied.
els signal. This process is persuasive on account of neighbor-      Details of each pipeline technique are provided in [54]. Table I
ing brain voxels being usually closely related in function and      demonstrates the pipeline techniques used in autism detection
blood supply [43], [44], [45].                                      exploiting DL.
    T EMPORAL FILTERING : the aim is to eliminate unwanted             3) High-level preprocessing Steps: High-level techniques
components from the time series of voxels without impairing         for pre-processing brain data are important, and using them ac-
the signal of interest [43], [44], [45].                            companying preliminary pre-processing methods can enhance
    R EALIGNMENT (M OTION C ORRECTION ): During the                 the accuracy of ASD recognition. These methods are applied
fMRI test, people often move their heads. The objective of          after the standard pre-processing of functional and structural
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
4

                             Fig. 4: Overall block diagram of a 2D-CNN used for ASD detection.

                             Fig. 5: Overall block diagram of a 3D-CNN used for ASD detection.

brain data. These include sliding window (SW) [24], data           markedly lessened and make it feasible to train these networks
augmentation (DA) [59], functional connectivity matrix (FCM)       with smaller databases [21]. Figure 4 shows the block digram
estimation [60], [61] and applying fast Fourier transformation     of 2D-CNN used for ASD detection.
(FFT) [62]. Furthermore, some of the researches utilized              3D-CNN
feature extraction [63] techniques and some also use feature          By transforming the data into three dimensions, the con-
selection methods. Precise information on reviewed studies is      volution network will also be altered to a three-dimensional
indicated in detail in Table I.                                    format (Figure 5). It should be noted that the manipulation of
                                                                   three dimensional CNN (3D-CNN) networks is less beneficial
C. Deep Neural Networks                                            than 1D-CNN and 2D-CNN networks for diverse reasons.
   Deep learning in various medical applications, including the    First, the data required to train these networks must be much
diagnosis of ASD, has become extremely popular in recent           larger which conventionally such datasets are not utilizable and
years. In this section of the paper, the types of Deep Learning    methods such as pre-training, which are extensively exploited
networks used in ASD detection are examined, which include         in 2D networks, cannot be used here. Another reason is that
CNN, RNN, AE, DBN, CNN-RNN, and CNN-AE models.                     with more complicated structure of networks, it becomes much
   1) Convolutional Neural Networks (CNNs) : In this section,      tougher to fix the number of layers, and network structure.
the types of popular convolutional networks used in ASD            The 3D activation map generated during the convolution of a
diagnosis are surveyed. These networks involve 1D-CNN, 2D-         3D CNN is essential for analyzing data where volumetric or
CNN, 3D-CNN models, and a variety of pre-trained networks          temporal context is crucial. This ability to analyze a series of
such as VGG.                                                       frames or images in context has led to the use of 3D CNNs as
   1D AND 2D-CNN                                                   tools for action detection and evaluation of medical imaging.
   There are many spatial dependancies present in the data         [65].
and it is difficult to extract these hidden signatures from the       2) Deep Belief Networks (DBNs): DBNs are not popular
data. Convolution network uses a structure alike to convolu-       today as they used to be, and have been substituted by new
tion filters to extract these features properly and contribute     models to perform various applications ( e.g., autoencoders for
to the knowledge that features should be processed taking          unsupervised learning, generative adversarial networks (GAN)
into account spatial dependencies; so the number of network        for generative modes [66], variational autoencoders (VAE)
parameters are significantly reduced. The principal application    [67]). However, disregarding the restricted use of these net-
of these networks is in image processing and due to the            works in this era, their influence on the advancement of neural
two-dimensional (2D) image inputs, convolution layers form         networks cannot be overlooked. The use of these networks
2D structures, which is why these networks are called 2D           in this paper is related to the feature extraction without a
convolutional neural network (2D-CNN). By using another            supervisor or pre-training of networks. These networks serve
type of data, one-dimensional signals, the convolution layers’     as unsupervised, consisting of several layers after the input
structure also resembles the data structure [64]. In convo-        layer, which are shown in Figure 6. The training of these
lution networks, assuming that various data sections do not        networks is done greedily and from bottom to top, in other
require learning different filters, the number of parameters are   words, each separate layer is trained and then the next layer is
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
5

Fig. 6: Overall block diagram of a DBN used for ASD
detection.

                                                                   Fig. 8: Overall block diagram of an LSTM used for ASD
                                                                   detection.

Fig. 7: Overall block diagram of an AE used for ASD
detection.

appended. After training, these networks are used as a feature
extractor or the network weights are used as initial weights of
a network for classification [21].
   3) Autoencoders (AEs): Autoencoders (AEs) are more than
30 years old, and have undergone dramatic changes over the
years to enhance their performance. But the overall structure
of these networks has remained the same [21].These networks        Fig. 9: Overall block diagram of a CNN-RNN used for ASD
consist of two parts: coder and decoder so that the first part     detection.
of the input leads to coding in the latent space, and the
decoder part endeavors to convert the code into preliminary
data (Figure 7). Autoencoders are a special type of feedforward    most efforts have been made to enhance these two structures
neural networks where the input is the same as the output.         and make them resistant to challenges (e.g., GRU-D [68] is
They compress the input into a lower-dimensional code and          used to find the lost data).
then reconstruct the output from this representation. The code        5) CNN-RNN: The initial idea in these networks is to utilize
is a compact summary or compression of the input, also called      convolution layers to amend the performance of RNNs so that
the latent-space representation. Various methods have been         the advantages of both networks can be used; CNN-RNN,
proposed to block the data memorization by the network,            on the one hand, can find temporal dependencies with the
including sparse AE (SpAE) and denoising AE (DAE) [21].            aid of RNN, and on the other hand, it can discover spatial
Trained properly, the coder part of an Autoencoder can be used     dependencies in data with the help of convolution layers [69].
to extract features; creating an unsupervised feature extractor.   These networks are highly beneficial for analyzing time series
   4) Recurrent Neural Networks (RNNs): In convolution net-        with more than one dimension (such as video) [70] but further
works, a kind of spatial dependencies in the data is addressed.    to the simpler matter, these networks also yield the analysis
But interdependencies between data are not confined to this        of three-dimensional data so that instead of a more complex
model. For example in time-series, dependencies may be             design of a 3D-CNN, a 2D-CNN with an RNN is occasionally
highly distant from each other, on the other hand, the long-       used. The superiority of this model is due to the feasibility
term and variable length of these sequences results in that        of employing pre-trained models. Figure 9 demonstrates the
the ordinary networks do not perform well enough to process        CNN-RNN model.
these data. To overcome these problems, RNNs can be used.             6) CNN-AE: In the construction of these networks, the
LSTM structures are proposed to extract long term and short        principal aim and prerequisite have been to decrease the
term dependencies in the data (Figure 8). Another well-known       number of parameters. As shown before, changing merely the
structure called GRU is developed after LSTM, and since then,      network layers to convolution markedly lessens the number
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
6

                                                                   Fig. 11: Block diagram of ios application for ASD rehabilita-
                                                                   tion.

Fig. 10: Overall block diagram of a CNN-AE used for ASD
detection.

of parameters; combining AE with convolution structures also
makes significant contribution. This helps to exploit higher
dimensional data and extracts more information from the data           Fig. 12: Cloud system design for ASD rehabilitation.
without changing the size of the database. Similar structures,
with or without some modifications, are widely deployed for
image segmentation [71], and likewise unsupervised network         B. Cloud Systems
can be applied for network pre-training or feature extraction.
                                                                      Mohammadian et al. [74] proposed a new application of DL
Figure 10 depicts the CNN-AE network used for ASD detec-
                                                                   to facilitate automatic stereotypical motor movement (SMM)
tion. Tables I and II, provide the summary of papers published
                                                                   identification by applying multi-axis inertial measurement
on detection and rehabilitation of ASD patients using DL,
                                                                   units (IMUs). They applied CNN to transform multi-sensor
respectively.
                                                                   time series into feature space. An LSTM network was then
                                                                   combined with CNN to obtain the temporal patterns for SMM
        V. D EEP L EARNING T ECHNIQUES FOR ASD                     identification. Finally, they employed the classier selection
                     R EHABILITATION                               voting approach to combine an ensemble of the best base
                                                                   learners. After various experiments, the superiority of their
   Rehabilitation tools are employed in multiple fields of         proposed procedure over other base methods was proven.
medicine and their main purpose is to help the patients to         Figure 12 shows the real-time SMM detection system. First,
recover after the treatment. Various and multiple rehabilitation   IMUs, which are wearable sensors, are used for data collec-
tools using DL algorithms have been presented. Rehabilitation      tion; the data can then be analyzed locally or remotely (using
tools are used to help ASD patients using mobile, computer         Wi-Fi to transfer data to tablets, cell phones, medical center
applications, robotic devices, cloud systems, and eye tracking,    servers, etc.) to identify SMMs. If abnormal movements are
which will be discussed below. Also, the summary of papers         detected, an alarm will be sent to a therapist or parents.
published on rehabilitation of ASD patients using DL algo-
rithms are shown in Table II.
                                                                   C. Eye Tracking
                                                                      Wu et al. [75] proposed a model of DL saliency predic-
A. Mobile and Software Applications                                tion for autistic children. They used DCN in their proposed
                                                                   paradigm, with a SM saliency map output. The fixation density
   Facial expressions are a key mode of non-verbal commu-          map (FDM) was then processed by the single-side clipping
nication in children with ASD and play a pivotal role in           (SSC) to optimize the proposed loss function as a true label
social interactions. Use of BCI systems provides insight into      along with the SM saliency map. Finally, they exploited an
the user’s inner-emotional state. Valles et al. [72] conducted     autism eye-tracking dataset to test the model. Their proposed
research focused on mobile software design to provide assis-       model outperformed other base methods. Elbattah et al. [76]
tance to children with ASD. They aimed to design a smart           aimed to combine unsupervised clustring algorithms with deep
iOS app based on facial images according to Figure 11. In          learning to help ASD rehabilitation. The first step involved the
this way, people’s faces at different angles and brightness        visualization of the eye-tracking path, and the images captured
are first photographed, and are turned into various emoji          from this step were fed to an autoencoder to learn the features.
so that the autistic child can express his/her feelings and        Using autoencoder features, clustering models are developed
emotions. In this group’s investigation [72], Kaggle’s (The        using the K-Means algorithm. Their method performed better
Facial Expression Recognition 2013) and KDEF (Kaggle’s             than other state-of-art techniques.
FER2013 and Karolinska Directed Emotional Faces) databases
were used to train the VGG-16. In addition, the LEAP system
was adapted to train the model at the University of Texas.
The research provided the highest rate accuracy of 86.44%. In
another similar study, they achieved an accuracy of 78.32%
[73].
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
TABLE I: Summary of articles published using DL methods for neuroimaging-based ASD detection.
                              Neuroimaging          Number                        Image       Preprocessing           High level                        Inputs                 DNN                      Number of                             Performance
Work        Datasets                                               Pipelines                                                                                                                DNN                     Classifier   K fold
                               Modalities           of Cases                       Atlas         Toolbox             Preprocessing                       DNN                  Toolbox                    Layers                               Criteria (%)
                                 T-fMRI             82 ASD                                                                                    Single Mean Channel Input                                             Majority
[24]   Clinical Acquisition      residual                            NA          MNI152           FSL                     SW                  Single STD Channel Input          NA         2CC3D           17                     NA         F1-Score = 89
                                                     48 HC                                                                                                                                                          Voting
                                  -fMRI                                                                                                       Combined 2-Channel Input
                                                     82 ASD                                                   SVE, C-SVE, H-SVE, Monte         Mean-Channel Sequence
[77]   Clinical Acquisition      T-fMRI                              NA            AAL            NA                                                                            NA         2CC3D           14       Sigmoid       NA          Acc = 97.32
                                                      48 HC                                                      Carlo Approximation                 STD-Channel
         HCP Dataset in          T-fMRI         68 Subjects with                                                  Dictionary Learning
[78]                                               7 Tasks and       NA            NA             FSL                                           Functional RNSs Maps            NA        3D-CNN           8        Softmax       NA          Acc = 94.61
       the HAFNI Project         rs-fMRI                                                                          and Sparse Coding
                                                 1 rs-fMRI Data
                                                    21 ASD
[59]   Clinical Acquisition      T-fMRI                              NA            AAL            FSL                     DA                       ROIs Time-Series            Keras       LSTM            7        Sigmoid       10           Acc=69.8
                                                     19 HC
                                 T-fMRI            1711 ASD                                       SPM           Wavelet Transform and                                                                                                     Ensemble AUROC=0.92
[79]   Different Datasets        rs-fMRI                             NA            AAL                                                                   FCM                   Keras      2D-CNN           14       Softmax       NA
                                                   15903 HC                                    SpeedyPP          Different Techniques                                                                                                      Ensemble Acc=85.19
                              Phenotypic Info
            Clinical                                82 ASD                                                                 SW                   Original fMRI Sequence
[80]                             T-fMRI                              NA            AAL         Neurosynth                                                                       NA         2CC3D           16       Sigmoid       NA           Acc= 87.1
           Acquisition                               48 HC                                                        Corrupting Strategy           Mean-Channel Sequence
                                                    41 ASD                                                       Prediction Distribution         std-channel sequence
[80]        ABIDE-I              rs-fMRI                             NA            AAL            FSL                                                                           NA         2CC3D           16       Sigmoid       NA           Acc= 85.3
                                                     54 HC                                                              Analysis            Corrupt a ROI of Original Image
            ABIDE-I                             379 ASD, 395 HC                 All Atlases                                                 Concatenating Voxel-Level Maps
[81]                             rs-fMRI                            CPAC                          NA                     FCM                                                    NA        3D-CNN           7        Sigmoid       10           Acc= 73.3
            ABIDE II                            163 ASD, 230 HC                   ABIDE                                                       of Connectivity Fingerprints
                                                                                 CC-200                                                                                                                                                        Acc=70.1
                                                   505 ASD                                                                                            Masking
[82]        ABIDE-I              rs-fMRI                            CPAC           AAL            NA                   FCM, DA                                                PyTorch        AE            NA         SLP         10           Sen=67.8
                                                    530 HC                                                                                           Correlations
                                                                               Dosenbach160                                                                                                                                                    Spec=72.8
[83]        ABIDE-I              rs-fMRI          872 subjects      CPAC            HO           Nilearn                  NA                         Raw Images                 NA         G-CNNs          5        Softmax       10           Acc=70.86
                                                                                                                                                                                        BrainNetCNN                                            Acc= 68.7
                                                   474 ASD                                                                                            Functional
[84]        ABIDE-I              rs-fMRI                             CCS           AAL            NA                     FCM                                                    NA      with Proposed      15       Softmax        5           Sen= 69.2
                                                    539 HC                                                                                           Connectomes
                                                                                                                                                                                            Layers                                             Spe= 68.3
                                                                                                 SPM8                                             Pearson Correlation
                                                    13 ASD                                                            Qcut, NMI
[85]        ABIDE-I              rs-fMRI                             NA            AAL           REST                                              Coefficient Matrix           NA          DAE            NA          NA         NA           Acc=54.49
                                                     22 HC                                                          Statistic Matrix
                                                                                                DPARSF                                               NMI Matrix
                                                                                                  FSL                                                                                                                                           Acc=100
                                                    11 ASD           NA                                            Convert NII Files                 Preprocessed
[86]         ABIDE               rs-fMRI                                           NA                                                                                          Caffe      LeNet-5       Standard    Softmax       NA           Sen=99.99
                                                     16 HC                                        BET               to PNG Images                    PNG Images
                                                                                                                                                                                                                                               Spec=100
                                                    55 ASD                                                           FCM, Feature                                                         Multiple                   Softmax
[87]        ABIDE-I              rs-fMRI                            NIAK           AAL            NA                                              Whole-Brain FCPs              NA                         4                       5           Acc=86.36
                                                     55 HC                                                             Selection                                                           SAEs                     regression
            ABIDE-I                                 54 ASD                                                                                    Images with 95 68, 79 68,        Keras
                                                                                                                 Dimension Reduction                                                                                                          Acc=72.73
                                                     62 HC                                                                                                                      with
[62]        ABIDE-II             rs-fMRI                             NA            NA            SPM8                                       and 79 95 Dimensions, Around                  MCNNEs           9        Binary SR     10           Sen=71.2
                                                   156 ASD                                                                                                                    Theano
                                                                                                                          FFT                    the x, y, and z Axes                                                                         Spec=73.48
          ABIDE-I + II                              187 HC                                                                                                                    backend
                                                                                                                  Creating Stochastic
                                                   542 ASD                         All                                                              Gray Matter                                                     Various
[88]         ABIDE               rs-fMRI                            CPAC                          NA            Parcellations by Poisson                                        NA        3D-CNN           6                      10            Acc=72
                                                    625 HC                        Atlases                                                         Mask Parcellations                                                Methods
                                                                                                                    Disk Sampling

                                                   465 ASD                                                                                     Edge Weights of Subjects’
[89]        ABIDE-I              rs-fMRI                           DPARSF          AAL            NA                      FCM                                                  Keras        VAE            3           NA                         NA
                                                    507 HC                                                                                           Brain Graph                                                                  NA

                                                   539 ASD                       Craddock                                                         Mean Time Courses
[90]        ABIDE-I              rs-fMRI                             CCS                       Neurosynth                 DA                                                   Keras       LSTM            5        Sigmoid       10           Acc=68.5
                                                    573 HC                         200                                                               from ROIs
                                                                                                                  Slicetiming, Spatial
                                 rs-fMRI,
                                                   505 ASD                                                    Standardization, Smoothing,          4005-Dimensional                                                                             Acc=61
[91]         ABIDE              Phenotypic                           NA           CC200          DPABI                                                                          NA          SAE            3        Clustering    NA
                                                    530 HC                                                        Filtering, Removing                 Eigenvector                                                                            F1-Score=60.2
                                    Info
                                                                                                              Covariates, FCM, AE-MKFC
                                                                                                               Independent Components                                                                                                         Acc=87.21
                                                    42 ASD                                                                                         Time Courses of
[92]         ABIDE               rs-fMRI                             NA            NA             FSL            (Time Course, Power                                            NA          SAE            9        Softmax       21          Sen=89.49
                                                     42 HC                                                                                           Each Subject
                                                                                                              Spectrum and Spatial Map)                                                                                                       Spec=83.73
                                                    NY site
                                                    UM site                                                                                     fMRI ROI Time-Series,
[93]        ABIDE-I              rs-fMRI                             CCS           AAL         Neurosynth                 DA                                                   Keras       LSTM            6        Sigmoid       10           Acc=74.8
                                                    US site                                                                                     Functional Connectivity
                                                    UC site
                                                                                   HO                                                                                                                                                          Acc=73.2
                                                   408 ASD                         AAL                                                            3 Different FCM+
[94]        ABIDE-I              rs-fMRI                            CPAC                          FSL                     NA                                                   Keras       DANN            25       Sigmoid       10           Sen=74.5
                                                    401 HC                                                                                        Demographic Data
                                                                                  CC200                                                                                                                                                        Spec=71.7
                                                                                                                                                                                                                                             Avg Acc= 67.1
                                                                                                                 DTL-NN Framework:
                                                  At Least 60                                                                                                                                                        Softmax                 Avg Sen=65.7
[95]         ABIDE               rs-fMRI                             CCS           AAL            FSL          Offline Learning, Transfer             FC Patterns               NA         SSAE            4                       5
                                                   Subjects                                                                                                                                                         regression               Avg spec=68.3
                                                                                                                     Learning FCM
                                                                                                                                                                                                                                               AUC=0.71
                                                                                  AAL            FAST
                                                   993 ASD                     Schaefer-100                                                        Mean Time-Series
[96]      ABIDE I+II             rs-fMRI                             NA                           BET                     NA                                                    NA        1D-CNN           5        Softmax       10            Acc=68
                                                   1092 HC                         HO                                                              within Each ROI
                                                                               Schaefer-400      FAST

                                                                                                                                                                                                                                                                7
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
8
                                                                                                                                                                                      4 Deep                                                  Acc=87
                                                    529 ASD           All                                         Single Volume                 Glass Brain and                      Ensemble                                               F1-score=86
[97]         ABIDE-I              rs-fMRI                                         NA            NA                                                                        Keras                     16         Sigmoid         NA
                                                     573 HC        Pipelines                                     Image Generator                Stat Map Images                      Classifier                                             Recall=85.2
                                                                                                                                                                                    techniques                                               Pre=86.8
[98]         ABIDE-II             rs-fMRI        303 ASD, 390 HC     NA           NA           FSL                     NA                  1D Time Series from Voxels      NA        1D-CAE         14           NA            NA           Acc= 65.3
                                                                                                                Thresholding Based
[99]          ABIDE               rs-fMRI        40 ASD, 40 HC       CCS          NA            NA                                               WM, GM, CSF               NA        AlexNet      Standard     Softmax         NA           Acc=82.61
                                                                                                                   Segmentation
                                                                               Parcellated                      DA Using SMOTE                                                                                                               Acc=82
                                                     Whole            All                                                                    Upper Triangle Part of                   ASD-
[100]         ABIDE               rs-fMRI                                       into 200        NA              and Graph Network                                          NA                     Proposed       SLP           NA           Sen=79.1
                                                     Dataset       Pipelines                                                                 the Correlation Matrix                  DiagNet
                                                                                Regions                            Motifs, FCM                                                                                                              Spec=83.3

                                                                                                               Time Series Extraction
                                                                                                              from Different Regions,                                                 Auto-                                                  Acc=80
                                                     12 ASD
[60]         ABIDE-I              rs-fMRI                           CPAC         SCSC           NA                                                    FCM                PyTorch      ASD-        Proposed      SVM             5            Sen=73
                                                      14 HC                                                     Connectivity Matrix,
                                                                                                                                                                                     Network                                                 Spec=83
                                                                                                                 SMOTE Algorithm

                                                                                                                                                                                                                                            Acc=70.20
                                                    505 ASD
[61]         ABIDE-I              rs-fMRI                           CPAC        CC400           NA                     FCM                            FCM                  NA        2D-CNN         20          MLP             10          Sen=77.00
                                                     530 HC
                                                                                                                                                                                                                                            Spec=61.00

                                                                                                                                                                                                                                             Acc=70
                                                    505 ASD                                                                                                                          1D CNN
[101]         ABIDE               rs-fMRI                            NA           NA            NA                     FCM                     1D-Vector of FCM            NA                        7         Softmax         NA            Sen=74
                                                     530 HC                                                                                                                            -AE
                                                                                                                                                                                                                                             Spec=63

                                                    539 ASD                                                                                                                                                                              Mean DSC=91.56
[102]        ABIDE-I              rs-fMRI                            NA           NA         FreeSurfer                NA                       Single 3D Image          Theano     3D-FCNN         13         Softmax          6
                                                     573 HC                                                                                                                                                                              Mean MHD=14.05

                                                                                                                                                                                                                                            Acc=93.59
                                                    501 ASD                                                     FCM, Converting to           1000 Features Selected                                                          Different
[103]        ABIDE-I              rs-fMRI                          DPARSF        AAL            NA                                                                         NA         SSAE           3         Softmax                      Sen=92.52
                                                     553 HC                                                        1D-Vector                   by the SVM-RFE                                                                 Folds
                                                                                                                                                                                                                                            Spec=94.56
                                                                                                            Online Dictionary Learning
                                                                                               FSL                                                                                                                                       Average Acc= 70.5
                                                    100 ASD                                                 and Sparse Representation        4D Matrix with 150 3D
[104]        ABIDE-I              rs-fMRI                            NA           NA                                                                                     Theano      3D-CNN         14           NA             10        Average Sen= 74
                                                     100 HC                                                  Techniques, Generating          Network Overlap Maps
                                                                                               FEAT                                                                                                                                      Average Spec= 67
                                                                                                             Spatial Overlap Patterns
                                                                                                          Population Graph Construction,
                                 rs-fMRI &          529 ASD                                                                                        Population             Scikit
[105]        ABIDE-I                                                CPAC          HO           FSL         Feature Selection Strategies                                               GCN           11         Softmax          10           Acc=80.0
                               Phenotypic Info       571 HC                                                                                          Graph                -learn
                                                                                                             (RFE, PCA, MLP, AE)

                                 rs-fMRI &          403 ASD                                                                                     Mean Time-Series
[106]        ABIDE-I                                                 CCS        CC200           NA                     DA                                                 Keras       LSTM           6         Sigmoid          10           Acc=70.1
                               Phenotypic Info       468 HC                                                                                        from ROIs

                                 rs-fMRI &
                                                                                                                                                                                       Two                                                   Acc=70
                                  S-MRI &           505 ASD                    Craddock
[107]        ABIDE-I                                                CPAC                        NA                     FCM                     1D-Vector of FCM            NA         SdAE          NA         Softmax          10           Sen=74
                                 Phenotypic          530 HC                      200
                                                                                                                                                                                     + MLP                                                   Spec=63
                                     Info
                                  rs-fMRI                                                      FSL
                                                   75 Qualified                              Brainsuite         Extraction of Fetal            Mean Time Series                                                                             F1-score=84
[108]   Clinical Acquisition      Fetal                              NA           NA                                                                                     PyTorch     3D-CNN          7         Sigmoid         NA
                                                     Subjects                                 SPM8             Brain fMRI Data, SW            of 3D fMRI Volumes                                                                             AUC=91
                                BOLD fMRI
                                                                                              CONN
             ABIDE-I              rs-fMRI                                                                     Segmentation, Average                                                                                                         Acc=65.56
                                                    116 ASD                                                                                   Rs-fMRI + GM+WM
[109]                                                                NA          AAL           SPM8             Mean Time Series                                         Theano       DBN            6           LR             10            Sen=84
                                                     69 HC                                                                                        Data Fusions
             ABIDE-II              s-MRI                                                                          of Each ROI                                                                                                               Spec=32.96
                                                                                                                                                  FCM Vector              Keras
                                  rs-fMRI
                                                    418 ASD                        All                            FCM, Features               Anatomical Features        Tensor-    Different                  Various
[110]         IMPAC                                                  NA                         NA                                            Combination of Both         Flow                       8                          3            AUC= 80
                                                     497 HC                      atlases                      Extraction from S-MRI                                                 Networks                   Methods
                                   s-MRI                                                                                                   Anatomical and Connectivity
                                                                                                                                                                          Caffe
                                                                                                                                                  FCM Vector
                                  rs-fMRI                                        AAL                                                                                                                          Label Fus-
                                                                                                                                                                                    Ensemble                   ion Using                    Acc= 85.06
                                                    368 ASD                     CC200                                 FCM,                                                         of 5 Stacked
[111]        ABIDE-I                                                CPAC                     Freesurfer                                        1D-Vector of FCM            NA                       31       the Average        10           Sen= 81
                                                     449 HC                                                        Fisher Score                                                      AEs and
                                   s-MRI                                                                                                                                                                      of Softmax                     Spec= 89
                                                                               Destrieux                                                                                               MLP
                                                                                                                                                                                                             Probabilities
                                  rs-fMRI            61ASD                                                Data-Driven Landmark Discovery      50 Patches Extracted                    Multi-
[112]         NDAR                                                   NA           NA            NA                                                                         NA                       13         Softmax          10          Acc=76.24
                                   s-MRI             215 HC                                                 Algorithm, Patch Extraction       from 50 Landmarks                    Channel CNN

                                                                                               FSL                                                                                                  Each
                                                                                                                                                                                                                                            Acc=88.5
                                   All               78 ASD                    Proposed                          PICA, Extraction                 PSDs of 34                           34           SAE
[63]          NDAR                                                   NA                                                                                                    NA                                  PSVM            NA           Sen=85.1
                                 Modalities          124 HC                      Atlas                               of PSD                       Components                          SAEs         Has 2
                                                                                               BET                                                                                                                                          Spec=90.4
                                                                                                                                                                                                   Layers
                                   s-MRI                                                     In-House                                                                                                11
[113]         NDAR                               60 ASD, 211 HC      NA           NA                           3D Patches Extraction           Patch Size 166416           NA       DDUNET                       NA             5              NA
                                                                                               Tools                                                                                               Blocks
            ABIDE-I                              21 ASD, 21 HC                                 FSL             Segmentation, Shape                CDF Values
[114]      NDAR/Pitt               s-MRI         16 ASD, 16 HC       NA           NA                                                                                       NA        SNCAE          NA         Softmax         NA           Acc=96.88
                                                                                              iBEAT             Feature Extraction                of Features
           NDAR/IBIS                             10 ASD, 10 HC
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
Acc=90.39
                                      78 ASD                                              Construction of Individual
[115]    ABIDE-I        s-MRI                       NA      Destrieux     FreeSurfer                                         3000 Top Features           NA        SAE          3       Softmax      10   Sen=84.37
                                      104 HC                                                 Network, F-score
                                                                                                                                                                                                          Spec=95.88
           HCP                        1113 HC               Desikan                         Normalization, Apply                                       Tensor-
[116]                   s-MRI         83 ASD        NA                    FreeSurfer                                        Preprocessed Images         Flow       DEA          3          NA        10   AUC=63.9
         ABIDE-I                                             Killia                           One-Hot Coding
                                      105 HC                                                                                                            Keras
         ABIDE                                                                                                                   32 Slices                                                                 Acc=84
[117]                   s-MRI       1112 Subjects   NA         NA          SPM12                     NA                      Along Each Axial,          Keras     DCNN         17       Sigmoid      NA    Sen=77
         CombiRx                                                                                                            Coronal, and Sagittal                                                          Spec=85
                                                                                                                         Coronal, Axial and Sagittal             FastSur-
[118]   ABIDE-II        s-MRI           NA          NA        DKT         FreeSurfer            Segmentation                                           PyTorch               Proposed   Softmax      NA       NA
                                                                                                                                 2D Slices                       fer CNN
                                                           HO Cortical
                                                               and                         GABM Method, New
                                      500 ASD                                                                                   Preprocessed
[119]    ABIDE-I        s-MRI                       NA     Subcortical      FSL            Chromosome Encoding                                           NA      3D-CNN        11       Softmax      5      Acc=70
                                       500 HC                                                                                    MRI Scans
                                                            Structural                           Scheme
                                                              Atlas

         Clinical                                                                            Sparse Annotations,                                                                                          Acc=91.6
[120]                   s-MRI          48 HC        NA         NA         FreeSurfer                                            Image Patch             Caffe    3D-CNN        18       Softmax      NA
        Acquisition                                                                                 DA                                                                                                    AUC=94.1

                                                             Brain-                         Filtering, Calculating
                                                                                                                                                                                                          Acc=74.54
                                      270 ASD                Netome                         Mean Time Series for          Mean Time Series Data                   CNN-
[121]    ABIDE-I       rs-fMRI                      CPAC                     NA                                                                          NA                    14       Sigmoid      5    Sen=63.46
                                       305 HC                 Atlas                       ROIs Using BrainNetome           Stacked Across ROIs                    GRU
                                                                                                                                                                                                          Spec=84.33
                                                             (BNA)                       Atlas (BNA), Normalization

                                                                                              Transformation of                                                                                            Acc=95.7
         Clinical                     25 ASD                                                                                                                     1D CNN-
[122]                   fNIRS                        No        No            No                the Time Series                 PM, GM, SM               Keras                  NA       Bagging      NA    Sen=97.1
        Acquisition                    22 HC                                                                                                                      LSTM
                                                                                              to Three Variants                                                                                            Spec=94.3
                                                                                                     SW                                                                                                    Acc=92.2
         Clinical                     25 ASD
[123]                   fNIRS                        No        No            No              Converted into the                  3D Tensor               NA      CGRNN          7          NA        NA    Sen=85.0
        Acquisition                    22 HC
                                                                                                3D Tensor                                                                                                  Spec=99.4
                                                                          FreeSurfer
         Different                                          Various          FSL
[124]                   s-MRI           NA          NA                                         Geometric DA                  3D Cortical Mask          Theano    ConvNet     U-Nets        NA        8        NA
         Datasets                                           Methods        SPM12
                                                                           VolBrain
                                                                                       Performed an Automatic Quality
                                                                                         Control, Visually Inspection,
                                                                                                                                                                    MM-
                                      620 ASD                                          9 Temporal Summary Measures,                                                                     Majority            Acc=64
[125]   ABIDE I+II     rs-fMRI                      CPAC       HO           FSL                                           Each Summary Measure           NA       ensemble      7                    5
                                      2085 HC                                          Mean and STD of the Summary                                                                      Voting            F1-Score=66
                                                                                                                                                                 (3D-CNN)
                                                                                          Measures, Normalization,
                                                                                         Occlusion of Brain Regions
                        rs-fMRI       184 ASD                                                                                                                    3D-CNN                                     Acc=77
[126]    ABIDE-I      Phenotypic                    CPAC       NA            NA                Down Sampling                  Raw 4D Volume              NA                    21       Softmax      5
                                       110 HC                                                                                                                    C-LSTM                                   F1-score=78
                      Information
                                                            264 ROIs       AFNI                     FCM,
                      rs-fMRI and     403 ASD                 Based                                                                                                                                       Acc= 79.2
[127]    ABIDE-I                                    NA                     FSL                Feature Extraction            Normalized Features          NA         AE          7         DNN        10
                          s-MRI        468 HC              Parcellation                                                                                                                                   AUC= 82.4
                                                                          MATLAB             (Different Features)
                                                             Scheme
                                                                                                                                                                                                           Acc=71
                                      505 ASD                                                                                                                                           K-Means
[128]    ABIDE-I       rs-fMRI                      CPAC     CC200           NA                     FCM                      1D-Vector of FCM          PyTorch   CapsNet     Standard                10    Sen=73
                                       530 HC                                                                                                                                           Clustering
                                                                                                                                                                                                           Spec=66

                                                                                                                                                                                                                        9
Deep Learning for Neuroimaging-based Diagnosis and Rehabilitation of Autism Spectrum Disorder: A Review - arXiv
10
                                                           TABLE II: Summary of papers published on rehabilitation of ASD patients using DL algorithms.
                                              Type of                       Number                                                                  Inputs                   DNN                                Number of                                              Performance
Work         Datasets                                                                                   Preprocessing                                                                          DNNs                                     Classifier           K fold
                                            Applications                    of Cases                                                                 DNN                    Toolbox                              Layers                                                Criteria (%)

                                                                                                                                                                              Caffe                                                                                     Acc=85
                                                                             20 ASD                   HFM Construction,                         HFMs, Natural
[129]         OSIE                               —                                                                                                                                           VGGNeT                 50                  Softmax               13        Sen=80
                                                                              19 HC               Filtering Normalizing, DA                     Scene Images
                                                                                                                                                                           TensorFlow                                                                                   Spec=89

                                                                               70
[73]          KDEF                 Facial Expression Recognition                                             DA                             RGB Images (562762)              Keras            DCNN                  44                  Softmax               NA       Acc=78.32
                                                                           Individuals

            Clinical           Detecting Audio Regimes That Directly                                                                                                                     Noisemes Network        Standard               Synthetic
[130]                                                                        33 ASD                  MFCC Spectrograms                         32 Spectrograms                NA         DiarTK Diarization                                                   NA        Acc=84.7
           Acquisition       Estimate ASD Severity Social Affect scores                                                                                                                                          Network                  RF
                                                                                                                                                                                              Network
            Kaggle’s                                                                                                                                                          Keras
[72]        FER2013                Facial Expression Recognition               NA                             No                              4848-Pixel Images            (TensorFlow        DCNN                  44                  Softmax               NA       Acc=86.44
             KDEF                                                                                                                                                            Backend)
                                                                                                                                                                                                                                                                       Acc=57.90
                                                                             14 ASD                    SalGAN Model,
[131]       SALICON                      ASD Classification                                                                               Sequence of Image Patches           NA            SP-ASDNet               11                     NA                 NA       Rec=59.21
                                                                              14 HC                   Feature Extraction
                                                                                                                                                                                                                                                                       Pre=56.26

                                                                                                                                        5-channel Sub-Sequence Stacks
[132]       BigFaceX               Facial Expression Recognition          196 Subjects     SW, Merge in the Channel Dimension, DA                                            Keras         TimeConvNet         PreTrain Nets            Softmax               NA        Acc=97.9
                                                                                                                                        within a Specific Time Window

            Different                                                                             Interactive and Intelligent
[133]                        Suitable Courseware for Children with ASD         NA                                                              Different Inputs               NA           Different Nets          NA                      NA                 NA           NA
            Datasets                                                                              Chatbot , NLP, Visual Aid

                                                                                                  Resizing, Frame Extraction,
                                                                                                                                                                                              R-CNN              VGG-16                  K-NN                          Acc= 88.2
                                   Estimating Visual Attention in                                      Visual Inspection                   5 Facial Landmarks - 36                                                                                                      Pre=83.3
[134]    Camera Images                                                    6 ASD and ID                                                                                        NA                                                                              10
                                      Robot-Assisted Therapy                                     Face Detection (ViolaJones),                 HOG Descriptors                                                                                                          Sen=83.0
                                                                                                                                                                                                              Cascaded CNNs
                                                                                                                                                                                             MTCNN                                        Nave                         Spec=87.3
                                                                                             Feature Extraction (HOG Descriptors)                                                                              Architecture
              Sensor                                                         6 ASD                                                                                                                                                      Majority
[135]                                Automatic SMM detection                                      Resampling, Filtering, SW             Time-Series of Multiple Sensors      Keras          CNN-LSTM                13                                        NA           NA
               Data                                                           5 HC                                                                                                                                                      Voting
                                                                                                    Segmentation, Different
[136]       KOMAA                  Facial Expression Recognition           55 subjects                                                 Greedy Forward Feature Selection       NA               CNN                  9                     SVM                 NA         Acc=96
                                                                                                      Features, Z-scores

          Story-Telling                                                      31 ASD                                                                                                                                            Coherence Representation of
[137]                                    ASD Classification                                         DA, ChineseWord2Vec                  32-Dimensional Word Vector           NA              LSTM                  1                                         NA         Acc=92
        Narrative Corpora                                                     36 HC                                                                                                                                                LSTM Forget Gate

            Ext-Dataset                                                     136 ASD              TLD Method, Accumulative                  Angle Histogram, Length                                                                                                      Acc=92.6
[138]                          ASD Classification using Eye Tracking                                                                                                         Keras            LSTM                  4                      NA                 10        Sen=91.9
          (video dataset)                                                    136 HC               Histogram Computation                Histogram and Fused Histogram,
                                                                                                                                                                                                                                                                        Spec=93.4
                                     Predicting Visual Attention                                                                                                                                                                                                        SIM=67.8
[139]       MIT1003                                                        300 Images                        NA                                  Raw Images                   NA               DCN                  26                     NA                 NA         CC=76.9
                                       of Children with ASD.
                                                                                                                                                                                                                                                                       AUC-J=83.4
          Scan Path Data,                                                    14 ASD                                                                                                                                                                                     Acc=55.13
[75]    Including Location               ASD Classification                                              DA Methods                           Image, Data Points             Pytorch         ResNet18            Standard               Softmax               NA        Sen=63.5
                                                                              14 HC
           and Duration                                                                                                                                                                                                                                                 Spec=47.1
          UCI Machine                                                                                                                                                                                                                                                  Acc=99.53
[140]       Learning                     ASD Classification                    704                    Different Methods                       Preprocessed Data               NA               CNN                  7                      NA                 NA       Sens=99.39
           Repository                                                                                                                                                                                                                                                   Spec=100

                                                                                                                                                                             Keras,
          Eye Tracking                                                       29 ASD         Visualization of Eye-Tracking Scanpaths                                                                                                    K-Means                          Silhouette
[76]                                     ASD Classification                                                                                     100*100 Image                Scikit-            AE                  8                                         NA
            Scanpath                                                          30 HC                    Scaling Down, PCA                                                                                                               Clustering                       score=60
                                                                                                                                                                             Learn

                                      Engagement Estimation                                                                                                                  Keras                              R-CNN +                                                ICC=43.35
                                                                                                                                             Cropped Face Images              with
[141]      Video Data             of Children with ASD During a            30 children                       NA                                                                             CultureNet          ResNet50+               Softmax               NA       CCC=43.18
                                                                                                                                                 (256 *256)                TensorFlow
                                  Robot-Assisted Autism Therapy                                                                                                                                                 5FC layers                                              PC=45.17
                                                                                                                                                                            Backend
            YouTube                                                                                                                                                                                                                                                    Avg Pre=73
                              Modeling Typical and Atypical Behaviors                                                                   Sequences of Individual Frames      openCV,
[142]         ASD                                                         68 video Clips              Different Methods                                                                       DCNN                 NA                      DT                  5      Avg Recall=75
                                         in ASD Children                                                                                     at a Rate of 30 fps             Caffe
             Dataset                                                                                                                                                                                                                                                   Avg Acc=71
                                     Behavioral Data Extracted                               Segmentation, Upper Body tracking,                                                                                                                                        Acc=88.46
             Video                                                           5 ASD                                                       3 Movement Features with 68
[143]                                 from Video Analysis of                                    Laban Movement Analysis to                                                    NA               CNN                  10                  Softmax               NA       Pre=89.12
             Dataset                                                          7 HC                                                            Facial Key-Points
                                     Child-Robot Interactions.                                 Drive Weight, Different features                                                                                                                                       Recall=88.53

             Video                  Developing Automatic SMM                                      Resampling, Filtering, SW,                Time-Series of Multiple          Deeppy
[144]                                                                        6 ASD                                                                                                             CNN                  8                     SVM                 NA       F1-score=95
             Dataset                    Detection Systems                                        Data Balancing, Normalizing                Accelerometer Sensors            Library

                                                                                                                                                                                                                                                                        Acc=100
                                                                                                                                      The Embedded Categorical Variables
                                                                            513 ASD          Cleaning Missing Values and Outliers,                                                                                                                                      Spec=99
[145]    ASD Screening                   Autism Screening                                                                              are Concatenated with Numerical        NA              DENN                  4                   Sigmoid               NA
                                                                             189 HC             Visualization, Identity Mapping                                                                                                                                         Sen=100
                                                                                                                                       Features as New Feature Vectors
                                                                                                                                                                                                                                                                       F1-score=99

         ASD Screening                   Classification of                                   Handling of Missing Values, Variable                                                                                                                                      Acc=99.40
[146]                                                                          —                Reduction, Normalization, and                Normalized Variables            Keras             DNN                  7                   Sigmoid               NA       Sen=97.89
           Datasets                      Adults with ASD
                                                                                                       Label Encoding                                                                                                                                                  Spec=100
11

Fig. 13: Number of DL tools used for the diagnosis and             Fig. 14: Number of of DL networks used for ASD detection
rehabilitation of ASD patients in reviewed papers.                 and rehabilitation in the reviewed works.

                      VI. D ISCUSSION
   In this study, we performed a comprehensive overview of
the investigations conducted in the scope of ASD diagnostic
CAD systems as well as DL-based rehabilitation tools for ASD
patients. In the field of ASD diagnosis, numerous papers have
been published using functional and structural neuroimaging
data as well as rehabilitation tools, as summarized in Table
III in the appendix. A variety of DL toolboxes have been
proposed for implementing deep networks. In Tables I and
II the types of DL toolboxes utilized for each study are
depicted, and the total number of their usage is demonstrated
in Figure 13. The Keras toolbox is used in the majority of the
studies due to its simplicity. Keras offers a consistent high-
level application programming interface (APIs) to build the
models more straightforward, and by using powerful backends
such as TensorFlow, its performance is sound. Additionally,
                                                                   Fig. 15: Number of various classification algorithms used for
due to all pre-trained models and available codes on platforms
                                                                   the detection of ASD and rehabilitation in DL.
such as GitHub, Keras is quite popular among researchers.
   Number of DL networks used for the ASD detection in the
reviewed works is shown in Figure 14. Among the various
DL architectures, CNN is found to be the most popular one          and healthy) available in the public domain. Hence, researchers
as it has achieved more promising results compared to other        are not able to broaden their investigation to all sub-types
deep methodologies. The autoencoder, as well as RNN, have          of ASD. Two of the cheapest and most pragmatic functional
yielded favorable results. It can be noted that in recent years,   neuro-screening modalities for diagnosis of ASD are EEG, and
the number of DL-based papers has increased exponentially          fNIRS. But unfortunately the deficiency of freely available
due to their sound performance and also the availability of        datasets has resulted in little research in this area. Another
vast and thorough datasets.                                        obstacle is that multi-modality databases such as EEG-fMRI
   The number of various classification algorithms used in DL      are not available to researchers to evaluate the effectiveness of
networks are shown in Figure 15. One of the best and most          incorporating information of different imaging modalities to
widely used is the Softmax algorithm (Tables I and II). It is      detect ASD. Although fMRI and sMRI data are ubiquitous in
most popular since it is differentiable in the entire domain and   the ABIDE dataset, the results of merging these structural and
computationally less expensive.                                    functional data for ASD diagnosis using DL have not yet been
                                                                   investigated. Another problem grappling the researchers is
                     VII. C HALLENGES                              designing the DL-based rehabilitation systems with hardware
  Some of the most substantial challenges in ASD diagnosis         resources. Nowadays, assistive tools such as Google Colab
scope using DL-based techniques are addressed in this section,     are available to researchers to improve the processing power;
which comprise database and algorithmic problems. There are        however, the problems still prevail when implementing these
only two-class brain structural and functional datasets (ASD       systems in real-world scenarios.
12

         VIII. C ONCLUSION AND F UTURE W ORKS                           The receiver operating characteristic curve (ROC-curve)
   ASD is typically characterized by social disorders, com-          depicts the performance of the proposed model at all clas-
munication deficits, and stereotypical behaviors. Numerous           sification thresholds. It is the graph of true positive rate vs.
computer-aided diagnosis systems and rehabilitation tools have       false positive rate (TPR vs. FPR). Equations for calculation of
been developed to assist patients with autism disorders. In this     TPR and FPR are presented below.
survey, research on ASD diagnosis applying DL and func-                                                   TP
                                                                                           TPR =                                     (6)
tional and structural neuroimaging data were first assessed.                                          TP + FN
The researchers have taken advantage of deep CNNs, RNNs,                                                  FP
AEs, and CNN-RNN networks to improve the performance                                         FPR =                                   (7)
                                                                                                      FP + TN
of their systems. Boosting the accuracy of the system, the
capability of generalizing and adapting to differing data and           A REA UNDER THE ROC C URVE (AUC)
real-world challenges, as well as reducing the hardware power           AUC presents the area under the ROC-curve from (0, 0)
requirements to the extent that the final system can be utilized     to (1, 1). It provides the aggregate measure of all possible
by all are the principal challenges of these systems. To enhance     classification thresholds. AUC has a range from 0 to 1. A
the accuracy and performance of CADS for ASD detection in            100% wrong classification will have AUC value of 0.0, while
the future, deep reinforcement networks (RL) or GANs can be          a 100% correct classified version will have the AUC value
exploited. Scarcity of data is always an aparamount problem in       of 1.0. It has two folded advantages. One is that it is scale-
the medical field that can be resolved relatively with the help of   invariant, which implies how well the model is predicted rather
these deep GANs. Also, as another direction for future works,        than checking the absolute values. The second advantage is
handcrafted features can be extracted from data and fed to           that it is classification threshold-invariant as it will verify the
DL networks in addition to raw data; this can help increasing        performance of the model irrespective of the threshold being
performance by adding the potential of traditional methods to        selected.
DL-based models.
   Many researchers have proposed various DL-based rehabil-                                    A PPENDIX B
itation tools to aid the ASD patients. Designing a reliable,           Table III shows details of Deep Nets in all the papers
accurate, and wearable low power consumption DL algorithm            reviewed in this study.
based device is the future tool for ASD patients. An achievable
rehabilitation tool is to wear smart glasses to help the children                         ACKNOWLEDGMENT
with ASD. These glasses with the built-in cameras will acquire          MB is supported by a NHMRC Senior Principal Re-
the images from the different directions of environment. Then        search Fellowship (1059660 and 1156072). MB has received
the DL algorithm processes these images and produces mean-           Grant/Research Support from the NIH, Cooperative Research
ingful images for the ASD children to better communicate             Centre, Simons Autism Foundation, Cancer Council of Vic-
with their surroundings.                                             toria, Stanley Medical Research Foundation, Medical Benefits
                                                                     Fund, National Health and Medical Research Council, Medical
                         A PPENDIX A                                 Research Futures Fund, Beyond Blue, Rotary Health, A2 milk
                    S TATISTICAL M ETRICS                            company, Meat and Livestock Board, Woolworths, Avant and
  This section demonstrates the equations for the calculation        the Harry Windsor Foundation, has been a speaker for Astra
of each evaluation metric. In these equations, True positive         Zeneca, Lundbeck, Merck, Pfizer, and served as a consultant
(TP) is the correct classification of the positive class, True       to Allergan, Astra Zeneca, Bioadvantex, Bionomics, Collab-
negative (TN) is the correct classification of the negative class,   orative Medicinal Development, Lundbeck Merck, Pfizer and
False positive (FP) is the incorrect prediction of the positives,    Servier - all unrelated to this work.
False negative (FN) is the incorrect prediction of the negatives.
                                   TP + TN
         Accuracy(Acc) =                                      (1)
                              TP + TN + FP + FN
                                         TN
              Specif icity(Spec) =                            (2)
                                       TN + FP
                                        TP
               Sensitivity(Sen) =                             (3)
                                      TP + FN
                                        TP
               P recision(P rec) =                            (4)
                                      TP + FP
                                   P rec ∗ Sens
               F 1 − Score = 2 ∗                              (5)
                                   P rec + Sens
  R ECEIVER O PERATING C HARACTERISTIC C URVE (ROC-
C URVE )
13

                                          TABLE III: Details of Deep Nets. For ASD diagnosis and Rehabilitation.
Work          Network                                             Details for Deep Networks                                        Dropout              Classifier               Optimizer              Loss Function
                                                                                                                                 2 (rate=0.5)
[24]           2CC3D                                CNN Layers (6) + Pooling Layers (4) + FC Layers (2)                                                  Sigmoid                   NA                       BCE
                                                                                                                                 2 (rate=0.65)
[77]           2CC3D                              CNN Layers (6) + Pooling Layers (4) + FC Layers (3)                                 NA                 Sigmoid                   NA                       NA
[78]          3D-CNN                      CNN Layers (2) + LReLU Actication + Pooling Layers (1) + FC Layers (1)                 3 (rate=NA)             Softmax                  SGD                      MNLL
[59]           LSTM                              LSTM Layers (1) + Pooling Layers (1) + FC Layers (3)                            1 (rate=0.5)            Sigmoid                 Adadelta                  MSE
                                                                                                                                 2 (rate=0.3)
[79]            CNN                          CNN Layers (2) + ReLU Activation + BN Layers (4) + FC Layers (3)                                            Softmax                  Adam                      NA
                                                                                                                                 2 (rate=0.7)
[80]           2CC3D                              CNN Layers (6) + Pooling Layers (4) + FC Layers (2)                            2 (rate=NA)             Sigmoid                    NA                      NA
[81]            CNN                        CNN Layers (2) + ELU Activation + Pooling Layers (2) + FC Layers (2)                       NA                 Sigmoid                   SGD                      NA
                                                                                                                                                                                                            MSE
[82]             AE                                           Standard AE with Tanh Actication                                       NA                    SLP                     NA
                                                                                                                                                                                                            BCE
[83]          G-CNN                                         Proposed G-CNN with 3 Layer CNN                                       (rate=0.3)             Softmax                  Adam                      NA
         BrainNetCNN with                  Element-wise layer (1) + E2E layers (2) + E2N layer (1) + N2G layer (1)               5 (rate=0.5)
[84]                                                                                                                                                     Softmax                  Adam             Proposed Loss Function
          Proposed Layers                        + FC layers (3)+ Leaky ReLU Activation+ Tanh Activation                         1 (rate= 0.6)
[85]           DAE                                                      Standard DAE                                                  NA                   NA                       NA             Proposed Loss function
[86]          LeNet-5                                           Standard LeNet-5 Architecture                                         NA                 Softmax                    NA                      NA
[87]           SAE                                                 SAE with LSF Activation                                            NA                 Softmax                  LBFGS                     NA
                                                                                                                                                                                   Adam
[62]          MCNNEs                      CNN Layers (3) + ReLU Activation + Pooling Layers (3) + FC Layers (1)                  1 (rate=0.5)           Binary SR                                           BCE
                                                                                                                                                                                  Adamax
                                                                                                                                                                                   SGD                      BCE
[88]          3D-CNN                       CNN Layers (2) + ELU Activation + Pooling Layers (2) + FC Layers (3)                      NA                  Sigmoid
                                                                                                                                                                                   Adam                    MSD
[89]            VAE                                                 VAE with 3 Layers                                                 NA                   NA                    Adadelta          Proposed Loss Function
[90]           LSTM                                 LSTM Layers (1) + Pooling Layers (1) + FC Layers (1)                          1(rate=0.5)            Sigmoid                 Adadelta                   BCE
[91]            SAE                                       SAE Layers (3) + Sigmoid Activation                                         NA                Clustering             Proposed Opt.                NA
[92]            SAE                                       SAE Layers (8) + Sigmoid Activation                                         NA                   SR                     L-BFGS                    MSE
                                                                                                                                                                                                            BCE
[93]           LSTM                                 LSTM Layers (2) + Pooling Layers (1) + FC Layers (2)                             NA                  Sigmoid                  Adam
                                                                                                                                                                                                            MSE
                                        3 MLP (1 Dropout Layer and 4 Dense Layers) + Self-Attention (3) + Fusion (3)
[94]     Multichannel DANN                                                                                                       1 (rate=NA)             Sigmoid                   NA                        CE
                                           + Aggregation Layer + Dense Layer (1) + ReLU, ELU, Tanh Activations
                                                                                                                                                                             Scaled Conjugate
[95]            SSAE                                                   3 SSAE Layers                                                 NA                  Softmax                                   Proposed Loss Function
                                                                                                                                                                             Gradient Descent
 [96]         1D-CNN                                 CNN Layers (1) + Pooling Layers (1) + FC Layers (1)                          (rate=0.2)             Softmax                  Adam                       NA
 [97]          CNN                          CNN Layers (6) + Pooling Layers (4) + BN Layers (2) + FC Layers (2)                  1 (rate=0.25)           Sigmoid                  Adam              Propose Loss Function
 [98]      1D CAE-CNN            Encoder (4 layers) + Decoder (4 layers) + CNN layers (2) + pooling layers (2) + FC layers (2)        NA                   NA                       NA                       NA
 [99]         AlexNet                                           Standard AlexNet Architecture                                         NA                 Softmax                    NA                       CE
[100]      ASD-DiagNet                                                 Proposed DiagNet                                               NA                   SLP                      NA                       NA
 [60]    Auto-ASD-Network                                        Proposed Auto-ASD-Network                                            NA                  SVM                       NA                      NLLF
 [61]          CNN                                    CNN layers (7) + Pooling layers (7) + FC layers (3)                        1 (rate=0.25)            MLP                       NA                       NA
[101]      2 SdAE-CNN                                     Proposed SDAE-CNN with 7 Layes CNN                                          NA                 Softmax                    NA                       NA
[102]        3D-FCNN                                 CNN Layers (9) + PReLU Activation + FC Layers (3)                                NA                 Softmax                   SGD                       CE
[103]          SSAE                                                     2 Layers SSAE                                                 NA                 Softmax                    NA                       NA
[104]         3D-CNN                  CNN layers (7) + Pooling Layers (3) + FC Layers (2) + Log-Likelihood Activation            2 (rate=0.2)              NA                      SGD                     MNLL
               GCN                                        GCN with ReLU and Sigmoid Actication                                                                                                               CE
[105]                                                                                                                             (rate=0.3)             Softmax                   NA
                AE                                                 SAE wth Tanh Activation                                                                                                                  MSE
                                                                                                                                                                                                            BCE
[106]          LSTM                                                 Proposed Deep Nework                                          (rate=0.5)             Sigmoid                 Adadelta
                                                                                                                                                                                                            MSE
[107]       2 SdAE-MLP                                      Proposed 2-SDAE-MLP Network                                               NA                 Softmax                   NA                       MSE
[108]         3D-CNN                      CNN Layers (2) + ReLU Activation + Pooling Layers (2) + FC Layers (2)                       NA                 Sigmoid                  SGD                       BCE
[109]           DBN                                            DBN with 5 Hidden Layers                                               NA                   LR                      NA                        NA
[110]         FeedFWD                                      Dense layers (5) + LReLU activation                                   3 (rate= NA)              NA                     Adam                      BCE
              Ensemble                                                                                                                                Averaging the
                                                                                                                                                         Softmax
[111]         of 5 SAEs                                       5 [ AE (3) + MLP (2)] + Softmax                                    5 (rate=NA)                                       NA                       NA
                                                                                                                                                        Activation
              and MLPs
                                                                                                                                                       Probabilities
[112]    Multi-Channel CNN              CNN Layers (5) + ReLU Activation + Pooling Layers (2) + FC Layers (5)                         NA                 Softmax                   NA                         CE
 [63]          34 SAEs                                             34 [ SAE network (2)]                                              NA                  PSVM                   L-BFGS                       NA
[113]         DDUNET                             Proposed DDUNET with 11 blocks and ReLU Activation                               (rate=0.1)               NA                     SGD                         CE
[114]          SNCAE                                             Proposed SNCAE Newtork                                               NA                 Softmax                   NA                         NA
[115]           SpAE                                               SpAE with 2 Networks                                               NA                 Softmax                   NA                        MSE
[116]            DAE                                             AE (3) + SELU Activation                                             NA                   NA                     Adam            Sum of MSE + 2 CE + CC
[117]           DCNN                    CNN Layers (6) + ReLU Activation + Pooling Layers (6) + FC Layers (4)                         NA                 Sigmoid                  Adam                       BCE
[118]      FastSurfer CNN                                    Proposed FastSurfer CNN Network                                          NA                 Softmax                  Adam              Logistic & Dice Losses
[119]          3D-CNN                   CNN Layers (3) + ReLU Activation + Pooling Layers (3) + FC Layers (2)                    2 (rate=0.5)            Softmax                 Adadelta                     CE
[120]         3D-UNET                  DCNN Layers (7) + ReLU Activation + Pooling Layers (2) + BN Layers (6)                    2 (rate=0.5)            Softmax                  SGD                    weighted CE
[121]        CNN-GRU             CNN Layers (4) + GRU Layers (2) + ReLU Activation + Pooling Layers (2) + FC Layers (5)               NA                 Sigmoid                  Adam                       BCE
[122]     1D CNN - LSTM                              Proposed 1D-CNN LSTM with ReLU Activation                                    (rate=0.2)             Softmax                  Adam                       CCE
                                                 CNN layers (3) + ReLU Activation + Pooling Layers (1)
[123]         CGRNN                                                                                                              1 (rate=0.5)              NA                     Adam                      BCE
                                                 + GRU Layers (1) + Sigmoid Activation + FC Layer (1)
[124]         ConvNet                                Variation of the U-net Convolutional Architecture                               NA                    NA                     Adam             Proposed Loss Function
[125]         3D-CNN                    CNN Layers (2) + ELU Activations + Pooling Layers (2) + FC Layers (2)                        NA                  Sigmoid                  SGD                       BCE
                                      CNN Layers (8) + Conv-Bi LSTM Layers (2) + Sigmoid Activation (for LSTM)
[126]     3DCNN C-LSTM                                                                                                           8 (rate=0.2)            Softmax                  Adam                       CE
                                                           + Pooling Layers (1) + FC Layers (1)
[127]         AE                                                Proposed AE with 7 Layers                                             NA                DNN                        NA                       NA
[128]      CapsNets                                                 Standard Architecture                                             NA           K-Means Clustering             Adam             Proposed Loss Function
[129]   VGGNets + ASDNet               CNN Layers (27) + ReLU Activation + Pooling Layers (10) + FC Layers (6)                   6 (rate=0.5)          Softmax                    SGD                       CE
                                                                                                                                 7 (rate=0.25)
[73]           DCNN                 CNN Layers (7) + Activation+ Pooling Layers (13) + FC Layers (3) + BN Layers (10)                                    Softmax                   SGD                      NA
                                                                                                                                 3 (rate=0.5)
            Noisemes net
[130]                                                                 Standard Networks                                              NA                    RF                      NA                       NA
        DiarTK Diarization net
                                                                                                                                 7 (rate=0.25)
[72]           DCNN              CNN Layers (7) + ELU Activation + Pooling Layers (13) + FC Layers (3) + BN Layers (10)                                  Softmax                   SGD                      NA
                                                                                                                                 3 (rate=0.5)
[131]        SP-ASDNet                  CNN Layers (2) + LSTM Layers (2) + Pooling Layers (3) + FC Layers (2)                    2 (rate=NA)               NA                     Adam                      BCE
                                          Convolutional Spatiotemporal Encoding Layer+ Backbone Convolutional
[132]       TimeConvNet                                                                                                              NA                  Softmax                  Adam                      CCE
                                          Neural Network Architecture (Mini-Xception, ResNet20, MobileNetV2)
[133]    Different Networks                                          Proposed Structure                                              NA                    NA                      NA                       NA
               RCNN                                                      VGG-16
[134]                                                                                                                                NA                   KNN                      NA                       NA
              MTCNN                                             Cascaded CNNs Architecture
                                                  CNN Layers (3) + LSTM Layers (1) + ReLU Activation                             1 (rate=0.5)
[135]        CNN-LSTM                                                                                                                                    Softmax                   SGD                      NA
                                                            + Pooling Layers (3) + FC Layers (3)                                 1 (rate=0.2)
[136]            CNN                               CNN Layers (4) + Pooling Layers (2) + FC Layers (2)                               NA                  Softmax                   NA                       NA
[137]           LSTM                                                  LSTM layer (1)                                                 NA          Coherence Representation          NA                       NA
[138]           LSTM                             LSTM Layers (3) + Sigmoid Activation + FC Layers (1)                                NA                    NA                      NA                       CE
[139]            DCN                CNN Layers (17) + Pooling Layers (3) + Deconvolution Layers (3) + Learned Priors (3)             NA                    NA                      NA              Proposed Loss Function
 [75]    Pretrained Resnet18                                  Standard ResNet-18 Architecture                                     Standard              Standard                  Adam                      BCE
[140]            CNN                     CNN Layers (2) + ReLU Activation + Pooling Layers (2) + FC Layers (2)                   1 (rate=0.5)              NA                     Adam                      BCE
 [76]             AE                                                 AE with 8 Layers                                                NA            K-Means Clustering              NA                       NA
[141]         CultureNet                             Faster R-CNN + Modified ResNet50 + 5FC Layers                                   NA                  Softmax                  Adelta           Proposed Loss Function
[142]           DCNN                                Proposed DCNN Architecture with Different Layers                                 NA            Decision Tree (DT)       Manual Optimization             NA
[143]            CNN                                CNN Layers (2) + ReLU Activation + FC Layers (3)                             4 (rate=0.2)            Softmax                   NA                       NA
            SA-B3D with                                                                                                                                                                                     CE
[144]                                     CNN Layers (5) + LSTM Layers (1) + Pooling Layers (4) + FC Layers (1)                      NA                  Sigmoid                  Adam
            LSTM Model                                                                                                                                                                             Proposed Loss Function
 [74]            CNN                      CNN Layers (3) + ReLU Activation + Pooling Layers (3) + FC Layers (1)                       NA                  SVM                      SGD                      NA
[145]           DENN                        Proposed DENN Architecture with ReLU Activation + FC Layers (2)                           NA                 Sigmoid              mini-batch SGD                CCE
                                                                                                                                  (rate =0.2)
[146]           DNN                                 Propoded DNN with ReLU Activation + FC Layers (2)                                                    Sigmoid                  Adam                      BCE
                                                                                                                                  (rate =0.4)
You can also read