Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...

Page created by Ken Higgins
 
CONTINUE READING
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
Noname manuscript No.
                                        (will be inserted by the editor)

                                        Applications of Artificial Neural Networks in
                                        Microorganism Image Analysis: A Comprehensive
                                        Review from Conventional Multilayer Perceptron
                                        to Popular Convolutional Neural Network and
                                        Potential Visual Transformer

                                        Jinghua Zhang1,2 · Chen Li1,∗,# · Marcin
                                        Grzegorzek2
                                        Received: date / Accepted: date
arXiv:2108.00358v1 [cs.CV] 1 Aug 2021

                                        Abstract Microorganisms are widely distributed in the human daily living
                                        environment. They play an essential role in environmental pollution control,
                                        disease prevention and treatment, and food and drug production. The identifi-
                                        cation, counting, and detection are the basic steps for making full use of differ-
                                        ent microorganisms. However, the conventional analysis methods are expen-
                                        sive, laborious, and time-consuming. To overcome these limitations, artificial
                                        neural networks are applied for microorganism image analysis. We conduct
                                        this review to understand the development process of microorganism image
                                        analysis based on artificial neural networks. In this review, the background
                                        and motivation are introduced first. Then, the development of artificial neural
                                        networks and representative networks are introduced. After that, the papers
                                        related to microorganism image analysis based on classical and deep neural
                                        networks are reviewed from the perspectives of different tasks. In the end, the
                                        methodology analysis and potential direction are discussed.
                                        Keywords Microorganism image analysis · Classical neural network · Deep
                                        neural network · Transfer learning

                                        1 Introduction

                                        Microorganisms are tiny living organisms that can appear as unicellular, multi-
                                        cellular, and acellular types [91]. We provide some examples in Fig. 1. Some mi-
                                        croorganisms are benefiting, such as Lactobacteria can decompose substances
                                        to give nutrients to plants [70], Actinophrys can digest the organic waste in

                                        1. Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological
                                        Information Engineering, Northeastern University, Shenyang, China
                                        2. Institute for Medical Informatics, University of Luebeck, Luebeck, Germany
                                        ∗. Co-first author: Chen Li
                                        #. Corresponding author: Chen Li, E-mail: lichen201096@hotmail.com; Marcin Grzegorzek,
                                        E-mail: marcin.grzegorzek@uni-luebeck.de
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
2                                                                 Jinghua Zhang1,2 et al.

sludge and increase the quality of freshwater [167], and Rhizobium legumi-
nosarum can help soybean to fix nitrogen and supply food to human beings [8].
However, there are also many harmful microorganisms, such as Mycobacterium
tuberculosis can lead to disease and death [43], and the novel coronavirus dis-
ease 2019 (COVID-19) constitutes a public health emergency globally [112].
Therefore, microorganism research plays a vital role in pollution monitoring,
environmental management, medical diagnosis, agriculture, and food produc-
tion [70, 80], and the analysis of microorganisms is the essential step for related
researches and applications [82].

    Actinophrys   Arcella     Aspidisca    Codosiga    Colpoda     Epistylis     Euglypha

    Paramecium    Rotifera    Vorticella   Noctiluca   Ceratium    Stentor      Siprostomum

    K.Quadrala    Euglena    Gymnodinium   Gonyaulax   Phacus     Stylonychia    Synchaeta

Fig. 1 The examples of images of the investigated classes of microorganisms [167].

    In general, microorganism analysis methods can be summarised into four
categories: chemical (e.g., chemical component analysis), physical (e.g., spec-
trum analysis), molecular biological (e.g., DNA and RNA analysis), and mor-
phological (e.g.manual observation under a microscope) methods [80]. Their
main advantages and disadvantages are compared in Table 1. The chemical
method is highly accurate but often results in secondary pollution of chemical
reagent [80]. The physical method also has high accuracy, but it requires expen-
sive pieces of equipment [80]. The molecular biological method distinguishes
microorganism by sequence analysis of genome [159]. This strategy needs ex-
pensive pieces of equipment, plenty of time, and professional researchers. The
morphological method is the most direct and brief approach, where a microor-
ganism is observed under a microscope and recognized manually based on its
shape [106]. The morphological method is the most cost-effective of the above
methods, but it is still laborious, tedious, and time-consuming [167]. Besides,
the objectivity of this manual analyzing process is unstable, depending on the
experience, workload, and mood of the biologist significantly.
    In recent years, Artificial Intelligence (AI) technology develops rapidly [169].
It achieves outstanding performance in many fields of image analysis and pro-
cessing, such as autonomous driving [125, 18, 147], face recognition [88, 146,
32], and disease diagnosis [133, 83, 112]. AI can undertake the laborious and
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
Title Suppressed Due to Excessive Length                                                   3

Table 1 A comparison of traditional methods for microorganism analysis [77].

        Methods             Advantages                     Disadvantages
    Chemical method         High accuracy      Secondary pollution of chemical reagent
    Physical method         High accuracy               Expensive equipment
   Molecular biological                        Secondary pollution of chemical reagent,
                            High accuracy
         method                                expensive equipment, long testing time
  Morphological method    Short testing time   Long training time for skillful operators

time-consuming work and quickly extract valuable information from image
data. Therefore, AI shows potential in Microorganism Image Analysis (MIA).
Besides, AI has a strong objective analysis ability in MIA and can avoid subjec-
tive differences caused by manual analysis. To some extent, the misjudgement
of biologists can be reduced, and the efficiency can be improved.
    As an essential part of artificial intelligence technology, Artificial Neural
Network (ANN) is originally designed according to the biological neuron [93].
Due to the limitation of computer performance, the difficulty of training, and
the popularization of Support Vector Machine (SVM), the early ANN devel-
opment once fall into a state of stagnation [160]. After that, with the improve-
ment of computer performance, Convolutional Neural Network (CNN) shows
an overwhelming advantage in image recognition [68], and ANN is paid at-
tention to again and developed rapidly. We find that ANNs are widely used
in MIA thorough investigation because they can learn useful patterns from
enormous data and features.

1.1 Artificial Intelligence Methods for Microorganism Image Analysis

AI is an umbrella term encompassing the techniques for a machine to mimic or
go beyond human intelligence, primarily in cognitive capabilities [116]. Fig. 2
provides the structure of AI technology [86, 169]. AI has several important sub-
domains, such as Machine learning (ML), computer vision, natural language
processing, etc. Among them, ML technology has been widely used in the
MIA task, such as Environmental Microorganism (EM) segmentation [82, 167],
Herpesvirus detection [33], and Tuberculosis Bacilli (TB) classification [121,
103].
    As shown in Fig. 2, ML can be grouped into conventional methods and
ANNs. In conventional methods, SVM, k-Nearest Neighbor (KNN), Random
Forest (RF) and other methods have been applied to the MIA task. For ex-
ample, work [79] proposes an automatic EM classification system based on
content-based image analysis techniques. There are four features (histogram
descriptor, geometric feature, Fourier descriptor and internal structure his-
togram) extracted from the Sobel edge detector based segmentation result for
training SVM to perform the classification task. For ten EM classes tested in
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
4                                                                                                      Jinghua Zhang1,2 et al.

                                                               Artificial
                                                             Intelligence

                           Natural
    Computer                                    Speech        Machine                 Expert
                          Language                                                                          Robotics
     Vision                                   Recognition     Learning                System
                          Processing

                                                                                               Artificial
                        Conventional
                                                                                                Neural
                          Methods
                                                                                               Networks

                                                                         Classical
                                                                                                                    Deep Neural
                                                                          Neural
                                                                                                                     Networks
                                                                         Networks
      Linear Regression                  Decision Trees

    Support Vector Machine             k-Nearest Neighbor                                                   Convolutional Neural Network
             (SVM)                           (KNN)                       Perception
                                                                                                                       (CNN)

         Naive Bayes                   Logistic Regression   Radial Basis Function Neural Network             Recurrent Neural Network
                                                                             (RBF)                                      (RNN)

           k-Means                       Sparse Coding          Restricted Boltzmann Machine                Generative Adversarial Network
                                                                            (RBM)                                       (GAN)

        Random Forest             Conditional Random Field            Hopfield Network                           Deep Belief Network
            (RF)                           (CRF)                            (HN)                                       (DBN)

Fig. 2 The structure of ML in the AI.

this work, the mean of average precisions obtained by the system amounts to
94.75% [70]. An automatic identification approach of TB is proposed in [40].
The Canny edge technique is applied for edge detection, followed by a non-
maxima suppression and a hysteresis threshold operations. After that, a mor-
phological closing operation is applied. Then compactness and eccentricity fea-
tures are extracted in one branch, and the same segmented images are passed
through k-means clustering in the second branch. Each branch independently
goes to classification part using the nearest neighbour classifier. The average
results obtained from the two branches are 93.30% and 100% sensitivity for
the first and second branches [70]. In [34], a content-based MIA work based
on ZooImage automated system is introduced, where 1437 objects of 11 taxa
are classified using four shape features and a Random Forest (RF) classifier,
and a general classification accuracy of 83.92% is finally achieved [77].
    ANNs also play a vital role in the MIA task. Fig. 2 shows that they include
classical neural networks and deep neural networks. In the early years, due to
the computer performance limitation, classical neural networks represented by
Multilayer Perceptron (MLP), Radial Basis Function Neural Network (RBF),
Probabilistic Neural Network (PNN), etc. are applied to the MIA tasks. For
example, in [28], human experts’ performance in identifying Dinoflagellates
is compared to that achieved by two ANN classifiers (MLP and RBF) and
two other statistical techniques, KNN and Quadratic Discriminant Analysis
(QDA). The data set used for training and testing comprised a collection of
60 features that are extracted from the specimen images. The result shows
that the ANN classifiers outperform classical statistical techniques. Extended
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
Title Suppressed Due to Excessive Length                                        5

trials show that the human experts achieved 85% accuracy while the RBF
achieves the best performance of 83%, the MLP 66%, KNN 60%, and the QDA
56%. The work [71] uses PNN to select the best identification parameters of
the features extracting from the microorganism images. PNN is then used to
classify the microorganisms with a 100% accuracy using nine identification
parameters.
    Later, with the significant improvement of computer performance, the de-
velopment of neural network theory and the proposal of Convolutional Neural
Network (CNN), deep neural networks show an overwhelming advantage in
image analysis, including microbial image analysis. For example, the work [92]
uses U-Net to perform the Rift Valley virus segmentation and achieves a Dice
score of 90% and Intersection Over Union (IOU) of 83.1%. In [144], the trans-
fer learning based on Xception is applied to perform the bacterial classification.
Seven varieties of bacteria for recognition which might be lethal for human are
chosen for the experiment. The performance is evaluated on 230 bacteria im-
ages of seven varieties from test dataset, which shows promising performance
with approximately 97.5% prediction accuracy in bacteria image classification.
    In conclusion, we can find that conventional machine learning methods
and classical neural network methods have similar workflows in the MIA task.
They typically rely heavily on feature engineering [3]. These workflows usually
contain image acquisition, image preprocessing, image segmentation, feature
extraction, classifier design, and evaluation. The reliability of accuracy de-
pends on the design and extraction of features [3]. In recent years, with the
development and popularization of CNN, one of the most important parts
of deep neural networks, the MIA task is no longer affected by feature engi-
neering. Compared with the classical ANN method, CNN can directly extract
useful features from the image through the convolutional kernel. This kind of
ability makes the research and application of CNN in MIA increase rapidly
and obtain overwhelming advantages.

1.2 Motivation of This Review

This paper focuses on the development and application of ANNs in the MIA
task. A comprehensive overview of techniques for the image analysis of mi-
croorganism using classical neural networks and deep neural networks is pre-
sented. The motivation is to clarify the development history of ANNs, under-
stand the popular technology and trend of ANN applications in the MIA field.
Besides, this paper also discusses potential techniques for the image analysis
of microorganism by ANNs. As far as we know, there are some review papers
that summarize researches related to the MIA task, for example, papers [70,
78, 80, 81]. In the following part, we go through these review papers.
    The review [70] comprehensively analyses the various studies focused on
microorganism image segmentation methods from 1989 to 2018. These meth-
ods include classical segmentation methods (e.g., edge-based, threshold-based,
and region-based segmentation methods) and machine learning-based methods
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
6                                                             Jinghua Zhang1,2 et al.

(supervised and unsupervised machine learning methods). About 85 related
papers are summarized in this review. The ANN-based segmentation method
is one part of this review.
    In the review [78], to clarify the potential and application of different clus-
tering techniques in MIA, the related works from 1997 to 2017 are summarized
while pinning out the specific challenges on each work (area) with the corre-
sponding suitable clustering algorithm. More than 20 related research papers
are summarized in this review.
    In the review [80], the development history of microorganism classifica-
tion using content-based microscopic image analysis approaches is summa-
rized. This review introduces the classification methods from different appli-
cation domains, including agricultural, food, environmental, industrial, medi-
cal, water-borne, and scientific microorganisms. Around 240 related works are
summarized. The classification methods discussed in this review contains not
only ANNs but also many other methods like SVM, KNN, RF, and so on.
    In our previous work [81], we propose a brief review for content-based
microorganism image analysis using classical and deep neural networks. This
review briefly summarizes 55 related papers from 1992 to 2017.
    To sum up, we can find that the review [78] only summarizes the MIA
method based on clustering techniques. The review papers [70, 80] focus on
the segmentation and classification tasks, respectively. Although the methods
discussed in these reviews include ANN methods, ANN is not central to these
two reviews’ discussions. Besides, our previous review [81] only briefly dis-
cusses the ANN technique in the MIA task, and the time of the discussion is
up to 2017. Hence, to clarify the development history of ANNs, understand
the popular technology and trend of ANN applications in the MIA field, we
summarize the related papers and provide a detailed discussion about the re-
search motivation, contribution, dataset, workflow, and result for each work
in this review.

1.3 Paper Searching and Screening

To illustrate the collection process of related papers, Fig. 3 provides the details.
Based on our previous review, there are 55 related papers from 1992 to 2017.
We collect 51 papers from several databases including Google Scholar, IEEE,
Elsevier, Springer, ACM, etc. After carefully reading, 16 papers on other topics
are excluded, and six related paper is added. Finally, a total of 96 research
papers are retained for our review. The popularity and trend of ANN for the
analysis of microorganism image are provided in Fig. 4.

1.4 Structure of This Review

This review is structured as follows, we begin by introducing the development
of ANN and some representative networks used in MIA tasks in Sec. 2; Sec. 3
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
Title Suppressed Due to Excessive Length                                                                    7

                       Previous work                     Irrelevant

                        55 papers                        16 papers

                                                                                         Final: 96 papers

                                        51 papers                        6 paper

                                       First searching                Second searching

Fig. 3 The collection process of related papers.

       100

        90

        80

        70

        60

        50

        40

        30

        20

        10

         0
          92
          93
          94
          95
          96
          97
          98
          99
          00
          01
          02
          03
          04
          05
          06
          07
          08
          09
          10
          11
          12
          13
          14
          15
          16
          17
          18
          19
          20
        19
        19
        19
        19
        19
        19
        19
        19
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
        20
                       All ANN      Classical Neural Network          Deep Neural Network

Fig. 4 The development trend of ANN methods for MIA tasks. The horizontal direction
shows the time. The vertical direction shows the cumulative number of related works in each
year.

introduces the MIA work using classical ANN methods; in Sec. 4, State-of-
the-art deep ANN methods applied in the MIA tasks are summarized; Sec. 5
presents the method analysis; Sec. 6 concludes this paper.

2 Artificial Neural Networks

In this section, to better understand the development of ANN in the MIA task,
we briefly introduce the evolution history of ANN. Besides, some representative
ANN structures in MIA tasks are introduced.

2.1 Evolution of Artificial Neural Networks

The development of ANN has a long history. As shown in Fig. 5, its course
can be divided into three stages [160]. The ANN research can be traced back
to M-P Neuron, designed according to the biological neuron, proposed in [93]
in 1943. This model is a physical model made up of elements such as resistors.
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
8                                                                                                                            Jinghua Zhang1,2 et al.

Perceptron, whose learning rule is based on the original M-P Neuron, is pro-
posed in [119]. After training, perceptron can determine the connection weights
of neurons [160]. Since then, the first upsurge of ANN starts. However, Minsky
et al. point out that perceptron can not be applied to solve the XOR problem
(the linearly inseparable problem) in [95], which makes the ANN research fall
into the trough.

              M-P Neuron                      XOR Problem                                                  BP Algorithm                          SVM
           The earliest                     Perceptron can                                               It solves the XOR                The popularizatio-
           artificial neuron                not solve the XOR                                            problem                          n of SVM makes

                                                                                                  1986

                                                                                                                                   1995
    1943

                                     1943

                                            problem                                                                                       ANN fall into the
                                                                                                                                          trough again
                    1958

                                                                                  1982

                                                                                                                  1989
                            The first and basic                                           It makes the ANN
                            model of the                                                  research rise                  The first time to
                            ANNs                                                          again                          propose CNN
                                Perceptron                                                Hopfield Network                     CNN

                                 ResNet                          GAN                          GoogleNet                       AlexNet
                            It proposes short-             It consists of a               Propose Inception              It applis ReLU,
                            cut connection                 generator and a                structure. Win the             Dropout, and LRN
                     2015

                                                    2014

                                                                                   2014

                                                                                                                  2012

                            and wins the first             discriminator and              first place in                 to CNN and wins
                            place in ImageNet              is used for image              ILSVRC 2014                    the first place in
                            2015                           generation                                                    ILSVRC 2012

           It proposes the                                                The first time to
           end-to-end                                                     apply CNN to the
    2015

                                     2015

                                                                   2014

                                                                                                  2014

                                                                                                                                   2006

           structure and is                 It is simpler and             obejct detection               It wins the second               It proposes
           used to image                    faster than RCNN.             task. (two stage               place in ILSVRC                  layerwise pre-
           segmentation                     (one stage method)            method)                        2014                             training
                 U-Net                            YOLO                          RCNN                            VGG                              DBN

Fig. 5 The evolution of ANN.

    In the second stage, with the proposal of Hopfield network [57] in 1982,
the study of ANN attracts attention again. After that, with the proposal of
Back-Propagation (BP) algorithm [122], the XOR problem can be solved by
training Multilayer Perceptron (MLP) using BP algorithm. Besides, LeCun
et al. propose CNN by introducing the convolutional layer, inspired by the
biological primary visual cortex, into the neural network [74, 75]. However,
limited by the computer performance and neural network theory at that time,
although BP enables CNN to be trained, there are still problems such as too
long training time and easy over-fitting. With the popularization of SVM, the
ANN research falls into the trough again.
    Although the ANN research falls into the trough, the related researches by
Hinton [49, 50, 51, 97, 124] and Bengio [12, 13, 46, 73] et al. don’t stop. Benefit
from their research progresses, ANNs show overwhelming advantages in speech
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
Title Suppressed Due to Excessive Length                                      9

and image recognition [68]. Since then, the third rise of ANN begins. Different
from the second stage, the computer performance is significantly improved.
Training a deep neural network does not need such a long time like before.
Besides, with the popularization of the internet, more and more data can be
used for the training, which can reduce the over-fitting.
    The ANN based MIA research also follows the development trend of ANN.
In the following part, some representative ANN structures in MIA tasks are
introduced.

2.2 Representative Artificial Neural Networks

ANN plays an essential role in the MIA task. By investigating related research
papers, we find that early MIA tasks based on classical neural networks rely on
the “Feature Engineering + Classifier” workflow. They usually extract effective
features from images by the existing experience or the select method and then
use them for training classifiers to perform corresponding tasks. Among these
classifiers, MLP is the most widely used one. With the widespread use of CNNs,
the MIA task no longer relies on feature engineering. Among these CNNs,
AlexNet [68], VGGNet [132], ResNet [48], and Inception [136] based methods
are widely used in the microorganism image classification task, U-Net [118]
and its related improved networks are also widely used for the microorganism
segmentation task, and YOLO [114] is also widely used in the microorganism
detection task. To understand the characteristics of these networks, we provide
a brief description of MLP, AlexNet, VGGNet, ResNet, Inception, U-Net, and
YOLO below.

2.2.1 Multilayer Perception

As is shown in Fig. 6, an MLP consists of at least three layers: an input layer,
a hidden layer, and an output layer. The early MLP is a class of feed-forward
ANN. Like the perceptron, the early MLP can determine the connection weight
between two layers by the error correction learning, which adjusts the connec-
tion weight according to the error between the expected output and the actual
output. However, error correction learning cannot work between multiple lay-
ers. For this reason, the early MLP uses the random number to determine the
connection weight between the input layer and the hidden layer, and the error
correction learning is used to adjust the connection weight between the hidden
layer and the output layer. With the proposal of the BP algorithm, MLP can
adjust the connection weight layer by layer.

2.2.2 AlexNet

AlexNet adopts an architecture with consecutive convolutional layers. It is the
first ANN to win the ILSVRC 2012 [111]. After its triumphant performance,
the following years’ winning architectures are all deep CNNs. As shown in
Applications of Artificial Neural Networks in Microorganism Image Analysis: A Comprehensive Review from Conventional Multilayer Perceptron to ...
10                                                                      Jinghua Zhang1,2 et al.

                  Input 1
                                                                  Property 1

                  Input 2

                                                                  Property 2

                  Input 3                                Output Layer
                            Input Layer

                                          Hidden Layer

Fig. 6 An example of Multilayer Perceptron.

Fig. 7(a), AlexNet consists of 8 layers, which includes five convolutional layers
and three fully connected layers. The significance of AlexNet is that they use
the Rectified Linear Unit (ReLU) as an activation function instead of the
sigmoid or hyperbolic tangent function [111]. AlexNet is trained by multi-
GPU, which can cut down on the training time. Besides, different from the
conventional pooling methods, overlapping pooling is introduced to AlexNet.
AlexNet has 60 million parameters [123]. To solve the overfitting problem,
data augmentation and dropout are employed.

2.2.3 VGGNet

Simonyan et al. propose a CNN model named VGGNet, which wins second
place in ILSVRC 2014 [132]. VGGNet is characterized by its simplicity, using
only the 3 × 3 convolutional filter, which is the smallest size to capture the
spatial information of left/right, up/down, center. The layer number of VG-
GNet could be 16 or 19. As the architecture of VGG-16 shown in Fig. 7(b),
the network consists of 13 convolutional layers, 5 max pooling layers, 3 fully
connected layers, and a softmax layer. VGG-19 has 3 more convolution layers
than VGG-16. VGG-16 and VGG-19 comprise 138 and 144 million parame-
ters [132]. The significance of VGGNet is the use of 3 × 3 convolutional filters,
whose stack could obtain the same receptive filed as the bigger convolutional
filter used in AlexNet. Besides, this stack, which has fewer parameters than
the bigger convolutional filter used in AlexNet, allows VGGNet to have more
weight layers, which results to improve performance [111].

2.2.4 Inception

Szegedy et al. propose GoogleNet, which wins first place in ILSVRC 2014 [136].
As shown in Fig. 7(c), it consists of 22 convolutional layers and 5 pooling
layers. Nevertheless, this network has only 7 million parameters [111]. The
Title Suppressed Due to Excessive Length                                                              11

                                     Input

                                 3 × 3 conv

                                 3 × 3 conv

                                                                                          Input
                                     Pool
                                                                                        7 × 7 conv
                                 3 × 3 conv
                                                                                          Pool
                                 3 × 3 conv

                                     Pool                                               3 × 3 conv

                                 3 × 3 conv                                             3 × 3 conv

                                 3 × 3 conv                 Input
                                                                                        3 × 3 conv

                                 3 × 3 conv             7 × 7 conv
                                                                                        3 × 3 conv

          Input                      Pool                   Pool
                                                                                        3 × 3 conv
       11×11 conv                3 × 3 conv             1 × 1 conv
                                                                                        3 × 3 conv
       5 × 5 conv                3 × 3 conv             3 × 3 conv

          Pool                   3 × 3 conv                 Pool
                                                                                        3 × 3 conv
       3 × 3 conv                    Pool            Inception block       ×2
                                                                                        3 × 3 conv
          Pool                   3 × 3 conv
                                                            Pool
                                                                                        3 × 3 conv
       3 × 3 conv                3 × 3 conv
                                                     Inception block       ×5           3 × 3 conv
       3 × 3 conv                3 × 3 conv
                                                            Pool
          Pool                       Pool                                               3 × 3 conv

                                                     Inception block       ×2           3 × 3 conv
           FC                         FC

           FC                         FC                    Pool                          Pool

           FC                         FC                     FC                            FC

        SoftMax                   SoftMax                SoftMax                         SoftMax

     (a) AlexNet             (b) VGG-16              (c) GoogleNet                       (d) ResNet

                             Input
                                                                                    x

                    1 × 1 conv    1 × 1 conv     Pool                        3 × 3 conv
     1 × 1 conv                                                F(x)                     ReLU
                    3 × 3 conv    5 × 5 conv   1 × 1 conv
                                                                             3 × 3 conv

                                                                  H(x)  F(x)  x
                         Concatenation

                    (e) Inception block                               (f) Residual learning block

Fig. 7 DNN architectures. (a) is AlexNet, (b) is VGG-16, (c) is GoogleNet, (d) is ResNet,
(e) is the Inception block used in GoogleNet, and (f) is the residual learning block used in
ResNet.
12                                                          Jinghua Zhang1,2 et al.

significance of GoogleNet is that it first introduces the Inception structure.
As the Inception structure is shown in Fig. 7(e), it first uses the idea of using
the 1 × 1 convolutional filter to reduce the channel number of the previous
feature map to reduce the total number of parameters of the network. Besides,
the structure uses convolutional filters of different sizes to obtain multi-level
features to improve the performance [167].
    In the Inception-V2, batch normalization is added, and a stack of two 3 × 3
convolutional filters are employed instead of a 5 × 5 convolutional filter to
increase the network depth and reduce the parameters [61]. In Inception-V3,
a stack of 1 × n and n × 1 convolutional filters is used to replace the n × n
convolutional filter [137]. In Inception-V4, the idea of residual learning block
is incorporated [135].

2.2.5 ResNet

From AlexNet (5 convolutional layers), VGGNet (16 or 19 convolutional lay-
ers), to GoogleNet (22 convolutional layers), the structure of the network is
getting deeper and deeper. This is because deeper networks allow more com-
plex features to be extracted, which in theory leads to better results for deeper
networks. However, adding depth to the network by simply adding layers can
lead to gradient vanishing/gradient explosion problems [14, 46]. To solve these
problems, He et al. propose ResNet, whose structure is shown in Fig. 7(d).
ResNet proposes a novel approach called identity mapping so that the net-
work depth can increase for improved performance. Fig. 7(f) shows the resid-
ual learning block based on identity mapping. In this block, H(x) denotes
an underlying mapping to be fit by the stacked layers, and the input feature
map of these layers is denoted as x. The residual function can be denoted as
F (x) = H(x) − x. The residual network’s primary purpose is to make a deeper
network from a shallow network by copying weight layers in the shallow net-
work and setting other layers in the deeper network to be identity mapping.

2.2.6 U-Net

U-Net is a CNN that is initially used to perform the microscopic image seg-
mentation task [166]. As the structure is shown in Fig. 8, U-Net is symmetrical,
consisting of a contracting path (left side) and an expansive path (right side),
which gives it the u-shaped architecture. The training strategy of U-Net re-
lies on the strong use of data augmentation to make more effective use of the
available annotated samples [118]. Besides, the end-to-end structure of U-Net
can retrieve the shallow information of the network [118].

2.2.7 YOLO

Redmon et al. propose a novel framework called YOLO, which uses the whole
topmost feature map to predict both confidences for multiple categories and
bounding boxes [168]. YOLO’s main idea is that it divides the input image into
Title Suppressed Due to Excessive Length                                                                                                                                                                                                                                     13

                                                                                                                                                                                                                                                                         2562
 2562

        1                64       64                                                                                                                                                                                     128             64            64           1
                          1282

                                                                                                                                                                                                                                       1282
                                 64 128 128                                                                                                                                                   256 128 128

                                                                                                                                                                                                           642
                                           642

                                                 128 256 256                                                                                                            512     256           256
                                                                                                                                                                                                                         3×3 Convolution (ReLU)
                                                              322

                                                                                                                                                                                322
                                                                                                                                                                                                                         2×2 Max Pooling
                                                                    256 512                        512                          1024                      512            512                                             2×2 Up-convolution
                                                                                                                                                                                                                         Copy & Concatenate
                                                                                        162

                                                                                                  512             1024               1024                                                                                1×1 Convolution (Sigmoid)

Fig. 8 U-Net architecture.

an S×S grid, and the grid cell is responsible for detecting the object centered in
that cell [114]. Each grid cell predicts B bounding boxes and confidence scores
for those boxes [114]. These confidence scores reflect how confident the model
is that the box contains an object and how accurate it thinks the box is that
it predicts [114]. Each grid cell also predicts C conditional class probabilities.
It should be noticed that only the contribution from the grid cell containing
an object is calculated [168]. The structure of YOLO is shown in Fig. 9. It
consists of 24 convolutional layers and 2 fully connected layers. Instead of the
Inception used by GoogleNet, they use 1 × 1 reduction layers followed by 3 × 3
convolutional layers.
                                                                                                                           ×4

                                                                                                                                                                                         ×2
            7 × 7 conv

                                  3 × 3 conv

                                                      1 × 1 conv

                                                                   3 × 3 conv

                                                                                1 × 1 conv

                                                                                              3 × 3 conv

                                                                                                                   1 × 1 conv

                                                                                                                                3 × 3 conv

                                                                                                                                             1 × 1 conv

                                                                                                                                                           3 × 3 conv

                                                                                                                                                                                 1 × 1 conv

                                                                                                                                                                                              3 × 3 conv

                                                                                                                                                                                                            3 × 3 conv

                                                                                                                                                                                                                          3 × 3 conv

                                                                                                                                                                                                                                         3 × 3 conv

                                                                                                                                                                                                                                                      3 × 3 conv
   Input

                          Pool

                                               Pool

                                                                                                           Pool

                                                                                                                                                                         Pool

                                                                                                                                                                                                                                                                   FC

                                                                                                                                                                                                                                                                        FC

Fig. 9 YOLO architecture.

3 Classical Neural Network for Microorganism Image Analysis

An overview of the MIA task using classical neural networks is discussed in
this section. We divide the tasks into two categories: classification and other
tasks. In the classification part, we mainly introduce the works related to clas-
sification and identification tasks. The other tasks, including segmentation,
14                                                          Jinghua Zhang1,2 et al.

feature generation, and enumeration, are discussed in the following part. Be-
sides, we provide a summary to summarize the characters of the MIA based
on classical neural networks and a table for readers to find relevant research
works conveniently.

3.1 Classification Tasks

In [10], to discriminate and identify algal species with the Optical Plank-
ton Analyser (OPA), a multilayer feed-forward network is applied. In the
experiment, a total of eight species of algae are cultivated in monoculture.
From each monoculture, six-parameter logarithmic data generated by OPA
is used to train the network. The results show that the network can distin-
guish Cyanobacteria from other algae with 99% accuracy. The identification
of species is performed with less accuracy but is generally more than 90%.
    In [26], to achieve automatic taxonomic classification in ecological research,
back-propagation of error network is used to perform the Cymatocylis classifi-
cation task of this paper, which contains five species: Cymatocylis calyciformis,
C. drygalskii, C. vanhöffeni, C. convallaria, and C. parva. In the experiment,
the images are digitized, binarised, and edited by hand to remove large de-
bris first. Then, the network is trained by the image data after the Fourier
transformation. The results show that 28% of the 299 trials achieve more than
70% correct categorization rates of the data used in the training sets. The best
network shows error rates of 11% and 18% of training dataset and previously
unseen data.
    In [28], to decrease the cost of monitoring of noxious and toxic algae and
other parameters in coastal waters and improve the efficiency, a comparison
between human experts, two ANN methods (MLP and RBF), and two ML
methods (KNN and QDA) is made in this paper. The dataset used in this paper
contains 23 species of toxic and noxious Dinoflagellates. In the experiment,
images are pre-processed to segment the specimen from the background, debris
and clutter. Then, these images are analyzed by six functions, which includes
the 2D Fast Fourier Transform of the object, the Discrete Fourier Transform
of the object’s profile, its second-order statistics, a Sobel edge descriptor, a
junction descriptor, and a texture metric, to generate a multiplicity of low-
resolution parameters. These parameters are fed into the automatic classifiers
for training. The experiment results show that the human expert achieves the
averaged identification rate of 85% while RBF achieves the best performance
of 83%, MLP 66%, KNN 60%, and QDA 56%.
    Tuberculosis is largely treatable with early diagnosis and subsequent mon-
itoring. To detect TB in sputum smears, an automatic method is proposed
in [142]. The image data used in the experiment is created based on the spu-
tum smears taken at the South African Institute for Medical Research, Cape
Town. There are 1147 objects (267 TB and 880 other objects) are obtained.
From these, l000 examples are used for training and 147 for testing. In the ex-
periment, the pre-processing steps, including edge detection, region labelling
Title Suppressed Due to Excessive Length                                      15

and removal, edge pixel linking, and boundary tracing, are applied one by one
first for the shape features extraction in which the discrete Fourier transform
is used. Then, in the classifier training phase, the first 15 coefficients of the
discrete Fourier transform are sufficient for training. During this process, the
classifiers from discriminant methods of statistics or neural networks are com-
pared. The results show that the MLP based on the BP algorithm performs
best. It achieves an accuracy of 97.9%
    In [42], the MLP based on the BP algorithm is proposed to perform the
classification task for investigating the reactors’ influence on the fungal mor-
phology. These reactors are shaking flask, stirred tank, and airlift tower loop
reactor. The authors themselves make the image data used in the experiment.
Five morphological features, including object area, eccentricity, circularity,
length of the skeleton, and branching frequency, are extracted from the image
data. These five features and the cultivation time are used as the inputs of the
network. The clustering method is used to find the best combination of fea-
tures. The experimental results show that the best classification results (better
than 90%) were obtained with cultivations in shaking flasks with baffles and
a stirred tank reactor.
    The microscopic examination used to determine concentrations and biovol-
umes of microorganisms is time-consuming and tedious. Automation can ef-
fectively improve this situation. In [15], a classification method based on ANN
is proposed, and it can effectively classify individual bacteria and non-bacteria
objects. The private dataset used in the experiment is based on samples col-
lected at several stations in the Baltic Sea. A total of 690 images were taken
for the experiment, each containing about 200 bacteria. The process of the
experiment is to use edge detection to perform image segmentation and then
extract shape features for network training. The network used in the exper-
iment is a MLP based on the BP algorithm, which contains 16 input nodes,
five hidden nodes and three output nodes. Experimental results show that this
method can effectively perform the classification task.
    Pathogen identification plays an important role in disease control and pre-
vention. Simple and quick methods are essential in the pathogen identification
task. In [64], a classification system based on statistical methods successfully
discriminate closely related species within the same host. Five closely related
species of Gyrodactulus are used to test the system performance. There are
two kinds of image data used in the experiment: one is prepared under the
scanning electron microscope, and another is prepared under the light micro-
scope. In the experiment, the features are extracted from these images to train
the classifiers, which contains linear discriminant analysis, nearest neighbours,
feed-forward neural network and projection pursuit regression. The experi-
mental results show that nearest neighbours and linear discriminant analysis
give the best results on the data of light microscope and nearest neighbours
and feed-forward neural network achieve absolute detection of Gyrodactulus
salaris from just a single sclerite, the marginal hook, from scanning electron
microscope images.
16                                                         Jinghua Zhang1,2 et al.

    It is a time-consuming task to classify marine dinoflagellates by artificial
analysis. However, the analysis of seawater samples is very important for ma-
rine ecology and fishery health. Therefore, using automatic methods to assist
ecologists to analyse seawater samples has great significance. A system called
DiCANN is proposed for this purpose in [27]. This system consists of ANNs, an
Internet distributed database, and image analysis techniques. The image data
used in the experiment is derived from a Distributed Image Database (DIB)
on the Internet. As the system’s workflow shown in Fig. 10, the input image
data is pre-processed first, and then the features are extracted for training or
testing the classifiers. The classifiers contain back-propagation of error feed-
forward Perceptron (BPN), KNN, RBF, and QDA. The experimental results
are shown in Fig. 11. As the number of species increases, human performance
declines while the performance of BPN rises most obviously.

Fig. 10 The workflow of DiCANN in [27].

Fig. 11 The experimental results of DiCANNin [27].
Title Suppressed Due to Excessive Length                                   17

     In [5], four kinds of ANNs are applied to the classification task of Es-
cherichia coli, including MLP, RBF, General Regression Neural Network (GRNN),
and PNN. A total of 336 Escherichia coli instances are used in the experiment.
There are eight classes in the dataset. The experimental results show that PNN
is the most suitable model for Escherichia coli classification. Using this neural
network, the benchmark result using the ad hoc structured probability model
is improved to 82.97% correct classification rate.
     Counting phytoplankton using a microscope is a time-consuming process.
Therefore, it is of great significance to develop an automatic counting system.
In [39], a phytoplankton identification and counting system based on com-
puter image analysis and pattern recognition is proposed. The system consists
of ANNs and simple rule-based procedures. The image data used in the ex-
periment is taken directly from water samples from Lough Neagh in northern
Northern Ireland. In the experimental process, the input image is pre-processed
and analysed first, then the 74 parameters describing size, shape, colour and
grey level distribution are extracted for the training of the network, and fi-
nally, the test data are used for testing. The experimental result shows that
the automatic system is close to the results obtained by manual.
     Their own experience often influences human experts in the identification
of Cryptosporidium parvum and Giardia lamblia that can infect humans and
cause deadly gastroenteritis. ANNs are widely used in biomedical image iden-
tification in recent years. Therefore, the methods based on ANNs are proposed
to perform the identification task of these two protozoa in [152]. In the exper-
iments, the image data is first clipped to fit the input of the networks (40 by
40 and 95 by 95 pixels for Cryptosporidium parvum and Giardia lamblia, re-
spectively). The ANNs used for Cryptosporidium parvum and Giardia lamblia
have 1600 and 9025 input neurons. Both of them have five hidden and two
output neurons. In the experiments, there are 1586 and 2431 images for train-
ing Cryptosporidium oocyst ANN and Giardia cyst ANN, respectively. After
the training process, 100 images are used to select ANNs, which show the
best performance. Finally, Cryptosporidium oocyst ANN achieves 91.8% cor-
rect identification rate on 500 unseen images and Giardia cyst ANN achieves
99.6% correct identification rate on 282 unseen images.
     The analysis of sedimentary organic matter can be used in geochronologi-
cal, biostratigraphical, paleoecological and paleoenvironmental analysis. How-
ever, the traditional microscope analysis is time-consuming and tedious. Au-
tomatic technology can improve effectiveness. Thus, in [149], a semi-automatic
system is presented to perform the classification task of sedimentary organic
matter. The image data used in the experiments are from Paleogene and
Holocene samples. The data can be divided into two orders. In the first order,
there are 3266 samples belong to three second order classes: 501 for second or-
der amorphous, 1475 for second order palynodebris, and 1290 for second order
palynomorphs. In the experiments, after image capture, the pre-processing is
applied to segment the foreground, and then 194 original features, including
morphological and textural features, are extracted. After that, the Exhaus-
tive CHi-square Automatic Interaction Detector classification tree algorithm
18                                                            Jinghua Zhang1,2 et al.

is used to determine the effective features for each ANN training. The ANNs
used here are back-propagation MLP. The Gamma test is used to prevent the
overtraining problem. The workflow of the system is shown in Fig. 12, the
classification contains two stages. In the first stage, the test image is classified
into the second order category, and then it is classified into a subcategory in
the second stage. The experimental results show that the system achieves an
average correct classification rate of 87%.

Fig. 12 The workflow of the system in [149].

    The related image recognition methods are behindhand with the wide ap-
plication of optical imaging samplers in plankton ecology research. Most meth-
ods may need manual post-processing to correct their results. To optimize this
situation, a dual-classification method is proposed in [58]. As the workflow of
the dual-classification system shown in Fig. 13, four kinds of shape features, in-
cluding moment invariants, morphological measurements, Fourier descriptors,
and granulometry curves, are used to train a Learning Vector Quantization
Neural Network (LVQ-NN) in the first classification, and texture-based fea-
tures (co-occurrence matrix) are used to train the SVM to achieve the second
classification. Only when the results of the two classifiers are the same, can
the plankton be classified into one class, which effectively reduces the false
positive rate. The image data used in the experiments are collected from the
Great South Channel off Cape Cod, Massachusetts. It contains seven cate-
gories and a total of about 20000 images. The experimental results show that
compared with previous methods, the dual-classification system after correc-
tion can effectively achieve abundance estimation and reduce the error by 50%
to 100%.
    It is a time-consuming and laborious task to analyse protozoa and meta-
zoa by observing through the microscope. It is of great significance to apply
digital image analysis technology in this analysis process. In [45], the authors
develop an image analysis program to perform the semi-automatic recogni-
tion of several protozoa and metazoa commonly used in sewage treatment.
Title Suppressed Due to Excessive Length                                      19

Fig. 13 Schematic diagram of dual-classification system in [58].

The program uses around 40 morphological parameters to train Discriminant
Analysis and ANN to to identify and classify protozoan or metazoan images.
The experimental results show that this program can effectively distinguish
amoebas, sessile ciliates, crawling ciliates, large flagellates and free swimming
ciliates.
     Protozoa and metazoa are good indicators in sludge treatment. Manual
analysis methods are time-consuming and have professional requirements for
operators. Therefore, the digital image analysis method has the potential capa-
bility in this task. In [44], a semi-automatic image analysis program is proposed
for protozoa and metazoa classification. The image data used in the experi-
ments are taken from 22 classes of protozoa and metazoa collected from aera-
tion tanks of WWTPs of Nancy (France) and Portugal treating domestic and
industrial effluents. In the program’s workflow, the image data is pre-processed
(filtering, segmentation, noise reduction, etc.) to generate the region of inter-
est. Then the morphological features are extracted for training the classifiers,
which includes discriminant analysis, neural network, and decision tree. The
neural network used here is a two-layer feed-forward neural network without a
hidden layer based on the BP algorithm. The experimental results show that
discriminant analysis and neural network are more suitable for this task, while
the decision tree is not.
     It is tedious and time-consuming work to identify, count and measure in-
dividual cells manually. Image analysis methods are effective tools. In [157],
20                                                          Jinghua Zhang1,2 et al.

the authors present a novel method for bacterial image classification. In this
method, the image pre-processing is applied first, in which Iterative threshold
segmentation and mathematical morphology are proposed to realize bacte-
ria image edge detection, and then 20 original features are extracted. After
the feature extraction, eight effective features are selected by the Bayesian
model, and Principal Component Analysis (PCA) is used. These processed
features are used to train ANN, a three-layer feed-forward neural network, to
perform the classification. The data used in the experiments contains eight
bacteria classes, which are from National Microbe Culture Resource (NMCR)
database. There are 60 images used as a training set and 20 images for testing.
The accuracy of recognition rate in the experiments is 82.3%.
    As an extension of work [157], different feature extraction and selection is
used in [154]. Six effective features are selected from 15 original features (four
morphological features, seven invariant moments, and four texture features) by
their variance contribution. Under the same experimental condition as [157],
the accuracy of recognition rate in [154] is 86.3%.
    As an extension of work [149], a new version of the classification system is
proposed in [150]. Due to the ANNs used in the previous system show poor
performance on the first order and second order amorphous classification tasks,
RBF is used to replace back-propagation MLP in this work. The novel system
is shown in Fig. 14. The experimental results show that in the best case, the
correct recognition rate is improved by 4% to just over 91%.

Fig. 14 The workflow of the system in [150].

    The determination of bacterial abundance, biovolume and morphology in
wastewater by manual analysis through a microscope is a time-consuming task.
Therefore, in [155, 29, 156], a bacteria image recognition system is proposed for
related research and application to solve the problem. In this system, adaptive
and enhanced edge detection is proposed for image pre-processing first. Then,
the original features, including contour Invariant moment and morphological
features, are extracted, and six effective features are selected from the original
Title Suppressed Due to Excessive Length                                       21

features, and PCA is applied to reduce the features’ dimension. Finally, the
processed features are used to train an ANN to perform the classification
task. The ANN used here is a three-layer feed-forward neural network based
on an adaptive accelerated BP algorithm. The data used in the experiments
are from the CECC database. In the experiments, the adaptive accelerated BP
algorithm can effectively accelerate the training process by five times. Thirty
images from the CECC database are used to test the system’s performance,
and the result shows that the recognition rate is 85.5%.
    Because the manual method to actived sludge screening is time-consuming
and laborious, it can be widely applied. In [4], a semi-automatic method is pro-
posed to perform the recognition task of stalked protozoa species in sewage.
In this method, the acquired images are segmented first, and then the fea-
tures, which contains geometrical, morphological and signature descriptors,
are extracted. Finally, the discriminant analysis and neural networks trained
by these features were used to identify stalked protozoa. The data used in the
experiments are collected from aeration tanks of WWTP in Nancy (France)
and Braga (Portugal). The main goal of this study is to find out the most ef-
fective feature class. The experimental results show that geometrical features
are the most important.
    The development of a simple and reliable computer-aided microscopy method
to analyse complex microbial communities is a major challenge in microbial
ecology. In [52], a method for the classification of different cell growth phases
is introduced. In this method, the adaptive global thresholding is applied to
segment the cell images first, and then five geometric features, including cir-
cularity, compactness, eccentricity, tortuosity and length-width ratio, are ex-
tracted for training the classifiers. The classifiers contain 3σ, KNN, neural
network, and fuzzy classifiers. The neural network used in this study is RBF.
The dataset used here has three phases: normal or grownup or about-to-divide.
Each phase of the bacilli bacterial cell has 100 colour images. The experimental
results show that fuzzy classifier performs best, and the classification accuracy
of the three phases are 100%, 98% and 98%, respectively, while the neural net-
work is the second, and the classification accuracies are 100%, 96% and 95%,
respectively.
    In [53, 55], an image analysis method used for cocci bacterial cell classifica-
tion is proposed. In this method, the image segmentation is performed by the
actived contour, and then five geometric features, which contain circularity,
compactness, eccentricity, tortuosity and length-width ratio, are extracted for
training and testing of the classifiers. There are three classifiers, 3σ, KNN, and
neural network classifiers, used in this study. The neural network is RBF. A
total of 350 image data is used in the experiment, with six categories: cocci,
diplococci, streptococci, tetrad, sarcinae, and staphylococci. The experimental
results show that the neural network achieves the best performance. According
to [53], the classification accuracy details of each class achieve by the neural
network are 100%, 99%, 98%, 99%, 98%, and 99%, respectively.
    The traditional microorganism detection and identification method is labours
and time-consuming. In [71], a rapid and cost-effective method based on PNN,
22                                                               Jinghua Zhang1,2 et al.

whose structure is shown in Fig. 15, is applied to classify the microorganism
image classification task. The features, including various geometrical, opti-
cal, and textural parameters, are extracted from the pre-processed images to
train the network in this method. In the experiments, the dataset includes five
categories that are Bacillus thuringiensis, Escherichia coli K12, Lactobacil-
lus brevis, Listeria innocua, and Staphylococcus epidermis. The experimental
results show that the PNN based on nine kinds of features, which are 45◦
run length non-uniformity, width, shape factor, horizontal run length non-
uniformity, mean grey level intensity, ten percentile values of the grey level
histogram, 99 percentile values of the grey level histogram, sum entropy, and
entropy, can classify the microorganisms with 100% accuracy.

Fig. 15 The structure of probabilistic neural network in [71].

    In environmental bacteria image acquisition, it is easy to generate low-
quality images, bringing troubles to the subsequent analysis. Therefore, a
computer-aid method based on ANN is introduced to perform the classification
task of different quality categories in [165]. The experimental data is private,
and it contains 25000 images with three categories: high quality, medium qual-
ity, and low quality. As the workflow is shown in Fig. 16, the input image is
divided into nine sub-images, and three features, including mean grey value
(MGV), background inhomogeneity (BGI), and cell density measure (CDM),
are extracted from each sub-image. Then the ANN is trained by the normal-
ization and sorting of the measured features. The network used in this study
consists of 27 input, 90 hidden, and three output neurons. The experimental
results show that the optimal ANN achieves a correct identification rate of
94%.
Title Suppressed Due to Excessive Length                                    23

Fig. 16 The workflow of the system in [165].

    The conventional manual method for detecting Mycobacterium tubercu-
losis is an ineffective but necessary part of diagnosing tuberculosis disease.
In [101], a method based on the image analysis technique and ANN is intro-
duced to detect the Mycobacterium tuberculosis in the tissue section. Fifteen
tissue slides are collected in the Department of Pathology, USM Hospital, Ke-
lantan, and each slide is captured to generate 30 to 50 images. A total of 607
objects, which contains 302 definite TB and 305 possible TB, are obtained.
In the experiments, the moving k-means clustering is applied to segment the
24                                                            Jinghua Zhang1,2 et al.

images, and geometrical features of Zernike moments are extracted. Then, the
Hybrid Multilayered Perceptron (HMLP) is used to test the performance of
different feature combinations in the detection task. Fig 17 provides an exam-
ple of HMLP with one hidden layer. The experimental results show that the
HMLP with the best feature combination can achieve an accuracy of 98.07%,
a sensitivity of 100%, and a specificity of 96.19%.

Fig. 17 An example of HMLP with one hidden layer in [101].

    Similar to [52, 53], the same workflow is applied to classify three types of
spiral bacterial cells: are vibrio, spirillum, and spirochete in [54]. Same as [52],
3σ, KNN, neural network, and fuzzy classifiers are applied in this study. The
experimental results show 3σ, KNN (K = 5), neural network (RBF), and
fuzzy classifiers achieve 100% classification accuracy on the test data of three
categories. It proves that neural network can provide good classification ability.
Besides, in [56], the neuro fuzzy classifier is used to replace the fuzzy classifier
in [54], the results show that it shows the same performance as the fuzzy
classifier.
    TB detection in tissue is more complex and challenging than detection
in sputum. In [100], a method based on the image processing technique and
ANN is introduced to detect and classify TB in tissue. The slides used in this
work are collected from the Pathology Department, Hospital Universiti Sains
Malaysia, Kelantan. The 1603 objects consist of three categories, which are TB,
overlapped TB, and non-TB, are collected from 150 tissue slide images. As the
workflow is shown in Fig. 18, the captured image is segmented first, and then
Title Suppressed Due to Excessive Length                                     25

the features are extracted to train the network to perform the classification
task. The network is a single-layer feed-forward neural network trained by
Extreme Learning Machine. The experimental results show that the network
obtains a classification accuracy of 77.25%.

Fig. 18 The workflow of the detection method in [100].

    As the extension of [100], a new method based on image processing tech-
nique and ANN is proposed in [99]. The data and workflow used in this study
are the same as [100]. The network used here is the HMLP network trained by
integrating both Modified Recursive Prediction Error algorithm and Extreme
Learning Machine. The results show that this network can achieve the highest
testing accuracy of 77.33% and average testing accuracy of 74.62% for 35 hid-
den nodes. Besides, in [103], similar work is presented. The data used in this
study contains 1600 objects consisting of TB, overlapped TB, and non-TB,
which are collected from 500 tissue slide images. A single hidden layer feed-
forward network trained by Online Sequential Extreme Learning Machine is
applied in this study. The experimental results show that the network trained
by geometrical features can achieve classification scores above 90.00% on three
categories.
    It is essential but ineffective to analyze the sputum specimen by manual
methods. The digital image analysis method can effectively optimize this situ-
ation. In [121], a classification method based on ANN is introduced. A total of
100 images used in this study are the binary images directly taken from [41].
There are 75 and 25 images for training and testing, respectively. In the work-
flow of this method, geometry features, containing circularity, compactness,
eccentricity, and tortuosity, are extracted from the binary images for training
and testing the classifier. The network used here is BP based MLP. The ex-
perimental results show that the ANN can accurately classify the tuberculosis
bacteria or not with a mean square error of 0.000368.
    Biomonitor is important for the researches related to the ecosystem. How-
ever, the manual identification method is ineffective. Therefore, [66] focus on
the automatic classification and retrieval of macroinvertebrate images. In this
You can also read