Predicting the Cases of Hepatitis B with the A-LSTM Model

Page created by Randy Brown
 
CONTINUE READING
Predicting the Cases of Hepatitis B with the A-LSTM Model
Journal of Physics: Conference Series

PAPER • OPEN ACCESS

Predicting the Cases of Hepatitis B with the A-LSTM Model
To cite this article: Yang Li et al 2021 J. Phys.: Conf. Ser. 1995 012007

View the article online for updates and enhancements.

                                This content was downloaded from IP address 46.4.80.155 on 23/09/2021 at 18:29
Predicting the Cases of Hepatitis B with the A-LSTM Model
CMSA 2021                                                                                                       IOP Publishing
Journal of Physics: Conference Series                         1995 (2021) 012007          doi:10.1088/1742-6596/1995/1/012007

Predicting the Cases of Hepatitis B with the A-LSTM Model

                     Yang Li1,2,*, Yali Yang1, Cong Yang1 and Baolin Zhang3
                     1
                       Fundamentals Department of Air Force Engineering University, Xi’an, Shaanxi, China
                     2
                       Command and Control Centre of Wenchang Spacecraft Launch Site, Wenchang,
                     Hainan, China
                     3
                       Radar NCO School of Air Force Early Warning Academy, Wuhan, Hubei, China

                     *Corresponding author email: xleon2008@outlook.com

                     Abstract. Hepatitis B is a disease caused by hepatitis B virus. It’s of great value to predict the
                     cases of hepatitis B because of its strong infectivity and carcinogenicity. To predict the monthly
                     new patients of hepatitis B in China accurately, a neural network with an attention-based LSTM
                     model is proposed. Driven by the historical data provided by the Data-center of China Public
                     Health Science, the model’s evaluation indexes of RMSE, MAPE, MAE and R-squared are
                     1780.495, 1.789%, 1469.208 and 0.867 respectively, while the evaluation indexes of BPNN are
                     3532.959, 3.311%, 2677.009 and 0.478 respectively. The result shows that A-LSTM model in
                     this work has an excellent prediction on the monthly new patients of hepatitis B and performs
                     much better than BPNN and other traditional time series models.
                     Keywords: A-LSTM; Hepatitis B; Attention mechanism; Predictive modelling.

1. Introduction
Hepatitis B is caused by hepatitis B virus, which can cause acute or chronic hepatitis, and produce
cirrhosis or liver cancer. In China, hepatitis B belongs to class B infectious diseases, second only to
plague and cholera. The main routes of transmission include blood transmission, mother to child
transmission, sexual transmission and other body fluid contact transmission. According to the data
released in 2018, there are almost 70 million chronic hepatitis B patients in China. As hepatitis B cannot
be cured at present, it is very important to accurately predict its trend and take positive preventive and
control measures.
Both the traditional model, which is represented by ARIMA (Autoregressive Integrated Moving
Average model), and deep learning model, such as BPNN (Back Propagation Neural Network), RNN
(Recurrent Neural Network), CNN (Convolutional Neural Networks), have good performances in
prediction of time series. ARIMA model proposed by George box is used in prediction of environmental
index [1], infectious disease [2], etc. The BPNN model proposed by Rumelhant and Mc Clelland in
1986 is widely used in the prediction of stock market [3], class B notifiable disease [4], PM10
concentration [5]. RNN and its derived LSTM (Long Short-Term Memory) model have good
performance in sequence data processing, such as machine translation [6], speech recognition [7], house
price prediction [8], etc. Liu et al. used ARIMA and BPNN to predict the incidence of pulmonary
tuberculosis in China. The results showed that the performance of BPNN was significantly better than
the traditional ARIMA model [9].
In this work, we propose a deep learning model with A-LSTM. The model driven by the historical data
of the monthly new cases of hepatitis B in China, has an outstanding performance on forecasting the
short-term new hepatitis B data in the future. In the following parts of the paper, we will introduce the

              Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
              of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd                          1
Predicting the Cases of Hepatitis B with the A-LSTM Model
CMSA 2021                                                                                              IOP Publishing
Journal of Physics: Conference Series                  1995 (2021) 012007        doi:10.1088/1742-6596/1995/1/012007

data and the models firstly, and then we will present the results. At last, the discussion and conclusion
will be given.

2. Data and the Model

2.1. Data
There are many organizations to publish the data of hepatitis B. We choose the data published by the
authoritative Data-center of China Public Health Science. According to the data of the project, since
January 2004, the monthly incidence of hepatitis B in China has been higher than 50,000, even close to
120,000. Figure 1 shows the monthly new cases of hepatitis B in China from January 2004 to December
2017.

         Figure 1. The monthly new cases of hepatitis B from January, 2004 to December, 2017.
We choose to use the i th month data of monthly new cases in last 5 years to predict the new cases in the
next year. Let X i represents the new hepatitis B cases of the i th month. Before we construct the dataset
                                                                              X i  min{ X i }
to train the model, X i should be normalized to [0,1] as follows: X 'i                            . Then we
                                                                           max{ X i }  min{ X i }
get the dataset: {( X 'i , X 'i 12 ,, X 'i  48 ; X 'i  60 )} , which will be divided into training set, verification set
and test set with the ratio of 0.7225:0.1275:0.15.

2.2. LSTM Model and Attention Mechanism
Because RNN network model can extract all input state information before the current time, it is widely
used in the field of sequence data processing. However, in the training process of RNN network model,
the gradient may disappear and explode. In order to solve these two problems, Hochreiter and
Schmidhuber proposed the long-term and short-term memory model (LSTM) in 1997. The structure of
LSTM is shown in Figure 2.

                                                              2
CMSA 2021                                                                                          IOP Publishing
Journal of Physics: Conference Series                   1995 (2021) 012007   doi:10.1088/1742-6596/1995/1/012007

                 Figure 2. The unit structure of LSTM (a) and the structure of LSTM (b) [10].
In each unit of the LSTM network, X t represents the input of time t, and ht  R m is the hidden state
of RNN at time t, m is the sequence length of the hidden state,  is sigmoid function, and  is
Hadamard product operation. Then, the hidden states of LSTM layer at different time are spliced into
 H1  [h1 , h2 , , ht ]  R mT , and             H1   is used to construct the attention value:
 h( j )  H 1 W j  b j , j  [1, 2, , T ] . W j  R , b j  R m are the parameters that the network needs to
                                                     T

obtain   through     training.      At        last,   h(1), h(2),, h(T ) are spliced into attention value:
                                   T m
H 2  [h(1), h(2), , h(T )]  R
                           '
                                          .

2.3. A-LSTM Model
The LSTM model based on attention mechanism proposed in this work, which will be called A-LSTM,
is shown in Figure 3(a). The time series data pass through two LSTM modules at the same time. The
output of LSTM_1 module goes into attention module and gives attention H 2  RT m . At the same time,
the hidden states of each layer in LSTM_2 module are spliced as H 3  RT m , just like H1  R mT . Then
the input of Decoder module is constructed as H 4  H 2  H 3 . Finally, the output of A-LSTM model is
given by the Decoder module as Yˆ  Wout  H 4  bout . Among the equation, Wout  R T m and bout  R
are to be obtained through training by the network, and the operation symbol A means to get the sum
of all the elements in matrix A .

              Figure 3. The structures of A-LSTM model (a) and BPNN model (b) [11].
The network structure of A-LSTM model is mainly determined by the number n of LSTM units and the
number L of LSTM layers. In this work, we choose a 2-layer A-LSTM, i.e., L=2, and try to train the
model and make prediction for each n{5,6,,12}. And then we choose the most outstanding model
as our final model. In the model, “MSE” and “rmsprop” are the loss function and optimizer.

2.4. BPNN Model
BPNN model well simulates the working principle of human neuron through signal forward propagation
and error backward propagation, which is the core of BPNN model. The general network structure of
BPNN model is shown in Figure 3(b), which includes one input layer, several hidden layers and an
output layer [11].

                                                             3
CMSA 2021                                                                                  IOP Publishing
Journal of Physics: Conference Series          1995 (2021) 012007    doi:10.1088/1742-6596/1995/1/012007

In the model, “tanh”, “ReLu” and “linear” are the activation functions of input layer, hidden layer and
output layer respectively. And the error function and the optimizer are “MSE”, “rmsprop”.

2.5. Evaluation Indexes
It is necessary to make an objective evaluation on the performance of different neural networks and
different models among same type of neural network. The common evaluation indexes are RMSE,
MAPE, MAE, and R-squared ( R 2 ). RMSE, MAPE and MAE represent root mean square error, average
absolute percentage error, mean absolute error respectively. The smaller the value of these three
evaluation indexes is, the better prediction the model will have. How strong the linear relationship
between two variables is assessed by R 2 . The closer R 2 is to 1, the better the model will be.
                                                                                            n
                                                                                       ( y i  yi )
                                                                                                      2
                                      100% y i  yi
                              1/ 2
          1    n
                          2
                                                 n           1 n 
   RMSE    ( y i  yi )  , MAPE      |         |, MAE   | y i  yi |, R  1  i n1
                                                                               2

           n i 1                     n i 1 yi            n i 1                    ( yi  y )
                                                                                                      2
                                                                                                    i
                                                                                           i 1

Among the equations above, y i , yi , y i are the predicted value, the observed value, the average of the
observed values respectively.

3. Results
The dataset constructed in 2.1 is used to train the A-LSTM models with the parameters n{5,6,,12}
and BPNN models with the parameters (n, L) {5,6,,12}{1,2,3} . Then the trained models make the
predictions respectively.
             Table 1. The performance of A-LSTM when n ranges within {5, 6, … ,12}.
     n         5           6          7           8          9       10       11          12
  RMSE 3532.96 4857.50 5129.39 5173.20 5184.12 5444.06 5582.12 5593.57
According to the Table 1, when n ranges within {5,6,…,12}, the RMSE of A-LSTM model will increase.
So, A-LSTM with n = 5 is chosen as the best model, and the best prediction is shown in Table 2.
Similarly, BPNN model with the parameters (n, L)  (6,3) makes its best prediction, which is also
shown in Table 2.
                         Table 2. The best prediction of A-LSTM and BPNN.
                                               A-LSTM                      BPNN
                     Observed
       Date                                         Relative                       Relative
                      Value         Prediction                   Prediction
                                                     Error                          Error
      January         84430           84513         0.098%         87103           3.166%
     February         89034           90054         1.146%         86573           2.764%
       March          92756           92909         0.165%         88783           4.283%
       April          83305           84457         1.383%         81609           2.035%
        May           87915           87628         0.326%         84495           3.890%
        June          85074           82640         2.861%         80275           5.641%
        July          83303           81281         2.428%         79057           5.097%
      August          85770           82046         4.341%         80125           6.582%
    September         76990           75036         2.538%         73493           4.542%
      October         76826           78442         2.103%         76581           0.318%
     November         79043           77748         1.638%         75853           4.035%
     December         77506           75615         2.439%         74394           4.015%
It can be seen from Table 2 that the A-LSTM model and BPNN model in this work both have good
performance in prediction for the monthly new cases of hepatitis B. Moreover, according to the
evaluation indexes in Table 3, A-LSTM model is superior to BPNN model in terms of RMSE, MAPE,

                                                     4
CMSA 2021                                                                                        IOP Publishing
Journal of Physics: Conference Series              1995 (2021) 012007      doi:10.1088/1742-6596/1995/1/012007

MAE, and R 2 . This indicates that the A-LSTM model proposed in this work effectively extracts the
attention features of the dataset and has a better prediction than the common time series models.
                         Table 3. The comparison between A-LSTM and BPNN.
    Evaluation Indexes              RMSE                 MAPE                MAE                     R2
        A-LSTM                     1780.495              1.789              1469.208               0.867
         BPNN                      3519.336              3.864              3246.290               0.482

4. Discussion and Conclusion
We know that the performance of deep learning model is not only determined by the parameters of the
model itself, but also closely related to the dataset used for training. The performance of the same model
on different datasets is generally different.
For the optimal model in this work, i.e., A-LSTM model with n  5 , we construct NewDataset,
{( X i' , X i'1 , X i'2 ,, X i'59 ; X i'60 )} . After trained by NewDataset, A-LSTM makes its prediction shown
in Figure 4(a) and the evaluation indexes on different datasets are also shown in Figure 4(b)(c)(d)(e). It
can be found that the predictions of A-LSTM trained by the two datasets both fit the observed values
very well, which means that the model based on attention mechanism has a good ability to extract
features from the dataset and can make a great prediction.

Figure 4. (a) shows the predictions of A-LSTM on different datasets, and (b), (c), (d), (e) represent the
                                       RMSE, MAPE, MAE, R 2 .
In this work, we propose A-LSTM model based on LSTM and attention mechanism, and use the model
to predict the monthly new patients of hepatitis B. The results show that A-LSTM model has an excellent
prediction and performs much better than BPNN and other traditional time series models.
Based on one-dimensional historical data of cases caused by hepatitis B virus, this work forecasts the
short-term monthly new patients in the future. Although we have achieved good prediction results, with
our in-depth research on hepatitis B virus and the progress of hepatitis B treatment technology, we can
incorporate the number of typical complications of hepatitis B into the study, together with the historical
data of hepatitis B to form more suitable dataset, which can be constructed as multiple dimensions of
attention, and make more accurate and extensive prediction of hepatitis B and its complications. This is
also the direction and content of follow-up research.

                                                         5
CMSA 2021                                                                               IOP Publishing
Journal of Physics: Conference Series        1995 (2021) 012007   doi:10.1088/1742-6596/1995/1/012007

Acknowledgments
The research is supported by the Graduate Scientific Research Foundation of Fundamentals Department
of Air Force Engineering University.

References
[1] P. J. García Nieto, F. Sánchez Lasheras, E. García-Gonzalo, and F. J. de Cos Juez 2018 PM10
     concentration forecasting in the metropolitan area of Oviedo (Northern Spain) using models based
     on SVM, MLP, VARMA and ARIMA: A case study Science of the Total Environment, 753-61
     DOI: 10.1016/j.scitotenv.2017.11.291
[2] Lulu Wang, Chen Liang, Wei Wu, Shengwen Wu, Jinghua Yang, Xiaobo Lu, Yuan Cai, and
     Cuihong Jin 2019 Epidemic Situation of Brucellosis in Jinzhou City of China and Prediction
     Using the ARIMA Model Canadian Journal of Infectious Diseases and Medical Microbiology,
     1-09 DOI: 10.1155/2019/1429462
[3] RAN Yangfan and JIANG Hongxun 2018 Stocking Prices Prediction based on Back Propagation
     Neual Network and Support Vector Regression Journal of Shanxi University(Natural Science
     Edition), 1-14
[4] LIU Tian, YAO Menglei, HUANG Jigui, HUANG Shuqiong, CHEN Hongying, and Hossein
     Abbasimehr 2021 A Comparison of the Prediction Effects of Back Propagation Neural Network
     Model and Seasonal Autoregressive Integrated Moving Average Model in the Caseload of Class
     B Notifiable Diseases in Jingzhou City Chinese Journal of Social Medicine, 109-13 DOI:
     10.3969/j.issn.1673-5625.2021.01.028
[5] Huang Meng, Wang Ying, Qin Chuang, Liu Yang, and Li Xue-chao 2020 PM_(10) mass
     concentration prediction in Lanzhou based on SVR and BPNN Journal of Lanzhou
     University(Natural Sciences), 659-65
[6] LIU Wanwan, SU Yila, WU Ni'Er, and RENQING Dao'Erji 2018 Mongolian-Chinese machine
     translation based on LSTM Computer Engineering & Science, 1890-96
[7] YU Xuwen 2020 Speech recognition of Hainan dialect based on deep learning, Hainan
     University
[8] Tianxiang Zheng, Shaopeng Liu, Zini Chen, Yuhan Qiao, and Rob Law 2020 Forecasting Daily
     Room Rates on the Basis of an LSTM Model in Difficult Times of Hong Kong: Evidence from
     Online Distribution Channels on the Hotel Industry Sustainability, 7334 DOI:
     10.3390/su12187334
[9] Qiao Liu, Zhongqi Li, Ye Ji, Leonardo Martinez, Zia Ul Haq, Arshad Javaid, Wei Lu, and
     Jianming Wang 2019 Forecasting the seasonality and trend of pulmonary tuberculosis in Jiangsu
     Province of China using advanced statistical time-series analyses Infection and Drug Resistance,
     2311-22 DOI: 10.2147/IDR.S207809
[10] Christopher Olah 2015 Understanding LSTM Networks -- colah's blog Retrieved 2021/4/11 from
     http://colah.github.io/posts/2015-08-Understanding-LSTMs/
[11] 2021       Back-Propagation        Neural       Networks       Retrieved      2021/4/6      from
     http://murphylab.web.cmu.edu/publications/boland/boland_node17.html

                                                  6
You can also read