Hybrid deep convolutional neural network with one-versus-one approach for solar flare prediction

Page created by Paul Hernandez
 
CONTINUE READING
Hybrid deep convolutional neural network with one-versus-one approach for solar flare prediction
MNRAS 507, 3519–3539 (2021)                                                                                         https://doi.org/10.1093/mnras/stab2132
Advance Access publication 2021 July 26

Hybrid deep convolutional neural network with one-versus-one approach
for solar flare prediction
Yanfang Zheng, Xuebao Li,‹ Yingzhen Si, Weishu Qin and Huifeng Tian
School of computer science, Jiangsu University of Science and Technology, Zhenjiang, China

Accepted 2021 July 19. Received 2021 July 1; in original form 2021 February 8

                                                                                                                                                             Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
ABSTRACT
We propose a novel hybrid Convolutional Neural Network (CNN) model with one-versus-one approach to forecast solar flare
occurrence with the outputs of four classes (No-flare, C, M, and X) within 24 h. We train and test our model using the same
data sets as in Zheng, Li & Wang, and then compare our results with previous models using the true skill statistic (TSS) as
primary metric. The main results are as follows. (1) This is the first time that the CNN model in conjunction with one-versus-one
approach is used in solar physics to make multiclass flare prediction. (2) In the four-class flare prediction, our model achieves
quite high mean scores of TSS = 0.703, 0.489, 0.432, and 0.436 for No-flare, C, M, and X class, respectively, which are
much better than or comparable to those of previous studies. In addition, our model obtains TSS scores of 0.703 ± 0.070 for
≥C-class and 0.739 ± 0.109 for ≥M-class predictions. (3) This is the first attempt to open the black-box CNN model to study
the visualization of feature maps for interpreting the prediction model. Furthermore, the visualization results indicate that our
model pays attention to the regions with strong gradient, strong intensity, high total intensity, and large range of the intensity
in high-level feature maps. The median gradient and intensity, the total intensity, and the range of the intensity for high-level
feature maps increase approximately with the increase of flare level.
Key words: magnetic fields – methods: data analysis – techniques: image processing – Sun: activity – Sun: flares.

                                                                                automatically acquiring features from image data without human
1 I N T RO D U C T I O N
                                                                                intervention. There are a few attempts to apply CNNs to forecast
Solar flares could cause space weather hazards in the near-Earth                solar flares. Huang et al. (2018) presented a CNN model to make
space environment. Therefore, it is very essential and important to             binary class prediction for solar flares using many patches of solar
establish a reliable and high-accuracy prediction model for solar               active regions (ARs) from line of sight (LOS) magnetograms located
flares to effectively prevent or reduce damage from strong flares.              within ± 30◦ of the solar disc centre. Park et al. (2018) proposed a
In the past few years, several authors applied statistical methods to           CNN model to predict binary class flares within 24 h using full-disc
flare prediction model (Song et al. 2009; Mason & Hoeksema 2010;                LOS magnetograms. Zheng, Li & Wang (2019) proposed a hybrid
Bloomfield et al. 2012; Barnes et al. 2016). Some other researchers             CNN model to make multiclass flare prediction within 24 h, and
employed classic machine-learning methods to predict solar flare                achieved the best forecasting performance results in terms of true
occurrence. Here, the most famous examples include, but are not                 skill statistic (TSS; Hanssen & Kuipers 1965; Muranushi et al. 2015;
limited to: artificial neural networks (Qahwaji & Colak 2007; Ahmed             Leka, Barnes & Wagner 2018) from the existing literatures. Their
et al. 2013; Li & Zhu 2013; Nishizuka et al. 2018), support vector              model adopted the hierarchical classification strategy to gradually
machines (Yuan et al. 2010; Bobra & Couvidat 2015; Nishizuka                    decompose the complex four-class classification problem into three
et al. 2017; Sadykov & Kosovichev 2017), random forests (Liu et al.             binary classification sub-problems. In the existing literatures, it can
2017; Florios et al. 2018; Cinto et al. 2020), k-nearest neighbors (Li          be found that only Zheng et al. (2019) applied CNN to conduct re-
et al. 2008; Huang et al. 2013), and the ensemble learning (Colak &             search on multiclass flare prediction. The multiclass flare prediction
Qahwaji 2009; Huang et al. 2010; Guerra, Pulkkinen & Uritsky                    can be treated as a multiclass classification task. The multiclass flare
2015). Furthermore, the multiclass flare prediction was performed               prediction is typically more difficult than binary class prediction,
by Liu et al. (2017), Bloomfield et al. (2012), and Colak & Qahwaji             since the decision boundary of multiclass classification problem
(2009).                                                                         tends to be more complicated than that of a binary classification
   Convolutional neural networks (CNNs; LeCun, Bengio & Hinton                  (Galar et al. 2011; Zhang et al. 2017). In this paper, we attempt
2015) are primarily made of stacked convolution layers, which can               to propose a novel hybrid deep CNN model with one-versus-one
be regarded as a series of learnable feature extractors designed for            approach that is different from previous studies to predict solar flare
                                                                                occurrence with the outputs of four classes (i.e. No-flare, C, M,
                                                                                and X) within 24 h. This is because the one-versus-one approach
   E-mail: 305122880@qq.com                                                    is considered as one of the most common and effective techniques


C The Author(s) 2021.

Published by Oxford University Press on behalf of Royal Astronomical Society. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium,
provided the original work is properly cited.
Hybrid deep convolutional neural network with one-versus-one approach for solar flare prediction
3520        Y. F. Zheng et al.
to deal with multiclass classification problems (Kang, Cho & Kang           still unbalanced at different classes. There are more No-flare/C-class
2015).                                                                      ARs than M/X-class ARs in our data sets.
   The computational process of CNN-based prediction models is to
split and combine various features from the raw solar observational
                                                                            3 METHOD
data, which are uninterpretable to humans, and thus CNN models are
usually considered as a black box (Yosinski et al. 2015). Such a black      Initially, we tried to train a single CNN model to predict four-class
box structure leads to the inability to explain the forecasting process     flares directly. However, the single CNN model showed large training
and the forecasting basis of the model. The feature visualization           loss during training, and then did not perform well during testing.
analysis is helpful to explain the feature parameters concerned by the      Therefore, we attempt to adopt the hybrid deep CNN model with one-
CNN model (Zeiler & Fergus 2014; Yosinski et al. 2015). However,            versus-one approach. The one-versus-one approach is to decompose
we have not found any researcher performed this aspect of research          an m-class classification problem into Cm2 = m(m − 1)/2 binary
from the existing literatures. We attempt for the first time to open the    classification tasks through pairwise combination. Each binary
black-box structure of the CNN flare prediction model by visualizing        classification task is managed by an independent binary classifier.

                                                                                                                                                           Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
the feature maps of different convolutional layers to study the image       All binary classifiers are trained by only the samples of the two
features that affect the prediction results.                                corresponding classes from the original training data set. In the
   The remainder of this paper is organized as follows. The data is         prediction or testing phase, each sample from the testing data set
described in Section 2, and the method is introduced in Section 3.          is submitted to all binary classifiers at the same time, and then the
Results are presented in Section 4, and conclusions and discussions         output result of these classifiers can be represented by the following
are provided in Section 5.                                                  score matrix P (Zhang et al. 2018a,b)
                                                                                  ⎡                    ⎤
                                                                                     − p12 · · · p1m
                                                                                  ⎢ p21 − · · · p2m ⎥
                                                                                  ⎢                    ⎥
2 DATA                                                                      P =⎢ .         .. . . .. ⎥,                                        (1)
                                                                                  ⎣ ..      .    .  . ⎦
The Helioseismic and Magnetic Imager (HMI; Schou et al. 2012) on                    pm1 pm2 · · · −
board the Solar Dynamics Observatory (SDO; Pesnell, Thompson &
Chamberlin 2012) began its routine observation on 2010 April 30.            where pij ∈ [0, 1] is the probability of the binary classifier discriminat-
SDO/HMI started to publicly release a new data product called Space-        ing class i from j in favour of the former class, while the probability
weather HMI Active Region Patches (SHARP; Bobra et al. 2014),               in favour of the latter can be calculated by pji = 1 − pij if the classifier
which can provide the LOS magnetograms of ARs. In this study, we            does not provide it. After the score matrix is obtained, the final output
adopt SHARP LOS magnetogram data from 2010 May 1 to 2018                    can be inferred by the aggregation strategies, such as majority voting
September 13, covering the main peak of solar cycle 24, as raw input        (Furnkranz 2002) and weighted voting (Galar et al. 2011). Previous
data for our prediction model. The magnetogram data of AR with              studies have proved that the weighted voting strategy is more robust
central longitudes located within ±45◦ of the central meridian is           and effective, so it is widely used as the most popular aggregation
included to avoid the influence of projection effects (Ahmed et al.         strategy (Hullermeier & Vanderlooy 2010; Galar et al. 2011). In our
2013; Bobra et al. 2014). The output result for our prediction model        work, we also adopt this aggregation strategy. In the weighted voting
is compared with the daily flare observations of the Geostationary          strategy, the total probability value of each class is calculated from
Operational Environment Satellite (GOES).                                   the score matrix P, and then the class with the largest total value is
   In order to assess the performance of our model properly, we need        chosen as the final output class. This process is implemented by the
to segregate the whole data set into the training and testing data set.     following formula:
The data sets we use are the same as used by Zheng et al. (2019),                                    
because they adopt the method of shuffle and split cross-validation         class = arg max                 pij .                                    (2)
                                                                                         i=1,··· ,m
                                                                                                      1j =im
by ARs. This method ensures that the ARs and samples in the testing
data set do not appear in the training data set, which can effectively         We design the proposed hybrid CNN model with one-versus-
verify the validity and stability of the model. By using the same data      one approach in this study. The model is implemented in PYTHON
sets, we also can conduct fair performance comparisons between              programming language and the deep learning library KERAS with
different flare forecasting models. Initially, we collect 870 ARs and       TENSORFLOW (Abadi et al. 2016) as the backend. Fig. 1 shows
136 134 magnetogram samples, including 443 X-class, 6534 M-                 the proposed hybrid CNN model with one-versus-one approach
class, 72 412 C-class, and 56 745 No-flare magnetogram samples              and the architecture of our model consisting of six binary CNN
to build the data sets. It is clear that there is a significant imbalance   models. The proposed model adopts the one-versus-one approach to
in the number of samples at different classes. In order to alleviate        decompose the problem of four-class flare prediction into the sub-
this imbalance issue, we undersample No-flare/C-class samples by            problems of six binary class flare predictions. As shown in Fig. 1(a),
randomly selecting No-flare/C-class samples with about 2 samples            the proposed model is comprised of six different binary CNN
per 10 samples, and augment M/X-class samples by rotating and               models (Model N C, Model N M, Model N X, Model C M, Model
reflecting images. Table 1 shows the number of samples and ARs              C X, and Model M X), which are independent binary classifiers
for 10 separate data sets in our work. As can be seen from Table 1,         responsible for distinguishing between a different pair of classes.
through undersampling and data augmentation, the number of No-              The development of the proposed model involves two phases: a
flare/C-class samples is decreased, while the number of M/X-class           training phase and a prediction or testing phase. In the training
samples is increased. To some extent, the data imbalance is alleviated,     phase, each binary CNN model is trained independently by the subset
but the number of the samples is not equal at different classes. In         of training data sets including only the magnetogram samples of
addition, the undersampling and data augmentation techniques do             the two corresponding classes. Thus it generates the probabilities
not increase the number of ARs. Therefore, the number of ARs is             that the AR will produce a solar flare of the two corresponding

MNRAS 507, 3519–3539 (2021)
Hybrid deep convolutional neural network with one-versus-one approach for solar flare prediction
Hybrid deep CNN with OVO for flare prediction                         3521
              Table 1. The number of solar magnetogram samples and ARs for 10 separate data sets.

              Data set                   No-flare Class              C Class                 M Class                      X Class
                                      (Sample/AR numbers)      (Sample/AR numbers)     (Sample/AR numbers)          (Sample/AR numbers)

              No. 1: Training               8796/359                 11091/237                  8862/60                    2856/8
              Test                           1331/57                  1756/39                    1146/8                    1500/2
              Total                         10127/416                12847/276                 10008/68                   4356/10
              No. 2: Training               8824/351                 11242/235                  9408/60                   4080/8
              Test                           1733/72                  1589/36                    876/7                     60/1
              Total                         10557/423                12831/271                 10284/67                   4140/9
              No. 3: Training               8996/362                 11549/235                  8562/59                    4788/8
              Test                           1382/58                  1379/35                   1950/11                     312/2
              Total                         10378/420                12928/270                 10512/70                   5100/10

                                                                                                                                                     Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
              No. 4: Training               8807/347                 10864/239                  8814/60                   3684/8
              Test                           1565/66                  1635/37                    828/8                     1416/2
              Total                         10372/413                12499/276                  9642/68                   5100/10
              No. 5: Training               8951/352                 11611/234                  9270/59                    3756/8
              Test                           1664/68                  1686/38                    1218/9                     600/2
              Total                         10615/420                13237/272                 10488/68                   4356/10
              No. 6: Training               8836/359                 11189/238                  8580/62                   3648/8
              Test                           1500/62                  1641/39                    1332/5                    1668/3
              Total                         10336/421                12830/277                  9912/67                   5316/11
              No. 7: Training               8633/353                 11450/242                  8856/61                    3840/8
              Test                           1579/68                  1025/27                    1692/9                    1476/3
              Total                         10212/421                12475/269                 10548/70                   5316/11
              No. 8: Training               8908/357                 11689/243                  9468/58                    4632/8
              Test                           1459/64                   925/27                    750/7                      312/2
              Total                         10367/421                12614/270                 10218/65                   4944/10
              No. 9: Training               8964/359                 11398/241                  9174/62                    3504/8
              Test                           1413/61                  1473/34                    1038/5                    1440/2
              Total                         10377/420                12871/275                 10212/67                   4944/10
              No. 10: Training              8632/357                 11473/236                 9216/60                    3888/8
              Test                           1517/65                  1266/30                   1392/10                    252/1
              Total                         10149/422                12739/266                 10608/70                   4140/9

classes within 24 h. In the testing phase, each sample from the                 To maximize the prediction performance, all six different binary
testing data sets is input to all six binary CNN models evaluated            CNN models are iteratively trained on a small part of the training
simultaneously. Then each binary CNN model performs binary                   data set called a mini batch (Goodfellow, Bengio & Courville 2016)
class prediction and outputs the probabilities (i.e. pij and pji ) of        to minimize a loss function respectively. Since the number of ARs
the two corresponding classes, respectively. When all six binary             is imbalanced in our data sets, we employ the summation of a class-
CNN models complete the output probabilities, the score matrix P             weighted cross entropy as the loss function,
mentioned above can be acquired. Eventually, the proposed model
uses the weighted voting strategy to aggregate the output predictions              
                                                                                   N 
                                                                                     K−1

of all six binary CNN models, and calculates the final output                J =             Ck ynk loge (yˆnk ),                              (3)
                                                                                   n=1 k=0
class from the score matrix. Based on the final output class, the
proposed model makes four-class (i.e. No-flare, C, M, and X) flare           Ck = L(ARk )L(samplek )βk ,                                       (4)
prediction.
   All six different binary CNN models constituting the proposed             where K denotes the number of classes being equal to 2, N is the
hybrid CNN model utilize the same network architecture, which is             number of the training samples per mini batch, and ynk and yˆnk
illustrated in Fig. 1(b). In this study, after trying different combina-     represent the expected output and the forecasting output of the kth
tions of the following choices, such as the number of convolutional          class during a forward propagation, respectively. Ck is the weight
layers, filter size, and activation function, we adopt the binary CNN        of the kth class used for weighting the loss function, L(ARk ) is the
model architecture with the best performance. As the proposed hybrid         number of ARs of the kth class, and L(samplek ) is the number of
CNN model consists of six binary CNN models, it could achieve the            samples of the kth class. β k (k = 0, 1) is the optimized parameter
best performance for four-class prediction, in the case that these           used to adjust Ck , which is obtained through experiment and comes
six binary CNN models achieve the best performance for binary                in pairs of the corresponding two classes. In the training phase, the
class prediction. As shown in Fig. 1(b), each binary CNN model               parameters (such as weights and biases) of our models are iteratively
consists of 28 different layers and functions containing convolu-            updated to minimize the loss function using stochastic gradient
tional layers, pooling layers, dense layers, softmax layer, dropout          descent (SGD; LeCun et al. 1998a), where the gradients of the loss
layers, batch normalization (BN) layers, and ReLU activation                 function with respect to the weights and biases are calculated using
functions.                                                                   backpropagation procedure (Rumelhart, Hinton & Williams 1986).

                                                                                                                     MNRAS 507, 3519–3539 (2021)
Hybrid deep convolutional neural network with one-versus-one approach for solar flare prediction
3522       Y. F. Zheng et al.

                                                                                                                                                  Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

    Figure 1. The proposed hybrid CNN model with one-versus-one approach and the architecture of our model consisting of six binary CNN models.

MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction                          3523

                                                                                                                                                        Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
                                                                  Figure 1 – continued

4 R E S U LT S                                                                phenomenon could be alleviated by gradually decreasing the value
                                                                              of learning rate. In our work, we select the learning rate parameter
4.1 Training results                                                          through the experiment to reduce the number of peaks as much as
                                                                              possible. However, due to the difference of experimental data sets
We train and validate the six different binary CNN models that
                                                                              used for training different models (Model N C, Model N M, Model
are combined to form the hybrid CNN model with one-versus-one
                                                                              N X, Model C M, Model C X, and Model M X), there are still a few
approach by the training and validating data sets during each epoch
                                                                              peaks on the loss curves in Figs 2(a), (e), and (g). In summary, it can
to track the learning performance. About 80 per cent of the samples
                                                                              be found from Fig. 2 that both the training loss and validating loss
in the training data set is used to train each binary CNN model,
                                                                              generally decrease with the increase in the number of epochs for all
and the remaining 20 per cent of the samples in the training data
                                                                              binary CNN models. By checking the learning curves, we find that
set is used as the validation data set to validate the model. The
                                                                              all binary CNN models are not subjected to the problems of poor
training and validation sets for each binary model contain only the
                                                                              learning or serious overfitting.
samples in the full training and validation sets of the corresponding
two classes. We obtain 10 separate training and validation data sets
for training and validating the model in total. At the end of each
                                                                              4.2 Testing results
epoch, we compute the loss function on the validating data set, and
the model that minimizes the validating loss is selected as the best          The prediction results of the proposed hybrid CNN model can be
trained model, which is similar to previous studies (Huang et al.             characterized by a confusion matrix. From this confusion matrix, we
2018; Park et al. 2018; Zheng et al. 2019). The learning curves for           can calculate the following metrics (Zheng et al. 2019): precision,
the six different binary CNN models are illustrated in Fig. 2, giving         recall, accuracy, false alarm ratio (FAR), Heidke skill score (HSS;
the training loss and validating loss with the respect to the number of       Heidke 1926), and TSS. We use these metrics to evaluate the model
epochs, respectively. The 10 differently colored curves indicate the          performance. The range of precision, recall, and accuracy is 0 to
variations in training and validating loss with epochs for the model          1, with the maximum value 1 being the perfect score. The range
trained and validated by 10 different data sets. In Figs 2(g) and (h), the    of FAR is also 0 to 1, with the minimum value 0 being the perfect
peaks on the pink loss curves may be caused by the excessive changes          score. HSS ranges from −∞ to 1, with 1 indicating perfect score and
in weights and biases of the model during the training process. This          less than 0 indicating no skill. TSS ranges from −1 (for no correct

                                                                                                                  MNRAS 507, 3519–3539 (2021)
3524         Y. F. Zheng et al.

                                                                                                                                                                   Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

Figure 2. Learning curves showing the training loss and validating loss with the respect to the number of epochs for the six different binary CNN models. The
10 differently colored curves indicate the variations in training and validating loss with epochs for the model trained and validated by 10 different data sets.

MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction                       3525

                                                                                                                                               Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

                                                            Figure 2 – continued

predictions) to +1 (for all correct predictions) and a value of zero    2015). Therefore, we also follow the suggestion of Bloomfield et al.
indicates that the predictions have been generated mainly by chance.    (2012) to use the TSS as primary metric.
Among the above six metrics, only the TSS score is insensitive to         We evaluate or test the proposed hybrid CNN model with one-
the class imbalance ratio (Bloomfield et al. 2012; Bobra & Couvidat     versus-one approach on each of 10 testing data sets to obtain

                                                                                                          MNRAS 507, 3519–3539 (2021)
3526     Y. F. Zheng et al.

                                                                                                                     Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

                        Figure 3. The confusion matrices for the proposed model evaluated on each of 10 data sets.

MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction                                  3527
      Table 2. The four-class flare prediction results of our proposed hybrid CNN model (within 24 h) and comparison to previous studies.

      Metric                  Model                No-flare (weaker than C1.0) Class             C Class              M Class              X Class

      Recall                This work                        0.807 ± 0.076                   0.644 ± 0.083        0.518 ± 0.272         0.523 ± 0.398
                       Zheng et al. (2019)                   0.869 ± 0.034                   0.671 ± 0.059        0.617 ± 0.148         0.594 ± 0.394
                         Liu et al. (2017)                   0.812 ± 0.039                   0.526 ± 0.050        0.671 ± 0.037         0.297 ± 0.039
                     Bloomfield et al. (2012)                      –                             0.737                0.693                 0.859
                     Colak & Qahwaji (2009)                        –                             0.772                0.865                 0.917
      Precision             This work                        0.773 ± 0.045                   0.638 ± 0.041        0.659 ± 0.060         0.502 ± 0.346
                       Zheng et al. (2019)                   0.793 ± 0.054                   0.670 ± 0.079        0.699 ± 0.087         0.562 ± 0.383
                         Liu et al. (2017)                   0.703 ± 0.037                   0.563 ± 0.054        0.656 ± 0.036         0.745 ± 0.152
                     Bloomfield et al. (2012)                      –                             0.330                0.136                 0.029
                     Colak & Qahwaji (2009)                        –                               –                    –                     –

                                                                                                                                                                Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
      Accuracy              This work                        0.869 ± 0.027                   0.791 ± 0.039        0.830 ± 0.031         0.893 ± 0.065
                       Zheng et al. (2019)                   0.891 ± 0.018                   0.812 ± 0.029        0.849 ± 0.034         0.933 ± 0.041
                         Liu et al. (2017)                   0.844 ± 0.017                   0.712 ± 0.026        0.778 ± 0.019         0.957 ± 0.005
                     Bloomfield et al. (2012)                      –                             0.711                0.829                 0.881
                     Colak & Qahwaji (2009)                        –                             0.811                0.944                 0.981
      FAR                   This work                        0.226 ± 0.045                   0.361 ± 0.041        0.340 ± 0.060         0.297 ± 0.281
                       Zheng et al. (2019)                   0.207 ± 0.054                   0.330 ± 0.079        0.301 ± 0.087         0.138 ± 0.140
                         Liu et al. (2017)                   0.297 ± 0.023                   0.437 ± 0.016        0.344 ± 0.020         0.255 ± 0.126
                     Bloomfield et al. (2012)                      –                             0.670                0.864                 0.971
                     Colak & Qahwaji (2009)                        –                             0.319                0.688                 0.967
      HSS                   This work                        0.692 ± 0.058                   0.488 ± 0.047        0.444 ± 0.194         0.419 ± 0.309
                       Zheng et al. (2019)                   0.747 ± 0.037                   0.535 ± 0.061        0.551 ± 0.120         0.539 ± 0.366
                         Liu et al. (2017)                   0.640 ± 0.032                   0.334 ± 0.028        0.497 ± 0.031         0.406 ± 0.014
                     Bloomfield et al. (2012)                      –                             0.296                0.177                 0.049
                     Colak & Qahwaji (2009)                        –                             0.493                0.470                 0.169
      TSS                   This work                        0.703 ± 0.070                   0.489 ± 0.049        0.432 ± 0.222         0.436 ± 0.330
                       Zheng et al. (2019)                   0.768 ± 0.028                   0.538 ± 0.059        0.534 ± 0.137         0.552 ± 0.370
                         Liu et al. (2017)                   0.669 ± 0.039                   0.328 ± 0.050        0.500 ± 0.037         0.291 ± 0.039
                     Bloomfield et al. (2012)                      –                             0.443                0.526                 0.740
                     Colak & Qahwaji (2009)                        –                               –                    –                     –
      Note. For Liu et al. (2017) and Zheng et al. (2019), we use the results provided by Zheng et al. (2019) in their table 3. Colak & Qahwaji (2009)
      did not provide the scores of TSS and Precision. The scores we show are average in this table, but the results of Bloomfield et al. (2012) are their
      optimum result and not an average.

the prediction result. Fig. 3 shows the confusion matrices for the                 for four-class flare prediction. Table 3 shows the prediction results of
proposed model evaluated on each of 10 data sets. The values in                    our proposed hybrid CNN model within 24 h for ≥C class and ≥M
each confusion matrix represent the numbers of samples predicted                   class, which are compared with previous studies. For ≥C class and
correctly (primary diagonal) and incorrectly (off the primary di-                  ≥M class prediction, the TSS score of the model is 0.703 ± 0.070
agonal). The total confusion matrix (i.e. the top matrix in Fig. 3)                and 0.739 ± 0.109, respectively, which is obviously superior to that
is derived from the summation of these 10 confusion matrices. The                  of Huang et al. (2018) and Park et al. (2018), but a little smaller than
four-class flare prediction results of our proposed hybrid CNN model               that of Zheng et al. (2019). Overall, experimental results demonstrate
within 24 h are presented and compared with previous studies in                    that the predictive performance of our proposed hybrid CNN model
recent years in Table 2. The model in our work and the model of                    with one-versus-one approach is close to that of Zheng et al. (2019)
Zheng et al. (2019) give the means and standard deviations of the                  while being superior to all the others for both multiclass prediction
performance metrics based on different CNN methods using the                       and binary class prediction.
same data sets. The average TSS scores of our model are 0.703,
0.489, 0.432, and 0.436 for No-flare, C, M, and X class, respectively,
                                                                                   4.3 Feature visualization
which are slightly smaller than those of Zheng et al. (2019). Our
average TSS scores are better than those of Liu et al. (2017) in each              In our study, the effectiveness of our prediction model is verified by
class except for M class. We show average scores in Table 2, but                   the above experiments. However, CNN models are always known
Bloomfield et al. (2012) only show their optimum result which is not               as black boxes. Understanding what current solar flare forecasting
an average. Our optimum TSS scores are 0.726, 0.489, 0.614, and                    model based on CNNs have learned is a key way to further improve
0.772 for No-flare, C, M, and X class, respectively, which are better              it. Thus, in order to interpret why the model can work well, it is
than those of Bloomfield et al. (2012). It can be seen from Table 2                necessary to visualize what the model learns by feature maps and
that in addition to the TSS, we obtain quite good scores in other five             then give a qualitative empirical analysis (Zeiler & Fergus 2014;
metrics at the same time. Based on the same data sets, our model                   Yosinski et al. 2015). We modify the loss function by abandoning
can also make binary class prediction for solar flares. The result for             the weight Ck in Equation (3), and keep the training and validation
binary class prediction can be derived from the confusion matrices                 splits, initial weights, and other factors unchanged to obtain a

                                                                                                                          MNRAS 507, 3519–3539 (2021)
3528        Y. F. Zheng et al.
                              Table 3. The flare prediction results of our proposed hybrid CNN model (within 24 h) for
                              ≥C class and ≥M class and comparison to previous studies.

                              Metric                       Model                     ≥C Class                ≥M Class

                              Recall                    This work                 0.895 ± 0.024            0.818 ± 0.120
                                                    Zheng et al. (2019)           0.898 ± 0.030            0.817 ± 0.084
                                                    Huang et al. (2018)               0.726                    0.850
                                                     Park et al. (2018)                0.85                      –
                              Precision                 This work                 0.913 ± 0.041            0.873 ± 0.044
                                                    Zheng et al. (2019)           0.939 ± 0.019            0.889 ± 0.056
                                                    Huang et al. (2018)               0.352                    0.101
                                                     Park et al. (2018)                 –                        –
                              Accuracy                  This work                 0.869 ± 0.027            0.887 ± 0.026

                                                                                                                                                           Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
                                                    Zheng et al. (2019)           0.891 ± 0.017            0.891 ± 0.024
                                                    Huang et al. (2018)               0.756                    0.813
                                                     Park et al. (2018)                0.82                      –
                              FAR                       This work                 0.086 ± 0.041            0.126 ± 0.044
                                                    Zheng et al. (2019)           0.060 ± 0.019            0.111 ± 0.056
                                                    Huang et al. (2018)               0.648                    0.899
                                                     Park et al. (2018)                0.17                      –
                              HSS                       This work                 0.692 ± 0.058            0.746 ± 0.089
                                                    Zheng et al. (2019)           0.746 ± 0.037            0.759 ± 0.071
                                                    Huang et al. (2018)               0.339                    0.143
                                                     Park et al. (2018)                0.63                      –
                              TSS                       This work                 0.703 ± 0.070            0.739 ± 0.109
                                                    Zheng et al. (2019)           0.767 ± 0.028            0.749 ± 0.079
                                                    Huang et al. (2018)               0.487                    0.662
                                                     Park et al. (2018)                0.63                      –
                              Note. For Zheng et al. (2019), we compute all six metric scores from the confusion matrices in
                              table 4 they provided. For Huang et al. (2018), we calculate the scores of Precision, Accuracy,
                              and FAR from the contingency table in table 4 they provided.

new model. We train and evaluate the model again, and obtain                    layer of the model. The natural logarithmic values of the gradient
the following results. The TSS score of low-performance model                   median for the high-level feature maps output by 64 filters are
is 0.527 ± 0.066, 0.142 ± 0.075, 0.279 ± 0.075, −0.027 ± 0.073                  given in Fig. A1(a). From Fig. A1, we obtain the following results:
for No-flare, C, M, and X class, respectively. As shown in Fig. 4,              (1) the median values of gradients of the area concerned by the
we perform feature visualization for both high-performance model                high-performance model are significantly higher than those of the
and low-performance model respectively, and obtain the feature                  low-performance model, which applies not only to one AR sample,
images output by each convolution layer of the model respectively.              but also to other AR samples. (2) The median values of the gradients
As shown in Figs 4(b)–(c) and (f)–(g), the low-level features, e.g.             concerned by both high-performance model and low-performance
the edges or shape, of the input image are detected in the lower                model increase approximately with the increase of flare level. The
convolutional layers, which are still recognizable. Subsequent layers           relationship between the magnitude of the gradient values and the
use these low-level features to detect higher level features. As shown          flare level can be clearly seen in Fig. A1(b). (3) In addition, we
in Figs 4(d) and (h), high-level image features are detected in the             calculate the median values of the intensity, total intensity, and the
last convolutional layer, which become more abstract and difficult to           range of intensity values for the feature map images from the last
explain.                                                                        convolutional layer as shown in Figs A1(c)–(h), and they behave sim-
   By investigating the 64 feature map images output by each                    ilarly to the gradients in the high-performance and low-performance
convolutional layer of the high-performance and low-performance                 models.
models, we find the following two characteristics: firstly, there is no            In addition, we find from Figs A1(a), (c), (e) and (g) that there are
significant difference in terms of feature distribution (e.g. the edges         a few peaks for the 64 filters corresponding to each AR sample in the
or shape) in the lower convolutional layers of the high-performance             low performance prediction. To investigate the source of these peaks
and low-performance models, especially in the first convolutional               and their rationality, we examine all feature visualization images, as
layer. Secondly, by comparing the dynamic range of the data, we                 shown in Fig. A2. Figs A2(a)–(c) present the feature map images
find that the range in the high performance prediction is significantly         output by the first, second, and fifth convolutional layers of the
higher than that in the low performance prediction, especially in               low-performance model respectively, and Fig. A2(d) presents the
the last convolutional layer. Since the final output result of the              gradient images for the feature maps of Fig. A2(c). As shown in
prediction model mainly depends on the image features output by                 Fig. A2(a), two of the output images from the 64 filters in the first
the last convolutional layer, we randomly select 10 different AR                convolutional layer exhibit image features with large dynamic range
samples for each class as the input of the model, and calculate the             of the data which are distinctly different from those of the other
gradient values for the feature maps output by the last convolutional           images, and these features are similar to those visualized by the

MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction                                  3529

                                                                                                                                                                     Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
Figure 4. Feature visualization. (a) shows the normalized AR image as the input of the the model, which is the LOS magnetogram sample of X-class AR 11890
observed at 19:36 UT on 2013 November 7. (b)–(d) present one of the feature map images output by the first, second, and fifth convolutional layers of the
high-performance model respectively, and (f)–(h) correspondingly present one of the feature map images output by the low-performance model. We find that
the output image from the fifth convolutional layer shows a strong contrast. Therefore, we calculate the gradient for the output image from the last convolutional
layer, and (e) and (i) show the gradient images for the feature maps of (d) and (h) respectively. The values on the colorbars for (a)–(i) can be clearly seen by
zooming in.

high-performance model.1 As shown in Figs A2(b) and (c), five                       weights and spatial or temporal undersampling to ensure a degree of
images similar to the visualized results of the high-performance                    displacement, scale and deformation invariance. From the second
model appear in the second convolutional layer and eight images in                  convolutional layer, each unit in a layer receives inputs from a
the fifth convolutional layer. The number of visualized images with                 set of units located in a small neighborhood in the previous layer
distinctive features from the last convolutional layer is consistent                (LeCun et al. 1998b). In other words, each unit in each feature map
with the number of peaks in Fig. A1. A similar phenomenon is                        is connected to several small neighborhoods at identical locations in
also observed for the other 9 AR samples. This is because CNNs                      a subset of feature maps in previous layer. In the low performance
combine the three structural concepts of local receptive fields, shared             prediction, the first convolutional layer acquires some of the features
                                                                                    that affect the correct forecasting results. These feature images and
                                                                                    the neighbourhood images are combined into multiple feature sets,
1 https://github.com/FlarePrediction/Repository/tree/papers/paper11/MNRA            which are used as input of the next convolutional layer, corresponding
S V2/Figure%20C                                                                     to a certain number of outputs. As the number of convolutional layers

                                                                                                                            MNRAS 507, 3519–3539 (2021)
3530        Y. F. Zheng et al.
increases, the number of features learned that are relevant to the          as class M, while some of M-class samples are mostly incorrectly
correct prediction result also increases, so that the low-performance       predicted as class X. This indicates that the correct predictions of
model can also obtain some correct predictions.                             ≥M-class major flares are not missed, which does not degrade
                                                                            the performance of the model for ≥M-class prediction. Based on
                                                                            our experimental results, we conclude that the proposed hybrid
5 CONCLUSIONS AND DISCUSSIONS
                                                                            CNN model with one-versus-one approach is an effective method
In this study, we propose a new hybrid deep CNN model with                  for the solar flare prediction task. In near future work, with the
one-versus-one approach to forecast solar flare occurrence with the         continued observation of SDO/HMI, we would gather more X-class
outputs of four classes (i.e. No-flare, C, M, and X) within 24 h.           ARs and samples in the data sets to improve the model robustness for
The proposed model decomposes the problem of four-class flare               predicting M/X-class flares. However, waiting for more X-class flares
prediction into the sub-problems of six binary class predictions by         to use for training is a very slow process. It is worthwhile exploring
the one-versus-one approach, which are managed by six different             other data augmentation techniques, such as generative adversarial
binary CNN models respectively. We train these binary CNN models            networks (GANs; Kim et al. 2019) to gain more X-class AR samples.

                                                                                                                                                           Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
independently in the training phase, and then aggregate the output          We will also try other methods, such as including domain knowledge
predictions of binary models by the weighted voting strategy to             into the fully-connected network, to further improve the predictive
obtain a final prediction result in the testing phase. The main results     performance of the model. In addition, our current hybrid CNN
of this paper are summarized as follows. (1) To our knowledge, this         model can process time-series data, but it does not use the linear
is the first time that the CNN model in conjunction with one-versus-        relationship for analysis, which is a very important problem in the
one approach has been used in solar physics to make multiclass              process of flare prediction. The long short-term memory (LSTM)
forecasting for solar flares. (2) In the multiclass flare prediction, our   networks can compute the dependencies between time-series data
model achieves relatively high average scores of TSS = 0.703, 0.489,        and are often used to solve time-series prediction problem. Our next
0.432, and 0.436 for No-flare, C, M, and X class, respectively. In the      step work is to develop the model with CNNs and LSTM networks
binary class prediction, the TSS score of our model is 0.703 ± 0.070        to deal with time-series images and further improve the performance
for ≥C-class prediction and 0.739 ± 0.109 for ≥M-class prediction,          of flare prediction.
respectively. (3) According to the literature, this is the first attempt
to open the black-box CNN model to visualize the feature maps
                                                                            AC K N OW L E D G E M E N T S
for interpreting the flare prediction model. In addition, the results
of feature visualization demonstrate that the prediction performance        We wish to thank the anonymous referee for valuable suggestions
of our model is related to the gradient, the intensity, total intensity,    and comments that improved this work significantly. We thank the
and the range of the intensity in feature maps of deeper layer, which       Solar Dynamics Observatory/Helioseismic and Magnetic Imager
increase approximately with the increase of flare level. It is worth        (SDO/HMI) team members who have made contributions to the
noting that the median gradient concerned by both high-performance          SDO mission for their hard work. This work is supported by
model and low-performance model increases approximately as the              the National Natural Science Foundation of China (Grants No.
flare level increases. Exploring this further would be worthwhile in        11703009, No.11803010), the Natural Science Foundation of Jiangsu
follow-up work as it may lead to insights into improving the network.       Province, China (Grant No. BK20170566, No. BK20201199), and
Many previous studies point out solar flares are closely related to the     the Qing Lan Project.
physical features extracted from near the polarity inverse line (PIL) in
the photospheric magnetic field, which are used for flare forecasting
                                                                            DATA AVA I L A B I L I T Y
(e.g. Cui et al. 2006; Georgoulis & Rust 2007; Schrijver 2009;
Bobra & Couvidat 2015; Liu et al. 2017). Cui et al. (2006) believe          The data underlying this article are available in the article and
that solar flares are correlated with the maximum horizontal gradient       in its online supplementary material, which are also available
and the length of the neutral line. Schrijver (2009) reviews that large     in https://github.com/FlarePrediction/Repository/tree/papers/paper1
flares tend to occur in the AR with strong magnetic field gradient          1/MNRAS V2. The data sets used in the paper are large, with
and long PIL. Bobra & Couvidat (2015) utilize dozens of physical            compression up to about 5.23GB, so we provide the website
features including magnetic field gradients to predict solar flares.        information for downloading. You may need to register a user for
Georgoulis & Rust (2007) define the effective connected magnetic            the BaiduNetDisk, and then download the data sets according to the
field intensity to measure the flaring potential in ARs. Liu et al.         information of the readme file in the supplementary material.
(2017) discover that flux near the PIL is one of the most important
features for predicting solar flares. Therefore, in combination with
                                                                            REFERENCES
our finding, we speculate that our flare prediction model concentrates
on the regions with strong magnetic field gradient, strong magnetic         Abadi M. et al., 2016, preprint (arXiv:1603.04467)
field intensity, high total magnetic field intensity, and large variation   Ahmed O. W., Qahwaji R., Colak T., Higgins P.A., Gallagher P.T., Bloomfield
range of magnetic field intensity, which is in agreement with previous         D.S., 2013, Sol. Phys., 283, 157
studies.                                                                    Barnes G. et al., 2016, ApJ, 829, 89
   It can be found from Table 2 that the standard deviations of the         Bloomfield D. S., Higgins P. A., James McAteer R. T., Gallagher P. T., 2012,
                                                                               ApJ, 747, L41
metrics for M class and X class are larger, implying that our model
                                                                            Bobra M. G., Couvidat S., 2015, ApJ, 798, 135
is not very stable in predicting M/X-class flares in the multiclass
                                                                            Bobra M. G., Sun X., Hoeksema J. T., Turmon M., Liu Y., Hayashi K., Barnes
flare forecasting. This is because our model has some difficulty               G., Leka K. D., 2014, Sol. Phys., 289, 3549
in distinguishing between M-class and X-class samples, mostly               Cinto T., Gradvohl A. L. S., Coelho G. P., Silva A. E. A. da, 2020, MNRAS,
resulting from the shortage of X-class ARs and samples in the solar            495, 3332
cycle 24. Fortunately, from the confusion matrices in Fig. 3, it can be     Colak T., Qahwaji R., 2009, Space Weather, 7, S06001
found that some of X-class samples are mostly incorrectly predicted         Cui Y. M., Li R., Zhang L. Y., He Y., Wang H., 2006, Sol. Phys., 237, 45

MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction                              3531
Florios K., Kontogiannis I., Park S. H., Guerra J.A., Benvenuto F., Bloomfield   Qahwaji R., Colak T., 2007, Sol. Phys., 241, 195
    D.S., Georgoulis M.K., 2018, Sol. Phys., 293, 28                             Rumelhart D. E., Hinton G. E., Williams R. J., 1986, Nature, 323, 533
Furnkranz J., 2002, J. Mach. Learn. Res., 2, 721                                 Sadykov V. M., Kosovichev A. G., 2017, ApJ, 849, 148
Galar M., Fernandez A., Barrenechea E., Bustince H., Herrera F., 2011,           Schou J. et al., 2012, Sol. Phys., 275, 229
    Pattern Recognit., 44, 1761                                                  Schrijver C. J., 2009, Adv. Space Res., 43, 739
Georgoulis M. K., Rust D. M., 2007, ApJ, 661, L109                               Song H., Tan C., Jing J., Wang H., Yurchyshyn V., Abramenko V., 2009, Sol.
Goodfellow I., Bengio Y., Courville A., 2016, Deep Learning. MIT Press,              Phys., 254, 101
    Cambridge. Available at: http://www.deeplearningbook.org/                    Yosinski J., Clune J., Nguyen A., Fuchs T., Lipson H., 2015, in Deep Learning
Guerra J. A., Pulkkinen A., Uritsky V. M., 2015, Space Weather, 13,                  Workshop, Proc. 31th Int. Conf. on Machine Learning. ICML-15, Lille
    626                                                                          Yuan Y., Shih F. Y., Jing J., Wang H. M., 2010, Res. Astron. Astrophys., 10,
Hanssen A. W., Kuipers W. J. A., 1965, Meded. Verh., 81, 2                           785
Heidke P., 1926, Geogr. Ann., 8, 301                                             Zeiler M. D., Fergus R., 2014, in European Conference on Computer Vision.
Huang X., Yu D. R., Hu Q. H., Wang H. N., Cui Y. M., 2010, Sol. Phys., 263,          Springer, Berlin, p. 818
    175                                                                          Zhang Z. L., Luo X. G., Garcia S., Tang J. F., Herrera F., 2017, Knowl.-Based

                                                                                                                                                                 Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
Huang X., Zhang L., Wang H., Li L., 2013, A&A, 549, A127                             Syst., 125, 53
Huang X., Wang H., Xu L., Liu J., Li R., Dai X., 2018, ApJ, 856, 7               Zhang Z. L., Luo X. G., Gonzalez S., Garcia S., Herrera F., 2018a,
Hullermeier E., Vanderlooy S., 2010, Pattern Recognit., 43, 128                      Neurocomputing, 285, 176
Kang S. K., Cho S., Kang P., 2015, Neurocomputing, 149, 677                      Zhang Z. L., Luo X. G., Yu Y., Yuan B. W., Tang J. F., 2018b, Eng. Appl.
Kim T. et al., 2019, Nature Astron., 3, 397                                          Artif. Intel., 74, 43
LeCun Y., Bottou L., Orr G. B., Müller K.-R., 1998a, in Montavon G., Orr        Zheng Y. F., Li X. B., Wang X. S., 2019, ApJ, 885, 73
    G., Müller K.-R., eds, Neural Networks: Tricks of the Trade. Springer,
    Berlin, p. 9
LeCun Y., Bottou L., Bengio Y., Haffner P., 1998b, Proc. IEEE, 86, 2278          S U P P O RT I N G I N F O R M AT I O N
LeCun Y., Bengio Y., Hinton G., 2015, Nature, 521, 436
                                                                                 Supplementary data are available at MNRAS online.
Leka K. D., Barnes G., Wagner E., 2018, J. Space Weather Space Clim., 8,
    A25                                                                          Supplementary material.zip
Li R., Zhu J., 2013, Res. Astron. Astrophys., 13, 1118
Li R., Cui Y., He H., Wang H., 2008, Adv. Space Res., 42, 1469                   Please note: Oxford University Press is not responsible for the content
Liu C., Deng N., Wang J. T. L., Wang H. M., 2017, ApJ, 843, 104                  or functionality of any supporting materials supplied by the authors.
Mason J. P., Hoeksema J. T., 2010, ApJ, 723, 634                                 Any queries (other than missing material) should be directed to the
Muranushi T., Shibayama T., Muranushi Y. H., Yuko H., Isobe H., Nemoto           corresponding author for the article.
    S., Komazaki K., Shibata K., 2015, Space Weather, 13, 778
Nishizuka N., Sugiura K., Kubo Y., Den M., Watari S., Ishii M., 2017, ApJ,
    835, 156                                                                     A P P E N D I X A : F E AT U R E V I S UA L I Z AT I O N
Nishizuka N., Sugiura K., Kubo Y., Den M., Ishii M., 2018, ApJ, 858, 113
Park E., Moon Y. J., Shin S., Yi K., Lim D., Lee H., Shin G., 2018, ApJ, 869,    In Fig. A1, we present the results of feature visualization. In Fig. A2,
    91                                                                           we show the feature maps output by different convolutional layers of
Pesnell W. D., Thompson B. J., Chamberlin P. C., 2012, Sol. Phys., 275, 3        the low-performance model.

                                                                                                                        MNRAS 507, 3519–3539 (2021)
3532         Y. F. Zheng et al.

                                                                                                                                                                       Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

Figure A1. The results of feature visualization. (a), (c), (e), and (g) show the logarithmic values of median gradient, median intensity, total intensity, and the
range of the intensity for the feature maps from the last convolutional layer in high-performance model and in low-performance model, respectively, using 10
different AR samples with four classes. (b), (d), (f), and (h) show the values of median gradient, median intensity, total intensity, and the range of the intensity
for the feature maps from the last convolutional layer in high-performance model, respectively.

MNRAS 507, 3519–3539 (2021)
Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
3533

                                                                                                                                                                                 MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction

                                                                                                                                                         Figure A1 – continued
Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

                                                                                                                              Figure A1 – continued
Y. F. Zheng et al.

                                                                                                                                                      MNRAS 507, 3519–3539 (2021)
3534
Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
3535

                                                                                                                                                                                 MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction

                                                                                                                                                         Figure A1 – continued
3536         Y. F. Zheng et al.

                                                                                                                                                                 Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

Figure A2. The feature maps output by different convolutional layers of the low-performance model. The raw input corresponding to these feature maps is the
LOS magnetogram sample of X-class AR 11890 observed at 19:36 UT on 2013 November 7. (a)–(c) show the feature map images output by the first, second,
and fifth convolutional layers of the low-performance model respectively, and (d) shows the gradient images for the feature maps of (c). The feature maps with
red boxes show different image features from the others. The values on the colorbars for (a)–(d) can be clearly seen by zooming in.

MNRAS 507, 3519–3539 (2021)
Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021
3537

                                                                                                                                                                                 MNRAS 507, 3519–3539 (2021)
Hybrid deep CNN with OVO for flare prediction

                                                                                                                                                         Figure A2 – continued
Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

                                                                                                                              Figure A2 – continued
Y. F. Zheng et al.

                                                                                                                                                      MNRAS 507, 3519–3539 (2021)
3538
Hybrid deep CNN with OVO for flare prediction     3539

                                                                                                                                     Downloaded from https://academic.oup.com/mnras/article/507/3/3519/6328501 by guest on 07 December 2021

                                                                Figure A2 – continued
This paper has been typeset from a TE
                                    X/LAT
                                        EX file prepared by the author.

                                                                                                       MNRAS 507, 3519–3539 (2021)
You can also read