SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group

Page created by Jeffrey Adams
 
CONTINUE READING
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
SuperAI on Supercomputers

                   Haohuan Fu
            haohuan@tsinghua.edu.cn
   High Performance Geo-Computing (HPGC) Group
              http://www.thuhpgc.org
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
Earth Science and Supercomputers

                                                 nCreate a digital earth,
                                                 so as to:
                                                     p   simulate
                                                     p   analyze
                                                     p   understand
                                                     p   predict and mitigate

           figure credit: Earth Simulator, JPN
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
Two Major Functions

Simulation        Data Analysis

                                  3
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
Two Major Functions

To design highly efficient and    To develop intelligent data
  highly scalable simulation       mining methods for the
         applications            analysis of BIG scientific DATA

                                                                   4
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
Two Major Functions

To design highly efficient and    To develop intelligent data
  highly scalable simulation       mining methods for the
         applications            analysis of BIG scientific DATA

                                                                   5
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
2015-now: The CESM Project on Sunway TaihuLight

      CAM5.0                               POP2.0

                         CPL7

     CLM4.0                             CICE4.0

                                                                       • Four component models, millions lines of code
                                                                       • Large-scale run on Sunway TaihuLight
                      CESM1.2.0                                               • 24,000 MPI processes
                                                                              • Over one million cores
                                                                       • 10-20x speedup for kernels
Tsinghua + BNU 30+ Professors and Students                             • 2-3x speedup for the entire model

                  Haohuan Fu, Junfeng Liao, Wei Xue, Lanning Wang and et al., “Refactoring and Optimizing the Community
                  Atmosphere Model (CAM) on the Sunway TaihuLight Supercomputer”, The International Conference for High   6
                  Performance Computing, Networking, Storage and Analysis (SC), Salt Lake City, Utah, US, 2016
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
2017: Non-Linear Earthquake Simulation
        Dynamic Rupture Source Generator
             (Based on CG-FDM)
Fault Stress        Friction         Wave Eqn
   Init            Law Ctrl            Solver
                                                  3D Vel/Den Model

        Source Partitioner               3D Model Interpolator

      Seismic Wave Propagation                  Snapshot/Sesimo
        (Based on AWP-ODC)                         Recorder
    Velocity
                     Stress Update
    Update
Next Timestep                                      Restart
       Stress
                        Source                    Controller
  Adjustment For
      Plasticity       Injection

  LZ4 Compression, Group I/O, Balanced I/O Forwarding
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
2018: Simulating the Wenchuan Earthquake with
         Accurate Surface Topography
                (a)                       (b)
                          Sunway (2017)                              Sunway (2018)
                          Tangshan                                   Wenchuan
  Surface      5km
  Topography

                      z                                          '                   &            −T-
                                                y
                                                                                            −T.
                                                    ! = !($, &, ')
                                                    ) = )($, &, ')
                                                    * = *($, &, ')                                      T.

                                                                                         /& = 0
                                                                                                  T-
                                                x                                    $
                                                       (1)                                        (2)
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
Two Major Functions

To design highly efficient and    To develop intelligent data
  highly scalable simulation       mining methods for the
         applications            analysis of BIG scientific DATA

                                                                   9
SuperAI on Supercomputers - Haohuan Fu High Performance Geo-Computing (HPGC) Group
Data Challenge for Climate Scientists

                 • Over 100 PB just for
                   climate change studies

                 • A Challenge but also
                   a huge opportunity

                   Climate Data Challenges in the 21st Century,
                   Overpeck et al., Science , 331 (6018): 700-702
Remote sensing data

  Satellite                           Google earth images   High-resolution image
                                      1984-2018

  UAV                                 UAV image

  - Recording what is happening on the earth for decades
  - What issues can we solve based on these data?                                   11
Look Ahead: Data-Driven Modeling and Prediction

                                       city
                        migration
                                                green land
                         of birds

            ecology
           protection                                    waterbody
              zone

          Land                       Remote
          Cover                      Sensing                 sea-level
         Mapping                    Data Sets
Deep learning in computer vision
                                    Image Classification
                                               Cat
                                     Object Detection

                                   Semantic Segmentation
                                     Sky              Trees

                                                Cat

                                     Grasses
                                                        13
Deep learning + remote sensing data
                                                                     Challenges:
                                                                     - Few public labeled
                                                                       datasets in RS domain
                                                                     - Difference between RS
                                                                       images and CV images
                                                                     - Real-world applications
                                                                       (small size, imbalance)
    More accurate       More intelligent         More reliable
 land cover mapping   oil palm monitoring   urban land use mapping
                                                                     Major issues:
                                                                     - How to select and
                                                                       build the RS datasets?
                                                                     - How to select the
                                                                       most suitable method?
                                                                     - How to improve the
                                                                       method for our tasks?

                                                                                      14
Case 1: More Accurate Land Cover Map
First 30 m resolution global land cover maps

Running SVM on 8900 Landsat scenes on a supercomputer, achieved result in one day.   Gong et al., 2013, IJRS
FROM-GLC (Accuracy: 63.72%)

                    Gong et al., 2013. IJRS
A Starting Point: Direct Application of Stacked AutoEncoder

            Landsat Image                RF Image             SAE Image

                             RF               SVM          ANN                SAE
    Overall Accuracy        76.03%           77.74%       77.86%             78.99%
     Mapping Time       33.605 ± 0.183     16344.188   4.014 ± 0.003      13.250 ± 0.042
                                                                                    18
Integrating Google Earth Image

                            Spatial features from 0.5-m images
   Google Earth

                                                                       Predicted
                                                                       Result 1    Land cover map

                                                                       Predicted
                                                                 SVM   Result 2
                                          CNN

                           Spectral features from 30-m images
                                                                       Predicted
                                                                 SVM
                       Surface reflectance, NDVI, DOY                  Result 3
                       Longitude and Latitude
                       Elevation and Slope
                       Surface reflectance of max NDVI, DOY            Predicted
                                                                 RF
                                                                       Result 4
                  24
  Landsat + DEM

                                                                                         19
Integrating Google Earth Image

                                 20
Integrating Google Earth Image

                                 21
Integrating Google Earth Image

                                 22
Integrating Google Earth Image

                                    Slight increase using      Great increase using
                                 30-meter resolution images   Multi-resolution images

                                                                           23
Integrating Google Earth Image
  Google Earth image       Results of RF       Results of SVM         Results of Ours     Legend

                                                                                             Impervious

                                                                                             Forest

                                                                                             Cropland

                                                                                             Water

                                                                                             Grassland

                                                                                             Bare land

                                                                                             Shrubland

                                                                                             Cloud

 Fewer confusions among various land cover types in the results of our proposed method.       24
30m to 10m

     Gong P., et al., 2019. Stable classification with limited sample: transferring a 30-m resolution sample set
     collected in 2015 to mapping 10-m resolution global land cover in 2017,Science Bulletin.                 25
10m to 3m

            26
削弱部分时相或局部区域厚云遮盖影像

   City of Changsha
Case 2: Detection of Oil Palm Trees
Case 2: Intelligent monitoring of oil palm trees

  Mapping of oil palm trees using    Detection of oil palm trees using    Detection and classification of
  high-resolution satellite images   high-resolution satellite images    oil palm trees using UAV images

                                                                                                 29
CNN based large-scale oil palm tree detection

                                                30
Semantic segmentation based oil palm plantation extraction
  • We proposed the first semantic segmentation based approach for large-scale oil palm plantation extraction from
    QuickBird Satellite images in 0.6-m spatial resolution.
  • The enhanced pixel-wise dataset contains 36,000 images in 256×256 pixels located in southern Malaysia, including
    four categories of samples: oil palm, other vegetation, buildings and the others.
  • We present an end-to-end deep convolutional neural network based on the SegNet model, which has a symmetrical
    encoder-decoder architecture based on the convolutional layers of the VGG-16 model.
  • Our proposed approach combines the SegNet model with the Conditional Random Fields and integrates the post-
    processing results with the outputs of SegNet to improve the localization of boundaries.

                                                                                               Fully connected
                                                                                                      CRF

                                                     conv+BN+ReLU   pooling
                  Input                              upsampling     Softmax
                                                                              Decoder
                                                                                                                 Segmentation
                                   Encoder
                              conv-BN-ReLU-pooling                    upsamling-conv-BN-ReLU

                                                                                                                                31
CNN based large-scale oil palm tree detection
                                           Multi-level CNN training and optimization
•   The first CNN is used for land cover classification to locate the oil palm plantation area, including three types of samples (oil palm plantation
    area, other vegetation / bare land, and impervious/cloud).
•   The second CNN is used for object classification to identify the oil palms, including for types of samples (oil palm, background, other
    vegetation / bare land, and impervious/cloud).
•   The two CNNs are trained and optimized independently based on 17,000 training samples and 3000 validation samples.

                        CNN-1: Land cover classification                                     CNN-2: Object classification
                                                                                                                                     32
Transfer Learning

  One Source Domain to One Target Domain   One Source Domain to Multi Target Domain
Case 3: Urban Land Use Map
Case 3: More accurate urban land use map
                                                     GIS Map data                   Satellite imagery                Predicted buildings
• A building extraction method based on
  the U-Net semantic segmentation model.

• A data fusion method combining the
  multispectral satellite images with the
  public GIS map datasets.

• This work won the fifth place in the
  Building Extraction Track of DeepGlobe
  - CVPR 2018 Satellite Challenge.

                                                                                   16×16×512
                                                                       32×32×256                  32×32×256
                                                           64×64×128                                          64×64×128

                                                  128×128×64      Convolution + Batch Normalization + Activation          128×128×64

                                                                   Upsampling       Max-pooling      Concatenation
                                            256×256×32                                                                       256×256×32
                                                                                                                                     35 256×256×1
Case 3: More accurate urban land use map

                                                   Deep Convolutional
                                                    Neural Network
                    Rescaled images

                                                   Deep Convolutional
 Satellite images                                   Neural Network

                     Sliced images

                                                   Deep Convolutional
                                                    Neural Network
                    Rescaled images

                                                   Deep Convolutional
                                                    Neural Network
 Satellite images
 GIS Map images      Sliced images    Augmented   Semantic Segmentation   Probability   Post-processing   Predicted
                                        images                               maps        Ensembling       buildings

                                                                                                          36
Case 3: More accurate urban land use map
                  Results of the proposed method                  • Some examples of the building extraction results.
     Index        Vegas      Paris    Shanghai     Khartoum
      TP          27526      3097       11323           3495
   Precision      0.9441    0.8459      0.7470          0.6398
    Recall        0.8437    0.6825      0.5396          0.4694
   F1-score       0.8911    0.7555      0.6266          0.5415
               F1-scores obtained after each strategy
     Method        Vegas      Paris    Shanghai     Khartoum
                                                                             Vegas                     Paris
     Baseline      0.8611    0.6774      0.5342         0.4544
    Enhanced       0.8730    0.7181      0.5471         0.4935
   Postprocess     0.8866    0.7383      0.5897         0.5210
    Add-map        0.8911    0.7555      0.6266         0.5415

  • Our method improves the F1-scores by 3%-9% compared
    with the baseline, depending on the situation of each city.
  • Combining the GIS Map datasets with satellite images
    improves the results for all cities, even for places with               Shanghai                 Khartoum
    few building information on the map.                                                                       37
AI for City

         n    Intelligent City Brain:
              p Detection
              p Understanding
              p Modeling
              p Planning
n    Li, Weijia and Dong, Runmin and Fu, Haohuan and Wang, Jie and Yu, Le and Gong, Peng, "Integrating Google Earth imagery with Landsat
    data to improve 30-m resolution land cover mapping", Remote Sensing of Environment, vol. 237, pp. 111563, 2020.

n     Zheng, Juepeng and Li, Weijia and Xia, Maocai and Dong, Runmin and Fu, Haohuan and Yuan, Shuai, "Large-Scale Oil Palm Tree Detection
    from High-Resolution Remote Sensing Images Using Faster-RCNN", Proc. IEEE International Geoscience and Remote Sensing Symposium
    (IGARSS), pp. 1422-1425, 2019.

n     Dong, Runmin and Li, Weijia and Fu, Haohuan and Gan, Lin and Yu, Le and Zheng, Juepeng and Xia, Maocai, "Oil palm plantation mapping
    from high-resolution remote sensing images using deep learning", to appear in International Journal of Remote Sensing, pp. 1-25, 2019.

n     Xia, Maocai and Li, Weijia and Fu, Haohuan and Yu, Le and Dong, Runmin and Zheng, Juepeng, "Fast and robust detection of oil palm trees
    using high-resolution remote sensing images", Proc. Automatic Target Recognition XXIX, vol. 10988, pp. 109880C, 2019.

n     Dong, Runmin and Li, Weijia and Fu, Haohuan and Xia, Maocai and Zheng, Juepeng and Yu, Le, "Semantic segmentation based large-scale
    oil palm plantation detection using high-resolution satellite images", Proc. Automatic Target Recognition XXIX, vol. 10988, pp. 109880D,
    2019.

n     Li, Weijia and He, Conghui and Fu, Haohuan and Zheng, Juepeng and Dong, Runmin and Xia, Maocai and Yu, Le and Luk, Wayne, "A Real-
    Time Tree Crown Detection Approach for Large-Scale Remote Sensing Images on FPGAs", Remote Sensing, vol. 11, no. 9, pp. 1025, 2019.

n     Gong, Peng and Liu, Han and Zhang, Meinan and Li, Congcong and Wang, Jie and Huang, Huabing and Clinton, Nicholas and Ji, Luyan and
    Li, Wenyu and Bai, Yuqi and others, "Stable classification with limited sample: transferring a 30-m resolution sample set collected in 2015 to
    mapping 10-m resolution global land cover in 2017", Science Bulletin, vol. 64, no. 6, pp. 370-373, 2019.

n     Li, Weijia and He, Conghui and Fang, Jiarui and Zheng, Juepeng and Fu, Haohuan and Yu, Le, "Semantic Segmentation-Based Building
    Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data", Remote Sensing, vol. 11, no. 4, pp. 403, 2019.

n     Li, Weijia and Dong, Runmin and Fu, Haohuan and Yu, Le, “Large-scale oil palm tree detection from high-resolution satellite images using
    two-stage convolutional neural networks”, Remote Sensing, vol. 11, no. 1, pp. 11, 2019.
You can also read