THE EXTREMES PREDICTION USE CASE - HIPEAC
←
→
Page content transcription
If your browser does not render page correctly, please read the page content below
TECHNICAL DIMENSION
The TransContinuum Initiative: exploiting the full range of digital technologies for the prediction
of weather and climate extremes
The extremes prediction
use case
By PETER BAUER, MARC DURANTON and MICHAEL MALMS
Dealing responsibly with extreme events requires not only a drastic change in the ways society
addresses its energy and population crises. It also requires a new capability for using present and
future information on the Earth system to reliably predict the occurrence and impact of such
events. A breakthrough in Europe’s predictive capability can be made manifest through science
and technology solutions delivering as yet unseen levels of predictive reliability with real value
for society.
The “TransContinuum Initiative”, initiated by ETP4HPC, ECSO, BDVA, 5GIA, EU MATHS IN,
CLAIRE, AIOTI and HiPEAC, offers unprecedented opportunities to overcome the technological
limitations currently hampering progress in this area. Beyond providing this use case with better
technology solutions, the initiative offers a foundation for an Earth system – computational
science collaboration that will eventually lead to science and technology being truly co-developed,
and thus to sustainable benefit for one of today’s most relevant applications and for European
technology providers.
Key insights Key recommendations
• The reliable prediction of weather • Future systems will require at least • Interoperable machine learning tool-
and climate extremes represents an 100 times more computational power kits facilitating the portability of data
extreme-scale computing and data for producing reliable predictions of processing in the cloud and workflow
handling challenge. Earth-system extremes with short lead management options in the cloud for
times. It will require implementing an orchestrating the rather complex data
• The European and global Earth system
Earth-system digital twin – a cyber- assimilation and Earth-system simulation
science community has established a
physical entanglement. A solution is a workloads should be developed
solid network of technology solutions
layered federation with fewer elements
for its complex and time critical work- • Several domains need to be simultane-
near the heavy workloads and more
flows; however, they will not scale to ously promoted:
elements at the observational data
meet future requirements. • for software: interactive workflows,
pre-processing front-end and the data
mathematical methods and algorithms,
• Climate change and the urgency for analytics post-processing back-end.
high-productivity programming envi-
societies to adapt to future extremes
• The extensive use of edge computing ronments, performance models and
require new technology solutions that
needs low-cost yet high-performance optimization tools.
provide breakthroughs for how we
computing facilities and to overcome • for hardware: heterogeneous proces-
observe and simulate the Earth system.
the data-transfer bottlenecks between sor configurations through accele-
• The cross-disciplinary “TransContinuum the computing and data intensive parts rators and data-flow engines, high-
Initiative (TCI)” promises true co-design of the digital twin and downstream bandwidth memory, deep memory
opportunities exploiting the entire applications hierarchies for I/O and storage, super-
breadth of digital technologies for the fast interconnects and configurable
benefit of skilful extremes prediction computing including the supporting
capabilities in Europe. system software stack.
44THE EXTREMES PREDICTION USE CASE
The use case which is crucially important for decision- is the ultimate goal. The loop shown in the
Natural hazards represent some of the making on tight schedules [4]. figure would be open as humans are continu-
most important socio-economic chal- ally influencing nature, for example through
lenges our society faces in the decades to Europe currently leads the world in CO2 emissions at global scale or irrigation in
come. Natural hazards have caused over medium-range weather prediction and is agriculture at small scales. But natural vari-
1 million fatalities and over 3 trillion Euros also a major competence centre for global ability can also affect the system and needs
of economic loss worldwide in the last climate prediction [5,6]. For decades, HPC to be replicated, for example extreme events
twenty years, and this trend is accelerat- has been a key enabler for this track-record such as volcanic eruptions injecting large
ing as a result of the drastic rise in demand and has led to weather and climate predic- amounts of ash into the atmosphere leading
for resources and population growth. The tion becoming one of the leading use cases to a change in radiative forcing.
combination of likelihood and impact for computing and data handling at large
makes extremes and climate action failure scale. The apparent change in computing Beyond advancing present-day Earth-
the leading threats for our society [1,2,3]. architectures has stimulated a wide range system simulation and observation capa-
of programmes that aim to prepare the bilities, the digital twin would close the gap
The “Earth system extremes” use case operational Earth-system monitoring and between Earth system science and socio-
relies on very complex numerical Earth prediction infrastructures for future tech- economic sectors in which it might be
system simulation models that ingest nologies [7]. These programmes increas- applied and facilitate a high level of flexible
hundreds of millions of observations per ingly realize that it will take more than intervention by non-science/technology
day to help improve the formulation of progress in HPC to fulfil future extremes experts. Hiding the complexity of the digi-
the physical process representation as well prediction targets. This is where the need tal TransContinuum from the user is key
as produce the initial conditions used for to exploit the opportunities offered by the to success for user-driven digital twins [9].
launching predictions of the future. This entire digital TransContinuum [8] comes
logic applies equally to weather timescales into play, and where new ways of co-devel- As shown in Fig. 1, the TransContin-
of days or weeks and climate timescales oping Earth system and computational uum embeds smart sensors and IoT within
of decades. These systems are also run as science need to be found. smart networks supplying data to extreme-
ensembles whereby each ensemble member scale computing and big data handling
represents both initial condition and model Implementing an Earth-system digital platforms. These platforms exploit cloud
uncertainty. The result is a prediction of twin – called the cyber-physical entangle- services across workflows giving access
state, but also a prediction of uncertainty, ment in Fig. 1 – through these technologies to additional distributed computing and
Figure 1: Main elements of digital continuum and relevance for extremes prediction use case including readiness of both application and technologies
(green = established in production, but not optimal in performance; yellow = research, not in production mode; red = novel and unexplored).
45TECHNICAL DIMENSION
data analytics capabilities. Mathematical observations and perform on-the-fly data the desired quality of service for different
methods and algorithms as well as artificial pre-processing. data flows. The extremes prediction use
intelligence and machine learning provide case is particularly HPC- and big data-
the glue between the elements of the Trans- However, observations from commod- driven and creates a significant footprint for
Continuum. Cybersecurity acts as a shield ity devices deployed on e.g. phones, car the supporting communication networks.
for the entire loop. Such a methodological sensors and specialized industrial devices Pre-processing in edge devices can already
framework based on extremes prediction monitoring agriculture, renewable energy offload some of this burden as this use case
also offers gains for other domains. sources and infrastructures also offer data requires near real-time data availability at
that is currently inaccessible to operational the HPC facilities. The extensive use of edge
Challenges along the TransContinuum services and that can fill vast observational computing needs low-cost yet high-perfor-
Assessing the challenges for the gaps in less developed countries, offers mance computing facilities that will interact
extremes prediction use case with respect much finer resolution in densely populated with end devices as well as with one another.
to the TransContinuum requires a closer areas and in regions of significant socio-
look at the production workflow in today’s economic interest [11]. Beyond techni- Given the decade-long evolution of
systems and how they are likely to evolve cal challenges, a generic approach for fast interconnected data collection, transmis-
in the future. access to commercial data for public use sion, pre-processing, centralized HPC
via public-private partnerships is required, and post-processing, and dissemination
Smart sensors and Internet of Things which is already on the agenda of inter- in weather prediction workflows there is
At present, Earth-system observa- governmental organisations like the World little room for optimization of present-day
tions that are used operationally already Meteorological Organization. This element systems. However, workloads, data volumes
comprise hundreds of millions of observa- of the TransContinuum is only developing and data diversity grow much faster than
tions collected daily to monitor the atmos- now and has huge potential. in the past and the demand for providing
phere, oceans, cryosphere, biosphere and more skilful predictions is more urgent than
the solid Earth, the largest data volumes Smart networks and services ever before. Therefore, network and service
being provided close to a hundred satellite Collecting and transferring massive solutions need to orchestrate and dynami-
instruments [10]. This volume is expected amounts of diverse data from devices scat- cally manage data routing and computing
to increase by several orders of magni- tered in multiple areas via distributed pre- resources with much more flexibility and
tude in the next decade, bringing with it a processing centres to centralized digital scalability. Whether both performance and
need to ingest such observations in digital platforms and computing centres requires cost-effectiveness need dedicated solutions
twin systems within hours. Smart sensor a suitable network infrastructure. Evolved or whether it can be achieved by a mixture
technology is highly relevant for satellite networks and services should offer secure of commercial and institutional systems is
instruments that can perform targeted and trustable solutions that will support an as yet unsolved question.
Figure 2: a) The volume of worldwide climate and daily weather forecast data is expanding rapidly, creating challenges for both physical archiving
and sharing, as well as for ease of access, particularly for non-experts. The figure shows the projected increase in global climate data holdings
for climate models, remotely sensed data, and in situ instrumental/proxy data (from [12]). b) Daily data volumes produced by the 50-member
ECMWF weather forecast ensemble at today’s spatial resolution of 18 km, the next upgrade to 9 km and the expected configuration in 2025.
46THE EXTREMES PREDICTION USE CASE
Big data and data analytics water, food, energy, health, finance and the decomposition of workflows such that
The volume of Earth-system observa- civil protection. Machine learning-based the production chain can exploit the full
tion and simulation data in weather predic- quality control and error correction, infor- processing and analysis potential of the
tion already exceeds hundreds of TBytes/ mation extraction and data compression as TransContinuum. A likely solution is a
day while climate projection programmes well as processing acceleration will be key. layered federation with fewer elements near
produce PByte-sized archives, which take the heavy workloads and more elements at
climate scientists years to analyse (see High-performance and cloud computing the observational data pre-processing front-
Fig. 2) [12]. The extrapolation of data Today, experimental and operational end and the data analytics post-processing
volumes and production rates to advanced Earth-system simulations use petascale back-end. This distribution also needs to
Earth-system digital twins will prohibit the HPC infrastructures, and the expectation be driven by the urgency of product deliv-
effective and timely information extraction is that future systems will require at least ery as, particularly for extremes prediction,
that is critical for timely actions to antici- 100 times more computational power for processing speed trumps all other consid-
pate and mitigate the effects of extremes. producing reliable predictions of Earth- erations.
Both simulations and observations need to system extremes with lead times that
be generated and combined in the Earth- are sufficient for society and industry to Adding flexibility without sacrificing
system digital twin within minutes to hours respond [13]. This need translates into a performance through the TransContin-
of time-critical workflows towards near- new software paradigm to gain full and uum is where most of the strategic recom-
real-time decision-making. sustainable access to low-energy process- mendations for research and innovation
ing capabilities, dense memory hierar- compiled by the European Technology
Overcoming the data-transfer bottle- chies as well as post-processing and data Platform for HPC (ETP4HPC) [15], the Big
necks between the computing and data dissemination pipelines that are optimally Data Value Association (BDVA) [16] and
intensive parts of the digital twin and configured across centralized and cloud- High Performance Embedded Architec-
downstream applications is crucial, and based facilities. European leadership in this ture and Compilation (HiPEAC) [17] are
future workflow management needs to software domain offers a unique opportu- positioned. For software, these are interac-
make such applications an integral part of nity to turn European investment in HPC tive workflows, mathematical methods and
the observation and prediction infrastruc- digital technology into real value. algorithms, high-productivity program-
ture. Powerful data analytics technology ming environments, performance models
and methodologies offer the only option A big challenge is the definition of the and optimization tools. For hardware,
to make the effective conversion from evolution of today’s strongly centralized heterogeneous processor configurations
PBytes/day of raw data to Mbytes/day of workload and data lifecycle management through accelerators and dataflow engines,
usable information. This conversion must towards a more distributed system – still high-bandwidth memory, deep memory
be tailored to individual sectors needing to being data-centric to keep big data near hierarchies for I/O and storage, super-fast
prepare and respond to extremes, namely extreme-scale computing – that allows interconnects and configurable computing
Figure 3: Main components of extremes prediction workflow, their alignment with edge, high-performance and cloud computing elements of the
TransContinuum, and key contributions of machine learning to workflow enhancements.
47TECHNICAL DIMENSION
including the supporting system software data analytics. Here, the biggest gaps are tions will not scale to meet future require-
stack are part of the agenda [14]. interoperable machine learning toolkits ments. True near-sensor, edge processing
facilitating the portability of data process- does not yet exist and IoT devices are not
For the prediction of extremes, the main ing in the cloud and workflow management yet used; smart and configurable networks
computing tasks are currently performed options in the cloud for orchestrating the as well as flexible HPC and cloud solutions
on centralized, dedicated systems where rather complex data assimilation and Earth- are not available; and even for the extreme-
the main data storage facilities also lie. system simulation workloads (see Fig. 3). scale computing and data handling tasks,
The observational input data flow crosses the present algorithmic and programming
all levels of edge, cloud and centralized Conclusions frameworks do not allow the exploitation of
computing during which selected pre- Predicting environmental extremes the real potential of emerging technologies.
processing steps are exercised, for example, clearly has an extreme-scale computing
for satellite data collection and pre-process- and data footprint. It has reached a point Building on past European science
ing at distributed receiving stations of the where only through significant investment and technology programmes such as the
ground segment, their further dissemina- in all parts of the TransContinuum can Future and Emerging Technology for HPC
tion and processing by space agencies and the future needs of our society for reliable (FET-HPC) [18] under Horizon 2020, the
meteorological centres, and assimilation information on the present and future of recent implementation of the EuroHPC
into models at prediction centres. Similar our planet be fulfilled. These investments Joint Undertaking, and the Copernicus
data flow mechanisms exist for ground- must go hand in hand with investments in Programme, Europe has actually rec-
based meteorological observation taken basic Earth system science, to better under- ognized the present gaps and needs for
from networks, stations and (few) commer- stand key processes and their interaction, serious investments. This has motivated the
cial providers. and for service provision ensuring that ambitious Destination Earth action [19]. In
Earth system data is translated into useful support of the Green Deal and the European
Increasingly however, the back-end of information for society. strategy for data [20], Destination Earth
model output data processing interfacing promises to bring the Earth system science,
with commercial service providers is being While the present digital infrastructures digital TransContinuum and service devel-
placed into the cloud, which also offers support weather and climate prediction well opment strands together assigning the
better access to machine learning-based enough, the chosen domain-specific solu- highest priority to extremes prediction and
48THE EXTREMES PREDICTION USE CASE
climate adaptation. The programme has
adapted the concept of digital twins of the
Earth system for this purpose as promoted
by HiPEAC and ETP4HPC and its strong
digital infrastructure contribution has huge
potential for achieving European technol-
ogy leadership by addressing one of the
principal challenges for today’s society.
References
[1] UNISDR, “Economic losses, poverty & disasters, 1998-
2017”, (UNISDR, Geneva, 2018)
[2] AON, “Weather, climate & catastrophe insight”, Annual
report, (AON, Chicago, 2020)
[3] WEF, “The global risks report 2020”, Insight report
15th edition, (World Economic Forum, Geneva, 2020)
[4] P. Bauer, A. Thorpe, and G. Brunet (2015). The quiet
revolution of numerical weather prediction. Nature,
525(7567), 47-55.
[5] R. A. Kerr (2012). One Sandy forecast a bigger winner
than others. Science, 736-737.
[6] P. Voosen (2019). New climate models predict a
warming surge. Science, 16.
[7] P. Neumann, J. Biercamp, (2019, September).
ESiWACE: On European Infrastructure Efforts for
Weather and Climate Modeling at Exascale. In 2019
15th International Conference on eScience (eScience)
(pp. 498-501). IEEE.
[8] “TransContinuum Initiative (TCI): our vision”, https://
www.etp4hpc.eu/transcontinuum-initiative.html
[9] P. Bauer, B. Stevens, & W. Hazeleger (2020). A digital
twin of Earth for the green transition. Nature Climate
Change, submitted.
[10] W. Balogh, and K. Toshiyuki. “The World
Meteorological Organization and Space-Based
Observations for Weather, Climate, Water and Related
Environmental Services.” In Space Capacity Building in
the XXI Century, pp. 223-232. Springer, Cham, 2020.
[11] J. M. Talavera, L. E. Tobón, J. A. Gómez,
M. A. Culman, J. M. Aranda, D. T. Parra, ... &
L. E. Garreta, (2017). Review of IoT applications in
agro-industrial and environmental fields. Computers and
Electronics in Agriculture, 142, 283-297.
[12] J. T. Overpeck, G. A. Meehl, S. Bony &
D. R. Easterling (2011). Climate data challenges in the
21st century. Science, 331(6018), 700-702.
This document is part of the HiPEAC
[13] T. C. Schulthess, P. Bauer, N. Wedi, O. Fuhrer,
T. Hoefler & C. Schär (2018). Reflecting on the goal Peter Bauer is the Deputy Director Vision available at hipeac.net/vision.
and baseline for exascale computing: a roadmap based V.1, January 2021.
on weather and climate simulations. Computing in
of the Research Department at the
European Centre for Medium-Range Cite as: P. Bauer, M. Duranton, and M. Malms.
Science & Engineering, 21(1), 30-41.
The extremes prediction use case.
[14] P. Bauer, P. D. Dueben, T. Hoefler, T. Quintino, Weather Forecasts, Reading, UK.
T. Schulthess & N. P. Wedi (2020), The digital In M. Duranton et al., editors, HiPEAC Vision
revolution of Earth-system science. Nature 2021, pages 44-49, Jan 2021.
Computational Science, submitted. The HiPEAC project has received funding from
[15] ETP4HPC, https://www.etp4hpc.eu/pujades/files/ Marc Duranton is researcher at the the European Union’s Horizon 2020
ETP4HPC_SRA4_2020_web(1).pdf Research and Technology Department research and innovation programme under
[16] BDVA, https://www.bdva.eu/BDVA_SRIA_v4 of CEA (Alternative energies and Atomic grant agreement number 871174.
[17] HiPEAC, https://www.hipeac.net/vision/#/latest/
[18] “European HPC technology research projects”, https:// Energy Commission), France and the © HiPEAC 2021
www.scientific-computing.com/white-paper/european- coordinator of the HiPEAC Vision 2021.
hpc-technology-research-projects
[19] “Destination Earth (DestinE)”, https://ec.europa.eu/
digital-single-market/en/destination-earth-destine
[20] “A European Strategy for Data”, https://ec.europa.eu/ Michael Malms is the SRA lead editor
digital-single-market/en/european-strategy-data within the office team of ETP4HPC and
main initiator and coordinator of the
TransContinuum Initiative.
49You can also read