MERSEA1 The TOPAZ system at met.no under

 
MERSEA1 The TOPAZ system at met.no under
Note No. 26/2009
                                                                                              oseanografi
                                                                                   Oslo, December 1, 2009

           The TOPAZ system at met.no under
                      MERSEA1

  Pål Erik Isachsen, Harald Engedahl, Ann Kristin Sperrevik,
                Bruce Hackett and Arne Melsom

1 This   document contains hyperlinks that are active when viewed with properly enabled software.
Contents                                                                                                                                    List of Figures

Contents

1   Introduction                                                                                                                                                         2

2   Overview of the TOPAZ system                                                                                                                                         2

3   Implementation at NERSC                                                                                                                                              4

4   Modifications made                                                                                                                                                   6
    4.1    Removal of manual menu-based inputs . . . .                                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    6
    4.2    File transfer . . . . . . . . . . . . . . . . . .                                .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    7
    4.3    HPC que scheduling . . . . . . . . . . . . .                                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    7
    4.4    Batch control by SMS . . . . . . . . . . . . .                                   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    7
    4.5    OPeNDAP server . . . . . . . . . . . . . . .                                     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
    4.6    Daily updates of deterministic forecast . . . .                                  .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
    4.7    Transition from SSM-I to OSI-SAF ice fields                                      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8
    4.8    SVN version control . . . . . . . . . . . . .                                    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .    8

5   Today’s TOPAZ system at met.no                                                                                                                                       9
    5.1    The TOPAZ cycle      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 9
    5.2    HPC setup . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 11
    5.3    SMS . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 12
    5.4    OPeNDAP . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   . 15

6   Experience from an initial testing period                                                                                                                           17
    6.1    Model fields . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
    6.2    HPC performance      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
    6.3    SMS . . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17
    6.4    THREDDS . . .        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   17

7   Future work                                                                                                                                                         17
    7.1    Argo In situ observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
    7.2    met.no uses of TOPAZ results . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

A The HPC directory structure                                                                                                                                           18

B Initialization files                                                                                                                                                  19

List of Figures
    1      The original TOPAZ cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                                            3
    2      The main menu in the NERSC implementation . . . . . . . . . . . . . . . . . .                                                                                 5

                                                                    1
2   OVERVIEW OF THE TOPAZ SYSTEM

    3        The TOPAZ week at met.no. ’INIT’ = Initialization, ’ANA’ = Analysis,’PAF’
             = Prepare Atmospheric Forcing, ’F07/F14’ = Forecast07/14, ’GFP’ = Generate
             Forecast Products. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1 Introduction
The TOPAZ system (“Towards an Operational Prediction system for the North Atlantic and Eu-
ropean coastal Zones”) consists of a numerical ocean-sea ice model of the North Atlantic and
Arctic oceans which utilizes an Ensemble Kalman Filter (EnKF) to assimilate ocean and ice ob-
servations into the model. TOPAZ has been developed at the Nansen Environmental and Remote
sensing Center (NERSC) under the EU FP6 MERSEA project2 . As part of the same project the
model system is to be implemented for operational use at the Norwegian Meteorological Institute
(met.no).
   This note contains a description of the work that has gone into the transfer of TOPAZ from the
development branch at NERSC to the operational branch at met.no. This note presents the state
of the met.no implementation as of fall 2009. However, TOPAZ is an evolving system and will
be continually modified and expanded under the EU FP7 MyOcean project3 . For this reason we
largely leave out exhaustive descriptions of technical details as they are implemented at the time
of writing of this note.

2 Overview of the TOPAZ system
TOPAZ (Bertino and Lisæter, 2008) is an ensemble ocean-sea ice forecasting system of the
North Atlantic and Arctic Oceans which includes data assimilation via an Ensemble Kalman
Filter (EnKF: Evensen, 1994; Evensen, 2003). Data is presently assimilated into the model once
a week and the over-all purpose of the system is two-fold: 1) to make a best possible weekly
estimate of the true ocean-sea ice state, and 2) to make a forecast of the evolution of this state
some days into the future.
   The system first produces a best possible guess, or analysis, of the true ocean-sea ice state by
merging the model estimate of that state with an estimate based on actual observations. This first
step is done with the Ensemble Kalman Filter: The model estimate is taken to be the ensemble
mean of one hundred individual model runs, each integrated from slightly perturbed initial con-
ditions and each forced with slightly perturbed atmospheric fields. The observational estimate of
the ocean-sea ice state comes from remote-sensed sea level anomalies (SLA), sea surface temper-
atures (SST), sea ice concentrations (ICEC) and sea ice drift velocity (IDRIFT).4 A final estimate
is then constructed as a linear combination of the model estimate and the observation estimate
where the relative weights of the two are based on estimates of the respective error covariances.

 2 http://www.mersea.eu.org
 3 http://www.myocean.eu.org
 4 The   preparation towards assimilation of in situ hydrographic data is ongoing at NERSC.

                                                        2
2   OVERVIEW OF THE TOPAZ SYSTEM

                              Figure 1: The original TOPAZ cycle

The error covariances of the model estimate are taken from the ensemble spread while those of
the observations are specified a priori.
   The generic ’TOPAZ week’ proceeds as follows (Figure 1): Every Tuesday an analysis of
the ocean state the previous Wednesday, i.e. six days back in time, is made from the end state
of previous week’s ensemble model runs and from observations centered around that previous
Wednesday. The next day, on Wednesday (of the current week), the system then makes a 17-day
integration of one single model member, from last week’s Wednesday to next week’s Saturday
(-7 to +10 days). At the end of this integration one is then left with a 10-day ocean-sea ice
forecast. Finally, the system integrates a new set of 100 ensemble members, this time 7 days
from the previous Wednesday to the current Wednesday. These 100 ensemble members will thus
form the basis for the analysis to be made at the onset of next week’s cycle, i.e. on the following
Tuesday.
   The TOPAZ ocean model is an implementation of the HYbrid Coordinate Ocean Model (HY-
COM). HYCOM has been developed based on the Miami Isopycnic Coordinate Ocean Model

                                               3
3    IMPLEMENTATION AT NERSC

(MICOM) (Bleck et al., 1992 and references therein). In HYCOM, the vertical coordinate is
specified as target densities. When the requested specification of layers can be met according to
an algorithm embedded in HYCOM, the model layers are isopycnic. Thus, the isopycnic layers
normally span the water column beneath the mixed layer in the deep, stratified ocean. There is a
smooth transition to terrain-following coordinates in shallow regions, and to z-level coordinates
in the mixed layer and unstratified seas. The hybrid coordinate algorithm has been described in
detail by Bleck (2002), and various specifications of the vertical coordinate have been described
and tested by Chassignet et al. (2003).
   Another feature in HYCOM is that the user may select one of several vertical mixing param-
eterizations. A detailed discussion of how HYCOM performs when five different mixed layer
models are used, is given by Halliwell (2003). The K-Profile Parameterization (KPP) closure
scheme (Large et al., 1994) is used in the TOPAZ implementation. Further, TOPAZ is run with
22 layers in the vertical, all of which are allowed to become hybrid depending on the results from
the algorithm. The present implementation in TOPAZ is based on HYCOM version 2.1.34.
   In TOPAZ, HYCOM is coupled to a prognostic sea ice model. The thermodynamic part of this
model is based on Drange and Simonsen (1996), and the dynamic part is due to Hibler III (1979),
modified by the elastic-viscous-plastic (EVP) rheology by Harder et al. (1998).
   After completion of model integrations, the TOPAZ fields are split up into North Atlantic and
Arctic regions and distributed in NetCDF format via OPeNDAP5 as so-called Mersea class 1, 2
and 3 products. Class 1 products are 3-dimensional daily mean fields of all prognostic variables
interpolated to a set of fixed z levels, class 2 products are the same fields interpolated onto a set
of oceanic sections and class 3 products are integrated transports (mass, heat, etc.) through these
same sections.

3 Implementation at NERSC
TOPAZ has up until recently6 been run at the High Performance Computing (HPC) system
“njord” at NTNU7 . For the NERSC set up the model runs under the “laurentb” user and most of
the actual scripts and executable code were placed at /home/ntnu/laurentb/TOPAZ3 while most
of the data storage took place at /work/laurentb/TOPAZ3.
   The NERSC implementation of TOPAZ is operated manually via a menu-based system. From
the main script (topaz_realt.sh) the operator can start and monitor the progress of all steps of the
TOPAZ week (Figure 2).
   The week starts on Tuesday with an initialization step where model dates are updated (ad-
vanced by seven days). Next the observational data are downloaded one by one from various
data providers via ftp, and finally the EnKF analysis step is started.
   On Wednesday, if the analysis step was successful, the operator sets off the prognostic model
runs. The two steps, forecast148 for the 17-day (7+10) single-member integration and forecast07
 5 http://opendap.org/
 6 As   of 2009 NERSC has migrated their TOPAZ setup from njord in Trondheim to hexagon in Bergen.
 7 http://www.notur.no/hardware/njord
 8 The   17-day single-member integration is named forecast14 since the last three days of the integration is forced

                                                        4
3   IMPLEMENTATION AT NERSC

Figure 2: The main menu in the NERSC implementation

                       5
4   MODIFICATIONS MADE

for the 7-day 100-member integration, actually involves first generating relevant atmospheric
forcing fields9 .
   In the current set-up each model member utilizes one 16-cpu node on njord. The 17-day
integration (including the preparation of forcing fields) requires about two and a half wall-clock
hours while one single 7-day integration requires about one hour. The 100 ensemble members are
submitted to the HPC que one by one as 100 different jobs, and since TOPAZ runs submitted by
NERSC receives normal que priority several days may pass before all jobs have passed through
the que.
   Since some of the 100 ensemble members will typically crash during integration, the operator
will have to manually monitor the progress of forecast07 and re-submit failed members. The
resubmitted members get their new initial conditions copied from another, successful, member.
This procedure will not result in two identical integrations however, since the two are forced by
atmospheric fields perturbed differently.
   When each of forecast14 and forecast07 have successfully completed, the operator starts post-
processing scripts that create a set of pre-defined products for dissemination. The daily mean
model fields are now interpolated onto two sub-regions, an Arctic domain defined on a polar
stereographic grid and a North Atlantic domain defined on a regular lat-lon grid. The class 1, 2
and 3 NetCDF files are uploaded (via scp) from the njord to a local OPeNDAP server at NERSC.

4 Modifications made
For transfer of the TOPAZ system from NERSC to met.no, the run setup required two major
modifications. First, the met.no operational system will enjoy top queue priority on the HPC
system as opposed to the normal queue priority during the development stages. Since the top
queue priority enables TOPAZ to essentially claim the entire HPC system during certain stages
and thus to exclude other work on the computer cluster—also including other top-priority fore-
cast jobs—extra care must be implemented into the planning of the operational TOPAZ week
at met.no. Second, the over-all control and execution of the various steps of TOPAZ should
be automatic rather than manual (the above-mentioned command-based interface operated by
NERSC).
   The following sections describe these initial modifications needed for the transition to met.no
in some detail. Then follows a short description of some added features to the TOPAZ system: 1)
a daily update of the deterministic forecast, 2) the replacement of SSM/I ice fields with OSI-SAF
products, and 3) the setup of SVN version control.

4.1 Removal of manual menu-based inputs
The initial stages in porting the TOPAZ model setup consisted of changing hard-coded paths,
URLs and email addresses in the actual code, then changing symbolic links. All menus and

   by climatological atmospheric fields.
 9 ECMWF   (799) forcing fields in GRIB format are downloaded daily via ftp from met.no.

                                                     6
4.2   File transfer                                             4   MODIFICATIONS MADE

command-based interfaces were then removed and replaced with functionality for automatic
execution of the code (see the description of the SMS-run system below).

4.2 File transfer
Some work also had to be done in modifying the way data files are transferred. During develop-
ment stages of the system, all file transfers were initialized and controlled from the external HPC
machine. But since met.no’s computers are protected by fire walls, some of the scripting had to
be modified to move the control of all file transfer between met.no and the HPC machine to the
local met.no machines. This involved uploading daily atmospheric forcing fields from ECMWF
and downloading the final MERSEA products to be displayed on an OPeNDAP server. Down-
loading of observation fields (SLA, SST, ice concentration and ice drift) by ftp from external
sources was left unchanged.

4.3 HPC que scheduling
With regards to the top queue priority enjoyed by the met.no "forecast" user on the HPC cluster,
the execution of the 100-member forecast07 integrations needed special care. Submitting 100
such jobs all at once from the forecast user who enjoys top queue priority would essentially
occupy the entire HPC cluster for a minimum of two and a half to three hours. After consulting
the HPC staff at NTNU, an initial approach consisting of sending the 100 jobs in batches of five to
ten members was considered too inefficient. Instead, we split the forecast07 step into three parts:
Part one runs the first fifty members during a relatively quiet period during early Wednesday
evening, part two then runs the second batch of fifty members in a second quiet period later
that same night or early Thursday morning. Both parts one and two allow any crashed members
to be rerun one time. Then, finally, part three runs immediately after part two and deals with
any remaining unsuccessful ensemble members from parts one and two. A crashed member is
first given a final chance after its initial conditions have been copied from a randomly chosen
successful member. Then, should the integration of the member still fail, its end state is copied
from a randomly chosen successful member (and the ensemble will thus contain two identical
members).

4.4 Batch control by SMS
Control of TOPAZ at met.no had to go through Supervisor Monitor Scheduler (SMS), an appli-
cation that makes it possible to run a large number of jobs which are dependant of each other
and/or by time. All jobs in the SMS are grouped in ’suites’ containing one or more ’families’.
Typically a ’suite’ contains all ’families’ which are run during one time (e.g. at 00 UTC, 12 UTC,
etc.). A ’family’ contains all the SMS jobs or scripts which are applied for a certain model or
application. SMS is applied to 1) submit jobs, 2) control each submitted job, and 3) report back
to the operator:

                                               7
4.5   OPeNDAP server                                              4   MODIFICATIONS MADE

Submitting jobs: The SMS is applied to run jobs depending on given criteria, e.g., to start at a
       specified wall clock time, to start and stop at specified times (i.e. the job runs for a certain
       period), to start when another specified job has the status ’complete’, or to start a job when
       an ’event flag’ is set.
Controlling jobs: The SMS monitor the status of each job and also when a job sets an ’event
       flag’ .
Interphase with the operator: The operator communicates with the SMS by XCdp (X Com-
       mand and Display Program). By XCdp the operator can start (submit), suspend, set jobs as
       ’complete’ or abort jobs. In XCdp the status of each job is shown as different color codes.

4.5 OPeNDAP server
Only minor modifications were made to the set-up scripts used by the THREDDS10 OPeNDAP
server.

4.6 Daily updates of deterministic forecast
During the spring of 2009, the system has been modified such that the deterministic forecast
(forecast14, from -7 days to +10 days) is updated every day. Each such update will thus enjoy
one more day of analyzed atmospheric forcing fields and should thus be a better forecast than the
one made the day before.

4.7 Transition from SSM-I to OSI-SAF ice fields
Within the EU FP7 MyOcean project the data for assimilation in the TOPAZ system will be
pulled from the MyOcean in-situ and satellite Thematic Assembly Centers. The first adjustment
to meet this objective is the transition from SSM/I to OSI-SAF11 sea ice concentration fields,
which is the product that will be available through MyOcean. This transition was made during
September 2009. Due to the high resolution of the OSI-SAF data, a routine for generating ’super
observations’ (the mean of several observations) has been added to the preprocessing of the sea
ice concentration data.

4.8 SVN version control
In March 2009 a SVN repository was established for the TOPAZ system12 . This repository
will ease the exchange of code updates between NERSC and met.no. The HPC directories (see
Appendix A) currently under version control are Jobscripts, Progs and Realtime_exp, which
contain all scripts and source code of the TOPAZ system. In addition to the TOPAZ version used
operationally at met.no, the repository also contains a NERSC branch.
10 http://www.unidata.ucar.edu/projects/THREDDS/
11 http://www.osi-saf.org/
12 https://svn.met.no/topaz

                                                   8
5    TODAY’S TOPAZ SYSTEM AT MET.NO

5 Today’s TOPAZ system at met.no
5.1 The TOPAZ cycle
At present, the TOPAZ cycle at met.no (Figure 3) is run along two different paths: On a daily
basis the forecast14 and genfore14prods are run after new atmospheric data is transferred from
ECMWF. The forecast14 step integrates one single member from day -7 to day +10, and gen-
fore14prods generates the MERSEA class 1, 2 and 3 products in NetCDF format. The TOPAZ
daily cycle ends with a post-processing stage which transfers the NetCDF files with MERSEA
products to the OPeNDAP server at met.no.
   In parallel, on a weekly basis the following jobs are executed: On Tuesday morning at 07:00
UTC the initialize, and analysis steps are launched. Then, at Wednesday evening after met.no’s
12-UTC cycle (“termin”) has completed, the forecast07 runs are started to produce the 100-
member ensemble. By this procedure, the forecast14 runs are updated every day, each run with
an improved set of atmospheric forcing fields, i.e., more analyzed atmospheric fields. However,
the ocean (and sea ice) model initial fields are kept unchanged until the next weekly cycle with
the forecast07 runs.
   The initialize step advances the TOPAZ time stamp by 7 days to prepare for a new week. In
addition, it does some cleaning up. The analysis step then downloads observation fields and
conducts the analysis, one field at a time.
   The forecast07 step which integrates 100 ensemble members from day -7 to day 0 starts off
with a part1 which integrates the first 50 members. This first part of the job is triggered Wednes-
day evening by the completion of the 12-UTC cycle that day13 . Then the next 50 ensemble
members are integrated by part2 which is triggered by the completion of the 18-UTC cycle (sim-
ilar procedure as for the 12-UTC cycle) sometime early on Thursday morning. In order to avoid
serious delays for other operational forecasting jobs, part1 and part2 of forecast07 are each al-
lowed a maximum of about three wall clock hours to complete their runs on the HPC. Finally,
to sweep up any remaining crashed members, a part3 is submitted to run immediately after the
completion of part2. The timing here is not critical since this last stage will normally not require
an excessive amount of HPC resources (very few crashed ensemble members will remain after
two trials in part1 and part2).
   After the 100 ensemble members are ready the genfore07prods step generates MERSEA class
1, 2 and 3 products from the ensemble integration (as for the daily cycle). The TOPAZ weekly
cycle ends with a post-processing stage which transfers the NetCDF files with MERSEA prod-
ucts to the OPeNDAP server at met.no. This last step will normally be done some time Thursday
morning.

13 In
    fact this is done indirectly, by an ’event trigger’ which is set by the completion of the last job in the 12-UTC
   cycle.

                                                       9
5.1       The TOPAZ cycle                                                   5     TODAY’S TOPAZ SYSTEM AT MET.NO

                            TUE                            WED−THU

                                                                                  GFP
                     INIT

                              ANA                      F07−1 F07−2 F07−3

        OBSERVATIONS

                                                INIT                            FORCING                         THREDDS
                                              FIELDS                             FIELDS
               GFP

                                        GFP

                                                           GFP

                                                                                        GFP

                                                                                                          GFP

                                                                                                                             GFP

                                                                                                                                               GFP
  PAF

                            PAF

                                               PAF

                                                                      PAF

                                                                                              PAF

                                                                                                                 PAF

                                                                                                                                   PAF
         F14                      F14                F14                    F14                     F14                F14               F14

        MON                       TUE                WED                    THU                     FRI                SAT               SUN

                                                                   ECMWF
                                                                 ATM. FIELDS

Figure 3: The TOPAZ week at met.no. ’INIT’ = Initialization, ’ANA’ = Analysis,’PAF’ = Prepare
          Atmospheric Forcing, ’F07/F14’ = Forecast07/14, ’GFP’ = Generate Forecast Prod-
          ucts.

                                                                        10
5.2      HPC setup                               5   TODAY’S TOPAZ SYSTEM AT MET.NO

5.2 HPC setup
Currently, the HPC setup for met.no’s forecast user has purposely been made as similar as pos-
sible to that of the development branch14 . The HPC operation for met.no’s operational setup
is executed via a set of top-level job scripts under the home directory of the forecast user, all
triggered by SMS (according to Fig. 3):

Weekly cycle

topaz3_initialize.job: Cleans up some _msg files (for communication with SMS) in Realtime_exp.
         Then executes script Realtime_exp/Subscripts2/topaz_cleanup.sh (which again calls Real-
         time_exp/Subscripts2/topaz_cleanup_migrate.sh). These scripts do some cleaning up and
         copying of model output files to the BusStop/Backup directory. Finally, adds 7 days to the
         dates stored in Realtime_exp/Startup_files. Note that the NERSC version had this script
         start a new set of log files (not implemented in the met.no version).

topaz3_analysis.job: Tries to download and process each of the observation types (SLA, SST,
         ICEC, IDRIFT) via scripts in Realtime_exp/Subscripts2. Starts EnKF via Realtime_exp/
         Subscripts2/topaz_enkf.sh. Waits maximum 10 hours for this script to finish.

topaz3_forecast07_part1.job: Generates two HYCOM input files (as for forecast14). Gener-
         ates forcing files by submitting job Forecast07/job_forfun_nersc.sh. Then submits mem-
         bers 1–50 to the que (uses template job script Realtime_exp/Infiles/forecast07_job_single.mal_test).
         Waits for all members to finish (or until a maximum wait time has passed). Checks which
         members were unsuccessful (crashed) by looking for missing ENSrestart files. Resubmits
         these missing members one more time. Waits for all these resubmitted jobs to finish.

topaz3_forecast07_part2.job: Identical to the previous step, except that no infiles nor forcing
         files are generated. Now submits members 51-100, and gives any unsuccessful members
         one more chance.

topaz3_forecast07_part3.job: Checks how many jobs were unsuccessful (by same procedure
         as above). For each such member, resubmit after having replaced the initial condition
         (ENSrestart file) from a random draw amongst the successful members. Waits for all
         members to finish. Again, check for any remaining unsuccessful member. This time,
         replace the end state of such a member with one drawn randomly amongst end states of
         successful members.

topaz3_genfore07prods.job: Same procedure as in topaz3_genfore14prods.job, but this time
         for the ensemble run.

14 Thedevelopment branch is associated with user laurentb and resides under /home/laurentb/TOPAZ3 and
   /work/laurentb/TOPAZ3.

                                                11
5.3   SMS                                             5    TODAY’S TOPAZ SYSTEM AT MET.NO

Daily cycle

topaz3_prepatmforcing.job: This job only executes script tmp/T799_ecnc every day (indepen-
      dent of the other steps of the TOPAZ cycle). Here GRIB files of ECMWF forcing files
      (uploaded from met.no to the HPC machine independently) are converted to NetCDF via
      CDO.15

topaz3_forecast14.job: Starts the forecast14 job via script Realtime_exp/Subscripts2/topaz_fore14_nest.sh.
      This script first generates two input files for HYCOM, then submits another job script,
      Forecast14/forecast14_job.sh. Finally, this job script does two things: first it generates
      atmospheric forcing fields (from NetCDF to HYCOM’s format) via the executable Real-
      time_exp/bin/forfun_nersc, then it runs the HYCOM executable.

topaz3_genfore14prods.job: Generates a job file (by inserting correct dates into a generic ver-
      sion) which generates the Mersea class 1,2,3 products for the North Atlantic (NAT) and
      Arctic (ARCTIC) regions via script Realtime_exp/Subscripts2/topaz_generate_daily.sh.
      Class 1 products are actually generated by executable Realtime_exp/bin/hyc2proj, while
      class 2 and 3 are generated by scripts Realtime_exp/Subscripts2/merseaip_class2_sections.sh
      and Realtime_exp/Subscripts2/merseaip_class3_transport.sh.

5.3 SMS
All jobs in the SMS are grouped in ’suites’ containing one or more ’families’. Typically a ’suite’
might contain all ’families’ which are run during the same time slot (e.g. at 00 UTC, 12 UTC,
etc.). A ’family’ contains all the SMS jobs or tasks which are applied for a certain model or
application. All SMS jobs are scripts, most commonly of shell or perl type. They must have
a specified format at the beginning (header) and at the end to be recognized by the SMS. To
tell the SMS system which jobs and dependencies to be used, an ASCII format input file named
’metop.def’ is applied. This is a dynamical file to control the SMS, and it must be changed every
time a new SMS job is added or removed from the system.
   In the daily cycle, in SMS family run_everyday, the atmospheric forcing fields for the topaz
model system are retrieved from ECMWF. This is done after the ECMWF fields from their 12
UTC model run have arrived at met.no. After forcing fields have been downloaded and converted
to NetCDF, the forecast14 and genfore14prods steps are executed. Finally the model results
(now on NetCDF format) are transfered to the OPeNDAP server at met.no. The procedure is as
follows: Under suite ’ec_atmo’ in family ’topaz’ the SMS job ’check_if_atmo_12_240’ checks
if the ECMWF fields for the 12 UTC run have arrived at met.no. If this is the case, the event
flag ’ec_atmo_12_240’ is set, and the SMS jobs under the suite ’trigjobs’, family ’run_everyday’
and family ’topaz’ are started. These jobs prepare and transfer ECMWF atmospheric fields to the
remote HPC host ’njord’, and starts the forecast14 runs. The lines below show the SMS jobs with
events and triggers as they appear in family run_everyday in the file ’metop.def’ (all comments
starts with #):

15 http://www.mpimet.mpg.de/fileadmin/software/cdo/

                                                      12
5.3   SMS                                       5   TODAY’S TOPAZ SYSTEM AT MET.NO

suite trigjobs........
    family run_everyday
      edit JOB_HOSTS ""
      family topaz
# - Prepare new atmospheric fields from ECMWF
# - Perform complete Forecast14 run, including transfer of output
#   data to thredds
# - Requeue family
        task prep_atmos_topaz3
            trigger /ec_atmo/topaz/check_if_atmo_12_240:ec_atmo_12_240
   task put_atmos_topaz3
          trigger ./prep_atmos_topaz3 == complete
     task start_topaz3_prepatmforcing
            trigger ./put_atmos_topaz3 == complete
        task get_topaz3_prepatmforcing
    edit RUN_JOB_OPTIONS "--timeout-job=170"
            trigger ./start_topaz3_prepatmforcing == complete
# On every day of the week, run topaz forecast14 with updated atmospheric
        task start_topaz3_forecast14
            trigger ./get_topaz3_prepatmforcing == complete
        task get_topaz3_forecast14
    edit RUN_JOB_OPTIONS "--timeout-job=190"
            trigger ./start_topaz3_forecast14 == complete
        task start_topaz3_genfore14prods
            trigger ./get_topaz3_forecast14 == complete
        task get_topaz3_genfore14prods
    edit RUN_JOB_OPTIONS "--timeout-job=130"
            trigger ./start_topaz3_genfore14prods == complete
# Put model output tar file on thredds, unpack, and requeue family
        task post_topaz3
            trigger ./get_topaz3_genfore14prods == complete
        task requeue_run_everyday
            trigger ./post_topaz3 == complete
      endfamily
    endfamily

   The weekly cycle starts with the initialize and analysis steps on Tuesdays. Thus, these jobs are
run under family run_tuesday. When it is tuesday, the event flag ’tuesday_OK’ is set. When this
event flag is set, all SMS jobs under suite ’trigjobs’, family ’run_tuesday’, and family ’topaz’ are
started. In ’metop.def’ this "production line" looks like:

      family run_tuesday
        edit JOB_HOSTS ""

                                               13
5.3   SMS                                      5   TODAY’S TOPAZ SYSTEM AT MET.NO

      family topaz
# Initialization & cleanup, then perform the analysis part:
        task start_topaz3_initialize
            trigger /cronjobs/topaz/check_which_day_of_week:tuesday_OK
# Copy last "old" results from njord to rhino
        task copy_topaz3_results
    edit RUN_JOB_OPTIONS "--timeout-job=120"
            trigger /cronjobs/topaz/check_which_day_of_week:tuesday_OK
        task get_topaz3_initialize
            trigger ./start_topaz3_initialize == complete
        task start_topaz3_analysis
            trigger ./get_topaz3_initialize == complete
        task get_topaz3_analysis
    edit RUN_JOB_OPTIONS "--timeout-job=500"
            trigger ./start_topaz3_analysis == complete
        task requeue_run_tuesday
            trigger ./get_topaz3_analysis == complete and ./copy_topaz3_r
      endfamily
    endfamily

   The forecast07 runs are initiated on Wednesday evening with the first part (part1) at the end
of the 12-UTC operational suite at met.no. Since we now will start 100 model runs, special care
must be taken to avoid blocking other operational models on the HPC. This is done by checking if
one of the last models in the 12-UTC suite is finished (similar procedure for part2 in the 18-UTC
suite). If this is true the event flag ’wednesday_OK’ is set. When this event flag is set, all SMS
jobs under suite ’trigjobs’, family ’run_wednesday’, and family ’topaz’ are started. To avoid
that any part1 and part2 runs are performed at the same time, something that could completely
block the computer for other runs, part2 (and part3) is not started before part1 is completed,
regardless of the 18-UTC suite is finished or not. Below is the portion of the ’metop.def’ for
family ’run_wednesday’:

    family run_wednesday
      edit JOB_HOSTS ""
      family topaz
# When the last model (um4exp) during the 12 UTC time is completed,
# start forecast07 part1
        task start_topaz3_forecast07_part1
            trigger /metop/mod12/topaz/check_which_day_of_week:wednesday_
        task get_topaz3_forecast07_part1
    edit RUN_JOB_OPTIONS "--timeout-job=250"
            trigger ./start_topaz3_forecast07_part1 == complete

# When the last model (hirlam4) during the 18 UTC time is completed,

                                              14
5.4   OPeNDAP                                5   TODAY’S TOPAZ SYSTEM AT MET.NO

# start forecast07 part2:
        task start_topaz3_forecast07_part2
            trigger /metop/mod18/topaz/check_which_day_of_week:wednesday_
                    ./get_topaz3_forecast07_part1 == complete
        task get_topaz3_forecast07_part2
    edit RUN_JOB_OPTIONS "--timeout-job=250"
            trigger ./start_topaz3_forecast07_part2 == complete

# If there is still some ensemble members which have crashed, even after
# we give those a third and last chance in part3:
        task start_topaz3_forecast07_part3
            trigger ./get_topaz3_forecast07_part2 == complete
        task get_topaz3_forecast07_part3
    edit RUN_JOB_OPTIONS "--timeout-job=250"
            trigger ./start_topaz3_forecast07_part3 == complete

# When forecast07 part1, part2 and part3 are all completed, make products
        task start_topaz3_genfore07prods
            trigger ./get_topaz3_forecast07_part1 == complete && \
                     ./get_topaz3_forecast07_part2 == complete && \
                      ./get_topaz3_forecast07_part3 == complete
        task get_topaz3_genfore07prods
    edit RUN_JOB_OPTIONS "--timeout-job=100"
            trigger ./start_topaz3_genfore07prods == complete

# Put model output tar file on thredds, unpack, and requeue family
        task post_topaz3
            trigger ./get_topaz3_genfore07prods == complete
        task requeue_run_wednesday
            trigger ./post_topaz3 == complete
      endfamily
    endfamily

5.4 OPeNDAP
For data dissemination by OPeNDAP, THREDDS Data Server (TDS16 ) software was installed on
server hardware in the met.no De-Militarized Zone (DMZ). An area on this server is dedicated to
serving TOPAZ Mersea data products (thredds.met.no/thredds/public/mersea-ipv2.html). Con-
figuration of the Mersea area of this server is essentially a copy of the configuration used by
NERSC (see topaz.nersc.no/thredds/catalog.html). Catalog-generation scripts from NERSC were
adapted and implemented at met.no.

16 www.unidata.ucar.edu/projects/THREDDS/

                                            15
5.4   OPeNDAP                                5   TODAY’S TOPAZ SYSTEM AT MET.NO

  The TOPAZ products are located under threddsday:/metno/eksternweb/thredds/content/mersea-
ipv2 in the following structure:

/metno/eksternweb/thredds/content/ Top of THREDDS content tree
      mersea-ipv2.xml Catalog for Mersea tree (static)
      mersea-ipv2/ Top of Mersea tree
           gen_xml.sh Script to update catalog xml files. Called by the SMS job post_topaz3.sms.
                Runs perl scripts located in Agg-XML to generate updated xml files for each set
                of products. xml files are built in Agg-XMLwork and then copied to this subdi-
                rectory.
           mersea-ipv2-class1-arctic.xml Catalog for Arctic Class 1 products
           mersea-ipv2-class2-arctic.xml Catalog for Arctic Class 2 products
           mersea-ipv2-class3-arctic.xml Catalog for Arctic Class 3 products
           mersea-ipv2-class1-nat.xml Catalog for North Atlantic Class 1 products
           mersea-ipv2.tar tar ball of all NetCDF data files in current update. Currently 2 Gb.
           arctic/ Arctic grid (polar-stereographic)
                mersea-class1/ Class 1 products (gridded fields)
                      Filename template: topaz_V3_mersea_arctic_grid1to8_da_class1_b[bulletin
                       date YYYYMMDD]_f[field date YYYYMMDD]9999.nc
                mersea-class2/ Class 2 products (vertical section fields)
                    section01/ Data files for Section 1.
                         Filename template: topaz_V3_mersea_arctic_section01_dc_b[bulletin date
                          YYYYMMDD]_f[field date YYYYMMDD]9999.nc
                    section02/ ... section24/ Data files for Sections 2-24.
                         Filename template: topaz_V3_mersea_arctic_sectionNN_dc_b[bulletin
                          date YYYYMMDD]_f[field date YYYYMMDD]9999.nc
                    moorings/ Data files for mooring locations.
                         Filename template: topaz_V3_mersea_arctic_moorings_dc_b[bulletin date
                          YYYYMMDD]_f[field date YYYYMMDD]9999.nc
                mersea-class3/ Class 3 products (transports)
                    transport/ Data files for water transport.
                         Filename template: topaz_V3_mersea_arctic_transport_b[bulletin date
                          YYYYMMDD]_f[field date YYYYMMDD]9999.nc
                    icetransport/ Data files for sea ice transport.
                         Filename template: topaz_V3_mersea_arctic_icetransport_b[bulletin date
                          YYYYMMDD]_f[field date YYYYMMDD]9999.nc
           nat/ North Atlantic grid (geographic)
                mersea-class1/ Class 1 products (gridded fields)
                      Filename template: topaz_V3_mersea_nat_grid1to8_da_class1_b[bulletin
                       date YYYYMMDD]_f[field date YYYYMMDD]9999.nc
           Agg-XML/ Contains perl code for generating aggregation catalog files in xml. Perl
                scripts are called by gen_xml.sh.

                                            16
7   FUTURE WORK

                   work/ Work area for writing the catalog xml files. These are copied to the top
                    directory.
                    old/ Contains catalog xml files from the previous update.
                 backup/ Purpose unknown, currently empty.
             old_content/ Obsolete forms of xml catalog files. (Not used?)

6 Experience from an initial testing period
The first operational runs of TOPAZ at met.no took place in early April 2008. The system has
been operational since mid May, 2008, and resulting MERSEA class 1, 2 and 3 files can be
downloaded from: http://thredds.met.no/thredds/public/mersea-ipv2.html

6.1 Model fields
No thorough validation of TOPAZ output fields has been conducted. A few random visual in-
spections have shown the met.no fields to be in good accordance with those produced by NERSC.
Regular validation is planned within the MyOcean project.

6.2 HPC performance
Scheduling and execution on njord under the forecast user has resulted in few, if any, problems.
One irregularity in njord networking caused problems with ftp connections, and TOPAZ was
then unable to download any observational data for that week.

6.3 SMS
The once-weekly structure of TOPAZ has caused some problems. On two-three occasions the
failure of operational personnel to reset some triggers used by TOPAZ has caused failure. New
routines have been implemented which should avoid such problems in the future.

6.4 THREDDS
Thredds has run largely without problems.

7 Future work
7.1 Argo In situ observations
The NERSC version of TOPAZ has been assimilating Argo17 in situ hydrographic profiles since
the end of 2008. This feature has yet to be implemented in the met.no version.
17 http://www.argo.net/

                                               17
7.2    met.no uses of TOPAZ results                 A   THE HPC DIRECTORY STRUCTURE

7.2 met.no uses of TOPAZ results
Presently the TOPAZ fields are available to the community at large via the OPeNDAP server, but
they are not used for any in-house met.no applications.

A The HPC directory structure
The TOPAZ3 directory under /home/ntnu/forecast contains these major subdirectories:

Realtime_exp: Most of the shell scripts and code are stored and executed here:
       Startup_files: Four ASCII text files contain the current analysis date (in year, week, day
             and Julian day). These files are modified by the top-level script topaz3_initialize.job
             and subsequently read in by many other scripts.
       Logfiles: Not presently used by the met.no setup.
       Subscripts: Generic scripts
       Subscripts2: Scripts more specific to TOPAZ
       Subprogs: Generic programs
       Subprogs2: Programs more specific to TOPAZ
       Infiles: Various controlling input files (including generic run scripts for the HYCOM
             model)
       bin: Various executables
       Helpdocs: (Largely outdated) help documents.

Analysis: The analysis step is executed here

Forecast14: The Forecast14 step is executed here

Forecast07: The Forecast07 step is executed here

BusStop: Temporary storage for data to be transferred:
       Backup: Backup fields (Analysis, Forecast14, Forecast07, MERSEA)
       OpenDAP: The Mersea products to be uploaded to the THREDDS server
       Diagnostics: Not presently used

Progs: Various source code (ripped from the NERSC laurentb user)
       EnKF_MPI: EnKF code for MPI

tmp:
       T799_ecnc: ECMWF forcing fields are downloaded here (every day) and converted from
           GRIB to NetCDF format.
       TOPAZ3_relax: Climatological fields for ocean model.

                                               18
B   INITIALIZATION FILES

      Met.no: Presently not used
      ECMWFR: Presently not used

HYCOM_inputs: HYCOM data

EOdata: Downloaded observation data (SLA, SST, ICEC, IDRIFT)

Jobscripts: Contains all top-level job scripts described in Section 5.2).

B Initialization files
The following is a list of data files from last week’s run which must be present at the beginning of
a new TOPAZ week. The list is not exhaustive but includes files which will typically be missing
if one week’s runs went astray. The files must then be copied from the (hopefully successful)
NERSC runs. All files belong in the (HPC) directory /̃TOPAZ3/Forecast07/.

ENSrestartyyyy_ddd* These are restart files containing the end state of the model at the end of
      last week’s run.

ENSrestRANDyyyy_ddd_00.[ab] Contain random component?

ENSDAILY_yyyy_d-7_ICEDRIFT.uf Ice drift data throughout entire previous ensemble run?

  Here ’yyyy’ and ’ddd’ is the year and the day-in-the-year (days after 1 January) pointing to
the analysis day we want to integrate from. The ’d-7’ in in the ENSDAILY files points to seven
days before the analysis, i.e. to the start of the previous analysis (two weeks ago).
  An example: For the TOPAZ week starting 20 January 2009, we need to restart from the
previous Wednesday, i.e. 14 January or yyyy=2009, ddd=013. We would need the files EN-
Srestart2009_013*, ENSrestRAND2009_013_00.[ab] and ENSDAILY_2009_006*.

References
  Bertino, L. and K. A. Lisæter, 2008: The TOPAZ monitoring and prediction system for the
       Atlantic and Arctic Oceans. J. Operat. Oceanogr., 1(2), 15–19.
  Bleck, R., 2002: An oceanic general circulation model framed in hybrid isopycnic-cartesian
       coordinates. Ocean Modelling, 4, 55-88.
  Bleck, R., C. Rooth, D. Hu, and L. T. Smith, 1992: Salinity thermocline transients in a wind-
       and thermohaline-forced isopycnic coordinate model of the Atlantic Ocean. J. Phys.
       Oceanogr., 22, 1486–1505.
  Chassignet, E. P., L. T. Smith, G. R. Halliwell, and R. Bleck, 2003: North Atlantic simulations
       with the HYbrid Coordinate Ocean Model (HYCOM): Impact of the vertical coordinate
       choice, reference pressure, and thermobaricity. J. Phys. Oceanogr., 33, 2504–2526.

                                               19
B   INITIALIZATION FILES

Drange, H., and K. Simonsen, 1996: Formulation of air-sea fluxes in the ESOP2 version
     of MICOM. Technical Report 125, Nansen Environmental and Remote Sensing Center,
     Bergen, Norway. 23 pp.
Evensen, G., 1994: Sequential data assimilation with a nonlinear quasi-geostrophic model
     using Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99(C5), 10143–
     10162.
Evensen, G., 2003: The Ensemble Kalman Filter: theoretical formulation and practical imple-
     mentation. Ocean Dynamics, 53, 343–367.
Halliwell, G. R., 2004: Evaluation of vertical coordinate and vertical mixing algorithms in the
     HYbrid Coordinate Ocean Model (HYCOM). Ocean Modelling, 7(3–4), 285–322.
Harder, M., P. Lemke, and M. Hilmer, 1998: Simulation of sea ice transport through Fram
     Strait: Natural variability and sensitivity to forcing. J. Geophys. Res., 103(C3), 5595–
     5606.
Hibler, W. D., III, 1979: A dynamic thermodynamic sea ice model. J. Phys. Oceanogr., 9,
     815–846.
Lisæter, K. A., and G. Evensen, 2003: Assimilation of ice concentration in a coupled ice –
     ocean model, using the Ensemble Kalman filter. Ocean Dynamics, 53, 368–388.

                                           20
You can also read
NEXT SLIDES ... Cancel