The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL

Page created by Fernando Pierce
 
CONTINUE READING
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
The Swiss National Supercomputing Centre
Driving innovation in computational research in Switzerland
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
CSCS in a nutshell

§ Established in 1991 as a unit of ETH Zurich
§ 90 highly qualified staff from 16 nations
§ Develops and operates the key supercomputing capabilities required to
  solve important problems to science and/or society
§ Leads the national strategy for High-Performance Computing and Networking
  (HPCN)
§ Has a dedicated User Laboratory for supercomputing since 2011 (i.e. research
  infrastructure funded by the ETH Domain on a programmatic basis)
   § ~1200 users, 116 projects (2017)
§ Annual budget
   § Operations: CHF 17 mio.
   § Investments CHF 20 mio.
                             The Swiss National Supercomputing Centre   2
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
Services Provided

                    3
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
User Lab

§ Scientific users can access CSCS computing resources for free
   § They have to submit project requests that are assessed by international experts
   § 43.4 million node hours have been used in 2016
   § 109 projects, 754 users

                              The Swiss National Supercomputing Centre   4
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
Collaboration Agreements
Core Mission                                                          Services on non dedicated systems
§ UserLab                                                             § Empa
§ PRACE Tier 0
                                                                      § ETH Zurich
                                                                      § Hilti (ending in 2019)
Housing
                                                                      § MARVEL
§ BlueBrain for EPFL
§ Euler for ETH Zurich                                                § PartnerRe
                                                                      § Paul Scherrer Institute
Hosting (dedicated systems)                                           § Università della Svizzera italiana
§ MeteoSwiss                                                          § University of Geneva / CADMOS
§ Mönch Cluster for ETH Zurich                                        § University of Zurich
§ Phoenix for CHIPP (new also as non dedicated
  system)

                                    Quarterly all-hands meeting   5
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
Example of co-Design Project
MeteoSwiss NExT Cosmo Suite

                               6
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
Improving simulation quality requires higher performance –
what exactly and by how much?
§ Current model running through mid 2016   § New model starting operation on in 2016
§ COSMO-2 (2.2 km grid)                    § COSMO-1 (1.1 km grid)
   § 24h forecast running in 30 min.          § 24h forecast running in 30 min.
     8x per day                                 8x per day (~10x COSMO-2)
                                           § COSMO-2E (21 times 2.2 km grid)
                                              § 21-member ensemble,120h forecast
                                                in 150 min., 2x per day (~26x COSMO-2)
                                           § KENDA
                                              § 40-member ensemble,1h forecast in 15 min.,
                                                24x per day (~5x COSMO-2)
                                           § New production system must deliver ~40x
                                             the simulations performance of existing
                                             HPC system
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
Origin of factor 40 performance improvement

§ Current production system installed in 2012
§ New Piz Kesch/Escha installed in 2015
   §   Processor performance                                                  2.8 x ß Moore’s law
   §   Improved system utilisation                                            2.8 x
   §   General software performance                                           1.7 x   ß Software refactoring
   §   Port to GPU architecture                                               2.3 x
   §   Increase in number of processors                                       1.3 x
   §   Total performance improvement                                         ~ 40 x
§ Bonus: simulation running on GPU is 3x more energy efficient compared to
  conventional state of the art CPU

                              The Swiss National Supercomputing Centre   8
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
Example of co-Design Project
Swiss Particle Particle Physics Community (CHIPP) and CERN

        The Swiss National Supercomputing Centre             9
The Swiss National Supercomputing Centre - Driving innovation in computational research in Switzerland - SKA | EPFL
CHIPPonCray – The Context

§ Collaboration with Swiss Institute of Particle Physics (CHIPP) to analyze data
  from the Large Hadron Collider Experiment (LHC) at CERN in Geneva
§ In the scope of this collaboration CSCS is running a dedicated cluster since 2007
   § Computing power corresponds to about one Piz Daint cabinet
   § Up to 50 TB of data transferred daily to CSCS for analysis
   § 4 PB of local data
§ Phoenix contributes to 2% of the total computing infrastructure of CERN

                                       CHIPP - CSCS   10
CHIPPonCray – What Changed?
Before (still today)                                                             Tomorrow (in fact, already now)

§   Cluster Phoenix as dedicated cluster                                         § Ported to Piz Daint, non dedicated environment
§   Few synergies with other HPC activities being                                § Porting only possible because Cray
    developed at CSCS                                                              environment is getting less specific
§ Headaches managing its yearly upgrades
                                                                                 § Will allow to take full advantage of Cray HPC
§   Difficulty to scale up infrastructure by order of                              environment
    magnitude
                                                                                 § Important step to adapt to increasing needs of
§   No flexibility on service provision
                                                                                   particle physics community ( x 50)

                                              Quarterly all-hands meeting   11
CHIPPonCray – The Solution

§ Containerized compute nodes (with docker/shifter)
§ Different components are shared with the Phoenix
§ Equivalent to Phoenix in performance, but better economy of scales
§ Opens the door to other HPC technologies

§ We’re the first ones doing this!
§ Lots of visibility inside and outside CSCS
§ Built a bridge for future collaborations with CERN

                                     CHIPP - CSCS   12
The Supercomputers of CSCS

                             13
High-risk & high-impact projects
                                                                                                Application driven co-design
                                                                                                of pre-exascale supercomputing ecosystem         2017-2020

                                                                                                                               rid                                    b   rid
                                                                                                              e II: e d   hy b        2016           e I I I : e d hy
                                                                                                            s                                      s           s
                                                                                                          a
                                                                                                       P h X ba s                              P ha c a l ba
                                                                                                         K2 0                                    Pas
                                                                                                                          2015
Monte Rosa                 Upgrade to
                                                                                           re
Cray XT5                   Cray XE6                                                  -co
                                                                                ulti                 2014
14’762 cores               47,200 cores               e I :   r k &m                                                                 Upgrade
                                                   s        o
                                               Pha s netw
                                                                   2013
      Hex-core upgrade                          Ar i e
      22’128 cores                                                                                      Development &
                                                     2012                                               procurement of
                                                                                                        petaflop/s scale
                                  2011                                                                  supercomputer

                 2010                                                                               Three pronged approach of the HPCN Initiative
                                                                                                    1. New, flexible, and efficient building
                                                          New                                       2. Efficient supercomputers
    2009
                    Begin construction                    building                                  3. Efficient applications
                    of new building                       complete
                                             The Swiss National Supercomputing Centre   14
Piz Daint specifications

                       The Swiss National Supercomputing Centre   15
Final Considerations

         The Swiss National Supercomputing Centre   16
Possible Contribution of CSCS to SKA

§ Co-design of scientific applications and needed computing infrastructure
§ Design of distributed infrastructures for data management and computations
§ Design of technical infrastructure to support required computational infrastructure
§ Providing of storage and computational resources

                           The Swiss National Supercomputing Centre   17
Thanks for your attention.

                             cscsch
You can also read