DCache: An Overview - CERN Indico

Page created by Jacob Klein
 
CONTINUE READING
DCache: An Overview - CERN Indico
dCache: An Overview
             Paul Millar
   on behalf of the dCache team

 Nordic Data Management Workshop
        Oslo, Norway; 2019-02-27
 https://indico.cern.ch/event/779913/

         eXtreme DataCloud is co-funded by the Horizon2020
         Framework Program – Grant Agreement 777367
         Copyright © Members of the XDC Collaboration, 2017-2020
DCache: An Overview - CERN Indico
Scientific data challenges
●
    Volume
●
    Fast ingest
●
    Chaotic Access
●
    Sharing data
●
    Access Control
●
    Persistence & long-term
    archival
●
    Immutability

                              dCache: An Overview |   | 2019-02-27 | 2
DCache: An Overview - CERN Indico
Data management
                            & workflow control
                            (Rucio, Kafka, SSE)

                       High Speed
                       Data Ingest

                                                                     Fast Analysis
Interactive analysis
                                                                     NFS 4.1/pNFS
     & Sharing

                                                  Wide Area Transfers
                                                  (Globus Online, FTS)
                                                  by GridFTP, HTTP
                                                    dCache: An Overview |   | 2019-02-27 | 3
DCache: An Overview - CERN Indico
●
               HERA
           ●
               Tevatron
           ●
               WLCG
           ●
               Belle II
           ●
               LOFAR
           ●
               CTA
           ●
               IceCUBE
           ●
               EU-XFEL
           ●
               Petra3
           ●
               DUNE
           ●
               And many more ...

dCache: An Overview |     | 2019-02-27 | 4
DCache: An Overview - CERN Indico
Flexibility that works …
●
    Supports many authentication schemes: username+password,
    X.509, Kerberos and OpenID-Connect:
    ●
        Integrates with existing infrastructure + pluggable for flexibility,
    ●
        Users have same rights, irrespective of how they authenticate.
●
    Supports delegated authorisation, using Macaroons.
●
    Multiple protocols: (Grid)FTP, HTTP/WebDAV, SRM, xrootd, NFS
    v4.1/pNFS and dcap.
    ●
        Using different protocols, users will see the same data.

                                                     dCache: An Overview |   | 2019-02-27 | 5
DCache: An Overview - CERN Indico
dCache innovations:
 Storage Events
DCache: An Overview - CERN Indico
Storage events: the problems
       Upload a file                     Stage files from tape

                       OK                          Request queued

                                         Are these files on disk?
       Delete a file                               no, no, no, …

                       OK

                                         Are these files on disk?

                                                   no, no, no, …

                                                      …
                                         Are these files on disk?

                                                 no, YES, no, …

          Catalogue: Rucio/LFC/…
                                   dCache: An Overview |         | 2019-02-27 | 7
DCache: An Overview - CERN Indico
9000 stats per second!

                  dCache: An Overview |   | 2019-02-27 | 8
DCache: An Overview - CERN Indico
dCache: An Overview |   | 2019-02-27 | 9
DCache: An Overview - CERN Indico
An new approach: storage events

                   Subscribe to events

                                         OK

                     Something happened #1

                     Something happened #2

                     Something happened #3

                                             dCache: An Overview |   | 2019-02-27 | 10
New solutions to old problems:
                                      Subscribe …           ●
                                                                User- and internally triggered
     Upload
                                  OK                            events:
              OK                 File uploaded                   ●
                                                                     Data uploaded
                                                                 ●
                                                                     Data deleted/renamed/moved
     Delete                                                      ●
                                                                     Tape flush/stage operations
                                 File deleted
              OK                                            ●
                                                                Uses: update catalogue, metadata
                                                    Rucio
                                                                extraction, data normalisation,
                                                                build derived data, …
                   Stage files
                     Request queued
                                                            ●
                                                                Two event systems:
                   Subscribe …                                   ●
                                                                     Site integration (Kafka)
                                 OK
                                                                 ●
                                                                     Per user events (SSE/inotify)

                     File #16 on disk
                                                                             (DEMO :-)
                                                                dCache: An Overview |    | 2019-02-27 | 11
dCache innovations:
Distributed storage & Data Lakes
Data Lakes: distributed resources
●
    dCache has over a decade of production use as a data lake:
    ●
        NDGF is a distributed dCache, spread over five countries.
    ●
        AGLT2 is a distributed dCache, spread over two campuses.
●
    dCache can already provide protocol-based QoS; e.g., cache data for
    NFS access, read remotely for HTTP/GridFTP.
●
    Currently building new testbed to demonstrate existing solutions and
    improve upon them:
        Hamburg → Zeuthen (RTT: ~5 ms); Hamburg → Moscow (RTT: ~70 ms)
●
    Adding ability to provide cached data when detached:
        A “satellite” can offer data if disconnected from the rest of dCache.

                                                      dCache: An Overview |   | 2019-02-27 | 13
Data Lakes: cloud bursting
●
    dCache stores data in either a local filesystem or as objects within a
    CEPH cluster.
●
    Two new developments:
    ●
        Storing data within an S3 endpoint
    ●
        Dynamic pools: just start a dCache pool and that capacity becomes usable.
●
    Together, support the cloud bursting use-case:
    ●
        As cloud capacity “comes online” either due to load (cloud burst) or due to
        resources being cheap (Amazon grants) then start a dCache pool
    ●
        Jobs can run “in the cloud” with dCache taking care of any data movement.

                                                     dCache: An Overview |   | 2019-02-27 | 14
dCache innovations:
Delegated Authorisation with Macaroons
Macaroons: delegated authorisation

                                 Photo by Alan Cleaver (CC-BY)

                               dCache: An Overview |   | 2019-02-27 | 16
Example use: community portals / BOINC

                                                     User
                                                     Database

           1. Request data
                                                 2. Request a macaroon
           GET
                             307

                 GET          3. Request data directly from dCache

                                                                           dCache
                                                   dCache: An Overview |   | 2019-02-27 | 17
Example use: ad-hoc sharing

                    1. Request a macaroon

                2. Send to colleague
                (e.g. via email)

                    GET/PUT/DELETE

                                                          dCache
                        3. Use macaroon

                                            dCache: An Overview |   | 2019-02-27 | 18
dCache Workshop: 2019-05-21 to 2019-05-22
●
    Located in Madrid, Spain.
●
    Learn more about latest
    developments in dCache
●
    Opportunity to discuss issues directly
    with dCache developers
●
    Share stories with dCache admins
●
    Help shape the future direction of
    dCache.
            https://indico.desy.de/indico/event/22170/
                                             dCache: An Overview |   | 2019-02-27 | 19
The take-home message
●
    dCache is advance storage software for data-intensive
    science.
●
    dCache:
    ●
        has decades of production use throughout the world,
    ●
        provides scalable resources, used by many scientific disciplines,
    ●
        offers innovative solutions that help drive the next generation
        of scientific discovery.

                                             dCache: An Overview |   | 2019-02-27 | 20
Backup slides
dCache 101: Motivation
●
    Data never fits into a single server
    ●
        Multiple servers
    ●
        Off-load to tape
●
    Growing number of client hosts
    ●
        Mainframe vs Linux cluster
●
    Control over hardware/OS selection
    ●
        Better tender offers
    ●
        Use and enhance local expertise

                                           dCache: An Overview |   | 2019-02-27 | 22
dCache 101: Design
●
    Single-rooted namespace, distributed data
●
    Client talks to namespace for metadata operations only
●
    Bandwidth and performance grow with number of data
    servers
●
    Standard clients (OS native or experiment)
●
    Some data can be offloaded to tape

                                      dCache: An Overview |   | 2019-02-27 | 23
What are macaroons good for?

              3. Add caveats

            GET                              2. Request a macaroon

                          307

                  GET

                                4. Request data directly from dCache

     Processing data without user credentials / BOINC
What are macaroons good for?
                              5. GET with macaroon

                                                       4. COPY with
        2. Request a
                                                       embedded macaroon
          macaroon

      1. Request copy
                                    FTS      3. Add caveats

                        HTTP 3rd party copies
What are macaroons good for?

                                2. Request a macaroon
 3. Add caveats

  1. Request access          Rucio
  to data

                          4. Access data

                  Enforcing catalogue permissions
Comparison: it’s what industry is doing…
Comparison: it’s what Open-Source is doing…
dCache Storage Events: Kafka

       created
                                              staged
                            Log
                 billing
                                                       billing    Log
                           billing
                                                                 billing

  created
                                     staged
dCache Server-Sent Events (SSE)
●
    Based on HTTP v1.1
●
    HTML 5 standard
        Support for many languages and web-browsers
●
    Initially adding support for inotify events
        (it’s how Linux does namespace notification)
●
    Plan to add:
    ●
        Locality change notification: flush, stage, …
    ●
        Transfer-related events
    ●
        QoS changes
Cheat sheet: Kafka vs SSE

                                                          SSE

Standard …                      Component               Protocol

What events does it see?   dCache internal events      Controlled

Main benefit                  Easy integration       Built-in security

“Catch-up” storage            Memory & disk           Memory-only
                                                        (currently)

Target audience             Site-level integration   Events for users
EOSC-Pilot demonstrator: EU-XFEL data ingest

                                    new RAW data

               store derived file           create
                                            derived
                                             data
                                                       extract
           metadata                                   metadata
            catalog
                                    update catalog
Rucio demonstrator: automated replication with SSE

                                            New data
       Upload                         SSE
                                                       Rucio

                   Third-party copy
Future directions
●
    Complete SSE inotify support in dCache.
●
    Add additional events, based on initial feedback.
●
    Further explore automated data workflow (EU-XFEL
    usecase).
●
    Work with Rucio team to explore SSE integration.
●
    Work with dCache sites to deploy store events in
    production.
You can also read