DCache: An Overview - CERN Indico

Page created by Jacob Klein
DCache: An Overview - CERN Indico
dCache: An Overview
             Paul Millar
   on behalf of the dCache team

 Nordic Data Management Workshop
        Oslo, Norway; 2019-02-27

         eXtreme DataCloud is co-funded by the Horizon2020
         Framework Program – Grant Agreement 777367
         Copyright © Members of the XDC Collaboration, 2017-2020
DCache: An Overview - CERN Indico
Scientific data challenges
    Fast ingest
    Chaotic Access
    Sharing data
    Access Control
    Persistence & long-term

                              dCache: An Overview |   | 2019-02-27 | 2
DCache: An Overview - CERN Indico
Data management
                            & workflow control
                            (Rucio, Kafka, SSE)

                       High Speed
                       Data Ingest

                                                                     Fast Analysis
Interactive analysis
                                                                     NFS 4.1/pNFS
     & Sharing

                                                  Wide Area Transfers
                                                  (Globus Online, FTS)
                                                  by GridFTP, HTTP
                                                    dCache: An Overview |   | 2019-02-27 | 3
DCache: An Overview - CERN Indico
               Belle II
               And many more ...

dCache: An Overview |     | 2019-02-27 | 4
DCache: An Overview - CERN Indico
Flexibility that works …
    Supports many authentication schemes: username+password,
    X.509, Kerberos and OpenID-Connect:
        Integrates with existing infrastructure + pluggable for flexibility,
        Users have same rights, irrespective of how they authenticate.
    Supports delegated authorisation, using Macaroons.
    Multiple protocols: (Grid)FTP, HTTP/WebDAV, SRM, xrootd, NFS
    v4.1/pNFS and dcap.
        Using different protocols, users will see the same data.

                                                     dCache: An Overview |   | 2019-02-27 | 5
DCache: An Overview - CERN Indico
dCache innovations:
 Storage Events
DCache: An Overview - CERN Indico
Storage events: the problems
       Upload a file                     Stage files from tape

                       OK                          Request queued

                                         Are these files on disk?
       Delete a file                               no, no, no, …


                                         Are these files on disk?

                                                   no, no, no, …

                                         Are these files on disk?

                                                 no, YES, no, …

          Catalogue: Rucio/LFC/…
                                   dCache: An Overview |         | 2019-02-27 | 7
DCache: An Overview - CERN Indico
9000 stats per second!

                  dCache: An Overview |   | 2019-02-27 | 8
DCache: An Overview - CERN Indico
dCache: An Overview |   | 2019-02-27 | 9
DCache: An Overview - CERN Indico
An new approach: storage events

                   Subscribe to events


                     Something happened #1

                     Something happened #2

                     Something happened #3

                                             dCache: An Overview |   | 2019-02-27 | 10
New solutions to old problems:
                                      Subscribe …           ●
                                                                User- and internally triggered
                                  OK                            events:
              OK                 File uploaded                   ●
                                                                     Data uploaded
                                                                     Data deleted/renamed/moved
     Delete                                                      ●
                                                                     Tape flush/stage operations
                                 File deleted
              OK                                            ●
                                                                Uses: update catalogue, metadata
                                                                extraction, data normalisation,
                                                                build derived data, …
                   Stage files
                     Request queued
                                                                Two event systems:
                   Subscribe …                                   ●
                                                                     Site integration (Kafka)
                                                                     Per user events (SSE/inotify)

                     File #16 on disk
                                                                             (DEMO :-)
                                                                dCache: An Overview |    | 2019-02-27 | 11
dCache innovations:
Distributed storage & Data Lakes
Data Lakes: distributed resources
    dCache has over a decade of production use as a data lake:
        NDGF is a distributed dCache, spread over five countries.
        AGLT2 is a distributed dCache, spread over two campuses.
    dCache can already provide protocol-based QoS; e.g., cache data for
    NFS access, read remotely for HTTP/GridFTP.
    Currently building new testbed to demonstrate existing solutions and
    improve upon them:
        Hamburg → Zeuthen (RTT: ~5 ms); Hamburg → Moscow (RTT: ~70 ms)
    Adding ability to provide cached data when detached:
        A “satellite” can offer data if disconnected from the rest of dCache.

                                                      dCache: An Overview |   | 2019-02-27 | 13
Data Lakes: cloud bursting
    dCache stores data in either a local filesystem or as objects within a
    CEPH cluster.
    Two new developments:
        Storing data within an S3 endpoint
        Dynamic pools: just start a dCache pool and that capacity becomes usable.
    Together, support the cloud bursting use-case:
        As cloud capacity “comes online” either due to load (cloud burst) or due to
        resources being cheap (Amazon grants) then start a dCache pool
        Jobs can run “in the cloud” with dCache taking care of any data movement.

                                                     dCache: An Overview |   | 2019-02-27 | 14
dCache innovations:
Delegated Authorisation with Macaroons
Macaroons: delegated authorisation

                                 Photo by Alan Cleaver (CC-BY)

                               dCache: An Overview |   | 2019-02-27 | 16
Example use: community portals / BOINC


           1. Request data
                                                 2. Request a macaroon

                 GET          3. Request data directly from dCache

                                                   dCache: An Overview |   | 2019-02-27 | 17
Example use: ad-hoc sharing

                    1. Request a macaroon

                2. Send to colleague
                (e.g. via email)


                        3. Use macaroon

                                            dCache: An Overview |   | 2019-02-27 | 18
dCache Workshop: 2019-05-21 to 2019-05-22
    Located in Madrid, Spain.
    Learn more about latest
    developments in dCache
    Opportunity to discuss issues directly
    with dCache developers
    Share stories with dCache admins
    Help shape the future direction of
                                             dCache: An Overview |   | 2019-02-27 | 19
The take-home message
    dCache is advance storage software for data-intensive
        has decades of production use throughout the world,
        provides scalable resources, used by many scientific disciplines,
        offers innovative solutions that help drive the next generation
        of scientific discovery.

                                             dCache: An Overview |   | 2019-02-27 | 20
Backup slides
dCache 101: Motivation
    Data never fits into a single server
        Multiple servers
        Off-load to tape
    Growing number of client hosts
        Mainframe vs Linux cluster
    Control over hardware/OS selection
        Better tender offers
        Use and enhance local expertise

                                           dCache: An Overview |   | 2019-02-27 | 22
dCache 101: Design
    Single-rooted namespace, distributed data
    Client talks to namespace for metadata operations only
    Bandwidth and performance grow with number of data
    Standard clients (OS native or experiment)
    Some data can be offloaded to tape

                                      dCache: An Overview |   | 2019-02-27 | 23
What are macaroons good for?

              3. Add caveats

            GET                              2. Request a macaroon



                                4. Request data directly from dCache

     Processing data without user credentials / BOINC
What are macaroons good for?
                              5. GET with macaroon

                                                       4. COPY with
        2. Request a
                                                       embedded macaroon

      1. Request copy
                                    FTS      3. Add caveats

                        HTTP 3rd party copies
What are macaroons good for?

                                2. Request a macaroon
 3. Add caveats

  1. Request access          Rucio
  to data

                          4. Access data

                  Enforcing catalogue permissions
Comparison: it’s what industry is doing…
Comparison: it’s what Open-Source is doing…
dCache Storage Events: Kafka

                                                       billing    Log

dCache Server-Sent Events (SSE)
    Based on HTTP v1.1
    HTML 5 standard
        Support for many languages and web-browsers
    Initially adding support for inotify events
        (it’s how Linux does namespace notification)
    Plan to add:
        Locality change notification: flush, stage, …
        Transfer-related events
        QoS changes
Cheat sheet: Kafka vs SSE


Standard …                      Component               Protocol

What events does it see?   dCache internal events      Controlled

Main benefit                  Easy integration       Built-in security

“Catch-up” storage            Memory & disk           Memory-only

Target audience             Site-level integration   Events for users
EOSC-Pilot demonstrator: EU-XFEL data ingest

                                    new RAW data

               store derived file           create
           metadata                                   metadata
                                    update catalog
Rucio demonstrator: automated replication with SSE

                                            New data
       Upload                         SSE

                   Third-party copy
Future directions
    Complete SSE inotify support in dCache.
    Add additional events, based on initial feedback.
    Further explore automated data workflow (EU-XFEL
    Work with Rucio team to explore SSE integration.
    Work with dCache sites to deploy store events in
You can also read