A View of DUNE Production So ware - BV August 10, 2020 - Indico

Page created by Rick Price
 
CONTINUE READING
A View of DUNE Production Software

                            BV

                  August 10, 2020

BV            A View of DUNE Production Software   August 10, 2020   1/9
High-level View of DUNE FD TPC Processing
                                           DAQ

                                                                               Stage 1: noise filter + signal processing.
                                           raw
                                          packed                                   I   Fairly well defined stage.
    Stage 1 MC Production
                               Stage 1 Data Production
                                                                                   I   Noise filtered, deconvolved signal regions of
       Int. Kinematics
           Geant4                       raw decode
                                                                                       interest (ROI), zero suppressed,
        WC TPC Sim                    WC noise filter                                     F ≈ 103 data volume reduction.
       WC noise filter              WC signal processing
    WC signal processing                                                       Stage 2+: reconstruction
                                                                                   I   WC and LS branches take fundamentally
           signal-ROI
           waveforms                                                                   different approach
                               Stage 2+ Production
                                                                                   I   Some comparisons between later stage
     Stage 2+ Production
      (3D-first wire-cell)
                                 (2D-first larsoft)                                    products possible.
       WC 3D imaging
                               2D Gaussian wf fits                                 I   WC 3D → LS 2D pollination possible (eg
      WC 3D clustering
                                    LS 2D clustering                                   flash matching)
                                     track stiching
      WC flash matching
        wire-cell reco
                                LS flash matching
                                                                               Caveats:
                                      pandora reco
                                                                                   I   Conceptual view, ignoring many details.
          Wire-Cell                     LArSoft                                    I   Light sim and processing not depicted
           tuples                        tuples
                                                                                          F Less computation than TPC
                                                                                          F WCP has flash matching.
   Wire-Cell Based        Joint          LArSoft Based
      Analysis           Analysis          Analysis                                I   ND proc is unknown creature (to me)

               BV                                          A View of DUNE Production Software                      August 10, 2020   2/9
Stage 1 Data Production Issues - Data Size
                                                                                                  raw
                                                                                                 packed
DUNE FD 10kt “event” readout is large (rule of thumb: 2 GB / CPU core)
      10kt-5ms: 6 GB packed, 15 GB unpacked in memory                                    Stage 1 Data Production

                                                                                               raw decode

      SNB 10kt-100s is 20,000 times larger.                                                  WC noise filter
                                                                                          WC signal processing

      APA-5ms: 40 MB packed, 100 MB in memory, very reasonable.
                                                                                               signal-ROI
                                                                                               waveforms

 Per-APA Stage 1 processing is desired and required
       DAQ will likely produce per-APA files
       No Stage 1 processing requires multi-APA info.
           I   NF/SP operates per-APA basis in any case.
       SNB needs yet more fragmentation
           I   Slice 100s into ≈ 5-10 ms per-APA chunks, must reuse ≈ 100 µs to avoid edge-effect
       Residual care needed:
           I   ROOT overhead in art / LArSoft adds memory pressure, would like to avoid entirely.
           I   WCT NF/SP “light weight” enough for per-APA 5-10 ms. (PDSP proc is okay)
           I   WCT can operate in multi-threaded mode to better utilize memory on 3+ cores.
               BV                     A View of DUNE Production Software               August 10, 2020             3/9
Stage 1 Data Production Issues - Multi-processing Issues

    Multi-threading CPU
      I   WCT has pipelined, multi-threaded execution mode
      I   Means multiple “events” in-flight at once
      I   Higher total memory but lower per-thread usage
      I   Best exploited if Stage 1 data processing runs as pure-WCT job
               F   N-1 threads wasted in non-WCT, ST’ed job portion
    Hybrid CPU/GPU production environment becoming required
      I   WCT SP has new DNN ROI (Haiwang) ML inference
               F   ≈ 20s / APA / 5ms on CPU → 200ms on GPU
      I   More GPU ussage to come:
               F   CCE-PPS (CSI+Haiwang) attacking WCT sim/SP hot spots with GPU
    GPU is “too” fast
      I   Need O(100) CPU cores to avoid idle GPUs with SP tasks.
    WCT developing multi-job/multi-host shared use of GPU.

          BV                      A View of DUNE Production Software   August 10, 2020   4/9
Stage 1 MC Production Issues
WCT relies on LArSoft for int./Geant4 interface
    Relatively low-CPU compared to NF/SP.
    Many small “energy depos” produce unwanted overhead
    when transferred to WCT via art data products.
    LS also handles light sim, WCT has nothing here.                      Stage 1 MC Production

    Possibly develop direct Geant4-WCT integration.                          Int. Kinematics
                                                                                 Geant4
       I   EDEP library good candidate
                                                                              WC TPC Sim
              F (McGrew, CAPTAIN and DUNE ND)
                                                                             WC noise filter
       I   WCT-native Geant4 not out of the question.                     WC signal processing

Scaling WCT sim to 150-APAs? Probably OK.
                                                                               signal-ROI
    PDSP 6 APA posed no substantial problem.                                   waveforms

    ST: no known issues.
    MT: known TBB “N-fanout” limitation, work-arounds exist.
    Easy feature: avoid producing APAs with only noise+39 Ar
    Still, may expect scaling issues, we just need to try it!

             BV                      A View of DUNE Production Software         August 10, 2020   5/9
Stage 2+ 3D-first reconstruction - Prototype vs Toolkit

Wire-Cell Prototype (WCP)
    Focus on functionality, features, fast development.
                                                                            signal-ROI
    WCP code is working prototype                                           waveforms

       I   All late-stage Wire-Cell®™© reco for MicroBooNE.

Wire-Cell Toolkit (WCT)
                                                                        Stage 2+ Production
                                                                         (3D-first wire-cell)

    Focus on functionality, features, configurability,                   WC 3D imaging
                                                                         WC 3D clustering
    performance, generality, maintenance                                WC flash matching

    Proven ideas in WCP are ported to WCT                                  wire-cell reco

       I   Typically completely rewritten based on understanding the
                                                                             Wire-Cell
           algorithms                                                         tuples
       I   Sometimes, better algorithms used instead
       I   Coded to fit WCT patterns, standards, conventions, etc.
       I   Very labor intensive, but result is high quality

            BV                     A View of DUNE Production Software             August 10, 2020   6/9
Stage 2+ 3D-first reconstruction - MicroBooNE → DUNE
MicroBooNE uses WCT + WCP
    WCT NF/SP via art module
    WCP hooked in via art modules + exchange files.
DUNE needs WCP algorithms (in WCT!)
    Now: WCT only has initial 3D imaging stages                               signal-ROI
                                                                              waveforms
       I   “tiling” and “charge solving”
    World-class LArTPC reco algs resulting from a huge                    Stage 2+ Production
                                                                           (3D-first wire-cell)
    investment by EDG uBooNE effort must not be
                                                                           WC 3D imaging
    squandered on “just” uBooNE.                                           WC 3D clustering

Substantial effort needed to DUNE’ify and WCT’ify                         WC flash matching
                                                                             wire-cell reco

    Prior wild guess is 5 FTE-year of expert
                                                                               Wire-Cell
    physicists/developer.                                                       tuples
       I   Do actual porting, optimization, re-inventing
       I   Physics validation is needed and additional.
    Now, maybe higher, more WCP development since then.
    Possible to make “short cuts” in porting, but may not save
    much effort in near term, likely more effort in long term

            BV                       A View of DUNE Production Software             August 10, 2020   7/9
Some WCT Future Directions

DUNE offline framework task force
    Led by Paul (+A. Normal, FNAL), deliver requirements document.
    IMO: WCT could be “it” but politics will likely trump technical.
    WCT does now and will run in whatever (reasonable) framework we end up with.
WCT is developing “distributed I/O”
    Connect multiple sources/sinks of data to multiple algorithms on multiple threads.
    Distribute “live” data between processes / hosts.
    Target big DUNE’s problems:
       I   “Large event” problem: spread data across many threads, reduce memory/core, keep “event
           coherency” avoid bookkeeping worries.
       I   CPU/GPU heterogeneity: share GPU among many threads/procs/hosts.
    CPU/GPU heterogeneity being also attacked with “local” code solution
       I   Haiwang+CSI / CCE-PPS using “Kokkos” library shim over h/w

           BV                     A View of DUNE Production Software             August 10, 2020   8/9
My “vision” for us and DUNE production software

   Shore up the investment in WCP algorithm development for
   MicroBooNE by porting into WCT while optimizing, generalizing and
   reinventing.
   Apply WCT distributed I/O to “large event” and “CPU/GPU
   heterogeneity” problems.
   Continue to strengthen collaboration with CSI on GPU-accel and ML,
   both our current initiatives and pursue new ones.

        BV              A View of DUNE Production Software   August 10, 2020   9/9
You can also read