Belle II data production - Jake Bennett for the data production team July 2021 BPAC

Page created by Bertha Bradley
 
CONTINUE READING
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
Belle II data production
Jake Bennett for the data production team
July 2021 BPAC
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
Belle II data
•         Data available for summer analyses:
          - Latest o cial reprocessing (proc12): 65.4/fb @ 4S + 6.9/fb o -resonance
          - Prompt 2021 data processed: 65.9/fb @ 4S + 2.6/fb o -resonance            }   ~140/fb total

•         Remaining 2021 data: 73.8/fb

                                                                                                          2
    ffi
                                                     ff
                                                           ff
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
A reminder about nomenclature
             •   Data processing
                 -     Uno cial reprocessing: fast reprocessing of data immediately after availability in the o ine system
                       •   Not for physics publications, important for fast feedback on detector performance, no longer active
                       •   Special processings upon request, depending on resource availability
                 -     Prompt reprocessing: rst processing with automated calibration (internal terminology - bucketXX)
                       •   Automated calibration with air ow runs at BNL, mDST production at raw data centers
                       •   Not yet intended for physics publications, ok for conference presentations
                       •   Once calibration algorithms and work ow are mature may be used to top-up o cial samples
                 -     O cial reprocessing: careful calibration and validation of results for physics publications
                   • Automated calibration with air ow runs at DESY, mDST production at raw data centers
                   • Part of o cial reprocessing campaigns (procYY), subdivided by experiment (internal terminology - chunkZZ)
             •   O cial (internal) terminology: “procYY + prompt (AA/ab)”

             •   MC production
                 - Unique campaign names for di erent releases, global tags, conditions
                 -     Run-independent samples use simulated backgrounds and default conditions (MCXXri_a, MCXXri_b, etc.)
                 -     Run-dependent samples use random trigger events from data and real conditions (MCXXrd_a, MCXXrd_b, etc.)
                                                                                                                                  3
ffi
 ffi
       ffi
                 ffi
                                   fi
                                         fl
                                         fl
                                              ff
                                                   fl
                                                                                          ffi
                                                                                                   ffl
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
Belle II data ow                         hRAW = HLT skimmed raw data

•   Extensive discussions between data processing, MC production, and distributed computing experts to maximize e ciency
                                                                                                                           4
                 fl
                                                                                                        ffi
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
Prompt processing and reprocessing scheme
      •   Data calibrated weekly in “buckets” containing ~9/fb
      •   Reprocess for nal, physics-ready data

                                                                                 Prompt “buckets”
                                                                                  (e.g. bucket22)

                                                                            O cial recalibration (e.g. proc12)

      •   Important: e cient use of resources to nish ~weekly calibration

                                                                                                                 5
ffi
              ffi
                    fi
                                         fi
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
proc12 - the rst recalibration in the new scheme
     •   proc12 was a new experience, and required several adjustments
         - HLT skim logic was drastically improved in exp14
            • prompt cDST for exp 7-12 had to be re-produced from scratch recalculating the HLT skim assignment
         - Disk at DESY became available only in late fall
            • cDST could not be moved there until very late
         - Technical issues while running at DESY (data staging took 14 days for re-calibration of exp12)
         - Exp 7-10 integrated very little luminosity
            • The rst chunk merges them together, requiring adjustment of the scripts
         - Calibration scripts had to be adapted
            • Work with cDST as input instead of RAW, integrate data from di erent experiments

     •   Despite numerous tests and preparation,
         proc12 was greatly delayed by
         - Inaction/mistakes by the calibration experts
         - Unexpected SW issues when running over
            large datasets
         - Depending on one single air ow manager
                                 proc12 logbook: https://hackmd.io/@Hqns4j77TjW1vfIUdN_6Vg/Sk6eLx4G_
                                                                                                                  6
fi
                       fi
                            fl
                                                              ff
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
Improvements to calibration
•   Fully* automated loop (including validation)
    - Still some production related manual work
         (by managers)
    - Not all calibrations have integrated validation scripts
    - Some calibrations require manual intervention
•   Better use of HTCondor backend
    (failed jobs automatically resubmitted)
•   Optimized calibrations
•   Increased computing resources at BNL

•   Calibration time quickly becoming a bottleneck
    (collecting >1/fb/day)
    - Now increasing the bucket size from 10/fb to 18/fb
    - Will eventually scale with luminosity
    - Prescale determination is critically important, since
       hRaw samples cannot be easily reprocessed                * Note: experts may exclude automated
       (requires staging of full raw data for skimming)         calibration results and “recycle” constants
                                                                from the previous bucket to avoid poor quality
                                                                                                                 7
Belle II data production - Jake Bennett for the data production team July 2021 BPAC
Prompt calibration performance   •   Prompt calibration is
                                     regularly run in 7 days
                                     (+2 for validation)
                                     - Latest calibrations
                                        met the goal for 2021
                                        calibration time

                                 •   Excellent work by the
                                     calibration team!

                                 •   Great support by BNL,
                                     DESY, and the
                                     distributed computing
                                     group!

                                 •   Recent improvements in
                                     understanding of trigger
                                     payloads will further
                                     reduce time pressure

                                                                8
•   We can regularly process 2-3 fb-1/day on the GRID
Prompt processing performance   •   Fantastic work by the processing managers and
                                    the DC team!

                                                                                        9
Data readiness for summer (actual)

                                     10
Streamlining data processing
•   Calibration for proc13 will start in October
    - Earlier than proc12 to nish in time
    - Using release-06, validated in summer
    - Special reprocessing of 2021b bucket in
       Autumn to validate software for calibration
    - Reprocess everything collected through 2021b

•   Expected dataset for Moriond 2022:
    - proc12 (rel-05)
    - prompt 2020c+2021ab (rel-05)
    - prompt 2021c (rel-06, to be discussed)

•   Expected dataset for Summer 2022:
    - proc13 (rel-06)
    - prompt 2021c + 2022ab (rel-06)

                                                     11
                 fi
Analysis skimming improvements and plans
•   Large number of uDST les due to run structure                           Sample               Skim Production Status
    - One or a few jobs per run                                         MC14_ri generic                   Done
                                                                      MC14_ri low multiplicity         In progress
    - Even with many input les per user job, running on the
                                                                            MC14_rd                  To be prepared
       grid with data will remain inconvenient
                                                                         Proc12 chunk1                Almost Ready
    - Heavy I/O overhead, downloading many les is slow
                                                                         Proc12 chunk2                 In progress
•   Proposal: drop run structure in the nal uDST sample
                                                                          Bucket16-18                     Done
    - Reminder: the run number and experiment number are                  Bucket19-21                  In progress
       stored in the metadata of each le

•   Signal MC skims are requested by the analysts (typically common for analyses in a working group)
•   Proposal: Skims of signal MC will be produced by skim liaisons for each working group and stored under a speci c
    group area on the grid

•   Certain analysts still rely on mDST
    - Available sooner than uDST, some analyses cannot use skims
    - Running on mDST increases job processing time on the grid (heavy tra c)
•   Proposal: Optimize gbasf2 analysis tools for uDST processing, but allow mDST use

                                                                                                                          12
                 fi
                      fi
                           fi
                                fi
                                     fi
                                                                ffi
                                                                                                         fi
•   MC production runs
                                                                             quickly, but slowed by
                                                                             - storage availability
                  MC production reduced                                      - preparation of run-
                    to make room for                                            dependent payloads
                      analysis jobs
                                                                         •   New tools will help
           MC13                           MC14a   MC14ri_a/b                 - Rucio automated
                                                                               deletion (e.g.
                                                            MC14rd_b           intermediate les)
                                                                             - More e cient user
                                                                               tools for deletion
                                                       proc12 + prompt
                                                                         •   Processing running very
                                                                             smoothly
                                                                             - Still calibration
                                                                                limited
                                                                             - Some delay from
                                                                                staging
                                                                             - Low luminosity
                                                                                requires less MC
ffi
      fi
Revised MC nomenclature

                          14
New MC production strategy

                             15
Improving MC production
•             Considering ways to improve the currently ine cient run-dependent MC production
              - GT/payload management must improve
              - Run dependence of input (BG overlay) les and payloads causes trouble
              - Need creative ideas to handle low luminosity productions for which few events per run are needed

•             Proposals
              - Include payload preparation and
                 sign-o in calibration air ow
              - Merge streams (x4) to reduce the
                 number of productions to
                 manage/use
              - Extend number of jobs per
                 production (depends on prod.
                 system updates)
              - Beam conditions manager
                 (coordinate preparation of all
                 beam related payloads, BG
                 overlay les)

                                                                                                                   16
    ff
         fi
                             fl
                                           fi
                                                   ffi
Data and MC for summer analysis
•   Overview of data production plans for proc12 and 2021a/b prompt (details and links on DP status page)

              Remaining 2021a/b data to be calibrated and processed by mid-August (double-sized buckets 22-25)
                                                                                                                 17
Interface between data production (and computing) and users
•    gbasf2/gb2 tools (documentation - temporary place: http://linuxfarmb.phy.olemiss.edu/gbasf2/html/)
     - Signi cant development ongoing

•    Production managers upload production details to the dataset searcher and announce via email and con uence

          gb2_ds_search dataset --campaign proc12 --general_skim hadron --skim_decay 10601300 --exp_low 12 --exp_high 12

                Required (soon) input: general skim name
            (which sample was skimmed to obtain the sample)              Skim details can be found via the skim registry
           Analysts need to know what goes into their analyses!                Running on uDSTs will save time!

•    Luminosity determination is made twice, once in the online system (stored in RunDB) and once o ine (not yet in RunDB)
     - Currently a backlog for o ine calculation

                            $ b2info-luminosity --exp 12 --runs 0-99999 --what online
                            Read 4102 runs for experiment 12
                            TOTAL online   : L = 63404840.40 /nb = 63404.84 /pb =   63.405 /fb =            0.0634 /ab

                            $ b2info-luminosity --exp 12 --runs 0-99999 --what offline
                            Read 1574 runs for experiment 12
                            TOTAL offline   : L = 62553677.26 /nb = 62553.68 /pb =   62.554 /fb =           0.0626 /ab

                                                                                                                             18
fi
                      ffl
                                                                                                      ffl
                                                                                                                   fl
Many more details available!
     •          Summary pages for data readiness, links to more details
                (good pages to bookmark)

     •          Conference readiness:
                https://con uence.desy.de/display/BI/Conference+readiness
                -     Identi es recommended MC samples to use
                      with o cially processed data
                -     List of recommendations for data/MC comparisons
                -     Physics performance deliverables

     •          Data Production Status:
                https://con uence.desy.de/display/BI/Data+Production+Status
                -     Ongoing and planned productions for data, MC,
                      analysis skims, etc.
                -     Links to nd data/MC samples
                -     Status of computing resources

     •          Data production workshop reports:
                -     Fall 2020: https://docs.belle2.org/record/2116/ les/BELLE2-NOTE-TE-2020-025.pdf
                -     Spring 2021: https://docs.belle2.org/record/2358/ les/BELLE2-NOTE-TE-2021-013.pdf
                -     Data production plan: https://relativity.phy.olemiss.edu/~jbennett/DataProductionPlan_V2.1.pdf
                                                                                                                       19
fi
     ffi
           fi
                 fl
                 fl
                                                      fi
                                                           fi
20
You can also read