# THE SPECTRE CAUCHY-CHARACTERISTIC EVOLUTION SYSTEM FOR RAPID, PRECISE WAVEFORM EXTRACTION

←

→

**Page content transcription**

If your browser does not render page correctly, please read the page content below

The SpECTRE Cauchy-characteristic evolution system for rapid, precise waveform extraction Jordan Moxon,1 Mark A. Scheel,1 Saul A. Teukolsky,1, 2 Nils Deppe,1 Nils Fischer,3 Francois Hébert,1 Lawrence E. Kidder,2 and William Throwe2 1 Theoretical Astrophysics, Walter Burke Institute for Theoretical Physics, California Institute of Technology, Pasadena, CA 91125, USA. 2 Cornell Center for Astrophysics and Planetary Science, Cornell University, Ithaca, New York 14853, USA. 3 Max Planck Institute for Gravitational Physics (Albert Einstein Institute), Am Mühlenberg 1, D-14476 Potsdam, Germany We give full details regarding the new Cauchy-characteristic evolution (CCE) system in SpEC- TRE. The implementation is built to provide streamlined flexibility for either extracting waveforms during the process of a SpECTRE binary compact object simulation, or as a standalone module for extracting waveforms from worldtube data provided by another code base. Using our recently arXiv:2110.08635v1 [gr-qc] 16 Oct 2021 presented improved analytic formulation, the CCE system is free of pure-gauge logarithms that would spoil the spectral convergence of the scheme. It gracefully extracts all five Weyl scalars, in addition to the news and the strain. The SpECTRE CCE system makes significant improvements on previous implementations in modularity, ease of use, and speed of computation. I. INTRODUCTION stein field equations. Current strong-field numerical rela- tivity simulation methods are ‘Cauchy’ methods [20–23]: initial data is generated for a desired configuration of Since the original gravitational wave detections by the compact binary using an elliptic solve on a restricted the LIGO-VIRGO collaborations [1, 2], sensitivities of region, and that spacelike hypersurface data is evolved ground-based detectors have continued to advance [3, 4]. in the timelike direction. One output of a Cauchy sim- A crucial requirement for the successful detection and ulation is the metric and its derivatives as a function parameter estimation of astrophysical gravitational-wave of time, evaluated on one or more spheres of finite dis- sources is the accurate modelling of potential gravita- tance from the binary, typically ∼ 100 − 1000M from the tional wave signals. Gravitational wave modelling is re- coalescence. Waveform extraction then uses the Cauchy quired both to construct templates for extracting sig- worldtube metric and its derivatives to determine the ob- nals from instrumentation noise [5, 6] and for performing servable asymptotic waveform that is directly applicable follow-up parameter estimation [7–11]. Currently, the to data analysis efforts for gravitational wave interferom- precision of numerical relativity waveforms is sufficient eters. to cause no significant bias in detections produced by the present generation of gravitational wave detectors [12]. The most widely used technique of waveform extrac- As the technology of the current network of gravi- tion is the method of extrapolation to large radii using tational wave detectors (Advanced LIGO [13], VIRGO, several worldtubes of finite radius [24, 25]. For each wave- and KAGRA [14]) continues to mature, next-generation form quantity of interest, such as the gravitational wave ground based interferometers (Cosmic Explorer [15] and strain or one of the Weyl scalars, there is a clear power Einstein Telescope [16]) are planned, and space-based law asymptotic behavior in well-behaved gauges. The gravitational wave detector projects (LISA [17], Tian- extrapolation method then fits for the leading behavior Qin [18] and DECIGO [19]) move forward, the demand in r−1 and obtains a reasonable approximation for the for high-precision waveform models for binary inspirals asymptotic waveform. The extrapolation method has continues to grow. Recent investigations [12] have indi- been used to generate a great number of useful waveforms cated that future ground-based gravitational wave detec- for gravitational wave data analysis [26–28]. However, tors will have sufficient sensitivity that current numeri- the extrapolation method makes a number of simplify- cal relativity waveforms are not precise enough to pro- ing assumptions regarding the choice of coordinates and duce unbiased parameter recovery. Further, space-based behavior of the field equations far from the system that gravitational wave detectors, such as LISA, will likely diminish the precision of the method. observe several sources simultaneously, and sufficiently In addition, there is good evidence [29] that there precise modelling of each source will help make best use are large, low-frequency parts of gravitational waveforms of the resulting data by improving the capability to dis- (‘memory’ contributions) that are not well modeled by tinguish overlapping signals. waveform extrapolation. These memory effects do not An important ingredient to improved precision for nu- have significant impact on the frequency bands impor- merical relativity waveforms is the refinement of wave- tant for LIGO, but will likely be important for more sen- form extraction methods. The process of waveform ex- sitive detectors (such as the Einstein Telescope or Cosmic traction refers to the calculation of the observable asymp- Explorer) or detectors sensitive to lower frequency bands totic waveform from a strong-field simulation of the Ein- (such as DECIGO or LISA).

2 to achieve high precision and can be very costly to run [35]. The first spectral implementation of CCE is a mod- ule of the Spectral Einstein Code (SpEC). That imple- I+ mentation was first reported in [36], and has undergone a u number of updates and refinements [37, 38], including re- cent work that assembled a number of valuable analytic tests that assisted in refining and optimizing the code [35]. In this paper, we present our new implementation of Γ CCE in the SpECTRE [39] code base, which incorporates a number of improvements to the waveform extraction system. The SpECTRE CCE module implements a mod- ified version of the evolution system in Bondi-Sachs coor- Σu dinates [40] that is able to guarantee that no pure-gauge logarithms arise that spoil the spectral convergence of the CCE Domain scheme as the system evolves. Further, the SpECTRE CCE system is able to use formulation simplifications to implement the computation for all five Weyl scalars as suggested in [40]. We have also implemented numerical Cauchy Domain optimizations specific to the SpECTRE CCE system to ensure rapid and precise waveform extraction, and we have re-implemented and extended the collection of tests that was previously effective in testing and refining the FIG. 1: A sketch of the Cauchy and Characteristic domains. SpEC implementation [35]. The Cauchy system evolves Einstein’s equations on spacelike SpECTRE [39, 41] is a next-generation code base for hypersurfaces, while the Characteristic system evolves Ein- which the aim is to construct scalable multi-physics sim- stein’s equations on compactified null hypersurfaces Σu that extend to I + . Boundary conditions for the Characteristic sys- ulations of astrophysical phenomenon such as neutron tem are required on the worldtube Γ and are provided there star mergers, binary black hole coalescences, and core- by the Cauchy system. collapse supernovae. It is the goal of the SpECTRE project to construct a highly precise astrophysical simula- tion framework that scales well to & 106 cores. The core Cauchy-characteristic evolution1 Cauchy- SpECTRE evolution system uses discontinuous Galerkin characteristic evolution (CCE) [30–32] is an alternative methods with a task-based parallelism model. The dis- waveform extraction method that uses metric data on a continuous Galerkin method has the ability to refine a single worldtube Γ to provide boundary conditions for a domain by subdividing the computation into local cal- second full nonlinear field simulation along hypersurfaces culations coupled by boundary fluxes. SpECTRE then generated by outgoing null geodesics. CCE avoids many uses the task-based parallelism framework, charm++ of the assumptions made by other extraction methods, [42–44], to schedule and run the resulting multitude of and instead computes the full solution to Einstein’s separate calculations, which ensures good scaling prop- equations in a Bondi-Sachs coordinate system at I + , erties of the method. from which waveform quantities may be unambiguously The CCE system in SpECTRE enjoys some efficiency derived. The CCE domain and salient hypersurfaces are gain from sharing a common well-optimized infrastruc- illustrated in Fig. 1. ture with the discontinuous Galerkin methods and makes There are two notable previous implementations of modest use of the parallelization framework (see Sec. IV). CCE. The original implementation, PITT Null [33, 34], However, the characteristic evolution itself is imple- is a part of the Einstein Toolkit, and demonstrated the mented as a single spectral domain that covers the entire feasibility of the CCE approach. Unfortunately, as it is a asymptotic region from the worldtube Γ out to I + . The finite difference implementation, PITT Null struggles smooth behavior of the metric away from the binary co- alescence ensures exponential convergence of the mono- lithic spectral method. In principle, the CCE method could be applied to a subdivided asymptotic domain. 1 The acronym CCE has also been used in the past to refer to However, the unusual features of the field equations for “Cauchy-characteristic extraction”, which describes only the part CCE (reviewed in Sec. II) would require special treat- of the computation moving from the Cauchy coordinates to a set ment to appropriately account for boundary information. of quantities that could separately be evolved on null characteris- Moreover, any subdivision of the angular direction would tic curves. Most of our descriptions refer to the entire algorithm as a single part of the wave computation, so we refer to the com- obscure the spherical shell geometry that permits efficient bination of Cauchy-characteristic extraction and characteristic calculation of the angular degrees of freedom of the sys- evolution as simply CCE. tem via spin-weighted spherical harmonic (SWSH) meth-

3 ods. We use Greek letters α, β, γ, . . . to represent spacetime It is important to note that the SpECTRE CCE mod- indices, uppercase roman letters A, B, C, . . . to represent ule, like every part of SpECTRE, is a rapidly evolving spherical angular indices, and lowercase roman letters open-source code base. The discussion in this paper rep- from the middle of the alphabet i, j, k, . . . to represent resents as completely as possible the state of our efforts spatial indices. to optimize and refine the system at the time of pub- When relevant, we similarly adorn the spin-weighted lication. However, we will continue to make modifica- scalars and tensors that represent components of the met- tions and improvements, so we encourage the reader to ric to indicate the coordinates in which they are com- explore the full code base at [45], and refer to the docu- ponents of the Bondi-like metric. For instance, the gr̂û mentation at [46]. For up-to-date details on making use component of a partially flat Bondi-like metric is −e2β̂ . of the standalone SpECTRE CCE system, please see the Our notation conventions are consistent with our previ- documentation page [47]. ous paper regarding the mathematics of the CCE system We first describe the mathematical aspects of the evo- [40]. lution system, including the incorporation of formulation improvements from [40] in Sec. II. Next, we discuss some of the numerical methods that we have constructed for A. Spectral representation our new SpECTRE implementation to improve runtime and precision in Sec. III. We discuss the how the SpEC- TRE CCE module fits into the wider task-based SpEC- The SpECTRE CCE system represents its null hyper- TRE infrastructure in Sec. IV. Finally, we demonstrate surface data on the domain I ×S 2 , where the real interval the precision and accuracy of the code by applying the I describes the domain y ∈ [−1, 1] for compactified radial system to a collection of analytic test cases in Sec. V, and coordinate to a realistic use-case of extracting data from a binary black-hole evolution from SpEC in Sec. VI. We describe 2R̂(û, x̂Â ) y̆ = 1 − , (2) the major future improvements that we hope to make for r̂ the CCE system in Sec. VII. where r̂ is the partially flat Bondi-like radial coordinate and R̂ is the Bondi-like radius at the worldtube. II. THE EVOLUTION SYSTEM We use a pseudospectral representation for each physical variable on this domain, using Gauss-Lobatto points for the radial dependence, and libsharp[49, 50]- The discussion of CCE and its numerical implementa- compatible collocation points for the angular depen- tions relies closely on a number of coordinate systems. dence. The angular collocation points are chosen to We use the following notation for coordinate variables be equiangular in the φ direction, and Gauss-Legendre and spacetime indices: points in cos θ 2 . • xα : {u, r, θ, φ} are generic Bondi-like coordinates. The choice of Gauss-Lobatto points for the radial de- These are the coordinates determined by the first pendence simplifies the CCE algorithm because it is con- stage of local coordinate transformations at the venient to specify boundary conditions for the radial in- worldtube first derived in [48]. tegrals as simple boundary values. The choice of angular collocation points enables fast • x̂α̂ : {û, r̂, θ̂, φ̂} are partially flat Bondi-like coordi- SWSH transforms, so that libsharp routines can effi- nates introduced in [40]. ciently provide the angular harmonic coefficients s alm (y̆) for an arbitrary function f (y̆, θ̆, φ̆) of spin weight s, de- • x̆ᾰ : {ŭ, y̆, θ̆, φ̆} are numeric partially flat coordi- fined by nates. These are the coordinates directly repre- sented in the SpECTRE numeric implementation, X f (y̆, θ̆, φ̆) = s a`m (y̆)s Y`m (θ̆, φ̆). (3) and are related to the partially flat Bondi-like co- `m ordinates by ŭ = û, y̆ = 1 − 2R̂/r̂, Here s Y`m (θ̆, φ̆) are the SWSHs as defined in Eq. (C1). We then perform all angular calculus operations using θ̆ = θ̂, φ̆ = φ̂, (1a) ˘. We use the spin-weighted derivative operators ð̆ and ð̄ where the worldtube hypersurface is determined by r̂ = R̂(û, θ̂, φ̂). 2 • x̊α̊ : {ů,r̊, θ̊, φ̊} are the asymptotically flat ‘true’ It is of some numerical convenience that there are no points at the poles, where spherical polar coordinates are singular. However, Bondi-Sachs coordinates. These are the coordi- care must still be taken to avoid unnecessary factors of sin θ in nates in which we’d like to determine the final wave- quantities like derivative operators, as they give rise to greater form quantities. numerical errors when points are merely close to the pole.

4 an angular dyad q̆ Ă : In a Bondi-like metric, surfaces of constant u are gener- ated by outgoing null geodesics. The Bondi-Sachs met- −i ric further imposes asymptotic conditions on each com- q̆ Ă = −1, . (4) sin θ̆ ponent of the metric that we will not impose for all of Then, for any spin-weighted scalar quantity v̆ = our coordinate systems. The same form (7) holds in any Bondi-like coordinates, including the partially flat Bondi- q̆1Ă1 . . . q̆nĂn v̆Ă1 ...Ăn , where each q̆i may be either q̆ or q̄˘, like coordinates x̂α̂ and true Bondi-Sachs coordinates x̊α̊ . we define the spin-weighted derivative operators It is important to note that for numerical implementa- tions, the system is usually not evolved in a true Bondi- ð̆v̆ = q̆1Ă1 . . . q̆nĂn q̆ B̆ D̆B̆ vĂ1 ...Ăn , (5a) Sachs coordinate system. For convenience of numerical ˘v̆ = q̆ Ă1 . . . q̆ Ăn q̄˘B̆ D̆ v̆ (5b) calculation, most CCE implementations enforce gauge ð̄ 1 n B̆ Ă1 ...Ăn , choices only at the worldtube boundary, and therefore where D̆Ă is the angular covariant derivative. All angular do not ensure asymptotic flatness. The SpECTRE CCE ˘ implementation employs a somewhat different tactic, as derivatives may be expressed in a combination of ð̆ and ð̄ the generic Bondi-like gauge is vulnerable to pure-gauge operators. We perform angular differentiation of an arbi- logarithmic dependence that spoils spectral convergence. trary function f (y̆, θ̆, φ̆) of spin weight s by transforming Instead, we use the partially flat gauge introduced in [40], to SWSH modes on each concentric spherical slice of the which ensures that the evolved coordinates are in the domain represented by s alm (y̆), then applying the diag- asymptotically inertial angular coordinates, while keep- onal modal multipliers ing the time coordinate choice fixed by the arbitrary ð̆f (y̆, θ̆, φ̆) Cauchy time coordinate. Xp In the Bondi-like coordinates, it is possible to choose a = (` − s)(` + s + 1)s a`m (y̆) s+1 Y`m (θ̆, φ̆) (6a) subset of the Einstein field equations that entirely deter- `m mine the scalars {J, β, U, W } and that form a computa- ˘f (y̆, θ̆, φ̆) ð̄ tionally elegant, hierarchical set of differential equations. X p Represented in terms of the numerical Bondi-like coor- = − (` + s)(` − s + 1)s a`m (y̆) s−1 Y`m (θ̆, φ̆), dinates {ŭ, y̆, θ̆, φ̆}, the hierarchical differential equations `m take the form (6b) ˘ ∂y̆ β̆ = Sβ̆ (J), (9a) and then performing an inverse transform. In addition, it is occasionally valuable to apply the in- ˘ β̆), ∂y̆ ((1 − y̆)2 Q̆) = SQ̆ (J, (9b) verse of the angular derivative operators ð̆ and ð̄ ˘. This ˘ β̆, Q̆), ∂y̆ Ŭ = SŬ (J, (9c) can be performed applying the inverse of the multiplica- ˘ β̆, Q̆, Ŭ ), ∂y̆ ((1 − y̆)2 W̆ ) = SW̆ (J, (9d) tive factors in the modal representation (6), and is ap- proximately as efficient to compute as the derivative. ∂y̆ ((1 − y̆)H̆)+L (J, ˘ β̆, Q̆, Ŭ , W̆ )H̆ H̆ ˘ ˘ β̆, Q̆, Ŭ , W̆ )H̄ +LH̄˘ (J, B. Hierarchical evolution system ˘ β̆, Q̆, Ŭ , W̆ ), = SH̆ (J, (9e) For evolution in the characteristic domain (see Fig. 1), ∂ŭ J˘ = H̆. (9f) we solve the Einstein field equations for the spin-weighted scalars that appear in the Bondi-Sachs form of the metric: The detailed definitions for the source functions S̆(. . . ) and the factors LH̆ in (9) can be found in Sec. IV of [40]. We emphasize that the only time derivative appearing 2β V ds = − e 2 2 − r hAB U UA B du2 − 2e2β dudr r in the core evolution system (9) is that of J˘ (9f), so we have only the single complex field to evolve and all of the − 2r2 hAB U B dudxA + r2 hAB dxA dxB . (7) other equations are radial constraints within each null The spin-weighted scalars that are used in the evolution hypersurface. system are then J, β, Q, U, W, and H, where The SpECTRE CCE system requires input data spec- ified on two hypersurfaces: the worldtube Γ and the ini- U ≡ U A qA , (8a) tial hypersurface Σŭ0 (see Fig. 1). The worldtube sur- Q≡r e 2 −2β A q hAB ∂r U , B (8b) face data must provide sufficient information to set the boundary values for each of the radial differential equa- 2 r W ≡ V − r, (8c) tions in (9). Namely, we must specify β̆, Ŭ , Q̆, W̆ , and 1 H̆ at the worldtube (see Sec. II C below). The worldtube J ≡ q A q B hAB , (8d) 2 data is typically specified by determining the full space- 1 time metric on a surface of constant coordinate radius in K ≡ q A q̄ B hAB . (8e) 2 a Cauchy code, then performing multiple gauge transfor-

5 mations to adapt the boundary data to the appropriate transformations for these scalars depend only partially flat Bondi-like gauge. on angular Jacobians ∂Ă xB , and are described The initial hypersurface data requires specification in Sec. II D. ˘ In contrast to Cauchy only of the single evolved field J. (b) Evaluate the hypersurface equation for the approaches to the Einstein field equations, the initial spin-weighted scalar I˘ using the radial inte- data for CCE does not have a collection of constraints gration methods described in Sec. III B. that form an elliptic differential equation. Instead, J˘ may be arbitrarily specified on the initial data surface, 3. Determine the time derivative of the angular coor- constrained only by asymptotic flatness conditions. The dinates ∂ŭ xA (x̆) (see Sec. II D) using the asymp- choice of “correct” initial data to best match the physi- totic value of U. cal history of an inspiral system, however, remains very 4. Transform U to the partially flat gauge Ŭ by sub- difficult. We discuss our current heuristic methods for tracting its asymptotic value U0 ≡ U|I + . fixing the initial hypersurface data in Sec. II E. 5. For each spin weighted scalar I in {W, H}: C. Gauge-corrected control flow (a) Transform I to partially flat gauge I˘ via the angular coordinates xA (x̆Ă ) and their first The SpECTRE CCE system implements the partially derivatives ∂ŭ xA (x̆) – see Sec. II D. flat gauge strategy discussed at length in [40]. The prac- (b) Evaluate the hypersurface equation for I.˘ tical impact of the method is that we must include the evolved angular coordinates in the process of determin- 6. For each output waveform quantity O in ing the Bondi-Sachs scalars for the radial hypersurface {h, N, Ψ4 , Ψ3 , Ψ2 , Ψ1 , Ψ0 }: equations. Past implementations have performed the an- (a) Compute asymptotic value of O, and trans- gular transformation at I + , which results in a simpler form to asymptotically inertial coordinate algorithm, but also gives rise to undesirable pure-gauge logarithmic dependence. time as described in App. B, using ů(x̆Ă ). In this discussion, we make use of the local Bondi- 7. Step J˘ forward in time using ∂ŭ J˘ = H̆, step xA Sachs-like coordinates x̂µ̂ on the worldtube that are de- using Eq. (12) below for ∂ŭ xA , and step ů using termined by the standard procedure introduced in [30] Eq. (B1) below for ∂ŭ ů. and reviewed in [35, 40]. This procedure obtains a unique Bondi-Sachs-like coordinate system by generating a null See Sec. III A for details regarding the calculation of the hypersurface with geodesics outgoing with respect to the angular Jacobian factors required for the gauge trans- worldtube, and with time and angular coordinates chosen formation and the practical methods used to evolve the to match the Cauchy coordinates on the worldtube. angular coordinates. In the below discussion, we make use of an intermedi- ate spin-weight 1 scalar D. Worldtube data interpolation and U = Ŭ + U0 , (10) transformation where U0 = U|I + is a radially-independent contribution fixed by the worldtube boundary conditions. U obeys The collection of hypersurface equations (9) requires the same radial differential equation as Ŭ , but possesses data for each of the quantities {β̆, Q̆, Ŭ , W̆ , H̆} on a sin- a constant asymptotic value that is used to determine the gle spherical shell at each timestep. For β̆ and Ŭ , the evolution of the angular coordinates. worldtube data specifies the constant-in-y̆ part of the so- The computational procedure with the gauge transfor- lution on the hypersurface, for Q̆ and W̆ , the worldtube mation to partially flat coordinates is then: data fixes the ∝ (1 − y̆)2 part, and for H̆, the worldtube 1. Perform the gauge transformation from the Cauchy data fixes a combination of radial modes that includes gauge metric to the local Bondi-Sachs coordinates the ∝ (1 − y̆) contribution. on the worldtube Γ, generated by geodesics with The worldtube data provided by a Cauchy simulation null vectors that are outgoing with respect to the contains the spacetime metric, as well as its first radial worldtube surface. and time derivatives. The procedure for transforming the data provided by the Cauchy evolution to boundary 2. For each spin weighted scalar I in {β, Q, U }: data for the hypersurface equations (9) is then, for each (a) Transform I to partially flat gauge I˘ (or U) hypersurface time ŭ, via the angular coordinates xA (ŭ, x̆Ă ) 3 . All of the target collocation points in the source coordinate system. See Sec. III A for more details regarding our interpolation meth- 3 When performing spectral interpolation, we require the position ods.

6 1. Interpolate the worldtube data to the desired hy- to a set of angular coordinates xA (x̆Ă ) for which the met- persurface time ŭ ric satisfies the asymptotic conditions: 2. Perform the local transformation of the Cauchy lim J˘ = 0, (11a) worldtube metric and its derivatives to a Bondi- y̆→1 like gauge as described in [48] lim Ŭ = 0. (11b) y̆→1 3. Perform angular transformation and interpolation from the generic Bondi-like gauge to the partially These conditions are satisfied if the angular coordi- flat gauge used for the evolution quantities. nates obey the radially-independent evolution equation The worldtube data is usually generated by the Cauchy [40] simulation at time steps that are suited to the strong-field calculations, but the characteristic system can usually ∂ŭ xA = −U0Ă ∂Ă xA , (12) take significantly larger time steps. Once the character- istic time stepping infrastructure has selected a desired where U0Ă q̆Ă ≡ U0 . time step, we interpolate the worldtube data at each an- gular collocation point to the target time for the next hy- The angular transformations for the remaining spin- persurface. In SpECTRE, the interpolation is performed weighted scalars require the spin-weighted angular Jaco- by selecting a number of time points as centered as pos- bian factors sible on the target time, then performing a barycentric rational interpolation to the target time. ă = q̆ Ă ∂Ă xB qB (13a) After the time interpolation of the worldtube data, ˘Ă b̆ = q̄ ∂Ă x qB B (13b) we have the values of the spacetime metric and its ra- dial and time derivatives on a single inner boundary of (13c) the CCE hypersurface of constant retarded time ŭ. We 0 then compute the outgoing radial null vector lµ (denot- and conformal factor ing Cauchy coordinate quantities with 0 ) , construct a 1 q radial null coordinate system using 0the affine parameter ω̆ = b̆˘b̄ − ăā ˘ (14a) along null geodesics generated by lµ , then normalize the 2 1 radial coordinate to construct an areal radius r. Fol- ω̆ ˘ ˘ω̆ + Ū ð̆ω̆ ∂ŭ ω̆ = ð̄U0 + ð̆Ū0 + U0 ð̄ 0 (14b) lowing these transformations, for which explicit formulas 4 2 are given in [35, 40, 48], the spacetime metric gαβ is of the form (7), but with no asymptotic flatness behavior Given the angular coordinates determined by the time imposed. During the transformation from the Cauchy co- evolution of (12), we perform interpolation of each of the ordinates to the Bondi-like coordinates, the angular and spin-weighted scalars {R, ∂u R, J, U, ∂r U, β, Q, W, H} to time coordinates remain fixed on the worldtube surface, the new angular collocation points (more details for the so no alteration of the pseudospectral grid is necessary. numerical interpolation procedure are in Sec. III A), and The final step for the worldtube computation is to per- perform the transformation of the spin-weighted scalars form a constant-in-r angular coordinate transformation as R̆ =ω̆R, (15a) ω̆ ˘ ∂ŭ R̆ =ω̆∂u R + ∂ŭ ω̆ + U0 ð̄R + Ū0 ð̆R , (15b) 2 1 ˘2 J = 2 b̄ J + ă J + 2ă˘b̄K , ˘ 2 ¯ (15c) 4ω̆ 2β e e2β̆ = , (15d) ω̆ ˘¯ ∂y̆ (J˘J) " !# R̆ ˘ e2β̆ ˘ ˘ ∂y̆ Ŭ = 3 b̄∂r U − c̆∂r Ū + 4R̆ ð̄ω̆∂y̆ J − ð̆ω̆ ω̆ (1 − y̆)2 ω̆ 2K̆ ˘¯ !2 e2β̆ ˘˘ ˘¯ J˘ − ∂y̆ (J˘J) + 2R̆ J ð̄ω̆ − K̆ ð̆ω̆ −1 + ∂y̆ J∂ y̆ , (15e) ω̆ 2K̆

7 ˘ , ˘ y̆ Ū Q̆ =2R̆e−2β̆ K̆∂y̆ Ŭ + J∂ (15f) 1 ˘ e2β̆ (1 − y̆) ˘ω̆), U= (b̄U − c̆Ū ) − (K̆ ð̆ω̆ − J˘ð̄ (15g) 2ω̆ 2R̆ω̆ Ŭ =U − U0 , (15h) (ω̆ − 1)(1 − y̆) e2β̆ (1 − y̆) h ˘ ˘ 2 ˘ ˘ ˘ω̆) − 2∂ŭ ω̆ − Ŭ ð̄ω̆ + Ū ð̆ω̆ , ˘¯ ð̆ω̆)2 − 2K̆(ð̆ω̆)(ð̄ i W̆ =W + + J(ð̄ω̆) + J( (15i) 2R̆ 4R̆ω̆ 2 ω̆ ω̆ ˘ 1 ˘ − J˘ð̆Ū0 + ∂ŭ ω̆ − 2 (U0 ð̄ω̆ + Ū0 ð̆ω̆) (2J˘ − 2∂y̆ J) 1 ˘J˘ + ð̆(Ū J) ˘U + K̆ ð̆Ū ˘ − J˘ð̄ h i H̆ = U0 ð̄ 0 0 0 2 ω̆ 1 ˘2 H J¯ + J H̄ b̄ H + ă2 H̄ + ˘b̄c̆ ∂ŭ R̆ ˘ +2 ∂y̆ J, (15j) 4ω̆ K R̆ SpECTRE CCE initial data transient q where K = 1 + J J and K̆ = 1 + J˘J. ¯ ˘ Finally, the ¯ p 0.100 quantities {β̆, Q̆, U, W̆ , H̆} are used directly to determine 0.075 the integration constants in the hypersurface equations (9). Note that in all of the equations (15h) onward, we 0.050 have explicit dependence on U0 or implicit dependence on 0.025 U0 via ∂ŭ ω̆. This dependence necessitates finishing the 0.000 hypersurface integration of U to determine its asymptotic 0.025 Re Y2 2 value before computing the remaining gauge-transformed Strain h Im Y2 2 0.050 Re Y2 0 Mode coefficient quantities on the worldtube. Im Y2 0 0.075 200 E. Initial data 0 In addition to the specification of the worldtube data 200 at the interface to the Cauchy simulation, the character- isticsystem requires initial data at the first outgoing null 400 Re Y2 2 hypersurface in the evolution (see Fig. 1). The initial Im Y2 2 data problem on this hypersurface is physically similar 600 Curvature 0 Re Y2 0 to the initial data problem for the Cauchy evolution: It 800 Im Y2 0 is computationally prohibitive to directly construct the 0 200 400 600 800 1000 1200 1400 spacetime metric in the state that it would possess dur- Simulation time (M) ing the inspiral. Ideally, we would like the starting state of the simulation to be simply a snapshot of the state if FIG. 2: The initial data transient for an example CCE run we had been simulating the system for far longer. using worldtube data obtained from a binary black hole sim- The initial data problem in CCE has been investigated ulation SXS:BBH:2096 from the SXS catalog. The dominant previously by [51], in which a linearized solution scheme modes of the strain and Ψ0 display visually apparent drift was considered. The most important part of the initial during the first ∼ 2 orbits of the inspiral. The initial data data specification appears to be choosing the first hyper- transient contaminates the data for the early part of the sim- surface such that it is consistent with the boundary data ulation and leads to a BMS frame shift in the strain waveform. at the same timestep. Without that constraint, previous The frame shift can be seen visually from the fact that the authors [51], and empirical tests of our own code, indi- Y22 mode does not oscillate about 0. The initial data method used for this demonstration is the cubic ansatz initial data cate that spurious oscillations emerge that often last the described as method 1 below. full duration of the simulation. Computationally, the initial data freedom in CCE is much simpler than the Cauchy case [52, 53]. We coordinate transformation 4 . may specify the Bondi-Sachs transverse-traceless angu- lar scalar J˘ arbitrarily. Even when we take the practical constraint that J˘ must be consistent with the worldtube data at the first timestep, we still have almost arbitrary 4 In our evolution system, we track and perform an angular coor- freedom in the specification of J, as it must be consistent dinate transformation at the worldtube regardless of initial data with the worldtube data only up to an arbitrary angular choice, so permitting this transformation on the initial hypersur-

8 Current methods of choosing initial data for J do not (1 − y)2 part of J to vanish, which is sufficient to represent a snapshot of a much longer simulation, and prevent the emergence of pure-gauge logarithmic this gives rise to transients in the resulting strain out- dependence during the evolution of J. puts (see Fig. 2). These initial data transients are analo- gous to ‘junk radiation’ frequently found in Cauchy sim- ulations, but are somewhat more frustrating for data 3. Set J˘ = 0 along the entire initial hypersurface. In analysis because the CCE initial data transients tend to general, this choice will be inconsistent with the have comparatively long timescales. We observe that the data specified on the worldtube J|Γ , so it is neces- strain waveform tends to settle to a suitable state within sary to construct an angular transformation x(x̆Ă ) a few orbits of the start of the simulation. However, ˘ Γ = 0 following the transformation. such that J| when recovering high-fidelity waveforms from an expen- sive Cauchy simulation, every orbit of trustworthy world- Methods 2 and 3 above require the ability to compute tube data is precious, and it is disappointing to lose those the angular coordinate transformation xA (x̂B̂ ) such that first orbits of data to the initial data transient. It is a topic of ongoing work to develop methods of efficiently ˘b̄2 J˘ + ă2 J˘¯ + 2ă˘b̄K̆ generating high-quality initial data for CCE to improve 0 = J˘ = (18) 4ω̆ 2 the initial data transient behavior (see Sec. VII A). We currently support three methods for generating on some surface. Solving (18) in general would amount initial hypersurface data: to an expensive high-dimensional root-find. However, in our present application, practical solutions in the wave zone typically have a value of J˘ no greater 1. Keep J˘ and ∂y̆ J˘ consistent with the first timestep than ∼ 5 × 10−3 , and we should not expect to find a well- of the worldtube data. Use those quantities to fix behaved angular coordinate transform otherwise. So, we the angularly dependent coefficients A and B in the take advantage of the small parameter in the equation to cubic initial hypersurface ansatz: iteratively construct candidate angular coordinate sys- tems that approach the condition (18). Our linearized ˘ θ̆, φ̆) = A(θ̆, φ̆)(1 − y̆) + B(θ̆, φ̆)(1 − y̆)3 . J(y̆, (16) iteration is based on the approximation This is a similar initial data construction to [51], 1 J˘n ω̆n and is chosen to omit any (1 − y̆)2 dependence, ăn+1 = − (19a) 2 ˘b̄ K̆ which guarantees that no pure-gauge logarithmic n n 1 −1 terms arise during the evolution [40]. x̆n+1 (x̆) = ð̆n+1 ăn+1 ð̆x̆i + ˘b̄n+1 ð̄ ˘x̆i , i (19b) 2 2. Set the Newman-Penrose quantity Ψ0 = 0 on the for a collection of Cartesian coordinates x̆i that are rep- initial hypersurface.This amounts to enforcing a resentative of the angular coordinate transformation (see second-order nonlinear ordinary differential equa- Sec. III A). tion in y ≡ 1 − 2R/r for J, before constructing the We find that this procedure typically approaches coordinate transformation from xα to x̆ᾰ . After roundoff in ∼ 103 iterations. Despite the crude ineffi- some simplification, the expression for Ψ0 in [40] ciency of this approximation, the iterative solve needs may be used to show that the equation to be conducted only once, so it represents only a small portion of the CCE execution time for the initial data 1 methods that take advantage of it. ∂y2 J = J¯2 (∂y J)2 − 2(2 + J J)∂ ¯ y J∂y J¯ + J 2 (∂y J) ¯2 16K 2 In practical investigations, it has been found that most × (−4J − (1 − y)∂y J) (17) frequently the simplest method of an inverse cubic ansatz (1. above) performs best in various measures of asymp- is equivalent to the condition Ψ0 = 0. The initial totic data quality [54]. However, because the reasons hypersurface data is generated by first using (17) for the difference in precision for different initial data to perform a radial ODE integration out to I + , schemes are not currently well understood, we believe it with boundary values of J and ∂y J on the initial useful to include descriptions of all viable methods. worldtube. However, the data so generated is not necessarily asymptotically flat, so an angular coordinate transformation is calculated to fix III. IMPLEMENTATION DETAILS AND ˘ I + = 0. Encouragingly, fixing both (17) and the J| NUMERICAL OPTIMIZATIONS asymptotic flatness condition also constrains the Much of the good performance of the SpECTRE CCE system is inherited from the shared SpECTRE infrastruc- ture. In particular, the SpECTRE data structures of- face amounts only to setting nontrivial initial data for xA (x̂Â ). fer easy interfaces to aggregated allocations (which limit

9 expensive allocation of memory), fast vector operations source collocation values are transformed to spectral co- through the interface with the open source Blaze library efficients a`,m . The Clenshaw algorithm can be applied [55], and rapid SWSH transforms via the open source directly at each of the target points (θ, φ), to obtain libsharp library. Further, we take advantage of per- the values f (θ, φ). Note that the step of caching the core caching mechanisms to avoid recomputing common α`,m (θ, φ) and β`,m (θ, φ) is primarily useful for interpo- numerical constants, such as spectral weights and collo- lating multiple functions to the same grid; if only one cation values. function is needed for each grid, there will be little gain However, in addition to establishing ambitious “best in caching α and β, as they would each be evaluated only practices” for the mechanical details of the software de- once in a given recurrence chain. velopment, we have implemented numerical optimiza- In Appendix C, we give full details of the specific re- tions specialized to calculations in the CCE system. We currence relations that can be used to efficiently calculate give a brief explanation of the techniques we use to im- the Clenshaw sum for SWSH, as well as additional recur- prove performance of angular interpolation in Sec. III A, rence relations that improve performance when moving which is required to perform the gauge transformation between the m modes. For the remaining discussion it discussed in Sec. II D. In Sec. III B, we explain our meth- is convenient to define a few auxiliary variables that are ods for efficiently performing the hypersurface integrals used in the formulas for the SWSH recurrence: in our chosen Legendre-Gauss-Lobatto pseudospectral representation. a = |s + m| (24a) b = |s − m| (24b) ( 0, s ≥ −m A. Angular interpolation techniques using λ= (24c) spin-weighted Clenshaw algorithm s + m, s < −m The step-by-step procedure for efficiently interpolating The Clenshaw recurrence algorithm is a fast method a spin-weighted function represented as a series of spin- of computing the sum over basis functions, weighted spherical harmonic coefficients to a set of target N collocation points (θi , φi ) is then: X f (x) = an φn (x), (20) 1. Assemble the lookup table of required (α` (a,b) (θ), n=0 (a,b) β` , λm ): provided the set of basis functions φn obeys a standard (a) For each m ∈ [−`max , `max ] there is a pair form of a three-term recurrence relation common to many (a, b) from (24) to be computed. Note that polynomial bases. In particular, it is assumed that φn (a,b) α` must be cached separately for each tar- may be written as, (a,b) get point, but β` does not depend on the φn (x) = αn (x)φn−1 (x) + βn (x)φn−2 (x), (21) target coordinates. for some set of easily computed αn and βn . 2. For m ∈ [0, `max ]: The algorithm for computing the full sum f (x) [56] is (a) If |s| ≥ |m|: Determine s Y|s|,m (θ, φ) from then to compute the set of quantities yn for n ≥ 1, where direct evaluation of (C1) with (C3) and yn is s Y|s|+1,m (θ, φ) from (C10); Store s Y|s|,m (θ, φ) for recursion if |s| = |m|. yN +2 (x) =yN +1 (x) = 0 (22a) (b) If |m| > |s|: Determine s Y|m|,m (θ, φ) from re- yn (x) =αn+1 (x)yn+1 (x) currence (C9) and s Y|m|+1,m (θ, φ) from (C10). + βn+2 (x)yn+2 (x) + an (22b) Store s Y|m|,m (θ, φ) for recursion. (c) Perform the Clenshaw algorithm to sum over Once the last two quantities in the chain y1 (x) and y2 (x) l ∈ [min(|s|, |m|), `max ], using the spectral are determined, the final sum is obtained from the for- (a,b) coefficients a`m , the precomputed α` and mula (a,b) β` recurrence coefficients, and the first two f (x) = β2 (x)φ0 (x)y2 (x) + φ1 (x)y1 (x) + a0 φ0 (x). (23) harmonics in the sequence computed from the previous step. We use the Clenshaw method for interpolating SWSH 3. For m ∈ [−1, −`max ], repeat the substeps of step 2, data to arbitrary points x on the sphere. For but for the negative set of m’s. spherical harmonics, it is successive values of ` that have convenient three-term recurrence relations, so the Although the procedure for interpolation is performed lowest modes in the recursion are Y|m|,m (θ, φ) and efficiently, there are a number of details of the imple- Y|m|+1,m (θ, φ). The values of α`,m (θ, φ) and β`,m (θ, φ) mentation of the angular coordinate transformation that are cached for the target interpolation points, and the must be handled carefully.

10 The spin-weighted interpolation procedure can be per- formed only on quantities that are representable by the ﬁeld values at SWSH basis. We can store non-representable quantities source collocation (including, e.g. the angular coordinates themselves) on our chosen angular grid, but we cannot perform a SWSH transform on such quantities, so we cannot interpolate them using pseudospectral methods with any predictable Source frame accuracy. Inconveniently, we are burdened with a num- ber of quantities that are not representable on the SWSH basis. Immediately after interpolation, J(xA (x̆Ă )) is not representable on the basis corresponding to the new grid because the Jacobian factors have not yet been applied. ﬁeld values at Similarly, the Jacobian factors ă and b̆ are not repre- sentable on the SWSH basis whenever the angular trans- target collocation form is not trivial. Accordingly, for our example of J,˘ we must apply the transformation operations in a specific sequence: Target frame 1. Interpolate J(xA ) and K(xA ) to J(xA (x̆Ă )) and K(xA (x̆Ă )). 2. Multiply the result by the Jacobian factors that FIG. 3: An illustration of the interpolation reasoning for pseu- appear in (25). dospectral methods. The input to the interpolation is the field values at the collocation points in the source frame, and we We meet a similar complication when manipulating wish to determine the field values for the same function at the the evolved angular coordinates xA (ŭ, x̆Ă ). The angu- collocation points in the target frame, which will be at non- collocation points in the source frame coordinates. Therefore, lar coordinates are not representable on the SWSH ba- the interpolation seeks to calculate the field value at points sis, yet we must take angular derivatives of the angular x(x̂) in the source frame, for all collocation points x̂ in the coordinates to determine the Jacobian factors (13). The target frame. method we use to evade the problems for the angular coordinate representation is to introduce a unit sphere Cartesian representation of the angular coordinates: First, it is important to note the counterintuitive na- ture of the set of coordinate functions we require for the xunit = sin θ cos φ, (26a) interpolation. In both the source frame and the target yunit = sin θ sin φ, (26b) frame, we use a pseudospectral grid, evenly spaced in φ, zunit = cos θ. (26c) and at Legendre-Gauss points in θ. When interpolating, we require the location in the source frame coordinates of The evolution equation for the unit sphere Cartesian rep- the target frame collocation points. Therefore, when ex- resentation is then derived from the angular coordinate pressed as a function over collocation points, the function evolution equation (12). that we use for interpolation is xA (x̂A ). We have found this feature of the interpolation for pseudospectral meth- ∂ŭ xiunit = U0Ă ∂Ă xiunit ods easy to misremember, so we have included Fig. 3 to 1 ˘ i assist in recalling the correct reasoning. = U0 ð̄xunit + Ū0 ð̆xiunit . (27) Most of the quantities that we wish to interpolate have 2 nonzero spin-weight, so do not transform as scalars. In- The main advantage of promoting the angular coordi- stead, their transformation involves factors of the spin- nates xA (ŭ, x̆Ă ) to their unit sphere Cartesian analogs is weighted angular Jacobians (13). The tensor transforma- that the Cartesian coordinates xi are spin-weight 0 and tions for each of the relevant quantities at the worldtube so we can quickly and accurately evaluate their angular boundary are given in (15). For illustration, let us discuss derivatives. the transformation of the spin-weight 2 scalar J: ˘ The spin-weighted Jacobian factors (13) are then cal- ˘b̄2 J + ă2 J¯ + 2ă˘b̄K culated as J˘ = (25) 4ω̆ ă = ð̆xi ∂i xA qA , (28a) It is important to note that at the start of the transfor- ˘ xi ∂ xA q , b̆ = ð̄ (28b) i A mation procedure, we have the values of J on the source grid xA and the values of ă, b̆, and ω̆ on the target grid where the factors ∂i xA are the Cartesian-to-angular Ja- x̆Ă (the Jacobians are derivatives of x(x̆); see Fig. 3). cobians in the source frame, so are analytically computed

11 as almost-tridiagonal indefinite integration matrix for the spectral representation ∂x θ = cos[φ(x̂Â )] cos[θ(x̂Â )], (29a) −1 1 −1 1 ··· (−1)n+1 ∂x φ = − sin[φ(x̂ )]/ sin[θ(x̂ )], Â Â (29b) −1 0 −1/3 0 ··· 0 0 1 0 0 ∂y θ = cos[θ(x̂Â )] sin[φ(x̂Â )] (29c) I= −1/5 ··· . . .. . . .. .. .. .. ∂y φ = cos[φ(x̂Â )]/ sin[θ(x̂Â )], (29d) . . . . . 0 0 · · · 1/(2n − 1) 0 −1/(2n + 3) ∂z θ = − sin[θ(x̂ )], Â (29e) (33) ∂z φ = 0. (29f) Here the first row is chosen to zero the function at the innermost gridpoint (at y̆ = −1). It is convenient to gen- erate linear operators acting entirely on the nodal rep- B. Rapid linear algebra methods for radial resentation. These are composed as M −1 IM , where M integration is the linear operator that maps the nodal representa- tion to the modal representation. We may then add an SpECTRE CCE uses a Legendre Gauss-Lobatto spec- integration constant freely to the result of the indefinite tral representation for the radial dependence of the spin- integration operator in the nodal representation to satisfy weighted scalars on its domain. The use of spectral the boundary conditions. methods allows rapid integration of the radial differential Two of the five equations (those that determine β̆ and equations of the hierarchical CCE system (9). The nu- Ŭ ) take the simple form merical methods we employ in this section are not them- selves new, but they have not previously been applied to ∂y̆ f = Sf . (34) efficiently solving the CCE system of equations. Each of the angular derivatives that appears in the hi- The radial ODE solves for these cases are a straightfor- erarchy of radial differential equations is first evaluated ward application of the nodal integration matrix M −1 IM by the procedure described around Eq. (6): perform a using (33). In the CCE system, the choice to zero the spin-weighted spherical p harmonic transform using lib- value at the innermost boundary point ensures that we sharp, multiply by (` − s)(` + s + 1) in the modal ba- may impose the boundary conditions for the worldtube ˘, and recover sis for the ð̆ and − (` + s)(` − s + 1) for ð̄ p quantities β̆|Γ and Ŭ |Γ by adding the appropriate bound- the nodal representation of the derivative with an inverse ary value to all points along the radial rays for each an- spin-weighted transform. Using these nodal values of the gular point on the boundary. angular derivative terms , we may then directly compute Two more of the radial differential equations (those each of the right-hand sides of the radial differential equa- that determine Q̆ and W̆ ) take the form tions over the nodal grid. Therefore, for each of the radial differential equations, the problem reduces to a collection (1 − y̆)∂y̆ f + 2f = Sf . (35) of radial ODE solves. The spectral representation in the radial direction al- This case requires more care than the original indefinite lows the further simplification of determining linear op- integral, but the full integration matrix is still readily erators that correspond to indefinite integration. Given calculable for arbitrary Legendre order n. the function f expressed in the modal representation Considering again the modal representation (30), we wish to find the linear operator K such that X f (y̆) = an Pn (y̆), (30) X X n an Pn (y̆) = (K · a)n [(1 − y̆)∂y̆ Pn (y̆) + 2Pn (y̆)]. n n we seek the integration matrix I such that (36) The operator K is the inverse of the operator in Eq. (35). X Z y̆ We will again make use of the integration matrix I X an Pn (y̆) = (I · a)n Pn (y̆), n n (33). We also require the inverse of the matrix C associ- X X ated with multiplication by (1 − y̆): =⇒ an Pn (y̆) = (I · a)n ∂y̆ Pn (y̆), (31) n n X X (C · a)n Pn (y̆) = an (1 − y̆)Pn (y̆). (37) The relevant identity for Legendre polynomials that we use to determine the integration matrix I is The matrix C is derived by algebraic manipulations of Bonnet’s recursion formula for Legendre polynomials 1 d Pn (y̆) = [Pn+1 (y̆) − Pn−1 (y̆)] . (32) 2n + 1 dy̆ (n + 1)Pn+1 = (2n + 1)y̆Pn − nPn−1 By integrating both sides of this equation and applying n+1 n ⇒ (1 − y̆)Pn = − Pn+1 + Pn − Pn−1 (38) the result to the modal representation (30), we find the 2n + 1 2n + 1

12 Therefore, composing the operations of C and I, we find IV. PARALLELIZATION AND MODULARITY X X ((C + 2I) · a)n Pn (y̆) = (I · a)n [(1 − y̆)∂y̆ Pn + 2Pn ] Because of the dependence of the gauge transformation n n (39) at the inner boundary on the field values at I + needed and to establish an asymptotically flat gauge, the opportuni- ties for subdividing the CCE domain for parallelization K = I · (C + 2I)−1 (40) purposes are limited. However, we are able to take ad- vantage of the task-based parallelism in SpECTRE to: a) To compute K in practice, we determine the values of C parallelize independent portions of the CCE information and I analytically, then perform a single numerical inver- flow, and b) efficiently parallelize the CCE calculation sion to finish the computation of (40). Boundary condi- with a simultaneously running Cauchy simulation. tions then determine the quadratic part of the solution, so are imposed by adding the appropriate b(θ̆, φ̆)(1 − y̆)2 contribution along each radial ray. A. Component construction Importantly, for both of the above types of the radial ODE solve, the integration matrix in question is inde- In SpECTRE, we refer to the separate units of the sim- pendent of the values of the fields. So, at the start of ulation that may be executed in parallel via task-based the simulation, we precompute and store the necessary parallelism as components. For instance, in the near-field integration matrices, reducing each of the ODE solves de- region in which the domain can be parallelized among scribed above to a matrix-vector multiplication for each several subregions of the domain, each portion of the do- radial ray. In SpECTRE, these matrix-vector product main is associated with a component. calculations are optimized via the vector intrinsic library For SpECTRE CCE, we use three components (in ad- libxsmm [57]. dition to components that are used for the Cauchy evolu- The final type of radial differential equation appears tion): one component for the characteristic evolution, an- only in the equation that determines H. This type is other component dedicated to providing boundary data more complicated: on the worldtube, and a third component for writing re- (1 − y̆)∂y̆ f + [1 + (1 − y̆)LG LJ ]f + (1 − y̆)L̄G LJ f¯ = S, sults to disk. (41) Much of the efficiency and precision of the SpECTRE CCE system comes from the ability to cover the entire in which the L factors depend on the field quantities of asymptotic domain from the worldtube Γ to I + with a the current hypersurface. In this case, there is little hope single spectral domain. In principle, there may be oppor- of determining an elegant simplification using the modal tunity to parallelize multiple radial shells of the compu- basis. In any case, there would be no opportunity for tation, but in practice our initial assessments indicated caching and reusing an integration matrix, as the differ- that there would be little gain for the typical gravita- ential operator that acts on f depends on the other fields tional wave extraction scenario. First, there is a signifi- on the hypersurface. So, for the integration of the H cant constraint that comes from the asymptotic flatness equation, we decompose the complex linear differential condition — the gauge transformation throughout the equation into a real linear equation on vectors of length domain on a given hypersurface depends on the asymp- 2n: totic value U|I + on the same hypersurface, which forces a significant portion of the computation to serial execution. (1 − y̆)∂y̆ + 1 0 Additionally, we have seen very rapid convergence in the 0 (1 − y̆)∂y̆ + 1 number of radial points used for the CCE system, so it is Re(LJ )Re(LG ) Re(LJ )Im(LG ) Re(f ) + (1 − y̆) unlikely that subdividing the domain radially would offer Im(LJ )Re(LG ) Im(LJ )Im(LG ) Im(f ) much additional gain for the typical use case. Re(S) Therefore, the entire characteristic evolution system is = , (42) assigned to a single component, and represents the com- Im(S) putational core of the algorithm. The evolution compo- where the multiplication by (1 − y̆) and differentiation nent is responsible for ∂y̆ are understood to represent linear operators on the Legendre Gauss-Lobatto nodal representation. We then • The angular gauge transformation and interpola- solve (42) by numerically computing the linear operator tion (via Clenshaw recurrence) along each radial ray and performing an aggregated lin- • The calculation of the right-hand sides of the set of ear solve via LAPACK. Boundary conditions are imposed hierarchical equations (9) as usual by setting the first row of the operands Re(S) and Im(S) to the desired boundary value before the op- • The integration of each of the radial ODEs eration, and adjusting the first and (n + 1) row of the linear operator to be equivalent to the first and (n + 1) • The time interpolation and preparation of wave- row of the identity matrix. form data

13 Interpolation/gauge to Bondi-Sachs; One of: Worldtube from Boundary calculation disk from stored metric Hypersurface solve Write waveform DG interpolation and Evolution to disk GH boundary (parallel w/ Cauchy) Analytic boundary Gμν =0 FIG. 4: Components of the CCE task-based parallelism system. The worldtube component (left) is modular and can be switched out according to the desired source of worldtube data. We currently support reading worldtube data from disk, interpolating worldtube data from a simultaneously running Generalized Harmonic system in SpECTRE, or computing analytic boundary data from a known solution or approximation to the Einstein field equations. The core evolution component performs no reads from or Finally, there is a generic observer component that writes to the filesystem, which ensures that the expensive handles the output of the waveform data to disk. When part of the computation will not waste time waiting for CCE is simultaneously running with a Cauchy evolution, potentially slow disk operations. there will be additional components running in parallel The second component used in CCE is the worldtube with the CCE components, such as components that per- component. A worldtube component is responsible for: form the Cauchy evolution, components that search for apparent horizons, and components that write simula- • Collecting the Cauchy worldtube metric and its tion data to disk. The division of the CCE pipeline into derivatives from an assigned data source parallel components is illustrated in Fig. 4. • Interpolating the data to time steps appropriate to the CCE evolution system B. Independently stepped interface with Cauchy simulation • Performing the transformation to the Bondi-Sachs- like coordinate system on the worldtube Because the Cauchy-characteristic evolution system The user has a choice of several different worldtube com- does not have much opportunity to parallelize internally, ponents, each of which corresponds to a different source we need to ensure that its serial execution is optimized. of the metric quantities on the worldtube. Worldtube Our goal is that when running simultaneously with the components are available that: highly parallel discontinuous Galerkin system used for the Generalized Harmonic evolution, the CCE system • Read worldtube data directly from disk does not impose any significant runtime penalty. An important contribution to the efficiency of the CCE • Accept interpolated data from a simultaneously system is that the solutions to the Einstein field equa- running Cauchy execution in SpECTRE tions are smooth and slowly varying in time. As a result, • Calculate worldtube data from an analytically de- the spectral methods used in CCE converge rapidly, and termined metric on the boundary the scales that we seek to resolve with the time-stepper are primarily on orbital timescales. Therefore, we antic- Our methods for reading from disk are currently opti- ipate that the CCE system should be able to take far mized for easily reading worldtube data written by SpEC, larger timesteps than the Generalized Harmonic system but our worldtube module should accept data from any running in concert, and it will be important for the over- code that can produce the spacetime metric and its first all efficiency of the extraction pipeline to adjust the time derivatives decomposed into spherical harmonic modes. steps of the CCE evolution independently of the time

You can also read