Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1

Page created by Tracy Mcdonald
 
CONTINUE READING
Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1
Geosci. Model Dev., 14, 3899–3913, 2021
https://doi.org/10.5194/gmd-14-3899-2021
© Author(s) 2021. This work is distributed under
the Creative Commons Attribution 4.0 License.

Constraining stochastic 3-D structural geological models with
topology information using approximate Bayesian
computation in GemPy 2.1
Alexander Schaaf1,2 , Miguel de la Varga2 , Florian Wellmann2 , and Clare E. Bond1
1 Geology   and Petroleum Geology, School of Geosciences, University of Aberdeen, Aberdeen, AB24 3UE, UK
2 Computational   Geoscience and Reservoir Engineering, RWTH Aachen University, Aachen, Germany

Correspondence: Florian Wellmann (florian.wellmann@cgre.rwth-aachen.de)

Received: 11 May 2020 – Discussion started: 18 August 2020
Revised: 13 May 2021 – Accepted: 24 May 2021 – Published: 28 June 2021

Abstract. Structural geomodeling is a key technology for the       restrict the generation of geomodel ensembles with known
visualization and quantification of subsurface systems. Given      geological information and to obtain improved ensembles of
the limited data and the resulting necessity for geological in-    probable geomodels which respect the known topology in-
terpretation to construct these geomodels, uncertainty is per-     formation and exhibit reduced uncertainty using stochastic
vasive and traditionally unquantified. Probabilistic geomod-       simulation methods.
eling allows for the simulation of uncertainties by automat-
ically constructing geomodel ensembles from perturbed in-
put data sampled from probability distributions. But random
sampling of input parameters can lead to construction of geo-      1   Introduction
models that are unrealistic, either due to modeling artifacts or
by not matching known information about the regional geol-         Structural geomodeling is an elemental part of visualizing
ogy of the modeled system. We present a method to incorpo-         and quantifying geological systems (Wellmann and Caumon,
rate geological information in the form of known geomodel          2018). Topology relationships in geological systems (e.g.,
topology into stochastic simulations to constrain resulting        how layers are connected to each other stratigraphically, or
probabilistic geomodel ensembles using the open-source ge-         their across-fault connectivity) are important constraints for
omodeling software GemPy. Simulated geomodel realiza-              fundamental geological processes, such as fluid and heat flow
tions are checked against topology information using an ap-        (Thiele et al., 2016a, b). Each unique interpretation (model)
proximate Bayesian computation approach to avoid the spec-         of a geological setting has a specific topology. And as geol-
ification of a likelihood function. We demonstrate how we          ogy is not only an experimental science, but also an interpre-
can infer the posterior distributions of the model parameters      tive and historical science (Frodeman, 1995), the deduction
using topology information in two experiments: (1) a syn-          of the geomodel – often from sparse amounts of data – can
thetic geomodel using a rejection sampling scheme (ABC-            inherently lead to numerous potentially valid geological in-
REJ) to demonstrate the approach and (2) a geomodel of a           terpretations (Bond et al., 2007), which themselves can lead
subset of the Gullfaks field in the North Sea comparing both       to equally numerous topology graphs. This aspect is com-
rejection sampling and a sequential Monte Carlo sampler            pounded by the complex nature of geological systems and
(ABC-SMC). Possible improvements to processing speed of            interpretation bias imparted by geoscientists in the explicit
up to 10.1 times are discussed, focusing on the use of more        creation of geomodels (Bond et al., 2007; Polson and Curtis,
advanced sampling techniques to avoid the simulation of un-        2010; Bond, 2015). It also leads to the creation, and favoring,
feasible geomodels in the first place. Results demonstrate the     of specific models that fit expectations and prior knowledge
feasibility of using topology graphs as a summary statistic to     (Baddeley et al., 2004) rather than consideration of the full
                                                                   range of possible models. However, methodologies to create

Published by Copernicus Publications on behalf of the European Geosciences Union.
Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1
3900                        A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

models often focus on the creation of a single deterministic       deformation leads to fundamentally different topological re-
model (Bond et al., 2008) and lack systematic consideration        lationships than compressional deformation (see Fig. 1).
of data uncertainty (Thore et al., 2002; Tacher et al., 2006;         We therefore hypothesize that topological information
Bardossy and Fodor, 2013). These facts call for the develop-       about a geological system can be used as a meaningful con-
ment of alternative approaches. The increasing development         straint for probabilistic 3-D geomodeling outputs.
of implicit modeling algorithms (Mallet, 2004; Hillier et al.,        This topological information is difficult to incorporate into
2014; Laurent et al., 2016) allows for the creation of vast        the mathematical foundations of implicit modeling functions
structural geomodel ensembles by making use of interpola-          and is highly case-dependant.
tion functions, which makes the analysis and visualization            The origin of topological information is generally qualita-
of uncertainty using probabilistic simulation approaches pos-      tive. For this reason, choosing a likelihood function, or trying
sible (Bistacchi et al., 2008; Suzuki et al., 2008; Wellmann       to connote any probabilistic meaning to the comparison of
et al., 2010; Lindsay et al., 2012; Wellmann and Regenauer-        topological graphs, does not seem to enhance the inference
Lieb, 2012; Wellmann, 2013).                                       (Curtis and Wood, 2004). This work, favoring model sim-
   The mathematical nature of implicit modeling, in combi-         plicity, adopts an approximate Bayesian computation (ABC)
nation with the use of a probabilistic modeling process, often     approach to compute the posterior using a distance function
leads to geologically unsound model realizations and mod-          instead of a likelihood function.
eling artifacts. Additionally, the modeling algorithms only           To test this approach we designed two distinct experi-
take a limited set of input data types, e.g., layer interface      ments: one synthetic and one case study.
locations and structural orientation data, which significantly
                                                                       1. We construct a synthetic fault model and explore its
limits the amount of geological information that can be in-
                                                                          topological uncertainty. We do this by describing our
cluded in the modeling process. Wellmann et al. (2017) and
                                                                          input data not as fixed parameters, but as probability
de la Varga and Wellmann (2016) showed how Bayesian in-
                                                                          distributions. We then use Monte Carlo sampling to ob-
ference can be used to reduce uncertainty and modeling arti-
                                                                          tain input data realizations from which geomodels are
facts in both synthetic and real, implicit, structural geomodel
                                                                          constructed. We then show how a single topology graph
ensembles. Their concept uses supplemental geological in-
                                                                          can be used as a summary statistic in an ABC-rejection
formation (e.g., layer thicknesses or fault offsets) in the form
                                                                          scheme to approximate the posterior model ensemble
of likelihood functions to constrain stochastic geomodel en-
                                                                          that honors the added information.
sembles. In other words, by conditioning the probability of
model parameters to some additional data, we are able to               2. To test the same ABC approach on a real-world dataset,
increase the overall information of the probabilistic model.              we apply it to a model extracted from a seismic inter-
Additional data can be, for example, a range of possible layer            pretation of the North Sea Gullfaks field. We also ex-
thicknesses in a depositional setting, geophysics or arguably             plore a more advanced sampling technique to demon-
geological knowledge in the form of valid geometrical con-                strate possibilities for reducing the computational costs
figurations.                                                              of the method.
   While the overall idea has been demonstrated in some spe-
cific cases, the general question of how to define suitable           In the following section we will give an overview of the
likelihood functions for specific types of observations – given    applied implicit geomodeling approach, the basic concept of
specific geological systems and diverse types of prior geolog-     Bayesian inference and its use in probabilistic geomodeling,
ical knowledge – still remains.                                    as well as the theory behind approximate Bayesian compu-
   Geological expert knowledge contains much more infor-           tation. We further describe how we analyze model topology
mation that is vital to model creation, such as understand-        and use it as a summary statistic. We will then introduce, in
ing the geological processes that result in the thickening and     detail, both the synthetic fault model and the case study, fol-
thinning of sedimentary deposits and their relative spatial        lowed by a comprehensive discussion of our findings.
distribution. One key knowledge-based input into geomod-
eling is the understanding of the kinematic evolution of the       2     Methodology
rock units into their present configuration. While kinematic
modeling software exists (see Groshong et al., 2012; Brandes       2.1    Implicit geomodeling
and Tanner, 2014, for reviews), it is limited to “end-member”
kinematic models, resulting in geometrical deformations de-        Several approaches exist for creating structural geomodels,
fined by few parameters not taking into account a range of         which can be separated into three main categories: (a) inter-
other factors, not least of which being the mechanics of the       polation, (b) kinematic methods and (c) process simulation.
different units (Butler et al., 2018). But we can capture cer-     The interpolation of surfaces and volumes from spatial data
tain kinematics using topology information – for example,          is currently the most widely used approach in geosciences,
the across-fault connectivity of layers, for which extensional     typically performed manually by geoscientists, which re-
                                                                   quires robust knowledge of the geological setting and exten-

Geosci. Model Dev., 14, 3899–3913, 2021                                              https://doi.org/10.5194/gmd-14-3899-2021
Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1
A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information                                     3901

                                                                      lationships, such as the across-fault connectivity of layers
                                                                      (see Fig. 1). The topology relationships of geological mod-
                                                                      els can be represented by an adjacency graph, which repre-
                                                                      sents topological units as individual nodes and their connec-
                                                                      tions by edges (see Fig. 1). The adjacency topology of geo-
                                                                      logical structures is highly dependent on deformation: com-
                                                                      pressional deformation leads to different connectivities in the
                                                                      topology graph than extensional, but even within the same
                                                                      type of deformation they can lead to different topologies –
                                                                      as visualized by the horst and graben structures in Fig. 1.
                                                                      Not only does the type of deformation have an important in-
                                                                      fluence on the system topology, but also the quantity – e.g.,
                                                                      the fault throw. For an in-depth introduction and discussion
                                                                      of topology in geology see Thiele et al. (2016a) for the fun-
                                                                      damental theory and Thiele et al. (2016b) and also Pakyuz-
                                                                      Charrier et al. (2019) for the influence of structural uncer-
Figure 1. Idealized horst (a) and graben (b) structures with topol-   tainty on geomodel topology.
ogy graph overlay, showing the difference in graph structure for
different tectonic settings (modified from Fossen, 2010). The black   2.2.1     Computing geomodel topology
nodes represent the centroids of the geobodies and the black edges
the topology connections, together building a topology graph.         To compute the geomodel topology with the necessary com-
                                                                      putational efficiency to conduct a feasible stochastic simula-
                                                                      tion of realistic geomodels, we implemented a topology al-
sive amounts of data in order to robustly approximate real-           gorithm using theano (Theano Development Team, 2016)
ity. Additionally, highly complex structures such as extensive        into the core of GemPy. This enables the topology compu-
fault networks and repeatedly folded areas are challenging to         tation to run alongside the geomodel interpolation on graph-
recreate using current interpolation methods (Jessell et al.,         ical processing units (GPUs). As theano is a highly opti-
2014; Wellmann et al., 2016; Laurent et al., 2016).                   mized linear algebra library, the employed method is mainly
   The open-source, Python-based implicit modeling pack-              focused on utilizing matrix operations for the computation
age GemPy1 (de la Varga et al., 2019) is used here. It is             of the geomodel topology. When the implicit geomodel is
based on the work of Lajaunie et al. (1997) and Calcagno              discretized using a regular grid, it becomes a 3-D matrix of
et al. (2008), and it allows the interpolation of geological in-      lithology IDs L (Fig. 2a), which we use for the calculation
terface position and plane orientation data by using a scalar         of the geomodel topology. For each geomodel we also have
field method in combination with cokriging (Chilès et al.,            access to the 3-D Boolean matrices Fn for each fault, repre-
2004). For a detailed overview of the algorithm and the func-         senting the two sides of the respective fault by two ascending
tionality of GemPy, we refer the reader to de la Varga et al.         consecutive integers (Fig. 2b). Given these two types of input
(2019).                                                               data, we compute the geomodel topology as follows.
2.2   Geological topology                                              1. The  lithology matrix L and the summed fault matrices
                                                                          Pnfault
                                                                             i=1 Fi , where nfault is the total number of faults in the
Topology, referring to “properties of space that are main-                geomodel, are combined into a matrix in which each
tained under continuous deformation, such as adjacency,                   lithology in each fault block is represented by its own
overlap or separation” (Thiele et al., 2016a; Crossley, 2006),            unique integer, referred to as the topology labels matrix
is a highly relevant concept in structural geology, as it pro-            T (see Fig. 2c):
vides a useful description of the relations between strati-                                   nX
                                                                                               fault
graphic units across layer interfaces, faults or the contact to               T = L + nlith            Fi ,                         (1)
an intrusive body. Generally, eight binary topological rela-                                  i=1
tionships can exist between three-dimensional objects (Egen-                  with nlith being the total number of lithology IDs in the
hofer, 1990), while a total of 69 relations are possible be-                  geomodel.
tween simple lines, surfaces and bodies (e.g., surfaces with-
out holes; see Zlatanova, 2000). From these eight Egenhofer–           2. The topology labels matrix T is then shifted twice (for-
Herring relationships, meets (i.e., adjacency) is the most                ward and backward) along each axis X, Y and Z. The
relevant one for describing structural and stratigraphic re-              two resulting shifted matrices S1 and S2 along each axis
                                                                          are then subtracted from each other to result in a differ-
   1 URL: https://github.com/cgre-aachen/gempy (last access:              ence matrix D, in which only the cells along a lithology
26 June 2021)                                                             or fault boundary are nonzero (Fig. 3).

https://doi.org/10.5194/gmd-14-3899-2021                                                  Geosci. Model Dev., 14, 3899–3913, 2021
Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1
3902                             A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

  3. The topology labels matrix T is then evaluated at all            – Observed data: y represents additional measurements
     nonzero cells of D to obtain the two topology labels               or any other source of data, which should enhance the
     na and nb of each topological connection (referred to              model definition by providing additional information
     as an edge e) in the geobody, which are stored in a set            with the goal of reducing model uncertainty or enabling
     of unique edges E representing the geomodel topology.              the comparison of the model to reality (e.g., by com-
     For the example shown in Figs. 2 and 3 the abbreviated             paring geophysical potential-field measurements with
     set is E = {(0, 4), (0, 5), (0, 1), . . ., (3, 7)}.                the according forward simulation on the basis of a ge-
                                                                        omodel). In this work we use topology information in
   This method of topology calculation works on regular                 the form of a topology adjacency graph as the observed
grids, which imposes a strong bias on the result: if the main           data. Notice that when the terms “observation” or “ob-
lithological and structural features are not aligned with the           served data” are used in the context of a probabilistic
grid orientation, the resulting topology graph could thus con-          model, we refer to this mathematical term y instead of
tain (or miss) connections. For a more detailed discussion on           to the literal semantic meaning of the words.
the effects of model discretization see Wellmann and Cau-
mon (2018).                                                           – Likelihood functions, p(y|θ ): these form the relation-
                                                                        ship between the model parameters θ and the observed
2.3     Stochastic modeling approach                                    data y. Essentially, this function describes the condi-
                                                                        tional probability for observing the data y given the pa-
2.3.1    Bayesian inference                                             rameters θ (e.g., MacKay and Kay, 2003). In the case
                                                                        of structural modeling, this essentially means that we
Bayesian inference is fundamentally different to the classical          compute the geomodel from the input parameters θ and
frequentist approach of inference. It treats probabilities as de-       compare model predictions (e.g., the thickness of a cer-
grees of certainty of a parameter θ , which is inherently con-          tain layer at a certain position or topology adjacency
sidered to be a random variable itself (Bolstad, 2009; Van-             graphs) with additional observed data.
derPlas, 2014). It is based on Bayes’ theorem (Eq. 2), which
allows updating of a given probability – the prior probability         While constructing meaningful likelihood functions for
p(θ ) of a parameter θ – after the occurrence of a connected        physical properties such as layer thickness or geobody vol-
event (Bolstad, 2009). This updating process relies on the use      ume from observed data is straightforward (de la Varga and
of a likelihood function p(y|θ ), representing the conditional      Wellmann, 2016), we have no proper framework to construct
probability of the observed data y given the prior probability      them for more abstract or “soft data”, such as our understand-
of the underlying parameter θ and the theoretical connection        ing of the geological setting or the topology relationships of
of the occurring event. It is used to condition the prior into      our layers across faults or unconformities. For this reason, we
the posterior distribution p(θ |y), which represents the de-        chose to apply methods to estimate our posterior distributions
gree of certainty over the parameter θ given the occurrence         given abstract geological information without specifying a
of the event and its observed data y.                               likelihood function: approximate Bayesian computation.

              p(y|θ ) p(θ )                                         2.3.2   Approximate Bayesian computation
p(θ |y) = R                                                  (2)
              p(y|θ ) p(θ ) dθ
                                                                    Geoscientists often have extensive implicit knowledge of ge-
In this paper, we consider Bayes’ equation as a general way         ological settings (e.g., our understanding of the tectonics of a
to combine conditional probabilities as in the interpretation       system), but only a limited amount of this knowledge can be
of, for example, probabilistic graphical models (Koller et al.,     incorporated into the geological interpolation function (Well-
2009). For use in geomodeling, the terms in Eq. (2) can be          mann and Caumon, 2018). Additionally, it is often difficult
seen as (de la Varga and Wellmann, 2016; Gelman et al.,             to define formal likelihood functions for geological knowl-
2013) follows.                                                      edge as required for conventional Bayesian inference meth-
                                                                    ods (Wood and Curtis, 2004). A less formal but valid alter-
  – Model parameters, θ , are model-defining parameters             native approach is to approximate the posterior distributions
    (e.g., layer interface positions, dip or fault parameters)      using approximate Bayesian computation (ABC) methods.
    used for the interpolation of the geomodel, which can be        These methods, also referred to as likelihood-free inference
    either deterministic (thus be exactly defined and known)        methods by some (Marin et al., 2012), evaluate the distance
    or probabilistic. The latter represent uncertain parame-        of stochastically generated models to our additional data us-
    ters, which is expressed in the form of probability dis-        ing one or multiple summary statistics, S, instead of a proba-
    tributions (e.g., a normal distribution expressing the un-      bilistic likelihood function. While summary statistics are of-
    certainty of the vertical subsurface position of a layer        ten measures such as the mean, mode or median of a model,
    interface). We will use θ 0 as the notation for a sample        they tend to be insufficient in summarizing geomodels. In
    from these parameter distributions.                             this work we use the geomodel topology graph as a summary

Geosci. Model Dev., 14, 3899–3913, 2021                                             https://doi.org/10.5194/gmd-14-3899-2021
Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1
A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information                                           3903

Figure 2. (a) Lithology matrix L of an example 2-D geomodel that consists of four layers and a vertical fault in the center; (b) fault matrix
F of the geomodel; (c) topology labels matrix T of the geomodel.

                                                                         tion of both our priors and our additional information (Sun-
                                                                         nåker et al., 2013). Within this work we use the Jaccard index
                                                                         (1 − J ) as a distance function between topology graphs.

Figure 3. Vertical (a) and horizontal (b) difference matrix D show-
ing all cells (red) in the shifted matrices S1 and S2 , which are next      A more advanced sampling scheme for ABC is sequential
to the interface between two different layers or of any layers across    Monte Carlo sampling (ABC-SMC). In its simplest form it
a fault. The highlighted (yellow) part shows the area in which the       can be seen as an extension of rejection sampling by chain-
implicit interface must be located.
                                                                         ing rejection sampling simulations together (each referred to
                                                                         as an epoch). During the first epoch of rejection sampling,
                                                                         a large error threshold 1 is used while sampling from the
statistic of the geomodel to provide a meaningful comparison             prior distributions p(θ ). The accepted samples, forming the
between geomodels.                                                       posterior distributions of the first epoch, form the updated
   To obtain the approximate posterior distribution we need              priors of the second epoch by replacing the priors with the
to sample from our prior parameter distributions, plug the               kernel density estimation fˆh (θaccepted ) of the posterior sam-
sample values θ 0 into our simulator function y (our geomod-             ples. Iteratively, with every epoch, the error threshold  is re-
eling software), compute the summary statistic S(y(θ 0 )) (ge-           duced to the target value (e.g.,  = 0) to obtain the final pos-
omodel topology) and evaluate its distance to our observed               terior sample. Thus, every epoch, the sampler “learns” from
summary statistic (data) S(y) (e.g., a geomodel topology                 the previous epoch by adjusting the prior distributions fur-
graph). The most fundamental sampling scheme for ABC is                  ther towards the posterior distributions. As ABC-REJ tends
based on rejection sampling (ABC-REJ; see Algorithm 1),                  to suffer from potentially low computational efficiency when
for which the distance between our simulated data y(θ 0 ) (the           using low error thresholds , the iterative shrinking paired
simulated geomodel) and observed data y (initial geomodel)               with adjustment of the prior distributions can potentially ob-
is calculated using a distance function of  their summary               tain the approximate posterior much more quickly. We apply
statistics (topology graphs) d S(y), S(y(θ 0 )) . The simulated          this sampling scheme to our Gullfaks case study to show the
model is accepted if the distance is below a user-specified er-          potential speed-ups.
ror bound  ≥ 0 (Sadegh and Vrugt, 2014); otherwise, it is
rejected. The accepted samples form the approximate pos-
terior. Thus, this method circumvents the need to specify a
likelihood function for our additional data, while still approx-
imating the posterior distributions incorporating the informa-

https://doi.org/10.5194/gmd-14-3899-2021                                                     Geosci. Model Dev., 14, 3899–3913, 2021
Constraining stochastic 3-D structural geological models with topology information using approximate Bayesian computation in GemPy 2.1
3904                         A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

                                                                   Table 1. Distribution parameters for prior parameterization of the
                                                                   synthetic fault model.

                                                                             Name             Distribution   µ (m)    σ (m)
                                                                             Sandstone_2_Z    Normal             0       50
                                                                             Siltstone_Z      Normal             0       70
                                                                             Shale_Z          Normal             0       90
                                                                             Sandstone_1_Z    Normal             0      110
                                                                             Main_Fault_X     Normal             0       60
                                                                             Main_Fault_Z     Normal             0       60

                                                                         Ae between two geobodies could yield a more granular
2.4    Topology distance functions
                                                                         comparison that allows us to take into account trends of
To use geomodel topology as a constraint for probabilistic               the contact size. Thus, the ABC error tolerance  could
geomodels in an ABC framework, we need a consistent way                  be used to reject geomodels wherein certain topological
of comparing geomodel topologies – i.e., suitable distance               contact areas are above and/or below a certain value:
functions. We consider three possible comparison methods                 Ae − low ≤ Ae ≤ Ae + high .
here.                                                                 In this work we demonstrate the second approach, as it
 1. Presence or absence of defined connections. As the re-         allows us to directly compare entire geomodel topologies.
    lational topology information is captured in adjacency         We have chosen to compare the simulated results to a sin-
    graphs, the most fundamental approach is to check if           gle topology graph – the initial geomodel topology. This ap-
    two relevant nodes n1 and n2 (e.g., representing two re-       proach was selected as a base case to demonstrate how large
    gions in the model) share an edge e = (n1 , n2 ) (are ad-      variations in geomodel topology observed in the stochas-
    jacent) and if this edge exists in both models. This is the    tic simulation of input data uncertainties in geomodels (see
    most simple way of comparing specific aspects of rela-         Thiele et al., 2016b) can be constrained to a base topology
    tional topology between geomodels. This approach can           (i.e., conceptual model). This of course reinforces the bias
    be viewed as a Boolean comparison that is true if the          of the initial base model in the uncertainty simulation, but
    given edge exists in both models and false if not. This        it allows for the reliable exploration of the uncertainty of all
    also enables the direct comparison of i multiple edges,        possible geomodels honoring the topology constraint.
    which would result in a vector of i Boolean statements
    for each comparison [e1 , e2 , . . ., ei ].                    2.5     Quantifying uncertainty using Shannon entropy

 2. Comparing entire graphs. To compare topology graphs            Stochastic simulations yield vast ensembles of geomodel re-
    as a whole, Thiele et al. (2016b) describe the use of the      alizations, and their variability (and thus uncertainty) needs
    Jaccard index (Jaccard, 1912). It can be used to com-          to be analyzed and understood. The uncertainty of a single
    pare the similarity of sets by creating the ratio of the       geological entity (e.g., a layer or a fault) can be estimated
    intersection and union of two graphs A and B.                  from its frequency of occurrence in each single geomodel
                                                                   voxel. In order to analyze the whole geomodel uncertainty at
                   |A ∩ B|
      J (A, B) =                                            (3)    once, more sophisticated measures can be applied: the con-
                   |A ∪ B|                                         cept of Shannon entropy H can be used in a spatial context
      For two topology graphs A and B, this means we calcu-        to evaluate the uncertainty of an entire geomodel ensemble at
      late the ratio of edges (representing connected regions)     once, as described by Wellmann and Regenauer-Lieb (2012).
      shared in both (intersection: A ∩ B) and their total com-    Average model entropy H collapses the uncertainty of a ge-
      bined number of edges (union: A ∪ B). This ratio can be      omodel ensemble into a single number. It will be equal to 0
      used to efficiently identify all unique topology graphs in   if all cells x have only one possible outcome (no uncertainty)
      a given ensemble, as only an identical pair of graphs re-    and reach its maximum when all outcomes are equally likely
      sults in a Jaccard index of J (A, B) = 1. A comparison       for all cells of the model (maximum uncertainty).
      using the Jaccard index yields ratios of integers and thus
      a discrete comparison. This method also allows specify-      2.6     Experiment design
      ing a tolerance 0 <  < 1 for model acceptance, i.e., to
      accept models within the range 1 −  ≤ J ≤ 1.                2.6.1    Synthetic fault model

 3. Contact area. Comparing the number of actual edge              As a proof of concept we show how ABC can be used to
    pixels (or voxels) representing the area of the contact        incorporate geological knowledge and reasoning into an un-

Geosci. Model Dev., 14, 3899–3913, 2021                                             https://doi.org/10.5194/gmd-14-3899-2021
A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information                                         3905

Figure 4. (a) 3-D view of the synthetic fault model, with top surfaces of the four lithologies shown and the fault surface in blue. (b) X–Z
slice through the center of the discretized model showing partial input data (for visual brevity) and example standard deviations of prior
parameters used for the stochastic simulation. (c) Model overlaid with its topology graph used as our summary statistic for the ABC.

certain synthetic geomodel. This model represents a folded                      logical knowledge used during its construction, and thus
layer cake stratigraphy that is cut by a N–S-striking normal                    geometrical configurations more similar to this graph
fault to represent an idealized reservoir scenario frequently                   can be considered more likely (see also Thiele et al.,
encountered in the energy industry (see Fig. 4a).                               2016b; Pakyuz-Charrier et al., 2019). This graph would
   The prior parameterization is schematically visualized in                    be treated from this point on as an “observation” y due
Fig. 4b and consists of two different kinds of uncertain pa-                    to its use as a constraint within the probabilistic model.
rameters: (i) vertical location of the layer and fault interfaces               We are employing a rejection sampling scheme (ABC-
and (ii) lateral location of the fault interface, with the spe-                 REJ) with an error tolerance of  = 0 to obtain 500
cific parameterization displayed in Table 1. As this work fo-                   generated posterior models. The resulting posterior geo-
cuses on developing and describing a novel methodology for                      model ensemble will contain only samples with match-
constraining uncertain geomodels, we have chosen the un-                        ing topology graphs.
certainty parameterization of the synthetic geomodel entirely
subjectively as normal distributions increasing in uncertainty
with depth. The uncertainty is individually applied to each             2.6.2     Case study: the Gullfaks field
set of surface points to preserve surface shape within each of
the two fault blocks. Proper prior parameterization of uncer-           To demonstrate the applicability of the method to real
tain geomodels is a vital branch of research on its own (e.g.,          datasets we apply it to a model of part of the Gullfaks field,
Pakyuz-Charrier, 2018; Krajnovich et al., 2020) and out of              located in the northern North Sea. The field is located in the
the scope of this work.                                                 western part of the Viking Graben and consists of the NNE–
   Two separate simulations were run for this experiment so             SSW-trending 10–25 km wide Gullfaks fault block (Fossen
we can see how topology can constrain an uncertain geo-                 and Hesthammer, 1998). For a detailed overview of the re-
model compared to the Monte Carlo simulation of input pa-               gional and structural geology we refer to Fossen and Rørnes
rameter uncertainties alone.                                            (1996), Fossen and Hesthammer (1998), Fossen et al. (2000),
                                                                        and Schaaf and Bond (2019).
  1. A Monte Carlo simulation of the prior parameters was
                                                                           For the experiment, we constructed a base geomodel
     run to evaluate the uncertainty in the resulting geo-
                                                                        (Fig. 4a) founded in an interpretation of the training dataset
     model ensemble consisting of 2000 generated models.
                                                                        provided with the seismic interpretation software Petrel™.
     This represents our “base case” uncertainty without any
                                                                        We have chosen a relatively simple subset of the interpreta-
     topological constraints. It is important to note that this
                                                                        tion containing two faults, three horizon tops (Tarbert – red,
     simulation is only a forward uncertainty propagation
                                                                        Ness – purple and Etive – green) and the Base Cretaceous
     and does not entail any type of inference.
                                                                        Unconformity (BCU, yellow). To create the geomodel, we
  2. An approximate Bayesian computation was done using                 exported the corresponding seismic interpretation data from
     the initial model topology graph (see Fig. 4c) to repre-           Petrel and imported them into Python. The surface interpre-
     sent our geological knowledge. This graph is extracted             tations were then decimated down to 510 surface points and
     from the initial geological model, which has been manu-            187 surface orientations via a target reduction of 80 % per
     ally built by an expert. The assumption is that this topo-         fault block or surface using the VTK-based decimation func-
     logical graph encapsulates important aspects of the geo-           tionality of pyvista (Sullivan and Kaszynski, 2019) to re-

https://doi.org/10.5194/gmd-14-3899-2021                                                     Geosci. Model Dev., 14, 3899–3913, 2021
3906                         A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

Figure 5. (a) 3-D view of the Gullfaks geomodel used as the mean prior model in our case study. (b) X–Z section through the discretized
geomodel with an overlaid observed topology graph showing the inter- and intra-fault block relations of geobodies.

Table 2. Distribution parameters for prior parameterization of the           tainty for comparison with the following two simula-
Gullfaks case study.                                                         tions.

          Name           Distribution   µ (m)    σ (m)                    2. An ABC-REJ simulation was run using the initial ge-
                                                                             omodel topology graph (see Fig. 4b) to represent our
          BCU Z          Normal              0     43.3
                                                                             geological knowledge. We used an error threshold of
          Fault 3 X      Normal              0     90.9
          Fault 4 X      Normal              0     90.5
                                                                              = 0.025 for 1000 accepted posterior samples, as the
          Tarbert A Z    Normal              0     46.5                      threshold was small enough to constrain the posterior
          Tarbert B Z    Normal              0     45.5                      topology spread to the initial geomodel topology graph.
          Tarbert C Z    Normal              0     44.2
          Ness A Z       Normal              0     48.6                   3. An ABC-SMC simulation was run using the same ini-
          Ness B Z       Normal              0     46.7                      tial geomodel topology graph. We ran six SMC epochs
          Ness C Z       Normal              0     45.1                      using  values of 0.3, 0.2, 0.1, 0.075, 0.05 and 0.025.
          Etive A Z      Normal              0     50.9                      Each epoch was run for 1000 accepted posterior sam-
          Etive B Z      Normal              0     48.1                      ples.
          Etive C Z      Normal              0     46.3

                                                                      3     Results
tain the best possible surface shape while allowing fast im-
plicit geomodel construction times in GemPy.                          3.1    Synthetic fault model
   The prior parameterization consists of two different kinds
of uncertain parameters: (i) vertical location of the layer in-       Simulating the uncertainties encoded in the prior parame-
terfaces within each fault block and (ii) the lateral location        terization resulted in 100 unique model topologies within
of the fault interfaces. This parameterization is similar to the      the geomodel ensemble of 2000 models, with 18 topology
synthetic fault model (all specifications are listed in Table 2),     graphs occurring at least 10 times and the most frequent
and all sets of surface points within each individual fault           14 making up 90 % of geomodel ensemble topologies. It is
block were perturbed together to retain surface shape. This           also notable that the most frequent topology graph (29.5 %)
parameterization was chosen to demonstrate how even a few             is not the initial (mean prior) topology graph (15.6 %), but
uncertain parameters in an uncertainty modeling workflow              rather represents models wherein the shale layer (green) of
can lead to highly uncertain results, especially regarding the        the footwall shares an across-fault connection with the sand-
topology graphs of the resulting geomodel ensembles in real-          stone 2 layer (red) of the hanging wall. The uncertainty of
world geomodels. We then conducted a sensitivity study of             the prior geomodel ensemble is visualized in Fig. 5a–c in X–
the topological spread with respect to the geomodel resolu-           Z, Y–Z and X–Y sections as Shannon entropy, as described
tion. This allowed us to determine the appropriate geomodel           in the Methodology section. All three sections through the
resolution necessary for our experiment. Next, we performed           model clearly show the uncertainty of the layer interface po-
three separate simulations to compare different approaches.           sition and the highest uncertainty around the fault surface. In
                                                                      comparison, applying a single topology graph as a summary
  1. A Monte Carlo simulation was run of the prior uncer-             statistic to the simulation using ABC leads to significantly
     tainty for 1000 samples to evaluate the spatial uncer-           reduced uncertainty throughout the geomodel ensemble (see
     tainty and the topological spread of the resulting geo-          Fig. 5d–f), with average geomodel ensemble entropy being
     model ensemble. This serves as our base case uncer-              reduced from H prior = 0.44 down to H posterior = 0.31, which

Geosci. Model Dev., 14, 3899–3913, 2021                                                 https://doi.org/10.5194/gmd-14-3899-2021
A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information                                       3907

Figure 6. Shannon entropy slices in the X–Z (a, d), Y–X (b, e) and X–Y plane of the prior (top, a–c) and posterior (bottom, d–f) geomodel
ensemble. The white lines show the location of other respective cross sections.

is a drop in geomodel uncertainty of nearly 30 %. Visualizing
the entropy difference between the prior and the posterior ge-
omodel ensembles shows the highest reduction in entropy for
the two inner layer interfaces (see Fig. 6) and not around the
fault surface. As expected, constraining the simulation using
a single topology graph with an error of  = 0 collapses the
number of geomodel ensemble topologies from 100 down to
1.
   Figure 7 plots the kernel density estimations (KDEs) of
the input parameter distributions of prior (grey) and posterior
(colored) samples. The strongest change in the mean from
prior to posterior distributions occurred for the vertical inter-
face location perturbation priors of sandstone 2 (red), shale
(green) and sandstone 1 (brown; see Fig. 7), with the first
shifted to higher mean z values and the latter two shifted
deeper by −72 and −53 m, respectively. Additionally, the
initially normally distributed prior of sandstone 1 shows a            Figure 7. X–Y section of entropy difference between the forward
strong negative skewness of −0.61 in the posterior distribu-           simulated entropy and the approximate posterior entropy. The plot
tion. The standard deviation for the siltstone and shale inter-        highlights areas where the entropy was reduced (blue), increased
face distributions was reduced by roughly 32 % and 40 %, re-           (red) and kept constant (white).
spectively. The prior and posterior distributions for the lateral
and vertical fault parameter uncertainties show no significant
difference (panels e and f).                                           within a 1000-model ensemble, with 116 unique topologies
                                                                       occurring more than once. Again, the most frequent topol-
3.2   Case study: the Gullfaks field                                   ogy graph is not the initial (mean prior) topology graph. The
                                                                       uncertainty of an X–Z section of the forward ensemble is vi-
Forward simulation of the prior uncertainties of the Gull-             sualized in Fig. 9a using Shannon entropy. The section illus-
faks geomodel resulted in 676 unique geomodel topologies               trates the general trend of uncertainty throughout the forward

https://doi.org/10.5194/gmd-14-3899-2021                                                  Geosci. Model Dev., 14, 3899–3913, 2021
3908                          A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

Figure 8. Prior (grey) and posterior (color) kernel density estimations for the different stochastic model parameters for our synthetic fault
model.

simulation: we observe the highest uncertainty surrounding               we iteratively lower the acceptable threshold during the SMC
the two faults in the geomodel, especially around the east-              simulation, the simulated and accepted topologies iteratively
ern fault. The area also shows increased uncertainty due to              converge towards the topology graph we used as our prior
the interaction of layer interfaces, the fault and the vertical          geological knowledge. The average geomodel ensemble en-
vicinity of the BCU.                                                     tropy H also iteratively decreases from 0.233 for the forward
   The initial topology graph is used as a constraining sum-             simulation down to 0.112 at  = 0.025 (see Fig. 11b), show-
mary statistic using ABC with rejection sampling (ABC-                   ing how fixing a probabilistic geomodel to a single topology
REJ) and a threshold of  = 0.025. The absolute threshold                graph can significantly reduce, or rather significantly con-
value will be directly proportional to the sensitivity of the            strain, the simulated uncertainty.
model geometry with respect to the stochastic parameters.                   Figure 8 shows how the ABC-SMC simulation iteratively
This prevents the selection of a value independent of the ac-            affects the probability distributions of selected probabilistic
tual geological model under study. In this case study, the               geomodel parameters with decreasing thresholds . Each row
value of  has been chosen empirically by performing sev-                shows the consecutive epochs of the ABC-SMC simulation
eral predictive simulations. Results were evaluated based on             and corresponds to a specific . Each column describes a dif-
their correspondence to the geological setting, as judged by             ferent stochastic parameter in the stochastic model. By ap-
expert knowledge.                                                        plying the initial topology graph of the geomodel as our sum-
   The results shows that this approach leads to reduced un-             mary statistics, we can directly see here how the parameter
certainty, as exemplified by the entropy section shown in                distribution for the BCU (Fig. 8a) shifts its mean µ by 47.4 m
Fig. 9b. At this threshold, the approximate posterior geo-               upwards and reduces its standard deviation σ by 35.8 % to
model ensemble contains only the applied initial topology                accommodate our geological knowledge about the geomodel
graph. Using rejection sampling with such a strict threshold             topology. We can observe this effect in the entropy section of
resulted in a very low acceptance of only 0.59 % of simulated            the posterior geomodel ensemble as well (Fig. 9b). In Fig. 10,
geomodels, which required about 40 h of simulation time to               we show the difference in entropy between the prior and
obtain 1000 posterior samples2 . In contrast, using a sequen-            approximate posterior geomodel ensemble shown in Fig. 9,
tial Monte Carlo sampling scheme (ABC-SMC) required                      where areas with decreasing entropy values are shown in blue
only 3.96 h to obtain the same number of posterior samples               and increasing values in red. We observe here how the BCU
at the same threshold – a speed-up of 10.1. This includes the            moves upward and increases the entropy there, while lower-
five sampling epochs using  = {0.3, 0.2, 0.1, 0.075, 0.05}              ing entropy in the lithologies below. The parameter distribu-
with 1000 accepted samples each, which are used to sequen-               tions for Tarbert B (Fig. 8b, red) and Etive B (Fig. 8c, green)
tially adapt the priors.                                                 show similar behavior: a shifted mean and reduced standard
   Figure 11a shows the number of unique topologies for for-             deviation to accommodate the topology information. We see
ward simulations and each threshold of the ABC-SMC. As                   a much stronger reduction in standard deviation for the two
                                                                         faults (Fig. 8d, e): 80.4 % and 80.0 % for Fault A and Fault
   2 The experiment was run on consumer-grade hardware and               B, respectively. This is also shown as the strongest reduction
leveraging GPU computation: Intel Core i5-8600 K @ 3.60 GHz,             in entropy in Fig. 10.
Nvidia GeForce RTX 2070 8 GB GDDR6, 16 GB DDR4 RAM @
2133 MHz.

Geosci. Model Dev., 14, 3899–3913, 2021                                                    https://doi.org/10.5194/gmd-14-3899-2021
A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information                                      3909

Figure 9. Prior (grey) and posterior (colored) kernel density estimations for selected model parameters (a–e) for the six epochs (each
row represents an epoch) of the ABC-SMC simulation of the Gullfaks case study, showing how the simulation iteratively approaches the
approximate posterior distribution, which shows the possible parameter uncertainty given our topological information. The mean µ and
standard deviation σ are shown for the first and last epochs.

Figure 10. (a) Section of the entropy block of the forward simulation for the prior uncertainty (HT = 0.223). (b) Section of the entropy
block of the final epoch ( = 0.025) of the ABC-SMC simulation (HT = 0.113).

4   Discussion                                                        ensembles that honor both our prior parameter knowledge
                                                                      and qualitative geological knowledge. If the applied topolog-
We showed how topology information, as an encoding for                ical information is meaningful, then the constrained stochas-
important aspects of geological knowledge and reasoning,              tic geomodel ensemble will see a meaningful reduction
can be included in probabilistic geomodeling methods in a             in uncertainty and will subsequently allow for more pre-
Bayesian framework. The simulation experiments for our                cise model-based estimates and decision-making (Stamm
two case studies demonstrated that we are able to approxi-            et al., 2019). More importantly, the (approximate) Bayesian
mate posterior distributions to obtain probabilistic geomodel         approach requires the explicit statement of the geological

https://doi.org/10.5194/gmd-14-3899-2021                                                 Geosci. Model Dev., 14, 3899–3913, 2021
3910                         A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

Figure 11. X–Z section of entropy difference between the forward-simulated entropy and the approximate posterior entropy H ( = 0.025).
The plot highlights areas where the entropy was reduced (blue), increased (red) and kept constant (white).

                                                                      ceptable topologies is used, one could, for example, accept
                                                                      a simulated model if it matches at least one within the error
                                                                      tolerance.
                                                                         The work of Pakyuz-Charrier et al. (2019) shows how
                                                                      clustering of probabilistic geomodel topologies can be used
                                                                      to differentiate between different modes of topologies. Their
                                                                      approach compares geomodel topologies by describing them
                                                                      as half-vectorized adjacency matrices, resulting in a binary
                                                                      string that can be compared using the Hamming distance
                                                                      (Hamming, 1950). It could be considered as a different dis-
                                                                      tance metric in the ABC approach presented in this work to
                                                                      constrain the simulated probabilistic geomodel. And, while
                                                                      their work focuses on the analysis of existing probabilistic
Figure 12. (a) Number of unique topologies within the geomodel
                                                                      geomodel ensembles, our approach focuses on training prob-
ensembles of each SMC epoch, showing the iterative reduction in       abilistic geomodels on topology information.
topological uncertainty throughout the SMC simulation. (b) Aver-         As more complex geomodels strongly increase the re-
age geomodel entropy of the ensembles for each epoch, showing         quired parameterization to accurately describe the model do-
how the reduction of topological uncertainty shown in (a) affects     main in a probabilistic framework, constraining them with
the total geomodel uncertainty.                                       topological information could help keep this parameteriza-
                                                                      tion at computationally feasible levels by reducing the pa-
                                                                      rameter dimensionality, while still obtaining meaningful ge-
                                                                      omodels (e.g., free of modeling artifacts caused by random
knowledge (here the topology information) used in the prob-
                                                                      perturbations of the limited input data). This would not work
abilistic geomodel, increasing the transparency of assump-
                                                                      using an inefficient rejection sampling scheme (e.g., ABC-
tions made during the geomodeling process and any subse-
                                                                      REJ) but would rather require the use of “adaptive” sam-
quent decisions.
                                                                      pling algorithms to efficiently explore the posterior parame-
   With our approach, we directly address a scientific chal-
                                                                      ter space without wasting too much computing power on re-
lenge raised in recent work by Thiele et al. (2016b) that
                                                                      jected models (e.g., ABC-SMC). In our Gullfaks case study,
known topological relationships are frequently not honored
                                                                      we have not only shown the efficacy of the method in a real-
during the probabilistic modeling process, thus potentially
                                                                      world example, but have also demonstrated the stark increase
invalidating large parts of the resulting geomodel ensemble.
                                                                      in computational efficiency when using advanced sampling
Injecting topology information into a Bayesian approach al-
                                                                      techniques. The SMC sampler used in our work requires
lows us to obtain topologically valid, and hence geologically
                                                                      manual setting of the acceptance thresholds, which directly
reasonable, geomodel ensembles. And, although we have
                                                                      influence the algorithm’s efficiency in acquiring samples of
only used simple topology information within this study, the
                                                                      the approximate posterior distribution. Adaptive SMC meth-
demonstrated ABC approach allows us to easily scale the
                                                                      ods automatically tune acceptance thresholds to increase
amount of topology information used: from simple true–false
                                                                      sampling efficiency “on the fly” to minimize computation
comparisons of single topology graphs to the use of a whole
range of topology graphs and relationships. If a set of ac-

Geosci. Model Dev., 14, 3899–3913, 2021                                                https://doi.org/10.5194/gmd-14-3899-2021
A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information                                     3911

time and avoid manual (subjective) selection of thresholds          – As opposed to standard MC with rejection, the imple-
(Del Moral et al., 2012).                                             mented SMC approach makes the use of ABC feasible
   Sadegh and Vrugt (2014) describe a more complex                    in realistic settings. Further research into using more
ABC algorithm based on Differential Evolution Adap-                   advanced sampling schemes could provide additional
tive Metropolis (DREAM-ABC) and demonstrate its much                  speed-ups in obtaining the posterior geomodel ensem-
higher efficiency in approximating the posterior. It might be         ble, which is especially relevant for computationally
of particular interest for the approximate inference of com-          more expensive complex geomodels with large parame-
plex structural geomodels with topology constraints, as it has        terizations.
shown promise to very efficiently explore high-dimensional
(read: large amount of prior parameters) and multi-modal
parameter spaces. When using multiple topology graphs             Code and data availability. Input data and scripts to run the model
(which are discrete) in an ABC framework, the posterior pa-       and produce the plots for all the simulations presented in this paper
rameter space may potentially become multi-modal, which           are archived at Zenodo (Schaaf, 2020).
poses significant challenges for traditional Markov chain-
based samplers (Feroz and Hobson, 2008). The approach by
Sadegh and Vrugt (2014) is based on combining multiple            Author contributions. AS was responsible for the data curation, in-
                                                                  vestigation, validation and visualization. Conceptualization, design
Markov chains, which natively supports parallel computing
                                                                  and development of the research’s methodology were done by AS,
and would thus allow for a high scalability of the approach       MdlV and FW. Formal analysis and software development were
to complex, computationally intensive geomodels.                  done by AS and MdlV. Supervision, research resources and fund-
   Alternatively, Bayesian optimization for likelihood-free       ing acquisition were provided by CB and FW. AS was responsible
inference (BOLFI; Gutmann and Corander, 2016) could be            for writing the original draft, while all authors contributed to the
worth considering for complex structural geomodels. The           review and editing process.
method abstracts the simulator and/or implicit function into
a statistical surrogate model between the priors and the
summary statistics and then attempts to minimize their dis-       Competing interests. The authors declare that they have no conflict
tance, with the potential to significantly reduce the number      of interest.
of needed computations of the geomodel. Overall, the spa-
tial and discrete nature of geomodels and the use of discrete
summary statistics pose unique challenges for sampling algo-      Disclaimer. Publisher’s note: Copernicus Publications remains
rithms, requiring further research to identify algorithms that    neutral with regard to jurisdictional claims in published maps and
can confidently converge and minimize the high computa-           institutional affiliations.
tional cost of probabilistic 3-D geomodels.
   The method demonstrated the effect of topology informa-
                                                                  Acknowledgements. We would like to thank Total E&P UK in Ab-
tion on geomodel uncertainty – showing how well the pa-
                                                                  erdeen for funding this research. We also thank Fabian Stamm for
rameterization of a probabilistic geomodel fits our geologi-      providing the wonderful synthetic geomodel used in this paper. We
cal assumptions. The acceptance rates during sampling could       are grateful for the constructive reviews from Ashton Krajnovich
potentially be used as a proxy for the validity of our assump-    and an anonymous reviewer for helping us improve this paper.
tions: low acceptance rates could reveal a bad fit between
our model and our added geological knowledge and reason-
ing. Using entropy-difference plots, the effect of geological     Financial support. This research was conducted within the scope
assumptions on geomodel uncertainty can be analyzed spa-          of a Total E&P UK-funded postgraduate research project.
tially, e.g., how it changes around faults and other structures
in the geomodel ensemble.
                                                                  Review statement. This paper was edited by Thomas Poulet and re-
                                                                  viewed by Ashton Krajnovich and one anonymous referee.
5    Summary

    – We have shown how to use approximate Bayesian com-
      putation to constrain probabilistic geomodels so that the
      approximate posterior incorporates known topology in-
      formation.                                                  References

    – The method enables additional geological knowledge          Baddeley, M. C., Curtis, A., and Wood, R.: An introduction to prior
      and reasoning to be explicitly encoded and incorporated       information derived from probabilistic judgements: elicitation of
      into probabilistic geomodel ensembles, potentially in-        knowledge, cognitive bias and herding, Geological Society, Lon-
      creasing the transparency of the modeling assumptions.        don, Special Publications, 239, 15–27, 2004.

https://doi.org/10.5194/gmd-14-3899-2021                                              Geosci. Model Dev., 14, 3899–3913, 2021
3912                           A. Schaaf et al.: Constraining stochastic 3-D structural geomodels with topology information

Bardossy, G. and Fodor, J.: Evaluation of Uncertainties and Risks           Carlo Methods for Astronomical Data Analyses, Mon. Not.
   in Geology: New Mathematical Approaches for Their Handling,              R. Astron. Soc., 384, 449–463, https://doi.org/10.1111/j.1365-
   Springer Science & Business Media, 2013.                                 2966.2007.12353.x, 2008.
Bistacchi, A., Massironi, M., Dal Piaz, G. V., Dal Piaz, G., Mo-         Fossen, H.: Structural Geology, Cambridge University Press, 1st
   nopoli, B., Schiavo, A., and Toffolon, G.: 3D Fold and Fault             Edn., 2010.
   Reconstruction with an Uncertainty Model: An Example from             Fossen, H. and Hesthammer, J.: Structural Geology of
   an Alpine Tunnel Case Study, Comput. Geosc., 34, 351–372,                the Gullfaks Field, Northern North Sea, Geological
   https://doi.org/10.1016/j.cageo.2007.04.002, 2008.                       Society, London, Special Publications, 127, 231–261,
Bolstad, W. M.: Understanding Computational Bayesian Statistics,            https://doi.org/10.1144/GSL.SP.1998.127.01.16, 1998.
   John Wiley & Sons, 2009.                                              Fossen, H. and Rørnes, A.: Properties of Fault Populations in the
Bond, C., Gibbs, A., Shipton, Z., and Jones, S.: What Do You Think          Gullfaks Field, Northern North Sea, J. Struct. Geol., 18, 179–
   This Is? “Conceptual Uncertainty” in Geoscience Interpreta-              190, https://doi.org/10.1016/S0191-8141(96)80043-5, 1996.
   tion, GSA Today, 17, 4, https://doi.org/10.1130/GSAT01711A.1,         Fossen, H., Odinsen, T., Færseth, R. B., and Gabrielsen, R. H.: De-
   2007.                                                                    tachments and Low-Angle Faults in the Northern North Sea Rift
Bond, C. E.: Uncertainty in structural interpretation: Lessons to be        System, Geological Society, London, Special Publications, 167,
   learnt, J. Struct. Geol., 74, 185–200, 2015.                             105–131, https://doi.org/10.1144/GSL.SP.2000.167.01.06, 2000.
Bond, C. E., Shipton, Z. K., Gibbs, A. D., and Jones, S.: Struc-         Frodeman, R.: Geological Reasoning: Geology as an In-
   tural models: optimizing risk analysis by understanding concep-          terpretive and Historical Science, Geol. Soc. Am.
   tual uncertainty, First Break, 26, 6, https://doi.org/10.3997/1365-      Bull.,     107,      960–0968,      https://doi.org/10.1130/0016-
   2397.2008006, 2008.                                                      7606(1995)1072.3.CO;2, 1995.
Brandes, C. and Tanner, D. C.: Fault-related folding: A review of        Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari,
   kinematic models and their application, Earth-Sci. Rev., 138,            A., and Rubin, D. B.: Bayesian Data Analysis, Chapman and
   352–370, 2014.                                                           Hall/CRC, 675 pp., https://doi.org/10.1201/b16018, 2013.
Butler, R. W., Bond, C. E., Cooper, M. A., and Watkins, H.: Inter-       Groshong, R., Bond, C., Gibbs, A., Ratcliff, R., and Wiltschko, D.:
   preting structural geometry in fold-thrust belts: Why style mat-         Preface: Structural balancing at the start of the 21st century: 100
   ters, J. Struct. Geol., 114, 251–273, 2018.                              years since Chamberlin, J. Struct. Geol., 41, 1–5, 2012.
Calcagno, P., Chilès, J. P., Courrioux, G., and Guillen, A.: Geo-        Gutmann, M. U. and Corander, J.: Bayesian Optimization for
   logical Modelling from Field Data and Geological Knowledge:              Likelihood-Free Inference of Simulator-Based Statistical Mod-
   Part I. Modelling Method Coupling 3D Potential-Field Interpola-          els, J. Mach. Learn. Res., 17, 1–47, 2016.
   tion and Geological Rules, Phys. Earth Planet. In., 171, 147–157,     Hamming, R. W.: Error detecting and error correcting codes, The
   https://doi.org/10.1016/j.pepi.2008.06.013, 2008.                        Bell system technical journal, 29, 147–160, 1950.
Chilès, J. P., Aug, C., Guillen, A., and Lees, T.: Modelling the         Hillier, M. J., Schetselaar, E. M., de Kemp, E. A., and Perron,
   Geometry of Geological Units and Its Uncertainty in 3D From              G.: Three-Dimensional Modelling of Geological Surfaces Us-
   Structural Data: The Potential-Field Method, Proceedings of in-          ing Generalized Interpolation with Radial Basis Functions, Math.
   ternational symposium on orebody modelling and strategic mine            Geosci., 46, 931–953, https://doi.org/10.1007/s11004-014-9540-
   planning, Perth, Australia, 22, 24 pp., 2004.                            3, 2014.
Crossley, M. D.: Essential Topology, Springer Science & Business         Jaccard, P.: The Distribution of the Flora in the Alpine Zone.1,
   Media, 2006.                                                             New Phytologist, 11, 37–50, https://doi.org/10.1111/j.1469-
Curtis, A. and Wood, R.: Optimal Elicitation of Prob-                       8137.1912.tb05611.x, 1912.
   abilistic Information from Experts, Geological So-                    Jessell, M., Aillères, L., de Kemp, E., Lindsay, M., Well-
   ciety, London, Special Publications, 239, 127–145,                       mann, F., Hillier, M., Laurent, G., Carmichael, T., and
   https://doi.org/10.1144/GSL.SP.2004.239.01.09, 2004.                     Martin, R.: Next Generation Three-Dimensional Geologic
de la Varga, M. and Wellmann, J. F.: Structural Geologic Modeling           Modeling and Inversion, in: Building Exploration Capabil-
   as an Inference Problem: A Bayesian Perspective, Interpretation,         ity for the 21st Century, Society of Economic Geologists,
   4, SM1–SM16, https://doi.org/10.1190/INT-2015-0188.1, 2016.              https://doi.org/10.5382/SP.18.13, 2014.
de la Varga, M., Schaaf, A., and Wellmann, F.: GemPy 1.0: open-          Koller, D., Friedman, N., and Bach, F.: Probabilistic Graphical
   source stochastic geological modeling and inversion, Geosci.             Models: Principles and Techniques, MIT Press, 2009.
   Model Dev., 12, 1–32, https://doi.org/10.5194/gmd-12-1-2019,          Krajnovich, A., Zhou, W., and Gutierrez, M.: Uncertainty assess-
   2019.                                                                    ment for 3D geologic modeling of fault zones based on geo-
Del Moral, P., Doucet, A., and Jasra, A.: An Adaptive Sequential            logic inputs and prior knowledge, Solid Earth, 11, 1457–1474,
   Monte Carlo Method for Approximate Bayesian Computation,                 https://doi.org/10.5194/se-11-1457-2020, 2020.
   Stat. Comput., 22, 1009–1020, https://doi.org/10.1007/s11222-         Lajaunie, C., Courrioux, G., and Manuel, L.: Foliation Fields
   011-9271-y, 2012.                                                        and 3D Cartography in Geology: Principles of a Method
Egenhofer, M. J.: Categorizing binary topological relations between         Based on Potential Interpolation, Math. Geol., 29, 571–584,
   regions, lines, and points in geographic databases, Santa Barbara        https://doi.org/10.1007/BF02775087, 1997.
   CA National Center for Geographic Information and Analysis            Laurent, G., Ailleres, L., Grose, L., Caumon, G., Jessell,
   Technical Report, 9, 76 pp., 1990.                                       M., and Armit, R.: Implicit Modeling of Folds and Over-
Feroz, F. and Hobson, M. P.: Multimodal Nested Sampling:                    printing Deformation, Earth Planet. Sc. Lett., 456, 26–38,
   An Efficient and Robust Alternative to Markov Chain Monte                https://doi.org/10.1016/j.epsl.2016.09.040, 2016.

Geosci. Model Dev., 14, 3899–3913, 2021                                                    https://doi.org/10.5194/gmd-14-3899-2021
You can also read