REXPACO: An algorithm for high contrast reconstruction of the circumstellar environment by angular differential imaging

Page created by Marvin Santos
 
CONTINUE READING
Astronomy & Astrophysics manuscript no. REXPACO                                                                         ©ESO 2021
                                               May 14, 2021

                                                    REXPACO: An algorithm for high contrast reconstruction
                                                 of the circumstellar environment by angular differential imaging
                                                               Olivier Flasseur1 , Samuel Thé1 , Loïc Denis2 , Éric Thiébaut1 , and Maud Langlois1

                                                    1
                                                        Université de Lyon, Université Lyon1, ENS de Lyon, CNRS, Centre de Recherche Astrophysique de Lyon UMR 5574,
                                                        F-69230, Saint-Genis-Laval, France
                                                        e-mail: surname.name@univ-lyon1.fr
                                                    2
                                                        Université de Lyon, UJM-Saint-Etienne, CNRS, Institut d Optique Graduate School, Laboratoire Hubert Curien
                                                        UMR 5516, F-42023, Saint-Etienne, France
arXiv:2104.09672v2 [astro-ph.IM] 13 May 2021

                                                        e-mail: surname.name@univ-st-etienne.fr
                                                    May 14, 2021

                                                                                                           ABSTRACT

                                                    Context. Direct imaging is a method of choice for probing the close environment of young stars. Even with the coupling
                                                    of adaptive optics and coronagraphy, the direct detection of off-axis sources such as circumstellar disks and exoplanets
                                                    remains challenging due to the required high contrast and small angular resolution. Angular differential imaging (ADI)
                                                    is an observational technique that introduces an angular diversity to help disentangle the signal of off-axis sources from
                                                    the residual signal of the star in a post-processing step.
                                                    Aims. While various detection algorithms have been proposed in the last decade to process ADI sequences and reach
                                                    high contrast for the detection of point-like sources, very few methods are available to reconstruct meaningful images of
                                                    extended features such as circumstellar disks. The purpose of this paper is to describe a new post-processing algorithm
                                                    dedicated to the reconstruction of the spatial distribution of light (total intensity) received from off-axis sources, in
                                                    particular from circumstellar disks.
                                                    Methods. Built on the recent PACO algorithm dedicated to the detection of point-like sources, the proposed method is
                                                    based on the local learning of patch covariances capturing the spatial fluctuations of the stellar leakages. From this
                                                    statistical modeling, we develop a regularized image reconstruction algorithm (REXPACO) following an inverse problems
                                                    approach based on a forward image formation model of the off-axis sources in the ADI sequences.
                                                    Results. Injections of fake circumstellar disks in ADI sequences from the VLT/SPHERE-IRDIS instrument show that
                                                    both the morphology and the photometry of the disks are better preserved by REXPACO compared to standard post-
                                                    processing methods such as cADI. In particular, the modeling of the spatial covariances proves useful in reducing
                                                    typical ADI artifacts and in better disentangling the signal of these sources from the residual stellar contamination.
                                                    The application to stars hosting circumstellar disks with various morphologies confirms the ability of REXPACO to produce
                                                    images of the light distribution with reduced artifacts. Finally, we show how REXPACO can be combined with PACO to
                                                    disentangle the signal of circumstellar disks from the signal of candidate point-like sources.
                                                    Conclusions. REXPACO is a novel post-processing algorithm for reconstructing images of the circumstellar environment
                                                    from high contrast ADI sequences. It produces numerically deblurred images and exploits the spatial covariances of
                                                    the stellar leakages and of the noise to efficiently eliminate this nuisance term. The processing is fully unsupervised, all
                                                    tuning parameters being directly estimated from the data themselves.
                                                    Key words. Techniques: image processing - Techniques: high angular resolution - Methods: statistical - Methods: data
                                                    analysis.

                                               1. Introduction                                                     2011; Kluska et al. 2020) and gaps (van Boekel et al. 2017)
                                                                                                                   that could be signposts for the presence of exoplanets. In
                                               Circumstellar disks are at the heart of the planet forma-           this context, protoplanetary disks allow unique studies of
                                               tion processes. Direct imaging in the near infrared is a            the exoplanet-disk interactions (Keppler et al. 2018; Haf-
                                               unique method for addressing this question because plan-            fert et al. 2019; Mesa et al. 2019). In order to understand
                                               ets can be detected together with the disk environment.             the physical processes governing these disks, it is essential
                                               Despite the very promising results from direct imaging, a           to reconstruct their surface-brightness or scattering phase
                                               small fraction of the expected circumstellar disks have been        function but also to disentangle, in a post-processing step,
                                               unveiled and characterized using high contrast direct imag-         the disks and the exoplanets.
                                               ing in the near infrared (Esposito et al. 2020; Garufi et al.
                                               2020; Langlois et al. 2020). Recent protoplanetary and de-          From an observational point of view, the direct detection
                                               bris disk studies performed in intensity or in polarimetry          of circumstellar disks requires: (i) high angular resolution
                                               have focused on morphological disk characteristics such as          to resolve the close environment of stars (the typical size
                                               the presence of spirals (Benisty et al. 2015; Ren et al. 2018;      of disks is smaller than 1 arcsecond) and (ii) high con-
                                               Muro-Arena et al. 2020), asymmetries, warps (Dawson et al.          trast to detect the faint light scattered by disks (in the
                                                                                                                                                       Article number, page 1 of 25
A&A proofs: manuscript no. REXPACO

Fig. 1. Schematic representation of the main steps followed by the state-of-the-art post-processing methods for ADI sequences
with circumstellar disks. (a) Principle of the classical approach based on the estimation and subtraction of an on-axis PSF. (b)
Four main categories of algorithms designed to reduce reconstruction artifacts seen in (a): using a physics-based model of the disk,
mitigating artifacts directly from the ADI sequence of interest or using an additional sequence of reference, or formalizing the
reconstruction task as an inverse problem. REXPACO falls into the last category: methods based on an inverse problems approach.

infrared, an effective contrast better than 105 relative to           disks. The common principle is to estimate a reference
the host star is typically required). Among the available             image3 of the nuisance component (see Fig. 1(a) for a
ground-based observing facilities, the Spectro-Polarimetry            schematic illustration). This reference image is subtracted
High-contrast Exoplanet Research instrument (SPHERE;                  to each image of the sequence. The resulting residual images
Beuzit et al. 2019) operating at the Very Large Telescope             are aligned along a common direction to spatially superim-
(VLT) offers such capabilities. It produces angular differen-         pose the signals from the off-axis sources, and are added
tial imaging (ADI; Marois et al. 2006) sequences using the            to each others. More sophisticated methods follow an in-
pupil-tracking mode of the telescope as a means to enhance            verse problems framework. In particular, the recent PACO
the achievable contrast. The presence of strong residual stel-        algorithm (Flasseur et al. 2018a,b,c) jointly learns the aver-
lar leakages from the coronagraph, in the form of so-called           age nuisance component and its fluctuations by estimating
speckles, is the current main limitation of the reachable             spatial covariances. Since the nuisance component is highly
contrast. A post-processing step is required to combine the           nonstationary, the underlying statistical model is estimated
images of the ADI sequence and cancel out most of the                 at a local scale of small patches.
nuisance component1 while preserving the signal from the                  Most of these standard ADI post-processing algorithms
off-axis sources2 .                                                   are not suited to detect and reconstruct the light distri-
                                                                      bution of circumstellar disks. Their major drawback is the
There is a large variety of methods dedicated to the post-            presence, in the output images, of strong artifacts that take
processing of ADI sequences (see for example Pueyo (2018)             the form of partial replicas, the removal of extended smooth
for a review). Even though these methods are primarily                components, smearing, and nonuniform attenuations due
designed for the detection of unresolved point-like sources,          to the so-called self-subtraction phenomenon. As a direct
they are also widely used to process observations of disks            consequence, both the morphology and the photometry of
(Milli et al. 2012) and share a common framework with                 the disks are strongly corrupted. Very few specific solutions
most of the existing post-processing methods designed for             have been developed to tackle these limitations in the ADI
                                                                      processing of circumstellar disks. The existing approaches
1
  We refer to the term “nuisance component” to designate the          can be classified into four categories (see Fig. 1(b) for a
contribution of light sources other than the object of interest, as   schematic diagram of their principles).
well as different sources of noise.
2                                                                     3
  Since the signals from circumstellar disks and point-like              Since the nuisance component is mainly due to light coming
sources come from directions other than the optical axis, we          from the optical axis, we refer to the term “on-axis point-spread
use the term “off-axis sources” to refer indifferently to these two   function” (in short, “on-axis PSF”) to designate the estimated
types of objects.                                                     reference image of the nuisance component.

Article number, page 2 of 25
Olivier Flasseur et al.: REXPACO – Reconstruction of EXtended features by PAtch COvariances

Fig. 2. Typical ADI sequence from the VLT/SPHERE-IRDIS instrument: (a) Examples of temporal frames; (b) two spatio-
temporal slices cut along the solid and dashed lines of (a) emphasizing the spatial variations of the structure of the signal; (c)
estimated covariance matrices for four regions of interest (ROIs) at different angular separations s with the host star. Data set:
HR 4796A, see Sect. 3.3 for observing conditions.

    A first family of methods introduce a forward-backward        Based on the previous analysis of the astrophysical needs
loop embedding a physics-based model of the disk (Milli           and of the state-of-the-art post-processing methods for cir-
et al. 2012; Esposito et al. 2013; Mazoyer et al. 2020). In a     cumstellar disk detection and reconstruction, the desirable
nutshell, the parameters constraining the disk morphology         properties of post-processing algorithms may be listed as
and flux are optimized by minimizing the residuals pro-           follows: (i) detection of disks at high contrasts, (ii) recon-
duced by the selected post-processing algorithm from the          struction of a physically plausible image of the flux distribu-
ADI sequence free from the current estimation of the disk         tion with limited processing artifacts, and (iii) ability to dis-
contribution. This method is however limited as it requires       entangle the spatially extended contribution of disks from
a physics-based model of the disk that is sufficiently simple     the spatially localized contribution of point-like sources.
to be numerically optimized and yet sufficiently flexible to
accurately model the actual disk.
    A second category of methods attempt to avoid self-               In this paper, we attempt to address these three points
subtraction by reducing the disk contribution in the estima-      by proposing a new post-processing method to extract the
tion of the reference on-axis PSF. For instance, Pairet et al.    light distribution of extended features like circumstellar
(2018) have proposed a modified version of the PCA-based          disks from ADI sequences. To get rid of the nuisance term,
post-processing algorithm that iteratively removes the esti-      we formalize the reconstruction as an inverse problem where
mated disk component to improve the estimation of the on-         all the contributions appear explicitly and where the mea-
axis PSF. To limit self-subtraction, Ren et al. (2020) have       sured instrumental PSF is taken into account. As a result,
proposed a data-imputation strategy: The areas of the field       the estimation of the object of interest is nearly free from
of view impacted by the disk are considered as missing-data       contamination by the stellar leakages and from blurring
based on prior knowledge of the disk location and replaced        by the off-axis PSF. The method proposed in this paper,
by regions free from any disk contribution.                       named REXPACO (for Reconstruction of EXtended features by
                                                                  PAtch COvariances), introduces dedicated constraints for
    So-called reference differential imaging (RDI; Gerard &       each component to help disentangle the object of interest
Marois 2016; Wahhaj et al. 2021) is a third class of ap-          from the nuisance term. For the object of interest, these con-
proaches that require an additional ADI sequence of a ref-        straints take the form of specific regularizations favoring the
erence star that hosts no known off-axis sources to esti-         smoothness of extended structures (e.g., disk-like compo-
mate the stellar contamination. The quality of the post-          nent) or the sparsity of point-like sources. For the nuisance
processing highly depends both on the selection of the refer-     term, we consider the same statistical model as with PACO’s
ence star, and on the similarity of the observing conditions      algorithm (Flasseur et al. 2018a,b,c), initially designed for
(Ruane et al. 2019). Ren et al. (2018) also use a reference       exoplanet detection, but which has also proven very effec-
ADI sequence to build a nonnegative decomposition (via            tive for the recovery of extended patterns in holographic
nonnegative matrix factorization) of the nuisance compo-          microscopy (Flasseur et al. 2019). The power of PACO sta-
nent.                                                             tistical model resides in its ability to capture the local be-
    Finally, a fourth class of methods is emerging and states     havior of the stellar leakages and the noise through the
the reconstruction of the object of interest as an inverse        spatial covariances. Another advantage is that this model
problem. It consists in jointly estimating the nuisance term,     is directly learned from the ADI sequence and requires no
the disk and, possibly, point-like sources given the data and     other specific calibration data. Similarly, REXPACO is an un-
low-complexity priors adapted to each kind of contribution        supervised algorithm: All tuning parameters are automati-
(Pairet et al. 2019, 2021). The method we propose in this         cally estimated from the data. The framework implemented
paper belongs to this family.                                     by REXPACO is rather flexible. As an example, REXPACO can
                                                                                                     Article number, page 3 of 25
A&A proofs: manuscript no. REXPACO

Table 1. Summary of the main notations.                         and described by a vector x ∈ RM + , where M is the number
                                                                of pixels of the reconstructed field of view and where R+
Not. Range        Definition                                    is the set of nonnegative real numbers to account that the
               . Constants and related indexes                  intensity is necessarily nonnegative everywhere. The ADI
                                                                data sequence r results from the contribution of two com-
 K        N       number of pixels in a patch                   ponents as expressed by:
 M        N       number of pixels in a reconstructed image
 N        N       number of pixels in a temporal frame          r = Ax + f ,                                              (1)
 T        N       number of temporal frames                                                           M        NT
                                                                where A is the linear operator (R → R ) modeling
 n      J1, N K   pixel index                                   the effects of the off-axis instrumental point spread func-
 t      J1, T K   temporal index                                tion (off-axis PSF), that is to say the non-coronagraphic
                      . Data quantities                         image of the star, and f ∈ RN T is a nuisance component
          NT                                                    accounting for the residual stellar light in the presence of
  r     R         ADI sequence (observed data)                  aberrations (that is, the speckles) and for the contribution
  f     RN T      nuisance component                            of noise. Due to the temporal variation of aberrations and
  x     RM+       light distribution of off-axis sources        to the noise, the nuisance component f is the result of some
                         . Operators                            random process. The quality of the statistical modeling of
        N T ×M
                                                                this process is crucial for the inversion of Eq. (1), and it
  A R             off-axis PSF model: A = V H Γ Q               represents a key element of REXPACO. Figure 2(a) displays
  Q RM T ×M       field rotation                                an example of an ADI sequence from the star HR 4796A ob-
  Γ RM T ×M T     coronagraph attenuation                       served by VLT/SPHERE-IRDIS. HR 4796A is surrounded
  H RM T ×M T     convolution by off-axis PSF                   by a bright inclined debris disk (see Sect. 3.3 for a more
  V RN T ×M T     field of view cropping                        complete description of its astrophysical properties). We
  P  RK×N         patch extractor                               selected this data set to illustrate the rationale of the dif-
                                                                ferent ingredients of REXPACO. Part (a) of Fig. 2 shows that
                   . Estimated quantities
                                                                the bright disk of HR 4796A can only be marginally de-
           N                                                    tected by a close visual inspection of a given frame of the
 m
 c      R         temporal mean of f
 C
 e     RK×K       empirical spatial covariance of f             sequence. This is due to the very large contrast between
 C
 b     RK×K       shrunk spatial covariance of f                the star and its environment: The contribution of the com-
                                                                ponent of interest A x is typically fainter than the nui-
 Ω
 b      RN T      statistics of the nuisance component f        sance component f by one to four orders of magnitude.
 x
 b      RM
         +        reconstructed distribution of light           This high contrast sets the gain that must be achieved by
                                                                the post-processing method in order to recover the compo-
                                                                nent of interest x from the data. Part (b) of Fig. 2 shows
be coupled with PACO to better estimate off-axis point-like     two spatio-temporal slices extracted at two different loca-
sources.                                                        tions (along the solid line: far from the host star, along
    The paper is organized as follows. In Sect. 2, we formal-   the dashed line: closer to the host star) and illustrates the
ize REXPACO algorithm. In Sect. 3, we illustrate the results    nonstationarity of the nuisance component. Near the star
obtained by REXPACO, both with semisynthetic data and           (i.e., at angular separations smaller than about 1.5 arcsec-
with real observations of known circumstellar disks from        ond), the nuisance component is dominated by speckles. At
the InfraRed Dual Imaging Spectrograph (IRDIS; Dohlen           larger angular separations, the nuisance component mainly
et al. 2008a,b) of the VLT/SPHERE instrument. Section 4         results from a combination of photon noise, thermal back-
presents results from the combination of REXPACO with the       ground flux and detector readout. Part (c) of Fig. 2 shows
PACO algorithm to unmix the spatially extended contribu-        the empirical spatial covariances of the data computed by
tion of disk from the point-like contribution of exoplanets.    temporal averaging through the ADI sequence in different
Section 5 summarizes our contributions.                         regions of interest (ROIs). This figure not only reveals that
                                                                the covariances of the fluctuations are spatially nonstation-
                                                                ary but also that the covariances are non-negligible in a
2. REXPACO: Reconstruction of EXtended features                 small neighborhood. In a nutshell, the nuisance component
   by PAtch COvariances modeling                                f has strong spatial correlations and its distribution is spa-
                                                                tially nonstationary and fluctuates over time, making the
This section describes REXPACO, the method we propose to        extraction of the object x all the more difficult. We intro-
recover the spatial distribution x of the light flux of off-    duce in Sect. 2.3 a model of the probability density function
axis sources from an ADI sequence r ∈ RN T formed by            p(f ) of the nuisance component f that is suitable to de-
T images of N pixels each. Table 1 summarizes the main          scribe the statistical behavior of these fluctuations.
notations used throughout the paper.                                We formalize the reconstruction of the object x from the
                                                                data r as an inverse problem: We seek the light distribution
2.1. Inverse problems formulation                               x
                                                                b that would produce observations, given the instrumental
                                                                model A, statistically close to the actual measurements.
REXPACO follows an inverse problems formulation: It com-        Image reconstruction methods based on the resolution of
putes an estimate xb from the sequence of observations r by     inverse problems are generally expressed in the form of an
fitting a model of the data under given constraints. Since      optimization problem. In a Bayesian framework, the so-
all observations r are collected during a single night, the     lution of this optimization problem is interpreted as the
distribution of interest x can be considered time-invariant     maximum a posteriori estimator (see for example Thiébaut
Article number, page 4 of 25
Olivier Flasseur et al.: REXPACO – Reconstruction of EXtended features by PAtch COvariances

Fig. 3. Schematic illustration of the forward model A x = V H Γ Q x of REXPACO. First, the object x is rotated by the parallactic
angles by operator Q. Then, each resulting temporal frame is attenuated by the (time independent) transmission of the coronagraph
with Γ, and convolved by the off-axis PSF by application of H. Finally, the field of view seen by the sensor is extracted by V to
obtain the object contribution A x. The recorded ADI sequence is modeled by the sum of A x and of the nuisance component f ,
as described by Eq. (1).

(2006)). Introducing an objective function C to account for       planets, brown dwarfs, background stars). During the ac-
the instrumental effects, the statistical model of the mea-       quisition of an ADI sequence, the field of view as seen by
surements and regularization terms that promote more sat-         the detector rotates around the optical axis. After suit-
isfying solutions (e.g., sparser and smoother solutions) and      able instrument calibration, this apparent rotation is com-
better reject noise, the problem writes:                          pletely defined by the center of rotation and the parallactic
b = arg min C (r, x, A, Ω, µ) ,
x                                                          (2)    angles recorded with the data4 . The contribution of x in
      x≥0                                                         the t-th data frame r t is given by the linear model At x
                                                                  where At : RM → RN implements the off-axis point spread
where:
                                                                  function for the considered frame: it models the appar-
C (r, x, A, Ω, µ) = D(r, A x, Ω) + R(x, µ) ,               (3)    ent rotation of the field of view, the nonuniform attenu-
and arg minx≥0 C (r, x, A, Ω, µ) denotes the set of non-          ation by the coronagraph and the instrumental blurring.
negative values of x minimizing C (r, x, A, Ω, µ). The first      In other words, [A x]t ≡ At x for all frames t ∈ J1, T K
term, D(r, A x, Ω), is the data-fidelity term that penalizes      where A : RM → RN ×T is the linear operator introduced
the discrepancy between the recorded data r and the mod-          in Sect. 2.1 and such that A x models the contribution of
eled contribution A x of the object of interest x in the data.    x to all data frames.
The data fidelity term also depends on parameters Ω that              Considering the size of the image reconstruction prob-
define the statistical distribution of the nuisance component     lem, it is essential that our model A of the off-axis PSFs not
f (e.g., mean, variances or covariances, see Sect. 2.3). The      only be an accurate approximation of the actual instrumen-
second term, R(x, µ), is a regularization term enforcing          tal effects but also that A and its adjoint be fast to apply
some prior knowledge about the object properties to favor         in an iterative reconstruction algorithm (see Appendix A).
physically more plausible reconstructions and prevent noise       To obtain a numerically efficient model, we decompose each
amplification. The hyper-parameters µ tune the behavior           operator At into different factors: Qt to model the tempo-
of the regularization, in particular, its relative importance     rally and spatially variant effects due to the rotation of the
compared to the data fidelity term. In the following, we          field of view, Γ to account for the attenuation close to the
detail the ingredients of this general framework. The for-        optical axis due to the coronagraph, and H to implement
ward model A of the off-axis PSF is developed in Sect. 2.2.       the instrumental blurring (considered in first approxima-
The statistical model of the nuisance component is stated         tion as spatially and temporally invariant). Operators Qt
in Sect. 2.3 while the estimation of the parameters Ω of this     can be implemented by interpolation operations while op-
model is described in Sect. 2.4. Different possible regulariza-   erator H performs bi-dimensional convolutions that can be
tions are considered in Sect. 2.5. Finally, a strategy for the    computed efficiently using fast Fourier transforms (FFTs).
automatic tuning of the regularization hyper-parameters is        Modeling the instrumental blurring as being shift-invariant
presented in 3.2.                                                 4
                                                                     In practice for VLT/SPHERE-IRDIS, not all parallactic an-
                                                                  gles are measured and some of them are interpolated from oth-
2.2. Forward model: off-axis point spread function                ers. Parallactic angles are also corrected for the estimated over-
                                                                  heads of storage in a calibration and preprocessing step (Pavlov
The parameters x ∈ RM + represent the distribution of light       et al. 2008). Estimation errors of the rotation center and paral-
due to the off-axis sources (e.g., circumstellar disks, exo-      lactic angles are not included in our statistical modeling.

                                                                                                      Article number, page 5 of 25
A&A proofs: manuscript no. REXPACO

(a convolution) is accurate in most of the field of view ex-           2.3. Statistical model of the nuisance component
cept close to the optical axis. The coronagraph induces a
reduction of the light transmission at short angular separa-           The nuisance component f in ADI sequences gener-
tions (i.e., under and near the coronagraphic mask) and a              ally presents strong and nonstationary spatial correlations
deformation of the PSF. We model the variable light trans-             (speckles) that fluctuate over time, especially near the host
mission by a diagonal operator Γ = diag(γ) with γ a 2-D                star where the stellar leakages dominate, see Fig. 2(a). Also,
transmission mask. The mask γ has a smooth profile with                the presence of several bad pixels (hot, dead or randomly
all entries in the range [0, 1]; from 0 at the center of the coro-     fluctuating) is a common issue in high contrast imaging
nagraph (i.e., no flux from off-axis sources transmitted) to 1         where the images are recorded by infrared detectors. De-
farther away, and being equal to 0.5 at the inner working an-          spite a prereduction stage identifying and correcting for
gle (IWA) of the coronagraph. We neglect the deformation               defective pixels, some of them displaying large fluctuations
of the PSF in our model. This approximation has a lim-                 only on a few frames remain after this processing (see for
ited effect since the area affected by the PSF deformations            example Pavlov et al. (2008); Delorme et al. (2017) for
(close to or under the coronagraphic mask) is dominated                the SPHERE instrument). We thus aim to account for the
by the nuisance component. An alternate way to handle                  different causes of fluctuations of the nuisance component
the measurements in the area of the coronagraph would                  through a statistical model.
consist of assuming that they suffer an infinite variance (so              We recover an image of the objects of interest x based
that they be completely discarded in the subsequent estima-            on a co-log-likelihood term D(r, A x, Ω) that captures the
tions). Both approaches lead to comparable reconstructions             statistical distribution of the residuals between r and the
(i.e., differences in the reconstructed flux are several orders        model A x. Like in our previous works on the PACO algo-
of magnitude lower than the disk magnitude) in all exper-              rithm (Flasseur et al. 2018a,b,c, 2020a,b), we model the
imental cases we have considered. This model of the effect             covariances locally at the scale of small patches of size K
of the coronagraph on off-axis point sources as a simple at-           (a few tens of pixels in practice). This approximation is
tenuation varying with the angular position is consistent              justified by the nonstationarity of the spatial correlations
(for angular separations larger than 0.05 arcsecond) with              and can be seen as a trade-off between considering full-size
the calibrations of the coronagraphs considered in Beuzit              spatial covariance matrices (of size N × N , and hence too
et al. (2019) for the SPHERE instrument. For the results               large to be estimated and stored in practice) and completely
presented in Sect. 3, the experimental calibration of this             neglecting the spatial correlations. Our choice amounts to
transmission has been performed in the H-Johnson’s spec-               model the nuisance component f t of the t-th frame as the
tral band (i.e., at ' 1.6 µm) and extrapolated to the other            realization of independent Gaussian processes for each K-
wavelengths by using the experimental measurement and                  pixel patch, without overlap. Given that model, the prob-
dedicated simulations, as described in Beuzit et al. (2019).           ability density function of the nuisance component is given
    To summarize, our forward model of the contribution of             by:
the object of interest x in the t-th frame writes5 :                                     Y          1
                                                                       ∀t,   p(f t ) ∝         det− 2 (Cn )
At x = V H Γ Q t x ,                                            (4)
                                                                                         n∈P
where V is a simple truncation operator to extract the N
                                                                                  h                                     i
                                                                                                      >
                                                                             × exp − 12 (Pn (f t − m)) C−1
                                                                                                        n (Pn (f t − m)) , (5)
pixels corresponding to the actual data field of view from
the (larger) M -pixels output of blurring operator H. Due
to the apparent rotation of the field of view during the ADI           with Pn : RN → RK the operator extracting patch n from
sequence, previously unseen areas fall within the corners of           a given temporal frame for any index n in P defining the
the sensor and, by combining all measurements, a region                partition of the data frame into nonoverlapping K-pixel
larger than N pixels can be reconstructed (up to 57% more              patches, and P is the list of the indices of the pixels at
pixels).                                                               the center of the nonoverlapping patches. The vector m ∈
    The factorization of the forward operator At can                   RN denotes the temporal mean Et (f t ) while Cn ∈ RK×K
straightforwardly be generalized to the case of a time-                denotes the temporal covariance matrix within the patch
varying blur Ht or transmission Γt to account for an evo-              n.
lution of the observing conditions during the ADI sequence                For m and Cn , we use the same estimators that proved
or a possible partial decentering of the coronagraph. Fig-             successful with PACO for exoplanet detection: the temporal
ure 3 illustrates the different steps to convert the object of         mean m is estimated by the sample mean m   c defined by:
interest x into the model of its contribution into the ADI
sequence recorded by the high contrast imaging instrument.                      T
                                                                             1X
In this figure, we adopt the same convention as for A that,            m
                                                                       c=          (r t − At x) ,                                (6)
                                                                             T t=1
when the frame index t of an operator is omitted, the result
of applying the operator for all frames is considered.
                                                                       and the covariance matrices Cn are estimated using a
5
   The forward model could be written differently so that the op-      shrinkage estimator (Ledoit & Wolf 2004; Chen et al. 2010):
erations involved in Qt , Γ and H would be applied in a different
order, subject to slight adaptations. For example, if the field
rotation and translation modeled by operator Qt are applied af-        b n = (1 − ρen ) C
                                                                       C                e n + ρen F
                                                                                                  en ,                           (7)
ter the convolution H, the non-isotropic off-axis PSF has to be
counter-rotated by the parallactic angles (i.e., the blurring is not
                                                                       that combines the high-variance and low-bias sample co-
the same for all t) to counter-balance the effect of the permuta-
tion of Qt and H. The same remark applies for the attenuation          variance estimator C
                                                                                          e n with the low-variance and high-bias
Γ if the mask γ is not circular symmetrical.                           estimator Fn that neglects all covariances. In other words,
                                                                                 e

Article number, page 6 of 25
Olivier Flasseur et al.: REXPACO – Reconstruction of EXtended features by PAtch COvariances

                                                                    It can be noted that most above defined quantities — m        c
                                                                    in Eq. (6), Cb n in Eq. (7), C e n in Eq. (8), u
                                                                                                                   b n,t in Eq. (9),
                                                                    f n in Eq. (11) and ρen in Eq. (12) — depend implicitly
                                                                    W
                                                                    on the sought parameters x although, and for the sake of
                                                                    simplicity, this is not always explicitly indicated. In Al-
                                                                    gorithms 1 and 2 presented in Sect. 2.4, these quantities
                                                                    are labeled with the iteration number as the parameters
                                                                    x change at each iteration. Figure 4 illustrates the extrac-
                                                                    tion of a collection of patches from an ADI sequence for
                                                                    locally learning the statistics of the nuisance component,
                                                                    in particular for the estimation of C  e n . The number K of
                                                                    pixels in each patch, and hence the dimension of the local
                                                                    covariance matrices should be large enough to encompass
                                                                    the core of the off-axis PSF and yet not too large compared
                                                                    to the number T of samples available for the estimation.
                                                                    The optimal patch size is estimated using the same crite-
                                                                    rion (validated empirically) as in PACO and which yields an
                                                                    optimal patch size corresponding roughly to twice the off-
                                                                    axis PSF full width at half maximum (FWHM). This rule
                                                                    typically leads to 50 . K . 100. Figure 2(c) gives exam-
                                                                    ples of estimated patch covariance matrices C  b n for different
                                                                    angular separations. It shows that the structure of the cor-
                                                                    relations strongly depends on the distance to the star and
                                                                    that neglecting these correlations would be a crude approx-
                                                                    imation, especially near the star were the nuisance compo-
                                                                    nent is highly correlated and fluctuating. Figure 2(c) also
                                                                    emphasizes that the covariances are well preserved by the
Fig. 4. Extraction of patch collections for local learning of the
statistics of the nuisance component from ADI sequences. The        regularization (i.e., ρen is small enough for Ce n to represent
contribution of off-axis sources is shown in blue.                  the largest contribution). We have shown in Flasseur et al.
                                                                    (2018a, 2020b) that such a local model of the covariances
                                                                    approximates well the empirical distribution of the nuisance
e n is a diagonal matrix containing the pixel sample vari-
F                                                                   component. In Flasseur et al. (2020b), a small discrepancy
ances that are the diagonal entries of the sample covariance        was observed near the star due to the larger fluctuations6
of the considered patch:                                            of the nuisance component in this area. The impact of a
        T                                                           statistical mis-modeling in this area is further discussed in
en = 1           b>                                                 Sect. 3.3.
       X
C          u
           b n,t un,t ,                                      (8)                                           b = {c
     T t=1                                                              Given the patch-based statistics Ω        m, {C b n}
                                                                                                                            n∈P } lo-
                                                                    cally accounting for the fluctuations of the nuisance com-
with:                                                               ponent, the co-log-likelihood rewrites:
                         
b n,t = Pn r t − m
u                c − At x ,                                  (9)                              T
                                                                              b =1
                                                                                   X X                               
                                                                    D(r, A x, Ω)           un,t k2b −1 + T log det C
                                                                                          kb                       b n , (13)
the residuals in patch n of frame t. With this specific choice                   2               Cn
                                                                                      t=1
                                                                                       n∈P
for F
    e n , Eq. (7) rewrites:
                                                                    where the summation over n is performed on the list P of
bn = W
C    fn       en ,
              C                                             (10)    nonoverlapping K-pixel patches while the residuals ub n,t in
                                                                    the patch n for the frame t are defined in Eq. (9).
where stands for the Hadamard product (i.e., entrywise
multiplication), and W  f n ∈ RK×K is defined for {k, k 0 } ∈
      2
J1, KK by:                                                          2.4. Unbiased estimation of the mean and spatial covariances
h    i         
                 1       if k = k 0 ,                               The estimators mc, in Eq. (6), and Cb n , in Eq. (7), that
  W
  fn         =                                          (11)        appear in Eq. (13) both depend on the object flux x which
       k, k0     1 − ρen if k 6= k 0 .
                                                                    is unknown. To handle this problem, the simplest solution
The shrinkage parameter ρen balances each estimator to              6
                                                                       Large fluctuations in the nuisance component can be due to a
reach a bias-variance trade-off. The extension of the results       slight decentering of the coronagraph or to a sudden degradation
of Chen et al. (2010) to the diagonal covariance matrix F en
                                                                    of the observing conditions. Finer statistical models of the nui-
leads to the following data-driven expression for the shrink-       sance component could be considered to account for these fluctu-
age parameter ρen (see Eq. (12) of Flasseur et al. (2018a)):        ations. In particular, multivariate Gaussian scale mixture mod-
                                                                    els (GSM; Wainwright & Simoncelli 2000) have been shown to
                                  2                               be very effective in the context of exoplanet detection (Flasseur
         e 2 + tr2 Ce n − 2 PK C
                       
      tr C n                 k=1
                                  en
                                     k,k                            et al. 2020a,b). Replacing the multi-variate Gaussian model con-
ρen =                  PK  2  .                        (12)    sider in this paper by a GSM model requires specific develop-
                     2
       (T + 1) tr Cn − k=1 Cn k,k
                   e           e                                    ments that are left for future work.

                                                                                                       Article number, page 7 of 25
A&A proofs: manuscript no. REXPACO

 Algorithm 1: REXPACO reconstruction                                                  Algorithm 2: REXPACO reconstruction
 (Alternating estimation approach).                                                   (Joint estimation approach).
 Input: ADI sequence r.                                                               Input: ADI sequence r.
 Input: Forward operator A.                                                           Input: Forward operator A.
 Input: Regularization parameters µ.                                                  Input: Regularization parameters µ.
 Input: Relative precision η ∈ (0, 1).                                                Output: Unbiased estimate x  b of the light distribution of
 Output: Unbiased estimate x  b of the light distribution of                                    off-axis sources.
           off-axis sources.                                                          I Step 1. Compute shrinkage matrices.
 b [0] ← 0M
 x                                                                                    {W
                                                                                       f n }n∈P
 i←0                                                                                             T
                                                                                        [1]
                                                                                      c = T1
                                                                                                P
 do                                                                                   m            rt                                 / Eq. (14a)
      i←i+1                                                                                       t=1
                                                                                      for n ∈ P do
          I Step 1. Learn statistics of nuisance term.                                                    T                                    >
                                                                                         e [1]
                                                                                         C n ←
                                                                                                      1
                                                                                                          P
                                                                                                          Pn r t − m
                                                                                                                            [1]
                                                                                                                                   rt − m
                                                                                                                                              [1]
                                                                                                                                                    P>n / Eq. (14b)
            [i]  1
                   PT              [i−1]
                                          
          c ← T t=1 r t − At x
          m                                                                                           T
                                                                                                                        c                 c
                                  b                    / Eq. (6)                                     t=1                    PK  2
          for n ∈ P do                                                                    [1]     tr Ce [1]
                                                                                                        n
                                                                                                            2
                                                                                                               +tr2 C e [1]
                                                                                                                        n    −2
                                                                                                                                    k=1
                                                                                                                                           e [1]
                                                                                                                                           C n    k,k
             for t ∈ J1, T K do                                                          ρen    ←                                                     / Eq. (12)
                                                                                                                                         [1] 2
                                                                                                                    [1] 2
                                                                                                                            P   K
                                                                                                                                     
                    [i]
                                    
                                                   [i]
                                                                                                   (T +1) tr C   e n − k=1 C          e n k,k
                   un,t ← Pn r t − m          c − At x     b [i−1]
                                                                                         for (k, k 0 ) ∈ J1, KK 2
                                                                                                               do
              e [i]                   [i]       [i]                                                                            if k = k 0
                           PT
             C  n ← T
                         1
                                     u u >                            / Eq. (8)              [1] 
                                                                                                           ←
                                                                                                               1
                               t=1  n,t n,t                                                  W
                                                                                              fn                                                         / Eq. (11)
                                                                                                      k,k0     1 − ρen              6 k0
                                                                                                                               if k =
                                                    PK  2
                             [i] 2      2      [i]              [i]
               [i]      tr C
                           e n      +tr    Ce  n    −2        C
                                                              e n k,k
                                                         k=1
             ρen ←                              PK  [i] 2  / Eq. (12)
                                          [i] 2
                         (T +1) tr Cn             −        Cn
                                       e               k=1
                                                           e    k,k                   I Step 2. Joint minimization with fixed {W
                                                                                                                               f n }n∈P .
              for (k, k 0 ) ∈ J1, KK 2
                                   do                                                x ← arg min Djoint (r, x) + R(x, µ)
                                                                                      b                                     / Eqs. (2) and (15)
                  [i]             1            if k = k 0                                     x≥0
                   W
                   fn           ←                                 / Eq. (11)
                           k,k0     1 − ρen           6 k0
                                                 if k =
               Cb [i]  f [i] C
                  n ← Wn
                                  e [i]                           / Eq. (10)
                                    n                                             Algorithm 1 for all the reconstructions. Approach (ii) im-
            [i]
          Ω ← {c
          b            [i]    [i]
                      m , {Cn }n∈P }
                            b                                                     plemented by Algorithm 2 converges faster and is especially
                                                                                  beneficial when reconstructing structures that are close to
          I Step 2. Reconstruct an image of the objects.                          circular symmetry such as in Figs. 7 and 8. In order to be
          b [i] ← arg min D(r, A x, Ω
          x                         b [i] ) + R(x, µ) / Eq. (2)                   equivalent to Algorithm 1, Algorithm 2 requires replacing
                     x≥0                                                          the data-fidelity term D defined in Eq. (13) by (the proof
 while          b [i] − x
                x       b [i−1] > η x
                                    b [i]                                         is detailed in Appendix C):

                                                                                                          T X
                                                                                  Djoint (r, x) =             log det C
                                                                                                                      bn
would be to neglect the object contribution as it is typically                                            2
                                                                                                            n∈P
small compared to the nuisance component f . Under this
                                                                                                                                    T
assumption, Eqs. (6) and (8) rewrite:                                                                   1 X  b −1  f                                   
                                                                                                                                                 b>
                                                                                                                                    X
                                                                                                      +    tr Cn Wn                        u
                                                                                                                                           b n,t u n,t        , (15)
               T                                                                                        2
    [1]     1X                                                                                             n∈P                       t=1
m
c         =       rt ,                                                 (14a)
            T t=1
                                                                                  with ub n,t the residuals in the patch n of frame t defined
               T                                   >                            in Eq. (9). This modified data-fidelity corresponds to the
e [1]       1 X            [1]
                                  
                                               [1]
C n       =       Pn r t − m
                           c        Pn r t − m
                                             c         ,               (14b)      co-log-likelihood under a Gaussian assumption except that
            T t=1                                                                                      f n are introduced7 so that the shrunk
                                                                                  shrinkage matrices W
and correspond respectively to the sample mean and the                            covariances Cn defined in Eq. (10) are minimizers of Djoint .
                                                                                                b
sample covariances learned locally from the image patches.                        The mean m   c and shrunk covariances C   b n can thus be re-
    We show in Appendix B that neglecting the light flux                          placed by their closed-form expressions given in Eqs. (6)
of off-axis objects when characterizing the nuisance com-                         and (10) that depend on x and the minimization be per-
ponent leads to biased estimations, a problem common to                           formed solely on x. Beyond the faster convergence of Algo-
many ADI post-processing methods and known as the self-                           rithm 2, the two algorithms differ slightly due to the shrink-
subtraction problem (Milli et al. 2012; Pairet et al. 2019).                      age factors not being refined throughout the reconstruction
To recover a better estimation of the object flux, it is then                     in Algorithm 2. We found the impact of that discrepancy to
necessary to develop an unbiased estimation of the statis-                        be barely noticeable in the reconstruction results and hence
tics Ω. To do so, two strategies are possible: (i) an alterna-                    would recommend the general usage of Algorithm 2 for its
tion between the reconstruction of x b and the update of the                      improved speed.
statistics Ωb of the nuisance component f from the residuals
r−A x  b , until convergence, as done in Algorithm 1, or (ii) a                   7
                                                                                    Shrinkage matrices {W f n }n∈P are defined in Eq. (11) and we
joint estimation approach implemented by Algorithm 2. In                          make the simplifying assumption that they are independent from
this paper, we used η = 10−8 in the stopping condition of                         x so that they are estimated once for all given the data r.

Article number, page 8 of 25
Olivier Flasseur et al.: REXPACO – Reconstruction of EXtended features by PAtch COvariances

2.5. Regularization term                                           simulated disks with very different morphologies: (i) a spa-
                                                                   tially centered elliptical disk with an eccentricity about 0.95
After discussing the data-fidelity term D(r, A x, Ω) in the        and with sharp edges; (ii) a spiral disk with two arms whose
previous sections, we focus here on the regularization term        center is shifted by ten pixels from the star center in one
R(x, µ) and on the resolution of the inverse problem (2).          of two spatial directions. Contrary to the elliptical disk,
    Regularization terms are introduced to enforce prior           the spiral disk has smooth edges. These simulated disks re-
knowledge about the unknown object x and to improve                semble the actual circumstellar disks presented in Sect. 3.3
the conditioning of the inversion. Two classical regulariza-       so that our simulations can help to assess the quality of
tions are considered in the proposed method. The first one         the reconstructions of real circumstellar disks. Each sim-
promotes smooth objects with sharp edges by favoring the           ulated disk is injected into a data set of HIP 80019 with
sparsity of their spatial gradients. This is the goal of the       no known off-axis source, at three different contrast levels:
so-called `2 − `1 edge-preserving regularization (Charbon-         αgt ∈ {1 × 10−6 , 5 × 10−6 , 1 × 10−5 }. Reconstructions of the
nier et al. 1997; Mugnier et al. 2004; Thiébaut 2006) widely       elliptical disk are performed with Algorithm 1 while the spi-
used in image processing:                                          ral disk that requires many more iterations of Algorithm 1
                    N q                                            to converge has been processed with Algorithm 2. A total
                                                                   of 60 reconstructions have been performed: for each disk
                    X
Rsmooth (x, ) =            ||∆n x||22 + 2 ,              (16)
                    n=1
                                                                   and each level αgt , the simulated disk has been injected in
                                                                   one of ten different orientations with respect to the back-
where ∆n approximates the spatial gradient at pixel n by           ground in order to evaluate the mean and variance of the
finite differences and  > 0 sets the√ edge-preservation be-       reconstructions.
havior of the regularization ( = 10−7 in all the experi-              Simulations of the elliptical disk are reported in Figs. 5
ments of this paper): Local differences ∆n x that are below        and 6. We compare REXPACO reconstructions to the cADI
 in magnitude are smoothed similarly as a quadratic regu-         image combination technique. Since cADI does not perform
larization would, while larger differences are preserved (e.g.,    a deconvolution, the comparison is performed at the resolu-
border of a disk). This classical regularization is sufficiently   tion of the instrument, that is to say the images produced
flexible to remain adapted to different disk morphologies, as      by cADI are shown next to the reblurred reconstructions
illustrated by Figs. 5 to 8 where elliptical disks with sharp      Hx b of REXPACO in Figs. 5(a) and 6(a). Large errors can
edges and spiral disks with smooth edges are considered.           be noted in cADI images, in particular at small angular
Since we aim to reconstruct a well-contrasted object on an         separations. The computation of the average reconstruc-
dark background, we also consider a second regularization          tion obtained for the ten different orientations of the disk
term promoting the sparsity of the object x by penalizing          with respect to the background indicates the presence of
its separable `1 -norm:                                            systematic errors with cADI: an under-estimation of the
                                                                   light-flux in the region of the disk that is closest to the
            N               N
            X               X                                      star (an issue due to limited angular diversity). In contrast,
R`1 (x) =         |xn | =         xn .                     (17)    REXPACO reconstructions are close to the ground truth, even
                      x≥0
            n=1             n=1                                    for the lower level of contrast αgt = 10−6 . The deblurred
This sparsity-promoting regularization term is also well           reconstructions shown in Figs. 5(b) and 6(b) are in good
adapted to restore point-like sources. We discuss in Sect. 4       agreement with the ground truth. Unsurprisingly, the re-
how an alternating estimation strategy can be applied to           construction quality is higher when the disk is brighter:
further improve the separation of overlapping point-like           more spurious fluctuations are visible in the deblurred re-
sources and extended structures. Due to the positivity con-        construction at αgt = 10−6 than at αgt = 5 × 10−6 or
straint in Eq. (2) that guarantees that reconstructed light        αgt = 10−5 . Although some discrepancies can be noted in
fluxes are nonnegative (i.e., x ≥ 0), the `1 -norm in Eq. (17)     the deblurred line profiles, the resolution is improved by the
                P                                                  deconvolution process.
boils down to     n xn which is a differentiable expression
and smooth optimization techniques are applicable (see Ap-             Simulations of the spiral disk are reported in Figs. 7
pendix A). The two terms Rsmooth and R`1 are combined              and 8. The comparison with cADI shows a clear improve-
by:                                                                ment of the reconstructions with REXPACO: cADI fails to
                                                                   recover most of the disk. At the lowest contrast αgt = 10−6
R(x, µ) = µsmooth Rsmooth (x, ) + µ`1 R`1 (x) ,           (18)    and owing to the near circular-symmetry of the structures
                                                                   close to the star, the spiral disk is challenging to disentan-
where µ = {µsmooth , µ`1 } balances their relative weight          gle from the nuisance component. REXPACO is however able
with respect to the data-fidelity term D. For a given set          to recover some parts of the spiral arms although their flux
of parameters µ, we solve the constrained minimization             is underestimated. At the highest contrast αgt = 10−5 , line
problem (2) by running the VMLM-B algorithm (Thiébaut              profiles of Fig. 8 indicate that REXPACO reconstructions are
2002). Some technical elements about this minimization             strongly improved compared to cADI and that some errors
technique can be found in Appendix A.                              remain in particular at the lowest angular separations.
                                                                       The reconstructions are quantitatively evaluated by re-
3. Results                                                         porting the normalized root mean square error (N-RMSE,
                                                                   the lower the better) computed over different regions (the
3.1. Evaluation on simulated disks                                 disk, the background, and the whole image):
We first evaluate quantitatively, on numerical simulations,
the ability of REXPACO to estimate a faithful image of the                              ||xgt − x b ||2
light flux distribution of off-axis sources. We consider two       N-RMSE(xgt , x
                                                                                b) =                    .                      (19)
                                                                                           ||xgt ||2
                                                                                                        Article number, page 9 of 25
A&A proofs: manuscript no. REXPACO

Fig. 5. Reconstructions of a simulated elliptical disk: (a) comparisons between cADI image combinations and reblurred REXPACO
reconstructions; (b) resolution improvement achieved by deconvolution with REXPACO. The average reconstructions are computed
over ten injections of the simulated disk with various orientations with respect to the background.

Table 2 reports N-RMSE values and indicates a clear im-         and error until the reconstruction is qualitatively accept-
provement over cADI: the errors are typically divided by a      able, but this approach is not completely satisfactory since
factor of three to four (the error reduction is more modest     it is time-consuming and user-dependent. Several methods
for the more challenging reconstructions of the spiral disk     have been developed in the literature to automatically tune
at the lowest fluxes).                                          regularization parameters by minimizing a quantitative cri-
                                                                terion (Craven & Wahba 1978; Wahba et al. 1985; Stein
                                                                1981). In this work, we consider the Stein’s unbiased risk
3.2. Automatic setting of the regularization parameters         estimator approach (SURE; Stein 1981), which minimizes
                                                                an estimate of the mean-square error (MSE) in the data
The setting of the parameters µ is of uttermost importance      space:
since the results of the regularized reconstructions depend
both on the characteristics of the data set (e.g., observing                        b µ (r)) ||2C−1 ,
                                                                MSE(µ) = ||A (xgt − x                                   (20)
conditions, amplitude of the field rotation) and on the prop-
erties of the objects to reconstruct (e.g., morphology, con-    where xgt is the (unknown) ground truth object, and x
                                                                                                                    b µ (r)
trast). The parameters µ can be tuned manually by trial         denotes the reconstructed object x
                                                                                                 b obtained from the data

Article number, page 10 of 25
Olivier Flasseur et al.: REXPACO – Reconstruction of EXtended features by PAtch COvariances

                 max
reconstruction
    single

                 min

                   ( )
reconstruction
                 max
   average

                 min

                   ( )
reconstruction
    single

                   ( )
reconstruction
   average

                   ( )

 Fig. 6. Line profiles extracted from the reconstructions at αgt = 10−6 shown in Fig.5.

 r when using regularization parameters µ. For any given set            where Jµ (r) = ∂ x   b µ (r)/∂r is the Jacobian matrix of the
 of parameters µ, the criterion SURE(µ) provides an unbi-               mapping r → x     b µ (r) with respect to the components of
 ased estimation of MSE(µ) that does not require knowledge              the data r: [Jµ (r)]a,b = ∂[b xµ (r)]a /∂[r]b . Since there is no
 of the true object xgt (Stein 1981):                                   closed-form expression for x  b µ (r) (it is obtained by an itera-
                                                                        tive process), it is complex to derive tr (A Jµ (r)). An alter-
                                                                        native proposed by Girard (1989) considers a Monte Carlo
 SURE(µ) = kr − m    b µ (r)k2C−1
                c − Ax                                                  perturbation method to numerically approximate this term
                        + 2 tr (A Jµ (r)) − N T ,                (21)

                                                                                                          Article number, page 11 of 25
A&A proofs: manuscript no. REXPACO

Fig. 7. Reconstructions of a simulated spiral disk: (a) comparisons between cADI image combinations and reblurred REXPACO
reconstructions; (b) resolution improvement achieved by deconvolution with REXPACO. The average reconstructions are computed
over ten injections of the simulated disk with various orientations with respect to the background.

by finite-differences:                                            MAD(r) = median(|r − median(r)|) is a robust estimator
                                                                  of the standard-deviation.
tr (A Jµ (r)) ≈ ξ −1 b> A [b
                           xµ (r + ξb) − x
                                         b µ (r)] ,       (22)        As in our previous derivation of the statistical model
                                                                  of the nuisance component, we approximate the full covari-
                                                                  ance matrix C that appears in Eq. (21) as block-diagonal,
where b ∈ RN T is an independent and identically dis-             with blocks corresponding to the partition of the image into
tributed pseudo-random vector of unit variance and ξ is the       nonoverlapping patches. The patch covariance model and
amplitude of the perturbation. While the precise setting of ξ     the perturbation approximation lead to the following risk
is not a crucial point of the method, it must be chosen both      estimator:
large enough to prevent errors due to numerical underflows
in the computation of the difference x   b µ (r + ξb) − x
                                                        b µ (r)              T
                                                                            XX
and small enough so that the approximation in Eq. (22) re-        SURE(µ) ≈    kPn (r t − m    b µ (r))k2b −1
                                                                                          c − Ax
mains valid (the effects of the perturbation being nonlinear                                                    Cn
                                                                              n∈P t=1
in the model). In practice, when we evaluate Eq. (22), we                           >
set ξ = 0.1×MAD(r), where the median absolute deviation                                  xµ (r + ξb) − x
                                                                            + (2/ξ) b A [b             b µ (r)] − N T .   (23)

Article number, page 12 of 25
Olivier Flasseur et al.: REXPACO – Reconstruction of EXtended features by PAtch COvariances

Fig. 8. Line profiles extracted from the reconstructions at αgt = 10−5 shown in Fig.7.

The optimal value µb SURE of the regularization parameters       true MSE: The two criteria reach a global minimum for sim-
is obtained by minimizing the quantity (23) with respect to      ilar hyper-parameter values. The SURE criterion is intrin-
µ.                                                               sically non-convex and additional non-convexities arise due
                                                                 to the sensitivity of the evaluation of the term tr (A Jµ (r))
   We have validated on a numerical example the auto-
                                                                 by finite differences. Rather than proceeding by local mini-
matic tuning of the regularization parameters µ: The sim-
                                                                 mization of SURE(µ), it is therefore safer to systematically
ulated elliptical disk is injected at the level of contrast
                                                                 evaluate the criterion on a 2-D grid in order to identify the
αgt = 10−5 on the data set of HIP 80019. Our SURE risk es-
                                                                 global minimum.
timator given in Eq. (23) is evaluated for different values of
the parameters µ. Figure 9 compares this criterion with the
                                                                                                 Article number, page 13 of 25
A&A proofs: manuscript no. REXPACO

Table 2. Quantitative evaluation of the reconstructions on simulated elliptical and spiral disks: N-RMSE, defined in Eq. (19), for
the reconstructions shown in Figs. 5 to 8 . The N-RMSE is also given on the restrictions D(xgt ) and D(x b) to the area actually
covered by the disks. The best scores are highlighted in bold fonts.

                            Score               Algorithm       αgt = 1 × 10−6      αgt = 5 × 10−6      αgt = 1 × 10−5
                                                                         — Elliptical disk, see Figs. 5 and 6 —
                N-RMSE (D(xgt ), D(bx))          REXPACO                0.20               0.18              0.17
                   N-RMSE (xgt , x
                                 b)              REXPACO                0.51               0.32             0.25
                 N-RMSE (H xgt , H xb)            cADI                  0.65               0.45              0.43
                 N-RMSE (H xgt , H xb)           REXPACO                0.19              0.13              0.07
                                                                          — Spiral disk, see Figs. 7 and 8 —
                N-RMSE (D(xgt ), D(bx))          REXPACO                0.75              0.66             0.32
                   N-RMSE (xgt , x
                                 b)              REXPACO                0.76              0.81             0.35
                 N-RMSE (H xgt , H xb)            cADI                  0.88              0.87             0.89
                 N-RMSE (H xgt , H xb)           REXPACO                0.75             0.40              0.24

                                                                         side of the disk (reconstructed values are very close to zero:
                                                                         the ground truth value) at the expense of a slightly nois-
                                                                         ier reconstruction of the disk. In Figure 9, the location of
                                                                         the global minimum (denoted by a circle) corresponds to
                                                                         parameters µ bSURE          MSE
                                                                                        smooth > µsmooth and µ   bSURE
                                                                                                                  `1   < µMSE
                                                                                                                           `1   , which
                                                                         leads to a stronger smoothing of the object with µ      b SURE
                                                                         and a better rejection of close-to-zero noise in the back-
                                                                         ground with µ b MSE . This qualitative observation is also con-
                                                                         firmed by Table 3 which reports RMSE8 scores between the
                                                                         ground truth xgt and the reconstructed flux distribution x    b
                                                                                       1
                                                                         (RMSE = M       ||xgt − x
                                                                                                 b ||2 ). Table 3 shows that the qual-
                                                                         ity of the reconstruction obtained with the regularization
                                                                         parameters µ b SURE estimated by minimizing the SURE cri-
Fig. 9. Comparison between the SURE criterion accounting                 terion (22) is very close to the best-possible reconstruction
for the fluctuations of the nuisance component through a local           obtained by minimizing the MSE criterion (20).
learning of the covariances (left, see Eq. (23)) and the true MSE
(right, see Eq. (20)). The pink circles represent the global min-
imum of each criterion. Data set: HIP 80019, see Sect. 3.3 for           3.3. Evaluation on data sets with circumstellar disks
observing conditions.
                                                                         In this section, we evaluate the ability of REXPACO to recon-
                                                                         struct light flux distributions of actual circumstellar disks
Table 3. RMSE scores of REXPACO reconstructions of a simu-               from ADI sequences. The results are compared to two ex-
lated elliptical disk, for three different regularization levels. The    isting methods, cADI and PCA (see Sect. 1 for their respec-
best scores are highlighted in bold fonts.                               tive principle), both designed for the detection of point-like
                                                                         sources but used extensively with different tuning parame-
        Score (×10−5 )          µ → 0+       µMSE      b SURE
                                                       µ                 ters to process data with disks (see Milli et al. (2012); Pairet
                                                                         et al. (2019) for studies of the resulting post-processing
       RMSE (disk)                  4.37     4.41       2.99             artifacts). We also compare REXPACO to PACO. The ratio-
     RMSE (background)              5.19     2.54       3.96             nale behind this last comparison is to assess the benefit
     RMSE (whole field)             6.79     4.96       5.09             of the reconstruction framework of REXPACO in addition to
                                                                         the common statistical modeling of the spatial covariances
                                                                         shared by the two methods. For cADI and the PCA, we
    Figure 10 illustrates the impact of the regularization in            used the SpeCal pipeline (Galicher et al. 2018) which is
three cases: an under-regularized setting (i.e., µ → 0+ ), the           the post-processing standard of the SPHERE data center
                     b SURE based on the minimization of the             (Delorme et al. 2017). The PACO implementation is based
automatic setting µ
                                                                         on the algorithm described in Flasseur et al. (2018a,b,c). A
SURE criterion (23), and the best-possible setting µMSE
                                                                         Matlab™ implementation of the REXPACO main routines is
computed using the ground truth by minimizing the MSE
                                                                         available online9 as a purpose of illustration. It is based on
criterion (20). The morphology of the reconstructed disk is
                                                                         GlobalBioIm10 , an opensource software for image recon-
improved by the regularization: It is smoother and sharper
thanks to the edge-preserving term. The background of x                   8
                                                                             Here, we use this metric instead of N-RMSE defined in
                                                             b
is also less noisy (closer to zero) due to the `1 -norm penal-            Eq. (19) since we aim to compare the absolute error made both
ization. It can be noted that in this simulation, the recon-              inside and outside the spatial support of the disk.
structed object x b with µ b SURE is flatter and sharper than             9
                                                                             https://github.com/olivier-flasseur/rexpaco_demo.git
the disk reconstructed with the optimal parameters because               10
                                                                             https://biomedical-imaging-group.github.io/
the optimal parameters lead to a better reconstruction out-               GlobalBioIm/

Article number, page 14 of 25
You can also read