Long-time simulations with high fidelity on quantum hardware

Page created by Margaret Arnold
 
CONTINUE READING
Long-time simulations with high fidelity on quantum hardware

                                                                      Joe Gibbs,1 Kaitlin Gili,1, 2 Zoë Holmes,3 Benjamin Commeau,3, 4 Andrew
                                                                       Arrasmith,1 Lukasz Cincio,1 Patrick J. Coles,1 and Andrew Sornborger3
                                                                     1
                                                                       Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
                                                                  2
                                                                    Department of Physics, University of Oxford, Clarendon Laboratory, Oxford, U.K.
                                                                    3
                                                                      Information Sciences, Los Alamos National Laboratory, Los Alamos, NM, USA.
                                                                  4
                                                                    Department of Physics, University of Connecticut, Storrs, Connecticut, CT, USA.
                                                                                                   (Dated: July 15, 2021)
                                                            Moderate-size quantum computers are now publicly accessible over the cloud, opening the excit-
                                                         ing possibility of performing dynamical simulations of quantum systems. However, while rapidly
                                                         improving, these devices have short coherence times, limiting the depth of algorithms that may
                                                         be successfully implemented. Here we demonstrate that, despite these limitations, it is possible
                                                         to implement long-time, high fidelity simulations on current hardware. Specifically, we simulate
arXiv:2102.04313v2 [quant-ph] 14 Jul 2021

                                                         an XY-model spin chain on the Rigetti and IBM quantum computers, maintaining a fidelity of at
                                                         least 0.9 for over 600 time steps. This is a factor of 150 longer than is possible using the iterated
                                                         Trotter method. Our simulations are performed using a new algorithm that we call the fixed state
                                                         Variational Fast Forwarding (fsVFF) algorithm. This algorithm decreases the circuit depth and
                                                         width required for a quantum simulation by finding an approximate diagonalization of a short time
                                                         evolution unitary. Crucially, fsVFF only requires finding a diagonalization on the subspace spanned
                                                         by the initial state, rather than on the total Hilbert space as with previous methods, substantially
                                                         reducing the required resources. We further demonstrate the viability of fsVFF through large nu-
                                                         merical implementations of the algorithm, as well as an analysis of its noise resilience and the scaling
                                                         of simulation errors.

                                                             I.       INTRODUCTION                                In this work, we improve upon a recently proposed
                                                                                                               variational quantum algorithm known as Variational Fast
                                                                                                               Forwarding (VFF) [21]. VFF allows long time simula-
                                               The simulation of physical systems is both valuable
                                                                                                               tions to be performed using a fixed depth circuit, thus
                                            for basic science and technological applications across a
                                                                                                               enabling a quantum simulation to be ‘fast forwarded’ be-
                                            diverse range of industries, from materials design to phar-
                                                                                                               yond the coherence time of noisy hardware. The VFF
                                            maceutical development. Relative to classical computers,
                                                                                                               algorithm requires finding a full diagonalization of the
                                            quantum computers have the potential to provide an ex-
                                                                                                               short time evolution operator U of the system of inter-
                                            ponentially more efficient means of simulating quantum
                                                                                                               est. Once found, the diagonalization enables any initial
                                            mechanical systems. Quantum hardware has progressed
                                                                                                               state of that system to be fast forwarded. However, for
                                            substantially in recent years [1, 2]. However, despite con-
                                                                                                               practical purposes, one is often interested in studying the
                                            tinual progress, we remain in the ‘noisy intermediate-
                                                                                                               evolution of a particular fixed initial state of interest. In
                                            scale quantum’ (NISQ) era in which the available hard-
                                                                                                               that case a full diagonalization of U is overkill. Instead, it
                                            ware is limited to relatively small numbers of qubits
                                                                                                               suffices to find a diagonal compilation of U that captures
                                            and prone to errors. Simulation algorithms designed for
                                                                                                               its action on the given initial state. Here, we show that
                                            fault-tolerant quantum computers, such as Trotterization
                                                                                                               focusing on this commonly encountered but less exacting
                                            methods [3, 4], qubitization methods [5], and Taylor se-
                                                                                                               task can substantially reduce the resources required for
                                            ries methods [6], require deeper circuits than viable given
                                                                                                               the simulation.
                                            the short coherence times of current hardware. Thus al-
                                            ternative approaches are needed to successfully imple-                Specifically, we introduce the fixed state VFF algo-
                                            ment useful simulations on NISQ hardware.                          rithm (fsVFF) for fast forwarding a fixed initial state
                                               Variational quantum algorithms [7–22], where a clas-            beyond the coherence time of a quantum computer. This
                                            sical computer optimizes a cost function measured on               approach is tailored to making dynamical simulation
                                            a quantum computer, show promise for NISQ quantum                  more suitable for NISQ hardware in two key ways. First,
                                            simulations. An early approach introduced an iterative             the cost function requires half as many qubits as VFF.
                                            method, where the state is variationally learned on a              This not only allows larger scale simulations to be per-
                                            step-by-step basis using action principles [18, 19, 23, 24].       formed on current resource-limited hardware, but also
                                            Subsequently, a generalization of the variational quan-            has the potential to enable higher fidelity simulations
                                            tum eigensolver [10] was developed for simulations in              since larger devices tend to be noisier. Second, fsVFF
                                            low lying energy subspaces [20]. Very recently, quantum-           can utilize simpler ansätze than VFF both in terms of the
                                            assisted methods have been proposed that perform all               depth of the ansatz and the number of parameters that
                                            necessary quantum measurements at the start of the al-             need to be learnt. Thus, fsVFF can reduce the width,
                                            gorithm instead of employing a classical-quantum feed-             depth, and total number of circuits required to fast for-
                                            back loop [25–27].                                                 ward quantum simulations, hence increasing the viability
2

of performing simulations on near-term hardware.                 3. Use the compiled form to simulate for time T =
   We demonstrate these advantages by implementing                  N ∆t using the circuit
long-time high fidelity quantum simulations of the 2-
qubit XY spin chain on Rigetti’s and IBM’s quantum                         W (θopt )D(γopt , N ∆t)W (θopt )† .              (2)
computers. Specifically, while the iterated Trotter ap-
proach has a fidelity of less than 0.9 after 4 time steps        VFF has proven effective for providing a fixed quantum
and has completely thermalized by 25 time steps, with         circuit structure with which to fast-forward beyond the
fsVFF we achieve a simulation fidelity greater than 0.9       coherence time of current noisy quantum devices. How-
for over 600 time steps. We further support the effective-    ever, the algorithm requires a full diagonalization of U
ness of this approach for NISQ simulations, with 4 qubit      over the entire Hilbert space. The local Hilbert-Schmidt
noisy and 8 qubit noiseless numerical simulations of the      test used to find this diagonalization requires 2n qubits.
XY model and Fermi-Hubbard model respectively.                Additionally, the ansatz must be sufficiently expressible
   In our analytical results, we prove the faithfulness       to diagonalize the full unitary U to a high degree of ap-
of the fsVFF cost function by utilizing the newly de-         proximation [31–33]. This typically requires a large num-
veloped No-Free-Lunch theorems for quantum machine            ber of parameters and a reasonably deep circuit. These
learning [28, 29]. We also provide a proof of the noise       overheads limit VFF’s utility on current hardware.
resilience of the fsVFF cost function, specifically the op-      In what follows, we introduce a more NISQ-friendly
timal parameter resilience [30]. Finally, we perform an       refinement to VFF that reduces these overheads when
analysis of simulation errors under fast-forwarding.          one is interested in fast-forwarding a fixed initial state
   The diagonalizations obtained using fsVFF may fur-         |ψ0 i, rather than any possible initial state. The fixed
ther be useful for determining the eigenstates and eigen-     state VFF algorithm (fsVFF) is summarised in Fig. 1.
values of the Hamiltonian on the subspace spanned by             We note that VFF, like the standard iterated Trot-
the initial state. This can be done using a time series       ter approach to quantum simulation, necessarily incurs a
analysis, by using fsVFF to reduce the depth of the quan-     Trotter error by approximating e−iH∆t with U = U (∆t).
tum phase estimation (QPE) algorithm, or using a simple       This Trotter error may be removed using the Variational
sampling method. We demonstrate on IBM’s quantum              Hamiltonian Diagonalization algorithm (VHD), which di-
computer that, while standard QPE fails on real hard-         rectly diagonalizes the Hamiltonian H [22]. However,
ware, fsVFF can be used to obtain accurate estimates of       VHD is yet more resource intensive than VFF on current
the spectrum.                                                 hardware, so we focus here on refining VFF.

                                                                  III.   FIXED STATE VARIATIONAL FAST
                 II.   BACKGROUND                                         FORWARDING ALGORITHM

   Before presenting our fsVFF algorithm, let us first re-                            A.   Cost function
view the original VFF algorithm from Ref. [21]. Consider
a Hamiltonian H on a d = 2n dimensional Hilbert space            In fsVFF, instead of searching for a full diagonalization
(i.e., on n qubits) evolved for a short time ∆t with the      of U over the entire Hilbert space, we search for a diag-
simulation unitary e−iH∆t , and let T (larger than ∆t)        onal compilation of U that captures the action of U on
denote the desired simulation time. Then the VFF algo-        the initial state |ψ0 i and its future evolution, e−iHt |ψ0 i.
rithm consists of the following steps:                        Here, we introduce a cost function tailored to this task.
                                                                 To make precise what is required of the cost for fsVFF,
   1. Approximate e−iH∆t with a single-timestep Trot-         let us first note that as the state |ψ0 i evolves, it remains
      terized unitary denoted U = U (∆t).                     within its initial energy subspace. This can be seen by
                                                              expanding then initial state in terms of the energy eigen-
   2. Variationally search for an approximate diagonal-       basis {|Ek i}2k=1 (the eigenbasis of H) as
      ization of U by compiling it to a unitary with a
      structure of the form                                                                     neig
                                                                                                X
                                                                                      |ψ0 i =          ak |Ek i ,           (3)
           V (α, ∆t) := W (θ)D(γ, ∆t)W (θ)† ,          (1)                                      k=1

     where α = (θ, γ) is a vector of parameters. Here,        where ak = hEk |ψ0 i, and noting that
     D(γ, ∆t) is a parameterized unitary that will (after                                       neig
                                                                                                X
     training) encode the eigenvalues of U (∆t), while                     e   −iHt
                                                                                      |ψ0 i =          ak e−iEk t |Ek i .   (4)
     W (θ) is a parameterized unitary matrix that will                                          k=1
     consist of the corresponding eigenvectors [21]. The
     compilation is performed using the local Hilbert-        Thus it follows that if |ψ0 i spans neig energy eigenstates
     Schmidt test [13] to find the parameters θopt and        of H, so does e−iHt |ψ0 i for all future times. Therefore
     γopt that minimize the local Hilbert-Schmidt cost.       to find a compilation of U that captures its action on
3

                                                   H, |   0i        (b) Trotter Expansion:      U, |   0i
                         (a) Input: H, | 0 i

      (d) Unitary Diagonalization with Maximizing Fidelity up to k = neig :                             (c) Optional: Calculate neig
          Gradient Descent Optimization Loop                                                                :
                                                                                                              Construct G(k) :
             Evaluate
                                  ⇢
                                      @CfsVFF @CfsVFF                                                                                   l,l0 =k
                                                                                                               G(k) = [h l |       l0 i]l,l0 =0
             Gradients:                      ,                                                                                                    Hadamard Test
                                        @✓i     @ i
                                  ⇢
             Update                              @CfsVFF                   @CfsVFF
             Parameters:              ✓i     ⌘           ,      i      ⌘
                                                   @✓i                       @ i
                                                                                                               Calculate Det(G(k)) :
             Evaluate Cost:
                                                     neig                                                      If Det(G(k)) 6= 0                  construct G(k+1)
                                              1 X                    † k k             2
                     CfsVFF := 1                  |h            0 |(V ) U |       0 i|                         If Det(G(k)) = 0                   neig = k
                                             neig
                                                     k=1

                                                                                                        U, |   0 i, neig

                           (e) Output: Fast Forwarded Simulation
                             Approximately simulate H for time N t using
         ✓ opt ,   opt
                             W (✓✓ opt )D(   opt , N      t)W (✓✓ opt )†

FIG. 1. The fsVFF Algorithm. (a) An input Hamiltonian and an initial input state are necessary (b) to create a single
time-step Trotterized unitary, U (∆t) and (c) to calculate the number of eigenstates spanned by the initial state. The value
of neig can be calculated by constructing a matrix of state overlaps U k |ψ0 i and increasing the matrix dimension until the
determinant is zero. (d) The unitary is then variationally diagonalized into the form, V (α, ∆t) = W (θ)D(γ, ∆t)W † (θ). The
cost function CfsVFF is minimized with a classical optimizer (e.g., gradient descent), where the parameters θ and γ are updated.
(e) The optimal parameters θopt and γopt are then used to implement a fast-forwarded simulation with the diagonalized unitary
form.

e−iHt |ψ0 i (for all times t) it suffices to find a compi-                                 due to Trotter error is negligible). In general these states
lation of U on the neig dimensional subspace spanned                                       may be freely chosen from the subspace spanned by |ψ0 i.
           neig                                           n
by {|Ek i}k=1   . We stress that the eigenstates {|Ek i}2k=1                               Here a convenient choice in training states would be |ψ0 i
                                                                                                                                                   neig
need not be ordered, and therefore the subspace spanned                                    and its Trotter evolutions, that is the set {U k |ψ0 i}k=1   .
                       neig
by the subset {|Ek i}k=1    is not necessarily low lying in                                Motivated by these observations, we define our cost func-
energy.                                                                                    tion for fsVFF as
                                                                                                                           neig
   A No-Free-Lunch Theorem for quantum machine                                                                         1 X
learning introduced in Ref. [28] proves that to perfectly                                        CfsVFF := 1 −                  | hψ0 | (V † )k U k |ψ0 i |2 ,       (5)
                                                                                                                      neig
learn the action of a unitary on a d-dimensional space                                                                       k=1
requires d training pairs. In the context of fsVFF, we                                     where similarly to VFF we use a diagonal ansatz
are interested in learning the action of a unitary on an                                   V (α, ∆t) := W (θ)D(γ, ∆t)W (θ)† . This cost quantifies
neig -dimensional subspace. Since the unitary is block di-                                 the overlap between the initial state evolved under U for
agonal, one can directly apply this NFL theorem to the                                     k time steps, U k |ψ0 i, and the initial state evolved under
subspace of interest. Therefore neig training pairs are                                    the trained unitary for k time steps, W Dk W † |ψ0 i, aver-
required to learn the unitary’s action on this subspace.                                   aged over neig time steps. Assuming we have access to
(Note, we assume here that the training states are not                                     the unitary that prepares the state |ψ0 i, the state over-
entangled with an additional register. It was shown in                                     laps can be measured using n qubits, via a circuit that
Ref. [29] that using entangled training data can reduce                                    performs a Loschmidt echo [30]. Therefore CfsVFF can
the required number of training states. In fact, this more                                 be evaluated using only n qubits. This is half as many as
powerful method is used by the VFF algorithm. How-                                         standard VFF, opening up the possibility of performing
ever, producing such entangled training data requires ad-                                  larger simulations on current hardware.
ditional qubits and two-qubit gates and therefore is less                                     It is important to note that while the exact time-
NISQ-friendly.)                                                                            evolved state exp(−iHt) |ψ0 i is perfectly confined to the
   The No-Free-Lunch theorem therefore implies that neig                                   initial subspace, the approximate evolution induced by
states are required to learn U on |ψ0 i (assuming leakage                                  U (∆t) allows for leakage from the initial subspace [34].
4

                                                n
Thus the subspace spanned by {U k |ψ0 i}k=1      eig
                                                     in general   zero if and only if the vectors are linearly dependent. The
                                            neig
does not perfectly overlap with {|Ek i}k=1 . However, by          Gramian corresponding to Vk is given by
reducing ∆t and considering higher order Trotter ap-                                                                    
proximations [4, 35], this leakage can be made arbitrarily                           hψ0 |ψ0 i hψ0 |ψ1 i · · · hψ0 |ψk i
small. In Appendix A, we prove that in the limit that                             hψ1 |ψ0 i hψ1 |ψ1 i · · · hψ1 |ψk i
                                                                                                                        
leakage from the initial subspace is negligible, CfsVFF                   G(k) =        ..        ..    ..        ..    . (7)
                                                                                         .         .        .      .    
is faithful. That is, we show that the cost vanishes,
CfsVFF = 0, if and only if the fidelity of the fast-forwarded                        hψk |ψ0 i hψk |ψ1 i · · · hψk |ψk i
simulation is perfect,
                                                                  If Det(G(k)) 6= 0, then the vectors in Vk are linearly in-
                           †   τ    τ       2
            Fτ = | hψ0 | W D W U |ψ0 i | = 1 ,             (6)    dependent and therefore span at least a k +1 dimensional
                                                                  subspace. Conversely, if Det(G(k)) = 0, the set Vk con-
for all times τ . Note, that the reverse direction is trivial.    tains linear dependencies and the subspace they span is
If Fτ = 1 for all τ , then CfsVFF = 0.                            less than k + 1 dimensional. Therefore, if we can find
   Similar to the VFF cost, the fsVFF cost is noise re-           kmin , the smallest k such that Det(G(k)) = 0, then (not-
silient in the sense that incoherent noise should not affect      ing that G(k) is a k + 1 dimensional matrix) we know
the global optimum of the function. This is proven for a          that kmin is the largest number of linearly independent
broad class of incoherent noise models using the results          vectors spanned by V∞ . That is, kmin is the dimension
of Ref. [30] in Appendix B.                                       of K∞ (U, ψ0 ) and so we have that neig = kmin .
   Nonetheless, it is only possible to measure CfsVFF if the         The overlaps hψl |ψl0 i for any l and l0 can be mea-
unitary U neig can be implemented comfortably within the          sured using the Hadamard Test, shown in Fig. 1, and
coherence time of the QC. Additionally, the number of             thus the Hadamard test can be used to determine G(k)
circuits required to evaluate CfsVFF in general scales with       on quantum hardware. Since the Gramian here con-
neig . Given these two restrictions, fsVFF is limited to          tains two symmetries,     0
                                                                                               hermiticity 0 and the invariance
simulating quantum states spanning a non-exponential              hψl |ψl0 i = hψ0 |U −l U l |ψ0 i = hψ0 |U l −l |ψ0 i = hψ0 |ψl0 −l i,
number of eigenstates. Consequently, we advocate us-              we only have to calculate the first row of the matrix G(k)
ing fsVFF to simulate states with neig = poly(n). Cru-            on the quantum computer.
cially these states need not be low lying and therefore our          In summary, our proposed algorithm to determine neig
approach is more widely applicable than the Subspace              consists of the following loop. Starting with k = 1,
Variational Quantum Simulator (SVQS) algorithm [20],
                                                                     1. Construct G(k) using the Hadamard test.
which simulates fixed low energy input states. In Sec-
tion V C we develop methods for reducing the resources
                                                                     2. Calculate (classically) Det(G(k)).
required to evaluate the cost for larger values of neig .
   While CfsVFF was motivated as a natural choice of cost                If Det(G(k)) = 0, terminate the loop and conclude
function to learn the evolution induced by a target uni-                 that neig = k.
tary on a fixed initial state, it is a global cost [36] and              If Det(G(k)) 6= 0, increase k → k + 1 and return to
hence it encounters what is known as a barren plateau for                step 1.
large simulation sizes [33, 36–46]. In Appendix C we sug-
gest an alternative local version of the cost to mitigate         This is shown schematically in Fig. 1.
such trainability issues.
                                                                    We remark that in the presence of degeneracies in the
                                                                  spectrum of H, the eigenvectors corresponding to degen-
                   B.    Calculating neig                         erate eigenvalues are not unique. Therefore, in this case,
                                                                  the number of states spanned by |ψ0 i depends on how
                                                                  the eigenvectors corresponding to degenerate eigenval-
   In this section, we present an algorithm to calcu-             ues are chosen. However, as detailed in Appendix A,
late neig and therefore determine the number of train-            to learn the action of U on |ψ0 i, what matters is the
ing states required to evaluate CfsVFF . Our proposed             number of eigenstates spanned by |ψ0 i corresponding to
algorithm utilizes the fact that the number of energy             unique eigenvalues. This is equivalent to the dimension
eigenstates spanned by |ψ0 i is equivalent to the num-            of the Krylov subspace K∞ (U, ψ0 ). Consequently, the
ber of linearly independent states in the set V∞ where            algorithm detailed above can also be used in this case.
Vk := {|ψl i}l=k                           l
               l=0 with |ψl i := U (∆t) |ψ0 i. The sub-
space Kk (U, ψ0 ) spanned by Vk is known as the Krylov              While it is beneficial to learn neig to determine how
subspace associated with the operator U and vector                many training states are required to perfectly learn the
|ψ0 i [47]. Therefore, neig is equivalently the dimension of      diagonalization on the subspace spanned by the initial
the Krylov subspace K∞ (U, ψ0 ).                                  state, we stress that it is not strictly necessary for the suc-
   To determine the dimension of K∞ (U, ψ0 ) we can uti-          cessful implementation of fsVFF. One could always train
lize the fact that the determinant of the Gramian matrix          on an increasing number of states and study the conver-
of a set of vectors (i.e., the matrix of their overlaps) is       gence of an observable of interest. More concretely, one
5

could train on k states and then use the resultant diag-      may be possible to find a fully expressive, compact ansatz
onalization to compute the evolution of a particular ob-      by inspection. This is the case for a simple 2-qubit XY
servable as a function of time. For k < neig the trajectory   Hamiltonian, as discussed in Section IV.
of the observable will alter as k is increased. However,         More generally, it can be challenging to analyti-
for k > neig increasing k further will no longer change the   cally find compact but sufficiently expressible ansätze.
trajectory of the observable because it will have already     Nonetheless, it is possible to variationally update the
converged on the true trajectory. Using this approach,        ansatz structure and thereby systematically discover sim-
neig need not be already known to implement fsVFF.            ple structures. One straightforward approach is to use a
                                                              layered ansatz where each layer initializes to the iden-
                                                              tity gate [50, 51]. The ansatz can be optimized until
                       C.     Ansatz                          it plateaus, redundant single qubit gates removed, then
                                                              another layer can be appended and the process repeats.
  The fsVFF algorithm, similarly to VFF, employs an           Alternatively, more sophisticated discrete optimization
ansatz of the form                                            techniques may be used to variationally search the space
                                                              of ansätze.
           V (α, ∆t) = W (θ)D(γ, ∆t)W † (θ) ,          (8)

to diagonalize the initial Trotter unitary U (∆t). Here                    D.    Summary of algorithm
W (θ) is a quantum circuit that approximately rotates
the standard basis into the eigenbasis of H, and D(γ)            The fixed state Variational Fast Forwarding algorithm
is a diagonal unitary that captures the (exponentiated)       (fsVFF) is summarized in Fig. 1. We start with an ini-
eigenvalues of H. A generic diagonal operator D can be        tial state |ψ0 i that we wish to evolve under the Hamilto-
written in the form                                           nian H.
                            Y         q
                D(γ, ∆t) =      eiγq Z ∆t ,         (9)          1. The first step is to approximate the short time evo-
                               q
                                                                    lution using a single step Trotter approximation U .
where γq ∈ R and we use the notation                             2. This Trotter approximation can be used to find an
                                                                    approximation for neig , the dimension of the en-
                   q
                 Z =   Z1q1   ⊗ ··· ⊗   Znqn   ,      (10)
                                                                    ergy eigenspace spanned by |ψ0 i, using the method
                                                                    outlined in Section III B.
with Zj the Pauli Z operator acting on qubit j. While
Eq. (9) provides a general expression for a diagonal uni-        3. Equipped with a value for neig , we then varia-
tary, for practical ansätze it may be desirable to assume           tionally search for a diagonalization of U over the
that the Z q operators are local operators and the prod-            energy subspace spanned by |ψ0 i using CfsVFF ,
uct contains a polynomial number of terms, i.e., is in              Eq. (5). At each iteration step the gradient of the
O(poly(n)). There is more flexibility in the construc-              cost with respect to a parameter θi is measured on
tion of the ansätze for W since these are generic unitary           the quantum computer for a fixed set of parameters
operations. A natural choice might be to use a hardware-            using the analytic expressions for ∂θi CfsVFF pro-
efficient ansatz [48] or an adaptive ansatz [13, 49].               vided in Appendix D. These gradients are used to
   One of the main advantages of fsVFF is that diago-               update the parameters using a classical optimizer,
nalization is only necessary over the subspace spanned              such as those in Refs. [52–54]. The output of the
by the initial state, rather than the entire Hilbert space          optimization loop is the set of parameters that min-
which will be significantly larger. To outperform stan-             imize CfsVFF ,
dard VFF, it is in our interest to take advantage of this
small subspace to find compact ansätze.                                {θopt , γopt } = arg min CfsVFF (θ, γ) .    (11)
   The two main impeding factors we wish to minimize                                      θ,γ
to aid diagonalization are error rates and optimization
time. Therefore, when searching for ansätze, our priori-         4. Finally, the state |ψ0 i can be simulated for time
ties are to minimize the number of CNOT gates required              T = N ∆t using the circuit
(the noisiest component in the ansätze) and the num-
ber of rotation parameters. There is, however, a trade                    W (θopt )D(γopt , N ∆t)W (θopt )† .      (12)
off between expressibility of the ansatz and its trainabil-
ity. There needs to be enough freedom in the unitary to            That is, by simply multiplying the parameters γopt
map the required eigenvectors to the computational basis           in the diagonalized unitary by a constant number
but generically highly expressive ansätze exhibit barren           of iterations N .
plateaus [33].
   For systems with symmetries and/or systems that are          In Appendix E, we show that the total simulation fi-
nearby perturbations of known diagonalizable systems, it      delity, in the limit that leakage is small, is expected
6

                                                                                 states |00i (corresponding to neig = 1), |10i (neig = 2)
              1.00
                                               Classical Simulation
                                                                                 and √12 (|00i + |10i) (neig = 3). As described in Sec-
              0.75
                                               Hardware Implementation           tion III B, the neig of these states can be found by calcu-
                                                               neig = 1          lating Det(G(k)) for increasing values of k since, as k is
  Det(G(k))

              0.50                                             neig = 2          increased, the determinant first equals 0 when k = neig .
                                                               neig = 3             To verify this for the states considered here, we first
              0.25                                                               determine G using a classical simulator. As seen in
                                                                                 Figure 2, in this case Det(G(k)) exactly equals 0 when
              0.00
                     1            2                3                 4           k = neig . We then measured G on Honeywell’s quantum
                                           k                                     computer. Although on the real quantum device gate
                                                                                 noise and sampling errors are introduced, the results re-
FIG. 2. Gramian Determinant Calculation. Here we                                 produce the classical results reasonably well. Namely, at
plot the determinant of the Gramian matrix, Det(G), for G                        the correct value of k, Det(G(k)) drastically reduces and
measured on the Honeywell quantum computer (solid) and                           approximately equals 0. Thus, we have shown that it is
simulated classically (dashed) for a 2-qubit XY spin chain.                      possible to determine neig for an initial state by measur-
Specifically we looked at states spanning k = 1 (blue), k = 2                    ing G on quantum hardware.
(yellow) and k = 3 (red) eigenstates. For both sets of
data Det(G(neig )) ≈ 0, demonstrating the effectiveness of
the method for determining neig that we introduce in Sec-
tion III B. For the Honeywell implementation we used 1000                                              B.   Training
measurement samples per circuit.
                                                                                    We tested the training step of the algorithm on
                                                                                 IBM and Rigetti’s quantum computers, specifically
to scale sub-quadratically with the number of fast-                              ibmq_toronto and Aspen-8. For the purpose of imple-
forwarding time steps N . Thus, if the minimal cost from                         menting a complete simulation, we chose to focus on sim-
the optimization loop is sufficiently small, we expect the                       ulating the evolution of the state |ψ0 i = |10i. As dis-
fsVFF algorithm to allow for long, high fidelity simula-                         cussed in the previous section, this state spans neig = 2
tions.                                                                           eigenstates.
                                                                                    To diagonalize HXY on the 2-dimensional subspace
                                                                                 spanned by |10i, we used a hybrid quantum-classical op-
                IV.      HARDWARE IMPLEMENTATION                                 timization loop to minimize CfsVFF . For a state with
                                                                                 neig = 2 the cost CfsVFF , Eq. (5), uses two training
  In this section we demonstrate that fsVFF can be used                          states {U (∆t)k |ψ0 i}k=1,2 where U (t) is the first-order
to implement long time simulations on quantum hard-                              Trotter approximation of HXY . On the IBM quantum
ware. Specifically, we simulate the XY spin chain, which                         computer we evaluated the full cost function for each
has the Hamiltonian                                                              gradient descent iteration. However, the time available
                                                                                 on the Aspen-8 device was limited, so to speed up the
                                  n−1
                                  X                                              rate of optimization we evaluated the overlap on just one
                         HXY :=         Xj Xj+1 + Yj Yj+1 ,               (13)   of the two training states per iteration, alternating be-
                                  j=1                                            tween iterations (instead of evaluating the overlaps on
                                                                                 both training states every iteration). To allow the move-
where Xj and Yj are Pauli operators on the jth qubit. In                         ment through parameter space to use information aver-
what follows, we first present results showing that we can                       aged over the two timesteps, whilst only using a single
determine neig for an initial state |ψ0 i using the method                       training state per cost function evaluation, momentum
described in Section III B. We then demonstrate that the                         was added to the gradient updates [55].
fsVFF cost can be trained to find an approximate diago-
                                                                                    To take advantage of the fact that more compact an-
nalization of HXY on the subspace spanned by |ψ0 i. We
                                                                                 sätze are viable for fsVFF, we variationally searched for
finally use this diagonalization to perform a long time
                                                                                 a short depth ansatz, tailored to the target problem.
fast forwarded simulation. In all cases we focus on a two
                                                                                 Specifically, we started training with a general 2-qubit
qubit chain, i.e. n = 2, and we approximate its evolution
                                                                                 unitary and then during training the structure was min-
operator using a first-order Trotter approximation.
                                                                                 imised by pruning unnecessary gates. In Figure 3, we
                                                                                 show the circuit for the optimal ansatz obtained using
                                                                                 the method. The ansatz requires one CNOT gate and
                            A.    Determining neig
                                                                                 two single qubit gates for W and only one Rz rotation for
                                                                                 D. This is a substantial compression on the most general
  The 2-qubit XY Hamiltonian has the eigenvectors                                two qubit ansatz for W which requires 3 CNOTs and 15
{|00i , √12 (|10i + |01i), √12 (|10i − |01i), |11i}, correspond-                 single qubit rotations and the most general 2 qubit ansatz
ing to the eigenvalues {0, 1, -1, 0}. As proof of princi-                        for D which requires 2 Rz rotations and one 2-qubit ZZ
ple, we tested the algorithm for determining neig on the                         rotation (though in the case of the XY Hamiltonian this
7

                                                                                                   the number of iterations for the implementations on
                                                                                                   ibmq_toronto (yellow) and Aspen-8 (red). The dashed
                                                                                                   line indicates the noisy cost value obtained from the
                                                                                                   quantum computer. To evaluate the quality of the op-
                                                                                                   timization, we additionally classically compute the true
                                                                                                   cost (indicated by the solid lines) using the parameters
FIG. 3. Ansatz for Hardware Implementation. The                                                    found on ibmq_toronto and Aspen-8. While the noisy
ansatz used to diagonalize the 2-qubit XY Hamiltonian in the                                       cost saturates at around 10−1 , we obtained a minimum
subspace of initial state |10i for the implementation on Rigetti                                   noise-free cost of the order 10−3 . The two orders of mag-
and IBM’s quantum computers. Here Rj (θ) = exp(−iθσj /2)                                           nitude difference between the noisy and the noise-free
for j = x, y, z.                                                                                   cost is experimental evidence that the cost function is
                                                                                                   noise resilient on extant quantum hardware.
                       100
a)
                                                                                                                     C.    Fast forwarding
                10−1
 Cost, CfsVFF

                                                                                                      Finally we took the two sets of parameters found
                10−2                                                                               from training on ibmq_toronto and Aspen-8, and used
                                                                                                   them to implement a fast-forwarded simulation of the
                                 Noisy
                10−3             Noise-free
                                                                                                   state |10i on ibmq_rome. To evaluate the quality of
                                                                                                   the fast forwarding we calculated the fidelity, F (N ) =
                             0           5            10                15            20           hψ(N )|ρ(N )|ψ(N )i, between the density matrix of the
                                                     Iterations
                       1.0
                                                                                                   simulated state, ρ(N ), after N timesteps, and the exact
b)                                                                                                 time evolved state, |ψ(N )i, at time T = N ∆t. We used
                       0.8
                                                                                                   Quantum State Tomography to reconstruct the output
                                                                  0.9
                                          fsVFF (ibmq-toronto)                                     density matrix, and then calculated the overlap with the
         Fidelity, F

                       0.6                fsVFF (Aspen-8)         0.8                              exact state classically.
                                          Iterated Trotter                                            As shown by the plots of F (N ) in Figure 4(b), fsVFF
                                                                  0.7
                       0.4                                                   625   1275    2000    significantly outperforms the iterated Trotter method.
                                                                                                   Let us refer to the time before the simulation falls be-
                       0.2                                                                         low an error threshold δ as the high fidelity time. Then
                             0    25      50     75    100     125           150    175      200   the ratio of the high fidelity time for fsVFF (TδFF ) and for
                                                   Timesteps, N                                    standard Trotterization (TδTrot ) is a convenient measure
                                                                                                   of simulation performance,
FIG. 4. Hardware Implementation. a) The 2-qubit pa-                                                                    RδFF = TδFF /TδTrot .               (14)
rameterised quantum circuit shown in Figure 3 was trained
to diagonalize U (∆t), a first order Trotter expansion of the
                                                                                                   A simulation can be said to have been successfully fast-
2-qubit XY Hamiltonian with ∆t = 0.5, in the subspace
spanned by |10i. The dashed line plots the noisy cost as
                                                                                                   forwarded if RδFF > 1. The iterated Trotter method
measured on ibmq_toronto (yellow) and Aspen-8 (red) us-                                            dropped below a simulation infidelity threshold of δ =
ing 30,000 samples per circuit. The solid line indicates the                                       1 − F = 0.1 (0.2) after 4 (8) timesteps. In compar-
equivalent noise-free cost that was calculated on a classical                                      ison, fsVFF maintained a high fidelity for 625 (1275)
simulator. b) The initial state |ψ0 i = |10i is evolved forwards                                   timesteps. Thus we achieved a simulation fast-forwarding
                                                                                                             FF          FF
in time on the ibmq_rome quantum computer using the it-                                            ratio of R0.1 = 156 (R0.2 = 159).
erated Trotter method (blue) and using fsVFF with the opti-
mum parameters found on ibmq_toronto (yellow) and Aspen-
8 (red). The quality of the simulation is evaluated by plotting                                             V.    NUMERICAL SIMULATIONS
the fidelity F = hψ|ρ|ψi between the evolved state and exact
evolution. The grey dotted line at F = 0.25 represents the
overlap with the maximally mixed state. The black dotted                                                              A.   Noisy Training
line denotes a threshold fidelity at F = 0.9. The inset shows
the fast-forwarding of the ansatz trained on ibmq_toronto on                                          We further validate fsVFF’s performance by testing it
a longer timescale, where the fidelity dropped below 0.9 (0.8)                                     on a simulator of a noisy quantum computer. The noise
at 625 (1275) timesteps. All simulation data was taken using                                       levels on the simulator are lower than those experienced
1000 samples per circuit.
                                                                                                   on current devices and hence these results are indicative
                                                                                                   of the performance of the algorithm in the near future as
                                                                                                   hardware improves.
may be simplified to only 2 Rz rotations [56]).                                                       For these numerics we diagonalize the evolution of the
 Figure 4(a) shows the fsVFF cost function versus                                                  4 qubit XY Hamiltonian, in the subspace spanned by the
8

domain wall state |ψ0 i = |1100i. This state spans 5 en-                              
ergy eigenstates of the XY Hamiltonian and so we use the
training states {U (∆t)k |ψ0 i}5k=1 . Here U (∆t) is chosen                           
to be a second-order Trotter-Suzuki decomposition for
the evolution operator under HXY with ∆t = 0.5. The                                                                                                  
                                                                                      

                                                                  )LGHOLW\F

                                                                                                                                &RVWCfsVFF
noise model used was based upon the IBM architecture.                                                        IV9))
   To construct the ansatz for the diagonalizing unitary,                                                                                                  1RLV\
                                                                                                       7URWWHU
W , we developed an adaptive technique, similar to that                                                                                                          1RLVHIUHH
proposed in [13, 49], to evolve the discrete circuit struc-                                                                                             
ture, as well as optimize the rotation parameters using                                                                                                    ,WHUDWLRQV
gradient descent. This method tends to produce shal-
lower circuits than the ones obtained with fixed ansatz                               
approaches. It is also less prone to get stuck in local                                                                                             
minima. Since, the XY Hamiltonian is particle number
                                                                                                                              7LPHVWHSVN
conserving we further use only particle number conserv-
                                                                    FIG. 5. Noisy Training and Fast-Forwarding of the 4
ing gates. This reduces the number of parameters in W ,             qubit XY Hamiltonian. The inset shows the cost curve
as well as minimizing the leakage out of the symmetry               as the ansatz is evolved and optimized to diagonalize the 4
sector when the circuit is executed with a noisy simula-            qubit XY Hamiltonian in the 5-dimensional subspace spanned
tor. Additional details on this adaptive learning method            by initial state |1100i. The final circuit found by the learning
are provided Appendix F. The ansatz for D, as in our 2              algorithm for the diagonalizing unitary, W , had 50 CNOT
qubit hardware implementation, simply consisted of Rz               gates. The main plot evaluates the fast-forwarding perfor-
rotations on each qubit.                                            mance of the trained ansatz, in comparison to the Iterated-
   The result of the training is shown in the inset of Fig. 5.      Trotter evolved state. The fidelity is calculated against the
The noisy cost was measured by the noisy quantum sim-               ideal state found in simulation using the iterated Trotter
ulator, whereas the noise-free cost is calculated simul-            method in the absence of noise, F (N ) = hψ(N )| ρ |ψ(N )i with
                                                                    |ψ(N )i = U (∆t)N |ψ0 i. The black dotted line highlights a
taneously but in the absence of any noise. The signif-
                                                                    threshold value F = 0.8. The gray dotted line at F = 1/24
icant separation between the noisy and noise-free cost              represents the overlap with the maximally mixed state.
again demonstrates the noise resilience of the VFF al-
gorithm. After successfully training the cost, the fast-
forward performance was then evaluated. Using the same              σ =↑, ↓ and nj,σ = c†j,σ cj,σ is a particle number operator.
noise model, the output density matrix of the Iterated-
Trotter state and the fast-forwarded state was compared                    P number of fermions with a spin σ is given by
                                                                    The total
                                                                    Nσ = j nj,σ . The term with coefficient J in Eq. (15)
against the the Iterated-Trotter state in the absence of            represents a single-fermion nearest-neighbor hopping and
noise, with the fidelity between the two states plotted. As         the term with coefficient U introduces on-site repulsion.
shown in Fig. 5, the fast-forwarded evolution significantly         The Hamiltonian preserves particle numbers N↑ and N↓ .
out performs the Iterated-Trotter evolution, with the for-
                                                                       In our numerical studies, we choose L = 4 (which re-
mer’s fidelity dropping below 0.8 after 700 timesteps,
                                                                    quires 8 qubits to simulate) and J = 1, U = 2 as well as
compared to only 8 steps of the latter. Thus we achieved
                              FF                                    N↑ = N↓ = 2 (half filling). The initial state is chosen to
a fast forwarding ratio of R0.8   = 87.5.
                                                                    be a superposition of neig = 5 eigenvectors of HFH in the
                                                                    particle sector N↑ = N↓ = 2. Similar to our noisy sim-
               B.    Fermi-Hubbard model                            ulations of the XY model, we utilize an adaptive ansatz
                                                                    for W that is made out of gates that preserve particle
                                                                    number N↑ and N↓ . The ansatz for D takes the form
   Finally, to probe the scalability and the breadth of ap-         of Eq. (9), where we only allow for single-Z terms in
plicability of the fsVFF algorithm we performed a larger            Eq. (10). For this numerical result, we trained using the
(noiseless) numerical implementation of the algorithm on            full exponentiation of the Hamiltonian as the evolution
the Fermi-Hubbard model. Specifically, we considered                operator, with no Trotter error.
the 1D Fermi-Hubbard Hamiltonian on an L-site lattice
                                                                       In the inset of Fig. 6 we show the cost function as
with open boundary conditions:
                                                                    it is iteratively minimized. We then test the perfor-
                         L−1
                         X     X                                    mance on a noisy simulator based upon a fully connected
         HFH = − J                   c†j,σ cj+1,σ + h.c.            8-qubit trapped-ion device [23]. As shown in Fig. 6,
                         j=1 σ=↑,↓                                  small final cost values typically require deeper circuits
                                                           (15)     to achieve, the optimum diagonalization to use depends
                         L
                         X
                    +U         nj,↑ nj,↓ .                          on the length of time one wishes to simulate. At short
                         j=1                                        times, a larger final cost function value performs better
                                                                    since this corresponds to a shorter ansatz which expe-
Here, cj,σ (c†j,σ ) denotes fermionic creation (annihila-           riences less noise. However, to simulate longer times, a
tion) operator at site j for each of the two spin states            higher quality diagonalization is required, with the addi-
9

                                                                                                                                                       D
                                                                                                                                                                                   
                             

                                                                                                                                                          &RVWCfsVFF
                                                                                                                                                                             
)LGHOLW\FH[DFW

                                                     IV9))

                                                                               &RVWCfsVFF
                                                           IV9))                                                                                                                          TXELWV
                                                                                                                                                                                           TXELWV
                                                     IV9))                                                                                                            
                                                           7URWWHU                                                                                                                                                
                                                                                                                                                                                                           ,WHUDWLRQV
                                                                                                                  ,WHUDWLRQV                                 
                                                                                                                                                             E

                             

                                                                                                                                                         ,QILGHOLW\1 F
                                                                                                                                 
                                                                                   7LPHT
                                                                                                                                                                                   
  FIG. 6. Training and Fast-Forwarding of the 8 qubit
  Hubbard Model. The inset shows the cost as it is iter-
  atively minimized using an adaptive ansatz. Various qual-                                                                                                                         
  ity diagonalizations are indicated by the colored circles. In                                                                                                                                                                                       
  the main figure, we plot the fidelity between the simulated                                                                                                                                                             7LPHVWHSVN
  state and the exact evolution as a function of time. The red,
  green and yellow lines denote fsVFF simulations using the
  corresponding quality diagonalization shown in the inset. In                                                                                           FIG. 7. Randomized Training. a) The 5 (6) qubit XY
  blue we plot the fidelity of the Iterated-Trotter simulation,                                                                                          Hamiltonian with initial state |11100i (|111000i) is diagonal-
  Fexact (T ) = hψ(T )| ρtrot |ψ(T )i with |ψ(T )i = e−iHT |ψ0 i and                                                                                     ized using the cost function Eq. (16), using only 2 train-
  ρtrot the simulated iterated Trotter state.                                                                                                            ing states per cost function evaluation to learn the evolu-
                                                                                                                                                         tion within the 9 (12) dimensional subspace. The final circuit
                                                                                                                                                         found by the learning algorithm for the diagonalizing unitary,
                                                                                                                                                         W , had 32 (134) CNOT gates. b) After completion of the
  tional noise induced by increased circuit depth resulting                                                                                              randomised training, the Hamiltonians were fast-forwarded,
  in a relatively small decrease in fidelity. As shown in                                                                                                with the infidelity evaluated in comparison to the noiseless
  Fig. 6, we find that the fast forwarding corresponding to                                                                                              Trotter-iterated state U (∆t)N |ψ0 i.
  an optimized cost of 1.1 × 10−5 maintained a fidelity of
  greater than 0.8 for T < 800. In contrast, the iterated
  Trotter method drops below 0.8 for T > 4.6 and hence                                                                                                   where V (t) = W D(t)W † is the fsVFF ansatz, U (t) is a
                                               FF
  we here achieve a fast forwarding ratio of R0.8  = 174.                                                                                                Trotter approximation for the short time unitary evolu-
                                                                                                                                                         tion, and the elements of the set R are randomly gen-
                                                                                                                                                         erated numbers from the interval [−1, 1]. The gradients
                                                    C.       Randomized Training                                                                         of the cost function are smaller when the unitary acts
                                                                                                                                                         close to the identity operation, so to maintain stronger
     While the fsVFF cost as stated in Eq. (5) has neig                                                                                                  gradients it is advantageous for the elements of R to be
  terms, this does not necessarily mean that the number                                                                                                  slightly biased towards the edges of the interval. Specifi-
  of circuits required to evaluate it also scales with neig .                                                                                            cally, in our numerics to test this approach, the absolute
  Analogous to mini-batch gradient descent methods pop-                                                                                                  magnitude of r was raised to the power of 0.75. Although
  ular for the training of classical neural networks, we can                                                                                             this approach does not require an a priori calculation of
  use only a small random selection of the total training                                                                                                neig , there is a caveat that tmax needs to be large enough
  dataset per gradient evaluation, yet over the whole opti-                                                                                              to get sufficient separation of the training states so they
  mization the total training set will be fully explored many                                                                                            are not functionally identical. This alternative training
  times over. Therefore, instead of restricting ourselves to                                                                                             setup is potentially a yet more NISQ friendly variant, as
  a discrete set of training states, which requires setting                                                                                              the unitary does not need to be decomposed into the form
  the size of the training set to be equal to or greater than                                                                                            U (∆t)neig , as required in Eq. 5, and therefore allows for
  neig , we can instead randomly select our training states                                                                                              shorter depth circuits.
  from a continuum. This has the added advantage that it                                                                                                    To demonstrate the viability of this batched training
  is then unnecessary to explicitly compute neig .                                                                                                       method we diagonalized the 5 (6) qubit XY Hamilto-
     This approach results in a modified cost function of                                                                                                nian with initial state |11100i (|111000i), which has an
  the form                                                                                                                                               neig = 9 (12). For both training curves shown in Fig. 7,
                              X                                                                                                                          only 2 training states per cost function evaluation were
             efsVFF := 1 − 1
             C                  | hψ0 | V (−rtmax )U (rtmax ) |ψ0 i |2                                                                                   used. In both cases, we trained with the unitary U (t/6)6
                          |R|                                                                                                                            where U was the second order Trotter-Suzuki operator,
                                                              r∈R
                                                                                                                                                (16)     and tmax = 1. The cost was successfully minimised to
10

10−5 in both cases, and a noiseless simulation error of        for large scale systems on current hardware. Once an evo-
less than 10−2 was maintained for over 100 time steps on       lution operator has been diagonalized in the subspace of
fast forwarding.                                               an initial state, fsVFF can be used to significantly reduce
                                                               the circuit depth of QPE and QEE, as shown in Figure 8.
                                                               In this manner, fsVFF provides a NISQ friendly means of
            VI.   ENERGY ESTIMATION                            estimating the eigenvalues within a subspace of a Hamil-
                                                               tonian.
   The diagonalization obtained from the optimization             To demonstrate the power of fsVFF to reduce the
stage of fsVFF, W (θopt )D(γopt , ∆t)W (θopt )† , implicitly   depth of QPE, we perform QPE using the diagonaliza-
contains approximations of the eigenvalues and eigenvec-       tion obtained from training on IBM’s quantum com-
tors of the Hamiltonian of the system of interest. In          puter. Specifically, we consider the input eigenvector
this section we discuss methods for extracting the energy      |E1 i := √12 (|01i + |10i). This is one of the eigenvec-
eigenstates and eigenvalues from a successful diagonal-        tors spanned by the input state of our earlier hardware
ization and implement them on quantum hardware.                implementation, |ψ0 i = |10i. We then consider evolving
   The energy eigenvectors spanned by the initial state        |E1 i under HXY for a time step of ∆t = 1/8. Since the
|ψ0 i can be determined by the following simple sampling       energy of the state |E1 i equals 1, we expect this to result
method. The first step is to apply W † to the initial state    in a phase shift of e2πi/8 being applied to |E1 i. We im-
|ψ0 i. In the limit of perfect learning and vanishing Trot-    plemented QPE and fsVFF enhanced QPE to measure
ter error, this gives                                          this phase using the circuits shown in Fig 8. We chose to
                                                               measure to 3 bits of precision and therefore the output
                                    neig                       should be the measurement 001 with probability one. As
                                    X
               W (θopt )† |ψ0 i =          ak |vk i    (17)    Figure 9 shows, it appears that the standard QPE imple-
                                    k=1                        mentation was unable to discern this phase. In contrast,
                                     n
                                   eig
                                                               when fsVFF was used to reduce the circuit depth, the
where ak = hEk |ψ0 i and {|vk i}k=1    is a set of compu-      output distribution was strongly peaked at the correct
tational basis states. The energy eigenstates spanned          state.
by |ψ0 i are then found by applying W (θopt ) to any              QEE requires only one ancillary qubit, a single im-
of the states obtained from measuring W (θopt )† |ψ0 i         plementation of e−iHt , and no Quantum Fourier Trans-
                                                   neig
in the computational basis, that is {|Ek i}k=1           =     form and therefore is less resource intensive than QPE.
                 neig
{W (θopt ) |vk i}k=1 .                                         Nonetheless, we can again, as shown in Fig. 8, use fsVFF
   Extracting the energy eigenvalues from D is more sub-       as a pre-processing step to reduce the circuit depth.
tle. Firstly, as W DW † and U , even in the limit of per-         We tested this on the 3-qubit XY Hamiltonian by first
fectly minimizing the cost CfsVFF , may disagree by a          performing fsVFF on a quantum simulator with the ini-
global phase φ, at best we can hope to learn the difference    tial state |ψ0 i = |110i. Having obtained an approximate
between, rather than absolute values, of the energy eigen-     diagonalization, we determined the eigenstates using the
values of H. For simple cases, where the diagonal ansatz       sampling method described earlier. Figure 10 shows the
D is composed of a polynomial number of terms, these           results of the measurement of W (θopt )† |ψ0 i, with four
energy value differences may be extracted directly by          strong peaks corresponding to the four eigenvectors in
rewriting D in the computational basis. For  example,  in    this subspace.
                                                γ∆tZ1
our hardware implementation D(γ) = exp −i 2             ⊗11       Figure 11 shows the results of QEE implemented on
and therefore the difference in energy between the two         ibmq_boeblingen. We use the basis states found from
eigenvalues spanned by |ψ0 i is given by γopt + ∆tkπ
                                                     . Here    the sampling method as our inputs to reduce the depth
k is an integer correcting for the arbitrary phase arising     of the circuit, and remove the need to use the time-series
from taking the log of D that can be determined using          method originally proposed for extracting the eigenval-
the method described in Ref. [22]. Using this approach,        ues, as we could calculate the eigenvalues individually
we obtain 1.9995 and 2.0019 from the training on IBM           by inputting their corresponding eigenvectors. A value
and Rigetti respectively, in good agreement with the the-      of ∆t = 1 was used so the phase calculated directly
oretically expected value of 2. For more complex cases,        matched the eigenvalue. After removing a global phase,
this simple post-processing method will be become in-          QEE had accurately found the eigenvalues of the four
tractable and an algorithmic approach will be necessary.       eigenvectors, with a mean-squared error from the true
   Quantum Phase Estimation (QPE) [57] and Quan-               values of 4.37 × 10−3 .
tum Eigenvalue Estimation (QEE) [58] are fault toler-
ant quantum algorithms for estimating the eigenvalues
of a unitary operation. However, their implementation                            VII.   DISCUSSION
on current quantum devices is limited by the reliance
on the execution of controlled unitaries from ancillary           In this work, we demonstrated that despite the mod-
qubits. These controlled unitaries require many entan-         est size and noise levels of the quantum hardware that
gling gates, and introduce too much noise to be realized       is currently available, it is possible to perform long time
11

                                                                                  j                                                                                                                                 j
                                                a) QPE             |0i                     H                          QF T †                                                                           |0i                 H                          QF T †
                                                                                                                                                                               ⇒
                                                                                                                 j
                                                               |Ek i                            U (∆t)2                                                                                            |vk i                       D(γ opt , 2j ∆t)

                                                b) QEE             |0i            H                            hσx + iσy i                                                                             |0i        H                               hσx + iσy i
                                                                                                                                                                           ⇒
                                                               |Ek i                       U (∆t)                                                                                                 |vk i                    D(γ opt , ∆t)

FIG. 8. Energy estimation circuits a)/b) show circuit diagrams depicting the enhancement of QPE/QEE using fsVFF. A
circuit depth reduction is achieved through replacing U (∆t) with D(γopt , ∆t), and removing the need to prepare an eigenstate
                                                                                                                                 j
in favour of a computational basis state, |vk i = W † |Ek i. QPE relies on implementing controlled unitaries of the form U (∆t)2
                                              j
and therefore replacing these with D(γopt , 2 ∆t) results in an exponential reduction in circuit depth.

                                                                                                                                                                                      
                                                                                                        6WDQGDUG43(
                                                                                                  IV9))HQKDQFHG43(                                                  

                                                                                                                                               3UREDELOLW\$PSOLWXGH
3UREDELOLW\$PSOLWXGH

                                                                                                                                                                                      

                                                                                                                                                                                        

                                                                                                                                                                                      

                                                                                                                                                                                        
                                                                                                                                                                               

FIG. 9. Quantum Phase Estimation. Using the 2-                                                                                                 FIG. 10. Determining the eigenstates spanned by the
qubit diagonalization found from training on ibmq_toronto,                                                                                     initial state: The 3-qubit XY Hamiltonian was diagonal-
QPE was performed on ibmq_boeblingen on the eigenvector                                                                                        ized on a quantum simulator in the subspace of initial state
                                        2πi
|E1 i := √12 (|01i + |10i). A phase of e 8 is applied, so the                                                                                  |ψ0 i = |110i to obtain θopt and γopt . Here we show the out-
measured output should be 001 with probability 1. The vari-                                                                                    put of measuring W (θopt )† |ψ0 i in the computational basis on
ation distance from the target probability distribution when                                                                                   ibmq_boeblingen. The 4 non-zero states correspond to the 4
using QPE with fsVFF was 0.578, compared to 0.917 using                                                                                        eigenvectors spanned by |ψ0 i.
standard QPE.

                                                                                                                                               initial state rather than an arbitrary initial state. By fo-
dynamical simulations with a high fidelity. Specifically,                                                                                      cusing on this less demanding task, we showed that it is
we have introduced fsVFF, a new algorithm for NISQ                                                                                             possible to substantially reduce the width and depth of
simulations, which we used to simulate a 2-qubit XY-                                                                                           the previously proposed VFF algorithm. In particular,
model spin chain on the Rigetti and IBM quantum com-                                                                                           since fsVFF only requires finding a diagonalization of a
puters. We achieved a fidelity of at least 0.9 for over                                                                                        short-time evolution unitary on the subspace spanned by
600 time steps. This is a 150-fold improvement on the                                                                                          the initial state (compared to the entire Hilbert space in
standard iterated Trotter approach, which had a fidelity                                                                                       the case of VFF), fsVFF can utilize much simpler an-
of less than 0.9 after only 4 time steps. Moreover, our                                                                                        sätze. This is demonstrated in our hardware implemen-
numerical simulations of the 4 qubit XY model and 8                                                                                            tation, where one CNOT and two parameterized single
qubit Fermi-Hubbard model achieved fast-forwarding ra-                                                                                         qubit rotations proved sufficient for an effective ansatz
                                            1
tios of 87.5 and 174 respectively, indicating the viability                                                                                    for W and one single qubit rotation was sufficient for D.
of larger implementations in the near future as hardware                                                                                          The fsVFF algorithm, similarly to VFF, is fundamen-
improves.                                                                                                                                      tally limited by the initial Trotter error approximating
   Central to the success of the fsVFF algorithm is the                                                                                        the short time evolution of the system. The Variational
fact that it is tailored to simulating a particular fixed                                                                                      Diagonalization Hamiltonian (VHD) algorithm [22] may
12

                                                                  ticular observable of interest, rather than all possible ob-
                              2                                   servables. It would be interesting to investigate whether a
                3                                                 fixed-observable fsVFF could further reduce the resources
                 4                          4                     required to implement long time high fidelity simulations.
                                                                  More broadly, an awareness of this trade off may prove
                                                                  useful beyond dynamical simulation for the ongoing chal-
                                                                  lenge of adapting quantum algorithms to the constraints
                                                                of NISQ hardware.

                                             
                                                                            VIII.    ACKNOWLEDGEMENTS
                                             
                5                          7 
                 4                          4                        JG and KG acknowledge support from the U.S. De-
                             3               
                                                                  partment of Energy (DOE) through a quantum comput-
                              2              ([DFW           ing program sponsored by the Los Alamos National Lab-
                                                                  oratory (LANL) Information Science & Technology In-
FIG. 11. Eigenvalue Estimation using QEE. Here we                 stitute. ZH, BC and PJC acknowledge support from the
show the result of implementing QEE (using fsVFF as a pre-        LANL ASC Beyond Moore’s Law project. ZH acknowl-
processing step) on ibmq_santiago to calculate the eigenval-
                                                                  edges subsequent support from the Mark Kac Fellow-
ues of the eigenvectors in the subspace spanned by |011i. The
solid yellow, red, blue and green lines represent the eigenval-   ship. We acknowledge the LANL Laboratory Directed
ues obtained for the |000i, |001i, |100i and |101i states, with   Research and Development (LDRD) program for sup-
exact corresponding energies of {-2.828, 0, 0, 2.828}, indi-      port of AS and initial support of BC under project num-
cated by the dotted lines. The eigenvalues are plotted as         ber 20190065DR as well as LC under project number
phases since for ∆t = 1 there is a one to one correspondence.     20200022DR. AA was supported by the U.S. Department
                                                                  of Energy (DOE), Office of Science, Office of High Energy
                                                                  Physics QuantISED program under Contract No. DE-
be used to remove this error. However, like VFF, VHD is           AC52-06NA25396. LC and PJC were also supported by
designed to simulate any possible initial state. There are        the U.S. DOE, Office of Science, Basic Energy Sciences,
a number of different approaches inspired by fsVFF that           Materials Sciences and Engineering Division, Condensed
could be explored for reducing the resource requirements          Matter Theory Program. This research used quantum
of the VHD algorithm by focusing on simulating a par-             computing resources provided by the LANL Institutional
ticular initial state. Such a “fixed state VHD” algorithm         Computing Program, which is supported by the U.S.
would allow for more accurate long time simulations on            Department of Energy National Nuclear Security Ad-
NISQ hardware.                                                    ministration under Contract No. 89233218CNA000001.
   More generally, our work highlights the trade off be-          This research used additional quantum computational re-
tween the universality of an algorithm and the resources          sources supported by the LANL ASC Beyond Moore’s
required to implement it. One can imagine a number of             Law program and by the Oak Ridge Leadership Comput-
alternative ways in which the universality of an algorithm        ing Facility, which is a DOE Office of Science User Facil-
can be sacrificed, without significantly reducing its util-       ity supported under Contract DE-AC05-00OR22725.
ity, in order to make it more NISQ friendly. For example,
one is often interested in studying the evolution of a par-

 [1] Frank Arute, Kunal Arya, Ryan Babbush, Dave Bacon,            [5] Guang Hao Low and Isaac L Chuang, “Hamiltonian sim-
     et al., “Quantum supremacy using a programmable su-               ulation by qubitization,” Quantum 3, 163 (2019).
     perconducting processor,” Nature 574, 505–510 (2019).         [6] Dominic W Berry, Andrew M Childs, Richard Cleve,
 [2] Frank Arute, Kunal Arya, Ryan Babbush, Dave Ba-                   Robin Kothari, and Rolando D Somma, “Simulating
     con, Joseph C Bardin, Rami Barends, Andreas Bengts-               hamiltonian dynamics with a truncated taylor series,”
     son, Sergio Boixo, Michael Broughton, Bob B Buck-                 Physical Review Letters 114, 090502 (2015).
     ley, et al., “Observation of separated dynamics of charge     [7] M. Cerezo, Andrew Arrasmith, Ryan Babbush, Si-
     and spin in the fermi-hubbard model,” arXiv preprint              mon C Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R
     arXiv:2010.07965 (2020).                                          McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio,
 [3] Seth Lloyd, “Universal quantum simulators,” Science ,             and Patrick J. Coles, “Variational quantum algorithms,”
     1073–1078 (1996).                                                 arXiv preprint arXiv:2012.09265 (2020).
 [4] AT Sornborger and Ewan D Stewart, “Higher-order               [8] Suguru Endo, Zhenyu Cai, Simon C Benjamin, and Xiao
     methods for simulations on quantum computers,” Physi-             Yuan, “Hybrid quantum-classical algorithms and quan-
     cal Review A 60, 1956 (1999).                                     tum error mitigation,” Journal of the Physical Society of
                                                                       Japan 90, 032001 (2021).
You can also read