This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Exploring non-linear correlators on AGP

Armin Khamoshi armin.khamoshi@rice.edu Department of Physics and Astronomy, Rice University, Houston, TX 77005-1892    Guo P. Chen Department of Chemistry, Rice University, Houston, TX 77005-1892    Thomas M. Henderson Department of Chemistry, Rice University, Houston, TX 77005-1892 Department of Physics and Astronomy, Rice University, Houston, TX 77005-1892    Gustavo E. Scuseria Department of Chemistry, Rice University, Houston, TX 77005-1892 Department of Physics and Astronomy, Rice University, Houston, TX 77005-1892
Abstract

Single-reference methods such as Hartree-Fock-based coupled cluster theory are well known for their accuracy and efficiency for weakly correlated systems. For strongly correlated systems, more sophisticated methods are needed. Recent studies have revealed the potential of the antisymmetrized geminal power (AGP) as an excellent initial reference for the strong correlation problem. While these studies improved on AGP by linear correlators, we explore some non-linear exponential ansätze in this paper. We investigate two approaches in particular. Similar to Phys. Rev. B 91, 041114(R) (2015), we show that the similarity transformed Hamiltonian with a Hilbert-space Jastrow operator is summable to all orders and can be solved over AGP by projecting Schrödinger’s equation. The second approach is based on approximating the unitary pair-hopper ansatz recently proposed for application on a quantum computer. We report benchmark numerical calculations against the ground state of the pairing Hamiltonian for both of these approaches.

strongly correlated electrons, antisymmetrized geminal power, Jastrow correlator, canonincal transformation

I Introduction

Many-body methods in electronic structure theory frequently start from a mean-field reference. Commonly, a single-reference Slater determinant obtained by Hartree-Fock (HF) calculations is used for post-HF methods that make correction to this mean-field Helgaker et al. (2000). Such methods include configuration interaction (CI) and coupled cluster theory (CC). For weakly correlated systems, single-reference CC is often regarded as the gold standard and calculations are routinely performed using CC with single and double excitations (CCSD) very accurately at an affordable cost of 𝒪(M6)\mathcal{O}(M^{6}) where MM is the size of the system Scuseria et al. (1988); Bartlett and Musiał (2007); Helgaker et al. (2000). Unfortunately, the same cannot be said for the strong correlation problem. When many-body interactions are strong, more than one Slater determinant becomes important and methods based on a single HF reference often break down Bulik et al. (2015); Degroote et al. (2016); Qiu et al. (2017). Although multireference methods exist, they have certain limitations and are not always straightforward to use Roos et al. (1980); Olsen (2011); Szalay et al. (2012); Lyakh et al. (2012). As such, a new a way of thinking and more sophisticated methods are desirable.

A useful way of organizing Hilbert space in the context of strong correlation is by seniority Ring and Schuck (1980); Bytautas et al. (2011). Loosely speaking, seniority (Ω\Omega) is the number of unpaired fermions in some appropriate pairing scheme. For example, a two-body Hamiltonian can generally be written as Henderson et al. (2015)

𝑯=𝑯δΩ=0+𝑯δΩ=2+𝑯δΩ=4\displaystyle\boldsymbol{H}=\boldsymbol{H}^{\delta\Omega=0}+\boldsymbol{H}^{\delta\Omega=2}+\boldsymbol{H}^{\delta\Omega=4} (1)

where the superscripts indicate the portion of the Hamiltonian that couples those determinants whose seniority differs by 0,20,2, etc. Numerical evidence has shown that the low-seniority portions of the Hilbert space can recover a very large part of the correlation energy and have the correct qualitative behavior Bytautas et al. (2011, 2015). The seniority-zero subspace can be solved exactly by doubly occupied configuration interaction (DOCI) but it scales combinatorially with system size Veillard and Clementi (1967); Couty and Hall (1997); Kollmar and Heß (2003). Remarkably, CC restricted to paired double excitation (pCCD) Stein et al. (2014), also known as AP1roG Limacher et al. (2013), is often nearly exact in this subspace with a very low computational cost for realistic molecular Hamiltonians Johnson et al. (2013); Henderson et al. (2014a); Shepherd et al. (2016). However, this correspondence is not universal and it has been shown that pCCD and its extensions break down for some problems including the attractive pairing Hamiltonian Dukelsky et al. (2004); Henderson et al. (2014b, 2015); Degroote et al. (2016); Henderson and Scuseria (2020),

𝑯=pϵp𝐍pGpq𝐏p𝐏p,\displaystyle\boldsymbol{H}=\sum_{p}\epsilon_{p}\mathbf{N}_{p}-G\sum_{pq}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{p}, (2)

where ϵp=pΔϵ\epsilon_{p}=p\Delta\epsilon are equally spaced, doubly degenerate energy levels where Δϵ\Delta\epsilon is the level spacing, GG tunes the strength of the infinite-range pairwise interactions, and

𝐏p\displaystyle\mathbf{P}^{\dagger}_{p} =cpcp¯,\displaystyle={c}^{\dagger}_{p}{c}^{\dagger}_{\bar{p}}, (3a)
𝐍p\displaystyle\mathbf{N}_{p} =cpcp+cp¯cp¯,\displaystyle={c}^{\dagger}_{p}{c}_{p}+{c}^{\dagger}_{\bar{p}}{c}_{\bar{p}}, (3b)

such that cp{c}^{\dagger}_{p} creates a fermion in spin-orbital pp, and p¯\bar{p} is its “paired” companion. Generally, in the seniority-zero subspace, all operators can be organized as a linear combination of terms like 𝐏p𝐍q𝐏r\mathbf{P}^{\dagger}_{p}\ldots\mathbf{N}_{q}\ldots\mathbf{P}_{r} Khamoshi et al. (2019), where the nilpotent operators 𝐏p\mathbf{P}^{\dagger}_{p} and 𝐏p\mathbf{P}_{p} together with 𝐍p\mathbf{N}_{p} are generators of a su(2)su(2) Lie algebra

[𝐏p,𝐏q]\displaystyle\left[\mathbf{P}_{p},\mathbf{P}^{\dagger}_{q}\right] =δpq(1𝐍p),\displaystyle=\delta_{pq}\left(1-\mathbf{N}_{p}\right), (4a)
[𝐍p,𝐏q]\displaystyle\left[\mathbf{N}_{p},\mathbf{P}_{q}^{\dagger}\right] =2δpq𝐏q.\displaystyle=2\delta_{pq}\mathbf{P}^{\dagger}_{q}. (4b)

The breakdown of CC in the pairing Hamiltonian can be understood in terms of the spontaneous breaking of number symmetry in the electronic mean-field, which occurs at some critical value of GG, Gc>0G_{c}>0. As a result, the mean-field solution gives rise to the number-broken Bardeen–Cooper–Schrieffer (BCS) wavefunction Bardeen et al. (1957) for G>GcG>G_{c} and a number-conserving Slater determinant for G<GcG<G_{c} including all G<0G<0 where the interaction is repulsive. Indeed, it is in the attractive regime where CC as well as a whole host of other conventional methods struggle Henderson and Scuseria (2020). It is noteworthy that the pairing Hamiltonian is exactly solvable Richardson and Sherman (1964); Dukelsky et al. (2004).

Recently, we proposed to use the AGP wavefunction as the initial reference for this and potentially more general problems Henderson and Scuseria (2019); Khamoshi et al. (2019); Henderson and Scuseria (2020); Harsha et al. (2020); Dutta et al. (2020); Khamoshi et al. (2020). Along similar lines of research, Johnson et al. have developed correlated methods by explicitly using the pairing Hamiltonian’s eigenvectors, Richardson-Gaudin states, as the initial reference Johnson et al. (2020a); Fecteau et al. (2020a, b); Johnson et al. (2020b). The main advantage of AGP for the seniority-zero sector is that on the one hand, it puts coefficients on each Slater determinant in the DOCI while still conserving seniority, and on the other hand, AGP is computationally straightforward. To put it differently, the natural orbitals of AGP, HF, pCCD, and DOCI are all the same if the same pairing scheme is chosen, differing only in occupation. The AGP wavefunction can then be expanded in its seniority-zero natural orbital determinants and can occupy all (MN)M\choose N such determinants, where NN is the number of fermion pairs. It thus encompasses all possible configurations in the seniority-zero space, albeit with inexact coefficients. However, unlike DOCI, AGP can be optimized at mean-field cost 𝒪(M3)\mathcal{O}(M^{3}) Scuseria et al. (2011). It also encompasses the HF state, so it has a much richer structure as an initial reference. Moreover, we have shown recently that the reduced density matrices (RDM) over AGP can be computed efficiently as all higher order RDMs are decomposable in terms of lower order ones. Khamoshi et al. (2019) These realizations permit us to envision post-AGP methods—analogous to post-HF—to capture the remaining error and provide accurate ansätze in both weak and strong correlation. A case can be made for using AGP in the seniority non-zero spaces where pairs are broken, but we concentrate on the seniority-zero case at this stage.

Recent studies in our group have developed CI-based methods over AGP and numerical calculations show excellent accuracy when applied to the pairing Hamiltonian. These methods include orthogonal excitations Henderson and Scuseria (2019, 2020) and those based on the generators of the Lie algebra Eq. (4) Dutta et al. (2020). It has been shown that these models are formally equivalent and can be viewed as “geminal replacement” methods Dutta et al. (2020). Encouraged by the accuracy of linear models, we would like to explore non-linear correlator ansätze over AGP. Non-linear models in principle can approximate the DOCI coefficient differently and it remains to be seen whether they provide an advantage over the linear case. Developing a CC theory over AGP is not trivial and remains an open problem Henderson and Scuseria (2020).

In this paper, we explore two exponential correlators over AGP which we build from number-conserving combinations of the generators of the Lie algebra, Eq. (4). The first is the exponential of Jastrow Jastrow (1955) operators on AGP (JAGP). Jastrow-Gutzwiller correlators are ubiquitous in physics and chemistry and are widely used in Monte Carlo calculations. Jastrow (1955); Haldane (1988); Foulkes et al. (2001); Casula and Sorella (2003); Casula et al. (2004); Umezawa and Tsuneyuki (2003); Tsuneyuki (2008); Henderson and Scuseria (2013); Neuscamman (2013a, 2016); Genovese et al. (2019); Nakano et al. (2019) It was shown in Ref. Wahlen-Strothman et al. (2015); Wahlen-Strothman and Scuseria (2016) that the similarity transformation of any fermionic Hamiltonian, e𝑱𝑯e𝑱\mathrm{e}^{-\boldsymbol{J}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}}, where 𝑱\boldsymbol{J} is a Hilbert-space Jastrow operator, is analytically summable to all orders. Inspired by that, we extend the formalism to the pairing su(2)su(2) algebra Eq. (4) and show that we can efficiently solve the projected Schrödinger equation over AGP. In a way, this could be viewed as a form of generalized coupled cluster theory, but instead of the traditional particle-hole excitations, correlation is mediated by number operators that create excitations on AGP—a possibility that does not occur for single Slater determinants as they are eigenfunctions of 𝑱\boldsymbol{J}.

The second ansatz is the unitary pair-hopper that we proposed recently for applications in quantum computers Khamoshi et al. (2020). Unitary ansätze can be implemented efficiently on a quantum computer, Cao et al. (2019) but some approximations must be made in order to implement them on a classical computer. Here, we approximate the unitary transformation of the Hamiltonian using the Baker-Campbell-Hausdorff (BCH) formula by truncating it at the 4th commutator. Affordable computational scaling is made possible by the reconstruction formulae Khamoshi et al. (2019) with which we can eliminate the bottleneck of computing and storing higher rank density matrices from our calculations. As an alternative to the brute-force expansion of the BCH formula, we apply the canonical transformation theory as formulated in Ref. Yanai and Chan (2006, 2007); Neuscamman et al. (2009, 2010); Yanai et al. (2012) to recursively sum the transformed Hamiltonian. We report benchmark calculations of all methods against the ground state energy of the pairing Hamiltonian.

The organization of this paper is as follows. To familiarize the reader with our notation, in Sec. II we review some basic properties of AGP and its density matrices. In Sec. III, we present our formalism for solving similarity transformed JAGP as well as its application to the pairing Hamiltonian. Sec. IV discusses the truncated commutator expansion and canonical transformation with the unitary pair-hoppers ansatz over the pairing Hamiltonian. Lastly, some discussion and concluding remarks are provided in Sec. V.

II Background

The AGP wavefunction with NN pairs corresponds to a condensate where all 2N2N fermions are placed in the same geminal. Mathematically it can be written as Coleman (1965); Surján (1999)

|AGP=1N!(𝚪)N|vac,\displaystyle|\text{AGP}\rangle=\frac{1}{N!}\left({\mathbf{{\Gamma}}}^{\dagger}\right)^{N}|\text{vac}\rangle, (5)

where |vac|\text{vac}\rangle is the physical vacuum and 𝚪{\mathbf{{\Gamma}}}^{\dagger} is the geminal creation operator whose general expression is

𝚪=pqηpqcpcq.\displaystyle{\mathbf{{\Gamma}}}^{\dagger}=\sum_{pq}\eta_{pq}{c}^{\dagger}_{p}{c}^{\dagger}_{q}. (6)

Here, p,qp,q denote spin-orbitals and η\eta is a skew-symmetric matrix known as the geminal coefficient. To make a connection with Eq. (3) and for mathematical convenience, we perform an orbital rotation that brings the matrix of geminal coefficients into a block-diagonal form Hua (1944), that is

η=p=1M(0ηpηp0).\displaystyle\eta=\bigoplus_{p=1}^{M}\begin{pmatrix}0&\eta_{p}\\ -\eta_{p}&0\\ \end{pmatrix}. (7)

Therefore, without loss of generality, we obtain the geminal operator in the natural orbital basis of the geminal as

𝚪=p=1Mηpcpcp¯=p=1Mηp𝐏p.\displaystyle{\mathbf{{\Gamma}}}^{\dagger}=\sum_{p=1}^{M}\eta_{p}{c}^{\dagger}_{p}{c}^{\dagger}_{\bar{p}}=\sum_{p=1}^{M}\eta_{p}\mathbf{P}^{\dagger}_{p}. (8)

where we made use of the notation in Eq. (3) and MM denotes the number of orbitals. In this formalism, we can express the AGP wavefunction as

|AGP=1p1<<pNMηp1ηpN𝐏p1𝐏pN|vac.\displaystyle|\text{AGP}\rangle=\sum_{1\leq p_{1}<...<p_{N}\leq M}\eta_{p_{1}}...\eta_{p_{N}}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle. (9)

Similarly, we can write down the equation for DOCI,

|DOCI=1p1<<pNMDp1pN𝐏p1𝐏pN|vac.\displaystyle|{\text{DOCI}}\rangle=\sum_{1\leq p_{1}<...<p_{N}\leq M}D_{p_{1}...p_{N}}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle. (10)

From these expressions, it is evident that the two wavefunctions have the same structure but differ in the coefficients of each determinant; in DOCI the amplitude tensor is fully connected, whereas in AGP it is fully factorized.

Despite being a linear combination of a combinatorial number of determinants, reduced density matrices of many-body fermionic operators over AGP can be computed efficiently at polynomial cost and are very sparse in the natural orbital basis Khamoshi et al. (2019). To see this, first note that for a generic many-body operator O^\hat{O} we have

AGP|O^|AGP=AGP|O^δΩ=0|AGP.\displaystyle\langle\text{AGP}|{\hat{O}}|\text{AGP}\rangle=\langle\text{AGP}|{\hat{O}^{\delta\Omega=0}}|\text{AGP}\rangle. (11)

In other words, since AGP is a seniority-zero wavefunction, all many-body density matrices over AGP can be written in terms of AGP|𝐏p𝐍q𝐏r|AGP\langle\text{AGP}|{\mathbf{P}^{\dagger}_{p}...\mathbf{N}_{q}...\mathbf{P}_{r}...}|\text{AGP}\rangle where the overlaps associated with particle-number breaking or seniority non-conserving operators are identically zero. Therefore, we can organize all nonzero reduced density matrices in ascending rank as follows

Zp(1,1)\displaystyle Z_{p}^{(1,1)} =AGP|𝐍p|AGP,\displaystyle=\langle\text{AGP}|{\mathbf{N}_{p}}|\text{AGP}\rangle, (12a)
Zpq(0,2)\displaystyle Z_{pq}^{(0,2)} =AGP|𝐏p𝐏q|AGP,\displaystyle=\langle\text{AGP}|{\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}}|\text{AGP}\rangle, (12b)
Zpq(2,2)\displaystyle Z_{pq}^{(2,2)} =AGP|𝐍p𝐍q|AGP,\displaystyle=\langle\text{AGP}|{\mathbf{N}_{p}\mathbf{N}_{q}}|\text{AGP}\rangle, (12c)
Zpqr(1,3)\displaystyle Z_{pqr}^{(1,3)} =AGP|𝐏p𝐍q𝐏r|AGP,\displaystyle=\langle\text{AGP}|{\mathbf{P}^{\dagger}_{p}\mathbf{N}_{q}\mathbf{P}_{r}}|\text{AGP}\rangle, (12d)
Zpqr(3,3)\displaystyle Z_{pqr}^{(3,3)} =AGP|𝐍p𝐍q𝐍r|AGP,\displaystyle=\langle\text{AGP}|{\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}}|\text{AGP}\rangle, (12e)
\displaystyle\ldots

where the first integer in the superscript indicates the number of 𝐍p\mathbf{N}_{p} operators in the middle and the second integer is the rank of the density matrix. When indices of an RDM are different, we refer to it as an irreducible density matrix. This is because 𝐍p=𝐍p2/2=2𝐏p𝐏p\mathbf{N}_{p}=\mathbf{N}_{p}^{2}/2=2\mathbf{P}^{\dagger}_{p}\mathbf{P}_{p} together with 𝐏p𝐏p=𝐏p𝐍p=0\mathbf{P}^{\dagger}_{p}\mathbf{P}^{\dagger}_{p}=\mathbf{P}^{\dagger}_{p}\mathbf{N}_{p}=0 in the seniority-zero space imply that RDMs with repeated indices are either zero or can be written in terms of lower rank density matrices.

Every matrix element of an AGP RDM, regardless of the rank, can be computed directly at 𝒪(M2)\mathcal{O}(M^{2}) cost using a method based on elementary symmetric polynomials (ESP) Khamoshi et al. (2019). Even more efficiently, one can take advantage of the reconstruction formulae to express any irreducible RDM as a linear combination of lower rank ones, which can be performed all the way down to occupation numbers and geminal coefficients Khamoshi et al. (2019). This allows us to compute any density matrix of rank 2 or higher with 𝒪(1)\mathcal{O}(1) cost per element. Finally, we can take advantage of the equivalence between AGP and the number-projected BCS wavefunction Ring and Schuck (1980); Dukelsky et al. (2016) to extract AGP density matrices via grid integrations of BCS transition density matrices Dutta et al. (2021), at 𝒪(Ngrid)\mathcal{O}(N_{\mathrm{grid}}) cost per element, where NgridN_{\mathrm{grid}} is the size of the numerical quadrature for gauge integration Scuseria et al. (2011). While less efficient than the reconstruction formulae, this final approach can be very efficient when used to contract density matrices with other factorized quantities. In the subsequent sections, we use these methods to compute all density matrices in our calculations.

III Jastrow-AGP

III.1 Analytic properties

Let the kk-body, or rank-kk, Hilbert-space Jastrow operator Neuscamman (2013b) be written as

𝑱k\displaystyle\boldsymbol{J}_{k} =12kp1<<pkαp1pk𝐍p1𝐍p2𝐍pk,\displaystyle=\frac{1}{2^{k}}\sum_{p_{1}<...<p_{k}}\alpha_{p_{1}...p_{k}}\mathbf{N}_{p_{1}}\mathbf{N}_{p_{2}}...\mathbf{N}_{p_{k}}, (13)

where the sums run over all orbitals, α\alpha is a symmetric tensor and is invariant under the exchange of its indices, and 1/2k{1}/{2^{k}} is introduced for convenience. Importantly, since 𝐍p2=2𝐍p\mathbf{N}_{p}^{2}=2\mathbf{N}_{p} in the seniority-zero space, we impose that α\alpha is zero if any two indices are the same and eliminate these terms from the sum in Eq. (13).

From the kk-body Jastrow operator we can define the kk-body Jastrow-AGP (JkAGP) wavefunction as

|JkAGP=e𝑱k|AGP|\mathrm{J_{k}AGP}\rangle=\mathrm{e}^{\boldsymbol{J}_{k}}|\mathrm{AGP}\rangle (14)

As we prove in Appendix. A, we do not need to include lower body Jastrow operators in this JkAGP wavefunction, as they are contained inside 𝑱k\boldsymbol{J}_{k}. Thus, for example, we may write

e𝑱1+𝑱2|AGP=e𝑱2|AGP.\mathrm{e}^{\boldsymbol{J}_{1}+\boldsymbol{J}_{2}}|\mathrm{AGP}\rangle=\mathrm{e}^{\boldsymbol{J}_{2}^{\prime}}|\mathrm{AGP}\rangle. (15)

where 𝑱2\boldsymbol{J}_{2}^{\prime} is another two-body Jastrow operator.

Slater determinants built from the same orbitals used to define the Hilbert-space Jastrow operators are eigenfunctions of 𝑱k\boldsymbol{J}_{k}, that is

𝑱k|Φμ\displaystyle\boldsymbol{J}_{k}|\Phi_{\mu}\rangle =Jk,μ|Φμ,\displaystyle=J_{k,\mu}|\Phi_{\mu}\rangle, (16a)
Jk,μ\displaystyle J_{k,\mu} =12kp1<<pkαp1pkNp1,μNpk,μ\displaystyle=\frac{1}{2^{k}}\,\sum_{p_{1}<\ldots<p_{k}}\alpha_{p_{1}\ldots p_{k}}N_{p_{1},\mu}\ldots N_{p_{k},\mu} (16b)
=i1<<ikαi1ik,\displaystyle=\sum_{i_{1}<\ldots<i_{k}}\alpha_{i_{1}\ldots i_{k}}, (16c)

where |Φμ|\Phi_{\mu}\rangle is a determinant, Np,μN_{p,\mu} is the occupation number of level pp in determinant μ\mu, and the indices i1iki_{1}\ldots i_{k} run over the occupied orbitals in |Φμ|\Phi_{\mu}\rangle. From this, it follows that

e𝑱kcμ|Φμ=eJk,μcμ|Φμ.\mathrm{e}^{\boldsymbol{J}_{k}}\sum c_{\mu}|\Phi_{\mu}\rangle=\sum\mathrm{e}^{J_{k,\mu}}\,c_{\mu}\,|\Phi_{\mu}\rangle. (17)

For a complex α\alpha, the magnitude of the coefficient of a given determinant is modified by the Hermitian part of 𝑱k\boldsymbol{J}_{k}, i.e. the real part of α\alpha. The anti-Hermitian part of 𝑱k\boldsymbol{J}_{k} (the imaginary part of α\alpha) only adjusts the phase of each coefficient. It is thus apparent that for an NN-body initial wavefunction |Ψ|\Psi\rangle in which all determinants |Φμ|\Phi_{\mu}\rangle have non-zero coefficients, exp(𝑱N)|Ψ\exp(\boldsymbol{J}_{N})|\Psi\rangle can be made an exact eigenstate of a chosen Hamiltonian provided that 𝑱N\boldsymbol{J}_{N} is non-Hermitian.

AGP is, in general, such an initial wavefunction, so exp(𝑱N)\exp(\boldsymbol{J}_{N}) acting on it can be made exact for the seniority zero space. For any kNk\leq N, it follows from Eq. (16) and Eq. (17) that

e𝑱k|AGP=p1<<pNexp(1j1<<jkNαpj1pjk)ηp1ηpN𝐏p1𝐏pN|vac.\mathrm{e}^{\boldsymbol{J}_{k}}|\text{AGP}\rangle=\sum_{p_{1}<...<p_{N}}\exp\left(\sum_{1\leq j_{1}<...<j_{k}\leq N}\alpha_{p_{j_{1}}...p_{j_{k}}}\right)\\ \eta_{p_{1}}...\eta_{p_{N}}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle. (18)

In contrast, we can see how the linear wavefunction |Ψ=𝑱k|AGP|{\Psi}\rangle=\boldsymbol{J}_{k}|\text{AGP}\rangle, which we refer to as JkCI AGP Dutta et al. (2020), modifies AGP as follows

|Ψ=p1<<pN(1j1<<jkNαpj1pjk)ηp1ηpN𝐏p1𝐏pN|vac.|{\Psi}\rangle=\sum_{p_{1}<...<p_{N}}\left(\sum_{1\leq j_{1}<...<j_{k}\leq N}\alpha_{p_{j_{1}}...p_{j_{k}}}\right)\eta_{p_{1}}...\eta_{p_{N}}\\ \mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle. (19)

Thus, the way in which the linear and non-linear 𝑱k\boldsymbol{J}_{k} correlators factorize the DOCI coefficients can be easily deduced from Eq. (18) and Eq. (19). Together with Eq. (10), we can readily see that

Dp1pN(Linear)1j1<<jkNαpj1pjkηp1ηpN,\displaystyle D_{p_{1}...p_{N}}^{\text{(Linear)}}\approx\sum_{1\leq j_{1}<...<j_{k}\leq N}\alpha_{p_{j_{1}}...p_{j_{k}}}\eta_{p_{1}}...\eta_{p_{N}}, (20a)
Dp1pN(Non-linear)1j1<<jkNeαpj1pjkηp1ηpN.\displaystyle D_{p_{1}...p_{N}}^{\text{(Non-linear)}}\approx\prod_{1\leq j_{1}<...<j_{k}\leq N}\mathrm{e}^{\alpha_{p_{j_{1}}...p_{j_{k}}}}\eta_{p_{1}}...\eta_{p_{N}}. (20b)

Indeed the two factorizations are different, and Eq. (20a) can be understood as the linear approximation to Eq. (20b) up to a constant shift of all amplitudes. An implication can be made, for example, about the sign of each coefficient; if α\alpha is real, the DOCI coefficient in Eq. (20b) can be negative only if some η\eta’s are negative, whereas the same restriction is not implied in (20a).

Having shown analytically how 𝑱k\boldsymbol{J}_{k} acts on AGP, we concern ourselves with optimizing the energy over JAGP in the subsequent sections.

III.2 Energy optimization with JAGP

Consider k=1k=1 for which the J1AGP wavefunction is just an AGP,

e𝑱1|AGP\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}|\text{AGP}\rangle =p1<<pNejαpj(kηpk𝐏pk)|vac\displaystyle=\sum_{p_{1}<\ldots<p_{N}}\mathrm{e}^{\sum_{j}\alpha_{p_{j}}}\,\left(\prod_{k}\eta_{p_{k}}\,\mathbf{P}^{\dagger}_{p_{k}}\right)|\text{vac}\rangle (21a)
=p1<<pN(kηpkeαpk𝐏pk)|vac.\displaystyle=\sum_{p_{1}<\ldots<p_{N}}\left(\prod_{k}\eta_{p_{k}}\,\mathrm{e}^{\alpha_{p_{k}}}\,\mathbf{P}^{\dagger}_{p_{k}}\right)|\text{vac}\rangle. (21b)

Evidently, the action of 𝑱1\boldsymbol{J}_{1} just changes the geminal coefficient ηp\eta_{p} to ηpeαp\eta_{p}\,\mathrm{e}^{\alpha_{p}}. For k=1k=1, J1AGP can optimize an AGP, but it cannot correlate it; we need k>1k>1 to go beyond an AGP mean-field. Unfortunately, except for k=1k=1, the norm and density matrices over e𝑱k|AGP\mathrm{e}^{\boldsymbol{J}_{k}}|\text{AGP}\rangle are not trivial to compute, and aside from quantum Monte Carlo methods, Foulkes et al. (2001); Casula et al. (2004); Neuscamman (2013a, 2016); Wei and Neuscamman (2018); Cohen et al. (2019) we are not aware of an efficient non-stochastic algorithm to compute the overlaps. This for examples makes the variational optimization of JAGP energy

Evar-JAGP=AGP|e𝑱𝑯e𝑱|AGPAGP|e𝑱e𝑱|AGP,\displaystyle E_{\text{var-JAGP}}=\frac{\langle\text{AGP}|{\mathrm{e}^{\boldsymbol{J}^{\dagger}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}}}|\text{AGP}\rangle}{\langle\text{AGP}|{\mathrm{e}^{\boldsymbol{J}^{\dagger}}\mathrm{e}^{\boldsymbol{J}}}|\text{AGP}\rangle}, (22)

out of reach for the existing non-stochastic methods.

However, a case can be made for the similarity transformed Hamiltonian using 𝑱2\boldsymbol{J}_{2}, that is

EST-J2AGP=AGP|e𝑱2𝑯e𝑱2|AGP.\displaystyle E_{\text{ST-J${}_{2}$AGP}}={\langle\text{AGP}|{\mathrm{e}^{-\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}_{2}}}|\text{AGP}\rangle}. (23)

The two-body Jastrow correlator has a long history in electronic structure methods and has shown to be capable of recovering a significant portion of the correlation energy. Casula and Sorella (2003); Neuscamman (2013a); Genovese et al. (2019) Previous work in our group has shown that the similarity transformation of fermionic Hamiltonians using a two-body Jastrow operator is summable to all orders and the amplitudes can be solved by left projection via components of 𝑱2\boldsymbol{J}_{2} acting on the HF reference Wahlen-Strothman et al. (2015); Wahlen-Strothman and Scuseria (2016). Following the same line of reasoning, we extend that formalism to the pairing su(2)su(2) algebra Eq. (4) and show how to obtain the working equations for AGP in the next section.

The main idea relies on the fact that the similarity transformation of the Hamiltonian reduces the rank of 𝑱2\boldsymbol{J}_{2} by one, thus the remaining one-body Jastrow operator can be subsequently absorbed into the reference. The same idea can be applied to the unitary transformation of the Hamiltonian with an anti-Hermitian i𝑱2i\boldsymbol{J}_{2} operator

EuJ2AGP=AGP|ei𝑱2𝑯ei𝑱2|AGP,\displaystyle E_{\text{uJ${}_{2}$AGP}}={\langle\text{AGP}|{\mathrm{e}^{-i\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{i\boldsymbol{J}_{2}}}|\text{AGP}\rangle}, (24)

wherein α\alpha is assumed to be real. Based on the argument in the previous section, the Jastrow correlators with purely imaginary amplitudes only contribute to the phase of each Slater determinant in AGP and do not alter their magnitudes. This can also be seen from Eq. (18) by αiα\alpha\rightarrow i\alpha. The phase contributions could be important for Hamiltonians that contain complex terms including those that break time-reversal symmetry. For other Hamiltonians, the mere change in phase is insufficient to make the wavefunction exact.

III.3 Similarity transformation with JAGP

In this section, we extend the results of Ref. Wahlen-Strothman et al. (2015); Wahlen-Strothman and Scuseria (2016) to the AGP wavefunction and our su(2)su(2) algebra. For pedagogical reasons, we keep this section self-contained. Suppose one had a generic Jastrow operator 𝑱\boldsymbol{J}. In a manner similar to coupled cluster theory, we intend to solve for the ground state energy by similarity transforming the Hamiltonian,

𝑯¯\displaystyle\bar{\boldsymbol{H}} =e𝑱𝑯e𝑱,\displaystyle=\mathrm{e}^{-\boldsymbol{J}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}}, (25a)
E¯\displaystyle\bar{E} =AGP|𝑯¯|AGP.\displaystyle=\langle\text{AGP}|{\bar{\boldsymbol{H}}}|\text{AGP}\rangle. (25b)

Similarity transformation does not change the spectrum of 𝑯\boldsymbol{H}, but 𝑯¯\bar{\boldsymbol{H}} is no longer Hermitian and E¯\bar{E} is not variational. Therefore we obtain the residual equations by left projecting the Schrödinger equation to get

𝐍p1𝐍pk𝑯¯E¯𝐍p1𝐍pk=0p1<<pk\langle{\mathbf{N}_{p_{1}}\ldots\mathbf{N}_{p_{k}}\,\bar{\boldsymbol{H}}}\rangle-\bar{E}\,\langle{\mathbf{N}_{p_{1}}\ldots\mathbf{N}_{p_{k}}}\rangle=0\quad\forall_{p_{1}<\ldots<p_{k}} (26)

where the expectation values are taken with respect to AGP. In traditional coupled cluster theory using particle-hole excitations, one expresses the similarity transformation by the Baker-Campbell-Hausdorff (BCH) expansion whose series naturally truncates at the 4th commutator for a two-body Hamiltonian. In contrast, the transformation using the Jastrow operator does not truncate, but it is summable to all orders for any Hamiltonian.

Our ultimate goal is to use a two- or higher-body Jastrow operators to similarity transform the Hamiltonian and provide correlation. However, it will prove useful to begin by working through the case in which we use a one-body operator, for which the algebra is much simpler. Therefore, consider the 𝑱1\boldsymbol{J}_{1} operator

𝑱1=12pαp𝐍p.\boldsymbol{J}_{1}=\frac{1}{2}\,\sum_{p}\alpha_{p}\,\mathbf{N}_{p}. (27)

Using Eq. (4), it is easy to show that

e𝑱1𝐏pe𝑱1\displaystyle\mathrm{e}^{-\boldsymbol{J}_{1}}\;\mathbf{P}^{\dagger}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}} =eαp𝐏p,\displaystyle=\mathrm{e}^{-\alpha_{p}}\mathbf{P}^{\dagger}_{p}, (28a)
e𝑱1𝐏pe𝑱1\displaystyle\mathrm{e}^{-\boldsymbol{J}_{1}}\;\mathbf{P}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}} =eαp𝐏p,\displaystyle=\mathrm{e}^{\alpha_{p}}\mathbf{P}_{p}, (28b)
e𝑱1𝐍pe𝑱1\displaystyle\mathrm{e}^{-\boldsymbol{J}_{1}}\;\mathbf{N}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}} =𝐍p.\displaystyle=\mathbf{N}_{p}. (28c)

Thus, the similarity-transformed Hamiltonian simply adjusts the coefficients of the 𝐏\mathbf{P}^{\dagger} and 𝐏\mathbf{P} operators.

Refer to caption
(a)
Refer to caption
(b)
Figure 1: Comparison of energy obtained using ST-J2AGP at 12 levels, half-filled with J2-CI and var-J2AGP. The left figure is the correlation energy as the difference with HF and the right is the total energy error.

Alternatively, we can use Eq. (21) to absorb exp(𝑱1)\exp(\boldsymbol{J}_{1}) into AGP. Although we have seen how this can be done, we will extract the same result in an different fashion. Notice by Eq. (5) that

e𝑱1|AGP=1N!(e𝑱1𝚪e𝑱1)N|vac,\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}|\text{AGP}\rangle=\frac{1}{N!}\left(\mathrm{e}^{\boldsymbol{J}_{1}}{\mathbf{{\Gamma}}}^{\dagger}\mathrm{e}^{-\boldsymbol{J}_{1}}\right)^{N}|\text{vac}\rangle, (29)

where we used the fact that exp(𝑱1)|vac=|vac\exp(\boldsymbol{J}_{1})|\text{vac}\rangle=|\text{vac}\rangle. From the algebra, Eq. (4), it follows

e𝑱1𝚪e𝑱1\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}{\mathbf{{\Gamma}}}^{\dagger}\mathrm{e}^{-\boldsymbol{J}_{1}} =pηpeαp𝐏p,\displaystyle=\sum_{p}\eta_{p}\mathrm{e}^{\alpha_{p}}\mathbf{P}^{\dagger}_{p}, (30a)
e𝑱1𝚪e𝑱1\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}{\mathbf{{\Gamma}}}\mathrm{e}^{-\boldsymbol{J}_{1}} =pηpeαp𝐏p,\displaystyle=\sum_{p}\eta_{p}\mathrm{e}^{-\alpha_{p}}\mathbf{P}_{p}, (30b)

from which it is clear that e𝑱1|AGP\mathrm{e}^{\boldsymbol{J}_{1}}|\text{AGP}\rangle leads to ηpηpeαp\eta_{p}\rightarrow\eta_{p}\mathrm{e}^{\alpha_{p}} for all geminal coefficients in the ket, and AGP|e𝑱1\langle\text{AGP}|\mathrm{e}^{-\boldsymbol{J}_{1}} results in ηpηpeαp\eta_{p}\rightarrow\eta_{p}\mathrm{e}^{-\alpha_{p}} in the bra.

We now proceed to derive the equations for similarity transformation with 𝑱2\boldsymbol{J}_{2}. Define 𝑱1(p)=qαpq𝐍p/4\boldsymbol{J}_{1}(p)=\sum_{q}\alpha_{pq}\mathbf{N}_{p}/4 and recall αpp=0\alpha_{pp}=0, then as in Eq. (28) we obtain

e𝑱2𝐏pe𝑱2\displaystyle\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{P}^{\dagger}_{p}\;\mathrm{e}^{\boldsymbol{J}_{2}} =e𝑱1(p)𝐏pe𝑱1(p),\displaystyle=\mathrm{e}^{-\boldsymbol{J}_{1}(p)}\;\mathbf{P}^{\dagger}_{p}\;\mathrm{e}^{-\boldsymbol{J}_{1}(p)}, (31a)
e𝑱2𝐏pe𝑱2\displaystyle\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{P}_{p}\;\mathrm{e}^{\boldsymbol{J}_{2}} =e𝑱1(p)𝐏pe𝑱1(p),\displaystyle=\mathrm{e}^{\boldsymbol{J}_{1}(p)}\;\mathbf{P}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}(p)}, (31b)
e𝑱2𝐍pe𝑱2\displaystyle\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{N}_{p}\;\mathrm{e}^{\boldsymbol{J}_{2}} =𝐍p\displaystyle=\mathbf{N}_{p} (31c)

where we used [𝑱1(p),𝐏p]=0[{\boldsymbol{J}_{1}(p)},{\mathbf{P}^{\dagger}_{p}}]=0 to symmetrically distribute the exponentials in the right hand side of Eq. (31) on either side of each operator. In fact, using these equations, we can transform any many-body operator of the form 𝐏p𝐍q𝐏r\mathbf{P}^{\dagger}_{p}...\mathbf{N}_{q}...\mathbf{P}_{r}... in a symmetric way. In general, it can be shown that

e𝑱2𝐏p1𝐏pn𝐍q1𝐍qm𝐏r1𝐏rne𝑱2=ei=1n𝑱1(ri)𝑱1(pi)𝐏p1𝐏rnei=1n𝑱1(ri)𝑱1(pi).\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{n}}\mathbf{N}_{q_{1}}...\mathbf{N}_{q_{m}}\mathbf{P}_{r_{1}}...\mathbf{P}_{r_{n}}\;\mathrm{e}^{\boldsymbol{J}_{2}}=\\ \mathrm{e}^{\sum_{i=1}^{n}\boldsymbol{J}_{1}(r_{i})-\boldsymbol{J}_{1}(p_{i})}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}_{r_{n}}\mathrm{e}^{\sum_{i=1}^{n}\boldsymbol{J}_{1}(r_{i})-\boldsymbol{J}_{1}(p_{i})}. (32)

To obtain a closed-form expression for the energy and the residual equations, we need to further absorb each 𝑱1(p)\boldsymbol{J}_{1}(p) into AGP. Following the same steps as in Eq. (30), we can see that

e𝑱1(q)𝑱1(p)𝚪e𝑱1(p)𝑱1(q)=r(ηre(αqrαpr)/2)𝐏r=𝚪~pq.\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}{\mathbf{{\Gamma}}}^{\dagger}\mathrm{e}^{\boldsymbol{J}_{1}(p)-\boldsymbol{J}_{1}(q)}\\ =\sum_{r}\left(\eta_{r}\mathrm{e}^{(\alpha_{qr}-\alpha_{pr})/2}\right)\mathbf{P}^{\dagger}_{r}=\widetilde{\mathbf{{\mathbf{{\Gamma}}}}}^{\dagger}_{pq}. (33)

This together with Eq. (32) show that we can evaluate the expectation value of any similarity transformed operator of the form 𝐏p𝐍q𝐏r\mathbf{P}^{\dagger}_{p}...\mathbf{N}_{q}...\mathbf{P}_{r}... with 𝑱2\boldsymbol{J}_{2} by scaling the geminal coefficients appropriately.

The same basic steps follow for even higher-body Jastrow operators; the initial similarity-transformation of the Hamiltonian leads to modified matrix elements along with exponentials of Jastrow operators whose ranks are reduced by one. However, only 𝑱2\boldsymbol{J}_{2} permits easy evaluation of the remaining expectation values.

For unitary JAGP, as in Eq. (24), one follows the same steps as above but replaces αiα\alpha\rightarrow i\alpha.

III.4 Application to the pairing Hamiltonian

We now show how to apply the equations above to similarly transform the pairing Eq. (2). It follows form Eq. (32) that

e𝑱2𝑯e𝑱2=pϵp𝐍pGpq(e𝑱1(q)𝑱1(p)𝐏p𝐏qe𝑱1(q)𝑱1(p)).\mathrm{e}^{-\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}_{2}}=\sum_{p}\epsilon_{p}\mathbf{N}_{p}\\ -G\sum_{pq}\left(\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}\right). (34)

For every 𝐏p𝐏q\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}, define |AGP~pq=e𝑱1(q)𝑱1(p)|AGP|\widetilde{\mathrm{AGP}}_{pq}\rangle=\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}|\text{AGP}\rangle, and note that |AGP~pq|\widetilde{\mathrm{AGP}}_{pq}\rangle is an AGP whose geminal coefficients have been rescaled by η~r=ηre(αqrαpr)/2\tilde{\eta}_{r}=\eta_{r}\mathrm{e}^{(\alpha_{qr}-\alpha_{pr})/2} for all orbitals rr from Eq. (33). As such, we can obtain the energy,

E¯=e𝑱2𝑯e𝑱2=pϵpAGP|𝐍p|AGPGpqAGP~pq|𝐏p𝐏q|AGP~pq.\bar{E}=\langle{\mathrm{e}^{-\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}_{2}}}\rangle=\sum_{p}\epsilon_{p}\langle\text{AGP}|{\mathbf{N}_{p}}|\text{AGP}\rangle\\ -G\,\sum_{pq}\langle\widetilde{\mathrm{AGP}}_{pq}|\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}|\widetilde{\mathrm{AGP}}_{pq}\rangle. (35)

Following similar steps as above, it is straightforward to get closed-form expressions for the residual equations, Eq. (26), as well. We simply use

𝐍r𝐍se𝑱2𝑯e𝑱2=pϵpAGP|𝐍p𝐍r𝐍s|AGPGpqAGP~pq|𝐍r𝐍s𝐏p𝐏q|AGP~pq\langle{\mathbf{N}_{r}\,\mathbf{N}_{s}\,\mathrm{e}^{-\boldsymbol{J}_{2}}\,\boldsymbol{H}\,\mathrm{e}^{\boldsymbol{J}_{2}}}\rangle=\sum_{p}\epsilon_{p}\,\langle{\mathrm{AGP}}|\mathbf{N}_{p}\,\mathbf{N}_{r}\,\mathbf{N}_{s}|{\mathrm{AGP}}\rangle\\ -G\,\sum_{pq}\langle\widetilde{\mathrm{AGP}}_{pq}|\mathbf{N}_{r}\,\mathbf{N}_{s}\,\mathbf{P}^{\dagger}_{p}\,\mathbf{P}_{q}|\widetilde{\mathrm{AGP}}_{pq}\rangle (36)

which can then be normal ordered in terms of 𝐏p𝐍q𝐏r\mathbf{P}^{\dagger}_{p}\ldots\mathbf{N}_{q}\ldots\mathbf{P}_{r}\ldots.

As outlined in Sec. II, it is a property of AGP-based RDMs that Zp(1,1)=AGP|𝐍p|AGPZ_{p}^{(1,1)}=\langle\text{AGP}|{\mathbf{N}_{p}}|\text{AGP}\rangle is sufficient to construct all higher-body density matrices, where the cost of building Z(1,1)Z^{(1,1)} is 𝒪(NgridM)\mathcal{O}(N_{\mathrm{grid}}M) using grid integration in the number-projected BCS representation of AGP. Of course the same principle applies to RDMs using |AGP~pq|\widetilde{\mathrm{AGP}}_{pq}\rangle, except that Z~r;pq(1,1)=AGP~pq|𝐍r|AGP~pq\widetilde{Z}^{(1,1)}_{r;pq}=\langle\widetilde{\mathrm{AGP}}_{pq}|\mathbf{N}_{r}|\widetilde{\mathrm{AGP}}_{pq}\rangle needs to be evaluated for every pqpq in the Hamiltonian, as demonstrated for Eq. (35). The cost of building Z~r;pq(1,1)\widetilde{Z}^{(1,1)}_{r;pq} for all pqrpqr is 𝒪(NgridM3)\mathcal{O}(N_{\mathrm{grid}}M^{3}) and is the most expensive step in evaluating the ST-J2AGP energy. The cost of evaluating the residuals using the reconstruction formulae is 𝒪(M4)\mathcal{O}(M^{4}). Without the reconstruction formulae, the cost would be 𝒪(M6)\mathcal{O}(M^{6}).

In Fig. 1, we show the energy error and the fraction of correlation energy captured by ST-J2AGP. All calculations were carried out using the energy optimized AGP for 𝑯\boldsymbol{H}. The correlation energy is computed as the difference with that of HF. As shown in Fig. 1, ST-J2AGP over-correlates near the recoupling regime, GGcG\approx G_{c} and smaller, including G<0G<0. In the strongly attractive correlated limit, GGcG\gg G_{c}, ST-J2AGP is highly accurate and becomes exact for sufficiently large G/GcG/G_{c}. It is noteworthy, however, that AGP is an eigenstate of the pairing Hamiltonian for G/GcG/G_{c}\rightarrow\infty. For comparison purposes, we have plotted the results for variational J2AGP where we carried out the optimization using an in-house DOCI code. We have also included results using a linear wavefunction 𝑱2|AGP\boldsymbol{J}_{2}|\mathrm{AGP}\rangle in which the energy is determined variationally; this J2CI-AGP has 𝒪(M6)\mathcal{O}(M^{6}) scaling, or 𝒪(M8)\mathcal{O}(M^{8}) without reconstruction formulae.

For larger systems, the over-correlation of ST-J2AGP near the recoupling regime worsens while the linear variational theory remains well-behaved. Thus, although ST-J2AGP has a lower cost than other AGP-based methods developed recently in our group, neither var-J2AGP nor ST-J2AGP yield as accurate energies for the pairing Hamiltonian. While correlators build from particle-hole excitations generally benefit from being put in an exponential operator, it is far from clear that the same is true of correlators built from Hilbert space Jastrow operators. Thus in the next section we discuss an exponential ansatz that we build from different generators.

IV Unitary pair-hopper Correlator

An alternative way of creating an exponential ansatz based on two-body, number preserving operators with the generators of the algebra is to use a linear combination of terms like 𝐏p𝐏q\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}. Unlike the Jastrow operators discussed in Sec. III, a unitary or similarity transformation using 𝐏p𝐏q\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q} operators is not easily summable nor does it truncate naturally the way it does in CC theory. This is a direct consequence of the commutation relations Eq. (4). As a result, some form of approximation must be made. Recently, we proposed a unitary ansatz that we call unitary pair-hopper for an implementation on a quantum computer Khamoshi et al. (2020). Benchmark calculations against the ground state energy of the pairing-Hamiltonian show highly accurate results.

In this section, we seek approximation schemes to this model that can be implemented affordably on a classical computer. In particular, we implement two approaches: First, we use the BCH formula and truncate the commutator expansion at some high order. Second, we apply the canonical transformation theory as studied extensively in Ref. Yanai and Chan (2006, 2007); Neuscamman et al. (2009, 2010); Yanai et al. (2012) to this problem later in this section.

Refer to caption
(a)
Refer to caption
(b)
Figure 2: Fraction of the correlation energy and the energy error of the unitary pair-hopper ansatz, with and without truncation, in comparison with other methods in a small system of 8 levels, half-filled. In addition to AGP based-methods, the plot shows the results for the single-reference unitary coupled cluster on HF. The left figure shows the correlation energy as the difference with HF and the right is the total energy error.

IV.1 Truncated unitary pair-hopper

Recall the definition of the anti-Hermitian pair-hopper operator Khamoshi et al. (2020)

𝓣=p<qτpq(𝐏p𝐏q𝐏q𝐏p),\displaystyle\boldsymbol{\mathcal{T}}=\sum_{p<q}\tau_{pq}\left(\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}-\mathbf{P}^{\dagger}_{q}\mathbf{P}_{p}\right), (37)

where τpq\tau_{pq} is the amplitude and is antisymmetric. 𝓣\boldsymbol{\mathcal{T}} is derived from

𝐊pq𝐊pq𝐏p𝐏q𝐏q𝐏p,\displaystyle\mathbf{K}_{pq}-\mathbf{K}^{\dagger}_{pq}\propto\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}-\mathbf{P}^{\dagger}_{q}\mathbf{P}_{p}, (38)

where 𝐊pq|AGP=0\mathbf{K}_{pq}|\text{AGP}\rangle=0 is the two-body killer of AGP Henderson and Scuseria (2019). Its exact expression is

𝐊pq\displaystyle\mathbf{K}_{pq} =ηp2𝐏p𝐏q+ηq2𝐏q𝐏p+\displaystyle=\eta_{p}^{2}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}+\eta_{q}^{2}\mathbf{P}^{\dagger}_{q}\mathbf{P}_{p}+
12ηpηq(𝐍p𝐍q𝐍p𝐍q).\displaystyle{}\quad\frac{1}{2}\eta_{p}\eta_{q}\left(\mathbf{N}_{p}\mathbf{N}_{q}-\mathbf{N}_{p}-\mathbf{N}_{q}\right). (39)

Notice that e𝓣|HF\mathrm{e}^{\boldsymbol{\mathcal{T}}}|{\text{HF}}\rangle is equivalent to paired unitary CC with generalized indices (UGCC) Nooijen (2000); Stein et al. (2014); Lee et al. (2019).

Define the unitary pair-hopper ansatz as e𝓣|AGP\mathrm{e}^{\boldsymbol{\mathcal{T}}}|\text{AGP}\rangle which we denote as uP-AGP. We want to solve for the amplitudes by minimizing the energy, that is

E(τ)\displaystyle E(\tau) =AGP|e𝓣𝑯e𝓣|AGP\displaystyle=\langle\text{AGP}|{\mathrm{e}^{-\boldsymbol{\mathcal{T}}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{\mathcal{T}}}}|\text{AGP}\rangle (40a)
τ\displaystyle\tau^{*} =argminτpqE(τ).\displaystyle=\underset{\tau_{pq}\in\mathbb{R}}{\text{argmin}}\;E(\tau). (40b)

However, since the BCH expansion of the unitary transformed Hamiltonian does not truncate naturally, we choose to truncate the expansion at the highest order that gives us an affordable scaling, i.e. 𝒪(M6)\mathcal{O}(M^{6}). Again, using the reconstruction formulae, the expectation value of all two-body or higher density matrices can be accessed at 𝒪(1)\mathcal{O}(1) cost per element. This means the asymptotic scaling of the commutator expansion is dominated by the many-body contractions. For example, for a two-body Hamiltonian, the first commutator, [𝑯,𝓣]\langle{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]}\rangle contains three-body contractions; the second commutator, [[𝑯,𝓣],𝓣]\langle{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle contains four-body contractions, and so on. By truncating the expansion at the 4th commuter, we obtain a scaling of 𝒪(M6)\mathcal{O}(M^{6}) for the energy. In other words, we can state

E(τ)𝑯𝒪(M2)+[𝑯,𝓣]𝒪(M3)+12![[𝑯,𝓣],𝓣]𝒪(M4)+13![[[𝑯,𝓣],𝓣],𝓣]𝒪(M5)+14![[[[𝑯,𝓣],𝓣],𝓣],𝓣]𝒪(M6),E(\tau)\approx\underbrace{\langle{\boldsymbol{H}}\rangle}_{\text{$\mathcal{O}(M^{2})$}}+\underbrace{\langle{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{3})$}}+\frac{1}{2!}\underbrace{\langle{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{4})$}}\\ +\frac{1}{3!}\underbrace{\langle{[{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{5})$}}+\frac{1}{4!}\underbrace{\langle{[{[{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{6})$}}, (41)

such that addition of every term in the expansion multiplies the leading scaling by a factor of MM. Obtaining the equations above analytically is cumbersome but not prohibitive. We use our home-built algebraic manipulator software, drudge Zhao (2018), to generate this and the analytic gradient of the energy. As a result, we can compute both E(τ)E(\tau) and E(τ)\nabla E(\tau) at 𝒪(M6)\mathcal{O}(M^{6}) cost albeit with a relatively large prefactor.

We minimize the energy as in Eq. (40) using the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm Byrd et al. (1995) where we provide the gradients using our analytical expressions. We could alternatively find the solutions to E(τ)=0\nabla E(\tau)=0 using non-linear root-finding algorithms, but we find that the former is typically more robust when an accurate initial guess is provided.

Because truncating the commutator expansion yields a non-variational energy expression, we must take care with the minimization of the energy. This becomes particularly important in larger systems for which the numerical algorithms may go below the exact energy and eventually blow up. Fortunately, this failure mode is easy to identify as it is accompanied by extremely large amplitudes in τ\tau. To avoid this problem, we introduce an L2L_{2} regularization Nocedal and Wright (2006)

E~(τ)\displaystyle\widetilde{E}({\tau}) =E(τ)+λ2τ2\displaystyle=E({\tau})+\frac{\lambda}{2}\left\lVert{\tau}\right\rVert^{2} (42a)
E~(τ)\displaystyle\implies\nabla\widetilde{E}({\tau}) =E(τ)+λτ\displaystyle=\nabla{E}({\tau})+\lambda{\tau} (42b)

where λ>0\lambda>0 is the regularization parameter and is sufficiently small to allow a robust convergence. In effect, the regularization penalizes the energy optimization in proportion to the magnitude of the amplitudes, thereby preventing it from blowing up. Although this compromises the accuracy of final energy, we gain a smoother convergence. To find λ\lambda, we pick a small value e.g. 10310^{-3}, converge to an optimal solution for τ\tau, and use this solution as the initial guess of another optimization in which a smaller λ\lambda is used. The first initial guess for τ\tau can be set to zero. We repeat this process until we can no longer converge or the change in energy is negligible. The convergence threshold of the gradient is set to 10810^{-8}, and we have not observed significant dependence of the final results on the initial guess for τ\tau or λ\lambda in our calculations.

In Fig. 2 we plot the results for the pairing Hamiltonian in a small system in which we can compare the accuracy of our truncated commutator expansion with that of exact uP-AGP which we obtain from a full configuration interaction (FCI) code. As we can see in the plot, the agreement between the exact and truncated uP-AGP is excellent on the attractive side while at the same time being slightly more accurate than the linear wavefunction 𝓣|AGP\boldsymbol{\mathcal{T}}|\text{AGP}\rangle denoted as P-CI AGP. On the repulsive side, the results for the truncated uP-AGP is comparable to P-CI and becomes more accurate only at sufficiently small G/GcG/G_{c}.

Refer to caption
(a)
Refer to caption
(b)
Figure 3: Accuracy of the truncated unitary pair-hopper as compared with P-CI AGP and ST-J2AGP at 20 levels, half-filled. The left figure shows the correlation energy as the difference with HF and the right is the total energy error.

For benchmarking purposes, we also plot HF-based unitary coupled cluster (UCC) Bartlett et al. (1989); Taube and Bartlett (2006) as well as that with generalized indices, UGCC, in Fig. 2. The results are obtained from a FCI code. As expected UCC works well in the repulsive regime, but it breaks down as GG approaches GcG_{c}. On the other hand, UGCC is remarkably more accurate than UCC, but it too eventually breaks down for large G/GcG/G_{c}. In contrast, our AGP based methods are well behaved in all correlation regimes.

The results for a larger system is show in Fig. 3 where we compare the accuracy of the truncated uP-AGP with ST-J2AGP and P-CI. Although all of these methods have the same number of parameters, they clearly differ considerably in their accuracy. The observation that uP-AGP could be more accurate than P-CI in principle is a testament to its non-linear nature. Previously, it was shown that CI methods based on the generators of the algebra, namely J2-CI, P-CI, and K-CI, yield identical results Dutta et al. (2020). The other advantage of the non-linear model is that it is agnostic to the linear dependencies of the ansatz. This is true for both ST-J2AGP and the unitary pair-hopper. Although removing the linear dependencies by diagonalizing the metric can yield faster convergence, our numerical results show very little difference in the final energy.

The downside of the truncated uP-AGP is that it has a relatively large prefactor which is a result of the brute-force expansion of the BCH formula. In the next section we apply canonical transformation theory as another means for approximating the unitary transformed Hamiltonian.

IV.2 AGP-based canonical transformation

In canonical transformation (CT) theory, as formulated originally by Yanai and Chan Yanai and Chan (2006, 2007), the commutator expansion of a unitary transformed Hamiltonian,

e𝑨𝑯e𝑨=𝑯+[𝑯,𝑨]+12![[𝑯,𝑨],𝑨]+,\displaystyle\mathrm{e}^{-\boldsymbol{A}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{A}}=\boldsymbol{H}+[{\boldsymbol{H}},{\boldsymbol{A}}]+\frac{1}{2!}[{[{\boldsymbol{H}},{\boldsymbol{A}}]},{\boldsymbol{A}}]+\ldots, (43)

where 𝑨\boldsymbol{A} is an anti-Hermitian operator, is recursively summed by systematically replacing all higher-body operators obtained from each commutator by an approximate, lower-body operator. For example, if 𝑨\boldsymbol{A} and 𝑯\boldsymbol{H} are each two-body operators, one approximates [𝑯,𝑨][𝑯,𝑨](1,2)[{\boldsymbol{H}},{\boldsymbol{A}}]\rightarrow[{\boldsymbol{H}},{\boldsymbol{A}}]_{(1,2)} wherein the three-body operator resulting from [𝑯,𝑨][{\boldsymbol{H}},{\boldsymbol{A}}] is approximated in terms of one- and two-body operators. This process yields an effective Hamiltonian, 𝑯¯e𝑨𝑯e𝑨\bar{\boldsymbol{H}}\approx\mathrm{e}^{-\boldsymbol{A}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{A}}, that, in its simplest form, can be written as

𝑯¯=𝑯+[𝑯,𝑨](1,2)+12![[𝑯,𝑨](1,2),𝑨](1,2)+,\displaystyle\bar{\boldsymbol{H}}=\boldsymbol{H}+[{\boldsymbol{H}},{\boldsymbol{A}}]_{(1,2)}+\frac{1}{2!}\left[[\boldsymbol{H},\boldsymbol{A}\right]_{(1,2)},\boldsymbol{A}]_{(1,2)}+\ldots, (44)

where we have retained up to two-body terms in the effective Hamiltonian. Eq. (44) in particular can be summed recursively by

𝑯¯(n+1)\displaystyle\bar{\boldsymbol{H}}^{(n+1)} =1n[𝑯¯(n),𝑨](1,2),\displaystyle=\frac{1}{n}\left[\bar{\boldsymbol{H}}^{(n)},\boldsymbol{A}\right]_{(1,2)}, (45a)
𝑯¯\displaystyle\bar{\boldsymbol{H}} =n=0𝑯¯(n),\displaystyle=\sum_{n=0}\bar{\boldsymbol{H}}^{(n)}, (45b)

where 𝑯¯(0)=𝑯\bar{\boldsymbol{H}}^{(0)}=\boldsymbol{H}, and the sum is continued until the change in the parameters of the effective Hamiltonian is sufficiently small. The energy is then defined as E=Ψ|𝑯¯|ΨE=\langle{\Psi}|\bar{\boldsymbol{H}}|{\Psi}\rangle where |Ψ|{\Psi}\rangle is some mean-field reference. The minimization of the energy can be carried out by solving CC-style amplitude equations Ψ|𝑨^i𝑯¯|Ψ\langle{\Psi}|\boldsymbol{\hat{A}}^{\dagger}_{i}\bar{\boldsymbol{H}}|{\Psi}\rangle which can be rearranged to give

Ri=Ψ|[𝑯¯,𝑨i^]|Ψ=0,𝑨i^𝑨,\displaystyle R_{i}=\langle{\Psi}|[\bar{\boldsymbol{H}},\hat{\boldsymbol{A}_{i}}]|{\Psi}\rangle=0,\quad\forall{\hat{\boldsymbol{A}_{i}}\in\boldsymbol{A}}, (46)

where 𝑨i^\hat{\boldsymbol{A}_{i}} are the operator parts of 𝑨\boldsymbol{A}.

Refer to caption
(a)
Refer to caption
(b)
Figure 4: Accuracy of canonical transformation with the unitary pair-hoppers, CT-uP-AGP, and that of P-CI AGP. The left figure is the correlation energy as the difference with HF and the right is the total energy error.

In essence, the goal of CT theory is to make the effective Hamiltonian more and more diagonal by successively approximating higher order excitations. It can be viewed as a kind of renormalization group approach and is rooted in an earlier work by White White (2002) as well as the flow renormalization approach by Wegner Wegner (1994) and Glazek and Wilson Glazek and Wilson (1994). In this section, however, we very closely follow the formalism presented in Ref. Yanai and Chan (2006, 2007); Neuscamman et al. (2009, 2010); Yanai et al. (2012) by Yanai, Chan, and Neuscamman.

The chief difference between our approach and those of earlier papers is in the operator decomposition or downfolding scheme. Instead of invoking the cumulant decomposition of RDMs Kutzelnigg and Mukherjee (1999) as a means to get lower-body operators, we use AGP and our reconstruction formulae. Just as in cumulant decomposition, the reconstruction formulae are statements about the RDMs, but we can extended them to operators as an approximation.

For example, consider the three-body irreducible density matrix Zpqr(3,3)Z^{(3,3)}_{pqr}. From the reconstruction formulae Khamoshi et al. (2019), it follows that

𝐍p𝐍q𝐍r=ΛqpΛrp𝐍p+ΛpqΛrq𝐍q+ΛprΛqr𝐍r,\langle{\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}}\rangle=\\ \Lambda_{qp}\Lambda_{rp}\langle{\mathbf{N}_{p}}\rangle+\Lambda_{pq}\Lambda_{rq}\langle{\mathbf{N}_{q}}\rangle+\Lambda_{pr}\Lambda_{qr}\langle{\mathbf{N}_{r}}\rangle, (47)

where Λqp=2ηq2/(ηq2ηp2)\Lambda_{qp}=2\eta_{q}^{2}/(\eta_{q}^{2}-\eta_{p}^{2}). For CT, we propose to approximate 𝐍p𝐍q𝐍r\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r} by removing the expectation values to get

𝐍p𝐍q𝐍rΛqpΛrp𝐍p+ΛpqΛrq𝐍q+ΛprΛqr𝐍r.\displaystyle\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}\rightarrow\Lambda_{qp}\Lambda_{rp}\mathbf{N}_{p}+\Lambda_{pq}\Lambda_{rq}\mathbf{N}_{q}+\Lambda_{pr}\Lambda_{qr}\mathbf{N}_{r}. (48)

Note that since the reconstruction formulae apply only to irreducible RDMs, we limit our downfolding to irreducible operators, i.e. 𝐏p𝐍q𝐏r\mathbf{P}^{\dagger}_{p}\ldots\mathbf{N}_{q}\ldots\mathbf{P}_{r}\ldots where all indices are different. Indeed, all reducible operators (RDMs) can be written as a sum of irreducible ones.

To carry out the CT procedure, we take the unitary pair-hopper e𝓣\mathrm{e}^{\boldsymbol{\mathcal{T}}} ansatz to be the generator of our transformation, and we seek an effective Hamiltonian of the form

𝑯¯=php𝐍p+pqwpq𝐍p𝐍q+pqvpq𝐏p𝐏q,\displaystyle\bar{\boldsymbol{H}}=\sum_{p}h_{p}\mathbf{N}_{p}+\sum_{p\neq q}w_{pq}\mathbf{N}_{p}\mathbf{N}_{q}+\sum_{pq}v_{pq}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}, (49)

where wpqw_{pq} is symmetric and vpqv_{pq} is Hermitian. Note that Eq. (49) is the general form of a two-body Hamiltonian in the seniority-zero space Henderson et al. (2015). In order to get a relatively more accurate approximation in CT, we would like to delay the downfolding as much possible. To this end, following Ref. Neuscamman et al. (2010), we split 𝑯¯=𝑯¯1+𝑯¯2\bar{\boldsymbol{H}}=\bar{\boldsymbol{H}}_{1}+\bar{\boldsymbol{H}}_{2} such that 𝑯¯1\bar{\boldsymbol{H}}_{1} and 𝑯¯2\bar{\boldsymbol{H}}_{2} are the one- and two-body parts of the effective Hamiltonian respectively, and the downfolding is delayed until four-body operators appear. As such, one obtains

𝑯1¯\displaystyle\bar{\boldsymbol{H}_{1}} =𝑯1+[𝑯1,𝓣]+12![[𝑯1,𝓣],𝓣](1,2)\displaystyle=\boldsymbol{H}_{1}+[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}]+\frac{1}{2!}[[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)}
+13![[[𝑯1,𝓣],𝓣],𝓣](1,2)+\displaystyle{}\quad+\frac{1}{3!}[[[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)}+... (50a)
𝑯2¯\displaystyle\bar{\boldsymbol{H}_{2}} =𝑯2+[𝑯2,𝓣](1,2)+12![[𝑯2,𝓣],𝓣](1,2)\displaystyle=\boldsymbol{H}_{2}+[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}]_{(1,2)}+\frac{1}{2!}[[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)}
+13![[[𝑯2,𝓣],𝓣](1,2),𝓣](1,2)+\displaystyle{}\quad+\frac{1}{3!}[[[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)},\boldsymbol{\mathcal{T}}]_{(1,2)}+... (50b)

Similarly, to get accurate residual equations, we split Rpq=R1,pq+R2,pqR_{pq}=R_{1,pq}+R_{2,pq} and evaluate the residuals associated with the one- and two-body parts recursively

R1,pq\displaystyle R_{1,pq} =[𝑯1,𝓣pq]+12![[𝑯1,𝓣pq],𝓣pq](1,2)+,\displaystyle=\langle{[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}_{pq}]}\rangle+\frac{1}{2!}\langle{[[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}_{pq}],\boldsymbol{\mathcal{T}}_{pq}]_{(1,2)}}\rangle+\ldots, (51a)
R2,pq\displaystyle R_{2,pq} =[𝑯2,𝓣pq](1,2)+12![[𝑯2,𝓣pq],𝓣pq](1,2).+\displaystyle=\langle{[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}_{pq}]_{(1,2)}}\rangle+\frac{1}{2!}\langle{[[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}_{pq}],\boldsymbol{\mathcal{T}}_{pq}]_{(1,2)}}\rangle.+\ldots (51b)

In both Eq. (50) and Eq. (51) we retain all two body-terms that naturally appear in the commutator expansion and absorb their coefficients into wpqw_{pq} and vpqv_{pq} in Eq. (49). All higher-body operators are downfolded directly into one-body with the intent of making the Hamiltonian more diagonal. The cost of evaluating the effective Hamiltonian and the residuals is 𝒪(M4)\mathcal{O}(M^{4}).

Having the effective Hamiltonian and the residual equations, one can solve the amplitudes using a non-linear root-finding algorithm. In so doing, we can provide an approximation to the Jacobian matrix as follows

Jpq,rs\displaystyle J_{pq,rs} =[[𝑯¯,𝓣rs],𝓣pq].\displaystyle=\langle{[[\bar{\boldsymbol{H}},\boldsymbol{\mathcal{T}}_{rs}],\boldsymbol{\mathcal{T}}_{pq}]}\rangle. (52)

The cost of building the Jacobian is also 𝒪(M4)\mathcal{O}(M^{4}) by using the reconstruction formula.

We apply the CT as described above to the pairing Hamiltonian Eq. (2). As encountered in other implementations of CT, we find that the optimization over the amplitude equations is ill-conditioned Neuscamman et al. (2009, 2010); Yanai et al. (2012). These numerical difficulties are attributed to near-zero eigenvalues of the Jacobian and are said to be similar to the intruder states in the second order perturbation theory Neuscamman et al. (2010). Several techniques have been proposed to mitigate these difficulties Neuscamman et al. (2010); Yanai et al. (2012). In our implementation, we find that shifting of the amplitude equations Yanai et al. (2012)

R~μ\displaystyle\tilde{R}_{\mu} =Rμ+λτμ\displaystyle=R_{\mu}+\lambda\tau_{\mu} (53a)
J~μν\displaystyle\implies\tilde{J}_{\mu\nu} =Jμν+λδμν,\displaystyle={J}_{\mu\nu}+\lambda\delta_{\mu\nu}, (53b)

where μ,ν\mu,\nu are flattened indices, to be the most effective. This is analogous to the regularization introduced in Eq. (42). The shifting parameter λ>0\lambda>0 can be chosen ad hoc Yanai et al. (2012). While we may follow the same steps as in the previous section to find optimal values for λ\lambda, for convenience, we picked some sufficiently small values with which the equations can converge reliably.

The results for 40 levels, 20 pairs is shown in Fig. 4. We picked λ=1\lambda=1 for G>0G>0 and a larger λ=2.5\lambda=2.5 for G<0G<0. CT with the unitary pair-hopper on AGP (CT-uP-AGP) is considerably less expensive than the truncated uP-AGP as well as the linear P-CI AGP. However, the approximation made to higher order excitations compromises the accuracy of CT-uP-AGP. Systematic improvements can be made to CT-uP-AGP by delaying the downfolding even further, but we leave these improvements for future work.

V Conclusions

We explore several non-linear exponential ansätze based on AGP in this paper. First, we study Hilbert-space Jastrow correlators on AGP. We show analytically that while a non-Hermitian JAGP exponential ansatz can in principle be made exact, practical computational limitations with non-stochastic methods compel us to make certain approximations. To this end, we extend the formalism of Ref. Wahlen-Strothman et al. (2015) to AGP and the su(2)su(2) algebra, and show how to similarity transform any given Hamiltonian in the seniority-zero space and solve for the amplitudes in a CC-style manner for AGP. Benchmark calculations of this ST-J2AGP on the ground state of the pairing Hamiltonian show significant improvement on AGP but the resulting energies are less accurate compared to their CI counter-part, J2-CI AGP. The variational ansatz, var-J2AGP, is generally more accurate than the similarity transformed one, but it is out of reach for non-stochastic methods to the best of our knowledge. The cost of ST-J2AGP is 𝒪(M4)\mathcal{O}(M^{4}).

In the second half of the paper, we sought an approximation to the unitary pair-hopper ansatz. Owing to its non-linear nature, exact variational calculations show uP-AGP can be considerably more accurate than the linear P-CI AGP. However, due to its non truncating commutator expansion, some approximation must be made to implement it efficiently on a classical computer. For an approximation to uP-AGP, we expand e𝓣𝑯e𝓣\mathrm{e}^{-\boldsymbol{\mathcal{T}}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{\mathcal{T}}} using the well-known BCH formula up to the highest affordable computational scaling 𝒪(M6)\mathcal{O}(M^{6}). Numerical results indicate improvement over P-CI AGP on the attractive regime of the pairing Hamiltonian.

The downside of truncated uP-AGP is that it has a relatively large prefactor in its asymptotic scaling. To resolve this, we implement a slight variation of canonical transformation theory Neuscamman et al. (2010) wherein we carry out the operator decomposition from the reconstruction formulae Khamoshi et al. (2019) instead of the commonly used cumulant decomposition Kutzelnigg and Mukherjee (1999). Our CT-uP-AGP is significantly less expensive than either truncated uP-AGP or P-CI AGP, but the approximation made to the higher order excitations noticeably compromises its accuracy.

Considering the pros and cons of each method, it is far from clear whether exponential ansätze for post-AGP methods are necessarily more advantageous than linear correlators. While there are non-linear ansätze that are more accurate than their linear counterparts, practical considerations for implementing them could limit their scope. So far, the linear ansätze have shown remarkable accuracy, they are straightforward to implement and seem rather resilient to system size, at least in the pairing Hamiltonian.

In electronic structure theory, exponential ansätze typically enjoy more favorable size-extensivity properties. However, we intentionally avoided discussing size-extensivity in this paper since the interactions in the pairing Hamiltonian are infinite-range. This makes the energy a non-linear function of system size and complicates the discussion of size-extensivity. Nevertheless, it must be noted that Neuscamman has shown Neuscamman (2012) variational JAGP is fully size-consistent. Also, earlier work in our group has shown a significant but incomplete restoration of size-consistency using K-CI and linearized-CC correlators on AGP Henderson and Scuseria (2020) with the latter exhibiting more favorable size-consistency. A more thorough analysis regarding size-extensivity of post-AGP methods will be reported in due time.

VI Acknowledgments

This work was supported by the U.S. National Science Foundation under Grant No. CHE-1762320. G.E.S. is a Welch Foundation Chair (Grant No. c-0036). A.K. is thankful to Yiheng Qiu for useful discussions regarding Jastrow AGP and to Gaurav Harsha for assistance with the drudge software. We thank Rishab Dutta for letting us use his JCI-AGP code.

VII Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Appendix A Proof for JkCI Manifold Containing Lower-Rank JCI Manifolds over AGP

The JkCI-AGP manifold contains the Jk-1CI-AGP and thus all lower-rank JCI-AGP manifolds. This has been observed numerically in Ref. Dutta et al. (2020). Here we provide a proof for the k=2k=2 case, which can be readily generalized to higher ranks. Along the same lines, JkAGP does not need to include lower-rank Jastrow operators, which is shown as a corollary.

For an AGP with NN pairs of electrons

𝑱1|AGP\displaystyle\,\boldsymbol{J}_{1}|{\mathrm{AGP}}\rangle
=\displaystyle= 12Np𝐍p𝑱1|AGP\displaystyle\,\frac{1}{2N}\sum_{p}\mathbf{N}_{p}\boldsymbol{J}_{1}|{\mathrm{AGP}}\rangle
=\displaystyle= 14Npqαq𝐍p𝐍q|AGP\displaystyle\,\frac{1}{4N}\sum_{pq}\alpha_{q}\mathbf{N}_{p}\mathbf{N}_{q}|{\mathrm{AGP}}\rangle
=\displaystyle= 14Np<q(αp+αq)𝐍p𝐍q|AGP+14Npαp𝐍p2|AGP\displaystyle\,\frac{1}{4N}\sum_{p<q}\left(\alpha_{p}+\alpha_{q}\right)\mathbf{N}_{p}\mathbf{N}_{q}|{\mathrm{AGP}}\rangle+\frac{1}{4N}\sum_{p}\alpha_{p}\mathbf{N}_{p}^{2}|{\mathrm{AGP}}\rangle
=\displaystyle= 14Np<q(αp+αq)𝐍p𝐍q|AGP+12Npαp𝐍p|AGP.\displaystyle\,\frac{1}{4N}\sum_{p<q}\left(\alpha_{p}+\alpha_{q}\right)\mathbf{N}_{p}\mathbf{N}_{q}|{\mathrm{AGP}}\rangle+\frac{1}{2N}\sum_{p}\alpha_{p}\mathbf{N}_{p}|{\mathrm{AGP}}\rangle.

It follows that

𝑱1|AGP=14p<qαp+αqN1𝐍p𝐍q|AGP.\boldsymbol{J}_{1}|{\mathrm{AGP}}\rangle=\frac{1}{4}\sum_{p<q}\frac{\alpha_{p}+\alpha_{q}}{N-1}\mathbf{N}_{p}\mathbf{N}_{q}|{\mathrm{AGP}}\rangle. (54)

Namely, 𝑱1|AGP\boldsymbol{J}_{1}|{\mathrm{AGP}}\rangle can be expressed as a linear combination of states in J2CI.

From Eq. (54), we can show Eq. (15) by defining

αpq=αp+αqN1+αpq\alpha^{\prime}_{pq}=\frac{\alpha_{p}+\alpha_{q}}{N-1}+\alpha_{pq} (55)

and

𝑱2=14p<qαpq𝐍p𝐍q.\boldsymbol{J}_{2}^{\prime}=\frac{1}{4}\sum_{p<q}\alpha^{\prime}_{pq}\mathbf{N}_{p}\mathbf{N}_{q}. (56)

Generalization to higher-rank cases is straightforward.

References