Exploring non-linear correlators on AGP

Armin Khamoshi armin.khamoshi@rice.edu Department of Physics and Astronomy, Rice University, Houston, TX 77005-1892 Guo P. Chen Department of Chemistry, Rice University, Houston, TX 77005-1892 Thomas M. Henderson Department of Chemistry, Rice University, Houston, TX 77005-1892 Department of Physics and Astronomy, Rice University, Houston, TX 77005-1892 Gustavo E. Scuseria Department of Chemistry, Rice University, Houston, TX 77005-1892 Department of Physics and Astronomy, Rice University, Houston, TX 77005-1892

Abstract

Single-reference methods such as Hartree-Fock-based coupled cluster theory are well known for their accuracy and efficiency for weakly correlated systems. For strongly correlated systems, more sophisticated methods are needed. Recent studies have revealed the potential of the antisymmetrized geminal power (AGP) as an excellent initial reference for the strong correlation problem. While these studies improved on AGP by linear correlators, we explore some non-linear exponential ansätze in this paper. We investigate two approaches in particular. Similar to Phys. Rev. B 91, 041114(R) (2015), we show that the similarity transformed Hamiltonian with a Hilbert-space Jastrow operator is summable to all orders and can be solved over AGP by projecting Schrödinger’s equation. The second approach is based on approximating the unitary pair-hopper ansatz recently proposed for application on a quantum computer. We report benchmark numerical calculations against the ground state of the pairing Hamiltonian for both of these approaches.

strongly correlated electrons, antisymmetrized geminal power, Jastrow correlator, canonincal transformation

I Introduction

Many-body methods in electronic structure theory frequently start from a mean-field reference. Commonly, a single-reference Slater determinant obtained by Hartree-Fock (HF) calculations is used for post-HF methods that make correction to this mean-field Helgaker et al. (2000). Such methods include configuration interaction (CI) and coupled cluster theory (CC). For weakly correlated systems, single-reference CC is often regarded as the gold standard and calculations are routinely performed using CC with single and double excitations (CCSD) very accurately at an affordable cost of $\mathcal{O}(M^{6})$ where $M$ is the size of the system Scuseria et al. (1988); Bartlett and Musiał (2007); Helgaker et al. (2000). Unfortunately, the same cannot be said for the strong correlation problem. When many-body interactions are strong, more than one Slater determinant becomes important and methods based on a single HF reference often break down Bulik et al. (2015); Degroote et al. (2016); Qiu et al. (2017). Although multireference methods exist, they have certain limitations and are not always straightforward to use Roos et al. (1980); Olsen (2011); Szalay et al. (2012); Lyakh et al. (2012). As such, a new a way of thinking and more sophisticated methods are desirable.

A useful way of organizing Hilbert space in the context of strong correlation is by seniority Ring and Schuck (1980); Bytautas et al. (2011). Loosely speaking, seniority ( $\Omega$ ) is the number of unpaired fermions in some appropriate pairing scheme. For example, a two-body Hamiltonian can generally be written as Henderson et al. (2015)

\displaystyle\boldsymbol{H}=\boldsymbol{H}^{\delta\Omega=0}+\boldsymbol{H}^{\delta\Omega=2}+\boldsymbol{H}^{\delta\Omega=4}

(1)

where the superscripts indicate the portion of the Hamiltonian that couples those determinants whose seniority differs by $0,2$ , etc. Numerical evidence has shown that the low-seniority portions of the Hilbert space can recover a very large part of the correlation energy and have the correct qualitative behavior Bytautas et al. (2011, 2015). The seniority-zero subspace can be solved exactly by doubly occupied configuration interaction (DOCI) but it scales combinatorially with system size Veillard and Clementi (1967); Couty and Hall (1997); Kollmar and Heß (2003). Remarkably, CC restricted to paired double excitation (pCCD) Stein et al. (2014), also known as AP1roG Limacher et al. (2013), is often nearly exact in this subspace with a very low computational cost for realistic molecular Hamiltonians Johnson et al. (2013); Henderson et al. (2014a); Shepherd et al. (2016). However, this correspondence is not universal and it has been shown that pCCD and its extensions break down for some problems including the attractive pairing Hamiltonian Dukelsky et al. (2004); Henderson et al. (2014b, 2015); Degroote et al. (2016); Henderson and Scuseria (2020),

\displaystyle\boldsymbol{H}=\sum_{p}\epsilon_{p}\mathbf{N}_{p}-G\sum_{pq}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{p},

(2)

where $\epsilon_{p}=p\Delta\epsilon$ are equally spaced, doubly degenerate energy levels where $\Delta\epsilon$ is the level spacing, $G$ tunes the strength of the infinite-range pairwise interactions, and


$\displaystyle\mathbf{P}^{\dagger}_{p}$	$\displaystyle={c}^{\dagger}_{p}{c}^{\dagger}_{\bar{p}},$	(3a)
$\displaystyle\mathbf{N}_{p}$	$\displaystyle={c}^{\dagger}_{p}{c}_{p}+{c}^{\dagger}_{\bar{p}}{c}_{\bar{p}},$	(3b)

such that ${c}^{\dagger}_{p}$ creates a fermion in spin-orbital $p$ , and $\bar{p}$ is its “paired” companion. Generally, in the seniority-zero subspace, all operators can be organized as a linear combination of terms like $\mathbf{P}^{\dagger}_{p}\ldots\mathbf{N}_{q}\ldots\mathbf{P}_{r}$ Khamoshi et al. (2019), where the nilpotent operators $\mathbf{P}^{\dagger}_{p}$ and $\mathbf{P}_{p}$ together with $\mathbf{N}_{p}$ are generators of a $su(2)$ Lie algebra


$\displaystyle\left[\mathbf{P}_{p},\mathbf{P}^{\dagger}_{q}\right]$	$\displaystyle=\delta_{pq}\left(1-\mathbf{N}_{p}\right),$	(4a)
$\displaystyle\left[\mathbf{N}_{p},\mathbf{P}_{q}^{\dagger}\right]$	$\displaystyle=2\delta_{pq}\mathbf{P}^{\dagger}_{q}.$	(4b)

The breakdown of CC in the pairing Hamiltonian can be understood in terms of the spontaneous breaking of number symmetry in the electronic mean-field, which occurs at some critical value of $G$ , $G_{c}>0$ . As a result, the mean-field solution gives rise to the number-broken Bardeen–Cooper–Schrieffer (BCS) wavefunction Bardeen et al. (1957) for $G>G_{c}$ and a number-conserving Slater determinant for $G<G_{c}$ including all $G<0$ where the interaction is repulsive. Indeed, it is in the attractive regime where CC as well as a whole host of other conventional methods struggle Henderson and Scuseria (2020). It is noteworthy that the pairing Hamiltonian is exactly solvable Richardson and Sherman (1964); Dukelsky et al. (2004).

Recently, we proposed to use the AGP wavefunction as the initial reference for this and potentially more general problems Henderson and Scuseria (2019); Khamoshi et al. (2019); Henderson and Scuseria (2020); Harsha et al. (2020); Dutta et al. (2020); Khamoshi et al. (2020). Along similar lines of research, Johnson et al. have developed correlated methods by explicitly using the pairing Hamiltonian’s eigenvectors, Richardson-Gaudin states, as the initial reference Johnson et al. (2020a); Fecteau et al. (2020a, b); Johnson et al. (2020b). The main advantage of AGP for the seniority-zero sector is that on the one hand, it puts coefficients on each Slater determinant in the DOCI while still conserving seniority, and on the other hand, AGP is computationally straightforward. To put it differently, the natural orbitals of AGP, HF, pCCD, and DOCI are all the same if the same pairing scheme is chosen, differing only in occupation. The AGP wavefunction can then be expanded in its seniority-zero natural orbital determinants and can occupy all $M\choose N$ such determinants, where $N$ is the number of fermion pairs. It thus encompasses all possible configurations in the seniority-zero space, albeit with inexact coefficients. However, unlike DOCI, AGP can be optimized at mean-field cost $\mathcal{O}(M^{3})$ Scuseria et al. (2011). It also encompasses the HF state, so it has a much richer structure as an initial reference. Moreover, we have shown recently that the reduced density matrices (RDM) over AGP can be computed efficiently as all higher order RDMs are decomposable in terms of lower order ones. Khamoshi et al. (2019) These realizations permit us to envision post-AGP methods—analogous to post-HF—to capture the remaining error and provide accurate ansätze in both weak and strong correlation. A case can be made for using AGP in the seniority non-zero spaces where pairs are broken, but we concentrate on the seniority-zero case at this stage.

Recent studies in our group have developed CI-based methods over AGP and numerical calculations show excellent accuracy when applied to the pairing Hamiltonian. These methods include orthogonal excitations Henderson and Scuseria (2019, 2020) and those based on the generators of the Lie algebra Eq. (4) Dutta et al. (2020). It has been shown that these models are formally equivalent and can be viewed as “geminal replacement” methods Dutta et al. (2020). Encouraged by the accuracy of linear models, we would like to explore non-linear correlator ansätze over AGP. Non-linear models in principle can approximate the DOCI coefficient differently and it remains to be seen whether they provide an advantage over the linear case. Developing a CC theory over AGP is not trivial and remains an open problem Henderson and Scuseria (2020).

In this paper, we explore two exponential correlators over AGP which we build from number-conserving combinations of the generators of the Lie algebra, Eq. (4). The first is the exponential of Jastrow Jastrow (1955) operators on AGP (JAGP). Jastrow-Gutzwiller correlators are ubiquitous in physics and chemistry and are widely used in Monte Carlo calculations. Jastrow (1955); Haldane (1988); Foulkes et al. (2001); Casula and Sorella (2003); Casula et al. (2004); Umezawa and Tsuneyuki (2003); Tsuneyuki (2008); Henderson and Scuseria (2013); Neuscamman (2013a, 2016); Genovese et al. (2019); Nakano et al. (2019) It was shown in Ref. Wahlen-Strothman et al. (2015); Wahlen-Strothman and Scuseria (2016) that the similarity transformation of any fermionic Hamiltonian, $\mathrm{e}^{-\boldsymbol{J}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}}$ , where $\boldsymbol{J}$ is a Hilbert-space Jastrow operator, is analytically summable to all orders. Inspired by that, we extend the formalism to the pairing $su(2)$ algebra Eq. (4) and show that we can efficiently solve the projected Schrödinger equation over AGP. In a way, this could be viewed as a form of generalized coupled cluster theory, but instead of the traditional particle-hole excitations, correlation is mediated by number operators that create excitations on AGP—a possibility that does not occur for single Slater determinants as they are eigenfunctions of $\boldsymbol{J}$ .

The second ansatz is the unitary pair-hopper that we proposed recently for applications in quantum computers Khamoshi et al. (2020). Unitary ansätze can be implemented efficiently on a quantum computer, Cao et al. (2019) but some approximations must be made in order to implement them on a classical computer. Here, we approximate the unitary transformation of the Hamiltonian using the Baker-Campbell-Hausdorff (BCH) formula by truncating it at the 4th commutator. Affordable computational scaling is made possible by the reconstruction formulae Khamoshi et al. (2019) with which we can eliminate the bottleneck of computing and storing higher rank density matrices from our calculations. As an alternative to the brute-force expansion of the BCH formula, we apply the canonical transformation theory as formulated in Ref. Yanai and Chan (2006, 2007); Neuscamman et al. (2009, 2010); Yanai et al. (2012) to recursively sum the transformed Hamiltonian. We report benchmark calculations of all methods against the ground state energy of the pairing Hamiltonian.

The organization of this paper is as follows. To familiarize the reader with our notation, in Sec. II we review some basic properties of AGP and its density matrices. In Sec. III, we present our formalism for solving similarity transformed JAGP as well as its application to the pairing Hamiltonian. Sec. IV discusses the truncated commutator expansion and canonical transformation with the unitary pair-hoppers ansatz over the pairing Hamiltonian. Lastly, some discussion and concluding remarks are provided in Sec. V.

II Background

The AGP wavefunction with $N$ pairs corresponds to a condensate where all $2N$ fermions are placed in the same geminal. Mathematically it can be written as Coleman (1965); Surján (1999)

\displaystyle|\text{AGP}\rangle=\frac{1}{N!}\left({\mathbf{{\Gamma}}}^{\dagger}\right)^{N}|\text{vac}\rangle,

(5)

where $|\text{vac}\rangle$ is the physical vacuum and ${\mathbf{{\Gamma}}}^{\dagger}$ is the geminal creation operator whose general expression is

\displaystyle{\mathbf{{\Gamma}}}^{\dagger}=\sum_{pq}\eta_{pq}{c}^{\dagger}_{p}{c}^{\dagger}_{q}.

(6)

Here, $p,q$ denote spin-orbitals and $\eta$ is a skew-symmetric matrix known as the geminal coefficient. To make a connection with Eq. (3) and for mathematical convenience, we perform an orbital rotation that brings the matrix of geminal coefficients into a block-diagonal form Hua (1944), that is

\displaystyle\eta=\bigoplus_{p=1}^{M}\begin{pmatrix}0&\eta_{p}\\ -\eta_{p}&0\\ \end{pmatrix}.

(7)

Therefore, without loss of generality, we obtain the geminal operator in the natural orbital basis of the geminal as

\displaystyle{\mathbf{{\Gamma}}}^{\dagger}=\sum_{p=1}^{M}\eta_{p}{c}^{\dagger}_{p}{c}^{\dagger}_{\bar{p}}=\sum_{p=1}^{M}\eta_{p}\mathbf{P}^{\dagger}_{p}.

(8)

where we made use of the notation in Eq. (3) and $M$ denotes the number of orbitals. In this formalism, we can express the AGP wavefunction as

\displaystyle|\text{AGP}\rangle=\sum_{1\leq p_{1}<...<p_{N}\leq M}\eta_{p_{1}}...\eta_{p_{N}}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle.

(9)

Similarly, we can write down the equation for DOCI,

\displaystyle|{\text{DOCI}}\rangle=\sum_{1\leq p_{1}<...<p_{N}\leq M}D_{p_{1}...p_{N}}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle.

(10)

From these expressions, it is evident that the two wavefunctions have the same structure but differ in the coefficients of each determinant; in DOCI the amplitude tensor is fully connected, whereas in AGP it is fully factorized.

Despite being a linear combination of a combinatorial number of determinants, reduced density matrices of many-body fermionic operators over AGP can be computed efficiently at polynomial cost and are very sparse in the natural orbital basis Khamoshi et al. (2019). To see this, first note that for a generic many-body operator $\hat{O}$ we have

\displaystyle\langle\text{AGP}|{\hat{O}}|\text{AGP}\rangle=\langle\text{AGP}|{\hat{O}^{\delta\Omega=0}}|\text{AGP}\rangle.

(11)

In other words, since AGP is a seniority-zero wavefunction, all many-body density matrices over AGP can be written in terms of $\langle\text{AGP}|{\mathbf{P}^{\dagger}_{p}...\mathbf{N}_{q}...\mathbf{P}_{r}...}|\text{AGP}\rangle$ where the overlaps associated with particle-number breaking or seniority non-conserving operators are identically zero. Therefore, we can organize all nonzero reduced density matrices in ascending rank as follows


$\displaystyle Z_{p}^{(1,1)}$	$\displaystyle=\langle\text{AGP}\|{\mathbf{N}_{p}}\|\text{AGP}\rangle,$	(12a)
$\displaystyle Z_{pq}^{(0,2)}$	$\displaystyle=\langle\text{AGP}\|{\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}}\|\text{AGP}\rangle,$	(12b)
$\displaystyle Z_{pq}^{(2,2)}$	$\displaystyle=\langle\text{AGP}\|{\mathbf{N}_{p}\mathbf{N}_{q}}\|\text{AGP}\rangle,$	(12c)
$\displaystyle Z_{pqr}^{(1,3)}$	$\displaystyle=\langle\text{AGP}\|{\mathbf{P}^{\dagger}_{p}\mathbf{N}_{q}\mathbf{P}_{r}}\|\text{AGP}\rangle,$	(12d)
$\displaystyle Z_{pqr}^{(3,3)}$	$\displaystyle=\langle\text{AGP}\|{\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}}\|\text{AGP}\rangle,$	(12e)
$\displaystyle\ldots$

where the first integer in the superscript indicates the number of $\mathbf{N}_{p}$ operators in the middle and the second integer is the rank of the density matrix. When indices of an RDM are different, we refer to it as an irreducible density matrix. This is because $\mathbf{N}_{p}=\mathbf{N}_{p}^{2}/2=2\mathbf{P}^{\dagger}_{p}\mathbf{P}_{p}$ together with $\mathbf{P}^{\dagger}_{p}\mathbf{P}^{\dagger}_{p}=\mathbf{P}^{\dagger}_{p}\mathbf{N}_{p}=0$ in the seniority-zero space imply that RDMs with repeated indices are either zero or can be written in terms of lower rank density matrices.

Every matrix element of an AGP RDM, regardless of the rank, can be computed directly at $\mathcal{O}(M^{2})$ cost using a method based on elementary symmetric polynomials (ESP) Khamoshi et al. (2019). Even more efficiently, one can take advantage of the reconstruction formulae to express any irreducible RDM as a linear combination of lower rank ones, which can be performed all the way down to occupation numbers and geminal coefficients Khamoshi et al. (2019). This allows us to compute any density matrix of rank 2 or higher with $\mathcal{O}(1)$ cost per element. Finally, we can take advantage of the equivalence between AGP and the number-projected BCS wavefunction Ring and Schuck (1980); Dukelsky et al. (2016) to extract AGP density matrices via grid integrations of BCS transition density matrices Dutta et al. (2021), at $\mathcal{O}(N_{\mathrm{grid}})$ cost per element, where $N_{\mathrm{grid}}$ is the size of the numerical quadrature for gauge integration Scuseria et al. (2011). While less efficient than the reconstruction formulae, this final approach can be very efficient when used to contract density matrices with other factorized quantities. In the subsequent sections, we use these methods to compute all density matrices in our calculations.

III Jastrow-AGP

III.1 Analytic properties

Let the $k$ -body, or rank- $k$ , Hilbert-space Jastrow operator Neuscamman (2013b) be written as

\displaystyle\boldsymbol{J}_{k}

\displaystyle=\frac{1}{2^{k}}\sum_{p_{1}<...<p_{k}}\alpha_{p_{1}...p_{k}}\mathbf{N}_{p_{1}}\mathbf{N}_{p_{2}}...\mathbf{N}_{p_{k}},

(13)

where the sums run over all orbitals, $\alpha$ is a symmetric tensor and is invariant under the exchange of its indices, and ${1}/{2^{k}}$ is introduced for convenience. Importantly, since $\mathbf{N}_{p}^{2}=2\mathbf{N}_{p}$ in the seniority-zero space, we impose that $\alpha$ is zero if any two indices are the same and eliminate these terms from the sum in Eq. (13).

From the $k$ -body Jastrow operator we can define the $k$ -body Jastrow-AGP (J_kAGP) wavefunction as

|\mathrm{J_{k}AGP}\rangle=\mathrm{e}^{\boldsymbol{J}_{k}}|\mathrm{AGP}\rangle

(14)

As we prove in Appendix. A, we do not need to include lower body Jastrow operators in this J_kAGP wavefunction, as they are contained inside $\boldsymbol{J}_{k}$ . Thus, for example, we may write

\mathrm{e}^{\boldsymbol{J}_{1}+\boldsymbol{J}_{2}}|\mathrm{AGP}\rangle=\mathrm{e}^{\boldsymbol{J}_{2}^{\prime}}|\mathrm{AGP}\rangle.

(15)

where $\boldsymbol{J}_{2}^{\prime}$ is another two-body Jastrow operator.

Slater determinants built from the same orbitals used to define the Hilbert-space Jastrow operators are eigenfunctions of $\boldsymbol{J}_{k}$ , that is


$\displaystyle\boldsymbol{J}_{k}\|\Phi_{\mu}\rangle$	$\displaystyle=J_{k,\mu}\|\Phi_{\mu}\rangle,$	(16a)
$\displaystyle J_{k,\mu}$	$\displaystyle=\frac{1}{2^{k}}\,\sum_{p_{1}<\ldots<p_{k}}\alpha_{p_{1}\ldots p_{k}}N_{p_{1},\mu}\ldots N_{p_{k},\mu}$	(16b)
	$\displaystyle=\sum_{i_{1}<\ldots<i_{k}}\alpha_{i_{1}\ldots i_{k}},$	(16c)

where $|\Phi_{\mu}\rangle$ is a determinant, $N_{p,\mu}$ is the occupation number of level $p$ in determinant $\mu$ , and the indices $i_{1}\ldots i_{k}$ run over the occupied orbitals in $|\Phi_{\mu}\rangle$ . From this, it follows that

\mathrm{e}^{\boldsymbol{J}_{k}}\sum c_{\mu}|\Phi_{\mu}\rangle=\sum\mathrm{e}^{J_{k,\mu}}\,c_{\mu}\,|\Phi_{\mu}\rangle.

(17)

For a complex $\alpha$ , the magnitude of the coefficient of a given determinant is modified by the Hermitian part of $\boldsymbol{J}_{k}$ , i.e. the real part of $\alpha$ . The anti-Hermitian part of $\boldsymbol{J}_{k}$ (the imaginary part of $\alpha$ ) only adjusts the phase of each coefficient. It is thus apparent that for an $N$ -body initial wavefunction $|\Psi\rangle$ in which all determinants $|\Phi_{\mu}\rangle$ have non-zero coefficients, $\exp(\boldsymbol{J}_{N})|\Psi\rangle$ can be made an exact eigenstate of a chosen Hamiltonian provided that $\boldsymbol{J}_{N}$ is non-Hermitian.

AGP is, in general, such an initial wavefunction, so $\exp(\boldsymbol{J}_{N})$ acting on it can be made exact for the seniority zero space. For any $k\leq N$ , it follows from Eq. (16) and Eq. (17) that

\mathrm{e}^{\boldsymbol{J}_{k}}|\text{AGP}\rangle=\sum_{p_{1}<...<p_{N}}\exp\left(\sum_{1\leq j_{1}<...<j_{k}\leq N}\alpha_{p_{j_{1}}...p_{j_{k}}}\right)\\ \eta_{p_{1}}...\eta_{p_{N}}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle.

(18)

In contrast, we can see how the linear wavefunction $|{\Psi}\rangle=\boldsymbol{J}_{k}|\text{AGP}\rangle$ , which we refer to as J_kCI AGP Dutta et al. (2020), modifies AGP as follows

|{\Psi}\rangle=\sum_{p_{1}<...<p_{N}}\left(\sum_{1\leq j_{1}<...<j_{k}\leq N}\alpha_{p_{j_{1}}...p_{j_{k}}}\right)\eta_{p_{1}}...\eta_{p_{N}}\\ \mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{N}}|\text{vac}\rangle.

(19)

Thus, the way in which the linear and non-linear $\boldsymbol{J}_{k}$ correlators factorize the DOCI coefficients can be easily deduced from Eq. (18) and Eq. (19). Together with Eq. (10), we can readily see that


	$\displaystyle D_{p_{1}...p_{N}}^{\text{(Linear)}}\approx\sum_{1\leq j_{1}<...<j_{k}\leq N}\alpha_{p_{j_{1}}...p_{j_{k}}}\eta_{p_{1}}...\eta_{p_{N}},$		(20a)
	$\displaystyle D_{p_{1}...p_{N}}^{\text{(Non-linear)}}\approx\prod_{1\leq j_{1}<...<j_{k}\leq N}\mathrm{e}^{\alpha_{p_{j_{1}}...p_{j_{k}}}}\eta_{p_{1}}...\eta_{p_{N}}.$		(20b)

Indeed the two factorizations are different, and Eq. (20a) can be understood as the linear approximation to Eq. (20b) up to a constant shift of all amplitudes. An implication can be made, for example, about the sign of each coefficient; if $\alpha$ is real, the DOCI coefficient in Eq. (20b) can be negative only if some $\eta$ ’s are negative, whereas the same restriction is not implied in (20a).

Having shown analytically how $\boldsymbol{J}_{k}$ acts on AGP, we concern ourselves with optimizing the energy over JAGP in the subsequent sections.

III.2 Energy optimization with JAGP

Consider $k=1$ for which the J₁AGP wavefunction is just an AGP,


$\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}\|\text{AGP}\rangle$	$\displaystyle=\sum_{p_{1}<\ldots<p_{N}}\mathrm{e}^{\sum_{j}\alpha_{p_{j}}}\,\left(\prod_{k}\eta_{p_{k}}\,\mathbf{P}^{\dagger}_{p_{k}}\right)\|\text{vac}\rangle$	(21a)
	$\displaystyle=\sum_{p_{1}<\ldots<p_{N}}\left(\prod_{k}\eta_{p_{k}}\,\mathrm{e}^{\alpha_{p_{k}}}\,\mathbf{P}^{\dagger}_{p_{k}}\right)\|\text{vac}\rangle.$	(21b)

Evidently, the action of $\boldsymbol{J}_{1}$ just changes the geminal coefficient $\eta_{p}$ to $\eta_{p}\,\mathrm{e}^{\alpha_{p}}$ . For $k=1$ , J₁AGP can optimize an AGP, but it cannot correlate it; we need $k>1$ to go beyond an AGP mean-field. Unfortunately, except for $k=1$ , the norm and density matrices over $\mathrm{e}^{\boldsymbol{J}_{k}}|\text{AGP}\rangle$ are not trivial to compute, and aside from quantum Monte Carlo methods, Foulkes et al. (2001); Casula et al. (2004); Neuscamman (2013a, 2016); Wei and Neuscamman (2018); Cohen et al. (2019) we are not aware of an efficient non-stochastic algorithm to compute the overlaps. This for examples makes the variational optimization of JAGP energy

\displaystyle E_{\text{var-JAGP}}=\frac{\langle\text{AGP}|{\mathrm{e}^{\boldsymbol{J}^{\dagger}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}}}|\text{AGP}\rangle}{\langle\text{AGP}|{\mathrm{e}^{\boldsymbol{J}^{\dagger}}\mathrm{e}^{\boldsymbol{J}}}|\text{AGP}\rangle},

(22)

out of reach for the existing non-stochastic methods.

However, a case can be made for the similarity transformed Hamiltonian using $\boldsymbol{J}_{2}$ , that is

\displaystyle E_{\text{ST-J${}_{2}$AGP}}={\langle\text{AGP}|{\mathrm{e}^{-\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}_{2}}}|\text{AGP}\rangle}.

(23)

The two-body Jastrow correlator has a long history in electronic structure methods and has shown to be capable of recovering a significant portion of the correlation energy. Casula and Sorella (2003); Neuscamman (2013a); Genovese et al. (2019) Previous work in our group has shown that the similarity transformation of fermionic Hamiltonians using a two-body Jastrow operator is summable to all orders and the amplitudes can be solved by left projection via components of $\boldsymbol{J}_{2}$ acting on the HF reference Wahlen-Strothman et al. (2015); Wahlen-Strothman and Scuseria (2016). Following the same line of reasoning, we extend that formalism to the pairing $su(2)$ algebra Eq. (4) and show how to obtain the working equations for AGP in the next section.

The main idea relies on the fact that the similarity transformation of the Hamiltonian reduces the rank of $\boldsymbol{J}_{2}$ by one, thus the remaining one-body Jastrow operator can be subsequently absorbed into the reference. The same idea can be applied to the unitary transformation of the Hamiltonian with an anti-Hermitian $i\boldsymbol{J}_{2}$ operator

\displaystyle E_{\text{uJ${}_{2}$AGP}}={\langle\text{AGP}|{\mathrm{e}^{-i\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{i\boldsymbol{J}_{2}}}|\text{AGP}\rangle},

(24)

wherein $\alpha$ is assumed to be real. Based on the argument in the previous section, the Jastrow correlators with purely imaginary amplitudes only contribute to the phase of each Slater determinant in AGP and do not alter their magnitudes. This can also be seen from Eq. (18) by $\alpha\rightarrow i\alpha$ . The phase contributions could be important for Hamiltonians that contain complex terms including those that break time-reversal symmetry. For other Hamiltonians, the mere change in phase is insufficient to make the wavefunction exact.

III.3 Similarity transformation with JAGP

In this section, we extend the results of Ref. Wahlen-Strothman et al. (2015); Wahlen-Strothman and Scuseria (2016) to the AGP wavefunction and our $su(2)$ algebra. For pedagogical reasons, we keep this section self-contained. Suppose one had a generic Jastrow operator $\boldsymbol{J}$ . In a manner similar to coupled cluster theory, we intend to solve for the ground state energy by similarity transforming the Hamiltonian,


$\displaystyle\bar{\boldsymbol{H}}$	$\displaystyle=\mathrm{e}^{-\boldsymbol{J}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}},$	(25a)
$\displaystyle\bar{E}$	$\displaystyle=\langle\text{AGP}\|{\bar{\boldsymbol{H}}}\|\text{AGP}\rangle.$	(25b)

Similarity transformation does not change the spectrum of $\boldsymbol{H}$ , but $\bar{\boldsymbol{H}}$ is no longer Hermitian and $\bar{E}$ is not variational. Therefore we obtain the residual equations by left projecting the Schrödinger equation to get

\langle{\mathbf{N}_{p_{1}}\ldots\mathbf{N}_{p_{k}}\,\bar{\boldsymbol{H}}}\rangle-\bar{E}\,\langle{\mathbf{N}_{p_{1}}\ldots\mathbf{N}_{p_{k}}}\rangle=0\quad\forall_{p_{1}<\ldots<p_{k}}

(26)

where the expectation values are taken with respect to AGP. In traditional coupled cluster theory using particle-hole excitations, one expresses the similarity transformation by the Baker-Campbell-Hausdorff (BCH) expansion whose series naturally truncates at the 4th commutator for a two-body Hamiltonian. In contrast, the transformation using the Jastrow operator does not truncate, but it is summable to all orders for any Hamiltonian.

Our ultimate goal is to use a two- or higher-body Jastrow operators to similarity transform the Hamiltonian and provide correlation. However, it will prove useful to begin by working through the case in which we use a one-body operator, for which the algebra is much simpler. Therefore, consider the $\boldsymbol{J}_{1}$ operator

\boldsymbol{J}_{1}=\frac{1}{2}\,\sum_{p}\alpha_{p}\,\mathbf{N}_{p}.

(27)

Using Eq. (4), it is easy to show that


$\displaystyle\mathrm{e}^{-\boldsymbol{J}_{1}}\;\mathbf{P}^{\dagger}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}}$	$\displaystyle=\mathrm{e}^{-\alpha_{p}}\mathbf{P}^{\dagger}_{p},$	(28a)
$\displaystyle\mathrm{e}^{-\boldsymbol{J}_{1}}\;\mathbf{P}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}}$	$\displaystyle=\mathrm{e}^{\alpha_{p}}\mathbf{P}_{p},$	(28b)
$\displaystyle\mathrm{e}^{-\boldsymbol{J}_{1}}\;\mathbf{N}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}}$	$\displaystyle=\mathbf{N}_{p}.$	(28c)

Thus, the similarity-transformed Hamiltonian simply adjusts the coefficients of the $\mathbf{P}^{\dagger}$ and $\mathbf{P}$ operators.

Alternatively, we can use Eq. (21) to absorb $\exp(\boldsymbol{J}_{1})$ into AGP. Although we have seen how this can be done, we will extract the same result in an different fashion. Notice by Eq. (5) that

\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}|\text{AGP}\rangle=\frac{1}{N!}\left(\mathrm{e}^{\boldsymbol{J}_{1}}{\mathbf{{\Gamma}}}^{\dagger}\mathrm{e}^{-\boldsymbol{J}_{1}}\right)^{N}|\text{vac}\rangle,

(29)

where we used the fact that $\exp(\boldsymbol{J}_{1})|\text{vac}\rangle=|\text{vac}\rangle$ . From the algebra, Eq. (4), it follows


$\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}{\mathbf{{\Gamma}}}^{\dagger}\mathrm{e}^{-\boldsymbol{J}_{1}}$	$\displaystyle=\sum_{p}\eta_{p}\mathrm{e}^{\alpha_{p}}\mathbf{P}^{\dagger}_{p},$	(30a)
$\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}{\mathbf{{\Gamma}}}\mathrm{e}^{-\boldsymbol{J}_{1}}$	$\displaystyle=\sum_{p}\eta_{p}\mathrm{e}^{-\alpha_{p}}\mathbf{P}_{p},$	(30b)

from which it is clear that $\mathrm{e}^{\boldsymbol{J}_{1}}|\text{AGP}\rangle$ leads to $\eta_{p}\rightarrow\eta_{p}\mathrm{e}^{\alpha_{p}}$ for all geminal coefficients in the ket, and $\langle\text{AGP}|\mathrm{e}^{-\boldsymbol{J}_{1}}$ results in $\eta_{p}\rightarrow\eta_{p}\mathrm{e}^{-\alpha_{p}}$ in the bra.

We now proceed to derive the equations for similarity transformation with $\boldsymbol{J}_{2}$ . Define $\boldsymbol{J}_{1}(p)=\sum_{q}\alpha_{pq}\mathbf{N}_{p}/4$ and recall $\alpha_{pp}=0$ , then as in Eq. (28) we obtain


$\displaystyle\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{P}^{\dagger}_{p}\;\mathrm{e}^{\boldsymbol{J}_{2}}$	$\displaystyle=\mathrm{e}^{-\boldsymbol{J}_{1}(p)}\;\mathbf{P}^{\dagger}_{p}\;\mathrm{e}^{-\boldsymbol{J}_{1}(p)},$	(31a)
$\displaystyle\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{P}_{p}\;\mathrm{e}^{\boldsymbol{J}_{2}}$	$\displaystyle=\mathrm{e}^{\boldsymbol{J}_{1}(p)}\;\mathbf{P}_{p}\;\mathrm{e}^{\boldsymbol{J}_{1}(p)},$	(31b)
$\displaystyle\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{N}_{p}\;\mathrm{e}^{\boldsymbol{J}_{2}}$	$\displaystyle=\mathbf{N}_{p}$	(31c)

where we used $[{\boldsymbol{J}_{1}(p)},{\mathbf{P}^{\dagger}_{p}}]=0$ to symmetrically distribute the exponentials in the right hand side of Eq. (31) on either side of each operator. In fact, using these equations, we can transform any many-body operator of the form $\mathbf{P}^{\dagger}_{p}...\mathbf{N}_{q}...\mathbf{P}_{r}...$ in a symmetric way. In general, it can be shown that

\mathrm{e}^{-\boldsymbol{J}_{2}}\;\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}^{\dagger}_{p_{n}}\mathbf{N}_{q_{1}}...\mathbf{N}_{q_{m}}\mathbf{P}_{r_{1}}...\mathbf{P}_{r_{n}}\;\mathrm{e}^{\boldsymbol{J}_{2}}=\\ \mathrm{e}^{\sum_{i=1}^{n}\boldsymbol{J}_{1}(r_{i})-\boldsymbol{J}_{1}(p_{i})}\mathbf{P}^{\dagger}_{p_{1}}...\mathbf{P}_{r_{n}}\mathrm{e}^{\sum_{i=1}^{n}\boldsymbol{J}_{1}(r_{i})-\boldsymbol{J}_{1}(p_{i})}.

(32)

To obtain a closed-form expression for the energy and the residual equations, we need to further absorb each $\boldsymbol{J}_{1}(p)$ into AGP. Following the same steps as in Eq. (30), we can see that

\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}{\mathbf{{\Gamma}}}^{\dagger}\mathrm{e}^{\boldsymbol{J}_{1}(p)-\boldsymbol{J}_{1}(q)}\\ =\sum_{r}\left(\eta_{r}\mathrm{e}^{(\alpha_{qr}-\alpha_{pr})/2}\right)\mathbf{P}^{\dagger}_{r}=\widetilde{\mathbf{{\mathbf{{\Gamma}}}}}^{\dagger}_{pq}.

(33)

This together with Eq. (32) show that we can evaluate the expectation value of any similarity transformed operator of the form $\mathbf{P}^{\dagger}_{p}...\mathbf{N}_{q}...\mathbf{P}_{r}...$ with $\boldsymbol{J}_{2}$ by scaling the geminal coefficients appropriately.

The same basic steps follow for even higher-body Jastrow operators; the initial similarity-transformation of the Hamiltonian leads to modified matrix elements along with exponentials of Jastrow operators whose ranks are reduced by one. However, only $\boldsymbol{J}_{2}$ permits easy evaluation of the remaining expectation values.

For unitary JAGP, as in Eq. (24), one follows the same steps as above but replaces $\alpha\rightarrow i\alpha$ .

III.4 Application to the pairing Hamiltonian

We now show how to apply the equations above to similarly transform the pairing Eq. (2). It follows form Eq. (32) that

\mathrm{e}^{-\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}_{2}}=\sum_{p}\epsilon_{p}\mathbf{N}_{p}\\ -G\sum_{pq}\left(\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}\right).

(34)

For every $\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}$ , define $|\widetilde{\mathrm{AGP}}_{pq}\rangle=\mathrm{e}^{\boldsymbol{J}_{1}(q)-\boldsymbol{J}_{1}(p)}|\text{AGP}\rangle$ , and note that $|\widetilde{\mathrm{AGP}}_{pq}\rangle$ is an AGP whose geminal coefficients have been rescaled by $\tilde{\eta}_{r}=\eta_{r}\mathrm{e}^{(\alpha_{qr}-\alpha_{pr})/2}$ for all orbitals $r$ from Eq. (33). As such, we can obtain the energy,

\bar{E}=\langle{\mathrm{e}^{-\boldsymbol{J}_{2}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{J}_{2}}}\rangle=\sum_{p}\epsilon_{p}\langle\text{AGP}|{\mathbf{N}_{p}}|\text{AGP}\rangle\\ -G\,\sum_{pq}\langle\widetilde{\mathrm{AGP}}_{pq}|\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}|\widetilde{\mathrm{AGP}}_{pq}\rangle.

(35)

Following similar steps as above, it is straightforward to get closed-form expressions for the residual equations, Eq. (26), as well. We simply use

\langle{\mathbf{N}_{r}\,\mathbf{N}_{s}\,\mathrm{e}^{-\boldsymbol{J}_{2}}\,\boldsymbol{H}\,\mathrm{e}^{\boldsymbol{J}_{2}}}\rangle=\sum_{p}\epsilon_{p}\,\langle{\mathrm{AGP}}|\mathbf{N}_{p}\,\mathbf{N}_{r}\,\mathbf{N}_{s}|{\mathrm{AGP}}\rangle\\ -G\,\sum_{pq}\langle\widetilde{\mathrm{AGP}}_{pq}|\mathbf{N}_{r}\,\mathbf{N}_{s}\,\mathbf{P}^{\dagger}_{p}\,\mathbf{P}_{q}|\widetilde{\mathrm{AGP}}_{pq}\rangle

(36)

which can then be normal ordered in terms of $\mathbf{P}^{\dagger}_{p}\ldots\mathbf{N}_{q}\ldots\mathbf{P}_{r}\ldots$ .

As outlined in Sec. II, it is a property of AGP-based RDMs that $Z_{p}^{(1,1)}=\langle\text{AGP}|{\mathbf{N}_{p}}|\text{AGP}\rangle$ is sufficient to construct all higher-body density matrices, where the cost of building $Z^{(1,1)}$ is $\mathcal{O}(N_{\mathrm{grid}}M)$ using grid integration in the number-projected BCS representation of AGP. Of course the same principle applies to RDMs using $|\widetilde{\mathrm{AGP}}_{pq}\rangle$ , except that $\widetilde{Z}^{(1,1)}_{r;pq}=\langle\widetilde{\mathrm{AGP}}_{pq}|\mathbf{N}_{r}|\widetilde{\mathrm{AGP}}_{pq}\rangle$ needs to be evaluated for every $pq$ in the Hamiltonian, as demonstrated for Eq. (35). The cost of building $\widetilde{Z}^{(1,1)}_{r;pq}$ for all $pqr$ is $\mathcal{O}(N_{\mathrm{grid}}M^{3})$ and is the most expensive step in evaluating the ST-J₂AGP energy. The cost of evaluating the residuals using the reconstruction formulae is $\mathcal{O}(M^{4})$ . Without the reconstruction formulae, the cost would be $\mathcal{O}(M^{6})$ .

In Fig. 1, we show the energy error and the fraction of correlation energy captured by ST-J₂AGP. All calculations were carried out using the energy optimized AGP for $\boldsymbol{H}$ . The correlation energy is computed as the difference with that of HF. As shown in Fig. 1, ST-J₂AGP over-correlates near the recoupling regime, $G\approx G_{c}$ and smaller, including $G<0$ . In the strongly attractive correlated limit, $G\gg G_{c}$ , ST-J₂AGP is highly accurate and becomes exact for sufficiently large $G/G_{c}$ . It is noteworthy, however, that AGP is an eigenstate of the pairing Hamiltonian for $G/G_{c}\rightarrow\infty$ . For comparison purposes, we have plotted the results for variational J₂AGP where we carried out the optimization using an in-house DOCI code. We have also included results using a linear wavefunction $\boldsymbol{J}_{2}|\mathrm{AGP}\rangle$ in which the energy is determined variationally; this J₂CI-AGP has $\mathcal{O}(M^{6})$ scaling, or $\mathcal{O}(M^{8})$ without reconstruction formulae.

For larger systems, the over-correlation of ST-J₂AGP near the recoupling regime worsens while the linear variational theory remains well-behaved. Thus, although ST-J₂AGP has a lower cost than other AGP-based methods developed recently in our group, neither var-J₂AGP nor ST-J₂AGP yield as accurate energies for the pairing Hamiltonian. While correlators build from particle-hole excitations generally benefit from being put in an exponential operator, it is far from clear that the same is true of correlators built from Hilbert space Jastrow operators. Thus in the next section we discuss an exponential ansatz that we build from different generators.

IV Unitary pair-hopper Correlator

An alternative way of creating an exponential ansatz based on two-body, number preserving operators with the generators of the algebra is to use a linear combination of terms like $\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}$ . Unlike the Jastrow operators discussed in Sec. III, a unitary or similarity transformation using $\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}$ operators is not easily summable nor does it truncate naturally the way it does in CC theory. This is a direct consequence of the commutation relations Eq. (4). As a result, some form of approximation must be made. Recently, we proposed a unitary ansatz that we call unitary pair-hopper for an implementation on a quantum computer Khamoshi et al. (2020). Benchmark calculations against the ground state energy of the pairing-Hamiltonian show highly accurate results.

In this section, we seek approximation schemes to this model that can be implemented affordably on a classical computer. In particular, we implement two approaches: First, we use the BCH formula and truncate the commutator expansion at some high order. Second, we apply the canonical transformation theory as studied extensively in Ref. Yanai and Chan (2006, 2007); Neuscamman et al. (2009, 2010); Yanai et al. (2012) to this problem later in this section.

IV.1 Truncated unitary pair-hopper

Recall the definition of the anti-Hermitian pair-hopper operator Khamoshi et al. (2020)

\displaystyle\boldsymbol{\mathcal{T}}=\sum_{p<q}\tau_{pq}\left(\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}-\mathbf{P}^{\dagger}_{q}\mathbf{P}_{p}\right),

(37)

where $\tau_{pq}$ is the amplitude and is antisymmetric. $\boldsymbol{\mathcal{T}}$ is derived from

\displaystyle\mathbf{K}_{pq}-\mathbf{K}^{\dagger}_{pq}\propto\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}-\mathbf{P}^{\dagger}_{q}\mathbf{P}_{p},

(38)

where $\mathbf{K}_{pq}|\text{AGP}\rangle=0$ is the two-body killer of AGP Henderson and Scuseria (2019). Its exact expression is

	$\displaystyle\mathbf{K}_{pq}$	$\displaystyle=\eta_{p}^{2}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q}+\eta_{q}^{2}\mathbf{P}^{\dagger}_{q}\mathbf{P}_{p}+$
		$\displaystyle{}\quad\frac{1}{2}\eta_{p}\eta_{q}\left(\mathbf{N}_{p}\mathbf{N}_{q}-\mathbf{N}_{p}-\mathbf{N}_{q}\right).$		(39)

Notice that $\mathrm{e}^{\boldsymbol{\mathcal{T}}}|{\text{HF}}\rangle$ is equivalent to paired unitary CC with generalized indices (UGCC) Nooijen (2000); Stein et al. (2014); Lee et al. (2019).

Define the unitary pair-hopper ansatz as $\mathrm{e}^{\boldsymbol{\mathcal{T}}}|\text{AGP}\rangle$ which we denote as uP-AGP. We want to solve for the amplitudes by minimizing the energy, that is


$\displaystyle E(\tau)$	$\displaystyle=\langle\text{AGP}\|{\mathrm{e}^{-\boldsymbol{\mathcal{T}}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{\mathcal{T}}}}\|\text{AGP}\rangle$	(40a)
$\displaystyle\tau^{*}$	$\displaystyle=\underset{\tau_{pq}\in\mathbb{R}}{\text{argmin}}\;E(\tau).$	(40b)

However, since the BCH expansion of the unitary transformed Hamiltonian does not truncate naturally, we choose to truncate the expansion at the highest order that gives us an affordable scaling, i.e. $\mathcal{O}(M^{6})$ . Again, using the reconstruction formulae, the expectation value of all two-body or higher density matrices can be accessed at $\mathcal{O}(1)$ cost per element. This means the asymptotic scaling of the commutator expansion is dominated by the many-body contractions. For example, for a two-body Hamiltonian, the first commutator, $\langle{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]}\rangle$ contains three-body contractions; the second commutator, $\langle{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle$ contains four-body contractions, and so on. By truncating the expansion at the 4th commuter, we obtain a scaling of $\mathcal{O}(M^{6})$ for the energy. In other words, we can state

E(\tau)\approx\underbrace{\langle{\boldsymbol{H}}\rangle}_{\text{$\mathcal{O}(M^{2})$}}+\underbrace{\langle{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{3})$}}+\frac{1}{2!}\underbrace{\langle{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{4})$}}\\ +\frac{1}{3!}\underbrace{\langle{[{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{5})$}}+\frac{1}{4!}\underbrace{\langle{[{[{[{[{\boldsymbol{H}},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]},{\boldsymbol{\mathcal{T}}}]}\rangle}_{\text{$\mathcal{O}(M^{6})$}},

(41)

such that addition of every term in the expansion multiplies the leading scaling by a factor of $M$ . Obtaining the equations above analytically is cumbersome but not prohibitive. We use our home-built algebraic manipulator software, drudge Zhao (2018), to generate this and the analytic gradient of the energy. As a result, we can compute both $E(\tau)$ and $\nabla E(\tau)$ at $\mathcal{O}(M^{6})$ cost albeit with a relatively large prefactor.

We minimize the energy as in Eq. (40) using the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm Byrd et al. (1995) where we provide the gradients using our analytical expressions. We could alternatively find the solutions to $\nabla E(\tau)=0$ using non-linear root-finding algorithms, but we find that the former is typically more robust when an accurate initial guess is provided.

Because truncating the commutator expansion yields a non-variational energy expression, we must take care with the minimization of the energy. This becomes particularly important in larger systems for which the numerical algorithms may go below the exact energy and eventually blow up. Fortunately, this failure mode is easy to identify as it is accompanied by extremely large amplitudes in $\tau$ . To avoid this problem, we introduce an $L_{2}$ regularization Nocedal and Wright (2006)


$\displaystyle\widetilde{E}({\tau})$	$\displaystyle=E({\tau})+\frac{\lambda}{2}\left\lVert{\tau}\right\rVert^{2}$	(42a)
$\displaystyle\implies\nabla\widetilde{E}({\tau})$	$\displaystyle=\nabla{E}({\tau})+\lambda{\tau}$	(42b)

where $\lambda>0$ is the regularization parameter and is sufficiently small to allow a robust convergence. In effect, the regularization penalizes the energy optimization in proportion to the magnitude of the amplitudes, thereby preventing it from blowing up. Although this compromises the accuracy of final energy, we gain a smoother convergence. To find $\lambda$ , we pick a small value e.g. $10^{-3}$ , converge to an optimal solution for $\tau$ , and use this solution as the initial guess of another optimization in which a smaller $\lambda$ is used. The first initial guess for $\tau$ can be set to zero. We repeat this process until we can no longer converge or the change in energy is negligible. The convergence threshold of the gradient is set to $10^{-8}$ , and we have not observed significant dependence of the final results on the initial guess for $\tau$ or $\lambda$ in our calculations.

In Fig. 2 we plot the results for the pairing Hamiltonian in a small system in which we can compare the accuracy of our truncated commutator expansion with that of exact uP-AGP which we obtain from a full configuration interaction (FCI) code. As we can see in the plot, the agreement between the exact and truncated uP-AGP is excellent on the attractive side while at the same time being slightly more accurate than the linear wavefunction $\boldsymbol{\mathcal{T}}|\text{AGP}\rangle$ denoted as P-CI AGP. On the repulsive side, the results for the truncated uP-AGP is comparable to P-CI and becomes more accurate only at sufficiently small $G/G_{c}$ .

For benchmarking purposes, we also plot HF-based unitary coupled cluster (UCC) Bartlett et al. (1989); Taube and Bartlett (2006) as well as that with generalized indices, UGCC, in Fig. 2. The results are obtained from a FCI code. As expected UCC works well in the repulsive regime, but it breaks down as $G$ approaches $G_{c}$ . On the other hand, UGCC is remarkably more accurate than UCC, but it too eventually breaks down for large $G/G_{c}$ . In contrast, our AGP based methods are well behaved in all correlation regimes.

The results for a larger system is show in Fig. 3 where we compare the accuracy of the truncated uP-AGP with ST-J₂AGP and P-CI. Although all of these methods have the same number of parameters, they clearly differ considerably in their accuracy. The observation that uP-AGP could be more accurate than P-CI in principle is a testament to its non-linear nature. Previously, it was shown that CI methods based on the generators of the algebra, namely J₂-CI, P-CI, and K-CI, yield identical results Dutta et al. (2020). The other advantage of the non-linear model is that it is agnostic to the linear dependencies of the ansatz. This is true for both ST-J₂AGP and the unitary pair-hopper. Although removing the linear dependencies by diagonalizing the metric can yield faster convergence, our numerical results show very little difference in the final energy.

The downside of the truncated uP-AGP is that it has a relatively large prefactor which is a result of the brute-force expansion of the BCH formula. In the next section we apply canonical transformation theory as another means for approximating the unitary transformed Hamiltonian.

IV.2 AGP-based canonical transformation

In canonical transformation (CT) theory, as formulated originally by Yanai and Chan Yanai and Chan (2006, 2007), the commutator expansion of a unitary transformed Hamiltonian,

\displaystyle\mathrm{e}^{-\boldsymbol{A}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{A}}=\boldsymbol{H}+[{\boldsymbol{H}},{\boldsymbol{A}}]+\frac{1}{2!}[{[{\boldsymbol{H}},{\boldsymbol{A}}]},{\boldsymbol{A}}]+\ldots,

(43)

where $\boldsymbol{A}$ is an anti-Hermitian operator, is recursively summed by systematically replacing all higher-body operators obtained from each commutator by an approximate, lower-body operator. For example, if $\boldsymbol{A}$ and $\boldsymbol{H}$ are each two-body operators, one approximates $[{\boldsymbol{H}},{\boldsymbol{A}}]\rightarrow[{\boldsymbol{H}},{\boldsymbol{A}}]_{(1,2)}$ wherein the three-body operator resulting from $[{\boldsymbol{H}},{\boldsymbol{A}}]$ is approximated in terms of one- and two-body operators. This process yields an effective Hamiltonian, $\bar{\boldsymbol{H}}\approx\mathrm{e}^{-\boldsymbol{A}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{A}}$ , that, in its simplest form, can be written as

\displaystyle\bar{\boldsymbol{H}}=\boldsymbol{H}+[{\boldsymbol{H}},{\boldsymbol{A}}]_{(1,2)}+\frac{1}{2!}\left[[\boldsymbol{H},\boldsymbol{A}\right]_{(1,2)},\boldsymbol{A}]_{(1,2)}+\ldots,

(44)

where we have retained up to two-body terms in the effective Hamiltonian. Eq. (44) in particular can be summed recursively by


$\displaystyle\bar{\boldsymbol{H}}^{(n+1)}$	$\displaystyle=\frac{1}{n}\left[\bar{\boldsymbol{H}}^{(n)},\boldsymbol{A}\right]_{(1,2)},$	(45a)
$\displaystyle\bar{\boldsymbol{H}}$	$\displaystyle=\sum_{n=0}\bar{\boldsymbol{H}}^{(n)},$	(45b)

where $\bar{\boldsymbol{H}}^{(0)}=\boldsymbol{H}$ , and the sum is continued until the change in the parameters of the effective Hamiltonian is sufficiently small. The energy is then defined as $E=\langle{\Psi}|\bar{\boldsymbol{H}}|{\Psi}\rangle$ where $|{\Psi}\rangle$ is some mean-field reference. The minimization of the energy can be carried out by solving CC-style amplitude equations $\langle{\Psi}|\boldsymbol{\hat{A}}^{\dagger}_{i}\bar{\boldsymbol{H}}|{\Psi}\rangle$ which can be rearranged to give

\displaystyle R_{i}=\langle{\Psi}|[\bar{\boldsymbol{H}},\hat{\boldsymbol{A}_{i}}]|{\Psi}\rangle=0,\quad\forall{\hat{\boldsymbol{A}_{i}}\in\boldsymbol{A}},

(46)

where $\hat{\boldsymbol{A}_{i}}$ are the operator parts of $\boldsymbol{A}$ .

In essence, the goal of CT theory is to make the effective Hamiltonian more and more diagonal by successively approximating higher order excitations. It can be viewed as a kind of renormalization group approach and is rooted in an earlier work by White White (2002) as well as the flow renormalization approach by Wegner Wegner (1994) and Glazek and Wilson Glazek and Wilson (1994). In this section, however, we very closely follow the formalism presented in Ref. Yanai and Chan (2006, 2007); Neuscamman et al. (2009, 2010); Yanai et al. (2012) by Yanai, Chan, and Neuscamman.

The chief difference between our approach and those of earlier papers is in the operator decomposition or downfolding scheme. Instead of invoking the cumulant decomposition of RDMs Kutzelnigg and Mukherjee (1999) as a means to get lower-body operators, we use AGP and our reconstruction formulae. Just as in cumulant decomposition, the reconstruction formulae are statements about the RDMs, but we can extended them to operators as an approximation.

For example, consider the three-body irreducible density matrix $Z^{(3,3)}_{pqr}$ . From the reconstruction formulae Khamoshi et al. (2019), it follows that

\langle{\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}}\rangle=\\ \Lambda_{qp}\Lambda_{rp}\langle{\mathbf{N}_{p}}\rangle+\Lambda_{pq}\Lambda_{rq}\langle{\mathbf{N}_{q}}\rangle+\Lambda_{pr}\Lambda_{qr}\langle{\mathbf{N}_{r}}\rangle,

(47)

where $\Lambda_{qp}=2\eta_{q}^{2}/(\eta_{q}^{2}-\eta_{p}^{2})$ . For CT, we propose to approximate $\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}$ by removing the expectation values to get

\displaystyle\mathbf{N}_{p}\mathbf{N}_{q}\mathbf{N}_{r}\rightarrow\Lambda_{qp}\Lambda_{rp}\mathbf{N}_{p}+\Lambda_{pq}\Lambda_{rq}\mathbf{N}_{q}+\Lambda_{pr}\Lambda_{qr}\mathbf{N}_{r}.

(48)

Note that since the reconstruction formulae apply only to irreducible RDMs, we limit our downfolding to irreducible operators, i.e. $\mathbf{P}^{\dagger}_{p}\ldots\mathbf{N}_{q}\ldots\mathbf{P}_{r}\ldots$ where all indices are different. Indeed, all reducible operators (RDMs) can be written as a sum of irreducible ones.

To carry out the CT procedure, we take the unitary pair-hopper $\mathrm{e}^{\boldsymbol{\mathcal{T}}}$ ansatz to be the generator of our transformation, and we seek an effective Hamiltonian of the form

\displaystyle\bar{\boldsymbol{H}}=\sum_{p}h_{p}\mathbf{N}_{p}+\sum_{p\neq q}w_{pq}\mathbf{N}_{p}\mathbf{N}_{q}+\sum_{pq}v_{pq}\mathbf{P}^{\dagger}_{p}\mathbf{P}_{q},

(49)

where $w_{pq}$ is symmetric and $v_{pq}$ is Hermitian. Note that Eq. (49) is the general form of a two-body Hamiltonian in the seniority-zero space Henderson et al. (2015). In order to get a relatively more accurate approximation in CT, we would like to delay the downfolding as much possible. To this end, following Ref. Neuscamman et al. (2010), we split $\bar{\boldsymbol{H}}=\bar{\boldsymbol{H}}_{1}+\bar{\boldsymbol{H}}_{2}$ such that $\bar{\boldsymbol{H}}_{1}$ and $\bar{\boldsymbol{H}}_{2}$ are the one- and two-body parts of the effective Hamiltonian respectively, and the downfolding is delayed until four-body operators appear. As such, one obtains


$\displaystyle\bar{\boldsymbol{H}_{1}}$	$\displaystyle=\boldsymbol{H}_{1}+[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}]+\frac{1}{2!}[[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)}$
	$\displaystyle{}\quad+\frac{1}{3!}[[[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)}+...$	(50a)
$\displaystyle\bar{\boldsymbol{H}_{2}}$	$\displaystyle=\boldsymbol{H}_{2}+[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}]_{(1,2)}+\frac{1}{2!}[[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)}$
	$\displaystyle{}\quad+\frac{1}{3!}[[[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}],\boldsymbol{\mathcal{T}}]_{(1,2)},\boldsymbol{\mathcal{T}}]_{(1,2)}+...$	(50b)

Similarly, to get accurate residual equations, we split $R_{pq}=R_{1,pq}+R_{2,pq}$ and evaluate the residuals associated with the one- and two-body parts recursively


$\displaystyle R_{1,pq}$	$\displaystyle=\langle{[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}_{pq}]}\rangle+\frac{1}{2!}\langle{[[\boldsymbol{H}_{1},\boldsymbol{\mathcal{T}}_{pq}],\boldsymbol{\mathcal{T}}_{pq}]_{(1,2)}}\rangle+\ldots,$	(51a)
$\displaystyle R_{2,pq}$	$\displaystyle=\langle{[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}_{pq}]_{(1,2)}}\rangle+\frac{1}{2!}\langle{[[\boldsymbol{H}_{2},\boldsymbol{\mathcal{T}}_{pq}],\boldsymbol{\mathcal{T}}_{pq}]_{(1,2)}}\rangle.+\ldots$	(51b)

In both Eq. (50) and Eq. (51) we retain all two body-terms that naturally appear in the commutator expansion and absorb their coefficients into $w_{pq}$ and $v_{pq}$ in Eq. (49). All higher-body operators are downfolded directly into one-body with the intent of making the Hamiltonian more diagonal. The cost of evaluating the effective Hamiltonian and the residuals is $\mathcal{O}(M^{4})$ .

Having the effective Hamiltonian and the residual equations, one can solve the amplitudes using a non-linear root-finding algorithm. In so doing, we can provide an approximation to the Jacobian matrix as follows

\displaystyle J_{pq,rs}

\displaystyle=\langle{[[\bar{\boldsymbol{H}},\boldsymbol{\mathcal{T}}_{rs}],\boldsymbol{\mathcal{T}}_{pq}]}\rangle.

(52)

The cost of building the Jacobian is also $\mathcal{O}(M^{4})$ by using the reconstruction formula.

We apply the CT as described above to the pairing Hamiltonian Eq. (2). As encountered in other implementations of CT, we find that the optimization over the amplitude equations is ill-conditioned Neuscamman et al. (2009, 2010); Yanai et al. (2012). These numerical difficulties are attributed to near-zero eigenvalues of the Jacobian and are said to be similar to the intruder states in the second order perturbation theory Neuscamman et al. (2010). Several techniques have been proposed to mitigate these difficulties Neuscamman et al. (2010); Yanai et al. (2012). In our implementation, we find that shifting of the amplitude equations Yanai et al. (2012)


$\displaystyle\tilde{R}_{\mu}$	$\displaystyle=R_{\mu}+\lambda\tau_{\mu}$	(53a)
$\displaystyle\implies\tilde{J}_{\mu\nu}$	$\displaystyle={J}_{\mu\nu}+\lambda\delta_{\mu\nu},$	(53b)

where $\mu,\nu$ are flattened indices, to be the most effective. This is analogous to the regularization introduced in Eq. (42). The shifting parameter $\lambda>0$ can be chosen ad hoc Yanai et al. (2012). While we may follow the same steps as in the previous section to find optimal values for $\lambda$ , for convenience, we picked some sufficiently small values with which the equations can converge reliably.

The results for 40 levels, 20 pairs is shown in Fig. 4. We picked $\lambda=1$ for $G>0$ and a larger $\lambda=2.5$ for $G<0$ . CT with the unitary pair-hopper on AGP (CT-uP-AGP) is considerably less expensive than the truncated uP-AGP as well as the linear P-CI AGP. However, the approximation made to higher order excitations compromises the accuracy of CT-uP-AGP. Systematic improvements can be made to CT-uP-AGP by delaying the downfolding even further, but we leave these improvements for future work.

V Conclusions

We explore several non-linear exponential ansätze based on AGP in this paper. First, we study Hilbert-space Jastrow correlators on AGP. We show analytically that while a non-Hermitian JAGP exponential ansatz can in principle be made exact, practical computational limitations with non-stochastic methods compel us to make certain approximations. To this end, we extend the formalism of Ref. Wahlen-Strothman et al. (2015) to AGP and the $su(2)$ algebra, and show how to similarity transform any given Hamiltonian in the seniority-zero space and solve for the amplitudes in a CC-style manner for AGP. Benchmark calculations of this ST-J₂AGP on the ground state of the pairing Hamiltonian show significant improvement on AGP but the resulting energies are less accurate compared to their CI counter-part, J₂-CI AGP. The variational ansatz, var-J₂AGP, is generally more accurate than the similarity transformed one, but it is out of reach for non-stochastic methods to the best of our knowledge. The cost of ST-J₂AGP is $\mathcal{O}(M^{4})$ .

In the second half of the paper, we sought an approximation to the unitary pair-hopper ansatz. Owing to its non-linear nature, exact variational calculations show uP-AGP can be considerably more accurate than the linear P-CI AGP. However, due to its non truncating commutator expansion, some approximation must be made to implement it efficiently on a classical computer. For an approximation to uP-AGP, we expand $\mathrm{e}^{-\boldsymbol{\mathcal{T}}}\boldsymbol{H}\mathrm{e}^{\boldsymbol{\mathcal{T}}}$ using the well-known BCH formula up to the highest affordable computational scaling $\mathcal{O}(M^{6})$ . Numerical results indicate improvement over P-CI AGP on the attractive regime of the pairing Hamiltonian.

The downside of truncated uP-AGP is that it has a relatively large prefactor in its asymptotic scaling. To resolve this, we implement a slight variation of canonical transformation theory Neuscamman et al. (2010) wherein we carry out the operator decomposition from the reconstruction formulae Khamoshi et al. (2019) instead of the commonly used cumulant decomposition Kutzelnigg and Mukherjee (1999). Our CT-uP-AGP is significantly less expensive than either truncated uP-AGP or P-CI AGP, but the approximation made to the higher order excitations noticeably compromises its accuracy.

Considering the pros and cons of each method, it is far from clear whether exponential ansätze for post-AGP methods are necessarily more advantageous than linear correlators. While there are non-linear ansätze that are more accurate than their linear counterparts, practical considerations for implementing them could limit their scope. So far, the linear ansätze have shown remarkable accuracy, they are straightforward to implement and seem rather resilient to system size, at least in the pairing Hamiltonian.

In electronic structure theory, exponential ansätze typically enjoy more favorable size-extensivity properties. However, we intentionally avoided discussing size-extensivity in this paper since the interactions in the pairing Hamiltonian are infinite-range. This makes the energy a non-linear function of system size and complicates the discussion of size-extensivity. Nevertheless, it must be noted that Neuscamman has shown Neuscamman (2012) variational JAGP is fully size-consistent. Also, earlier work in our group has shown a significant but incomplete restoration of size-consistency using K-CI and linearized-CC correlators on AGP Henderson and Scuseria (2020) with the latter exhibiting more favorable size-consistency. A more thorough analysis regarding size-extensivity of post-AGP methods will be reported in due time.

VI Acknowledgments

This work was supported by the U.S. National Science Foundation under Grant No. CHE-1762320. G.E.S. is a Welch Foundation Chair (Grant No. c-0036). A.K. is thankful to Yiheng Qiu for useful discussions regarding Jastrow AGP and to Gaurav Harsha for assistance with the drudge software. We thank Rishab Dutta for letting us use his JCI-AGP code.

VII Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Appendix A Proof for J_kCI Manifold Containing Lower-Rank JCI Manifolds over AGP

The J_kCI-AGP manifold contains the J_k-1CI-AGP and thus all lower-rank JCI-AGP manifolds. This has been observed numerically in Ref. Dutta et al. (2020). Here we provide a proof for the $k=2$ case, which can be readily generalized to higher ranks. Along the same lines, J_kAGP does not need to include lower-rank Jastrow operators, which is shown as a corollary.

For an AGP with $N$ pairs of electrons

		$\displaystyle\,\boldsymbol{J}_{1}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{2N}\sum_{p}\mathbf{N}_{p}\boldsymbol{J}_{1}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{4N}\sum_{pq}\alpha_{q}\mathbf{N}_{p}\mathbf{N}_{q}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{4N}\sum_{p<q}\left(\alpha_{p}+\alpha_{q}\right)\mathbf{N}_{p}\mathbf{N}_{q}\|{\mathrm{AGP}}\rangle+\frac{1}{4N}\sum_{p}\alpha_{p}\mathbf{N}_{p}^{2}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{4N}\sum_{p<q}\left(\alpha_{p}+\alpha_{q}\right)\mathbf{N}_{p}\mathbf{N}_{q}\|{\mathrm{AGP}}\rangle+\frac{1}{2N}\sum_{p}\alpha_{p}\mathbf{N}_{p}\|{\mathrm{AGP}}\rangle.$

It follows that

\boldsymbol{J}_{1}|{\mathrm{AGP}}\rangle=\frac{1}{4}\sum_{p<q}\frac{\alpha_{p}+\alpha_{q}}{N-1}\mathbf{N}_{p}\mathbf{N}_{q}|{\mathrm{AGP}}\rangle.

(54)

Namely, $\boldsymbol{J}_{1}|{\mathrm{AGP}}\rangle$ can be expressed as a linear combination of states in J₂CI.

From Eq. (54), we can show Eq. (15) by defining

\alpha^{\prime}_{pq}=\frac{\alpha_{p}+\alpha_{q}}{N-1}+\alpha_{pq}

(55)

and

\boldsymbol{J}_{2}^{\prime}=\frac{1}{4}\sum_{p<q}\alpha^{\prime}_{pq}\mathbf{N}_{p}\mathbf{N}_{q}.

(56)

Generalization to higher-rank cases is straightforward.

References

Helgaker et al. (2000) T. Helgaker, P. Jørgensen, and J. Olsen, Molecular electronic-structure theory (Wiley, Chichester; New York, 2000).
Scuseria et al. (1988) G. E. Scuseria, C. L. Janssen, and H. F. Schaefer, J. Chem. Phys. 89, 7382 (1988).
Bartlett and Musiał (2007) R. J. Bartlett and M. Musiał, Rev. Mod. Phys. 79, 291 (2007).
Bulik et al. (2015) I. W. Bulik, T. M. Henderson, and G. E. Scuseria, J. Chem. Theory Comput. 11, 3171 (2015).
Degroote et al. (2016) M. Degroote, T. M. Henderson, J. Zhao, J. Dukelsky, and G. E. Scuseria, Phys. Rev. B 93, 125124 (2016).
Qiu et al. (2017) Y. Qiu, T. M. Henderson, J. Zhao, and G. E. Scuseria, J. Chem. Phys. 147, 064111 (2017).
Roos et al. (1980) B. O. Roos, P. R. Taylor, and P. E. M. Sigbahn, Chemical Physics 48, 157 (1980).
Olsen (2011) J. Olsen, Int. J. Quant. Chem. 111, 3267 (2011).
Szalay et al. (2012) P. G. Szalay, T. Müller, G. Gidofalvi, H. Lischka, and R. Shepard, Chem. Rev. 112, 108 (2012).
Lyakh et al. (2012) D. I. Lyakh, M. Musiał, V. F. Lotrich, and R. J. Bartlett, Chem. Rev. 112, 182 (2012).
Ring and Schuck (1980) P. Ring and P. Schuck, The Nuclear Many-Body Problem, Theoretical and Mathematical Physics, The Nuclear Many-Body Problem (Springer-Verlag, Berlin Heidelberg, 1980).
Bytautas et al. (2011) L. Bytautas, T. M. Henderson, C. A. Jiménez-Hoyos, J. K. Ellis, and G. E. Scuseria, J. Chem. Phys. 135, 044119 (2011).
Henderson et al. (2015) T. M. Henderson, I. W. Bulik, and G. E. Scuseria, J. Chem. Phys. 142, 214116 (2015).
Bytautas et al. (2015) L. Bytautas, G. E. Scuseria, and K. Ruedenberg, J. Chem. Phys. 143, 094105 (2015).
Veillard and Clementi (1967) A. Veillard and E. Clementi, Theoret. Chim. Acta 7, 133 (1967).
Couty and Hall (1997) M. Couty and M. B. Hall, J. Phys. Chem. A 101, 6936 (1997).
Kollmar and Heß (2003) C. Kollmar and B. A. Heß, J. Chem. Phys. 119, 4655 (2003).
Stein et al. (2014) T. Stein, T. M. Henderson, and G. E. Scuseria, J. Chem. Phys. 140, 214113 (2014).
Limacher et al. (2013) P. A. Limacher, P. W. Ayers, P. A. Johnson, S. De Baerdemacker, D. Van Neck, and P. Bultinck, J. Chem. Theory Comput. 9, 1394 (2013).
Johnson et al. (2013) P. A. Johnson, P. W. Ayers, P. A. Limacher, S. D. Baerdemacker, D. V. Neck, and P. Bultinck, Comput. Theor. Chem. 1003, 101 (2013).
Henderson et al. (2014a) T. M. Henderson, I. W. Bulik, T. Stein, and G. E. Scuseria, J. Chem. Phys. 141, 244104 (2014a).
Shepherd et al. (2016) J. J. Shepherd, T. M. Henderson, and G. E. Scuseria, J. Chem. Phys. 144, 094112 (2016).
Dukelsky et al. (2004) J. Dukelsky, S. Pittel, and G. Sierra, Rev. Mod. Phys. 76, 643 (2004).
Henderson et al. (2014b) T. M. Henderson, G. E. Scuseria, J. Dukelsky, A. Signoracci, and T. Duguet, Phys. Rev. C 89, 054305 (2014b).
Henderson and Scuseria (2020) T. M. Henderson and G. E. Scuseria, J. Chem. Phys. 153, 084111 (2020).
Khamoshi et al. (2019) A. Khamoshi, T. M. Henderson, and G. E. Scuseria, J. Chem. Phys. 151, 184103 (2019).
Bardeen et al. (1957) J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957).
Richardson and Sherman (1964) R. W. Richardson and N. Sherman, Nucl. Phys. 52, 221 (1964).
Henderson and Scuseria (2019) T. M. Henderson and G. E. Scuseria, J. Chem. Phys. 151, 051101 (2019).
Harsha et al. (2020) G. Harsha, T. M. Henderson, and G. E. Scuseria, J. Chem. Phys. 153, 124115 (2020).
Dutta et al. (2020) R. Dutta, T. M. Henderson, and G. E. Scuseria, J. Chem. Theory Comput. 16, 6358 (2020).
Khamoshi et al. (2020) A. Khamoshi, F. A. Evangelista, and G. E. Scuseria, Quantum Sci. Technol. 6, 014004 (2020).
Johnson et al. (2020a) P. A. Johnson, C.-E. Fecteau, F. Berthiaume, S. Cloutier, L. Carrier, M. Gratton, P. Bultinck, S. De Baerdemacker, D. Van Neck, P. Limacher, and P. W. Ayers, J. Chem. Phys. 153, 104110 (2020a).
Fecteau et al. (2020a) C.-E. Fecteau, H. Fortin, S. Cloutier, and P. A. Johnson, J. Chem. Phys. 153, 164117 (2020a).
Fecteau et al. (2020b) C.-E. Fecteau, F. Berthiaume, M. Khalfoun, and P. A. Johnson, J Math Chem (2020b), https://doi.org/10.1007/s10910-020-01197-0.
Johnson et al. (2020b) P. A. Johnson, H. Fortin, S. Cloutier, and C.-E. Fecteau, arXiv:2012.10477 (2020b).
Scuseria et al. (2011) G. E. Scuseria, C. A. Jiménez-Hoyos, T. M. Henderson, K. Samanta, and J. K. Ellis, J. Chem. Phys. 135, 124108 (2011).
Jastrow (1955) R. Jastrow, Phys. Rev. 98, 1479 (1955).
Haldane (1988) F. D. M. Haldane, Phys. Rev. Lett. 60, 635 (1988).
Foulkes et al. (2001) W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Rajagopal, Rev. Mod. Phys. 73, 33 (2001).
Casula and Sorella (2003) M. Casula and S. Sorella, J. Chem. Phys. 119, 6500 (2003).
Casula et al. (2004) M. Casula, C. Attaccalite, and S. Sorella, J. Chem. Phys. 121, 7110 (2004).
Umezawa and Tsuneyuki (2003) N. Umezawa and S. Tsuneyuki, J. Chem. Phys. 119, 10015 (2003).
Tsuneyuki (2008) S. Tsuneyuki, Prog Theor Phys 176, 134 (2008).
Henderson and Scuseria (2013) T. M. Henderson and G. E. Scuseria, J. Chem. Phys. 139, 234113 (2013).
Neuscamman (2013a) E. Neuscamman, J. Chem. Phys. 139, 181101 (2013a).
Neuscamman (2016) E. Neuscamman, J. Chem. Theory Comput. 12, 3149 (2016).
Genovese et al. (2019) C. Genovese, A. Meninno, and S. Sorella, J. Chem. Phys. 150, 084102 (2019).
Nakano et al. (2019) K. Nakano, R. Maezono, and S. Sorella, J. Chem. Theory Comput. 15, 4044 (2019).
Wahlen-Strothman et al. (2015) J. M. Wahlen-Strothman, C. A. Jiménez-Hoyos, T. M. Henderson, and G. E. Scuseria, Phys. Rev. B 91, 041114 (2015).
Wahlen-Strothman and Scuseria (2016) J. M. Wahlen-Strothman and G. E. Scuseria, J. Phys.: Condens. Matter 28, 485502 (2016).
Cao et al. (2019) Y. Cao, J. Romero, J. P. Olson, M. Degroote, P. D. Johnson, M. Kieferová, I. D. Kivlichan, T. Menke, B. Peropadre, N. P. D. Sawaya, S. Sim, L. Veis, and A. Aspuru-Guzik, Chem. Rev. 119, 10856 (2019).
Yanai and Chan (2006) T. Yanai and G. K.-L. Chan, J. Chem. Phys. 124, 194106 (2006).
Yanai and Chan (2007) T. Yanai and G. K.-L. Chan, J. Chem. Phys. 127, 104107 (2007).
Neuscamman et al. (2009) E. Neuscamman, T. Yanai, and G. K.-L. Chan, J. Chem. Phys. 130, 124102 (2009).
Neuscamman et al. (2010) E. Neuscamman, T. Yanai, and G. K.-L. Chan, Int. Rev. Phys. Chem. 29, 231 (2010).
Yanai et al. (2012) T. Yanai, Y. Kurashige, E. Neuscamman, and G. K.-L. Chan, Phys. Chem. Chem. Phys. 14, 7809 (2012).
Coleman (1965) A. J. Coleman, J. Math. Phys. 6, 1425 (1965).
Surján (1999) P. R. Surján, Top. Curr. Chem. 203, 63 (1999).
Hua (1944) L.-K. Hua, Am. J. Math 66, 531 (1944).
Dukelsky et al. (2016) J. Dukelsky, S. Pittel, and C. Esebbag, Phys. Rev. C 93, 034313 (2016).
Dutta et al. (2021) R. Dutta, G. P. Chen, T. M. Henderson, and G. E. Scuseria, arXiv:2101.08911 (2021).
Neuscamman (2013b) E. Neuscamman, J. Chem. Phys. 139, 194105 (2013b).
Wei and Neuscamman (2018) H. Wei and E. Neuscamman, J. Chem. Phys. 149, 184106 (2018).
Cohen et al. (2019) A. J. Cohen, H. Luo, K. Guther, W. Dobrautz, D. P. Tew, and A. Alavi, J. Chem. Phys. 151, 061101 (2019).
Nooijen (2000) M. Nooijen, Phys. Rev. Lett. 84, 2108 (2000).
Lee et al. (2019) J. Lee, W. J. Huggins, M. Head-Gordon, and K. B. Whaley, J. Chem. Theory Comput. 15, 311 (2019).
Zhao (2018) J. Zhao, Symbolic solution for computational quantum many-body theory development, Thesis (2018).
Byrd et al. (1995) R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, SIAM J. Sci. Comput. 16, 1190 (1995).
Nocedal and Wright (2006) J. Nocedal and S. J. Wright, Numerical Optimization (Springer Science & Business Media, New York, NY, 2006).
Bartlett et al. (1989) R. J. Bartlett, S. A. Kucharski, and J. Noga, Chem. Phys. Lett. 155, 133 (1989).
Taube and Bartlett (2006) A. G. Taube and R. J. Bartlett, Int. J. Quantum Chem. 106, 3393 (2006).
White (2002) S. R. White, J. Chem. Phys. 117, 7472 (2002).
Wegner (1994) F. Wegner, Ann. Phys. 506, 77 (1994).
Glazek and Wilson (1994) S. D. Glazek and K. G. Wilson, Phys. Rev. D 49, 4214 (1994).
Kutzelnigg and Mukherjee (1999) W. Kutzelnigg and D. Mukherjee, J. Chem. Phys. 110, 2800 (1999).
Neuscamman (2012) E. Neuscamman, Phys. Rev. Lett. 109, 203001 (2012).


$\displaystyle\mathrm{e}^{\boldsymbol{J}_{1}}\|\text{AGP}\rangle$	$\displaystyle=\sum_{p_{1}<\ldots<p_{N}}\mathrm{e}^{\sum_{j}\alpha_{p_{j}}}\,\left(\prod_{k}\eta_{p_{k}}\,\mathbf{P}^{\dagger}_{p_{k}}\right)\|\text{vac}\rangle$	(21a)
	$\displaystyle=\sum_{p_{1}<\ldots<p_{N}}\left(\prod_{k}\eta_{p_{k}}\,\mathrm{e}^{\alpha_{p_{k}}}\,\mathbf{P}^{\dagger}_{p_{k}}\right)\|\text{vac}\rangle.$	(21b)

		$\displaystyle\,\boldsymbol{J}_{1}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{2N}\sum_{p}\mathbf{N}_{p}\boldsymbol{J}_{1}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{4N}\sum_{pq}\alpha_{q}\mathbf{N}_{p}\mathbf{N}_{q}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{4N}\sum_{p<q}\left(\alpha_{p}+\alpha_{q}\right)\mathbf{N}_{p}\mathbf{N}_{q}\|{\mathrm{AGP}}\rangle+\frac{1}{4N}\sum_{p}\alpha_{p}\mathbf{N}_{p}^{2}\|{\mathrm{AGP}}\rangle$
	$\displaystyle=$	$\displaystyle\,\frac{1}{4N}\sum_{p<q}\left(\alpha_{p}+\alpha_{q}\right)\mathbf{N}_{p}\mathbf{N}_{q}\|{\mathrm{AGP}}\rangle+\frac{1}{2N}\sum_{p}\alpha_{p}\mathbf{N}_{p}\|{\mathrm{AGP}}\rangle.$