This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Application-level Benchmarking of Quantum Computers using Nonlocal Game Strategies

Jim Furches1, Sarah Chehade2, Kathleen Hamilton2, Nathan Wiebe3,4,5, and Carlos Ortiz Marrero1,6 1Physical Detection Systems and Deployment Division, Pacific Northwest National Laboratory, Richland, WA 99354 2Quantum Computational Science Group, Oak Ridge National Laboratory, Oak Ridge, TN 37830 3Department of Computer Science, University of Toronto, ON M5S 1A1, Canada 4High Performance Computing Group, Pacific Northwest National Laboratory, Richland, WA 99354 5Canadian Institute for Advanced Research, Toronto, On M5G 1M1, Canada 6Department of Computer Science, Colorado State University, Fort Collins, CO 80523 carlos.ortizmarrero@pnnl.gov
Abstract

In a nonlocal game, two noncommunicating players cooperate to convince a referee that they possess a strategy that does not violate the rules of the game. Quantum strategies allow players to optimally win some games by performing joint measurements on a shared entangled state, but computing these strategies can be challenging. We present a variational quantum algorithm to compute quantum strategies for nonlocal games by encoding the rules of a nonlocal game into a Hamiltonian. We show how this algorithm can generate a short-depth optimal quantum strategy for a graph coloring game with a quantum advantage. This quantum strategy is then evaluated on fourteen different quantum hardware platforms to demonstrate its utility as a benchmark. Finally, we discuss potential sources of errors that can explain the observed decreased performance of the executed task and derive an expression for the number of samples required to accurately estimate the win rate in the presence of noise.

1 Introduction

Running simple instances of quantum algorithms with a provable advantage is difficult given the current state of quantum hardware [1, 2]. For this reason, it is important to develop benchmarking tools and techniques that can test and validate the unique aspects of quantum hardware that are consistent with the predictions of quantum theory. In particular, recent work on quantum benchmarking has highlighted the importance of developing benchmarking metrics that can measure progress toward quantum utility of useful quantum algorithms [3].

Low-level benchmark metrics such as randomized benchmarking [4, 5, 6] aim to measure the average error rates of a gate set independent of the initial state or measurement scheme, but are limited, for example, in that it cannot help specify sources of error in an algorithmic pipeline [7] and can overestimate gate fidelity in the presence of errors [8]. High-level benchmarks such as Quantum Volume [9] aim to measure the performance of the entire quantum computing stack, including all classical control systems, but this can be too broad a metric and does not necessarily capture the performance of useful quantum algorithms. In addition, both of these benchmark metrics are difficult to compute at scale and fail to capture the ability of a specific hardware platform at attaining some quantum advantage.

Recent work on nonlocal games has begun to shed light into their utility for quantum hardware verification, quantum advantage, and self-testing [10, 11, 12, 13, 14]. In a nonlocal game, two noncommunicating players cooperate to convince a referee that they possess a strategy that does not violate the rules of a game. When players are allowed to use entanglement as a resource in the development of their joint strategy, they are able to perform computations that no classical computer can replicate without communication and can win the game with higher probability. Nonlocal games have been historically important and provide a unique setting to explore the relationship between classical physics, quantum theory, and other non-signaling theories [15, 16, 17, 18]. An extensive body of research links these games to foundational problems in quantum physics, conjectures in operator algebras, and computational complexity theory [19, 20, 21]. Moreover, advances in quantum information theory and combinatorics have revealed broad classes of games with a provable quantum advantage when players are allowed to incorporate quantum resources into their strategies, such as graph coloring and graph homomorphism games [22, 23], making them exciting experimental candidates for testing quantum hardware [24]. Moreover, nonlocal games are classically verifiable, i.e. given a strategy, you can check in polynomial time if the answers satisfy the rules of the game.

Despite many breakthroughs in our theoretical understanding of nonlocal games, constructing optimal strategies for general nonlocal games remains a challenge. In our work, we propose a new methodology for constructing strategies using variational methods and outline the utility of the strategies found for benchmarking. We begin Section 2 with an introduction to nonlocal games and some definitions. In Section 3, we propose the use of a dual-phase optimization technique to find the resource state and the measurement scheme of a quantum strategies for a nonlocal game. In Section 4, we demonstrate how our method is able to successfully find optimal strategies for CHSH, an N-Partite Symmetric game, and the graph coloring game. For the graph coloring game, we were able to find a short-depth perfect quantum strategy for a graph on 1414 vertices shown to be the smallest graph instance where there exists a strict separation between classical and quantum strategies [25, 26]. We then proceed to test the performance of this novel short-depth strategy on 1414 superconducting quantum computing devices and highlight some potential sources of errors causing decreased performance on some of the devices we tested. In Section 5, we outline how we can use quantum strategies to benchmark quantum devices, their desirable noise robustness properties, and win rate estimation procedure in the presence of device shot noise.

2 Background

A nonlocal game of NN players 𝒢=(Q1,,QN,A1,,AN,λ)\mathcal{G}=(Q_{1},...,Q_{N},A_{1},...,A_{N},\lambda) (illustrated in Fig. 1) consists of a set of possible questions QjQ_{j} that player jj receives from a referee and a set of answers AjA_{j} that player jj is allowed to send to the referee, which the referee then evaluates against a rule function λ:Q1××QN×A1××AN{0,1}\lambda:Q_{1}\times...\times Q_{N}\times A_{1}\times...\times A_{N}\to\{0,1\}. Each set of questions QjQ_{j} and the set of answers AjA_{j} has cardinality mjm_{j} and kjk_{j}, respectively; however, in our work, we assume that there are mm questions and kk answers for each player. The game proceeds in the following steps:

  1. 1.

    The players are informed about the rules of the game λ\lambda, and can collaborate to create a joint strategy, modeled as a conditional probability density between the questions and answers, to maximize their chances of satisfying the rules of the game before it starts.

  2. 2.

    Players are then separated or isolated to prevent them from communicating. This is referred to as non-signaling, or in other words, each player’s actions are independent of each other.

  3. 3.

    The referee tests the strategy by asking questions to each player 𝐪=[q1,,qN]\mathbf{q}=[q_{1},...,q_{N}] and receiving their responses 𝐚=[a1,,aN]\mathbf{a}=[a_{1},...,a_{N}], where qiQiq_{i}\in Q_{i} and aiAia_{i}\in A_{i}, respectively.

  4. 4.

    The players win a round if λ(𝐚|𝐪)=1\lambda(\mathbf{a}|\mathbf{q})=1, and lose if λ(𝐚|𝐪)=0\lambda(\mathbf{a}|\mathbf{q})=0. Multiple rounds are played with different questions to establish that players have a valid strategy.

Refer to caption
Figure 1: Flow of a nonlocal game. After formulating a strategy, Alice and Bob separate and cannot communicate. For a quantum strategy, they each take a part of an entangled state ρ\rho and upon receiving a question qq from the referee, they perform a measurement on their respective states, giving an answer ap(a|q)a\sim p(a|q). Finally, the referee receives their answers and verifies them against the rules λ(a|q)\lambda(a|q).

It is common that all players share the same set of possible questions QQ and answers AA. In particular, Synchronous games have rules that require that the answers of two (or more) players be identical when asked the same questions λ(a1,,ai,aj,,aN|𝐪~)=0\lambda(a_{1},\dots,a_{i},a_{j},\dots,a_{N}|\tilde{\mathbf{q}})=0, for all aiaja_{i}\neq a_{j}, where 𝐪~\tilde{\mathbf{q}} is a vector of questions. In our work, we only consider computing strategies for synchronous games, although the optimization procedure we propose in Section 3 applies for more general strategies.

Using the rules, we can define the value of the game as

ω(𝒢)=qaλ(a|q)p(q)p(a|q),\omega(\mathcal{G})=\sum_{qa}\lambda(a|q)p(q)p(a|q), (1)

where the sum is taken over all possible values of qQNq\in Q^{N} and aANa\in A^{N} (we drop the vector notation for convenience). The distribution p(q)p(q) of the questions asked is typically chosen to be uniform, and the behavior p(a|q)p(a|q) is determined by the strategy that the players construct. Notice that this is the only term that players can control to maximize their win rate. A strategy is said to be perfect if λ(a|q)=0p(a|q)=0\lambda(a|q)=0\implies p(a|q)=0 and, consequently, ω(𝒢)=1\omega(\mathcal{G})=1.

Classical strategies consist of a lookup table that indexes each player’s response to a particular question. It suffices to consider deterministic strategies since stochastic strategies involving shared randomness between the players cannot outperform deterministic strategies due to the linearity of the value of the game [17].

Suppose that players share a quantum state |ψii\ket{\psi}\in\otimes_{i}\mathcal{H}_{i}, and each player has a set of positive operator-valued measures (POVMs) with elements of the form {a|q}\{\mathcal{M}_{a|q}\}, which they perform on their subspace. Using this setup, players can generate the following conditional probability density,

p(a|q)=Tr[ρ(iai|qi)],p(a|q)=\mathrm{Tr}\left[\rho\left(\otimes_{i}\mathcal{M}_{a_{i}|q_{i}}\right)\right], (2)

where ρ=|ψψ|\rho=|\psi\rangle\langle\psi|. These densities are known as quantum strategies.

In addition to the above definition of a quantum strategy, there are a variety of competing definitions for a quantum strategy depending on the choice of axioms that describe how joint measurements between two parties should be performed [27]. In our work, we will only consider strategies as defined above, but the study of quantum strategies is a very active area of research [28, 29, 30, 31]. Note that in [32], the authors prove that for synchronous games, a maximally entangled state is sufficient for a quantum strategy to win the graph coloring game.

A game exhibits a quantum advantage if there exists a quantum strategy that performs better than the best possible classical strategy, in which case there is a Bell inequality \mathcal{I} that is violated for some quantum strategies. The inequality has historically served as an experimental test of local realism [33]. Such inequalities are constraints satisfied by classical (local hidden-variable) models, and are often linear inequalities derived from the local realism assumption. More specifically, a Bell inequality consists of a function \mathcal{I} with respect to the probabilities {p(a|q)}\{p(a|q)\} such that

({p(a|q)})ξ,\mathcal{I}(\{p(a|q)\})\leq\xi, (3)

for some ξ\xi\in\mathbb{R}. Bell inequalities are a central object for self-testing of states [12]. The construction of such functions \mathcal{I} and constants ξ\xi are as follows: for a given Bell inequality =q,awq,ap(a|q)\mathcal{I}=\sum\limits_{q,a}w_{q,a}p(a|q), where wq,aw_{q,a} are weights, there is a corresponding Bell operator =q,awq,aiai|qi\mathcal{B}=\sum\limits_{q,a}w_{q,a}\bigotimes\limits_{i}\mathcal{M}_{a_{i}|q_{i}}, such that a violation is obtained as ξ=Tr(ρ)\xi=\mathrm{Tr}(\mathcal{B}\rho). If the maximal achievable violation is obtained by using quantum resources, denote ξQ\xi_{Q} for this distinction and consider the shifted Bell operator ξQ𝟙\xi_{Q}\mathbbm{1}-\mathcal{B}. If the shifted Bell operator admits a decomposition

ξQ𝟙=γPγPγ,\xi_{Q}\mathbbm{1}-\mathcal{B}=\sum\limits_{\gamma}P_{\gamma}^{\dagger}P_{\gamma}, (4)

where each PγP_{\gamma} is a polynomial with respect to the measurement operators {ai|qi}\{\mathcal{M}_{a_{i}|q_{i}}\}, then we call the decomposition a sum of squares (SOS) for the Bell inequality. Such a decomposition is extremely hard to find [34].

3 Method

In this section, we present a variational quantum algorithm for computing quantum strategies of nonlocal games. Let |ψ\ket{\psi} be the shared entangled state between the players and a|q=iai|qi\mathcal{M}_{a|q}=\bigotimes_{i}\mathcal{M}_{a_{i}|q_{i}} be the joint POVM applied to that state for question qq, returning aa with probability p(a|q)=a|qp(a|q)=\langle\mathcal{M}_{a|q}\rangle. It was noted in [35] that fixing these measurement operators gives a Hamiltonian whose ground state is the optimal shared state for this measurement setting. This fact has been used with reinforcement learning to optimize measurements while selecting the optimal shared state through exact diagonalization [36].

3.1 Dual-Phase Optimization

Our approach is a dual-phase optimization (DPO) that alternates between 2 phases: preparing the optimal state |ψ\ket{\psi} for the fixed measurements {a|q}\{\mathcal{M}_{a|q}\}, and optimizing the measurements, while fixing the shared state. We assume that the players parameterize their state |ψ|ψ(θ)\ket{\psi}\rightarrow\ket{\psi(\theta)} and POVMs a|qa|q(ϕ)\mathcal{M}_{a|q}\rightarrow\mathcal{M}_{a|q}(\phi). The particular choice of parameterization depends on characteristics of the game (e.g. number of qubits required depends on the number of answers).

Algorithm 1 Dual-Phase Optimization
Initialize ϕ\phi randomly
while ΔH>ϵ\Delta\langle H\rangle>\epsilon do
  Construct H(ϕ)H(\phi)
  |ψ(θ)VQE(H(ϕ))\ket{\psi(\theta)}\leftarrow VQE(H(\phi))
  ϕGD(H(ϕ))\phi\leftarrow GD(\langle H(\phi)\rangle)
end while

The preparation of the Hamiltonian depends on the specific measurement scheme the players decide on, which depend on the game. Later, we outline a method for constructing a Hamiltonian from arbitrary game rules λ\lambda and measurements {a|q}\{\mathcal{M}_{a|q}\}.

The optimal shared state is prepared in the first phase using any VQE procedure VQE()VQE(\cdot). Here, we choose ADAPT-VQE [37] because it generates compact variational circuits for use on near-term quantum hardware, but any other solver can also be used (see B). The reference state |ψ0\ket{\psi_{0}} can be a product state, e.g. |0\ket{0}, |+\ket{+}. We choose a qubit operator pool consisting of all possible Pauli strings PP acting on the entire system. The operators added to the state take the form eiθPe^{i\theta P}, giving |ψ(θ)=j=N1eiθjPj|ψ0\ket{\psi(\theta)}=\prod\limits_{j=N}^{1}e^{i\theta_{j}P_{j}}\ket{\psi_{0}}, and they are capable of generating the entanglement required to win nonlocal games, provided they act non-trivially on at least 2 qubits.

The second phase uses a gradient descent-like optimizer GD()GD(\cdot) to update the measurement parameters ϕ\phi. This requires the calculation of gradients ϕψ(θ)|H(ϕ)|ψ(θ)\nabla_{\phi}\braket{\psi(\theta)|H(\phi)|\psi(\theta)} on the quantum device, which can be done through parameter shift rules [38, 39]. In F, we outline the cost of computing this gradient for larger problem instances and some optimization challenges.

3.2 Game Hamiltonians

As mentioned above, DPO requires the construction of a Hamiltonian based on the measurements of the players, which determines the quantum strategy. Player ii may measure their qubits ρi\rho_{i} in an arbitrary basis depending on the question, leading to a form for the measurement operators

a|q\displaystyle\mathcal{M}_{a|q} =iUiqiPaiUiqi\displaystyle=\bigotimes_{i}U_{iq_{i}}^{\dagger}P_{a_{i}}U_{iq_{i}} (5)
=UqPaUq,\displaystyle=U_{q}^{\dagger}P_{a}U_{q}, (6)

where Pai=|aiai|P_{a_{i}}=\outerproduct{a_{i}}{a_{i}} is the projector onto answer aia_{i}, and UiqiU_{iq_{i}} acts on ρi\rho_{i} in response to question qiq_{i}. Because a|q=p(a|q)\langle\mathcal{M}_{a|q}\rangle=p(a|q), we can substitute this into (1) to construct the game operator

β\displaystyle\beta =qaλ(a|q)p(q)a|q\displaystyle=\sum_{qa}\lambda(a|q)p(q)\mathcal{M}_{a|q} (7)
=qp(q)Uq(aλ(a|q)Pa)Uq\displaystyle=\sum_{q}p(q)U_{q}^{\dagger}\left(\sum_{a}\lambda(a|q)P_{a}\right)U_{q} (8)

with the property β=ω(𝒢)\braket{\beta}=\omega(\mathcal{G}). A VQE finds the ground state of a Hamiltonian, so to maximize the win rate, we use a value Hamiltonian H=βH=-\beta in DPO.

Proposition 3.1.

The value β=1\braket{\beta}=1 if and only if the players have a perfect quantum strategy, otherwise β<1\braket{\beta}<1.

Proof 3.1.

We show this by first computing the value for a perfect strategy and then for an imperfect strategy. Let I={(q,a)|q,aλ(a|q)=0}I=\{(q,a)~|~\forall q,a~\lambda(a|q)=0\} be the set of question-answer pairs for which the strategy violates the rules, and let P=IcP=I^{c} be its complement, the set of correctly answered question-answer pairs.

For a perfect quantum strategy, I=I=\emptyset and P=QN×ANP=Q^{N}\times A^{N}, therefore we get

β\displaystyle\braket{\beta} =q,aPIp(q)λ(a|q)\displaystyle=\sum_{q,a\in P\cup I}p(q)\lambda(a|q) =qaPp(q,a)+(0)qaIp(q,a).\displaystyle=\sum_{qa\in P}p(q,a)+(0)\sum_{qa\in I}p(q,a).

Since I=I=\emptyset for all q,aq,a pairs for which λ=1\lambda=1 are contained in PP, it follows that the joint probability density in the left term must sum to 1. Hence, we obtain β=1\braket{\beta}=1.

A very similar line of reasoning holds for an imperfect strategy, where II\neq\emptyset. Reusing the above expression,

β\displaystyle\braket{\beta} =qaPp(q,a)+(0)qaIp(q,a)\displaystyle=\sum_{qa\in P}p(q,a)+(0)\sum_{qa\in I}p(q,a)
=qaPp(q,a)<1,\displaystyle=\sum_{qa\in P}p(q,a)<1,

since for p(q,a),q,aPp(q,a),~q,a\in P no longer contains the full probability density as II contains some possible pairs. We conclude that β1\braket{\beta}\leq 1, with β=1\braket{\beta}=1 iff a strategy is perfect.

\square

To parameterize this Hamiltonian for DPO, a general single-qubit unitary U1U_{1} may be decomposed into 3 parameters, leading to a parameterization of the full measurement gate Uq=i,jiU1(ϕiqiji)U_{q}=\bigotimes_{i,j_{i}}U_{1}(\phi_{iq_{i}j_{i}}), where ii indexes the player, qiq_{i} indexes the question, and jij_{i} indexes the particular qubit of player ii. In measurement layers acting on multiple qubits, we expand each entry of ϕiqiji\phi_{iq_{i}j_{i}} to be the concatenated vector of parameters for all gates applied to that qubit, i.e. Uq=iUNi(ϕiqi)U_{q}=\bigotimes_{i}U_{N_{i}}(\phi_{iq_{i}}), where UNiU_{N_{i}} is an NiN_{i}-qubit unitary.

4 Experiments

To evaluate the performance of DPO, we apply it to several nonlocal games with known quantum bounds: CHSH, the N-partite symmetric (NPS) game [36], and the odd-cycle game [14]. Then, we use DPO to explicitly compute an optimal quantum strategy for the coloring game of a 1414 vertex graph called G14G_{14} [25]. This strategy is then evaluated on quantum hardware, demonstrating that it can be used to benchmark the nonlocal capabilities of quantum devices and find sources of errors.

4.1 CHSH

The Clauser-Horne-Shimony-Holt (CHSH) scenario [33] is the simplest nonlocal game that admits a quantum advantage. CHSH features 2 players, Alice and Bob, who each receive a question qiQ={0,1}q_{i}\in Q=\{0,1\}, answering aiA={0,1}a_{i}\in A=\{0,1\}. The inequality operator can be expressed in the familiar form =A0B0+A1B0+A0B1A1B1\mathcal{I}=A_{0}B_{0}+A_{1}B_{0}+A_{0}B_{1}-A_{1}B_{1} following the rules

λ(a|q)={aaab,if qa=qb=1δaa,ab,otherwise\lambda(a|q)=\begin{cases}a_{a}\oplus a_{b},&\text{if }q_{a}=q_{b}=1\\ \delta_{a_{a},a_{b}},&\text{otherwise}\end{cases} (9)

and making the substitution λ=0λ=1\lambda=0\rightarrow\lambda=-1 in (8). Here AqA_{q} denotes Alice’s measurement operator and likewise BqB_{q} for Bob. All classical strategies are bounded by 2\braket{\mathcal{I}}\leq 2, whereas quantum strategies can violate this up to =22\braket{\mathcal{I}}=2\sqrt{2}. i.e., from Equation (3), the violation occurs with ξQ=22\xi_{Q}=2\sqrt{2}. It suffices to share a Bell state and then perform the appropriate single-qubit measurements.

We applied DPO to the CHSH game using Ry(ϕ)=eiϕ2YR_{y}(\phi)=e^{-i\frac{\phi}{2}Y} gates as the UqU_{q} operators. In the first iteration, ADAPT chose |ψ(θ)=eiπ4YX|00\ket{\psi(\theta)}=e^{i\frac{\pi}{4}YX}\ket{00}, which correctly generates a Bell state |Φ\ket{\Phi^{-}}. In the second phase of that iteration, the measurement parameters were optimized to ϕ[0,π/2,π/4,π/4]\phi\approx[0,-\pi/2,\pi/4,-\pi/4] by constraining ϕa0=0\phi_{a0}=0, giving the optimal inequality value 222\sqrt{2}.

4.2 N-partite Symmetric

The NPS scenario [40] involves the correlations between players NN, each receiving a binary question qi{0,1}q_{i}\in\{0,1\} and returning a dichotomic answer ai=±1a_{i}=\pm 1. The inequality is expressed in terms of one- and symmetric two-body correlators,

Sq=iiq,Sqq=ijiqjq,S_{q}=\sum_{i}\braket{\mathcal{M}_{iq}},~S_{qq^{\prime}}=\sum_{i\neq j}\braket{\mathcal{M}_{iq}\mathcal{M}_{jq^{\prime}}}, (10)

where measurement iq=UiqZiUiq\mathcal{M}_{iq}=U_{iq}^{\dagger}Z_{i}U_{iq}. The classical bound on the correlations is

=2S0+12S00S01+12S11+2N0,\mathcal{I}=-2S_{0}+\frac{1}{2}S_{00}-S_{01}+\frac{1}{2}S_{11}+2N\geq 0, (11)

with negative values only achievable with quantum strategies [36].

We tested 50 DPO trials for the N=6N=6 case (Fig. 2). Our algorithm encounters some local minima, particularly at the classical bound of =0\mathcal{I}=0, but still succeeds in 19/50 attempts reaching =0.258\mathcal{I}=-0.258 as found in [36] as well. Additionally, 29/50 of the trials violated the classical bound.

Refer to caption
Figure 2: Trials of DPO on NPS for N=6N=6. (Left) Trajectory of all 50 trials. Negative inequality values are not reachable with classical states. (Right) Distribution of the final inequality values. Despite the non-convexity of the problem, many trials still reach the optimal value.

4.3 Chromatic Number Game

The objective of the chromatic number game [25] is to color a graph G=(V,E)G=(V,E) in such a way that adjacent vertices are never given the same color. If this can be done using cc colors, we call this a cc-coloring of the graph. It has been shown recently that winning strategies for this game generate the set of all possible correlations for synchronous nonlocal games [41]. This differs from the other nonlocal games we mentioned, as the sets of questions and answers are much larger, and each player requires more qubits to encode their answer. The referee asks a question q=[va,vb]{(v,v)|vV}Eq=[v_{a},v_{b}]\in\{(v,v)|v\in V\}\cup E, and the players respond with colors a=[ca,cb]{1,,c}2a=[c_{a},c_{b}]\in\{1,\dots,c\}^{2}. The rules are given by

λ(a|q)={δca,cbif va=vb(1δca,cb)if (va,vb)E.\lambda(a|q)=\begin{cases}\delta_{c_{a},c_{b}}&\text{if }v_{a}=v_{b}\\ (1-\delta_{c_{a},c_{b}})&\text{if }(v_{a},v_{b})\in E\end{cases}. (12)

From these rules, one can encode the graph-coloring game into a Hamiltonian for DPO. For convenience, we denote a vertex question [v,v][v,v] as vv, and an edge question [va,vb]E[v_{a},v_{b}]\in E as ee. Let the answers also be given by c=[ca,cb]c=[c_{a},c_{b}]. Then, the expression in (8) gives

β=1|Q|[vUvPccUv+eUe(IPcc)Uv],\beta=\frac{1}{|Q|}\left[\sum_{v}U_{v}^{\dagger}P_{cc}U_{v}+\sum_{e}U_{e}^{\dagger}(I-P_{cc})U_{v}\right], (13)

where Pcc=c|cccc|P_{cc}=\sum_{c}\ket{cc}\bra{cc} is the projector onto the subspace of matching colors, and |Q|=|V|+|E||Q|=|V|+|E|. Intuitively, the first term maximizes p(ca=cb|v)p(c_{a}=c_{b}|v), and the second term maximizes p(cacb|e)=1p(ca=cb|e)p(c_{a}\neq c_{b}|e)=1-p(c_{a}=c_{b}|e). Note that to ensure that all possible questions are asked to each player, EE contains both edges (va,vb)(v_{a},v_{b}) and their reverse (vb,va)(v_{b},v_{a}).

First we consider the odd-cycle game [17], defined on an odd-length cycle graph G(n)G(n) of nn vertices. The players are restricted to using 2 colors ca,cb{0,1}c_{a},c_{b}\in\{0,1\}, meaning there is no perfect classical strategy because G(n)G(n) is 3-colorable. Indeed, the optimal classical win rate is ωc(n)=11/(2n)\omega_{c}(n)=1-1/(2n). The optimal quantum strategy has a higher rate of ωq(n)=cos2[π/(4n)]\omega_{q}(n)=\cos^{2}[\pi/(4n)], yielding a separation, but no perfect quantum strategy exist. Additionally, this game is of particular interest because it was recently experimentally demonstrated in a spatially separated pair of trapped ions, showing quantum advantage for up to n19n\leq 19 vertices [14].

Refer to caption
Figure 3: Win rates of discovered strategies for the odd-cycle game. (Top) The maximal win rate found for each game instance G(n)G(n) compared to the optimal quantum value ωq(n)\omega_{q}(n). (Bottom) The distribution of values for each instance. The black lines denote the median, and the gray regions correspond to the classical bound ωc(n)\omega_{c}(n).

Fig. 3 shows the distribution strategies discovered by our algorithm with a measurement layer of one RYR_{Y} gate per player. We evaluated each game instance G(n)G(n) with 25 trials. In all instances, we observed that the best discovered strategy was within 10810^{-8} or lower of the optimal quantum value. The algorithm did get stuck in some local minima near the classical value ωc\omega_{c}, but the median values were within the algorithm tolerance, showing that it is able to find graph coloring strategies for the odd-cycle game.

Now we focus on the quantum chromatic game for the graph G14G_{14} (Fig. 4). For this graph there exists a perfect quantum strategy with 44 colors, while the smallest possible coloring strategy classically requires 55 [25]. Recall that the notion of finding the smallest possible coloring strategy classically is equivalent to finding the chromatic number of this graph [42]. This graph was conjectured to be the smallest possible graph with a perfect quantum strategy for this game, and subsequently this was proved to be the case [25, 26]. In [25] a construction was provided using an orthogonal representation of G14G_{14}, that is, a map of vertices to vectors in 4\mathbb{R}^{4} such that adjacent vertices are assigned orthogonal vectors. These vectors are then used to define a set of projective measurement operators acting on the maximally entangled state to get a perfect quantum strategy to color the graph. It is unclear how to obtain an explicit set of ansätze from the projective measurements to construct a short-depth circuit that can be executed on near-term hardware (see C). We use the DPO algorithm to generate a perfect (up to numerical precision error) quantum strategy for this graph.

Refer to caption
Figure 4: Picture of the G13G_{13} graph. To get the graph G14G_{14} just add an apex vertex, vertex connected to all the other vertices, α\alpha to this graph. Image courtesy of [25].

To simplify the search for strategies, we restrict the players to 2 qubits each, since 2 qubits suffice to represent c{0,,χq1}c\in\{0,...,\chi_{q}-1\} using a binary encoding. We also impose a known constraint on an optimal strategy for synchronous games [25, 28]: Bob’s measurement operators are complex conjugate to Alice’s, halving the number of measurement parameters ϕ\phi required. We use a measurement layer per player of general single-qubit UU gates, a CNOT from qubit 0 to qubit 1, and RyR_{y} gates applied to each qubit, resulting in 8 parameters per question or 112 in total (in our code, this is the U3Ry layer).

We classically simulated 500 trials of DPO, achieving a minimum energy of E=1.0000E=-1.0000. We remove the gates added by ADAPT with |θi|<104|\theta_{i}|<10^{-4}. The corresponding circuit was then converted into a Qiskit circuit (Fig. 5), and the evaluation using the classical AerSimulator confirmed a 100% win rate (Fig. 7(a)). The 112 parameters ϕ\phi can be found in our code repository (see A).

Refer to caption
Figure 5: Generated circuit for G14G_{14}. The initial state |+N\ket{+}^{\otimes N} is prepared, then ADAPT added the operators Y0Z2Y_{0}Z_{2} and Y0Z1Y2Y3Y_{0}Z_{1}Y_{2}Y_{3}, giving the shared state |ψ(θ)\ket{\psi(\theta)}. The remaining gate layers along with the 112 measurement parameters ϕ\phi constitute the measurement strategy. We only adaptively added circuits on the state preparation and fixed the measurement scheme in this case.

It is worth noting that in a nonlocal game the referee cannot cross-check answers from previous questions (otherwise the graph would not be 4-quantum colorable), and the players change their coloring for each vertex probabilistically in subsequent runs, using the entanglement to coordinate their answers as required. For example, when asked q=[A,A]q=[A,A] multiple times, the responses are nearly uniform among [χq][\chi_{q}] but always match. Furthermore, we found that measurement layers consisting of only single-qubit gates were insufficient and generated imperfect strategies at E=0.9921E=-0.9921. In these cases, we frequently observed that the errors consisted of a cyclic path through some graph edges.

The operators chosen by ADAPT are nonlocal as expected, acting on 2 and 4 qubits. The shared state discovered,

|ψ(θ)\displaystyle\ket{\psi(\mathbf{\theta})} =eiπ4YZYYeiπ4YIZI|+4\displaystyle=e^{-i\frac{\pi}{4}YZYY}e^{i\frac{\pi}{4}YIZI}\ket{+}^{\otimes 4} (14)
=12H4c[χq]|cc,\displaystyle=\frac{1}{2}H^{\otimes 4}\sum_{c\in[\chi_{q}]}\ket{cc}, (15)

is the maximally entangled state followed by local Hadamard gates. This matches the existing strategy described in [25], which leverages the maximally entangled state, up to local unitary operations.

The circuit preparing the shared state |ψ\ket{\psi} needs 8 CNOT gates to be transpiled using a ladder-like formulation with CNOT gates applied between nearest-neighbor qubits. This can be reduced to 2 CNOT gates (Fig. 6) by noting that the state 12c|cc\frac{1}{2}\sum_{c}\ket{cc} can be generated with transversal Bell pairs shared between the players on qubits a0,b0a_{0},b_{0} and a1,b1a_{1},b_{1}. Applying the transversal Hadamards H4H^{\otimes 4} in (15) flips the direction of the CNOT gates using a simple circuit identity. We refer to this version of the shared state circuit as the “Bell pair” strategy, which uses the same measurement layer and parameters as the original strategy.

Refer to caption
Figure 6: Simplification of the G14G_{14} state preparation. (Left) The original strategy using the gates found with the adaptive procedure. (Right) The “Bell pair” strategy, an equivalent circuit producing two independent Bell state pairs between the players. This requires just 2 CNOT gates compared to the 8 required for the original strategy.

4.4 Experiments on IBM Devices

Refer to caption
(a) Qiskit AerSimulator
Refer to caption
(b) IBMQ ibm_hanoi
Figure 7: Win rate of the original G14G_{14} strategy by question. 7(a)) Classical statevector simulation with 10000 shots per question. 7(b)) Performance on a quantum device with 1024 shots per question.
Refer to caption
(a) All devices
Refer to caption
(b) Noise Simulation
Figure 8: 8(a)) Average performance of the strategy on quantum devices grouped by question category, either a vertex q=[v,v]q=[v,v] or an edge q=[v1,v2]q=[v_{1},v_{2}] (vertex winrate on the left, edge winrate on the right). The number of device qubits is reported to distinguish processor types; the circuit was executed on 4 qubits. 8(b)) Classical simulation of the circuit with random Pauli noise applied to CNOT gates with probability perrp_{err}. The circuit was transpiled to the basis gates and coupling map of ibm_hanoi, and the observed data are fitted to the curves by maximizing the probability of observing the data assuming a normal distribution. The estimated error for CNOT gates is higher than reported, since our simulation did not account for measurement readout or decoherence errors.

This strategy was submitted to 1111 IBM quantum devices with 44 or more qubits (Fig. 7). A decrease in performance was observed on IBM quantum devices compared to the classical simulation due to noise, particularly affecting the success rate of vertex questions (Fig. 7(b), 8). There are several possible sources of noise:

  1. 1.

    Vertex questions are more sensitive to bit flips, as any 1-bit error will result in the answer violating the rules Xj|cc|cacbX_{j}\ket{cc}\rightarrow\ket{c_{a}c_{b}}, while the same is not true for edge questions, since bit flips may not necessarily make the answers agree Xj|cacb\centernot|ccX_{j}\ket{c_{a}c_{b}}\centernot\rightarrow\ket{cc} (see Section 6). This asymmetry comes from the rules of the game.

  2. 2.

    As the resource state depends on entanglement, error in entangling 2-qubit gates disrupts the strategy.

  3. 3.

    Circuit transpilation to hardware with fixed qubit connectivity further incurs two-qubit gate overhead.

This sensitivity suggests that measuring the win rate of the strategy for G14G_{14} is a good benchmark to evaluate the ability of a quantum device at accurately controlling for bit flip errors, while simultaneously performing nonlocal operations. In particular, the vertex question win rate is very sensitive to noise, measuring the fidelity of the device gates, whereas the edge question win rate can confirm if a device is using quantum resources. The optimal classical strategy of 4 colors consists of a 4-coloring of G13G_{13} and assigning the most infrequently used color to the apex vertex α\alpha. Therefore, all vertex questions would be correctly answered and one edge would be incorrectly answered, resulting in an edge win rate of 36/3797.3%36/37\approx 97.3\%, or an overall win rate of 86/8897.7%86/88\approx 97.7\%. Any win rate higher than this requires quantum resources. In our experiments, no device exceeded this threshold (Fig. 8). However, introducing an error-correcting version of our quantum strategy could improve the robustness of this test, which we leave to future investigations.

5 Nonlocal Games as Quantum Hardware Benchmarks

Nonlocal games with perfect strategies can serve as hardware benchmarks by assessing and analyzing the empirical win rates when executed on near-term hardware. Under certain assumptions about the structure of quantum noise, nonlocal games can exhibit quantum advantage in shallow circuits, even with noisy qubits [43]. The ‘noisy entanglement’ generated in shallow circuits enables correlations that classical circuits fundamentally struggle to reproduce. This is seen in [43]: their classical circuits of constant depth cannot simulate the long-range correlations.

In this section, we demonstrate the effectiveness of this benchmark by analyzing hardware noise and its strong correlation to strategy performance. We proceed backwards,from the unentangled readout measurements, to the independent Bell state measurements, to the initial entangled resource state preparation. By investigating the effects of hardware noise on the empirical win rates we seek to establish: a) which questions are most affected by noise, b) which components of the circuit are most affected by noise, and finally, if classical correlations, or quantum noise, could be misinterpreted as a winning quantum strategy.

In addition to the classical bounds provided in Section 4.4, we also consider the worst outcome on hardware: a nearly uniform distribution over all bitstrings. This would skew the win rates in the G14G_{14} game as follows: for any vertex question would only be 1/4=0.251/4=0.25 and the average win rate of any edge question would be 12/16=0.7512/16=0.75. Thus random guessing would return an overall win rate of 59%. In Figures 11, 12, 13 and 14 we include these values as a reference.

Quantum hardware is affected by many sources of noise. The noise profile is time dependent and there are many strategies developed to optimize the scheduling and execution of quantum circuits. Using superconducting qubit platforms from IBM and Rigetti, we investigate the robustness of the original G14G_{14} strategy, and the Bell pair strategy on superconducting qubit platforms with respect to hardware noise fluctuations over several days, and also to changes in the circuit made during the transpiration step.

5.1 Theoretical Noise Robustness

There is an asymmetric sensitivity to noise between the vertex and edge questions due to the game rules (see Section 4.4). Furthermore, there is variance in the edge questions performance. We hypothesize this arises from the particular strategy and distribution of answers found via the ADAPT algorithm. We further investigate the effects of bit-flip errors on the game strategies.

Multiple bitstrings satisfy the constraints for edge questions λ(c,c|vA,vB)=0\lambda(c,c|v_{A},v_{B})=0 for all colors cc. Players using the four qubit strategy can win an edge question by outputting a bit string that is either Hamming distance H(a,b)=1H(a,b)=1 from matching (e.g. 0001) or distance 2 (e.g. 1100). While both options satisfy the game rules, choosing bitstrings with a greater Hamming distance reduces the likelihood of losing due to device noise, as higher-weight errors occur less frequently.

To quantify the noise robustness of the strategy resulting from this, we introduce the expected Hamming distance (EHD),

EHD(va,vb)=𝔼ca,cbp(ca,cb|va,vb)[H(ca,cb)],EHD(v_{a},v_{b})=\mathbb{E}_{c_{a},c_{b}\sim p(c_{a},c_{b}|v_{a},v_{b})}\left[H(c_{a},c_{b})\right], (16)

where H(a,b)H(a,b) denotes the Hamming distance between the binary representations of answers aa and bb. In general, the EHD is not efficiently computable on a classical computer since it requires sampling the strategy. However, because the G14G_{14} strategy is sufficiently small, we calculate the EHD for each circuit via classical simulation (Fig. 9).

Refer to caption
Figure 9: (Left) Noise robustness of the G14G_{14} strategy. The adjacency matrix of the G14G_{14} graph is shown with the color scale denoting the EHD. Vertex questions form the main diagonal. (Right) Average performance on ibm_sherbrooke over 7 runs plotted for comparison (same as in Fig. 10).

To show that the EHD effectively predicts question performance, we also plot results collected on ibm_sherbrooke111Eagle r3 processor, a superconducting qubit platform available from IBM with 127 qubits. We executed the strategy described in Section 4.3 multiple times over the course of a week. Supplemental experimental details are available in the G. The heatmaps exhibit a high degree of correlation (r=0.8812,p<0.001r=0.8812,p<0.001), suggesting the strategy produced greatly influences noise robustness. The standard deviation is also presented alongside the win rate (Fig. 10), further highlighting the sensitivity of different questions to variations in device calibration. There are some outliers, particularly (0,2)(0,2) and (2,0)(2,0) that perform worse than expected, and the EHD cannot account for variation in the vertex question performance. We leave these to future investigations.

Refer to caption
Figure 10: (Left) Win rate of the Bell pair strategy averaged over 7 separate experiments on ibm_sherbrooke. All runs used the same hardware qubits and layout, forming a linear chain. Because the experiments were spaced out, the calibration parameters differed between each trial. (Right) Standard deviation of each question for those experiments. This shows how sensitive each circuit in the strategy is to fluctuations in the device parameters.

We executed circuits for the original G14G_{14} strategy and the Bell pair strategy on Rigetti’s Ankaa-2, and Ankaa-3 devices. Both have square lattice qubit connectivity and to take advantage of this, we prioritized running experiments on qubit subsets with cyclic connectivity. The circuits first constructed in Qiskit are exported to Open Quantum Assembly Language (QASM) [44], then imported and compiled into Quil using the qiskit-rigetti plugin [45]. During compilation into native operations that are executable on the Rigetti platforms, circuit optimization is possible.

Refer to caption
Figure 11: (Left) All vertex question win rates of the original G14G_{14} strategy grouped by hardware qubits used on Ankaa-2 from Rigetti. (Right) Edge question win rate of the original G14G_{14} strategy grouped by hardware qubits used on Ankaa-3 from Rigetti.

The compiled circuit can be further optimized through rewiring directives that determine how program qubits are mapped onto hardware qubits. The NAIVE rewiring uses the program qubit register index as the hardware qubit index. This rewiring may require the use of additional operations to mitigate non-neighboring interactions. The PARTIAL rewiring attempts to optimize the mapping between program and physical qubits to optimize the fidelity of the compiled circuit. We specified the rewiring strategy through the use of pre-compilation hooks. If no hooks were specified by the user, the rewiring strategy was not verified and we denote the strategy as (NOT SPECIFIED).

Refer to caption
Figure 12: (Left) Vertex question win rate using the Bell pair strategy averaged over multiple experiments on Rigetti’s Ankaa-2. The runs used different hardware qubits and wiring strategies. (Right) Edge question win rate using the Bell pair strategy averaged over multiple experiments on Rigetti’s Ankaa-3.
Refer to caption
Figure 13: (Left) All vertex question win rates of the original G14G_{14} strategy grouped by hardware qubits used on Ankaa-3 from Rigetti. (Right) Edge question win rate of the original G14G_{14} strategy grouped by hardware qubits used on Ankaa-3 from Rigetti.
Refer to caption
Figure 14: (Left) Vertex question win rate using the Bell pair strategy averaged over multiple experiments on Rigetti’s Ankaa-3. The runs used different hardware qubits and wiring strategies. (Right) Edge question win rate using the Bell pair strategy averaged over multiple experiments on Rigetti’s Ankaa-3.

5.2 Noise Robustness of Game Components

In this section we analyze how hardware noise affects different nonlocal game circuit components, supported by results collected on superconducting qubit platforms. This extends the simulated noise results shown in Fig. 8(b) where the error rate of two-qubit gates was inferred from the hardware results reported in Section 4.4. We supplement these results with additional experiments designed to characterize key components of the strategy: readout measurement error, independent entangled measurements, and imperfect resource state preparation (shown in Fig. 15). Throughout this section we analyze and characterize each element individually. We determine the effective win rate that would be observed by the players if one of these elements failed or was replaced by randomness and use this to demonstrate the effectiveness of nonlocal games as a hardware benchmark.

Refer to caption
Figure 15: The components of the nonlocal game circuit that we characterize: resource state (purple), independent Bell basis measurements (orange), and readout error (grey).

The readout measurement error can be characterized by a 2n×2n2^{n}\times 2^{n} dimensional matrix constructed row-wise from individual computational basis state preparation and measurement: preparing the register in |0n|0\rangle^{\otimes n}, applying XX-gates, and projecting the final state onto the computational basis. This can be used to estimate bit flip error probabilities (independent or correlated) [46], and also can be leveraged for readout error mitigation 222The results we report do not include readout error mitigation, we reserve this for future work.. We collected data to characterize the readout error on ibm_sherbrooke and Rigetti’s Ankaa-2, and Ankaa-3 platforms. In Fig. 16 we plot the Ankaa-2 and Ankaa-3 results to emphasize the connection to the EHD metric (see Section 5.1). Though the circuits executed on the hardware are very shallow, SPAM error can have a significant impact.

Connecting the SPAM error back to the EHD if a nonlocal circuit was correctly executed, and the only errors occurred during the readout stage, in Fig. 16 we see that vertex questions are more likely to return incorrect answers, while for edge questions, correct answers can still be returned with high probability. For vertex questions, the all-zero bitstring is relatively unaffected by errors during the readout measurement step, in contrast to the remaining three bitstrings. For edge questions, bit-flip errors that occur during the readout step can still return valid edge question bitstrings. The edge question in which the probability of erroneously returning a non-valid bitstring are bitstrings with high Hamming weights. Thus, if a state is correctly prepared and the error only occurs during the readout stage, it affects vertex questions and low-weight edge answers.

Refer to caption
(a) Rigetti Ankaa-2
Refer to caption
(b) Rigetti Ankaa-3
Refer to caption
(c) ibm_sherbrooke
Figure 16: Example of spurious bitstring counts caused by SPAM errors.

The full quantum strategy is composed of multiple circuits needed to evaluate the players’ performance on all questions posed by the referee. The construction assumes that the two players are separated in space to prohibit classical communication, and implementing the strategy requires nonlocal operations. Prior to the final qubit readout, the two players implement entangled unitaries (𝒰A𝒰B\mathcal{U}_{A}\otimes\mathcal{U}_{B}) that are assumed to be independent. We assess the ability of each player to apply these entangled measurements with high fidelity independently, and simultaneously without corrupting each others operations. This is tested on four qubits [46,47,48,49][46,47,48,49] connected in a linear chain (see G). A specific Bell state is prepared by applying 𝒰A11\mathcal{U}_{A}\otimes 1\otimes 1 or 11𝒰B1\otimes 1\otimes\mathcal{U}_{B} where only two qubits prepare a Bell state while the other two qubits remain in the |0|0\rangle state. Then, a Bell basis measurement is applied to the prepared Bell state and the remaining two qubits are measured in the computational basis. This is compared to the preparation of two independent Bell states both measured by Bell state measurement. To amplify the gate noise we construct and execute these circuits with basic unitary folding by inserting pairs of CNOT gates.

The general success probabilities are plotted in Fig. 17.

Refer to caption
(a) |Ψ+|\Psi+\rangle (ibm_sherbrooke).
Refer to caption
(b) |Φ+|\Phi+\rangle (ibm_sherbrooke).
Figure 17: Median probability of successfully preparing and measuring independent copies of Bell states (|Ψ+|\Psi+\rangle) or (|Φ+|\Phi+\rangle) on ibm_sherbrooke.

For the single Bell state preparations, we extract the marginal distributions of each subset and plot the mean probability of observing counts of each Bell state. The mean is evaluated using 14 executions of these experiments on ibm_sherbrooke.

Refer to caption
(a) (Top) Bell states prepared on qubit subset 𝒜\mathcal{A}. (Bottom) Qubit subset \mathcal{B} remains idle.
Refer to caption
(b) (Top) Qubit subset 𝒜\mathcal{A} remains idle. (Bottom) Bell states prepared on qubit subset \mathcal{B}.
Figure 18: Mean probability of successfully preparing independent copies of Bell states combined with Bell state measurements. Mean probability of successfully observing idle qubits in the |00|00\rangle state.

The distinct separation between the success probabilities of isolated Bell state preparation either on [qa,qb][q_{a},q_{b}] or [qc,qd][q_{c},q_{d}] could be caused by individual two qubit gate error rates – indicative that a coupler between particular pairs of qubits could be more stable compared to neighboring qubits. Another cause could be the choice of hardware qubits combined with circuit optimization options (see G).

On Ankaa-2 we prepared the state |Ψ+|0|0|\Psi^{+}\rangle\otimes|0\rangle\otimes|0\rangle and observe that over 75% of the observed bitstrings correspond to the correct Bell state. The highest number of counts are returned in the all-zero bitstring, indicated that the state was prepared correctly and measured correctly while the idle qubits remained idle. Preparing the state |0|0|Ψ+|0\rangle\otimes|0\rangle\otimes|\Psi^{+}\rangle we observe that between 71-72% of the observed bitstrings correspond to the correct Bell state. However, preparing and measuring the state |Ψ+|Ψ+|\Psi^{+}\rangle\otimes|\Psi^{+}\rangle showed a sharp decline in counts observed in the expected bitstrings.

Stretch |Φ+A|\Phi+\rangle_{A} |Φ+B|\Phi+\rangle_{B} |Ψ+A|\Psi+\rangle_{A} |Ψ+B|\Psi+\rangle_{B}
0 0.94 (1) 0.96(2) 0.91(2) 0.96(3)
2 0.92 (1) 0.94(2) 0.89(2) 0.94(3)
4 0.89 (2) 0.90(2) 0.86(2) 0.92(4)
Table 1: Mean and standard error of measuring counts in the target Bell state.

Connecting this characterization back to the nonlocal game as a benchmark: the game construction assumes the players are separated in space and classical communication is not possible. However the implementation on near-term hardware will likely use physical qubits that are connected via tunable couplings. If correlated noise is significant when executing simultaneous multi-qubit gates on non-overlapping qubit subsets, this can affect the win rate of the players. For the Bell state example we observe that this affects the ability to implement and measure two identical states. We believe that correlated noise may impede the performance again of vertex questions. Finally, we consider the impact of hardware noise on the resource state |Ψ|\Psi\rangle shared by Alice and Bob. With mirrored unitary circuits [47], we measure the probability of applying 𝒰R𝒰R\mathcal{U}_{R}\mathcal{U}^{\dagger}_{R} and successfully returning to the initial all zero register. Testing the four qubit unitary on ibm_sherbrooke, Ankaa-2, and Ankaa-3 multiple times we find that the success rate fluctuates depending on: hardware, qubit subset, and the choice of resource state.

On ibm_sherbrooke the success rate of the mirrored four qubit unitary was 19.43%. On Ankaa-2, the mirrored four qubit unitary of the original G14G_{14} strategy, this approach had a success rate <10%<10\%. Specifically on September 29, 2024 the mirrored unitaries successfully returned to the initial state |04|0\rangle^{\otimes 4}: on qubit subset (9,10,17,16) 8.06%; (2,3,10,9) 6.49 %; (9,10,16,17) 5.66 %; (2,9,16,23) 8.06%; and (2,3,9,10) 7.32 %. The circuit on Ankaa-2 were compiled with PARTIAL re-wiring. For Ankaa-3, the mirrored four qubit unitary success rate was much higher. On September 30, 2024 the mirrored unitaries successfully returned to the initial state |04|0\rangle^{\otimes 4}: (0,1,4,3) 55.32%; (0,1,3,4) 33.08%. The circuits on Ankaa-3 were compiled with NAIVE re-wiring.

The mirrored circuits are much deeper than the resource state preparation alone, and contain more multi-qubit operations. Since noisy hardware can better prepare shallower, sparser resource state constructions, the mirror fidelity provides a pessimistic estimate of the fidelity of the resource state preparation. However we find it informative to compare the mirror fidelity of the arbitrary four qubit unitary to the mirror fidelity of the shared states used in the Bell state strategy, which we measured 1414 times during one week using ibm_sherbrooke. For this set of shared states the mean success probability was 91.56±0.91%91.56\pm 0.91\%.

Overall what we can infer from these individual characterizations is how hardware can generate nonlocal correlations (in the resource state preparation), how independent qubits can be controlled (via the players entangled operations) and finally the robustness of the players answers to readout errors. The development of a full predictive model is beyond the scope of this work, but from the initial characterizations of the game components it is clear that improving individual components can significantly impact the overall win rate which is of importance in the G14G_{14} game, where the separation between the classical and quantum strategies is small.

6 Statistical fluctuations and Sample Complexity of Estimating the Win Rate

On near-term quantum devices, the win rate of each circuit (question) is estimated by statistical sampling, using independently drawn samples to estimate the probability that the players return the correct answers. Finite sample effects lead to statistical fluctuations. In this section, we derive an upper bound on the number of individual samples (shots) to draw from a prepared state to sufficiently assess whether a circuit has correctly answered the referee’s question.

In the interactive nonlocal game setup, the scenario is repeated with random questions until the referee is satisfied with the outcome. We consider how to obtain a low error estimate of the win rate with high probability using a finite number of repetitions.

Let nn be the number of rounds performed, where each round consists of the referee asking the players all mm possible questions once and checking their answers using the rule function λ(a|q)\lambda(a|q). In the context of quantum hardware, this can be viewed as the execution of mm quantum circuits with nn shots per circuit.

Because the outcome of each question is binary, i.e., λ(a|q){0,1}\lambda(a|q)\in\{0,1\}, we model the outcome of question qjq_{j} as a Bernoulli random variable λj\lambda_{j} with an unknown success probability pjp_{j}. The random variable describing the game value of a single round is ω=1mjmλj\omega=\frac{1}{m}\sum_{j}^{m}\lambda_{j}. We denote the empirical estimate of the win rate with nn rounds as ω¯=1ninωi\bar{\omega}=\frac{1}{n}\sum_{i}^{n}\omega_{i} where ω1,,ωnω\omega_{1},\dots,\omega_{n}\sim\omega are i.i.d. samples. Under these mild assumptions, we derive an expression for the number of samples needed to accurately estimate the win rate within error ϵ\epsilon.

Theorem 6.1.

Let ω¯=1ninωi\bar{\omega}=\frac{1}{n}\sum_{i}^{n}\omega_{i} be the empirical estimate of the game win rate after nn rounds, where each round ωi\omega_{i} is independent and identically distributed (i.i.d.). Then, for any ϵ>0\epsilon>0,

P(|ω¯𝔼[ω¯]|ϵ)2exp(nϵ2/2σ¯2/m+ϵ/3),P(|\bar{\omega}-\mathbb{E}[\bar{\omega}]|\geq\epsilon)\leq 2\exp\left(\frac{-n\epsilon^{2}/2}{\bar{\sigma}^{2}/m+\epsilon/3}\right), (17)

where m=|Q|m=|Q| is the number of questions and σ¯2=1mjmpj(1pj)\bar{\sigma}^{2}=\frac{1}{m}\sum_{j}^{m}p_{j}(1-p_{j}), where pjp_{j} is the win rate of question qjq_{j}.

Proof.

We make use of the Bernstein inequality [48, 49], which is restated here for convenience. Let Sn=inXiS_{n}=\sum_{i}^{n}X_{i} be the sum of zero-mean random variables X1,,XnX_{1},\dots,X_{n} and |Xi|c|X_{i}|\leq c almost surely. Then, for any ϵ>0\epsilon>0,

P(|Sn|ϵ)2exp(ϵ2/2inVar[Si]+cϵ/3).P(|S_{n}|\geq\epsilon)\leq 2\exp\left(\frac{-\epsilon^{2}/2}{\sum_{i}^{n}\mathrm{Var}[S_{i}]+c\epsilon/3}\right). (18)

To use the inequality, we construct the sum Sn=inωi𝔼[ωi]S_{n}=\sum_{i}^{n}\omega_{i}-\mathbb{E}[\omega_{i}], subtracting the expectation values to meet the zero-mean condition, yielding

P(|nω¯n𝔼[ω¯]|ϵ)2exp(ϵ2/2inVar[ωi]+cϵ/3).P(|n\bar{\omega}-n\mathbb{E}[\bar{\omega}]|\geq\epsilon)\leq 2\exp\left(\frac{-\epsilon^{2}/2}{\sum_{i}^{n}\mathrm{Var}[\omega_{i}]+c\epsilon/3}\right). (19)

The magnitude of each term is bounded |ωi𝔼[ωi]|1=c|\omega_{i}-\mathbb{E}[\omega_{i}]|\leq 1=c. Furthermore, because each round ωiω\omega_{i}\sim\omega is i.i.d., inVar[ωi]=nσ¯2/m\sum_{i}^{n}\mathrm{Var}[\omega_{i}]=n\bar{\sigma}^{2}/m, where σ¯\bar{\sigma} is defined above and we have used the fact that the variance of a Bernoulli random variable is p(1p)p(1-p). Substituting nϵn\epsilon in place of ϵ\epsilon gives (17). ∎

Corollary 6.2 (Sample complexity).

With probability 1δ1-\delta, we obtain an ϵ\epsilon-close estimate of ω\omega using at least

n2log(2/δ)(σ¯2mϵ2+13ϵ)n\geq 2\log(2/\delta)\left(\frac{\bar{\sigma}^{2}}{m\epsilon^{2}}+\frac{1}{3\epsilon}\right) (20)

rounds.

Proof.

This results from setting (17) less than or equal to δ\delta and solving for nn. ∎

Corollary 6.3.

[Confidence interval] With nn rounds and with probability 1δ1-\delta, the error of our estimate is

|ω¯𝔼[ω¯]|2log(2/δ)3n+σ¯2log(2/δ)mn.|\bar{\omega}-\mathbb{E}[\bar{\omega}]|\leq\frac{2\log(2/\delta)}{3n}+\bar{\sigma}\sqrt{\frac{2\log(2/\delta)}{mn}}. (21)
Proof.

This can be obtained by solving (20) for ϵ\epsilon and taking the positive solution, then applying the identity x+yx+y\sqrt{x+y}\leq\sqrt{x}+\sqrt{y}. ∎

Year Provider Device Strategy Shots Win rate (%)
2023 IBM Guadalupe Original 1024 78.1(6)
Auckland Original 1024 83.9(6)
Jakarta Original 1024 84.2(6)
Manila Original 1024 84.2(6)
Cairo Original 1024 84.3(6)
Nairobi Original 1024 84.5(6)
Quito Original 1024 84.6(6)
Mumbai Original 1024 85.6(6)
Lima Original 1024 85.8(6)
Belem Original 1024 87.3(6)
Hanoi Original 1024 92.5(5)
2024 IBM Sherbrooke Bell Pair 4096 94.3(2)
Rigetti Ankaa-2 Bell Pair 2048 83.8(4)
Ankaa-3 Bell Pair 2048 91.8(3)
Table 2: Overall win rate for each device with 95%95\% confidence intervals.

From (20), we see two possibilities to achieve asymptotic O(log(1/δ)/ϵ)O(\log(1/\delta)/\epsilon) sampling: with a large number of questions mm and when σ¯ϵ\bar{\sigma}\simeq\epsilon. The first case is not practical because increasing the number of questions counterproductively increases the total number of circuit samples mnmn.

The second case is also hard to achieve (at present) because it requires a near-perfect strategy on high-fidelity quantum hardware. This results from σ¯\bar{\sigma} being directly linked to the win rate of the questions, which depends on both the strategy and quantum hardware. Assuming all questions have equal win rate pp for simplicity, this requires (again taking the positive solution) p12(1+14ϵ2)p\approx\frac{1}{2}(1+\sqrt{1-4\epsilon^{2}}), which is approximately p12ϵ+O(ϵ2)p\approx 1-2\epsilon+O(\epsilon^{2}). We expect that O(log(1/δ)/ϵ)O(\log(1/\delta)/\epsilon) sampling may become feasible for perfect strategies with improved gate fidelity, quantum error correction, or amplitude amplification. Table 2 contains all the win rates of our executed experiments with confidence intervals derived from Corollary 6.3.

7 Conclusion

Refer to caption
Figure 19: Overall win rate for all devices tested. The symbol \dagger means that trial used the Bell pair strategy instead of the original strategy. Uncertainty on the win rates are 95% confidence intervals. Colors represent the two device providers we tested: IBM in purple and Rigetti in teal. The horizontal line represent classical winrate threshold for this game; any strategy generated above this line requires quantum resources. All results are above the win rate achievable by random guessing.

We present a variational algorithm to compute novel quantum strategies for nonlocal games by encoding the rules of a nonlocal game into a Hamiltonian and employing a two-step optimization procedure. Our key insight is to optimize separately the state preparation circuit and the measurement scheme while leveraging robust circuit initialization and general techniques, such as ADAPT, during optimization. The proposed algorithm successfully reproduces known quantum strategies and has also discovered new short-depth, perfect quantum strategies for a graph on 1414 vertices using four qubits. This demonstrates that variational techniques can be effectively used on classical computers to identify short-depth, optimal strategies for small examples of nonlocal games where analytic methods fail. Moreover, these techniques extend to a quantum setting, where sample-based gradient estimation is employed. However, the presence of barren plateaus is a known challenge with the training objective function, suggesting that “warm starts” or other techniques to mitigate vanishing gradients may be necessary for scaling these methods to larger nonlocal games.

We further illustrate how the execution of a nonlocal game strategy can serve as an application-level benchmark for quantum devices. By evaluating the win rates of both vertex and edge questions in these games, the win rate of vertex questions reflects a device’s ability to perform nonlocal operations and maintain gate fidelity, while the win rate of edge questions can help confirm the utilization of entanglement across a device. Although none of the devices we tested surpassed the quantum advantage threshold, primarily due to noise in circuit execution, we believe our results can be improved by optimizing the transpilation of the individual circuit before execution and control of the device calibration schedules. It is also worth noting that although our experiments do not provide a full proof of quantum advantage, given that the particles are not spatially separated enough to guarantee that classical communication does not happen during the experiment, it does provide validation that the quantum hardware in question outputs results consistent with the hypotheses of quantum theory. Recent work has begun to outline ways of guaranteeing a “loop-hole free” full verification of quantum advantage by compiling a multi-prover nonlocal game strategy into a single prover strategy [50, 10, 11] and we leave it to future work to investigate the feasibility and implications of these schemes for the games we studied. In a recent survey [51], the authors outlined five desirable properties for a good quantum benchmark and in our work we argued how the win rate from nonlocal game strategies fit all five points:

  • Relevant: The win rate measures the ability to prepare, control, and manipulate entangled states.

  • Reproducible: Strategy and questions are fixed.

  • Fair: Device independent and the executed circuits are shallow.

  • Verifiable: Straightforward to calculate the win rate via sampling.

  • Usable: Circuits can be made accessible via QASM files and can easily be ported to other quantum devices.

We believe that the continued study and extensions of nonlocal games, in particular graph-based games, can enable the design of more appropriate quantum benchmarks as quantum devices scale and hardware architectures become more complex. Ultimately, our research not only advances the understanding of variational quantum strategies but also lays the foundation for leveraging quantum machine learning techniques to explore other nonlocal games strategies beyond the reach of classical methods.

Acknowledgments

Thanks to David Roberson and Eleanor Rieffel for providing valuable feedback. NW, JF, and COM were funded by grants from the US Department of Energy, Office of Science, National Quantum Information Science Research Centers, Co-Design Center for Quantum Advantage under contract number DE-SC0012704. JF and COM were partially supported by the Laboratory Directed Research and Development Program and Mathematics for Artificial Reasoning for Scientific Discovery investment at the Pacific Northwest National Laboratory, a multiprogram national laboratory operated by Battelle for the U.S. Department of Energy under Contract DEAC05- 76RLO1830. S. C. is supported in part by the DOE Advanced Scientific Computing Research (ASCR) Accelerated Research in Quantum Computing (ARQC) Program under field work proposal ERKJ354. K. H. was supported by the DOE Advanced Scientific Computing Research (ASCR) Pathfinder Testbed Program under FWP ERKJ418.

This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.

This manuscript has been authored in part by UT-Battelle, LLC, under Contract No. DE-AC0500OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for the United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan.

References

References

  • [1] Dalzell AM, McArdle S, Berta M, Bienias P, Chen CF, Gilyén A, et al. Quantum algorithms: A survey of applications and end-to-end complexities. arXiv preprint arXiv:231003011. 2023.
  • [2] Rieffel EG, Asanjan AA, Alam MS, Anand N, Neira DEB, Block S, et al. Assessing and advancing the potential of quantum computing: A NASA case study. Future Generation Computer Systems. 2024.
  • [3] Proctor T, Young K, Baczewski AD, Blume-Kohout R. Benchmarking quantum computers. Nature Reviews Physics. 2025:1-14.
  • [4] Emerson J, Alicki R, Życzkowski K. Scalable noise estimation with random unitary operators. Journal of Optics B: Quantum and Semiclassical Optics. 2005;7(10):S347.
  • [5] Knill E, Leibfried D, Reichle R, Britton J, Blakestad RB, Jost JD, et al. Randomized benchmarking of quantum gates. Physical Review A—Atomic, Molecular, and Optical Physics. 2008;77(1):012307.
  • [6] Magesan E, Gambetta JM, Johnson BR, Ryan CA, Chow JM, Merkel ST, et al. Efficient measurement of quantum gate error by interleaved randomized benchmarking. Physical review letters. 2012;109(8):080505.
  • [7] Helsen J, Roth I, Onorati E, Werner AH, Eisert J. General framework for randomized benchmarking. PRX Quantum. 2022;3(2):020357.
  • [8] Chen YH, Baldwin CH. Randomized Benchmarking with Leakage Errors; 2025. Available from: https://arxiv.org/abs/2502.00154.
  • [9] Cross AW, Bishop LS, Sheldon S, Nation PD, Gambetta JM. Validating quantum computers using randomized model circuits. Physical Review A. 2019;100(3):032328.
  • [10] Natarajan A, Zhang T. Bounding the quantum value of compiled nonlocal games: from CHSH to BQP verification. In: 2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS). IEEE; 2023. p. 1342-8.
  • [11] Kalai Y, Lombardi A, Vaikuntanathan V, Yang L. Quantum advantage from any non-local game. In: Proceedings of the 55th Annual ACM Symposium on Theory of Computing; 2023. p. 1617-28.
  • [12] Šupić I, Bowles J. Self-testing of quantum systems: a review. Quantum. 2020;4:337.
  • [13] Hart O, Stephen DT, Williamson DJ, Foss-Feig M, Nandkishore R. Playing nonlocal games across a topological phase transition on a quantum computer. arXiv preprint arXiv:240304829. 2024.
  • [14] Drmota P, Main D, Ainley E, Agrawal A, Araneda G, Nadlinger D, et al. Experimental Quantum Advantage in the Odd-Cycle Game. Physical Review Letters. 2025;134(7):070201.
  • [15] Bell JS. On the Einstein Podolsky Rosen paradox. Physics Physique Fizika. 1964;1(3):195.
  • [16] Clauser JF, Horne MA, Shimony A, Holt RA. Proposed experiment to test local hidden-variable theories. Physical review letters. 1969;23(15):880.
  • [17] Cleve R, Hoyer P, Toner B, Watrous J. Consequences and limits of nonlocal strategies. In: Proceedings. 19th IEEE Annual Conference on Computational Complexity, 2004. IEEE; 2004. p. 236-49.
  • [18] Reichardt BW, Unger F, Vazirani U. A classical leash for a quantum system: Command of quantum systems via rigidity of CHSH games. arXiv preprint arXiv:12090448. 2012.
  • [19] Fritz T. Tsirelson’s problem and Kirchberg’s conjecture. Reviews in Mathematical Physics. 2012;24(05):1250012.
  • [20] Slofstra W. Tsirelson’s problem and an embedding theorem for groups arising from non-local games. Journal of the American Mathematical Society. 2020;33(1):1-56.
  • [21] Ji Z, Natarajan A, Vidick T, Wright J, Yuen H. MIP*=RE. Communications of the ACM. 2021;64(11):131-8.
  • [22] Cameron PJ, Montanaro A, Newman MW, Severini S, Winter A. On the Quantum Chromatic Number of a Graph. The Electronic Journal of Combinatorics. 2007;14(1):R81.
  • [23] Mančinska L, Roberson DE. Quantum homomorphisms. Journal of Combinatorial Theory, Series B. 2016;118:228-67.
  • [24] Daniel AK, Zhu Y, Alderete CH, Buchemmavari V, Green AM, Nguyen NH, et al. Quantum computational advantage attested by nonlocal games with the cyclic cluster state. Physical Review Research. 2022;4(3):033068.
  • [25] Mančinska L, Roberson DE. Oddities of Quantum Colorings. Baltic Journal on Modern Computing. 2016;4(4):846-59.
  • [26] Lalonde O. On the Quantum Chromatic Numbers of Small Graphs. arXiv preprint arXiv:231108194. 2023. Available from: https://arxiv.org/abs/2311.08194.
  • [27] Lupini M, Mančinska L, Paulsen VI, Roberson DE, Scarpa G, Severini S, et al. Perfect strategies for non-local games. Mathematical Physics, Analysis and Geometry. 2020;23(1):7.
  • [28] Helton JW, Mousavi H, Nezhadi SS, Paulsen VI, Russell TB. Synchronous values of games. In: Annales Henri Poincaré. vol. 25. Springer; 2024. p. 4357-97.
  • [29] Helton JW, Meyer KP, Paulsen VI, Satriano M. Algebras, synchronous games, and chromatic numbers of graphs. New York J Math. 2019;25:328-61.
  • [30] Ortiz CM, Paulsen VI. Quantum graph homomorphisms via operator systems. Linear Algebra and its Applications. 2016;497:23-43.
  • [31] Mančinska L, Paulsen VI, Todorov IG, Winter A. Products of synchronous games. arXiv preprint arXiv:210912039. 2021.
  • [32] Cameron PJ, Montanaro A, Newman MW, Severini S, Winter A. On the quantum chromatic number of a graph. arXiv preprint quant-ph/0608016. 2006.
  • [33] Clauser JF, Horne MA, Shimony A, Holt RA. Proposed Experiment to Test Local Hidden-Variable Theories. Physical Review Letters. 1969 oct;23(15):880-4.
  • [34] Kempe J, Kobayashi H, Matsumoto K, Toner B, Vidick T. Entangled games are hard to approximate. SIAM Journal on Computing. 2011;40(3):848-77.
  • [35] Bharti K, Haug T, Vedral V, Kwek LC. Machine learning meets quantum foundations: A brief survey. AVS Quantum Science. 2020 jul;2(3). Available from: https://doi.org/10.1116%2F5.0007529.
  • [36] Bharti K, Haug T, Vedral V, Kwek LC. How to Teach AI to Play Bell Non-Local Games: Reinforcement Learning; 2019.
  • [37] Grimsley HR, Economou SE, Barnes E, Mayhall NJ. An adaptive variational algorithm for exact molecular simulations on a quantum computer. Nature communications. 2019;10(1):3007.
  • [38] Mitarai K, Negoro M, Kitagawa M, Fujii K. Quantum circuit learning. Physical Review A. 2018 sep;98(3). Available from: https://doi.org/10.1103%2Fphysreva.98.032309.
  • [39] Schuld M, Bergholm V, Gogolin C, Izaac J, Killoran N. Evaluating analytic gradients on quantum hardware. Physical Review A. 2019 mar;99(3). Available from: https://doi.org/10.1103%2Fphysreva.99.032331.
  • [40] Tura J, Augusiak R, Sainz AB, Vértesi T, Lewenstein M, Acín A. Detecting nonlocality in many-body quantum states. Science. 2014;344(6189):1256-8.
  • [41] Harris SJ. Universality of graph homomorphism games and the quantum coloring problem. In: Annales Henri Poincaré. Springer; 2024. p. 1-36.
  • [42] Paulsen VI, Todorov IG. Quantum chromatic numbers via operator systems. The Quarterly Journal of Mathematics. 2015;66(2):677-92.
  • [43] Bravyi S, Gosset D, König R, Tomamichel M. Quantum advantage with noisy shallow circuits. Nature Physics. 2020;16(10):1040-5.
  • [44] Cross AW, Bishop LS, Smolin JA, Gambetta JM. Open quantum assembly language. arXiv preprint arXiv:170703429. 2017.
  • [45] Rigetti. Qiskit-Rigetti Plugin. GitHub; 2024. https://github.com/rigetti/qiskit-rigetti.
  • [46] Hamilton KE, Kharazi T, Morris T, McCaskey AJ, Bennink RS, Pooser RC. Scalable quantum processor noise characterization. In: 2020 IEEE International Conference on Quantum Computing and Engineering (QCE). IEEE; 2020. p. 430-40.
  • [47] Mayer K, Hall A, Gatterman T, Halit SK, Lee K, Bohnet J, et al. Theory of mirror benchmarking and demonstration on a quantum computer. arXiv preprint arXiv:210810431. 2021.
  • [48] Bernstein S. On a modification of Chebyshev’s inequality and of the error formula of Laplace. Ann Sci Inst Sav Ukraine, Sect Math. 1924;1(4):38-49.
  • [49] Zhang H, Chen S. Concentration Inequalities for Statistical Inference. Communications in Mathematical Research. 2021;37(1):1-85.
  • [50] Grilo AB. A Simple Protocol for Verifiable Delegation of Quantum Computation in One Round. In: 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik; 2019. .
  • [51] Acuaviva A, Aguirre D, Peña R, Sanz M. Benchmarking Quantum Computers: Towards a Standard Performance Evaluation Approach. arXiv preprint arXiv:240710941. 2024.
  • [52] Cerezo M, Arrasmith A, Babbush R, Benjamin SC, Endo S, Fujii K, et al. Variational quantum algorithms. Nature Reviews Physics. 2021;3(9):625-44.
  • [53] Warren A, Zhu L, Mayhall NJ, Barnes E, Economou SE. Adaptive variational algorithms for quantum Gibbs state preparation. arXiv preprint arXiv:220312757. 2022.
  • [54] Sherbert K, Furches J, Shirali K, Economou SE, Marrero CO. Adaptive Quantum Generative Training using an Unbounded Loss Function. In: 2024 IEEE International Conference on Quantum Computing and Engineering (QCE). vol. 1. IEEE; 2024. p. 1731-8.
  • [55] Childs AM, Wiebe N. Hamiltonian simulation using linear combinations of unitary operations. arXiv preprint arXiv:12025822. 2012.
  • [56] Chakraborty S. Implementing any linear combination of unitaries on intermediate-term quantum computers. Quantum. 2024;8:1496.
  • [57] Catli AB, Simon S, Wiebe N. Exponentially Better Bounds for Quantum Optimization via Dynamical Simulation. arXiv preprint arXiv:250204285. 2025.

Appendix

Appendix A Data Availability

The code used to generate the data and figures in this article can be found at
https://github.com/jfurches/nonlocalgames. The authors will make available the data collected for noise characterization by reasonable request.

Appendix B ADAPT-VQE

The Adaptive Derivative-Assembled Pseudo-Trotter ansatz Variational Quantum Eigensolver (ADAPT-VQE) is a hybrid quantum-classical algorithm designed to dynamically construct an efficient and compact ansatz for molecular simulations on quantum hardware [37]. It enhances the traditional Variational Quantum Eigensolver (VQE) by adaptively building a problem-specific ansatz for the quantum state. Unlike traditional approaches such as Unitary Coupled Cluster (UCC), which rely on pre-defined and often redundant wavefunction ansätze, ADAPT-VQE grows the ansatz iteratively by selecting operators that maximize energy recovery at each step. This adaptive approach minimizes the number of parameters and quantum gates required, making it well-suited for noisy intermediate-scale quantum (NISQ) devices.

ADAPT-VQE operates by measuring the gradient of the Hamiltonian’s expectation value with respect to each operator in a predefined operator pool. The operator with the largest gradient is added to the ansatz, and its parameter is optimized alongside previously added parameters using a classical variational optimizer. This process is repeated until the norm of the gradient vector falls below a threshold, ensuring convergence to the desired accuracy.

More concretely: assume we have variational parameters 𝜽(k)=(θ1,,θk)\boldsymbol{\mathbf{\theta}}^{(k)}=(\theta_{1},\dots,\theta_{k}) and the operator pool 𝒜={A(1),A(2),A(N)}\mathcal{A}=\{A^{(1)},A^{(2)},\dots A^{(N)}\}, the ansatz in iteration k+1k+1 of the algorithm may be written as

|ψk+1(𝜽(k+1))=eiθk+1Ak+1|ψk(𝜽(k)).|\psi_{k+1}(\boldsymbol{\mathbf{\theta}}^{(k+1)})\rangle=e^{-i\theta_{k+1}A_{k+1}}|\psi_{k}(\boldsymbol{\mathbf{\theta}}^{(k)})\rangle.

Notice that the ansatz at iteration kk is grown by appending operator Ak+1A_{k+1} with coefficient θk+1\theta_{k+1}; the operator is chosen by measuring the energy gradients |H/θk+1|θk+1=0|\left|\left.\partial\langle H\rangle/\partial\theta_{k+1}\right|_{\theta_{k+1}=0}\right| for each operator in the pool and selecting the one with the largest gradient. For this step, it can be shown that

|H/θk+1|θk+1=0|=|ψk(𝜽(k))|[Ak+1,H]|ψk(𝜽(k))|,\left|\left.\partial\langle H\rangle/\partial\theta_{k+1}\right|_{\theta_{k+1}=0}\right|=\left|\langle\psi_{k}(\boldsymbol{\mathbf{\theta}}^{(k)})|\left[A_{k+1},H\right]|\psi_{k}(\boldsymbol{\mathbf{\theta}}^{(k)})\rangle\right|,

where the right hand side can be efficiently measured on a quantum processor as the size of a problem scales. The pool operator gradient-measurement step is followed by a convergence check: if the pool operator gradient norm is smaller than a threshold ε\varepsilon, the calculation is terminated; if not, the iteration procedure continues. The ansatz-growing step is followed by a VQE optimization of all variational parameters.

By tailoring the ansatzs to the problem at hand, ADAPT-VQE achieves high accuracy with significantly reduced circuit depth compared to fixed ansatz methods. This variational technique has been studied extensively [52] and it has been extended to tackle problems in Quantum Generative training [53, 54]

Appendix C Original G14G_{14} Strategy

Here we outline the perfect quantum strategy for the graph G14G_{14} using 44 colors as detailed in [25]. The authors construct this strategy by leveraging a 4-dimensional real orthogonal representation of the graph and a transformation derived from quaternion multiplication outlined in [22]. Here is an outline of their construction:

  1. 1.

    Ortogonal Representation: For each vertex in vV(G14)v\in V(G_{14}) we assign a normalized 4D real unit vector φ(v)\varphi(v) as follows:

    • For each vertex in G13G_{13} (see Figure 4) you assign it a 3-dimensional vectors with entries in {1,0,1}\{-1,0,1\} such that two vertices in G13G_{13} are adjacent if and only if their corresponding vectors are orthogonal.

    • For each 3-dimensional vector, (x,y,z)T(x,y,z)^{T}, extended it to a 4D vector by appending a zero: (x,y,z,0)T(x,y,z,0)^{T}.

    • Assign the apex vertex Ω\Omega of G14G_{14} the 4D vector (0,0,0,1)T(0,0,0,1)^{T}.

    • Normalize each vector to be a unit vector and let φ(v)\varphi(v) be the vector corresponding to vertex vG14v\in G_{14}.

    • This assignment guarantees that if vertices uu and vv are adjacent (uvu\sim v), their vectors are orthogonal (φ(u)Tφ(v)=0\varphi(u)^{T}\varphi(v)=0).

    • To each vector φ(v)=(r0,r1,r2,r3)T\varphi(v)=(r_{0},r_{1},r_{2},r_{3})^{T}, we associate a set of four mutually orthogonal unit vectors, {φk(v)}k=03\{\varphi_{k}(v)\}_{k=0}^{3}, where each vector is a columns of the following matrix:

      Mv=(r0r1r2r3r1r0r3r2r2r3r0r1r3r2r1r0)M_{v}=\begin{pmatrix}r_{0}&-r_{1}&-r_{2}&-r_{3}\\ r_{1}&r_{0}&r_{3}&-r_{2}\\ r_{2}&-r_{3}&r_{0}&r_{1}\\ r_{3}&r_{2}&-r_{1}&r_{0}\end{pmatrix}

      So, φ0(v)=(r0,r1,r2,r3)T\varphi_{0}(v)=(r_{0},r_{1},r_{2},r_{3})^{T}, φ1(v)=(r1,r0,r3,r2)T\varphi_{1}(v)=(-r_{1},r_{0},r_{3},-r_{2})^{T}, and so on. These four vectors form the measurement basis for vertex vv.

  2. 2.

    State and Projectors Pk(v)P_{k}(v): In the corresponding nonlocal game, Alice and Bob share a 4-dimensional maximally entangled state, |Ψ+=12j=03|jA|jB|\Psi^{+}\rangle=\frac{1}{2}\sum_{j=0}^{3}|j\rangle_{A}\otimes|j\rangle_{B}. Upon receiving a vertex vv, a player performs a measurement using projectors {Pk(v)}k=03\{P_{k}(v)\}_{k=0}^{3}, where each projector is defined by the corresponding basis vector:

    Pk(v)=φk(v)φk(v)TP_{k}(v)=\varphi_{k}(v)\varphi_{k}(v)^{T}
  3. 3.

    Joint Probabilities P(a,b|u,v)P(a,b|u,v): The probability that Alice and Bob obtain outcomes (colors) aa and bb for questions uu and vv respectively, is given by:

    P(a,b|u,v)=14Tr(Pa(u)Pb(v))=14|φa(u)Tφb(v)|2P(a,b|u,v)=\frac{1}{4}\text{Tr}(P_{a}(u)P_{b}(v))=\frac{1}{4}|\varphi_{a}(u)^{T}\varphi_{b}(v)|^{2}

    This formula ensures that the winning conditions of the coloring game are met with certainty. Specifically, if u=vu=v, then P(a,b|u,u)=14δabP(a,b|u,u)=\frac{1}{4}\delta_{ab}. If uvu\sim v, then P(a,a|u,v)=0P(a,a|u,v)=0 for all aa.

One difficulty in implementing this strategy comes from the fact that measurement schemes are given as projections, which would need to be decomposed, for example, using Linear Combinations of Unitatires (LCU) [55]. The cost of standard LCU can be resource-intensive requiring log(M)\left\lceil{\log(M)}\right\rceil ancilla qubits, where MM is the number of unitaries in the linear combination, as well as the need to implement the “prepare” unitary and a sophisticated multi-qubit controlled “select” unitary for each projection separately. Other techniques like Ancilla-free LCU [56] might be able to reduce this overhead, but assessing the feasibility of implementing this strategy in near-hardware is non-trivial and outside the scope of the work.

Appendix D Measurement Parameters

Alice’s measurement parameters of the G14G_{14} strategy are contained within
data/g14_constrained_u3ry/g14_state.json with the key phi. Constructing this into a NumPy array should return a tensor of shape (1,14,2,4)(1,14,2,4), corresponding to (players, questions, qubits, parameters). This tensor can be transformed to produce the conjugated measurement angles for Bob, as seen in U3RyLayer in measurement.py.

Appendix E Hyperparameters

Problem Hyperparameter Value
CHSH ADAPT Grad Max ϵθ\epsilon_{\theta} 10310^{-3}
BFGS Grad Max ϵϕ\epsilon_{\phi} 10510^{-5}
DPO Tolerance ΔE\Delta E 10310^{-3}
NPS Same as CHSH
G14G_{14} ADAPT Grad Max ϵθ\epsilon_{\theta} 10610^{-6}
BFGS Grad Max ϵϕ\epsilon_{\phi} 10510^{-5}
DPO Tolerance ΔE\Delta E 10610^{-6}
Table 3: Hyperparameters for DPO experiments

We give the algorithm hyperparameters for our experiments. The parameter ϵθ\epsilon_{\theta} refers to the convergence criteria of ADAPT used to prepare the shared state |ψ(θ)\ket{\psi(\theta)}. ADAPT finishes when the maximum pool gradient element reaches the threshold, maxAi|[H,Ai]|<ϵθ\max_{A_{i}}\left|\braket{[H,A_{i}]}\right|<\epsilon_{\theta}. Similarly, the parameter ϵϕ\epsilon_{\phi} controls the convergence of the second phase of DPO, as the BFGS optimizer halts when maxi|ϕH|<ϵϕ\max_{i}\left|\nabla_{\phi}\braket{H}\right|<\epsilon_{\phi}. Finally, ΔE\Delta E controls the termination of the overall DPO procedure, ending when H(k1)H(k)<ΔE\braket{H^{(k-1)}}-\braket{H^{(k)}}<\Delta E at iteration kk.

Appendix F Gradient Sample Complexity

In this section, we analyze the efficiency of the gradient simulation to understand its sample complexity. This addresses the practical and theoretical challenges faced when implementing our algorithm. The gradient complexity we consider is in terms of the number of exponentials required to achieve any ϵ\epsilon precision.

Theorem F.1.

Let j\mathcal{E}_{j} be a random variable describing the error in the gradient estimate for the jthj-th experiment with variance 𝔼[j2]=ϵ02\mathbb{E}[\mathcal{E}^{2}_{j}]=\epsilon_{0}^{2}. Then the sample complexity of estimating the gradient with ϵ2\epsilon^{2} variance is given by Nexp𝒪(N2ϵ2)N_{\mathrm{exp}}\in\mathcal{O}\left(\frac{N^{2}}{\epsilon^{2}}\right), where NN is the dimensionality of the parameter space.

Proof.

Let ϵ0=ϵN\epsilon_{0}=\frac{\epsilon}{\sqrt{N}}. By the additivity of the variance, it follows that 𝔼[j=1Nj2]=N𝔼[j2]=Nϵ02\mathbbm{E}\left[\sum\limits_{j=1}^{N}\mathcal{E}_{j}^{2}\right]=N\mathbbm{E}[\mathcal{E}_{j}^{2}]=N\epsilon_{0}^{2}. The Euclidean norm of the gradient is approximated using the variances of the measurement outcomes. Hence

2𝔼[j=1Nj2]=Nϵ2N=ϵ2.\|\nabla\|^{2}\approx\mathbbm{E}\left[\sum\limits_{j=1}^{N}\mathcal{E}_{j}^{2}\right]=N\frac{\epsilon^{2}}{N}=\epsilon^{2}. (22)

Since each experiment requires 𝒪(1ϵ02)=𝒪(Nϵ2)\mathcal{O}\left(\frac{1}{\epsilon_{0}^{2}}\right)=\mathcal{O}\left(\frac{N}{\epsilon^{2}}\right) operator exponentials and this must be repeated NN times, the total number of operator exponentials NexpN_{\mathrm{exp}} is

Nexp𝒪(N2ϵ2)N_{\mathrm{exp}}\in\mathcal{O}\left(\frac{N^{2}}{\epsilon^{2}}\right) (23)

as desired. ∎

Theorem F.2.

Assume that the variational state |ψ(θ)\ket{\psi(\theta)} requires NN parameters to specify and that we wish to minimize F(θ):=ψ(θ)|H|ψ(θ)F(\theta):=\langle\psi(\theta)|H|\psi(\theta)\rangle over θ\theta. Assume that FF is Lipshitz continuous with constant CC and that F\nabla F is Lipshitz continuous with constant LL. We then have that the number of exponentials required to perform gradient descent optimization with final error in the objective function at most ϵtot\epsilon_{\rm tot} using learning rate η\eta and NepochsN_{\rm epochs} epochs is in

O(N2NepochC2((1+ηL)Nepoch1)2ϵtot2L2)O\left(\frac{N^{2}N_{\rm epoch}C^{2}((1+\eta L)^{N_{\rm epoch}}-1)^{2}}{\epsilon_{\rm tot}^{2}L^{2}}\right)
Nepoch,tot𝒪(N2Nepochγ2minθΓψ(θ)|H(ϕ)|ψ(θ)).N_{\mathrm{epoch,tot}}\in\mathcal{O}\left(\frac{N^{2}N_{\mathrm{epoch}}}{\gamma^{2}\min\limits_{\theta\in\Gamma}\|\nabla\langle\psi(\theta)|H(\phi)|\psi(\theta)\rangle\|}\right). (24)
Proof.

The gradient descent rule with learning rate η\eta reads

θθηϕψ(θ)|H(ϕ)ψ(θ).\theta\rightarrow\theta-\eta\nabla_{\phi}\langle\psi(\theta)|H(\phi)|\psi(\theta)\rangle. (25)

Using our assumption that the gradient is Lipshitz-continuous with constant LL, then

ψ(θ)|H|ψ(θ)ψ(θ+δ)|H|ψ(θ+δ)Lδ.\|\nabla\langle\psi(\theta)|H|\psi(\theta)\rangle-\nabla\langle\psi(\theta+\delta)|H|\psi(\theta+\delta)\rangle\|\leq L\delta. (26)

If we define G~(θ)\tilde{G}(\theta) to be an approximate gradient evaluated at the parameters θ\theta, then

ψ(θ)|H|ψ(θ)G~(θ+δ)\displaystyle\|\nabla\langle\psi(\theta)|H|\psi(\theta)\rangle-\tilde{G}(\theta+\delta)\|\leq ψ(θ)|H|ψ(θ)ψ(θ+δ)|H|ψ(θ+δ)\displaystyle\|\nabla\langle\psi(\theta)|H|\psi(\theta)\rangle-\nabla\langle\psi(\theta+\delta)|H|\psi(\theta+\delta)\rangle\|
+ψ(θ+δ)|H|ψ(θ+δ)G~(θ+δ)\displaystyle\qquad+\|\nabla\langle\psi(\theta+\delta)|H|\psi(\theta+\delta)-\tilde{G}(\theta+\delta)\|
Lδ+ϵ.\displaystyle\leq L\|\delta\|+\epsilon. (27)

Thus, we can recursively define the error in the parameter vector after kk epochs to be δk\delta_{k} and thus from the triangle inequality and the gradient update rule we have

δkη(Lδk1+ϵ)+δk1\|\delta_{k}\|\leq\eta(L\|\delta_{k-1}\|+\epsilon)+\|\delta_{k-1}\| (28)

We can then solve this recursion relation to find that

δk\displaystyle\|\delta_{k}\| ηϵ+(1+ηL)ηϵ+(1+ηL)2ηϵ+\displaystyle\leq\eta\epsilon+(1+\eta L)\eta\epsilon+(1+\eta L)^{2}\eta\epsilon+\cdots
ϵ((1+ηL)k1)L.\displaystyle\leq\frac{\epsilon((1+\eta L)^{k}-1)}{L}. (29)

Using the assumption that the objective function is Lipshitz-continuous with constant CC,

ψ(θ+δ)|H|ψ(θ+δ)ψ(θ)|H|ψ(θ)Cδ.\displaystyle\|\langle\psi(\theta+\delta)|H|\psi(\theta+\delta)\rangle-\bra{\psi(\theta)}H\ket{\psi(\theta)}\leq C\|\delta\|. (30)

Then it suffices to choose the error per gradient evaluation such that

Cϵ((1+ηL)Nepoch1)Lϵtot.\frac{C\epsilon((1+\eta L)^{N_{\rm epoch}}-1)}{L}\leq\epsilon_{tot}. (31)

Isolating ϵ\epsilon yields

ϵϵtotLC((1+ηL)Nepoch1).\epsilon\leq\frac{\epsilon_{\rm tot}L}{C((1+\eta L)^{N_{\rm epoch}}-1)}. (32)

This means that from Theorem F.1, the total number of exponentials per epoch that are needed is

NexpO(N2C2((1+ηL)Nepoch1)2ϵtot2L2).N_{\exp}\in O\left(\frac{N^{2}C^{2}((1+\eta L)^{N_{\rm epoch}}-1)^{2}}{\epsilon_{\rm tot}^{2}L^{2}}\right). (33)

Using the fact that there are NepochN_{\rm epoch} repetitions of the above

Nexp,totO(N2NepochC2((1+ηL)Nepoch1)2ϵtot2L2).N_{\mathrm{\exp,tot}}\in O\left(\frac{N^{2}N_{\rm epoch}C^{2}((1+\eta L)^{N_{\rm epoch}}-1)^{2}}{\epsilon_{\rm tot}^{2}L^{2}}\right). (34)

This shows that the sample complexity of such problems can, in general, be substantial. In particular, if a small learning rate is required for the evolution, the number of operations needed for optimization can be exponential. The learning rate η\eta should be chosen (in the strongly convex case) to be proportional to the smallest eigenvalue of the Hessian matrix, implying that the number of samples scales exponentially with the condition number. This can be prohibitive in cases where some optimization directions are vastly steeper than others, such as in the vicinity of a saddle point. The number of epochs required for optimization is similarly difficult to bound. However, in the case where the optimization function is strongly convex, the number of epochs varies logarithmically with the error in the final objective function. In general, however, such optimization problems are not necessarily strongly convex. For these reasons, we leave the parameters of the gradient descent arbitrary.

As a final note, this suggests that variationally optimizing the parameters for a nonlocal game is not necessarily expected to be efficient, in general. To make this optimization tractable at scale, we need to minimize the number of epochs as much as possible. This can be achieved by starting with a well-informed initital guess for the protocol before attempting to optimize the result. If such conditions are met, the above analysis suggests that a manageable number of operations will be needed to achieve a constant distance from the locally optimized strategy. To tackle the general problem, we suggest exploring alternative optimization approaches such as solving the variational problem using dynamical simulation-based methods [57].

Appendix G Experimental Details

The experiments on ibm_sherbrooke were conducted 7 different times between Sep. 27 - Oct. 1, 2024 with 4096 shots per circuit. The layout was chosen on the first run to be qubits 46-49 (a linear chain) using the dense method of the Qiskit transpiler with no optimization (level 0). For subsequent runs, the same layout was repeated. Each batch contained: the SPAM characterization circuits, the independent unitary noise characterization circuits, mirror fidelity circuits and the Bell pair game circuits. In Table 2 and Fig. 19, the best run on ibm_sherbrooke is reported. Calibration data for the backend was queried and saved at the time the circuit batch entered the queue and in Table 4 we report the two qubit gate error (ECR gates).

name value
ecr45_44 0.010494
ecr45_46 0.007731
ecr47_46 0.004980
ecr47_48 0.006505
ecr49_48 0.005589
ecr49_50 0.010321
ecr50_51 0.007020
Table 4: Calibration data for individual ECR gates between hardware qubits 46-49 on ibm_sherbrooke.

The reported data from Rigetti’s Ankaa-2 was collected on September 29 2024, and September 30 2024. Each circuit was sampled with 2048 shots and the hardware qubits used are reported in Figs. 11 and 12.

The reported data from Rigetti’s Ankaa-3 was collected on September 30 2024 . Each circuit was sampled with 2048 shots and the hardware qubits used are reported in Figs. 13 and 14.