This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Optimizing embedding-related quantum annealing parameters for reducing hardware bias

Aaron Barbosa, Elijah Pelofske, Georg Hahn, and Hristo N. Djidjev
(Los Alamos National Laboratory)
Abstract

Quantum annealers have been designed to propose near-optimal solutions to NP-hard optimization problems. However, the accuracy of current annealers such as the ones of D-Wave Systems, Inc., is limited by environmental noise and hardware biases. One way to deal with these imperfections and to improve the quality of the annealing results is to apply a variety of pre-processing techniques such as spin reversal (SR), anneal offsets (AO), or chain weights (CW). Maximizing the effectiveness of these techniques involves performing optimizations over a large number of parameters, which would be too costly if needed to be done for each new problem instance. In this work, we show that the aforementioned parameter optimization can be done for an entire class of problems, given each instance uses a previously chosen fixed embedding. Specifically, in the training phase, we fix an embedding EE of a complete graph onto the hardware of the annealer, and then run an optimization algorithm to tune the following set of parameter values: the set of bits to be flipped for SR, the specific qubit offsets for AO, and the distribution of chain weights, optimized over a set of training graphs randomly chosen from that class, where the graphs are embedded onto the hardware using EE. In the testing phase, we estimate how well the parameters computed during the training phase work on a random selection of other graphs from that class. We investigate graph instances of varying densities for the Maximum Clique, Maximum Cut, and Graph Partitioning problems. Our results indicate that, compared to their default behavior, substantial improvements of the annealing results can be achieved by using the optimized parameters for SR, AO, and CW.

1 Introduction

Quantum annealers such as the ones designed by D-Wave Systems, Inc. [3] are able to find approximate solutions to NP-hard problems of very high quality by resorting to a technique called quantum annealing. To be precise, the D-Wave annealer is designed to solve optimization problems requiring the minimization of a function of the form

H(x1,,xn)=i=1Nhixi+i<jJijxixj,\displaystyle H(x_{1},\ldots,x_{n})=\sum_{i=1}^{N}h_{i}x_{i}+\sum_{i<j}J_{ij}x_{i}x_{j}, (1)

where the linear weights hih_{i}\in\mathbb{R} and the quadratic couplers JijJ_{ij}\in\mathbb{R} are specified by the user and define the problem, and xix_{i} are unknown binary variables, where i,j{1,,n}i,j\in\{1,\ldots,n\}. To minimize eq. (1), the D-Wave annealer maps the connectivity of the logical qubits in eq. (1), i.e., the graph defined by the set of edges (i,j)(i,j) for which Jij0J_{ij}\neq 0, to the qubits and links between them on its hardware chip, called a Chimera graph (see [2] for a graphical representation of the Chimera graph). The process of submitting a problem to a D-Wave machine is as follows:

  1. 1.

    The problem of interest must be represented as the minimization of a function of the form of eq. (1). The function of eq. (1) is called a QUBO (quadratic unconstrained binary optimization) problem if xi{0,1}x_{i}\in\{0,1\} for i{1,,n}i\in\{1,\ldots,n\}, and an Ising problem if xi{1,+1}x_{i}\in\{-1,+1\} for i{1,,n}i\in\{1,\ldots,n\}. Both QUBO and Ising formulations are equivalent [2]. We can represent the function of eq. (1) as a graph PP itself having nn vertices, one for each variable xix_{i}, i{1,,n}i\in\{1,\ldots,n\}. In this representation, each vertex ii is assigned a vertex weight hih_{i}, and each edge between vertices ii and jj is assigned the edge weight JijJ_{ij}.

  2. 2.

    Next, the problem graph PP is mapped onto the hardware of the D-Wave 2000Q annealer. Since it is usually not the case that the structure of the graph PP perfectly matches the structure of the Chimera graph of the D-Wave 2000Q, a minor embedding of PP onto the Chimera graph has to be computed. In such an embedding, some logical qubits in eq. (1) become a chain, which is a set of hardware qubits on the chip linked together in a way that prompts them to take the same value at the end of the anneal. Defining the chains requires the specification of a parameter determining the strength of the coupling between the qubits in a chain (the chain strength or chain weight). The minor-embedded problem PP onto the Chimera graph corresponds to a new graph PP^{\prime}, which is a subgraph of the Chimera graph.

  3. 3.

    At the start of the annealing process, the qubits used in the embedding of PP^{\prime} onto the D-Wave hardware are initialized in an equal superposition [3, 7]. During annealing, the system is slowly driven from the neutral transverse field Hamiltonian to the user-specified QUBO or Ising problem HH of eq. (1) while remaining, in theory, in the ground state.

  4. 4.

    Since all qubits in a chain represent one logical qubit, they act as one in theory. However, this is not guaranteed in practice, and hardware qubits in a chain might not always take the same value after annealing. In this case, we speak of a broken chain. There is no unique way to assign a definite value to each logical qubit employed in eq. (1) based on its broken chain, and D-Wave offers several default methods to unembed chains, i.e., to decide on the value to be assigned to broken chains.

In practice, several sources of error potentially decrease the quality of the solution returned by the D-Wave annealer. First, before annealing, the linear weights and quadratic couplers in eq. (1) have to be mapped to electrical currents on the hardware chip using a linear-to-analog converter [7]. This conversion works with a finite precision of 8 bits, thus necessarily resulting in weights spanning a range larger than 8 bits to be mapped imprecisely due to rounding errors. Moreover, so-called leakage may occur on the physical chip from the coupler JijJ_{ij} to the adjacent linear weights hih_{i} and hjh_{j}, where i,j{1,,n}i,j\in\{1,\ldots,n\}. This can likewise alter the linear weights hih_{i} and hjh_{j} [6], where the effect is reported to be more serious for chained qubits.

One simple way to mitigate such hardware biases is the so-called spin reversal (SR) or gauge transform. Spin reversal works on Ising problems and is based on the idea that although, theoretically, quantum annealing is invariant under a gauge transformation (i.e., the reversal of spin-up and spin-down in a quantum system), the D-Wave annealer is not a closed system and thus breaks gauge symmetry. As a consequence, two Ising problems in which certain spins have been flipped result in (slightly) different systems when mapped onto the annealer. Solving several Ising problems with a certain number of spin reversed qubits allows us to average results, and balance out errors. In practice, we select an arbitrary subset of variables S{1,,n}S\subseteq\{1,\ldots,n\} in an Ising problem, and substitute the corresponding variables as xixix_{i}\rightarrow-x_{i} for all iSi\in S in the Ising problem (the corresponding linear terms and quadratic couplers have to be modified as well). This is equivalent to re-interpreting an up spin as a down spin and vice versa, thus leaving the ground state of the Ising problem invariant, but having the potential to reduce analog and systematic errors on the device. The spin reversal can be applied on two different levels, either before or after embedding eq. (1) onto the hardware (see Section 2). In this work we explore both variants.

Second, in theory, all qubits evolve simultaneously during the anneal process, experiencing equal changes to the tunneling energy and equally contributing to the classical energy function [1]. In practice, however, qubits freeze out at different times during the anneal [10], which might bias the qubit states at readout after annealing. To this end, D-Wave offers to set anneal offsets (AO) for all individual qubits with the aim to improve the solution quality. In order to synchronize the evolution of the qubits, the D-Wave 2000Q device offers the ability to delay or advance the evolution of individual qubits within predefined ranges, meaning that qubits can individually be set to start their anneal process earlier or later compared to the default schedule. We consider setting individual anneal offsets for all qubits in this work. Analogously to spin reversal, we consider applying anneal offsets in two different ways, either using separate AO for each individual qubit, or using the same AO for all qubits in each chain (see Section 2).

Third, all couplers on the D-Wave hardware require the specification of a weight, and as such, when embedding a logical qubit as a chain on the D-Wave hardware, couplers have to also be assigned to all pairwise connections of chained qubits. The chain weight is typically set by D-Wave, in which case some overall weight is equally distributed to all couplers in a chain. However, it can also be set manually. We read out the total chain weight (CW) determined by D-Wave for each chain, and aim to re-distribute it along the chain in an optimal way.

Although SR, AO, and CW can be effective for removing hardware biases in the annealer, it is non-trivial to optimally select the actual qubits to which a spin reversal is applied, or the values of the anneal offsets or chain weights for each new problem instance being solved. This is because each technique has around 2000 degrees of freedom. Previous work has improved upon the spin reversal transform by using classical optimization in order to find an optimal set of qubits to spin reverse [9]. Concerning the anneal offsets, D-Wave reports that longer chains are likely to freeze out sooner during the anneal process due to their lower effective tunneling energy [1], and they recommend delaying the evolution of those qubits which will be subjected to strong magnetic fields relative to the other working qubits. Moreover, [8] suggest to advance qubits in their evolution if their final state does not contribute to the energy of the classical solution.

Since tuning all qubits for an application of SR, AO, or CW individually for each new problem instance under consideration is infeasible, we propose a different approach in this work. We aim to optimize SR, AO and CW with respect to a whole class of input problems. We carry out all optimizations in the classical set-up of training and validation sets for three NP-hard problems, the Maximum Clique, Maximum Cut, and Graph Partitioning problems. Using a differential evolution optimizer, we tune the average performance of the spin reversal transform, anneal offsets, or chain weights across all of the training graphs, and evaluate our optimized sets of parameters on a set of test graphs.

The article is structured as follows. In Section 2, we describe the background of SR, AO, and CW, as well as the optimization framework we employed. We also introduce the NP-hard problems we consider (Maximum Clique, Maximum Cut, and Graph Partitioning). Experimental results are reported in Section 3. The article concludes with a discussion in Section 4.

2 Methods

In this section, we justify our approach of using a fixed embedding for optimizing SR, AO, and CW (Section 2.1). We provide more details on the two types of spin reversal we apply (Section 2.2), as well as on anneal offsets (Section 2.3) and chain weights (Section 2.4). Section 2.5 defines the NP-hard problems we consider. A description of the optimization we perform to tune the application of SR, AO, and CW is given in Section 2.6.

2.1 Using a fixed embedding

Using the same (fixed) embedding is a key ingredient of our approach. If we want the same set of optimized parameters to work for multiple problems, we need at least one invariant, and it is, in this case, the hardware embedding. We use the fact that, for an NP-hard problem, the limiting factor for embedding its problem graph PP is the largest complete graph that can be embedded onto the quantum annealing hardware. Hence, instead of using an arbitrary embedding of a complete graph that the D-Wave’s embedding method minorminer would randomly find, we can use a fixed one, and optimize the hardware related parameters using that fixed embedding. In addition, since we will use the same embedding many times, it makes sense to choose one with as good properties as possible. For this reason, we try several complete graph embeddings and choose one that gives the best performance overall, i.e., the best QUBO/Ising value when using default D-Wave parameters, separate for each of the three considered problems.

2.2 Spin reversal

Suppose we are given an Ising problem in the form of eq. (1) and a set S{1,,n}S\subseteq\{1,\ldots,n\} of spins to be reversed. To transform a particular xix_{i} from 1-1 to +1+1, while keeping the value of eq. (1) unchanged, we define a new function HH^{\prime} with hihih_{i}^{\prime}\rightarrow-h_{i} and JijJijJ_{ij}^{\prime}\rightarrow-J_{ij}, JjiJjiJ_{ji}^{\prime}\rightarrow-J_{ji} for all iSi\in S, where j{1,,n}j\in\{1,\ldots,n\}. Note that the ground state energies of HH and HH^{\prime} are identical, and each minimum of HH^{\prime} is a minimum of HH with the iith variable having a flipped sign. To apply spin reversal to a set SS, we apply the above transformation to each iSi\in S.

It is not obvious how the set SS should be chosen to maximize the benefit of the spin reversal. As remarked in [6], reversing too few spins leaves the Ising model almost unchanged, whereas applying spin reversal to too many qubits likely results in many pairs of connected qubits being transformed, thus effectively leaving the corresponding quadratic couplers unchanged. In both cases, the spin reversal transform might only have little effect. Hence, the default spin reversal implemented by D-Wave flips roughly half of the qubits randomly.

When optimizing the spin reversal for a particular problem, we are thus asked to determine for each involved qubit a binary spin reversal indicator, denoting if a particular qubit is spin reversed or not.

Note that spin reversal can be applied on two levels: First, we can flip the logical qubits in the formulation of eq. (1). This will be referred to as spin reversal on the chain level, since in this case qubits which are being mapped onto the hardware as chains are either all reversed or all non-reversed. Second, we can embed the original problem graph PP (see Section 1) and read out the embedded Ising problem in the graph representation PP^{\prime}. The resulting Ising problem likely consists of more qubits, due to some logical qubits being mapped to a set of physical qubits of the D-Wave hardware, and we can flip each hardware qubit individually. This will be referred to as spin reversal on the qubit level.

2.3 Anneal offsets

Typically, all hardware qubits being used in a problem embedded onto the D-Wave quantum chip undergo the same anneal process simultaneously. In such a case, it is normal that qubits freeze out at different times during the anneal [10], which, however, might affect negatively the dynamics of the annealing and prevent the system from reaching its ground state. To this end, in the newest generation of the annealer, the start of the anneal process of individual qubits can be moved forward or backward in time, within specified limits.

For D-Wave 2000Q, individual anneal offsets can be specified for each hardware qubit using the parameter anneal_offsets, where the value 0 denotes no offset, and positive (negative) values indicate that a qubit’s anneal begins ahead of (behind) the standard schedule. The range of the variable anneal_offsets is machine dependent, and can be queried with anneal_offset_ranges. Moreover, anneal offsets are discrete, with a machine dependent step size specified in anneal_offset_step. We tune the anneal offset of each involved qubit with an optimization within the range anneal_offset_ranges (given as boundary condition to the optimizer) over a discrete search space, similar to spin reversal optimization.

2.4 Bias distribution on physical qubits

Refer to caption
Figure 1: The logical qubit xix_{i} (left) is mapped onto a chain of seven physical qubits xi1,,xi7x_{i}^{1},\dots,x_{i}^{7} (right), and the logical couplers JjiJ_{ji} and JikJ_{ik} (connecting xix_{i} to qubits xjx_{j} and xkx_{k}) are mapped to the physical couplers Jji15J_{ji}^{15}, Jji12J_{ji}^{12} and Jik31J_{ik}^{31}, Jik61J_{ik}^{61}, Jik72J_{ik}^{72}, respectively.

Typically, logical qubits have to be mapped to chains of hardware qubits when computing a minor embedding of a QUBO/Ising problem of eq. (1) onto the D-Wave Chimera graph. In this case, a logical qubit (variable) xix_{i} is mapped onto a chain {xi1,,xil}\{x_{i}^{1},\ldots,x_{i}^{l}\} having ll\in\mathbb{N} hardware qubits. The linear bias hih_{i} and the quadratic biases JijJ_{ij} have to distributed between the physical qubits and the links between them (see Figure 1). The default method of D-Wave distributes hih_{i} and JijJ_{ij} uniformly between the qubits {xi1,,xil}\{x_{i}^{1},\ldots,x_{i}^{l}\} and the links between the physical qubits (chains) implementing xix_{i} and xjx_{j}, respectively. However, there is no evidence that such a method is optimal. In fact, Pudenz [12] has compared the default method with two other distribution strategies and has shown that in some cases the alternative methods work better.

However, none of the previous strategies considers the possible effect of hardware biases, or looks at bias distribution strategies to mitigate such issues. Here we address this problem, using our fixed embedding approach, and tackle the bias distribution problem (i.e., how to distribute the biases on the physical qubits and couplers in a way as to optimize the annealing results) as an optimization problem. The optimization of linear weights and quadratic couplers is done separately, denoted as CW(L) and CW(Q), respectively. When optimizing linear weights, we evenly distribute the quadratic weights among the quadratic couplers, and analogously for the linear weights when optimizing quadratic couplers.

2.5 Formulations of the NP-hard problems studied

We consider three classical NP-hard problems in this work, the Maximum Clique, Maximum Cut, and Graph Partitioning problems. For a graph G=(V,E)G=(V,E) with vertex set VV and edge set EE, a clique in GG is any subgraph CC of GG that is complete, i.e., there is an edge between each pair of vertices of CC. The Maximum Clique problem asks us to find a clique of maximum size. A formulation of the Maximum Clique problem in the form of eq. (1) can be found in [11].

Similarly, a cut of the graph is any partition of VV into two disjoint sets, that is V=V1V2V=V_{1}\cup V_{2} and V1V2=V_{1}\cap V_{2}=\emptyset. The cut size of any cut is the number of edges having one endpoint in V1V_{1} and one endpoint in V2V_{2}, that is |{e=(v,w):vV1,wV2}||\{e=(v,w):v\in V_{1},w\in V_{2}\}|. The Maximum Cut problem asks us to find a cut of maximum size. A formulation of the Maximum Cut problem as an Ising problem can be found in [5].

Last, the Graph Partitioning problem asks us to divide the set of vertices VV into two disjoint and balanced sets (partitions) V1V_{1} and V2V_{2}, satisfying V=V1V2V=V_{1}\cup V_{2} and V1V2=V_{1}\cap V_{2}=\emptyset, such that the size of V1V_{1} and V2V_{2} differs by at most one and the number of cut edges {e=(v,w):vV1,wV2}\{e=(v,w):v\in V_{1},w\in V_{2}\} between the two partitions is minimized. An Ising formulation for Graph Partitioning is given in [4].

2.6 Differential evolution optimization

In this work, we aim to tune three parameters per qubit, that is, its spin indicator for spin reversal, its anneal offset, and its chain weights in case we are dealing with a chained qubit on the D-Wave Chimera hardware.

To carry out the optimization we employ the differential optimization solver of the SciPy library in Python [14], available under the command scipy.optimize. differential_evolution(), which implements the algorithm of Storn and Price [13].

We employ the differential optimizer with a population size of 8080 and 5050 generations. We do not use the option polishing, i.e., no steepest decent with a quasi-Newton method is performed to fine-tune the solution. All remaining parameters are left at their default values. The initial population is random, but we made sure to include default parameters for SR, AO, and CW given by D-Wave Systems, Inc. Lastly, we use elitism in the optimization, i.e., we make sure that the best solution is always passed on to the next generation, as opposed to the possibility of being replaced by crossover and random selection operations.

3 Experimental analysis

We will perform the optimization on a class of random test graphs, separately for the three problems introduced in Section 2.5, and evaluate the performance on a series of (unseen) testing graphs. With this, we aim to find out if it is possible to enhance the performance of the D-Wave 2000Q on a whole class of problem instances, since it is easy to see that due to the large search space, optimizing parameters for each newly solved problem is infeasible.

In all experiments, we fix the anneal duration at 10001000 microseconds. We generate 1010 training and 1010 testing graphs, each with 6565 vertices, the size of the largest complete graph embeddable onto the D-Wave hardware. We vary the density of the training and validation graphs in {0.25,0.50,0.75}\{0.25,0.50,0.75\}. We use the majority vote unembedding algorithm. For Maximum Clique and Maximum Cut we use a chain strength of 1, and for Graph Partitioning a chain strength of 203233d20\cdot 32\cdot 33\cdot d, where dd is the graph density. This is similar to the choice in [4], where the chain strength for Graph Partitioning is set to a prefactor multiplied with an estimate of the value of the objective function.

For the optimization, for each fixed point in the parameter search space, we perform 10001000 anneals per graph, and record the average performance across all 1010 training graphs, measured in both the value of the QUBO/Ising objective function and the energy (before unembedding) returned by the D-Wave annealer. For testing, we perform 1000010000 anneals per graph.

When reporting the experimental results, we denote by Default-RE the default behavior of the D-Wave annealer with a random embedding and with all other parameters set to their default values. As the optimization is performed on the same embedding, it is reasonable to try to find one that will result in the best performance on average. Hence, we try 3030 random embeddings and choose one for each problem that yields the best objective function value during forward annealing with default parameters (since Maximum Clique and Maximum Cut are maximization problems, the higher the value the better, whereas for Graph Partitioning, which is a minimization problem, lower is better). We denote this as Default-OE (for default D-Wave with optimized embedding). Moreover, we denote by SR(Q) and SR(C) the tuned spin reversal on the qubit or chain level, and similarly AO(Q) and AO(C) denote the tuned anneal offsets on the qubit or chain level. Moreover, CW(L) and CW(Q) refer to setting the chain weights of linear or quadratic couplers. Since the D-Wave 2000Q features auto_scale and extended_j_range are mutually exclusive, and because we use auto scaling for all experiments, we optimized CW(Q) without the extended J range feature.

We assess the performance of all methods using the time-to-solution metric, defined as the time to reach an optimum solution at least once with probability 0.99. It is computed as TTS=TQPUlog(0.01)/log(1p)\text{TTS}=T_{\text{QPU}}\cdot\log(0.01)/\log(1-p), where TQPUT_{\text{QPU}} is the solve time on D-Wave, and pp is the proportion of times the optimal solution was found. Two caveats are worth mentioning: For problem instances where the optimal solution can be found using a classical solver, we are able to compute the time-to-optimal-solution, which we simply denote by TTS (time-to-solution). In case the optimal solution cannot be found in reasonable time, we relax this metric to time-to-best-solution, denoted by TBS, which uses the best solution found by any of the methods, instead of the provably best one. The TBS measure depends on the set of algorithms employed in the study, their parameters, and the D-Wave samples on which these algorithms are run. However, the setup of the simulations presented here is fixed, thus making the TBS measure well-defined. Lastly, the TTS measure we report does not include any classical computation time, nor the training portion of the optimization done for each problem.

Refer to caption
Figure 2: Performance on test graphs compared to Default-OE for graphs of density 0.250.25 (left column), 0.50.5 (middle column), and 0.750.75 (right column). Metric is the cut size for the Maximum Cut problem (top row), cut size for the Graph Partitioning problem (middle row), and clique size for the Maximum Clique problem (bottom row).

Figure 2 gives a graphical representation of our results on the suite of test graphs for all three problems. We measure results in the raw values of the found cut size (for Maximum Cut and Graph Partitioning) or clique size for the Maximum Clique problem. We observe that, compared to the baseline of Default-OE, SR as well as AO and CW seem to perform well for Maximum Cut. For the other two problems, it is mostly SR and AO on the chain level which perform best.

Problem Density Default-OE Default-RE SR(Q) SR(C) AO(Q) AO(C) CW(L) CW(Q)
MaxCut 0.25 2857.8 5212.4 1121.0
(1) (10) (1)
0.5 3246.2 3080.3
(5) (6)
0.75 1436.1 3185.2
(3) (9)
GraphPart. 0.25 7179.0 5809.4 3457.0 3869.6 4060.5 3396.9
(1) (3) (3) (1) (2) (1)
0.5 7215.7 7987.9 6817.3 3569.4 3733.9 3853.2
(2) (4) (1) (3) (1) (2)
0.75 7669.1 4286.7 3961.7
(6) (3) (2)
MaxClique 0.25 14979.0 11520.8 8226.6 8292.9 10728.6
(3) (1) (2) (3) (2)
0.5 683.1 6061.0 2612.8 5571.9 111.7 559.4 1754.3
(2) (3) (2) (2) (1) (2) (2)
0.75 9.9 323.7 4.1 13825.2 58.6 20.2
(1) (1) (1) (1) (1) (1)
Table 1: TBS (white) and TTS (gray) for the test graphs. Number of test graphs for which optimal solutions were found in parentheses. The best TTS and the highest number of optimal/best solutions found for each problem/density combination are given in bold. In case of a tie, the combination with the smallest TTS is chosen.

Next, the results for the TTS estimations are given in Table 1. It is complicated to rank the performances of these techniques, since two measures are of relevance here. First, a low TTS or TBS time is desired for any method. However, some methods might be able to achieve a low TTS/TBS time, but only at the expense of solving fewer test graphs than others, and a method solving almost all graphs usually incurs a higher TTS/TBS time (this occurs, for instance, for the Maximum Cut problem and density 0.25). We thus denote both the measured TTS/TBS time for the graph that could be solved, as well as the number of the graph problems for which the best solution could be found, in parentheses.

First, we note that for the Maximum Cut and Graph Partitioning problems, classical solvers are unable to find the optimal solution, meaning we have to resort to the TBS measure. For Maximum Cut, neither of the Optimized, Random, SR(Q), SR(C) method could compute the best known solution for any graph. Solely the optimized anneal offset feature (AO on qubit and chain level), and the optimized chain weights (CW for linear weights), can solve some of the graphs and attain best TBS measures. Overall, AO(C), annealing offsets at the chain level, gives the best performance for Maximum Cut.

For the Graph Partitioning problem, the optimized spin reversal can solve most problems on average, but only at the expense of incurring large TBS times. Using optimized anneal offsets on the chain level, as well as optimized chain weights (for quadratic couplers) yields best TBS times, however again at the expense of only solving few graphs which can bias the results. Spin reversal at the qubit level (SR(Q)) seems to be performing the best for this type of problem.

For the Maximum Clique problem, we are able to solve problem instances classically to optimality using the function networkx.algorithms.clique.find_cliques in Python’s Networkx package, and thus report TTS times. We observe that all methods can only solve around two of the 1010 test graphs, and that overall, SR(Q) and CW(Q) yield the best TTS times for the Maximum Clique problem.

Problem Density Default-RE SR(Q) SR(C) AO(Q) AO(C) CW(L) CW(Q)
MaxCut 0.25 -0.3 -1.0 -0.7 4.2 8.8 0.8 -1.0
0.50 0.2 -0.0 0.1 8.8 8.3 7.2 1.3
0.75 0.0 -0.3 -0.7 9.0 9.9 4.4 0.1
GraphPart. 0.25 0.3 1.6 1.3 0.6 1.4 0.2 0.6
0.50 -0.2 0.6 -0.6 0.0 0.4 0.1 0.5
0.75 -0.2 0.7 -0.7 -0.4 0.6 0.4 0.2
MaxClique 0.25 -8.8 0.5 2.3 -4.6 6.4 -1.8 -6.6
0.50 -0.6 1.2 2.3 -3.3 5.5 0.3 0.3
0.75 -2.2 3.1 1.8 -1.4 0.3 0.1 0.2

Table 2: Improvement (%) in the value of the QUBO/Ising formulation compared to Default-OE for the test graphs.

Apart from reporting TBS/TTS times, we can also evaluate all methods using the value of the Ising or QUBO formulation on the bitstring returned by D-Wave. Doing this for the best of the 3030 embeddings on D-Wave with default parameters (i.e., Default-OE) allows us to set a reference point, and we report the improvement (in percent) of the obtained value over this reference in Table 2.

For the Maximum Cut problem, optimized anneal offsets (both on the qubit and the chain level) resulted in the largest percent improvement. For graph partitioning, spin reversal on the qubit level performs best, though the improvements are only of the order of one percent. Surprisingly, using Default-RE for Maximum Cut and Graph Partitioning does not perform very different than Default-OE. Lastly, for Maximum Clique, using anneal offsets on the chain level performs considerably better than the other techniques for density 0.25 and 0.5, with spin reversal being best for high densities.

4 Discussion

This work considered optimizing three recent features of the D-Wave 2000Q annealer, precisely spin reversal (on qubit or chain level), anneal offsets (on qubit or chain level), and chain weight distribution (for linear or quadratic couplers). After fixing the embedding, we perform a classical optimization over a suite of random test graphs using a differential evolution optimizer, and aim to investigate if it is possible to outperform the default D-Wave anneal setting with optimized parameters for the three techniques. Our overall aim is to tune SR, AO and CW such that these features work better on a whole class of problems.

We conclude that for random graphs, and the three NP-hard problems we considered, tuning anneal offsets indeed works best, yielding substantial improvements over the default D-Wave behavior, especially for the Maximum Cut and Graph Partitioning problems. Optimizing spin reversal and chain weights seems more dependent on the problem and measure.

This work leaves scope for a variety of future research avenues. First, we performed the optimization for SR, AO and CW individually, since each involves tuning around 20002000 variables, and we found larger optimization problems to be infeasible. More elaborate optimization methods could allow us to optimize all parameters simultaneously, potentially improving results. Second, it would be interesting to extend the experiments to more classes of NP-hard problems, with the aim to see how much the trained parameters of SR, AO, or CW differ, and to investigate if the trained SR, AO, or CW can be recycled for certain classes. Third, the fixed embedding setup could allow us to determine if certain hardware qubits behave more favorably if consistently spin reversed, or if consistently employed with a particular anneal offset. Finally, though time consuming, our experiments would benefit from larger sets of testing and training graphs.

Acknowledgments

This work has been supported by the US Department of Energy through the Los Alamos National Laboratory. Los Alamos National Laboratory is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001) and by the Laboratory Directed Research and Development program of Los Alamos National Laboratory under project numbers 20190065DR and 20180267ER.

References

  • [1] Evgeny Andriyash, Zhengbing Bian, Fabián A. Chudak, Andrew D. King, and William G. Macready. Boosting integer factoring performance via quantum annealing offsets. Technical report, D-Wave Systems, 2016.
  • [2] Guillaume Chapuis, Hristo Djidjev, Georg Hahn, and Guillaume Rizk. Finding Maximum Cliques on the D-Wave Quantum Annealer. J Signal Process Sys, 91(3-4):363–377, 2019.
  • [3] D-Wave Systems. Quantum Computing for the Real World Today, 2000.
  • [4] D-Wave Systems. Graph Partitioning QUBO, 2019.
  • [5] D-Wave Systems. Maximum Cut QUBO, 2019.
  • [6] D-Wave Systems. D-Wave System Documentation: Solving a Problem on the QPU – Using Spin-Reversal (Gauge) Transforms. Technical report, D-Wave Systems, 2020.
  • [7] D-Wave Systems. Technical Description of the D-Wave Quantum Processing Unit, 2020.
  • [8] Andrew D. King, Emile Hoskinson, Trevor Lanting, Evgeny Andriyash, and Mohammad H. Amin. Degeneracy, degree, and heavy tails in quantum annealing. Phys Rev A, 93(5), 2016.
  • [9] E. Pelofske, G. Hahn, and H. Djidjev. Optimizing the Spin Reversal Transform on the D-Wave 2000Q. In Proceedings of the 2019 IEEE International Conference on Rebooting Computing (ICRC), pages 1–8, 2019.
  • [10] E. Pelofske, G. Hahn, and H. Djidjev. Inferring the Dynamics of Ground-State Evolution of Quantum Annealers. arXiv:2009.06387, pages 1–21, 2020.
  • [11] Elijah Pelofske, Georg Hahn, and Hristo Djidjev. Solving large maximum clique problems on a quantum annealer. In Feld S., Linnhoff-Popien C. (eds) Quantum Technology and Optimization Problems. QTOP 2019. Lecture Notes in Computer Science, vol 11413. Springer, Cham., pages 123–135, 2019.
  • [12] K. L. Pudenz. Parameter setting for quantum annealers. In 2016 IEEE High Performance Extreme Computing Conference (HPEC), pages 1–6, 2016.
  • [13] R Storn and K Price. Differential Evolution – a Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J Global Optim, 11:341–359, 1997.
  • [14] Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R.J. Nelson, Eric Jones, Robert Kern, Eric Larson, CJ Carey, Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E.A. Quintero, Charles R. Harris, Anne M. Archibald, A.H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.