This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\savesymbol

AND \savesymboltheoremstyle \newdateformatmydate\twodigit\THEDAY \shortmonthname[\THEMONTH], \THEYEAR \restoresymboltheoremtheoremstyle

Revisiting some classical linearizations of the quadratic binary optimization problem

Abraham P Punnen
Department of Mathematics, Simon Fraser University,
Surrey, B C, Canada.
  (apunnen@sfu.ca)
\AND
Navpreet Kaur
Department of Mathematics, Simon Fraser University,
Surrey, B C, Canada.
  (navpreet_kaur_2@sfu.ca)
Abstract

In this paper, we present several new linearizations of a quadratic binary optimization problem (QBOP), primarily using the method of aggregations. Although aggregations were studied in the past in the context of solving system of Diophantine equations in non-negative variables, none of the approaches developed produced practical models, particularly due to the large size of associate multipliers. Exploiting the special structure of QBOP we show that selective aggregation of constraints provide valid linearizations with interesting properties. For our aggregations, multipliers can be any non-zero real numbers. Moreover, choosing the multipliers appropriately, we demonstrate that the resulting LP relaxations have value identical to the corresponding non-aggregated models. We also provide a review of existing explicit linearizations of QBOP and presents the first systematic study of such models. Theoretical and experimental comparisons of new and existing models are also provided.

1 Introduction

Let Q=(qij)Q=(q_{ij}) be an n×nn\times n matrix and 𝐜T=(c1,c2,,cn)\mathbf{c}^{\mkern-1.0mu{T}}=(c_{1},c_{2},\ldots,c_{n}) be an nn-vector. Then, the quadratic binary optimization problem (QBOP) with linear constraints is to

Maximize i=1njRiqijxixj+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}x_{i}x_{j}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: 𝐱K, and xi{0,1} for all i=1,2,,n,\displaystyle\mathbf{x}\in K,\mbox{ and }x_{i}\in\{0,1\}\mbox{ for all }i=1,2,\ldots,n,

where KK is a polyhedral set, 𝐱T=(x1,x2,,xn)\mathbf{x}^{\mkern-1.0mu{T}}=(x_{1},x_{2},\ldots,x_{n}) and Ri={j:qij0}R_{i}=\{j:q_{ij}\neq 0\}. Unless otherwise stated, we assume that the matrix QQ is symmetric and its diagonal elements are zeros. Since our focus is primarily on the linearization problem associated with QBOP, for simplicity, we assume that KK is the unit cube in RnR^{n}. i.e. K={0,1}nK=\{0,1\}^{n}. However, the theoretical results in this paper extend to any set KK that is polyhedral in nature. Consequently, most of the discussions in this paper will be on the quadratic unconstrained binary optimization problem (QUBO) with the understanding that the corresponding results extend directly to the more general case of QBOP. For a state-of-the-art discussion on QUBO, we refer to the book [44].

The terminology ‘linearization’ in the context of QBOP generally refers to writing a QBOP as an equivalent mixed integer linear program (MILP). There is another notion of ‘linearization’ associated with QBOP where conditions are sought on the existence of a row vector 𝐝\mathbf{d} such that 𝐱TQ𝐱=𝐝T𝐱\mathbf{x}^{T}Q\mathbf{x}=\mathbf{d}^{\mkern-1.0mu{T}}\mathbf{x} for all binary 𝐱K\mathbf{x}\in K [29, 30, 45, 12, 6, 48]. The notion of linearization considered in this paper is the former.

A linearization, (i.e. an equivalent MILP formulation) of QBOP is said to be explicit if it contains a variable, say yijy_{ij}, corresponding to the product xixjx_{i}x_{j} for each iji\neq j such that yij=1y_{ij}=1 if xi=xj=1x_{i}=x_{j}=1 and zero otherwise. There is another class of linearization studied for QBOP, called compact linearization that does not use the variables yijy_{ij} [3, 5, 9, 11, 20, 27, 28, 35]. Compact linearizations are investigated by many researchers, particularly in obtaining strong linear programming (LP) relaxation bounds [3, 5, 9, 28] that matches the roof duality bound [1]. The major advantage of a compact linearization is that it has relatively less number of variables and constraints, compared to explicit linearizations. However, many such models use numbers of large absolute values in defining bounds on variables or expressions and this could lead to numerical difficulties. In a companion paper, we present a through analysis of existing as well as new compact linearizations [47]. These formulations provide strong relaxations, but requires to use SDP solvers. Researchers also studied convex reformulations [9, 10] of QBOP making use of results from semidefinite programming.

Our main focus in this paper is to study explicit formulations in a systematic way both from theoretical as well as computational perspectives. Though traditional explicit linearizations have a large number of constraints and variables, some of them provide strong upper bounds when the integrality constraints are relaxed. For example, the Glover-Woolsey linearization [24] (also called the standard linearization) have an LP relaxation value that matches other lower bounds such as the roof duality bound [1]. Since explicit formulations use the yijy_{ij} variables, insights gained from the polyhedral structure of the Boolean quadric Polytope [41] and valid inequalities derived from the support graph of the QQ matrix [26] can be readily used to strengthen the corresponding LP relaxations and to aid development of branch and cut algorithms.

We distinguish between two classes of explicit formulations. In one case, the variables yijy_{ij} defines the product xixjx_{i}x_{j} precisely for all feasible solutions 𝐱\mathbf{x} of the model. i.e. yijy_{ij} takes zero or one values only and yij=1y_{ij}=1 precisely when xi=xj=1x_{i}=x_{j}=1. We call MILP formulations having this property as precise models. In another class of MILP models, the above definition of the yijy_{ij} variables is guaranteed only when 𝐱\mathbf{x} is an optimal solution of the model. We call such MILP formulations the optimality restricted models. Optimality restricted models normally have lesser number of constraints and mostly run faster than precise models. When the model is solved to optimality, the optimal solution and the objective function value produced are indeed accurate under both precise and optimality restricted models. On the other hand, if optimality is not reached and a solver terminates the program due to, say time limit restrictions, the objective function value of the best solution produced, say 𝐱0\mathbf{x}^{0}, may not be equal to f(𝐱0)=(𝐱0)TQ𝐱0+ cT𝐱0f(\mathbf{x}^{0})=(\mathbf{x}^{0})^{\mkern-1.0mu{T}}Q\mathbf{x}^{0}+\textbf{ c}^{\mkern-1.0mu{T}}\mathbf{x}^{0} for an optimality restricted model. To get the precise objective function value of such a (heuristic) solution, one may need to recompute f(𝐱0)f(\mathbf{x}^{0}). After recomputing, the resulting objective function value is unpredictable and could be significantly worse or significantly better than the value reported by the solver. Later, we will illustrate this with an example. Thus, it is important to study both precise and optimality restricted models separately.

The MILP formulations are not simply a tool to compute an exact optimal solution. Considering the significant power of modern day MILP solvers such as Gurobi and Cplex, they are often used to obtain heuristic solutions by imposing time restrictions or could be used in developing sophisticated matheuristics [36]. For such purposes, precise models are more valuable. However, to simply compute an optimal solution, optimality restricted models are, in general, more useful. Because of this, after discussing each precise model, we also present a corresponding optimality restricted model, whenever applicable.

In this paper, we present the first systematic experimental and theoretical study of explicit linearizations of QUBO. Starting with a review of existing explicit linearizations of QUBO, we present different ways of generating new explicit linearizations and analyze the quality of them from an LP relaxation point of view. One class of our new explicit linearizations uses only 2n2n general constraints which matches the number of general constraints in many of the compact formulations [3, 5, 9, 28] and also have a LP relaxation bound that matches the bound of the standard linearization. To the best of our knowledge, no such compact explicit linearizations are known before. We also present detailed and systematic experimental analysis using all of the explicit linearizations we obtained offering insights into the relative merits of these models.

One of the techniques we use in obtaining new linearizations is by making use of selective aggregation of constraints. Weighted aggregation that preserve the set of feasible solutions of a Diophantine system in non-negative variables have been investigated by many researchers [15, 59, 22, 32, 51, 40, 42, 37]. While theoretically interesting, such an approach never worked well in practice, particularly due to the large coefficients associated with the aggregations and weaker LP relaxation bounds that results in. We show that selective aggregation of constraints provides a valid MILP model for QUBO for any selection of weights in the aggregation. In particular, by choosing these weights carefully, we can achieve an LP relaxation bound that is the same as that of the non-aggregated model. Our work also have linkages with surrogate duality providing strong duality relationship, unlike weak duality generally seen in MILP [14, 18, 19, 21].

2 Basic Explicit MILP models

Let us now discuss four basic explicit linearizations of QUBO and their corresponding optimality restricted versions. Linearizing the product xixjx_{i}x_{j} of binary variables is achieved by replacing it with a variable yijy_{ij} and introduce additional constraints to force yijy_{ij} to take 0 or 1 values only in such a way that yij=1y_{ij}=1 if and only if xi=xj=1x_{i}=x_{j}=1. The process of replacing xixjx_{i}x_{j} by a single variable yijy_{ij} as discussed above is called linearization of the product xixjx_{i}x_{j}. The product xixjx_{i}x_{j} can be linearized in many ways. Perhaps the oldest such linearization technique was discussed explicitly by Watters [55], which was implicit in other works as well [13, 16, 17, 25, 58]. To linearize the product xixjx_{i}x_{j}, Watters [55] introduced the constraints (which is implicit in the work of Dantzig [13])

xi+xjyij1\displaystyle x_{i}+x_{j}-y_{ij}\leq 1 (1)
2yijxixj0\displaystyle 2y_{ij}-x_{i}-x_{j}\leq 0 (2)
yij,xi,xj{0,1}\displaystyle y_{ij},x_{i},x_{j}\in\{0,1\} (3)

This leads to the following 010-1 linear programming formulation of QUBO [13, 55].

DW: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (4)
2yijxixj0 for all jRi,i=1,2,,n,\displaystyle 2y_{ij}-x_{i}-x_{j}\leq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (5)
xi{0,1} for all i=1,2,,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2,\ldots,n, (6)
yij{0,1} for all jRi,i=1,2,,n.\displaystyle y_{ij}\in\{0,1\}\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (7)

This formulation has at most 2n(n1)2n(n-1) general constraints and n2+nn^{2}+n binary variables.

Using a disaggregation of constraints (2), Glover and Woolsey [23] proposed to linearize the product xixjx_{i}x_{j} by introducing the constraints

xi+xjyij1\displaystyle x_{i}+x_{j}-y_{ij}\leq 1 (8)
yijxi\displaystyle y_{ij}\leq x_{i} (9)
yijxj\displaystyle y_{ij}\leq x_{j} (10)
yij0,xi,xj{0,1}.\displaystyle y_{ij}\geq 0,x_{i},x_{j}\in\{0,1\}. (11)

Note that yijy_{ij} is now a continuous variable and the number of constraints for each (i,j)(i,j) pair is increased by one. This leads to the following 010-1 programming formulation of QUBO [23].

GW: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (12)
yijxi0 for all jRi,i=1,2,,n\displaystyle y_{ij}-x_{i}\leq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n (13)
yjixi0 for all jSi,i=1,2,,n\displaystyle y_{ji}-x_{i}\leq 0\mbox{ for all }j\in S_{i},i=1,2,\ldots,n (14)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (15)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (16)

where Si={j:qji0}S_{i}=\{j:q_{ji}\neq 0\}. Since QQ is symmetric, Ri=SiR_{i}=S_{i} for i=1,2,,ni=1,2,\ldots,n. The formulation GW has 3n(n1)3n(n-1) general constraints, an increase compared to DW. GW also has nn binary variables and n(n1)n(n-1) continuous variables. Note that the constraints (13) and (14) are the restatements of the constraints (9) and (10) when taken over all the pairs ii and jj. We use this representation to aid easy presentation of some of our forthcoming results. In the literature the constraints (9) and (10) are normally used instead of (13) and (14) when describing GW. GW is perhaps the most popular linearization studied in the literature for QUBO and is often called the standard linearization or Glover-Woolsey linearization.

A variation of GW, often attributed to the works of Fortet [16, 17] can be stated as follows.

FT: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (17)
yijxi0 for all jRi,i=1,2,,n\displaystyle y_{ij}-x_{i}\leq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n (18)
yijyji0 for all jRi,i=1,2,,n\displaystyle y_{ij}-y_{ji}\leq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n (19)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (20)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (21)

The constraints (19) implies yij=yji,jRi,i=1,2,,n.y_{ij}=y_{ji},j\in R_{i},i=1,2,\ldots,n. In fact, we can replace constraints (19) by

yij=yji for all jRi,j>i,i=1,2,,n.y_{ij}=y_{ji}\mbox{ for all }j\in R_{i},j>i,i=1,2,\ldots,n. (22)

The formulation FT has 3n(n1)3n(n-1) general constraints, nn binary variables, and n(n1)n(n-1) continuous variables. The number of constraints reduces to 52n(n1)\frac{5}{2}n(n-1) when constraints (19) is replaced by (22).

Let us now propose a new basic linearization of QUBO. Note that DW can be viewed as pairwise aggregation of the constraints (9) and (10) of GW along with additional binary restriction on the yijy_{ij} variables. Using this line of reasoning and applying pairwise aggregation of constraints (13) and (14) of GW, we get the linearization

PK: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (23)
yij+yji2xi0 for all jRi,i=1,2,,n,\displaystyle y_{ij}+y_{ji}-2x_{i}\leq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (24)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (25)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (26)
Theorem 1.

PK is a valid MILP formulation of QUBO.

Proof.

Consider the variables xix_{i} and xjx_{j}. When xi=0x_{i}=0 or xj=0x_{j}=0, by constraint (24), yij=yji=0y_{ij}=y_{ji}=0. Suppose xi=xj=1x_{i}=x_{j}=1. Then, from constraint (23),

yij1 and yji1.y_{ij}\geq 1\mbox{ and }y_{ji}\geq 1. (27)

From constraint (24), we have yij+yji2y_{ij}+y_{ji}\leq 2. In view of (27), yij=yji=1y_{ij}=y_{ji}=1. ∎

PK has at most 2n(n1)2n(n-1) general constraints, nn binary variables, and n(n1)n(n-1) continuous variables. Unlike DW, the variables yijy_{ij} in PK are continuous and it is the most compact basic linearization of QUBO.

2.1 Optimality restricted variations of the basic models

In Section 1, we discussed the concept of optimality restricted variations of a linearization of QUBO, their merits and drawbacks, and why is it important and relevant to investigate them separately from exact models. Let us now discuss the optimality restricted variations derived from the four basic linearization models discussed in this section earlier. Let Ri+={j:qij>0}R^{+}_{i}=\{j:q_{ij}>0\} and Ri={j:qij<0}R^{-}_{i}=\{j:q_{ij}<0\}. First note that for DW, the binary restriction of yijy_{ij} variables can be relaxed to yij0y_{ij}\geq 0 for jRi,i=1,2,,nj\in R^{-}_{i},i=1,2,\ldots,n. This provides an optimality restricted variation of DW. To see why this works, note that when xi=xj=0x_{i}=x_{j}=0, yij=0y_{ij}=0, without requiring yijy_{ij} to be binary. Likewise, when xi=xj=1x_{i}=x_{j}=1, yij=1y_{ij}=1, without requiring yijy_{ij} to be binary. The binary restriction is relevant only when precisely one of xix_{i} or xjx_{j} is 1. In this case constraint (5) becomes yij12y_{ij}\leq\frac{1}{2}. The binary restriction then force yijy_{ij} to be zero. But, in this case, if qij<0q_{ij}<0 then yijy_{ij} will take value 0 at optimality regardless, yijy_{ij} is binary or not. In fact, we can simplify the model further. Note that the general linearization constraints (1) and (2) can be rewritten as

xi+xj1yij(1/2)(xi+xj)x_{i}+x_{j}-1\leq y_{ij}\leq(1/2)(x_{i}+x_{j})

Thus, at the optimality, constraint (4) is not active if jRi+j\in R^{+}_{i} and hence can be removed. Similarly, constraint (5) can be removed if jRij\in R^{-}_{i}. Thus, an optimality restricted version of DW can be stated as

ORDW: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n, (28)
2yijxixj0 for all jRi+,i=1,2,,n,\displaystyle 2y_{ij}-x_{i}-x_{j}\leq 0\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n, (29)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (30)
yij{0,1} for all jRi+,i=1,2,,n,\displaystyle y_{ij}\in\{0,1\}\mbox{ for all }j\in R_{i}^{+},i=1,2,\ldots,n, (31)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i}^{-},i=1,2,\ldots,n. (32)

The formulation ORDW has i=1n(|Ri|+|Ri+|)\sum_{i=1}^{n}\left(|R^{-}_{i}|+|R^{+}_{i}|\right) general constraints and (n+i=1n|Ri|)\left(n+\sum_{i=1}^{n}|R^{-}_{i}|\right) binary variables.

While the optimality restricted model ORDW guarantees an optimal solution, if the solver is terminated before reaching an optimal solution, the yijy_{ij} value from the resulting best solution found need not be equal to xixjx_{i}x_{j}. Consequently, the reported objective function value could be erroneous and the correct QUBO objective function value needs to be recomputed using the available values of xi,i=1,2,,nx_{i},i=1,2,\ldots,n. Let us illustrate this with an example. Consider the instance EX1 of QUBO with

Q=[0αα0] and 𝐜T=[00],Q=\begin{bmatrix}0&\alpha\\ \alpha&0\end{bmatrix}\mbox{ and }\mathbf{c}^{\mkern-1.0mu{T}}=\begin{bmatrix}0&0\end{bmatrix},

where α>0\alpha>0. Suppose that we use ORDW to solve EX1 and terminate the solver at the solution x1=x2=1,y12=y21=0x_{1}=x_{2}=1,y_{12}=y_{21}=0. Note that this is a feasible solution to ORDW (but not feasible for DW) with objective function value 0. But the QUBO objective function value of this solution is 2α2\alpha, which is arbitrarily larger than the value solver produced for the model ORDW when terminated prematurely.

Thus, when MILP models in optimality restricted form are used, it is important to recompute the objective function value in order to use the model as a heuristic by choosing the best solution produced within a given time/space limit. For precise models like DW, such an issue does not arise.

Let us now look at an optimality restricted version of GW. Note that the linearization constraints in GW can be restated as

max{0,xi+xj1}yijmin{xi,xj}\max\{0,x_{i}+x_{j}-1\}\leq y_{ij}\leq\min\{x_{i},x_{j}\}

Thus, constraints (12) and (16) are redundant at optimality if jRi+j\in R^{+}_{i} and constraints (13) and (14) are redundant at optimality if jRij\in R^{-}_{i}. Removing these redundant constraints, GW can be restated in the optimality restricted form as

ORGW: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n, (33)
yijxi0 for all jRi+,i=1,2,,n\displaystyle y_{ij}-x_{i}\leq 0\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n (34)
yjixi0 for all jSi+,i=1,2,,n\displaystyle y_{ji}-x_{i}\leq 0\mbox{ for all }j\in S^{+}_{i},i=1,2,\ldots,n (35)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (36)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n. (37)

The formulation ORGW has at most 2n(n1)2n(n-1) general constraints. Precisely, the number of general constraints will be i=1n(2|Ri+|+|Ri|)\sum_{i=1}^{n}\left(2|R_{i}^{+}|+|R_{i}^{-}|\right)

The main linearization constraints of FT can be stated as

max{0,xi+xj1}yijxi\max\{0,x_{i}+x_{j}-1\}\leq y_{ij}\leq x_{i}

Thus, constraints (17) and (21) are redundant at optimality if jRij\in R^{-}_{i} and (18) and (19) are redundant at optimality if jRi+j\in R^{+}_{i}. Removing these redundant constraints, FT can be restated in a modified form as

ORFT: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n, (38)
yijxi0 for all jRi+,i=1,2,,n,\displaystyle y_{ij}-x_{i}\leq 0\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n, (39)
yijyji for all jRi+,i=1,2,,n,\displaystyle y_{ij}\leq y_{ji}\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n, (40)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (41)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n. (42)

The model ORFT has (i=1n(2|Ri+|+|Ri|)\left(\sum_{i=1}^{n}(2|R_{i}^{+}|+|R_{i}^{-}|\right) general constraints, nn binary variables, and (i=1n(|Ri|)\left(\sum_{i=1}^{n}(|R_{i}^{-}|\right) continuous variables.

The linearization constraint (23) of PK can be written as

max{0,xi+xj1}yij\max\{0,x_{i}+x_{j}-1\}\leq y_{ij}

Thus, the constraints (23) and (26) are redundant at optimality for jRi+j\in R^{+}_{i}. For qij=qji>0q_{ij}=q_{ji}>0, if at least one of xi or xj=0x_{i}\mbox{ or }x_{j}=0, from the optimality sense, yij=yji=0y_{ij}=y_{ji}=0, but if xi=xj=1x_{i}=x_{j}=1, then the other linearization constraint

yij+yji2xiy_{ij}+y_{ji}\leq 2x_{i}

will become yij+yji2y_{ij}+y_{ji}\leq 2 and will be satisfied with equality by an optimal solution in this case. Thus, here upper bound restrictions on yijy_{ij} are not required. Since QQ is symmetric, qij=qjiq_{ij}=q_{ji} and hence they have the same sign. As a result, constraint (24) is redundant at optimality for jRj\in R^{-}. To avoid unboundedness, we need to add the constraint yij0 for all jRi,i=1,2,,ny_{ij}\geq 0\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n. Thus, the optimality restricted version of PK can be stated as

ORPK: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n, (43)
yij+yji2xi0 for all jRi+,i=1,2,,n,\displaystyle y_{ij}+y_{ji}-2x_{i}\leq 0\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n, (44)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (45)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n. (46)

The number of general constraints and continuous variables in ORPK is i=1n(|Ri|+|Ri+|)\sum_{i=1}^{n}\left(|R^{-}_{i}|+|R^{+}_{i}|\right) and (i=1n|Ri|)\left(\sum_{i=1}^{n}|R^{-}_{i}|\right) respectively.

3 Linearization by Weighted Aggregation

Let us now discuss some general techniques to generate new explicit linearizations of QUBO from DW, GW, FT and PK using weighted aggregation of selected constraints and thereby reducing the total number of constraints in the resulting model. In general, for integer (binary) programs, weighted aggregation of constraints alter the problem characteristics and the resulting models only provide an upper bound. This is the principle used in surrogate constraints and related duality [14, 18, 19, 21]. Precise aggregations are also studied in the literature but are not suitable for practical applications. Interestingly, by selecting constraints to be aggregated carefully, we show that new precise formulations for QUBO can be obtained with interesting theoretical properties and have practical value.

An unweighted aggregation model derived from GW was presented by Glover and Woolsey [24] using constraints (13) and (14) with amendments provided by Goldman [25]. To the best of our knowledge, no other aggregation models (weighted or unweighted) are discussed in the literature for explicit linearizations of QUBO using general weights. The effect of these aggregations on the quality of the corresponding LP relaxations will be discussed later.

3.1 Weighted aggregation of type 1 linearization constraints

Recall that all of the four basic models have the constraints

xi+xjyij1 for all jRi,i=1,2,,nx_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n

and we call them type 1 linearization constraints. Let αij>0,jRi,i=1,2,,n\alpha_{ij}>0,j\in R_{i},i=1,2,\ldots,n. Consider the system of nn linear inequalities

jRiαij(xjyij)(jRiαij)(1xi), for i=1,2,,n.\sum_{j\in R_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq\left(\sum_{j\in R_{i}}\alpha_{ij}\right)(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (47)

obtained by weighted aggregation of type 1 linearization constraints.

Let DW(α\alpha) be the mixed integer linear programs obtained from DW by replacing constraints set (4) by (47). When qij=0q_{ij}=0, both yijy_{ij} and yjiy_{ji} are not defined. If qij0q_{ij}\neq 0 by the symmetry of QQ, qji=qij0q_{ji}=q_{ij}\neq 0. Thus, yijy_{ij} is defined if and only if yjiy_{ji} is defined. Moreover, when both yijy_{ij} and yjiy_{ji} are defined, jRij\in R_{i} if and only if iRji\in R_{j}.

Theorem 2.

DW(α\alpha) is a valid formulation of QUBO.

Proof.

For any pair (i,j)(i,j) with jRij\in R_{i} we have to show that yij=yji=xixjy_{ij}=y_{ji}=x_{i}x_{j}. Suppose xi=0x_{i}=0. Choose any jRij\in R_{i}. By constraint set (5) we have

2yijxj and 2yjixj2y_{ij}\leq x_{j}\mbox{ and }2y_{ji}\leq x_{j}

Since xij{0,1}x_{ij}\in\{0,1\} and yij,yji{0,1}y_{ij},y_{ji}\in\{0,1\} we have yij=0y_{ij}=0 and xij=0x_{ij}=0. Now, suppose xi=1x_{i}=1. Choose any kRik\in R_{i}. From the constraint set (47),

jRiαij(xjyij)0\sum_{j\in R_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq 0

If xjyij<0x_{j}-y_{ij}<0 for some jRij\in R_{i}, then yij0y_{ij}\neq 0. Since yij{0,1}y_{ij}\in\{0,1\}, yij=1y_{ij}=1 and hence xj=0x_{j}=0. But we already established that if xj=0x_{j}=0 then yij=0y_{ij}=0, a contradiction. If xjyij>0x_{j}-y_{ij}>0 then there exists a kRik\in R_{i} such that xkyik<0x_{k}-y_{ik}<0 which is impossible as established above. Thus xjyij=0x_{j}-y_{ij}=0 for all jRij\in R_{i}. Thus, yij=xjy_{ij}=x_{j} for all jRij\in R_{i}. Thus, for any jRij\in R_{i} if xj=1x_{j}=1 then yij=1y_{ij}=1. Since jRij\in R_{i} we have iRji\in R_{j}. Thus, if xj=1x_{j}=1 and xi=1x_{i}=1 following similar analysis we get yji=1y_{ji}=1. This completes the proof. ∎

Let PK(α\alpha) be the mixed integer linear programs obtained from PK by replacing constraints set (23) with (47) and by adding the upper bound constraints yij1y_{ij}\leq 1 for jRi,i=1,2,,n.j\in R_{i},i=1,2,\ldots,n.

Theorem 3.

PK(α\alpha) is a valid formulation of QUBO.

Proof.

If xi=0x_{i}=0 for any ii, then by constraints (24) yij+yji0y_{ij}+y_{ji}\leq 0 yielding yij=yij=0y_{ij}=y_{ij}=0 for any jj. Now suppose xi=1x_{i}=1. Then, from constraints (47)

jRiαij(xjyij)0\sum_{j\in R_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq 0

If xjyij<0x_{j}-y_{ij}<0 then yij0y_{ij}\neq 0. If 0yij10\leq y_{ij}\leq 1, then xj=0x_{j}=0 which implies yij=0y_{ij}=0, a contradiction. Thus yij>1y_{ij}>1 and this contradicts the upper bound restriction on yijy_{ij}. If xjyij>0x_{j}-y_{ij}>0 then there exists a kRik\in R_{i} such that xkyik<0x_{k}-y_{ik}<0, a contradiction. Thus, xj=yijx_{j}=y_{ij} for all jRij\in R_{i}. Thus, if xj=1x_{j}=1 then yij=1y_{ij}=1. If jRij\in R_{i} then iRji\in R_{j}. Using similar arguments with xj=1x_{j}=1 we get yji=1y_{ji}=1. Thus, if xi=xj=1x_{i}=x_{j}=1 then yij=yji=1y_{ij}=y_{ji}=1 and this concludes the proof. ∎

The upper bound of 1 on the variables yijy_{ij} is required for this formulation to be valid. Consider the instance EX2 of QUBO with

Q=(0363306366063360) and 𝐜T=(3,6,0,3)Q=\begin{pmatrix}0&3&-6&-3\\ 3&0&6&3\\ -6&6&0&-6\\ -3&3&-6&0\\ \end{pmatrix}\mbox{ and }\mathbf{c}^{\mkern-1.0mu{T}}=(-3,-6,0,3)

choose α14=3,α34=6\alpha_{14}=3,\;\alpha_{34}=6, all other αij=1,jRi\alpha_{ij}=1,j\in R_{i} for i=1,,ni=1,\dots,n. If the constraints yij1y_{ij}\leq 1 is not used, an optimal solution obtained by the formulation PK(α\alpha) is x1=0,x2=x3=x4=1x_{1}=0,\;x_{2}=x_{3}=x_{4}=1,  y23=y42=2,y34=7/6y_{23}=y_{42}=2,\;y_{34}=7/6, rest of the yij’s are 0y_{ij}\mbox{'s are }0 with objective function value 8. But the optimal objective function value of this problem is 6 with x2=x3=1x_{2}=x_{3}=1 and x1=x4=0x_{1}=x_{4}=0 as optimal solution.

Let GW(α\alpha) be the mixed integer linear program obtained from GW by replacing constraints set (12) by (47).

Theorem 4.

GW(α\alpha) is a valid formulation of QUBO.

Proof.

If xi=0x_{i}=0 then by constraints (13) and (14) we have yij=yji=0y_{ij}=y_{ji}=0 for all jRij\in R_{i}. Also, we have an implicit upper bound of 11 on yijy_{ij} for all ii and jj. Now, following arguments similar to that of proof of Theorem 3, we have if xi=xj=1x_{i}=x_{j}=1 then yij=yji=1y_{ij}=y_{ji}=1. This completes the proof. ∎

Finally, let FT(α\alpha) be the mixed integer linear programs obtained respectively from FT by replacing constraints set (17) by (47).

Theorem 5.

FT(α\alpha) is a valid formulation of QUBO.

Proof.

If xi=0x_{i}=0 then by constraints (18) we have yij=0y_{ij}=0 for all jRij\in R_{i} and by constraints (19) we have yji=0y_{ji}=0 for jRij\in R_{i}. Also, we have an implicit upper bound of 11 on yijy_{ij} for all ii and jj. Now, following arguments similar to that of proof of Theorem 3, we have if xi=xj=1x_{i}=x_{j}=1 then yij=yji=1y_{ij}=y_{ji}=1. This completes the proof. ∎

When αij=1\alpha_{ij}=1 for all i,ji,j, we have the unit-weight special cases of DW(α\alpha),GW(α\alpha), FT(α\alpha) and PK(α\alpha). In this case, the aggregated constraint is

jRi(xjyij)|Ri|(1xi), for i=1,2,,n.\sum_{j\in R_{i}}(x_{j}-y_{ij})\leq|R_{i}|(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (48)

Note that the number of general constraints is n2n^{2} in each of the models DW(α\alpha) and PK(α\alpha), a reduction from 2n(n1)2n(n-1) for DW and PK. The number of general constraints in FT(α\alpha) and GW(α\alpha) is n(2n1)n(2n-1) a reduction from 3n(n1)3n(n-1) constraints in FT and GW.

3.2 Weighted aggregation of type 2 linearization constraints

The collection of linearization constraints in our basic models that are not of type 1 are called type 2 linearization constraints. For DW and PK, there is only one class of type 2 linearization constraints. For GW and FT there are two classes of type 2 linearization constraints. In such cases, there are different types of aggregations possible.

Let us start with aggregation of type 2 linearization constraints (24) from PK. For βij>0,jRi,i=1,2,,n\beta_{ij}>0,j\in R_{i},i=1,2,\ldots,n, consider the inequality

jRiβij(yij+yji)2(jRiβij)xi, for i=1,2,,n.\sum_{j\in R_{i}}\beta_{ij}(y_{ij}+y_{ji})\leq 2\left(\sum_{j\in R_{i}}\beta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (49)

Let PK(,β*,\beta) be the MILP obtained from PK by replacing (24) with (49) and adding the constraints yij1,jRi,i=1,2,,ny_{ij}\leq 1,j\in R_{i},i=1,2,\ldots,n. The notation PK(,β*,\beta) indicates type 1 linearization constraints are not aggregated and type 2 linearization constraints are aggregated using parameter values represented by βij\beta_{ij}.

Theorem 6.

PK(,β*,\beta) is a valid MILP model for QUBO.

Proof.

If xi=0x_{i}=0 then constraint (49) implies yij=yji,jRiy_{ij}=-y_{ji},j\in R_{i} which guarantees that yij=yji=0y_{ij}=y_{ji}=0. If xi=1x_{i}=1 then constraint (49) is redundant. But then, constraint (23) assures that yijxjy_{ij}\geq x_{j} and yjixjy_{ji}\geq x_{j}. We also have the constraint yij1y_{ij}\leq 1. Thus, if xjx_{j} and xix_{i} are both 1, then yij=1y_{ij}=1 and yji=1y_{ji}=1. If xi=1x_{i}=1 but xj=0x_{j}=0 then constraint (49) with xjx_{j} on the RHS insures that yji=yij=0y_{ji}=y_{ij}=0 and the result follows. ∎

For the special case of PK(,β*,\beta) where βij=1\beta_{ij}=1 for all jRi,i=1,2,,nj\in R_{i},i=1,2,\ldots,n, the constraint (49) becomes

jRi(yij+yji)2|Ri|xi, for i=1,2,,n.\sum_{j\in R_{i}}(y_{ij}+y_{ji})\leq 2|R_{i}|x_{i},\mbox{ for }i=1,2,\ldots,n. (50)

PK(,β*,\beta) have at most n2n^{2} general constraints, a reduction from 2n(n1)2n(n-1) for PK.

Although the weighted aggregation of type 2 linearization constraints produced the valid MILP model PK(,β*,\beta) for QUBO, weighted aggregation of type 2 linearization constraints of DW need not produced a valid MILP model for QUBO. Let βij>0,jRi,i=1,2,,n\beta_{ij}>0,j\in R_{i},i=1,2,\ldots,n. Consider

jRi(2yijxj)(jRiβij)xi, for i=1,2,,n.\sum_{j\in R_{i}}(2y_{ij}-x_{j})\leq\left(\sum_{j\in R_{i}}\beta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (51)

Let DW(,β*,\beta) be MILP obtained from DW replacing constraints (5) by (51). Then, DW(,β*,\beta) is not a valid MILP model for QUBO. For example, consider the instance(EX3) of QUBO where

Q=(011100100) and cT=(1,5,5)Q=\begin{pmatrix}0&1&1\\ 1&0&0\\ 1&0&0\\ \end{pmatrix}\mbox{ and }\textbf{c}^{\mkern-1.0mu{T}}=(1,-5,-5)

An optimal solution to the formulation DW(,β*,\beta) obtained from this instance is x1=1,x2=x3=0,y12=1,y23=y31=y13=y21=y32=0x_{1}=1,\;x_{2}=x_{3}=0,\;y_{12}=1,\;y_{23}=y_{31}=y_{13}=y_{21}=y_{32}=0 with objective function value 2 where βij=1,jRi,i=1,,n\beta_{ij}=1,\;\forall\;j\in R_{i},\;i=1,\ldots,n. Note that, y12x1x2y_{12}\neq x_{1}x_{2}. The optimal objective optimal function value for this problem is 1.

Weighted aggregation of type 2 linearization constraints (18) in FT yields another MILP formulation of QUBO. Let γij>0,jRi,i=1,2,,n\gamma_{ij}>0,j\in R_{i},i=1,2,\ldots,n. Now consider the inequality

jRiγijyij(jRiγij)xi, for i=1,2,,n.\sum_{j\in R_{i}}\gamma_{ij}y_{ij}\leq\left(\sum_{j\in R_{i}}\gamma_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (52)

Let FT(,γ*,\gamma) be the MILP obtained from FT by replacing the inequality (18) in FT with (52) and adding an upper bound of 1 on all yijy_{ij} variables.

Theorem 7.

FT(,γ*,\gamma) is a valid MILP formulation of QUBO.

Proof.

When xi=0x_{i}=0, constraint (52) assures that yij=0y_{ij}=0. When xi=1x_{i}=1 constraint (52) is redundant. However, in this case, from constraints (17) and the upper bound restrictions on yijy_{ij} we have 1yijxj1\geq y_{ij}\geq x_{j}. Thus, if xj=1x_{j}=1 then yij=1y_{ij}=1. If xi=1x_{i}=1 but xj=0x_{j}=0 then the constraint (52) corresponding to xjx_{j} assures that yji=0y_{ji}=0. But then, by constraint (19), yij=0y_{ij}=0 and the result follows. ∎

FT(,γ*,\gamma) have at most n(2n1)n(2n-1) general constraints and n(n1)n(n-1) upper-bound constraints. We can also use weighted aggregation of constraints (19) in FT to yield a valid MILP model for QUBO. Let θij>0,jRi,i=1,2,,n\theta_{ij}>0,j\in R_{i},i=1,2,\ldots,n. Now consider the inequality

jRiθij(yijyji)0 for i=1,2,,n.\sum_{j\in R_{i}}\theta_{ij}(y_{ij}-y_{ji})\leq 0\mbox{ for }i=1,2,\ldots,n. (53)

Let FT(,θ*,\theta) be the MILP obtained from FT by replacing constraints (19) with (53) and FT(,γ,θ*,\gamma,\theta)be the MILP obtained by replacing constraints (18) by (52) and (19) by (53) along with the addition of yij1jRi,i=1,,ny_{ij}\leq 1\;\forall j\in R_{i},i=1,\ldots,n.

Theorem 8.

FT(,θ*,\theta) and FT(,γ,θ*,\gamma,\theta) are valid MILP formulations of QUBO.

We skip the proof of this theorem as it is easy to construct.

FT(,γ,θ*,\gamma,\theta) has n2+nn^{2}+n general constraints and both FT(,θ*,\theta) and FT(,γ*,\gamma) have n(2n1)n(2n-1) general constraints. The corresponding unweighted versions of these formulations can be obtained by using γij,θij and δij=1,jRi,i=1,,n\gamma_{ij},\;\theta_{ij}\mbox{ and }\delta_{ij}=1,\;\forall j\in R_{i},i=1,\ldots,n.
Now, we will explore the weighted aggregation of the type 2 linearization constraints (13) and (14) of GW. Let γij>0\gamma_{ij}>0 and δij>0\delta_{ij}>0 for all jRi,i=1,2,nj\in R_{i},i=1,2,\ldots n (note that since QQ is assumed to be symmetric, Ri=SiR_{i}=S_{i}). Now, consider the inequality

jRiγijyij+jSiδijyji(jRiγij+jSiδij)xi, for i=1,2,,n.\sum_{j\in R_{i}}\gamma_{ij}y_{ij}+\sum_{j\in S_{i}}\delta_{ij}y_{ji}\leq\left(\sum_{j\in R_{i}}\gamma_{ij}+\sum_{j\in S_{i}}\delta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (54)

Let GW(,γ+δ*,\gamma+\delta) be the MILP obtained from GW by replacing (13) and (14) in GW by (54) and adding the upper bound constraints yij1y_{ij}\leq 1 for all ii and jj.

Theorem 9.

GW(,γ+δ*,\gamma+\delta) is a valid MILP model for QUBO.

Proof.

When xi=0x_{i}=0, constraint (54) assures that yij=yji=0y_{ij}=y_{ji}=0. When xi=1x_{i}=1 constraint (54) is redundant. However, in this case, from constraints (12) and the upper bound restrictions on yijy_{ij} we have 1yijxj1\geq y_{ij}\geq x_{j} and 1yjixj1\geq y_{ji}\geq x_{j}. Thus, if xj=1x_{j}=1 we have yij=yji=1y_{ij}=y_{ji}=1. If xj=0x_{j}=0 and xi=1x_{i}=1, using constraint (54) corresponding to xjx_{j}, we have yji=yij=0y_{ji}=y_{ij}=0 and the result follows. ∎

Unlike GW(,γ+δ*,\gamma+\delta), the aggregation of (53) and the following constraint

jRiθij(yijyji)0 for i=1,2,,n.\sum_{j\in R_{i}}\theta_{ij}(y_{ij}-y_{ji})\leq 0\mbox{ for }i=1,2,\ldots,n. (55)

need not produce a valid MILP model corresponding to FT.

Note that PK(,β*,\beta) is the same as GW(,β+β*,\beta+\beta). When γij=δij=1\gamma_{ij}=\delta_{ij}=1 for all ii and jj, GW(,γ+δ*,\gamma+\delta) becomes the Glover-Woolsey aggregation discussed in [24] along with the amendments proposed by Goldman [25].

Let us now consider weighted aggregation of constraints (13) and (14) of GW separately. Consider the inequalities

jRiγijyij\displaystyle\sum_{j\in R_{i}}\gamma_{ij}y_{ij} (jRiγij)xi, for i=1,2,,n.\displaystyle\leq\left(\sum_{j\in R_{i}}\gamma_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (56)
jSiδijyji\displaystyle\sum_{j\in S_{i}}\delta_{ij}y_{ji} (jSiδij)xi, for i=1,2,,n.\displaystyle\leq\left(\sum_{j\in S_{i}}\delta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (57)

In GW, replace constraints (13) and (14) by (56) and (57) and add the upper bound restriction yij1y_{ij}\leq 1 for all i,ji,j. Let GW(,γ,δ*,\gamma,\delta) be the resulting model.

Theorem 10.

GW(,γ,δ*,\gamma,\delta) is a valid MILP model for QUBO.

Proof.

When xi=0x_{i}=0 constraints (56) and (57) guarantees that yij=yji=0y_{ij}=y_{ji}=0. When xi=1x_{i}=1 constraints (56) and (57) are redundant. However, in this case, from constraints (12) and the upper bound restrictions on yijy_{ij} we have 1yijxj1\geq y_{ij}\geq x_{j} and 1yjixj1\geq y_{ji}\geq x_{j}. Thus, if xj=1x_{j}=1 we have yij=yji=1y_{ij}=y_{ji}=1. If xj=0x_{j}=0 and xi=1x_{i}=1, using constraint (56) and (57) corresponding to xjx_{j}, we have yji=yij=0y_{ji}=y_{ij}=0 and the result follows. ∎

When γij=δij=1\gamma_{ij}=\delta_{ij}=1 for all i,ji,j, we have the corresponding unit weight case. Aggregating only (13) and replacing by (56) and leaving (14) unaltered is also a valid formulation for QUBO and we denote this by GW(,γ*,\gamma). Likewise, aggregating only (14) as (57) while leaving (13) unaltered is another valid formulation and we denote this by GW(,δ*,\delta). For each of the above variations, we also have the corresponding unit-weight cases (i.e. γij=1\gamma_{ij}=1 and δij=1\delta_{ij}=1 for all (i,j)(i,j)). We are skipping the proofs for these two formulations. The number of general constraints for GW(,γ+δ*,\gamma+\delta), GW(,γ,δ*,\gamma,\delta), and GW(,γ*,\gamma)(GW(,δ*,\delta) ) are n2,n(n+1) and n(2n1)n^{2},\;n(n+1)\mbox{ and }n(2n-1) respectively.

3.3 Simultaneous weighted aggregations

Simultaneously aggregating type 1 and type 2 constraints in our basic models also lead to valid MILP formulations of QUBO with significantly reduced number of constraints. This is however not applicable for DW since type 2 linearization constraints cannot be aggregated the way we discussed aggregations here. Let us first examine simultaneous aggregation in the context of GW. Let GW(α,γ+δ\alpha,\gamma+\delta) be the MILP obtained by replacing constraints (12) with (47) and replacing constraints (13) and (14) with (54) and adding the upper bound restrictions yij1y_{ij}\leq 1. That is,

GW(α,γ+δ\alpha,\gamma+\delta):   Maximize (i,j)Sqijyij+i=1ncixi\displaystyle\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: jRiαij(xjyij)(jRiαij)(1xi), for i=1,2,,n.\displaystyle\sum_{j\in R_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq\left(\sum_{j\in R_{i}}\alpha_{ij}\right)(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (58)
jRiγijyij+jSiδijyji(jRiγij+jSiδij)xi, for i=1,2,,n.\displaystyle\!\!\sum_{j\in R_{i}}\gamma_{ij}y_{ij}+\!\!\sum_{j\in S_{i}}\delta_{ij}y_{ji}\leq\!\!\left(\sum_{j\in R_{i}}\!\!\gamma_{ij}+\!\!\sum_{j\in S_{i}}\!\!\delta_{ij}\right)x_{i},\mbox{ for }i\!\!=\!\!1,2,\ldots,n. (59)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (60)
0yij1 for all jRi,i=1,2,,n.\displaystyle 0\leq y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (61)
Theorem 11.

GW(α,γ+δ\alpha,\gamma+\delta) is a valid MILP model for QUBO.

Proof.

When xi=0x_{i}=0, constraint (59) assures that yij=0y_{ij}=0 for all jRij\in R_{i} and yji=0y_{ji}=0 for all jSij\in S_{i} .(Since QQ is symmetric, Ri=Si,i=1,2,,nR_{i}=S_{i},i=1,2,\ldots,n.) When xi=1x_{i}=1, using arguments similar to those in the proof of Theorem 3, it can be shown that yij=xjy_{ij}=x_{j} for all jRij\in R_{i} and yji=xjy_{ji}=x_{j} for all jSij\in S_{i}. This establishes the validity of the model. ∎

GW(α,γ+δ\alpha,\gamma+\delta) have 2n2n general constraints, n(n1)n(n-1) upper bound constraints, nn binary variables, and n(n1)n(n-1) continuous variables. This is the most compact among the explicit formulations.

Other variations of GW(α,γ+δ\alpha,\gamma+\delta) are also interesting. For example, we can replace (59) by (56) and (57) and let GW(α,γ,δ\alpha,\gamma,\delta) be the resulting model which is valid for QUBO. The model GW(α,γ\alpha,\gamma) obtained by replacing (59) by (56) and (14) is yet another valid MILP for QUBO. Likewise, the model GW(α,δ\alpha,\delta) obtained by replacing (59) by (57) and (13) is also a valid model for QUBO.

Let us now combine the two types of weighted aggregations in FT (i.e. inequalities (47) and (52)). The resulting MILP formulation is

FT(α,γ,\alpha,\gamma,*): Maximize (i,j)Sqijyij+i=1ncixi\displaystyle\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: jRiαij(xjyij)(jRiαij)(1xi), for i=1,2,,n.\displaystyle\sum_{j\in R_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq\left(\sum_{j\in R_{i}}\alpha_{ij}\right)(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (62)
jRiγijyij(jRiγij)xi, for i=1,2,,n.\displaystyle\sum_{j\in R_{i}}\gamma_{ij}y_{ij}\leq\left(\sum_{j\in R_{i}}\gamma_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (63)
yijyji for all jRi,i=1,2,,n.\displaystyle y_{ij}\leq y_{ji}\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (64)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (65)
0yij1 for all jRi,i=1,2,,n.\displaystyle 0\leq y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (66)
Theorem 12.

FT(α,γ,\alpha,\gamma,*) is a valid MILP formulation of QUBO.

Proof.

When xi=0x_{i}=0, constraint (63) assures that yij=0y_{ij}=0 for all jRij\in R_{i}. By constraints (64), whenever yij=0y_{ij}=0, we have yji=0y_{ji}=0. When xi=1x_{i}=1, using arguments similar to those in the proof of Theorem 3, it can be shown that yij=xjy_{ij}=x_{j} for all jRij\in R_{i} and by using (64) yji=xjy_{ji}=x_{j} for all jRij\in R_{i}. This establishes the validity of the model. ∎

FT(α,γ,\alpha,\gamma,*) have at most n(n+1)n(n+1) general constraints and n(n1)n(n-1) upper bound constraints.Using similar arguments, it can be shown that FT(α,,θ\alpha,*,\theta) is a valid formulation of QUBO as well. Let FT(α,γ,θ\alpha,\gamma,\theta) be the MILP obtained from FT(α,γ,\alpha,\gamma,*) by replacing (64) with (55). From the basic model FT, another possible aggregation can be obtained by using following constraint instead:

jRiθij(yijyji)=0\sum_{j\in R_{i}}\theta_{ij}(y_{ij}-y_{ji})=0 (67)

It may be noted that FT(α,γ,θ\alpha,\gamma,\theta) is a valid formulation of QUBO if the constraint (64) in FT(α,γ,\alpha,\gamma,*) is replaced by (55). But, using (67) instead does not give a valid formulation of QUBO. This can be established using example EX4. The number of general constraints in FT(α,γ,θ\alpha,\gamma,\theta) is 3n3n.

Finally, it is possible to generate a combined aggregation model from PK, PK(α,β\alpha,\beta). However, such a model is a special case of GW(α,γ+δ\alpha,\gamma+\delta) with γ=δ\gamma=\delta, hence will be a valid formulation of QUBO.

4 Optimality restricted variations and aggregations

Let us now discuss the optimality restricted variations of the weighted aggregation based models discussed in section 3. This needs to be done with more care. In this case, we are loosing some of the advantage gained by optimality restricted variations discussed earlier while gaining some other advantages.

4.1 Weighted aggregation of type 1 constraints

Note that under the optimality restricted variations, the number of type 1 constraints are reduced. Let us consider weighted aggregation of these reduced type 1 linearization constraints. Let αij>0\alpha_{ij}>0 for jRi,i=1,2,,nj\in R_{i}^{-},i=1,2,\ldots,n. Now consider the inequality

jRiαij(xjyij)(jRiαij)(1xi), for i=1,2,,n.\sum_{j\in R^{-}_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq\left(\sum_{j\in R^{-}_{i}}\alpha_{ij}\right)(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (68)

obtained by the weighted aggregation of the type 1 linearization constraints in optimality restricted variations of the basic models. Unlike the precise basic models, for the optimality restricted variations of the basic models, if we replace the type 1 linearization constraints by (68), the resulting MILP need not be a valid model for QUBO. To see this, consider the MILP model ORDW and replace constraints (28) by (68). Let ORDW-A be the resulting MILP. Interestingly, ORDW-A is not a valid MILP model for QUBO. To see this, consider the instance of QUBO (EX6) with

Q=(020201010) and 𝐜T=(5,5,0)Q=\begin{pmatrix}0&-2&0\\ -2&0&-1\\ 0&-1&0\\ \end{pmatrix}\mbox{ and }\mathbf{c}^{\mkern-1.0mu{T}}=(5,5,0)

where αij=1for all jRi\alpha_{ij}=1\;\mbox{for all }j\in R_{i} for i=1,,ni=1,\dots,n. An optimal solution to the corresponding ORDW-A is x1=x2=1,x3=0,y12=y23=1,y13=y21=y31=y32=0x_{1}=x_{2}=1,\;x_{3}=0,\;y_{12}=y_{23}=1,\;y_{13}=y_{21}=y_{31}=y_{32}=0 with objective function value 7 but the optimal objective function value for this QUBO is 6. Here, y23x2x3y_{23}\neq x_{2}x_{3}. Also, y12y21y_{12}\neq y_{21} and y32y23y_{32}\neq y_{23}. However, we can still use (68) to replace constraints (28) provided we put back all remaining constraints from the corresponding basic model rather than the optimality restricted model. More precisely, consider the model ORDW(α\alpha)

Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: jRiαij(xjyij)(jRiαij)(1xi), for i=1,2,,n.\displaystyle\sum_{j\in R^{-}_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq\left(\sum_{j\in R^{-}_{i}}\alpha_{ij}\right)(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (69)
2yijxixj0 for all jRi,i=1,2,,n,\displaystyle 2y_{ij}-x_{i}-x_{j}\leq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n, (70)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (71)
yij{0,1} for all jRi,i=1,2,,n.\displaystyle y_{ij}\in\{0,1\}\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (72)
Theorem 13.

ORDW(α\alpha) is a valid MILP formulation of QUBO.

Proof.

Let DW-R be the MILP obtained from DW by replacing constraints (4) by constraints (28). Based on our discussion on optimality restricted models, it can be verified that DW-R is a valid MILP model for QUBO. Clearly, every feasible solution of DW-R is a feasible solution of ORDW(α\alpha). We now show that every feasible solution of ORDW(α\alpha) is a feasible solution of DW-R. For this, it is sufficient to show that every feasible solution of ORDW(α\alpha) satisfies

xi+xjyij1 for all jRi,i=1,2,,n.x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n. (73)

If xi=0x_{i}=0 then inequality (73) is clearly satisfied. Suppose xi=1x_{i}=1. Then, from (69),

jRiαij(xjyij)0.\sum_{j\in R^{-}_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq 0.

But then, as in the proof of Theorem 2, we can show that xj=yijx_{j}=y_{ij} for all jRij\in R^{-}_{i} and hence (73) is satisfied and the proof follows. ∎

The difference between ORDW(α\alpha) and DW(α\alpha) is minor. For the former, there are less number of variables in the aggregated constraints and hence this part of the coefficient matrix could sometimes be sparse. Otherwise, aggregation did not achieve much.

It is also interesting to note that even if we replace constraints (28) in ORDW by the full weighted aggregation constraint (47), still we do not get a valid MILP formulation of QUBO. Details on this are omitted.

Similarly, let ORPK(α\alpha), ORFT(α\alpha), and ORGW(α\alpha) be the MILP obtained respectively from PK, FT, and GW by replacing the type 1 linearization constraints with

jRiαij(xjyij)(jRiαij)(1xi), for i=1,2,,n.\sum_{j\in R^{-}_{i}}\alpha_{ij}(x_{j}-y_{ij})\leq\left(\sum_{j\in R^{-}_{i}}\alpha_{ij}\right)(1-x_{i}),\mbox{ for }i=1,2,\ldots,n. (74)
Theorem 14.

ORPK(α\alpha), ORFT(α\alpha), and ORGW(α\alpha) are valid optimality restricted MILP models for QUBO

Proof.

The proof is similar to that of theorem 13, where we create ORPK(α\alpha)-R1, ORFT(α\alpha)-R1 and ORGW(α\alpha)-R2, where instead of using the type 2 constraints and lower bound constraints from the respective basic model instead of optimality restricted model of the same. Further proceed with the same idea as theorem 13. ∎

We want to emphasize that, as in the case of ORDW(α\alpha), the type 2 linearization constraints used in ORPK(α\alpha), ORFT(α\alpha), and ORGW(α\alpha) are those corresponding to their respective basic models rather than the optimality restricted versions of these basic models as using the latter will lead to invalid MILP. The only optimality restricted part used is inequality (74). To clarify this observation, let us consider EX4 once again, where αij=1for all jRi\alpha_{ij}=1\;\mbox{for all }j\in R_{i} for i=1,,ni=1,\dots,n. The solution is x1=x3=1,x2=0x_{1}=x_{3}=1,\;x_{2}=0,  y12=y31=1y_{12}=y_{31}=1, rest of the yij’s are 0y_{ij}\mbox{'s are }0 with objective function value 35. But the optimal value of this problem is 34 with x1=x3=1x_{1}=x_{3}=1 and x2=0x_{2}=0.

4.2 Weighted aggregation of type 2 constraints

Recall that we cannot aggregate type 2 linearization constraints from DW to yield a valid MILP formulation for QUBO. This negative result carries over to ORDW as well. Let us now examine weighted aggregation of type 2 linearization constraints of ORPK. That is,

ORPK(,β*,\beta): Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n, (75)
jRi+βij(yij+yji)2(jRi+βij)xi, for i=1,2,,n.\displaystyle\sum_{j\in R^{+}_{i}}\beta_{ij}(y_{ij}+y_{ji})\leq 2\left(\sum_{j\in R^{+}_{i}}\beta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (76)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (77)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (78)
yij1 for all jRi+,i=1,2,,n.\displaystyle y_{ij}\leq 1\mbox{ for all }j\in R_{i}^{+},i=1,2,\ldots,n. (79)

Notice that in the model above the lower bound constraints yij0 for all jRi,i=1,2,,ny_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n are required in full and cannot be used corresponding to the optimality restricted model of PK. As seen in the example EX6, if

Q=(061607170) and 𝐜T=(7,2,15)Q=\begin{pmatrix}0&6&1\\ 6&0&-7\\ 1&-7&0\\ \end{pmatrix}\mbox{ and }\mathbf{c}^{\mkern-1.0mu{T}}=(-7,-2,-15)

where βij=1for all jRi\beta_{ij}=1\;\mbox{for all }j\in R_{i} for i=1,,ni=1,\dots,n and the lower bounds are used as yij0 for all jRi,i=1,2,,ny_{ij}\geq 0\mbox{ for all }j\in R_{i}^{-},i=1,2,\ldots,n. The solution obtained from the corresponding formulation is x1=x3=0,x2=1x_{1}=x_{3}=0,\;x_{2}=1,  y12=y13=y21=1y_{12}=y_{13}=y_{21}=1, y31=3y_{31}=-3 rest of the yij’s are 0y_{ij}\mbox{'s are }0 with the optimal objective function value as 8. But the optimal value of this problem is 3 with solution x1=x2=1x_{1}=x_{2}=1 and x3=0x_{3}=0.

It may be noted that constraint yij1y_{ij}\leq 1 is redundant for PK model. Since PK and ORPK are the valid models for QUBO. The model ORPK-R given below is also valid for QUBO

ORPK-R: Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xi+xjyij1 for all jRi,i=1,2,,n,\displaystyle x_{i}+x_{j}-y_{ij}\leq 1\mbox{ for all }j\in R^{-}_{i},i=1,2,\ldots,n, (80)
yij+yji2xi0 for all jRi+,i=1,2,,n,\displaystyle y_{ij}+y_{ji}-2x_{i}\leq 0\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n, (81)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (82)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (83)
yij1 for all jRi+,i=1,2,,n.\displaystyle y_{ij}\leq 1\mbox{ for all }j\in R^{+}_{i},i=1,2,\ldots,n. (84)
Theorem 15.

ORPK(,β*,\beta) is a valid optimality restricted MILP formulation of QUBO.

Proof.

Clearly, every feasible solution of ORPK-R is a feasible solution of ORPK(,β*,\beta). Now, we will prove that every solution of ORPK(,β*,\beta) is also satisfies for ORPK-R. It is sufficient to show that the feasible solution of ORPK(,β*,\beta) satisfies (81). If xi=1x_{i}=1, constraint (81) is clearly satisfied. On the other hand, if xi=0x_{i}=0, then (76) becomes:

jRi+βij(yij+yji)0, for i=1,2,,n.\sum_{j\in R^{+}_{i}}\beta_{ij}(y_{ij}+y_{ji})\leq 0,\mbox{ for }i=1,2,\ldots,n.\\

As βij>0\beta_{ij}>0, if yij+yji=0y_{ij}+y_{ji}=0, then yij=yji=0 for all jRi+y_{ij}=y_{ji}=0\mbox{ for all }j\in R_{i}^{+}. This satisfies (81). ∎

The number of general constraints for the model ORPK(,β*,\beta is at most n+i=1n|Ri|\sum_{i=1}^{n}|R_{i}^{-}|.

The optimality restricted version of GW(,γ+δ*,\gamma+\delta) is denoted by ORGW(,γ+δ*,\gamma+\delta) and it can be stated as

Maximize (i,j)Sqijyij+i=1ncixi\displaystyle\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xj+xiyij1 for jRi,i=1,2,,n.\displaystyle x_{j}+x_{i}-y_{ij}\leq 1\mbox{ for }j\in R^{-}_{i},i=1,2,\ldots,n. (85)
jRi+γijyij+jSi+δijyji(jRi+γij+jSi+δij)xi, for i=1,2,,n.\displaystyle\sum_{j\in R^{+}_{i}}\gamma_{ij}y_{ij}+\sum_{j\in S^{+}_{i}}\delta_{ij}y_{ji}\leq\left(\sum_{j\in R^{+}_{i}}\gamma_{ij}+\sum_{j\in S^{+}_{i}}\delta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (86)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (87)
0yij1 for all jRi,i=1,2,,n.\displaystyle 0\leq y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (88)
Theorem 16.

ORGW(,γ+δ*,\gamma+\delta) is valid optimality restricted MILP formulation of QUBO.

Proof.

In ORGW, if we replace (37) by (88), we get a new formulation similar to the case of Theorem 15. It can be easily noticed that it is a valid formulation for QUBO. Let us name this as ORGW-R. Clearly, every feasible solution of ORGW-R is a feasible solution of ORGW(,γ+δ*,\gamma+\delta). Now, we need to show that every feasible solution of ORGW(,γ+δ*,\gamma+\delta) is also a feasible for ORGW-R. It is sufficient to show that every feasible solution of ORGW(,γ+δ*,\gamma+\delta) also satisfies constraints (34) and (35). If xi=0x_{i}=0, then constraint (86) gives yij=yji=0y_{ij}=y_{ji}=0, which satisfies constraints (34) and (35). Also, if xi=1x_{i}=1, these constraints are clearly satisfied.∎

jRi+γijyijjRi+γijxi, for i=1,2,,n.\displaystyle\sum_{j\in R^{+}_{i}}\gamma_{ij}y_{ij}\leq\sum_{j\in R^{+}_{i}}\gamma_{ij}x_{i},\mbox{ for }i=1,2,\ldots,n. (89)
jSi+δijyjijSi+δijxi, for i=1,2,,n.\displaystyle\sum_{j\in S^{+}_{i}}\delta_{ij}y_{ji}\leq\sum_{j\in S^{+}_{i}}\delta_{ij}x_{i},\mbox{ for }i=1,2,\ldots,n. (90)

we also get optimality restricted models ORGW(,γ,δ*,\gamma,\delta), ORGW(,γ*,\gamma) and ORGW(,δ*,\delta) corresponding to GW(,γ,δ*,\gamma,\delta), GW(,γ*,\gamma) and GW(,δ*,\delta). It can be shown that the models ORGW(,γ,δ*,\gamma,\delta), ORGW(,γ*,\gamma) and ORGW(,δ*,\delta) are valid optimality restricted MILP models for QUBO using similar ideas as Theorem 16. The number of general constraints for the model ORGW(,γ+δ*,\gamma+\delta) and ORGW(,γ,δ*,\gamma,\delta) is at most n+i=1n|Ri|n+\sum_{i=1}^{n}|R_{i}^{-}| and 2n+i=1n|Ri|2n+\sum_{i=1}^{n}|R_{i}^{-}| respectively. For ORGW(,γ*,\gamma) and ORGW(,δ*,\delta), the number of general constraints is n+i=1n(|Ri|+|Ri+|)n+\sum_{i=1}^{n}\left(|R_{i}^{-}|+|R_{i}^{+}|\right).

The optimality restricted version of FT(,γ,θ*,\gamma,\theta) is given by

ORFT(,γ,θ*,\gamma,\theta): Maximize (i,j)Sqijyij+i=1ncixi\displaystyle\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: xj+xiyij1 for jRi,i=1,2,,n,\displaystyle x_{j}+x_{i}-y_{ij}\leq 1\mbox{ for }j\in R^{-}_{i},i=1,2,\ldots,n, (91)
jRi+γijyijjRi+γijxi, for i=1,2,,n,\displaystyle\sum_{j\in R^{+}_{i}}\gamma_{ij}y_{ij}\leq\sum_{j\in R^{+}_{i}}\gamma_{ij}x_{i},\mbox{ for }i=1,2,\ldots,n, (92)
jRi+θijyijjRi+θijyji,i=1,2,n\displaystyle\sum_{j\in R^{+}_{i}}\theta_{ij}y_{ij}\leq\sum_{j\in R^{+}_{i}}\theta_{ij}y_{ji},i=1,2\ldots,n (93)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (94)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i}^{-},i=1,2,\ldots,n. (95)
yij1 for all jRi+,i=1,2,,n.\displaystyle y_{ij}\leq 1\mbox{ for all }j\in R_{i}^{+},i=1,2,\ldots,n. (96)
Theorem 17.

ORFT(,γ,δ*,\gamma,\delta) is a valid MILP model for QUBO.

Proof.

By adding, the constraints (96) to ORFT, we get another valid formulation of QUBO, say ORFT-R similar to the case of Theorem 15. Clearly, every feasible solution of ORFT-R is a feasible solution of ORFT(,γ,δ*,\gamma,\delta). To prove the other side, it is sufficient to show that every feasible solution of ORFT(,γ,δ*,\gamma,\delta) satisfies constraints (39) and (40). If xi=0x_{i}=0, using constraints (92), yij=0jRiy_{ij}=0\;\forall j\in R_{i} and from constraints (94) yji=0y_{ji}=0. This satisfies constraints (39) and (40). ∎

The number of general constraints for the model ORFT(,γ,θ*,\gamma,\theta) is at most n+i=1n|Ri|n+\sum_{i=1}^{n}|R_{i}^{-}|. We can get the optimality restricted aggregations in ORFT(,γ*,\gamma) and ORFT(,θ*,\theta) corresponding to the models FT(,γ*,\gamma) and FT(,θ*,\theta). Their validity can be proved using the similar ideas as in Theorem 17. For ORFT(,γ*,\gamma) and ORFT(,θ*,\theta), the number of general constraints is n+i=1n(|Ri|+|Ri+|)n+\sum_{i=1}^{n}\left(|R_{i}^{-}|+|R_{i}^{+}|\right).

4.3 Simultaneous weighted optimality restricted aggregations of type 1 and type 2 constraints

Let us now examine simultaneous aggregation of type 1 and type 2 linearization constraints in the context of optimality restricted models. Recall that when type 1 linearization constraints are aggregated (either full version or the optimality restricted version) we must use full type 2 linearization constraints. Thus, for the simultaneous aggregations, it is necessary that we use aggregation of full type 2 linearization constraints, along with full bound constraints on yijy_{ij} variables. Let us start with the optimality restricted version of simultaneous aggregation of PK. Let ORPK(α,β\alpha,\beta) be:

ORPK(α,β\alpha,\beta): Maximize i=1njRiqijyij+i=1ncixi\displaystyle\sum_{i=1}^{n}\sum_{j\in R_{i}}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: jRiαij(xjyij)jRiαij(1xi),i=1,2,,n,\displaystyle\sum_{j\in R_{i}^{-}}\alpha_{ij}(x_{j}-y_{ij})\leq\sum_{j\in R_{i}^{-}}\alpha_{ij}(1-x_{i}),i=1,2,\ldots,n, (97)
jRiβij(yij+yji)2(jRiβij)xi, for i=1,2,,n.\displaystyle\sum_{j\in R_{i}}\beta_{ij}(y_{ij}+y_{ji})\leq 2\left(\sum_{j\in R_{i}}\beta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (98)
xi{0,1} for all i=1,2,n,\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n, (99)
yij0 for all jRi,i=1,2,,n.\displaystyle y_{ij}\geq 0\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (100)
yij1 for all jRi,i=1,2,,n.\displaystyle y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (101)
Theorem 18.

ORPK(α,β\alpha,\beta) is a valid formulation of QUBO.

Proof.

Notice that after adding yij1,jRi,i=1,2,,ny_{ij}\leq 1,\forall j\in R_{i},i=1,2,\ldots,n to ORPK(α,\alpha,*), we still get a valid formulation for QUBO. Let us call it ORPK(α,\alpha,*)-R. We need to show that every feasible solution of ORPK(α,β\alpha,\beta) is a solution of ORPK(α,\alpha,*)-R. For this, we prove that every feasible solution of ORPK(α,β\alpha,\beta) satisfies yij+yji2xiy_{ij}+y_{ji}\leq 2x_{i}, jRi,i=1,,n\forall j\in R_{i},\;i=1,\ldots,n. Clearly, if xi=0,yij=yji=0x_{i}=0,\;y_{ij}=y_{ji}=0 using constraints (98). The required constraint set is satisfied. On the other hand, if xi=1x_{i}=1, then (98) combined with (101) assures that the required constraint is satisfied. Other side of the proof is pretty obvious hence the result follows. ∎

Similar to the above model, we have optimality restricted aggregation of GW(α,γ+δ\alpha,\gamma+\delta) which is given as under:

ORGW(α,γ+δ\alpha,\gamma+\delta): Maximize (i,j)Sqijyij+i=1ncixi\displaystyle\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: jRiαij(xjyij)jRiαij(1xi),i=1,2,,n,\displaystyle\sum_{j\in R_{i}^{-}}\alpha_{ij}(x_{j}-y_{ij})\leq\sum_{j\in R_{i}^{-}}\alpha_{ij}(1-x_{i}),i=1,2,\ldots,n, (102)
jRiγijyij+jSiδijyji(jRiγij+jSiδij)xi, for i=1,2,,n.\displaystyle\sum_{j\in R_{i}}\gamma_{ij}y_{ij}+\sum_{j\in S_{i}}\delta_{ij}y_{ji}\leq\left(\sum_{j\in R_{i}}\gamma_{ij}+\sum_{j\in S_{i}}\delta_{ij}\right)x_{i},\mbox{ for }i=1,2,\ldots,n. (103)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (104)
0yij1 for all jRi,i=1,2,,n.\displaystyle 0\leq y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (105)
Theorem 19.

ORGW(α,γ+δ\alpha,\gamma+\delta) is a valid formulation for QUBO.

Proof.

Similar to the Theorem 18, add yij1,jRi,i=1,2,,ny_{ij}\leq 1,\forall j\in R_{i},i=1,2,\ldots,n to ORGW(α,\alpha,*) and form ORGW(α,\alpha,*)-R Then, every feasible solution of ORGW(α,\alpha,*) is a feasible solution for ORGW(α,γ+δ\alpha,\gamma+\delta). Further, we want to show that every feasible solution of ORGW(α,γ+δ\alpha,\gamma+\delta) satisfies constraints (13) and (14). If xi=0,x_{i}=0, yij=yji=0y_{ij}=y_{ji}=0 using constraints (103) and (104). The required constraint set is satisfied. On the other hand, if xi=1x_{i}=1, then (103) combined with (105) assures that the required constraint is satisfied. Hence, the model is valid.

Other variations of ORGW(α,γ+δ\alpha,\gamma+\delta) are also interesting. As we did for the exact simultaneous models in section 3, we can replace (103) by (56) and (57) and let ORGW(α,γ,δ\alpha,\gamma,\delta) be the resulting model which is valid for QUBO. The model GW(α,γ,\alpha,\gamma,*) obtained by replacing (103) by (56) and (14) is yet another valid MILP for QUBO. Likewise, the model GW(α,,δ\alpha,*,\delta) obtained by replacing (103) by (57) and (13) is also a valid model for QUBO. ∎

Similar to GW and PK, simultaneous aggregated models can be constructed for FT as well. The formulation ORFT(α,γ,θ\alpha,\gamma,\theta) is given as

ORFT(α,γ,δ\alpha,\gamma,\delta): Maximize (i,j)Sqijyij+i=1ncixi\displaystyle\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{i=1}^{n}c_{i}x_{i}
Subject to: jRiαij(xjyij)jRiαij(1xi),i=1,2,,n,\displaystyle\sum_{j\in R_{i}^{-}}\alpha_{ij}(x_{j}-y_{ij})\leq\sum_{j\in R_{i}^{-}}\alpha_{ij}(1-x_{i}),i=1,2,\ldots,n, (106)
jRiγijyijjRiγijxi,i=1,2,,n,\displaystyle\sum_{j\in R_{i}}\gamma_{ij}y_{ij}\leq\sum_{j\in R_{i}}\gamma_{ij}x_{i},i=1,2,\ldots,n, (107)
jRiθijyijjRiθijxi,i=1,2,,n,\displaystyle\sum_{j\in R_{i}}\theta_{ij}y_{ij}\leq\sum_{j\in R_{i}}\theta_{ij}x_{i},i=1,2,\ldots,n, (108)
xi{0,1} for all i=1,2,n.\displaystyle x_{i}\in\{0,1\}\mbox{ for all }i=1,2\ldots,n. (109)
0yij1 for all jRi,i=1,2,,n.\displaystyle 0\leq y_{ij}\leq 1\mbox{ for all }j\in R_{i},i=1,2,\ldots,n. (110)
Theorem 20.

ORFT(α,γ,θ\alpha,\gamma,\theta) is a valid formulation for QUBO.

The proof of this theorem is similar to that of Theorem 18 and hence, is omitted. ORFT(α,γ\alpha,\gamma) and ORFT(α,θ\alpha,\theta) are two other valid optimality restricted models for QUBO, which are derived along the same lines as we have discussed other models. Details on these are omitted.

5 Linear Programming Relaxations

Linear programming relaxations (LP relaxations) of the MILP models we have discussed so far provide valid upper bounds on the optimal objective function value of QUBO. Such bounds can be effectively used in specially designed enumerative algorithmic paradigms such as branch-and-bound, branch and cut, etc. The linear programming relaxation of GW provides an upper bound for the optimal objective function value of QUBO that matches with the roof duality bound [1]. Many researchers have tried to obtain MILP formulations of QUBO with the corresponding LP relaxation bound matches with that of GW [3, 5, 9, 28] and almost all of these are compact type linearizations. No explicit linearizations, other than the standard linearization itself is known to have this property. We now show that FT and PK and shares this property along with the weighted aggregation models for appropriately chosen weights.

For any mathematical programming model PP, its optimal objective function value is denoted by OBJ(P)\operatorname{OBJ}(P). Also, if PP is an MILP, its LP relaxation is denoted by ¯\overline{\hbox{}}PP. Two MILP models P1P1 and P2P2 are said to be LP-equivalent if OBJ( ¯P1)=OBJ( ¯P2)\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P1$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P2$}}) for all instances of P1P1 and P2P2. P1P1 is said to be stronger than P2P2 (for maximization problems), if OBJ( ¯P1)OBJ( ¯P2)\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P1$}})\leq\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P2$}}) for all instances of P1P1 and P2P2 with strict inequality holds for at least one instance. P1P1 and P2P2 are said to be incomparable if there exist instances of P1P1 and P2P2 such that OBJ( ¯P1)<OBJ( ¯P2)\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P1$}})<\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P2$}}) and also there exists instances P1 and P2 such that OBJ( ¯P1)>OBJ( ¯P2)\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P1$}})>\operatorname{OBJ}(\hbox to0.0pt{\hskip 3.90451pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$P2$}}).

Let us now analyse the LP relaxations of our basic models and their optimality restricted counterparts.

5.1 LP relaxation of basic models

Our next lemma shows that the models DW, GW, FT, and PK are respectively LP-equivalent to their corresponding optimality restricted forms ORDW, ORGW, ORFT, and ORPK.

Lemma 21.

OBJ( ¯DW)=OBJ( ¯ORDW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 7.19513pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$DW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 14.98299pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$ORDW$}}), OBJ( ¯GW)=OBJ( ¯ORGW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 14.63576pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$ORGW$}}), OBJ( ¯FT)=OBJ( ¯ORFT)\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 12.81389pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$ORFT$}}), and OBJ( ¯PK)=OBJ( ¯ORPK)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 13.79652pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$ORPK$}})

Theorem 22.

OBJ( ¯GW)=OBJ( ¯FT)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}})

Proof.

For any indices ii and jj, the constraint (18)\eqref{f5} and (19)\eqref{f6} of FT¯\overline{FT} are given by

yijxi,yjixj,yijyji, and yjiyij.\displaystyle y_{ij}\leq x_{i},y_{ji}\leq x_{j},y_{ij}\leq y_{ji},\mbox{ and }y_{ji}\leq y_{ij}.

Thus yji=yijy_{ji}=y_{ij} and hence yijxi and yijxjy_{ij}\leq x_{i}\mbox{ and }y_{ij}\leq x_{j}. Thus, every feasible solution of   ¯\overline{\hbox{}}FTFT is feasible for   ¯\overline{\hbox{}}GWGW establishing that OBJ( ¯GW)OBJ( ¯FT)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})\geq\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}})

On the other hand, we will show that every optimal solution of GW¯\overline{GW} must be also a feasible solution of FT¯\overline{FT} Suppose not, then there exist an optimal solution, say (x¯,y¯)(\overline{x},\overline{y}) with objective value v(GW¯(x¯,y¯))v(\overline{GW}_{(\overline{x},\overline{y})}), such that at least one constraint of FT¯\overline{FT} is violated. As the given solution is a feasible solution of GW¯\overline{GW}, assume that for some pair (p,q)(p,q) constraint (19) is violated:

y¯qp<y¯pq\displaystyle\overline{y}_{qp}<\overline{y}_{pq}

Since we assume the QQ matrix to be symmetric. Without loss of generality, assume that qij=qji>0q_{ij}=q_{ji}>0. Then, we can construct other solution (x,y)(x,y) for GW¯\overline{GW} such that xi=x¯ii=1,nx_{i}=\overline{x}_{i}\;\forall i=1,\dots n and

yij={y¯qp+ϵif (i,j)=(q,p)y¯ijotherwisey_{ij}=\begin{cases}\overline{y}_{qp}+\epsilon&\text{if $(i,j)=(q,p)$}\\ \overline{y}_{ij}&\text{otherwise}\end{cases} (111)

Clearly, for a small ϵ>0\epsilon>0 this is a feasible solution as

xi+xj1yqp=y¯qp+ϵ<y¯pqmin{xp,xq}x_{i}+x_{j}-1\leq y_{qp}=\overline{y}_{qp}+\epsilon<\overline{y}_{pq}\leq\text{min}\{{x_{p},x_{q}}\} (112)

The optimal value of this solution will be v(GW¯(x¯,y¯))+qijϵv(\overline{GW}_{(\overline{x},\overline{y})})+q_{ij}\epsilon which contradicts the fact that (x¯,y¯)(\overline{x},\overline{y}) is an optimal solution of GW¯\overline{GW}. Thus, every optimal solution of GW¯\overline{GW} must be a feasible solution of FT¯\overline{FT}. Hence v(GW¯)v(FT¯)v(\overline{GW})\leq v(\overline{FT}). ∎

Punnen, Pandey, and Friesen [49] studied the effect of LP relaxations of MILP models of QUBO under different equivalent representations of the matrix QQ. Recall that we assumed that QQ is a symmetric matrix. Interestingly, when QQ is not necessarily symmetric, FT could provide a stronger LP relaxation. To see this, consider the instance of QUBO (EX7) where

Q=(03αα0) and 𝐜T=(α,α)Q=\begin{pmatrix}0&3\alpha\\ -\alpha&0\end{pmatrix}\mbox{ and }\mathbf{c}^{\mkern-1.0mu{T}}=(-\alpha,-\alpha)

where α1\alpha\geq 1. Then,   ¯\overline{\hbox{}}FTFT has an optimal solution x1=x2=y12=y21=0x_{1}=x_{2}=y_{12}=y_{21}=0 with objective function value 0, whereas x1=x2=y12=0.5,y21=0x_{1}=x_{2}=y_{12}=0.5,\;y_{21}=0 is an optimal solution to   ¯\overline{\hbox{}}GWGW with objective function value α/2.\alpha/2. This shows that OBJ( ¯GW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}) could be arbitrarily bad compared to OBJ( ¯FT)\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}}). We now show that GWGW is stronger than DWDW.

Theorem 23.

OBJ( ¯GW)OBJ( ¯DW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})\leq\operatorname{OBJ}(\hbox to0.0pt{\hskip 7.19513pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$DW$}}). Further, OBJ( ¯DW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 7.19513pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$DW$}}) could be arbitrarily bad compared to OBJ( ¯GW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}).

Proof.

Note that constraint (5) in DW is the sum of the constraints (9) and (10). Thus, every feasible solution of   ¯\overline{\hbox{}}GWGW is also feasible to    ¯\overline{\hbox{}}DWDW. Thus, OBJ( ¯GW)OBJ( ¯DW)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})\leq\operatorname{OBJ}(\hbox to0.0pt{\hskip 7.19513pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$DW$}}). Now consider the example (EX8) where

Q=(0α/2α/20) and 𝐜T=(α,1), where α>0.Q=\begin{pmatrix}0&\alpha/2\\ \alpha/2&0\end{pmatrix}\mbox{ and }\mathbf{c}^{\mkern-1.0mu{T}}=(-\alpha,1),\mbox{ where }\alpha>0.

Then, x1=x2=y12=y21=1x_{1}=x_{2}=y_{12}=y_{21}=1 is an optimal solution to   ¯\overline{\hbox{}}GWGW with objective function value 11 whereas for    ¯\overline{\hbox{}}DWDW, x1=0,x2=1,y12=y21=0.5x_{1}=0,x_{2}=1,y_{12}=y_{21}=0.5 is an optimal solution with objective function value 1+α1+\alpha. ∎

Recall that any solution to   ¯\overline{\hbox{}}PKPK can be represented by the ordered pair (𝐱,Y)(\mathbf{x},Y) where 𝐱=(x1,x2,,xj)\mathbf{x}=(x_{1},x_{2},\ldots,x_{j}) is a vector and YY is an ordered list consisting of yij,i=1,2,n,jRiy_{ij},i=1,2,\ldots n,j\in R_{i}.

Lemma 24.

Let (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) be an optimal solution to   ¯\overline{\hbox{}}PKPK. If y¯uvy¯vu\bar{y}_{uv}\neq\bar{y}_{vu} for some indices uu and vv then y¯uv+y¯vu=2x¯u\bar{y}_{uv}+\bar{y}_{vu}=2\bar{x}_{u} and y¯vu+y¯uv=2x¯v\bar{y}_{vu}+\bar{y}_{uv}=2\bar{x}_{v}.

Proof.

Let S={(i,j):i=1,2,,n,jRi,S=\{(i,j):i=1,2,\ldots,n,j\in R_{i}, and (i,j)(u,v),(v,u)}(i,j)\neq(u,v),(v,u)\}. The the objective function of   ¯\overline{\hbox{}}PKPK can be written as

quvyuv+qvuyvu+(i,j)Sqijyij+j=1ncjxj.q_{uv}y_{uv}+q_{vu}y_{vu}+\sum_{(i,j)\in S}q_{ij}y_{ij}+\sum_{j=1}^{n}c_{j}x_{j}.

Without loss of generality assume that y¯uv<y¯vu\bar{y}_{uv}<\bar{y}_{vu}. Thus, y¯vu0\bar{y}_{vu}\neq 0 and x¯u0,x¯v0\bar{x}_{u}\neq 0,\bar{x}_{v}\neq 0. Since QQ is symmetric, note that quv=qvu0q_{uv}=q_{vu}\neq 0. If possible, assume that y¯uv+y¯vu<2x¯u\bar{y}_{uv}+\bar{y}_{vu}<2\bar{x}_{u}. If quv>0q_{uv}>0 we can increase the value of y¯uv\bar{y}_{uv} to y¯uv+ϵ\bar{y}_{uv}+\epsilon for arbitrarily small ϵ>0\epsilon>0 to yield a feasible solution to   ¯\overline{\hbox{}}PKPK with objective function value larger than that of (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}), a contradiction. Likewise, if quv<0q_{uv}<0 we can decrease the value of y¯vu\bar{y}_{vu} to y¯vuϵ\bar{y}_{vu}-\epsilon to yield a feasible solution with objective function value larger than that of (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}), a contradiction. Thus, y¯uv+y¯vu=2x¯u\bar{y}_{uv}+\bar{y}_{vu}=2\bar{x}_{u}. The equality y¯vu+y¯uv=2x¯v\bar{y}_{vu}+\bar{y}_{uv}=2\bar{x}_{v} follows by symmetry. ∎

Lemma 25.

If (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) is an optimal solution to   ¯\overline{\hbox{}}PKPK and y¯uvy¯vu\bar{y}_{uv}\neq\bar{y}_{vu} for some indices uu and vv then x¯u=x¯v\bar{x}_{u}=\bar{x}_{v}. Further, (𝐱,Y)(\mathbf{x},Y) is also an optimal solution to   ¯\overline{\hbox{}}PKPK where 𝐱=𝐱¯\mathbf{x}=\bar{\mathbf{x}}, yuv=yvu=x¯u=x¯vy_{uv}=y_{vu}=\bar{x}_{u}=\bar{x}_{v} and yij=y¯ijy_{ij}=\bar{y}_{ij} for all (i,j)(u,v),(v,u)(i,j)\neq(u,v),(v,u).

Proof.

The proof of the first part of this lemma follows from Lemma 24. Let us now consider the proof of the second part. Since y¯uvy¯vu\bar{y}_{uv}\neq\bar{y}_{vu}, by Lemma  24, ¯\overline{\hbox{}}y+uv{}_{uv}+¯\overline{\hbox{}}y=vu2{}_{vu}=2¯\overline{\hbox{}}xu and y¯vu+y¯uv=2x¯v\bar{y}_{vu}+\bar{y}_{uv}=2\bar{x}_{v}. Also, x¯u=x¯v\bar{x}_{u}=\bar{x}_{v}. Thus, it can be verified that (𝐱,Y)(\mathbf{x},Y) and (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) have the same objective function value and (𝐱,Y)(\mathbf{x},Y) is a feasible solution to   ¯\overline{\hbox{}}PKPK. Thus (𝐱,Y)(\mathbf{x},Y) is an optimal solution to   ¯\overline{\hbox{}}PKPK. ∎

The proof of Lemma 25 implicitly used the fact that QQ is symmetric.

Theorem 26.

OBJ( ¯GW)=OBJ( ¯PK)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}}).

Proof.

The constraint (24) in   ¯\overline{\hbox{}}PKPK is the sum of the constraints (13) and (14) in   ¯\overline{\hbox{}}GWGW. Thus every feasible solution   ¯\overline{\hbox{}}GWGW is also feasible to   ¯\overline{\hbox{}}PKPK and hence OBJ( ¯GW)OBJ( ¯PK)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})\leq\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}}). Now, let (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) be an optimal solution to   ¯\overline{\hbox{}}PKPK. If (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) is a feasible solution to   ¯\overline{\hbox{}}GWGW, the proof is over. So, suppose that (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) is not a feasible solution to   ¯\overline{\hbox{}}GWGW. Let T={(i,j):i=1,2,,jRi}T=\{(i,j):i=1,2,\ldots,j\in R_{i}\} such that the solution (𝐱¯,Y¯)(\bar{\mathbf{x}},\bar{Y}) violates at least one of the constraints pair

yijxi\displaystyle y_{ij}\leq x_{i} (113)
yjixi\displaystyle y_{ji}\leq x_{i} (114)

for all (i,j)T(i,j)\in T. Note that for any pair (i,j)T(i,j)\in T both of these constraints cannot be violated simultaneously. Thus, precisely one of these constraints is violated. By assumption, TT\neq\emptyset. Let (u,v)T(u,v)\in T and hence yuvyvuy_{uv}\neq y_{vu}. By Lemma  25, we can construct an alternative optimal solution (𝐱,Y)(\mathbf{x},Y) to   ¯\overline{\hbox{}}PKPK such that constraints (113) and (114) are satisfied for all (i,j)T{(u,v)}(i,j)\notin T-\{(u,v)\}. Further, (𝐱,Y)(\mathbf{x},Y) is also an optimal solution to   ¯\overline{\hbox{}}PKPK with the property that the number of constraints violated in   ¯\overline{\hbox{}}GWGW is reduced by one. Repeating this process, we can construct a feasible solution in   ¯\overline{\hbox{}}GWGW which is an optimal solution to   ¯\overline{\hbox{}}PKPK. Thus OBJ( ¯GW)OBJ( ¯PK)\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})\geq\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}}) and the result follows. ∎

Theorem  26 need not be true if the underlying matrix QQ is not symmetric. . This can be illustrated using the same example as the one constructed for a corresponding result for FT. Here   ¯\overline{\hbox{}}PKPK gives an optimal solution x1=x2=0.5,y12=0,y21=1x_{1}=x_{2}=0.5,\;y_{12}=0,\;y_{21}=1, with objective function value 2α2\alpha. Thus,   ¯\overline{\hbox{}}GWGW gives tighter bound than   ¯\overline{\hbox{}}PKPK in this case.

5.2 LP relaxations of weighted aggregation based models

In the preceding section, we have shown that the LP relaxations of GW, FT, and PK yield the same objective function value. Weighted aggregation models are expected to produce weaker bounds. We now show that careful choices of the multipliers used in some of our weighted aggregation models can provide a LP relaxation bound matching the LP relaxation bound of GW. Before discussing these results, let us prove a property from linear programming duality which is related surrogate duality [14, 18, 19].

Consider the linear program

P: Maximize 𝐜T𝐱\displaystyle\mathbf{c}^{\mkern-1.0mu{T}}\mathbf{x}
Subject to: A𝐱𝐛,𝐱𝟎\displaystyle A\mathbf{x}\leq\mathbf{b},\mathbf{x}\geq\mathbf{0}

where the matrix AA and vectors 𝐛,𝐜T\mathbf{b},\mathbf{c}^{\mkern-1.0mu{T}} and 𝐱\mathbf{x} are of appropriate dimensions. Let 𝐰\mathbf{w}^{*} be a given non-negative row vector. Now consider the continuous knapsack problem obtained from PP as

CKP(𝐰\mathbf{w}^{*}): Maximize 𝐜T𝐱\displaystyle\mathbf{c}^{\mkern-1.0mu{T}}\mathbf{x}
Subject to: 𝐰A𝐱𝐰𝐛,𝐱𝟎\displaystyle\mathbf{w}^{*}A\mathbf{x}\leq\mathbf{w}^{*}\mathbf{b},\mathbf{x}\geq\mathbf{0}
Theorem 27.

If P has an optimal solution with an optimal dual solution 𝐰0\mathbf{w}^{0}, then CKP(𝐰0\mathbf{w}^{0}) also has an optimal solution. Further, the optimal objective function values of P and CKP(𝐰0\mathbf{w}^{0}) are the same.

Now, consider the linear program

PP^{*}: Maximize 𝐜T𝐱\displaystyle\mathbf{c}^{\mkern-1.0mu{T}}\mathbf{x}
Subject to: A𝐱𝐛,A1𝐱𝐛1,A2𝐱𝐛2,,Ap𝐱𝐛p,𝐱𝟎\displaystyle A\mathbf{x}\leq\mathbf{b},A^{1}\mathbf{x}\leq\mathbf{b}^{1},A^{2}\mathbf{x}\leq\mathbf{b}^{2},\ldots,A^{p}\mathbf{x}\leq\mathbf{b}^{p},\mathbf{x}\geq\mathbf{0}

where 𝐛iRmi\mathbf{b}^{i}\in R^{m_{i}} for i=1,2,,pi=1,2,\ldots,p. Let 𝐰iRmi\mathbf{w}^{i}\in R^{m_{i}} be a given non-negative row vectors in RmiR^{m_{i}} , for i=1,2,,pi=1,2,\ldots,p. Now consider the new linear program PwP^{w} obtained from PP^{*} using weighted aggregation of constraints Ai𝐱𝐛i,i=1,2,,nA^{i}\mathbf{x}\leq\mathbf{b}^{i},i=1,2,\ldots,n. Then PwP^{w} can be written as

PwP^{w}: Maximize 𝐜T𝐱\displaystyle\mathbf{c}^{\mkern-1.0mu{T}}\mathbf{x}
Subject to: A𝐱𝐛,𝐰1A1𝐱𝐰1𝐛1,𝐰2A2𝐱𝐰2𝐛2,,𝐰pAp𝐱𝐰p𝐛p,𝐱𝟎\displaystyle A\mathbf{x}\leq\mathbf{b},\mathbf{w}^{1}A^{1}\mathbf{x}\leq\mathbf{w}^{1}\mathbf{b}^{1},\mathbf{w}^{2}A^{2}\mathbf{x}\leq\mathbf{w}^{2}\mathbf{b}^{2},\ldots,\mathbf{w}^{p}A^{p}\mathbf{x}\leq\mathbf{w}^{p}\mathbf{b}^{p},\mathbf{x}\geq\mathbf{0}
Theorem 28.

When 𝐰i\mathbf{w}^{i} is the part of an optimal dual solution 𝐰\mathbf{w} of PP^{*} that is associated with the constraint block Ai𝐱𝐛i,i=1,2,,pA^{i}\mathbf{x}\leq\mathbf{b}^{i},i=1,2,\dots,p, the optimal objective function values of PP^{*} and PwP^{w} are the same.

The proof of the theorem 27 and theorem 28 are skipped due to space constraints. The proofs can be found in the upcoming thesis[31]

Let us now analyze the quality of LP relaxations of the MILP models generated by weighted aggregations.

Theorem 29.

If the multipliers in the weighted aggregation models is used as corresponding optimal dual variables associated with those constraints then the resulting aggregated model will have the same objective function value as the corresponding original model. In particular,

  1. (i)

    If αij\alpha_{ij} in the type 1 aggregation models is selected as the dual variable associated with the corresponding aggregated constraint in the corresponding models (DW,GW,FT, and PK), then OBJ( ¯DW)=OBJ( ¯DW(α))\operatorname{OBJ}(\hbox to0.0pt{\hskip 7.19513pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$DW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 7.19513pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$DW$}}(\alpha)), OBJ( ¯GW)=OBJ( ¯GW(α))=OBJ( ¯FT(α))=OBJ( ¯PK(α))\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(\alpha))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}}(\alpha))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}}(\alpha)).

  2. (ii)

    If βij\beta_{ij}, γij\gamma_{ij} and δij\delta_{ij} in the type 2 aggregation models are selected as the dual variable associated with the corresponding aggregated constraint in the corresponding models (GW,FT, and PK), OBJ( ¯GW)=OBJ( ¯GW(,γ))=OBJ( ¯GW(,δ))=OBJ( ¯GW(,γ+δ))=OBJ( ¯FT(,γ))=OBJ( ¯FT(,θ))=OBJ( ¯PK(,β))\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(*,\gamma))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(*,\delta))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(*,\gamma+\delta))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}}(*,\gamma))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}}(*,\theta))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}}(*,\beta)).

  3. (iii)

    If α\alpha, βij\beta_{ij}, γij\gamma_{ij} and δij\delta_{ij} in the simultaneous aggregation models are selected as the dual variable associated with the corresponding aggregated constraint in the corresponding models (GW,FT, and PK), OBJ( ¯GW)=OBJ( ¯GW(α,γ))=OBJ( ¯GW(α,δ))=OBJ( ¯GW(α,γ+δ))=OBJ( ¯FT(α,γ))=OBJ( ¯FT(α,θ))=OBJ( ¯PK(α,β))\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}})=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(\alpha,\gamma))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(\alpha,\delta))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.8479pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$GW$}}(\alpha,\gamma+\delta))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}}(\alpha,\gamma))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 5.02605pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$FT$}}(\alpha,\theta))=\operatorname{OBJ}(\hbox to0.0pt{\hskip 6.00867pt\leavevmode\hbox{\set@color$\overline{\hbox{}}$}\hss}{\leavevmode\hbox{\set@color$PK$}}(\alpha,\beta)).

Proof.

The proof of this theorem follows from Theorem 22, Theorem 26, and Theorem 28. ∎

It may be noted that when the multipliers (αij,βij,etc.\alpha_{ij},\beta_{ij},etc.) are chosen as the corresponding dual variables, as discussed in Theorem 29, it could invalidate the corresponding MILP models since these multipliers could be zero for some ii and jj. Note that the validity of these MILP models is guaranteed only when the multipliers are strictly positive. Thus, if any of the multipliers are zero we need to correct it to an appropriately selected value ϵ>0\epsilon>0 or find an alternative optimal dual solution (if exists) which is not zero. The choice of ϵ>0\epsilon>0 needs to be made by taking into consideration numerical issues associated with the solver used and the tightness of the LP relaxation bound required. In practice, when the optimal dual variables used have value zero, it needs to be adjusted to appropriate positive values as discussed earlier, to maintain the validity of the models. Depending on the level of adjustments, the LP relaxation values may deteriorate a little bit.

6 Experimental analysis

In addition to the theoretical analysis of our MILP models for QUBO, we have also carried out extensive experimental analysis to assess the relative merits of these models. The objectives of the experiments were two fold. First we wanted we assess the ability of the models to solve the problems optimally, with a given time threshold. Another objective is to assess the heuristic value of the models. That is, given an upper threshold for running time, identify if any of the models consistently produced better solutions. We are not comparing the models with best known specially designed QUBO solvers as it will be an unfair comparison. However, we believe the insights gained from our experiments can be used to introduce further refinements to existing QUBO solvers or design new ones with improved capabilities.

All of our computational experiments were carried out on different PCs with same configurations. That is, with Windows 10 Enterprise 64-bit operating system with Intel(R) Core(TM) i7-3770 3.40GHz processor and 16 GB memory. Gurobi 9.5.1 was used as the MILP solver on Anaconda’s spyder IDE with Gurobipy interface. Parameter for ’Presolve’ and ’Cuts’ was set to 0 throughout these experiments. This is to eliminate bias coming from adding well-known cuts generated based on the Boolean quadric polytope [41]. We considered three classes of test instances. This include the well-known Beasley instances[56] and Billionet and Elloumi instances[56], in addition to instances which we generated which we call balanced data set. To limit the experimental runs, we did not test some simultaneous aggregations of FT, that are FT(α,γ\alpha,\gamma), FT(α,δ\alpha,\delta), ORFT(α,γ\alpha,\gamma) and ORFT(α,δ\alpha,\delta) since they have similar behaviour as GW models.

We used the notational convention MU(α,β\alpha,\beta) to represent unweighted aggregation of the weighted model M(α,β\alpha,\beta) where M represents the model and α\alpha and β\beta represents weights/parameters used for the aggregation. For the experiments, on the weighted versions, the weights are taken as optimal dual variables for the respective constraints in the LP relaxation of the model. All of the optimal dual variables with zero value are replaced by 1. For unweighted versions, all the multipliers are taken as 1.

6.1 Balanced Data Set

Refer to caption
Figure 1: Experimental analysis for Balanced data set

This data set is generated with the following characteristics: ci[10,10]c_{i}\in[-10,10] for all i, qij[20,20]q_{ij}\in[-20,20] for all i,ji,j with QQ symmetric and having diagonal elements zero. Ten different problem sizes n=10,20,30,,90,100n=10,20,30,\ldots,90,100 are considered. After generating a pair (Q,𝐜)(Q,\mathbf{c}), we solved the corresponding   ¯\overline{\hbox{}}GWGW. If all xix_{i} variables turned out to be half-integral, the generated model is accepted into the data set; otherwise the model is discarded. For each problem size (i.e. nn) five different instances are constructed. The balanced data set we generated will be made available for researchers for future experimental analysis. The time limit was set to 30 minutes for n60n\leq 60 and 1 hour for all other instances in this set.

Figue 1 shows the experimental results corresponding to the Balanced Data Set for the models discussed in Section 2,3 and 4. For this data set, it is observed that for the smaller instances (nn\leq50), most of the models reached optimality, however, for nn\geq60, for all the instances time limit was reached and the best objective function value is used for the comparison. It can be noticed that among the basic models discussed in section 2, the optimality restricted version of PK model was the best model.

Overall, the type two aggregations of PK, both unweighted and weighted, performed well here. In general, it can be noted that all the models where type 1 constraints are aggregated never worked better as compared to the aggregations based on Type 2 constraints. For the aggregations of GW, particularly, exact type 2 aggregations performed better than the others. Hansen and Meyer [28]) indicated that replacing constraints (13) and (14) in GW by

jRiyij\displaystyle\sum_{j\in R_{i}}y_{ij} |Ri|xi, for i=1,2,,n.\displaystyle\leq|R_{i}|x_{i},\mbox{ for }i=1,2,\ldots,n. (115)
yij\displaystyle y_{ij} 1, for i=1,2,,n.\displaystyle\leq 1,\mbox{ for }i=1,2,\ldots,n. (116)

yields a valid formulation for QUBO and stated it as the Glover-Woolsey aggregation. They showed that the LP relaxation of this model yields very weak upper bound and discarded it from their further experiments. We first observe that the model discussed in [28] as Glover-Woolsey aggregation is not precisely the Glover-Woolsey aggregation and that model need not produce an optimal solution for QUBO. For example consider the QUBO with Q=(0110)Q=\begin{pmatrix}0&1\\ 1&0\end{pmatrix} and 𝐜T=(2,0)\mathbf{c}^{\mkern-1.0mu{T}}=(-2,0). An optimal solution to this problem by this formulation is x1=0,x2=1,y12=0,y21=1x_{1}=0,\;x_{2}=1,\;y_{12}=0,\;y_{21}=1 with objective function value 2 where as x1=0,x2=1,y12=0,y21=0x_{1}=0,\;x_{2}=1,\;y_{12}=0,\;y_{21}=0 is a better solution with optimal objective function value 1 as the former does not give an optimal solution to the original QUBO because x1x2y21x_{1}x_{2}\neq y_{21}. However, the correct Glover-Woolsey aggregation (with amendments from Goldman [25]) in fact performs computationally better, as can be observed from bar chart 1.

Overall, when we compare all models together, optimality restricted version of GW(,γ,δ*,\gamma,\delta) turned out to be better formulation with weighted version having a frequency of 23 and unweighted version with frequency 21 for this data set. For the basic models, exact and optimality restricted models performed the same for this data set. While for the aggregations, the optimality restricted models performed better. For the exact models, unweighted aggregations were better but for the optimality restricted models, weighted aggregations showed better results. Overall, optimality restricted versions of the aggregated and basic models both outperformed other models for this data set. As mentioned earlier, the bar chart clearly shows most of the type 1 constraints based aggregations and some simultaneous aggregations did not perform well in this experiment, so we discarded PK(α,\alpha,*), GW(α,\alpha,*), GW(α,γ,θ\alpha,\gamma,\theta), GW(α,δ\alpha,\delta), GW(α,γ\alpha,\gamma), FT(α,\alpha,*) and their respective unweighted and optimality restricted versions from next experiments.

6.2 Beasley Data Set[56]

The data for this experiment is from OR Library[7]. More details of data set can be found in Angelika[56] and Beasley[8]. We selected instances, n=50,100, 250, 500 and all the coefficients are uniformly distributed integers in [-100,100]. The time limit was set to 30 minutes for n=50 and 1 hour for all other instances.

For n=50 and 100, most of the models produced an optimal solution before reaching the time limit and hence comparison is done based on the run times.

For n=250 and 500, optimality is not reached so we look for the models that give the tightest bounds. Hence, we categorize the comparison on basis of run time and best bounds.

Refer to caption
Figure 2: Experimental analysis for Beasley data set

From the figure 2 for smaller instances(n=50,100), it is evident that PK model is the best most number of times overall but among the aggregations optimality restricted version of GW(β,γ\beta,\gamma) was the best. Further, it may be noted that for this data set, optimality restricted models have better run times.
For n=250 and 500 ,unweighted version of the optimality restricted version of PK(,β*,\beta) and exact version of FT(,γ,δ*,\gamma,\delta) gave the best bound often.

Overall, ORPK and GW(,β,γ*,\beta,\gamma) were the best with frequency 21 and 17 respectively out of 40. For the basic models, for n=50,100 optimality restricted models performed better whereas for n=250,500 the exact models were better. For the aggregations, for n=50,100 optimality restricted models performed better while for n=250,500 exact and optimality restricted models were the same. Also for the aggregations, overall unweighted models were better but for exact models for n=50,100 weighted models were better.

6.3 Billionet and Elloumi instances[56]

Refer to caption
Figure 3: Experimental analysis for BE data set

The third data set we used is Billionet and Elloumi instances [56, 43]. Overall there are 80 instances in this data set with different nn and density. We have selected the problems with n=100,120,150,200,250 where diagonal coefficients are from [-100,100] while off-diagonal coefficients are from [-50,50]. The density of instances is taken as 1 for n=100, 0.3 and 0.8 for n=120,150,200 and 0.1 for n=250. The time limit was set 1 hour for all of the instances. For this data set, all the instances reached the time limit. So the comparison is done with respect to value of heuristic solutions.

Overall, from figure 3 optimality restricted version of GW(,γ+δ*,\gamma+\delta) is the best with 42 frequency followed by basic model FT with 38 frequency. It is worth noting that all the aggregations of GW and PK gave good solutions and results are either better than or at least comparable to GW. Also, unweighted formulations performed better than weighted formulations for this data set. For the basic models, exact models were better. For the weighted aggregations, exact and optimality restricted models behaved the same but for unweighted aggregations optimality restricted models were better. Also for the aggregations, in general unweighted models were better for this data set.

On considering the analysis for all the data sets it is evident that type 2 aggregations of GW and PK performed better overall.

Conclusion

In this paper, we have presented various MILP formulations for QBOP. This includes some basic models and models obtained by the selective aggregation of constraints of the basic models. Unlike the aggregation based models studied in the literature for general integer programs, our models provide viable alternatives to practical problem solving. Theoretical and experimental analysis on the models has been carried out to assess the relative strength of the model. We also developed new benchmark instances which will be made available through Github for future experimental work. The applicability of the basic strategies we used in this paper goes beyond QUBO to other quadratic models ( e.g. [38, 57]) as well as models that are not explicitly quadratic. More details about this will be reported in a sequel. Our computational results disclose that aggregation based models, particularly the type 2 aggregation based models, performed well along with the advantage of having same LP relaxation bound as their corresponding Basic models when choosing the multipliers as optimal dual variables. The aggregation methods suggested in this paper can be used in other integer programming models. For example, the stable set problem, vertex cover problem, facility location, among others. For details on such applications, we refer to the forthcoming Ph.D. thesis [31] and our followup papers.

Acknowledgments

This work was supported by an NSERC discovery grant awarded to Abraham P. Punnen. A preliminary reduced version of this paper was released as a technical report of the mathematics department, Simon Fraser University.

References

  • [1] W. P. Adams and P. M. Dearing, On the equivalance between roof duality and Lagrangian duality for unconstrained 0-1 quadratic programming problems, Discrete Applied Matheamtics 48 (1994) 1–20.
  • [2] W.P. Adams, R.J. Forrester, A simple recipe for concise mixed 0-1 linearizations, Operations Research Letters 33 (2005) 55–61.
  • [3] W.P. Adams, R. Forrester and F. Glover, Comparisons and enhancement strategies for linearizing mixed 0-1 quadratic programs, Discrete Optimization 1 (2004) 99–120.
  • [4] W. P. Adams and S. M. Henry, Base-2 Expansions for Linearizing Products of Functions of Discrete Variables, Operations Research 60 (2012) 1477–1490.
  • [5] W.P. Adams and H.D. Sherali, A tight linearization and an algorithm for 0–1 quadratic programming problems, Management Science 32 (1986) 1274–1290.
  • [6] W. P. Adams and L Waddell. Linear programming insights into solvable cases of the quadratic assignment problem Discrete Optimization 14(2014), 46–60.
  • [7] John E. Beasley, Or-library. http://people.brunel.ac.uk/ mastjjb/jeb/info. html, 1990.
  • [8] J. E. Beasley, Heuristic algorithms for the unconstrained binary quadratic programming problem. Technical report, The Management School, Imperial College, London SW7 2AZ, England, 1998
  • [9] A. Billionnet, S. Elloumi, and M-C. Plateau, Quadratic 0-1 programming: Tightening linear or quadratic convex reformulation by use of relaxations, RAIRO Operations Research 42 (2008) 103–121.
  • [10] A. Billionnet, S. Elloumi, M.-C. Plateau, Improving the performance of standard solvers for quadratic 0–1 programs by a tight convex reformulation: the QCR method, Discrete Applied Mathematics 157 (2009) 1185–1197.
  • [11] W. Chaovalitwongse, P. Pardalos, O.A. Prokopyev, A new linearization technique for multi-quadratic 0–1 programming problems, Operations Research Letters 32 (2004) 517–522
  • [12] E. Çela, V.G. Deineko and G.J. Woeginger, Linearizable special cases of the QAP Journal of Combinatorial optimization 31 (2016) 1269–1279.
  • [13] G.B. Dantzig, On the significance of solving linear programming problems with some integer variables, Econometrica 28 (1960) 30–44.
  • [14] M. E. Dyer, Calculating surrogate constraints, Mathematical Programming 19 (1980) 255–278.
  • [15] A. A. Elimam and S. E. Elmaghary, On the reduction method for integer linear programs, Discrete Applied Mathematics 12 (1982) 241–260.
  • [16] R. Fortet, Applications de l’algèbre de boole en recherche opérationelle, Revue Francaise Recherche Opérationelle 4 (1959) 5–36.
  • [17] R. Fortet, L’algèbre de boole et ses applications en recherche opérationnelle, Cahiers du Centre d’Etudes de Recherche Opérationnelle 4 (1960) 17–26.
  • [18] F. Glover, Surrogate Constraints, Operations Research 16 (1968) 741–749.
  • [19] F. Glover, Surrogate Constraint Duality in Mathematical Programming, Operations Research 23 (1975) 434–451.
  • [20] F. Glover, Improved linear integer programming formulations of nonlinear integer problems, Management Science 22 (1975) 455–460.
  • [21] F. Glover, Tutorial on surrogate constraints approaches for optimization graphs, Research report, Journal of Heuristics 9 (2003) 175–227.
  • [22] F. Glover and D. A. Babayev, New results for aggregating integer-valued equations, Annals of Operations Research 58 (1995) 227–242.
  • [23] F. Glover and E. Woolsey, Converting the 0–1 polynomial programming problem to a 0–1 linear program, Operations Research 22 (1974) 180–182.
  • [24] F. Glover and E. Woolsey, Further reduction of zero-one polynomial programming problems to zero-one linear programming problems, Operations Research 21 (1973) 141–161.
  • [25] A. J. Goldman, Linearization in 0-1 Variables: A Clarification, Operations Research 31 (1983) 946–947.
  • [26] S. Gueye and P. Michelon, A linearization framework for unconstrained quadratic (0-1) problems, Discrete Applied Mathematics 157 (2009) 1255–1266.
  • [27] S. Gueye and P. Michelon, “Miniaturized” Linearizations for Quadratic 0/1 Problems, Annals of Operations Research 140 (2005) 235–261.
  • [28] P. Hansen and C. Meyer, Improved compact linearizations for the unconstrained quadratic 0–1 minimization problem, Discrete Applied Mathematics 157 (2009) 1267–1290.
  • [29] H. Hu and R. Sotirov, The linearization problem of a binary quadratic problem and its applications, Annals of Operations Research 307 (2021) 229–249.
  • [30] S. N. Kabadi and A. P. Punnen, An O(n4)O(n^{4}) algorithm for the QAP linearization problem, Mathematics of Operations Research 36 (2011) 754-761.
  • [31] N. Kaur, Aggregation properties of 0-1 integer programming problems and connections with binary quadratic programs, Forthcoming PhD. thesis,Simon Fraser University 2024
  • [32] K. E. Kendall and S. Zionts, Solving integer programming problems by aggregating constraints, Operations Research 25 (1977) 346–351.
  • [33] G. Kochenberger, J-K. Hao, F. Glover, M. Lewis, Z. Lu, H. Wang and Y. Wang, The Unconstrained Binary Quadratic Programming Problem: A Survey, Journal of Combinatorial Optimization 28 (2014) 58–81.
  • [34] A. N. Letchford, The Boolean quadric polytope, in The Quadratic Unconstrained Binary Optimization Problem: Theory, Algorithms, and Applications, A. P. Punnen (ed.), Springer, Switzerland, 2022.
  • [35] L Liberti, Compact linearization for binary quadratic problems, 4OR 5 (3) (2007) 231–245.
  • [36] V. Maniezzo, M. A. Boschetti, and T. Stützle, Matheuristics: Algorithms and Implementations, Springer 2021.
  • [37] G. B. Mathews, On the partition of numbers, Proceedings of the London Mathematical Society 28 (1897) 486–490.
  • [38] F. de Meijer and R. Sotirov, SDP-Based Bounds for the Quadratic Cycle Cover Problem via Cutting-Plane Augmented Lagrangian Methods and Reinforcement Learning, INFORMS Journal on Computing 33 (2021) 262–1276.
  • [39] K. G. Murty, Linear Programming, Whiley, 1983.
  • [40] D. C. Onyekwelu, Computational viability of a constraint aggregation scheme for integer linear programming problems, Operations Research 31 (1983) 795–801.
  • [41] M. Padberg, The boolean quadric polytope: Some characteristics, facets and relatives, Mathematical Programming 45 (1989) 134–172.
  • [42] M. W. Padberg, Equivalent knapsack-type formulations of bounded integer linear programs: An alternative approach, Naval Research Logistics Quarterly 19 (1973) 699–708.
  • [43] P. M. Pardalos and G. P. Rodgers, Computational aspects of a branch and bound algorithm for quadratic zero-one programming. Computing 45 (1990) 131–144.
  • [44] A. P. Punnen (ed.), The Quadratic Unconstrained Binary Optimization Problem: Theory, Algorithms, and Applications, Springer, Switzerland, 2022.
  • [45] A. P. Punnen, and S. N. Kabadi, , A linear time algorithm for the Koopmans–Beckmann QAP linearization and related problems, Discrete Optimization 10 (2013) 200–209.
  • [46] A. P. Punnen and N. Kaur, Revisiting some classical explicit linearizations for the quadratic binary optimization problem, Research Report, Department of Mathematics, Simon Fraser University, 2022.
  • [47] A. P. Punnen and N. Kaur, Revisiting compact linearizations of the quadratic binary optimization problem, Research Report, Department of Mathematics, Simon Fraser University, 2022.
  • [48] A. P. Punnen, B. D Woods and S. N. Kabadi, A characterization of linearizable instances of the quadratic traveling salesman problem.(2017), arXiv:1708.07217
  • [49] A. P. Punnen, P. Pandey, and M. Friesen, Representations of quadratic combinatorial optimization problems: A case study using the quadratic set covering problem, Computers & Operations Research 112 (2019) 104769.
  • [50] A. P. Punnen and R. Sotirov, Mathematical programming models and exact algorithms, in The Quadratic Unconstrained Binary Optimization Problem: Theory, Algorithms, and Applications, A. P. Punnen (ed.), Springer, Switzerland, 2022.
  • [51] I. G. Rosenberg, Aggregation of equations in integer programming, Discrete Mathematics 10 (1974) 325–341.
  • [52] H.D. Sherali and W.P. Adams, A Reformulation-Linearization Technique for Solving Discrete and Continuous Nonconvex Problems. Kluwer Academic Publ., Norwell, MA (1999)
  • [53] H. Sherali, W. Adams, A hierarchy of relaxations between the continuous and convexhull representations for zero-one programming problems, SIAM Journal of Discrete Mathematics 3 (1990) 411–430.
  • [54] H. Sherali, W. Adams, A hierarchy of relaxations and convex hull characterizations for mized-integer zero-one programming problems, Discrete Applied Mathematics 52 (1994) 83–106.
  • [55] L. J. Watters, Reduction of integer polynomial programming problems to zero-one linear programming problems, Operations Research 15 (1967)
  • [56] A. Wiegele, Biq Mac library – a collection of Max-Cut and quadratic 0-1 programming instances of medium size, (2007)
  • [57] Q. Wu, Y. Wang, and F. Glover, Advanced Tabu Search Algorithms for Bipartite Boolean Quadratic Programs Guided by Strategic Oscillation and Path Relinking, INFORMS Journal on Computing 32 (2019) 74–89.
  • [58] W. I. Zangwill, Media selection by decision programming, Journal of Advertising Research 5 (1965) 30–36.
  • [59] N. Zhu and K. Broughan, On aggregating two linear Diophantine equatios, Discrete Applied Mathematics 82 (1998) 231–246.