Iteration Complexity of an Infeasible Interior Point Methods for Seconder-order Cone Programming and its Warmstarting

Yushu Chen and Guangwen Yang
Department of Computer Science and Technology
Tsinghua University
Beijing, China, 100084
Lu Wang, Qingzhong Gan, and Haipeng Chen
Shanghai Aerospace Control Technology Institute
Shanghai, China, 201108
Also at National Supercomputing Center in Wuxi, Jiangsu, China. chenyushu@mail.tsinghua.edu.cnCorresponding author. Also at National Supercomputing Center in Wuxi, Jiangsu, China. ygw@mail.tsinghua.edu.cn

Abstract

This paper studies the worst case iteration complexity of an infeasible interior point method (IPM) for seconder order cone programming (SOCP), which is more convenient for warmstarting compared with feasible IPMs. The method studied bases on the homogeneous and self-dual model and the Monteiro-Zhang family of searching directions. Its worst case iteration complexity is $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ , to reduce the primal residual, dual residual, and complementarity gap by a factor of $\epsilon$ , where $k$ is the number of cone constraints. The result is the same as the best known result for feasible IPMs. The condition under which warmstarting improves the complexity bound is also studied.

1 Introduction

Warm starting of IPMs is widely perceived to be difficult [Potra and Wright, 2000]. When the initial value is close to the boundary and not well-centered, the IPMs usually convergences very slow. Compared with feasible IPMs, since infeasible IPMs can be initialized trivially under the cone constraints without satisfying the equity constrains, they are more convenient for warm starting [Skajaa et al., 2013]. In order to study the convergence of warm starting schemes for SOCP, a better understanding on the iteration complexity of infeasible IPMs is required. This paper studies the iteration-complexity of a infeasible interior point method for SOCP, and present the condition under which warmstarting improves the complexity.

Most of the existing works on the worst-case iteration complexity of IPMs for SOCP focus on feasible IPMs, requiring that the iterations satisfy the equity constraints. Among these works, Nesterov and Todd [1997, 1998] present the Nesterov-Todd (NT) searching direction and show that the short-step path-following algorithm [Kojima et al., 1989] using the direction has $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ iteration complexity, where $k$ is the number of cones. Tsuchiya [1999] further studies the convergence of path-following algorithms for SOCP, and shows that: (1) using the HRVW/KSH/M directions [Helmberg et al., 1996, Kojima et al., 1997, Monteiro, 1997], the iteration complexity of the shortstep, semi-long step and long step methods are $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ , $O\left(k\log\left(\epsilon^{-1}\right)\right)$ , and $O\left(k^{3/2}\log\left(\epsilon^{-1}\right)\right)$ , respectively; (2) using the NT directions, the iteration complexity of the methods are $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ , $O\left(k\log\left(\epsilon^{-1}\right)\right)$ , and $O\left(k\log\left(\epsilon^{-1}\right)\right)$ iterations, respectively. Monteiro and Tsuchiya [2000] extend the $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ complexity bound to the short-step method and the Mizuno-Todd-Ye predictor-corrector method using the Monteiro-Zhang (MZ) family [Monteiro, 1997, Monteiro and Zhang, 1998, Zhang, 1998] family of directions, that includes the HRVW/KSH/M directions and NT directions as special cases. Mohammad-Nezhad and Terlaky [2019] show a Newton method converges quadraticly, when initialized using a IPM solution with a sufficiently small complementary gap.

Compared with feasible IPMs, iteration complexity of infeasible IPMs for SOCP are not thoroughly investigated. Rangarajan [2006] study the convergence of infeasible IPMs for conic programing that includes SOCP as a special case. They show using the $\mathcal{N}_{-\infty}$ central path neighborhood, the method using the NT directions takes $O\left(k^{2}\log\left(\epsilon^{-1}\right)\right)$ iterations, and the method using th XS or SX directions [Zhang, 1998] takes $O\left(k^{5/2}\log\left(\epsilon^{-1}\right)\right)$ iterations.

In this paper, we show an infeasible IPM for SOCP has $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ iteration complexity, that is the same as the best known result of feasible IPMs. We also present the condition under which warmstarting improves the bound. The method studied is an extension of the well know short-step path-following algorithm for linear programming, carried over to the context of SOCP. The modifications are as follows: firstly, the iterates are not required to be feasible to the equity constraints, but only be in the interior of the cone constraints, so that it is convenient for warmstarting; secondly, a homogeneous and self-dual (HSD) model [Nemirovskii and Nesterov, 1993, Peng et al., 2003] is applied, therefore an infeasibility certificate can be obtained when either the primal or dual problem is infeasible; thirdly, the MZ family of searching directions is employed for acceleration. The warmstarting scheme analyzed is presented in a companion paper [Chen et al., 2022], which uses initial values modified from inexact solutions of previous problems to save computation.

The paper is organized as follows. Section II introduces the SOCP problem and the IPM for solving it. Section III studies the iteration complexity for the IPM to reduce the primal residual, dual residual, and complementarity gap. Section IV studies the conditions that warm starting can improve the worst case complexity compared with cold starting.

1.1 Notation and terminology

The following notations are used throughout the paper. $\mathcal{R}_{+}$ denotes the set of positive real numbers, $\mathcal{R}^{n}$ denotes the $n$ -dimensional Euclidean space, and $\mathcal{R}^{m\times n}$ denotes the set of all $m\times n$ matrices with real entries. $\mathbf{I}_{n}$ denotes the identity matrix of order $n$ . $\mathbf{G}\succ 0$ means $\mathbf{G}$ is symmetric and positive definite, and $\mathbf{G}\succeq 0$ means $\mathbf{G}$ is symmetric and positive semi-definite. For a square matrix $\mathbf{M}$ with all real eigenvalues, its largest and smallest eigenvalues are denoted by $\lambda_{\max}(\mathbf{M})$ and $\lambda_{\min}(\mathbf{M})$ . For a set $\mathcal{K}$ , $\text{int}(\mathcal{K})$ denotes its interior. $\text{diag}(\cdots)$ denotes a diagonal matrix using a given vector as diagonal, and $\text{blkdiag}(\cdots)$ means a block diagonal matrix with given matrix blocks. $\left\|\cdots\right\|$ without specification is used instead of $\left\|\cdots\right\|_{2}$ for simplicity. With a little abuse of notations, when concatenating column vectors $\mathbf{x}$ and $\mathbf{y}$ , we use $(\mathbf{x},\mathbf{y})$ instead of $(\mathbf{x}^{T},\mathbf{y}^{T})^{T}$ for simplicity.

2 The SOCP problem and the infeasible IPM

In this section, we introduce the seconder-order cone programming (SOCP) problem, some important concepts about the problem, and the infeasible IPM for SOCP studied in this paper.

2.1 The SOCP problem

SOCP minimize a linear function over the intersection of an affine set and the direct product of linear cones (LC) and second-order cones (SOC).

The linear cones $\mathcal{K}_{L}^{n}$ and second-order cones $\mathcal{K}_{S}^{n}$ are defined as

		$\displaystyle\mathcal{K}_{L}^{n}=\{\mathbf{v}\in{{\mathbf{R}}^{n}}:\mathbf{v}\geq 0\},n\geq 1,$		(1)
		$\displaystyle\mathcal{K}_{S}^{n}=\{\mathbf{v}\in{{\mathbf{R}}^{n}}:{{\mathbf{v}}_{1}}\geq\left\\|{{\mathbf{v}}_{2:n}}\right\\|\},n\geq 2,$		(1)

The 1-dimensional SOC is defined as $\mathcal{K}_{S}^{1}=\mathcal{K}_{L}^{1}$ .

Then, the standard form of SOCP is defined as

		$\displaystyle\min_{\mathbf{x}}\mathbf{c}^{T}\mathbf{x},$		(2)
		s.t.
		$\displaystyle\mathbf{A}\mathbf{x}=\mathbf{b},$
		$\displaystyle\mathbf{x}\in\mathcal{K},$
		$\displaystyle\mathcal{K}=\mathcal{K}_{L}^{l}\times\mathcal{K}_{S}^{{{n}_{l+1}}}\times\mathcal{K}_{S}^{{{n}_{l+2}}}\cdots\times\mathcal{K}_{S}^{{{n}_{l+m}}},$

where $\mathbf{A}\in\mathcal{R}^{p\times n}$ , $\text{rank}(\mathbf{A})=p\leq n$ . Without loss of generality, we assume that in the solution variable the linear cone is arranged at first, followed with the SOCs. The number of cones is

k=l+m,

(3)

where the $l$ -dimensional LC $\mathcal{K}_{L}^{n}$ is viewed as $l$ 1-dimensional LCs, so that $n_{i}=1$ for $i\in\left\{1,\cdots,l\right\}$ . The $i$ -th cone is also mentioned as $\mathcal{K}^{i}$ for simplicity, and correspondingly $\mathcal{K}=\mathcal{K}^{1}\times\cdots\times\mathcal{K}^{k}$ . For a vector $\mathbf{v}\in\mathcal{K}$ , we use $\mathbf{v}^{(i)}$ to denote the subvector in the cone $\mathcal{K}^{i}$ . For $\mathbf{v}\in\mathcal{R}^{n}-\mathcal{K}$ , $\mathbf{v}^{(i)}$ also denotes the subvector corresponding to the position of $\mathcal{K}^{i}$ .

The dual problem of SOCP Eq. (2) is

		$\displaystyle\max_{\mathbf{y}}\mathbf{b}^{T}\mathbf{y}$		(4)
		s.t.
		$\displaystyle\mathbf{A}^{T}\mathbf{y}+\mathbf{s}=\mathbf{c},$
		$\displaystyle\mathbf{s}\in\mathcal{K}.$

2.2 Important concepts about SOCP

This subsection introduce some important concepts about SOCP for later use.

Firstly, we introduce the Euclidean Jordan algebra associated with the SOC (see e.g. [Faraut and Koranyi, 1994, Faybusovich, 1997a, b]), which is crucial in the IPMs for SOCP. The algebra for the SOC $\mathcal{K}^{i}$ is defined as

\mathbf{x}^{(i)}\circ\mathbf{s}^{(i)}=\left({\left(\mathbf{x}^{(i)}\right)}^{T}\mathbf{s}^{(i)},\mathbf{x}_{1}^{(i)}\mathbf{s}_{2:n_{i}}^{(i)}+\mathbf{s}_{1}^{(i)}\mathbf{x}_{2:n_{i}}^{(i)}\right),\forall\mathbf{x}^{(i)},\mathbf{s}^{(i)}\in\mathcal{R}^{n_{i}}

(5)

The Euclidean Jordan algebra for $\mathcal{K}$ is defined as

\mathbf{x}\circ\mathbf{s}=\left(\mathbf{x}^{(1)}\circ\mathbf{s}^{(1)},\cdots,\mathbf{x}^{(k)}\circ\mathbf{s}^{(k)}\right),\forall\mathbf{x},\mathbf{s}\in\mathcal{R}^{n}

(6)

From now on, we assume that the space $\mathcal{R}^{n}$ is endowed with the Euclidean Jordan algebra.

The unit element of the algebra is $\mathbf{e}=\left(\mathbf{e}^{(1)},\cdots,\mathbf{e}^{(k)}\right)$ , where $\mathbf{e}^{(i)}=(1,0,\cdots,0)^{T},i\in{1,\cdots,k}$ , because $\mathbf{v}\circ\mathbf{e}=\mathbf{v}$ . Then, the algebra can be written as multiplication of arrow head matrices and the unit element, as

\mathbf{x}\circ\mathbf{s}=\text{mat}(\mathbf{x})\text{mat}(\mathbf{s})\mathbf{e},

(7)

where the arrow head matrix for $\mathbf{v}\in\mathcal{R}^{n}$ is defined as

		$\displaystyle\text{mat}(\mathbf{v})=\text{blkdiag}\left(\text{mat}\left(\mathbf{v}^{(1)}\right)\cdots,\text{mat}\left(\mathbf{v}^{(k)}\right)\right),$		(8)
		$\displaystyle\text{mat}\left(\mathbf{v}^{(i)}\right)\triangleq\begin{pmatrix}{\mathbf{v}^{(i)}_{1}}&\left(\mathbf{v}^{(i)}_{2:n}\right)^{T}\\ \mathbf{v}^{(i)}_{2:n}&{\mathbf{v}^{(i)}_{1}}{{\mathbf{I}}_{n_{i}-1}}\end{pmatrix},i\in{1,\cdots,k}.$		(8)

It is easy to verify that the eigenvalues of the arrow head matrix satisfies

\lambda_{\min}(\text{mat}(\mathbf{v}))=\mathbf{v}_{1}-\|\mathbf{v}_{2:n}\|,\quad\lambda_{\max}(\text{mat}(\mathbf{v}))=\mathbf{v}_{1}+\|\mathbf{v}_{2:n}\|.

(9)

Consequently, we define the notation

\lambda_{\min}(\mathbf{v})\triangleq\mathbf{v}_{1}-\|\mathbf{v}_{2:n}\|,\quad\lambda_{\max}(\mathbf{v})\triangleq\mathbf{v}_{1}+\|\mathbf{v}_{2:n}\|.

(10)

For a vector $\mathbf{v}\in\mathcal{K}$ , $\lambda_{i}^{1}(\mathbf{v})$ and $\lambda_{i}^{2}(\mathbf{v})$ are used as abbreviations of $\lambda_{\min}(\mathbf{v}^{(i)})$ and $\lambda_{\max}(\mathbf{v}^{(i)})$ , where $\mathbf{v}^{(i)}$ denotes the part of $\mathbf{v}$ in the $i$ -th cone.

Secondly, we introduce the homogeneous and self-dual (HSD) model and the central path. The HSD model combines the primal and dual problem, and enables to solved them simultaneously. The model is given by

		$\displaystyle\min\tilde{\beta}\nu$		(11)
		s.t.
		$\displaystyle\mathbf{Ax}-\mathbf{b}\tau-\tilde{\mathbf{r}}_{p}\nu=0,$
		$\displaystyle-{{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{c}\tau-\mathbf{s}-\tilde{\mathbf{r}}_{d}\nu=0,$
		$\displaystyle{{\mathbf{b}}^{T}}\mathbf{y}-{{\mathbf{c}}^{T}}\mathbf{x}-\kappa-\tilde{r}_{g}\nu=0,$
		$\displaystyle\tilde{\mathbf{r}}_{p}^{T}\mathbf{y}+\mathbf{r}_{d,0}^{T}\mathbf{x}+\tilde{r}_{g}\tau=-\tilde{\beta},$
		$\displaystyle(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\in\mathcal{C},\mathcal{C}=\mathcal{K}\times\mathcal{R}^{p}\times\mathcal{K}\times\mathcal{K}_{L}^{1}\times\mathcal{K}_{L}^{1},$

where $\nu\in{{R}_{+}}$ is free, and ${{\mathbf{r}}_{p}}\in{{R}^{p}}$ , ${{\mathbf{r}}_{d}}\in{{R}^{n}}$ , ${{r}_{g}}\in R$ , $\beta\in R$ are residuals defined by

		$\displaystyle\tilde{\mathbf{r}}_{p}\triangleq\left(\mathbf{A}{{\mathbf{x}}_{0}}-\mathbf{b}{{\tau}_{0}}\right)/{{\nu}_{0}},$		(12)
		$\displaystyle\tilde{\mathbf{r}}_{d}\triangleq\left(-{{\mathbf{A}}^{T}}{{\mathbf{y}}_{0}}+\mathbf{c}{{\tau}_{0}}-{{\mathbf{s}}_{0}}\right)/{{\nu}_{0}},$
		$\displaystyle\tilde{r}_{g}\triangleq\left({{\mathbf{b}}^{T}}{{\mathbf{y}}_{\mathbf{0}}}-{{\mathbf{c}}^{T}}{{\mathbf{x}}_{0}}-{{\kappa}_{0}}\right)/{{\nu}_{0}},$
		$\displaystyle\tilde{\beta}\triangleq-\left(\mathbf{r}_{p}^{T}{{\mathbf{y}}_{0}}+\mathbf{r}_{d}^{T}{{\mathbf{x}}_{0}}+{{r}_{g}}{{\tau}_{0}}\right),$

${{\nu}_{0}}$ is typically set as 1.

The solution of the HSD model fully describes the property of the primal problem and dual problem, that is summarized by the following theorem.

Theorem 1. Let $\left(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau\right)$ be an optimal solution of the HSD model Eq. (11). Then, the following relations hold [Terlaky et al., 2002, Wang, 2003]:

a) $\mathbf{x}^{T}\mathbf{s}=0$ and $\kappa\tau=0$ ;

b) If $\tau>0$ , then $\left(\mathbf{x},\mathbf{y},\mathbf{s}\right)/\tau$ is the optimal solution of the primal problem Eq. (2) and the dual problem Eq. (LABEL:SOCP_DUAL);

c) If $\kappa>0$ , then at least one of the inequities $\mathbf{b}^{T}\mathbf{y}>0$ and $\mathbf{c}^{T}\mathbf{x}<0$ hold. If the first inequality holds then the primal problem Eq. (2) is infeasible. If the second inequality holds then the dual problem Eq. (LABEL:SOCP_DUAL) is infeasible.

Proof. see corollary 1.3.1 of [Wang, 2003].

The central path is defined by

		$\displaystyle\mathbf{Ax}-\mathbf{b}\tau-{{\mathbf{r}}_{p}}\nu=0,$		(13)
		$\displaystyle-{{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{c}\tau-\mathbf{s}-{{\mathbf{r}}_{d}}\nu=0,$
		$\displaystyle{{\mathbf{b}}^{T}}\mathbf{y}-{{\mathbf{c}}^{T}}\mathbf{x}-\kappa-{{r}_{g}}\nu=0,$
		$\displaystyle\mathbf{x}\circ\mathbf{s}=\mu\mathbf{e},$
		$\displaystyle\kappa\tau=\mu$

The central path approaches the optimal solution of the HSD model as $\mu\to 0$ , that corresponds to the solution or an infeasibility certificate of the SOCP problem.

Thirdly, we introduce the scaling matrix, that can be applied to scale the searching directions. Monteiro and Tsuchiya [2000] introduces a group of scaling auto-morphism that maps the cone $\mathcal{K}^{i}$ onto itself, as the following group of matrices

\mathcal{G}^{i}\triangleq\left\{\theta^{i}\mathbf{G}^{i}:\theta^{i}>0,\mathbf{G}^{i}\in\mathbf{R}^{n_{i}\times n_{i}},(\mathbf{G}^{i})^{T}\mathbf{Q}^{i}\mathbf{G}^{i}=\mathbf{Q}^{i}\right\},

(14)

where $\theta^{i}$ is a positive scaling value, and

\mathbf{Q}^{i}=\begin{pmatrix}1&0\\ 0&-\mathbf{I}_{n_{i}-1}\\ \end{pmatrix}.

(15)

The set of scaling matrice for a vector in $\mathcal{K}$ is defined as

\mathcal{G}\triangleq\left\{\Theta\mathbf{G}:\mathbf{G}=\text{blkdiag}\left({{\mathbf{G}}^{1}},\cdots,{{\mathbf{G}}^{k}}\right),\mathbf{G}^{i}\in\mathbf{R}^{n_{i}\times n_{i}},(\mathbf{G}^{i})^{T}\mathbf{Q}^{i}\mathbf{G}^{i}=\mathbf{Q}^{i}\right\}

(16)

where $i\in\left\{1,\cdots,k\right\}$ and

\Theta=\text{blkdiag}\left({\theta}^{1}\mathbf{I}_{n_{1}},\cdots,{\theta}^{k}\mathbf{I}_{n_{k}}\right).

(17)

The solution variables $\mathbf{x}$ and $\mathbf{s}$ are scaled by

\mathbf{\bar{x}}=(\Theta\mathbf{G})^{T}\mathbf{x},\mathbf{\bar{s}}={{(\Theta\mathbf{G})}^{-1}}\mathbf{s},

(18)

For convenience, we denote a scaling matrix by

\mathbf{D}=(\mathbf{\Theta}\mathbf{G})^{-1}.

(19)

The following proposition gives several important properties of the scaling matrix.

Proposition 1 [Andersen et al., 2003]. Let $\mathbf{D}\in\mathcal{G}$ , then

		$\displaystyle(\mathbf{x}^{(i)})^{T}\mathbf{s}^{(i)}=(\mathbf{\bar{x}}^{(i)})^{T}\mathbf{\bar{s}}^{(i)},$		(20)
		$\displaystyle\theta_{i}^{2}(\mathbf{x}^{(i)})^{T}\mathbf{Q}^{i}\mathbf{x}^{(i)}=(\mathbf{\bar{x}}^{(i)})^{T}\mathbf{Q}^{i}\mathbf{\bar{x}}^{(i)}$
		$\displaystyle\theta_{i}^{-2}(\mathbf{s}^{(i)})^{T}\mathbf{Q}^{i}\mathbf{s}^{(i)}=(\mathbf{\bar{s}}^{(i)})^{T}\mathbf{Q}^{i}\mathbf{\bar{s}}^{(i)},$
		$\displaystyle\mathbf{x}\in\mathcal{K}\Leftrightarrow\mathbf{\bar{x}}\in\mathcal{K},\text{ and }\mathbf{x}\in\text{int}(\mathcal{K})\Leftrightarrow\mathbf{\bar{x}}\in\text{int}(\mathcal{K}).$

Proof. See Lemma 3 of [Andersen et al., 2003].

For a vector $\mathbf{v}\in\text{int}\mathbf{(}K)$ , Tsuchiya [1999] introduces a special scaling matrix $\mathbf{T}_{v}$ defined as

\mathbf{T}_{v}\triangleq\text{blkdiag}\left(\mathbf{T}_{v^{(1)}},\cdots,\mathbf{T}_{v^{(k)}}\right),

(21)

where

\displaystyle\mathbf{T}_{u}\triangleq\begin{pmatrix}&\mathbf{u}_{1}&\mathbf{u}_{2:n}^{T}\\ &\mathbf{u}_{2:n}&\beta_{u}\mathbf{I}+\frac{\mathbf{u}_{2:n}\mathbf{u}_{2:n}^{T}}{\beta_{u}+\mathbf{u}_{1}}\end{pmatrix},\beta_{u}=\sqrt{\mathbf{u}_{1}^{2}-\|\mathbf{u}_{2:n}\|^{2}}

(22)

for $\mathbf{u}\in\mathcal{R}^{n}$ .

Proposition 2 [Tsuchiya, 1999]. For a vector $\mathbf{v}\in\text{int}\mathbf{(}K)$ , $\mathbf{T}_{v}$ is the unique scaling matrix which maps $\mathbf{e}$ to $\mathbf{T}_{v}$ . Moreover, $\mathbf{T}_{v}\succ 0$ .

Proof. See Proposition 2.1 of [Tsuchiya, 1999].

Finally, it is ready to define the scaling-invariant neighborhoods of the central path. For $(\mathbf{x},\mathbf{s},\kappa,\tau)\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})\times R_{+}\times R_{+}$ , define the central path coefficient

\mu(\mathbf{x},\mathbf{s},\kappa,\tau)=(\mathbf{x}^{T}\mathbf{s}+\kappa\tau)/(k+1).

(23)

With a little abuse of notation, we also use $\mu$ instead of $\mu(\mathbf{x},\mathbf{s},\kappa,\tau)$ for simplicity. The central path distances are defined as

		$\displaystyle d_{2}(\mathbf{x},\mathbf{s},\kappa,\tau)=\sqrt{2}\left\\|\mathbf{w}_{xs\kappa\tau}-\mu\mathbf{e}\right\\|,$		(24)
		$\displaystyle d_{\infty}(\mathbf{x},\mathbf{s},\kappa,\tau)=\max\left(d_{\infty}(\mathbf{x},\mathbf{s}),\left\|\kappa\tau-\mu\right\|\right),$		(24)

where

		$\displaystyle\mathbf{w}_{xs}=\mathbf{T}_{x}\mathbf{s},\mathbf{w}_{xs\kappa\tau}=\left(\mathbf{w}_{xs},\kappa\tau\right),$		(25)
		$\displaystyle d_{\infty}(\mathbf{x},\mathbf{s})=\max_{i=1,\cdots,k,j=0,1}\left(\left\|(\mathbf{w}_{xs})_{1}^{(i)}+\\|(\mathbf{w}_{xs})_{2:n_{i}}^{(i)}\\|-\mu\right\|,\left\|(\mathbf{w}_{xs})_{1}^{(i)}-\\|(\mathbf{w}_{xs})_{2:n_{i}}^{(i)}\\|-\mu\right\|\right),$		(25)

for $\mathbf{v}\in\mathcal{K}$ .

The neighborhoods of the central path determined by the above distance are

		$\displaystyle\mathcal{N}_{2}(\gamma)=\left\{(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\in\text{int}(\mathcal{C}):d_{2}(\mathbf{x},\mathbf{s},\kappa,\tau)\leq\gamma\mu(\mathbf{x},\mathbf{s},\kappa,\tau)\right\},$		(26)
		$\displaystyle\mathcal{N}_{\infty}(\gamma)=\left\{(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\in\text{int}(\mathcal{C}):d_{\infty}(\mathbf{x},\mathbf{s},\kappa,\tau)\leq\gamma\mu(\mathbf{x},\mathbf{s},\kappa,\tau)\right\},$		(26)

where the parameter $\gamma\in(0,1)$ . For $(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\in\text{int}(\mathcal{C})$ , it is easy to show $d_{\infty}(\mathbf{x},\mathbf{s},\kappa,\tau)\leq d_{2}(\mathbf{x},\mathbf{s},\kappa,\tau)$ , so that $\mathcal{N}_{2}(\gamma)\subseteq\mathcal{N}_{\infty}(\gamma)$ .

The following proposition shows the invariance property of $\mathbf{w}_{xs}$ .

Proposition 3 Tsuchiya [1999]. Suppose that $(\mathbf{x},\mathbf{s})\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})$ . Let $\mathbf{w}\triangleq\mathbf{w}_{xs}$ and $\mathbf{\bar{w}}\triangleq\mathbf{w}_{\bar{x}\bar{s}}$ , $i\in{1,\cdots,k}$ . Then:

a) $\mathbf{\bar{w}}_{0}^{(i)}=\mathbf{w}_{0}^{(i)}$ and $\|\mathbf{\bar{w}}_{1}^{(i)}\|=\|\mathbf{w}_{1}^{(i)}\|$ ;

b) $\lambda_{\min}(\mathbf{\bar{w}}_{0}^{(i)})=\lambda_{\min}(\mathbf{w}_{0}^{(i)})$ and $\lambda_{\max}(\mathbf{\bar{w}}_{0}^{(i)})=\lambda_{\max}(\mathbf{w}_{0}^{(i)})$ ;

c) $d_{2}(\mathbf{\bar{x}},\mathbf{\bar{s}})=d_{2}(\mathbf{x},\mathbf{s})$ and $d_{\infty}(\mathbf{\bar{x}},\mathbf{\bar{s}})=d_{\infty}(\mathbf{x},\mathbf{s})$ .

Proof. See Proposition 2.4 of Tsuchiya [1999].

Using Proposition 3, it is easy to verify that for a point $(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)$ on the central path defined by Eq. (13), the distances $d_{2}(\mathbf{x},\mathbf{s},\kappa,\tau)=d_{\infty}(\mathbf{x},\mathbf{s},\kappa,\tau)=0$ .

2.3 The infeasible IPM for SOCP

In this subsection, we briefly introduce the infeasible IPM for SOCP studied in the paper.

The SOCP problem can be solved by loosely track the central path towards the optimal solution of the HSD model. In such a way, the updating direction in each iteration is computed by solving

		$\displaystyle\mathbf{A}\Delta\mathbf{x}-\mathbf{b}\Delta\tau=-\left(1-\nu\right)\left(\mathbf{Ax}-\mathbf{b}\tau\right),$		(27)
		$\displaystyle-{{\mathbf{A}}^{T}}\Delta\mathbf{y}+\mathbf{c}\Delta\tau-\Delta\mathbf{s}=-\left(1-\nu\right)\left(-{{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{c}\tau-\mathbf{s}\right),$
		$\displaystyle{{\mathbf{b}}^{T}}\Delta\mathbf{y}-{{\mathbf{c}}^{T}}\Delta\mathbf{x}-\Delta\kappa=-\left(1-\nu\right)\left({{\mathbf{b}}^{T}}\mathbf{y}-{{\mathbf{c}}^{T}}\mathbf{x}-\kappa\right),$
		$\displaystyle\mathbf{X}\Delta\mathbf{s}+\mathbf{S}\Delta\mathbf{x}=\nu\mu\mathbf{e}-\mathbf{XSe},$
		$\displaystyle\kappa\Delta\tau+\tau\Delta\kappa=\nu\mu-\kappa\tau,$

where $\mathbf{X}=\text{mat}(\mathbf{x})$ , $\mathbf{S}=\text{mat}(\mathbf{s})$ . $\mu$ is $\mu(\mathbf{x},\mathbf{s},\kappa,\tau)$ defined in Eq. (23), which is proportional to the complementarity gap $\mathbf{x}^{T}\mathbf{s}+\kappa\tau$ .

Instead of using the searching direction obtained by solving Eq (27) directly, the MZ family of drections are applied to speed up the convergence. The directions are calculated by solving a scaled linear system, as

		$\displaystyle\mathbf{A}\Delta\mathbf{x}-\mathbf{b}\Delta\tau=-\left(1-\nu\right)\left(\mathbf{Ax}-\mathbf{b}\tau\right),$		(28)
		$\displaystyle-{{\mathbf{A}}^{T}}\Delta\mathbf{y}+\mathbf{c}\Delta\tau-\Delta\mathbf{s}=-\left(1-\nu\right)\left(-{{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{c}\tau-\mathbf{s}\right),$
		$\displaystyle{{\mathbf{b}}^{T}}\Delta\mathbf{y}-{{\mathbf{c}}^{T}}\Delta\mathbf{x}-\Delta\kappa=-\left(1-\nu\right)\left({{\mathbf{b}}^{T}}\mathbf{y}-{{\mathbf{c}}^{T}}\mathbf{x}-\kappa\right),$
		$\displaystyle\bar{\mathbf{X}}\mathbf{D}\Delta\mathbf{s}+\bar{\mathbf{S}}\mathbf{D}^{-T}\Delta\mathbf{x}=\nu\mu\mathbf{e}-\bar{\mathbf{X}}\bar{\mathbf{S}}\mathbf{e},$
		$\displaystyle\kappa\Delta\tau+\tau\Delta\kappa=\nu\mu-\kappa\tau,$

where

\bar{\mathbf{X}}=\text{mat}(\bar{\mathbf{x}}),\bar{\mathbf{S}}=\text{mat}(\bar{\mathbf{s}}),\bar{\mathbf{x}}=\mathbf{D}^{-T}\mathbf{x},\bar{\mathbf{s}}=\mathbf{D}\mathbf{s},\mathbf{D}\in\mathcal{G}.

(29)

The scaled linear system Eq (28) is obtained by using the scale variable $\bar{\mathbf{x}},\bar{\mathbf{s}}$ instead of $\mathbf{x}$ and $\mathbf{s}$ in the HSD model Eq. (11).

The MZ family of directions include the HRVW/KSH/M directions and NT directions as special cases. For example, in the NT scaling, the scaling matrix $\mathbf{D}$ is symmetric and satisfies $\mathbf{x}=\mathbf{D}^{2}\mathbf{s}$ , and the multiplication with $\mathbf{D}$ or $\mathbf{D}^{2}$ can be calculated in $O(n)$ complexity.

Then, the infeasible IPM for SOCP is presented as Algorithm 1.

Algorithm 1 An Infeasible IPM for SOCP

Input: Constant coefficients $\gamma,\delta\in(0,1)$ , a small positive threshold $\epsilon\in(0,1)$ ; the SOCP problem parameters $\mathbf{A},\mathbf{b},\mathbf{c},\mathcal{K}$ ; initial value $(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\in\mathcal{N}_{2}(\gamma)$ .
Output: the solution $(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)$ and final status.

1: Set

\nu=1-\delta/\sqrt{2(k+1)}

;

2: Set

(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)=(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})

;

3: while the stop criteria is not satisfied do

4: Set

\mu=(\mathbf{x}^{T}\mathbf{s}+\kappa\tau)/(k+1)

;

5: Compute

(\Delta\mathbf{x},\Delta\mathbf{y},\Delta\mathbf{s},\Delta\kappa,\Delta\tau)

by solving the linear system Eq. (28);

6: Set

(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)=(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)+(\Delta\mathbf{x},\Delta\mathbf{y},\Delta\mathbf{s},\Delta\kappa,\Delta\tau)

;

7: end while

8: Select the final status according to the stop criteria.

In Algorithm 1, the stop criteria is as follows:

		$\displaystyle\left\\|\mathbf{r}_{p}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\right\\|\leq\epsilon\left\\|\mathbf{r}_{p}(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\right\\|,$		(30)
		$\displaystyle\left\\|\mathbf{r}_{d}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\right\\|\leq\epsilon\left\\|\mathbf{r}_{d}(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\right\\|,$
		$\displaystyle\mu\left(\mathbf{x},\mathbf{s},\kappa,\tau\right)\leq\epsilon\mu\left({{\mathbf{x}}_{0}},{{\mathbf{s}}_{0}},{{\kappa}_{0}},{{\tau}_{0}}\right).$

where the primal residual $\mathbf{r}_{p}$ and the dual residual $\mathbf{r}_{d}$ are defined as

\mathbf{r}_{p}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)=\mathbf{Ax}-\tau\mathbf{b},\mathbf{r}_{d}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)={{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{s}-\tau\mathbf{c}.

(31)

For convience, we also define

r_{g}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)={{\mathbf{b}}^{T}}\mathbf{y}-{{\mathbf{c}}^{T}}\mathbf{x}-\kappa.

(32)

The criteria indicates that the primal residual, dual residual, and complementarity gap are reduce by a certain ratio $\epsilon$ . When the criteria is satisfied, the final status is decided using an approximate version of Theorem 1: if $\tau\geq\epsilon\max(1,\kappa)$ , then the problem is solved; if $\tau<\epsilon\max(1,\kappa)$ , the primal problem is infeasible if $\mathbf{b}^{T}\mathbf{y}>0$ , and the dual problem is infeasible if $\mathbf{c}^{T}\mathbf{x}<0$ . Some important issues in implementation such as the maximum iteration number limit and rounding errors are not considered in the theoretical analysis of this paper.

Algorithm 1 obtains the solution for the HSD model. If the problem is solved, the solution for the SOCP problem is $(\mathbf{x}/\tau,\mathbf{y}/\tau,\mathbf{s}/\tau)$ . In oder to guarantee the convergence of Algorithm 1, the parameters $\gamma,\delta$ should be chosen to satisfy certain conditions, which will be presented in Proposition 4 in the next section.

3 Analysis of the iteration complexity

In order to prove the polynomial convergence in the worst case, it is crucial to show two points: (1) the updating keeps the solution in a central path neighborhood, which is a subset of the interior of cone constraints; (2) when the solution is bounded in the central path neighborhood, the primal residual, dual residual, and complementarity gap can be reduced for a certain ratio in each iteration.

This section is organized as follows: firstly, an equivalent form of the HSD model is proposed to simplify the following analysis; secondly, the well-definedness of the searching directions is proved; thirdly, the improvement of the primal residual, dual residual, and complementarity gap is investigated; fourthly, several important results are presented to bound the updating in the central path neighborhood for the unscaled search directions; fifthly, the polynomial convergence for the unscaled search directions is proved; finally, the proof is extend to the MZ family of search directions.

3.1 An equivalent form of the HSD model

In the HSD model, $\kappa$ and $\tau$ are variables in $\mathcal{K}_{L}^{1}$ , but they act differently from other variables in LCs. Although we find it is theoretically not difficult to consider them as additional terms, the analysis in such a way would be long and tedious. In order to simplify the analysis, we propose a equivalent form of the HSD model in this subsection.

Consider the following change of variables

\displaystyle\hat{\mathbf{A}}=[\mathbf{A},-\mathbf{b}],\hat{\mathbf{C}}=\begin{pmatrix}\begin{aligned} &0,&-\mathbf{c}\\ &\mathbf{c}^{T},&0\\ \end{aligned}\end{pmatrix},\hat{\mathbf{x}}=\begin{pmatrix}\mathbf{x}\\ \tau\\ \end{pmatrix},\hat{\mathbf{s}}=\begin{pmatrix}\mathbf{s}\\ \kappa\\ \end{pmatrix},\hat{\mathbf{e}}=\begin{pmatrix}\mathbf{e}\\ 1\\ \end{pmatrix},

(33)

where $\hat{\mathbf{x}},\hat{\mathbf{s}}\in\mathcal{K}\times\mathcal{K}_{L}^{1}$ and $\mathbf{y}\in\mathcal{R}^{p}$ .

It is easy to verify that the linear system Eq. (27) can be written as

		$\displaystyle\hat{\mathbf{A}}\Delta\hat{\mathbf{x}}=-\left(1-\nu\right)\hat{\mathbf{A}}\hat{\mathbf{x}},$		(34)
		$\displaystyle\hat{\mathbf{A}}^{T}\Delta\mathbf{y}+\hat{\mathbf{C}}\Delta\hat{\mathbf{x}}+\Delta\hat{\mathbf{s}}=-(1-\nu)\left(\hat{\mathbf{A}}^{T}\mathbf{y}+\hat{\mathbf{C}}\hat{\mathbf{x}}+\hat{\mathbf{s}}\right),$
		$\displaystyle\hat{\mathbf{X}}\Delta\hat{\mathbf{s}}+\hat{\mathbf{S}}\Delta\hat{\mathbf{x}}=\nu\mu\hat{\mathbf{e}}-\hat{\mathbf{X}}\hat{\mathbf{S}}\hat{\mathbf{e}},$

where $\hat{\mathbf{X}}=\text{mat}(\hat{\mathbf{x}})$ , $\hat{\mathbf{S}}=\text{mat}(\hat{\mathbf{s}})$ . It is worth noting that the second and third equations in Eq. (27) are combined into the second equation in Eq. (34).

The scaled linear system Eq. (28) is written as

		$\displaystyle\hat{\mathbf{A}}\Delta\hat{\mathbf{x}}=-\left(1-\nu\right)\hat{\mathbf{A}}\hat{\mathbf{x}},$		(35)
		$\displaystyle\hat{\mathbf{A}}^{T}\Delta\mathbf{y}+\hat{\mathbf{C}}\Delta\hat{\mathbf{x}}+\Delta\hat{\mathbf{s}}=-(1-\nu)\left(\hat{\mathbf{A}}^{T}\mathbf{y}+\hat{\mathbf{C}}\hat{\mathbf{x}}+\hat{\mathbf{s}}\right),$
		$\displaystyle\text{mat}(\hat{\mathbf{D}}^{-T}\hat{\mathbf{x}})\hat{\mathbf{D}}\Delta\hat{\mathbf{s}}+\text{mat}(\hat{\mathbf{D}}\hat{\mathbf{s}})\hat{\mathbf{D}}^{-T}\Delta\hat{\mathbf{x}}=\nu\mu\hat{\mathbf{e}}-(\hat{\mathbf{D}}^{-T}\hat{\mathbf{x}})\circ(\hat{\mathbf{D}}\hat{\mathbf{s}}),$

where the scaling matrix

\hat{\mathbf{D}}=\begin{pmatrix}\begin{aligned} &\mathbf{D}&0\\ &0&1\end{aligned}\end{pmatrix}.

(36)

The central path coefficient is also simplified as

\mu=(\mathbf{x}^{T}\mathbf{s}+\kappa\tau)/(k+1)=\hat{\mathbf{x}}^{T}\hat{\mathbf{s}}/(k+1)

(37)

Because the two linear systems are identical, the IPM solving the HSD model by Eq. (28) is identical to solving the model by Eq. (35). Without loss of generality, the last LC $\mathcal{K}_{L}^{1}$ can be viewed as a SOC $\mathcal{K}_{S}^{1}$ . Then, with a little abuse of notation, Eq. (35) is a special case of the following generalized linear system

		$\displaystyle\mathbf{A}\Delta\mathbf{x}=-\left(1-\nu\right)\mathbf{A}\mathbf{x},$		(38)
		$\displaystyle\mathbf{A}^{T}\Delta\mathbf{y}+\mathbf{C}\Delta\mathbf{x}+\Delta\mathbf{s}=-(1-\nu)\left(\mathbf{A}^{T}\mathbf{y}+\mathbf{C}\mathbf{x}+\mathbf{s}\right),$
		$\displaystyle\bar{\mathbf{X}}\mathbf{D}\Delta\mathbf{s}+\bar{\mathbf{S}}\mathbf{D}^{-T}\Delta\mathbf{x}=\nu\mu\mathbf{e}-\bar{\mathbf{x}}\circ\bar{\mathbf{s}},$

where the variables are redefined for simplicity, as

		$\displaystyle\mathbf{A}\in\mathcal{R}^{p\times n},\text{rank}(\mathbf{A})=p\leq n,\mathbf{C}\in\mathcal{R}^{n\times n},\mathbf{C}=-\mathbf{C}^{T}$		(39)
		$\displaystyle\mathcal{K}=\mathcal{K}_{L}^{l}\times\mathcal{K}_{S}^{{{n}_{l+1}}}\times\mathcal{K}_{S}^{{{n}_{l+2}}}\cdots\times\mathcal{K}_{S}^{{{n}_{l+m}}},k=l+m,$
		$\displaystyle\mathbf{x}\in\mathcal{K},\mathbf{y}\in\mathcal{R}^{p},\mathbf{s}\in\mathcal{K},\mathcal{C}=\mathcal{K}\times\mathcal{R}^{p}\times\mathcal{K},$
		$\displaystyle\mathbf{D}\in\mathcal{G},\bar{\mathbf{x}}=\mathbf{D}^{-T}\mathbf{x},\bar{\mathbf{s}}=\mathbf{D}\mathbf{s},\bar{\mathbf{X}}=\text{mat}(\bar{\mathbf{x}}),\bar{\mathbf{S}}=\text{mat}(\bar{\mathbf{s}}).$

The central path coefficient for the generalized linear system Eq. (38) is defined as

\mu=\mu(\mathbf{x},\mathbf{s})=\mathbf{x}^{T}\mathbf{s}/k,

(40)

where $\mathbf{x}^{T}\mathbf{s}$ is the complementarity gap . The residuals are defined as

\hat{\mathbf{r}}_{p}=\mathbf{A}\mathbf{x},\hat{\mathbf{r}}_{d}=\mathbf{A}^{T}\mathbf{y}+\mathbf{C}\mathbf{x}+\mathbf{s}.

(41)

For the special case Eq. (35), the generalized linear system Eq. (38) has 1 additional dimensional in both $\mathbf{x}$ and $\mathbf{s}$ compared with the original form Eq. (28), to substitute $\kappa$ and $\tau$ . Meanwhile, its cone constraint $\mathcal{K}$ is also 1-dimensional higher. The scaling matrix $\mathbf{D}\in\mathcal{G}$ generates the MZ family of searching directions. The residual $\hat{\mathbf{r}}_{p}$ is identical to the primal residual $\mathbf{r}_{p}$ of the HSD model defined in Eq. (12), $\hat{\mathbf{r}}_{d}$ is a combination of the dual residual and gap as $(\mathbf{r}_{d},r_{g})$ , and the central path coefficient $\mu(\mathbf{x},\mathbf{s})$ is identical to $\mu(\mathbf{x},\mathbf{s},\kappa,\tau)$ defined in Eq. (23).

In the following, it sufficient to study the convergence properties to reduce the residuals $\hat{\mathbf{r}}_{p}$ , $\hat{\mathbf{r}}_{d}$ , and the complementarity gap which is proportional to $\mu$ by solving the linear system Eq. (38), and the results apply to the process by solving Eq. (28).

For the generalized linear system Eq. (38), the central path distances are

		$\displaystyle d_{2}(\mathbf{x},\mathbf{s})=\sqrt{2}\left\\|\mathbf{w}_{xs}-\mu\mathbf{e}\right\\|,$		(42)
		$\displaystyle d_{\infty}(\mathbf{x},\mathbf{s})=\max_{i=1,\cdots,k,j=0,1}\left(\left\|(\mathbf{w}_{xs})_{0}^{(i)}+\\|(\mathbf{w}_{xs})_{1}^{(i)}\\|-\mu\right\|,\left\|(\mathbf{w}_{xs})_{0}^{(i)}-\\|(\mathbf{w}_{xs})_{1}^{(i)}\\|-\mu\right\|\right).$		(42)

The central path neighborhoods are

		$\displaystyle\mathcal{N}_{2}(\gamma)=\left\{(\mathbf{x},\mathbf{y},\mathbf{s})\in\text{int}(\mathcal{C}):d_{2}(\mathbf{x},\mathbf{s})\leq\gamma\mu(\mathbf{x},\mathbf{s})\right\},$		(43)
		$\displaystyle\mathcal{N}_{\infty}(\gamma)=\left\{(\mathbf{x},\mathbf{y},\mathbf{s})\in\text{int}(\mathcal{C}):d_{\infty}(\mathbf{x},\mathbf{s})\leq\gamma\mu(\mathbf{x},\mathbf{s})\right\},$		(43)

where the parameter $\gamma\in(0,1)$ . It is easy to verify the central path distances and neighborhoods are identical to the definition Eq. (24) and Eq. (26) for the special case Eq. (35). We also have $d_{\infty}(\mathbf{x},\mathbf{s})\leq d_{2}(\mathbf{x},\mathbf{s})$ , and $\mathcal{N}_{2}(\gamma)\subseteq\mathcal{N}_{\infty}(\gamma)$ .

3.2 The well-definedness of the searching directions

Tsuchiya [1999] and Monteiro and Tsuchiya [2000] provide valuable tools for our study. For the convenience of analysis, we introduce Lemma 1-4 in [Monteiro and Tsuchiya, 2000] as Lemma 1-4 (see the paper for the proofs).

Define the following notations as

\mathbf{W}_{xs}\triangleq\text{mat}(\mathbf{w}_{xs}),\mathbf{R}(\mathbf{x},\mathbf{s})\triangleq\mathbf{T}_{x}\mathbf{X}^{-1}\mathbf{S}\mathbf{T}_{x}.

(44)

Lemma 1 [Monteiro and Tsuchiya, 2000]. For any $\mathbf{x}\in\text{int}(\mathcal{K})$ , the matrice $\mathbf{X}$ and $\mathbf{T}_{x}$ satisfy:

\mathbf{X}-\mathbf{T}_{x}=\mathbf{U}_{x}=\text{blkdiag}\left(\mathbf{U}_{x^{(i)}}:i=1,\cdots,k\right),

(45)

where

\mathbf{U}_{v}=\begin{pmatrix}&0&0\\ &0&(\mathbf{v}_{1}-\beta_{v})\mathbf{P}_{v}\end{pmatrix}

(46)

for $\mathbf{v}\in\mathcal{R}^{n}$ , and $\mathbf{P}_{v}$ is the orthogonal projection matrix onto the subspace orthogonal to $\mathbf{v}_{2:n}$ , as

\mathbf{P}_{v}=\mathbf{I}_{n-1}-\mathbf{v}_{2:n}(\mathbf{v}_{2:n})^{T}/\left\|\mathbf{v}_{2:n}\right\|^{2}.

(47)

We also use the notation

\mathbf{P}_{x}=\text{blkdiag}\left(\begin{pmatrix}&0&0\\ &0&\mathbf{P}_{x^{(i)}}\end{pmatrix}:i=1,\cdots,k\right).

(48)

		$\displaystyle\mathbf{T}_{x}\mathbf{X}^{-1}=\mathbf{X}^{-1}\mathbf{T}_{x}=\text{blkdiag}\left(\mathbf{I}-\mathbf{U}_{x^{(i)}}/\mathbf{x}^{(i)}_{1}:i=1,\cdots,k\right),$		(49)
		$\displaystyle\mathbf{T}_{x}\mathbf{X}^{-1}\mathbf{e}=\mathbf{e};$		(49)

c) $\mathbf{X}$ and $\mathbf{T}_{x}$ commute and $\mathbf{X}\succeq\mathbf{T}_{x}$ .

Lemma 2 [Monteiro and Tsuchiya, 2000]. $\mathbf{R}_{xs}$ satisfies

\mathbf{R}_{xs}=\text{blkdiag}\left(\mathbf{R}_{xs}^{i}:i=1,\cdots,k\right),

(50)

where

\mathbf{R}_{xs}^{i}\triangleq\begin{pmatrix}&\dot{\mathbf{w}}_{1}&\dot{\mathbf{w}}_{2:n_{i}}^{T}\\ &\dot{\mathbf{w}}_{2:n_{i}}&\tilde{\mathbf{R}}_{xs}^{i}\end{pmatrix}

(51)

with $\dot{\mathbf{w}}$ denotes $\mathbf{w}_{xs}^{(i)}$ and

\tilde{\mathbf{R}}_{xs}^{i}\triangleq\frac{1}{\mathbf{x}^{(i)}_{1}}\left(\dot{\mathbf{w}}_{2:n_{i}}(\mathbf{x}^{(i)}_{2:n_{i}})^{T}+\beta_{x^{(i)}}^{2}\mathbf{s}^{(i)}_{1}\mathbf{I}\right)=\frac{\dot{\mathbf{w}}_{2:n_{i}}(\mathbf{x}^{(i)}_{2:n_{i}})^{T}}{\mathbf{x}^{(i)}_{1}}+\left(\dot{\mathbf{w}}_{1}-\frac{\dot{\mathbf{w}}_{2:n_{i}}^{T}\mathbf{x}^{(i)}_{2:n_{i}}}{\mathbf{x}^{(i)}_{1}}\right)\mathbf{I}.

(52)

Lemma 3 [Monteiro and Tsuchiya, 2000]. Let $(\mathbf{x},\mathbf{s})\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})$ satisfy

\max_{i,j}\left|\lambda_{i}^{j}(\mathbf{w}_{xs})-\dot{\nu}\right|\leq\gamma\dot{\nu}

(53)

for some scalars $\gamma>0$ and $\dot{\nu}>0$ . Then

		$\displaystyle\left\\|\mathbf{R}_{xs}-\mathbf{W_{xs}}\right\\|\leq 2\gamma\dot{\nu},$		(54)
		$\displaystyle\left\\|\mathbf{W}_{xs}-\dot{\nu}\mathbf{I}\right\\|\leq\gamma\dot{\nu},$
		$\displaystyle\left\\|\mathbf{R}_{xs}-\dot{\nu}\mathbf{I}\right\\|\leq 3\gamma\dot{\nu},$

Lemma 4 [Monteiro and Tsuchiya, 2000]. Let $(\mathbf{x},\mathbf{s})\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})$ satisfy

\left\|\mathbf{R}_{xs}-\dot{\nu}\mathbf{I}\right\|\leq\dot{\tau}\dot{\nu},

(55)

for some scalars $\dot{\tau}\in(0,1)$ and $\dot{\nu}>0$ . Assume that $(\mathbf{u},\mathbf{v})\in\mathcal{K}\times\mathcal{K}$ and $\mathbf{h}\in\mathcal{K}$ satisfy

\mathbf{S}\mathbf{u}+\mathbf{X}\mathbf{v}=\mathbf{h},\mathbf{u}^{T}\mathbf{v}\geq 0,

(56)

and define $\delta_{u}\triangleq\left\|\mathbf{T}_{x}^{-1}\mathbf{u}\right\|$ and $\delta_{v}\triangleq\left\|\mathbf{T}_{x}\mathbf{v}\right\|$ . Then,

\delta_{u}\leq\frac{\left\|\mathbf{T}_{x}\mathbf{X}^{-1}\mathbf{h}\right\|}{(1-\dot{\tau})\nu},\delta_{v}\leq\frac{2\left\|\mathbf{T}_{x}\mathbf{X}^{-1}\mathbf{h}\right\|}{1-\dot{\tau}}

(57)

Using the above lemmas, we obtain the following result that shows the well-definedness of the search directions.

Theorem 2. Let $(\mathbf{x},\mathbf{y},\mathbf{s})\in\text{int}(\mathcal{C})$ satisfy

\max_{i,j}\left\|\lambda_{i}^{j}(\mathbf{w}_{xs})-\nu\right\|\leq\gamma\nu

(58)

for $\gamma\in(0,1/3)$ , and $\mathbf{D}\in\mathcal{G}$ . Then, the linear systems Eq. (38) has unique solution.

Proof. In order to prove the linear system Eq. (38) has unique solution, it is equivalent to show the solution of the homogeneous system associated with it is 0. The homogeneous system is

		$\displaystyle\mathbf{A}\mathbf{d}_{hx}=0,$		(59)
		$\displaystyle{{\mathbf{A}}^{T}}\mathbf{d}_{hy}+\mathbf{C}d_{hx}+\mathbf{d}_{hs}=0,$
		$\displaystyle\bar{\mathbf{X}}\mathbf{D}\mathbf{d}_{hs}+\bar{\mathbf{S}}\mathbf{D}^{-T}\mathbf{d}_{hx}=0,$

where the solution is denoted by $(\mathbf{d}_{hx},\mathbf{d}_{hy},\mathbf{d}_{hs})$ .

Because $\mathbf{D}\in\mathcal{G}$ , it is invertible. Since $\mathbf{C}=-\mathbf{C}^{T}$ by definition Eq. (39), multiplying $\mathbf{d}_{hy}^{T}$ and $\mathbf{d}_{hx}^{T}$ to the first and second equations in Eq. (59) respectively, we can obtain

(\mathbf{D}^{-T}\mathbf{d}_{hx})^{T}(\mathbf{D}\mathbf{d}_{hs})=\mathbf{d}_{hx}^{T}\mathbf{d}_{hs}=0.

(60)

By proposition 3 b),

\max_{i,j}\left\|\lambda_{i}^{j}(\mathbf{w}_{\bar{x}\bar{s}})-\nu\right\|=\max_{i,j}\left\|\lambda_{i}^{j}(\mathbf{w}_{xs})-\nu\right\|\leq\gamma.

(61)

By Lemma 3, Eq. (55) holds for $\dot{\tau}=3\gamma<1$ . By the last equation in Eq. (59), using $\mathbf{D}\mathbf{d}_{hs}$ and $\mathbf{D}^{-1}\mathbf{d}_{hx}$ as $\mathbf{u}$ and $\mathbf{v}$ in Lemma 4 with $\textbf{h}=0$ , we obtain

\mathbf{d}_{hx}=0,\mathbf{d}_{hs}=0.

(62)

Then, by the second equation in Eq. (59),

{{\mathbf{A}}^{T}}\mathbf{d}_{hy}=0.

(63)

Since A has full row rank, we obtain

\mathbf{d}_{hy}=0.

(64)

Consequently, the linear system Eq. (59) has unique solution. Then, the linear system Eq. (38) also has unique solution. Q.E.D.

3.3 The improvements in each iteration

Now we investigate the improvement of the primal residual, dual residual and complementarity gap in each iteration. For a step size $\alpha$ , define

\displaystyle\mathbf{x}(\alpha)\triangleq\mathbf{x}+\alpha\Delta\mathbf{x},\mathbf{y}(\alpha)\triangleq\mathbf{y}+\alpha\Delta\mathbf{y},\mathbf{s}(\alpha)\triangleq\mathbf{s}+\alpha\Delta\mathbf{s}.

(65)

The following theorem states the improvement in each iteration to solve the linear system Eq. (38). It also shows that the updating directions of $\mathbf{x}$ and $\mathbf{s}$ are orthogonal.

Theorem 3. Assuming the updating direction $\left(\Delta{{\mathbf{x}}},\Delta{{\mathbf{y}}},\Delta{{\mathbf{s}}}\right)$ satisfy the linear system Eq. (38), then

		$\displaystyle\mathbf{Ax}\left(\alpha\right)=\left(1-\left(1-\nu\right)\alpha\right)\mathbf{Ax},$		(66)
		$\displaystyle{{\mathbf{A}}^{T}}\mathbf{y}(\alpha)+\mathbf{C}\mathbf{x}\left(\alpha\right)+\mathbf{s}(\alpha)=\left(1-\left(1-\nu\right)\alpha\right)\left({{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{C}\mathbf{x}+\mathbf{s}\right),$
		$\displaystyle\mathbf{x}{{\left(\alpha\right)}^{T}}\mathbf{s}\left(\alpha\right)=\left(1-\left(1-\nu\right)\alpha\right){\mathbf{x}}^{T}\mathbf{s},$
		$\displaystyle\Delta{{\mathbf{x}}^{T}}\Delta\mathbf{s}=0.$

Proof. The first two equations in Eq. (66) can be proved easily using elementary linear algebra.

Then, we prove the last two equations in Eq. (66).

Since $\mathbf{C}=-\mathbf{C}^{T}$ by definition Eq. (39), multiplying the first two equation in Eq. (38) by $-\Delta\mathbf{y}^{T}$ , and $\Delta\mathbf{x}^{T}$ respectively, and adding them together, we obtain

\Delta{{\mathbf{x}}^{T}}\Delta\mathbf{s}=\left(1-\nu\right)\left(\Delta{{\mathbf{y}}^{T}}\mathbf{Ax}-\Delta{{\mathbf{x}}^{T}}\left({{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{C}\mathbf{x}+\mathbf{s}\right)\right).

(67)

Multiplying the first two equation in Eq. (38) by $-\mathbf{y}^{T}$ , $\mathbf{x}^{T}$ respectively, and adding them together, we get

{{\left(\mathbf{x}+\Delta\mathbf{x}\right)}^{T}}\left(\mathbf{s}+\Delta\mathbf{s}\right)+\Delta{{\mathbf{y}}^{T}}\mathbf{Ax}-\Delta{{\mathbf{x}}^{T}}\left({{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{C}\mathbf{x}+\mathbf{s}\right)=\nu{\mathbf{x}}^{T}\mathbf{s}+\Delta\mathbf{x}^{T}\Delta\mathbf{s}.

(68)

Using the definition Eq. (39) and Eq. (40), summing the first element of each cone in the third equation in Eq. (38) yields

{{\left(\mathbf{x}+\Delta\mathbf{x}\right)}^{T}}\left(\mathbf{s}+\Delta\mathbf{s}\right)=\nu{{\mathbf{x}}^{T}}\mathbf{s}+\Delta\mathbf{x}^{T}\Delta\mathbf{s}.

(69)

Subtracting Eq. (68) and Eq. (69), we obtain

\Delta{{\mathbf{y}}^{T}}\mathbf{Ax}-\Delta{{\mathbf{x}}^{T}}\left({{\mathbf{A}}^{T}}\mathbf{y}+\mathbf{C}\mathbf{x}+\mathbf{s}\right)=0.

(70)

From Eq. (67) and Eq. (70), we get

\Delta{{\mathbf{x}}^{T}}\Delta\mathbf{s}=0.

(71)

Combining Eq. (71) with Eq. (69) and the definition Eq. (65) yields

\mathbf{x}{{\left(\alpha\right)}^{T}}\mathbf{s}\left(\alpha\right)=\left(1-\left(1-\nu\right)\alpha\right){\mathbf{x}}^{T}\mathbf{s}.

(72)

Q.E.D.

The first three equation in Eq. (66) can also be written as

		$\displaystyle\hat{\mathbf{r}}_{p}(\alpha)=(1-(1-\nu)\alpha)\hat{\mathbf{r}}_{p}$		(73)
		$\displaystyle\hat{\mathbf{r}}_{d}(\alpha)=(1-(1-\nu)\alpha)\hat{\mathbf{r}}_{d}$
		$\displaystyle\mu(\alpha)=(1-(1-\nu)\alpha)\mu,$

where $\hat{\mathbf{r}}_{p}(\alpha)$ , $\hat{\mathbf{r}}_{d}(\alpha)$ , and $\mu(\alpha)$ are $\hat{\mathbf{r}}_{p}$ , $\hat{\mathbf{r}}_{d}$ , and $\mu$ for the point $(\mathbf{x}(\alpha),\mathbf{y}(\alpha),\mathbf{s}(\alpha))$ . The relation shows that the improvement in each iteration depends on $\alpha$ and $\nu$ . If there are constants $\alpha\in(0,1]$ and $\nu\in[0,1)$ that keep $(\mathbf{x}(\alpha),\mathbf{y}(\alpha),\mathbf{s}(\alpha))$ in a central path neighborhood $\mathcal{N}_{2}(\gamma)$ or $\mathcal{N}_{\infty}(\gamma)$ , where $\gamma\in(0,1)$ is also constant, the polynomial convergence is established.

3.4 Results to bound the updating in the central path neighborhood

This subsection presents results to bound the updated solution $(\mathbf{x}(\alpha),\mathbf{y}(\alpha),\mathbf{s}(\alpha))$ in a central path neighborhood $\mathcal{N}_{2}(\gamma)$ for some $\gamma\in(0,1)$ , when the updating directions are unscaled.

By the definition Eq. (43), a point lying in $\mathcal{N}_{2}(\gamma)$ should satisfy two conditions: (1) the distance $d_{2}$ is bounded by $\gamma\mu$ ; (2) the point is in the interior of the cone constraints. The following lemmas provide means to guarantee the two conditions. They are modifications of the results for feasible IPMs [Monteiro and Tsuchiya, 2000] by including the HSD model and removing the equity constaints, so as to be applicable in infeasible IPMs and hold in cases where the primal or the dual problem is infeasible. The equavalent form of the HSD model and the orthogonality of search directions provided by Theorem 4 allow them to be proved roughly the same as the counterparts for feasible IPMs.

Firstly, we study the conditions to bound the distance $d_{2}$ .

The following Lemma simplifies the expression of $\mathbf{T}_{x}^{-1}\mathbf{x}(\alpha)\circ\mathbf{T}_{x}\mathbf{s}(\alpha)-\mu(\alpha)\mathbf{e}$ , that is useful to analyze $d_{2}(\mathbf{x}(\alpha),\mathbf{s}(\alpha))$ .

Lemma 5. Let $(\mathbf{x},\mathbf{y},\mathbf{s})\in\text{int}(\mathcal{C})$ , the updating direction $\left(\Delta{{\mathbf{x}}},\Delta{{\mathbf{y}}},\Delta{{\mathbf{s}}}\right)$ satisfy the linear system Eq. (38) with $\alpha\in R$ and

\mathbf{D}=\mathbf{I}.

(74)

Then

\mathbf{z}(\alpha)=(1-\alpha)(\mathbf{w}_{xs}-\mu\mathbf{e})+\alpha(\mathbf{W}_{xs}-\mathbf{R}_{xs})\widehat{\Delta\mathbf{x}}+\alpha^{2}\widehat{\Delta\mathbf{x}}\circ\widehat{\Delta\mathbf{s}},

(75)

where

\mathbf{z}(\alpha)\triangleq\mathbf{T}_{x}^{-1}\mathbf{x}(\alpha)\circ\mathbf{T}_{x}\mathbf{s}(\alpha)-\mu(\alpha)\mathbf{e},\widehat{\Delta\mathbf{x}}\triangleq\mathbf{T}_{x}^{-1}\Delta\mathbf{x},\widehat{\Delta\mathbf{s}}\triangleq\mathbf{T}_{x}\Delta\mathbf{s}.

(76)

Proof. Due to the assumption, the definition Eq. (25), Eq. (44), and Lemma 1 b), multiplying $\mathbf{T}_{x}\mathbf{X}^{-1}$ to the third equation of the linear system Eq. (38) yields

\widehat{\Delta\mathbf{s}}=\mathbf{T}_{x}\Delta\mathbf{s}=\mathbf{T}_{x}\mathbf{X}^{-1}(\nu\mu\mathbf{e}-\mathbf{X}\mathbf{s}-\mathbf{S}\Delta\mathbf{x})=\nu\mu\mathbf{e}-\mathbf{w}_{xs}-\mathbf{R}_{xs}\widehat{\Delta\mathbf{x}}.

(77)

Using the above result and definition Eq. (65), we have

		$\displaystyle\mathbf{T}_{x}^{-1}\mathbf{x}(\alpha)\circ\mathbf{T}_{x}\mathbf{s}(\alpha)=\left(\mathbf{T}_{x}^{-1}\mathbf{x}+\alpha\widehat{\Delta\mathbf{x}}\right)\circ\left(\mathbf{T}_{x}\mathbf{s}+\alpha\widehat{\Delta\mathbf{s}}\right)$		(78)
		$\displaystyle=\left(\mathbf{e}+\alpha\widehat{\Delta\mathbf{x}}\right)\circ\left(\mathbf{w}_{xs}+\alpha\widehat{\Delta\mathbf{s}}\right)$
		$\displaystyle=\mathbf{w}_{xs}+\alpha(\mathbf{w}_{xs}\circ\widehat{\Delta\mathbf{x}}+\widehat{\Delta\mathbf{s}})+\alpha^{2}\widehat{\Delta\mathbf{x}}\circ\widehat{\Delta\mathbf{s}}$
		$\displaystyle=\mathbf{w}_{xs}+\alpha\left(\nu\mu\mathbf{e}-\mathbf{w}_{xs}+(\mathbf{W}_{xs}-\mathbf{R}_{xs})\widehat{\Delta\mathbf{x}}\right)+\alpha^{2}\widehat{\Delta\mathbf{x}}\circ\widehat{\Delta\mathbf{s}}$

Combining Eq. (78) and Eq. (73), we obtain Eq.(75). Q.E.D.

Lemma 6 and 7 are introduced to bound the norm of $\widehat{\Delta\mathbf{x}}\circ\widehat{\Delta\mathbf{s}}$ in Eq. (75).

Lemma 6. Assume that $\gamma\in(0,1/3)$ , $(\mathbf{x},\mathbf{y},\mathbf{s})\in\mathcal{N}_{2}(\gamma)$ , $(\Delta\mathbf{x},\Delta\mathbf{y},\Delta\mathbf{s})$ is the solution of the linear system Eq. (38) with $\mathbf{D}=\mathbf{I}$ . Then, the scaled increments satisfies

\left\|\widehat{\Delta\mathbf{x}}\right\|\leq\Gamma_{p}/2,\quad\left\|\widehat{\Delta\mathbf{s}}\right\|\leq\Gamma_{p}\mu,

(79)

where

\displaystyle\Gamma_{p}=2\left(\gamma^{2}/2+(1-\nu)^{2}k\right)^{1/2}/(1-3\gamma).

(80)

Proof.

The assumption and the definition of the neighborhood Eq. (43) show

\|\mathbf{w}_{xs}-\mu\mathbf{e}\|\leq\gamma\mu/\sqrt{2}.

(81)

Because

\mathbf{w}_{xs}^{T}\mathbf{e}=\mathbf{s}^{T}\mathbf{T}_{x}\mathbf{e}=\mathbf{s}^{T}\mathbf{x}=k\mu,

(82)

we obtain

(\mathbf{w}_{xs}-\mu\mathbf{e})^{T}\mathbf{e}=0.

(83)

Then,

		$\displaystyle\\|\mathbf{w}_{xs}-\nu\mu\mathbf{e}\\|^{2}=\\|\mathbf{w}_{xs}-\mu\mathbf{e}\\|^{2}+2\\|\mu\mathbf{e}-\nu\mu\mathbf{e}\\|^{2}+2(1-\nu)\mu(\mathbf{w}_{xs}-\mu\mathbf{e})^{T}\mathbf{e}$		(84)
		$\displaystyle\leq\left(\gamma^{2}/2+(1-\nu)^{2}k\right)\mu^{2}.$		(84)

Because $d_{\infty}(\mathbf{x},\mathbf{s})\leq\gamma\mu$ , using Lemma 3 with $\dot{\nu}=\mu$ , Eq. (55) holds with $\dot{\tau}=3\gamma<1$ and $\dot{\nu}=\mu$ .

By Theorem 3, we also have $\Delta\mathbf{x}^{T}\Delta\mathbf{s}=0$ . Hence, using Lemma 4 with $\dot{\nu}=\mu,\mathbf{u}=\Delta\mathbf{x},\mathbf{v}=\Delta\mathbf{s},\mathbf{h}=\nu\mu\mathbf{e}-\mathbf{x}\circ\mathbf{s}$ and $\dot{\tau}=3\gamma$ , we have

		$\displaystyle\left\\|\widehat{\Delta\mathbf{x}}\right\\|\leq\frac{\\|\mathbf{T}_{x}(\mathbf{X})^{-1}\left(\nu\mu\mathbf{e}-\mathbf{X}\mathbf{s}\right)\\|}{\mu(1-3\gamma)}=\frac{\\|\nu\mu\mathbf{e}-\mathbf{w}_{xs}\\|}{\mu(1-3\gamma)},$		(85)
		$\displaystyle\left\\|\widehat{\Delta\mathbf{s}}\right\\|\leq\frac{2\\|\mathbf{T}_{x}(\mathbf{X})^{-1}\left(\nu\mu\mathbf{e}-\mathbf{X}\mathbf{s}\right)\\|}{1-3\gamma}=\frac{2\\|\nu\mu\mathbf{e}-\mathbf{w}_{xs}\\|}{1-3\gamma}.$		(85)

Using the result along with Eq. (79) and Eq. (84), we obtain Eq. (79) and Eq. (80). Q.E.D.

Lemma 7 [Tsuchiya, 1999]. Let $\mathbf{u},\mathbf{v}\in\mathcal{R}^{n}$ , then

\left\|\mathbf{u}\circ\mathbf{v}\right\|\leq\sqrt{2}\left\|\mathbf{u}\right\|\left\|\mathbf{v}\right\|.

(86)

Proof. See Lemma 2.12 of [Tsuchiya, 1999].

Now, it is ready to bound $\|(\mathbf{z}(\alpha))\|$ as the following lemma.

Lemma 8. Assume that $(\mathbf{x},\mathbf{y},\mathbf{s})\in\mathcal{N}_{2}(\gamma)$ for some scalar $\gamma\in(0,1/3)$ , and let $(\Delta\mathbf{x},\Delta\mathbf{y},\Delta\mathbf{s})$ be the unique solution of the linear system Eq. (38) with $\mathbf{D}=\mathbf{I}$ for some $\nu\in\mathcal{R}$ , and any $\alpha\in[0,1]$ . Then,

\sqrt{2}\|(\mathbf{z}(\alpha))\|\leq\left((1-\alpha)\gamma+\sqrt{2}\alpha\gamma\Gamma_{p}+\alpha^{2}\Gamma_{p}^{2}\right)\mu.

(87)

Proof. Due to Lemma 6 and Lemma 7, we have

\left\|\widehat{\Delta\mathbf{x}}\circ\widehat{\Delta\mathbf{s}}\right\|\leq\frac{\sqrt{2}}{2}\Gamma_{p}^{2}\mu.

(88)

By the assumption, Lemma 5, Eq. (43), and Eq. (88), we obtain

		$\displaystyle\sqrt{2}\\|(\mathbf{z}(\alpha))\\|=\sqrt{2}\left\\|\mathbf{T}_{x}^{-1}(\mathbf{x}(\alpha))\circ\mathbf{T}_{x}(\mathbf{s}(\alpha))-\mu(\alpha)\mathbf{e}\right\\|$		(89)
		$\displaystyle\leq\left((1-\alpha)\\|\mathbf{w}_{xs}-\mu\mathbf{e}\\|+\sqrt{2}\alpha\\|\mathbf{R}_{xs}^{i}-\mathbf{W}_{xs}^{i}\\|\left\\|\widehat{\Delta\mathbf{x}}\right\\|+\alpha^{2}\left\\|\widehat{\Delta\mathbf{x}}\circ\widehat{\Delta\mathbf{s}}\right\\|\right)$
		$\displaystyle\leq\left((1-\alpha)\gamma+\sqrt{2}\alpha\gamma\Gamma_{p}+\alpha^{2}\Gamma_{p}^{2}\right)\mu.$

Q.E.D.

The following lemma shows $d_{2}(\mathbf{x}(\alpha),\mathbf{s}(\alpha))$ is a lower bound of the approximation $\sqrt{2}\|\mathbf{z}(\alpha)\|_{2}$ .

Lemma 9 Let $(\mathbf{x},\mathbf{s})\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})$ . Then,

d_{2}(\mathbf{x},\mathbf{s})=\sqrt{2}\|\mathbf{w}_{xs}-\mu\mathbf{e}\|=\min_{\mathbf{D}\in\mathcal{G}}\|(\mathbf{D}^{-T}\mathbf{x})\circ(\mathbf{D}\mathbf{s})-\mu\mathbf{e}\|

(90)

Proof. See Lemma 2.10 of [Tsuchiya, 1999].

Secondly, we introduce the next lemma to facilitate the proof of the interior point condition.

Lemma 10 [Monteiro and Tsuchiya, 2000]. Let $(\mathbf{x},\mathbf{s})\in\mathcal{K}\times\mathcal{K}$ . If $\mathbf{x}\circ\mathbf{s}\in\text{int}(\mathcal{K})$ , then $(\mathbf{x},\mathbf{s})\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})$ . In particular, if $\sqrt{2}\left\|\mathbf{x}\circ\mathbf{s}-\nu\mathbf{e}\right\|\leq\gamma\nu$ for some $\gamma\in(0,1)$ , and $\nu>0$ , then $(\mathbf{x},\mathbf{s})\in\text{int}(\mathcal{K})\times\text{int}(\mathcal{K})$ .

Proof. See Lemma 10 of [Monteiro and Tsuchiya, 2000].

3.5 polynomial convergence for unscaled searching directions

This subsection establishes iteration-complexity bounds of the infeasible IPM Algorithm 1 to reduce the residuals $\hat{\mathbf{r}}_{p}$ , $\hat{\mathbf{r}}_{d}$ , and complementarity gap $\mu$ , by solving the linear system Eq. (38) with the unscaled searching directions.

Theorem 4. Let $\gamma\in(0,1/3)$ , $\alpha\in[0,1]$ , $\delta\in(0,1)$ satisfy

		$\displaystyle\tilde{\Gamma}\triangleq\frac{4\left(\gamma^{2}+\delta^{2}\right)}{\left(1-3\gamma\right)^{2}}\left(1-\frac{\delta}{\sqrt{2k}}\right)^{-1}<\gamma,$		(91)
		$\displaystyle\nu=1-\delta/\sqrt{2k}.$		(91)

Assume that $(\mathbf{x},\mathbf{y},\mathbf{s})\in\mathcal{N}_{2}(\gamma)$ , $\mathbf{D}=\mathbf{I}$ and $(\Delta\mathbf{x},\Delta\mathbf{y},\Delta\mathbf{s})$ is the solution of the linear system Eq. (38). Then,

(\mathbf{x}(\alpha),\mathbf{y}(\alpha),\mathbf{s}(\alpha))\in\mathcal{N}_{2}(\gamma),

(92)

and

		$\displaystyle\hat{\mathbf{r}}_{p}(\alpha)=(1-\alpha\delta/\sqrt{2k})\hat{\mathbf{r}}_{p},$		(93)
		$\displaystyle\hat{\mathbf{r}}_{d}(\alpha)=(1-\alpha\delta/\sqrt{2k})\hat{\mathbf{r}}_{d},$
		$\displaystyle\mu(\alpha)=(1-\alpha\delta/\sqrt{2k})\mu.$

Proof.

Due to the assumption, Lemma 8, Eq. (73), and $\Gamma_{p}\geq\sqrt{2}\gamma$ , we obtain

		$\displaystyle\sqrt{2}\mathbf{z}(\alpha)\leq\left((1-\alpha)\gamma+2\alpha\Gamma_{p}^{2}\right)\mu$		(94)
		$\displaystyle\leq\left((1-\alpha)\gamma+8\alpha(\gamma^{2}/2+(1-\nu)^{2}k)/(1-3\gamma)^{2}\right)\mu$
		$\displaystyle=\left((1-\alpha)\gamma+4\alpha(\gamma^{2}+\delta^{2})/(1-3\gamma)^{2}\right)\mu$
		$\displaystyle=\left((1-\alpha)\gamma+\alpha\tilde{\Gamma}\left(1-\delta/\sqrt{2k}\right)\right)\mu$
		$\displaystyle=\left((1-\alpha)\gamma+\alpha\nu\tilde{\Gamma}\right)\mu$
		$\displaystyle\leq\gamma\left(1-\alpha+\alpha\nu\right)\mu$
		$\displaystyle=\gamma\mu(\alpha).$

Denote $\dot{\mathbf{u}}(\alpha)=(\mathbf{x}(\alpha),\mathbf{y}(\alpha),\mathbf{s}(\alpha))$ . In order to prove Eq. (92), we have to show $\dot{\mathbf{u}}(\alpha)\in\text{int}(\mathcal{C})$ . Suppose that $\dot{\mathbf{u}}(\dot{\alpha})\notin\mathcal{C}$ for some step size $\dot{\alpha}\in[0,1]$ . Then, at least one of the cone constraint is not satisfied. Denote the part of $\dot{\mathbf{u}}(\alpha)$ corresponding to an unsatisfied cone constraint by $\dot{\mathbf{v}}(\alpha)$ . Then, $\lambda_{\min}(\dot{\mathbf{v}}(0))>0$ and $\lambda_{\min}(\dot{\mathbf{v}}(\dot{\alpha}))<0$ . Since $\lambda_{\min}(\dot{\mathbf{v}}(\alpha))<0$ is continuous for $\alpha$ , there exists $\ddot{\alpha}\in(0,\dot{\alpha})$ satisfying $\lambda_{\min}(\dot{\mathbf{v}}(\ddot{\alpha}))=0$ . We use $\check{\alpha}$ to denote the minimum $\ddot{\alpha}$ for all the cone constraints that is unsatisfied in $\dot{\mathbf{u}}(\dot{\alpha})$ . Then, $\dot{\mathbf{u}}(\check{\alpha})\in\mathcal{C}-\text{int}(\mathcal{C})$ . However, by Lemma 10 and Eq. (94), $\dot{\mathbf{u}}(\check{\alpha})\in\text{int}(\mathcal{C})$ . The contradiction shows $\dot{\alpha}$ does not exist. Consequently, $\dot{\mathbf{u}}(\alpha)\in\mathcal{C}$ . Using Lemma 10 and Eq. (94) again, we conclude that $\dot{\mathbf{u}}(\alpha)\in\text{int}(\mathcal{C})$ for $\alpha\in[0,1]$ .

Due to Lemma 9 and Eq. (94), we obtain

		$\displaystyle d_{2}(\mathbf{x}(\alpha),\mathbf{s}(\alpha))=\sqrt{2}\left\\|\mathbf{T}_{x(\alpha)}^{-1}\mathbf{x}(\alpha)\circ\mathbf{T}_{x(\alpha)}\mathbf{s}(\alpha)-\mu(\alpha)\mathbf{e}\right\\|$		(95)
		$\displaystyle\leq\sqrt{2}\left\\|\mathbf{T}_{x}^{-1}\mathbf{x}(\alpha)\circ\mathbf{T}_{x}\mathbf{s}(\alpha)-\mu(\alpha)\mathbf{e}\right\\|$
		$\displaystyle\leq\gamma\mu(\alpha).$

Consequently, Eq. (92) holds.

From Eq. (73) and the assumption, we also obtain Eq. (93). Q.E.D.

Algorithm 1 computes the search direction $(\Delta\mathbf{x},\Delta\mathbf{y},\Delta\mathbf{s},\Delta\kappa,\Delta\tau)$ by solving the linear system Eq. (28), that is identical to Eq. (35) and updates with step size $\alpha=1$ . Because Eq. (35) is a special case of (38), $\alpha=1$ satisfies the condition of Theorem 4, the residuals $\mathbf{r}_{p},\mathbf{r}_{d}$ are subvectors of $\hat{\mathbf{r}}_{p},\hat{\mathbf{r}}_{d}$ , and $\mu$ is unchanged, we can obtain the following proposition.

Proposition 4. Let the parameters satisfy

\frac{4\left(\gamma^{2}+\delta^{2}\right)}{\left(1-3\gamma\right)^{2}}\left(1-\frac{\delta}{\sqrt{2(k+1)}}\right)^{-1}<\gamma,\gamma\in(0,1/3),\delta\in(0,1).

(96)

Assume that $(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\in\mathcal{N}_{2}(\gamma)$ , $\mathbf{D}=\mathbf{I}$ . Then, in Algorithm 1,

		$\displaystyle(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\in\mathcal{N}_{2}(\gamma),$		(97)
		$\displaystyle\left\\|\mathbf{r}_{p}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\right\\|=\left(1-\delta/\sqrt{2(k+1)}\right)^{j}\left\\|\mathbf{r}_{p}(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\right\\|,$
		$\displaystyle\left\\|\mathbf{r}_{d}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\right\\|=\left(1-\delta/\sqrt{2(k+1)}\right)^{j}\left\\|\mathbf{r}_{d}(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\right\\|,$
		$\displaystyle\mu\left(\mathbf{x},\mathbf{s},\kappa,\tau\right)=\left(1-\delta/\sqrt{2(k+1)}\right)^{j}\mu\left({{\mathbf{x}}_{0}},{{\mathbf{s}}_{0}},{{\kappa}_{0}},{{\tau}_{0}}\right),$

where $j$ is the iteration number.

Proposition 4 shows the worst case iteration complexity of Algorithm 1 is $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ when $\mathbf{D}=\mathbf{I}$ and the parameters $\gamma$ and $\delta$ satisfy Eq. (96).

At the end of this subsection, we show that the centrality can be improved when the parameters are set properly. By setting $\alpha=1$ in Eq. (94), and Lemma 9, we can also obtain

d_{2}(\mathbf{x}(1),\mathbf{s}(1))\leq\mathbf{z}(1)\leq\nu\tilde{\Gamma}\mu=\tilde{\Gamma}\mu(1).

(98)

Combining the result with Eq. (92),

(\mathbf{x}(1),\mathbf{y}(1),\mathbf{s}(1))\in\mathcal{N}_{2}(\tilde{\Gamma}),

(99)

that shows the centrality can be improved since $\tilde{\Gamma}<\gamma$ . For example, in Algorithm 1, if the parameters satisfy ${4\left(\gamma^{2}+\delta^{2}\right)}/{\left(1-3\gamma\right)^{2}}<\gamma$ , the central path neighborhood coefficient $\gamma$ improves by $1-\delta/\sqrt{2(k+1)}$ in each iteration.

3.6 Polynomial convergence for the MZ family of directions

The MZ family of searching directions are generated by using $\mathbf{D}\in\mathcal{G}$ in the linear system Eq. (38).

For a scaling matrix $\mathbf{D}\in\mathcal{G}$ , consider the following change of variables

\bar{\mathbf{A}}=\mathbf{A}\mathbf{D}^{T},\bar{\mathbf{C}}=\mathbf{D}\mathbf{C}\mathbf{D}^{T}.

(100)

Linear system Eq. (38) is transformed to

		$\displaystyle\bar{\mathbf{A}}\Delta\bar{\mathbf{x}}=-\left(1-\nu\right)\bar{\mathbf{A}}\bar{\mathbf{x}},$		(101)
		$\displaystyle\bar{\mathbf{A}}^{T}\Delta\mathbf{y}+\bar{\mathbf{C}}\Delta\bar{\mathbf{x}}+\Delta\bar{\mathbf{s}}=-(1-\nu)\left(\bar{\mathbf{A}}^{T}\mathbf{y}+\bar{\mathbf{C}}\bar{\mathbf{x}}+\bar{\mathbf{s}}\right),$
		$\displaystyle\bar{\mathbf{X}}\Delta\bar{\mathbf{s}}+\bar{\mathbf{S}}\Delta\bar{\mathbf{x}}=\nu\mu\mathbf{e}-\bar{\mathbf{x}}\circ\bar{\mathbf{s}},$

which is equal to (38) with $\bar{\mathbf{x}}=\mathbf{D}^{-T}\mathbf{x},\bar{\mathbf{s}}=\mathbf{D}\mathbf{s}$ instead of $\mathbf{x}$ and $\mathbf{s}$ , and $\mathbf{D}=\mathbf{I}$ . Using Theorem 4 and Proposition 3 c) which shows the central path neighborhood $\mathcal{N}_{2}(\gamma)$ is unchanged under the scaling, we obtain that Theorem 4 also holds with $\mathbf{D}\in\mathcal{G}$ instead of $\mathbf{D}=\mathbf{I}$ .

Then, similar to Proposition 4, we obtain the next proposition immediately.

Proposition 5. Let the parameters satisfy

\frac{4\left(\gamma^{2}+\delta^{2}\right)}{\left(1-3\gamma\right)^{2}}\left(1-\frac{\delta}{\sqrt{2(k+1)}}\right)^{-1}<\gamma,\gamma\in(0,1/3),\delta\in(0,1).

(102)

Assume that $(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\in\mathcal{N}_{2}(\gamma)$ , $\mathbf{D}\in\mathcal{G}$ . Then, in Algorithm 1,

		$\displaystyle(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\in\mathcal{N}_{2}(\gamma),$		(103)
		$\displaystyle\left\\|\mathbf{r}_{p}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\right\\|=\left(1-\delta/\sqrt{2(k+1)}\right)^{j}\left\\|\mathbf{r}_{p}(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\right\\|,$
		$\displaystyle\left\\|\mathbf{r}_{d}(\mathbf{x},\mathbf{y},\mathbf{s},\kappa,\tau)\right\\|=\left(1-\delta/\sqrt{2(k+1)}\right)^{j}\left\\|\mathbf{r}_{d}(\mathbf{x}_{0},\mathbf{y}_{0},\mathbf{s}_{0},\kappa_{0},\tau_{0})\right\\|,$
		$\displaystyle\mu\left(\mathbf{x},\mathbf{s},\kappa,\tau\right)=\left(1-\delta/\sqrt{2(k+1)}\right)^{j}\mu\left({{\mathbf{x}}_{0}},{{\mathbf{s}}_{0}},{{\kappa}_{0}},{{\tau}_{0}}\right),$

where $j$ is the iteration number.

Proposition 5 shows the worst case iteration complexity of Algorithm 1 is $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ for the MZ family of searching directions, when the parameters $\gamma$ and $\delta$ satisfy Eq. (102). It is equal to the best known complexity of feasible IPMs. More precisely, the algorithm takes $\log(\epsilon)/\log\left(1-\delta/\sqrt{2k+1}\right)$ iterations to reduce the primal residual, dual residual, and complementary gap by a factor of $1/\epsilon$ .

4 Analysis of warm starting

In this section, we study the conditions that warm starting can improve the worst case iteration bound compared with cold starting for IPMs of SOCP.

The warm starting scheme investigated is modified from a scheme presented by Skajaa et al. [2013], which initialize infeasible IPMs for SOCP with a linear combination of the optimal solution of a similar problem solved previously and the cold starting point. They show that the scheme can improve the worst case interaion complexity under certain conditions in linear programming, and generalized the scheme to SOCP. We extend the scheme by using an inexact solution of a previous problem instead of the optimal solution, that enables the former problem to be solved for only a few iterations in order to save computation in scenarios such as successive convexification e.g. [Szmuk et al., 2020, Miao et al., 2021].

We firstly introduce the warm starting scheme.

Consider two related SOCP problems defined by Eq. (2), where the problem dimensions and cone constraints are unchanged. Denote the current SOCP problem by $\mathcal{P}$ and its paramters by $\mathbf{A},\mathbf{b},\mathbf{c}$ . The previous problem is $\mathcal{P}_{o}$ and its paramters are $\mathbf{A}_{o},\mathbf{b}_{o},\mathbf{c}_{o}$ , an inexact solution of which is $(\mathbf{x}_{o},\mathbf{y}_{o},\mathbf{s}_{o})$ . The differences of problem parameters are defined as

\Delta\mathbf{A}=\mathbf{A}-\mathbf{A}_{o},\ \Delta\mathbf{b}=\mathbf{b}-\mathbf{b}_{o},\ \Delta\mathbf{c}=\mathbf{c}-\mathbf{c}_{o}.

(104)

The cold starting point for an infeasible IPMs based on the HSD model (e.g. Algorithm 1) is defined as

\mathbf{q}_{c}=(\mathbf{x}_{c},\mathbf{y}_{c},\mathbf{s}_{c},\kappa_{c},\tau_{c})=(\mathbf{e},0,\mathbf{e},1,1).

(105)

where $\mathbf{q}$ denotes the solution as a whole for simplicity. We also have $\mu(\mathbf{q}_{c})=1$ .

Then, the warm starting point $\mathbf{q}_{w}=(\mathbf{x}_{w},\mathbf{y}_{w},\mathbf{s}_{w},\kappa_{w},\tau_{w})$ is computed as

		$\displaystyle{{\mathbf{x}}_{w}}=\omega{{\mathbf{x}}_{o}}+\left(1-\omega\right)\mathbf{e}$		(106)
		$\displaystyle{{\mathbf{y}}_{w}}=\omega{{\mathbf{y}}_{o}}$
		$\displaystyle{{\mathbf{s}}_{w}}=\omega{{\mathbf{s}}_{o}}+\left(1-\omega\right)\mathbf{e}$
		$\displaystyle{{\kappa}_{w}}={{\left({{\mathbf{x}}_{w}}\right)}^{T}}{{\mathbf{s}}_{w}}/k$
		$\displaystyle{{\tau}_{w}}=1,$

where the coefficient $\omega\in[0,1]$ .

Then, we investigate the iterations required when using the warm starting scheme in Algorithm 1, which has the best known worst case iteration complexity in IPMs for SOCP. Considering the inital primal residual, dual residual, and complementary gap are different for cold and warm starting, we use an unified stop creteria, as

\left\|\mathbf{r}_{p}(\mathbf{q})\right\|\leq\epsilon_{u},\left\|\mathbf{r}_{d}(\mathbf{q})\right\|\leq\epsilon_{u},\mu\left(\mathbf{q}\right)\leq\epsilon_{u}.

(107)

Then, according to proposition 5, when the parameters $\gamma$ and $\delta$ satisfy Eq. (102), the iterations required is

N_{it}=-\log\left(\max\left(\left\|\mathbf{r}_{p}(\mathbf{q}_{w})\right\|,\left\|\mathbf{r}_{d}(\mathbf{q}_{w})\right\|,\mu(\mathbf{q}_{w})\right)/\epsilon_{u}\right)/\log\left(1-\delta/\sqrt{2k+1}\right).

(108)

Obviously, a sufficient condition for the warm starting to improve the iteration bound is

\left\|\mathbf{r}_{p}(\mathbf{q}_{w})\right\|\leq c_{w}\left\|\mathbf{r}_{p}(\mathbf{q}_{c})\right\|,\left\|\mathbf{r}_{d}(\mathbf{q}_{w})\right\|\leq c_{w}\left\|\mathbf{r}_{d}(\mathbf{q}_{c})\right\|,\mu(\mathbf{q}_{w})\leq c_{w},\mathbf{q}_{w}\in\mathcal{N}_{2}(\gamma),

(109)

where $c_{w}\in(0,1)$ is a constant. Under the condition, the iterations can be reduced by $-\log(c_{w})$ compared with cold starting.

In the rest of this section, we study the conditions for the warm starting to improve the inital primal residual, dual residual, complementary gap by at least a constant factor $c_{w}\in(0,1)$ , and to keep the inital point being in the central path neighorhood.

4.1 The conditions to improve the primal residual

In this subsection, we analyze the primal residual $\mathbf{r}_{p}(\mathbf{q}_{w})$ .

Assume that

		$\displaystyle\left\\|\Delta\mathbf{A}\right\\|\left\\|{{\mathbf{x}}_{o}}\right\\|\leq{{c}_{A}}{{\mathbf{r}}_{p}}({{\mathbf{q}}_{c}}),$		(110)
		$\displaystyle\left\\|\Delta\mathbf{b}\right\\|\leq{{c}_{b}}{{\mathbf{r}}_{p}}({{\mathbf{q}}_{c}}),$
		$\displaystyle\mathbf{r}_{p}^{o}({{\mathbf{q}}_{o}})\leq{c_{p}}{{\mathbf{r}}_{p}}({{\mathbf{q}}_{c}}),$
		$\displaystyle{{c}_{A}}+{{c}_{b}}+{c_{p}}\leq 1.$

Conbining the assumption and the definition Eq. (106), the primal residual satisfies

		$\displaystyle{{\mathbf{r}}_{p}}({{\mathbf{q}}_{w}})=\mathbf{A}{{\mathbf{x}}_{w}}-\mathbf{b}{{\tau}_{w}}=\omega\left(\mathbf{A}{{\mathbf{x}}_{o}}-\mathbf{b}\right)+\left(1-\omega\right){{\mathbf{r}}_{p}}({{\mathbf{q}}_{c}})$		(111)
		$\displaystyle=\omega\left(\left({{\mathbf{A}}_{o}}+\Delta\mathbf{A}\right){{\mathbf{x}}_{o}}-\left({{\mathbf{b}}_{o}}+\Delta\mathbf{b}\right)\right)+\left(1-\omega\right){{\mathbf{r}}_{p}}({{\mathbf{q}}_{c}})$
		$\displaystyle=\omega\mathbf{r}_{P}^{o}({{\mathbf{q}}_{o}})+\omega\left(\Delta\mathbf{A}{{\mathbf{x}}_{o}}-\Delta\mathbf{b}\right)+\left(1-\omega\right){{\mathbf{r}}_{p}}({{\mathbf{q}}_{c}}).$

Consequently,

\left\|{{\mathbf{r}}_{p}}({{\mathbf{q}}_{w}})\right\|\leq\left(1-\omega\left(1-{{c}_{A}}-{{c}_{b}}-{c_{p}}\right)\right)\|{{\mathbf{r}}_{p}}({\mathbf{q}}_{c})\|.

(112)

Then, we obtain

\left\|\mathbf{r}_{p}(\mathbf{q}_{w})\right\|\leq c_{w}\left\|\mathbf{r}_{p}(\mathbf{q}_{c})\right\|,\text{ if }c_{w}\geq\left(1-\omega\left(1-{{c}_{A}}-{{c}_{b}}-{c_{p}}\right)\right).

(113)

The result shows that when the previous primal residual and the difference of problem paramters are small, warm starting with the coefficient $\omega$ close to 1 can reduce the primal residual compared with cold starting.

4.2 The conditions to improve the dual residual

In this subsection, we analyze the dual residual $\mathbf{r}_{d}(\mathbf{q}_{w})$ .

Assume that

		$\displaystyle\left\\|\Delta{{\mathbf{A}}^{T}}\right\\|\left\\|{{\mathbf{y}}_{o}}\right\\|\leq{{c}_{AT}}{{\mathbf{r}}_{d}}({{\mathbf{q}}_{c}}),$		(114)
		$\displaystyle\left\\|\Delta\mathbf{c}\right\\|\leq{{c}_{c}}{{\mathbf{r}}_{d}}({{\mathbf{q}}_{c}}),$
		$\displaystyle\mathbf{r}_{d}^{o}({{\mathbf{z}}_{o}})\leq{{c}_{d}}{{\mathbf{r}}_{d}}({{\mathbf{q}}_{c}}),$
		$\displaystyle{{c}_{d}}+{{c}_{AT}}+{{c}_{c}}\leq 1.$

Then, using the definition Eq. (106),

		$\displaystyle{{\mathbf{r}}_{d}}({{\mathbf{q}}_{w}})=-{{\mathbf{A}}^{T}}{{\mathbf{y}}_{w}}-{{\mathbf{s}}_{w}}+{{\tau}_{w}}\mathbf{c}$		(115)
		$\displaystyle=-\omega{{\left({{\mathbf{A}}_{o}}+\Delta\mathbf{A}\right)}^{T}}{{\mathbf{y}}_{o}}-\omega{{\mathbf{s}}_{o}}-\left(1-\omega\right)\mathbf{e}+\omega\left({{\mathbf{c}}_{o}}+\Delta\mathbf{c}\right)+\left(1-\omega\right)\mathbf{c},$
		$\displaystyle=\omega\mathbf{r}_{d}^{o}({{\mathbf{q}}_{o}})+\omega\left(-\Delta{{\mathbf{A}}^{T}}{{\mathbf{y}}_{o}}+\Delta\mathbf{c}\right)+\left(1-\omega\right){{\mathbf{r}}_{d}}({{\mathbf{q}}_{c}}),$
		$\displaystyle\left\\|{{\mathbf{r}}_{d}}({{\mathbf{z}}_{w}})\right\\|\leq\left(1-\omega\left(1-{{c}_{d}}-{{c}_{AT}}-{{c}_{c}}\right)\right)\\|{{\mathbf{r}}_{d}}({{\mathbf{q}}_{c}})\\|.$

Then, we obtain

\left\|\mathbf{r}_{d}(\mathbf{q}_{w})\right\|\leq c_{w}\left\|\mathbf{r}_{d}(\mathbf{q}_{c})\right\|,\text{ if }c_{w}\geq\left(1-\omega\left(1-{{c}_{d}}-{{c}_{AT}}-{{c}_{c}}\right)\right).

(116)

The result shows that when the previous dual residual and the difference of problem paramters are small, warm starting with the coefficient $\omega$ close to 1 can reduce the dual residual compared with cold starting.

4.3 The conditions to improve the complementary gap

In this subsection, we analyze the central path coefficient $\mu(\mathbf{q}_{w})$ , that is propotional to the complementary gap.

Assume that

		$\displaystyle\mu({{\mathbf{q}}_{w}})\leq{{c}_{\mu}}\mu({{\mathbf{q}}_{c}}),$		(117)
		$\displaystyle\left(1-\omega\right)\left(\mathbf{e}^{T}({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})/k+1\right)={{c}_{xs}},$
		$\displaystyle{{c}_{\mu}}+{{c}_{xs}}\leq 1.$

Then, we have

		$\displaystyle{{\left({{\mathbf{x}}_{w}}\right)}^{T}}{{\mathbf{s}}_{w}}={{\left(\omega{{\mathbf{x}}_{o}}+\left(1-\omega\right)\mathbf{e}\right)}^{T}}\left(\omega{{\mathbf{s}}_{o}}+\left(1-\omega\right)\mathbf{e}\right)$		(118)
		$\displaystyle={{\omega}^{2}}{{\left({{\mathbf{x}}_{o}}\right)}^{T}}{{\mathbf{s}}_{o}}+\omega\left(1-\omega\right){{\mathbf{e}}^{T}}\left({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}}\right)+{{\left(1-\omega\right)}^{2}}k.$		(118)

Due to the definition Eq. (106),

		$\displaystyle\mu({{\mathbf{q}}_{w}})=\frac{{{\left({{\mathbf{x}}_{w}}\right)}^{T}}{{\mathbf{s}}_{w}}+{{\kappa}_{w}}{{\tau}_{w}}}{k+1}=\frac{{{\left({{\mathbf{x}}_{w}}\right)}^{T}}{{\mathbf{s}}_{w}}}{k}$		(119)
		$\displaystyle={{\omega}^{2}}\mu({{\mathbf{q}}_{o}})+\frac{\omega\left(1-\omega\right)}{k}{{\mathbf{e}}^{T}}\left({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}}\right)+{{\left(1-\omega\right)}^{2}}$
		$\displaystyle\leq{{c}_{\mu}}+\left(1-\omega\right)(\mathbf{e}^{T}({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})/k+1)$
		$\displaystyle={{c}_{\mu}}+{{c}_{xs}}=\left({{c}_{\mu}}+{{c}_{xs}}\right)\mu({{\mathbf{q}}_{c}}).$

Then, we obtain

\mu(\mathbf{q}_{w})\leq c_{w}\mu(\mathbf{q}_{c}),\text{ if }c_{w}\geq{{c}_{\mu}}+{{c}_{xs}}.

(120)

The second equation of Eq. (117) requires

\omega\geq 1-c_{xs}/(\mathbf{e}^{T}({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})/k+1),

(121)

that show $\omega$ should be close to 1.

The result shows that when the previous complementary gap is small, warm starting with the coefficient $\omega$ close to 1 can reduce the complementary gap compared with cold starting.

4.4 The conditions to maintain centrality

Finaly, we study the condition to ensure $\mathbf{q}_{w}\in\mathcal{N}_{2}(\gamma)$ , which is required for the iteration number relation Eq.(108) to hold.

Because of Eq. (99) that shows the centrality can be improved in the solving process of Algorithm 1, we can assume that $\mathbf{q}_{o}\in\mathcal{N}_{2}(\gamma)$ and $d_{2}(\mathbf{q}_{o})\leq\gamma_{o}\mu(\mathbf{q}_{o})<\gamma\mu(\mathbf{q}_{o})$ . Besides, the cold starting point is perfectly centered, as $\mathbf{q}_{w}\in\mathcal{N}_{2}(\gamma)$ and $d_{2}(\mathbf{q}_{c})=0$ . Since $\text{int}(\mathcal{K})$ is convex, by the definition Eq. (106), $\mathbf{q}_{w}\in\text{int}(\mathcal{K})$ .

Then, in order to ensure $\mathbf{q}_{w}\in\mathcal{N}_{2}(\gamma)$ , we have to ensure $d_{2}(\mathbf{q}_{w})<\gamma\mu(\mathbf{q}_{w})$ . To investigate $d_{2}(\mathbf{q}_{w})$ , we firstly study

\mathbf{w}_{x_{w},s_{w}}=\mathbf{T}_{x_{w}}\mathbf{s}_{w}.

(122)

Due to Eq. (45),

\mathbf{T}_{x}=\text{mat}(\mathbf{x})-\mathbf{U}_{x},\mathbf{U}_{x}\mathbf{e}=0.

(123)

Denote

\tilde{\mathbf{U}}=\mathbf{U}_{x_{w}}-\omega\mathbf{U}_{x_{o}}.

(124)

Due to Eq. (22), Eq. (45), and Eq. (106), we have

\mathbf{U}_{x_{w}^{(i)}}-\omega\mathbf{U}_{x_{o}^{(i)}}=\begin{pmatrix}0&0\\ 0&\tilde{\rho}_{i}\mathbf{P}_{x_{o}^{(i)}}\end{pmatrix},

(125)

where

		$\displaystyle\tilde{\rho}_{i}=(x_{w}^{(i)})_{1}-\omega(x_{o}^{(i)})_{1}-\beta_{x_{w}^{(i)}}+\omega\beta_{x_{o}^{(i)}}$		(126)
		$\displaystyle=(x_{w}^{(i)})_{1}-\omega(x_{o}^{(i)})_{1}-\sqrt{(x_{w}^{(i)})_{1}^{2}-\\|(x_{w}^{(i)})_{2:n}\\|^{2}}+\omega\sqrt{(x_{o}^{(i)})_{1}^{2}-\\|(x_{o}^{(i)})_{2:n}\\|^{2}}$
		$\displaystyle=(1-\omega)\left(1-\frac{2\omega(x_{o}^{(i)})_{1}+1-\omega}{\beta_{x_{w}^{(i)}}+\omega\beta_{x_{o}^{(i)}}}\right).$

Because Eq. (47) shows $\left\|\mathbf{P}_{x_{o}}\right\|\leq 1$ , we obtain

\left\|\tilde{\mathbf{U}}\right\|\leq(1-\omega)\rho,\ \rho=\max_{i}\left(1-\frac{2\omega(x_{o}^{(i)})_{1}+1-\omega}{\beta_{x_{w}^{(i)}}+\omega\beta_{x_{o}^{(i)}}}\right).

(127)

Then, using the definition Eq. (106),

		$\displaystyle\mathbf{w}_{x_{w},s_{w}}=\mathbf{T}_{x_{w}}\mathbf{s}_{w}=\mathbf{x}_{w}\circ\mathbf{s}_{w}-\mathbf{U}_{x_{w}}\mathbf{s}_{w}$		(128)
		$\displaystyle=\omega^{2}\mathbf{x}_{o}\circ\mathbf{s}_{o}+\omega(1-\omega)(\mathbf{x}_{o}+\mathbf{s}_{o})+(1-\omega)^{2}\mathbf{e}-\mathbf{U}_{x_{w}}(\omega\mathbf{s}_{o}+(1-\omega)\mathbf{e})$
		$\displaystyle=\omega^{2}\mathbf{x}_{o}\circ\mathbf{s}_{o}+\omega(1-\omega)(\mathbf{x}_{o}+\mathbf{s}_{o})+(1-\omega)^{2}\mathbf{e}-(\omega\mathbf{U}_{x_{o}}+\tilde{\mathbf{U}})\omega\mathbf{s}_{o}$
		$\displaystyle=\omega^{2}(\text{mat}(\mathbf{x}_{o})-\mathbf{U}_{x_{o}})\mathbf{s}_{o}+\omega(1-\omega)(\mathbf{x}_{o}+\mathbf{s}_{o})+(1-\omega)^{2}\mathbf{e}-\omega\tilde{\mathbf{U}}\mathbf{s}_{o}$
		$\displaystyle=\omega^{2}\mathbf{w}_{x_{o},s_{o}}+\omega(1-\omega)(\mathbf{x}_{o}+\mathbf{s}_{o})+(1-\omega)^{2}\mathbf{e}-\omega\tilde{\mathbf{U}}\mathbf{s}_{o}.$

Combining the result with Eq. (119), we obtain

		$\displaystyle{{d}_{2}}\left({{\mathbf{q}}_{w}}\right)=\left\\|{{\mathbf{w}}_{{{x}_{w}},{{s}_{w}}}}-\mu({{\mathbf{q}}_{w}})\mathbf{e}\right\\|$		(129)
		$\displaystyle\leq{{\omega}^{2}}\left\\|{{\mathbf{w}}_{{{x}_{o}},{{s}_{o}}}}-\mu({{\mathbf{q}}_{o}})\mathbf{e}\right\\|+\left\\|\omega(1-\omega)\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-\frac{{{\mathbf{e}}^{T}}}{k}\left({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}}\right)\mathbf{e}\right)-\omega\widetilde{\mathbf{U}}{{\mathbf{s}}_{o}}\right\\|$
		$\displaystyle\leq{{\omega}^{2}}{{d}_{2}}\left({{\mathbf{q}}_{o}}\right)+\omega(1-\omega)\left(\left\\|\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-\psi_{o}\mathbf{e}\right)\right\\|+\rho\left\\|{{\mathbf{s}}_{o}}\right\\|\right),$

where

\psi_{o}=\frac{{{\mathbf{e}}^{T}}}{k}\left({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}}\right).

(130)

Then, we obtain a sufficient condition for $d_{2}(\mathbf{q}_{w})<\gamma\mu(\mathbf{q}_{w})$ by the following derivation:

		$\displaystyle{{d}_{2}}\left({{\mathbf{q}}_{w}}\right)\leq\gamma\mu({{\mathbf{q}}_{w}})$		(131)
		$\displaystyle\Leftarrow{{\omega}^{2}}{{d}_{2}}\left({{\mathbf{q}}_{o}}\right)+\omega(1-\omega)\left(\left\\|\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-{{\psi}_{o}}\mathbf{e}\right)\right\\|+\rho\left\\|{{\mathbf{s}}_{o}}\right\\|\right)$
		$\displaystyle\leq\gamma\left({{\omega}^{2}}\mu({{\mathbf{q}}_{o}})+\omega\left(1-\omega\right){{\psi}_{o}}+{{\left(1-\omega\right)}^{2}}\right)$
		$\displaystyle\Leftarrow\omega{{\gamma}_{o}}\mu({{\mathbf{q}}_{o}})+(1-\omega)\left(\left\\|\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-{{\psi}_{o}}\mathbf{e}\right)\right\\|+\rho\left\\|{{\mathbf{s}}_{o}}\right\\|\right)\leq\gamma\left(\omega\mu({{\mathbf{q}}_{o}})+\left(1-\omega\right){{\psi}_{o}}\right)$
		$\displaystyle\Leftarrow{{\xi}_{o}}\leq\omega\left({{\xi}_{o}}+\left(\gamma-{{\gamma}_{o}}\right)\mu({{\mathbf{q}}_{o}})\right)$

where

{{\xi}_{o}}=\left\|\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-{{\psi}_{o}}\mathbf{e}\right)\right\|+\rho\left\|{{\mathbf{s}}_{o}}\right\|-\gamma{{\psi}_{o}}.

(132)

Then, the sufficient condition of $\mathbf{q}_{w}\in\mathcal{N}_{2}(\gamma)$ is

		$\displaystyle\omega\in[0,1]\text{ if }{{\xi}_{o}}\leq 0$		(133)
		$\displaystyle\omega\in\left[\frac{{{\xi}_{o}}}{{{\xi}_{o}}+\left(\gamma-{{\gamma}_{o}}\right)\mu({{\mathbf{q}}_{o}})},1\right],\text{ if }{{\xi}_{o}}>0.$		(133)

The result shows when $\mathbf{q}_{o}$ is closed to the boundary of the cone constrains, the coefficient $\omega$ to ensure $\mathbf{q}_{w}\in\mathcal{N}_{2}(\gamma)$ may be very close to 1. The reason is twofold: firstly, near the boundary, the corresponding element in $\mathbf{x}_{o}$ and $\mathbf{s}_{o}$ may be different for many orders of magnitude, making $\left\|\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-{{\psi}_{o}}\mathbf{e}\right)\right\|\gg 0$ ; secondly, by Eq. (127), when $\mathbf{x}_{o}$ is close to the boundary, $\rho$ may be very large. In that case, little correction for the previous solution can be added. Typically, if the exact solution is on the boundary, as the number of iterations increases, the solution gets closer to the boundary. Consequently, an early stage inexact solution of the previous problem is preferred in the warm starting scheme, because less computation is cost to solve the previous problem, as well as it is easier to maintain centrality. It is similar to [Yildirim and Wright, 2002], which also prefers early stage inexact solutions for warm starting of feasible IPMs for linear programing.

However, when the differences of problem parameters defined in Eq. (104) are very small, and the previous solution is well centered, little correction is required. Consequently, the warm starting scheme with $\omega$ very close to 1 is applicable in that case. For example, in some successive convexification applications, the increment of $\mathbf{A}$ , $\mathbf{b}$ , and $\mathbf{c}$ between problems depend on the increment of $\mathbf{x}$ , and the Jacobian matrix is bounded. In the late stage of these applications, the differences of problem parameters are small since the increment of $\mathbf{x}$ is small. As a result, the warm starting scheme with $\omega$ very close to 1 may also work well.

As a summary of this section, when the differences of problem parameters are small, and the previous solution is well centered with small primal residual, dual residual, and complementary gap, the warm starting scheme with $\omega$ close to 1 can improve the worst case iteration complexity compared with cold starting.

5 Conclusion

This paper shows that an infeasible IPM for SOCP has $O\left(k^{1/2}\log\left(\epsilon^{-1}\right)\right)$ iteration complexity, which is the same as the best known result of feasible IPMs. Compared with cold starting, a warm starting scheme can reduce the iterations required, when the differences of problem parameters are small, and the previous solution is well centered with small primal residual, dual residual, and complementary gap.

References

Andersen et al. [2003] E. D. Andersen, C. Roos, and T. Terlaky. On implementing a primal-dual interior-point method for conic quadratic optimization. Mathematical Programming, 95(2):249–277, 2003.
Chen et al. [2022] Yushu Chen, Guangwen Yang, Lu Wang, Qingzhong Gan, Haipeng Chen, and Quanyong Xu. A fast algorithm for onboard atmospheric powered descent guidance. ArXiv, abs/2209.04157, 2022.
Faraut and Koranyi [1994] J. Faraut and A. Koranyi. Analysis on symmetric cones. oxford mathematical monographs, 1994.
Faybusovich [1997a] Leonid Faybusovich. Linear systems in jordan algebras and primal-dual interior-point algorithms. Journal of Computational and Applied Mathematics, 86(1):149–175, 1997a. ISSN 0377-0427. doi: 10.1016/S0377-0427(97)00153-2.
Faybusovich [1997b] Leonid Faybusovich. Euclidean jordan algebras and interior-point algorithms. Positivity, 1:331–357, 12 1997b. doi: 10.1023/A:1009701824047.
Helmberg et al. [1996] Christoph Helmberg, Franz Rendl, Robert J. Vanderbei, and Henry Wolkowicz. An interior-point method for semidefinite programming. SIAM Journal on Optimization, 6(2):342–361, 1996. doi: 10.1137/0806020. URL https://doi.org/10.1137/0806020.
Kojima et al. [1989] Masakazu Kojima, Shinji Mizuno, and Akiko Yoshise. A polynomial-time algorithm for a class of linear complementarity problems. Mathematical programming, 44(1):1–26, 1989.
Kojima et al. [1997] Masakazu Kojima, Susumu Shindoh, and Shinji Hara. Interior-point methods for the monotone semidefinite linear complementarity problem in symmetric matrices. SIAM Journal on Optimization, 7(1):86–125, 1997. doi: 10.1137/S1052623494269035. URL https://doi.org/10.1137/S1052623494269035.
Miao et al. [2021] Xinyuan Miao, Yu Song, Zhiguo Zhang, and Shengping Gong. Successive convexification for ascent trajectory replanning of a multi-stage launch vehicle experiencing non-fatal dynamic faults. IEEE Transactions on Aerospace and Electronic Systems, pages 1–1, 2021. doi: 10.1109/TAES.2021.3133310.
Mohammad-Nezhad and Terlaky [2019] Ali Mohammad-Nezhad and Tamás Terlaky. Quadratic convergence to the optimal solution of second-order conic optimization without strict complementarity. Optimization Methods and Software, 34(5):960–990, 2019. doi: 10.1080/10556788.2018.1528249. URL https://doi.org/10.1080/10556788.2018.1528249.
Monteiro [1997] Renato D. C. Monteiro. Primal–dual path-following algorithms for semidefinite programming. SIAM Journal on Optimization, 7(3):663–678, 1997. doi: 10.1137/S1052623495293056.
Monteiro and Tsuchiya [2000] Renato D. C. Monteiro and T. Tsuchiya. Polynomial convergence of primal-dual algorithms for the second-order cone program based on the mz-family of directions. Mathematical Programming, 88(1):61–83, 2000.
Monteiro and Zhang [1998] Renato DC Monteiro and Yin Zhang. A unified analysis for a class of long-step primal-dual path-following interior-point algorithms for semidefinite programming. Mathematical Programming, 81(3):281–299, 1998. doi: 10.1007/BF01580085.
Nemirovskii and Nesterov [1993] A. S. Nemirovskii and Y. E. Nesterov. Interior-point polynomial algorithms in convex programming. Studies in Applied Mathematics Philadelphia SIAM, 1993.
Nesterov and Todd [1998] Yu. E. Nesterov and M. J. Todd. Primal-dual interior-point methods for self-scaled cones. SIAM Journal on Optimization, 8(2):324–364, 1998. doi: 10.1137/S1052623495290209. URL https://doi.org/10.1137/S1052623495290209.
Nesterov and Todd [1997] Yurii E Nesterov and Michael Jeremy Todd. Self-scaled barriers and interior-point methods for convex programming. Mathematics of Operations Research, 1997.
Peng et al. [2003] Jiming Peng, Cornelis Roos, and Tamás Terlaky. Self-Regularity: A New Paradigm for Primal-Dual Interior-Point Algorithms. Princeton University Press, 01 2003. ISBN 9781400825134. doi: 10.1515/9781400825134.
Potra and Wright [2000] Florian A. Potra and Stephen J. Wright. Interior-point methods. Journal of Computational and Applied Mathematics, 124(1):281–302, 2000. ISSN 0377-0427. doi: https://doi.org/10.1016/S0377-0427(00)00433-7. URL https://www.sciencedirect.com/science/article/pii/S0377042700004337. Numerical Analysis 2000. Vol. IV: Optimization and Nonlinear Equations.
Rangarajan [2006] Bharath Kumar Rangarajan. Polynomial convergence of infeasible-interior-point methods over symmetric cones. SIAM Journal on Optimization, 16(4):1211–1229, 2006. doi: 10.1137/040606557. URL https://doi.org/10.1137/040606557.
Skajaa et al. [2013] Anders Skajaa, Erling D Andersen, and Yinyu Ye. Warmstarting the homogeneous and self-dual interior point method for linear and conic quadratic problems. Mathematical Programming Computation, 5(1):1–25, 2013.
Szmuk et al. [2020] Michael Szmuk, Taylor P. Reynolds, and Behçet Açıkmeşe. Successive convexification for real-time six-degree-of-freedom powered descent guidance with state-triggered constraints. Journal of Guidance, Control, and Dynamics, 43(8):1399–1413, 2020. doi: 10.2514/1.G004549. URL https://doi.org/10.2514/1.G004549.
Terlaky et al. [2002] Tamás Terlaky, Cornelis Roos, and Jiming Peng. Self-regularity: A New Paradigm for Primal-dual Interior-point Algorithms (Princeton Series in Applied Mathematics). Princeton University Press, 2002.
Tsuchiya [1999] Takashi Tsuchiya. A convergence analysis of the scaling-invariant primal-dual path-following algorithms for second-order cone programming. Optimization Methods and Software, 11(1-4):141–182, 1999. doi: 10.1080/10556789908805750.
Wang [2003] Bixiang Wang. Implementation of interior point methods for second order conic optimization. PhD thesis, McMaster University, 2003.
Yildirim and Wright [2002] E. Alper Yildirim and Stephen J. Wright. Warm-start strategies in interior-point methods for linear programming. SIAM Journal on Optimization, 12(3):782–810, 2002. doi: 10.1137/S1052623400369235.
Zhang [1998] Yin Zhang. On extending some primal–dual interior-point algorithms from linear programming to semidefinite programming. SIAM Journal on Optimization, 8(2):365–386, 1998. doi: 10.1137/S1052623495296115. URL https://doi.org/10.1137/S1052623495296115.

		$\displaystyle\left\\|\mathbf{R}_{xs}-\mathbf{W_{xs}}\right\\|\leq 2\gamma\dot{\nu},$		(54)
		$\displaystyle\left\\|\mathbf{W}_{xs}-\dot{\nu}\mathbf{I}\right\\|\leq\gamma\dot{\nu},$
		$\displaystyle\left\\|\mathbf{R}_{xs}-\dot{\nu}\mathbf{I}\right\\|\leq 3\gamma\dot{\nu},$

		$\displaystyle{{d}_{2}}\left({{\mathbf{q}}_{w}}\right)=\left\\|{{\mathbf{w}}_{{{x}_{w}},{{s}_{w}}}}-\mu({{\mathbf{q}}_{w}})\mathbf{e}\right\\|$		(129)
		$\displaystyle\leq{{\omega}^{2}}\left\\|{{\mathbf{w}}_{{{x}_{o}},{{s}_{o}}}}-\mu({{\mathbf{q}}_{o}})\mathbf{e}\right\\|+\left\\|\omega(1-\omega)\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-\frac{{{\mathbf{e}}^{T}}}{k}\left({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}}\right)\mathbf{e}\right)-\omega\widetilde{\mathbf{U}}{{\mathbf{s}}_{o}}\right\\|$
		$\displaystyle\leq{{\omega}^{2}}{{d}_{2}}\left({{\mathbf{q}}_{o}}\right)+\omega(1-\omega)\left(\left\\|\left(({{\mathbf{x}}_{o}}+{{\mathbf{s}}_{o}})-\psi_{o}\mathbf{e}\right)\right\\|+\rho\left\\|{{\mathbf{s}}_{o}}\right\\|\right),$