This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A non-monotone smoothing Newton algorithm for solving the system of generalized absolute value equations

Cairong Chen Supported partially by the National Natural Science Foundation of China (Grant No. 11901024) and the Natural Science Foundation of Fujian Province (Grand No. 2021J01661). Email address: cairongchen@fjnu.edu.cn. School of Mathematics and Statistics, FJKLMAA and Center for Applied Mathematics of Fujian Province, Fujian Normal University, Fuzhou, 350007, P.R. China. Dongmei Yu Corresponding author. Supported partially by the Natural Science Foundation of Liaoning Province (Nos. 2020-MS-301 and 2019-BS-118), the Liaoning Provincial Department of Education (Nos. LJ2020ZD002 and LJ2019ZL001) and the China Postdoctoral Science Foundation (2019M650449). Email address: yudongmei1113@163.com. Institute for Optimization and Decision Analytics, Liaoning Technical University, Fuxin, 123000, P.R. China. Deren Han Supported partially by the National Natural Science Foundation of China (Grant Nos. 11625105 and 11926358). Email address: handr@buaa.edu.cn. LMIB of the Ministry of Education, School of Mathematical Sciences, Beihang University, Beijing, 100191, P.R. China. Changfeng Ma Supported partially by the National Key Research and Development Program of China (Nos. 2019YFC0312003). Email address: macf@fjnu.edu.cn. School of Mathematics and Statistics, FJKLMAA and Center for Applied Mathematics of Fujian Province, Fujian Normal University, Fuzhou, 350007, P.R. China.

Abstract: The system of generalized absolute value equations (GAVE) has attracted more and more attention in the optimization community. In this paper, by introducing a smoothing function, we develop a smoothing Newton algorithm with non-monotone line search to solve the GAVE. We show that the non-monotone algorithm is globally and locally quadratically convergent under a weaker assumption than those given in most existing algorithms for solving the GAVE. Numerical results are given to demonstrate the viability and efficiency of the approach.

2000 Mathematics Subject Classification. 65F10, 65H10, 90C30

Keywords. Generalized absolute value equations; Smoothing function; Smoothing Newton algorithm; Non-monotone line search; Global and local quadratic convergence.

1 Introduction

The system of generalized absolute value equations (GAVE) is to find a vector xnx\in\mathbb{R}^{n} such that

Ax+B|x|b=0,Ax+B|x|-b=0, (1.1)

where An×nA\in\mathbb{R}^{n\times n} and 0Bn×n0\neq B\in\mathbb{R}^{n\times n} are two known matrices, bnb\in\mathbb{R}^{n} is a known vector, and |x||x| denotes the componentwise absolute value of xnx\in\mathbb{R}^{n}. To the best of our knowledge, GAVE (1.1) was first introduced by Rohn in [34] and further investigated in [30, 20, 12, 26, 15, 28] and references therein. Obviously, when B=IB=-I with II being the identity matrix, GAVE (1.1) becomes the system of absolute value equations (AVE)

Ax|x|b=0,Ax-|x|-b=0, (1.2)

which is the subject of numerous research works; see, e.g., [24, 48, 49, 3, 21, 11] and references therein.

GAVE (1.1) and AVE (1.2) have attracted considerable attention in the field of optimization for almost twenty years, and the primary reason is that they are closely related to the linear complementarity problem (LCP) [24, 30] and the horizontal LCP (HLCP) [26], which encompass many mathematical programming problems and have many practical applications [6, 27]. In addition, GAVE (1.1) and AVE (1.2) are also bound up with the system of linear interval equations [33].

Due to the combinatorial character introduced by the absolute value operator, solving GAVE (1.1) is generally NP-hard [20, Proposition 2]. Moreover, if GAVE (1.1) is solvable, checking whether it has a unique solution or multiple solutions is NP-complete [30, Proposition 2.1]. Recently, GAVE (1.1) and AVE (1.2) have been extensively investigated in the literature, and the main research effort can be summarized to the following two aspects.

On the theoretical side, one of the main branches is to investigate conditions for existence, non-existence and uniqueness of solutions of GAVE (1.1) or AVE (1.2); see, e.g., [34, 26, 44, 24, 13, 30, 12, 45, 43] and references therein. Specially, the following necessary and sufficient conditions that ensure the existence and uniqueness of solution of GAVE (1.1) can be found in [26, 45] (see section 2 for the definition of the column 𝒲\mathcal{W}-property).

Theorem 1.1.

([26, Theorem 1]) The following statements are equivalent:

  • (i)

    GAVE (1.1) has a unique solution for any bn;b\in\mathbb{R}^{n};

  • (ii)

    {A+B,AB}\{A+B,A-B\} has the column 𝒲\mathcal{W}-property;

  • (iii)

    for arbitrary nonnegative diagonal matrices D1,D2n×nD_{1},D_{2}\in\mathbb{R}^{n\times n} with D1+D2>0,D_{1}+D_{2}>0,

    det[(A+B)D1+(AB)D2]0;\det\left[(A+B)D_{1}+(A-B)D_{2}\right]\neq 0;
  • (iv)

    A+BA+B is nonsingular and {I,(A+B)1(AB)}\{I,(A+B)^{-1}(A-B)\} has the column 𝒲\mathcal{W}-property.

Theorem 1.2.

([45, Theorem 3.2]) GAVE (1.1) has a unique solution for any bnb\in\mathbb{R}^{n} if and only if matrix A+BDA+BD is nonsingular for any diagonal matrix D=diag(di)D=\operatorname{diag}(d_{i}) with di[1,1]d_{i}\in[-1,1].

It is easy to conclude that Theorem 1.1 and Theorem 1.2 imply that {A+B,AB}\{A+B,A-B\} has the column 𝒲\mathcal{W}-property if and only if matrix A+BDA+BD is nonsingular for any diagonal matrix D=diag(di)D=\operatorname{diag}(d_{i}) with di[1,1]d_{i}\in[-1,1] (see Lemma 2.3 for more details).

On the numerical side, there are various algorithms for solving AVE (1.2) or GAVE (1.1). For example, Mangasarian proposed concave minimization method [21], generalized Newton method [22], and successive linear programming method [23], for solving AVE (1.2). Zamani and Hladík proposed a new concave minimization algorithm for AVE (1.2) [48], which solves a deficiency of the method proposed in [21]. Zainali and Lotfi modified the generalized Newton method and developed a stable and quadratic convergent method for AVE (1.2) [47]. Cruz et al. proposed an inexact semi-smooth Newton method for AVE (1.2) [7]. Shahsavari and Ketabchi proposed two types of proximal algorithms to solve AVE (1.2) [36]. Haghani introduced generalized Traub’s method for AVE (1.2) [11]. Ke and Ma proposed an SOR-like iteration method for AVE (1.2) [18]. Caccetta et al. proposed a smoothing Newton method for AVE (1.2) [3]. Saheya et al. summarized several systematic ways of constructing smoothing functions and proposed a unified neural network model for solving AVE (1.2) [35]. Zhang and Wei proposed a generalized Newton method which combines the semismooth and the smoothing Newton steps for AVE (1.2) [49]. In [19], Lian et al. further considered the generalized Newton method for GAVE (1.1) and presented some weaker convergent conditions compared to the results in [22, 15]. Wang et al. proposed modified Newton-type iteration methods for GAVE (1.1) [42]. Zhou et al. established Newton-based matrix splitting methods for GAVE (1.1) [51]. Jiang and Zhang proposed a smoothing-type algorithm for GAVE (1.1) [29]. Tang and Zhou proposed a quadratically convergent descent method for GAVE (1.1) [40]. Hu et al. proposed a generalized Newton method for absolute value equations associated with second order cones (SOCAVE), which is an extension of GAVE (1.1) [15]. For more numerical algorithms, one can refer to [1, 4, 10, 8, 46, 17, 25] and references therein.

By looking into the mathematical format of GAVE (1.1), non-differentiability is caused by the absolute value operator. Smoothing algorithms have been successfully applied to solve GAVE (1.1) [29, 40]. However, monotone line search techniques were used in the methods proposed in [29, 40]. Recently, great attention has been paid to smoothing algorithms with non-monotone line search; see, e.g., [39, 38, 16, 52] and references therein. Non-monotone line search schemes can improve the likelihood of finding a global optimum and improve convergence speed in cases where a monotone line search scheme is forced to creep along the bottom of a narrow curved valley [50]. It is therefore interesting to develop non-monotone smoothing algorithms for solving GAVE (1.1). This motivates us to develop a non-monotone smoothing Newton algorithm for solving GAVE (1.1). Our work here is inspired by recent studies on weighted complementarity problem [39, 41].

The rest of this paper is organized as follows. In section 2, we provide some concepts and results used throughout the paper. In section 3, we develop a non-monotone smoothing Newton algorithm for solving GAVE (1.1), while section 4 is devoted to discussing the convergence. Numerical experiments are given in section 5. Finally, section 6 concludes this paper.

Notation. m×m\mathbb{R}^{m\times m} is the set of all m×mm\times m real matrices, n=n×1\mathbb{R}^{n}=\mathbb{R}^{n\times 1}, and =1\mathbb{R}=\mathbb{R}^{1}. +\mathbb{R}_{+} and ++\mathbb{R}_{++} denote the nonnegative and positive real number, respectively. InI_{n} (or simply II if its dimension is clear from the context) is the n×nn\times n identity matrix. The superscript “{\cdot}^{\top}” takes transpose. For Xm×nX\in\mathbb{R}^{m\times n}, Xi,jX_{i,j} refers to its (i,j)(i,j)th entry, |X||X| is in m×n\mathbb{R}^{m\times n} with its (i,j)(i,j)th entry |Xi,j||X_{i,j}|. Inequality XYX\leq Y means Xi,jYi,jX_{i,j}\leq Y_{i,j} for all (i,j)(i,j), and similarly for X<YX<Y. We use t0t\downarrow 0 to denote the case that a positive scalar tt tends to 0. We use α=O(β)\alpha=O(\beta) to mean αβ\frac{\alpha}{\beta} is bounded uniformly as β0\beta\rightarrow 0. For any aa\in\mathbb{R}, we define sgn(a):={1,ifa>0,0,ifa=0,1,ifa<0.\operatorname{sgn}(a):=\left\{\begin{array}[]{lr}1,&\text{if}~{}a>0,\\ 0,&\text{if}~{}a=0,\\ -1,&\text{if}~{}a<0.\end{array}\right. We denote the diagonal matrix whose iith diagonal element is xix_{i} by diag(xi)\operatorname{diag}(x_{i}) and define D(x):=diag(sgn(xi))D(x):=\operatorname{diag}(\operatorname{sgn}(x_{i})). The symbol \|\cdot\| stands for the 22-norm. For a matrix Pm×nP\in\mathbb{R}^{m\times n}, we use σmin(P)\sigma_{\min}(P) and σmax(P)\sigma_{\max}(P) to denote the smallest singular value and the largest singular value, respectively. For a differentiable mapping G:𝕍nnG:\mathbb{V}\subset\mathbb{R}^{n}\rightarrow\mathbb{R}^{n}, we denote G(x)G^{\prime}(x) by the Jacobian of GG at x𝕍x\in\mathbb{V} and G(x)=G(x)\nabla G(x)=G^{\prime}(x)^{\top} denotes the gradient of GG at xx. For xnx\in\mathbb{R}^{n}, we also denote xx by vec(xi)\operatorname{vec}(x_{i}).

2 Preliminaries

In this section, we collect some basic notions as well as corresponding assertions, which are useful in this paper.

Definition 2.1.

([37]) Let :={M,N}\mathcal{M}:=\{M,N\} be a set of matrices M,Nn×nM,\,N\in\mathbb{R}^{n\times n}, a matrix Rn×nR\in\mathbb{R}^{n\times n} is called a column representative of \mathcal{M} if

Rj{Mj,Nj},j=1,2,,n,R_{\cdot j}\in\{M_{\cdot j},N_{\cdot j}\},\quad j=1,2,\cdots,n,

where Rj,MjR_{\cdot j},\,M_{\cdot j} and NjN_{\cdot j} denote the jjth column of R,MR,\,M and NN, respectively. \mathcal{M} is said to have the column 𝒲\mathcal{W}-property if the determinants of all column representative matrices of \mathcal{M} are all positive or all negative.

Definition 2.2.

([33]) An interval matrix AIA^{I} is defined by AI:=[A¯,A¯]={X:A¯XA¯}A^{I}:=[\underline{A},\bar{A}]=\{X:\underline{A}\leq X\leq\bar{A}\}. A square interval matrix AIA^{I} is called regular if each XAIX\in A^{I} is nonsingular.

Definition 2.3.

(See, e.g., [9]) The classic (one-sided) directional derivative of a function f:nf:\mathbb{R}^{n}\rightarrow\mathbb{R} at xx in the direction yy is defined by

f(x;y)=limt0f(x+ty)f(x)t,f^{\prime}(x;y)=\lim_{t\downarrow 0}\frac{f(x+ty)-f(x)}{t},

provided that the limit exists. Accordingly, F(x;y)=[F1(x;y),,Fm(x;y)]F^{\prime}(x;y)=[F^{\prime}_{1}(x;y),\cdots,F^{\prime}_{m}(x;y)]^{\top} denotes the directional derivative for the vector-valued function F:nmF:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}.

Definition 2.4.

([9]) A vector-valued function F:nmF:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m} is said to be Lipschitz continuous on a set 𝒮n\mathcal{S}\subset\mathbb{R}^{n} if there is a constant L>0L>0 such that

F(x)F(y)Lxy,x,yS.\|F(x)-F(y)\|\leq L\|x-y\|,\quad x,y\in S.

Moreover, FF is called locally Lipschitz continuous on n\mathbb{R}^{n} if it is Lipschitz continuous on all compact subsets 𝒮n\mathcal{S}\subset\mathbb{R}^{n}.

If F:nmF:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m} is locally Lipschitz continuous, by Rademacher’s Theorem, FF is differentiable almost everywhere [31]. Let DFD_{F} be the set where FF is differentiable, then the generalized Jacobian of FF at xx in the sense of Clarke [5] is

F(x)=co{limx(k)DFx(k)xF(x(k))},\partial F(x)=\operatorname{co}\left\{\lim_{\begin{array}[]{c}x^{(k)}\in D_{F}\\ x^{(k)}\rightarrow x\end{array}}\nabla F(x^{(k)})\right\},

where “co\operatorname{co}” denotes the convex hull.

Definition 2.5.

([32]) A locally Lipschitz continuous vector-valued function F:nmF:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m} is called semismooth at xx if

limVF(x+td)dd,t0{Vd}\lim_{\begin{array}[]{c}V\in\partial F(x+td^{\prime})\\ d^{\prime}\rightarrow d,t\downarrow 0\end{array}}\{Vd^{\prime}\}

exists for any dnd\in\mathbb{R}^{n}.

Lemma 2.1.

([32]) Let F:nmF:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}, then the directional derivative F(x;d)F^{\prime}(x;d) exists for any dnd\in\mathbb{R}^{n} if FF is semismooth at xx.

Lemma 2.2.

([32]) Suppose that F:nmF:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m} is semismooth at xx. Then it is called strongly semismooth at xx if

VdF(x;d)=O(d2)Vd-F^{\prime}(x;d)=O(\|d\|^{2})

for any VF(x+d)V\in\partial F(x+d) and d0d\rightarrow 0.

Throughout the rest of this paper, we always assume that the following assumption holds.

Assumption 2.1.

Let matrices AA and BB satisfy {A+B,AB}\{A+B,A-B\} has the column 𝒲\mathcal{W}-property.

It is known that, if Assumption 2.1 holds, GAVE (1.1) has a unique solution for any bnb\in\mathbb{R}^{n} [26]. In addition, we have the following lemma, which is needed in the subsequent discussion.

Lemma 2.3.

Assumption 2.1 holds if and only if matrix A+BDA+BD is nonsingular for any diagonal matrix D=diag(di)D=\operatorname{diag}(d_{i}) with di[1,1](i=1,2,,n)d_{i}\in[-1,1](i=1,2,\cdots,n).

Proof.

The result can be straightly derived from Theorem 1.1 and Theorem 1.2. Indeed, it follows from Theorem 1.1 that Assumption 2.1 holds if and only if

det[(A+B)D¯+(AB)D~]0\det\left[(A+B)\bar{D}+(A-B)\tilde{D}\right]\neq 0

for any nonnegative diagonal matrices D¯,D~n×n\bar{D},\tilde{D}\in\mathbb{R}^{n\times n} with D¯+D~>0\bar{D}+\tilde{D}>0, that is,

det[A+B(D¯D~)(D¯+D~)1]0\det\left[A+B(\bar{D}-\tilde{D})(\bar{D}+\tilde{D})^{-1}\right]\neq 0 for any nonnegative diagonal matrices D¯,D~n×n\bar{D},\tilde{D}\in\mathbb{R}^{n\times n} with D¯+D~>0\bar{D}+\tilde{D}>0. (2.1)

Let

𝒟1\displaystyle\mathcal{D}_{1} :={Dn×n:D=diag(di),di[1,1](i=1,2,,n)},\displaystyle:=\{D\in\mathbb{R}^{n\times n}:D=\operatorname{diag}(d_{i}),d_{i}\in[-1,1](i=1,2,\cdots,n)\},
𝒟2\displaystyle\mathcal{D}_{2} :={Dn×n:D=(D¯D~)(D¯+D~)1,D¯=diag(d¯i)0,D~=diag(d~i)0,D¯+D~>0}.\displaystyle:=\{D\in\mathbb{R}^{n\times n}:D=(\bar{D}-\tilde{D})(\bar{D}+\tilde{D})^{-1},\bar{D}=\operatorname{diag}(\bar{d}_{i})\geq 0,\tilde{D}=\operatorname{diag}(\tilde{d}_{i})\geq 0,\bar{D}+\tilde{D}>0\}.

Then, on one hand, for any D𝒟2D\in\mathcal{D}_{2}, we have |Di,i|=|d¯id~i||d¯i+d~i|1|D_{i,i}|=\frac{|\bar{d}_{i}-\tilde{d}_{i}|}{|\bar{d}_{i}+\tilde{d}_{i}|}\leq 1. Thus, 𝒟2𝒟1\mathcal{D}_{2}\subseteq\mathcal{D}_{1}. On the other hand, for any D=diag(di)𝒟1D=\operatorname{diag}(d_{i})\in\mathcal{D}_{1}, di[1,1]d_{i}\in[-1,1] can be expressed by di=d¯id~id¯i+d~id_{i}=\frac{\bar{d}_{i}-\tilde{d}_{i}}{\bar{d}_{i}+\tilde{d}_{i}} with

{d¯i>0,d~i=0,ifdi=1;d¯i=0,d~i>0,ifdi=1;d¯i=(1+di)d~i1di>0,ifdi(1,1).\begin{cases}&\bar{d}_{i}>0,\tilde{d}_{i}=0,\quad\text{if}~{}d_{i}=1;\\ &\bar{d}_{i}=0,\tilde{d}_{i}>0,\quad\text{if}~{}d_{i}=-1;\\ &\bar{d}_{i}=\frac{(1+d_{i})\tilde{d}_{i}}{1-d_{i}}>0,\quad\text{if}~{}d_{i}\in(-1,1).\end{cases}

Hence, 𝒟1𝒟2\mathcal{D}_{1}\subseteq\mathcal{D}_{2}. It follows from the above discussion that 𝒟1=𝒟2\mathcal{D}_{1}=\mathcal{D}_{2}. Then (2.1) is equivalent to

det(A+BD)0\det(A+BD)\neq 0

for any D=diag(di)D=\operatorname{diag}(d_{i}) with di[1,1](i=1,2,,n)d_{i}\in[-1,1](i=1,2,\cdots,n). This completes the proof. ∎

Remark 2.1.

For symmetric matrices AA and BB, under the assumption that σmin(A)>σmax(B)\sigma_{\min}(A)>\sigma_{\max}(B), the authors in [2, Lemma 1] proved the nonsingularity of A+BDA+BD for any diagonal matrix DD whose elements are equal to 1,01,0 or 1-1. We should mentioned that the symmetries of the matrices AA and BB can be relaxed there and our result here is more general than theirs.

Remark 2.2.

In [29], the authors used the assumption that σmin(A)>σmax(B)\sigma_{\min}(A)>\sigma_{\max}(B), while in [40], the authors used the assumption that the interval matrix [A|B|,A+|B|][A-|B|,A+|B|] is regular. The interval matrix [A|B|,A+|B|][A-|B|,A+|B|] is regular is weaker than that σmin(A)>σmax(B)\sigma_{\min}(A)>\sigma_{\max}(B) and examples can be found in [49, Examples 2.1 and 2.3]. In addition, it is easy to prove that [A|B|,A+|B|][A-|B|,A+|B|] is regular implies that Assumption 2.1 holds, but the reverse is not true. For instance, let

A=[1001496994501],B=[999494995499],A=\left[\begin{array}[]{cc}1001&-496\\ -994&501\end{array}\right],\quad B=\left[\begin{array}[]{cc}999&-494\\ -995&499\end{array}\right],

then {A+B,AB}\{A+B,A-B\} has the column 𝒲\mathcal{W}-property [26] while [A|B|,A+|B|][A-|B|,A+|B|] is not regular. Indeed, there exists a singular matrix [2222][A|B|,A+|B|].\left[\begin{array}[]{cc}2&-2\\ -2&2\end{array}\right]\in[A-|B|,A+|B|]. In conclusion, our Assumption 2.1 here is more general than those used in [29, 40].

3 The algorithm

In this section, we develop a non-monotone smoothing Newton algorithm for solving GAVE (1.1). To this end, we first consider an equivalent reformulation of GAVE (1.1) by introducing a smoothing function for the absolute value operator.

3.1 A smoothing function for |x||x| with xx\in\mathbb{R}

In this subsection, we consider a smoothing function for |x||x| with xx\in\mathbb{R} and discuss some of its properties, which lay the foundation of the next subsection.

Since |x||x| is not differentiable at x=0x=0, in order to overcome the hurdle in analysis and application, researchers construct numerous smoothing functions for it [35]. In this paper, we adopt the following smoothing function ϕ:2\phi:\mathbb{R}^{2}\rightarrow\mathbb{R}, defined by

ϕ(μ,x)=μ2+x2μ,\phi(\mu,x)=\sqrt{\mu^{2}+x^{2}}-\mu, (3.1)

which can be derived from the perspective of the convex conjugate [35].

In the following, we give some properties related to the smoothing function (3.1).

Proposition 3.1.

Let ϕ\phi be defined by (3.1), then we have

  • (i)

    ϕ(0,x)=|x|;\phi(0,x)=|x|;

  • (ii)

    ϕ\phi is continuously differentiable on 2{(0,0)}\mathbb{R}^{2}\setminus\{(0,0)\}, and when (μ,x)(0,0)(\mu,x)\neq(0,0), we have

    ϕμ=μμ2+x21andϕx=xμ2+x2;\frac{\partial\phi}{\partial\mu}=\frac{\mu}{\sqrt{\mu^{2}+x^{2}}}-1\quad\text{and}\quad\frac{\partial\phi}{\partial x}=\frac{x}{\sqrt{\mu^{2}+x^{2}}};
  • (iii)

    ϕ\phi is a convex function on 2\mathbb{R}^{2}, i.e., ϕ(α(μ¯,x¯)+(1α)(μ~,x~))αϕ(μ¯,x¯)+(1α)ϕ(μ~,x~)\phi(\alpha(\bar{\mu},\bar{x})+(1-\alpha)(\tilde{\mu},\tilde{x}))\leq\alpha\phi(\bar{\mu},\bar{x})+(1-\alpha)\phi(\tilde{\mu},\tilde{x}) for all (μ¯,x¯),(μ~,x~)2(\bar{\mu},\bar{x}),(\tilde{\mu},\tilde{x})\in\mathbb{R}^{2} and α[0,1];\alpha\in[0,1];

  • (iv)

    ϕ\phi is Lipschitz continuous on 2;\mathbb{R}^{2};

  • (v)

    ϕ\phi is strongly semismooth on 2\mathbb{R}^{2}.

Proof.

The proofs of (i) and (ii) are trivial.

Now we turn to the result (iii). For any (μ¯,x¯),(μ~,x~)2(\bar{\mu},\bar{x}),(\tilde{\mu},\tilde{x})\in\mathbb{R}^{2} and α[0,1]\alpha\in[0,1], we have

ϕ(α(μ¯,x¯)+(1α)(μ~,x~))αϕ(μ¯,x¯)(1α)ϕ(μ~,x~)\displaystyle\phi(\alpha(\bar{\mu},\bar{x})+(1-\alpha)(\tilde{\mu},\tilde{x}))-\alpha\phi(\bar{\mu},\bar{x})-(1-\alpha)\phi(\tilde{\mu},\tilde{x})
=[αμ¯+(1α)μ~]2+[αx¯+(1α)x~]2αμ¯(1α)μ~\displaystyle\qquad=\sqrt{[\alpha\bar{\mu}+(1-\alpha)\tilde{\mu}]^{2}+[\alpha\bar{x}+(1-\alpha)\tilde{x}]^{2}}-\alpha\bar{\mu}-(1-\alpha)\tilde{\mu}
αμ¯2+x¯2+αμ¯(1α)μ~2+x~2+(1α)μ~\displaystyle\qquad\qquad-\alpha\sqrt{\bar{\mu}^{2}+\bar{x}^{2}}+\alpha\bar{\mu}-(1-\alpha)\sqrt{\tilde{\mu}^{2}+\tilde{x}^{2}}+(1-\alpha)\tilde{\mu}
=[αμ¯+(1α)μ~]2+[αx¯+(1α)x~]2αμ¯2+x¯2(1α)μ~2+x~2.\displaystyle\qquad=\sqrt{[\alpha\bar{\mu}+(1-\alpha)\tilde{\mu}]^{2}+[\alpha\bar{x}+(1-\alpha)\tilde{x}]^{2}}-\alpha\sqrt{\bar{\mu}^{2}+\bar{x}^{2}}-(1-\alpha)\sqrt{\tilde{\mu}^{2}+\tilde{x}^{2}}. (3.2)

On one hand,

([αμ¯+(1α)μ~]2+[αx¯+(1α)x~]2)2\displaystyle\left(\sqrt{[\alpha\bar{\mu}+(1-\alpha)\tilde{\mu}]^{2}+[\alpha\bar{x}+(1-\alpha)\tilde{x}]^{2}}\right)^{2}
=α2(μ¯2+x¯2)+(1α)2(μ~2+x~2)+2α(1α)(μ¯μ~+x¯x~).\displaystyle\qquad=\alpha^{2}(\bar{\mu}^{2}+\bar{x}^{2})+(1-\alpha)^{2}(\tilde{\mu}^{2}+\tilde{x}^{2})+2\alpha(1-\alpha)(\bar{\mu}\tilde{\mu}+\bar{x}\tilde{x}). (3.3)

On the other hand,

[αμ¯2+x¯2+(1α)μ~2+x~2]2\displaystyle\left[\alpha\sqrt{\bar{\mu}^{2}+\bar{x}^{2}}+(1-\alpha)\sqrt{\tilde{\mu}^{2}+\tilde{x}^{2}}\right]^{2}
=α2(μ¯2+x¯2)+(1α)2(μ~2+x~2)+2α(1α)(μ¯2+x¯2)(μ~2+x~2)\displaystyle\qquad=\alpha^{2}(\bar{\mu}^{2}+\bar{x}^{2})+(1-\alpha)^{2}(\tilde{\mu}^{2}+\tilde{x}^{2})+2\alpha(1-\alpha)\sqrt{(\bar{\mu}^{2}+\bar{x}^{2})(\tilde{\mu}^{2}+\tilde{x}^{2})}
α2(μ¯2+x¯2)+(1α)2(μ~2+x~2)+2α(1α)|μ¯μ~+x¯x~|.\displaystyle\qquad\geq\alpha^{2}(\bar{\mu}^{2}+\bar{x}^{2})+(1-\alpha)^{2}(\tilde{\mu}^{2}+\tilde{x}^{2})+2\alpha(1-\alpha)|\bar{\mu}\tilde{\mu}+\bar{x}\tilde{x}|. (3.4)

Then the result (iii) follows from (3.2)-(3.4).

Consider the result (iv). For any (μ¯,x¯),(μ~,x~)2(\bar{\mu},\bar{x}),(\tilde{\mu},\tilde{x})\in\mathbb{R}^{2}, we have

|ϕ(μ¯,x¯)ϕ(μ~,x~)|\displaystyle|\phi(\bar{\mu},\bar{x})-\phi(\tilde{\mu},\tilde{x})| =|(μ¯,x¯)μ¯(μ~,x~)+μ~|\displaystyle=|\|(\bar{\mu},\bar{x})\|-\bar{\mu}-\|(\tilde{\mu},\tilde{x})\|+\tilde{\mu}|
|(μ¯,x¯)(μ~,x~)|+|μ¯μ~|\displaystyle\leq|\|(\bar{\mu},\bar{x})\|-\|(\tilde{\mu},\tilde{x})\||+|\bar{\mu}-\tilde{\mu}|
(μ¯μ~,x¯x~)+(μ¯μ~,x¯x~)\displaystyle\leq\|(\bar{\mu}-\tilde{\mu},\bar{x}-\tilde{x})\|+\|(\bar{\mu}-\tilde{\mu},\bar{x}-\tilde{x})\|
=2(μ¯μ~,x¯x~).\displaystyle=2\|(\bar{\mu}-\tilde{\mu},\bar{x}-\tilde{x})\|.

Hence, ϕ\phi is Lipschitz continuous with Lipschitz constant 22.

Finally, we prove the result (v). It follows from the result (iii) that ϕ\phi is semismooth on 2\mathbb{R}^{2} [32]. Note that ϕ\phi is arbitrarily many times differentiable for all (μ,x)2(\mu,x)\in\mathbb{R}^{2} with (μ,x)(0,0)(\mu,x)\neq(0,0) and hence strongly semismooth at these points. Therefore, it is sufficient to show that it is strongly semismooth at (0,0)(0,0). For any (μ,x)2\{(0,0)}(\mu,x)\in\mathbb{R}^{2}\backslash\{(0,0)\}, ϕ\phi is differentiable at (μ,x)(\mu,x), and hence, ϕ(μ,x)=ϕ(μ,x)=[ϕ(μ,x)μ,ϕ(μ,x)x]\partial\phi(\mu,x)=\nabla\phi(\mu,x)=\left[\frac{\partial\phi(\mu,x)}{\partial\mu},\frac{\partial\phi(\mu,x)}{\partial x}\right]^{\top}. In addition, by Lemma 2.1, the classic directional derivative of ϕ\phi at (0,0)(0,0) exists and

ϕ((0,0);(μ,x))=limt0ϕ((0,0)+t(μ,x))ϕ(0,0)t=ϕ(μ,x),\phi^{\prime}((0,0);(\mu,x))=\lim_{t\downarrow 0}\frac{\phi((0,0)+t(\mu,x))-\phi(0,0)}{t}=\phi(\mu,x),

from which we have

ϕ(μ,x)[ϕ(μ,x)μ,ϕ(μ,x)x][μx]\displaystyle\phi(\mu,x)-\left[\frac{\partial\phi(\mu,x)}{\partial\mu},\frac{\partial\phi(\mu,x)}{\partial x}\right]\left[\begin{array}[]{c}\mu\\ x\end{array}\right]
=μ2+x2μ(μμ2+x21)μxμ2+x2x\displaystyle\qquad=\sqrt{\mu^{2}+x^{2}}-\mu-\left(\frac{\mu}{\sqrt{\mu^{2}+x^{2}}}-1\right)\mu-\frac{x}{\sqrt{\mu^{2}+x^{2}}}x
=0\displaystyle\qquad=0
=O((μ,x)2).\displaystyle\qquad=O(\|(\mu,x)\|^{2}).

Then the result follows from Lemma 2.2. ∎

3.2 The reformulation of GAVE (1.1)

In this subsection, based on the earlier subsection, we will give a reformulation of GAVE (1.1) and explore some of its properties.

Let z:=(μ,x)×nz:=(\mu,x)\in\mathbb{R}\times\mathbb{R}^{n}, we first define the function H:×n×nH:\mathbb{R}\times\mathbb{R}^{n}\rightarrow\mathbb{R}\times\mathbb{R}^{n} as

H(z):=[μAx+BΦ(μ,x)b],H(z):=\left[\begin{array}[]{c}\mu\\ Ax+B\Phi(\mu,x)-b\end{array}\right], (3.5)

where Φ:n+1n\Phi:\mathbb{R}^{n+1}\rightarrow\mathbb{R}^{n} is defined by

Φ(μ,x):=[ϕ(μ,x1)ϕ(μ,x2)ϕ(μ,xn)]\Phi(\mu,x):=\left[\begin{array}[]{c}\phi(\mu,x_{1})\\ \phi(\mu,x_{2})\\ \vdots\\ \phi(\mu,x_{n})\end{array}\right]

with ϕ\phi being the smoothing function given in (3.1). According to Proposition 3.1 (i), it holds that

H(z)=0μ=0andxis a solution of GAVE (1.1).H(z)=0\quad\Leftrightarrow\quad\mu=0~{}\text{and}~{}x~{}\text{is a solution of GAVE~{}\eqref{eq:gave}}. (3.6)

Then it follows from (3.6) that solving GAVE (1.1) is equivalent to solving the system of nonlinear equations H(z)=0H(z)=0. Before giving the algorithm for solving H(z)=0H(z)=0, we will give some properties of the function HH.

Proposition 3.2.

Let HH be defined by (3.5), then we have

  • (i)

    HH is continuously differentiable on n+1\{0}\mathbb{R}^{n+1}\backslash\{0\}, and when μ=0\mu=0 and xi0x_{i}\neq 0 (for all i=1,2,,ni=1,2,\cdots,n) or μ0\mu\neq 0, the Jacobian matrix of HH is given by

    H(z)=[10BV1A+BV2]H^{\prime}(z)=\left[\begin{array}[]{cc}1&0\\ BV_{1}&A+BV_{2}\end{array}\right] (3.7)

    with

    V1=[μμ2+x121μμ2+x221μμ2+xn21],V2=[x1μ2+x120000x2μ2+x2200000xnμ2+xn2];V_{1}=\left[\begin{array}[]{c}\frac{\mu}{\sqrt{\mu^{2}+x_{1}^{2}}}-1\\ \frac{\mu}{\sqrt{\mu^{2}+x_{2}^{2}}}-1\\ \vdots\\ \frac{\mu}{\sqrt{\mu^{2}+x_{n}^{2}}}-1\end{array}\right],\quad V_{2}=\left[\begin{array}[]{cccc}\frac{x_{1}}{\sqrt{\mu^{2}+x_{1}^{2}}}&0&0&0\\ 0&\frac{x_{2}}{\sqrt{\mu^{2}+x_{2}^{2}}}&0&0\\ \vdots&\vdots&\ddots&\vdots\\ 0&0&0&\frac{x_{n}}{\sqrt{\mu^{2}+x_{n}^{2}}}\end{array}\right]; (3.8)
  • (ii)

    HH is strongly semismooth on n+1.\mathbb{R}^{n+1}.

Proof.

The result (i) holds from Proposition 3.1 (ii).

Now we turn to prove the result (ii). Since HH is strongly semismooth on n+1\mathbb{R}^{n+1} if and only if its component function HiH_{i}, i=1,2,,ni=1,2,\cdots,n, are [32], and the composition of strongly semismooth functions is a strongly semismooth function [9, Theorem 19], the result (ii) follows from Proposition 3.1 (v) and the fact that a continuously differentiable function with a Lipschitz continuous gradient is strongly semismooth [14]. ∎

3.3 The non-monotone smoothing Newton algorithm for GAVE (1.1)

Now we are in position to develop a non-monotone smoothing Newton algorithm to solve the system of nonlinear equations H(z)=0H(z)=0, and so is GAVE (1.1).

Let H(z)H(z) be given in (3.5) and define the merit function :×n+\mathcal{M}:\mathbb{R}\times\mathbb{R}^{n}\rightarrow\mathbb{R}_{+} by

(z):=H(z)2.\mathcal{M}(z):=\|H(z)\|^{2}.

Clearly, solving the system of nonlinear equations H(z)=0H(z)=0 is equivalent to solving the following unconstrained optimization problem

minzn+1(z)\min_{z\in\mathbb{R}^{n+1}}~{}\mathcal{M}(z)

with the vanished objective function value. We now propose a non-monotone smoothing Newton algorithm to solve H(z)=0H(z)=0 by minimizing the merit function (z)\mathcal{M}(z), which is described in Algorithm 1.

Algorithm 1 A non-monotone smoothing Newton algorithm (NSNA) for GAVE (1.1)
1:  Choose θ,δ(0,1)\theta,\,\delta\in(0,1) and z(0):=(μ(0),x(0))++×nz^{(0)}:=(\mu^{(0)},x^{(0)})\in\mathbb{R}_{++}\times\mathbb{R}^{n}. Let 𝒞(0):=(z(0))\mathcal{C}^{(0)}:=\mathcal{M}(z^{(0)}). Choose γ(0,1)\gamma\in(0,1) such that β(0)=γ𝒞(0)<μ(0)\beta^{(0)}=\gamma\mathcal{C}^{(0)}<\mu^{(0)} and γμ(0)<1\gamma\mu^{(0)}<1. Set k:=0k:=0.
2:  If H(z(k))=0\|H(z^{(k)})\|=0, then stop. Else, compute the search direction Δz(k)=(Δμ(k),Δx(k))×n\Delta z^{(k)}=(\Delta\mu^{(k)},\Delta x^{(k)})\in\mathbb{R}\times\mathbb{R}^{n} by solving the perturbed Newton system:
H(z(k))Δz(k)=H(z(k))+β(k)e(1),H^{\prime}(z^{(k)})\Delta z^{(k)}=-H(z^{(k)})+\beta^{(k)}e^{(1)}, (3.9)
where e(1)=[1,0]×ne^{(1)}=[1,0]^{\top}\in\mathbb{R}\times\mathbb{R}^{n}. If Δz(k)\Delta z^{(k)} satisfies
H(z(k)+Δz(k))θH(z(k)),\|H(z^{(k)}+\Delta z^{(k)})\|\leq\theta\|H(z^{(k)})\|, (3.10)
then set z(k+1):=z(k)+Δz(k)z^{(k+1)}:=z^{(k)}+\Delta z^{(k)} and go to step 44. Otherwise, go to step 33.
3:  Let α(k)\alpha^{(k)} be the maximum of the values 1,δ,δ2,1,\delta,\delta^{2},\cdots such that
(z(k)+α(k)Δz(k))𝒞(k)γα(k)Δz(k)2.\mathcal{M}(z^{(k)}+\alpha^{(k)}\Delta z^{(k)})\leq\mathcal{C}^{(k)}-\gamma\|\alpha^{(k)}\Delta z^{(k)}\|^{2}. (3.11)
Set z(k+1):=z(k)+α(k)Δz(k)z^{(k+1)}:=z^{(k)}+\alpha^{(k)}\Delta z^{(k)} and go to step 44.
4:  Compute (z(k+1))=H(z(k+1))2\mathcal{M}(z^{(k+1)})=\|H(z^{(k+1)})\|^{2} and set
𝒞(k+1):=(𝒞(k)+1)(z(k+1))(z(k+1))+1,β(k+1):=γ𝒞(k+1).\mathcal{C}^{(k+1)}:=\frac{(\mathcal{C}^{(k)}+1)\mathcal{M}(z^{(k+1)})}{\mathcal{M}(z^{(k+1)})+1},\quad\beta^{(k+1)}:=\gamma\mathcal{C}^{(k+1)}. (3.12)
5:  Set k:=k+1k:=k+1 and go to step 22.

Remark 3.1.

The development of Algorithm 1 is inspired by the non-monotone smoothing Newton algorithm for the weighted complementarity problem [39] and the non-monotone Levenberg-Marquardt type method for the weighted nonlinear complementarity problem [41].

Before ending this section, we will show that Algorithm 1 is well-defined. To this end, we need the following lemma.

Lemma 3.1.

Let H(z)H^{\prime}(z) be defined by (3.7) and (3.8). If Assumption 2.1 holds, then H(z)H^{\prime}(z) is nonsingular at any z=(μ,x)++×nz=(\mu,x)\in\mathbb{R}_{++}\times\mathbb{R}^{n}.

Proof.

From (3.7), we need only to show that A+BV2A+BV_{2} is nonsingular. Since Assumption 2.1 holds and |xixi2+μ2|<1(i=1,2,,n)\left|\frac{x_{i}}{\sqrt{x_{i}^{2}+\mu^{2}}}\right|<1(i=1,2,\cdots,n), the result immediately follows from Lemma 2.3. ∎

Then we have the following theorem.

Theorem 3.1.

If Assumption 2.1 holds, Algorithm 1 is well defined and either terminates in finitely many steps or generates an infinite sequence {z(k)}\{z^{(k)}\} satisfying (z(k))𝒞(k)\mathcal{M}(z^{(k)})\leq\mathcal{C}^{(k)}, μ(k)>0\mu^{(k)}>0 and β(k)<μ(k)\beta^{(k)}<\mu^{(k)} for all k0k\geq 0.

Proof.

We will prove it by mathematical induction. Suppose that (z(k))𝒞(k)\mathcal{M}(z^{(k)})\leq\mathcal{C}^{(k)}, μ(k)>0\mu^{(k)}>0 and β(k)<μ(k)\beta^{(k)}<\mu^{(k)} for some kk. Since μ(k)>0\mu^{(k)}>0, it follows from Lemma 3.1 that H(z(k))H^{\prime}(z^{(k)}) is nonsingular. Hence, Δz(k)\Delta z^{(k)} can be uniquely determined by (3.9). If H(z(k))=0\|H(z^{(k)})\|=0, then Algorithm 1 terminates. Otherwise, H(z(k))2=(z(k))𝒞(k)\|H(z^{(k)})\|^{2}=\mathcal{M}(z^{(k)})\leq\mathcal{C}^{(k)} implies that 𝒞(k)>0\mathcal{C}^{(k)}>0, from which and the second equation in (3.12) we have β(k)=γ𝒞(k)>0\beta^{(k)}=\gamma\mathcal{C}^{(k)}>0. In the following, we divide our proof in three parts.

Firstly, we will show that μ(k+1)>0\mu^{(k+1)}>0. On one hand, if z(k+1)z^{(k+1)} is generated by step 22, it follows from (3.9) that μ(k+1)=μ(k)+Δμ(k)=μ(k)+(μ(k)+β(k))=β(k)>0\mu^{(k+1)}=\mu^{(k)}+\Delta\mu^{(k)}=\mu^{(k)}+(-\mu^{(k)}+\beta^{(k)})=\beta^{(k)}>0. On the other hand, if z(k+1)z^{(k+1)} is generated by step 33, we first show that there exists at least a nonnegative integer ll satisfying (3.11). On the contrary, for any nonnegative integer ll, we have

(z(k)+δlΔz(k))>𝒞(k)γδlΔz(k)2,\mathcal{M}(z^{(k)}+\delta^{l}\Delta z^{(k)})>\mathcal{C}^{(k)}-\gamma\|\delta^{l}\Delta z^{(k)}\|^{2}, (3.13)

which together with (z(k))𝒞(k)\mathcal{M}(z^{(k)})\leq\mathcal{C}^{(k)} gives

(z(k)+δlΔz(k))(z(k))δl+γδlΔz(k)2>0.\frac{\mathcal{M}(z^{(k)}+\delta^{l}\Delta z^{(k)})-\mathcal{M}(z^{(k)})}{\delta^{l}}+\gamma\delta^{l}\|\Delta z^{(k)}\|^{2}>0.

Since \mathcal{M} is differentiable at z(k)z^{(k)} and δ(0,1)\delta\in(0,1), by letting l+l\rightarrow+\infty in the above inequality, we have

(z(k))Δz(k)0.\mathcal{M}^{\prime}(z^{(k)})\Delta z^{(k)}\geq 0. (3.14)

In addition, from (3.9) we have

(z(k))Δz(k)\displaystyle\mathcal{M}^{\prime}(z^{(k)})\Delta z^{(k)} =2H(z(k))H(z(k))Δz(k)\displaystyle=2H(z^{(k)})^{\top}H^{\prime}(z^{(k)})\Delta z^{(k)}
=2H(z(k))2+2μ(k)β(k)\displaystyle=-2\|H(z^{(k)})\|^{2}+2\mu^{(k)}\beta^{(k)}
=2μ(k)(β(k)μ(k))2Ax(k)+BΦ(μ(k),x(k))b2.\displaystyle=2\mu^{(k)}(\beta^{(k)}-\mu^{(k)})-2\|Ax^{(k)}+B\Phi(\mu^{(k)},x^{(k)})-b\|^{2}. (3.15)

Since μ(k)>0\mu^{(k)}>0 and β(k)<μ(k)\beta^{(k)}<\mu^{(k)}, (3.15) implies that (z(k))Δz(k)<0\mathcal{M}^{\prime}(z^{(k)})\Delta z^{(k)}<0, which contradicts to (3.14). Therefore, there exists α(k)(0,1]\alpha^{(k)}\in(0,1] such that z(k+1)=z(k)+α(k)Δz(k)z^{(k+1)}=z^{(k)}+\alpha^{(k)}\Delta z^{(k)} in step 33. In this case, it follows from (3.9) that μ(k+1)=(1α(k))μ(k)+α(k)β(k)>0\mu^{(k+1)}=(1-\alpha^{(k)})\mu^{(k)}+\alpha^{(k)}\beta^{(k)}>0.

Secondly, we will show that (z(k+1))<𝒞(k+1)\mathcal{M}(z^{(k+1)})<\mathcal{C}^{(k+1)}. Indeed, if z(k+1)z^{(k+1)} is generated by step 22, then it follows from θ(0,1)\theta\in(0,1) and (3.10) that (z(k+1))θ2(z(k))<(z(k))𝒞(k)\mathcal{M}(z^{(k+1)})\leq\theta^{2}\mathcal{M}(z^{(k)})<\mathcal{M}(z^{(k)})\leq\mathcal{C}^{(k)}. Otherwise, by step 33, we can also obtain (z(k+1))<𝒞(k)\mathcal{M}(z^{(k+1)})<\mathcal{C}^{(k)}. In fact, β(k)<μ(k)\beta^{(k)}<\mu^{(k)} implies that Δz(k)0\Delta z^{(k)}\neq 0. Thereby, (3.11) implies that (z(k+1))<𝒞(k)\mathcal{M}(z^{(k+1)})<\mathcal{C}^{(k)}. Consequently, (z(k+1))<𝒞(k)\mathcal{M}(z^{(k+1)})<\mathcal{C}^{(k)} and the first equation in (3.12) imply

𝒞(k+1)=(𝒞(k)+1)(z(k+1))(z(k+1))+1>((k+1)+1)(z(k+1))(z(k+1))+1=(z(k+1)).\mathcal{C}^{(k+1)}=\frac{(\mathcal{C}^{(k)}+1)\mathcal{M}(z^{(k+1)})}{\mathcal{M}(z^{(k+1)})+1}>\frac{(\mathcal{M}^{(k+1)}+1)\mathcal{M}(z^{(k+1)})}{\mathcal{M}(z^{(k+1)})+1}=\mathcal{M}(z^{(k+1)}).

Finally, we will show that μ(k+1)>β(k+1)\mu^{(k+1)}>\beta^{(k+1)}. As mentioned earlier, we have μ(k+1)=β(k)\mu^{(k+1)}=\beta^{(k)} by step 22 and μ(k+1)=(1α(k))μ(k)+α(k)β(k)\mu^{(k+1)}=(1-\alpha^{(k)})\mu^{(k)}+\alpha^{(k)}\beta^{(k)} by step 33, respectively. For the latter, since α(k)(0,1]\alpha^{(k)}\in(0,1] and μ(k)>β(k)>0\mu^{(k)}>\beta^{(k)}>0, μ(k+1)=(1α(k))μ(k)+α(k)β(k)(1α(k))β(k)+α(k)β(k)=β(k)\mu^{(k+1)}=(1-\alpha^{(k)})\mu^{(k)}+\alpha^{(k)}\beta^{(k)}\geq(1-\alpha^{(k)})\beta^{(k)}+\alpha^{(k)}\beta^{(k)}=\beta^{(k)}. In a word, μ(k+1)β(k)\mu^{(k+1)}\geq\beta^{(k)}. In addition, it follows from (z(k+1))<𝒞(k)\mathcal{M}(z^{(k+1)})<\mathcal{C}^{(k)} and the first equation in (3.12) that

0𝒞(k+1)=𝒞(k)(z(k+1))+(z(k+1))(z(k+1))+1<𝒞(k)(z(k+1))+𝒞(k)(z(k+1))+1=𝒞(k),0\leq\mathcal{C}^{(k+1)}=\frac{\mathcal{C}^{(k)}\mathcal{M}(z^{(k+1)})+\mathcal{M}(z^{(k+1)})}{\mathcal{M}(z^{(k+1)})+1}<\frac{\mathcal{C}^{(k)}\mathcal{M}(z^{(k+1)})+\mathcal{C}^{(k)}}{\mathcal{M}(z^{(k+1)})+1}=\mathcal{C}^{(k)},

from which and γ>0\gamma>0 we obtain μ(k+1)β(k)=γ𝒞(k)>γ𝒞(k+1)=β(k+1)\mu^{(k+1)}\geq\beta^{(k)}=\gamma\mathcal{C}^{(k)}>\gamma\mathcal{C}^{(k+1)}=\beta^{(k+1)}.

The proof is completed by letting (z(0))𝒞(0)\mathcal{M}(z^{(0)})\leq\mathcal{C}^{(0)}, μ(0)>0\mu^{(0)}>0 and β(0)<μ(0)\beta^{(0)}<\mu^{(0)}. ∎

Remark 3.2.

We should mention that the equation (3.15) plays the key role in the proof of Theorem 3.1. This equation motivates us to develop the algorithm with the property β(k)<μ(k)\beta^{(k)}<\mu^{(k)}, which is slightly different from that given in [39] (β(k)μ(k)\beta^{(k)}\leq\mu^{(k)} was proved there).

4 Convergence analysis

In this section, we will analyze the convergence of Algorithm 1. In what follows, we assume that H(z(k))0\|H(z^{(k)})\|\neq 0 for all k0k\geq 0. To establish the global convergence of Algorithm 1, we need the following lemmas.

Lemma 4.1.

Suppose that Assumption 2.1 holds. Let {z(k)=(μ(k),x(k))}\{z^{(k)}=(\mu^{(k)},x^{(k)})\} be the iteration sequence generated by Algorithm 1. Then 𝒞(k)>𝒞(k+1)\mathcal{C}^{(k)}>\mathcal{C}^{(k+1)} and μ(k)>μ(k+1)\mu^{(k)}>\mu^{(k+1)} for all k0k\geq 0.

Proof.

The proof of 𝒞(k)>𝒞(k+1)\mathcal{C}^{(k)}>\mathcal{C}^{(k+1)} for all k0k\geq 0 can be found in the proof of Theorem 3.1. It follows from Theorem 3.1 that μ(k)>β(k)\mu^{(k)}>\beta^{(k)} for all k0k\geq 0. Then, by step 22, we have μ(k+1)=β(k)<μ(k)\mu^{(k+1)}=\beta^{(k)}<\mu^{(k)}. By step 33, μ(k+1)=(1α(k))μ(k)+α(k)β(k)<(1α(k))μ(k)+α(k)μ(k)=μ(k)\mu^{(k+1)}=(1-\alpha^{(k)})\mu^{(k)}+\alpha^{(k)}\beta^{(k)}<(1-\alpha^{(k)})\mu^{(k)}+\alpha^{(k)}\mu^{(k)}=\mu^{(k)}. The proof is complete. ∎

Lemma 4.2.

If Assumption 2.1 holds, then {z(k)}\{z^{(k)}\} generated by Algorithm 1 is bounded.

Proof.

We first prove that the level set

(Λ):={z=(μ,x)n+1:H(z)Λ}\mathcal{L}(\Lambda):=\{z=(\mu,x)\in\mathbb{R}^{n+1}:\|H(z)\|\leq\Lambda\}

is bounded for any Λ>0\Lambda>0. On the contrary, there exists a sequence {z¯(k)=(μ¯(k),x¯(k))}\{\bar{z}^{(k)}=(\bar{\mu}^{(k)},\bar{x}^{(k)})\} such that limkz¯(k)=\lim\limits_{k\rightarrow\infty}\|\bar{z}^{(k)}\|=\infty and H(z¯(k))Λ\|H(\bar{z}^{(k)})\|\leq\Lambda, where Λ>0\Lambda>0 is some constant. Since

H(z¯(k))2=(μ¯(k))2+Ax¯(k)+BΦ(μ¯(k),x¯(k))b2,\|H(\bar{z}^{(k)})\|^{2}=(\bar{\mu}^{(k)})^{2}+\|A\bar{x}^{(k)}+B\Phi(\bar{\mu}^{(k)},\bar{x}^{(k)})-b\|^{2}, (4.1)

we can conclude that {μ¯(k)}\{\bar{\mu}^{(k)}\} is bounded. It follows from this and the unboundedness of {(μ¯(k),x¯(k))}\{(\bar{\mu}^{(k)},\bar{x}^{(k)})\} that limkx¯(k)=\lim\limits_{k\rightarrow\infty}\|\bar{x}^{(k)}\|=\infty. Since the sequence {x¯(k)x¯(k)}\left\{\frac{\bar{x}^{(k)}}{\|\bar{x}^{(k)}\|}\right\} is bounded, it has at least one accumulation point x^\hat{x}. Then, there exists a subsequence {x¯(k)}kK\{\bar{x}^{(k)}\}_{k\in K} such that limkK,k+x¯(k)x¯(k)=x^\lim\limits_{k\in K,k\rightarrow+\infty}\frac{\bar{x}^{(k)}}{\|\bar{x}^{(k)}\|}=\hat{x} with K{0,1,2,}K\subset\{0,1,2,\cdots\}. It follows from the continuity of the 22-norm that x^=1\|\hat{x}\|=1. In the following, we remain kKk\in K. From (4.1), we have

Λ2x¯(k)2\displaystyle\frac{\Lambda^{2}}{\|\bar{x}^{(k)}\|^{2}} H(z¯(k))2x¯(k)2\displaystyle\geq\frac{\|H(\bar{z}^{(k)})\|^{2}}{\|\bar{x}^{(k)}\|^{2}}
=(μ¯(k))2x¯(k)2+Ax¯(k)x¯(k)+BΦ(μ¯(k),x¯(k))x¯(k)bx¯(k)2.\displaystyle=\frac{(\bar{\mu}^{(k)})^{2}}{\|\bar{x}^{(k)}\|^{2}}+\left\|A\frac{\bar{x}^{(k)}}{\|\bar{x}^{(k)}\|}+B\frac{\Phi(\bar{\mu}^{(k)},\bar{x}^{(k)})}{\|\bar{x}^{(k)}\|}-\frac{b}{\|\bar{x}^{(k)}\|}\right\|^{2}. (4.2)

Since

(μ¯(k))2+(x¯i(k))2μ¯(k)x¯(k)=(μ¯(k)x¯(k))2+(x¯i(k)x¯(k))2μ¯kx¯(k)(i=1,2,,n),\frac{\sqrt{(\bar{\mu}^{(k)})^{2}+(\bar{x}^{(k)}_{i})^{2}}-\bar{\mu}^{(k)}}{\|\bar{x}^{(k)}\|}=\sqrt{\left(\frac{\bar{\mu}^{(k)}}{\|\bar{x}^{(k)}\|}\right)^{2}+\left(\frac{\bar{x}^{(k)}_{i}}{\|\bar{x}^{(k)}\|}\right)^{2}}-\frac{\bar{\mu}^{k}}{\|\bar{x}^{(k)}\|}(i=1,2,\cdots,n),

from the boundedness of {μ¯(k)}\{\bar{\mu}^{(k)}\}, we have

limk(μ¯(k))2+(x¯i(k))2μ¯(k)x¯(k)=x^i2=|x^i|.\lim_{k\rightarrow\infty}\frac{\sqrt{(\bar{\mu}^{(k)})^{2}+(\bar{x}^{(k)}_{i})^{2}}-\bar{\mu}^{(k)}}{\|\bar{x}^{(k)}\|}=\sqrt{\hat{x}_{i}^{2}}=|\hat{x}_{i}|.

Hence, by letting kk\rightarrow\infty in (4.2), we have Ax^+B|x^|=0A\hat{x}+B|\hat{x}|=0, i.e., [A+BD(x^)]x^=0[A+BD(\hat{x})]\hat{x}=0. Since D(x^)[I,I]D(\hat{x})\in[-I,I], it follows from Lemma 4.3 that A+BD(x^)A+BD(\hat{x}) is nonsingular. Thus, we have x^=0\hat{x}=0, which contradicts to the fact that x^=1\|\hat{x}\|=1.

If {z(k)}\{z^{(k)}\} is generated by Algorithm 1, then H(z(k))𝒞(0)\|H(z^{(k)})\|\leq\sqrt{\mathcal{C}^{(0)}} for all k0k\geq 0. Hence, {z(k)}\{z^{(k)}\} is bounded based on the aforementioned disscussion. ∎

Remark 4.1.

The proof of Lemma 4.2 is inspired by that of [40, Theorem 2.3], which was considered in the case that the interval matrix [A|B|,A+|B|][A-|B|,A+|B|] is regular. In addition, similar to the proof of [29, Lemma 4.1], the boundedness of {z(k)}\{z^{(k)}\} can be derived under the assumption that σmin(A)>σmax(B)\sigma_{\min}(A)>\sigma_{\max}(B). Our result here seems more general than those in [29, Lemma 4.1].

Now we show the global convergence of Algorithm 1.

Theorem 4.1.

Assume that Assumption 2.1 holds. Let {z(k)=(μ(k),x(k))}\{z^{(k)}=(\mu^{(k)},x^{(k)})\} be the iteration sequence generated by Algorithm 1. Then any accumulation point zz^{*} of {z(k)}\{z^{(k)}\} satisfies

H(z)=0,H(z^{*})=0,

i.e., z=(0,x)z^{*}=(0,x^{*}) and xx^{*} is a solution of GAVE (1.1).

Proof.

Lemma 4.2 implies the existence of the accumulation point of {z(k)}\{z^{(k)}\} generated by Algorithm 1. Let zz^{*} be any accumulation point of {z(k)}\{z^{(k)}\}, then there exists a subsequence of {z(k)}\{z^{(k)}\} converging to zz^{*}. For convenience, we still denote the subsequence by {z(k)}\{z^{(k)}\}.

By Lemma 4.1, {𝒞(k)}\{\mathcal{C}^{(k)}\} is convergent because it is monotonically decreasing. Thus, there exists a constant 𝒞0\mathcal{C}^{*}\geq 0 such that limk+𝒞(k)=𝒞\lim\limits_{k\rightarrow+\infty}\mathcal{C}^{(k)}=\mathcal{C}^{*}. As (z(k))=H(z(k))2𝒞(k)\mathcal{M}(z^{(k)})=\|H(z^{(k)})\|^{2}\leq\mathcal{C}^{(k)} for all k0k\geq 0, limk+H(z(k))=0\lim\limits_{k\rightarrow+\infty}\|H(z^{(k)})\|=0 provided that 𝒞=0\mathcal{C}^{*}=0. Then, from the continuity of H(z)H(z) we have H(z)=0H(z^{*})=0. In the following, we assume that 𝒞>0\mathcal{C}^{*}>0 and derive a contradiction.

According to the first equation in (3.12), we have

limk+(z(k+1))=limk+(𝒞(k+1)1+𝒞(k)𝒞(k+1))=𝒞>0.\lim_{k\rightarrow+\infty}\mathcal{M}(z^{(k+1)})=\lim_{k\rightarrow+\infty}\left(\frac{\mathcal{C}^{(k+1)}}{1+\mathcal{C}^{(k)}-\mathcal{C}^{(k+1)}}\right)=\mathcal{C}^{*}>0. (4.3)

By the fact that β(k)=γ𝒞(k)\beta^{(k)}=\gamma\mathcal{C}^{(k)}, we have β=limk+β(k)=γ𝒞>0\beta^{*}=\lim\limits_{k\rightarrow+\infty}\beta^{(k)}=\gamma\mathcal{C}^{*}>0. Based on Theorem 3.1 and Lemma 4.1, we have μ=limk+μ(k)limk+β(k)=β>0\mu^{*}=\lim\limits_{k\rightarrow+\infty}\mu^{(k)}\geq\lim\limits_{k\rightarrow+\infty}\beta^{(k)}=\beta^{*}>0. Since μ>0\mu^{*}>0, H(z)H^{\prime}(z^{*}) is nonsingular and \mathcal{M} is continuously differentiable at zz^{*}.

Let 𝒩:={k:H(z(k)+Δz(k))θH(z(k))}\mathcal{N}:=\{k:\|H(z^{(k)}+\Delta z^{(k)})\|\leq\theta\|H(z^{(k)})\|\}. We claim that 𝒩\mathcal{N} must be a finite set. In fact, if 𝒩\mathcal{N} is an infinite set, then H(z(k)+Δz(k))θH(z(k))\|H(z^{(k)}+\Delta z^{(k)})\|\leq\theta\|H(z^{(k)})\|, i.e., (z(k+1))θ2(z(k))\mathcal{M}(z^{(k+1)})\leq\theta^{2}\mathcal{M}(z^{(k)}) holds for infinitely many kk. By letting k+k\rightarrow+\infty with k𝒩k\in\mathcal{N}, we have 𝒞θ2𝒞\mathcal{C}^{*}\leq\theta^{2}\mathcal{C}^{*}. This leads to a contradiction due to θ(0,1)\theta\in(0,1) and 𝒞>0\mathcal{C}^{*}>0. Hence, we can suppose that there exists an index k¯>0\bar{k}>0 such that H(z(k)+Δz(k))>θH(z(k))\|H(z^{(k)}+\Delta z^{(k)})\|>\theta\|H(z^{(k)})\| for all kk¯k\geq\bar{k}. Then, for all kk¯k\geq\bar{k}, z(k+1)=z(k)+α(k)Δz(k)z^{(k+1)}=z^{(k)}+\alpha^{(k)}\Delta z^{(k)} (generated by step 33) satisfies (z(k+1))𝒞(k)γα(k)Δz(k)2\mathcal{M}(z^{(k+1)})\leq\mathcal{C}^{(k)}-\gamma\|\alpha^{(k)}\Delta z^{(k)}\|^{2}, i.e.,

γα(k)Δz(k)2𝒞(k)(z(k+1)),\gamma\|\alpha^{(k)}\Delta z^{(k)}\|^{2}\leq\mathcal{C}^{(k)}-\mathcal{M}(z^{(k+1)}),

from which and (4.3) we have limk+α(k)Δz(k)=0\lim\limits_{k\rightarrow+\infty}\alpha^{(k)}\|\Delta z^{(k)}\|=0.

On one hand, if 1α(k)=δlkϱ>01\geq\alpha^{(k)}=\delta^{l_{k}}\geq\varrho>0 for all kk¯k\geq\bar{k} with ϱ\varrho being a fixed constant, then Δz=limk¯k+Δz(k)=0\Delta z^{*}=\lim\limits_{\bar{k}\leq k\rightarrow+\infty}\Delta z^{(k)}=0, which implies that

(z)Δz=0.\mathcal{M}^{\prime}(z^{*})\Delta z^{*}=0. (4.4)

Here and in the sequel, Δz\Delta z^{*} is the unique solution of H(z)Δz=H(z)+βe(1)H^{\prime}(z^{*})\Delta z^{*}=-H(z^{*})+\beta^{*}e^{(1)}.

On the other hand, {α(k)}kk¯\{\alpha^{(k)}\}_{k\geq\bar{k}} has a subsequence converging to 0. Without loss of generality, we may assume that limk¯k+α(k)=0\lim\limits_{\bar{k}\leq k\rightarrow+\infty}\alpha^{(k)}=0. Let α^(k):=δlk/δ\hat{\alpha}^{(k)}:=\delta^{l_{k}}/\delta, then limk¯k+α^(k)=0\lim\limits_{\bar{k}\leq k\rightarrow+\infty}\hat{\alpha}^{(k)}=0. Moreover, for all kk¯k\geq\bar{k}, it follows from the definition of α(k)\alpha^{(k)} and Theorem 3.1 that

(z(k)+α^(k)Δz(k))>𝒞(k)γα^(k)Δz(k)2(z(k))γα^(k)Δz(k)2.\mathcal{M}(z^{(k)}+\hat{\alpha}^{(k)}\Delta z^{(k)})>\mathcal{C}^{(k)}-\gamma\|\hat{\alpha}^{(k)}\Delta z^{(k)}\|^{2}\geq\mathcal{M}(z^{(k)})-\gamma\|\hat{\alpha}^{(k)}\Delta z^{(k)}\|^{2}.

Thus,

(z(k)+α^(k)Δz(k))(z(k))α^(k)+γα^(k)Δz(k)2>0.\frac{\mathcal{M}(z^{(k)}+\hat{\alpha}^{(k)}\Delta z^{(k)})-\mathcal{M}(z^{(k)})}{\hat{\alpha}^{(k)}}+\gamma\hat{\alpha}^{(k)}\|\Delta z^{(k)}\|^{2}>0.

By letting k+k\rightarrow+\infty in the above inequality, we have

(z)Δz0.\mathcal{M}^{\prime}(z^{*})\Delta z^{*}\geq 0. (4.5)

Since μγγμ(0)<1\mu^{*}\gamma\leq\gamma\mu^{(0)}<1, it follows from (3.9) and (4.3) that

12(z)Δz=H(z)H(z)Δz=(z)+μβ=𝒞+μγ𝒞=(μγ1)𝒞<0,\frac{1}{2}\mathcal{M}^{\prime}(z^{*})\Delta z^{*}=H(z^{*})^{\top}H^{\prime}(z^{*})\Delta z^{*}=-\mathcal{M}(z^{*})+\mu^{*}\beta^{*}=-\mathcal{C}^{*}+\mu^{*}\gamma\mathcal{C}^{*}=(\mu^{*}\gamma-1)\mathcal{C}^{*}<0,

which is contrary to (4.4) and (4.5). The proof of the theorem is now complete. ∎

Under Assumption 2.1, GAVE (1.1) has a unique solution and thus Lemma 4.2 and Theorem 4.1 imply that the sequence generated by Algorithm 1 has a unique accumulation zz^{*} and limk+z(k)=z\lim\limits_{k\rightarrow+\infty}z^{(k)}=z^{*}. In the following, we will discuss the local quadratic convergence of Algorithm 1.

Lemma 4.3.

Assume that Assumption 2.1 holds and z=(0,x)z^{*}=(0,x^{*}) is the accumulation point of the sequence {z(k)}\{z^{(k)}\} generated by Algorithm 1. We have:

  • (i)

    H(z){V:V=[10Bvec(κi1)A+Bdiag(χi)],κi,χi[1,1](i=1,2,,n)};\partial H(z^{*})\subseteq\left\{V:V=\left[\begin{array}[]{cc}1&0\\ B\operatorname{vec}(\kappa_{i}-1)&A+B\operatorname{diag}(\chi_{i})\end{array}\right],\,\kappa_{i},\chi_{i}\in[-1,1]\,(i=1,2,\cdots,n)\right\};

  • (ii)

    All VH(z)V\in\partial H(z^{*}) are nonsingular;

  • (iii)

    There exists a neighborhood 𝒩(z)\mathscr{N}(z^{*}) of zz^{*} and a constant C>0C>0 such that for any z:=(μ,x)𝒩(z)z:=(\mu,x)\in\mathscr{N}(z^{*}) with μ>0\mu>0, H(z)H^{\prime}(z) is nonsingular and H(z)1C\|H^{\prime}(z)^{-1}\|\leq C.

Proof.

A direct computation yields the result (i). The result (ii) follows from (i) and Lemma 2.3, and the result (iii) follows from [31, Lemma 2.6]. ∎

Owing to Proposition 3.2 (ii) and Lemma 4.3, we can obtain the following local quadratic convergence theorem of Algorithm 1. The theorem was well known in the application of smoothing-type Newton methods. The theorem as a whole can be implied by [39, Theorem 8] and thus we omit the proof here.

Theorem 4.2.

Assume that Assumption 2.1 holds and zz^{*} is the accumulation point of the sequence {z(k)}\{z^{(k)}\} generated by Algorithm 1. Then the whole sequence {z(k)}\{z^{(k)}\} converges to zz^{*} with

z(k+1)z=O(z(k)z2).\|z^{(k+1)}-z^{*}\|=O(\|z^{(k)}-z^{*}\|^{2}).

5 Numerical results

In this section, we will present two numerical examples to illustrate the performance of Algorithm 1. Three algorithms will be tested, i.e., Algorithm 1 (denoted by “NSNA”), the monotone smoothing Newton algorithm proposed by Jiang and Zhang [29] (denoted by “JZ-MSNA”) and the monotone smoothing Newton algorithm proposed by Tang and Zhou [40] (denoted by “TZ-MSNA”). All experiments are implemented in MATLAB R2018b with a machine precision 2.22×10162.22\times 10^{-16} on a PC Windows 10 operating system with an Intel i7-9700 CPU and 8GB RAM.

We will apply the aforementioned algorithms to solve GAVE (1.1) arising from HLCP. Given M,Nn×nM,N\in\mathbb{R}^{n\times n} and qnq\in\mathbb{R}^{n}, HLCP is to find a pair (z,w)n×n(z,w)\in\mathbb{R}^{n}\times\mathbb{R}^{n} such that

MzNw=q,z0,w0,zw=0.Mz-Nw=q,\quad z\geq 0,\quad w\geq 0,\quad z^{\top}w=0. (5.1)

The equivalent relationship between GAVE (1.1) and HLCP (5.1) can be found in [26, Proposition 1].

Example 5.1.

Consider HLCP  (5.1) with M=A^+ξIM=\hat{A}+\xi I and N=B^+ζIN=\hat{B}+\zeta I [27], where

A^=[SIISIISIISIIS],B^=[SSSSS]\hat{A}=\left[\begin{array}[]{cccccc}S&-I&&&&\\ -I&S&-I&&&\\ &-I&S&-I&&\\ &&\ddots&\ddots&\ddots&\\ &&&-I&S&-I\\ &&&&-I&S\end{array}\right],\quad\hat{B}=\left[\begin{array}[]{cccccc}S&&&&&\\ &S&&&&\\ &&S&&&\\ &&&\ddots&&\\ &&&&S&\\ &&&&&S\end{array}\right]

and

S=[4114114114114].S=\left[\begin{array}[]{cccccc}4&-1&&&&\\ -1&4&-1&&&\\ &-1&4&-1&&\\ &&\ddots&\ddots&\ddots&\\ &&&-1&4&-1\\ &&&&-1&4\end{array}\right].
Example 5.2.

Consider HLCP  (5.1) with M=A^+ξIM=\hat{A}+\xi I and N=B^+ζIN=\hat{B}+\zeta I [27], where

A^=[S0.5I1.5IS0.5I1.5IS0.5I1.5IS0.5I1.5IS],B^=[SSSSS]\hat{A}=\left[\begin{array}[]{cccccc}S&-0.5I&&&&\\ -1.5I&S&-0.5I&&&\\ &-1.5I&S&-0.5I&&\\ &&\ddots&\ddots&\ddots&\\ &&&-1.5I&S&-0.5I\\ &&&&-1.5I&S\end{array}\right],\quad\hat{B}=\left[\begin{array}[]{cccccc}S&&&&&\\ &S&&&&\\ &&S&&&\\ &&&\ddots&&\\ &&&&S&\\ &&&&&S\end{array}\right]

and

S=[40.51.540.51.540.51.540.51.54].S=\left[\begin{array}[]{cccccc}4&-0.5&&&&\\ -1.5&4&-0.5&&&\\ &-1.5&4&-0.5&&\\ &&\ddots&\ddots&\ddots&\\ &&&-1.5&4&-0.5\\ &&&&-1.5&4\end{array}\right].

Obviously, when ξ,ζ0\xi,\zeta\geq 0, matrices MM and NN in Example 5.1 are symmetric positive definite while the corresponding matrices in Example 5.2 are nonsymmetric positive definite. Moreover, it is easy to verify that {M,N}\{M,N\} has the column 𝒲\mathcal{W}-property [27], and thus HLCP (5.1) has a unique solution for any qnq\in\mathbb{R}^{n} [37, Theorem 2]. Correspondingly, GAVE (1.1) with A=M+NA=M+N and B=MNB=M-N satisfies Assumption 2.1 and has a unique solution for any b=qnb=q\in\mathbb{R}^{n}.

In both examples, we define q=MzNwq=Mz^{*}-Nw^{*} with

z=[0,1,0,1,,0,1],w=[1,0,1,0,1,0].z^{*}=[0,1,0,1,\cdots,0,1]^{\top},\quad w^{*}=[1,0,1,0\cdots,1,0]^{\top}.

In addition, three sets of values of ξ\xi and ζ\zeta are used, i.e., (ξ,ζ)=(0,0)(\xi,\zeta)=(0,0), (ξ,ζ)=(0,4)(\xi,\zeta)=(0,4) and (ξ,ζ)=(4,0)(\xi,\zeta)=(4,0).

For NSNA, we set θ=0.2,δ=0.8,μ(0)=0.01\theta=0.2,\delta=0.8,\mu^{(0)}=0.01 and choose γ:=min{μ(0)𝒞(0)+1,1μ(0)+1,1012}\gamma:=\min\{\frac{\mu^{(0)}}{\mathcal{C}^{(0)}+1},\frac{1}{\mu^{(0)}+1},10^{-12}\} such that γ𝒞(0)<μ(0)\gamma\mathcal{C}^{(0)}<\mu^{(0)}, γμ(0)<1\gamma\mu^{(0)}<1 and γ(0,1)\gamma\in(0,1). For JZ-MSNA, we set δ=0.8,σ=0.2,μ(0)=0.01,p=2\delta=0.8,\sigma=0.2,\mu^{(0)}=0.01,p=2 and choose β:=max{100,1.01(τ(0))2/μ0}\beta:=\max\{100,1.01*(\tau^{(0)})^{2}/\mu_{0}\} to satisfy the conditions needed for this algorithm [29] (we refer to [29] for the definition of τ(0)\tau^{(0)}). For TZ-MSNA, as in [40], we set σ=0.2,δ=0.8,γ=0.001\sigma=0.2,\delta=0.8,\gamma=0.001 and μ(0)=0.01\mu^{(0)}=0.01. For all methods, x(0)=[2,2,,2]x^{(0)}=[2,2,\cdots,2]^{\top} and methods are stopped if Res=Ax(k)+B|x(k)|b107\operatorname{Res}=\|Ax^{(k)}+B|x^{(k)}|-b\|\leq 10^{-7} or the maximum number of iteration step it_max=100it\_\max=100 is exceeded.

For Example 5.1, numerical results are shown in Tables 1-3, from which we can find that NSNA is better than JZ-MSNA and TZ-MSNA in terms of Iter\operatorname{Iter} (the number of iterations) and Cpu\operatorname{Cpu} (the elapsed CPU time in seconds). Figure 1 plots the convergence curves of the tested methods, from which the monotone convergence properties of all methods are shown222For JZ-MSNA and TZ-MSNA, H(z(k))\|H(z^{(k)})\| is defined as in (3.5) with ϕ(a,b)=(|a|p+|b|p)1p\phi(a,b)=(|a|^{p}+|b|^{p})^{\frac{1}{p}}.. For Example 5.2, numerical results are shown in Tables 4-6, from which we can also find that NSNA is superior to JZ-MSNA and TZ-MSNA in terms of Iter\operatorname{Iter} and Cpu\operatorname{Cpu}. Figure 2 plots the convergence curves of the tested methods, from which the monotone convergence properties of JZ-MSNA and TZ-MSNA are shown and the nonmonotone convergence property of NSNA occurs. In conclusion, under our setting, NSNA is a competitive method for solving GAVE (1.1).

Table 1: Numerical results for Example 5.1 with ξ=ζ=0\xi=\zeta=0.
Method nn
256256 10241024 23042304 40964096
NSNA Iter\operatorname{Iter} 5 5 6 6
Cpu\operatorname{Cpu} 0.0044 0.0619 0.3364 0.9985
Res\operatorname{Res} 1.1392×10141.1392\times 10^{-14} 5.7894×10145.7894\times 10^{-14} 2.8847×10132.8847\times 10^{-13} 1.5603×10121.5603\times 10^{-12}
JZ-MSNA Iter\operatorname{Iter} 66 66 88 77
Cpu\operatorname{Cpu} 0.00520.0052 0.07240.0724 0.44970.4497 1.20201.2020
Res\operatorname{Res} 2.3488×10112.3488\times 10^{-11} 4.9355×10104.9355\times 10^{-10} 1.4026×10111.4026\times 10^{-11} 9.4826×10119.4826\times 10^{-11}
TZ-MSNA Iter\operatorname{Iter} 66 66 77 77
Cpu\operatorname{Cpu} 0.00490.0049 0.06640.0664 0.38600.3860 1.16311.1631
Res\operatorname{Res} 1.1147×10141.1147\times 10^{-14} 2.2960×10142.2960\times 10^{-14} 3.4131×10143.4131\times 10^{-14} 4.6847×10144.6847\times 10^{-14}
Table 2: Numerical results for Example 5.1 with ξ=0\xi=0 and ζ=4\zeta=4.
Method nn
256256 10241024 23042304 40964096
NSNA Iter\operatorname{Iter} 5 6 7 7
Cpu\operatorname{Cpu} 0.0035 0.0745 0.3975 1.3254
Res\operatorname{Res} 2.9913×10142.9913\times 10^{-14} 7.8455×10137.8455\times 10^{-13} 4.8343×10124.8343\times 10^{-12} 2.3106×10112.3106\times 10^{-11}
JZ-MSNA Iter\operatorname{Iter} 77 88 88 99
Cpu\operatorname{Cpu} 0.00550.0055 0.11020.1102 0.46760.4676 1.68901.6890
Res\operatorname{Res} 6.4061×10116.4061\times 10^{-11} 1.7803×1091.7803\times 10^{-9} 4.4746×1084.4746\times 10^{-8} 7.1534×1087.1534\times 10^{-8}
TZ-MSNA Iter\operatorname{Iter} 66 77 88 88
Cpu\operatorname{Cpu} 0.00390.0039 0.08620.0862 0.45990.4599 1.51451.5145
Res\operatorname{Res} 1.7297×10141.7297\times 10^{-14} 3.4071×10143.4071\times 10^{-14} 5.3415×10145.3415\times 10^{-14} 6.7748×10146.7748\times 10^{-14}
Table 3: Numerical results for Example 5.1 with ξ=4\xi=4 and ζ=0\zeta=0.
Method nn
256256 10241024 23042304 40964096
NSNA Iter\operatorname{Iter} 3 3 3 3
Cpu\operatorname{Cpu} 0.0021 0.0361 0.1536 0.5444
Res\operatorname{Res} 2.0696×10132.0696\times 10^{-13} 5.8026×10125.8026\times 10^{-12} 4.2044×10114.2044\times 10^{-11} 1.7292×10101.7292\times 10^{-10}
JZ-MSNA Iter\operatorname{Iter} 44 44 55 55
Cpu\operatorname{Cpu} 0.00350.0035 0.04770.0477 0.28460.2846 0.93880.9388
Res\operatorname{Res} 2.5682×10112.5682\times 10^{-11} 1.1439×10111.1439\times 10^{-11} 2.1336×10112.1336\times 10^{-11} 3.0094×10113.0094\times 10^{-11}
TZ-MSNA Iter\operatorname{Iter} 44 44 44 44
Cpu\operatorname{Cpu} 0.00240.0024 0.04580.0458 0.22930.2293 0.70170.7017
Res\operatorname{Res} 1.7786×10141.7786\times 10^{-14} 3.4123×10143.4123\times 10^{-14} 5.2927×10145.2927\times 10^{-14} 6.8855×10146.8855\times 10^{-14}
Table 4: Numerical results for Example 5.2 with ξ=ζ=0\xi=\zeta=0.
Method nn
256256 10241024 23042304 40964096
NSNA Iter\operatorname{Iter} 4 5 6 6
Cpu\operatorname{Cpu} 0.0031 0.0642 0.3332 1.0440
Res\operatorname{Res} 1.0500×10141.0500\times 10^{-14} 5.9914×10145.9914\times 10^{-14} 2.9611×10132.9611\times 10^{-13} 1.6331×10121.6331\times 10^{-12}
JZ-MSNA Iter\operatorname{Iter} 55 66 88 77
Cpu\operatorname{Cpu} 0.00360.0036 0.07740.0774 0.45030.4503 1.22141.2214
Res\operatorname{Res} 4.1569×10114.1569\times 10^{-11} 2.7899×10102.7899\times 10^{-10} 1.4829×10111.4829\times 10^{-11} 6.0501×1096.0501\times 10^{-9}
TZ-MSNA Iter\operatorname{Iter} 55 66 77 77
Cpu\operatorname{Cpu} 0.00350.0035 0.06790.0679 0.37760.3776 1.21301.2130
Res\operatorname{Res} 1.1200×10141.1200\times 10^{-14} 2.0773×10142.0773\times 10^{-14} 3.1480×10143.1480\times 10^{-14} 4.2238×10144.2238\times 10^{-14}
Table 5: Numerical results for Example 5.2 with ξ=0\xi=0 and ζ=4\zeta=4.
Method nn
256256 10241024 23042304 40964096
NSNA Iter\operatorname{Iter} 6 7 7 8
Cpu\operatorname{Cpu} 0.0047 0.0869 0.4074 1.4394
Res\operatorname{Res} 3.0087×10143.0087\times 10^{-14} 7.9704×10137.9704\times 10^{-13} 5.8680×10125.8680\times 10^{-12} 2.4058×10112.4058\times 10^{-11}
JZ-MSNA Iter\operatorname{Iter} 88 99 1010 1111
Cpu\operatorname{Cpu} 0.00750.0075 0.12250.1225 0.57930.5793 1.97091.9709
Res\operatorname{Res} 1.4866×10111.4866\times 10^{-11} 2.0476×1092.0476\times 10^{-9} 1.3748×1081.3748\times 10^{-8} 7.0290×1087.0290\times 10^{-8}
TZ-MSNA Iter\operatorname{Iter} 77 88 99 99
Cpu\operatorname{Cpu} 0.00500.0050 0.09450.0945 0.51930.5193 1.62621.6262
Res\operatorname{Res} 1.6717×10141.6717\times 10^{-14} 3.3155×10143.3155\times 10^{-14} 5.1814×10145.1814\times 10^{-14} 9.2842×10149.2842\times 10^{-14}
Table 6: Numerical results for Example 5.2 with ξ=4\xi=4 and ζ=0\zeta=0.
Method nn
256256 10241024 23042304 40964096
NSNA Iter\operatorname{Iter} 3 3 3 3
Cpu\operatorname{Cpu} 0.0024 0.0388 0.1623 0.5437
Res\operatorname{Res} 2.1175×10132.1175\times 10^{-13} 5.8378×10125.8378\times 10^{-12} 4.2243×10114.2243\times 10^{-11} 1.7356×10101.7356\times 10^{-10}
JZ-MSNA Iter\operatorname{Iter} 44 44 55 55
Cpu\operatorname{Cpu} 0.00300.0030 0.05000.0500 0.29790.2979 0.87520.8752
Res\operatorname{Res} 2.5874×10112.5874\times 10^{-11} 1.1674×10111.1674\times 10^{-11} 2.1543×10112.1543\times 10^{-11} 3.0534×10113.0534\times 10^{-11}
TZ-MSNA Iter\operatorname{Iter} 44 44 44 44
Cpu\operatorname{Cpu} 0.00290.0029 0.04930.0493 0.24480.2448 0.72570.7257
Res\operatorname{Res} 1.7719×10141.7719\times 10^{-14} 3.3493×10143.3493\times 10^{-14} 4.6185×10144.6185\times 10^{-14} 6.7869×10146.7869\times 10^{-14}
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 1: Convergence history curves for Example 5.1 with n=322n=32^{2}. The plots in the first column are for NSNA, the plots in the second column are for JZ-MSNA and the plots in the third column are for TZ-MSNA, respectively. The plots in the first row are for ξ=ζ=0\xi=\zeta=0, the plots in the second row are for ξ=0\xi=0 and ζ=4\zeta=4 and the plots in the third row are for ξ=4\xi=4 and ζ=0\zeta=0, respectively.
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 2: Convergence history curves for Example 5.2 with n=322n=32^{2}. The plots in the first column are for NSNA, the plots in the second column are for JZ-MSNA and the plots in the third column are for TZ-MSNA, respectively. The plots in the first row are for ξ=ζ=0\xi=\zeta=0, the plots in the second row are for ξ=0\xi=0 and ζ=4\zeta=4 and the plots in the third row are for ξ=4\xi=4 and ζ=0\zeta=0, respectively.

6 Conclusions

In this paper, a non-monotone smoothing Newton method is proposed to solve the system of generalized absolute value equations. Under a weaker assumption, we prove the global and the local quadratic convergence of our method. Numerical results demonstrate that our method can be superior to two existing methods in some situations.

References

  • [1] M. Achache, N. Hazzam. Solving absolute value equation via complementarity and interior-point methods, J. Nonlinear Funct. Anal., 39, 2018.
  • [2] N. Anane, M. Achache. Preconditioned conjugate gradient methods for absolute value equations, J. Numer. Anal. Approx. Theroy, 49: 3–14, 2020.
  • [3] L. Caccetta, B. Qu, G.-L. Zhou. A globally and quadratically convergent method for absolute value equations, Comput. Optim. Appl., 48(1): 45–58, 2011.
  • [4] C.-R. Chen, Y.-N. Yang, D.-M. Yu, D.-R. Han. An inverse-free dynamical system for solving the absolute value equations, Appl. Numer. Math., 168: 170–181, 2021.
  • [5] F.H. Clarke. Optimization and Nonsmooth Analysis, Wiley, New York, 1983.
  • [6] R.W. Cottle, J.-S. Pang, R.E. Stone. The Linear Complementarity. Classics in Applied Mathematics, SIAM, Philadelphia, 2009.
  • [7] J.Y.B. Cruz, O.P. Ferreira, L.F. Prudente. On the global convergence of the inexact semi-smooth Newton method for absolute value equation, Comput. Optim. Appl., 65(1): 93–108, 2016.
  • [8] V. Edalatpour, D. Hezari, D.K. Salkuyeh. A generalization of the Gauss-Seidel iteration method for solving absolute value equations, Appl. Math. Comput., 293: 156–167, 2017.
  • [9] A. Fischer. Solution of monotone complementarity problems with locally Lipschitzian functions, Math. Program., 76: 513–532, 1997.
  • [10] X.-M. Gu, T.-Z. Huang, H.-B. Li, S.-F. Wang, L. Li. Two CSCS-based iteration methods for solving absolute value equations, J. Appl. Anal. Comput., 7(4): 1336–1356, 2017.
  • [11] F.K. Haghani. On generalized Traub’s method for absolute value equations, J. Optim. Theory Appl., 166: 619–625, 2015.
  • [12] M. Hladík. Bounds for the solution of absolute value equations, Comput. Optim. Appl., 69: 243–266, 2018.
  • [13] S.-L. Hu, Z.-H. Huang. A note on absolute value equations, Optim. Lett., 4: 417–424, 2010.
  • [14] S.-L. Hu, Z.-H. Huang, J.-S. Chen. Properties of a family of generalized NCP-functions and a derivative free algorithm for complementarity problems, J. Comput. Appl. Math., 230: 69–82, 2009.
  • [15] S.-L. Hu, Z.- H. Huang, Q. Zhang. A generalized Newton method for absolute value equations associated with second order cones, J. Comput. Appl. Math., 235: 1490–1501, 2011.
  • [16] Z.-H. Huang, S.-L. Hu, J.-Y. Han. Convergence of a smoothing algorithm for sysmetric cone complementarity problems with a nonmonotone line search, Sci. China Ser. A: Math., 52: 833–848, 2009.
  • [17] Y.-F. Ke. The new iteration algorithm for absolute value equation, Appl. Math. Lett., 99: 105990, 2020.
  • [18] Y.-F. Ke, C.-F. Ma. SOR-like iteration method for solving absolute value equations, Appl. Math. Comput., 311: 195–202, 2017.
  • [19] Y.-Y. Lian, C.-X. Li, S.-L. Wu. Weaker convergent results of the generalized Newton method for the generalized absolute value equations, J. Comput. Appl. Math., 338: 221–226, 2018.
  • [20] O.L. Mangasarian. Absolute value programming, Comput. Optim. Appl., 36(1): 43–53, 2007.
  • [21] O.L. Mangasarian. Absolute value equation solution via concave minimization, Optim. Lett., 1(1): 3–8, 2007.
  • [22] O.L. Mangasarian. A generalized Newton method for absolute value equations, Optim. Lett., 3(1): 101–108, 2009.
  • [23] O.L. Mangasarian. Knapsack feasibility as an absolute value equation solvable by successive linear programming, Optim. Lett., 3: 161–170, 2009.
  • [24] O.L. Mangasarian, R.R. Meyer. Absolute value equations, Linear Algebra Appl., 419: 359–367, 2006.
  • [25] A. Mansoori, M. Erfanian. A dynamic model to solve the absolute value equations, J. Comput. Appl. Math., 333: 28–35, 2018.
  • [26] F. Mezzadri. On the solution of general absolute value equations, Appl. Math. Lett., 107: 106462, 2020.
  • [27] F. Mezzadri, E. Galligani. Modulus-based matrix splitting methods for horizontal linear complementarity problems, Numer. Algor., 83: 201–219, 2020.
  • [28] X.-H. Miao, J.-T. Yang, B. Saheya, J.-S. Chen. A smoothing Newton method for absolute value equation associated with second-order cone, Appl. Numer. Math., 120: 82–96, 2017.
  • [29] X.-Q. Jiang, Y. Zhang. A smoothging-type algorithm for absolute value equations, J. Ind. Manag. Optim., 9(4): 789–798, 2013.
  • [30] O. Prokopyev. On equivalent reformulations for absolute value equations, Comput. Optim. Appl., 44(3): 363–372, 2009.
  • [31] L.-Q. Qi. Convergence analysis of some algorithms for solving nonsmooth equations, Math. Oper. Res., 18: 227–244, 1993.
  • [32] L.-Q. Qi, D.-F. Sun, G.-L. Zhou. A new look at smoothing Newton methods for nonlinear complementarity problems and box constrained variational inequalities, Math. Program., Ser. A, 87: 1–35, 2000.
  • [33] J. Rohn. Systems of linear interval equations, Linear Algerbra Appl., 126: 39–78, 1989.
  • [34] J. Rohn. A theorem of the alternatives for the equation Ax+B|x|=bAx+B|x|=b, Linear Multilinear Algebra, 52(6): 421–426, 2004.
  • [35] B. Seheya, C.T. Nguyen, J.-S. Chen. Neural network based on systematically generated smoothing functions for absolute value equation, J. Appl. Math. Comput., 61: 533–558, 2019.
  • [36] S. Shahsavari, S. Ketabchi. The proximal mathods for solving absolute value equation, Numer. Algebra Control Optim., 11: 449–460, 2021.
  • [37] R. Sznajder, M.S. Gowda. Generalizations of P0P_{0}- and PP-properties; Extended vertical and horizontal linear complementarity problems, Linear Algebra Appl., 223/224: 695–715, 1995.
  • [38] J.-Y. Tang. A variant nonmonotone smoothing algorithm with improved numerical results for large-scale LWCPs, Comp. Appl. Math., 37: 3927–3936, 2018.
  • [39] J.-Y. Tang, H.-C. Zhang. A nonmonotone smoothing Newton algorithm for weighted complementarity problem, J. Optim. Theory Appl., 189: 679–715, 2021.
  • [40] J.-Y. Tang, J.-C. Zhou. A quadratically convergent descent method for the absolute value equation Ax+B|x|=bAx+B|x|=b, Oper. Res. Lett., 47: 229–234, 2019.
  • [41] J.-Y. Tang, J.-C. Zhou. Quadratic convergence analysis of a nonmonotone Levenberg-Marquardt type method for the weighted nonlinear complementarity problem, Comput. Optim. Appl., 80: 213–244, 2021.
  • [42] A. Wang, Y. Cao, J.-X. Chen. Modified Newton-type iteration methods for generalized absolute value equations, J. Optim. Theory Appl., 181: 216–230, 2019.
  • [43] S.-L. Wu, C.-X. Li. The unique solution of the absolute value equations, Appl. Math. Lett., 76: 195–200, 2018.
  • [44] S.-L. Wu, C.-X. Li. A note on unique solvability of the absolute value equation, Optim. Lett., 14: 1957–1960, 2020. https://doi.org/10.1007/s11590-019-01478-x.
  • [45] S.-L. Wu, S.-Q. Shen. On the unique solution of the generalized absolute value equation, Optim. Lett., 15: 2017–2024, 2021.
  • [46] D.-M. Yu, C.-R. Chen, D.-R. Han. A modified fixed point iteration method for solving the system of absolute value equations, Optimization, 2020. https://doi.org/10.1080/02331934.2020.1804568.
  • [47] N. Zainali, T. Lotfi. On developing a stable and quadratic convergent method for solving absolute value equation, J. Coput. Appl. Math., 330: 742–747, 2018.
  • [48] M. Zamani, M. Hladík. A new concave minimization algorithm for the absolute value equation solution, Optim. Lett., 2021. https://doi.org/10.1007/s11590-020-01691-z.
  • [49] C. Zhang, Q.-J. Wei. Global and finite convergence of a generalized Newton method for absolute value equations, J. Optim. Theory Appl., 143: 391–403, 2009.
  • [50] H.-C. Zhang, W.W. Hager. A nonmonotone line search technique and its application to unconstrained optimization, SIAM J. Optim., 14: 1043–1056, 2004.
  • [51] H.-Y. Zhou, S.-L. Wu, C.-X. Li. Newton-based matrix splitting method for generalized absolute value equation, J. Comput. Appl. Math., 394: 113578, 2021.
  • [52] J.-G. Zhu, H.-W. Liu, C.-H. Liu. A family of new smoothing functions and a nonmonotone smoothing Newton method for the nonlinear complementarity problems, J. Appl. Math. Comput., 37: 647–662, 2011.