This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A generalized projection iterative methods for solving non-singular linear systems

Ashif Mustafa Manideepa Saha 111The work was supported by Department of Science and Technology-Science and Engineering Research Board (grant no. ECR/2017/002116) Department of Mathematics, National Institute of Technology Meghalaya, Shillong 793003, India manideepa.saha@nitm.ac.in
Abstract

In this paper, we propose and analyze iterative method based on projection techniques to solve a non-singular linear system Ax=bAx=b. In particular, for a given positive integer mm, mm-dimensional successive projection method (mD-SPM) for symmetric definite matrix AA, is generalized for non-singular matrix AA. Moreover, it is proved that mmD-SPM gives better result for large values of mm. Numerical experiments are carried out to demonstrate the superiority of the proposed method in comparison with other schemes in the scientific literature.

keywords:
Symmetric linear systems , Iterative techniques , Petrov-Galerkin condition , Orthogonal projection method , Oblique projection method
journal: Journal of  Templates

1 Introduction

Consider the linear system of equations

Ax=b,Ax=b, (1.1)

where An,nA\in\mathbb{R}^{n,n}, bnb\in\mathbb{R}^{n} and xnx\in\mathbb{R}^{n} is an unknown vector. In  [1], Ujević introduced a new iterative method for solving (1.1). The method was considered as a generalization of Gauss-Seidel methods. In  [2], Jing and Huang interpreted Ujević’s method as one dimensional double successive projection method (1D-DSPM), whereas Gauss-Seidel method as one dimensional single successive projection method (1D-SSPM). They established a different approach and called it as two dimensional double successive projection method (2D-DSPM). In [3], Salkuyeh improved this method and gave a generalization of it, thereby calling it mmD-SPM. In an independent work  [4], Hou and Wang explained 2D-DSPM as two-dimensional orthogonal projection method (2D-OPM), and generalized it to three dimensional orthogonal projection method (3D-OPM). All these works address systems in which the matrix AA in 1.1 is symmetric positve definite(SPD). In this paper, we generalize the mmD-SPM and use it on any non-singular system. The proposed method is called as mmD-OPM, where OPM refers to ‘orthogonal’ as well as ‘oblique’ projection method.

Given an initial approximation x0x_{0}, a typical projection method for solving (1.1), on the subspace 𝒦\mathcal{K} (known as the search subspace) and orthogonal to the subspace \mathcal{L} (known as the constraint subspace), is to find an approximate solution xx of (1.1) by imposing the Petrov-Galerkin condition [5] that

xx0+𝒦 and bAxx\in x_{0}+\mathcal{K}\text{ and }b-Ax\perp\mathcal{L} (1.2)

In case of orthogonal project method, search space and the constraint spaces are same, whereas in oblique projection method, they are different. The elementary Gauss-Seidel method can be considered as an one dimensional OPM with 𝒦==span{ei}\mathcal{K}=\mathcal{L}=\mathop{\rm span}\nolimits\{e_{i}\}, where eie_{i} is the ith column of the identity matrix. In [4], authors proposed three-dimensional OPM (33D-OPM) and showed both theoretically and numerically that 33D-OPM gives better (or atleast the same) reduction of error than 2D-OPM in [2]. In [3], author proposed a generalization of 22D-OPM as well as gave a way to chose the subspace 𝒦\mathcal{K}. We put forward the mm-dimensional OPM (mmD-OPM) by considering mm-dimensional subspaces 𝒦,\mathcal{K},\mathcal{L}, where, for oblique projection we take =A𝒦\mathcal{L}=A\mathcal{K}. At each iterative step, 𝒦,\mathcal{K},\mathcal{L} are taken as mm-dimensional subspaces and each iteration is cycled for i=1,2,,ni=1,2,\ldots,n, until it converges.

The paper is organized as follows: In Section 2, a theoretical proof of the advantage of chosing a larger value of mm in mmD-OPM is provided and also convergence of mmD-OPM for an SPD system is shown, which supplements the work in [3]. Section 3 shows the application of mmD-OPM to more general non-singular systems. Lastly, in section 4, numerical examples are considered for illustration.

2 mD-OPM for symmetric matrices

Throughout this section, the matrix AA under cosideration is assumed to be SPD. From the numerical experiments provided in [3], it is observed that mDmD-OPM provides better (or at least same) result with larger value of mm. In this section we prove it theoretically.

In mDmD-OPM, 𝒦\mathcal{K}(= \mathcal{L}) is considered as an mm-dimensional space. If 𝒦==span{v1,v2,,vm}\mathcal{K}=\mathcal{L}=\mathop{\rm span}\nolimits\left\{v_{1},v_{2},\ldots,v_{m}\right\}, and Vm=[v1,v2,,vm]V_{m}=[v_{1},v_{2},\ldots,v_{m}], a basic projection step for an mDmD-OPM is defined in [3, 5] as :

Find x(i+1)x(i)+𝒦 such that bAx(i+1)𝒦\displaystyle\text{Find }x^{(i+1)}\in x^{(i)}+\mathcal{K}\text{ such that }b-Ax^{(i+1)}\perp\mathcal{K} (2.1)
Equivalently, x(i+1)=x(i)+Vm(VmTAVm)1VmTr(i) where r(i)=bAx(i)\displaystyle x^{(i+1)}=x^{(i)}+V_{m}(V_{m}^{T}AV_{m})^{-1}V_{m}^{T}r^{(i)}\text{ where }r^{(i)}=b-Ax^{(i)} (2.2)

If xx^{*} is the exact solution of (1.1), the quantity x(i+1)xA2x(i)xA2\|x^{(i+1)}-x_{*}\|_{A}^{2}-\|x^{(i)}-x_{*}\|_{A}^{2} is defined as the error reduction at the iith iteration step of mmD-OPM (2.1) and is denoted by E.RmDE.R_{mD} as considered in [4]. In Theorem 1 of  [3], author proved that E.RmD0E.R_{mD}\leq 0. In particular, E.RmD<0E.R_{mD}<0, if r(i)r^{(i)} is not perpendicular to 𝒦(=)\mathcal{K}(=\mathcal{L}). To prove the main theorem of this section, we need the following Lemma.

Lemma 2.1.

If x(i+1)x^{(i+1)}s are defined as in (2.2), then E.RmD=z(i),p(i)E.R_{mD}=\left\langle z^{(i)},p^{(i)}\right\rangle, where p(i)=VmTr(i)p^{(i)}=-V_{m}^{T}r^{(i)} and z(i)=(VmTAVm)1p(i)z^{(i)}=-(V_{m}^{T}AV_{m})^{-1}p^{(i)}.

Proof.

Proof follows from the fact E.RmD=(VmTr(i))T(VmTAVm)1VmTr(i)E.R_{mD}=-(V_{m}^{T}r^{(i)})^{T}(V_{m}^{T}AV_{m})^{-1}V_{m}^{T}r^{(i)} in the proof of Theorem 1 of  [3]. ∎

For any positive integer kk, write Bk=VkTAVkB_{k}=V_{k}^{T}AV_{k}. Let l{1,2,,m1}l\in\{1,2,...,m-1\}. If we choose dim𝒦=l\dim\mathcal{K}=l in (2.1), then by Lemma 2.1

E.RlD=y(i),p~(i)E.R_{lD}=\langle y^{(i)},\tilde{p}^{(i)}\rangle (2.3)

where p~(i)=VlTr~(i)\tilde{p}^{(i)}=-V_{l}^{T}\tilde{r}^{(i)} and y(i)=(Bl)1p~(i)y^{(i)}=-(B_{l})^{-1}\tilde{p}^{(i)}. Note that Bl=Bm[α]B_{l}=B_{m}[\alpha], y(i)=z(i)[α]y^{(i)}=z^{(i)}[\alpha], p~(i)=p(i)[α]\tilde{p}^{(i)}=p^{(i)}[\alpha] and α={1,2,,l}\alpha=\left\{1,2,...,l\right\}.

Theorem 2.2.

E.RmDE.RlDE.R_{mD}\leq E.R_{lD}, when m>lm>l.

Proof.

For simplicity, we write y(i):=y,z(i):=z,p(i):=p,B(i):=By^{(i)}:=y,~z^{(i)}:=z,p^{(i)}:=p,~B^{(i)}:=B. For α{1,2,,m},\alpha\subsetneq\left\{1,2,...,m\right\}, define z~n\tilde{z}\in\mathbb{R}^{n} as z~(α)=y\tilde{z}(\alpha)=y and 0 elsewhere. By Lemma 2.1, it is sufficient to show that zTpyTp~z^{T}p\leq y^{T}\tilde{p}, or equivalently, zTBzz~TBz~z^{T}Bz\geq\tilde{z}^{T}B\tilde{z}. Since BB is a positive definite matrix, (zz~)TB(zz~)0(z-\tilde{z})^{T}B(z-\tilde{z})\geq 0 which implies that

zTBzz~TBz~z~TBz+zTBz~2z~TBz~.\displaystyle z^{T}Bz-\tilde{z}^{T}B\tilde{z}\geq\tilde{z}^{T}Bz+z^{T}B\tilde{z}-2\tilde{z}^{T}B\tilde{z}. (2.4)

However, z~TBz+zTBz~2z~TBz~=z~TppTz~2yTB~y=2p~Ty+2yTp~=0\tilde{z}^{T}Bz+z^{T}B\tilde{z}-2\tilde{z}^{T}B\tilde{z}=-\tilde{z}^{T}p-p^{T}\tilde{z}-2y^{T}\tilde{B}{y}=-2\tilde{p}^{T}y+2y^{T}\tilde{p}=0.

Thus from (2.4), we get zTBzz~TBz~.z^{T}Bz\geq\tilde{z}^{T}B\tilde{z}.

Corollary 2.3.

mDmD-OPM defined in (2.2) converges.

Proof.

In [2, 4], it is shown that E.RmD0E.R_{mD}\leq 0, for m=2,3m=2,3, which assures the convergence of mmD-OPM, for any mm. Hence the conclusion follows from Theorem 2.2. ∎

3 m-dimensional oblique projection method for non-singular matrices

In this section we present new mm-dimensional oblique projection method (mD-OPM) to solve nonsingular system (1.1). Assume that dim𝒦=dim=m\dim\mathcal{K}=\dim\mathcal{L}=m, with mnm\ll n. Take Vm=[v1,v2,,vm]V_{m}=[v_{1},v_{2},\ldots,v_{m}], and Wm=[w1,w2,,wm]W_{m}=[w_{1},w_{2},\ldots,w_{m}] so that columns of VV and WW form bases for 𝒦\mathcal{K} and \mathcal{L}, respectively. If =A𝒦\mathcal{L}=A\mathcal{K}, then the oblique projection iterative steps, discussed in (1.2), are given as follows [5]:

x(i+1)=x(i)+Vm(WmTAVm)1WmTr(i).x^{(i+1)}=x^{(i)}+V_{m}(W_{m}^{T}AV_{m})^{-1}W_{m}^{T}r^{(i)}. (3.1)

where r(i)=bAx(i)r^{(i)}=b-Ax^{(i)} is the residual in the iith iteration step.

Choose =A𝒦\mathcal{L}=A\mathcal{K}. Then x(i+1)x^{(i+1)} as defined in (1.2) minimizes the 22-norm of the residual r(i+1)r^{(i+1)} over xx(i)+𝒦x\in x^{(i)}+\mathcal{K} (see, Ch.5 in [5]). Throughout this section, .\|.\| represents 22-norm in the Euclidean space n\mathbb{R}^{n} and we drop the suffix mm which signifies the dimension of VmV_{m} and WmW_{m}.

As =A𝒦\mathcal{L}=A\mathcal{K}, we may take W=AVW=AV. Then (3.1) reduces to

x(i+1)=x(i)+VWr(i),x^{(i+1)}=x^{(i)}+VW^{\dagger}r^{(i)}, (3.2)

where WW^{\dagger} denotes the pseudo-inverse of WW so that r(i+1)=bAx(i+1)=r(i)WWr(i).r^{(i+1)}=b-Ax^{(i+1)}=r^{(i)}-WW^{\dagger}r^{(i)}. Main goal of this section is to prove the convergence of (3.2). Following lemma will help to reach our goal.

Lemma 3.1.

If σ1\sigma_{1} is the maximum singular valur of AA, and y=WTr(i)y=W^{T}r^{(i)}, then

r(i)2r(i+1)21σ12y2.\|r^{(i)}\|^{2}-\|r^{(i+1)}\|^{2}\geq\frac{1}{\sigma_{1}^{2}}\|y\|^{2}.
Proof.

As (WW)T=WW(WW^{\dagger})^{T}=WW^{\dagger} and WWW=WWW^{\dagger}W=W, we have,

r(i+1)2=(r(i)Tr(i)T(WW)T)(r(i)WWr(i))=r(i)2r(i)TWWr(i)\|r^{(i+1)}\|^{2}=(r^{(i)^{T}}-r^{(i)^{T}}(WW^{\dagger})^{T})(r^{(i)}-WW^{\dagger}r^{(i)})=\|r^{(i)}\|^{2}-r^{(i)^{T}}WW^{\dagger}r^{(i)} (3.3)

Using Courant-Fisher min-max principle[5], from (3.3) we achieve,

r(i)2r(i+1)2\displaystyle\|r^{(i)}\|^{2}-\|r^{(i+1)}\|^{2} =yT(WTW)1y\displaystyle=y^{T}(W^{T}W)^{-1}y =(WTW)1y,yy2y2\displaystyle=\dfrac{\langle(W^{T}W)^{-1}y,y\rangle}{\|y\|^{2}}\|y\|^{2} (3.4)
λmin(WTW)1y2\displaystyle\geq\lambda_{\min}(W^{T}W)^{-1}\|y\|^{2}
=1λmax(WTW)1y2\displaystyle=\dfrac{1}{\lambda_{\max}(W^{T}W)^{-1}}\|y\|^{2}
=1(σmax(W))2y2,\displaystyle=\dfrac{1}{(\sigma_{\max}(W))^{2}}\|y\|^{2},

where λmin\lambda_{\min}, λmax\lambda_{\max} denote the maximum and minimum eigenvalues, and σmax\sigma_{\max}, σmax\sigma_{\max} denotes the maximum and minimum singular values of the corresponding matrix, respectively.

Let W=U~V~TW=\tilde{U}\sum\tilde{V}^{T} be the singular value decomposition of WW. If U~=[u1~,u2~,,un~]\tilde{U}=[\tilde{u_{1}},\tilde{u_{2}},\dots,\tilde{u_{n}}], and V~=[v1~,v2~,,vm~]\tilde{V}=[\tilde{v_{1}},\tilde{v_{2}},\dots,\tilde{v_{m}}], then Wv1~=σ1(W)u1~W\tilde{v_{1}}=\sigma_{1}(W)\tilde{u_{1}} so that

(σ1(W))2=σ1(W)u1~2=Wv1~2=AVv1~2AV2A2V2=σ12.(\sigma_{1}(W))^{2}=\|\sigma_{1}(W)\tilde{u_{1}}\|^{2}=\|W\tilde{v_{1}}\|^{2}=\|AV\tilde{v_{1}}\|^{2}\leq\|AV\|^{2}\leq\|A\|^{2}\|V\|^{2}=\sigma_{1}^{2}.

Hence the result follows from (3.4). ∎

In Theorem 3 of [3], author provided the convergence of the method (2.2) for SPD matrices, and also gives an idea to choose the optimal vectors viv_{i}. Similar ideas is used to prove the convergence of (3.2). Next theorem is due to [6] (see Ch 3, Cor 3.1.1), which gives the relation between singular values of a matrix and its submatrices.

Theorem 3.2.

[6] If AA is an m×nm\times n matrix and AlA_{l} denotes a submatrix of AA obtained by deleting a total of ll rows and/or columns from AA, then

σk(A)σk(Al)σk+l(A),k=1:min{m,n}\sigma_{k}(A)\geq\sigma_{k}(A_{l})\geq\sigma_{k+l}(A),~~~~k=1:\min\{m,n\}

where the singular values σi\sigma_{i}’s are arranged decreasingly.

We now prove our main theorem, in which the singular values of matrix under consideration, are assumed to be arranged in decreasing order.

Theorem 3.3.

Let σ1σ2σn\sigma_{1}\geq\sigma_{2}\geq\ldots\geq\sigma_{n} be the singular values of AA. If i1<i2<<imi_{1}<i_{2}<\ldots<i_{m}, and vj=eijv_{j}=e_{i_{j}}, iji_{j}th column of the identity matrix, then

r(i+1)2(1σn2σ12)r(i)2.\|r^{(i+1)}\|^{2}\leq\left(1-\dfrac{\sigma_{n}^{2}}{\sigma_{1}^{2}}\right)\|r^{(i)}\|^{2}. (3.5)
Proof.

Let α={i1,i2,,im}\alpha=\{i_{1},i_{2},\ldots,i_{m}\}, and Am=A[:,α]TA_{m}=A[:,\alpha]^{T}. Then y=WTr(i)=Amr(i)y=W^{T}r^{(i)}=A_{m}r^{(i)}. Since AmA_{m} has full row rank, as shown in (3.4), we can infer

y2=Amr(i)2r(i)2r(i)2σmin2(Am)r(i)2=σm2(Am)r(i)2.\|y\|^{2}=\dfrac{\|A_{m}r^{(i)}\|^{2}}{\|r^{(i)}\|^{2}}\|r^{(i)}\|^{2}\geq\sigma_{min}^{2}(A_{m})\|r^{(i)}\|^{2}=\sigma_{m}^{2}(A_{m})\|r^{(i)}\|^{2}.

Taking l=nml=n-m and k=mk=m in Theorem 3.2, we get y2σn2r(i)2.\|y\|^{2}\geq\sigma_{n}^{2}\|r^{(i)}\|^{2}. So, from Lemma 3.1 we conclude that

r(i)2r(i+1)2σn2σ12r(i)2,\|r^{(i)}\|^{2}-\|r^{(i+1)}\|^{2}\geq\dfrac{\sigma_{n}^{2}}{\sigma_{1}^{2}}\|r^{(i)}\|^{2},

Hence the conclusion follows. ∎

Remark 3.4.

The quantity r(i)2r(i+1)2\|r^{(i)}\|^{2}-\|r^{(i+1)}\|^{2} is greater when larger value of mm is chosen. This can be seen from  (3.3) and following similar steps used in proving Theorem 2.2.

Remark 3.5.

Under the assumption in Theorem 3.3, equation (3.5) suggests that the iteration process in (3.2) converges.

4 Numerical Experiments

In this section comparison of mmD-OPM is established with various methods, like, CGNR, GMRES and Craig’s method [7] for any non-singular linear system.

The algorithm of the mDmD-OPM, discussed in Section 3, is as follows, which is same as proposed in [3] by considering the symmetric system ATA=ATbA^{T}A=A^{T}b.

Algorithm 1 [3] A particular implementation for arbitrary dimensional OPM

1. Chose an initial guess x(0)Rnx^{(0)}\in\textbf{R}^{n} and decide on mm, the number of times each component of x(0)x^{(0)} is improved in each iteration.
2. Until Convergence, Do
3. x=x(0)x=x^{(0)}.
4. For i=1,2,,ni=1,2,\dots,n, Do
5.Select the indices i1,i2,,imi_{1},i_{2},\dots,i_{m} of rr
6. Em=[ei1,ei2,,eim]E_{m}=\left[e_{i_{1}},e_{i_{2}},\dots,e_{i_{m}}\right]
7.Solve (EmTATAEm)ym=EmTATr(E_{m}^{T}A^{T}AE_{m})y_{m}=E_{m}^{T}A^{T}r for ymy_{m}
8.x=x+Emymx=x+E_{m}y_{m}
9.r=rAEmymr=r-AE_{m}y_{m}
10.End Do
11.x(0)=xx^{(0)}=x
12.End Do.

The experiments are done on a PC-Intel(R) Core(TM) i3-7100U CPU @ 2.40 GHz, 4 GB RAM. The computations are implemented in MATLAB 9.2.0.538062. The initial guess is x(0)=[0,0,,0]Tx^{(0)}=[0,0,\dots,0]^{T} and the stopping criteria is x(i+1)x(i)<1012\|x^{(i+1)}-x^{(i)}\|<10^{-12}.

While doing comparisions with mmD-OPM, we consider different values of mm to get various results. The theory suggests that mmD-OPM will have a good convergence for matrices whose singular values are closely spaced. Hence we chose the matrices accordingly.

Example 4.1.

The first matrix is a symmetric n×nn\times n Hankel matrix with elements A(i,j)=0.5nij+1.5A(i,j)=\frac{0.5}{n-i-j+1.5}. The eigen values of AA cluster around π2-\frac{\pi}{2} and π2\frac{\pi}{2} and the condition number is of O(1)O(1). The matrix is of size 100100. Comparision is done for different values of mm as well as with the CGNR, GMRES and Craig’s method.

Table 1: Results for 4.1
Iteration Process No of Iterations Residual
66D-OPM 14 3.5755×10123.5755\times 10^{-12}
1010D-OPM 8 4.6142×10124.6142\times 10^{-12}
5050D-OPM 2 3.8×10153.8\times 10^{-15}
GMRES 10 3.7×10153.7\times 10^{-15}
CGNR 9 5.3427×10155.3427\times 10^{-15}
Craig 9 4.9704×10154.9704\times 10^{-15}
Example 4.2.

We consider a square matrix of size nn with singular values 1+10i1+10^{-i}, i=1:ni=1:n. This is again a matrix with extremely good condition number. For such a well-conditioned matrix, mmd-dspm works like a charm and is better than the CGNR. The matrix taken here is of size 400400.

Table 2: Results for 4.2
Iteration Process No of Iterations Residual
44D-OPM 1 2.1618×10152.1618\times 10^{-15}
CGNR 6 5.3328×10155.3328\times 10^{-15}
Craig 6 5.4702×10155.4702\times 10^{-15}

5 Conclusion

mmD-OPM, presented in this paper, is a generalization of mmD-SPM [3], and can be applied to any non-singular system. Numerical experiments showed that this method is at par with other established methods. The way in which the search subspace is chosen put this method at a clear advantage over GMRES, because in GMRES, the orthogonalisation through Arnoldi process can lead to infeasible growth in storage requirements.

Reference

References

  • [1] N. Ujević, A new iterative method for solving linear systems, Applied Mathematics and Computation 179 (2006) 725–730.
  • [2] Y. F. Jing, T. Z. Huang, On a new iterative method for solving linear systems and comparision results, Journal of Computational and Applied Mathematics 220 (2008) 74–84.
  • [3] D. K. Salkuyeh, A generalization of the 2D-DSPM for solving linear system of equations, ArXiv, 0906.1798v1 (2009). arXiv:http://arxiv.org/abs/0906.1798v1.
  • [4] G. Hou, L. Wang, A generalized iterative method and comparision results using projection techniques for solving linear systems, Applied Mathematics and Computation 215 (2009) 806–817.
  • [5] Y. Saad, Iterative methods for sparse linear systems,second ed., SIAM, Philadelphia.
  • [6] R. A. Horn, C. R. Johnson, Topics in matrix analysis, Cambridge University Press.
  • [7] N. N. Kalitkin, L. V. Kuz’mina, Convergence of Craig’s method for linear algebraic systems, Mathematical Models and Computer Simulations 4 (2012) 509–526.