This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On a formula of Thompson and McEnteggert for the adjugate matrix

Kenier Castillo111Partially supported by the Centre for Mathematics of the University of Coimbra - UIDB/00324/2020, funded by the Portuguese Government through FCT/MCTES. kenier@mat.uc.pt Ion Zaballa222Partially supported by “Ministerio de Economía, Industria y Competitividad (MINECO)” of Spain and “Fondo Europeo de Desarrollo Regional (FEDER)” of EU through grants MTM2017-83624-P and MTM2017-90682-REDT, and by UPV/EHU through grant GIU16/42. ion.zaballa@ehu.es CMUC, Department of Mathematics, University of Coimbra, 3001-501 Coimbra, Portugal. Departamento de Matemática Aplicada y EIO. Universidad del País Vasco (UPV/EHU). Apdo. 644. 48080 Bilbao. Spain.
Abstract

For an eigenvalue λ0\lambda_{0} of a Hermitian matrix AA, the formula of Thompson and McEnteggert gives an explicit expression of the adjugate of λ0IA\lambda_{0}I-A, Adj(λ0IA)\mathop{\rm Adj}\nolimits(\lambda_{0}I-A), in terms of eigenvectors of AA for λ0\lambda_{0} and all its eigenvalues. In this paper Thompson-McEnteggert’s formula is generalized to include any matrix with entries in an arbitrary field. In addition, for any nonsingular matrix AA, a formula for the elementary divisors of Adj(A)\mathop{\rm Adj}\nolimits(A) is provided in terms of those of AA. Finally, a generalization of the eigenvalue-eigenvector identity and two applications of the Thompson-McEnteggert’s formula are presented.

keywords:
Adjugate, eigenvalues, eigenvectors, elementary divisors, rank-one matrices.
MSC:
15A18 , 15A15

1 Introduction

Let \mathcal{R} be a commutative ring with identity. Following [16, Ch. 30], for a polynomial p(λ))=k=0npkλk[λ]p(\lambda))=\sum_{k=0}^{n}p_{k}\lambda^{k}\in\mathcal{R}[\lambda] its derivative is p(λ)=k=1nkpkλk1p^{\prime}(\lambda)=\sum_{k=1}^{n}kp_{k}\lambda^{k-1}. Recall that if Xn×nX\in\mathcal{R}^{n\times n} is a square matrix of order nn with entries in \mathcal{R} and Mij(X)M_{ij}(X) is the minor obtained from XX by deleting the iith row and jjth column then the adjugate of XX, Adj(X)\mathop{\rm Adj}\nolimits(X), is the matrix whose (i,j)(i,j) entry is (1)i+jMji(X)(-1)^{i+j}M_{ji}(X); that is, Adj(X)=[(1)i+jMji(X)]1i,jn\mathop{\rm Adj}\nolimits(X)=\begin{bmatrix}(-1)^{i+j}M_{ji}(X)\end{bmatrix}_{1\leq i,j\leq n}.

Formula (1) below, from now on TM formula, was proved, with w=vw=v and the normalization wv=1w^{\ast}v=1, for a Hermitian matrix An×n{A}\in\mathbb{C}^{n\times n} by Thompson and McEnteggert (see [33, pp. 212-213]). Inspection of the proof shows that the formula also holds for normal matrices over \mathbb{C} (see [28]). With the same arguments we can go further. Recently, Denton, Parke, Tao, and Zhang pointed out that the TM formula has an extension to a non-normal matrix An×n{A}\in\mathbb{R}^{n\times n}, so long as it is diagonalizable (see [12, Rem. 4]). Even more, as shown in Remark 5 of [12] it holds for matrices over commutative rings (see [17] for an informal proof). A more detailed proof of this result will be given in Section 2. However, for matrices over fields (or over integral domains) with repeated eigenvalues, (1) does not provide meaningful information (see Remark 2.4). We will exhibit in Section 2 a generalization of the TM formula which holds for matrices over arbitrary fields with repeated eigenvalues. This new TM formula will be used to generalize the so-called eigenvector-eigenvalue identity (see (20)) for non-diagonalizable matrices over arbitrary fields. In addition we will provide a complete characterization of the similarity invariants of Adj(A)\mathop{\rm Adj}\nolimits(A) in terms of those of AA, generalizing a result about the eigenvalues and the minimal polynomial in [18]. Then in Section 3 two additional consequences of the TM formula will be analysed.

2 The TM formula and its generalization

Let An×nA\in\mathcal{R}^{n\times n} be a square matrix of order nn with entries in \mathcal{R}. An element λ0\lambda_{0}\in\mathcal{R} is said to be an eigenvalue of AA if Ax=λ0xAx=\lambda_{0}x for some nonzero vector xn×1x\in\mathcal{R}^{n\times 1}([7, Def. 17.1]). This vector is said to be a right eigenvector of AA for (or associated with) λ0\lambda_{0}. The left eigenvectors of AA for λ0\lambda_{0} are the right eigenvectors for λ0\lambda_{0} of ATA^{T}, the transpose of AA, or, if =\mathcal{R}=\mathbb{C} is the field of complex numbers, of AA^{\ast}, the conjugate transpose of AA. That is to say, yn×1y\in\mathcal{R}^{n\times 1} is a left eigenvector of AA for λ0\lambda_{0} if yTA=λ0yTy^{T}A=\lambda_{0}y^{T} (or yA=λ0yy^{\ast}A=\lambda_{0}y^{\ast} if =\mathcal{R}=\mathbb{C}). The characteristic polynomial of AA is pA(λ)=det(λInA)p_{A}(\lambda)=\det(\lambda I_{n}-A) and λ0\lambda_{0} is an eigenvalue of AA if and only if pA(λ0)Z()p_{A}(\lambda_{0})\in Z(\mathcal{R}), where Z()Z(\mathcal{R}) is the set of zero divisors of \mathcal{R} ([7, Lem. 17.2]).

The following result, in a slightly different form, was proved by D. Grinberg in [17].

Theorem 2.1.

Let An×nA\in\mathcal{R}^{n\times n} and let λ0\lambda_{0}\in\mathcal{R} be an eigenvalue of AA. Let v,wn×1v,\;w\in\mathcal{R}^{n\times 1} be a right and a left eigenvector, respectively, of AA for λ0\lambda_{0} . Then

wTvAdj(λ0InA)=pA(λ0)vwT.w^{T}v\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)=p^{\prime}_{A}(\lambda_{0})vw^{T}. (1)

The proof in [17] is based on the Lemma 2.2 below which is interesting in its own right. According to McCoy’s theorem ([7, Th. 5.3]) there is a non-zero vector xn×1x\in\mathcal{R}^{n\times 1} such that Ax=0Ax=0 if and only if rk(A)<n\rm{rk}(A)<n, where rk(A)\rm{rk}(A) is the (McCoy) rank of AA ([7, Def. 4.10]). In other words, 0 is an eigenvalue of AA if and only if rk(A)<n\rm{rk}(A)<n. Note that rk(A)=rk(AT)\rm{rk}(A)=\rm{rk}(A^{T}).

Lemma 2.2.

Let An×nA\in\mathcal{R}^{n\times n} be a matrix such that rk(A)<n\rm{rk}(A)<n and let wn×1w\in\mathcal{R}^{n\times 1} be a left eigenvector of AA for the eigenvalue 0. For j=1,,nj=1,\ldots,n, let (AdjA)j(\mathop{\rm Adj}\nolimits A)_{j} be the jjth column of Adj(A)\mathop{\rm Adj}\nolimits(A). Then, for all i,j=1,,ni,j=1,\ldots,n,

wi(AdjA)j=wj(AdjA)i,w_{i}(\mathop{\rm Adj}\nolimits A)_{j}=w_{j}(\mathop{\rm Adj}\nolimits A)_{i}, (2)

where w=[w1w2wn]Tw=\begin{bmatrix}w_{1}&w_{2}&\cdots&w_{n}\end{bmatrix}^{T}.

This is Lemma 3 of [17]. The author himself considers the proof to be informal. So a detailed proof of Lemma 2.2, following Grinberg’s ideas333Grinberg’s permission was granted to include the proofs of this Lemma and Theorem 2.1, is given next for reader’s convenience.

Proof of Lemma 2.2.

Let us take i,j{1,,n}i,j\in\{1,\ldots,n\} and assume that iji\neq j; otherwise, there is nothing to prove. We assume also, without lost of generality, that i<ji<j. Let w=[w1w2wn]Tw=\begin{bmatrix}w_{1}&w_{2}&\cdots&w_{n}\end{bmatrix}^{T} and, for k=1,,nk=1,\ldots,n, let aka_{k} be the kkth row of AA. Define Bn×nB\in\mathcal{R}^{n\times n} to be the matrix whose kkth row, bkb_{k}, is equal to aka_{k} if ki,jk\neq i,j and bk=wkakb_{k}=w_{k}a_{k} if k=i,jk=i,j. A simple computation shows that wi(AdjA)j=(AdjB)jw_{i}(\mathop{\rm Adj}\nolimits A)_{j}=(\mathop{\rm Adj}\nolimits B)_{j} and wj(AdjA)i=(AdjB)iw_{j}(\mathop{\rm Adj}\nolimits A)_{i}=(\mathop{\rm Adj}\nolimits B)_{i}. We claim that (AdjB)j(\mathop{\rm Adj}\nolimits B)_{j}=(AdjB)i(\mathop{\rm Adj}\nolimits B)_{i}. This would prove the lemma.

It follows from wTA=0w^{T}A=0 that k=1nwkak=0\sum_{k=1}^{n}w_{k}a_{k}=0 and so

bi+bj=k=1,k,i,jnwkbk.b_{i}+b_{j}=-\sum_{k=1,k\neq,i,j}^{n}w_{k}b_{k}. (3)

Let

P= [\@arstrutij\\1\\\\1\\i-w1-w-i1-1-w+i1-w-j10-w+j1-wn\\1\\\\1\\j-w1-w-i10-w+i1-w-j1-1-w+j1-wn\\1\\\\1\\] .P=\hbox{}\vbox{\kern 0.86108pt\hbox{$\kern 0.0pt\kern 2.5pt\kern-5.0pt\left[\kern 0.0pt\kern-2.5pt\kern-5.55557pt\vbox{\kern-0.86108pt\vbox{\vbox{ \halign{\kern\arraycolsep\hfil\@arstrut$\kbcolstyle#$\hfil\kern\arraycolsep& \kern\arraycolsep\hfil$\@kbrowstyle#$\ifkbalignright\relax\else\hfil\fi\kern\arraycolsep&& \kern\arraycolsep\hfil$\@kbrowstyle#$\ifkbalignright\relax\else\hfil\fi\kern\arraycolsep\cr 5.0pt\hfil\@arstrut$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle i$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle j$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\ddots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\i$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{i-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{i+1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{j-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{j+1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{n}\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\ddots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\j$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{i-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 0$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{i+1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{j-1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{j+1}$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\cdots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle-w_{n}\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\ddots$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle\\$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle$\hfil\kern 5.0pt&5.0pt\hfil$\scriptstyle 1\\$\hfil\kern 5.0pt\crcr}}}}\right]$}}.

This matrix is invertible in \mathcal{R} (its determinant is 11) and by (3),

B~=PB=[b1Tbi1TbjTbi+1Tbj1TbiTbj+1TbnT]T.\widetilde{B}=PB=\left[\begin{array}[]{ccccccccccc}b_{1}^{T}&\cdots&b_{i-1}^{T}&b_{j}^{T}&b_{i+1}^{T}&\cdots&b_{j-1}^{T}&b_{i}^{T}&b_{j+1}^{T}&\cdots&b_{n}^{T}\end{array}\right]^{T}.

Then, Adj(B~)=Adj(B)Adj(P)\mathop{\rm Adj}\nolimits(\widetilde{B})=\mathop{\rm Adj}\nolimits(B)\mathop{\rm Adj}\nolimits(P) and, since PP is invertible, Adj(P)=(detP)P1=P1\mathop{\rm Adj}\nolimits(P)=(\det P)P^{-1}=P^{-1}. Hence Adj(B)=Adj(B~)P\mathop{\rm Adj}\nolimits(B)=\mathop{\rm Adj}\nolimits(\widetilde{B})P and for k=1,,nk=1,\ldots,n

(AdjB)ki==1n(AdjB~)kPi.(\mathop{\rm Adj}\nolimits B)_{ki}=\sum_{\ell=1}^{n}(\mathop{\rm Adj}\nolimits\widetilde{B})_{k\ell}P_{\ell i}.

But in the iith column of PP the only nonzero entry is 1-1 in position (i,i)(i,i). Therefore, (AdjB)ki=(AdjB~)ki(\mathop{\rm Adj}\nolimits B)_{ki}=-(\mathop{\rm Adj}\nolimits\widetilde{B})_{ki}. Now, taking into account that B~\widetilde{B} is the matrix BB with rows iith and jjth interchanged and recalling that Mij(X)M_{ij}(X) is the minor of XX obtained by deleting the iith row and jjth column of XX, we get

(AdjB)ki=(AdjB~)ki=(1)k+i+1Mik(B~)=(1)k+i+1(1)ji1Mjk(B)=(1)k+jMjk(B)=(AdjB)kj,\begin{array}[]{rcl}(\mathop{\rm Adj}\nolimits B)_{ki}&=&-(\mathop{\rm Adj}\nolimits\widetilde{B})_{ki}=(-1)^{k+i+1}M_{ik}(\widetilde{B})\\ &=&(-1)^{k+i+1}(-1)^{j-i-1}M_{jk}(B)\\ &=&(-1)^{k+j}M_{jk}(B)=(\mathop{\rm Adj}\nolimits B)_{kj},\end{array}

as claimed. ∎

There is a “row version” of Lemma 2.2 which can be proved along the same lines.

Lemma 2.3.

Let An×nA\in\mathcal{R}^{n\times n} be a matrix such that rk(A)<n\rm{rk}(A)<n and let vn×1v\in\mathcal{R}^{n\times 1} be a right eigenvector of AA for the eigenvalue 0. For j=1,,nj=1,\ldots,n let (AdjA)j(\mathop{\rm Adj}\nolimits A)^{j} be the jjth row of Adj(A)\mathop{\rm Adj}\nolimits(A). Then, for all i,j=1,,ni,j=1,\ldots,n,

vi(AdjA)j=vj(AdjA)i,v_{i}(\mathop{\rm Adj}\nolimits A)^{j}=v_{j}(\mathop{\rm Adj}\nolimits A)^{i}, (4)

where v=[v1v2vn]Tv=\begin{bmatrix}v_{1}&v_{2}&\cdots&v_{n}\end{bmatrix}^{T}.

The proof of Theorem 2.1 which follows is very much that of Grinberg in [17]. It is included for completion and reader’s convenience.

Proof of Theorem 2.1.

Let B=λ0InAB=\lambda_{0}I_{n}-A and pB(λ)=det(λInB)p_{B}(\lambda)=\det(\lambda I_{n}-B) its characteristic polynomial. Then pB(λ)=λn+k=1n(1)kckλnkp_{B}(\lambda)=\lambda^{n}+\sum\limits_{k=1}^{n}(-1)^{k}c_{k}\lambda^{n-k} where, for k=0,,nk=0,\ldots,n, ck=1i1<<ikndetB(i1:ik,i1:ik)c_{k}=\sum\limits_{1\leq i_{1}<\cdots<i_{k}\leq n}\det B(i_{1}:i_{k},i_{1}:i_{k}), and B(i1:ik,i1:ik)=[bij,i]1j,kB(i_{1}:i_{k},i_{1}:i_{k})=\begin{bmatrix}b_{i_{j},i_{\ell}}\end{bmatrix}_{1\leq j,\ell\leq k} is the principal submatrix of BB formed by the rows and columns i1i_{1}, …, iki_{k}. In particular, cn1=j=1nMjj(B)c_{n-1}=\sum\limits_{j=1}^{n}M_{jj}(B) where Mjj(B)M_{jj}(B) is the principal minor of BB obtained by deleting the jjth row and column. Thus pB(0)=(1)n1j=1nMjj(B)p^{\prime}_{B}(0)=(-1)^{n-1}\sum\limits_{j=1}^{n}M_{jj}(B).

On the other hand, det(λInB)=det(λInλ0In+A)=(1)ndet((λ0λ)InA)=(1)npA(λ0λ)\det(\lambda I_{n}-B)=\det(\lambda I_{n}-\lambda_{0}I_{n}+A)=(-1)^{n}\det((\lambda_{0}-\lambda)I_{n}-A)=(-1)^{n}p_{A}(\lambda_{0}-\lambda). It follows from the definition of derivative of a polynomial that

pA(λ0)=(1)n+1pB(0)=j=1nMjj(λ0InA).p^{\prime}_{A}(\lambda_{0})=(-1)^{n+1}p^{\prime}_{B}(0)=\sum\limits_{j=1}^{n}M_{jj}(\lambda_{0}I_{n}-A).

Hence, proving (1) is equivalent to proving

wTvAdj(B)=j=1nMjj(B)vwTw^{T}v\mathop{\rm Adj}\nolimits(B)=\sum_{j=1}^{n}M_{jj}(B)vw^{T} (5)

where B=λ0InAB=\lambda_{0}I_{n}-A. It follows from Av=λ0vAv=\lambda_{0}v and wTA=λ0wTw^{T}A=\lambda_{0}w^{T} that Bv=0Bv=0 and wTB=0w^{T}B=0, respectively. So we can apply to BB properties (2) and (4). It follows from (2) that wk(AdjB)ij=wj(AdjB)ikw_{k}(\mathop{\rm Adj}\nolimits B)_{ij}=w_{j}(\mathop{\rm Adj}\nolimits B)_{ik} for all i,j,k{1,,n}i,j,k\in\{1,\ldots,n\}. Then vkwk(AdjB)ij=wjvk(AdjB)ikv_{k}w_{k}(\mathop{\rm Adj}\nolimits B)_{ij}=w_{j}v_{k}(\mathop{\rm Adj}\nolimits B)_{ik} and from (4), vk(AdjB)ik=vi(AdjB)kkv_{k}(\mathop{\rm Adj}\nolimits B)_{ik}=v_{i}(\mathop{\rm Adj}\nolimits B)_{kk}. Hence,

vkwk(AdjB)ij=viwj(AdjB)kk,i,j,k=1,,n.v_{k}w_{k}(\mathop{\rm Adj}\nolimits B)_{ij}=v_{i}w_{j}(\mathop{\rm Adj}\nolimits B)_{kk},\quad i,j,k=1,\ldots,n.

Adding on kk and taking into account that (AdjB)kk=Mkk(B)(\mathop{\rm Adj}\nolimits B)_{kk}=M_{kk}(B), we get

wTv(AdjB)ij=k=1nMkk(B)viwj,i,j=1,n.w^{T}v(\mathop{\rm Adj}\nolimits B)_{ij}=\sum_{k=1}^{n}M_{kk}(B)v_{i}w_{j},\quad i,j=1,\ldots n.

This is equivalent to (5) and the theorem follows. ∎

Remark 2.4.

Assume that \mathcal{R} is an integral domain and note that in this case rk(A)=rank(A)\rm{rk}(A)=\mathop{\rm rank}\nolimits(A); i.e., the McCoy rank and the usual rank coincide. It is an interesting consequence of (1) that wTv=0w^{T}v=0 implies pA(λ0)=0p^{\prime}_{A}(\lambda_{0})=0. The converse is not true in general. For example, if A=λ0I2A=\lambda_{0}I_{2} then v=[10]Tv=\begin{bmatrix}1&0\end{bmatrix}^{T} satisfies both Av=λ0vAv=\lambda_{0}v and vTA=λ0vTv^{T}A=\lambda_{0}v^{T}, but vTv=1v^{T}v=1 and pA(λ0)=0p^{\prime}_{A}(\lambda_{0})=0. However, if pA(λ0)=0p^{\prime}_{A}(\lambda_{0})=0 and rank(λ0InA)=n1\mathop{\rm rank}\nolimits(\lambda_{0}I_{n}-A)=n-1 then, necessarily, wTv=0w^{T}v=0 because Adj(λ0InA)\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A) is not the zero matrix. In particular, if 𝔽\mathbb{F} is a field of characteristic zero (see [16, Ch. 30]) then it follows from (1) that if wTv=0w^{T}v=0 then λ0\lambda_{0} is an eigenvalue of algebraic multiplicity at least 22. On the other hand, it is easily checked that if λ0\lambda_{0} is an eigenvalue of algebraic multiplicity bigger that 11 and geometric multiplicity 11 then wTv=0w^{T}v=0 for any right and left eigenvectors, vv and ww respectively, of AA for λ0\lambda_{0}. This is the case, for example, of A=[0010]A=\begin{bmatrix}0&0\\ 1&0\end{bmatrix}. For this matrix, the TM formula (1) does not provide any substantial information about Adj(λ0InA)\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A) because, in this case, wTv=0w^{T}v=0 and pA(λ0)=0p_{A}^{\prime}(\lambda_{0})=0. Thus, the TM formula (1) is relevant for matrices with simple eigenvalues. \Box

Our next goal is to provide a generalization of the TM formula (1) which is meaningful for nondiagonalizable matrices over fields. We will use the following notation: 𝔽\mathbb{F} will denote an arbitrary field. If A𝔽n×nA\in\mathbb{F}^{n\times n} then p1(λ)p_{1}(\lambda),…, pr(λ)p_{r}(\lambda) will be its (possibly repeated) elementary divisors in 𝔽\mathbb{F} ([15, Ch. VI, Sec. 3]). These are powers of monic irreducible polynomials of 𝔽[λ]\mathbb{F}[\lambda] (the ring of polynomials with coefficients in 𝔽\mathbb{F}). We will assume that for j=1,,rj=1,\ldots,r,

pj(λ)=λdj+aj1λdj1+aj2λdj2++ajdj1λ+ajdj.p_{j}(\lambda)=\lambda^{d_{j}}+a_{j1}\lambda^{d_{j}-1}+a_{j2}\lambda^{d_{j}-2}+\cdots+a_{jd_{j}-1}\lambda+a_{jd_{j}}.

Let ΔA\Delta_{A} denote the determinant of AA and Λ(A)\Lambda(A) the set of eigenvalues (the spectrum) of AA in, perhaps, an extension field, 𝔽~\widetilde{\mathbb{F}}, of 𝔽\mathbb{F}. Thus λ0Λ(A)\lambda_{0}\in\Lambda(A) if and only if it is a root in 𝔽~\widetilde{\mathbb{F}} of pj(λ)p_{j}(\lambda) for some j{1,2,,r}j\in\{1,2,\ldots,r\}. In particular, pA(λ)=j=1rpj(λ)p_{A}(\lambda)=\prod_{j=1}^{r}p_{j}(\lambda) is the characteristic polynomial of AA.

Item (ii) of the following theorem is an elementary result that is included for completion.

Theorem 2.5.

With the above notation:

  • (i)

    If 0Λ(A)\leavevmode\nobreak\ 0\not\in\Lambda(A) then the elementary divisors of Adj(A)\mathop{\rm Adj}\nolimits(A) are q1(λ)q_{1}(\lambda),…, qr(λ)q_{r}(\lambda) where for j=1,,rj=1,\ldots,r,

    qj(λ)=λdj+ΔAajdj1ajdjλdj1++ΔAdj1aj1ajdjλ+ΔAdj1ajdj.q_{j}(\lambda)=\lambda^{d_{j}}+\Delta_{A}\frac{a_{jd_{j}-1}}{a_{jd_{j}}}\lambda^{d_{j}-1}+\cdots+\Delta_{A}^{d_{j}-1}\frac{a_{j1}}{a_{jd_{j}}}\lambda+\Delta_{A}^{d_{j}}\frac{1}{a_{jd_{j}}}. (6)
  • (ii)

    If 0Λ(A)\leavevmode\nobreak\ 0\in\Lambda(A) and there are two indices i,k{1,,r}i,k\in\{1,\ldots,r\}, iki\neq k, such that pi(0)=pk(0)=0p_{i}(0)=p_{k}(0)=0 then Adj(A)=0\mathop{\rm Adj}\nolimits(A)=0.

  • (iii)

    If 0Λ(A)\leavevmode\nobreak\ 0\in\Lambda(A), pk(0)=0p_{k}(0)=0 for only one value k{1,,r}k\in\{1,\ldots,r\} and u,v𝔽n×1u,v\in\mathbb{F}^{n\times 1} are arbitrary right and left eigenvectors of AA, respectively, for the eigenvalue 0, then vTAdk1u0v^{T}A^{d_{k}-1}u\neq 0 and

    Adj(A)=(1)n1dk!pA(dk)(0)uvTvTAdk1u,\mathop{\rm Adj}\nolimits(A)=\frac{(-1)^{n-1}}{d_{k}!}p_{A}^{(d_{k})}(0)\frac{uv^{T}}{v^{T}A^{d_{k}-1}u}, (7)

    where pA(dk)(λ)p_{A}^{(d_{k})}(\lambda) is the dkd_{k}-th derivative of pA(λ)p_{A}(\lambda).

Proof.

For j=1,,rj=1,\ldots,r, let the companion matrix of pj(λ)p_{j}(\lambda) be

Cj=[000ajdj100ajdj1010ajdj2001aj1].C_{j}=\begin{bmatrix}0&0&\cdots&0&-a_{jd_{j}}\\ 1&0&\cdots&0&-a_{jd_{j}-1}\\ 0&1&\cdots&0&-a_{jd_{j}-2}\\ \vdots&\vdots&\ddots&\vdots&\vdots\\ 0&0&\cdots&1&-a_{j1}\end{bmatrix}. (8)

Then (see [15, Ch. VI, Sec. 6]) there is an invertible matrix S𝔽n×nS\in\mathbb{F}^{n\times n} such that

C=S1AS=j=1rCj.C=S^{-1}AS=\bigoplus_{j=1}^{r}C_{j}. (9)

An explicit computation shows that

Adj(Cj)=(1)dj[ajdj1ajdj00ajdj20ajdj0ak100ajdj1000].\mathop{\rm Adj}\nolimits(C_{j})=(-1)^{d_{j}}\begin{bmatrix}-a_{jd_{j}-1}&a_{jd_{j}}&0&\cdots&0\\ -a_{jd_{j}-2}&0&a_{jd_{j}}&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ -a_{k1}&0&0&\cdots&a_{jd_{j}}\\ -1&0&0&\cdots&0\end{bmatrix}.

Bearing in mind that detCj=(1)djajdj\det C_{j}=(-1)^{d_{j}}a_{jd_{j}}, we obtain Adj(C)=j=1rLj\mathop{\rm Adj}\nolimits(C)=\oplus_{j=1}^{r}L_{j} where, for j=1,,rj=1,\ldots,r,

Lj=(1)ni=1,ijraidi[ajdj1ajdj00ajdj20ajdj0ak100ajdj1000].L_{j}=(-1)^{n}\prod_{i=1,i\neq j}^{r}a_{id_{i}}\begin{bmatrix}-a_{jd_{j}-1}&a_{jd_{j}}&0&\cdots&0\\ -a_{jd_{j}-2}&0&a_{jd_{j}}&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ -a_{k1}&0&0&\cdots&a_{jd_{j}}\\ -1&0&0&\cdots&0\end{bmatrix}. (10)

Therefore, from (9) we get

Adj(A)=S(j=1rLj)S1.\mathop{\rm Adj}\nolimits(A)=S\left(\bigoplus_{j=1}^{r}L_{j}\right)S^{-1}. (11)
  • (i)

    Assume that 0Λ(A)0\not\in\Lambda(A). This means that ajdj0a_{jd_{j}}\neq 0 for all j=1,,rj=1,\ldots,r and we can write

    Lj=detA[ajdj1ajdj100ajdj2ajdj010aj1ajdj0011ajdj000].L_{j}=\det A\begin{bmatrix}-\frac{a_{jd_{j}-1}}{a_{jd_{j}}}&1&0&\cdots&0\\ -\frac{a_{jd_{j}-2}}{a_{jd_{j}}}&0&1&\cdots&0\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ -\frac{a_{j1}}{a_{jd_{j}}}&0&0&\cdots&1\\ -\frac{1}{a_{jd_{j}}}&0&0&\cdots&0\end{bmatrix}.

    Taking into account the definition of qj(λ)q_{j}(\lambda) of (6),

    det(λIdjLj)\displaystyle\det(\lambda I_{d_{j}}-L_{j})
    =ΔAdj(λdjΔAdj+ajdj1ajdjλdj1ΔAdj1++aj1ajdjλΔA+1ajdj)=qj(λ).\displaystyle=\Delta_{A}^{d_{j}}\left(\frac{\lambda^{d_{j}}}{\Delta_{A}^{d_{j}}}+\frac{a_{jd_{j}-1}}{a_{jd_{j}}}\frac{\lambda^{d_{j}-1}}{\Delta_{A}^{d_{j}-1}}+\cdots+\frac{a_{j1}}{a_{jd_{j}}}\frac{\lambda}{\Delta_{A}}+\frac{1}{a_{jd_{j}}}\right)=q_{j}(\lambda).

    Let us see that qj(λ)q_{j}(\lambda) is a power of an irreducible polynomial in 𝔽[λ]\mathbb{F}[\lambda]. In fact, put

    sj(λ)=λdjpj(1λ)=ajdjλdj+ajdj1λdj1++aj1λ+1.s_{j}(\lambda)=\lambda^{d_{j}}p_{j}\left(\frac{1}{\lambda}\right)=a_{jd_{j}}\lambda^{d_{j}}+a_{jd_{j}-1}\lambda^{d_{j}-1}+\cdots+a_{j1}\lambda+1.

    This polynomial is sometimes called the reversal polynomial of pj(λ)p_{j}(\lambda) (see, for example, [22]). Since pj(λ)p_{j}(\lambda) is an elementary divisor of AA in 𝔽\mathbb{F}, it is a power of an irreducible polynomial of 𝔽[λ]\mathbb{F}[\lambda]. By [1, Lemma 4.4], sj(λ)s_{j}(\lambda) is also a power of an irreducible polynomial. Now, it is not difficult to see that qj(λ)=1ajdjs(λΔA)q_{j}(\lambda)=\frac{1}{a_{jd_{j}}}s\left(\frac{\lambda}{\Delta_{A}}\right) is a power of an irreducible polynomial too. As a consequence, q1(λ),q2(λ),,qr(λ)q_{1}(\lambda),q_{2}(\lambda),\ldots,q_{r}(\lambda) are the elementary divisors of Adj(C)=j=1rLj\mathop{\rm Adj}\nolimits(C)=\oplus_{j=1}^{r}L_{j}. Since this and Adj(A)\mathop{\rm Adj}\nolimits(A) are similar matrices (cf. (11)), q1(λ)q_{1}(\lambda), q2(λ),,qr(λ)q_{2}(\lambda),\ldots,q_{r}(\lambda) are the elementary divisors of Adj(A)\mathop{\rm Adj}\nolimits(A). This proves (i).

  • (ii)

    If pi(0)=pj(0)=0p_{i}(0)=p_{j}(0)=0 for iji\neq j, then rank(A)=rank(C)n2\mathop{\rm rank}\nolimits(A)=\mathop{\rm rank}\nolimits(C)\leq n-2. Hence all minors of AA of order n1n-1 are equal to zero and so Adj(A)=0\mathop{\rm Adj}\nolimits(A)=0.

  • (iii)

    Assume now that there is only one index k{1,,r}k\in\{1,\ldots,r\} such that akdk=0a_{kd_{k}}=0. Then pk(λ)=λdkp_{k}(\lambda)=\lambda^{d_{k}} because it is a power of an irreducible polynomial. Thus akj=0a_{kj}=0 for j=1,,dkj=1,\ldots,d_{k} and by (8) and (10), Ck=[00Idk10]C_{k}=\begin{bmatrix}0&0\\ I_{d_{k}-1}&0\end{bmatrix} and

    Lk=(1)n1j=1,jkrajdj[0001][1000]=(1)n1j=1,jkrajdjedke1T,\begin{array}[]{rcl}L_{k}&=&(-1)^{n-1}\prod\limits_{j=1,j\neq k}^{r}a_{jd_{j}}\begin{bmatrix}0\\ 0\\ \vdots\\ 0\\ 1\end{bmatrix}\begin{bmatrix}1&0&\cdots&0&0\end{bmatrix}\\ &=&(-1)^{n-1}\prod\limits_{j=1,j\neq k}^{r}a_{jd_{j}}e_{d_{k}}e_{1}^{T},\end{array} (12)

    respectively. Also, it follows from akdk=0a_{kd_{k}}=0 that Lj=0L_{j}=0 for j=1,,rj=1,\ldots,r, jkj\neq k.

    Recall now that S1AS=C=j=1rCjS^{-1}AS=C=\oplus_{j=1}^{r}C_{j} and split SS and S1S^{-1} accordingly:

    S=[S1S2Sr],S1=[T1T2Tr],S=\begin{bmatrix}S_{1}&S_{2}&\cdots&S_{r}\end{bmatrix},\quad S^{-1}=\begin{bmatrix}T_{1}\\ T_{2}\\ \vdots\\ T_{r}\end{bmatrix},

    with Sj𝔽n×djS_{j}\in\mathbb{F}^{n\times d_{j}} and Tj𝔽dj×nT_{j}\in\mathbb{F}^{d_{j}\times n}, j=1,,rj=1,\ldots,r. Then

    ASk=SkCk,TkA=CkTk.AS_{k}=S_{k}C_{k},\quad T_{k}A=C_{k}T_{k}. (13)

    For i=1,,dki=1,\ldots,d_{k} let skis_{ki} and tkiTt_{ki}^{T} be the ii-th column and row of SkS_{k} and TkT_{k}, respectively:

    Sk=[sk1sk2skdk],Tk=[tk1Ttk2TtkdkT].S_{k}=\begin{bmatrix}s_{k1}&s_{k2}&\cdots&s_{kd_{k}}\end{bmatrix},\quad T_{k}=\begin{bmatrix}t_{k1}^{T}\\ t_{k2}^{T}\\ \vdots\\ t_{kd_{k}}^{T}\end{bmatrix}.

    Bearing in mind that Adj(A)=S(j=1rLj)S1\mathop{\rm Adj}\nolimits(A)=S(\oplus_{j=1}^{r}L_{j})S^{-1} (cf. (11)), the representation of LkL_{k} as a rank-one matrix of (12) and that Lj=0L_{j}=0 for jkj\neq k, we get

    Adj(A)=SkLkTk=(1)n1(j=1,jkrajdj)skdktk1T.\mathop{\rm Adj}\nolimits(A)=S_{k}L_{k}T_{k}=(-1)^{n-1}\left(\prod_{j=1,j\neq k}^{r}a_{jd_{j}}\right)s_{kd_{k}}t_{k1}^{T}. (14)

    Now, it follows from (13) that

    skj=Askj1,tkj1T=tkjTA,j=2,3,,dk,Askdk=0,tk1TA=0.\begin{array}[]{lll}s_{kj}=As_{kj-1},&\quad t_{kj-1}^{T}=t_{kj}^{T}A,&\quad j=2,3,\ldots,d_{k},\\ As_{kd_{k}}=0,&\quad t_{k1}^{T}A=0.&\end{array}

    Henceforth, skdks_{kd_{k}} and tk1Tt_{k1}^{T} are right and left eigenvectors of AA for the eigenvalue 0. Also, k=<sk1,Ask1,,Adk1sk1>\mathfrak{I}_{k}=<s_{k1},As_{k1},\ldots,A^{d_{k-1}}s_{k1}> is a cyclic AA-invariant subspace with sk1s_{k1} as generating vector. Similarly, 𝔍k=<tkdk,ATtkdk,,\mathfrak{J}_{k}=<t_{kd_{k}},A^{T}t_{kd_{k}},\ldots, (AT)dk1tkdk>(A^{T})^{d_{k-1}}t_{kd_{k}}> is a cyclic ATA^{T}-invariant subspace with tkdkt_{kd_{k}} as generating vector. Thus (14) is an explicit rank-one representation of Adj(A)\mathop{\rm Adj}\nolimits(A) in terms of a right and a left eigenvectors of AA for the eigenvalue zero. Actually this representation depends on a particular normalization of the vectors which span the cyclic subspaces k\mathfrak{I}_{k} and 𝔍k\mathfrak{J}_{k}. Specifically, TkSk=IdkT_{k}S_{k}=I_{d_{k}}. However, we are looking for a more general representation in terms of arbitrary right and left eigenvectors for which such a normalization may fail to hold.

    Let us assume that u,v𝔽n×1u,v\in\mathbb{F}^{n\times 1} are arbitrary right and left eigenvectors of AA for the eigenvalue 0. Then Au=0Au=0 and vTA=0v^{T}A=0 and since kerA=kerAT=1\ker A=\ker A^{T}=1, there are nonzero scalars α1,β1𝔽\alpha_{1},\beta_{1}\in\mathbb{F} such that u=α1skdku=\alpha_{1}s_{kd_{k}} and v=β1tk1v=\beta_{1}t_{k1}. Put udk=uu_{d_{k}}=u, v1=vv_{1}=v and for j=1,2,,dk1j=1,2,\ldots,d_{k}-1 define

    udkj=αj+1skdk+αjskdk1++α1skdkjvj+1=βj+1tk1+βjtk2++β1tkj+1\begin{array}[]{lcl}u_{d_{k}-j}&=&\alpha_{j+1}s_{kd_{k}}+\alpha_{j}s_{kd_{k}-1}+\ldots+\alpha_{1}s_{kd_{k}-j}\\ v_{j+1}&=&\beta_{j+1}t_{k1}+\beta_{j}t_{k2}+\ldots+\beta_{1}t_{kj+1}\end{array}

    with α2,,αdk,β2,,βdk𝔽\alpha_{2},\ldots,\alpha_{d_{k}},\beta_{2},\ldots,\beta_{d_{k}}\in\mathbb{F} arbitrary scalars. Using these scalars we define the following triangular matrices

    X=[α1α2α1αdk1αdk2α1αdkαdk1α2α1],Y=[β1β2βdk1βdkβ1βdk2βdk1β1β2β1]X=\begin{bmatrix}\alpha_{1}&&&&\\ \alpha_{2}&\alpha_{1}&&&\\ \vdots&\vdots&\ddots&&\\ \alpha_{d_{k}-1}&\alpha_{d_{k}-2}&\cdots&\alpha_{1}&\\ \alpha_{d_{k}}&\alpha_{d_{k}-1}&\cdots&\alpha_{2}&\alpha_{1}&\end{bmatrix},\quad Y=\begin{bmatrix}\beta_{1}&\beta_{2}&\cdots&\beta_{d_{k}-1}&\beta_{d_{k}}\\ &\beta_{1}&\cdots&\beta_{d_{k}-2}&\beta_{d_{k}-1}\\ &&\ddots&\vdots&\vdots\\ &&&\beta_{1}&\beta_{2}\\ &&&&\beta_{1}\end{bmatrix}

    It is plain that [u1u2udk]=[sk1sk2skdk]X\begin{bmatrix}u_{1}&u_{2}&\cdots&u_{d_{k}}\end{bmatrix}=\begin{bmatrix}s_{k1}&s_{k2}&\cdots&s_{kd_{k}}\end{bmatrix}X and also [v1v2vdk]=[tk1tk2tkdk]Y\begin{bmatrix}v_{1}&v_{2}&\cdots&v_{d_{k}}\end{bmatrix}=\begin{bmatrix}t_{k1}&t_{k2}&\cdots&t_{kd_{k}}\end{bmatrix}Y. Since XX and YY are nonsingular matrices for any choice of α2,,αdk,β2,,βdk\alpha_{2},\ldots,\alpha_{d_{k}},\beta_{2},\ldots,\beta_{d_{k}} (because α10\alpha_{1}\neq 0 and β10\beta_{1}\neq 0), we conclude that k=<u1,u2,,udk>\mathfrak{I}_{k}=<u_{1},u_{2},\ldots,u_{d_{k}}> and 𝔍k=<v1,v2,,vdk>\mathfrak{J}_{k}=<v_{1},v_{2},\ldots,v_{d_{k}}>. In addition, for j=1,2,,dk1j=1,2,\ldots,d_{k}-1

    Audkj=αj+1Askdk+αjAskdk1++α1Askdkj=αjskdk+αj1skdk1++α1Askdkj+1=udkj+1,\begin{array}[]{rcl}Au_{d_{k}-j}&=&\alpha_{j+1}As_{kd_{k}}+\alpha_{j}As_{kd_{k}-1}+\cdots+\alpha_{1}As_{kd_{k}-j}\\ &=&\alpha_{j}s_{kd_{k}}+\alpha_{j-1}s_{kd_{k}-1}+\cdots+\alpha_{1}As_{kd_{k}-j+1}=u_{d_{k}-j+1},\end{array}

    and

    vjTA=βjtk1TA+βj1tk2TA++β1tkjTA=βj1tk1T+βj2tk2T++β1tkj1=vj1T.\begin{array}[]{rcl}v_{j}^{T}A&=&\beta_{j}t_{k1}^{T}A+\beta_{j-1}t_{k2}^{T}A+\cdots+\beta_{1}t_{kj}^{T}A\\ &=&\beta_{j-1}t_{k1}^{T}+\beta_{j-2}t_{k2}^{T}+\cdots+\beta_{1}t_{kj-1}=v_{j-1}^{T}.\end{array}

    In other words, u1u_{1} and vdkv_{d_{k}} are generating vectors of k\mathfrak{I}_{k} and 𝔍k\mathfrak{J}_{k} and u=udk=Adk1u1u=u_{d_{k}}=A^{d_{k}-1}u_{1} and v=v1=ATvdkv=v_{1}=A^{T}v_{d_{k}} are the given right and left eigenvectors of AA for the eigenvalue 0. Now, it follows from u=α1skdku=\alpha_{1}s_{kd_{k}}, v=β1t1kv=\beta_{1}t_{1k} and (14) that

    Adj(A)=(1)n1(j=1,jkrajdj)uvTα1β1.\mathop{\rm Adj}\nolimits(A)=(-1)^{n-1}\left(\prod_{j=1,j\neq k}^{r}a_{jd_{j}}\right)\frac{uv^{T}}{\alpha_{1}\beta_{1}}. (15)

    Since TkSk=IdkT_{k}S_{k}=I_{d_{k}},

    [v1Tv2TvdkT][u1u2udk]=YTTkSkX=YTX.\begin{bmatrix}v_{1}^{T}\\ v_{2}^{T}\\ \vdots\\ v_{d_{k}}^{T}\end{bmatrix}\begin{bmatrix}u_{1}&u_{2}&\cdots&u_{d_{k}}\end{bmatrix}=Y^{T}T_{k}S_{k}X=Y^{T}X.

    But YTXY^{T}X is a lower triangular matrix whose diagonal elements are all equal to α1β1\alpha_{1}\beta_{1}. Thus, for j=1,,dkj=1,\ldots,d_{k}, α1β1=vjTuj=v1TAdk1udk=vTAdk1u\alpha_{1}\beta_{1}=v_{j}^{T}u_{j}=v_{1}^{T}A^{d_{k}-1}u_{d_{k}}=v^{T}A^{d_{k}-1}u. Since α10\alpha_{1}\neq 0 and β10\beta_{1}\neq 0, vTAdk1u0v^{T}A^{d_{k}-1}u\neq 0 as claimed. Now, from (15)

    Adj(A)=(1)n1(j=1,jkrajdj)uvTvTAdk1u.\mathop{\rm Adj}\nolimits(A)=(-1)^{n-1}\left(\prod_{j=1,j\neq k}^{r}a_{jd_{j}}\right)\frac{uv^{T}}{v^{T}A^{d_{k}-1}u}. (16)

    Finally, pA(λ)=j=1rpj(λ)=λdkj=1,jkrpj(λ)p_{A}(\lambda)=\prod\limits_{j=1}^{r}p_{j}(\lambda)=\lambda^{d_{k}}\prod\limits_{j=1,j\neq k}^{r}p_{j}(\lambda). Therefore, pA(dk)(0)=dk!j=1,jkrpj(0)=dk!j=1,jkrajdjp_{A}^{(d_{k})}(0)=d_{k}!\prod\limits_{j=1,j\neq k}^{r}p_{j}(0)=d_{k}!\prod\limits_{j=1,j\neq k}^{r}a_{jd_{j}} and (7) follows from (16).

As a first consequence of Theorem 2.5 we present a generalization of the formula for the eigenvalues of the adjugate matrix (see [18]).

Corollary 2.6.

Let A𝔽n×nA\in\mathbb{F}^{n\times n} be a nonsingular matrix. Let λ0Λ(A)\lambda_{0}\in\Lambda(A) and let m1msm_{1}\geq\ldots\geq m_{s} be its partial multiplicities (i.e., the sizes of the Jordan blocks associated to λ0\lambda_{0} in any Jordan form of AA in, perhaps, a extension field 𝔽~\,\widetilde{\mathbb{F}}. Then ΔAλ0\frac{\Delta_{A}}{\lambda_{0}} is an eigenvalue of Adj(A)\mathop{\rm Adj}\nolimits(A) with m1msm_{1}\geq\ldots\geq m_{s} as partial multiplicities.

Proof.

The elementary divisors of AA for the eigenvalue λ0\lambda_{0} in 𝔽~(λ)\widetilde{\mathbb{F}}(\lambda) are (λλ0)m1(\lambda-\lambda_{0})^{m_{1}}, …, (λλ0)ms(\lambda-\lambda_{0})^{m_{s}}. Then, it follows from item (i) of Theorem 2.5 (see (6)) that (λΔAλ0)m1,,(λΔAλ0)ms\left(\lambda-\frac{\Delta_{A}}{\lambda_{0}}\right)^{m_{1}},\ldots,\left(\lambda-\frac{\Delta_{A}}{\lambda_{0}}\right)^{m_{s}} are the corresponding elementary divisors of Adj(A)\mathop{\rm Adj}\nolimits(A). ∎

Corollary 2.7.

Let A𝔽n×nA\in\mathbb{F}^{n\times n}, λ0Λ(A)𝔽\lambda_{0}\in\Lambda(A)\cap\mathbb{F} and let m1msm_{1}\geq\ldots\geq m_{s} be its partial multiplicities. Let u,v𝔽n×1u,v\in\mathbb{F}^{n\times 1} be arbitrary right and left eigenvectors of AA for λ0\lambda_{0}. Then

Adj(λ0InA)=(δ1s)m1m!pA(m)(λ0)uvTvT(λ0InA)m1u\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)=\frac{(-\delta_{1s})^{m-1}}{m!}\;p^{(m)}_{A}(\lambda_{0})\;\frac{uv^{T}}{v^{T}(\lambda_{0}I_{n}-A)^{m-1}u} (17)

where mm is the algebraic multiplicity of λ0\lambda_{0} and δlj\delta_{lj} is the Kronecker delta.

Proof.

Put B=λ0InAB=\lambda_{0}I_{n}-A. Then 0Λ(B)0\in\Lambda(B), uu and vv are right and left eigenvectors of BB for the eigenvalue 0 and m1msm_{1}\geq\ldots\geq m_{s} are the partial multiplicities of this eigenvalue. By Theorem 2.5, Adj(λ0InA)=Adj(B)0\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)=\mathop{\rm Adj}\nolimits(B)\neq 0 if and only if s=1s=1. In this case,

Adj(λ0InA)=Adj(B)=(1)n1m!pB(m)(0)uvTvTBm1u.\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)=\mathop{\rm Adj}\nolimits(B)=\frac{(-1)^{n-1}}{m!}p_{B}^{(m)}(0)\;\frac{uv^{T}}{v^{T}B^{m-1}u}.

Therefore (17) follows from the fact that pB(m)(0)=(1)n+mpA(m)(λ0)p^{(m)}_{B}(0)=(-1)^{n+m}p^{(m)}_{A}(\lambda_{0}) (see the proof of Theorem 2.1). ∎

The following result is an immediate consequence of Corollary 2.7.

Corollary 2.8.

Let A𝔽n×n{A}\in\mathbb{F}^{n\times n} and let Λ(A)={λ1,,λs}\Lambda(A)=\{\lambda_{1},\ldots,\lambda_{s}\} be its spectrum. Assume that Λ(A)𝔽\Lambda(A)\subset\mathbb{F} and let mjm_{j} and gjg_{j} be the algebraic and geometric multiplicities of AA for the eigenvalue λj\lambda_{j}, j=1,,sj=1,\ldots,s. Fix k{1,,s}k\in\{1,\dots,s\} and let uku_{k} and vkv_{k} be right and left eigenvectors of A{A} for λk\lambda_{k}. Then

Adj(λkIA)=(δ1gk)mk1j=1,jks(λkλj)mjukvkTvkTAmk1uk.\displaystyle\mathop{\rm Adj}\nolimits(\lambda_{k}\,{I}-{A})=(-\delta_{1g_{k}})^{m_{k}-1}{\prod_{j=1,\,j\neq k}^{s}}(\lambda_{k}-\lambda_{j})^{m_{j}}\frac{u_{k}v^{T}_{k}}{v^{T}_{k}A^{m_{k}-1}u_{k}}. (18)

The TM formula (1) can be used to provide an easy proof of the so-called eigenvector-eigenvalue identity (see [12, Sec. 2.1]). In fact, under the hypothesis of Theorem 2.1, it follows from (1) that wTv[Adj(λ0InA)]jj=pA(λ0)vjwjw^{T}v[\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)]_{jj}=p^{\prime}_{A}(\lambda_{0})v_{j}w_{j}, j=1,,nj=1,\ldots,n (see [12, Rem 5]). Hence, recalling that for j=1,,nj=1,\ldots,n, MjjM_{jj} is the principal minor of λ0InA\lambda_{0}I_{n}-A obtained by removing its jjth row and column,

(wTv)pMjj(λ0)=pA(λ0)vjwj,j=1,,n.(w^{T}v)\;p_{M_{jj}}(\lambda_{0})=p^{\prime}_{A}(\lambda_{0})\;v_{j}w_{j},\quad j=1,\ldots,n. (19)

In particular, if An×nA\in\mathbb{C}^{n\times n} is Hermitian, λ1λ2λn\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{n} are its eigenvalues and, for i=1,,ni=1,\ldots,n, vi=[vi1vi2vin]Tv_{i}=\begin{bmatrix}v_{i1}&v_{i2}&\cdots&v_{in}\end{bmatrix}^{T} is a unitary right and left eigenvector of AA for λi\lambda_{i} (that is Avi=λiviAv_{i}=\lambda_{i}v_{i}, viA=λiviv_{i}^{\ast}A=\lambda_{i}v_{i}^{\ast} and vivi=1v_{i}^{\ast}v_{i}=1; (recall that we must change transpose by conjugate transpose in the complex case) then

|vij|2pA(λi)=pMjj(λi),i,j=1,,n.|v_{ij}|^{2}p^{\prime}_{A}(\lambda_{i})=p_{M_{jj}}(\lambda_{i}),\quad i,j=1,\ldots,n.

Equivalently, if μj1μj2μjn1\mu_{j1}\geq\mu_{j2}\geq\cdots\geq\mu_{jn-1} are the eigenvalues of MjjM_{jj},

|vij|2k=1,kin(λiλk)=k=1n(λiμjk)i,j=1,,n.|v_{ij}|^{2}\prod_{k=1,k\neq i}^{n}(\lambda_{i}-\lambda_{k})=\prod_{k=1}^{n}(\lambda_{i}-\mu_{jk})\quad i,j=1,\ldots,n. (20)

This is the classical eigenvector-eigenvalue identity (see [12, Thm. 1]).

As mentioned in Remark 2.4, if 𝔽\mathbb{F} is a field of characteristic zero and A𝔽n×nA\in\mathbb{F}^{n\times n} then (19) is meaningful if and only if λ0\lambda_{0} is a simple eigenvalue. If λ0\lambda_{0} is defective and its geometric multiplicity is bigger than 11 then (19) becomes a trivial identity because, in this case, Adj(λ0InA)=0\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)=0 (item (ii) of Theorem 2.5) and so pMj(λ0)=det(λ0In1Mj)=0p_{M_{j}}(\lambda_{0})=\det(\lambda_{0}I_{n-1}-M_{j})=0. However, if λ0\lambda_{0} is defective and its geometric multiplicity is 11, then (17) can be used to obtain a generalization of the eigenvector-eigenvalue identity. In fact, one readily gets from (17):

pMjj(λ0)=(δ1g)m1m!pA(m)(λ0)ujvjvT(λ0InA)m1u,j=1,,n,p_{M_{jj}}(\lambda_{0})=\frac{(-\delta_{1g})^{m-1}}{m!}p_{A}^{(m)}(\lambda_{0})\frac{u_{j}v_{j}}{v^{T}(\lambda_{0}I_{n}-A)^{m-1}u},\quad j=1,\ldots,n, (21)

where mm and gg are the algebraic and geometric multiplicities of λ0\lambda_{0}, respectively. Moreover, if both pA(λ)p_{A}(\lambda) and pMjj(λ)p_{M_{jj}}(\lambda) split in 𝔽\mathbb{F} then, with the notation of Corollary 2.8, the following identity follows from (18) for the non-repeated eigenvalues {μj1,,μjrj}\{\mu_{j1},\ldots,\mu_{jr_{j}}\} of MjjM_{jj} and for i=1,,si=1,\ldots,s:

k=1rk(λiμjk)qjk=(δ1gi)mi1uijvijviTAmi1uik=1,kis(λiλk)mk,j=1,,n,\prod_{k=1}^{r_{k}}(\lambda_{i}-\mu_{jk})^{q_{jk}}=(-\delta_{1g_{i}})^{m_{i}-1}\frac{u_{ij}v_{ij}}{v_{i}^{T}A^{m_{i}-1}u_{i}}\prod_{k=1,k\neq i}^{s}(\lambda_{i}-\lambda_{k})^{m_{k}},\quad j=1,\ldots,n, (22)

where ui=[ui1uin]Tu_{i}=\begin{bmatrix}u_{i1}&\cdots&u_{in}\end{bmatrix}^{T}, vi=[vi1vin]Tv_{i}=\begin{bmatrix}v_{i1}&\cdots&v_{in}\end{bmatrix}^{T}, and qjkq_{jk} is the algebraic multiplicity of μjk\mu_{jk}, k=1,,rjk=1,\ldots,r_{j} and j=1,,nj=1,\ldots,n.

In the following section two additional applications will be presented.

3 Two additional consequences of the TM formula

The well-known formula (23) below gives the derivative of a simple eigenvalue of a matrix depending on a (real or complex) parameter. The investigation about the eigenvalue sensitivity of matrices depending on one or several parameters can be traced back to the work of Jacobi ([19]). However a systematic study of the perturbation theory of the eigenvalue problem starts with the books of Rellich (1953), Wilkinson (1965) and Kato (1966), as well as the papers by Lancaster [20], Osborne and Michaelson [27], Fox and Kapoor [14], Crossley and Porter [9] (see also [31] and the references therein). Since then this topic has become classical as evidenced by an extensive literature including books and papers addressed to mathematicians and a broad spectrum of scientist and engineers. In addition to the above early references, a short, and by no means exhaustive, list of books could include [4, p. 463], [24, Ch. 8, Sec. 9], [10, Sec.4.2] or [21, pp. 134-135].

In proving (23), one first must prove, of course, that the eigenvalues smoothly depend on the parameter. It is also a common practice to prove or assume (see [23],[13, Ch. 11, Th. 2] and the referred books), the existence of eigenvectors which depend smoothly on the parameter. It is worth-remarking that in the proof by Lancaster in [20] only the existence of eigenvectors continuously depending on the parameter is required. We propose a simple and alternative proof of (23) where no assumption is made on the right and left eigenvector functions.

Let Dϵ(z0)D_{\epsilon}(z_{0}) be the open disc of radius ϵ>0\epsilon>0 with center z0z_{0}. For the following result 𝔽\mathbb{F} will be either the field of real numbers \mathbb{R} or of the complex numbers \mathbb{C}. Recall that vn×1v\in\mathbb{C}^{n\times 1} is a left eigenvector of An×nA\in\mathbb{C}^{n\times n} for an eigenvalue z0z_{0} if vA=z0vv^{\ast}A=z_{0}v^{\ast} where v=v¯Tv^{\ast}=\bar{v}^{T} is the transpose conjugate of vv. Hence, we will change T by to include complex vectors in our discussion.

Proposition 3.1.

Let A(ω)𝔽n×n{A}(\omega)\in\mathbb{F}^{n\times n} be a square matrix-valued function whose entries are analytic at ω0\omega_{0}\in\mathbb{C}. Let z0z_{0} be a simple eigenvalue of A(ω0){A}(\omega_{0}). Then there exist ϵ>0\epsilon>0 and δ>0\delta>0 so that z:Dϵ(ω0)Dδ(z0)z:D_{\epsilon}(\omega_{0})\to D_{\delta}(z_{0}) is the unique eigenvalue of A(ω){A}(\omega) with z(ω)Dδ(z0)z(\omega)\in D_{\delta}(z_{0}) for each ωDϵ(ω0)\omega\in D_{\epsilon}(\omega_{0}). Moreover, zz is analytic on Dϵ(ω0)D_{\epsilon}(\omega_{0}) and

z(ω)=v(ω)A(ω)u(ω)v(ω)u(ω),\displaystyle z^{\prime}(\omega)=\frac{v(\omega)^{\ast}{A}^{\prime}(\omega)u(\omega)}{v(\omega)^{\ast}u(\omega)}, (23)

where, for wDϵ(ω0)w\in D_{\epsilon}(\omega_{0}), u(ω)u(\omega) and v(ω)v(\omega) are arbitrary right and left eigenvector, respectively, of A{A} for z(ω)z(\omega).

Proof.

Since z0z_{0} is a simple root of p(z,ω)=det(zIA(ω))p(z,\omega)=\det(z\,{I}-{A}(\omega)), by the analytic implicit function theorem, we have, in addition to the first part of the result, that

z(ω)=pω(z(ω),ω)pz(z(ω),ω).z^{\prime}(\omega)=\displaystyle-\frac{\displaystyle\frac{\partial p}{\partial\omega}(z(\omega),\omega)}{\displaystyle\frac{\partial p}{\partial z}(z(\omega),\omega)}.

By the Jacobi formula for the derivative of the determinant and TM formula (1), we have (note that since z(ω)z(\omega) is a simple eigenvalue, v(ω)u(ω)0v(\omega)^{\ast}u(\omega)\neq 0 for any right and left eigenvectors u(ω)u(\omega) and v(ω)v(\omega))

pz(z(ω),ω)\displaystyle\displaystyle\frac{\partial p}{\partial z}(z(\omega),\omega) =tr(Adj(z(ω)IA(ω))\displaystyle=\mathop{\rm tr}\nolimits(\mathop{\rm Adj}\nolimits(z(\omega)\,{I}-{A}(\omega))
=p(z(ω),ω)\displaystyle=p^{\prime}(z(\omega),\omega)
pω(z(ω),ω)\displaystyle\displaystyle\frac{\partial p}{\partial\omega}(z(\omega),\omega) =tr(Adj(z(ω)IA(ω))A(ω))\displaystyle=-\mathop{\rm tr}\nolimits(\mathop{\rm Adj}\nolimits(z(\omega)\,{I}-{A}(\omega)){A}^{\prime}(\omega))
=p(z(ω),ω)v(ω)A(ω)u(ω)v(ω)u(ω),\displaystyle=-p^{\prime}(z(\omega),\omega)\frac{v(\omega)^{\ast}{A}^{\prime}(\omega)u(\omega)}{v(\omega)^{\ast}u(\omega)},

and the result follows. ∎

Remark 3.2.
  • (a)

    The same conclusion can be drawn in Proposition 3.1 if AA is a complex or real matrix-valued differentiable function of a real variable. In the first case, we would need a non-standard version of the implicit function theorem like the one in [3, Theorem 2.4]. In the second case the standard implicit function theorem is enough.

  • (b)

    It is shown in [2] that the existence of eigenvectors smoothly depending on the parameter can be easily obtained from the properties of the adjugate matrix. In fact, since z(ω)z(\omega) is a simple eigenvalue of A(ω)A(\omega) for each ωDϵ(ω0)\omega\in D_{\epsilon}(\omega_{0}), rank(z(ω)InA(ω))=n1\mathop{\rm rank}\nolimits(z(\omega)I_{n}-A(\omega))=n-1 and so by the TM formula, rankAdj(z(ω)InA(ω))=1\mathop{\rm rank}\nolimits\mathop{\rm Adj}\nolimits(z(\omega)I_{n}-A(\omega))=1 (see Remark 2.4). Now, Adj(z(ω)InA(ω))\mathop{\rm Adj}\nolimits(z(\omega)I_{n}-A(\omega)) is a differentiable matrix function of ωDϵ(ω0)\omega\in D_{\epsilon}(\omega_{0}) and (z(ω)InA(ω))(Adj(z(ω)InA(ω)))=(Adj(z(ω)InA(ω)))(z(ω)InA(ω))=det(z(ω)InA(ω))In=0(z(\omega)I_{n}-A(\omega))(\mathop{\rm Adj}\nolimits(z(\omega)I_{n}-A(\omega)))=(\mathop{\rm Adj}\nolimits(z(\omega)I_{n}-A(\omega)))(z(\omega)I_{n}-A(\omega))=\det(z(\omega)I_{n}-A(\omega))I_{n}=0. Henceforth, all nonzero columns of Adj(z(ω)InA(ω)\mathop{\rm Adj}\nolimits(z(\omega)I_{n}-A(\omega), which are all proportional, are (right and left) eigenvectors of A(ω)A(\omega) for z(ω)z(\omega). \Box

The second application is related to the problem of characterizing the admissible eigenstructures and, more generally, the similarity orbits of the rank-one updated matrices. There is a vast literature on this problem. A non-exhaustive list of publications is [32, 29, 34, 26, 6, 25, 8, 5] and the references therein. It is a consequence of Theorem 2 in [32] that if λ0\lambda_{0} is an eigenvalue of A𝔽n×nA\in\mathbb{F}^{n\times n} with geometric multiplicity 11 and rank(BA)=1\mathop{\rm rank}\nolimits(B-A)=1 then λ0\lambda_{0} may or may not be an eigenvalue of B𝔽n×nB\in\mathbb{F}^{n\times n}. It is then proved in [25, Th. 2.3] that in the complex case, generically, λ0\lambda_{0} is not an eigenvalue of BB. That is to say, there is a Zariski open set Ωn×n\Omega\subset\mathbb{C}^{n}\times\mathbb{C}^{n} such that for all (x,y)Ω(x,y)\in\Omega, λ0\lambda_{0} is not an eigenvalue of A+xyTA+xy^{T}. With the help of the TM formula we can be a little more precise about the set Ω\Omega. Form now on, 𝔽\mathbb{F} will be again an arbitrary field.

Proposition 3.3.

Let A𝔽n×nA\in\mathbb{F}^{n\times n} and let λ0\lambda_{0} be an eigenvalue of AA in, perhaps, an extension field 𝔽~\leavevmode\nobreak\ \widetilde{\mathbb{F}}. Assume that the geometric multiplicity of λ0\lambda_{0} is 11 and its algebraic multiplicity is mm. Let u0,v0𝔽n×1u_{0},v_{0}\in\mathbb{F}^{n\times 1} be right and left eigenvectors of AA for λ0\lambda_{0}. If x,y𝔽n×1x,y\in\mathbb{F}^{n\times 1} then λ0\lambda_{0} is an eigenvalue of A+xyTA+xy^{T} if and only if yTu0=0y^{T}u_{0}=0 or v0Tx=0v_{0}^{T}x=0.

Proof.

Let B=A+xyTB=A+xy^{T}. Then λInA=λInBxyT\lambda I_{n}-A=\lambda I_{n}-B-xy^{T}. Taking into account that λInB\lambda I_{n}-B is invertible in 𝔽(s)n×n\mathbb{F}(s)^{n\times n}, where 𝔽(s)\mathbb{F}(s) the field of rational functions, and using the formula of the determinant of updated rank-one matrices, we get

pB(λ)=pA(λ)+pA(λ)yT(λInA)1x=pA(λ)+yTAdj(λInA)x.p_{B}(\lambda)=p_{A}(\lambda)+p_{A}(\lambda)y^{T}(\lambda I_{n}-A)^{-1}x=p_{A}(\lambda)+y^{T}\mathop{\rm Adj}\nolimits(\lambda I_{n}-A)x.

In particular,

pB(λ0)=pA(λ0)+yTAdj(λ0InA)x=yTAdj(λ0InA)x.p_{B}(\lambda_{0})=p_{A}(\lambda_{0})+y^{T}\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)x=y^{T}\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)x. (24)

It follows from (17) that (recall that v0T(λ0InA)m1u00v_{0}^{T}(\lambda_{0}I_{n}-A)^{m-1}u_{0}\neq 0)

pB(λ0)=(1)m1m!pA(m)(λ0)yTu0v0Txv0T(λ0InA)m1u0.p_{B}(\lambda_{0})=\frac{(-1)^{m-1}}{m!}p_{A}^{(m)}(\lambda_{0})\frac{y^{T}u_{0}v_{0}^{T}x}{v_{0}^{T}(\lambda_{0}I_{n}-A)^{m-1}u_{0}}.

Since pA(m)(λ0)0p_{A}^{(m)}(\lambda_{0})\neq 0, the Proposition follows. ∎

Remark 3.4.

Note that, by (24) and item (ii) of Theorem 2.5, if the geometric multiplicity of λ0\lambda_{0} as eigenvalue of AA is 22 then Adj(λ0InA)=0\mathop{\rm Adj}\nolimits(\lambda_{0}I_{n}-A)=0 and so, λ0\lambda_{0} is necessarily an eigenvalue of A+xyTA+xy^{T}. This is an easy consequence of the interlacing inequalities of [32, Th. 2]. However, proving that those interlacing inequalities are necessary conditions that the invariant polynomials of AA and A+xyTA+xy^{T} must satisfy is by no means a trivial matter. \Box

The eigenvalues of rank-one updated matrices are at the core of the divide and conquer algorithm to compute the eigenvalues of real symmetric or complex hermitian matrices (see, for example, [11, Sec. 5.3.3], [30, Sec. 2.1]). At each step of the algorithm a diagonal matrix D=D1D2D=D_{1}\oplus D_{2} and a vector un×1u\in\mathbb{C}^{n\times 1} are given such that the eigenvalues and eigenvectors of D+uuD+uu^{\ast} are to be computed. In order the algorithm to run smoothly, it is required, among other things, that the diagonal elements of DD are all distinct. Thus, a so-called deflation process must be carried out. This amounts to check at each step the presence of repeated eigenvalues and, if so, remove and save them. The result that follows is related to the problem of detecting repeated eigenvalues but for much more general matrices over arbitrary fields.

Proposition 3.5.

Let A=A1A2A=A_{1}\oplus A_{2} with Ai𝔽ni×niA_{i}\in\mathbb{F}^{n_{i}\times n_{i}}, i=1,2i=1,2. Let x,y𝔽n×1x,y\in\mathbb{F}^{n\times 1} and split B=A+xyT=[Bij]ij=1,2B=A+xy^{T}=\begin{bmatrix}B_{ij}\end{bmatrix}_{ij=1,2} into 2×22\times 2 blocks such that Bii𝔽ni×niB_{ii}\in\mathbb{F}^{n_{i}\times n_{i}}, i=1,2i=1,2. Assume also that the eigenvalues of A1A_{1} and A2A_{2} have geometric multiplicity equal to 11 and Λ(A1)Λ(B11)=Λ(A2)Λ(B22)=\Lambda(A_{1})\cap\Lambda(B_{11})=\Lambda(A_{2})\cap\Lambda(B_{22})=\emptyset. Then

Λ(A1)Λ(A2)=Λ(B)Λ(A1)=Λ(B)Λ(A2).\Lambda(A_{1})\cap\Lambda(A_{2})=\Lambda(B)\cap\Lambda(A_{1})=\Lambda(B)\cap\Lambda(A_{2}).
Proof.

.- If λ0Λ(A1)Λ(A2)\lambda_{0}\in\Lambda(A_{1})\cap\Lambda(A_{2}) then λ0\lambda_{0}, as eigenvalue of AA, has geometric multiplicity 22. By Remark 3.4, λ0Λ(B)Λ(A1)Λ(A2)\lambda_{0}\in\Lambda(B)\cap\Lambda(A_{1})\cap\Lambda(A_{2}). Assume thatλ0Λ(B)Λ(A1)\lambda_{0}\in\Lambda(B)\cap\Lambda(A_{1}) but λ0Λ(A2)\lambda_{0}\not\in\Lambda(A_{2}). Let us see that this assumption leads to a contradiction. Let u0,v0𝔽n1×1u_{0},v_{0}\in\mathbb{F}^{n_{1}\times 1} be a right and a left eigenvectors of A1A_{1}, respectively. Then w0=[u0T0]T𝔽n×1w_{0}=\begin{bmatrix}u_{0}^{T}&0\end{bmatrix}^{T}\in\mathbb{F}^{n\times 1} and z0=[w0T0]T𝔽n×1z_{0}=\begin{bmatrix}w_{0}^{T}&0\end{bmatrix}^{T}\in\mathbb{F}^{n\times 1}are right and left eigenvectors of AA, respectively, for λ0\lambda_{0}. Since λ0Λ(A2)\lambda_{0}\not\in\Lambda(A_{2}), the geometric multiplicity of λ0\lambda_{0} as eigenvalue of AA is 11. Then, by Proposition 3.3, yTw0=0y^{T}w_{0}=0 or z0Tx=0z_{0}^{T}x=0 because λ0Λ(B)\lambda_{0}\in\Lambda(B). Let us assume that yTw0=0y^{T}w_{0}=0, on the contrary we would proceed similarly with z0Tx=0z_{0}^{T}x=0. If we put y=[y1Ty2T]Ty=\begin{bmatrix}y_{1}^{T}&y_{2}^{T}\end{bmatrix}^{T} and x=[x1Tx2T]Tx=\begin{bmatrix}x_{1}^{T}&x_{2}^{T}\end{bmatrix}^{T}, with x1,y1𝔽n1×1x_{1},y_{1}\in\mathbb{F}^{n_{1}\times 1}, then y1Tu0=0y_{1}^{T}u_{0}=0 and B11=A11+x1y1TB_{11}=A_{11}+x_{1}y_{1}^{T}. It follows from Proposition 3.3 that λ0Λ(B11)\lambda_{0}\in\Lambda(B_{11}), contradicting the hypothesis Λ(A1)Λ(B11)=\Lambda(A_{1})\cap\Lambda(B_{11})=\emptyset. That Λ(B)Λ(A2)Λ(A1)Λ(A2)\Lambda(B)\cap\Lambda(A_{2})\subset\Lambda(A_{1})\cap\Lambda(A_{2}) is proved similarly. ∎

Remark 3.6.
  • (i)

    Note that, with the notation of the proof of Proposition 3.5, B11=A1+x1y1TB_{11}=A_{1}+x_{1}y_{1}^{T} and B22=A2+x2y2TB_{22}=A_{2}+x_{2}y_{2}^{T}. Then, according to Proposition 3.3, λ0Λ(B11)\lambda_{0}\not\in\Lambda(B_{11}) unless (y1Tu0)(v0Tx1)=0(y_{1}^{T}u_{0})(v_{0}^{T}x_{1})=0. Hence, the hypothesis Λ(A1)Λ(B11)=\Lambda(A_{1})\cap\Lambda(B_{11})=\emptyset is a generic property, and so is Λ(A2)Λ(B22)=\Lambda(A_{2})\cap\Lambda(B_{22})=\emptyset.

  • (ii)

    Consider Proposition 3.5 over \mathbb{C}. If AA and BB are both Hermitian or unitary, then Λ(B)(Λ(A1)Λ(A2))\Lambda(B)\setminus\big{(}\Lambda(A_{1})\cap\Lambda(A_{2})\big{)} and Λ(A1)(Λ(A2)\(Λ(A1)Λ(A2)))\Lambda(A_{1})\cup\big{(}\Lambda(A_{2})\backslash(\Lambda(A_{1})\cap\Lambda(A_{2}))\big{)} strictly interlace on the real line or the unit circle, respectively (see, for example, [30, Th. 2.1, Sec. 2]). \Box



References

  • [1] A. Amparan, S. Marcaida, and I. Zaballa. On the structure invariants of proper rational matrices with prescribed finite poles. Linear and Multilinear Algebra, 61(11):1464–1486, 2013.
  • [2] A. L. Andrew, K.-W. E. Chu, and P. Lancaster. Derivatives of eigenvalues and eigenvectors of matrix functions. SIAM J. Matrix Anal. Appl., 14(4):903–926, 1993.
  • [3] M. S. Ashbaugh and E. M. Harrell II. Perturbation theory for shape resonances and large barrier potentials. Comm. Math. Phys., 83(2):151–170, 1982.
  • [4] F. V. Atkinson. Discrete and continuous boundary problems, volume 8 of Mathematics in Science and Engineering. Academic Press, New York-London, 1964.
  • [5] I. Baragaña. The number of distinct eigenvalues of a regular pencil and of a square matrix after rank perturbation. Linear Algebra Appl., 588:101–121, 2020.
  • [6] M. A. Beitia, I. de Hoyos, and I. Zaballa. The change of the Jordan structure under one row perturbations. Linear Algebra Appl., 401:119 – 134, 2005.
  • [7] W. C. Brown. Matrices over Commutative Rings. Marcel Dekker Inc., New York, 1993.
  • [8] R. Bru, R. Cantó, and A. M. Urbano. Eigenstructure of rank one updated matrices. Linear Algebra Appl., 485:372–391, 2015.
  • [9] T. R. Crossley and B. Porter. Eigenvalue and eigenvector sensitivities in linear system theory. Int. J. Control, 10:163–170, 1969.
  • [10] Hinrichsen D. and Pritchard A. J. Mathematical System Theory I. Modelling, State Space Analysis, Stability and Robustness. Springer, Berlin, 2005.
  • [11] J. W. Demmel. Applied Numerical Linear Algebra. SIAM, Philadelphia, 1997.
  • [12] P. B. Denton, S. J. Parke, T. Tao, and X. Zhang. Eigenvectors from eigenvalues: a survey of a basic identity in linear algebra. arXiv:1908.03795, 2020.
  • [13] L. C. Evans. Partial differential equations, volume 19 of Graduate Studies in Mathematics. American Mathematical Society, Providence, RI, second edition, 2010.
  • [14] R. L. Fox and M. P. Kapoor. Rate of change of eigenvalues and eigenvectors. AIAA J., 6:2426–2429, 1968.
  • [15] F. R. Gantmacher. The Theory of Matrices. AMS Chelsea Publishing, Providence, Rhode Island, 1988.
  • [16] R. Godement. Cours d’algèbre. Hermann Éditeurs, Paris, 2005.
  • [17] D. Grinberg. Eigenvectors from eigenvalues: a survey of a basic identity in linear algebra — what’s new. https://terrytao.wordpress.com/2019/12/03/eigenvectors-from-eigenvalues-a-survey-of-a-basic-identity-in-linear-algebra/#comment-531597, 2019.
  • [18] R. D. Hill and E. E. Underwood. On the matrix adjoint (adjugate). SIAM J. Algebraic Discrete Methods, 6(4):731–737, 1985.
  • [19] C. G. J. Jacobi. Über ein leichtes verfahren die in der theorie der säcularstörungen vorkommenden gleichungen numerisch aufzulösen. J. für die Reine und Angew. Math., 1846(30):51–94, 1846.
  • [20] P. Lancaster. On eigenvalues of matrices dependent on a parameter. Numer. Math., 6:377–387, 1964.
  • [21] P. D. Lax. Linear Algebra and its Applications. Pure and Applied Mathematics (Hoboken). Wiley-Interscience [John Wiley & Sons], Hoboken, NJ, second edition, 2007.
  • [22] D. S. Mackey, N. Mackey, C. Mehl, and V. Mehrmann. Vector spaces of linearizations for matrix polynomials. SIAM J. Matrix Anal. Appl., 28(4):971–1004, 2006.
  • [23] J. R. Magnus. On differentiating eigenvalues and eigenvectors. Econometric Theory, 1:179–191, 1985.
  • [24] J. R. Magnus and H. Neudecker. Matrix Differential Calculus with Applications in Statistics and Econometrics. John Wiley & Sons, Chichester, 1988.
  • [25] C. Mehl, V. Mehrmann, A. C. M. Ran, and L. Rodman. Eigenvalue perturbation theory of classes of structured matrices under generic structured rank one perturbations. Linear Algebra Appl., 435(3):687–716, 2011.
  • [26] J. Moro and F. M. Dopico. Low rank perturbation of Jordan structure. SIAM J. Matrix Anal. Appl., 25(2):495–506, 2003.
  • [27] M. R. Osborne and S. Michaelson. The numerical solution of eigenvalue problems in which the eigenvalue appears nonlinearly, with an application to differential equations. Computer J., 7:66–71, 1964.
  • [28] D. S. Scott. How to make the Lanczos algorithm converge slowly. Math. Comp., 33:239–247, 1979.
  • [29] F. C. Silva. The rank of the difference of matrices with prescribed similarity classes. Linear and Multilinear Algebra, 24(1):51–58, 1988.
  • [30] G. W. Stewart. Matrix Algorithms, Volume II: Eigensystems. SIAM, Philadelphia, 2001.
  • [31] J. G. Su. Multiple eigenvalue sensitivity analysis. Linear Algebra Appl., 137(4):183–211, 1990.
  • [32] R. C. Thompson. Invariant factors under rank one perturbations. Canad. J. Math., 32(1):240–245, 1980.
  • [33] R. C. Thompson and P. McEnteggert. Principal submatrices. II: The upper and lower quadratic inequalities. Linear Algebra Appl., 1:211–243, 1968.
  • [34] I. Zaballa. Pole assignment and additive perturbations of fixed rank. SIAM J. Matrix Anal. Appl., 12(1):16–23, 1991.