This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A QR Algorithm for Symmetric Tensors

Kim Batselier    Ngai Wong Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong
Abstract

We extend the celebrated QR algorithm for matrices to symmetric tensors. The algorithm, named QR algorithm for symmetric tensors (QRST), exhibits similar properties to its matrix version, and allows the derivation of a shifted implementation with faster convergence. We further show that multiple tensor eigenpairs can be found from a local permutation heuristic which is effectively a tensor similarity transform, resulting in the permuted version of QRST called PQRST. Examples demonstrate the remarkable effectiveness of the proposed schemes for finding stable and unstable eigenpairs not found by previous tensor power methods.

keywords:
Symmetric tensor, QR algorithm, shift, eigenpair
AMS:
15A69,15A18

1 Introduction

The study of tensors has become intensive in recent years, e.g., [9, 13, 3, 8, 10, 11]. Symmetric tensors arise naturally in various engineering problems. They are especially important in the problem of blind identification of under-determined mixtures [1, 5, 2, 15]. Applications are found in areas such as speech, mobile communications, biomedical engineering and chemometrics. Although in principle, the tensor eigenproblem can be fully solved by casting it into a system of multivariate polynomial equations and solving them, this approach is only feasible for small problems and the scalability hurdle quickly defies its practicality. For larger problems, numerical iterative schemes are necessary and the higher-order power methods (HOPMs) presented in [3, 8, 10, 11] are pioneering attempts in this pursuit. Nonetheless, HOPM and its variants, e.g., SS-HOPM [10] and GEAP [11], suffer from the need of setting initial conditions and that only one stable eigenpair is computed at a time.

In the context of matrix eigenpair computation, the QR algorithm is much more powerful and widely adopted, yet its counterpart in the tensor scenario seems to be lacking. Interestingly, the existence of a tensor QR algorithm was questioned at least a decade ago [7], which we try to answer here. Specifically, the central idea of the symmetric QR algorithm is to progressively align the original matrix (tensor) onto its eigenvector basis yet preserving the particular matrix (tensor) structure, viz. symmetry, by similarity transform. In that regard, the tensor nature does not preclude itself from having a tensor QR algorithm counterpart. The main contribution of this paper is an algorithm, called the QR algorithm for Symmetric Tensors (QRST), that computes the real eigenpair(s) of a real ddth-order nn-dimensional symmetric tensor 𝒜n××n\mathcal{A}\in\mathbb{R}^{n\times\cdots\times n}. We show that QRST also features a shifted version to accelerate convergence. Moreover, a permutation heuristic is introduced, captured in the algorithm named permuted QRST (PQRST), to scramble the tensor entries to produce possibly more distinct eigenpairs. We emphasize that the current focus is essentially a demonstration of the feasibility of QR algorithm for tensors, rather than an effort to develop efficient QRST implementations which will definitely involve further numerical work and smart shift design. We believe the latter will come about when QRST has become recognized.

This paper is organized as follows. Section 2 reviews some tensor notions and operations, and defines symmetric tensors and real eigenpairs. Section 3 proposes QRST and PQRST, together with a shift strategy and convergence analysis. Section 4 demonstrates the efficacy of (P)QRST through numerical examples, and Section 5 draws the conclusion.

2 Tensor Notations and Conventions

Here we present some tensor basics and notational conventions adopted throughout this work. The reader is referred to [9] for a comprehensive overview. A general ddth-order or dd-way tensor, assumed real throughout this paper, is a multidimensional array 𝒜n1×n2××nd\mathcal{A}\in\mathbb{R}^{n_{1}\times n_{2}\times\cdots\times n_{d}} that can be perceived as an extension of the matrix format to its general ddth-order counterpart [9]. We use calligraphic font (e.g., 𝒜\mathcal{A}) to denote a general (d>2d>2) tensor and normal font (e.g., AA) for the specific case of a matrix (d=2d=2). The inner product between two tensors 𝒜,n1×n2××nd\mathcal{A},\mathcal{B}\in\mathbb{R}^{n_{1}\times n_{2}\times\cdots\times n_{d}} is defined as

𝒜,\displaystyle\left<\mathcal{A},\mathcal{B}\right> =i1,i2,,id𝒜i1i2idi1i2id\displaystyle=\sum_{i_{1},i_{2},\cdots,i_{d}}\mathcal{A}_{i_{1}i_{2}\cdots i_{d}}\mathcal{B}_{i_{1}i_{2}\cdots i_{d}}
(1) =vec(𝒜)Tvec(),\displaystyle=\operatorname{vec}(\mathcal{A})^{T}\operatorname{vec}(\mathcal{B}),

where vec()\operatorname{vec}(\circ) is the vectorization operator that stretches the tensor entries into a tall column vector. Specifically, for the index tuple [i1i2id][i_{1}i_{2}\cdots i_{d}], the convention is to treat i1i_{1} as the fastest changing index while idi_{d} the slowest, so vec(𝒜)\operatorname{vec}(\mathcal{A}) will arrange from top to bottom, namely, 𝒜111\mathcal{A}_{11\cdots 1}, 𝒜211\mathcal{A}_{21\cdots 1}, \cdots, 𝒜121\mathcal{A}_{12\cdots 1}, \cdots, 𝒜n1n2nd\mathcal{A}_{n_{1}n_{2}\cdots n_{d}}. The norm of a tensor is often taken to be the Frobenius norm ||𝒜||=||𝒜||F=<𝒜,𝒜>1/2||\mathcal{A}||=||\mathcal{A}||_{F}=<\mathcal{A},\mathcal{A}>^{1/2}.

The kk-mode product of a tensor 𝒜n1××nk××nd\mathcal{A}\in\mathbb{R}^{n_{1}\times\cdots\times n_{k}\times\cdots\times n_{d}} with a matrix Upk×nkU\in\mathbb{R}^{p_{k}\times n_{k}} is defined by [9]

(2) (𝒜U×k)i1ik1jkik+1id=ik=1nkUjkik𝒜i1ikid,\displaystyle\left(\mathcal{A}{{}_{\times_{k}}}U\right)_{i_{1}\cdots i_{k-1}j_{k}i_{k+1}\cdots i_{d}}=\sum_{i_{k}=1}^{n_{k}}U_{j_{k}i_{k}}\mathcal{A}_{i_{1}\cdots i_{k}\cdots i_{d}},

and 𝒜U×kn1××nk1×pk×nk+1××nd\mathcal{A}{{}_{\times_{k}}}U\in\mathbb{R}^{n_{1}\times\cdots\times n_{k-1}\times p_{k}\times n_{k+1}\times\cdots\times n_{d}}. For distinct modes in a series of tensor-matrix multiplication, the ordering of the multiplication is immaterial, namely,

(3) 𝒜B×pC×q=𝒜C×qB×p (pq),\displaystyle\mathcal{A}{{}_{\times_{p}}}B{{}_{\times_{q}}}C=\mathcal{A}{{}_{\times_{q}}}C{{}_{\times_{p}}}B\mbox{~~}(p\neq q),

whereas it matters for the same mode,

(4) 𝒜B×pC×p=𝒜(CB)×p.\displaystyle\mathcal{A}{{}_{\times_{p}}}B{{}_{\times_{p}}}C=\mathcal{A}{{}_{\times_{p}}}(CB).

This is in fact quite easy to recognize by referring to Figure 1 where 𝒜B×1C×1D×2E×3=𝒜(CB)×1D×2E×3\mathcal{A}{{}_{\times_{1}}}B{{}_{\times_{1}}}C{{}_{\times_{2}}}D{{}_{\times_{3}}}E=\mathcal{A}{{}_{\times_{1}}}(CB){{}_{\times_{2}}}D{{}_{\times_{3}}}E. As an example, for matrices with compatible dimensions, it can be readily shown that AB×1C×2=BACTA{{}_{\times_{1}}}B{{}_{\times_{2}}}C=BAC^{T}. Apparently, 𝒜Ink×k=𝒜\mathcal{A}{{}_{\times_{k}}}I_{n_{k}}=\mathcal{A} for k=1,,dk=1,\cdots,d whereby ImI_{m} denotes the m×mm\times m identity matrix.

Refer to caption
Fig. 1: Conceptual illustration of 𝒜B×1C×1D×2E×3\mathcal{A}{{}_{\times_{1}}}B{{}_{\times_{1}}}C{{}_{\times_{2}}}D{{}_{\times_{3}}}E.

2.1 Symmetric Tensors and Real Eigenpairs

In this paper we focus on symmetric tensors. In particular, a ddth-order symmetric tensor has all its dimensions being equal, namely, n1==nd=nn_{1}=\cdots=n_{d}=n, and 𝒜i1id=𝒜π(i1id)\mathcal{A}_{i_{1}\ldots i_{d}}=\mathcal{A}_{\pi(i_{1}\ldots i_{d})}, where π(i1id)\pi(i_{1}\ldots i_{d}) is any permutation of the indices i1idi_{1}\ldots i_{d}. To facilitate notation we further introduce some shorthand specific to symmetric tensors. First, we use 𝕊[d,n]\mathbb{S}^{[d,n]} to denote the set of all real symmetric ddth-order nn-dimensional tensors. When a row vector xT1×nx^{T}\in\mathbb{R}^{1\times n} is to be multiplied onto kk modes of 𝒜𝕊[d,n]\mathcal{A}\in\mathbb{S}^{[d,n]} (due to symmetry it can be assumed without loss of generality that all but the first dkd-k modes are multiplied with xTx^{T}), we use the shorthand

(5a) 𝒜x\displaystyle\mathcal{A}x 𝒜xT×d𝕊[d1,n],\displaystyle\triangleq\mathcal{A}{{}_{\times_{d}}}{x^{T}}\in\mathbb{S}^{[d-1,n]},
(5b) 𝒜x2\displaystyle\mathcal{A}x^{2} 𝒜xT×d1xT×d𝕊[d2,n],\displaystyle\triangleq\mathcal{A}{{}_{\times_{d-1}}}{x^{T}}{{}_{\times_{d}}}{x^{T}}\in\mathbb{S}^{[d-2,n]},\cdots
(5c) 𝒜xd2\displaystyle\mathcal{A}x^{d-2} 𝒜xT×3xT×4xT×d𝕊[2,n]n×n,\displaystyle\triangleq\mathcal{A}{{}_{\times_{3}}}{x^{T}}{{}_{\times_{4}}}{x^{T}}\cdots{{}_{\times_{d}}}x^{T}\in\mathbb{S}^{[2,n]}\in\mathbb{R}^{n\times n},
(5d) 𝒜xd1\displaystyle\mathcal{A}x^{d-1} 𝒜xT×2xT×3xT×d𝕊[1,n]=n,\displaystyle\triangleq\mathcal{A}{{}_{\times_{2}}}{x^{T}}{{}_{\times_{3}}}{x^{T}}\cdots{{}_{\times_{d}}}x^{T}\in\mathbb{S}^{[1,n]}=\mathbb{R}^{n},
(5e) 𝒜xd\displaystyle\mathcal{A}x^{d} 𝒜xT×1xT×2xT×d𝕊[0,n]=.\displaystyle\triangleq\mathcal{A}{{}_{\times_{1}}}{x^{T}}{{}_{\times_{2}}}{x^{T}}\cdots{{}_{\times_{d}}}x^{T}\in\mathbb{S}^{[0,n]}=\mathbb{R}.

This is a contraction process because the resulting tensors are progressively shrunk wherein we assume any higher order singleton dimensions (ni=1n_{i}=1) are “squeezed”, e.g., 𝒜xd1\mathcal{A}x^{d-1} is treated as an n×nn\times n matrix rather than its equivalent n×n×1×1n\times n\times 1\cdots\times 1 tensor. Besides contraction by a row vector, the notation generalizes to any matrix of appropriate dimensions. Suppose we have a matrix Vn×p(p>1)V\in\mathbb{R}^{n\times p}(p>1), then

(6) 𝒜Vd1𝒜VT×2VT×3VT×dn×p×p.\displaystyle\mathcal{A}V^{d-1}\triangleq\mathcal{A}{{}_{\times 2}}{V^{T}}{{}_{\times 3}}{V^{T}}\cdots{{}_{\times d}}{V^{T}}\in\mathbb{R}^{n\times p\cdots\times p}.

Now, the real eigenpair of a symmetric tensor 𝒜𝕊[d,n]\mathcal{A}\in\mathbb{S}^{[d,n]}, namely, (λ,x)×n(\lambda,x)\in\mathbb{R}\times\mathbb{R}^{n}, is defined as

(7) 𝒜xd1=λx, xTx=1.\displaystyle\mathcal{A}x^{d-1}=\lambda x,\mbox{~~}x^{T}x=1.

This definition is called the (real) Z-eigenpair [16] or l2l^{2}-eigenpair [14] which is of interest in this paper, apart from other differently defined eigenpairs, e.g., [11]. It follows that

(8) xT𝒜xd1=𝒜xd=λ.\displaystyle x^{T}\mathcal{A}x^{d-1}=\mathcal{A}x^{d}=\lambda.

For 𝒜𝕊[d,n]\mathcal{A}\in\mathbb{S}^{[d,n]}, depending on whether dd is odd or even, we have the following result regarding eigenpairs that occur in pairs [10]:

Lemma 1.

For an even-order symmetric tensor, if (λ,x)(\lambda,x) is an eigenpair, then so is (λ,x)(\lambda,-x). For an odd-order symmetric tensor, if (λ,x)(\lambda,x) is an eigenpair, then so is (λ,x)(-\lambda,-x).

Proof.

For an even dd, we have 𝒜(x)d1=𝒜xd1=λx=λ(x)\mathcal{A}(-x)^{d-1}=-\mathcal{A}x^{d-1}=-\lambda x=\lambda(-x). For an odd dd, we have 𝒜(x)d1=𝒜xd1=λx=λ(x)\mathcal{A}(-x)^{d-1}=\mathcal{A}x^{d-1}=\lambda x=-\lambda(-x).∎

Subsequently, we do not call two eigenpairs distinct if they follow the relationship in Lemma 1. For even-order symmetric tensors, there exists a symmetric identity tensor counterpart 𝕊[d,n]\mathcal{E}\in\mathbb{S}^{[d,n]} such that [16, 10]

(9) xd1=x for all xTx=1.\displaystyle\mathcal{E}x^{d-1}=x~\mbox{~for all~}~x^{T}x=1.

The constraint that x=1||x||=1 is necessary so it holds for all even d2d\geq 2. Obviously, (1,x)(1,x) is an eigenpair for \mathcal{E}. It is also obvious that such an identity tensor does not exist for an odd dd since contracting with x-x does not lead to itself. Consequently, for an even dd, we have similar shifting of eigenvalues when we add a multiple of \mathcal{E} to 𝒜\mathcal{A}, i.e., if (λ,x)(\lambda,x) is an eigenpair of 𝒜\mathcal{A}, then (λ+α,x)(\lambda+\alpha,x) is an eigenpair of 𝒜+α\mathcal{A}+\alpha\mathcal{E}.

2.2 Similar Tensors

In this paper, we say that two symmetric tensors 𝒜,𝕊[d,n]\mathcal{A},\mathcal{B}\in\mathbb{S}^{[d,n]} are similar, denoted by 𝒜\mathcal{A}\sim\mathcal{B}, if for some non-singular Pn×nP\in\mathbb{R}^{n\times n}

(10) =𝒜Pd,\displaystyle\mathcal{B}=\mathcal{A}P^{d},

so that the inverse transform of \mathcal{B} by P1P^{-1} will return 𝒜\mathcal{A}, namely,

(11) Pd=(𝒜Pd)Pd=𝒜(PTPT)×1(PTPT)×2(PTPT)×d=𝒜.\displaystyle\mathcal{B}P^{-d}=\left(\mathcal{A}P^{d}\right)P^{-d}=\mathcal{A}{{}_{\times 1}}(P^{-T}P^{T}){{}_{\times 2}}(P^{-T}P^{T})\cdots{{}_{\times d}}(P^{-T}P^{T})=\mathcal{A}.

Suppose 𝒜~𝕊[d,n]\mathcal{\tilde{A}}\in\mathbb{S}^{[d,n]} is similar to 𝒜\mathcal{A} through a similarity transform by an orthogonal matrix Q¯\underline{Q}, i.e., Q¯TQ¯=In\underline{Q}^{T}\underline{Q}=I_{n} and 𝒜~=𝒜Q¯d\mathcal{\tilde{A}}=\mathcal{A}\underline{Q}^{d}. It follows that if (λ,e1)(\lambda,e_{1}) is an eigenpair of 𝒜~\mathcal{\tilde{A}}, then (λ,q¯1)(\lambda,\underline{q}_{1}) is an eigenpair of 𝒜\mathcal{A} where q¯1\underline{q}_{1} denotes the first column of Q¯\underline{Q}. This is seen by

𝒜~e1d1=λe1\displaystyle\mathcal{\tilde{A}}e_{1}^{d-1}=\lambda e_{1}\Leftrightarrow 𝒜Q¯T×1(e1TQ¯T)×2(e1TQ¯T)×d=λe1\displaystyle\mathcal{A}{{}_{\times 1}}\underline{Q}^{T}{{}_{\times 2}}(e_{1}^{T}\underline{Q}^{T})\cdots{{}_{\times d}}(e_{1}^{T}\underline{Q}^{T})=\lambda e_{1}
\displaystyle\Leftrightarrow 𝒜Q¯T×1(q¯1)T×2(q¯1)T×d=λe1\displaystyle\mathcal{A}{{}_{\times 1}}\underline{Q}^{T}{{}_{\times 2}}(\underline{q}_{1})^{T}\cdots{{}_{\times d}}(\underline{q}_{1})^{T}=\lambda e_{1}
\displaystyle\Leftrightarrow Q¯T𝒜q¯1d1=λe1\displaystyle\underline{Q}^{T}\mathcal{A}\underline{q}_{1}^{d-1}=\lambda e_{1}
(12) \displaystyle\Leftrightarrow 𝒜q¯1d1=λq¯1.\displaystyle\mathcal{A}\underline{q}_{1}^{d-1}=\lambda\underline{q}_{1}.

In fact, we do not need to limit ourselves to e1e_{1} and the above result easily generalizes to any eie_{i}, which we present as lemma 2.

Lemma 2.

For a symmetric tensor 𝒜𝕊[d,n]\mathcal{A}\in\mathbb{S}^{[d,n]}, if (λ,ei)(\lambda,e_{i}) is an eigenpair of 𝒜Q¯d\mathcal{A}\underline{Q}^{d}, then (λ,q¯i)(\lambda,\underline{q}_{i}) is an eigenpair of 𝒜\mathcal{A} where q¯i\underline{q}_{i} is the iith column of the orthogonal matrix Q¯n×n\underline{Q}\in\mathbb{R}^{n\times n}.

Proof.

The proof easily follows from (12). ∎

3 QR Algorithm for Symmetric Tensors

The conventional QR algorithm for a symmetric matrix starts with an initial symmetric matrix A0A_{0}, and computes its QR factorization Q1R1=A0Q_{1}R_{1}=A_{0}. Next, the QR product is reversed to give A1=R1Q1=Q1TA0Q1A_{1}=R_{1}Q_{1}=Q_{1}^{T}A_{0}Q_{1}. And in subsequent iterations, we have QkRk=Ak1Q_{k}R_{k}=A_{k-1}, and Ak=QkTAk1Qk=(Q1Qk)TA0(Q1Qk)=Q¯kTA0Q¯kA_{k}=Q_{k}^{T}A_{k-1}Q_{k}=(Q_{1}\cdots Q_{k})^{T}A_{0}(Q_{1}\cdots Q_{k})=\underline{Q}_{k}^{T}A_{0}\underline{Q}_{k} where we define Q¯k=Q1Q2Qk\underline{Q}_{k}=Q_{1}Q_{2}\cdots Q_{k}. Under some mild assumptions, AkA_{k} will converge to a diagonal matrix holding the eigenvalues of A0A_{0}. In the following, we present QRST which computes multiple, stable and unstable [10], eigenpairs. Moreover, with a permutation strategy to scramble tensor entries, PQRST can produce possibly even more distinct eigenpairs, rendering such tensor QR algorithm much more effective compared to tensor power methods [10, 11] which find only one stable eigenpair at a time.

3.1 QRST

Refer to caption
Fig. 2: (a) The example labeling tensor in 𝕊[3,3]\mathbb{S}^{[3,3]} with sequential numbering of each unique entry. (b) Matricization about the first mode.

The symmetric tensor 𝒜𝕊[3,3]\mathcal{A}\in\mathbb{S}^{[3,3]} in Figure 2(a) will be used as a running example. We call it a labeling matrix as it labels each distinct entry in a symmetric tensor sequentially with a natural number. Moreover, Figure 2(b) is the matricization of this labeling matrix along the first mode [9]. As a matter of fact, matricization about any mode of 𝒜\mathcal{A} gives rise to the same matrix due to symmetry.

Now we formally propose the QRST algorithm, summarized in Algorithm 3.1. The core of QRST is a two-step iteration, remarked in Algorithm 3.1 as Steps 1 and 2, for the iith slice where i=1,,ni=1,\cdots,n. In the Algorithm, it should be noted that Matlab convention is used whereby it can be recognized that 𝒜k(:,i,i,,i)=𝒜keid1\mathcal{A}_{k}(:,i,i,\cdots,i)=\mathcal{A}_{k}e_{i}^{d-1} is a column and 𝒜k(:,:,i,,i)=𝒜keid2\mathcal{A}_{k}(:,:,i,\cdots,i)=\mathcal{A}_{k}e_{i}^{d-2} is a (symmetric) square matrix.

Algorithm 3.1.
(Shifted) QRST
Input: 𝒜0𝕊[d,n]\mathcal{A}_{0}\in\mathbb{S}^{[d,n]}, tolerance τ\tau, maximum iteration kmaxk_{max}, shifts sks_{k} from shift scheme
Output: eigenpairs (λi,q¯i)(\lambda_{i},\underline{q}_{i})
for i=1:ni=1:n do \eqparboxCOMMENTiterate through the iith square slice 𝒜0(:,:,i,,i)\mathcal{A}_{0}(:,:,i,\cdots,i)   ϵ𝒜0eid1ei2/𝒜0(:,:,i,,i)2\epsilon\leftarrow||\mathcal{A}_{0}e_{i}^{d-1}-e_{i}||_{2}/||\mathcal{A}_{0}(:,:,i,\cdots,i)||_{2}   Q¯In\underline{Q}\leftarrow I_{n}   iteration count k1k\leftarrow 1   while ϵ>τ\epsilon>\tau or kkmaxk\leq k_{max} do    Step 1: QkRk=𝒜k1(:,:,i,,i)+sk1InQ_{k}R_{k}=\mathcal{A}_{k-1}(:,:,i,\cdots,i)+s_{k-1}I_{n}   # perform QR factorization    Step 2: 𝒜k=𝒜k1Qkd\mathcal{A}_{k}=\mathcal{A}_{k-1}Q_{k}^{d}   # similarity transform    Q¯Q¯Qk\underline{Q}\leftarrow\underline{Q}Q_{k}    ϵ𝒜keid1ei2/𝒜k(:,:,i,,i)2\epsilon\leftarrow||\mathcal{A}_{k}e_{i}^{d-1}-e_{i}||_{2}/||\mathcal{A}_{k}(:,:,i,\cdots,i)||_{2}   end while end for if converged then   λi𝒜k(i,i,i,,i)\lambda_{i}\leftarrow\mathcal{A}_{k}(i,i,i,\cdots,i)   q¯iQ¯(:,i)\underline{q}_{i}\leftarrow\underline{Q}(:,i) end if


Referring to Figure 2(b), if we take 𝒜0=𝒜\mathcal{A}_{0}=\mathcal{A}, i=1i=1 and start with 𝒜0e1d2=𝒜0e1\mathcal{A}_{0}e_{1}^{d-2}=\mathcal{A}_{0}e_{1}, the first (leftmost) 3×33\times 3 “slice” of the 3×93\times 9 matrix is extracted for QR factorization to get Q1R1Q_{1}R_{1}, and then all three modes of 𝒜0\mathcal{A}_{0} are multiplied with Q1TQ_{1}^{T} so the tensor symmetry is preserved. Using Algorithm 3.1 with the heuristic shift described in Section 3.2, this tensor QR algorithm converges in 152152 iterations when 𝒜ke12\mathcal{A}_{k}e_{1}^{2} becomes a multiple of e1e_{1} (i.e., all but the first entry in the 𝒜ke12\mathcal{A}_{k}e_{1}^{2} column are numerically zero), resulting in

[0.140.000.000.000.190.560.000.560.400.000.190.560.1920.6911.560.5611.565.560.000.560.400.5611.565.560.405.562.46],\displaystyle\left[{\begin{array}[]{*{20}c}-0.14&0.00&0.00&\vline&0.00&-0.19&0.56&\vline&0.00&0.56&-0.40\\ 0.00&-0.19&0.56&\vline&-0.19&-20.69&11.56&\vline&0.56&11.56&-5.56\\ 0.00&0.56&-0.40&\vline&0.56&11.56&-5.56&\vline&-0.40&-5.56&2.46\\ \end{array}}\right],

for which e1=[100]Te_{1}=[1~0~0]^{T} is obviously an eigenvector as its contraction along the i2i_{2} and i3i_{3} axes will extract the first column [0.1400]T[-0.14~0~0]^{T}. Using (12), this corresponds to the eigenpair (0.14,[0.78540.60290.1401]T)(-0.14,[-0.7854~0.6029~-0.1401]^{T}) of the labeling tensor. In fact, for different ii’s, we shoot for different converged patterns in the corresponding square slices as depicted in Figure 3, where it is obvious that 𝒜kei2=λiei\mathcal{A}_{k}e_{i}^{2}=\lambda_{i}e_{i} for each ii.

Refer to caption
Fig. 3: The “signatures” of convergence in the example labeling tensor for i=1,2,3i=1,2,3 where ×\times denotes “don’t care”. Positions of 0’s are obvious by comparing with Figure 2(b).

Since the core computation of QRST lies in the QR factorization (𝒪(n3)\mathcal{O}(n^{3})) and the corresponding similarity transform (𝒪(dn2nd1)=𝒪(dnd+1)\mathcal{O}(dn^{2}n^{d-1})=\mathcal{O}(dn^{d+1})), the overall complexity is therefore dominated by the similarity transform process. Note that this high complexity comes about whereby the tensor symmetry is not exploited. When a low-rank decomposition for 𝒜0\mathcal{A}_{0} exists that represents 𝒜0\mathcal{A}_{0} as a finite sum of rr rank-11 outer factors, the complexity of QRST can be largely reduced to 𝒪(drn2)\mathcal{O}(drn^{2}). However, we refrain from elaborating as it is beyond our scope of verifying the idea of QRST. Such low-rank implementation will be presented separately in a sequel of this work.

3.2 Convergence and Shift Selection

We assume an initial symmetric 𝒜0=𝒜𝕊[d,n]\mathcal{A}_{0}=\mathcal{A}\in\mathbb{S}^{[d,n]}. The vector eie_{i} denotes the iith column of the identity matrix InI_{n} and we define Q¯kQ1Q2Qk\underline{Q}_{k}\triangleq Q_{1}Q_{2}\cdots Q_{k} with Q¯0=In\underline{Q}_{0}=I_{n} and Q¯1=Q1\underline{Q}_{1}=Q_{1} etc. A tensor with higher order singleton dimensions is assumed squeezed, e.g., 𝒜0eid2\mathcal{A}_{0}e_{i}^{d-2} is regarded as an n×nn\times n matrix rather than its equivalent n×n×1×1n\times n\times 1\cdots\times 1 tensor. Also, we use q¯i,k\underline{q}_{i,k} to denote the iith column in Q¯k\underline{Q}_{k} and rij,kr_{ij,k} to stand for the ijij-entry of RkR_{k}. It will be shown that the unshifted QRST operating on the matrix slice resulting from contracting all but the first two modes 𝒜k\mathcal{A}_{k} by eie_{i}, i.e., 𝒜keid2\mathcal{A}_{k}e_{i}^{d-2}, is in fact an orthogonal iteration. This can be better visualized by enumerating a few iterations of QRST:

(13a) Q1R1\displaystyle Q_{1}R_{1} =𝒜0eid2,\displaystyle=\mathcal{A}_{0}e_{i}^{d-2},
(13b) 𝒜1\displaystyle\mathcal{A}_{1} =𝒜0Q1d=𝒜0Q¯1d,\displaystyle=\mathcal{A}_{0}Q_{1}^{d}=\mathcal{A}_{0}\underline{Q}_{1}^{d},
(13c) Q2R2\displaystyle Q_{2}R_{2} =𝒜1eid2=𝒜0Q¯1T×1Q¯1T×2q¯i,1T×3q¯i,1T×d=Q¯1T(𝒜0q¯i,1d2)Q¯1,\displaystyle=\mathcal{A}_{1}e_{i}^{d-2}=\mathcal{A}_{0}{{}_{\times_{1}}}{\underline{Q}_{1}^{T}}{{}_{\times_{2}}}{\underline{Q}_{1}^{T}}{{}_{\times_{3}}}{\underline{q}_{i,1}^{T}}\cdots{{}_{\times_{d}}}{\underline{q}_{i,1}^{T}}=\underline{Q}_{1}^{T}(\mathcal{A}_{0}\underline{q}_{i,1}^{d-2})\underline{Q}_{1},
(13d) 𝒜2\displaystyle\mathcal{A}_{2} =𝒜1Q2d=(𝒜0Q¯1d)Q2d=𝒜0(Q¯1Q2)d=𝒜0Q¯2d,\displaystyle=\mathcal{A}_{1}Q_{2}^{d}=(\mathcal{A}_{0}\underline{Q}_{1}^{d})Q_{2}^{d}=\mathcal{A}_{0}(\underline{Q}_{1}Q_{2})^{d}=\mathcal{A}_{0}\underline{Q}_{2}^{d},
(13e) Q3R3\displaystyle Q_{3}R_{3} =𝒜2eid2=𝒜0Q¯2T×1Q¯2T×2q¯i,2T×3q¯i,2T×d=Q¯2T(𝒜0q¯i,2d2)Q¯2,\displaystyle=\mathcal{A}_{2}e_{i}^{d-2}=\mathcal{A}_{0}{{}_{\times_{1}}}{\underline{Q}_{2}^{T}}{{}_{\times_{2}}}{\underline{Q}_{2}^{T}}{{}_{\times_{3}}}{\underline{q}_{i,2}^{T}}\cdots{{}_{\times_{d}}}{\underline{q}_{i,2}^{T}}=\underline{Q}_{2}^{T}(\mathcal{A}_{0}\underline{q}_{i,2}^{d-2})\underline{Q}_{2},

etc. From (13), we see that in general for k=1,2,k=1,2,\cdots,

(14a) 𝒜k1\displaystyle\mathcal{A}_{k-1} =𝒜0Q¯k1d,\displaystyle=\mathcal{A}_{0}\underline{Q}_{k-1}^{d},
(14b) QkRk\displaystyle Q_{k}R_{k} =𝒜k1eid2=Q¯k1T(𝒜0q¯i,k1d2)Q¯k1.\displaystyle={\mathcal{A}_{k-1}}{e_{i}^{d-2}}=\underline{Q}_{k-1}^{T}(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})\underline{Q}_{k-1}.

Multiplying Q¯k1\underline{Q}_{k-1} to both sides of (14b) we get

Q¯kRk=(𝒜0q¯i,k1d2)Q¯k1\displaystyle\underline{Q}_{k}R_{k}=(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})\underline{Q}_{k-1}\Leftrightarrow
(22) [|||q¯1,kq¯2,kq¯n,k|||][r11,kr12,kr1n,kr22,kr2n,krnn,k]\displaystyle\left[{\begin{array}[]{*{20}c}|&|&{}\hfil&|\\ {\underline{q}_{1,k}}&{\underline{q}_{2,k}}&\cdots&{\underline{q}_{n,k}}\\ |&|&{}\hfil&|\\ \end{array}}\right]\left[{\begin{array}[]{*{20}c}{r_{11,k}}&{r_{12,k}}&\cdots&{r_{1n,k}}\\ {}\hfil&{r_{22,k}}&{}\hfil&{r_{2n,k}}\\ {}\hfil&{}\hfil&\ddots&\vdots\\ {}\hfil&{}\hfil&{}\hfil&{r_{nn,k}}\\ \end{array}}\right]
(29) =[𝒜0q¯i,k1d2][|||q¯1,k1q¯2,k1q¯n,k1|||].\displaystyle~~~~~~~~~~~~~~=\left[{\begin{array}[]{*{20}c}{}\hfil&{}\hfil&{}\hfil\\ {}\hfil&{\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2}}&{}\hfil\\ {}\hfil&{}\hfil&{}\hfil\\ \end{array}}\right]\left[{\begin{array}[]{*{20}c}|&|&{}\hfil&|\\ {\underline{q}_{1,k-1}}&{\underline{q}_{2,k-1}}&\cdots&{\underline{q}_{n,k-1}}\\ |&|&{}\hfil&|\\ \end{array}}\right].

It should now become obvious that (29) is the tensor generalization of the matrix orthogonal iteration [6] (also called simultaneous iteration or subspace iteration) for finding dominant invariant eigenspace of matrices111Unlike the matrix case with a constant (static) matrix AA, the tensor counterpart comes with a dynamic 𝒜0q¯i,k1d2\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2} that is updated in every pass with the iith column of Q¯k1\underline{Q}_{k-1}.. This result comes at no surprise as the matrix QR algorithm is known to be equivalent to orthogonal iteration applied to a block column vector. However, our work confirms, for the first time, this is also true also for symmetric tensors.

Now, if we set i=1i=1 and extract the first columns on both sides of (29), we get

(30) q¯1,kr11,k=(𝒜0q¯1,k1d2)q¯1,k1=𝒜0q¯1,kd1 or q¯1,k=𝒜0q¯1,kd1𝒜0q¯1,kd1,\displaystyle\underline{q}_{1,k}r_{11,k}=(\mathcal{A}_{0}\underline{q}_{1,k-1}^{d-2})\underline{q}_{1,k-1}=\mathcal{A}_{0}\underline{q}_{1,k}^{d-1}\mbox{~~or~~}\underline{q}_{1,k}=\frac{\mathcal{A}_{0}\underline{q}_{1,k}^{d-1}}{||\mathcal{A}_{0}\underline{q}_{1,k}^{d-1}||},

which is just the unshifted S-HOPM method [3, 8] cited as Algorithm 1 in [10]. If we start QRST with i=2i=2, then the first two columns of (29) proceed as

(31a) q¯1,kr11,k\displaystyle\underline{q}_{1,k}r_{11,k} =(𝒜0q¯2,k1d2)q¯1,k1,\displaystyle=(\mathcal{A}_{0}\underline{q}_{2,k-1}^{d-2})\underline{q}_{1,k-1},
(31b) q¯1,kr12,k+q¯2,kr22,k\displaystyle\underline{q}_{1,k}r_{12,k}+\underline{q}_{2,k}r_{22,k} =𝒜0q¯2,k1d1.\displaystyle=\mathcal{A}_{0}\underline{q}_{2,k-1}^{d-1}.

Assuming convergence at k=k=\infty,

(32a) q¯1,r11,\displaystyle\underline{q}_{1,\infty}r_{11,\infty} =(𝒜0q¯2,d2)q¯1,,\displaystyle=(\mathcal{A}_{0}\underline{q}_{2,\infty}^{d-2})\underline{q}_{1,\infty},
(32b) q¯1,r12,+q¯2,r22,\displaystyle\underline{q}_{1,\infty}r_{12,\infty}+\underline{q}_{2,\infty}r_{22,\infty} =(𝒜0q¯2,d2)q¯2,=𝒜0q¯2,d1,\displaystyle=(\mathcal{A}_{0}\underline{q}_{2,\infty}^{d-2})\underline{q}_{2,\infty}=\mathcal{A}_{0}\underline{q}_{2,\infty}^{d-1},

from which we can easily check that r12,=q¯1,T(𝒜0q¯2,d2)q¯2,=r11,q¯1,Tq¯2,=0r_{12,\infty}=\underline{q}_{1,\infty}^{T}(\mathcal{A}_{0}\underline{q}_{2,\infty}^{d-2})\underline{q}_{2,\infty}=r_{11,\infty}\underline{q}_{1,\infty}^{T}\underline{q}_{2,\infty}=0 due to (32a). In fact, for any ii, if Q¯k\underline{Q}_{k} and RkR_{k} converge to Q¯\underline{Q}_{\infty} and RR_{\infty}, respectively, then from (29), we must have

(37) R=[r11,000r22,000rnn,]=Q¯T(𝒜0q¯i,d2)Q¯,\displaystyle R_{\infty}=\left[{\begin{array}[]{*{20}c}{r_{11,\infty}}&0&\cdots&0\\ 0&{r_{22,\infty}}&{}\hfil&\vdots\\ {\vdots}&{}\hfil&\ddots&0\\ 0&{\cdots}&{0}&{r_{nn,\infty}}\\ \end{array}}\right]=\underline{Q}_{\infty}^{T}(\mathcal{A}_{0}\underline{q}_{i,\infty}^{d-2})\underline{Q}_{\infty},

where RR_{\infty} is diagonal due to symmetry on the right of equality. Now if we pre-multiply Q¯\underline{Q}_{\infty} onto both sides of (37) and post-multiply with eje_{j}, then

(38) q¯j,rjj,=(𝒜0q¯i,d2)q¯j,\displaystyle\underline{q}_{j,\infty}r_{jj,\infty}=(\mathcal{A}_{0}\underline{q}_{i,\infty}^{d-2})\underline{q}_{j,\infty}

which boils down to the eigenpair (rii,,q¯i,)(r_{ii,\infty},\underline{q}_{i,\infty}) when i=ji=j. So in effect, (shifted) QRST finds more than just the conventional eigenpairs, but actually all “local” eigenpairs of the square matrix (𝒜0q¯i,d2)(\mathcal{A}_{0}\underline{q}_{i,\infty}^{d-2}), one of them happens to be the tensor eigenpair. This explains why in practice, (shifted) QRST can locate even unstable tensor eigenpairs as it is not restricted to the “global” convexity/concavity constraint imposed on the function f(x)=𝒜0xdf(x)=\mathcal{A}_{0}x^{d} that always leads to positive/negative stable eigenpairs [10].

Till now, it can be seen that except for i=1i=1, (shifted) QRST differs from the (S)S-HOPM algorithm presented in [10, 11], and convergence is generally NOT ensured for i1i\neq 1. However, inspired by (32b), the key for convergence starts with the convergence of the first column that reads

(39) q¯1,k\displaystyle\underline{q}_{1,k} =(𝒜0q¯i,k1d2)q¯1,k1(𝒜0q¯i,k1d2)q¯1,k1.\displaystyle=\frac{(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})\underline{q}_{1,k-1}}{||(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})\underline{q}_{1,k-1}||}.

Consequently, applying results in [10, 11], convergence of (39) can be enforced locally around q¯1,k1\underline{q}_{1,k-1} provided the scalar function f(x)=xT(𝒜0q¯i,k1d2)xf(x)=x^{T}(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})x is convex (concave), which is “localized” in the sense that the contracted (𝒜0q¯i,k1d2)(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2}) is used instead of the whole 𝒜0\mathcal{A}_{0}. Then, for example, convexity can be achieved by adding a shift sk1s_{k-1} that makes (𝒜0q¯i,k1d2)+sk1In(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})+s_{k-1}I_{n} positive semidefinite, namely,

sk1\displaystyle s_{k-1} λmin(𝒜0q¯i,k1d2)\displaystyle\geq-\lambda_{\min}\left(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2}\right)
=λmin(Q¯k1T(𝒜0q¯i,k1d2)Q¯k1)\displaystyle=-\lambda_{\min}\left(\underline{Q}_{k-1}^{T}(\mathcal{A}_{0}\underline{q}_{i,k-1}^{d-2})\underline{Q}_{k-1}\right)
(40) =λmin(𝒜k1eid2).\displaystyle=-\lambda_{\min}\left(\mathcal{A}_{k-1}e_{i}^{d-2}\right).

So the shift step in Algorithm 3.1 is simply sk1=λmin(𝒜k1(:,:,i,,i))+δs_{k-1}=-\lambda_{\min}(\mathcal{A}_{k-1}(:,:,i,\cdots,i))+\delta where δ\delta is a small positive quantity which is set to be unity in our tests unless otherwise specified. This is analogous to the core step in SS-HOPM (Algorithm 2 of [10]). In short, the shift strategy in [10, 11] can be locally reused for QRST with the advantage that QRST may further compute the unstable eigenpairs alongside the positive/negative-stable ones, since a group of eigenpairs are generated at a time due to analogy of QRST to orthogonal iteration.

3.3 PQRST

For a symmetric 𝒜0𝕊[d,n]\mathcal{A}_{0}\in\mathbb{S}^{[d,n]}, QRST produces at most nn eigenpairs in one pass, and fewer than nn eigenpairs can result when QRST does not converge or when the same eigenpair is computed repeatedly. To increase the probability of generating distinct eigenpairs, and noting that eigenvectors of similar tensors are just a change of basis, we introduce the heuristic of permutation to scramble the entries in 𝒜0\mathcal{A}_{0} as much as possible. Figure 4 shows how the 𝕊[3,3]\mathbb{S}^{[3,3]} labeling tensor is permuted by the n!n! (n=3)(n=3) (orthogonal) permutation matrices.

Refer to caption
Fig. 4: Successive permutations of the example labeling tensor. Here the corresponding permutation matrices are put alongside the flattened tensors.

As such, we can run QRST on a tensor subject to different permutations. We call this permuted QRST or PQRST, summarized in Algorithm 3.2.

Algorithm 3.2.
PQRST
Input: 𝒜0𝕊[d,n]\mathcal{A}_{0}\in\mathbb{S}^{[d,n]}, preconditioning orthogonal matrices in a cell P{p}P\{p\}, p=1,,p¯p=1,\cdots,\bar{p}
Output: eigenpairs (λi,q¯i)(\lambda_{i},\underline{q}_{i})
for p=1:p¯p=1:\bar{p} do   PP{p}P\leftarrow P\{p\}   𝒜p𝒜0Pd\mathcal{A}_{p}\leftarrow\mathcal{A}_{0}P^{d}   call QRST Algorithm 3.1 with 𝒜p\mathcal{A}_{p} and collect distinct converged eigenpairs end for


Of course, PQRST requires running QRST n!n! times which quickly becomes impractical for large nn. In that case, we should limit the number of permutations by sampling across all possible permutations or use other heuristics for setting the initial pre-transformed 𝒜0\mathcal{A}_{0}, which is our on-going research.

4 Numerical Examples

In this section, we compare PQRST with SS-HOPM from [10], and with its adaptive version described in [12]. For all the SS-HOPM runs the shift was chosen as (m1)i1,,im|ai1im|(m-1)\sum_{i_{1},\ldots,i_{m}}|a_{i_{1}\cdots i_{m}}|. We considered the iteration to have converged when |λk+1λk|1015|\lambda_{k+1}-\lambda_{k}|\leq 10^{-15}. If the eigenpairs are not known in advance then we also solve the corresponding polynomial system using the numerical method described in [4]. All numerical experiments were done on a quad-core desktop computer with 16 GB memory. Each core runs at 3.3 GHz.

4.1 Example 1

As a first example we use the labeling tensor of Figure 2(a). Due to the odd order of the tensor we have that both (λ,x)(\lambda,x) and (λ,x)(-\lambda,-x) are eigenpairs. From solving the polynomial system we learn that there are 4 distinct eigenpairs, listed in Table 1. The stability of each of these eigenpairs was determined by checking the signs of the eigenvalues of the projected Hessian of the Lagrangian as described in [10].

Table 1: Distinct eigenpairs for labeling tensor of Figure 2(a).
λ\lambda xx stability
30.4557 (0.370.610.70)\begin{pmatrix}0.37&0.61&0.70\end{pmatrix} negatively stable
0.4961 (0.800.340.50)\begin{pmatrix}-0.80&-0.34&0.50\end{pmatrix} positively stable
0.1688 (0.860.440.23)\begin{pmatrix}0.86&-0.44&-0.23\end{pmatrix} positively stable
0.1401 (0.780.600.14)\begin{pmatrix}0.78&-0.60&0.14\end{pmatrix} unstable

100 runs of the SS-HOPM algorithm were performed, each with a different random initial vector. Choosing the random initial vectors from either a uniform or normal distribution had no effect whatsoever on the final result. The shift was fixed to the “conservative” choice of (m1)i1,,im|ai1im|=288(m-1)\sum_{i_{1},\ldots,i_{m}}|a_{i_{1}\cdots i_{m}}|=288. All 100 runs converged to the eigenpair with the eigenvalue 30.455730.4557. The median of number of iterations is 169. One run of 169 iterations took about 0.02 seconds. Next, 100 runs of the adaptive SS-HOPM algorithm were performed with a tolerance of 101510^{-15}. When the initial guess for xx was a random vector sampled from a uniform distribution, then all 100 runs consistently converged to the 30.455730.4557 eigenpair with a median number of 14 iterations. This clearly demonstrates the power of the adaptive shift in order to reduce the number of iterations. Choosing the initial vector xx from a normal distribution resulted in finding the 3 stable eigenpairs and 2 iterations not converging (Table 2). The error was computed as the average value of 𝒜xm1λx2||\mathcal{A}x^{m-1}-\lambda\,x||_{2}. Similar to the SS-HOPM algorithm, the PQRST algorithm without shifting can only find the “largest” eigenpair with a median of 13 iterations. This took about 0.082 seconds. PQRST with shift δ=1\delta=1 results in 18 runs that find all eigenpairs (Table 3). From the 18 runs, 8 did not converge.

Table 2: Adaptive SS-HOPM for the labeling tensor of Figure 2(a).
occ. λ\lambda xx median its error
58 30.4557 (0.370.610.70)\begin{pmatrix}0.37&0.61&0.70\end{pmatrix} 15 4.95×10154.95\times 10^{-15}
23 0.4961 (0.800.340.50)\begin{pmatrix}-0.80&-0.34&0.50\end{pmatrix} 225 1.09×10141.09\times 10^{-14}
17 0.1688 (0.860.440.23)\begin{pmatrix}0.86&-0.44&-0.23\end{pmatrix} 701 4.85×10144.85\times 10^{-14}
Table 3: PQRST with shift for the labeling tensor of Figure 2(a).
occ. λ\lambda xx median its error
2 30.4557 (0.370.610.70)\begin{pmatrix}0.37&0.61&0.70\end{pmatrix} 18 3.10×10143.10\times 10^{-14}
2 0.4961 (0.800.340.50)\begin{pmatrix}-0.80&-0.34&0.50\end{pmatrix} 69.5 6.96×10156.96\times 10^{-15}
2 0.1688 (0.860.440.23)\begin{pmatrix}0.86&-0.44&-0.23\end{pmatrix} 246 4.74×10154.74\times 10^{-15}
4 0.1401 (0.780.600.14)\begin{pmatrix}0.78&-0.60&0.14\end{pmatrix} 201 4.24×10154.24\times 10^{-15}

4.2 Example 2

For the second example we consider the 4th-order tensor from Example 3.5 in [10]. Solving the polynomial system shows that there are 11 distinct eigenpairs (Table 4).

Table 4: Distinct eigenpairs for the tensor of Example 3.5 in [10].
λ\lambda xx stability
0.8893 (0.670.250.70)\begin{pmatrix}0.67&0.25&-0.70\end{pmatrix} negatively stable
0.8169 (0.840.260.47)\begin{pmatrix}0.84&-0.26&0.47\end{pmatrix} negatively stable
0.5105 (0.360.780.51)\begin{pmatrix}0.36&-0.78&0.51\end{pmatrix} unstable
0.3633 (0.270.640.72)\begin{pmatrix}0.27&0.64&0.72\end{pmatrix} negatively stable
0.2682 (0.610.440.66)\begin{pmatrix}0.61&0.44&0.66\end{pmatrix} unstable
0.2628 (0.130.440.89)\begin{pmatrix}0.13&-0.44&-0.89\end{pmatrix} unstable
0.2433 (0.990.090.11)\begin{pmatrix}0.99&0.09&-0.11\end{pmatrix} unstable
0.1735 (0.330.910.25)\begin{pmatrix}0.33&0.91&0.25\end{pmatrix} unstable
-0.0451 (0.780.610.12)\begin{pmatrix}0.78&0.61&0.12\end{pmatrix} positively stable
-0.5629 (0.180.180.97)\begin{pmatrix}0.18&-0.18&0.97\end{pmatrix} positively stable
-1.0954 (0.590.750.30)\begin{pmatrix}0.59&-0.75&-0.30\end{pmatrix} positively stable

Both a positive and negative shift are used for the (adaptive) SS-HOPM runs. Tables 5 and 6 list the results of applying 200 (100 positive shifts and 100 negative shifts) SS-HOPM and adaptive SS-HOPM runs to this tensor. Using the conservative shift of 3737 and the same convergence criterion as in Example 4.1 resulted in 27 runs not converging for the SS-HOPM algorithm. Again, the main strength of the adaptive SS-HOPM is the reduction of number of iterations required to converge. A faster convergence also implies a smaller error as observed in the numerical results. The same initial random vectors were used for both the SS-HOPM and adaptive SS-HOPM runs. Observe that in the 100 adaptive SS-HOPM runs with a negative shift the eigenpair with eigenvalue 0.0451-0.0451 was never found. The total run time for 100 iterations of one adaptive SS-HOPM algorithm took about 0.025 seconds, which means that 100 runs take about 2.5 seconds to complete.

Similarly, the PQRST algorithm without shifting does not converge in any of the permutations. PQRST however retrieves all eigenpairs except one (Table 7). The total time to run PQRST for all permutations for 100 iterations took about 0.35 seconds.

Table 5: SS-HOPM for the tensor of Example 3.5 in [10].
occ. λ\lambda xx median its error
2 0.8893 (0.670.230.70)\begin{pmatrix}0.67&0.23&-0.70\end{pmatrix} 886 2.07×1082.07\times 10^{-8}
30 0.8169 (0.840.260.47)\begin{pmatrix}0.84&-0.26&0.47\end{pmatrix} 644 9.15×10119.15\times 10^{-11}
41 0.3633 (0.270.640.71)\begin{pmatrix}0.27&0.64&0.71\end{pmatrix} 931 4.03×1084.03\times 10^{-8}
55 -0.0451 (0.780.610.12)\begin{pmatrix}-0.78&-0.61&-0.12\end{pmatrix} 680 1.81×10101.81\times 10^{-10}
27 -0.5629 (0.180.180.97)\begin{pmatrix}-0.18&0.18&-0.97\end{pmatrix} 386 1.37×10141.37\times 10^{-14}
18 -1.0954 (0.590.750.30)\begin{pmatrix}-0.59&0.75&0.30\end{pmatrix} 357.5 1.56×10141.56\times 10^{-14}
Table 6: Adaptive SS-HOPM for the tensor of Example 3.5 in [10].
occ. λ\lambda xx median its error
19 0.8893 (0.670.230.70)\begin{pmatrix}0.67&0.23&-0.70\end{pmatrix} 69 6.14×10166.14\times 10^{-16}
30 0.8169 (0.840.260.47)\begin{pmatrix}0.84&-0.26&0.47\end{pmatrix} 33 8.17×10168.17\times 10^{-16}
51 0.3633 (0.270.640.71)\begin{pmatrix}0.27&0.64&0.71\end{pmatrix} 26 7.22×10167.22\times 10^{-16}
78 -0.5629 (0.180.180.97)\begin{pmatrix}-0.18&0.18&-0.97\end{pmatrix} 18 7.53×10167.53\times 10^{-16}
22 -1.0954 (0.590.750.30)\begin{pmatrix}-0.59&0.75&0.30\end{pmatrix} 34 6.50×10166.50\times 10^{-16}
Table 7: PQRST with shift for the tensor of Example 3.5 in [10].
occ. λ\lambda xx median its error
4 0.8893 (0.670.230.70)\begin{pmatrix}0.67&0.23&-0.70\end{pmatrix} 96.5 1.55×10151.55\times 10^{-15}
2 0.8169 (0.840.260.47)\begin{pmatrix}0.84&-0.26&0.47\end{pmatrix} 70 1.37×10151.37\times 10^{-15}
1 0.5105 (0.360.780.51)\begin{pmatrix}0.36&-0.78&0.51\end{pmatrix} 33 1.34×10151.34\times 10^{-15}
4 0.2682 (0.610.440.66)\begin{pmatrix}0.61&0.44&0.66\end{pmatrix} 186.5 2.48×10152.48\times 10^{-15}
2 0.2628 (0.130.440.87)\begin{pmatrix}0.13&-0.44&-0.87\end{pmatrix} 46 7.80×10167.80\times 10^{-16}
2 0.2433 (0.990.090.11)\begin{pmatrix}0.99&0.09&-0.11\end{pmatrix} 45.5 4.46×10164.46\times 10^{-16}
1 0.1735 (0.330.910.25)\begin{pmatrix}0.33&0.91&0.25\end{pmatrix} 45 5.85×10165.85\times 10^{-16}
6 -0.0451 (0.780.610.12)\begin{pmatrix}-0.78&-0.61&-0.12\end{pmatrix} 34 4.20×10164.20\times 10^{-16}
2 -0.5629 (0.180.180.97)\begin{pmatrix}-0.18&0.18&-0.97\end{pmatrix} 19 4.60×10164.60\times 10^{-16}
4 -1.0954 (0.590.750.30)\begin{pmatrix}-0.59&0.75&0.30\end{pmatrix} 25.5 1.59×10151.59\times 10^{-15}

4.3 Example 3

For the third example we consider the 3rd-order tensor from Example 3.6 in [10]. Solving the polynomial system reveals that there are 4 stable and 3 unstable distinct eigenpairs (Table 8).

Table 8: Distinct eigenpairs for the tensor of Example 3.6 in [10].
λ\lambda xx stability
0.8730 (0.390.720.57)\begin{pmatrix}-0.39&0.72&0.57\end{pmatrix} negatively stable
0.4306 (0.720.120.68)\begin{pmatrix}-0.72&-0.12&-0.68\end{pmatrix} positively stable
0.2294 (0.840.440.31)\begin{pmatrix}-0.84&0.44&-0.31\end{pmatrix} unstable
0.0180 (0.710.510.48)\begin{pmatrix}0.71&0.51&-0.48\end{pmatrix} negatively stable
0.0033 (0.450.770.45)\begin{pmatrix}0.45&0.77&-0.45\end{pmatrix} unstable
0.0018 (0.330.630.70)\begin{pmatrix}0.33&0.63&-0.70\end{pmatrix} unstable
0.0006 (0.290.730.61)\begin{pmatrix}0.29&0.73&-0.61\end{pmatrix} positively stable

The results of running SS-HOPM, adaptive SS-HOPM and PQRST on this tensor exhibit very similar features as in Example 4.1. Just like in Example 4.1, the SS-HOPM algorithm can only retrieve the eigenpair with the largest eigenvalue when the initial random vector is drawn from a uniform distribution. When the initial guess is drawn from a normal distribution, SS-HOPM retrieves 2 eigenpairs in 100 runs while the adaptive SS-HOPM retrieves the 4 stable eigenpairs. (Table 9). The shifted PQRST algorithm (δ=.5\delta=.5) retrieves all 7 eigenpairs (Table 10).

Table 9: Adaptive SS-HOPM for the tensor of Example 3.6 in [10].
occ. λ\lambda xx median its error
43 0.8730 (0.390.720.57)\begin{pmatrix}-0.39&0.72&0.57\end{pmatrix} 13 4.00×10164.00\times 10^{-16}
35 0.4306 (0.720.120.68)\begin{pmatrix}-0.72&-0.12&-0.68\end{pmatrix} 22 8.24×10168.24\times 10^{-16}
16 0.0180 (0.710.510.48)\begin{pmatrix}0.71&0.51&-0.48\end{pmatrix} 43 1.26×10151.26\times 10^{-15}
6 -0.0006 (0.290.730.61)\begin{pmatrix}-0.29&-0.73&0.61\end{pmatrix} 18.5 3.35×10163.35\times 10^{-16}
Table 10: PQRST with shift for the tensor of Example 3.6 in [10].
occ. λ\lambda xx median its error
1 0.8730 (0.390.720.57)\begin{pmatrix}-0.39&0.72&0.57\end{pmatrix} 569 7.09×10157.09\times 10^{-15}
5 0.4306 (0.720.120.68)\begin{pmatrix}-0.72&-0.12&-0.68\end{pmatrix} 56 1.79×10151.79\times 10^{-15}
2 0.2294 (0.840.440.31)\begin{pmatrix}-0.84&0.44&-0.31\end{pmatrix} 511 1.65×10151.65\times 10^{-15}
2 0.0180 (0.710.510.48)\begin{pmatrix}0.71&0.51&-0.48\end{pmatrix} 873 5.58×10165.58\times 10^{-16}
2 0.0033 (0.450.770.45)\begin{pmatrix}0.45&0.77&-0.45\end{pmatrix} 1328 4.03×10164.03\times 10^{-16}
2 0.0018 (0.330.630.70)\begin{pmatrix}0.33&0.63&-0.70\end{pmatrix} 1177 4.76×10164.76\times 10^{-16}
2 0.0006 (0.290.730.61)\begin{pmatrix}0.29&0.73&-0.61\end{pmatrix} 3929.5 3.63×10163.63\times 10^{-16}

4.4 Example 4

As a last example we consider a random symmetric tensor in 𝕊[3,6]\mathbb{S}^{[3,6]}. Both SS-HOPM and the adaptive SS-HOPM are able to find 4 distinct eigenpairs in 100 runs. Since the difference in the results between SS-HOPM and the adaptive SS-HOPM lies in the total number of iterations we only list the results for the adaptive SS-HOPM method in Table 11.

Table 11: Adaptive SS-HOPM for random symmetric tensor of order 3 and dimension 6.
occ. λ\lambda xx median its error
23 2.4333 (0.640.360.570.330.170.07)\begin{pmatrix}-0.64&-0.36&0.57&0.33&-0.17&-0.07\end{pmatrix} 79 1.87×10141.87\times 10^{-14}
23 4.3544 (0.070.300.380.480.680.26)\begin{pmatrix}-0.07&0.30&-0.38&0.48&0.68&0.26\end{pmatrix} 25 5.96×10155.96\times 10^{-15}
33 3.4441 (0.230.160.380.720.430.27)\begin{pmatrix}0.23&-0.16&0.38&-0.72&0.43&0.27\end{pmatrix} 52 1.02×10141.02\times 10^{-14}
21 2.8488 (0.020.260.300.310.520.69)\begin{pmatrix}-0.02&-0.26&-0.30&0.31&-0.52&0.69\end{pmatrix} 70 9.45×10159.45\times 10^{-15}

In contrast, PQRST with shift δ=1\delta=1 finds 7 distinct eigenpairs. The 3 additional eigenpairs are unstable.

Table 12: PQRST for random symmetric tensor of order 3 and dimension 6.
occ. λ\lambda xx median its error
24 1.3030 (0.470.840.030.070.050.27)\begin{pmatrix}-0.47&0.84&0.03&-0.07&0.05&0.27\end{pmatrix} 599 6.78×10146.78\times 10^{-14}
576 2.4333 (0.640.360.570.330.170.07)\begin{pmatrix}-0.64&-0.36&0.57&0.33&-0.17&-0.07\end{pmatrix} 75 5.68×10115.68\times 10^{-11}
216 4.3544 (0.070.300.380.480.680.26)\begin{pmatrix}-0.07&0.30&-0.38&0.48&0.68&0.26\end{pmatrix} 148 2.66×10132.66\times 10^{-13}
120 3.4441 (0.230.160.380.720.430.27)\begin{pmatrix}0.23&-0.16&0.38&-0.72&0.43&0.27\end{pmatrix} 41 6.90×10146.90\times 10^{-14}
240 2.8488 (0.020.260.300.310.520.69)\begin{pmatrix}-0.02&-0.26&-0.30&0.31&-0.52&0.69\end{pmatrix} 48 6.63×10146.63\times 10^{-14}
120 1.6652 (0.270.370.020.310.260.79)\begin{pmatrix}-0.27&0.37&0.02&-0.31&-0.26&0.79\end{pmatrix} 773 2.66×10132.66\times 10^{-13}
132 1.2643 (0.020.510.480.610.220.30)\begin{pmatrix}-0.02&-0.51&-0.48&-0.61&0.22&-0.30\end{pmatrix} 150 2.65×10132.65\times 10^{-13}

5 Conclusions

The QR algorithm for finding eigenpairs of a matrix has been extended to its symmetric tensor counterpart called QRST (QR algorithm for symmetric tensors). The QRST algorithm, like its matrix version, permits a shift to accelerate convergence. A preconditioned QRST algorithm, called PQRST, has further been proposed to efficiently locate multiple eigenpairs of a symmetric tensor by orienting the tensor into various directions. Numerical examples have verified the effectiveness of (P)QRST in locating stable and unstable eigenpairs not found by previous tensor power methods.

References

  • [1] P. Comon and M. Rajih, Blind identification of under-determined mixtures based on the characteristic function, Signal Processing, 86 (2006), pp. 2271 – 2281. Special Section: Signal Processing in {UWB} Communications.
  • [2] L. De Lathauwer, J. Castaing, and J. Cardoso, Fourth-order cumulant-based blind identification of underdetermined mixtures, Signal Processing, IEEE Transactions on, 55 (2007), pp. 2965–2973.
  • [3] L. De Lathauwer, B. De Moor, and J. Vandewalle, On the best rank-11 and rank-(R1R_{1}, R2R_{2},…, RnR_{n}) approximation of higher-order tensors, SIAM Journal on Matrix Analysis and Applications, 21 (2000), pp. 1324–1342.
  • [4] P. Dreesen, Back to the Roots: Polynomial System Solving Using Linear Algebra, PhD thesis, Faculty of Engineering, KU Leuven (Leuven, Belgium), 2013.
  • [5] A Ferreol, L. Albera, and P. Chevalier, Fourth-order blind identification of underdetermined mixtures of sources (FOBIUM), Signal Processing, IEEE Transactions on, 53 (2005), pp. 1640–1653.
  • [6] G. H. Golub and C. F. Van Loan, Matrix Computations, The Johns Hopkins University Press, 3rd ed., Oct. 1996.
  • [7] M. E Kilmer and C. D. M. Martin, Decomposing a tensor, SIAM News, 37 (2004), pp. 19–20.
  • [8] E. Kofidis and P. A. Regalia, On the best rank-1 approximation of higher-order supersymmetric tensors, SIAM Journal on Matrix Analysis and Applications, 23 (2002), pp. 863–884.
  • [9] T. Kolda and B. Bader, Tensor decompositions and applications, SIAM Review, 51 (2009), pp. 455–500.
  • [10] T. G. Kolda and J. R. Mayo, Shifted power method for computing tensor eigenpairs, SIAM Journal on Matrix Analysis and Applications, 32 (2011), pp. 1095–1124.
  • [11] T. G. Kolda and J. R. Mayo, An adaptive shifted power method for computing generalized tensor eigenpairs, ArXiv e-prints, (2014).
  • [12]  , An Adaptive Shifted Power Method for Computing Generalized Tensor Eigenpairs, ArXiv e-prints, (2014).
  • [13] J. M. Landsberg, Tensors: Geometry and Applications, American Mathematical Society, 2012.
  • [14] L.-H. Lim, Singular values and eigenvalues of tensors: a variational approach, in in CAMAP2005: 1st IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, 2005, pp. 129–132.
  • [15] L.-H. Lim and P. Comon, Blind multilinear identification, IEEE Trans. Inf. Theory, 60 (2014), pp. 1260–1280.
  • [16] L. Qi, Eigenvalues of a real supersymmetric tensor, Journal of Symbolic Computation, 40 (2005), pp. 1302–1324.