This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Infrared Small Target Detection via tensor L2,1L_{2,1} norm minimization and ASSTV regularization: A Novel Tensor Recovery Approach

Jiqian Zhao and An-Bao Xu * Corresponding authorThis work was supported by the National Natural Science Foundation of China under Grant 11801418.Jiqian Zhao is with the College of Computer Science and artificial intelligence, Wenzhou University, Zhejiang 325035, China (e-mail:zjq18698557926@gmail.com).An-Bao Xu is with the College of Mathematics and Physics, Wenzhou University, Zhejiang 325035, China (e-mail:xuanbao@wzu.edu.cn).
Abstract

In recent years, there has been a noteworthy focus on infrared small target detection, given its vital importance in processing signals from infrared remote sensing. The considerable computational cost incurred by prior methods, relying excessively on nuclear norm for noise separation, necessitates the exploration of efficient alternatives. The aim of this research is to identify a swift and resilient tensor recovery method for the efficient extraction of infrared small targets from image sequences. Theoretical validation indicates that smaller singular values predominantly contribute to constructing noise information. In the exclusion process, tensor QR decomposition is employed to reasonably reduce the size of the target tensor. Subsequently, we address a tensor L2,1L_{2,1} Norm Minimization via T-QR (TLNMTQR) based method to effectively isolate the noise, markedly improving computational speed without compromising accuracy. Concurrently, by integrating the asymmetric spatial-temporal total variation regularization method (ASSTV), our objective is to augment the flexibility and efficacy of our algorithm in handling time series data. Ultimately, our method underwent rigorous testing with real-world data, affirmatively showcasing the superiority of our algorithm in terms of speed, precision, and robustness.

Index Terms:
Tensor recovery, L2,1L_{2,1} norm, infrared small target detection, tensor QR decomposition, asymmetic spatial-temporal total variation, image recovery.

I INTRODUCTION

In the contemporary era of the internet, our lives are intricately woven into the fabric of signals [1]. As we harness these signals, our constant aspiration is for precise and expeditious data collection. Nonetheless, in practical application, the dichotomy between speed and accuracy arises due to technological constraints. Sampling at lower rates presents a formidable challenge in acquiring truly comprehensive data. This challenge is particularly pronounced in the domain of infrared small target detection [2, 3, 4], where there is a pressing demand for a rapid and efficient methodology.

Compressive sensing [5, 6] offers a technical solution by sparsely sampling signals at lower rates and solving a optimization problems to ultimately reconstruct the complete signal. However, this method often applies to one-dimensional data. In contemporary society, data is typically stored in matrices, such as infrared images, audio, and video. Similarly, these data encounter partial element loss during collection, transmission, and storage. If compressive sensing methods are used, they may overlook the relationships between elements in two-dimensional rows and columns.

This is where matrix recovery methods come into play. When discussing early effective algorithms in matrix recovery, it’s inevitable to mention Robust Principal Component Analysis (RPCA). The RPCA model is defined as 𝐘=𝐋+𝐒\mathbf{Y}=\mathbf{L}+\mathbf{S}. This model describes the process of decomposing a noisy original matrix 𝐘\mathbf{Y} into a low-rank matrix 𝐋\mathbf{L} and a sparse matrix 𝐒\mathbf{S}. In the solving process, an optimization problem composed of the L1L_{1}-norm and nuclear norm is used. These methods fall under the category of Low-Rank and Sparse Matrix Decomposition Models (LRSD) [7, 8, 9]. For instance, [10] introduced a swift and effective solver for robust principal component analysis. Building upon this, various methods have since emerged, leveraging the concept of low-rank matrix recovery, see [11, 12, 13].

TABLE I: Explanation of notation in this paper
𝐚\mathbf{a} vectors 𝐈n\mathbf{I}_{n} the identity matrix
𝐀\mathbf{A} matrices \mathcal{I} the identity tensor
𝒜\mathcal{A} tensor 𝒜,\left\langle\mathcal{A},\mathcal{B}\right\rangle the inner product of 𝒜,\mathcal{A},\mathcal{B}
𝒜n1×n2×n3\mathcal{A}\in\mathbb{C}^{n_{1}\times n_{2}\times n_{3}} a third-order tensor conj(𝒜)\left(\mathcal{A}\right) the complex conjugate of 𝒜\mathcal{A}
𝒜(i,j,k)\mathcal{A}\left(i,j,k\right) the (i,j,k)-(i,j,k)\mbox{-}th entry of 𝒜\mathcal{A} 𝒜F\left\|\mathcal{A}\right\|_{F} the tensor Frobenius norm of 𝒜\mathcal{A}
𝒜(i,j,)\mathcal{A}\left(i,j,\right) the tube fiber of 𝒜\mathcal{A} 𝒜1\left\|\mathcal{A}\right\|_{1} the tensor L1-L_{1}\mbox{-}norm of 𝒜\mathcal{A}
𝒜(i)\mathcal{A}^{\left(i\right)} the i-i\mbox{-}th frontal slice 𝒜\mathcal{A} 𝒜2,1\left\|\mathcal{A}\right\|_{2,1} the tensor L2,1-L_{2,1}\mbox{-}norm of 𝒜\mathcal{A}
𝒜\mathcal{A}^{\ast} the conjugate transpose of 𝒜\mathcal{A} 𝒜\left\|\mathcal{A}\right\|_{*} the tensor nuclear norm of 𝒜\mathcal{A}
𝒜^\hat{\mathcal{A}} the result of DFT on 𝒜\mathcal{A} 𝒜ASSTV\left\|\mathcal{A}\right\|_{ASSTV} the tensor ASSTV norm of 𝒜\mathcal{A}

As research progresses, tensors are seen as a way to further optimize and upgrade LRSD, known as Tensor Low-Rank and Sparse Matrix Decomposition Models (TLRSD) [14, 15]. Tensors, due to their increased dimensions, encompass more information and exhibit significantly enhanced computational efficiency.

Analogous to matrix recovery methods, Tensor Robust Principal Component Analysis (TRPCA) [16] was proposed and showed promising results in tensor recovery. Similarly, methods related to tensor decomposition have gradually emerged. In recent years, Total Variation (TV) regularization [17, 18, 19] has been widely applied to solve tensor recovery problems. However, traditional TV regularization, while considering spatial information, is computationally complex. Hence, spatial spectral total variation (SSTV) [20] was introduced, significantly reducing the complexity of solving TV regularization. Yet, besides the spatial information within the tensor itself, the temporal information in tensors remained underutilized, leading to the introduction of ASSTV [21] to incorporate both temporal and spatial information.

Thus, tensor recovery has evolved to primarily solve the ASSTV regularization and nuclear norm optimization problems. However, this process involves tensor singular value decomposition (TSVD) [22] for solving tensor nuclear norm minimization problem, which is highly time-consuming. To address this, a new method ASSTV-TLNMTQR was proposed, which combining ASSTV regularization and TLNMTQR. Our method contributes in the following ways:

  1. 1.

    Introducing the innovative ASSTV-TLNMTQR method which has significantly elevated the pace of tensor decomposition. Through rigorous experimentation in infrared small target detection, this approach has showcased its rapid and efficient tensor recovery capabilities.

  2. 2.

    A pioneering strategy, rooted in tensor decomposition, has been presented for minimizing the L2,1L_{2,1} norm. This method has proven to be highly effective in solving tensor decomposition problems, surpassing the speed of previous approaches reliant on nuclear norm methodologies.

  3. 3.

    Our approach integrates ASSTV regularization and TLNMTQR, resulting in a model endowed. This integration empowers our model to produce outstanding solutions across varied contexts, facilitated by flexible parameter adjustments.

II NOTATIONS AND PRELIMINARIES

II-A Fast fourier transform

In this paper, we have compiled a comprehensive list of symbols and definitions essential for reference, as presented in TABLE I. In order to effectively articulate our model, it is imperative to introduce fundamental definitions and theorems. Primarily, we emphasize a critically significant operation associated with tensor product and matrix product— the Discrete Fourier Transform (DFT). Here, we denote 𝐯^\hat{\mathbf{v}} as an integral part of this discussion.

𝐯^=𝐅𝐧𝐯\hat{\mathbf{v}}=\mathbf{F_{n}\mathbf{v}}

where 𝐅𝐧\mathbf{F_{n}} is DFT matrix denoted as

𝐅n=[11111ωω2ωn11ωn1ω2(n1)ω(n1)(n1)]n×n\mathbf{F}_{n}=\begin{bmatrix}1&1&1&\cdots&1\\ 1&\omega&\omega^{2}&\cdots&\omega^{n-1}\\ \vdots&\vdots&\vdots&\ddots&\vdots\\ 1&\omega^{n-1}&\omega^{2\left(n-1\right)}&\cdots&\omega^{\left(n-1\right)\left(n-1\right)}\end{bmatrix}\in\mathbb{C}^{n\times n},

where ω=e2πin\omega=e^{-\frac{2\pi i}{n}} and ii is the imaginary unit. So we can learn that 𝐅nn\frac{\mathbf{F}_{n}}{\sqrt{n}} is orthogonal, i.e.,

𝐅n𝐅n=𝐅n𝐅n=n𝐈n.\mathbf{F}_{n}^{\ast}\mathbf{F}_{n}=\mathbf{F}_{n}\mathbf{F}_{n}^{\ast}=n\mathbf{I}_{n}. (1)

By providing a proof, we can easily obtain the following conclusions:

(𝐅n3𝐈n1)bcirc(𝒜)(𝐅n31𝐈n2)=𝐀¯,\left(\mathbf{F}_{{n}_{3}}\otimes\mathbf{I}_{{n}_{1}}\right)\cdot\text{bcirc}\left(\mathcal{A}\right)\cdot\left(\mathbf{F}_{{n}_{3}}^{-1}\otimes\mathbf{I}_{{n}_{2}}\right)=\bar{\mathbf{A}}, (2)

where \otimes denotes the Kronecker product and 𝐅n31𝐈n2n3\frac{\mathbf{F}_{{n}_{3}}^{-1}\otimes\mathbf{I}_{{n}_{2}}}{\sqrt{n_{3}}} is orthogonal. Moreover, 𝐀¯\bar{\mathbf{A}} and bcirc(𝒜)\text{bcirc}\left(\mathcal{A}\right) are defined as follows.

𝐀¯=bdiag(𝒜^)=[𝒜^(1)𝒜^(2)𝒜^(n3)]\bar{\mathbf{A}}=\text{bdiag}\left(\hat{\mathcal{A}}\right)=\begin{bmatrix}\hat{\mathcal{A}}^{\left(1\right)}&&&\\ &\hat{\mathcal{A}}^{\left(2\right)}&&\\ &&\ddots&\\ &&&\hat{\mathcal{A}}^{\left(n_{3}\right)}\end{bmatrix},

bcirc(𝒜)=[𝒜(1)𝒜(n3)𝒜(2)𝒜(2)𝒜(1)𝒜(3)𝒜(n3)𝒜(n31)𝒜(1)]\text{bcirc}\left(\mathcal{A}\right)=\begin{bmatrix}\mathcal{A}^{\left(1\right)}&\mathcal{A}^{\left(n_{3}\right)}&\cdots&\mathcal{A}^{\left(2\right)}\\ \mathcal{A}^{\left(2\right)}&\mathcal{A}^{\left(1\right)}&\cdots&\mathcal{A}^{\left(3\right)}\\ \vdots&\ddots&\ddots&\vdots\\ \mathcal{A}^{\left(n_{3}\right)}&\mathcal{A}^{\left(n_{3}-1\right)}&\cdots&\mathcal{A}^{\left(1\right)}\end{bmatrix} .

II-B Tensor product

Here, we provide the definition of the T-product: Let 𝒜n1×n2×n3\mathcal{A}\in\mathbb{R}^{n_{1}\times n_{2}\times n_{3}} and n2×l×n3\mathcal{B}\in\mathbb{R}^{n_{2}\times l\times n_{3}}. Then the t-product 𝒜\mathcal{A}\ast\mathcal{B} is the following tensor of size n1×l×n3n_{1}\times l\times n_{3}:

𝒜=fold(bcirc(𝒜)unfold()),\mathcal{A}\ast\mathcal{B}=\text{fold}\left(\text{bcirc}\left(\mathcal{A}\right)\cdot\text{unfold}\left(\mathcal{B}\right)\right), (3)

where unfold()\text{unfold}\left(\cdot\right) and fold()\text{fold}\left(\cdot\right) are the unfold operator map and fold operator map of 𝒜\mathcal{A} respectively, i.e.,

unfold(𝒜)=[𝒜(1)𝒜(2)𝒜(n3)]n1n3×n2\text{unfold}\left(\mathcal{A}\right)=\begin{bmatrix}\mathcal{A}^{\left(1\right)}\\ \mathcal{A}^{\left(2\right)}\\ \vdots\\ \mathcal{A}^{\left(n_{3}\right)}\end{bmatrix}\in\mathbb{R}^{n_{1}n_{3}\times n_{2}},

fold(unfold(𝒜))=𝒜\text{fold}\left(\text{unfold}\left(\mathcal{A}\right)\right)=\mathcal{A}.

II-C Tensor QR decomposition

The most important component in L2,1L_{2,1}-norm [23] is the tensor QR decomposition, which is an approximation of SVD. Let 𝒜n1×n2×n3\mathcal{A}\in\mathbb{R}^{n_{1}\times n_{2}\times n_{3}}. Then it can be factored as

𝒜=𝒬,\mathcal{A}=\mathcal{Q}\ast\mathcal{R}, (4)

where 𝒬n1×n1×n3\mathcal{Q}\in\mathbb{R}^{n_{1}\times n_{1}\times n_{3}} is orthogonal, and n1×n2×n3\mathcal{R}\in\mathbb{R}^{n_{1}\times n_{2}\times n_{3}} is analogous to the upper triangular matrix.

II-D Asymmetric spatialtemporal total variation

The TV regularizaiton [17, 18, 19] is widely applied in infrared small target detection, as it allows us to better utilize the information in the images.By incorporating the TV regularizaiton, we can maintain the smoothness of the image during the processes of image recovery and denoising, thereby eliminating potential artifacts. Typically, the TV regularizaiton is applied to matrices as shown in Equation (5).

𝒜TV=Dh𝒜1+Dv𝒜1{\left\|{\cal A}\right\|_{TV}}={\left\|{{D_{h}}{\cal A}}\right\|_{1}}+{\left\|{{D_{v}}{\cal A}}\right\|_{1}} (5)

However, this approach neglects the temporal information. Hence, we use a new method called asymmetric spatial-temporal total variation regularizaiton method to model both spatial and temporal continuity. There are two main reasons for choosing ASSTV regularizaiton method:

  • It facilitates the preservation of image details and better noise elimination.

  • By introducing a new parameter δ\delta, it allows us to assign different weights to the temporal and spatial components, enabling more flexible tensor recovery.

Now, we provide the definition of ASSTV, as shown in Equation (6):

𝒜ASSTV=Dh𝒜1+Dv𝒜1+δDz𝒜1{\left\|\mathcal{A}\right\|_{ASSTV}}={\left\|{{\rm{}}{D_{h}}\mathcal{A}}\right\|_{1}}+{\left\|{{\rm{}}{D_{v}}\mathcal{A}}\right\|_{1}}+\delta{\left\|{{\rm{}}{D_{z}}\mathcal{A}}\right\|_{1}} (6)

Where DhD_{h}, DvD_{v}, and DzD_{z} are the horizontal, vertical, and temporal difference operators, respectively. δ\delta denotes a positive constant, which is used to control the contribution in the temporal dimension. Here are the definitions of the three operators:

𝐷h𝒜(i,j,k)=𝒜(i+1,j,k)𝒜(i,j,k)\mathop{D}\nolimits_{h}\mathcal{A}(i,j,k)=\mathcal{A}(i+1,j,k)-\mathcal{A}(i,j,k) (7)
𝐷v𝒜(i,j,k)=𝒜(i,j+1,k)𝒜(i,j,k)\mathop{D}\nolimits_{v}\mathcal{A}(i,j,k)=\mathcal{A}(i,j+1,k)-\mathcal{A}(i,j,k) (8)
𝐷z𝒜(i,j,k)=𝒜(i,j,k+1)𝒜(i,j,k)\mathop{D}\nolimits_{z}\mathcal{A}(i,j,k)=\mathcal{A}(i,j,k+1)-\mathcal{A}(i,j,k) (9)

II-E Tensor L2,1L_{2,1} norm minimization via tensor QR decomposition

Figure 1: Singular value of one image
Refer to caption

In the realm of matrix recovery, a commonly employed approach involves solving the nuclear norm minimization problem through Singular Value Decomposition (SVD) to disentangle noise and restore images. When applying the SVD to a single channel of an RGB image, the singular values can be observed, as depicted in Fig. 1. However, the SVD is computationally demanding and often requires a substantial amount of algorithmic execution time to converge. Consequently, we adopt a novel tensor recovery method based on tensor tri-factorization to address the L2,1L_{2,1} norm minimization problem. This approach aims to achieve image recovery with reduced computational complexity.

minτ𝐙2,1+12𝐙𝐘F2\min\tau{\left\|\mathbf{Z}\right\|_{2,1}}+\frac{1}{2}\left\|{\mathbf{Z}-\mathbf{Y}}\right\|_{F}^{2} (10)

First, we will introduce the matrix version. For the problem (10), there exists an optimal solution (11).

𝐙(:,j)=max{𝐘(:,j)Fτ𝐘(:,j)F,0}𝐋(:,j)\mathbf{Z}\left({:,j}\right)=\max\left\{{\frac{{{{\left\|{\mathbf{Y}\left({:,j}\right)}\right\|}_{F}}-\tau}}{{{{\left\|{\mathbf{Y}\left({:,j}\right)}\right\|}_{F}}}},0}\right\}\mathbf{L}\left({:,j}\right) (11)

However, computing the entire matrix ZZ requires significant computational resources and time. Moreover, in most cases, smaller singular values in the matrix are primarily used to reconstruct noise information within the matrix. Therefore, it is unnecessary to compute them during the process of matrix recovery. Hence, it aims to find a method that only computes the first few larger singular values and their corresponding singular vectors, accelerating the overall computation process.

This problem is essentially finding an approximate rank-rr approximation of the matrix 𝐙\mathbf{Z}, meaning solving the problem (12).

min𝐋,𝐃,𝐑𝐙𝐋𝐃𝐑F2ε\mathop{\min}\limits_{\mathbf{L},\mathbf{D},\mathbf{R}}\left\|{\mathbf{Z}-\mathbf{L}\mathbf{D}\mathbf{R}}\right\|_{F}^{2}\leq\varepsilon (12)

Where 𝐋m×r\mathbf{L}\in\mathbb{R}^{m\times r}, 𝐃r×r\mathbf{D}\in\mathbb{R}^{r\times r}, 𝐑r×n\mathbf{R}\in\mathbb{R}^{r\times n}, and rr represents the rank of matrix 𝐙\mathbf{Z} (r(0,n]r\in(0,n]). 𝐋\mathbf{L} and 𝐑\mathbf{R} are column orthogonal and row orthogonal matrices respectively, while 𝐃\mathbf{D} is a regular square matrix. ε\varepsilon is a positive error threshold. For problem (12), we can use the iterative method to solve it.

We initialize three matrices as follows: 𝐋1=eye(m,r)\mathbf{L}_{1}=\text{eye}(m,r), 𝐃1=eye(r,r)\mathbf{D}_{1}=\text{eye}(r,r), and 𝐑1=eye(r,n)\mathbf{R}_{1}=\text{eye}(r,n). Then, we iterate to update 𝐋\mathbf{L}, 𝐃\mathbf{D}, and 𝐑\mathbf{R} using the QR decomposition. First, fixing 𝐑\mathbf{R}, we have:

[𝐋j+1,]=qr(𝐙𝐑jT).\left[{\mathbf{L}_{j+1},\sim}\right]=\rm{qr}(\mathbf{Z}\mathbf{R}_{j}^{T}).

Next, fixing 𝐋\mathbf{L}, we obtain:

[𝐑j+1,𝐋j+1T𝐙𝐑j+1T]=qr(𝐙𝐋j+1).\left[{\mathbf{R}_{j+1},\mathbf{L}_{j+1}^{T}\mathbf{Z}\mathbf{R}_{j+1}^{T}}\right]=\rm{qr}(\mathbf{Z}\mathbf{L}_{j+1}).

Finally:

𝐃j+1=𝐋j+1𝐓𝐙𝐑j+1T.{\mathbf{D}_{j+1}}=\mathbf{L}_{j+1}^{\mathbf{T}}\mathbf{Z}\mathbf{R}_{j+1}^{T}.

After multiple iterations, the tri-factorization of 𝐙\mathbf{Z} is completed.

Next, we utilize the obtained tri-factorization matrices to optimize our problem. First, we have 𝐋𝐃𝐑Z\mathbf{L}\mathbf{D}\mathbf{R}\approx Z, and then we establish the following L2,1L_{2,1} minimization model:

minτ𝐋𝐃𝐑2,1+12𝐋𝐃𝐑𝐘F2\min\tau{\left\|{\mathbf{L}\mathbf{D}\mathbf{R}}\right\|_{2,1}}+\frac{1}{2}\left\|{\mathbf{L}\mathbf{D}\mathbf{R}-\mathbf{Y}}\right\|_{F}^{2} (13)

Since we know that both 𝐋\mathbf{L} and 𝐑\mathbf{R} are orthogonal, Equation (13) simplifies to Equation (14).

minτ𝐃2,1+12𝐃𝐋T𝐘𝐑TF2\begin{gathered}\min\tau{\left\|\mathbf{D}\right\|_{2,1}}+\frac{1}{2}\left\|{\mathbf{D}-{\mathbf{L}^{T}}\mathbf{Y}{\mathbf{R}^{T}}}\right\|_{F}^{2}\end{gathered} (14)

Let 𝐂=𝐋T𝐘𝐑T\mathbf{C}=\mathbf{L}^{T}\mathbf{Y}\mathbf{R}^{T}, then we have Equation (15).

minτ𝐃2,1+12𝐃𝐂F2\begin{gathered}\min\tau{\left\|\mathbf{D}\right\|_{2,1}}+\frac{1}{2}\left\|{\mathbf{D}-\mathbf{C}}\right\|_{F}^{2}\\ \end{gathered} (15)

We observe that this problem, resembling (10), can be solved using the contraction operator from (11), as shown in Equation (16).

𝐃(:,j)=max{𝐂(:,j)Fτ𝐂(:,j)F,0}𝐂(:,j)\mathbf{D}\left({:,j}\right)=\max\left\{{\frac{{{{\left\|{\mathbf{C}\left({:,j}\right)}\right\|}_{F}}-\tau}}{{{{\left\|{\mathbf{C}\left({:,j}\right)}\right\|}_{F}}}},0}\right\}\mathbf{C}\left({:,j}\right) (16)

Similarly, for the tensor 𝒵{\cal Z}, we need to solve the following optimization problem:

min𝒵2,1+μ2𝒵(yμ)F2\mathop{\min}{\left\|{\cal Z}\right\|_{2,1}}+\frac{\mu}{2}\left\|{{\cal Z}-\left({{\cal B}-\frac{y}{\mu}}\right)}\right\|_{F}^{2} (17)

We need to perform tri-factorization on the tensor 𝒵{\cal Z} and find three tensors such that 𝒟𝒵{{\cal L}{\cal D}{\cal R}\approx\cal Z}, where n1×r×n3{\cal L}\in\mathbb{R}^{n_{1}\times r\times n_{3}} and r×n2×n3{\cal R}\in\mathbb{R}^{r\times n_{2}\times n_{3}} are orthogonal, and 𝒟r×r×n3{\cal D}\in\mathbb{R}^{r\times r\times n_{3}} is a tensor.

We obtain {\cal L}, 𝒟{\cal D}, and {\cal R} by solving the following optimization problem:

𝒵𝒟F2ε\left\|{{\cal Z}-{\cal L}{\cal D}{\cal R}}\right\|_{F}^{2}\leq\varepsilon (18)

Then, we can utilize iterative method to obtain {\cal L}, 𝒟{\cal D}, and {\cal R}. The final solution to the problem (17) is transformed into solving the following problem:

min𝒟2,1+μ2𝒟(yμ)F2\mathop{\min}{\left\|{{\cal L}{\cal D}{\cal R}}\right\|_{2,1}}+\frac{\mu}{2}\left\|{{\cal L}{\cal D}{\cal R}-\left({{\cal B}-\frac{y}{\mu}}\right)}\right\|_{F}^{2} (19)

Simplifying (19) leads to Equation (20).

min𝒟2,1+μ2𝒟(yμ)F2\mathop{\min}{\left\|{\cal D}\right\|_{2,1}}+\frac{\mu}{2}\left\|{{\cal D}-{{\cal L}^{*}}\left({{\cal B}-\frac{y}{\mu}}\right){{\cal R}^{*}}}\right\|_{F}^{2} (20)

Let (yμ)=𝒟T{{{\cal L}^{*}}\left({{\cal B}-\frac{y}{\mu}}\right){{\cal R}^{*}}}={{\cal D}_{T}}. The optimal solution for this modified problem is:

𝒟=ifft(max{𝒟TFτ𝒟TF,0}𝒟T,[],3){\cal D}={\rm{ifft}}\left({\max\left\{{\frac{{{{\left\|{{{\cal D}_{T}}}\right\|}_{F}}-\tau}}{{{{\left\|{{{\cal D}_{T}}}\right\|}_{F}}}},0}\right\}{{\cal D}_{T}},[],3}\right) (21)

The complete algorithmic process is illustrated in Algorithm 1.

Algorithm 1 : Tensor L2,1L_{2,1}-Norm Minimization Method Based on Tensor QR Decomposition.
0:  A real incomplete tensor 𝒳n1×n2×n3\mathcal{X}\in\mathbb{R}^{n_{1}\times n_{2}\times n_{3}};
1:  Initialize: r>0r>0, t>0t>0, τ>0\tau>0, ε=1e6\varepsilon=1e-6 0=n1×r×n3\mathcal{L}_{0}=\mathcal{I}\in\mathbb{R}^{n_{1}\times r\times n_{3}}, 𝒟0=r×r×n3\mathcal{D}_{0}=\mathcal{I}\in\mathbb{R}^{r\times r\times n_{3}}, 0=r×n2×n3\mathcal{R}_{0}=\mathcal{I}\in\mathbb{R}^{r\times n_{2}\times n_{3}}, =𝒳\mathcal{M}=\mathcal{X}.
2:  while Not converged do
3:     [,]=T-QR()\left[\mathcal{L},\sim\right]=\text{T-QR}\left(\mathcal{M}\ast\mathcal{R}^{\ast}\right);
4:     [,𝒟T]=T-QR(())\left[\mathcal{R},\mathcal{D}_{T}^{\ast}\right]=\text{T-QR}\left(\left(\mathcal{M}\right)^{\ast}\ast\mathcal{L}\right);
5:     ={{\cal R}}={\cal R}^{*}
6:     for t=1,,n3t=1,...,n_{3} do
7:        for j=1,,rj=1,...,r do
8:           𝒟j(t)=ifft(max{𝒟Tj^(t)Fτ,0}𝒟Tj^(t)F𝒟Tj^(t),[],3){\mathcal{D}^{j}}^{\left(t\right)}=\text{ifft}\left(\frac{\max\left\{\left\|\hat{\mathcal{D}_{T}^{j}}^{\left(t\right)}\right\|_{F}-\tau,0\right\}}{\left\|\hat{\mathcal{D}_{T}^{j}}^{\left(t\right)}\right\|_{F}}\hat{\mathcal{D}_{T}^{j}}^{\left(t\right)},[],3\right);
9:        end for
10:     end for
11:     =𝒟\cal M={\cal L}*{\cal D}*{\cal R}
12:     Check the convergence conditions 𝒳𝒟F2ε\left\|{{\cal X}-{\cal L}{\cal D}{\cal R}}\right\|_{F}^{2}\leq\varepsilon
13:  end while
14:  return  \mathcal{M}.

III OUR METHOD FOR TENSOR RECOVERY

There are different methods for constructing tensor 𝒦\cal K under different backgrounds. In infrared small target detection, we divide a continuous sequence of images into blocks by using a sliding window [21, 24], moving from the top left to the bottom right. Eventually, we obtain a tensor 𝒦\mathcal{K}, see Fig. 2. The height and width of the sliding window are n1n_{1} and n2n_{2}, respectively, with a thickness of n3=Ln_{3}=L.

Figure 2: The construction of the tensor
Refer to caption

After constructing the tensor 𝒦\cal K, if we performing the SVD on 𝒦\mathcal{K}, we can clearly observe that 𝒦\mathcal{K} is a low-rank tensor.

Then, we define the following linear model:

𝒦=+𝒯+𝒩\mathcal{K}=\mathcal{B}+\mathcal{T}+\mathcal{N} (22)

Where {\cal B}, 𝒯{\cal T}, and 𝒩{\cal N} represent the background image, target image, and noise image,respectively, 𝒦,,𝒯,𝒩n1×n2×n3{\cal K},{\cal B},{\cal T},{\cal N}\in{{}^{{n_{1}}\times{n_{2}}\times{n_{3}}}}.

III-A The proposed model

By combining the L2,1L_{2,1} norm and ASSTV regularization, we present the definition of our model as follows:

,𝒯,𝒩=argmin,𝒯,𝒩2,1+λtvASTTV+λs𝒯1+λ3𝒩F2\begin{split}\mathcal{B},\mathcal{T},\mathcal{N}=\mathop{\arg\min}\limits_{{\cal B},{\cal T},{\cal N}}{\left\|\mathcal{B}\right\|_{2,1}}&+{\lambda_{tv}}{\left\|\mathcal{B}\right\|_{ASTTV}}\\ +{\lambda_{s}}{\left\|\mathcal{T}\right\|_{1}}&+{\lambda_{3}}\left\|\mathcal{N}\right\|_{F}^{2}\\ \end{split} (23)

s.t.𝒦=+𝒯+𝒩,s.t.\quad\mathcal{K}=\mathcal{B}+\mathcal{T}+\mathcal{N},

By utilizing the ASSTV formulation (6), we can rewrite (23) as follows:

,𝒯,𝒩=argmin,𝒯,𝒩2,1+λs𝒯1+λ3𝒩F2+λtv(Kh1+Kv1+δKz1)\begin{split}{\cal B},{\cal T},{\cal N}&=\mathop{\arg\min}\limits_{{\cal B},{\cal T},{\cal N}}{\left\|{\cal B}\right\|_{2,1}}+{\lambda_{s}}{\left\|{\cal T}\right\|_{1}}+{\lambda_{3}}\left\|{\cal N}\right\|_{F}^{2}\\ &+{\lambda_{tv}}({\left\|{{K_{h}}{\cal B}}\right\|_{1}}+{\left\|{{K_{v}}{\cal B}}\right\|_{1}}+\delta{\left\|{{K_{z}}{\cal B}}\right\|_{1}})\end{split} (24)

s.t.𝒦=+𝒯+𝒩,s.t.\quad\mathcal{K}=\mathcal{B}+\mathcal{T}+\mathcal{N},

Our model employs the L2,1L_{2,1} norm, which significantly enhances algorithm efficiency. Additionally, we utilize tensor low-rank approximation to better assess the background, and the use of ASSTV allows for more flexible recovery of tensor.

III-B Optimization procedure

In this section, we utilize the Alternating Direction Method of Multipliers (ADMM) method to solve (23). First, we modify (23) to the following form:

,𝒯,𝒩=argminB,T,N𝒵2,1+λs𝒯1+λ3𝒩F2+λtv(𝒱11+𝒱21+δ𝒱31)\begin{split}{\cal B},{\cal T},{\cal N}&=\mathop{\arg\min}\limits_{B,T,N}{\left\|{\cal Z}\right\|_{2,1}}+{\lambda_{s}}{\left\|{\cal T}\right\|_{1}}+{\lambda_{3}}\left\|{\cal N}\right\|_{F}^{2}\\ &+{\lambda_{tv}}({\left\|{{{\cal V}_{1}}}\right\|_{1}}+{\left\|{{{\cal V}_{2}}}\right\|_{1}}+\delta{\left\|{{{\cal V}_{3}}}\right\|_{1}})\end{split} (25)

s.t.𝒦=+𝒯+𝒩,𝒵=,𝒱1=𝒦h,𝒱2=𝒦v,𝒱3=𝒦zs.t.\quad{\cal K}={\cal B}+{\cal T}+{\cal N},{\cal Z}={\cal B},{{\cal V}_{1}}={{\cal K}_{h}}{\cal B},{{\cal V}_{2}}={{\cal K}_{v}}{\cal B},{{\cal V}_{3}}={{\cal K}_{z}}{\cal B}

Next, we present the augmented Lagrangian formulation of (25):

LA(,𝒯,𝒩,𝒵,𝒱)=𝒵2,1+λs𝒯1+λ3𝒩F2+λtv(𝒱11+𝒱21+δ𝒱31)+y1,𝒦𝒯𝒩+y2,𝒵+y3,𝒱1𝒦h+y4,𝒱2𝒦v+y5,𝒱3𝒦z+μ2(𝒦𝒯𝒩F2)+𝒵F2+𝒱1𝒦hF2+𝒱2𝒦vF2+𝒱2𝒦vF2\begin{split}&{{L}_{A}}({\cal B},{\cal T},{\cal N},{\cal Z},{\cal V})\\ &={\left\|{\cal Z}\right\|_{2,1}}+{\lambda_{s}}{\left\|{\cal T}\right\|_{1}}+{\lambda_{3}}\left\|{\cal N}\right\|_{F}^{2}\\ &+{\lambda_{tv}}({\left\|{{{\cal V}_{1}}}\right\|_{1}}+{\left\|{{{\cal V}_{2}}}\right\|_{1}}+\delta{\left\|{{{\cal V}_{3}}}\right\|_{1}})\\ &+\left\langle{{y_{1}},{\cal K}-{\cal B}-{\cal T}-{\cal N}}\right\rangle+\left\langle{{y_{2}},{\cal Z}-{\cal B}}\right\rangle+\left\langle{{y_{3}},{{\cal V}_{1}}-{{\cal K}_{h}}{\cal B}}\right\rangle\\ &+\left\langle{{y_{4}},{{\cal V}_{2}}-{{\cal K}_{v}}{\cal B}}\right\rangle+\left\langle{{y_{5}},{{\cal V}_{3}}-{{\cal K}_{z}}{\cal B}}\right\rangle\\ &+\frac{\mu}{2}\left({\left\|{{\cal K}-{\cal B}-{\cal T}-{\cal N}}\right\|_{F}^{2}}\right)+\left\|{{\cal Z}-{\cal B}}\right\|_{F}^{2}\\ &+\left\|{{{\cal V}_{1}}-{{\cal K}_{h}}{\cal B}}\right\|_{F}^{2}+\left\|{{{\cal V}_{2}}-{{\cal K}_{v}}{\cal B}}\right\|_{F}^{2}+\left\|{{{\cal V}_{2}}-{{\cal K}_{v}}{\cal B}}\right\|_{F}^{2}\end{split} (26)

where μ\mu represent a positive penalty scalar and y1y_{1}, y2y_{2}, y3y_{3}, y4y_{4}, y5y_{5} represent the Lagrangian multiplier. We will use ADMM algorithm to solve (26), which includes 𝒵\cal Z, \cal B, 𝒯\cal T, V1V_{1}, V2V_{2}, V3V_{3} and 𝒩\cal N. Then we will alternately update the variable as:

  1. 1.

    We update the value of 𝒵{\cal Z} using the following equation:

    𝒵k+1=argminZ𝒵2,1+μk2𝒵+y2kμkF2{{\cal Z}^{k+1}}=\mathop{\arg\min}\limits_{Z}{\left\|{\cal Z}\right\|_{2,1}}+\frac{{{\mu^{k}}}}{2}\left\|{{\cal Z}-{\cal B}+\frac{{y_{2}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2} (27)

    For more details about the update of 𝒵{\cal Z}, please refer to Algorithm 1. In the experiment, one iteration produced promising results, so we only performed one.

  2. 2.

    We update the value of {\cal B} using the following equation:

    k+1=μk2(𝒦𝒯k𝒩k+y1kμkF2+𝒵k+1+y2kμkF2+𝒱1k𝒦h+y3kμkF2+𝒱2k𝒦v+y4kμkF2+𝒱3k𝒦z+y5kμkF2)\begin{array}[]{l}{{\cal B}^{k+1}}=\frac{{{\mu^{k}}}}{2}\left({\left\|{{\cal K}-{\cal B}-{{\cal T}^{k}}-{{\cal N}^{k}}+\frac{{y_{1}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}}\right.\\ +\left\|{{{\cal Z}^{k+1}}-{\cal B}+\frac{{y_{2}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}+\left\|{{\cal V}_{1}^{k}-{{\cal K}_{h}}{\cal B}+\frac{{y_{3}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}\\ \left.{+\left\|{{\cal V}_{2}^{k}-{{\cal K}_{v}}{\cal B}+\frac{{y_{4}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}+\left\|{{\cal V}_{3}^{k}-{{\cal K}_{z}}{\cal B}+\frac{{y_{5}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}}\right)\end{array} (28)

    To solve (28), we can utilize the following system of linear equations:

    (2I+Δ)k+1=LAk+θ1+θ2+θ3\left({2I+\Delta}\right){{\cal B}^{k+1}}={{L}^{k}_{A}}+{\theta_{1}}+{\theta_{2}}+{\theta_{3}} (29)

    where Δ=𝒦hT𝒦h+𝒦vT𝒦v+𝒦zT𝒦z,k=𝒦𝒯k𝒩k+y1kμk+𝒵+y2kμk,θ1=𝒦hT(𝒱1k+y3kμk),θ2=𝒦vT(𝒱2k+y4kμk),θ3=𝒦zT(𝒱3k+y5kμk)\Delta={\cal K}_{h}^{T}{{\cal K}_{h}}+{\cal K}_{v}^{T}{{\cal K}_{v}}+{\cal K}_{z}^{T}{{\cal K}_{z}},{{\cal L}^{k}}={\cal K}-{{\cal T}^{k}}-{{\cal N}^{k}}+\frac{{y_{1}^{k}}}{{{\mu^{k}}}}+{\cal Z}+\frac{{y_{2}^{k}}}{{{\mu^{k}}}},{\theta_{1}}={\cal K}_{h}^{T}\left({{\cal V}_{1}^{k}+\frac{{y_{3}^{k}}}{{{\mu^{k}}}}}\right),{\theta_{2}}={\cal K}_{v}^{T}\left({{\cal V}_{2}^{k}+\frac{{y_{4}^{k}}}{{{\mu^{k}}}}}\right),{\theta_{3}}={\cal K}_{z}^{T}\left({{\cal V}_{3}^{k}+\frac{{y_{5}^{k}}}{{{\mu^{k}}}}}\right) ,and T is the matrix transpose. By considering convolutions along two spatial directions and one temporal direction, we obtain a new Equation (30) as follows:

    k+1=F1(F(k+θ1+θ2+θ3)2+Fi{h,v,z}(𝒦i)HF(𝒦i)){{\cal B}^{k+1}}={F^{-1}}\left({\frac{{F\left({{{\cal L}^{k}}+{\theta_{1}}+{\theta_{2}}+{\theta_{3}}}\right)}}{{2+{{\sum{{}_{i\in\left\{{h,v,z}\right\}}F\left({{{\cal K}_{i}}}\right)}}^{H}}F\left({{{\cal K}_{i}}}\right)}}}\right) (30)
  3. 3.

    We update the value of 𝒯{\cal T} using the following equation:

    𝒯k+1=argminTλs𝒯1+μk2𝒦k+1𝒯𝒩k+y1kμkF2\begin{split}{{\cal T}^{k+1}}&=\mathop{\arg\min}\limits_{T}{\lambda_{s}}{\left\|{\cal T}\right\|_{1}}\\ &+\frac{{{\mu^{k}}}}{2}\left\|{{\cal K}-{{\cal B}^{k+1}}-{\cal T}-{{\cal N}^{k}}+\frac{{y_{1}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}\end{split} (31)

    The closedform solution of (31) can be obtained by resorting to the element-wise shrinkage operator, that is:

    𝒯k+1=𝒯hλ1(μk)1(𝒦k+1𝒩k+y1kμk){{\cal T}^{k+1}}={\cal T}{h_{{\lambda_{1}}\left({{\mu^{k}}}\right)}}^{-1}\left({{\cal K}-{{\cal B}^{k+1}}-{{\cal N}^{k}}+\frac{{y_{1}^{k}}}{{{\mu^{k}}}}}\right) (32)
  4. 4.

    We update the values of 𝒱1{\cal V}_{1}, 𝒱2{\cal V}_{2}, and 𝒱3{\cal V}_{3} using the following equations:

    {V1k+1=argminV1λtvV11+μk2V1k𝒦hk+1+y3kμkF2V2k+1=argminV2λtvV21+μk2V2k𝒦vk+1+y3kμkF2V3k+1=argminV3δλtvV31+μk2V3k𝒦zk+1+y3kμkF2\left\{\begin{array}[]{l}\begin{split}V_{1}^{k+1}&=\mathop{\arg\min}\limits_{V1}{\lambda_{tv}}{\left\|{{V_{1}}}\right\|_{1}}\\ &+\frac{{{\mu^{k}}}}{2}\left\|{V_{1}^{k}-{{\cal K}_{h}}{{\cal B}^{k+1}}+\frac{{y_{3}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}\\ V_{2}^{k+1}&=\mathop{\arg\min}\limits_{V2}{\lambda_{tv}}{\left\|{{V_{2}}}\right\|_{1}}\\ &+\frac{{{\mu^{k}}}}{2}\left\|{V_{2}^{k}-{{\cal K}_{v}}{{\cal B}^{k+1}}+\frac{{y_{3}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}\\ V_{3}^{k+1}&=\mathop{\arg\min}\limits_{V3}\delta{\lambda_{tv}}{\left\|{{V_{3}}}\right\|_{1}}\\ &+\frac{{{\mu^{k}}}}{2}\left\|{V_{3}^{k}-{{\cal K}_{z}}{{\cal B}^{k+1}}+\frac{{y_{3}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}\\ \end{split}\end{array}\right. (33)

    Here, we can also utilize the element-wise shrinkage operator to solve the above problem. The Equation is as follows:

    {V1k+1=𝒯hλtv(μk)1(𝒦hk+1y3kμk)V2k+1=𝒯hλtv(μk)1(𝒦vk+1y4kμk)V3k+1=𝒯hδλtv(μk)1(𝒦zk+1y5kμk)\left\{\begin{array}[]{l}V_{1}^{k+1}={\cal T}{h_{{\lambda_{tv}}\left({{\mu^{k}}}\right)}}^{-1}\left({{{\cal K}_{h}}{{\cal B}^{k+1}}-\frac{{y_{3}^{k}}}{{{\mu^{k}}}}}\right)\\ V_{2}^{k+1}={\cal T}{h_{{\lambda_{tv}}\left({{\mu^{k}}}\right)}}^{-1}\left({{{\cal K}_{v}}{{\cal B}^{k+1}}-\frac{{y_{4}^{k}}}{{{\mu^{k}}}}}\right)\\ V_{3}^{k+1}={\cal T}{h_{\delta{\lambda_{tv}}\left({{\mu^{k}}}\right)}}^{-1}\left({{{\cal K}_{z}}{{\cal B}^{k+1}}-\frac{{y_{5}^{k}}}{{{\mu^{k}}}}}\right)\end{array}\right. (34)
  5. 5.

    We update the value of 𝒩k+1{{\cal N}^{k+1}} using the following equation:

    𝒩k+1=argmin𝒩λ3𝒩F2+μk2𝒦k+1𝒯k+1𝒩+y1kμkF2\begin{split}{{\cal N}^{k+1}}&=\mathop{\arg\min}\limits_{\cal N}{\lambda_{3}}\left\|{\cal N}\right\|_{F}^{2}\\ &+\frac{{{\mu^{k}}}}{2}\left\|{{\cal K}-{{\cal B}^{k+1}}-{{\cal T}^{k+1}}-{\cal N}+\frac{{y_{1}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}\end{split} (35)
    TABLE II: Pictures introduction
    Sequence Frames Image Size Average SCR Target Descriptions Background Descriptions
    1 120 256×256256\times 256 5.45 Far range, single target ground
    2 120 256×256256\times 256 5.11 Near to far, single target ground
    3 120 256×256256\times 256 6.33 Near to far, single target ground
    4 120 256×256256\times 256 6.07 Far to near, single target ground
    5 120 256×256256\times 256 5.20 Far distance, single target ground-sky boundary
    6 120 256×256256\times 256 1.98 Target near to far, single target, faint target ground
    TABLE III: Parameter setting
    Methods Acronyms Parameter settings
    Total Variation Regularization and Principal Component Pursuit [25] TV-PCP λ1=0.005\lambda_{1}=0.005, λ2=1max(M,N)\lambda_{2}=\frac{1}{\sqrt{max(M,N)}}, β=0.025\beta=0.025, γ=1.5\gamma=1.5
    Partial Sum of the Tensor Nuclear Norm [26] PSTNN Sliding step: 40, λ=0.6max(n1,n2),pathchsize:30×30\lambda=\frac{0.6}{\sqrt{max(n1,n2)}},pathchsize:30\times 30
    Multiple Subspace Learning and Spatial-temporal Patch-Tensor Model [27] MSLSTIPT L=6L=6,P=0.8P=0.8,λ=1n3×max(n1,n2)\lambda=\frac{1}{\sqrt{n_{3}\times max(n_{1},n_{2})}},patchsize:30×30patchsize:30\times 30
    Nonconvex tensor fibered rank approximation [28] NTFRA Slidingstep:40Slidingstep:40,patchsize:40×40patchsize:40\times 40,λ=1n3×max(n1,n2)\lambda=\frac{1}{\sqrt{n_{3}\times max(n_{1},n_{2})}},β=0.01\beta=0.01
    Non-Convex Tensor Low-Rank Approximation [21] ASSTV-NTLA L=3L=3, H=6H=6, λtv=0.5\lambda_{tv}=0.5, λs=Hmax(M,N)×L\lambda_{s}=\frac{H}{\sqrt{max(M,N)\times L}}, λ3=100\lambda_{3}=100
    Tensor L2,1L_{2,1} Norm Minimization via Tensor QR Decomposition ASSTV-TLNMTQR r=180r=180, L=3L=3, H=6H=6, λtv=0.5\lambda_{tv}=0.5, λs=Hmax(M,N)×L\lambda_{s}=\frac{H}{\sqrt{max(M,N)\times L}}, λ3=100\lambda_{3}=100

    The solution of (35) is as follows:

    𝒩k+1=μk(𝒦k+1𝒯k+1)+y1kμk+2λ3{{\cal N}^{k+1}}=\frac{{{\mu^{k}}\left({{\cal K}-{{\cal B}^{k+1}}-{{\cal T}^{k+1}}}\right)+y_{1}^{k}}}{{{\mu^{k}}+2{\lambda_{3}}}} (36)
  6. 6.

    Updating multipliers y1; y2; y3; y4; y5 with other variables being fixed:

    {y1k+1=y1k+μk(𝒦k+1𝒯k+1𝒩k+1)y2k+1=y2k+μk(𝒵k+1k+1)y3k+1=y3k+μk(V1k+1𝒦hk+1)y4k+1=y4k+μk(V2k+1𝒦vk+1)y5k+1=y5k+μk(V3k+1𝒦zk+1)\left\{\begin{array}[]{l}y_{1}^{k+1}=y_{1}^{k}+{\mu^{k}}\left({{\cal K}-{{\cal B}^{k+1}}-{{\cal T}^{k+1}}-{{\cal N}^{k+1}}}\right)\\ y_{2}^{k+1}=y_{2}^{k}+{\mu^{k}}\left({{{\cal Z}^{k+1}}-{{\cal B}^{k+1}}}\right)\\ y_{3}^{k+1}=y_{3}^{k}+{\mu^{k}}\left({V_{1}^{k+1}-{{\cal K}_{h}}{{\cal B}^{k+1}}}\right)\\ y_{4}^{k+1}=y_{4}^{k}+{\mu^{k}}\left({V_{2}^{k+1}-{{\cal K}_{v}}{{\cal B}^{k+1}}}\right)\\ y_{5}^{k+1}=y_{5}^{k}+{\mu^{k}}\left({V_{3}^{k+1}-{{\cal K}_{z}}{{\cal B}^{k+1}}}\right)\end{array}\right. (37)
  7. 7.

    Updating μk+1{\mu^{k+1}} by μk+1=min(ρμk,μmax){\mu^{k+1}}=\min(\rho{\mu^{k}},{\mu_{\max}}).

    Algorithm 2 : ASSTV-TLNMTQR algorithm
    0:  infrared image sequence k1,k2,,kPn1×n2{k_{1}},{k_{2}},\cdots,{k_{P}}\in\mathbb{R}{{}^{{n_{1}}\times{n_{2}}}}, number of frames LL, parameters λs,λtv,λ3,μ>0{\lambda_{s}},\quad{\lambda_{tv}},\quad{\lambda_{3}},\quad\mu\textgreater 0;
    1:  Initialize: Transform the image sequence into the original tensor 𝒦{\cal K}, 0{\cal B}^{0}=𝒯0{\cal T}^{0}=𝒩0{\cal N}^{0}=𝒱i0{\cal V}_{i}^{0}=0, i=1,2,3i=1,2,3, yi0{y}_{i}^{0}=0, μ0=0.005{\mu}_{0}=0.005, μmax=1e7{\mu}_{max}=1e7, k=0k=0, i=1,,5i=1,\dots,5, ρ=1,5\rho=1,5, ξ=1e6\xi=1e-6.
    2:  while Not converged do
    3:     Update 𝒵k+1{{\cal Z}^{k+1}} by 27
    4:     Update k+1{{\cal B}^{k+1}} by 28
    5:     Update 𝒯k+1{{\cal T}^{k+1}} by 31
    6:     Update V1k+1V_{1}^{k+1},V2k+1V_{2}^{k+1},V3k+1V_{3}^{k+1} by 33
    7:     Update 𝒩k+1{{\cal N}^{k+1}} by 35
    8:     Update multipliers yik+1y_{i}^{k+1}, i=1,2,,5i=1,2,\cdots,5 by 37
    9:     Updating μk+1{\mu^{k+1}} by μk+1=min(ρμk,μmax){\mu^{k+1}}=\min(\rho{\mu^{k}},{\mu_{\max}})
    10:     Check the convergence conditions 𝒦k+1𝒯k+1𝒩+y1kμkF2𝒦F2ξ\frac{{\left\|{{\cal K}-{{\cal B}^{k+1}}-{{\cal T}^{k+1}}-{\cal N}+\frac{{y_{1}^{k}}}{{{\mu^{k}}}}}\right\|_{F}^{2}}}{{\left\|{\cal K}\right\|_{F}^{2}}}\leq\xi
    11:     Update k=k+1k=k+1
    12:  end while
    12:  k+1{{\cal B}^{k+1}},𝒯k+1{{\cal T}^{k+1}},𝒩k+1{{\cal N}^{k+1}}

IV NUMERICAL TESTS AND APPLICATIONS

The main purpose of our algorithm is to decompose the tensor 𝒦\cal K into three tensors: the background tensor \cal B, the target tensor 𝒯\cal T, and the noise tensor 𝒩\cal N. We will explore the effects of different parameter adjustments and ultimately identify a relatively suitable parameter configuration for the most effective extraction of infrared small target detection. Finally, we will compare our algorithm with the latest algorithms.

Figure 3: The parameter analysis of r=10
Refer to caption
Figure 4: The parameter analysis of r=50
Refer to caption
Figure 5: The parameter analysis of r=90
Refer to caption
Figure 6: The parameter analysis of r=130
Refer to caption
Figure 7: The parameter analysis of r=170
Refer to caption
Figure 8: The parameter analysis of r=210
Refer to caption
Figure 9: Parameter analysis of LL in Sequence1
Refer to caption
Figure 10: Parameter analysis of LL in Sequence2
Refer to caption
Figure 11: Parameter analysis of LL in Sequence3
Refer to caption
Figure 12: Parameter analysis of LL in Sequence4
Refer to caption
Figure 13: Parameter analysis of LL in Sequence5
Refer to caption
Figure 14: Parameter analysis of LL in Sequence6
Refer to caption

IV-A Infrared small target detection

This section aims to demonstrate the robustness of our algorithm through a series of experiments related to infrared small target detection. We employ 3D Receiver Operating Characteristic Curve (ROC) [29] to assess the performance of our algorithm in separating tensor 𝒯\mathcal{T}. Additionally, we have selected five comparative methods to highlight the superiority of our approach.

IV-A1 Evaluation metrics and baseline methods

In the experiments related to small infrared targets, we’ll use 3D ROC to assess the algorithm’s capability. The 3D ROC are based on 2D ROC [30, 31]. Initially, we plot the (PD,PF)(PD,PF) curve, followed by separate curves for (PD,τ)(PD,\tau) and (PF,τ)(PF,\tau), and finally combine them to form the 3D ROC.

The horizontal axis of 2D ROC represents the false alarm rate FaF_{a}, and the vertical axis represents the detection probability PdP_{d}. They are defined as follows:

Pd=numberoftruedetectionsnumberofactualtargets{P_{d}}=\frac{{{\rm{number\ of\ true\ detections}}}}{{{\rm{number\ of\ actual\ targets}}}} (38)
Fa=numberoffalsedetectionsnumberofimagepixels{F_{a}}=\frac{{{\rm{number\ of\ false\ detections}}}}{{{\rm{number\ of\ image\ pixels}}}} (39)

The above two indicators range between 0 and 1.

Figure 15: Parameter analysis of HH in Sequence1
Refer to caption
Figure 16: Parameter analysis of HH in Sequence2
Refer to caption
Figure 17: Parameter analysis of HH in Sequence3
Refer to caption
Figure 18: Parameter analysis of HH in Sequence4
Refer to caption
Figure 19: Parameter analysis of HH in Sequence5
Refer to caption
Figure 20: Parameter analysis of HH in Sequence6
Refer to caption
TABLE IV: Quantitative comparasion of different methods on sequences 1-6
Sequence Metrics Method ASSTV-TLNMTQR(ours) ASSTV-NTLA NTFRA MSLSTIPT PSTNN TVPCP
Sequence 1 AUC(PF,PD)AUC(PF,PD) 0.844 0.849 0.881 0.701 0.682 0.848
AUC(PF,τ)AUC(PF,\tau) 0.005 0.005 0.054 0.013 0.021 0.006
RUNTIME(s)RUNTIME(s) 4.135 5.304 25.688 0.161 3.031 0.490
Sequence 2 AUC(PF,PD)AUC(PF,PD) 0.895 0.899 0.965 0.833 0.998 0.950
AUC(PF,τ)AUC(PF,\tau) 0.005 0.005 0.054 0.011 0.009 0.006
RUNTIME(s)RUNTIME(s) 4.337 5.558 27.126 0.219 3.112 0.442
Sequence 3 AUC(PF,PD)AUC(PF,PD) 0.895 0.899 0.960 0.686 0.946 0.899
AUC(PF,τ)AUC(PF,\tau) 0.005 0.005 0.068 0.008 0.015 0.006
RUNTIME(s)RUNTIME(s) 4.476 5.806 26.409 0.251 2.948 0.369
Sequence 4 AUC(PF,PD)AUC(PF,PD) 0.998 0.998 0.951 0.947 0.972 0.981
AUC(PF,τ)AUC(PF,\tau) 0.006 0.005 0.59 0.010 0.023 0.007
RUNTIME(s)RUNTIME(s) 4.061 5.188 21.649 0.213 2.732 0.443
Sequence 5 AUC(PF,PD)AUC(PF,PD) 0.896 0.8999 0.951 0.705 0.781 0.899
AUC(PF,τ)AUC(PF,\tau) 0.005 0.005 0.054 0.014 0.022 0.008
RUNTIME(s)RUNTIME(s) 4.136 5.133 24.497 0.159 2.817 0.517
Sequence 6 AUC(PF,PD)AUC(PF,PD) 0.947 0.949 0.995 0.937 0.645 0.750
AUC(PF,τ)AUC(PF,\tau) 0.006 0.005 0.066 0.013 0.009 0.006
RUNTIME(s)RUNTIME(s) 4.367 5.203 25.966 0.195 2.869 0.492

In all ROC, we assess the algorithm’s performance by comparing the Area Under Curve (AUC). In 2D ROC, a bigger AUC generally signifies better algorithm performance. However, when the algorithm’s false alarm rate is high, indicating a higher detection of irrelevant points, it can create a false impression of a large area under the (PD,PF)(PD,PF) curve. Hence, the 2D ROC may not fully reflect the algorithm’s performance, prompting our choice of the 3D ROC. Within the 3D ROC, we’ve incorporated the relationships between PD, PF, and the threshold τ\tau. The curve (PD,τ)(PD,\tau) has the same implication as (PD,PF)(PD,PF), a larger area signifies better performance. Therefore, for subsequent experiments, we will solely showcase the (PD,PF)(PD,PF) curve. Meanwhile, the (PD,τ)(PD,\tau) curve reflects the algorithm’s background suppression capability, specifically the quantity of irrelevant points. Generally, a smaller area under the (PD,τ)(PD,\tau) curve implies a stronger background suppression ability and better algorithm performance.

IV-A2 Parameter setting and datasets

Above, we provide detailed parameter settings for our method and the comparative method, as shown in TABLE III. We selected six different sets of images from the dataset [32], composing a time sequence of 120 frames. For specific details of the images, refer to TABLE II.

IV-B Parameter analysis

IV-B1 rr of tensor 𝒟\cal D

In our algorithm, the tensor 𝒟r×r×n3\mathcal{D}\in\mathbb{C}^{r\times r\times n_{3}} serves as an approximation of tensor 𝒵\mathcal{Z} in Equation (27), making the size rr of tensor 𝒟\mathcal{D} the most critical parameter. The variation in 𝒟\mathcal{D}’s size significantly impacts our algorithm’s performance in detecting weak infrared targets. Refer to Fig. 8 to Fig. 8. We utilized Sequence 6 to run our algorithm. By plotting mesh grids of different rr values on the same frame, a clear trend emerges: as rr gradually increases, the detection performance for infrared small targets improves. Additionally, in the mesh grids for r=170r=170 and r=210r=210, the differences in detection results become negligible. To pursue faster processing, we have chosen r=180r=180 as the parameter for our subsequent experiments.

IV-B2 Number of frames LL

In a sequence, the number of frames inputted each time significantly affects the n3n3 dimension of the tensor we construct. The n3n3 dimension primarily encapsulates temporal information, greatly influencing the accuracy of ASSTV calculations. Hence, we need to test the frame quantity to identify the optimal number of frames. In our algorithm, the frame quantity is denoted by the variable LL. We vary LL from 2 to 6 with an increment of 1. From the experimental results in 14 to 14, we observe that in Sequence 1, Sequence 3, and Sequence 5, the best results are achieved when L=3L=3, with minimal differences in the remaining sequences. After seeing the false alarm rates of the remaining sequences, we found that L=3L=3 is the lowest. Therefore, for subsequent experiments, we will adopt L=3L=3.

Figure 21: The 3D ROC of Sequence 1
Refer to caption
Refer to caption
Refer to caption
Figure 22: The 3D ROC of Sequence 2
Refer to caption
Refer to caption
Refer to caption
Figure 23: The 3D ROC of Sequence 3
Refer to caption
Refer to caption
Refer to caption

IV-B3 Tunning parameter HH

HH plays a crucial role in optimizing the ASTTV-TLNMTQR model. We vary HH from 2 to 10 with increments of 2. From the results in 20 to 20, it’s evident that HH has different effects across various backgrounds. In Sequence 1 and Sequence 5, H=6H=6 emerges as the best choice, maintaining a top-three position in other sequences as well. Although H=8H=8 performs best in Sequence 2 and Sequence 3, but its false alarm rate is high, so its result in Sequence 2 and Sequence 3 means nothing. Hence, we consider H=6H=6 as the most stable parameter, and we will continue using this value in the subsequent experiments.

IV-C Comparison to state-of art methods

In our comparative experiments, we systematically evaluated the performance of our ASSTV-NTLA algorithm against five other algorithms, all of which were tested on the six sequences outlined in TABLE II. The specific parameter settings for each algorithm can be found in III.

Upon analyzing the experimental results depicted in Fig. 21 to Fig. 22, we observed that certain algorithms demonstrated detection rates either higher or nearly comparable to ASSTV-NTLA and ASSTV-TLNMTQR (ours) in specific sequences. However, a closer examination of TABLE IV reveals that these seemingly impressive detection rates are accompanied by high false alarm rates. This critical insight underscores the superior performance of ASSTV-NTLA and ASSTV-TLNMTQR (ours), as they consistently achieve the highest detection rates while maintaining lower false alarm rates. Furthermore, the comprehensive analysis presented in TABLE IV highlights an additional advantage of our ASSTV-TLNMTQR algorithm, which consistently performs at a speed approximately 25%25\% faster than ASSTV-NTLA across all six sequences. Notably, this accelerated speed is achieved without compromising accuracy, as evidenced by the mere 0.2%0.2\% difference in average accuracy compared to the ASSTV-NTLA method.

In summary, our ASSTV-TLNMTQR algorithm is the optimal choice. It has the fastest speed while ensuring higher detection rate and lower false alarm rate.

Figure 24: The 3D ROC of Sequence 4
Refer to caption
Refer to caption
Refer to caption
Figure 25: The 3D ROC of Sequence 5
Refer to caption
Refer to caption
Refer to caption
Figure 26: The 3D ROC of Sequence 6
Refer to caption
Refer to caption
Refer to caption

V CONCLUSION

This article introduces an innovative tensor recovery algorithm. It combines ASSTV regularization and TLNMTQR methods, resulting in a significant improvement in computational speed. We conducted experiments in infrared small target detection to evaluate the capabilities of our algorithm. In experiments on small infrared target detection, we tested the algorithm’s ability to extract the target tensor 𝒯\mathcal{T} and suppress the background tensor \cal B. Our algorithm greatly enhances computational speed without sacrificing accuracy.

In future research, we have several areas for optimization. Firstly, when addressing the L2,1L_{2,1} norm minimization problem, we can consider assigning different weights to different eigenvalues to enhance background extraction capabilities. Secondly, apart from parameter adjustments, we can explore more effective methods to determine the optimal size for approximating tensor 𝒦\mathcal{K}.

References

  • [1] R. R. Nadakuditi, “Optshrink: An algorithm for improved low-rank signal matrix denoising by optimal, data-driven singular value shrinkage,” IEEE Transactions on Information Theory, vol. 60, no. 5, pp. 3002–3018, 2014.
  • [2] Z. Zhang, C. Ding, Z. Gao, and C. Xie, “Anlpt: Self-adaptive and non-local patch-tensor model for infrared small target detection,” Remote Sensing, vol. 15, no. 4, p. 1021, 2023.
  • [3] H. Yi, C. Yang, R. Qie, J. Liao, F. Wu, T. Pu, and Z. Peng, “Spatial-temporal tensor ring norm regularization for infrared small target detection,” IEEE Geoscience and Remote Sensing Letters, vol. 20, pp. 1–5, 2023.
  • [4] L. Chuntong and W. Hao, “Research on infrared dim and small target detection algorithm based on low-rank tensor recovery,” Journal of Systems Engineering and Electronics, 2023.
  • [5] M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly, “Compressed sensing mri,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 72–82, 2008.
  • [6] M. Rani, S. B. Dhok, and R. B. Deshmukh, “A systematic review of compressive sensing: Concepts, implementations and applications,” IEEE Access, vol. 6, pp. 4875–4894, 2018.
  • [7] V. Chandrasekaran, S. Sanghavi, P. A. Parrilo, and A. S. Willsky, “Sparse and low-rank matrix decompositions,” IFAC Proceedings Volumes, vol. 42, no. 10, pp. 1493–1498, 2009, 15th IFAC Symposium on System Identification. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474667016388632
  • [8] D. Bertsimas, R. Cory-Wright, and N. A. Johnson, “Sparse plus low rank matrix decomposition: A discrete optimization approach,” Journal of Machine Learning Research, vol. 24, no. 267, pp. 1–51, 2023.
  • [9] G. Daniel, G. Meirav, O. Noam, B.-K. Tamar, R. Dvir, O. Ricardo, and B.-E. Noam, “Fast and accurate t2 mapping using bloch simulations and low-rank plus sparse matrix decomposition,” Magnetic Resonance Imaging, vol. 98, pp. 66–75, 2023.
  • [10] Z. Lin, M. Chen, and Y. Ma, “The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices,” arXiv preprint arXiv:1009.5055, 2010.
  • [11] Y. Chen, Y. Guo, Y. Wang, D. Wang, C. Peng, and G. He, “Denoising of hyperspectral images using nonconvex low rank matrix approximation,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 9, pp. 5366–5380, 2017.
  • [12] Y. Xie, Y. Qu, D. Tao, W. Wu, Q. Yuan, and W. Zhang, “Hyperspectral image restoration via iteratively regularized weighted schatten pp-norm minimization,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 8, pp. 4642–4659, 2016.
  • [13] H. Zhang, W. He, L. Zhang, H. Shen, and Q. Yuan, “Hyperspectral image restoration using low-rank matrix recovery,” IEEE transactions on geoscience and remote sensing, vol. 52, no. 8, pp. 4729–4743, 2013.
  • [14] J. Xue, Y. Zhao, S. Huang, W. Liao, J. C.-W. Chan, and S. G. Kong, “Multilayer sparsity-based tensor decomposition for low-rank tensor completion,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 11, pp. 6916–6930, 2022.
  • [15] M. Wang, D. Hong, Z. Han, J. Li, J. Yao, L. Gao, B. Zhang, and J. Chanussot, “Tensor decompositions for hyperspectral data processing in remote sensing: A comprehensive review,” IEEE Geoscience and Remote Sensing Magazine, 2023.
  • [16] C. Lu, J. Feng, Y. Chen, W. Liu, Z. Lin, and S. Yan, “Tensor robust principal component analysis with a new tensor nuclear norm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 4, pp. 925–938, 2020.
  • [17] W. He, H. Zhang, and L. Zhang, “Total variation regularized reweighted sparse nonnegative matrix factorization for hyperspectral unmixing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, no. 7, pp. 3909–3921, 2017.
  • [18] M.-D. Iordache, J. M. Bioucas-Dias, and A. Plaza, “Total variation spatial regularization for sparse hyperspectral unmixing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 50, no. 11, pp. 4484–4502, 2012.
  • [19] V. Vishnevskiy, T. Gass, G. Szekely, C. Tanner, and O. Goksel, “Isotropic total variation regularization of displacements in parametric image registration,” IEEE Transactions on Medical Imaging, vol. 36, no. 2, pp. 385–395, 2017.
  • [20] W. He, H. Zhang, H. Shen, and L. Zhang, “Hyperspectral image denoising using local low-rank matrix recovery and global spatial–spectral total variation,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, no. 3, pp. 713–729, 2018.
  • [21] T. Liu, J. Yang, B. Li, C. Xiao, Y. Sun, Y. Wang, and W. An, “Nonconvex tensor low-rank approximation for infrared small target detection,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–18, 2022.
  • [22] S. Weiland and F. Van Belzen, “Singular value decompositions and low rank approximations of tensors,” IEEE transactions on signal processing, vol. 58, no. 3, pp. 1171–1182, 2009.
  • [23] Y. Zheng and A.-B. Xu, “Tensor completion via tensor qr decomposition and l2,1-norm minimization,” Signal Processing, vol. 189, p. 108240, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0165168421002772
  • [24] C. Gao, D. Meng, Y. Yang, Y. Wang, X. Zhou, and A. G. Hauptmann, “Infrared patch-image model for small target detection in a single image,” IEEE Transactions on Image Processing, vol. 22, no. 12, pp. 4996–5009, 2013.
  • [25] X. Wang, Z. Peng, D. Kong, P. Zhang, and Y. He, “Infrared dim target detection based on total variation regularization and principal component pursuit,” Image and Vision Computing, vol. 63, pp. 1–9, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0262885617300756
  • [26] L. Zhang and Z. Peng, “Infrared small target detection based on partial sum of the tensor nuclear norm,” Remote Sensing, vol. 11, no. 4, p. 382, 2019.
  • [27] Y. Sun, J. Yang, and W. An, “Infrared dim and small target detection via multiple subspace learning and spatial-temporal patch-tensor model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 5, pp. 3737–3752, 2020.
  • [28] X. Kong, C. Yang, S. Cao, C. Li, and Z. Peng, “Infrared small target detection via nonconvex tensor fibered rank approximation,” IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1–21, 2021.
  • [29] C.-I. Chang, “An effective evaluation tool for hyperspectral target detection: 3d receiver operating characteristic curve analysis,” IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 6, pp. 5131–5153, 2021.
  • [30] S. Atapattu, C. Tellambura, and H. Jiang, “Analysis of area under the roc curve of energy detection,” IEEE Transactions on Wireless Communications, vol. 9, no. 3, pp. 1216–1225, 2010.
  • [31] S. Manti, M. K. Svendsen, N. R. Knøsgaard, P. M. Lyngby, and K. S. Thygesen, “Exploring and machine learning structural instabilities in 2d materials,” npj Computational Materials, vol. 9, no. 1, p. 33, 2023.
  • [32] B. Hui, Z. Song, H. Fan, P. Zhong, W. Hu, X. Zhang, J. Lin, H. Su, W. Jin, Y. Zhang, and Y. Bai, “A dataset for infrared image dim-small aircraft target detection and tracking under ground / air background,” Oct. 2019.