This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Hyperspectral Image Denoising with Partially Orthogonal Matrix Vector Tensor Factorization

Zhen Long, Yipeng Liu,  Sixing Zeng, Jiani Liu, Fei Wen, Ce Zhu,   This research is supported by National Natural Science Foundation of China (NSFC, No. 61602091, No. 61571102) and the Sichuan Science and Technology program (No. 2019YFH0008, No. 2018JY0035). The corresponding author is Yipeng Liu.Z. Long, Y. Liu, S. Zeng, J. Liu and C. Zhu are with School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China. e-mail: yipengliu@uestc.edu.cn.F. Wen is with the Department of Electronic Engineering, Shanghai Jiao Tong University, Shanghai 200240, China.
Abstract

Hyperspectral image (HSI) has some advantages over natural image for various applications due to the extra spectral information. During the acquisition, it is often contaminated by severe noises including Gaussian noise, impulse noise, deadlines, and stripes. The image quality degeneration would badly effect some applications. In this paper, we present an HSI restoration method named smooth and robust low-rank tensor recovery. Specifically, we propose a structural tensor decomposition in accordance with the linear spectral mixture model of HSI. It decomposes a tensor into sums of outer matrix-vector products, where the vectors are orthogonal due to the independence of endmember spectrums. Based on it, the global low rank tensor structure can be well exposited for HSI denoising. In addition, the 3D anisotropic total variation is used for spatial-spectral piecewise smoothness of HSI. Meanwhile, the sparse noise including impulse noise, deadlines and stripes, is detected by the 1\ell_{1}-norm regularization. The Frobenius norm is used for the heavy Gaussian noise in some real-world scenarios. The alternating direction method of multipliers is adopted to solve the proposed optimization model, which simultaneously exploits the global low-rank property and the spatial–spectral smoothness of the HSI. Numerical experiments on both simulated and real data illustrate the superiority of the proposed method in comparison with the existing ones.

Index Terms:
hyperspectral image, image denoising, matrix-vector tensor factorization, total variation.

I Introduction

Hyperspectral imaging uses spectrometers to collect data over hundreds of spectral bands ranging from 400nm to 2500nm in the same region. It can provide both spectral and spatial information about objects due to its numerous and continuous spectral bands. With abundant available spectral information, hyperspectral image (HSI) has been popular in a series of fields, such as remote sensing [1], food safety [2], object detection and classification [3, 4]. However, due to thermal electronics, dark current, random error of light count in the image formation, HSI is inevitably affected by severe noise during the acquisition, and it will definitely degrade the image quality. Therefore, the noise removal from HSI has become an important research area and attracted lots of attentions [5, 6, 7, 8, 9].

A number of denoising algorithms have proposed for HSI, including block matching and 3-dimensional filtering (BM3D) [10], total variational based methods [11, 12, 13], sparse representations based methods [14, 15, 16], Bayesian methods [17, 18, 19, 20]), deep learning based methods [21, 22, 23, 24, 25], and low rank based methods [26, 27, 8, 28, 29]. Among them, low rank approximation shows good performance without training, as a representative of model-based ways. It can be shown in Fig. 1, low-rank based denoising methods can be divided into matrix based ones and tensor based ones.

Refer to caption
Figure 1: The schematic diagram of matrix based methods and tensor based methods

The matrix-based denoising unfolds the three-dimensional tensor into a matrix or treats each band independently, and uses the traditional two-dimensional image denoising methods. For example, Zhang et al. [26] divide the HSI into several fragments, rearrange these fragments into a two-dimensional matrix, and restore each fragment using the low rank matrix recovery (LRMR) method. This method is very successful in recovering HSI with mixed noise, leading to a series of denoising models based on LRMR [8, 27, 28].

However, these two-dimensional denoising algorithms are difficult to achieve optimal results since the joint spatial-spectral information of HSI is partly damaged. To address this problem, the low-rank tensor recovery (LRTR) [30] is proposed for HSI denoising [29], which simultaneously makes use of the spectral and spatial information of HSI and obtains better results than the LRMR methods. Following it, some other works using different tensor decompositions are proposed. For instance, LRTDTV [31] uses Tucker decomposition based low rank tensor recovery method [32, 33] for HSI denoising. NLR-CPTD [34] and GSLRTD [35] apply CP decomposition [36] and t-SVD [37, 38] for HSI denoising, respectively.

Refer to caption
Figure 2: Matrix-vector tensor factorization

However, the commonly used three tensor decompositions cannot explicitly represent the linear spectral mixture model for HSI processing. In linear spectral mixture model, a spectrum at each pixel in the spatial domain is assumed to be a linear combination of several endmember spectra. Matrix-vector tensor factorization (MVTF) provides a more natural way to fit the linear spectral mixture model of HSI. As shown in Fig. 2, a HSI can be represented by the sum of the cross products of a endmember (vector) and its abundance map (matrix). Besides, MVTF has shown its superiority on HSI related applications, such as HSI unmixing [39] and HSI super-resolution [40]. However, in these two cases, they all ignore that the independence of endmember spectrums which is an important linear spectral mixture model prior for HSI related applications [41, 42].

In this paper, motivated by the linear spectral mixture model, we propose a structural MVTF with orthogonal vectors, and apply it to low rank based HSI denoising. The newly obtained partially orthogonal matrix vector tensor factorization can better exploit the global structure of HSI. To further enhance the recovery performance, total variation (TV) [43, 44] is used to model the piece-smooth structure of HSI. In addition, the Frobenius norm and 1\ell_{1} norm terms are used to deal with Gaussian noise and impulse noise in the optimization model, respectively. The alternating direction method of multipliers (ADMM) is used to divide the optimization problem into several sub-problems. Experimental results based on HSI recovery show that the proposed method outperforms state-of-the-art ones in terms of mean peak signal to noise ratio (MPSNR), mean structural similarity index (MSSIM), and dimensionless global relative error of synthesis (ERGAS).

The main contributions of this paper can be summarized as follows:

  • 1)

    We develop a new low rank approximation method based on a new structural tensor factorization to model HSI. Considering that endmember spectrums are irrelevant, an orthogonality constraint is introduced on vectors (endmember spectrums) in MVTF. With this constraint, we directly minimize the ranks of abundance matrices. Compared with the method on standard MVTF, the proposed one is much faster and there is no need to set all ranks of MVTF in advance.

  • 2)

    A designed 3D anisotropic total variation (3DATV) term for spatial-spectral regularization is incorporated into the optimization model for low-rank tensor recovery. It can well model the local piece-wise smoothness in joint spatial-spectral domain of HSI data for denoising.

  • 3)

    The ADMM is used for solving the optimization model. Numerical experiments on simulated and real data illustrate the superiority of the proposed methods.

The structure of the paper is as follows. In Section 2, we give some notations used in this paper. Section 3 presents the details of the proposed hyperspectral image denoising method. The experimental results of the proposed method is given in Section 4. Section 5 summarizes the paper.

II Notations and Preliminaries

II-A Notations

In this paper, scalar, vector, matrix and tensor are denoted by lowercase letters xx, boldface lowercase letters 𝐱\mathbf{x} and boldface capital letters 𝐗\mathbf{X}, and calligraphic letters 𝒳\mathcal{X}, respectively.

II-B Preliminaries on tensor computation

The 1\ell_{1} norm and the Frobenius norm of a matrix 𝐗\mathbf{X} are defined as:

𝐗1=i1=1I1i2=1I2|xi1,i2|,\begin{split}\left\|\mathbf{X}\right\|_{1}=\sum^{I_{1}}_{i_{1}=1}\sum^{I_{2}}_{i_{2}=1}\left|x_{i_{1},i_{2}}\right|,\end{split} (1)

and

𝐗F=(i1=1I1i2=1I2|xi1,i2|2)1/2,\begin{split}\left\|\mathbf{X}\right\|_{\text{F}}=(\sum^{I_{1}}_{i_{1}=1}\sum^{I_{2}}_{i_{2}=1}\left|x_{i_{1},i_{2}}\right|^{2})^{1/2},\end{split} (2)

respectively, where xi1,i2x_{i_{1},i_{2}} is the (i1,i2)(i_{1},i_{2})-th entry of 𝐗\mathbf{X}.

The inner product of two tensors 𝒳,𝒴I1×I2××IN\mathcal{X},\mathcal{Y}\in\mathbb{R}^{I_{1}\times I_{2}\times\cdots\times I_{N}} is defined as:

𝒳,𝒴=i1=1I1i2=1I2iN=1INxi1iNyi1iN.\begin{split}\left\langle\mathcal{X},\mathcal{Y}\right\rangle=\sum^{I_{1}}_{i_{1}=1}\sum^{I_{2}}_{i_{2}=1}\cdots\sum^{I_{N}}_{i_{N}=1}x_{i_{1}\cdots i_{N}}y_{i_{1}\cdots i_{N}}.\end{split} (3)

The cross product of two tensors 𝒳I1×I2××IN\mathcal{X}\in\mathbb{R}^{I_{1}\times I_{2}\times\cdots\times I_{N}} and 𝒴J1×J2××JM\mathcal{Y}\in\mathbb{R}^{J_{1}\times J_{2}\times\cdots\times J_{M}} is defined as:

(𝒳𝒴)i1iNj1jM=xi1iNyj1jM.(\mathcal{X}\circ\mathcal{Y})_{i_{1}\cdots i_{N}j_{1}\cdots j_{M}}=x_{i_{1}\cdots i_{N}}y_{j_{1}\cdots j_{M}}. (4)

The mode-nn product of an NN-th order tensor 𝒳I1×I2××IN\mathcal{X}\in\mathbb{R}^{I_{1}\times I_{2}\times\cdots\times I_{N}} with a matrix 𝐔Jn×In\mathbf{U}\in\mathbb{R}^{J_{n}\times I_{n}} is a new tensor of size I1××In1×Jn×In+1××INI_{1}\times\cdots\times I_{n-1}\times J_{n}\times I_{n+1}\times\cdots\times I_{N}, which can be denoted as follows:

(𝒳×n𝐔(n))i1in1jin+1iN=in=1Inxi1i2iNujin,\begin{split}(\mathcal{X}\times_{n}\mathbf{U}^{(n)})_{i_{1}\cdots i_{n-1}ji_{n+1}\cdots i_{N}}=\sum_{i_{n}=1}^{I_{n}}x_{i_{1}i_{2}\cdots i_{N}}u_{ji_{n}},\end{split} (5)

The mode-nn unfolding of an NN-th order tensor 𝒳I1×I2××IN\mathcal{X}\in\mathbb{R}^{I_{1}\times I_{2}\times\cdots\times I_{N}} is expressed as 𝐀(n)\mathbf{A}_{(n)}. The mode-nn unfolding operator arranges the nn-th mode of 𝒜\mathcal{A} as the row while the rest modes as the column of the mode-nn unfolding matrix. Mathematically, the elements of 𝐀(n)\mathbf{A}_{(n)} satisfy

𝐀(n)(in,j)=𝒜(i1,,in,,iN)\mathbf{A}_{(n)}(i_{n},j)=\mathcal{A}(i_{1},\cdots,i_{n},\cdots,i_{N}) (6)

where j=i1,,in1,in+1,,iN¯j=\overline{i_{1},\cdots,i_{n-1},i_{n+1},\cdots,i_{N}}.

II-C Preliminaries on total variation

For a third-order hyperspectral data 𝒳I×J×K\mathcal{X}\in\mathbb{R}^{I\times J\times K}, the total variation of HSI is denoted by [45, 46].:

𝒳TV=𝔇(𝒳)1\displaystyle\|\mathcal{X}\|_{\text{TV}}=\|\mathfrak{D}(\mathcal{X})\|_{1}
=𝔇h(𝒳)1+𝔇v(𝒳)1+𝔇z(𝒳)1,\displaystyle=\|\mathfrak{D}_{h}(\mathcal{X})\|_{1}+\|\mathfrak{D}_{v}(\mathcal{X})\|_{1}+\|\mathfrak{D}_{z}(\mathcal{X})\|_{1}, (7)

where 𝔇()=[𝔇h();𝔇v();𝔇z()]\mathfrak{D}(\cdot)=\left[\mathfrak{D}_{h}(\cdot);\mathfrak{D}_{v}(\cdot);\mathfrak{D}_{z}(\cdot)\right] is a three-dimensional difference operator, 𝔇h\mathfrak{D}_{h}, 𝔇v\mathfrak{D}_{v}, 𝔇z\mathfrak{D}_{z} are first-order finite difference operators along three different ways of hyperspectral data, which can be defined as:

𝔇h(𝒳)=𝒳(i+1,j,k)𝒳(i,j,k),i=1,,I1\displaystyle\mathfrak{D}_{h}(\mathcal{X})=\mathcal{X}(i+1,j,k)-\mathcal{X}(i,j,k),i=1,\cdots,I-1
𝔇v(𝒳)=𝒳(i,j+1,k)𝒳(i,j,k),j=1,,J1\displaystyle\mathfrak{D}_{v}(\mathcal{X})=\mathcal{X}(i,j+1,k)-\mathcal{X}(i,j,k),j=1,\cdots,J-1
𝔇z(𝒳)=𝒳(i,j,k+1)𝒳(i,j,k),k=1,,K1.\displaystyle\mathfrak{D}_{z}(\mathcal{X})=\mathcal{X}(i,j,k+1)-\mathcal{X}(i,j,k),k=1,\cdots,K-1. (8)

II-D Matrix-vector tensor factorization

The MVTF decomposes a tensor 𝒳I×J×K\mathcal{X}\in\mathbb{R}^{I\times J\times K} into the sum of several component tensors, and each component tensor can be written in the cross product form of a matrix 𝐆rI×J\mathbf{G}_{r}\in\mathbb{R}^{I\times J} and a vector 𝐜rK\mathbf{c}_{r}\in\mathbb{R}^{K}, which can be denoted as [39]:

𝒳r=1R𝐆r𝐜r=𝒢×3𝐂,\mathcal{X}\approx\sum_{r=1}^{R}\mathbf{G}_{r}\circ\mathbf{c}_{r}=\mathcal{G}\times_{3}\mathbf{C}, (9)

where 𝒢I×J×R\mathcal{G}\in\mathbb{R}^{I\times J\times R}, 𝐆r\mathbf{G}_{r} is the rr-th slice of 𝒢\mathcal{G} and 𝐂K×R\mathbf{C}\in\mathbb{R}^{K\times R}.

In linear spectral mixture model of hyperspectral image, 𝐜r\mathbf{c}_{r} can be modeled as the rr-th endmember spectrum, and abundance matrix 𝐆r\mathbf{G}_{r} represents the spatial information of rr-th endmember. Due to the strong spatial correlation of HSI, each abundance matrix is always low rank.

III Optimization model

As mentioned above, HSI is a third-order tensor, which has two spatial dimensions and one spectral dimension. In the data acquisition, they are inevitably polluted by noise. Considering that the observed image 𝒴\mathcal{Y} can be polluted by Gaussian noise and outliers, we set up the measurement model as follows:

𝒴=𝒳+𝒮+𝒩,\mathcal{Y}=\mathcal{X}+\mathcal{S}+\mathcal{N}, (10)

where 𝒴\mathcal{Y} is the observed image, 𝒳\mathcal{X} is the clean image, 𝒮\mathcal{S} is the sparse noise or outliers, and 𝒩\mathcal{N} is the Gaussian noise.

The purpose of HSI denoising is to estimate the clear hyperspectral image from the observed data contaminated by noise. In this study, we propose a novel smooth and robust low rank partially orthogonal MVTF method for HSI denoising under the consideration that endmember spectrums are irrelevant.

First, we develop a low rank partially orthogonal MVTF model for the clean HSI. It can be achieved by the following model:

min𝐆r,r=1,R,𝐂r=1Rrank(𝐆r),s. t. 𝒳=𝒢×3𝐂,𝐂T𝐂=𝐈R,\min_{\mathbf{G}_{r,r=1,\cdots R},\mathbf{C}}\sum_{r=1}^{R}\operatorname{rank}(\mathbf{G}_{r}),~{}\text{s.~{}t.~{}}\mathcal{X}=\mathcal{G}\times_{3}\mathbf{C},\mathbf{C}^{\operatorname{T}}\mathbf{C}=\mathbf{I}_{R}, (11)

where 𝒳\mathcal{X} is the clean HSI, rank(𝐆r)\operatorname{rank}(\mathbf{G}_{r}) is the rank of rr-th abundance matrix 𝐆r\mathbf{G}_{r}, 𝒢(:,:,r)=𝐆r\mathcal{G}(:,:,r)=\mathbf{G}_{r}, min𝐆r,r=1,R,𝐂r=1Rrank(𝐆r)\min_{\mathbf{G}_{r,r=1,\cdots R},\mathbf{C}}\sum_{r=1}^{R}\operatorname{rank}(\mathbf{G}_{r}) means each abundance matrix is of low rank, and the term 𝐂T𝐂=𝐈R\mathbf{C}^{\operatorname{T}}\mathbf{C}=\mathbf{I}_{R} means endmember spectrums are uncorrelated. Next, the 1\ell_{1} norm is used to separate sparse outlier from the observation. In addition, to enhance the recovery performance, the total variation term is used to exploit local smoothness structure of the HSI data.

The optimization model for this smooth and robust low rank tensor recovery (SRLRTR) for HSI denoising can be formulated as follows:

min𝒳,𝒮,𝒩,𝒢,𝐂λTV𝒳TV+λ𝒮𝒮1+λ𝒩𝒩F2+λ𝒢r=1Rrank(𝐆r)\displaystyle\min_{\mathcal{X},\mathcal{S},\mathcal{N},\mathcal{G},\mathbf{C}}\lambda_{\text{TV}}\left\|\mathcal{X}\right\|_{\text{TV}}+\lambda_{\mathcal{S}}\left\|\mathcal{S}\right\|_{1}+\lambda_{\mathcal{N}}\left\|\mathcal{N}\right\|_{\text{F}}^{2}+\lambda_{\mathcal{G}}\sum_{r=1}^{R}\operatorname{rank}(\mathbf{G}_{r})
s. t.𝒴=𝒳+𝒮+𝒩,𝒳=𝒢×3𝐂,𝐂T𝐂=𝐈R.\displaystyle\text{s.~{}t.}\quad\mathcal{Y}=\mathcal{X}+\mathcal{S}+\mathcal{N},\mathcal{X}=\mathcal{G}\times_{3}\mathbf{C},\mathbf{C}^{\operatorname{T}}\mathbf{C}=\mathbf{I}_{R}. (12)

Since the function rank(𝐆r)\operatorname{rank}(\mathbf{G}_{r}) is nonconvex, we can use 𝐆r\|\mathbf{G}_{r}\|_{*} as the convex surrogate and rewrite equation (III) as follows:

min𝒳,𝒮,𝒩,𝒢,𝐂λTV𝔇(𝒳)1+λ𝒮𝒮1+λ𝒩𝒩F2+λ𝒢r=1R𝐆r\displaystyle\min_{\mathcal{X},\mathcal{S},\mathcal{N},\mathcal{G},\mathbf{C}}\lambda_{\text{TV}}\left\|\mathfrak{D}(\mathcal{X})\right\|_{1}+\lambda_{\mathcal{S}}\left\|\mathcal{S}\right\|_{1}+\lambda_{\mathcal{N}}\left\|\mathcal{N}\right\|_{\text{F}}^{2}+\lambda_{\mathcal{G}}\sum_{r=1}^{R}\|\mathbf{G}_{r}\|_{*}
s. t.𝒴=𝒳+𝒮+𝒩,𝒳=𝒢×3𝐂,𝐂T𝐂=𝐈R.\displaystyle\text{s.~{}t.}\quad\mathcal{Y}=\mathcal{X}+\mathcal{S}+\mathcal{N},\mathcal{X}=\mathcal{G}\times_{3}\mathbf{C},\mathbf{C}^{\operatorname{T}}\mathbf{C}=\mathbf{I}_{R}. (13)

For solving this problem, we need to introduce two auxiliary variables 𝒵\mathcal{Z} and 𝐋\mathbf{L}, and the optimization model (III) can be rewritten into the following equivalent form:

min𝒳,𝒮,𝒩,𝒢,𝐂,𝒵,𝐋λTV𝐋1+λ𝒮𝒮1+λ𝒩𝒩F2+λ𝒢r=1R𝐆r\displaystyle\min_{\mathcal{X},\mathcal{S},\mathcal{N},\mathcal{G},\mathbf{C},\mathcal{Z},\mathbf{L}}\lambda_{\text{TV}}\left\|\mathbf{L}\right\|_{1}+\lambda_{\mathcal{S}}\left\|\mathcal{S}\right\|_{1}+\lambda_{\mathcal{N}}\left\|\mathcal{N}\right\|_{\text{F}}^{2}+\lambda_{\mathcal{G}}\sum_{r=1}^{R}\|\mathbf{G}_{r}\|_{*}
s. t.𝒴=𝒳+𝒮+𝒩,𝒵=𝒳,𝐋=𝔇(𝒵),\displaystyle\text{s.~{}t.}\quad\mathcal{Y}=\mathcal{X}+\mathcal{S}+\mathcal{N},\mathcal{Z}=\mathcal{X},\mathbf{L}=\mathfrak{D}(\mathcal{Z}),
𝒳=𝒢×3𝐂,𝐂T𝐂=𝐈R.\displaystyle\quad\mathcal{X}=\mathcal{G}\times_{3}\mathbf{C},\mathbf{C}^{\operatorname{T}}\mathbf{C}=\mathbf{I}_{R}. (14)

The augmented Lagrangian function of the optimization model (III) is:

𝔏(𝒳,𝒮,𝒩,𝒢,𝐂,𝒵,𝐋,Λ1,Λ2,Λ3,Λ4)=λTV𝐋1+λ𝒮𝒮1\displaystyle\mathfrak{L}(\mathcal{X},\mathcal{S},\mathcal{N},\mathcal{G},\mathbf{C},\mathcal{Z},\mathbf{L},\Lambda_{1},\Lambda_{2},\Lambda_{3},\Lambda_{4})=\lambda_{\text{TV}}\left\|\mathbf{L}\right\|_{1}+\lambda_{\mathcal{S}}\left\|\mathcal{S}\right\|_{1}
+λ𝒩𝒩F2+λ𝒢r=1R𝐆r+Λ1,𝒴𝒳𝒮𝒩\displaystyle+\lambda_{\mathcal{N}}\left\|\mathcal{N}\right\|_{\text{F}}^{2}+\lambda_{\mathcal{G}}\sum_{r=1}^{R}\|\mathbf{G}_{r}\|_{*}+\left\langle\Lambda_{1},\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\rangle
+Λ2,𝒵𝒳+Λ3,𝐋𝔇(𝒵)+Λ4,𝒳𝒢×3𝐂\displaystyle+\left\langle\Lambda_{2},\mathcal{Z}-\mathcal{X}\right\rangle+\left\langle\Lambda_{3},\mathbf{L}-\mathfrak{D}(\mathcal{Z})\right\rangle+\left\langle\Lambda_{4},\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\rangle
+β12𝒴𝒳𝒮𝒩F2+β22𝒵𝒳F2\displaystyle+\frac{\beta_{1}}{2}\left\|\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\|_{\text{F}}^{2}+\frac{\beta_{2}}{2}\left\|\mathcal{Z}-\mathcal{X}\right\|_{\text{F}}^{2}
+β32𝐋𝔇(𝒵)F2+β42𝒳𝒢×3𝐂F2),\displaystyle+\frac{\beta_{3}}{2}\left\|\mathbf{L}-\mathfrak{D}(\mathcal{Z})\right\|_{\text{F}}^{2}+\frac{\beta_{4}}{2}\left\|\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\|_{\text{F}}^{2}), (15)

under the constraint 𝐂T𝐂=𝐈R\mathbf{C}^{\operatorname{T}}\mathbf{C}=\mathbf{I}_{R}. Therefore, we can choose to iteratively optimize one variable with the other variables fixed.

III-A Updating 𝒢\mathcal{G}

Fixing other variables to update 𝒢\mathcal{G}, the optimization model can be rewritten as:

min𝐆r,r=1,,Rλ𝒢r=1R𝐆r+Λ4,𝒳𝒢×3𝐂+β42𝒳𝒢×3𝐂F2,\min_{\mathbf{G}_{r,r=1,\cdots,R}}\lambda_{\mathcal{G}}\sum_{r=1}^{R}\|\mathbf{G}_{r}\|_{*}+\left\langle\Lambda_{4},\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\rangle+\frac{\beta_{4}}{2}\left\|\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\|_{\text{F}}^{2}, (16)

where 𝒢(:,:,r)=𝐆r,r=1,,R\mathcal{G}(:,:,r)=\mathbf{G}_{r},r=1,\cdots,R. The optimization problem (16) can be equivalent to

min𝐆r,r=1,,Rλ𝒢r=1R𝐆r+β42r=1R𝐆r(:,:,r)F2,\min_{\mathbf{G}_{r,r=1,\cdots,R}}\lambda_{\mathcal{G}}\sum_{r=1}^{R}\|\mathbf{G}_{r}\|_{*}+\frac{\beta_{4}}{2}\sum_{r=1}^{R}\|\mathbf{G}_{r}-\mathcal{M}(:,:,r)\|_{\text{F}}^{2}, (17)

where =(𝒳+Λ4β4)×3𝐂T\mathcal{M}=(\mathcal{X}+\frac{\Lambda_{4}}{\beta_{4}})\times_{3}\mathbf{C}^{\text{T}}. The detailed solutions of 𝒢\mathcal{G} are concluded in Algorithm 1, where SVD means singular value decomposition and the soft thresholding operator of xx is defined as:

Thrτ(x)=sgn(x)max(|x|τ,0).\text{Thr}_{\tau}(x)=\text{sgn}(x)\max(|x|-\tau,0). (18)
Input: 𝒳I1×I2×I3;Λ4,β4;λ𝒢;R\mathcal{X}\in\mathbb{R}^{I_{1}\times I_{2}\times I_{3}};\Lambda_{4},\beta_{4};\lambda_{\mathcal{G}};R
Output: 𝒢^\hat{\mathcal{G}}
 1. update τ=λ𝒢β4\tau=\frac{\lambda_{\mathcal{G}}}{\beta_{4}}
 2. =(𝒳+Λ4β4)×3𝐂T\mathcal{M}=(\mathcal{X}+\frac{\Lambda_{4}}{\beta_{4}})\times_{3}\mathbf{C}^{\text{T}}
 3. for r=1 to R:
  [𝐔,𝚺,𝐕\mathbf{U},\bm{\Sigma},\mathbf{V}]=SVD\operatorname{SVD}((:,:,r)\mathcal{M}(:,:,r))
  𝚺^=Thrτ(𝚺)\hat{\bm{\Sigma}}=\text{Thr}_{\tau}(\bm{\Sigma})
  𝐆r\mathbf{G}_{r}=𝐔𝚺^𝐕T\mathbf{U}\hat{\bm{\Sigma}}\mathbf{V}^{\operatorname{T}}
  𝒢^(:,:,r)=𝐆r\hat{\mathcal{G}}(:,:,r)=\mathbf{G}_{r}
  end for
Algorithm 1 Updating 𝒢\mathcal{G}

III-B Updating 𝐂\mathbf{C}

Fixing 𝐂\mathbf{C} in equation (III), we can obtain the following equation:

min𝐂Λ4,𝒳𝒢×3𝐂+β42𝒳𝒢×3𝐂F2,\displaystyle\min_{\mathbf{C}}\left\langle\Lambda_{4},\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\rangle+\frac{\beta_{4}}{2}\left\|\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\|_{\text{F}}^{2},
s. t. 𝐂T𝐂=𝐈R.\displaystyle~{}\text{s.~{}t.~{}}\mathbf{C}^{T}\mathbf{C}=\mathbf{I}_{R}. (19)

Solving problem (III-B) is equal to :

max𝐂trace(𝐆(3)(Λ4(3)T+β4𝐗(3)T)𝐂)s. t. 𝐂T𝐂=𝐈R.\max_{\mathbf{C}}\operatorname{trace}(\mathbf{G}_{(3)}(\Lambda_{4(3)}^{\text{T}}+\beta_{4}\mathbf{X}_{(3)}^{\text{T}})\mathbf{C})~{}\text{s.~{}t.~{}}\mathbf{C}^{\text{T}}\mathbf{C}=\mathbf{I}_{R}. (20)

Letting 𝐌=𝐆(3)(Λ4(3)T+β4𝐗(3)T)\mathbf{M}=\mathbf{G}_{(3)}(\Lambda_{4(3)}^{\text{T}}+\beta_{4}\mathbf{X}_{(3)}^{\text{T}}) and performing SVD on 𝐌\mathbf{M} as 𝐌=𝐔𝚺𝐕T\mathbf{M}=\mathbf{U}\bm{\Sigma}\mathbf{V}^{\text{T}}, the solution of 𝐂\mathbf{C} can be obtained by

𝐂=𝐕𝐔T.\mathbf{C}=\mathbf{V}\mathbf{U}^{\text{T}}. (21)

III-C Updating 𝒳\mathcal{X}

Fixing other variables to update 𝒳\mathcal{X}, the optimization model (III) can be rewritten as:

min𝒳Λ1,𝒴𝒳𝒮𝒩+Λ2,𝒵𝒳\displaystyle\min_{\mathcal{X}}\left\langle\Lambda_{1},\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\rangle+\left\langle\Lambda_{2},\mathcal{Z}-\mathcal{X}\right\rangle
+Λ4,𝒳𝒢×3𝐂+β12𝒴𝒳𝒮𝒩F2\displaystyle+\left\langle\Lambda_{4},\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\rangle+\frac{\beta_{1}}{2}\left\|\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\|_{\text{F}}^{2}
+β22𝒵𝒳F2+β42𝒳𝒢×3𝐂F2.\displaystyle+\frac{\beta_{2}}{2}\left\|\mathcal{Z}-\mathcal{X}\right\|_{\text{F}}^{2}+\frac{\beta_{4}}{2}\left\|\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}\right\|_{\text{F}}^{2}. (22)

Using the objective function in (III-C) to derive the 𝒳\mathcal{X}, we can obtain the solution of 𝒳\mathcal{X} as follows:

𝒳=β1(𝒴𝒮𝒩)+Λ1+β2𝒵+Λ2+β4(𝒢×3𝐂)Λ4β1+β2+β4.\mathcal{X}=\frac{\beta_{1}(\mathcal{Y}-\mathcal{S}-\mathcal{N})+\Lambda_{1}+\beta_{2}\mathcal{Z}+\Lambda_{2}+\beta_{4}(\mathcal{G}\times_{3}\mathbf{C})-\Lambda_{4}}{\beta_{1}+\beta_{2}+\beta_{4}}. (23)

III-D Updating 𝒵\mathcal{Z}

Similarly, the optimization model (III) with respect to 𝒵\mathcal{Z} can be rewritten as:

min𝒵Λ2,𝒵𝒳+Λ3,𝐋𝔇(𝒵)\displaystyle\min_{\mathcal{Z}}\left\langle\Lambda_{2},\mathcal{Z}-\mathcal{X}\right\rangle+\left\langle\Lambda_{3},\mathbf{L}-\mathfrak{D}(\mathcal{Z})\right\rangle
+β22𝒵𝒳F2+β32𝐋𝔇(𝒵)F2.\displaystyle+\frac{\beta_{2}}{2}\left\|\mathcal{Z}-\mathcal{X}\right\|_{\text{F}}^{2}+\frac{\beta_{3}}{2}\left\|\mathbf{L}-\mathfrak{D}(\mathcal{Z})\right\|_{\text{F}}^{2}. (24)

The solution of this optimization problem can be transformed into the solution of the following linear system:

(β2𝐈+β3𝔇𝔇)𝒵=β2𝒳Λ2+𝔇(β3𝐋+Λ3).(\beta_{2}\mathbf{I}+\beta_{3}\mathfrak{D}^{*}\mathfrak{D})\mathcal{Z}=\beta_{2}\mathcal{X}-\Lambda_{2}+\mathfrak{D}^{*}(\beta_{3}\mathbf{L}+\Lambda_{3}). (25)

Considering the block circulant structure of the operator 𝔇𝔇\mathfrak{D}^{*}\mathfrak{D}, it can be transformed into the Fourier domain and fast calculated. The fast computation of 𝒵\mathcal{Z} can be written as:

𝒵=ifftn(fftn()β2𝟏+β3(fftn(𝔇𝔇))),\mathcal{Z}=\operatorname{ifftn}(\frac{\operatorname{fftn}(\mathcal{M})}{\beta_{2}\mathbf{1}+\beta_{3}(\operatorname{fftn}(\mathfrak{D}^{*}\mathfrak{D}))}), (26)

where =β2𝒳Λ2+𝔇(β3𝐋+Λ3)\mathcal{M}=\beta_{2}\mathcal{X}-\Lambda_{2}+\mathfrak{D}^{*}(\beta_{3}\mathbf{L}+\Lambda_{3}); fftn\operatorname{fftn} and ifftn\operatorname{ifftn} are 3D fast Fourier transform and 3D fast inverse Fourier transform, respectively.

III-E Updating 𝐋\mathbf{L}

To update 𝐋\mathbf{L}, we fix all the rest variables in equation (III), and the sub-problem can be formulated as follows:

min𝐋λTV𝐋1+Λ3,𝐋𝔇(𝒵)+β32𝐋𝔇(𝒵)F2.\min_{\mathbf{L}}\lambda_{\text{TV}}\left\|\mathbf{L}\right\|_{1}+\left\langle\Lambda_{3},\mathbf{L}-\mathfrak{D}(\mathcal{Z})\right\rangle+\frac{\beta_{3}}{2}\left\|\mathbf{L}-\mathfrak{D}(\mathcal{Z})\right\|_{\text{F}}^{2}. (27)

It can be transformed into:

min𝐋λTV𝐋1+β32𝐋(𝔇(𝒵)Λ3β3)F2.\min_{\mathbf{L}}\lambda_{\text{TV}}\left\|\mathbf{L}\right\|_{1}+\frac{\beta_{3}}{2}\left\|\mathbf{L}-(\mathfrak{D}(\mathcal{Z})-\frac{\Lambda_{3}}{\beta_{3}})\right\|_{\text{F}}^{2}. (28)

The solution of 𝐋\mathbf{L} can be obtained by equation (18) as follows:

𝐋=ThrλTVβ3(𝔇(𝒵)Λ3β3).\mathbf{L}=\text{Thr}_{\frac{\lambda_{\text{TV}}}{\beta_{3}}}(\mathfrak{D}(\mathcal{Z})-\frac{\Lambda_{3}}{\beta_{3}}). (29)

III-F Updating 𝒮\mathcal{S}

By fixing all the rest variables in equation (III), we can update 𝒮\mathcal{S} by the sub-problem:

min𝒮λ𝒮𝒮1+Λ1,𝒴𝒳𝒮𝒩+β12𝒴𝒳𝒮𝒩F2.\min_{\mathcal{S}}\lambda_{\mathcal{S}}\left\|\mathcal{S}\right\|_{1}+\left\langle\Lambda_{1},\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\rangle+\frac{\beta_{1}}{2}\left\|\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\|_{\text{F}}^{2}. (30)

Similarly to solving 𝐋\mathbf{L} above, this optimization model can be solved by:

𝒮=Thrλ𝒮β1(𝒴𝒳𝒩+Λ1β1).\begin{split}\mathcal{S}=\text{Thr}_{\frac{\lambda_{\mathcal{S}}}{\beta_{1}}}(\mathcal{Y}-\mathcal{X}-\mathcal{N}+\frac{\Lambda_{1}}{\beta_{1}}).\end{split} (31)

III-G Updating 𝒩\mathcal{N}

Fixing the other variables in equation (III), the optimization model to update 𝒩\mathcal{N} can be formulated as follows:

min𝒩λ𝒩𝒩F2+Λ1,𝒴𝒳𝒮𝒩+β12𝒴𝒳𝒮𝒩F2.\min_{\mathcal{N}}\lambda_{\mathcal{N}}\left\|\mathcal{N}\right\|_{\text{F}}^{2}+\left\langle\Lambda_{1},\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\rangle+\frac{\beta_{1}}{2}\left\|\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}\right\|_{\text{F}}^{2}. (32)

It can be easily solved by:

𝒩=β1(𝒴𝒳𝒮)+Λ1β1+2λ𝒩.\begin{split}\mathcal{N}=\frac{\beta_{1}(\mathcal{Y}-\mathcal{X}-\mathcal{S})+\Lambda_{1}}{\beta_{1}+2\lambda_{\mathcal{N}}}.\end{split} (33)

III-H Updating Λ1,Λ2,Λ3,Λ4\Lambda_{1},\Lambda_{2},\Lambda_{3},\Lambda_{4}

The Lagrangian multiplier updating scheme can be as follows:

Λ1=Λ1+β1(𝒴𝒳𝒮𝒩\displaystyle\Lambda_{1}=\Lambda_{1}+\beta_{1}(\mathcal{Y}-\mathcal{X}-\mathcal{S}-\mathcal{N}
Λ2=Λ2+β2(𝒵𝒳)\displaystyle\Lambda_{2}=\Lambda_{2}+\beta_{2}(\mathcal{Z}-\mathcal{X})
Λ3=Λ3+β3(𝐋𝔇(𝒵))\displaystyle\Lambda_{3}=\Lambda_{3}+\beta_{3}(\mathbf{L}-\mathfrak{D}(\mathcal{Z}))
Λ4=Λ4+β4(𝒳𝒢×3𝐂).\displaystyle\Lambda_{4}=\Lambda_{4}+\beta_{4}(\mathcal{X}-\mathcal{G}\times_{3}\mathbf{C}). (34)

We summarize the algorithm for the SRLRTR with partially orthogonal MVTF for HSI denoising in Algorithm 2.

Input: Observed hyperspectral image 𝒴I×J×K\mathcal{Y}\in\mathbb{R}^{I\times J\times K}; the threshold of stopping iteration ε\varepsilon; β1,β2,β3,β4\beta_{1},\beta_{2},\beta_{3},\beta_{4}; λ𝒮,λ𝒢,λ𝒩,λTV\lambda_{\mathcal{S}},\lambda_{\mathcal{G}},\lambda_{\mathcal{N}},\lambda_{\text{TV}}
Output: clean image 𝒳\mathcal{X}
Initialiazation: 𝒳=𝒴,𝒵=𝒩=𝒮=0\mathcal{X}=\mathcal{Y},\mathcal{Z}=\mathcal{N}=\mathcal{S}=0; Λ1=Λ2=Λ4=0\Lambda_{1}=\Lambda_{2}=\Lambda_{4}=0; Λ3=0\Lambda_{3}=0;
While not converged do
 1. update 𝒢\mathcal{G} by Algorithm 1
 2. update 𝐂\mathbf{C} via (21)
 3. update 𝒳\mathcal{X} via (23)
 4. update 𝒵\mathcal{Z} via (26)
 5. update 𝐋\mathbf{L} via (29)
 6. update 𝐒\mathbf{S} via (31)
 7. update 𝒩\mathcal{N} via (33)
 8. update dual variables by (III-H)
End while
Algorithm 2 SRLRTR with partially orthogonal MVTF

The convergence condition is reached when the relative error between two successive tensors 𝒳\mathcal{X} is smaller than a threshold, i.e. 𝒳(k)𝒳(k+1)F2/𝒳(k+1)F2ε{\left\|\mathcal{X}^{(k)}-\mathcal{X}^{(k+1)}\right\|_{\text{F}}^{2}}/{\left\|\mathcal{X}^{(k+1)}\right\|_{\text{F}}^{2}}\leq\varepsilon where 𝒳(k)\mathcal{X}^{(k)} is the recovered image in kk-th iteration and ε\varepsilon is the threshold.

III-I Computational complexity analysis

The main complexity of the proposed algorithm comes from the updating of 𝒢\mathcal{G}. For a hyperspectral image 𝒳I×J×K\mathcal{X}\in\mathbb{R}^{I\times J\times K}, the most time-consuming part is from calculating 𝒢\mathcal{G}, which needs to compute the SVD of (:,:,r)I×J,r=1,,R\mathcal{M}_{(:,:,r)}\in\mathbb{R}^{I\times J},r=1,\cdots,R. Assuming I=JI=J, the computational complexity is O(I3)O(I^{3}). Therefore, the total complexity is O(PRI3)O(PRI^{3}), where PP is the number of iterations.

IV Experiments

In this section, two simulated datasets and two real datasets are used in the experiments to testify the effectiveness of the proposed algorithm. The reflectivity values of all pixels in the hyperspectral image are normalized between 0 and 1 before the experiment.

IV-A Datasets Descriptions

1. Washington DC Mall (WDC): The Washington DC Mall dataset111https://engineering.purdue.edu/ biehl/MultiSpec/hyperspectral.html is provided with the permission of Spectral Information Technology Application Center of Virginia. The size of each scene is 1208×3071208\times 307. The dataset includes 210 bands in the spectral range from 400 nm to 2400 nm. We use the 191 bands by removing bands where the atmosphere is opaque. And 256×256256\times 256 pixels are selected from each image, forming the HSI with the size of 256×256×191256\times 256\times 191. The false color image (R: 29, G: 19, B: 9) of WDC is shown in Fig. 3(a) .

2. Pavia University (PaviaU): This scene222https://rslab.ut.ac.ir/data is acquired by ROSIS sensor that covers the Pavia, northern Italy. The spatial size of the PaviaU is 610×340610\times 340 and there are 103 channels, ranging from 430 nm to 860 nm. And 300×300300\times 300 pixels are considered, forming the HSI with the size of 300×300×103300\times 300\times 103. The false color image (R: 29, G: 19, B: 9) of PaviaU is shown in Fig. 3(b).

3. EO-1 Hyperion datasets (EO-1): The EO-1 Hyperion hyperspectral dataset333https://www.usgs.gov/centers/eros/science/ covers an agricultural area of the state of Indiana, USA. It contains 1000×10001000\times 1000 pixels and 242 bands spanning 350nm-2600nm. After removing water absorption bands, 166 bands are retained, and 200×200200\times 200 pixels are selected from original scene, forming the HSI with size of 200×200×166200\times 200\times 166. It is mainly contaminated by Gaussian noise, stripes and dead lines.The false color image (R: 2, G: 132, B: 136) of EO-1 is shown in Fig. 3(c).

4. HYDICE Urban Dataset (Urban): This scene is acquired from the real Hyperspectral Digital Imagery Collection Experiment (HYDICE) [47]. It contains 307×307307\times 307 pixels and there are 210 wavelengths ranging from 400 nm to 2500 nm. It is mainly contaminated by stripes and dead lines. The false color image (R: 2, G: 109, B: 207) of Urban is shown in Fig. 3(d).

Refer to caption
(a) WDC
Refer to caption
(b) PaviaU
Refer to caption
(c) EO-1
Refer to caption
(d) Urban
Figure 3: Testing HSI datasets
TABLE I: Quantitative comparison of different algorithms for PaviaU and WDC denoising .
Cases Datasets measure indexes 3DATVLR RSLRNTF LRTDTV SSTVLRTF NLRCPTD SNLRSF TVLASDA HyRes BM4D SRLRTR
Case 1 MPSNR (dB) 27.12 19.26 29.01 20.93 22.13 22.05 26.65 20.21 21.15 29.17
MSSIM 0.73 0.44 0.79 0.51 0.57 0.69 0.64 0.62 0.56 0.75
PaviaU ERGAS 172.50 424.75 142.19 325.18 387.03 5.1E+3 197.91 384.57 344.21 141.38
CPU time (sec) 217.04 723.36 179.96 729.27 1.8E+4 794.33 391.09 3.42 549.27 245.45
MPSNR (dB) 26.07 19.20 27.98 21.44 19.30 21.43 27.188 19.404 20.005 29.642
MSSIM 0.664 0.59 0.78 0.61 0.56 0.68 0.79 0.62 0.49 0.85
WDC ERGAS 202.10 490.13 162.25 380.48 281.5 5.5E+3 181.64 480.19 432.21 141.34
CPU time (sec) 212.72 631.34 327.80 1.34E+3 3.98E+4 601.76 332.35 1.98 1.11E+3 271.30
Case 2 MPSNR (dB) 29.12 31.29 31.12 30.70 33.04 35.76 28.87 33.78 32.05 32.63
MSSIM 0.80 0.819 0.851 0.746 0.882 0.924 0.731 0.886 0.854 0.823
PaviaU ERGAS 139.72 111.53 108.186 121.21 90.21 1.06E+3 148.75 85.64 101.95 98.45
CPU time (sec) 180.88 446.2 132.57 657.66 2.7E+4 2.1E+3 321.665 1.649 361.01 98.08
MPSNR (dB) 28.04 31.89 30.27 30.71 31.94 36.25 29.21 34.47 29.72 33.56
MSSIM 0.76 0.91 0.85 0.89 0.80 0.966 0.857 0.947 0.853 0.929
WDC ERGAS 160.74 104.13 124.26 107.48 126.2 1.05E+3 142.42 78.10 132.68 87.58
CPU time (sec) 536.77 888.73 272.92 1.05E+3 4.24E+4 590.02 431.91 5.541 683.03 283.91
Case 3 MPSNR (dB) 32.30 26.06 33.23 32.733 26.42 30.35 32.35 27.44 24.92 36.75
MSSIM 0.89 0.709 0.901 0.825 0.77 0.875 0.894 0.80 0.537 0.936
PaviaU ERGAS 99.83 207.99 87.02 109.89 181.37 1.9E+3 103.47 185.12 253.36 63.12
CPU time (sec) 167.85 811.507 129.46 572.934 2.3E+4 1.4E+3 537.47 3.626 747.35.92 182.65
MPSNR (dB) 32.23 26.05 32.96 33.96 26.04 29.72 32.91 26.88 23.81 37.85
MSSIM 0.90 0.82 0.92 0.93 0.79 0.90 0.93 0.85 0.62 0.97
WDC ERGAS 97.85 208.55 91.81 85.78 185.3 2.11E+3 99.57 196.58 279.44 58.97
CPU time (sec) 339.54 894.08 260.91 1.46E+3 3.63E+4 598.662 430.395 4.94 772.359 287.08
Case 4 MPSNR (dB) 32.27 25.75 33.45 31.77 26.07 30.04 32.28 26.85 25.04 36.56
MSSIM 0.888 0.693 0.904 0.819 0.751 0.879 0.888 0.849 0.543 0.934
PaviaU ERGAS 100/18 213.79 87.93 112.32 187.81 2.10E+3 104.43 196.75 250.74 63.34
CPU time (sec) 209.25 717.91 182.04 762.24 3.6E+4 790.49 386.89 4.110 526.23 211.807
MPSNR (dB) 32.089 25.83 32.76 32.99 25.54 29.62 32.76 26.85 23.75 37.181
MSSIM 0.897 0.80 0.91 0.926 0.76 0.902 0.936 0.849 0.617 0.971
WDC ERGAS 101.104 214.84 93.68 97.85 202.7 2.14E+3 100.31 196.73 279.65 52.55
CPU time (sec) 295.158 979.73 263.30 1.17E+3 4.31E+4 556.23 427.82 4.11 774.98 314.45
Refer to caption
(a) Clean Image
Refer to caption
(b) Noisy Image
Refer to caption
(c) LRTDTV
Refer to caption
(d) RSLRNTF
Refer to caption
(e) SNLRSF
Refer to caption
(f) 3DATVLR
Refer to caption
(g) NLRCPTD
Refer to caption
(h) SSTVLRTF
Refer to caption
(i) TVLASDS
Refer to caption
(j) HyRes
Refer to caption
(k) BM4D
Refer to caption
(l) SRLRTR
Figure 4: The recovered 121st spectral segment of the WDC data in case 1.
Refer to caption
(a) Clean Image
Refer to caption
(b) Noisy Image
Refer to caption
(c) LRTDTV
Refer to caption
(d) RSLRNTF
Refer to caption
(e) SNLRSF
Refer to caption
(f) 3DATVLR
Refer to caption
(g) NLRCPTD
Refer to caption
(h) SSTVLRTF
Refer to caption
(i) TVLASDS
Refer to caption
(j) HyRes
Refer to caption
(k) BM4D
Refer to caption
(l) SRLRTR
Figure 5: The recovered 85th spectral segment of the PaviaU data in case 3.

IV-B Competitive methods and measure metrics

To compare with the proposed method, we select 9 state-of-the-art algorithms and one representative wavelet denoising algorithm. The detailed descriptions are as follows:

  • SNLRSF [48]: a novel subspace method based on nonlocal low-rank and sparse constraints for mixed noise reduction of HSI.

  • NLRCPTD  [34]: a method based on nonlocal low-rank regularized CP decomposition with rank determination for HSI denoising.

  • 3DATVLR  [12]: a method based on 3D anisotropic total variation and low rank constraints for mixed-noise removal of HSI.

  • LRTDTV  [31]: a method based on total variation and low rank Tucker decomposition for HSI mixed noise removal.

  • RSLRNTF  [49]: a novel method based on matrix-vector tensor factorization with low rank and nonnegative constrains for HSI denoising.

  • SSTVLRTF [50]: a method based on spatial-spectral total variation and low rank t-SVD factorization to remove mixed noise in HSI.

  • TVLASDA [51]: a method based on spectral difference-induced total variation and low rank approximation for HSI denoising.

  • HyRes [52]: a method based on sparse low-rank model, which automatically tunes parameter.

  • BM4D [53]: a wavelet method based on nonlocal self-similarities, which achieves great success in nature image denoising.

The parameters of the comparison algorithms are set according to the values in its corresponding literatures.

In addition, to evaluate the performance quantitatively, four metrics are used to evaluate the image denoising quality, including MPSNR, MSSI, ERGAS and CPU time. The details description are as follows:

  • MPSNR=1Kk=1KPSNRk\text{MPSNR}=\frac{1}{K}\sum_{k=1}^{K}\text{PSNR}_{k}

  • MSSIM=1Kk=1KSSIMk\text{MSSIM}=\frac{1}{K}\sum_{k=1}^{K}\text{SSIM}_{k}

  • ERGAS=1Kk=1K𝒳^(:,:,k)𝒳(:,:,k)F2/μk2\text{ERGAS}=\sqrt{\frac{1}{K}\sum_{k=1}^{K}{\left\|\hat{\mathcal{X}}(:,:,k)-\mathcal{X}(:,:,k)\right\|_{\text{F}}^{2}}/{\mu_{k}^{2}}}

where KK is number of spectral bands. μk\mu_{k} is the average value of 𝒳(:,:,k)\mathcal{X}(:,:,k) and 𝒳^(:,:,k)\hat{\mathcal{X}}(:,:,k) is the recovered data. All the experiments are conducted using MatLab 2019b on a computer with 2.4 GHz Quad-Core Intel Core i5 processor and 16 GB RAM.

IV-C Experiments with simulated data

WDC and PaviaU hyperspectral images are used as simulated datasets in this experiment. To testify the proposed method, we add Gaussian noise, salt-and-pepper noise, dead line and strip noise to the simulated ground-truth datasets. Four groups of experiments are performed, and the experimental settings are as follows:

Case 1: Gaussian noise + impulse noise

In this group of experiments, the noise intensities of different bands are the same. In each frequency band, the same zero mean Gaussian noise and the same percentage of impulse noise are added. The variance of Gaussian noise is 0.2, and the percentage of impulse noise is 0.2.

Case 2: Gaussian noise +dead lines

In this group of experiments, Gaussian noise is added to each band in the same way where the noise level is equal to 0.15. In addition, a dead line is added to the spectral segment from band 41 to band 100 with the number of stripes randomly chosen from 3 to 10. The width of the stripes were randomly generated from one line to three lines.

Case 3: Gaussian noise + impulse noise+dead lines

In this group, the noise intensities of different bands are not the same. Gaussian noise with zero mean and 0.05 variance, is added to each band. The percentage of impulse noise is 0.1. The dead lines are added randomly between 41 and 100 segments, the number of stripes is selected randomly from 3 to 10, and the width ranges randomly from one line to three lines.

Case 4: Gaussian noise + impulse noise+dead lines+stripe

Based on the 3rd group of experiments, this group randomly adds stripe noise between band 101 and band 190 and the number of random stripe ranges from 20 to 40 for WDC data. For PaviaU, the dead lines are added randomly between 41 and 60 segments, the number of stripes is selected randomly from 3 to 10, and the width ranges randomly from one line to three lines. In addition, stripe noise is added between band 61 and band 100 when the number of random stripe ranges from 20 to 40.

Refer to caption
Figure 6: The comparison performance of different methods in terms of PSNR in case 4 for WDC denoising.
Refer to caption
Figure 7: The comparison performance of different methods in terms of PSNR in case 4 for PaviaU denoising.
Refer to caption
(a) Noisy Image
Refer to caption
(b) LRTDTV
Refer to caption
(c) RSLRNTF
Refer to caption
(d) SNLRSF
Refer to caption
(e) 3DATVLR
Refer to caption
(f) NLRCPTD
Refer to caption
(g) SSTVLRTF
Refer to caption
(h) TVLASDS
Refer to caption
(i) HyRes
Refer to caption
(j) BM4D
Refer to caption
(k) SRLRTR
Figure 8: The recovered 68th spectral segment of the EO-01 data.
Refer to caption
(a) Noisy Image
Refer to caption
(b) LRTDTV
Refer to caption
(c) RSLRNTF
Refer to caption
(d) SNLRSF
Refer to caption
(e) 3DATVLR
Refer to caption
(f) NLRCPTD
Refer to caption
(g) SSTVLRTF
Refer to caption
(h) TVLASDS
Refer to caption
(i) HyRes
Refer to caption
(j) BM4D
Refer to caption
(k) SRLRTR
Figure 9: The recovered 96th spectral segment of the EO-01 data.
Refer to caption
(a) Noisy Image
Refer to caption
(b) LRTDTV
Refer to caption
(c) RSLRNTF
Refer to caption
(d) SNLRSF
Refer to caption
(e) 3DATVLR
Refer to caption
(f) NLRCPTD
Refer to caption
(g) SSTVLRTF
Refer to caption
(h) TVLASDS
Refer to caption
(i) HyRes
Refer to caption
(j) BM4D
Refer to caption
(k) SRLRTR
Figure 10: The recovered 132th spectral segment of the EO-01 data.

In the proposed method, we set the parameters λTV=0.0002,λ𝒮=0.02,λ𝒩=0.1,λ𝒢=0.1\lambda_{\text{TV}}=0.0002,\lambda_{\mathcal{S}}=0.02,\lambda_{\mathcal{N}}=0.1,\lambda_{\mathcal{G}}=0.1 and R=5R=5 for WDC and PaviaU according to the highest value of MPSNR. The detailed information is put on the discussion part.

Table I shows the results of experiments based on simulated data in terms of MPSNR, MSSIM, ERGAS and CPU time. It can be observed, in most cases, the proposed algorithm is superior to all other methods in terms of the MPSNR, MSSIM and ERGA. Specifically, our algorithm shows advantage on removing mixed Gaussian noise and impulse noise in case 1. It means that our algorithm can deal with Gaussian noise and impulse noise well. Similarly, in case 3 and case 4, when the dead line and stripe noise are added, our proposed method is also superior to existing state-of-art methods in terms of recovery quality. This may imply the superiority of the proposed method in processing mixed noise in different conditions. However, in case 2 where it contains Gaussian noise and dead lines, SNLRSF, NLRCPTD and HyRes performs better.

To explain the recovery results more vividly, Fig. 4 and Fig. 5 present comparative results for the 121st band of WDC HSI in case 1 and the 85th band of PaviaU HSI in case 3, respectively. Fig. 4 shows the recovered images by different methods on mixed Gaussian noise and impulse noise condition. We could observe that the resolution of recovered image in our proposed one is clearer than that of others. In addition, Fig. 5 shows the recovered images on mixed Gaussian noise, impulse noise and dead lines condition. We could see that recovered images of BM4D and RSLRTF still suffer from the dead lines noise.

To give the details of results on each spectral segment, Fig. 6 and Fig. 7 provide the PSNR of each spectral segment in case 4 for WDC and PaviaU data, respectively. In case 4, we add different kinds of noises in different spectral segments. From Fig. 6, we can see the PSNR values change smooth in the 1-41 spectral segments where Gaussian noise and impulse noise are added while the PSNR values fluctuate in 50-190 spectral segments where dead line and stripe noise are added. In addition, for each scene, the recovery performance of the proposed one is better than that of others with respect to PSNR. It indicates that the proposed method is superior in removing mixed noises. In Fig. 7, the PSNR values of the proposed method is better than the others in almost every spectral segment. It means that it can remove noise and retain the structural characteristics of HSI better.

In conclusion, the proposed method performs better in the denoising of Gaussian noise and impulse noise. In the processing of stripe noise, it can also have a leading position among the compared methods.

IV-D Experiments with real data

We choose EO-01 and Urban datasets in this experiment. Both scenes are seriously contaminated by the Gaussian noise, stripes and dead lines. In this case, we set the parameters λTV=0.00001,λ𝒮=0.013,λ𝒩=0.1,λ𝒢=0.1,R=2\lambda_{\text{TV}}=0.00001,\lambda_{\mathcal{S}}=0.013,\lambda_{\mathcal{N}}=0.1,\lambda_{\mathcal{G}}=0.1,R=2 for Urban and EO-01data.

Fig. 8, Fig. 9 and Fig 10 show the denoising resulting of EO-1 hyperspectral image in bands 68, 96 and 132. It can be observed in Fig. 8, the original data are heavily contaminated by stripes noise. Our proposed algorithm can recover the HSI while local details and structural information of the HSI are preserved. However, HyRes and BM4D fail to recover the HSI in stripes noise. This is mainly because they cannot model the sparse noise. Similarly, HyRes and BM4D performs badly in Fig. 9 and Fig. 10. In addition, it can be observed that the proposed method obtains better performance in terms of image resolution.

Fig. 11, Fig. 12 and Fig. 13 show the 109th, 151st and 207th spectrum segments of Urban hyperspectral image after denoising. Fig. 11 shows that the TV based methods including LRTDTV, 3DATVLR, SSTVLRTF, TVLASDS and our proposed one have a good recovery performance on stripe noise condition. Meanwhile, the nonlocal based methods such as SNLRSF and NLRCPTD can recover the image. However, with the noise increase as shown in Fig. 12, some TV based methods including LRTDTV, 3DATVLR and SSTVLRTF are over-smoothness to recover the HSI. Instead, SNLRSF, TVLASDS and our proposed can successfully recover the clean image from the severely noise condition. Specially, as shown in Fig. 13, the scene is severely polluted by serious Gaussian noise, stripe noise and dead line, only SNLRSF, TVLASDS and our proposed method have a good recovery performance.

Refer to caption
(a) Noisy Image
Refer to caption
(b) LRTDTV
Refer to caption
(c) RSLRNTF
Refer to caption
(d) SNLRSF
Refer to caption
(e) 3DATVLR
Refer to caption
(f) NLRCPTD
Refer to caption
(g) SSTVLRTF
Refer to caption
(h) TVLASDS
Refer to caption
(i) HyRes
Refer to caption
(j) BM4D
Refer to caption
(k) SRLRTR
Figure 11: The recovered 109th spectral segment of the Urban data.
Refer to caption
(a) Noisy Image
Refer to caption
(b) LRTDTV
Refer to caption
(c) RSLRNTF
Refer to caption
(d) SNLRSF
Refer to caption
(e) 3DATVLR
Refer to caption
(f) NLRCPTD
Refer to caption
(g) SSTVLRTF
Refer to caption
(h) TVLASDS
Refer to caption
(i) HyRes
Refer to caption
(j) BM4D
Refer to caption
(k) SRLRTR
Figure 12: The recovered 151st spectral segment of the Urban data.
Refer to caption
(a) Noisy Image
Refer to caption
(b) LRTDTV
Refer to caption
(c) RSLRNTF
Refer to caption
(d) SNLRSF
Refer to caption
(e) 3DATVLR
Refer to caption
(f) NLRCPTD
Refer to caption
(g) SSTVLRTF
Refer to caption
(h) TVLASDS
Refer to caption
(i) HyRes
Refer to caption
(j) BM4D
Refer to caption
(k) SRLRTR
Figure 13: The recovered 208th spectral segment of the Urban data.
TABLE II: Comparison of different algorithms on Urban and EO-1 denoising in terms of CPU time .
Datasets measure indexes 3DATVLR RSLRNTF LRTDTV SSTVLRTF SNLRSF NLRCPTD TVLASDS SRLRTR
Urban data CPU time (sec) 730.26 1.68E+3 558.40 2.82E+3 1.91E+3 6.03E+4 1.14E+3 620.03
EO-1 data CPU time (sec) 128.38 574.46 192.97 1.38E+3 805.01 5.78E+4 247 163.29
Refer to caption
(a) λ𝒩=0.01\lambda_{\mathcal{N}}=0.01
Refer to caption
(b) λ𝒩=0.018\lambda_{\mathcal{N}}=0.018
Refer to caption
(c) λ𝒩=0.032\lambda_{\mathcal{N}}=0.032
Refer to caption
(d) λ𝒩=0.056\lambda_{\mathcal{N}}=0.056
Refer to caption
(e) λ𝒩=0.1\lambda_{\mathcal{N}}=0.1
Figure 14: The impact of parameters λTV\lambda_{\text{TV}}, λ𝒮\lambda_{\mathcal{S}} and λ𝒩\lambda_{\mathcal{N}}.

In addition, table II shows the CPU time of our algorithm and state-of-art algorithms on real data. It is noticed that due to the poor performance of the BM4D and HyRes methods in the above experiments, we do not consider the CPU time of them for comparison. As seen in table II, 3DATVLR is the fastest one compared with others, followed by our proposed one. But the recovery performance of 3DATVLR is worse than that of SRLRTR. Meanwhile, compared with TVLASDS and SNLRSF which have a good performance on severely noisy condition, the proposed one is faster than them. It implies that the partially orthogonal MVTF method can be more efficient and suitable to explore HSI information.

V Discussion

V-A The difference between partially orthogonal MVTF and traditional MVTF

We introduce a new smooth and robust low rank partially orthogonal MVTF method for denoising of HSI. The partially orthogonal MVTF is crucial to model the clean HSI for the proposed algorithm. Compared with traditional MVTF including orthogonal MVTF [54] and nonnegative MVTF [39], the proposed one is different in two aspects.

1. The traditional orthogonal MVTF in [54] is defined as:

𝒳r=1R𝐆r𝐜r=r=1R𝐃r×1𝐀r×2𝐁r×3𝐜r,\mathcal{X}\approx\sum_{r=1}^{R}\mathbf{G}_{r}\circ\mathbf{c}_{r}=\sum_{r=1}^{R}\mathbf{D}_{r}\times_{1}\mathbf{A}_{r}\times_{2}\mathbf{B}_{r}\times_{3}\mathbf{c}_{r}, (35)

where 𝐆r=𝐀r𝐃r𝐁rT\mathbf{G}_{r}=\mathbf{A}_{r}\mathbf{D}_{r}\mathbf{B}_{r}^{\text{T}} is the SVD of 𝐆r\mathbf{G}_{r} and 𝐜r\mathbf{c}_{r} is under the unit-norm constraint. In this case, the matrices 𝐀r\mathbf{A}_{r} and 𝐁r\mathbf{B}_{r} are orthogonal. Compared with traditional orthogonal MVTF, the proposed one constrains 𝐂T𝐂=𝐈R\mathbf{C}^{\text{T}}\mathbf{C}=\mathbf{I}_{R}, 𝐂(:,r)=𝐜r,r=1,,R\mathbf{C}(:,r)=\mathbf{c}_{r},r=1,\cdots,R, resulting in partially orthogonal MVTF. The partially orthogonal MVTF has a good physical interpretation under the assumption of linear spectral mixture model for HSI, when 𝐜r\mathbf{c}_{r} is considered as the rr-th endmember spectrum, and endmember spectrums are uncorrelated.

2. The nonnegative MVTF in [39] is denoted by:

𝒳r=1R𝐆r𝐜r=r=1R𝐀r𝐁rT𝐜r\displaystyle\mathcal{X}\approx\sum_{r=1}^{R}\mathbf{G}_{r}\circ\mathbf{c}_{r}=\sum_{r=1}^{R}\mathbf{A}_{r}\mathbf{B}_{r}^{\text{T}}\circ\mathbf{c}_{r}
s. t. 𝐀r,𝐁r,𝐜r0,\displaystyle\text{s.~{}t.~{}}\mathbf{A}_{r},\mathbf{B}_{r},\mathbf{c}_{r}\geq 0, (36)

where 𝐆r\mathbf{G}_{r} is of rank Lr,r=1,,RL_{r},r=1,\cdots,R, and can be factorized as 𝐀r𝐁rT\mathbf{A}_{r}\mathbf{B}_{r}^{\text{T}} with predefined rank LrL_{r}. In HSI, the abundance matrix is low rank.

This low rank model has shown its superiority on HSI unmixing [39], where 𝐜r\mathbf{c}_{r} is considered as an endmember, and matrix 𝐆r\mathbf{G}_{r} is the corresponding abundance map. In [49], RSLRNTF uses this model for HSI denoising. In these cases, the ranks LrL_{r}, r=1,,Rr=1,\cdots,R are required to be defined in advance. In practice, accurate determination of the rank of abundance matrix 𝐆r\mathbf{G}_{r} of the rr-th endmember is impossible. Therefore, we directly minimize the rank of each abundance matrix under the assumption that endmember spectrums are irrelevant, resulting in low rank partially-orthogonal MVTF. In addition, compared with RSLRNTF [49], our proposed method shows a better recovery performance in experiments.

V-B The impact of parameters

There are four parameters in equation (III), but we only need to tune three parameters because the parameters represent the relative weights of different terms in objective function. In this case, we fix λ𝒢\lambda_{\mathcal{G}} and analyze the impacts of parameters λ𝒮,λ𝒩,λTV\lambda_{\mathcal{S}},\lambda_{\mathcal{N}},\lambda_{\text{TV}}. In addition, the parameter RR in our algorithm has its physical meaning, which represents the number of endmembers (also called pure materials). In the following, we will take the experiments in case 4 on WDC datasets as an example and address how to choose these parameters.

1. The impact of parameters λTV\lambda_{\text{TV}}, λ𝒮\lambda_{\mathcal{S}} and λ𝒩\lambda_{\mathcal{N}}: For tuning the parameters λTV\lambda_{\text{TV}}, λ𝒮\lambda_{\mathcal{S}} and λ𝒩\lambda_{\mathcal{N}}, we set λ𝒢=0.1\lambda_{\mathcal{G}}=0.1. The other three parameters are related to the weights of TV regularizer, sparse noise (i.e., impulse noise, stripes, and dead lines), and Gaussian noise, respectively. Fig. 15 shows the MPSNR values as a function with respect to λTV\lambda_{\text{TV}} and λ𝒮\lambda_{\mathcal{S}} for the proposed algorithm with λ𝒩\lambda_{\mathcal{N}} chosen from the set {0.01, 0.018, 0.032, 0.056, 0.1}. λTV\lambda_{\text{TV}} is selected from the set {0.0001, 0.00017, 0.00046, 0.0013, 0.0036, 0.01} and λ𝒮\lambda_{\mathcal{S}} is selected from the set {0.001, 0.0028, 0.0077, 0.02, 0.05, 0.1}. It is obvious that the smaller λTV\lambda_{\text{TV}} and the larger λ𝒩\lambda_{\mathcal{N}}, the value of MPSNR becomes larger. When λ𝒮=0.02\lambda_{\mathcal{S}}=0.02, the proposed method can reach the peak of MPSNR.

Refer to caption
(a) λ𝒩=0.01\lambda_{\mathcal{N}}=0.01
Refer to caption
(b) λ𝒩=0.018\lambda_{\mathcal{N}}=0.018
Refer to caption
(c) λ𝒩=0.032\lambda_{\mathcal{N}}=0.032
Refer to caption
(d) λ𝒩=0.056\lambda_{\mathcal{N}}=0.056
Refer to caption
(e) λ𝒩=0.1\lambda_{\mathcal{N}}=0.1
Figure 15: The impact of parameters λTV\lambda_{\text{TV}}, λ𝒮\lambda_{\mathcal{S}} and λ𝒩\lambda_{\mathcal{N}}.

2. The impact of parameters RR:

Refer to caption
Figure 16: The impact of parameters RR

The parameter RR in our algorithm represents the number of endmembers. Fig. 16 shows the MPSNR value as a function of RR, where RR changes from 2 to 15 with step size 1. From it, we can see when R=5R=5, the value of MPSNR arrives at its peak.

VI Conclusion

In this paper, we develop a new low rank partially orthogonal MVTF model for the hyperspectral image denoising. In addition to the low rank tensor term for global data structure, a 3D tensor total variation is used to exploit the local data structure. The mixed noise is removed by different kinds of norm constraints in the optimization model too. The ADMM is used to efficiently solve the problem. Numerical experiments on HSI denoising show that our algorithm is superior to state-of-the-art algorithms in terms of recovery quality.

References

  • [1] J. R. Jensen and K. Lulla, “Introductory digital image processing: a remote sensing perspective,” 1987.
  • [2] A. Gowen, C. O’Donnell, P. Cullen, G. Downey, and J. Frias, “Hyperspectral imaging–an emerging process analytical tool for food quality and safety control,” Trends in Food Science & Technology, vol. 18, no. 12, pp. 590–598, 2007.
  • [3] Y. Tarabalka, J. Chanussot, and J. A. Benediktsson, “Segmentation and classification of hyperspectral images using watershed transformation,” Pattern Recognition, vol. 43, no. 7, pp. 2367–2379, 2010.
  • [4] D. Manolakis and G. Shaw, “Detection algorithms for hyperspectral imaging applications,” IEEE Signal Processing Magazine, vol. 19, no. 1, pp. 29–43, 2002.
  • [5] Q. Yuan, L. Zhang, and H. Shen, “Hyperspectral image denoising employing a spectral–spatial adaptive total variation model,” IEEE Transactions on Geoscience and Remote Sensing, vol. 50, no. 10, pp. 3660–3677, 2012.
  • [6] Y.-Q. Zhao and J. Yang, “Hyperspectral image denoising via sparse representation and low-rank constraint,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 1, pp. 296–308, 2014.
  • [7] B. Rasti, J. R. Sveinsson, M. O. Ulfarsson, and J. A. Benediktsson, “Hyperspectral image denoising using 3d wavelets,” in 2012 IEEE International Geoscience and Remote Sensing Symposium, pp. 1349–1352, IEEE, 2012.
  • [8] W. He, H. Zhang, L. Zhang, and H. Shen, “Hyperspectral image denoising via noise-adjusted iterative low-rank matrix approximation,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 6, pp. 3050–3061, 2015.
  • [9] Q. Yuan, L. Zhang, and H. Shen, “Hyperspectral image denoising with a spatial–spectral view fusion strategy,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 5, pp. 2314–2325, 2013.
  • [10] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising with block-matching and 3d filtering,” in Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning, vol. 6064, p. 606414, International Society for Optics and Photonics, 2006.
  • [11] X. Bresson and T. F. Chan, “Fast dual minimization of the vectorial total variation norm and applications to color image processing,” Inverse Problems and Imaging, vol. 2, no. 4, pp. 455–484, 2008.
  • [12] L. Sun, T. Zhan, Z. Wu, and B. Jeon, “A novel 3d anisotropic total variation regularized low rank method for hyperspectral image mixed denoising,” ISPRS International Journal of Geo-Information, vol. 7, no. 10, 2018.
  • [13] H. K. Aggarwal and A. Majumdar, “Hyperspectral image denoising using spatio-spectral total variation,” IEEE Geoscience and Remote Sensing Letters, vol. 13, no. 3, pp. 442–446, 2016.
  • [14] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Transactions on Image processing, vol. 15, no. 12, pp. 3736–3745, 2006.
  • [15] J. Yan, M. Zhu, H. Liu, and Y. Liu, “Visual saliency detection via sparsity pursuit,” IEEE Signal Processing Letters, vol. 17, no. 8, pp. 739–742, 2010.
  • [16] Q. Tian and S. Chen, “Cross-heterogeneous-database age estimation through correlation representation learning,” Neurocomputing, vol. 238, pp. 286–295, 2017.
  • [17] M. Lebrun, A. Buades, and J.-M. Morel, “A nonlocal bayesian image denoising algorithm,” SIAM Journal on Imaging Sciences, vol. 6, no. 3, pp. 1665–1688, 2013.
  • [18] L. Xu, F. Li, A. Wong, and D. A. Clausi, “Hyperspectral image denoising using a spatial–spectral monte carlo sampling approach,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 6, pp. 3025–3038, 2015.
  • [19] Y. Zheng, B. Jeon, L. Sun, J. Zhang, and H. Zhang, “Student’s t-hidden markov model for unsupervised learning using localized feature selection,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 10, pp. 2586–2598, 2018.
  • [20] B. Gu, X. Sun, and V. S. Sheng, “Structural minimax probability machine,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 7, pp. 1646–1656, 2017.
  • [21] W. Xie and Y. Li, “Hyperspectral imagery denoising by deep learning with trainable nonlinearity function,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 11, pp. 1963–1967, 2017.
  • [22] B. Wang, X. Gu, L. Ma, and S. Yan, “Temperature error correction based on bp neural network in meteorological wireless sensor network,” in International Conference on Cloud Computing and Security, pp. 117–132, Springer, 2016.
  • [23] Z. Qu, J. Keeney, S. Robitzsch, F. Zaman, and X. Wang, “Multilevel pattern mining architecture for automatic network monitoring in heterogeneous wireless communication networks,” China communications, vol. 13, no. 7, pp. 108–116, 2016.
  • [24] Q. Yuan, Q. Zhang, J. Li, H. Shen, and L. Zhang, “Hyperspectral image denoising employing a spatial-spectral deep residual convolutional neural network,” IEEE Transactions on Geoscience and Remote Sensing, no. 99, pp. 1–14, 2018.
  • [25] A. Maffei, J. M. Haut, M. E. Paoletti, J. Plaza, L. Bruzzone, and A. Plaza, “A single model cnn for hyperspectral image denoising,” IEEE Transactions on Geoscience and Remote Sensing, 2019.
  • [26] H. Zhang, W. He, L. Zhang, H. Shen, and Q. Yuan, “Hyperspectral image restoration using low-rank matrix recovery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 8, pp. 4729–4743, 2013.
  • [27] R. Zhu, M. Dong, and J.-H. Xue, “Spectral nonlocal restoration of hyperspectral images with low-rank property,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 8, no. 6, pp. 3062–3067, 2014.
  • [28] W. He, Q. Yao, C. Li, N. Yokoya, and Q. Zhao, “Non-local meets global: An integrated paradigm for hyperspectral denoising,” arXiv preprint arXiv:1812.04243, 2018.
  • [29] H. Fan, Y. Chen, Y. Guo, H. Zhang, and G. Kuang, “Hyperspectral image restoration using low-rank tensor recovery,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 10, no. 10, pp. 4589–4604, 2017.
  • [30] H. Huang, Y. Liu, J. Liu, and C. Zhu, “Provable tensor ring completion,” Signal Processing, vol. 171, p. 107486, 2020.
  • [31] Y. Wang, J. Peng, Q. Zhao, Y. Leung, X. Zhao, and D. Meng, “Hyperspectral image restoration via total variation regularized low-rank tensor decomposition,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, pp. 1227–1243, April 2018.
  • [32] Y.-D. Kim and S. Choi, “Nonnegative tucker decomposition,” in 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8, IEEE, 2007.
  • [33] Z. Long, Y. Liu, L. Chen, and C. Zhu, “Low rank tensor completion for multiway visual data,” Signal Processing, vol. 155, pp. 301–316, 2019.
  • [34] J. Xue, Y. Zhao, W. Liao, and J. C.-W. Chan, “Nonlocal low-rank regularized tensor decomposition for hyperspectral image denoising,” IEEE Transactions on Geoscience and Remote Sensing, 2019.
  • [35] Z. Huang, S. Li, L. Fang, H. Li, and J. A. Benediktsson, “Hyperspectral image denoising with group sparse and low-rank tensor decomposition,” IEEE Access, vol. 6, pp. 1380–1390, 2018.
  • [36] T. Jiang and N. D. Sidiropoulos, “Kruskal’s permutation lemma and the identification of candecomp/parafac and bilinear models with constant modulus constraints,” IEEE Transactions on Signal Processing, vol. 52, no. 9, pp. 2625–2636, 2004.
  • [37] M. E. Kilmer, K. Braman, N. Hao, and R. C. Hoover, “Third-order tensors as operators on matrices: A theoretical and computational framework with applications in imaging,” SIAM Journal on Matrix Analysis and Applications, vol. 34, no. 1, pp. 148–172, 2013.
  • [38] L. Feng, Y. Liu, L. Chen, X. Zhang, and C. Zhu, “Robust block tensor principal component analysis,” Signal Processing, vol. 166, p. 107271, 2020.
  • [39] Y. Qian, F. Xiong, S. Zeng, J. Zhou, and Y. Y. Tang, “Matrix-vector nonnegative tensor factorization for blind unmixing of hyperspectral imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 55, pp. 1776–1792, March 2017.
  • [40] G. Zhang, X. Fu, K. Huang, and J. Wang, “Hyperspectral super-resolution: A coupled nonnegative block-term tensor decomposition approach,” arXiv preprint arXiv:1910.10275, 2019.
  • [41] N. Wang, B. Du, L. Zhang, and L. Zhang, “An abundance characteristic-based independent component analysis for hyperspectral unmixing,” IEEE Transactions on Geoscience and Remote Sensing, vol. 53, no. 1, pp. 416–428, 2014.
  • [42] J. M. Nascimento and J. M. Dias, “Does independent component analysis play a role in unmixing hyperspectral data?,” IEEE Transactions on Geoscience and Remote Sensing, vol. 43, no. 1, pp. 175–187, 2005.
  • [43] Y. Liu, T. Liu, J. Liu, and C. Zhu, “Smooth robust tensor principal component analysis for compressed sensing of dynamic mri,” Pattern Recognition, vol. 102, p. 107252, 2020.
  • [44] Y. Liu, Z. Long, and C. Zhu, “Image completion using low tensor tree rank and total variation minimization,” IEEE Transactions on Multimedia, vol. 21, no. 2, pp. 338–350, 2018.
  • [45] H. Fan, C. Li, Y. Guo, G. Kuang, and J. Ma, “Spatial-spectral total variation regularized low-rank tensor decomposition for hyperspectral image denoising [-. 5pc],” IEEE Transactions on Geoscience and Remote Sensing, no. 99, pp. 1–18, 2018.
  • [46] Y. Wang, J. Peng, Q. Zhao, Y. Leung, X.-L. Zhao, and D. Meng, “Hyperspectral image restoration via total variation regularized low-rank tensor decomposition,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 11, no. 4, pp. 1227–1243, 2017.
  • [47] P. A. Mitchell, “Hyperspectral digital imagery collection experiment (hydice),” in Geographic Information Systems, Photogrammetry, and Geological/Geophysical Remote Sensing, vol. 2587, pp. 70–95, International Society for Optics and Photonics, 1995.
  • [48] C. Cao, J. Yu, C. Zhou, K. Hu, F. Xiao, and X. Gao, “Hyperspectral image denoising via subspace-based nonlocal low-rank and sparse factorization,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019.
  • [49] F. Xiong, J. Zhou, and Y. Qian, “Hyperspectral imagery denoising via reweighed sparse low-rank nonnegative tensor factorization,” in 2018 25th IEEE International Conference on Image Processing (ICIP), pp. 3219–3223, IEEE, 2018.
  • [50] H. Fan, C. Li, Y. Guo, G. Kuang, and J. Ma, “Spatial-spectral total variation regularized low-rank tensor decomposition for hyperspectral image denoising,” IEEE Transactions on Geoscience and Remote Sensing, vol. 56, pp. 6196–6213, Oct 2018.
  • [51] L. Sun, T. Zhan, Z. Wu, L. Xiao, and B. Jeon, “Hyperspectral mixed denoising via spectral difference-induced total variation and low-rank approximation,” Remote Sensing, vol. 10, no. 12, p. 1956, 2018.
  • [52] B. Rasti, M. O. Ulfarsson, and P. Ghamisi, “Automatic hyperspectral image restoration using sparse and low-rank modeling,” IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 12, pp. 2335–2339, 2017.
  • [53] M. Maggioni, V. Katkovnik, K. Egiazarian, and A. Foi, “Nonlocal transform-domain filter for volumetric data denoising and reconstruction,” IEEE transactions on image processing, vol. 22, no. 1, pp. 119–133, 2012.
  • [54] L. De Lathauwer, “Decompositions of a higher-order tensor in block terms—part ii: Definitions and uniqueness,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 3, pp. 1033–1066, 2008.