\TPMargin

5pt {textblock}0.80(0.10, 0.01) ©2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Block Coordinate Descent Algorithms for Auxiliary-Function-Based Independent Vector Extraction

Rintaro Ikeshita, , Tomohiro Nakatani, , and Shoko Araki Rintaro Ikeshita, Tomohiro Nakatani, and Shoko Araki are with NTT Corporation, Kyoto, 619-0237, Japan (e-mail: ikeshita@ieee.org).

Abstract

In this paper, we address the problem of extracting all super-Gaussian source signals from a linear mixture in which (i) the number of super-Gaussian sources $K$ is less than that of sensors $M$ , and (ii) there are up to $M-K$ stationary Gaussian noises that do not need to be extracted. To solve this problem, independent vector extraction (IVE) using a majorization minimization and block coordinate descent (BCD) algorithms has been developed, attaining robust source extraction and low computational cost. We here improve the conventional BCDs for IVE by carefully exploiting the stationarity of the Gaussian noise components. We also newly develop a BCD for a semiblind IVE in which the transfer functions for several super-Gaussian sources are given a priori. Both algorithms consist of a closed-form formula and a generalized eigenvalue decomposition. In a numerical experiment of extracting speech signals from noisy mixtures, we show that when $K=1$ in a blind case or at least $K-1$ transfer functions are given in a semiblind case, the convergence of our proposed BCDs is significantly faster than those of the conventional ones.

Index Terms:

Blind source extraction, independent component analysis, independent vector analysis, block coordinate descent method, generalized eigenvalue problem

I Introduction

Blind source separation (BSS) is a problem concerned with estimating the original source signals from a mixture signal captured by multiple sensors. When the number of sources is no greater than that of sensors, i.e., in the (over-)determined case, independent component analysis (ICA [1, 2, 3, 4]) has been a popular approach to BSS because it is simple and effective. When each original source is a vector and a multivariate random variable, independent vector analysis (IVA [5, 6]), also termed joint BSS [7, 8, 9], has been widely studied as an extension of ICA.

In this paper, we will only focus on the problem of extracting all the super-Gaussian sources from a linear mixture signal under the following assumptions and improve the computational efficacy of IVA (ICA can also be improved in the same way as IVA, and so we only deal with IVA):

1.

The number of super-Gaussian sources $K$ is known and fewer than that of sensors $M$ , i.e., $K<M$ .
2.

There can be up to $M-K$ stationary Gaussian noises, and thus the problem remains (over-)determined.

The second assumption, which concerns the rigorous development of efficient algorithms, can be violated to some extent when applied in practice (see numerical experiments in Section VII). To distinguish it from a general BSS, this problem is called blind source extraction (BSE [2]). The BSE problem is often encountered in such applications as speaker source enhancement, biomedical signal processing, or passive radar/sonar. In speaker source enhancement, for instance, the maximum number of speakers (super-Gaussian sources) can often be predetermined as a certain number (e.g., two or three) while an audio device is equipped with more microphones. In real-word applications, the observed signal is contaminated with background noises, and most noise signals are more stationary than speaker signals.

BSE can be solved by simply applying IVA as if there were $M$ super-Gaussian sources and selecting the top $K$ $(<M)$ super-Gaussian signals from the $M$ separated signals in some way. However, this approach is computationally intensive if $M$ gets too large. To reduce the computing cost, for preprocessing of IVA, the number of channels (or sensors) can be reduced to $K$ by using principle component analysis or by selecting $K$ sensors with high SNRs. These channel reductions, however, often degrade the separation performance due to the presence of the background noises.

BSE methods for efficiently extracting just one or several non-Gaussian signals (not restricted to super-Gaussian signals) have already been studied mainly in the context of ICA [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]. The natural gradient algorithm and the FastICA method with deflation techniques [16, 17, 18] can sequentially extract non-Gaussian sources one by one. In FastICA, the deflation is based on the orthogonal constraint where the sample correlation between every pair of separated signals equals zero. This constraint was also used to develop FastICA with symmetric orthonormalization [17, 18, 19, 20] that can simultaneously extract $K$ non-Gaussian signals.

I-A Independent vector extraction (IVE)

Recently, maximum likelihood approaches have been proposed for BSE in which the background noise components are modeled as stationary Gaussians. These methods include independent vector extraction (IVE [21, 22, 23]) and overdetermined IVA (OverIVA [24, 25, 26, 27]), which will be collectively referred to as IVE in this paper. When $K$ non-Gaussian sources are all super-Gaussian (as defined in Assumption 2), IVE can use a majorization-minimization (MM [28]) optimization algorithm developed for an auxiliary-function-based ICA (AuxICA [29]) and an auxiliary-function-based IVA (AuxIVA [30, 31, 32]). In this paper, we only deal with an IVE based on the MM algorithm [24, 25, 26, 27, 23] and always assume that the super-Gaussian distributions are defined by Assumption 2 in Section III-A.

In the MM-based IVE, each separation matrix $W\in\mathbb{C}^{M\times M}$ is optimized by iteratively minimizing a surrogate function of the following form:

	$\displaystyle g(W)$	$\displaystyle=\sum_{i=1}^{K}\bm{w}_{i}^{h}V_{i}\bm{w}_{i}+\operatorname{tr}\left(W_{\operatorname{z}}^{h}V_{\operatorname{z}}W_{\operatorname{z}}\right)-\log\|\det W\|^{2},$
	$\displaystyle W$	$\displaystyle=[\bm{w}_{1},\ldots,\bm{w}_{K},W_{\operatorname*{z}}]\in\mathbb{C}^{M\times M},$
	$\displaystyle W_{\operatorname*{z}}$	$\displaystyle=[\bm{w}_{K+1},\ldots,\bm{w}_{M}]\in\mathbb{C}^{M\times(M-K)}.$

Here, $\bm{w}_{1},\ldots,\bm{w}_{K}\in\mathbb{C}^{M}$ are filters that extract the target-source signals and $W_{\operatorname*{z}}$ is a filter that extracts the $M-K$ noise components. $V_{1},\ldots,V_{K},V_{\operatorname*{z}}\in\mathbb{C}^{M\times M}$ are the Hermitian positive definite matrices that are updated in each iteration of the MM algorithm. Interestingly, when $K=M$ or when viewing the second term of $g(W)$ as $\operatorname*{tr}\left(W_{\operatorname*{z}}^{h}V_{\operatorname*{z}}W_{\operatorname*{z}}\right)=\sum_{i=K+1}^{M}\bm{w}_{i}^{h}V_{\operatorname*{z}}\bm{w}_{i}$ , the problem of minimizing $g(W)$ has been discussed in ICA/IVA literature [33, 34, 35, 36, 37]. Among the algorithms developed so far, block coordinate descent (BCD [38]) methods with simple analytical solutions have attracted much attention in the field of audio source separation because they have been experimentally shown to give stable and fast convergence behaviors. A family of these BCD algorithms [29, 30, 31, 32], summarized in Table I, is currently called an iterative projection (IP) method. The IP method has been widely applied not only to ICA and IVA but also to audio source separation methods using more advanced source models (see [39, 40, 41, 42, 43] for the details of such methods).

I-B Conventional BCD (or IP) algorithms for IVE

When we consider directly applying the IP methods developed for AuxIVA to the BSE problem of minimizing $g(W)$ with respect to $W$ , AuxIVA-IP1 [30] in Table I, for instance, will cyclically update $\bm{w}_{1}\rightarrow\cdots\rightarrow\bm{w}_{K}\rightarrow\cdots\rightarrow\bm{w}_{M}$ one by one. However, in the BSE problem, the signals of interests are $K$ non-Gaussian sources, and most of the optimization of $W_{\operatorname*{z}}$ should be skipped.

Therefore, in a previous work [24], an algorithm (IVE-OC in Table I) was proposed that cyclically updates $\bm{w}_{1}\rightarrow W_{\operatorname*{z}}\rightarrow\cdots\rightarrow\bm{w}_{K}\rightarrow W_{\operatorname*{z}}$ one by one with a computationally cheap formula for $W_{\operatorname*{z}}$ . In this work [24], the updating equation for $W_{\operatorname*{z}}$ was derived solely from the (weak) orthogonal constraint (OC [22, 21, 23]) where the sample correlation between the separated $K$ non-Gaussian sources and the $M-K$ background noises equals zero. (Note that the $K$ non-Gaussian sources are not constrained to be orthogonal in the OC.) Although the algorithm has successfully reduced the computational cost of IVA by nearly a factor of $K/M$ , the validity of imposing the heuristic OC remains unclear.

After that, IVE-OC has been extended by removing the OC from the model. One such extension is a direct method that can obtain a global minimizer of $g(W)$ when $K=1$ [25, 27]. The other extension is a BCD algorithm for $K\geq 2$ that cyclically updates the pairs $(\bm{w}_{1},W_{\operatorname*{z}})\rightarrow\cdots\rightarrow(\bm{w}_{K},W_{\operatorname*{z}})$ one by one [26], but the computational cost is not so cheap due to the full update of $W_{\operatorname*{z}}$ in each iteration. These algorithms are called IVE-IP2 in this paper (see Table I).

I-C Contributions

In this work, we propose BCD algorithms for IVE, which are summarized in Table I with comparisons to the previous BCDs (IPs). The followings are the contributions of this paper.

(i) We speed up the previous IVE-IP2 for $K\geq 2$ by showing that $W_{\operatorname*{z}}$ ’s update can be omitted in each iteration of BCD without changing the behaviors of the algorithm (Section IV-F). This is attained by carefully exploiting the stationary Gaussian assumption for the noise components. In an experiment of speaker source enhancement, we confirmed that the computational cost of the conventional IVE-IP2 is consistently reduced.

(ii) We provide a comprehensive explanation of IVE-IP2 for $K=1$ (Sections IV-D and IV-H). Interestingly, it turns out to be a method that iteratively updates the maximum signal-to-noise-ratio (MaxSNR [44, 45]) beamformer and the power spectrum for the unique ( $K=1$ ) target-source signal. The experimental result shows that this algorithm has much faster convergence than conventional algorithms. Note that IVE-IP2 for $K=1$ was developed independently and simultaneously in our conference paper [27] and by Scheibler–Ono [25].

(iii) We reveal that IVE-OC [24], which was developed with the help of the heuristic orthogonal constraint (OC), can also be obtained as a BCD algorithm for our proposed IVE without the OC (Section IV-G). (This interpretation of IVE-OC as a BCD algorithm was described in our conference paper [27], but it is not mathematically rigorous. We here provide a rigorous proof of this interpretation.)

(iv) We further extend the proposed IVE-IP2 for the semiblind case where the transfer functions for $L$ ( $1\leq L\leq K$ ) sources of interests can be given a priori (Section V). We call the proposed semiblind method Semi-IVE. In Semi-IVE, $L$ separation filters, e.g., $\bm{w}_{1},\ldots,\bm{w}_{L}$ , which correspond to the known transfer functions are optimized by iteratively solving the linear constrained minimum variance (LCMV [44]) beamforming algorithm, and the remaining $\bm{w}_{L+1},\ldots,\bm{w}_{K}$ (and $W_{\operatorname*{z}}$ ) are optimized by the full-blind IVE-IP2 algorithm. We experimentally show that when $L\geq K-1$ transfer functions are given Semi-IVE yields surprisingly fast convergence.

Organization: The rest of this paper is organized as follows. The BSE and semi-BSE problems are defined in Section II. The probabilistic model for the proposed IVE is compared with related methods in Section III. The optimization algorithms are developed separately for the BSE and semi-BSE problems in Sections IV and V. The computational time complexity for these methods is discussed in Section VI. In Sections VII and VIII, experimental results and concluding remarks are described.

I-D Notations

Let $\mathcal{S}_{+}^{d}$ or $\mathcal{S}_{++}^{d}$ denote the set of all Hermitian positive semidefinite (PSD) or positive definite (PD) matrices of size $d\times d$ . Let $\operatorname*{GL}(d)$ denote the set of all (complex) nonsingular matrices of size $d\times d$ . Also, let $\bm{0}_{d}\in\mathbb{C}^{d}$ be the zero vector, let $O_{i,j}\in\mathbb{C}^{i\times j}$ be the zero matrix, let $I_{d}\in\mathbb{C}^{d\times d}$ be the identity matrix, and let $\bm{e}_{k}$ be a vector whose $k$ -th element equals one and the others are zero. For a vector or matrix $A$ , $A^{\top}$ and $A^{h}$ represent the transpose and the conjugate transpose of $A$ . The element-wise conjugate is denoted as $A^{\ast}=(A^{\top})^{h}$ . For a matrix $A\in\mathbb{C}^{i\times j}$ , $\operatorname*{Im}A$ is defined as the subspace $\{A\bm{x}\mid\bm{x}\in\mathbb{C}^{j}\}\subset\mathbb{C}^{i}$ . For a vector $\bm{v}$ , $\|\bm{v}\|=\sqrt{\bm{v}^{h}\bm{v}}$ denotes the Euclidean norm.

TABLE I: Optimization process of BCD algorithms for problem (29)

	Method	Algorithm	Reference	Assumption	Optimization process of BCD
Conventional	(Aux)IVA	IP1	[30]	-	$\bm{w}_{1}\rightarrow\cdots\rightarrow\bm{w}_{K}\rightarrow\cdots\rightarrow\bm{w}_{M}$
		IP2	[31]	$K=M=2$	$W=[\bm{w}_{1},\bm{w}_{2}]$ (direct optimization)
		IP2	[32]	(if $M$ is even)	$(\bm{w}_{1},\bm{w}_{2})\rightarrow\cdots\rightarrow(\bm{w}_{M-1},\bm{w}_{M})$
Conventional	IVE	IVE-OC¹⁾	[24], §III-C	Orthogonal constraint (OC)	$\bm{w}_{1}\rightarrow W_{\operatorname{z}}^{1}\rightarrow\cdots\rightarrow\bm{w}_{K}\rightarrow W_{\operatorname{z}}^{1}$
Conventional	IVE	IP2³⁾	[26], §IV-E	$K\geq 2$	$(\bm{w}_{1},W_{\operatorname{z}})\rightarrow\cdots\rightarrow(\bm{w}_{K},W_{\operatorname{z}})$
Proposed	IVE	IP1¹⁾	§IV-G	-	$\bm{w}_{1}\rightarrow(W_{\operatorname{z}},\Omega_{\operatorname{z}})\rightarrow\cdots\rightarrow\bm{w}_{K}\rightarrow(W_{\operatorname{z}},\Omega_{\operatorname{z}})$
		IP2²⁾	§IV-D	$K=1$	$W=[\bm{w}_{1},(W_{\operatorname*{z}})]$ (direct optimization)
		IP2³⁾	§IV-F	$K\geq 2$	$(\bm{w}_{1},(W_{\operatorname{z}}))\rightarrow\cdots\rightarrow(\bm{w}_{K},(W_{\operatorname{z}}))$
Proposed	Semi-IVE	IP2⁴⁾	§V	Given $\bm{a}_{1},\cdots,\bm{a}_{L}$ $(L\geq K-1)$	$W=[\bm{w}_{1},\ldots,\bm{w}_{K},(W_{\operatorname*{z}})]$ (direct optimization)
Proposed	Semi-IVE	IP2⁴⁾	§V	Given $\bm{a}_{1},\cdots,\bm{a}_{L}$ $(L\leq K-2)$	$(\bm{w}_{L+1},(W_{\operatorname{z}}))\rightarrow\cdots\rightarrow(\bm{w}_{K},(W_{\operatorname{z}}))$
¹⁾ The two algorithms are identical as shown in Section IV-G.
²⁾ It was developed independently and simultaneously by Scheibler–Ono [25] and the authors [27] in Proceedings of ICASSP2020.
³⁾ The proposed IVE-IP2 for $K\geq 2$ is an acceleration of the conventional IVE-IP2 developed in [26].
⁴⁾ $\bm{w}_{1},\cdots,\bm{w}_{L}$ , which correspond to $\bm{a}_{1},\ldots,\bm{a}_{L}$ , are directly globally optimized as the LCMV beamformers (see Section V-B).
^2,3,4) In the proposed IVE-IP2 and Semi-IVE, the optimizations for $W_{\operatorname*{z}}$ are skipped (see Sections IV-D, IV-F, and V-C).

II Blind and semiblind source extraction

II-A Blind source extraction (BSE) problem

Throughout this paper, we discuss IVA and IVE using the terminology from audio source separation in the short-term Fourier transform (STFT) domain.

Suppose that $K$ super-Gaussian target-source signals and a stationary Gaussian noise signal of dimension $M-K$ are transmitted and observed by $M$ sensors. In this paper, we only consider the case where $1\leq K<M$ . The observed signal in the STFT domain is modeled at each frequency bin $f=1,\ldots,F$ and time-frame $t=1,\ldots,T$ as

$\displaystyle\bm{x}(f,t)$	$\displaystyle=A_{\operatorname{s}}(f)\bm{s}(f,t)+A_{\operatorname{z}}(f)\bm{z}(f,t)\in\mathbb{C}^{M},$	(1)
$\displaystyle A_{\operatorname*{s}}(f)$	$\displaystyle=[\,\bm{a}_{1}(f),\ldots,\bm{a}_{K}(f)\,]\in\mathbb{C}^{M\times K},$	(2)
$\displaystyle\bm{s}(f,t)$	$\displaystyle=[\,s_{1}(f,t),\ldots,s_{K}(f,t)\,]^{\top}\in\mathbb{C}^{K},$	(3)
$\displaystyle A_{\operatorname*{z}}(f)$	$\displaystyle\in\mathbb{C}^{M\times(M-K)},\quad\bm{z}(f,t)\in\mathbb{C}^{M-K},$	(4)

where $s_{i}(f,t)\in\mathbb{C}$ and $\bm{z}(f,t)\in\mathbb{C}^{M-K}$ are the STFT coefficients of target source $i\in\{1,\ldots,K\}$ and the noise signal, respectively. $\bm{a}_{i}(f)\in\mathbb{C}^{M}$ is the (time-independent) acoustic transfer function (or steering vector) of source $i$ to the sensors, and $A_{\operatorname*{z}}(f)\in\mathbb{C}^{M\times(M-K)}$ is that of the noise. It is assumed that the source signals are statistically mutually independent.

In the blind source extraction (BSE) problem, we are given an observed mixture $\bm{x}=\{\bm{x}(f,t)\}_{f,t}$ and the number of target sources $K$ . From these inputs, we seek to estimate the spatial images $\bm{x}_{1},\ldots,\bm{x}_{K}$ for the target sources, which are defined as $\bm{x}_{i}(f,t)\coloneqq\bm{a}_{i}(f)s_{i}(f,t)\in\mathbb{C}^{M}$ , $i=1,\ldots,K$ .

II-B Semiblind source extraction (Semi-BSE) problem

In the semiblind source extraction (Semi-BSE) problem, in addition to the BSE inputs, we are given $L$ transfer functions $\bm{a}_{1},\ldots,\bm{a}_{L}$ for $L$ super-Gaussian sources, where $1\leq L\leq K$ . From these inputs, we estimate all the target-source spatial images $\bm{x}_{1},\ldots,\bm{x}_{K}$ . If $L=K$ , then Semi-BSE is known as a beamforming problem.

Motivation to address Semi-BSE: In some applications of audio source extraction, such as meeting diarization [46], the locations of some (but not necessarily all) point sources are available or can be estimated accurately, and their acoustic transfer functions can be obtained from, e.g., the sound propagation model [47]. For instance, in a conference situation, the attendees may be sitting in chairs with fixed, known locations. On the other hand, the locations of moderators, panel speakers, or audience may change from time to time and cannot be determined in advance. In such a case, by using these partial prior knowledge of transfer functions, Semi-BSE methods can improve the computational efficiency and separation performance of BSE methods. In addition, since there are many effective methods for estimating the transfer function of (at least) a dominant source [48, 49, 50], there is a wide range of applications where Semi-BSE can be used to improve the performance of BSE.

III Probabilistic models

We start by presenting the probabilistic models for the proposed auxiliary-function-based independent vector extraction (IVE), which is almost the same as those of such related methods as IVA [5], AuxIVA [30, 31, 32], and the conventional IVE [22, 21, 23, 24, 25, 26].

III-A Probabilistic model of proposed IVE

In the mixing model (1), the mixing matrix

\displaystyle A(f)=[\,\bm{a}_{1}(f),\ldots,\bm{a}_{K}(f),A_{\operatorname*{z}}(f)\,]\in\mathbb{C}^{M\times M}

(5)

is square and generally invertible, and hence the problem can be converted into one that estimates a separation matrix $W(f)\in\operatorname*{GL}(M)$ satisfying $W(f)^{h}A(f)=I_{M}$ , or equivalently, satisfying

	$\displaystyle s_{i}(f,t)$	$\displaystyle=\bm{w}_{i}(f)^{h}\bm{x}(f,t)\in\mathbb{C},\quad i=1,\ldots,K,$		(6)
	$\displaystyle\bm{z}(f,t)$	$\displaystyle=W_{\operatorname*{z}}(f)^{h}\bm{x}(f,t)\in\mathbb{C}^{M-K},$		(7)

where we define

$\displaystyle W(f)$	$\displaystyle=[\,\bm{w}_{1}(f),\ldots,\bm{w}_{K}(f),W_{\operatorname{z}}(f)\,]\in\operatorname{GL}(M),$	(8)
$\displaystyle\bm{w}_{i}(f)$	$\displaystyle\in\mathbb{C}^{M},\quad i=1,\ldots,K,$	(9)
$\displaystyle W_{\operatorname*{z}}(f)$	$\displaystyle\in\mathbb{C}^{M\times(M-K)}.$	(10)

Denote by $\bm{s}_{i}(t)=[\,s_{i}(1,t),\ldots,s_{i}(F,t)\,]^{\top}\in\mathbb{C}^{F}$ the vector of all the frequency components for source $i$ and time-frame $t$ . The proposed IVE exploits the following three assumptions. Note that Assumption 2 was introduced for developing AuxICA [29] and AuxIVA [30, 31, 32].

Assumption 1 (Independence of sources).

The random variables $\{\bm{s}_{i}(t),\bm{z}(f,t)\}_{i,f,t}$ are mutually independent:

\displaystyle p(\{\bm{s}_{i}(t),\bm{z}(f,t)\}_{i,f,t})=\prod_{i,t}p(\bm{s}_{i}(t))\cdot\prod_{f,t}p(\bm{z}(f,t)).

Assumption 2 (Super-Gaussian distributions for the target sources [29, 30]).

The target-source signal $\bm{s}_{i}(t)$ follows a circularly symmetric super-Gaussian distribution:

\displaystyle-\log p(\bm{s}_{i}(t))=G(\|\bm{s}_{i}(t)\|)+\operatorname*{const.},

(11)

where $G\colon\mathbb{R}_{\geq 0}\to\mathbb{R}$ is differentiable and satisfies that $\frac{G^{\prime}(r)}{r}$ is nonincreasing on $r\in\mathbb{R}_{>0}$ . Here, $G^{\prime}$ is the first derivative of $G$ . Candidates of $G$ (or the probability density functions) include the $\log\cosh$ function and the circularly symmetric generalized Gaussian distribution (GGD) with the scale parameter $\alpha_{i}\in\mathbb{R}_{>0}$ and the shape parameter $0<\beta<2$ , which is also known as the exponential power distribution [51]:

\displaystyle G(\|\bm{s}_{i}(t)\|)=\left(\frac{\|\bm{s}_{i}(t)\|}{\alpha_{i}}\right)^{\beta}+2F\log\alpha_{i}.

(12)

GGD is a parametric family of symmetric distributions, and when $\beta=1$ it is nothing but the complex Laplace distribution. It has been experimentally shown in many studies that ICA type methods including IVA and IVE can work effectively for audio source separation tasks when audio signals such as speech signals are modeled by the super-Gaussian distributions (see, e.g., [5, 6, 21, 22, 23, 24, 25, 26, 27, 30, 31, 32]).

Assumption 3 (Stationary Gaussian distribution for the background noise).

The noise signal $\bm{z}(f,t)\in\mathbb{C}^{M-K}$ follows a circularly symmetric complex Gaussian distribution with the zero mean and identity covariance matrix:

	$\displaystyle\bm{z}(f,t)$	$\displaystyle\sim\mathbb{C}\mathcal{N}\left(\bm{0}_{M-K},I_{M-K}\right),$		(13)
	$\displaystyle p(\bm{z}(f,t))$	$\displaystyle=\frac{1}{\pi^{M-K}}\exp\left(-\\|\bm{z}(f,t)\\|^{2}\right).$		(14)

Assumption 3 plays a central role for deriving several efficient algorithms for IVE. Despite this assumption, as we experimentally confirm in Section VII, the proposed IVE can extract speech signals even in a diffuse noise environment where the noise signal is considered super-Gaussian or nonstationary and has an arbitrary large spatial rank.

With the model defined by (6)–(13), the negative loglikelihood, $g_{0}(W)\coloneqq-\frac{1}{T}\log p(\bm{x}\mid W)$ , can be computed as

	$\displaystyle g_{0}(W)$	$\displaystyle=$		$\displaystyle\frac{1}{T}\sum_{i=1}^{K}\sum_{t=1}^{T}G(\\|\bm{s}_{i}(t)\\|)+\frac{1}{T}\sum_{f=1}^{F}\sum_{t=1}^{T}\\|\bm{z}(f,t)\\|^{2}$
		$\displaystyle-2\sum_{f=1}^{F}\log\|\det W(f)\|+\operatorname*{const.},$				(15)

where $W\coloneqq\{W(f)\}_{f=1}^{F}$ are the variables to be optimized.

Remark 1.

The stationary Gaussian noise model (13) with the identity covariance matrix does not sacrifice generality. At first glance, it seems better to employ

\displaystyle\bm{z}(f,t)\sim\mathbb{C}\mathcal{N}\left(\bm{0},\Omega_{\operatorname*{z}}(f)\right)

(16)

with a general covariance matrix $\Omega_{\operatorname*{z}}(f)\in\mathcal{S}_{++}^{M-K}$ . However, we can freely change the variables to satisfy (13) using the ambiguity between $A_{\operatorname*{z}}$ and $\bm{z}$ , given by

\displaystyle A_{\operatorname*{z}}(f)\bm{z}(f,t)

\displaystyle=(A_{\operatorname*{z}}(f)\Omega_{\operatorname*{z}}(f)^{\frac{1}{2}})(\Omega_{\operatorname*{z}}(f)^{-\frac{1}{2}}\bm{z}(f,t)).

(17)

III-B Relation to IVA and AuxIVA

If we assume that the $M-K$ noise components also independently follow super-Gaussian distributions, then the IVE model coincides with that of IVA [5] or AuxIVA [30, 31, 32]. The following are the two advantages of assuming the stationary Gaussian model (13).

(i) As we confirm experimentally in Section VII, when we optimize the IVE model, separation filters $\bm{w}_{1},\ldots,\bm{w}_{K}$ extract the top $K$ highly super-Gaussian (or nonstationary) signals such as speech signals from the observed mixture while $W_{\operatorname*{z}}$ extracts only the background noises that are more stationary and approximately follow Gaussian distributions. On the other hand, in IVA, which assumes super-Gaussian noise models, $K$ ( $<M$ ) target-source signals need to be chosen from the $M$ separated signals after optimizing the model.

(ii) As we reveal in Section IV-G, in the proposed IVE, it suffices to optimize $\operatorname*{Im}W_{\operatorname*{z}}=\{W_{\operatorname*{z}}\bm{v}\mid\bm{v}\in\mathbb{C}^{M-K}\}\subset\mathbb{C}^{M}$ , the subspace spanned by $W_{\operatorname*{z}}\in\mathbb{C}^{M\times(M-K)}$ , instead of $W_{\operatorname*{z}}$ . Because we can optimize $\operatorname*{Im}W_{\operatorname*{z}}$ very efficiently (Section IV-G), IVE can reduce the computation time of IVA, especially when $K\ll M$ .

III-C Relation to IVE with orthogonal constraint

The proposed IVE is inspired by OverIVA with an orthogonal constraint (OC) [24]. This conventional method will be called IVE-OC in this paper.

IVE-OC [24] was proposed as an acceleration of IVA [30] for the case where $K<M$ , while maintaining its separation performance. The IVE-OC model is defined as the proposed IVE by replacing the noise model from (13) to (16) and introducing two additional constraints:

	$\displaystyle W_{\operatorname*{z}}(f)$	$\displaystyle=\begin{bmatrix}W_{\operatorname{z}}^{1}(f)\\ -I_{M-K}\end{bmatrix},\quad W_{\operatorname{z}}^{1}(f)\in\mathbb{C}^{K\times(M-K)},$		(18)
		$\displaystyle\frac{1}{T}\sum_{t=1}^{T}\bm{s}(f,t)\bm{z}(f,t)^{h}=O_{K,M-K}.$		(19)

The first constraint (18) may be applicable because there is no need to extract the noise components (see [24, 26] for details). The second constraint (19), called an orthogonal constraint (OC [21]), was introduced to help the model distinguish between the target-source and noise signals.

OC, which forces the sample correlation between the separated target-source and noise signals to be zero, can equivalently be expressed as

	$\displaystyle W_{\operatorname{s}}(f)^{h}V_{\operatorname{z}}(f)W_{\operatorname*{z}}(f)=O_{K,M-K},$	(20)
$\displaystyle V_{\operatorname*{z}}(f)$	$\displaystyle=\frac{1}{T}\sum_{t=1}^{T}\bm{x}(f,t)\bm{x}(f,t)^{h}\in\mathcal{S}_{+}^{M},$	(21)
$\displaystyle W_{\operatorname*{s}}(f)$	$\displaystyle=[\,\bm{w}_{1}(f),\ldots,\bm{w}_{K}(f)\,]\in\mathbb{C}^{M\times K},$	(22)

which together with (18) imply

\displaystyle W_{\operatorname*{z}}^{1}(f)=(W_{\operatorname*{s}}(f)^{h}V_{\operatorname*{z}}(f)E_{\operatorname*{s}})^{-1}(W_{\operatorname*{s}}(f)^{h}V_{\operatorname*{z}}(f)E_{\operatorname*{z}}),

(23)

where we define

	$\displaystyle E_{\operatorname*{s}}$	$\displaystyle\coloneqq[\,\bm{e}_{1},\ldots,\bm{e}_{K}\,]=\begin{bmatrix}I_{K}\\ O_{M-K,K}\end{bmatrix}\in\mathbb{C}^{M\times K},$		(24)
	$\displaystyle E_{\operatorname*{z}}$	$\displaystyle\coloneqq[\,\bm{e}_{K+1},\ldots,\bm{e}_{M}\,]=\begin{bmatrix}O_{K,M-K}\\ I_{M-K}\end{bmatrix}\in\mathbb{C}^{M\times(M-K)}.$		(25)

From (18) and (23), it turns out that $W_{\operatorname*{z}}$ is uniquely determined by $W_{\operatorname*{s}}$ in the IVE-OC model. Hence, in a paper on IVE-OC [24], an algorithm was proposed in which $W_{\operatorname*{z}}$ is updated based on (18) and (23) immediately after updating any other variables $\bm{w}_{1},\ldots,\bm{w}_{K}$ to always impose OC on the model. Although the algorithm was experimentally shown to work well, its validity from a theoretical point of view is unclear because the update rule for $W_{\operatorname*{z}}$ is derived solely from the constraints (18)–(19) and does not reflect an objective of the optimization problem for parameter estimation, such as minimizing the negative loglikelihood.

In this paper, we develop BCD algorithms for the maximum likelihood estimation of the proposed IVE that does not rely on OC, and identify one such algorithm (IVE-IP1 developed in Section IV-G) that exactly coincides with the conventional algorithm for IVE-OC. This means that OC is not essential for developing fast algorithms in IVE-OC. Due to removing OC from the model, we can provide other more computationally efficient algorithms for IVE, which is the main contribution of this paper.

IV Algorithms for the BSE problem

We develop iterative algorithms for the maximum likelihood estimation of IVE. The proposed and some conventional algorithms [24, 25, 26, 27] are based on the following two methods.

•

One is the conventional majorization-minimization (MM) algorithm developed for AuxICA [29] and AuxIVA [30]. In MM, instead of dealing with original objective function $g_{0}$ (Eq. (15)), a surrogate function of $g_{0}$ that is easier to minimize is addressed (Section IV-A).
•

The other is block coordinate descent (BCD) algorithms. In each iteration of BCDs, several $W$ columns are updated to globally minimize the above surrogate function with respect to that variable. In this paper, we propose several BCDs that improve the conventional BCDs.

Our proposed algorithms are summarized in Algorithm 1. The optimization processes of all the BCDs are summarized in Table I and detailed in the following subsections. The computational time complexities of the algorithms are discussed in Section VI.

IV-A Majorization-minimization (MM) approach

We briefly describe how to apply the conventional MM technique developed for AuxICA [29] to the proposed IVE.

In IVE as well as AuxICA/AuxIVA, a surrogate function $g$ of $g_{0}$ is designed with an auxiliary variable $r$ that satisfies

\displaystyle g_{0}(W)=\min_{r}g(W,r).

(26)

Then, variables $W$ and $r$ are alternately updated by iteratively solving

	$\displaystyle r^{(l)}$	$\displaystyle\in\operatorname*{argmin}_{r}g(W^{(l-1)},r),$		(27)
	$\displaystyle W^{(l)}$	$\displaystyle\in\operatorname*{argmin}_{W}g(W,r^{(l)})$		(28)

for $l=1,2,\ldots$ until convergence. In the same way as in AuxIVA [30], by applying Proposition 5 in Appendix A to the first term of $g_{0}(W)$ , problem (28) in IVE comes down to solving the following $F$ subproblems:

\displaystyle W(f)\in\operatorname*{argmin}_{W(f)}~{}g(W(f),r^{(l)}),\quad f=1,\ldots,F,

(29)

where $r^{(l)}\coloneqq\{r_{i}^{(l)}(t)\in\mathbb{R}_{\geq 0}\}_{i,t}$ and

	$\displaystyle g(W(f),r^{(l)})=\sum_{i=1}^{K}\bm{w}_{i}(f)^{h}V_{i}(f)\bm{w}_{i}(f)$
	$\displaystyle+\operatorname{tr}\left(W_{\operatorname{z}}(f)^{h}V_{\operatorname{z}}(f)W_{\operatorname{z}}(f)\right)-2\log\|\det W(f)\|,$	(30)
$\displaystyle V_{i}(f)$	$\displaystyle=\frac{1}{T}\sum_{t=1}^{T}\phi_{i}(t)\bm{x}(f,t)\bm{x}(f,t)^{h},$	(31)
$\displaystyle V_{\operatorname*{z}}(f)$	$\displaystyle=\frac{1}{T}\sum_{t=1}^{T}\bm{x}(f,t)\bm{x}(f,t)^{h},$	(32)
$\displaystyle\phi_{i}(t)$	$\displaystyle=\frac{G^{\prime}(r_{i}^{(l)}(t))}{2r_{i}^{(l)}(t)},\quad r_{i}^{(l)}(t)=\\|\bm{s}_{i}^{(l)}(t)\\|,$	(33)
	$\displaystyle s_{i}^{(l)}(f,t)=\bm{w}_{i}^{(l-1)}(f)^{h}\bm{x}(f,t).$	(34)

Here, the computation of (33)–(34) corresponds to the optimization of (27). Recall that $G^{\prime}$ in (33) is the first derivative of $G\colon\mathbb{R}_{\geq 0}\to\mathbb{R}$ (see Assumption 2). To efficiently solve (29), we propose BCD algorithms in the following subsections. From the derivation, it is guaranteed that the objective function is monotonically nonincreasing at each iteration in the MM algorithm.¹¹1 Due to space limitations, showing the convergence rate and other convergence properties of the proposed algorithms will be left as a future work.

IV-B Block coordinate descent (BCD) algorithms and the stationary condition for problem (29)

No algorithms have been found that obtain a global optimal solution for problem (29) for general $K,M\in\mathbb{N}$ . Thus, iterative algorithms have been proposed to find a local optimal solution. Among them, a family of BCD algorithms has been attracting much attention (e.g., [29, 30, 31, 32, 24, 26, 25, 27, 39, 40, 41, 42]) because they have been experimentally shown to work faster and more robustly than other algorithms such as the natural gradient method [52]. The family of these BCD algorithms (specialized to solve (29)) is currently called an iterative projection (IP) method.

As we will see in the following subsections, all the IP algorithms summarized in Table I can be developed by exploiting the stationary condition, which is also called the first-order necessary optimality condition [53]. To simplify the notation, when we discuss (29), we abbreviate frequency bin index $f$ without mentioning it. For instance, $W(f)$ and $V_{i}(f)$ are simply denoted as $W$ and $V_{i}$ (without confusion).

Lemma 1 (See, e.g., [33, 34, 35]).

The stationary condition for problem (29) is expressed as ( $i=1,\ldots,K$ )

	$\displaystyle\frac{\partial g}{\partial\bm{w}_{i}^{\ast}}$	$\displaystyle=\bm{0}_{M}$		$\displaystyle\Longleftrightarrow$	$\displaystyle\quad W^{h}V_{i}\bm{w}_{i}$	$\displaystyle=\bm{e}_{i},$		(35)
	$\displaystyle\frac{\partial g}{\partial W_{\operatorname*{z}}^{\ast}}$	$\displaystyle=O_{M,M-K}$		$\displaystyle\Longleftrightarrow$	$\displaystyle\quad W^{h}V_{\operatorname{z}}W_{\operatorname{z}}$	$\displaystyle=E_{\operatorname*{z}},$		(36)

where $E_{\operatorname*{z}}$ is given by (25).

To rigorously derive the algorithms, we always assume the following two technical but mild conditions (C1) and (C2) for problem (29):²²2 If (C1) is violated, problem (29) has no optimal solutions and algorithms should diverge to infinity (see [54, Proposition 1] for the proof). Conversely, if (C1) is satisfied, it is guaranteed that problem (29) has an optimal solution by Proposition 6 in Appendix A. In practice, the number of frames $T$ exceeds that of sensors $M$ , and (C1) holds in general. Condition (C2) is satisfied automatically if we initialize $W$ as nonsingular. Intuitively, singular $W$ implies $-\log|\det W|=+\infty$ , which will never occur during optimization.

(C1): $V_{1},\ldots,V_{K},V_{\operatorname*{z}}\in\mathcal{S}_{+}^{M}$ are positive definite.
(C2): Estimates of $W\in\mathbb{C}^{M\times M}$ are always nonsingular during optimization.

IV-C Conventional methods: IVA-IP1, IVA-IP2, and IVE-OC

IV-C1 IVA-IP1

Let $W_{\operatorname*{z}}=[\bm{w}_{K+1},\ldots,\bm{w}_{M}]$ . As shown in Table I, IVA-IP1 [30] cyclically updates each separation filter $\bm{w}_{1},\ldots,\bm{w}_{M}$ by solving the following subproblem for each $i=1,\ldots,M$ one by one:

\displaystyle\bm{w}_{i}\in\operatorname*{argmin}_{\bm{w}_{i}}g(\bm{w}_{1},\ldots,\bm{w}_{M},r).

(37)

This can be solved under (C1) and (C2) by

	$\displaystyle\bm{u}_{i}$	$\displaystyle\leftarrow(W^{h}V_{i})^{-1}\bm{e}_{i}\in\mathbb{C}^{M},$		(38)
	$\displaystyle\bm{w}_{i}$	$\displaystyle\leftarrow\bm{u}_{i}\left(\bm{u}_{i}^{h}V_{i}\bm{u}_{i}\right)^{-\frac{1}{2}}\in\mathbb{C}^{M}.$		(39)

Here, we define $V_{i}\coloneqq V_{\operatorname*{z}}\in\mathcal{S}_{++}^{M}$ for $i=K+1,\ldots,M$ , and the $i$ th column of $W$ in (38), i.e., $\bm{w}_{i}$ , is set to the current value before update. When applied to the BSE problem, IVA-IP1’s main drawback is that the computation time increases significantly as $M$ gets larger since it updates $W_{\operatorname*{z}}$ even though there is no need to extract the background noises.

IV-C2 IVE-OC

To accelerate IVA-IP1 when it is applied to the BSE problem, IVE-OC [24] updates $W_{\operatorname*{z}}$ using (18) and (23) (see Section III-C and Algorithm 2). Although this update rule seems heuristic, we reveal in Section IV-G that IVE-OC can be comprehended in terms of BCD for the proposed IVE that does not rely on OC.

IV-C3 IVA-IP2

When $(K,M)=(1,2)$ or $(2,2)$ , i.e., when $W=[\bm{w}_{1},\bm{w}_{2}]$ , problem (29) can be solved directly (not iteratively) through a generalized eigenvalue problem [35, 31], which is more efficient than IVA-IP1. We extend this direct method to the case where $K=1$ and general $M$ ( $\geq 2$ ) in Section IV-D.

In a previous work [32], this IVA-IP2 was extended for the case where $K=M\geq 3$ . This algorithm, which is also called IVA-IP2, updates two separation filters, e.g., $\bm{w}_{i}$ and $\bm{w}_{j}$ , in each iteration by exactly solving the following subproblem:

\displaystyle(\bm{w}_{i},\bm{w}_{j})\in\operatorname*{argmin}_{\bm{w}_{i},\,\bm{w}_{j}}g(\bm{w}_{1},\ldots,\bm{w}_{M},r).

(40)

We extend this algorithm in Section IV-F.

IV-D Proposed IVE-IP2 for the case of $K=1$

We focus on the case where $K=1$ and derive the proposed algorithm IVE-IP2³³3 The algorithm IVE-IP2 for $K=1$ was developed independently and simultaneously by Scheibler–Ono (called the fast IVE or FIVE [25]) and the authors [27] in the Proceedings of ICASSP2020. In this paper, we give a complete proof for the derivation of IVE-IP2 and add some remarks (see also Section IV-H for the projection back operation). that is expected to be more efficient than IVE-IP1. The algorithm is summarized in Algorithm 3.

When $(K,M)=(1,2)$ or $(2,2)$ , problem (29) can be solved directly through a generalized eigenvalue problem [35, 31, 32]. We here extend this direct method to the case where $K=1$ and $M\geq 2$ in the following proposition.

Proposition 2.

Let $K=1$ , $M\geq 2$ , and $V_{1},V_{\operatorname*{z}}\in\mathcal{S}_{++}^{M}$ . A matrix $W=[\bm{w}_{1},W_{\operatorname*{z}}]\in\operatorname*{GL}(M)$ with $\bm{w}_{1}\in\mathbb{C}^{M}$ satisfies stationary conditions (35) and (36) if and only if

$\displaystyle\bm{w}_{1}$	$\displaystyle=\bm{u}_{1}\left(\bm{u}_{1}^{h}V_{1}\bm{u}_{1}\right)^{-\frac{1}{2}}Q_{1},\quad Q_{1}\in\mathcal{U}(1),$	(41)
$\displaystyle W_{\operatorname*{z}}$	$\displaystyle=U_{\operatorname{z}}\left(U_{\operatorname{z}}^{h}V_{\operatorname{z}}U_{\operatorname{z}}\right)^{-\frac{1}{2}}Q_{\operatorname{z}},\quad Q_{\operatorname{z}}\in\mathcal{U}(M-1),$	(42)
$\displaystyle U_{\operatorname*{z}}$	$\displaystyle\in\mathbb{C}^{M\times(M-1)}\quad\text{with}\quad U_{\operatorname{z}}^{h}V_{\operatorname{z}}\bm{u}_{1}=\bm{0}_{M-1},$	(43)
$\displaystyle\bm{u}_{1}$	$\displaystyle\in\mathbb{C}^{M\times 1}\quad\text{with}\quad V_{\operatorname*{z}}\bm{u}_{1}=\lambda V_{1}\bm{u}_{1},$	(44)

where $\mathcal{U}(d)$ is the set of all unitary matrices of size $d\times d$ . Also, (44) is the generalized eigenvalue problem for $(V_{\operatorname*{z}},V_{1})$ with the eigenvalue $\lambda\in\mathbb{R}_{>0}$ and eigenvector $\bm{u}_{1}\in\mathbb{C}^{M}$ .

Moreover, if the generalized eigenvalue $\lambda\in\mathbb{R}_{>0}$ in (44) is chosen as the largest one, then any $W\in\operatorname*{GL}(M)$ obtained by (41)–(44) is a global optimal solution for problem (29).

Proof.

We first show that $W$ satisfies (35)–(36) if and only if it is computed by (41)–(44).

For the “if” part, observe that $U_{\operatorname*{z}}V_{1}\bm{u}_{1}=\bm{0}_{M-1}$ holds by (43)–(44) and $\lambda\neq 0$ . Thus, $W$ surely satisfies (35)–(36).

We prove the “only if” part. The stationary conditions (35)–(36) imply that vectors $V_{1}\bm{w}_{1}$ and $V_{\operatorname*{z}}\bm{w}_{1}$ are orthogonal to the subspace $\operatorname*{Im}W_{\operatorname*{z}}$ of dimension $M-1$ . Hence, it holds that $\bm{w}_{1}=c\bm{u}_{1}$ for some $c\in\mathbb{C}$ , where $\bm{u}_{1}$ is given by (44). This $c$ is restricted by $\bm{w}_{1}^{h}V_{1}\bm{w}_{1}=1$ , and we obtain (41). In a similar manner, (36) implies that vector $V_{\operatorname*{z}}\bm{u}_{1}$ is orthogonal to $\operatorname*{Im}W_{\operatorname*{z}}$ of dimension $M-1$ . Hence, it holds that $W_{\operatorname*{z}}=U_{\operatorname*{z}}R$ for some $R\in\mathbb{C}^{(M-1)\times(M-1)}$ , where $U_{\operatorname*{z}}$ is given by (43). This $R$ is restricted by $W_{\operatorname*{z}}^{h}V_{\operatorname*{z}}W_{\operatorname*{z}}=I_{M-1}$ , and we have (42).

We next show the latter statement. By Proposition 6 in Section IV-G, global optimal solutions exist, and they must satisfy the stationary conditions, which are equivalent to (41)–(44). Since (41)–(44) satisfy (35)–(36), the sum of the first and second terms of $g$ becomes $M$ , which is constant. On the other hand, for the $\log\det$ term, it holds that

	$\displaystyle\|\det W\|$	$\displaystyle=\left(\bm{u}_{1}^{h}V_{1}\bm{u}_{1}\right)^{-\frac{1}{2}}\cdot\det\left(U_{\operatorname{z}}^{h}V_{\operatorname{z}}U_{\operatorname*{z}}\right)^{-\frac{1}{2}}\cdot\|\det U\|$
		$\displaystyle=\sqrt{\lambda}\det\left(U^{h}V_{\operatorname{z}}U\right)^{-\frac{1}{2}}\cdot\|\det U\|=\sqrt{\lambda}\det(V_{\operatorname{z}})^{-\frac{1}{2}},$

where we define $U\coloneqq[\bm{u}_{1},U_{\operatorname*{z}}]$ and use $U_{\operatorname*{z}}^{h}V_{\operatorname*{z}}\bm{u}_{1}=\bm{0}_{M-1}$ in the second equality. Hence, the largest $\lambda$ leads to the smallest $g$ , which concludes the proof. ∎

By Proposition 2, under condition (C1), a global optimal solution for problem (29) can be obtained by updating $W=[\bm{w}_{1},W_{\operatorname*{z}}]$ using (41)–(44) with $Q_{1}=1$ and $Q_{\operatorname*{z}}=I_{M-1}$ and choosing the generalized eigenvalue $\lambda$ in (44) as the largest one. Moreover, this algorithm can be accelerated by omitting the computation of (42)–(43) and updating only $\bm{w}_{1}$ according to (41) and (44). It is applicable for extracting the unique target-source signal because the formulas for computing $\bm{w}_{1}$ , $V_{1}$ , and $V_{\operatorname*{z}}$ are independent of $W_{\operatorname*{z}}$ . The obtained algorithm IVE-IP2 is shown in Algorithm 3.

Interestingly, because $V_{\operatorname*{z}}$ and $V_{1}$ can be viewed as the covariance matrices of the mixture and noise signals, the update rules (41) and (44) turn out to be a MaxSNR beamformer [44, 45]. Hence, IVE-IP2 can be understood as a method that iteratively updates the MaxSNR beamformer $\bm{w}_{1}$ and target-source signal $\bm{s}_{1}$ .

IV-E Drawback of conventional IVE-IP2 for the case of $K\geq 2$

Suppose $2\leq K<M$ . The conventional IVE-IP2 developed in [26, Algorithm 2] is a BCD algorithm that cyclically updates the pairs $(\bm{w}_{1},W_{\operatorname*{z}}),\ldots,(\bm{w}_{K},W_{\operatorname*{z}})$ by solving the following subproblem for each $i=1,\ldots,K$ one by one:

\displaystyle(\bm{w}_{i},W_{\operatorname*{z}})\in\operatorname*{argmin}_{\bm{w}_{i},\,W_{\operatorname*{z}}}g(\bm{w}_{1},\ldots,\bm{w}_{K},W_{\operatorname*{z}},r).

(45)

It was shown in [26, Theorem 3] that a global optimal solution of problem (45) can be obtained through a generalized eigenvalue problem. In this paper, we simplify the result of [26, Theorem 3] in the next proposition, based on which we will speed up the conventional IVE-IP2 in Section IV-F.

Proposition 3.

Let $2\leq K<M$ , and $V_{i},V_{\operatorname*{z}}\in\mathcal{S}_{++}^{M}$ . A global optimal solution of problem (45) is obtained as

$\displaystyle\bm{w}_{i}$	$\displaystyle=P_{i}\bm{b}_{i}\left(\bm{b}_{i}^{h}G_{i}\bm{b}_{i}\right)^{-\frac{1}{2}}\in\mathbb{C}^{M\times 1},$	(46)
$\displaystyle W_{\operatorname*{z}}$	$\displaystyle=P_{\operatorname{z}}B_{\operatorname{z}}\left(B_{\operatorname{z}}^{h}G_{\operatorname{z}}B_{\operatorname*{z}}\right)^{-\frac{1}{2}}\in\mathbb{C}^{M\times(M-K)},$	(47)
$\displaystyle P_{\ell}$	$\displaystyle=\left((W^{\prime})^{h}V_{\ell}\right)^{-1}[\,\bm{e}_{i},E_{\operatorname*{z}}\,]\in\mathbb{C}^{M\times(M-K+1)},$	(48)
$\displaystyle G_{\ell}$	$\displaystyle=P_{\ell}^{h}V_{\ell}P_{\ell}\in\mathcal{S}_{++}^{M-K+1},\quad\ell\in\{i,\operatorname*{z}\},$	(49)
$\displaystyle W^{\prime}$	$\displaystyle=[\,\bm{w}_{1},\ldots,\bm{w}_{i}^{\prime},\ldots,\bm{w}_{K},W^{\prime}_{\operatorname{z}}\,]\in\operatorname{GL}(M),$	(50)
$\displaystyle B_{\operatorname*{z}}$	$\displaystyle\in\mathbb{C}^{(M-K+1)\times(M-K)}\quad\text{with}\quad B_{\operatorname{z}}^{h}G_{\operatorname{z}}\bm{b}_{i}=\bm{0}_{M-K},$	(51)
$\displaystyle\bm{b}_{i}$	$\displaystyle\in\mathbb{C}^{(M-K+1)\times 1}\quad\text{with}\quad G_{i}\bm{b}_{i}=\lambda_{\max}G_{\operatorname*{z}}\bm{b}_{i},$	(52)

where $E_{\operatorname*{z}}$ is defined by (25), and $\bm{w}_{i}^{\prime}\in\mathbb{C}^{M\times 1}$ and $W^{\prime}_{\operatorname*{z}}\in\mathbb{C}^{M\times(M-K)}$ in (50) are set arbitrarily as long as $W^{\prime}$ is nonsingular (for instance they can be set to the current values under condition (C2)). Also, (52) is the generalized eigenvalue problem for $(G_{i},G_{\operatorname*{z}})$ , and $\bm{b}_{i}$ is the eigenvector corresponding to the largest generalized eigenvalue $\lambda_{\max}\in\mathbb{R}_{>0}$ .

Proof.

The proof is given in Appendix B. ∎

In the conventional IVE-IP2, under condition (C2), problem (45) is solved by (46)–(50) where $B_{i,\operatorname*{z}}\coloneqq[\bm{b}_{i},B_{\operatorname*{z}}]$ is computed through the following generalized eigenvalue problem instead of using (51)–(52):

\displaystyle G_{i}B_{i,\operatorname*{z}}=G_{\operatorname*{z}}B_{i,\operatorname*{z}}\operatorname*{diag}\{\lambda_{\mathrm{max}},\lambda_{2},\ldots,\lambda_{M-K+1}\}.

(53)

Since the computational cost of solving (53) is greater than computing (51)–(52), we can speed up the conventional IVE-IP2, as shown in the next subsection.

IV-F Proposed IVE-IP2 for $K\geq 2$

The proposed IVE-IP2 for $K\geq 2$ , which is summarized in Algorithm 4, is an acceleration of the conventional IVE-IP2 described in the previous subsection. We here provide another efficient formula to solve (45) without changing the behavior of the conventional IVE-IP2.

Thanks to newly providing Proposition 3, we can update the pair $(\bm{w}_{i},W_{\operatorname*{z}})$ according to (46)–(52) in which only the first generalized eigenvector has to be computed. Moreover, for the purpose of optimizing $\bm{w}_{1},\ldots,\bm{w}_{K}$ , we do not have to update $W_{\operatorname*{z}}$ using Eqs. (47) and (51) when solving problem (45) for each $i=1,\ldots,K$ . To see this, observe that

1.

in the MM algorithm (Eqs. (29)–(34)), the auxiliary variable $r$ , and by extension, the covariance matrices $V_{1},\ldots,V_{K},V_{\operatorname*{z}}$ are independent of $W_{\operatorname*{z}}$ ; and hence,
2.

$W_{\operatorname*{z}}$ never contributes to the construction of the surrogate function $g$ during iterative optimization.

This observation implies that problem (45) remains the same during iterative optimization regardless whether we update $W_{\operatorname*{z}}$ or not. Hence, in IVE-IP2, it is sufficient to update only $\bm{w}_{i}$ in (45) using Eqs. (46), (48)–(50), (52).

This advantage stems from the stationary Gaussian assumption for the noise components. To the contrary, suppose that the noise components are nonstationary or non-Gaussian. Then, the surrogate function $g$ as well as $V_{\operatorname*{z}}$ should depend on $W_{\operatorname*{z}}$ in the same way that $V_{i}$ depends on $\bm{w}_{i}$ , meaning that $W_{\operatorname*{z}}$ must be optimized in subproblem (45). In this way, the stationary Gaussian assumption is important to derive a computationally efficient algorithm for IVE.

IV-G IVE-IP1: Reinterpretation of the conventional IVE-OC

In this subsection, we model the noise components by (16) with a general time-independent covariance matrix $\Omega_{\operatorname*{z}}$ , i.e.,

\displaystyle-\log p(\bm{z}(t))=\bm{z}(t)^{h}\Omega_{\operatorname*{z}}^{-1}\bm{z}(t)+\log\det\Omega_{\operatorname*{z}}+\operatorname*{const.},

(54)

and develop a BCD algorithm called IVE-IP1 that cyclically updates $\bm{w}_{1},(W_{\operatorname*{z}},\Omega_{\operatorname*{z}}),\bm{w}_{2},(W_{\operatorname*{z}},\Omega_{\operatorname*{z}}),\ldots,\bm{w}_{K},(W_{\operatorname*{z}},\Omega_{\operatorname*{z}})$ by solving the following subproblems one by one:

	$\displaystyle\bm{w}_{i}$	$\displaystyle\in\operatorname{argmin}_{\bm{w}_{i}}g(\bm{w}_{1},\ldots,\bm{w}_{K},W_{\operatorname{z}},\Omega_{\operatorname*{z}},r),$		(55)
	$\displaystyle(W_{\operatorname{z}},\Omega_{\operatorname{z}})$	$\displaystyle\in\operatorname{argmin}_{W_{\operatorname{z}},\,\Omega_{\operatorname{z}}}g(\bm{w}_{1},\ldots,\bm{w}_{K},W_{\operatorname{z}},\Omega_{\operatorname*{z}},r).$		(56)

Here, the frequency bin index $f$ is omitted. Due to considering the noise covariance $\Omega_{\operatorname*{z}}$ , the objective function $g$ is slightly changed to

	$\displaystyle g(W,\Omega_{\operatorname*{z}},r)$	$\displaystyle=$		$\displaystyle\sum_{i=1}^{K}\bm{w}_{i}^{h}V_{i}\bm{w}_{i}+\operatorname{tr}\left(W_{\operatorname{z}}^{h}V_{\operatorname{z}}W_{\operatorname{z}}\Omega_{\operatorname*{z}}^{-1}\right)$
		$\displaystyle+\log\det\Omega_{\operatorname*{z}}-2\log\|\det W\|.$				(57)

It will turn out that the BCD algorithm IVE-IP1 is identical to the conventional IVE-OC summarized in Algorithm 2. In other words, IVE-OC, which relies on the orthogonal constraint (OC), can be interpreted as a BCD algorithm for the proposed IVE without OC.

Since problem (55) is the same as problem (37), the update rules for $\bm{w}_{1},\ldots,\bm{w}_{K}$ are given by (38)–(39). On the other hand, an algorithm for problem (56) can be obtained from the next proposition.

Proposition 4.

Let $K,M\in\mathbb{N}$ with $K<M$ , $V_{\operatorname*{z}}\in\mathcal{S}_{++}^{M}$ , and let $W_{\operatorname*{s}}=[\bm{w}_{1},\ldots,\bm{w}_{K}]\in\mathbb{C}^{M\times K}$ be full column rank. Then, a pair $(W_{\operatorname*{z}},\Omega_{\operatorname*{z}})$ is a global optimal solution of problem (56) if and only if it is computed by

	$\displaystyle W_{\operatorname*{z}}$	$\displaystyle\in\mathbb{C}^{M\times(M-K)}\quad\text{with}\quad W_{\operatorname{s}}^{h}V_{\operatorname{z}}W_{\operatorname*{z}}=O_{K,M-K},$		(58)
	$\displaystyle\Omega_{\operatorname*{z}}$	$\displaystyle=W_{\operatorname{z}}^{h}V_{\operatorname{z}}W_{\operatorname*{z}}\in\mathcal{S}_{++}^{M-K}.$		(59)

(Note that $W_{\operatorname*{z}}$ must be full column rank to guarantee the positive definiteness of $\Omega_{\operatorname*{z}}$ .)

Proof.

The stationary condition which is the necessary optimality condition of problem (56) is expressed as

	$\displaystyle\frac{\partial g}{\partial W_{\operatorname*{z}}^{\ast}}$	$\displaystyle=O$		$\displaystyle\Longleftrightarrow$		$\displaystyle\begin{cases}W_{\operatorname{s}}^{h}V_{\operatorname{z}}W_{\operatorname{z}}=O_{K,M-K},\\ W_{\operatorname{z}}^{h}V_{\operatorname{z}}W_{\operatorname{z}}=\Omega_{\operatorname*{z}},\end{cases}$		(60)
	$\displaystyle\frac{\partial g}{\partial\Omega_{\operatorname*{z}}^{-1}}$	$\displaystyle=O$		$\displaystyle\Longleftrightarrow$		$\displaystyle W_{\operatorname{z}}^{h}V_{\operatorname{z}}W_{\operatorname{z}}=\Omega_{\operatorname{z}}.$		(61)

Hence, the “only if” part is obvious.

To see the “if” part, we show that all stationary points are globally optimal. To this end, it suffices to prove that

(i): $g$ takes the same value on all stationary points; and
(ii): $g$ attains its minimum at some $(W_{\operatorname*{z}},\Omega_{\operatorname*{z}})$ .

We first check (i). By (59), the second term of $g$ becomes $\operatorname*{tr}(W_{\operatorname*{z}}^{h}V_{\operatorname*{z}}W_{\operatorname*{z}}\Omega_{\operatorname*{z}}^{-1})=M-K$ , which is constant. Let $(W_{\operatorname*{z}},\Omega_{\operatorname*{z}})$ and $(W_{\operatorname*{z}}^{\prime},\Omega_{\operatorname*{z}}^{\prime})$ be two stationary points. Since $\operatorname*{Im}W_{\operatorname*{z}}=\operatorname*{Im}W^{\prime}_{\operatorname*{z}}$ by the first part of (60), we have $W_{\operatorname*{z}}^{\prime}=W_{\operatorname*{z}}Q$ for some $Q\in\operatorname*{GL}(M-K)$ . Then, by (61), $\Omega_{\operatorname*{z}}^{\prime}=Q^{h}\Omega_{\operatorname*{z}}Q$ . Hence, for the $\log\det$ terms of $g$ , we have

	$\displaystyle\log\det\Omega_{\operatorname{z}}^{\prime}-2\log\|\det[W_{\operatorname{s}},W_{\operatorname*{z}}^{\prime}]\|$
	$\displaystyle=$		$\displaystyle\log\det(Q^{h}\Omega_{\operatorname{z}}Q)-2\log\|\det[W_{\operatorname{s}},W_{\operatorname*{z}}]\cdot\det Q\|$
	$\displaystyle=$		$\displaystyle\log\det\Omega_{\operatorname{z}}-2\log\|\det[W_{\operatorname{s}},W_{\operatorname*{z}}]\|,$

which concludes (i).

We next show (ii). Let us change the variables from $W_{\operatorname*{z}}$ and $\Omega_{\operatorname*{z}}$ to $W_{\operatorname*{z}}^{\prime}=W_{\operatorname*{z}}\Omega_{\operatorname*{z}}^{-\frac{1}{2}}$ and $\Omega_{\operatorname*{z}}$ . Then, the objective function $g$ with respect to $W_{\operatorname*{z}}^{\prime}$ and $\Omega_{\operatorname*{z}}$ is expressed as

\displaystyle g(W_{\operatorname*{z}}^{\prime},\Omega_{\operatorname*{z}})=\operatorname*{tr}(W_{\operatorname*{z}}^{\prime h}V_{\operatorname*{z}}W_{\operatorname*{z}}^{\prime})-2\log|\det[W_{\operatorname*{s}},W_{\operatorname*{z}}^{\prime}]|+\operatorname*{const.},

which is independent of $\Omega_{\operatorname*{z}}$ . By Proposition 6, the problem of minimizing $g$ with respect to $W_{\operatorname*{z}}^{\prime}$ attains its minimum at some $W_{\operatorname*{z}}^{\prime}$ , which in turn implies that problem (56) also attains its minimum at some $(W_{\operatorname*{z}},\Omega_{\operatorname*{z}})$ . ∎

Proposition 4 implies that in problem (56) it is sufficient to optimize $\operatorname*{Im}(W_{\operatorname*{z}})$ , instead of $W_{\operatorname*{z}}$ , to satisfy the orthogonal constraint (OC), i.e., Eq. (58), which is part of the stationary condition (60).⁴⁴4 The observation that OC appears as part of the stationary condition was pointed out in [26]. Since the update formula (18) and (23) for $W_{\operatorname*{z}}$ in IVE-OC surely satisfies OC, it is applicable for the solution of problem (56). Hence, it turns out that the conventional IVE-OC summarized in Algorithm 2 can be obtained as BCD for the proposed IVE.

IV-H Source extraction and projection back

After optimizing separation matrix $W$ in IVE, we can achieve source extraction by computing the minimum mean square error (MMSE) estimator of the source spatial image $\bm{x}_{i}$ for each $i=1,\ldots,K$ :

\displaystyle\hat{\bm{x}}_{i}(f,t)\coloneqq\underbrace{\left(W(f)^{-h}\bm{e}_{i}\right)}_{\text{projection back}}\underbrace{\left(\bm{w}_{i}(f)^{h}\bm{x}(f,t)\right)}_{\text{source extraction}}.

(62)

The projection back operation [55] adjusts the amplitude ambiguity of the extracted signals between frequency bins.

We here show that in the projection back operation, i.e., $W^{-h}\bm{e}_{i}$ for $i=1,\ldots,K$ , we need only $\bm{w}_{1},\ldots,\bm{w}_{K},\operatorname*{Im}W_{\operatorname*{z}}$ but not $W_{\operatorname*{z}}$ (frequency bin index $f$ is dropped off for simplicity). To see this, let

\displaystyle W^{(i)}=[\bm{w}_{1},\ldots,\bm{w}_{K},W_{\operatorname*{z}}^{(i)}]\in\operatorname*{GL}(M),\quad i\in\{1,2\}

(63)

satisfy $\operatorname*{Im}W_{\operatorname*{z}}^{(1)}=\operatorname*{Im}W_{\operatorname*{z}}^{(2)}$ . Then, we have

\displaystyle W^{(1)}=W^{(2)}\begin{bmatrix}I_{K}&O\\ O&D\\ \end{bmatrix}~{}\text{for some $D\in\operatorname*{GL}(M-K)$}.

This implies $\left(W^{(1)}\right)^{-h}\bm{e}_{i}=\left(W^{(2)}\right)^{-h}\bm{e}_{i}$ for $i=1,\ldots,K$ , meaning that the projection back operation depends only on $\operatorname*{Im}W_{\operatorname*{z}}$ , as we required.

Remark 2.

Since the proposed IVE-IP2 will never update $W_{\operatorname*{z}}$ from its initial value, we update $\operatorname*{Im}W_{\operatorname*{z}}$ using (18) and (23) after the optimization, which corresponds to Step 1 in Algorithm 1. This update rule is applicable for this purpose by considering the stationary condition (36), which is equivalent to (60) with $\Omega_{\operatorname*{z}}=I_{M-K}$ .

V Algorithm for semiblind BSE problem

V-A Proposed Semi-IVE

We next address the semiblind source extraction problem (Semi-BSE) defined in Section II. In Semi-BSE, we are given a priori the $L$ transfer functions (steering vectors) for target sources $l=1,\ldots,L$ ( $1\leq L\leq K$ ), denoted as

\displaystyle A_{1}(f)\coloneqq[\,\bm{a}_{1}(f),\ldots,\bm{a}_{L}(f)\,]\in\mathbb{C}^{M\times L}.

(64)

In this situation, it is natural in the context of the linear constrained minimum variance (LCMV [44]) beamforming algorithm to regularize the separation matrix as $W(f)^{h}A_{1}(f)=E_{1}$ , where $E_{1}$ is defined by (71) below. We can thus immediately propose the Semi-BSE method called Semi-IVE as IVE with the linear constraints, i.e.,

\displaystyle\left.\begin{array}[]{cl}\underset{W}{\operatorname*{minimize}}&g_{0}(W)\quad\text{(defined by \eqref{eq:loss})}\\ \operatorname*{subject~{}to}-to&W(f)^{h}A_{1}(f)=E_{1},\quad f=1,\ldots,F.\end{array}\right\}

(67)

Here, $W\coloneqq\{W(f)\}_{f=1}^{F}$ .

The optimization of Semi-IVE is also based on the MM algorithm described in Section IV-A. The surrogate function for Semi-IVE is the same as that of IVE, i.e., Eq. (30), and the problem of minimizing the surrogate function becomes the following constrained optimization problem for each frequency bin $f=1,\ldots,F$ :

\displaystyle\left.\begin{array}[]{cl}\underset{W(f)}{\operatorname*{minimize}}&g(W(f),r)\quad\text{(defined by \eqref{eq:loss:MM})}\\ \operatorname*{subject~{}to}-to&W(f)^{h}A_{1}(f)=E_{1}.\end{array}\right\}

(70)

The goal of this section is to develop an efficient algorithm to obtain a (local) optimal solution for problem (70).

Hereafter, we omit frequency bin index $f$ and use the following notations for simplicity:

$\displaystyle E_{1}$	$\displaystyle=[\,\bm{e}_{1}^{(M)},\ldots,\bm{e}_{L}^{(M)}\,]=\begin{bmatrix}I_{L}\\ O_{M-L,L}\end{bmatrix}\in\mathbb{C}^{M\times L},$	(71)
$\displaystyle E_{2}$	$\displaystyle=[\,\bm{e}_{L+1}^{(M)},\ldots,\bm{e}_{M}^{(M)}\,]=\begin{bmatrix}O_{L,M-L}\\ I_{M-L}\end{bmatrix}\in\mathbb{C}^{M\times(M-L)},$	(72)
$\displaystyle W_{2}$	$\displaystyle=[\,\bm{w}_{L+1},\ldots,\bm{w}_{K},W_{\operatorname*{z}}\,]\in\mathbb{C}^{M\times(M-L)}.$	(73)

Here, the superscript ^(d) in $\bm{e}_{i}^{(d)}\in\mathbb{C}^{d}$ represents the length of the unit vectors.

V-B LCMV beamforming algorithm for $\bm{w}_{1},\ldots,\bm{w}_{L}$

By applying Proposition 7 in Appendix A, the objective function can be expressed as

	$\displaystyle g(W)$	$\displaystyle=$		$\displaystyle\sum_{i=1}^{K}\bm{w}_{i}^{h}V_{i}\bm{w}_{i}+\operatorname{tr}(W_{\operatorname{z}}^{h}V_{\operatorname{z}}W_{\operatorname{z}})$
		$\displaystyle+\log\det\left(A_{1}^{h}A_{1}\right)-\log\det\left(W_{2}^{h}W_{2}\right).$				(74)

Hence, in problem (70), separation filters $\bm{w}_{1},\ldots,\bm{w}_{L}$ that correspond to the given transfer functions $\bm{a}_{1},\ldots,\bm{a}_{L}$ can be globally optimized by solving

\displaystyle\left.\begin{array}[]{cl}\underset{\bm{w}_{i}}{\operatorname*{minimize}}&\bm{w}_{i}^{h}V_{i}\bm{w}_{i}\\ \operatorname*{subject~{}to}-to&\bm{w}_{i}^{h}A_{1}=(\bm{e}_{i}^{(L)})^{\top}\in\mathbb{C}^{1\times L}.\end{array}\right\}

(77)

This problem is nothing but the LCMV beamforming problem that is solved as

\displaystyle\bm{w}_{i}=V_{i}^{-1}A_{1}\left(A_{1}^{h}V_{i}^{-1}A_{1}\right)^{-1}\bm{e}_{i}^{(L)}\in\mathbb{C}^{M}.

(78)

V-C Block coordinate descent (BCD) algorithm for $W_{2}$

We develop a BCD algorithm for optimizing the remaining variables $W_{2}$ . Since $W_{2}\in\mathbb{C}^{M\times(M-L)}$ is restricted by the $(M-L)L$ linear constraints $W_{2}^{h}A_{1}=O_{M-L,L}$ , it can be parameterized using the $(M-L)^{2}$ variables. One such choice is given by

	$\displaystyle W_{2}$	$\displaystyle=W_{2}^{\prime}\overline{W}\in\mathbb{C}^{M\times(M-L)},\quad\overline{W}\in\mathbb{C}^{(M-L)\times(M-L)},$		(79)
	$\displaystyle W_{2}^{\prime}$	$\displaystyle=[A_{1},E_{2}]^{-h}E_{2}\in\mathbb{C}^{M\times(M-L)},$		(80)

where $E_{2}$ is defined as (72), and it is assumed that $[A_{1},E_{2}]$ is nonsingular. It is easy to see that this $W_{2}$ certainly satisfies the linear constraints. By substituting (79)–(80) into (74), we can reformulate problem (70) as an unconstrained optimization problem of minimizing

\displaystyle\overline{g}(\overline{W},r)=\sum_{i=L+1}^{K}\overline{\bm{w}}_{i}^{h}\overline{V}_{\!i}\overline{\bm{w}}_{i}+\operatorname*{tr}(\overline{W}_{\!\operatorname*{z}}^{h}\overline{V}_{\!\operatorname*{z}}\overline{W}_{\!\operatorname*{z}})-2\log|\det\overline{W}|

with respect to $\overline{W}$ , where we define

$\displaystyle\overline{W}$	$\displaystyle=[\,\overline{\bm{w}}_{L+1},\ldots,\overline{\bm{w}}_{K},\overline{W}_{\!\operatorname*{z}}\,]\in\mathbb{C}^{(M-L)\times(M-L)},$	(81)
$\displaystyle\overline{\bm{w}}_{i}$	$\displaystyle\in\mathbb{C}^{(M-L)\times 1},\quad i=L+1,\ldots,K,$	(82)
$\displaystyle\overline{W}_{\!\operatorname*{z}}$	$\displaystyle\in\mathbb{C}^{(M-L)\times(M-K)},$	(83)
$\displaystyle\overline{V}_{\!i}$	$\displaystyle=\left(W_{2}^{\prime}\right)^{h}V_{i}W_{2}^{\prime}\in\mathcal{S}_{++}^{M-L},~{}~{}~{}i\in\{L+1,\ldots,K,\operatorname*{z}\}.$	(84)

Interestingly, this problem is nothing but the BSE problem and was already discussed in Section IV.

In a similar manner to IVE-IP2, our proposed Semi-IVE algorithm updates $\overline{\bm{w}}_{L+1},\ldots,\overline{\bm{w}}_{K}$ one by one by solving the following subproblem for each $i=L+1,\ldots,K$ :

\displaystyle(\overline{\bm{w}_{i}},\overline{W}_{\!\operatorname*{z}})\in\operatorname*{argmin}_{\overline{\bm{w}}_{i},\,\overline{W}_{\!\operatorname*{z}}}\overline{g}(\overline{\bm{w}}_{L+1},\ldots,\overline{\bm{w}}_{K},\overline{W}_{\!\operatorname*{z}},r).

(85)

•

When $L=K-1$ and $\overline{W}=[\,\overline{\bm{w}}_{K},\overline{W}_{\!\operatorname*{z}}\,]$ , a global optimal solution for problem (85) can be obtained by applying Proposition 2.
•

When $1\leq L\leq K-2$ , a global optimal solution of (85) can be obtained by applying Proposition 3.
•

Note that for the same reason as in the proposed IVE-IP2, updating $\overline{W}_{\!\operatorname*{z}}$ in (85) is not necessary.

In summary, Semi-IVE can be presented in Algorithm 5, in which we use the following notations:

$\displaystyle I_{M-L}$	$\displaystyle=[\,\overline{\bm{e}}_{L+1},\ldots,\overline{\bm{e}}_{K},\overline{E}_{\operatorname{z}}\,]\in\operatorname{GL}(M-L),$	(86)
$\displaystyle\overline{\bm{e}}_{i}$	$\displaystyle=\bm{e}_{i-L}^{(M-L)}\in\mathbb{C}^{M-L},\quad i=L+1,\ldots,K,$	(87)
$\displaystyle\overline{E}_{\operatorname*{z}}$	$\displaystyle=[\,\bm{e}_{K-L+1}^{(M-L)},\ldots,\bm{e}_{M-L}^{(M-L)}\,]\in\mathbb{C}^{(M-L)\times(M-K)}.$	(88)

Remark 3.

Semi-IVE does not update $\overline{W}_{\!\operatorname*{z}}$ and $W_{\operatorname*{z}}$ from the initial values. Hence, for the same reason as in Remark 2, we need to optimize $\operatorname*{Im}\overline{W}_{\!\operatorname*{z}}$ and $\operatorname*{Im}W_{\operatorname*{z}}$ after the optimization and before performing the projection back operation. For this purpose, we adopt the following formula:

\displaystyle W_{\operatorname*{z}}=W_{2}^{\prime}\overline{W}_{\!\operatorname*{z}},\quad\overline{W}_{\!\operatorname*{z}}=\begin{bmatrix}(\overline{W}_{\!\operatorname*{s}}^{h}\overline{V}_{\!\operatorname*{z}}\overline{E}_{\operatorname*{s}})^{-1}(\overline{W}_{\!\operatorname*{s}}^{h}\overline{V}_{\!\operatorname*{z}}\overline{E}_{\operatorname*{z}})\\ -I_{M-K}\end{bmatrix}.

(89)

Here, we use the following notations:

	$\displaystyle\overline{W}$	$\displaystyle=[\,\overline{W}_{\!\operatorname{s}},\overline{W}_{\!\operatorname{z}}\,],$	$\displaystyle\quad\overline{W}_{\!\operatorname*{s}}$	$\displaystyle=[\,\overline{\bm{w}}_{L+1},\ldots,\overline{\bm{w}}_{K}\,],$
	$\displaystyle I_{M-L}$	$\displaystyle=[\,\overline{E}_{\operatorname{s}},\overline{E}_{\operatorname{z}}\,],$	$\displaystyle\quad\overline{E}_{\operatorname*{s}}$	$\displaystyle=[\,\overline{\bm{e}}_{L+1},\ldots,\overline{\bm{e}}_{K}\,].$

The formula (89) optimizes $\operatorname*{Im}\overline{W}_{\!\operatorname*{z}}$ (and hence $\operatorname*{Im}W_{\operatorname*{z}}$ ) as described in Remark 2.

VI Computational time complexity

We compare the computational time complexity per iteration of the algorithms presented in Table I. In IVE-IP1, IVE-IP2, and Semi-IVE, the runtime is dominated by

•

the computation of $V_{1},\ldots,V_{K}\in\mathcal{S}_{++}^{M}$ at Step 1 in Algorithm 1, which costs $\mathrm{O}(KM^{2}FT)$ ; or
•

the matrix inversions and generalized eigenvalue decompositions in Algorithms 2–5, which cost $\mathrm{O}(KM^{3}F)$ .

Thus, the time complexity of these algorithms is

\displaystyle\mathrm{O}(KM^{2}FT+KM^{3}F).

(90)

On the other hand, the time complexity of IVA-IP1 and IVA-IP2 is

\displaystyle\mathrm{O}(M^{3}FT+M^{4}F)

(91)

due to the computation of $M$ covariance matrices $V_{1},\ldots,V_{M}$ . Consequently, IVE reduces the computational time complexity of IVA by a factor of $K/M$ .

1 Data: Observed signal

\bm{x}

;

Number of sources

K

; and

L

transfer function

A_{1}=[\bm{a}_{1},\ldots,\bm{a}_{L}]

where

0\leq L\leq K

2 Result: Separated spatial images

\bm{x}_{1},\ldots,\bm{x}_{K}

3 begin

/* Initialization */

W(f)\leftarrow-I_{M}

V_{\operatorname*{z}}(f)\leftarrow\frac{1}{T}\sum_{t=1}^{T}\bm{x}(f,t)\bm{x}(f,t)^{h}

6 if using IVE-IP1 or IVE-IP2 then

7 Update

W_{\operatorname*{z}}(f)

using Step 2 in Algorithm 2

8if using Semi-IVE then

W_{2}^{\prime}(f)\leftarrow[\,A_{1}(f),E_{2}\,]^{-h}E_{2}

\overline{V}_{\!\operatorname*{z}}(f)\leftarrow W_{2}^{\prime}(f)^{h}V_{\operatorname*{z}}(f)W_{2}^{\prime}(f)

11 Update

W_{\operatorname*{z}}(f)

using (89)

/* Optimization */

12 repeat

13 for $i=1,\ldots,K$ do

s_{i}(f,t)\leftarrow\bm{w}_{i}(f)^{h}\bm{x}(f,t)

r_{i}(t)\leftarrow\|\bm{s}_{i}(t)\|

\alpha_{i}^{\beta}=\frac{\beta}{2F}(\frac{1}{T}\sum_{t}r_{i}(t)^{\beta})

\phi_{i}(t)\leftarrow\frac{G^{\prime}(r_{i}(t),\alpha_{i})}{2r_{i}(t)}=\frac{\beta}{2}\frac{1}{\alpha_{i}^{\beta}r_{i}(t)^{2-\beta}}

\phi_{i}(t)\leftarrow\min\{\phi_{i}(t),10^{5}\times\min\{\phi_{i}(t)\}_{t=1}^{T}\}

// for numerical stability

V_{i}(f)\leftarrow\frac{1}{T}\sum_{t}\phi_{i}(t)\bm{x}(f,t)\bm{x}(f,t)^{h}

V_{i}(f)\leftarrow V_{i}(f)+10^{-3}\operatorname*{tr}\{V_{i}(f)\}I_{M}

// for numerical stability

21Update

W(f)

for each

f

using IVE-IP1, IVE-IP2, or Semi-IVE.

until convergence

/* Source extraction */

22 if using IVE-IP2 then

23 Update

W_{\operatorname*{z}}(f)

using Step 2 in Algorithm 2

24if using Semi-IVE then

25 Update

W_{\operatorname*{z}}(f)

using (89)

\bm{x}_{i}(f,t)\leftarrow\left(W(f)^{-h}\bm{e}_{i}\right)\bm{w}_{i}(f)^{h}\bm{x}(f,t)

Algorithm 1 IVE (

L=0

) and Semi-IVE (

1\leq L\leq K

) based on generalized Gaussian distributions in Assumption 2:

G(\|\bm{s}_{i}(t)\|,\alpha_{i})=\left(\frac{\|\bm{s}_{i}(t)\|}{\alpha_{i}}\right)^{\beta}+2F\log\alpha_{i}

. Here,

\alpha_{1},\ldots,\alpha_{K}

are parameters to be optimized.

1 for $i=1,\ldots,K$ do

\bm{u}_{i}(f)\leftarrow\left(W(f)^{h}V_{i}(f)\right)^{-1}\bm{e}_{i}

\bm{w}_{i}(f)\leftarrow\bm{u}_{i}(f)\left(\bm{u}_{i}(f)^{h}V_{i}(f)\bm{u}_{i}(f)\right)^{-\frac{1}{2}}

W_{\operatorname*{z}}(f)\leftarrow{\small\begin{bmatrix}(W_{\operatorname*{s}}(f)^{h}V_{\operatorname*{z}}(f)E_{\operatorname*{s}})^{-1}(W_{\operatorname*{s}}(f)^{h}V_{\operatorname*{z}}(f)E_{\operatorname*{z}})\\ -I_{M-K}\end{bmatrix}}

Algorithm 2 IVE-IP1 (proposed in [24])

1 Solve

V_{\operatorname*{z}}(f)\bm{u}=\lambda_{\max}V_{1}(f)\bm{u}

to obtain the eigenvector

\bm{u}

corresponding to the largest eigenvalue

\lambda_{\max}

\bm{w}_{1}(f)\leftarrow\bm{u}\left(\bm{u}^{h}V_{1}(f)\bm{u}\right)^{-\frac{1}{2}}

Algorithm 3 IVE-IP2 for

K=1

1 for $i=1,\ldots,K$ do

2 for $\ell\in\{i,\operatorname*{z}\}$ do

P_{\ell}(f)\leftarrow\left(W(f)^{h}V_{\ell}(f)\right)^{-1}[\,\bm{e}_{i},E_{\operatorname*{z}}\,]

G_{\ell}(f)\leftarrow P_{\ell}(f)^{h}V_{\ell}(f)P_{\ell}(f)

5Solve

G_{i}(f)\bm{b}=\lambda_{\max}G_{\operatorname*{z}}(f)\bm{b}

to obtain

\bm{b}

corresponding to the largest eigenvalue

\lambda_{\max}

\bm{w}_{i}(f)\leftarrow P_{i}(f)\bm{b}\left(\bm{b}^{h}G_{i}(f)\bm{b}\right)^{-\frac{1}{2}}

Algorithm 4 IVE-IP2 for

K\geq 2

/* LCMV beamforming */

1 for $i=1,\ldots,L$ do

\bm{w}_{i}(f)\leftarrow V_{i}(f)^{-1}A_{1}(f)\left(A_{1}(f)^{h}V_{i}(f)^{-1}A_{1}(f)\right)^{-1}\bm{e}_{i}

3if $L=K$ then

return

/* BCD */

4 for $i=L+1,\ldots,K$ do

\overline{V}_{\!i}(f)\leftarrow W_{2}^{\prime}(f)^{h}V_{i}(f)W_{2}^{\prime}(f)

6if $L=K-1$ then

7 Solve

\overline{V}_{\!\operatorname*{z}}(f)\overline{\bm{u}}=\lambda_{\max}\overline{V}_{\!K}(f)\overline{\bm{u}}

to obtain

\overline{\bm{u}}

corresponding to the largest eigenvalue

\lambda_{\max}

\bm{w}_{K}(f)\leftarrow W_{2}^{\prime}(f)\overline{\bm{u}}\left(\overline{\bm{u}}^{h}\overline{V}_{\!K}(f)\overline{\bm{u}}\right)^{-\frac{1}{2}}

9else

10 for $i=L+1,\ldots,K$ do

11 for $\ell\in\{i,\operatorname*{z}\}$ do

\overline{P}_{\ell}(f)\leftarrow\left(\overline{W}(f)^{h}\overline{V}_{\ell}(f)\right)^{-1}[\,\overline{\bm{e}}_{i},\overline{E}_{\operatorname*{z}}\,]

\overline{G}_{\ell}(f)\leftarrow\overline{P}_{\ell}(f)^{h}\overline{V}_{\ell}(f)\overline{P}_{\ell}(f)

14Solve

\overline{G}_{i}(f)\overline{\bm{b}}=\lambda_{\max}\overline{G}_{\operatorname*{z}}(f)\overline{\bm{b}}

to obtain

\overline{\bm{b}}

corresponding to the largest eigenvalue

\lambda_{\max}

\bm{w}_{i}(f)\leftarrow W_{2}^{\prime}(f)\overline{P}_{i}(f)\overline{\bm{b}}\left(\overline{\bm{b}}^{h}\overline{G}_{i}(f)\overline{\bm{b}}\right)^{-\frac{1}{2}}

Algorithm 5 Semi-IVE

VII Experiments

In this numerical experiment, we evaluated the following properties of the IVE and Semi-IVE algorithms:

•

The source extraction performance for the speech signals in terms of the signal-to-distortion ratio (SDR [56]) between the estimated and oracle source spatial images;
•

the runtime performance.

We compared the performance of the following five methods whose optimization procedures are summarized in Table I:

1.

IVA-IP1-old: The conventional AuxIVA with IP1 [30], followed by picking $K$ signals in an oracle manner.
2.

IVE-IP1-old: The conventional IVE-IP1 or IVE-OC [24] (Algorithms 1 and 2).
3.

IVE-IP2-old: The conventional IVE-IP2 [26] (see [26, Algorithm 2] for the implementation when $K\geq 2$ ).
4.

IVE-IP2-new: The proposed IVE-IP2 (Algorithms 1, 3, and 4), where the generalized eigenvalue problems will be solved using the power method with 30 iterations (see Section VII-C for details).⁵⁵5 Note that in the case of $K=1$ , IVE-IP2-old (called FIVE [25]) and IVE-IP2-new are the same algorithm (see Section IV-D). However, we distinguish between them because we propose to use the power method to efficiently obtain the first generalized eigenvector in the proposed IVE-IP2-new.
5.

Semi-IVE- $(L)$ -new: The proposed semiblind IVE algorithm, where the transfer functions of $L$ super-Gaussian sources are given as an oracle (Algorithms 1 and 5).

VII-A Dataset

As evaluation data, we generated synthesized convolutive noisy mixtures of speech signals.

Room impulse response (RIR) data: We used the RIR data recorded in room E2A from the RWCP Sound Scene Database in Real Acoustical Environments [57]. The reverberation time ( $\mathrm{RT}_{60}$ ) of room E2A is 300 ms. These data consist of nine RIRs from nine different directions.

Speech data: We used point-source speech signals from the test set of the TIMIT corpus [58]. We concatenated the speech signals from the same speaker so that the length of each signal exceeded ten seconds. We prepared 168 speech signals in total.

Noise data: We used a background noise signal recorded in a cafe (CAF) from the third ‘CHiME’ Speech Separation and Recognition Challenge [59]. We chose a monaural signal captured at ch1 on a tablet and used it as a point-source noise signal. The noise signal was about 35 minutes long.

Mixture signals: We generated 100 mixtures consisting of $K\in\{1,2,3\}$ speech signals and $J=5$ noise signals:

1.

We selected $K$ speech signals at random from the 168 speech data. We selected $J$ non-overlapping segments at random from the noise data and prepared $J$ noise signals. We selected $K+J$ RIRs at random from the original nine RIRs.
2.

We convolved the $K$ speech and $J$ noise signals with the selected $K+J$ RIRs to create $K+J$ spatial images.
3.

We added the obtained $K+J$ spatial images in such a way that $\mathrm{SNR}\coloneqq 10\log_{10}\frac{\frac{1}{K}\sum_{i=1}^{K}\lambda_{i}^{(\mathrm{s})}}{\sum_{j=1}^{J}\lambda_{j}^{(\mathrm{n})}}$ becomes 0 dB if $K=1,2$ and 5 dB if $K=3$ , where $\lambda_{i}^{(\mathrm{s})}$ and $\lambda_{j}^{(\mathrm{n})}$ denote the sample variances of the $i$ th speech-source and $j$ th noise-source spatial images.

VII-B Experimental conditions

For all the methods, we initialized $W(f)=-I_{M}$ and assumed that each super-Gaussian source $\bm{s}_{i}(t)\in\mathbb{C}^{F}$ follows the generalized Gaussian distribution (GGD). More concretely, we set to $G(r_{i}(t),\alpha_{i})=(\frac{r_{i}(t)}{\alpha_{i}})^{\beta}+2F\log\alpha_{i}$ with shape parameter $\beta=0.1$ in Assumption 2. Scale parameters $\alpha_{1},\ldots,\alpha_{K}\in\mathbb{R}_{>0}$ are optimized to $\alpha_{i}^{\beta}=\frac{\beta}{2F}(\frac{1}{T}\sum_{t}r_{i}(t)^{\beta})$ every after $W$ is updated (Step 1 in Algorithm 1).

In Semi-IVE- $(L)$ -new, we prepared oracle transfer functions $\bm{a}_{1},\ldots,\bm{a}_{L}$ using oracle spatial images $\bm{x}_{1},\ldots,\bm{x}_{L}$ . We set $\bm{a}_{\ell}(f)$ to the first eigenvector of sample covariance matrix $R_{\ell}(f)=\frac{1}{T}\sum_{t=1}^{T}\bm{x}_{\ell}(f,t)\bm{x}_{\ell}(f,t)^{h}$ for each source $\ell\in\{1,\ldots,L\}$ . Here, $\bm{a}_{\ell}(f)$ was normalized to be $\|\bm{a}_{\ell}(f)\|=1$ .

The performance was tested for one to three speakers and two to eight microphones. The sampling rate was 16 kHz, the reverberation time was 300 ms, the frame length was 4096 (256 ms), and the frame shift was 1024 (64 ms).

VII-C Implementation notes

We implemented all the algorithms in Python 3.7.1.

In IVE-IP2-new and Semi-IVE, we need to solve the generalized eigenvalue problems of form $A\bm{x}=\lambda_{\mathrm{max}}B\bm{x}$ where we require only the first generalized eigenvector. To prevent ‘for loops’ in the Python implementation, we solved them (i) by first transforming the problem into an eigenvalue problem $\left(B^{-1}A\right)\bm{x}=\lambda_{\mathrm{max}}\bm{x}$ , and then (ii) using power iteration (also known as the power method [60]) to obtain the required first eigenvector. The number of iterations in the power method was set to 30 in this experiment.

On the other hand, in IVE-IP2-old for $K\geq 2$ , all the generalized eigenvectors have to be obtained. To prevent ‘for loops,’ we implemented it (i) by first transforming the generalized eigenvector problem into an eigenvector problem in the same way as above, and then (ii) by calling the numpy.linalg.eig function. IVE-IP2-old for $K=1$ (i.e., FIVE [25]) was implemented in the same way.

These implementations are not recommended for numerical stability, but we adopted them to make the algorithms run fast.

Refer to caption — (a) One target source ( $K=1$ ), SNR = 0 [dB], averaged signal length: 11.46 sec

TABLE II: Realtime factors of algorithms when number of iterations is set to 50

Number of sources	$K=1$			$K=2$			$K=3$
Number of microphones	$M=2$	$M=6$	$M=8$	$M=3$	$M=6$	$M=8$	$M=4$	$M=6$	$M=8$
Signal length [sec]	11.77	11.77	11.77	12.43	12.43	12.43	12.81	12.81	12.81
IVA-IP1-old	0.08	0.58	0.99	0.18	0.61	1.05	0.31	0.70	1.20
IVE-IP1-old	0.05	0.13	0.18	0.14	0.27	0.36	0.28	0.44	0.67
IVE-IP2-old	0.07	0.32	0.48	0.21	0.59	0.93	0.38	0.81	1.45
IVE-IP2-new	0.06	0.16	0.22	0.19	0.40	0.59	0.35	0.59	1.00
Semi-IVE- $(1)$ -new	0.04	0.12	0.17	0.15	0.29	0.41	0.32	0.54	0.89
Semi-IVE- $(2)$ -new	-	-	-	0.14	0.26	0.35	0.28	0.44	0.70
Semi-IVE- $(3)$ -new	-	-	-	-	-	-	0.28	0.42	0.65

VII-D Experimental results

Figure 1 shows the SDR performance of each algorithm as a function of the runtime, and Table II presents the realtime factor (RTF) of the algorithms when the number of iterations is set to 50 for all the algorithms. These results were obtained by running the algorithms on a PC with Intel(R) Core(TM) i7-7820X CPU @ 3.60GHz using a single thread.

VII-D1 Effectiveness of IVE-IP2-new for extracting unique target source ( $K=1$ )

In Fig. 1(a), IVE-IP2-new showed faster convergence than the conventional algorithms while achieving a similar SDR performance. Interestingly, the full-blind IVE algorithms in the $M\geq 6$ cases produced rather better SDRs than the informed Semi-IVE- $(1)$ -new in the $M=2$ case. This result motivates us to use more microphones at the expense of increased runtime. As desired, these increased runtime of the IVE algorithms, especially the proposed IVE-IP2-new, was shown to be very small compared to IVA-IP1-old.

VII-D2 Effectiveness of IVE-IP2-new compared to IVE-IP2-old

From Fig. 1 and Table II, it can be confirmed for $M\geq 6$ that the computational cost of the proposed IVE-IP2-new is consistently smaller than IVE-IP2-old while maintaining its separation performance, which shows the effectiveness of IVE-IP2-new. On the other hand, if $M$ is small, IVE-IP2-new gave almost the same performance as IVE-IP2-old.

VII-D3 Comparison of IVA-IP1-old, IVE-IP1-old, and IVE-IP2-new

If $K=1$ (Fig. 1(a)), the proposed IVE-IP2-new gave the fastest convergence among other blind algorithms, which clearly shows the superiority of IVE-IP2-new. On the other hand, if $K\in\{2,3\}$ (Fig. 1(b)–(c)), the convergence of IVE-IP2-new is slower than that of IVE-IP1-old, mainly because the computational cost per iteration of IVE-IP2-new exceedes that of IVE-IP1-old (see Table II). Therefore, IVE-IP1-old is still important when extracting more than two sources.

VII-D4 Effectiveness of Semi-IVE- $(L)$ -new

In Fig. 1, the proposed Semi-IVE algorithms naturally outperformed all of the full-blind IVA and IVE algorithms. Surprisingly, when the $L\coloneqq K-1$ transfer functions are given a priori (and $M\geq 4$ ), the Semi-IVE algorithms (Semi-IVE- $(1)$ -new if $K=2$ and Semi-IVE- $(2)$ -new if $K=3$ ) achieved comparable or sometimes better performance than Semi-IVE- $(K)$ -new. The convergence of Semi-IVE- $(L)$ -new with $L\geq K-1$ was also extremely fast. These results clearly show the effectiveness of the proposed Semi-IVE algorithms.

VIII Concluding remarks

We presented two new efficient BCD algorithms for IVE: (i) IVE-IP2, which extracts all the super-Gaussian sources from a linear mixture in a fully blind manner, and (ii) Semi-IVE, which improves IVE-IP2 in the semiblind scenario in which the transfer functions for several super-Gaussian sources are available as prior knowledge. We also argued that the conventional IVE that relies on the orthogonal constraint (IVE-OC) can be interpreted as BCD for IVE (IVE-IP1). Due to the stationary Gaussian noise assumption, these BCD (or IP) algorithms can skip most of the optimization of the filters for separating the noise components, which plays a central role for achieving a low computational cost of optimization.

Our numerical experiment, which extracts speech signals from their noisy reverberant mixture, showed that when $K=1$ or given at least $K-1$ transfer functions in a semiblind case, the proposed IVE-IP2 and Semi-IVE resulted in significantly faster convergence compared to the conventional algorithms, where $K$ is the number of the target-source signals. The new IVE-IP2 consistently speeded up the old IVE-IP2, and the conventional IVE-IP1 remains important for extracting multiple sources.

Appendix A

We prepare several propositions that are needed to rigorously develop the algorithms in Sections IV and V.

Proposition 5 below gives an inequality which is the basis of the MM algorithm for AuxICA [29], AuxIVA [30], and (the auxiliary-function-based) IVE.

Proposition 5 (See [29, Theorem 1]).

Let $G\colon\mathbb{R}_{\geq 0}\to\mathbb{R}$ be differentiable and satisfy that $\frac{G^{\prime}(r)}{r}$ is nonincreasing on $r\in\mathbb{R}_{>0}$ . Then, for arbitrary $r,\widetilde{r}\in\mathbb{R}_{>0}$ , it holds that

\displaystyle G(r)\leq\frac{G^{\prime}(\widetilde{r})}{2\widetilde{r}}\cdot r^{2}+\left(G(\widetilde{r})-\frac{\widetilde{r}\cdot G^{\prime}(\widetilde{r})}{2}\right).

(92)

The inequality holds with equality if $r=\widetilde{r}$ .

To give a proof of Propositions 2, 3, and 4, we need the following proposition that provides a sufficient condition for the existence of globally optimal solutions.

Proposition 6.

Suppose $V_{1},\ldots,V_{n}\in\mathcal{S}_{++}^{M}$ and $1\leq n\leq M$ . Let $W_{2}\in\mathbb{C}^{M\times(M-n)}$ be full column rank. The function $g$ with respect to $W_{1}=[\bm{w}_{1},\ldots,\bm{w}_{n}]\in\mathbb{C}^{M\times n}$ , defined by

\displaystyle g(W_{1})

\displaystyle=\sum_{i=1}^{n}\bm{w}_{i}^{h}V_{i}\bm{w}_{i}-\log|\det[W_{1},W_{2}]|^{2},

is lower bounded and attains its minimum at some $W_{1}$ .

Proof.

The statement for $n=M$ was proved in [34, 35, 36, 37]. Since that for $1\leq n<M$ can also be proved in the same way as the former part, we omit the proof. ∎

Proposition 7 below gives a useful expression of $|\det W|^{2}$ . The same statement for $L=1$ in Proposition 7 can be found in [61], and we here extend it for general $1\leq L\leq M$ .

Proposition 7.

Let $1\leq L<M$ and

	$\displaystyle W$	$\displaystyle=\begin{bmatrix}W_{1},W_{2}\end{bmatrix}\in\mathbb{C}^{M\times M},$
	$\displaystyle W_{1}$	$\displaystyle\in\mathbb{C}^{M\times L},\quad W_{2}\in\mathbb{C}^{M\times(M-L)},$
	$\displaystyle A_{1}$	$\displaystyle\in\mathbb{C}^{M\times L},\quad W_{1}^{h}A_{1}=I_{L},\quad W_{2}^{h}A_{1}=O_{M-L,L}.$

Then, $|\det W|^{2}=\det\left(A_{1}^{h}A_{1}\right)^{-1}\det\left(W_{2}^{h}W_{2}\right)$ holds.

Proof.

The orthogonal projection onto $\operatorname*{Im}A_{1}$ is given by $P=A_{1}(A_{1}^{h}A_{1})^{-1}A_{1}^{h}\in\mathbb{C}^{M\times M}$ . Then, it holds that

	$\displaystyle\|\det W\|^{2}$	$\displaystyle=\left\|\det([\,PW_{1}+(I-P)W_{1},W_{2}\,])\right\|^{2}$
		$\displaystyle=\left\|\det([\,PW_{1},W_{2}\,])\right\|^{2}$
		$\displaystyle=\det([\,PW_{1},W_{2}\,]^{h}[\,PW_{1},W_{2}\,])$
		$\displaystyle=\det\left(\begin{bmatrix}W_{1}^{h}P^{h}PW_{1}&O_{L,M-L}\\ O_{M-L,L}&W_{2}^{h}W_{2}\end{bmatrix}\right)$
		$\displaystyle=\det(A_{1}^{h}A_{1})^{-1}\cdot\det(W_{2}^{h}W_{2}),$

where we use $\operatorname*{Im}(I-P)\supseteq\operatorname*{Im}W_{2}$ in the second equality. ∎

Appendix B Proof of Proposition 3

Proof.

The proof is very similar to that of [26, Theorem 3]. By Proposition 6, problem (45) has global optimal solutions, and they satisfy the stationary conditions (35)–(36). In Eqs. (35)–(36), the first $K$ rows except for the $i$ th row is linear with respect to $\bm{w}_{i}$ and $W_{\operatorname*{z}}$ , and so they can be solved as

	$\displaystyle\bm{w}_{i}$	$\displaystyle=P_{i}G_{i}^{-1}\bm{c}_{i},$	$\displaystyle\quad\bm{c}_{i}$	$\displaystyle\in\mathbb{C}^{(M-K+1)\times 1},$		(93)
	$\displaystyle W_{\operatorname*{z}}$	$\displaystyle=P_{\operatorname{z}}G_{\operatorname{z}}^{-1}C_{\operatorname*{z}},$	$\displaystyle\quad C_{\operatorname*{z}}$	$\displaystyle\in\mathbb{C}^{(M-K+1)\times(M-K)},$		(94)

where $P_{\ell}$ and $G_{\ell}$ for $\ell\in\{i,\operatorname*{z}\}$ are defined by (48)–(49), and $\bm{c}_{i}$ and $C_{\operatorname*{z}}$ are free variables (the roles of $G_{i}^{-1}$ and $G_{\operatorname*{z}}^{-1}$ will be clear below). Substituting (93)–(94) into the objective function $g$ , we have

	$\displaystyle g$	$\displaystyle=\bm{c}_{i}^{h}G_{i}^{-1}\bm{c}_{i}+\operatorname{tr}\{C_{i}^{h}G_{\operatorname{z}}^{-1}C_{\operatorname*{z}}\}$
		$\displaystyle\quad-\log\|\det[\,W_{0}\mid P_{i}G_{i}^{-1}\bm{c}_{i}\mid P_{\operatorname{z}}G_{\operatorname{z}}^{-1}C_{\operatorname{z}}\,]\|^{2}+\operatorname{const.},$		(95)

where $W_{0}\coloneqq[\,\bm{w}_{1},\ldots,\bm{w}_{i-1},\bm{w}_{i+1},\ldots,\bm{w}_{K}\,]\in\mathbb{C}^{M\times(K-1)}$ .

Now, let us consider minimizing $g$ with respect to $\bm{c}_{i}$ and $C_{\operatorname*{z}}$ . In (95), the $\log\det$ term can be simplified as

	$\displaystyle 2\log\|\det\begin{bmatrix}W_{0}\mid P_{i}G_{i}^{-1}\bm{c}_{i}\mid P_{\operatorname{z}}G_{\operatorname{z}}^{-1}C_{\operatorname*{z}}\end{bmatrix}\|$
	$\displaystyle=$		$\displaystyle 2\log\|\det\begin{bmatrix}W_{0}^{h}\\ P_{i}^{h}V_{i}\end{bmatrix}^{-1}\begin{bmatrix}W_{0}^{h}\\ P_{i}^{h}V_{i}\end{bmatrix}\begin{bmatrix}W_{0}\mid P_{i}G_{i}^{-1}\bm{c}_{i}\mid P_{\operatorname{z}}G_{\operatorname{z}}^{-1}C_{\operatorname*{z}}\end{bmatrix}\|$
	$\displaystyle=$		$\displaystyle 2\log\|\det\begin{bmatrix}W_{0}^{h}W_{0}&&\\ O_{M-K+1,K-1}&\bm{c}_{i}&C_{\operatorname{z}}\end{bmatrix}\|+\operatorname{const.}$
	$\displaystyle=$		$\displaystyle 2\log\|\det[\,\bm{c}_{i},C_{\operatorname{z}}\,]\|+\operatorname{const.}$

Here, we used $V_{i}P_{i}=\left(W^{\prime}\right)^{-h}[\,\bm{e}_{i},E_{\operatorname*{z}}\,]=V_{\operatorname*{z}}P_{\operatorname*{z}}$ twice in the second equality. Hence, by applying Proposition 2, $g$ attains its minimum when

$\displaystyle\bm{c}_{i}$	$\displaystyle=\bm{u}_{i}\left(\bm{u}_{i}^{h}G_{i}^{-1}\bm{u}_{i}\right)^{-\frac{1}{2}},\quad C_{\operatorname{z}}=U_{\operatorname{z}}\left(U_{\operatorname{z}}^{h}G_{\operatorname{z}}^{-1}U_{\operatorname*{z}}\right)^{-\frac{1}{2}},$	(96)
$\displaystyle U_{\operatorname*{z}}$	$\displaystyle\in\mathbb{C}^{(M-K+1)\times(M-K)}\quad\text{with}\quad U_{\operatorname{z}}^{h}G_{\operatorname{z}}^{-1}\bm{u}_{i}=\bm{0},$	(97)
$\displaystyle\bm{u}_{i}$	$\displaystyle\in\mathbb{C}^{(M-K+1)\times 1}\quad\text{with}\quad G_{\operatorname*{z}}^{-1}\bm{u}_{i}=\lambda_{\max}G_{i}^{-1}\bm{u}_{i},$	(98)

where $\lambda_{\max}$ in (98) denotes the largest generalized eigenvalue. Because (97)–(98) are equivalent to (51)–(52) through $\bm{b}_{i}=G_{i}^{-1}\bm{u}_{i}$ and $B_{\operatorname*{z}}=G_{\operatorname*{z}}^{-1}U_{\operatorname*{z}}$ , a global optimal solution for (45) can be obtained by

	$\displaystyle\bm{w}_{i}$	$\displaystyle=P_{i}G_{i}^{-1}\bm{u}_{i}\left(\bm{u}_{i}^{h}G_{i}^{-1}\bm{u}_{i}\right)^{-\frac{1}{2}}=P_{i}\bm{b}_{i}(\bm{b}_{i}^{h}G_{i}\bm{b}_{i})^{-\frac{1}{2}},$
	$\displaystyle W_{\operatorname*{z}}$	$\displaystyle=P_{\operatorname{z}}G_{\operatorname{z}}^{-1}U_{\operatorname{z}}\left(U_{\operatorname{z}}^{h}G_{\operatorname{z}}^{-1}U_{\operatorname{z}}\right)^{-\frac{1}{2}}=P_{\operatorname{z}}B_{\operatorname{z}}\left(B_{\operatorname{z}}^{h}G_{\operatorname{z}}B_{\operatorname*{z}}\right)^{-\frac{1}{2}}$

as we desired. ∎

References

[1] P. Comon and C. Jutten, Handbook of Blind Source Separation: Independent component analysis and applications. Academic press, 2010.
[2] A. Cichocki and S. Amari, Adaptive blind signal and image processing: learning algorithms and applications. John Wiley & Sons, 2002.
[3] T.-W. Lee, “Independent component analysis,” in Independent component analysis. Springer, 1998, pp. 27–66.
[4] J. V. Stone, Independent component analysis: a tutorial introduction. MIT press, 2004.
[5] T. Kim, H. T. Attias, S.-Y. Lee, and T.-W. Lee, “Blind source separation exploiting higher-order frequency dependencies,” IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 1, pp. 70–79, 2007.
[6] A. Hiroe, “Solution of permutation problem in frequency domain ICA, using multivariate probability density functions,” in Proc. ICA, 2006, pp. 601–608.
[7] Y.-O. Li, T. Adali, W. Wang, and V. D. Calhoun, “Joint blind source separation by multiset canonical correlation analysis,” IEEE Trans. Signal Process., vol. 57, no. 10, pp. 3918–3929, 2009.
[8] X.-L. Li, T. Adalı, and M. Anderson, “Joint blind source separation by generalized joint diagonalization of cumulant matrices,” Signal Processing, vol. 91, no. 10, pp. 2314–2322, 2011.
[9] M. Anderson, T. Adali, and X.-L. Li, “Joint blind source separation with multivariate Gaussian model: Algorithms and performance analysis,” IEEE Trans. Signal Process., vol. 60, no. 4, pp. 1672–1683, 2011.
[10] J. H. Friedman and J. W. Tukey, “A projection pursuit algorithm for exploratory data analysis,” IEEE Trans. Comput., vol. 100, no. 9, pp. 881–890, 1974.
[11] P. J. Huber, “Projection pursuit,” The annals of Statistics, pp. 435–475, 1985.
[12] J. H. Friedman, “Exploratory projection pursuit,” Journal of the American statistical association, vol. 82, no. 397, pp. 249–266, 1987.
[13] J.-F. Cardoso and A. Souloumiac, “Blind beamforming for non-Gaussian signals,” in IEE proceedings F (radar and signal processing), vol. 140, no. 6, 1993, pp. 362–370.
[14] N. Delfosse and P. Loubaton, “Adaptive blind separation of independent sources: A deflation approach,” Signal processing, vol. 45, no. 1, pp. 59–83, 1995.
[15] S. A. Cruces-Alvarez, A. Cichocki, and S. Amari, “From blind signal extraction to blind instantaneous signal separation: Criteria, algorithms, and stability,” IEEE Trans. Neural Netw., vol. 15, no. 4, pp. 859–873, 2004.
[16] S. Amari and A. Cichocki, “Adaptive blind signal processing-neural network approaches,” Proceedings of the IEEE, vol. 86, no. 10, pp. 2026–2048, 1998.
[17] A. Hyvärinen and E. Oja, “A fast fixed-point algorithm for independent component analysis,” Neural Comput., vol. 9, no. 7, pp. 1483–1492, 1997.
[18] A. Hyvarinen, “Fast and robust fixed-point algorithms for independent component analysis,” IEEE Trans. Neural Netw., vol. 10, no. 3, pp. 626–634, 1999.
[19] E. Oja and Z. Yuan, “The FastICA algorithm revisited: Convergence analysis,” IEEE Trans. Neural Netw., vol. 17, no. 6, pp. 1370–1381, 2006.
[20] T. Wei, “A convergence and asymptotic analysis of the generalized symmetric FastICA algorithm,” IEEE Transl. Signal Process., vol. 63, no. 24, pp. 6445–6458, 2015.
[21] Z. Koldovský and P. Tichavský, “Gradient algorithms for complex non-Gaussian independent component/vector extraction, question of convergence,” IEEE Trans. Signal Process., vol. 67, no. 4, pp. 1050–1064, 2018.
[22] Z. Koldovský, P. Tichavský, and V. Kautský, “Orthogonally constrained independent component extraction: Blind MPDR beamforming,” in Proc. EUSIPCO, 2017, pp. 1155–1159.
[23] J. Janský, J. Málek, J. Čmejla, T. Kounovský, Z. Koldovský, and J. Žd’ánský, “Adaptive blind audio source extraction supervised by dominant speaker identification using x-vectors,” in Proc. ICASSP, 2020, pp. 676–680.
[24] R. Scheibler and N. Ono, “Independent vector analysis with more microphones than sources,” in Proc. WASPAA, 2019, pp. 185–189.
[25] ——, “Fast independent vector extraction by iterative SINR maximization,” in Proc. ICASSP, 2020, pp. 601–605.
[26] ——, “MM algorithms for joint independent subspace analysis with application to blind single and multi-source extraction,” arXiv:2004.03926v1, 2020.
[27] R. Ikeshita, T. Nakatani, and S. Araki, “Overdetermined independent vector analysis,” in Proc. ICASSP, 2020, pp. 591–595.
[28] K. Lange, MM optimization algorithms. SIAM, 2016, vol. 147.
[29] N. Ono and S. Miyabe, “Auxiliary-function-based independent component analysis for super-Gaussian sources,” in Proc. LVA/ICA, 2010, pp. 165–172.
[30] N. Ono, “Stable and fast update rules for independent vector analysis based on auxiliary function technique,” in Proc. WASPAA, 2011, pp. 189–192.
[31] ——, “Fast stereo independent vector analysis and its implementation on mobile phone,” in Proc. IWAENC, 2012, pp. 1–4.
[32] ——, “Fast algorithm for independent component/vector/low-rank matrix analysis with three or more sources,” in Proc. ASJ Spring Meeting, 2018, (in Japanese).
[33] D.-T. Pham and J.-F. Cardoso, “Blind separation of instantaneous mixtures of nonstationary sources,” IEEE Trans. Signal Process., vol. 49, no. 9, pp. 1837–1848, 2001.
[34] S. Dégerine and A. Zaïdi, “Separation of an instantaneous mixture of Gaussian autoregressive sources by the exact maximum likelihood approach,” IEEE Trans. Signal Process., vol. 52, no. 6, pp. 1499–1512, 2004.
[35] ——, “Determinant maximization of a nonsymmetric matrix with quadratic constraints,” SIAM J. Optim., vol. 17, no. 4, pp. 997–1014, 2006.
[36] A. Yeredor, “On hybrid exact-approximate joint diagonalization,” in Proc. CAMSAP, 2009, pp. 312–315.
[37] A. Yeredor, B. Song, F. Roemer, and M. Haardt, “A “sequentially drilled” joint congruence (SeDJoCo) transformation with applications in blind source separation and multiuser MIMO systems,” IEEE Trans. Signal Process., vol. 60, no. 6, pp. 2744–2757, 2012.
[38] P. Tseng, “Convergence of a block coordinate descent method for nondifferentiable minimization,” Journal of optimization theory and applications, vol. 109, no. 3, pp. 475–494, 2001.
[39] D. Kitamura, N. Ono, H. Sawada, H. Kameoka, and H. Saruwatari, “Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization,” IEEE/ACM Trans. Audio, Speech, Language Process., vol. 24, no. 9, pp. 1622–1637, 2016.
[40] H. Kameoka, L. Li, S. Inoue, and S. Makino, “Supervised determined source separation with multichannel variational autoencoder,” Neural computation, vol. 31, no. 9, pp. 1891–1914, 2019.
[41] N. Makishima, S. Mogami, N. Takamune, D. Kitamura, H. Sumino, S. Takamichi, H. Saruwatari, and N. Ono, “Independent deeply learned matrix analysis for determined audio source separation,” IEEE/ACM Trans. Audio, Speech, Language Process., vol. 27, no. 10, pp. 1601–1615, 2019.
[42] K. Sekiguchi, A. A. Nugraha, Y. Bando, and K. Yoshii, “Fast multichannel source separation based on jointly diagonalizable spatial covariance matrices,” in Proc. EUSIPCO, 2019, pp. 1–5.
[43] N. Ito and T. Nakatani, “FastMNMF: Joint diagonalization based accelerated algorithms for multichannel nonnegative matrix factorization,” in Proc. ICASSP, 2019, pp. 371–375.
[44] H. L. Van Trees, Optimum array processing: Part IV of detection, estimation, and modulation theory. John Wiley & Sons, 2004.
[45] E. Warsitz and R. Haeb-Umbach, “Blind acoustic beamforming based on generalized eigenvalue decomposition,” IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 5, pp. 1529–1539, 2007.
[46] N. Ito, S. Araki, and T. Nakatani, “Data-driven and physical model-based designs of probabilistic spatial dictionary for online meeting diarization and adaptive beamforming,” in Proc. EUSIPCO, 2017, pp. 1165–1169.
[47] D. H. Johnson and D. E. Dudgeon, Array signal processing: concepts and techniques. Simon & Schuster, Inc., 1992.
[48] E. Warsitz and R. Haeb-Umbach, “Blind acoustic beamforming based on generalized eigenvalue decomposition,” IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 5, pp. 1529–1539, 2007.
[49] A. Khabbazibasmenj, S. A. Vorobyov, and A. Hassanien, “Robust adaptive beamforming based on steering vector estimation with as little as possible prior information,” IEEE Trans. Signal Process., vol. 60, no. 6, pp. 2974–2987, 2012.
[50] S. A. Vorobyov, “Principles of minimum variance robust adaptive beamforming design,” Signal Processing, vol. 93, no. 12, pp. 3264–3277, 2013.
[51] E. Gómez, M. Gomez-Viilegas, and J. Marin, “A multivariate generalization of the power exponential family of distributions,” Communications in Statistics-Theory and Methods, vol. 27, no. 3, pp. 589–600, 1998.
[52] S. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind signal separation,” in Proc. NIPS, 1996, pp. 757–763.
[53] J. Nocedal and S. Wright, Numerical optimization. Springer Science & Business Media, 2006.
[54] R. Ikeshita, N. Ito, T. Nakatani, and H. Sawada, “Independent low-rank matrix analysis with decorrelation learning,” in Proc. WASPAA, 2019, pp. 288–292.
[55] N. Murata, S. Ikeda, and A. Ziehe, “An approach to blind source separation based on temporal structure of speech signals,” Neurocomputing, vol. 41, no. 1-4, pp. 1–24, 2001.
[56] E. Vincent, R. Gribonval, and C. Févotte, “Performance measurement in blind audio source separation,” IEEE Trans. Audio, Speech, Language Process., vol. 14, no. 4, pp. 1462–1469, 2006.
[57] S. Nakamura, K. Hiyane, F. Asano, T. Nishiura, and T. Yamada, “Acoustical sound database in real environments for sound scene understanding and hands-free speech recognition,” in LREC, 2000.
[58] J. Garofolo, L. Lamel, W. Fisher, J. Fiscus, D. Pallett, N. Dahlgren, and V. Zue, “TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Philadelphia: Linguistic Data Consortium,” 1993.
[59] J. Barker, R. Marxer, E. Vincent, and S. Watanabe, “The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines,” in Proc. ASRU, 2015, pp. 504–511.
[60] K. E. Atkinson, An introduction to numerical analysis. John wiley & sons, 2008.
[61] T. Nakatani and K. Kinoshita, “Maximum likelihood convolutional beamformer for simultaneous denoising and dereverberation,” in Proc. EUSIPCO, 2019, pp. 1–5.

	$\displaystyle\log\det\Omega_{\operatorname{z}}^{\prime}-2\log\|\det[W_{\operatorname{s}},W_{\operatorname*{z}}^{\prime}]\|$
	$\displaystyle=$		$\displaystyle\log\det(Q^{h}\Omega_{\operatorname{z}}Q)-2\log\|\det[W_{\operatorname{s}},W_{\operatorname*{z}}]\cdot\det Q\|$
	$\displaystyle=$		$\displaystyle\log\det\Omega_{\operatorname{z}}-2\log\|\det[W_{\operatorname{s}},W_{\operatorname*{z}}]\|,$

	$\displaystyle 2\log\|\det\begin{bmatrix}W_{0}\mid P_{i}G_{i}^{-1}\bm{c}_{i}\mid P_{\operatorname{z}}G_{\operatorname{z}}^{-1}C_{\operatorname*{z}}\end{bmatrix}\|$
	$\displaystyle=$		$\displaystyle 2\log\|\det\begin{bmatrix}W_{0}^{h}\\ P_{i}^{h}V_{i}\end{bmatrix}^{-1}\begin{bmatrix}W_{0}^{h}\\ P_{i}^{h}V_{i}\end{bmatrix}\begin{bmatrix}W_{0}\mid P_{i}G_{i}^{-1}\bm{c}_{i}\mid P_{\operatorname{z}}G_{\operatorname{z}}^{-1}C_{\operatorname*{z}}\end{bmatrix}\|$
	$\displaystyle=$		$\displaystyle 2\log\|\det\begin{bmatrix}W_{0}^{h}W_{0}&&\\ O_{M-K+1,K-1}&\bm{c}_{i}&C_{\operatorname{z}}\end{bmatrix}\|+\operatorname{const.}$
	$\displaystyle=$		$\displaystyle 2\log\|\det[\,\bm{c}_{i},C_{\operatorname{z}}\,]\|+\operatorname{const.}$

Block Coordinate Descent Algorithms for Auxiliary-Function-Based Independent Vector Extraction

Abstract

Index Terms:

I Introduction

I-A Independent vector extraction (IVE)

I-B Conventional BCD (or IP) algorithms for IVE

I-C Contributions

I-D Notations

II Blind and semiblind source extraction

II-A Blind source extraction (BSE) problem

II-B Semiblind source extraction (Semi-BSE) problem

III Probabilistic models

III-A Probabilistic model of proposed IVE

Assumption 1 (Independence of sources).

Assumption 2 (Super-Gaussian distributions for the target sources [29, 30]).

Assumption 3 (Stationary Gaussian distribution for the background noise).

Remark 1.

III-B Relation to IVA and AuxIVA

III-C Relation to IVE with orthogonal constraint

IV Algorithms for the BSE problem

IV-A Majorization-minimization (MM) approach

IV-B Block coordinate descent (BCD) algorithms and the stationary condition for problem (29)

Lemma 1 (See, e.g., [33, 34, 35]).

IV-C Conventional methods: IVA-IP1, IVA-IP2, and IVE-OC

IV-C1 IVA-IP1

IV-C2 IVE-OC

IV-C3 IVA-IP2

IV-D Proposed IVE-IP2 for the case of K=1K=1

Proposition 2.

Proof.

IV-E Drawback of conventional IVE-IP2 for the case of K≥2K\geq 2

Proposition 3.

Proof.

IV-F Proposed IVE-IP2 for K≥2K\geq 2

IV-G IVE-IP1: Reinterpretation of the conventional IVE-OC

Proposition 4.

Proof.

IV-H Source extraction and projection back

Remark 2.

V Algorithm for semiblind BSE problem

V-A Proposed Semi-IVE

V-B LCMV beamforming algorithm for 𝐰1,…,𝐰L\bm{w}_{1},\ldots,\bm{w}_{L}

V-C Block coordinate descent (BCD) algorithm for W2W_{2}

Remark 3.

VI Computational time complexity

VII Experiments

VII-A Dataset

VII-B Experimental conditions

VII-C Implementation notes

VII-D Experimental results

VII-D1 Effectiveness of IVE-IP2-new for extracting unique target source (K=1K=1)

VII-D2 Effectiveness of IVE-IP2-new compared to IVE-IP2-old

VII-D3 Comparison of IVA-IP1-old, IVE-IP1-old, and IVE-IP2-new

VII-D4 Effectiveness of Semi-IVE-(L)(L)-new

VIII Concluding remarks

Appendix A

Proposition 5 (See [29, Theorem 1]).

Proposition 6.

Proof.

Proposition 7.

Proof.

Appendix B Proof of Proposition 3

Proof.

References

IV-D Proposed IVE-IP2 for the case of $K=1$

IV-E Drawback of conventional IVE-IP2 for the case of $K\geq 2$

IV-F Proposed IVE-IP2 for $K\geq 2$

V-B LCMV beamforming algorithm for $\bm{w}_{1},\ldots,\bm{w}_{L}$

V-C Block coordinate descent (BCD) algorithm for $W_{2}$

VII-D1 Effectiveness of IVE-IP2-new for extracting unique target source ( $K=1$ )

VII-D4 Effectiveness of Semi-IVE- $(L)$ -new