Lyapunov exponents for products of truncated orthogonal matrices

Dong Qichao¹¹1dqccha@mail.ustc.edu.cn

Abstract

This article gives a non-asymptotic analysis of the largest Lyapunov exponent of truncated orthogonal matrix products. We prove that as long as $N$ , the number of terms in product, is sufficiently large, the largest Lyapunov exponent is asympototically Gaussian. Futhermore, the sum of finite largest Lyapunov exponents is asympototically Gaussian, where we use Weingarten Calculus.

1 Introduction

1.1 Main results

Let $R_{i}$ be independent Haar distributed random real orthogonal matrices of size $(l_{i}+n)\times(l_{i}+n)$ and $A_{i}$ be the top $n\times n$ subblock of $R_{i}$ , where $l_{i}>0$ . We consider random matrix products

X_{N,n}:=A_{N}\cdots A_{1}.

(1)

Let $s_{1}\geq\cdots\geq s_{n}$ be the singular values of $X_{N,n}$ , the Lyapunov exponents of $X_{N,n}$ are defined as

\lambda_{i}=\lambda_{i}\left(X_{N,n}\right):=\frac{1}{N}\log s_{i}\left(X_{N,n}\right).

(2)

We prove that as long as $N$ is sufficiently large as a function of $n,l_{i}$ , the largest Lyapunov exponent

\lambda_{1}=\lambda_{1}\left(X_{N,n}\right):=\frac{1}{N}\log s_{1}\left(X_{N,n}\right)

of $X_{N,n}$ is asymptotically Gaussian (see Theorem 1). Our estimation provides quantitative concentration estimates when $N$ is large but finite even when $n$ grows with $N$ .

Let us give some notations. Define $l=\min\limits_{1\leq i\leq N}l_{i}$ , $L=\max\limits_{1\leq i\leq N}l_{i}$ , furthermore,

	$\displaystyle\mu_{n,l_{i}}=\mathbb{E}\log\left(\operatorname{Beta}(n/2,l/2)\right)=\psi(n/2)-\psi((l_{i}+n)/2),$		(3)
	$\displaystyle\sigma_{n,l_{i}}=\mathrm{Var}\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)=\psi_{1}(n/2)-\psi_{1}((l_{i}+n)/2),$		(4)

	$\displaystyle\mu_{n,l}$	$\displaystyle=\frac{1}{N}\sum_{i=1}^{N}\mu_{n,l_{i}},$		(5)
	$\displaystyle\Sigma_{n,l}$	$\displaystyle=\frac{1}{N^{2}}\sum_{i=1}^{N}\sigma_{n,l_{i}}.$		(6)

Our main results are as follows

Theorem 1.1.

Suppose $X_{N,n}$ is given as in (1), there exists constant $C>0$ such that

d_{KS}\left(\frac{\lambda_{1}-\mu_{n,l}}{\Sigma_{n,l}},\mathcal{N}\left(0,1\right)\right)\leq\left(\frac{4C\log^{2}n\log^{2}(N/n)n(n+l)}{2lN}\right)^{1/2},

(7)

then $\lambda_{1}$ is approximately Gaussian when N is sufficiently large as a function of $n,l$ . Here $\mathcal{N}(\mu,\Sigma)$ denotes a Gaussian with mean $\mu$ and co-variance $\Sigma$ , $d_{KS}$ is the Kolmogorov-Smirnoff distance.

Futhermore, for $k$ dimension case we have

Theorem 1.2.

Suppose $X_{N,n}$ is as in (1), k is finite, then the sum of the top k Lyapunov exponents $\lambda_{1}+\cdots+\lambda_{k}$ is approximately Gaussian.

Remark: Our results can be extended to truncated unitary matrices.

1.2 Prior work

Furstenberg and Kesten first proved that $\lambda_{1}$ converges provided that ${\mathbb{E}}\log_{+}\left(\|A_{i}\|\right)<\infty$ in paper [10]. Oseledec proved convergence of other singular values later in paper [19], which is referred as multiplicative ergodic theorem. Cohen and Newman in [18] studied the behavior of the limit in the situation when N approaches infinity. Moreover, the work of LePage [15] and subsequent work [7] showed that the top Lyapunov exponent of matrix products such as $X_{N,n}$ (not necessarily Gaussian) is asymptotically normal. To the best of our knowledge, all known mathematical proofs of asymptotic normality results hold only for finite fixed $n$ and do not include quantitative rates of convergence. For the case we study, we overcomes these deficiencies.

When we consider $t=\frac{n}{N}$ as a time parameter in an interacting particle system, there are also interesting work. A number of remarkable articles [1, 2, 3] and especially [4] establish a correspondence between $t$ and the time parameter in the stochastic evolution of an interacting particle system. This correspondence between singular values for products of complex Ginibre matrices and DBM appears to be initially due to Maurice Duits.

A rigorous analysis of the determinental kernel for the joint distribution of singular values for products of complex Gaussian matrices was undertaken in a variety of articles.In particular, [16] shows that when $N$ is arbitrary but fixed and $n\rightarrow\infty$ , the determinental kernel for singular values in products of $N$ iid complex Gaussian matrices of size $n\times n$ converges to the familiar sine and Airy kernels that arise in the local spectral statistics of large GUE matrices in the bulk and edge, respectively. Moreover, [16] rigorously obtained an expression for the limiting determinental kernel when $t=\frac{n}{N}$ is arbitrary in the context of products of complex Ginibre. $X_{N,n}$ around the triangle law always converge to a Gaussian field.

We also refer the reader to [5], which obtains a CLT for linear statistics of top singular values when $n/N$ is fixed and finite. For the real Gaussian case, [11] provides non-asymptotic analysis of the singular values, which inspires this article.

1.3 Strategy of proof

Basically, we follow the strategy in [11] . By reduction to small ball estimates for volumes of random projections ( Proposition 8.1 and Lemma 8.2 in [11]), we can estimate the difference

\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|-\sup_{\Theta\in{\mathbb{R}}}\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|=\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|-\lambda_{1}.

(8)

Proposition 1.3 ([11]Proposition 8.1).

There exists $C>0$ with the following property. For any $\varepsilon\in(0,1)$ and any $\Theta\in\operatorname{Fr}_{n,k}$ we have

\mathbb{P}\left(\left.\left|\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|-\sup_{\Theta^{\prime}\in\operatorname{Fr}_{n,k}}\frac{1}{N}\log\|X_{N,n}\left(\Theta^{\prime}\right)\right|\right\rvert\,\geq\frac{k}{2N}\log\left(\frac{n}{k\varepsilon^{2}}\right)\right)\leq(C\varepsilon)^{k/2},

(9)

We use a high-dimensional version of Kolmogorov-Simirnov distance to measure normality,

d(X,Y):=\sup_{C\in\mathcal{C}_{k}}|\mathbb{P}(X\in C)-\mathbb{P}(Y\in C)|.

We have the following :

Proposition 1.4 ([11]Proposition 6.3).

There exists $c>0$ with the following property. Suppose that $X,Y$ are $\mathbb{R}^{k}$ -valued random variables defined on the same probability space. For all $\mu\in\mathbb{R}^{k}$ , invertible symmetric matrices $\Sigma\in\mathrm{Sym}_{k}^{+}$ , and $\delta>0$ we have

d(X+Y,\mathcal{N}(\mu,\Sigma))\leq 3d(X,\mathcal{N}(\mu,\Sigma))+c\delta\sqrt{\left\|\Sigma^{-1}\right\|_{HS}}+2\mathbb{P}\left(\|Y\|_{2}>\delta\right).

We follow the notation in Bentkus [6] with one dimension case and define

S:=S_{N}=X_{1}+\cdots+X_{N},

where $X_{1},\ldots,X_{N}$ are independent random varibles in $\mathbb{R}^{k}$ with common mean $\mathbb{E}X_{j}=$ 0 . We set

C:=\operatorname{cov}(S)

to be the covariance matrix of $S$ , which assumed to be invertible. With the definition

\beta_{j}:=\mathbb{E}\left\|C^{-\frac{1}{2}}X_{j}\right\|_{2}^{3},\beta:=\sum_{j=1}^{N}\beta_{j},

we have the following :

Theorem 1.5 ([6]Theorem 1.1).

There exists an absolute constant $c>0$ such that

d\left(S,C^{\frac{1}{2}}Z\right)\leq ck^{\frac{1}{4}}\beta,

(10)

where $Z\sim\mathcal{N}\left(0,\mathrm{Id}_{k}\right)$ denotes a standard Gaussian on $\mathbb{R}^{k}$ .

We use Proposition 1.4 and Theorem 1.5 to derive Theorem 1.1.

For Theorem 1.2, we use integration with respect to the Haar measure on orthogonal group, so-called Weingarten calculus, to estimate moments of the sum of Lyapunov exponents, which is new technique in this area.

2 Proof for Theorem 1.1

Let ${A_{i},i=1,\cdots,N}$ , is drawn from rotationally invariant ensemble, so [11]lemma 8 and proposition 8.1 still can be used. According to [11] lemma 9.5, which is standard technique in this area. We have

	$\displaystyle\log\left\\|X_{N,n}(\Theta)\right\\|$	$\displaystyle=\log\left\\|A_{N}\cdots A_{1}(\Theta)\right\\|$		(11)
		$\displaystyle=\log\left\\|A_{N}\cdots A_{2}\frac{A_{1}(\Theta)}{\left\\|A_{1}(\Theta)\right\\|}\right\\|+\log\left\\|A_{1}(\Theta)\right\\|.$		(12)

Moreover, $A_{2}\left(\frac{A_{1}(\Theta)}{\left\|A_{1}(\Theta)\right\|}\right)$ is independent of $A_{3},\cdots,A_{N}$ and

A_{2}\left(\frac{A_{1}(\Theta)}{\left\|A_{1}(\Theta)\right\|}\right)\stackrel{{\scriptstyle d}}{{=}}A_{2}(\Theta),

(13)

the above equations use the rotationally invariance of ${A_{i},i=1,\cdots,N}$ .
In conclusion, we have

\log\left\|X_{N,n}(\Theta)\right\|\stackrel{{\scriptstyle d}}{{=}}\sum_{i=1}^{N}\log\left\|A_{i}(\Theta)\right\|

(14)

In order to do precise calculation, we assume $\Theta$ is standard orthogonal basis and consider $\Theta$ is 1 dimensional.
Here we recall a well-known result, Haar distributed orthogonal matrix can be derived from Gram-Schmidt transformation of Ginibre matrix, where entries are all real standard Gaussian variables, see [9] for more details. Then

\log\left\|A_{i}(\Theta)\right\|=\log\|\zeta_{i}\|,

(15)

where $\zeta_{i}$ is the first row of $A_{i}$ . Furthermore,

\displaystyle{\|\zeta_{i}\|}^{2}=\frac{{x_{1}}^{2}+\cdots+{x_{n}}^{2}}{{\|\vec{x}\|}^{2}}

where $\vec{x}$ is a vector of independent standard Gaussian variables. Then we have

\displaystyle{\|\zeta_{1}\|}^{2}\sim\operatorname{Beta}(n/2,l/2).

since if $X\sim\operatorname{Gamma}(\mathrm{a},\theta)$ and $Y\sim\operatorname{Gamma}(\beta,\theta)$ are independent, then $\frac{X}{X+Y}\sim\operatorname{Beta}(\alpha,\beta)$ .
According to [11] Lemma 9.5, we have in distribution that

\frac{2}{N}\log\left\|X_{N,n}(\Theta)\right\|=\frac{1}{N}\sum_{i=1}^{N}T_{i},

(16)

where $T_{i}$ are independent and

T_{i}\sim\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right).

(17)

We already know that

	$\displaystyle\mu_{n,l_{i}}=\mathbb{E}\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)=\psi(n/2)-\psi((l_{i}+n)/2),$		(18)
	$\displaystyle\sigma_{n,l_{i}}=\mathrm{Var}\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)=\psi_{1}(n/2)-\psi_{1}((l_{i}+n)/2),$		(19)

where $\psi(z)=\frac{\mathrm{d}}{\mathrm{d}z}\log\Gamma(z)=\frac{\Gamma^{\prime}(z)}{\Gamma(z)},\psi_{1}(z)=\frac{\mathrm{d}}{\mathrm{d}z}\psi(z)$ .

We find in particular that

\displaystyle\mathbb{E}\left[\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|\right]=\mu_{n,l}

(20)

for any random variable $Y$ and we will use the shorthand

\bar{Y}:=Y-\mathbb{E}[Y].

We will apply Markov’s inequality to bound moments of the sum of $T_{i}$ ’s, so we do the following estimate.

Proposition 2.1.

There exists a universal constant $C$ so that for any $n,N,l$ and $p\geq 1$

\displaystyle\left(\mathbb{E}\left[\left|\sum_{i=1}^{N}\bar{T}_{i}\right|^{p}\right]\right)^{1/p}\leq C\left(\sqrt{p\sum_{j=1}^{N}\frac{1}{M_{N}}}+\frac{p}{M_{N}}\right)\leq C\left(\sqrt{\frac{pN}{M}}+\frac{p}{M}\right),

(21)

where $M_{N}=\frac{n^{2}(n+l_{N}+2)}{(n+l_{N})^{2}},M=\frac{n^{2}(n+L+2)}{(n+L)^{2}}.$

Lemma 2.2.

There exists a universal constant $C$ so that

\left(\mathbb{E}\left[\left|\bar{T}_{i}\right|^{p}\right]\right)^{1/p}\leq C\sqrt{\frac{p}{M_{i}}}\leq C\sqrt{\frac{p}{M}}.

(22)

Proof.

We first make a reduction. We now verify that the estimates established in Lemma 1 for $\bar{T}_{i}=T_{i}-\mathbb{E}\left[\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)\right]$ can be derived directly from the corresponding estimates for

\widehat{T}_{i}:=T_{i}-\log\left(\frac{n}{n+l_{i}}\right).

We have

\mathbb{E}\left[\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)\right]=\psi(n/2)-\psi((l_{i}+n)/2)=\log\left(\frac{n}{n+l_{i}}\right)+\varepsilon,\quad\varepsilon=O\left(n^{-1}\right),

where $\psi$ is the digamma function, and we have used its asymptotic expansion $\psi(z)\sim$ $\log(z)+O\left(z^{-1}\right)$ for large arguments. Thus, we have for each $i$ that

\mathbb{E}\left[\left|\bar{T}_{i}\right|^{p}\right]=\mathbb{E}\left[\left|\widehat{T}_{i}+\varepsilon\right|^{p}\right]\leq\sum_{k=0}^{p}\left(\begin{array}[]{l}p\\ k\end{array}\right)\mathbb{E}\left[\left|\widehat{T}_{i}\right|^{k}\right]\left|\varepsilon\right|^{p-k}.

So assuming that $\widehat{T}_{i}$ satisfy the conclusion of Lemma 1 , we find

\mathbb{E}\left[\left|\bar{T}_{i}\right|^{p}\right]\leq\sum_{k=0}^{p}\left(\begin{array}[]{l}p\\ k\end{array}\right)\zeta_{k}^{k}\left|\varepsilon\right|^{p-k},\quad\zeta_{k}:=C\sqrt{\frac{k}{M}}.

Since for $0\leq k\leq p$ we have $\zeta_{k}\leq\zeta_{p}$ , we see that

\mathbb{E}\left[\left|\bar{T}_{i}\right|^{p}\right]\leq\sum_{k=0}^{p}\left(\begin{array}[]{l}p\\ k\end{array}\right)\zeta_{p}^{k}\left|\varepsilon\right|^{p-k}\leq\left(\zeta_{p}+\left|\varepsilon\right|\right)^{p}.

Finally, since $\varepsilon=O\left(n^{-1}\right)=o\left(\zeta_{p}\right)$ , we find that there exists $C>0$ so that

\left(\mathbb{E}\left[\left|\bar{T}_{i}\right|^{p}\right]\right)^{1/p}\leq C\zeta_{p}\leq C\sqrt{\frac{p}{M_{i}}}

as desired. It therefore remains to show that $\widehat{T}_{i}=T_{i}-\log\left(\frac{n}{n+l_{i}}\right)$ satisfies the conclusion of Lemma 1. To do this, we begin by checking that with $M_{i}$ , for all $s\geq 0$

\mathbb{P}\left(\left|T_{i}-\log\left(\frac{n}{n+l_{i}}\right)\right|\geq s\right)\leq 2e^{-M_{i}s^{2}}.

(23)

We have

	$\displaystyle\mathbb{P}\left(\left\|T_{i}-\log\left(\frac{n}{n+l_{i}}\right)\right\|\geq s\right)$	$\displaystyle=\mathbb{P}\left(\left\|\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)-\log\left(\frac{n}{n+l_{i}}\right)\right\|\geq s\right)$
		$\displaystyle=\mathbb{P}\left(\left\|\log\left(\frac{n+l_{i}}{n}\operatorname{Beta}(n/2,l_{i}/2)\right)\right\|\geq s\right)$
		$\displaystyle=\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\geq\frac{n}{n+l_{i}}e^{s}\right)+\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\leq\frac{n}{n+l_{i}}e^{-s}\right).$

Let us first bound $\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\geq\frac{n}{n+l_{i}}e^{s}\right)$ . Notice that the mean of $\operatorname{Beta}(n/2,l_{i}/2)$ is $\frac{n}{n+l_{i}}$ and that $\operatorname{Beta}(n/2,l_{i}/2)$ are subgaussian random variables. Thus, we can use Bernstein’s tail estimates for Beta variables. According to Theorem 1 [20],

Theorem 2.3 ([20]Theorem 1).

Let $X\sim\operatorname{Beta}(\alpha,\beta)$ . Define the parameters

		$\displaystyle v\triangleq\frac{\alpha\beta}{(\alpha+\beta)^{2}(\alpha+\beta+1)},$
		$\displaystyle c\triangleq\frac{2(\beta-\alpha)}{(\alpha+\beta)(\alpha+\beta+2)}.$

Then the upper tail of $X$ is bounded as

\mathbf{P}\{X>\mathbf{E}[X]+\epsilon\}\leqslant\begin{cases}\exp\left(-\frac{\epsilon^{2}}{2\left(v+\frac{\epsilon\epsilon)}{3}\right)}\right),&\beta\geqslant\alpha,\\ \exp\left(-\frac{\epsilon^{2}}{2v}\right),&\beta<\alpha,\end{cases}

and the lower tail of $X$ is bounded as

\mathbf{P}\{X<\mathbf{E}[X]-\epsilon\}\leqslant\begin{cases}\exp\left(-\frac{\epsilon^{2}}{2\left(v+\frac{\epsilon\epsilon}{3}\right)}\right),&\alpha\geqslant\beta,\\ \exp\left(-\frac{\epsilon^{2}}{2v}\right),&\alpha<\beta.\end{cases}

We have $v$ for $X\sim\operatorname{Beta}(n/2,l_{i}/2)$ is $\frac{1}{2(n+l_{i}+2)}$ , then

\displaystyle\mathbf{P}\{X>\mathbf{E}[X]+\epsilon\}\leqslant\exp\left(-\frac{\epsilon^{2}}{2v}\right).

Thus

	$\displaystyle\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\geq\frac{n}{n+l_{i}}e^{s}\right)$	$\displaystyle=\mathbb{P}\left(\operatorname{Beta}(n/2,l/2)-\frac{n}{n+l_{i}}\geq\frac{n}{n+l_{i}}\left(e^{s}-1\right)\right)$
		$\displaystyle\leq\exp\left(-M_{i}s^{2}\right),$

where we use $e^{s}-1\geq s$ and define $M_{i}=\frac{n^{2}(n+l_{i}+2)}{(n+l_{i})^{2}}$ . $\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\leq\frac{n}{n+l_{i}}e^{-s}\right)$ is similar to bound, thus we have

\mathbb{P}\left(\left|T_{i}-\log\left(\frac{n}{n+l_{i}}\right)\right|\geq s\right)\leq 2e^{-M_{i}s^{2}}.

To complete the proof of lemma 1, we can write

	$\displaystyle\mathbb{E}\left[\left\|\widehat{T}_{i}\right\|^{p}\right]$	$\displaystyle=\int_{0}^{\infty}\mathbb{P}\left(\left\|\widehat{T}_{i}\right\|>x\right)px^{p-1}dx$
		$\displaystyle\leq p\int_{0}^{\infty}e^{-cM_{i}x^{2}}x^{p-1}dx,$

the term can be estimated by comparing to the moments of a Gaussian as follows:

$\displaystyle p\int_{0}^{\infty}e^{-cM_{i}x^{2}}x^{p-1}dx$	$\displaystyle=p\left(2cM_{i}\right)^{-p/2}\int_{0}^{\infty}e^{-x^{2}}x^{p-1}dx$
	$\displaystyle\leq p\left(2cM_{i}\right)^{-p/2}\int_{0}^{\infty}e^{-x^{2}}x^{p-1}dx$
	$\displaystyle\leq p\left(2cM_{i}\right)^{-p/2}2^{\frac{p}{2}}\Gamma\left(\frac{p}{2}\right)$
	$\displaystyle\leq p\left(\frac{p}{2cM_{i}}\right)^{p/2},$	(24)

where we used that for $z>0$ we have $\Gamma(z)\leq z^{z}.$ Taking $1/p$ powers, we find that there exists $C>0$ so that

\left(\mathbb{E}\left[\left|\widehat{T}_{i}\right|^{p}\right]\right)^{1/p}\leq C\sqrt{\frac{p}{M_{i}}}\leq C\sqrt{\frac{p}{M}},

(25)

for all $p\geq 1$ , since $M_{i}$ is decreasing function with respect to $l_{i}$ . This completes the proof.

We now in a position to prove proposition 3 with lemma 1 in hand. We introduce the following result of R.Latała.

Theorem 2.4.

Let $X_{1},\cdots,X_{N}$ be mean zero, independent r.v. and $p\geq 1$ . Then

\left(\mathbb{E}\left[\left|\sum_{j=1}^{N}X_{j}\right|^{p}\right]\right)^{\frac{1}{p}}\simeq\inf\left\{t>0:\sum_{j=1}^{N}\log\left[\mathbb{E}\left|1+\frac{X_{j}}{t}\right|^{p}\right]\leq p\right\},

where $a\simeq b$ means there exist universal constants $c_{1},c_{2}$ so that $c_{1}a\leq b\leq ac_{2}$ . Moreover, if $X_{i}$ are also identically distributed, then

\left(\mathbb{E}\left[\left|\sum_{j=1}^{N}X_{j}\right|^{p}\right]\right)^{\frac{1}{p}}\simeq\sup_{\max\left\{2,\frac{p}{N}\right\}\leq s\leq p}\frac{p}{s}\left(\frac{N}{p}\right)^{\frac{1}{n}}\left\|X_{i}\right\|_{s}.

We know that

\displaystyle\left(\mathbb{E}\left[\left|\sum_{i=1}^{N}\bar{T}_{i}\right|^{p}\right]\right)^{1/p}\simeq\inf\left\{t>0:\sum_{j=1}^{N}\log\left[\mathbb{E}\left|1+\frac{\bar{T}_{i}}{t}\right|^{p}\right]\leq p\right\}.

(26)

We will use the notation :

p_{0}=M_{N}^{2}\sum_{j=1}^{N}M_{j}^{-1}.

Since

\sqrt{p\sum_{j=1}^{N}M_{N}^{-1}}\leq\frac{p}{M_{N}}\quad\Longleftrightarrow\quad p\geq p_{0},

(27)

we will show that there exists $C>0$ so that

p\leq p_{0}\quad\Longrightarrow\quad\left(\mathbb{E}\left|\sum_{i=1}^{N}\bar{T}_{i}\right|^{p}\right)^{\frac{1}{p}}\leq C\sqrt{p\sum_{j=1}^{N}M_{j}^{-1}}

(28)

as well as

p\geq p_{0}\quad\Longrightarrow\quad\left(\mathbb{E}\left|\sum_{i=1}^{N}\bar{T}_{i}\right|^{p}\right)^{\frac{1}{p}}\leq C\frac{p}{M_{N}}.

(29)

We may assume that without loss of generality that p is even, since we can use higher even moments to control odd moments. Here we recall a well-known estimate

\left(\frac{n}{k}\right)^{k}\leq\binom{n}{k}\leq\left(\frac{n}{k}\right)^{k}e^{k},\quad k\geq 1.

Since p is even, we can drop the absolute value,

$\displaystyle\mathbb{E}\left[\left(1+\frac{\bar{T}_{i}}{t}\right)^{p}\right]$	$\displaystyle=1+\sum_{l=2}^{p}\binom{p}{l}\frac{\mathbb{E}\left[\bar{T}_{i}^{l}\right]}{t^{l}}$
	$\displaystyle\leq 1+\sum_{l=2}^{p}\binom{p}{l}\frac{1}{t^{l}}\left(C\frac{l}{M}\right)^{l}$
	$\displaystyle\leq 1+\sum_{l=2}^{p}\left(\frac{p}{l}\right)^{\frac{l}{2}}\left(\frac{Cp}{t^{2}M}\right)^{\frac{l}{2}}.$	(30)

We now bound the term in the previous line by breaking into two terms where l is even and odd. When l is even, the term in (30) can be bounded by

\displaystyle 1+\sum_{\begin{subarray}{c}l=2\\ l\text{ even}\end{subarray}}^{p}\left(\frac{p}{l}\right)^{\frac{l}{2}}\left(\frac{Cp}{t^{2}M}\right)^{\frac{l}{2}}\leq 1+\sum_{\begin{subarray}{c}l=2\\ l\text{ even}\end{subarray}}^{p}\binom{\frac{p}{2}}{\frac{l}{2}}\left(\frac{Cp}{t^{2}M}\right)^{\frac{l}{2}}\leq\left(1+\frac{Cp}{t^{2}M}\right)^{\frac{l}{2}}.

(31)

When l is odd, we have fact that for any $0\leq m\leq l\leq p$

\left(\frac{p}{\ell}\right)^{\ell}\leq p^{m}\binom{p}{\ell-m}.

(32)

Thus the odd term in (30) is bounded by

\sum_{\begin{subarray}{c}\ell=3\\ \ell\text{ odd }\end{subarray}}^{p}\left(\frac{p}{\ell}\right)^{\ell/2}\left(\frac{Cp}{t^{2}M}\right)^{\ell/2}\leq\min_{m=1,3}\left\{\left(\frac{Cp^{2}}{t^{2}M}\right)^{\frac{m}{2}}\sum_{\begin{subarray}{c}\ell=3\\ \ell-1\end{subarray}}^{p-1}\binom{p}{\ell-m}^{\frac{1}{2}}\left(\frac{Cp}{t^{2}M}\right)^{\frac{\ell-m}{2}}\right\}.

(33)

To proceed, note that for any $0\leq b\leq a$

\binom{2a}{2b}\leq 2^{b}\binom{a}{b}^{2}.

This inequality follows by observing that for any $j=0,\ldots,b-1$ , we have

\frac{(2a-2j)(2a-2j-1)}{(2b-2j-1)}=\frac{(a-j)(a-j-1/2)}{(b-j)(b-j-1/2)}\leq 2\left(\frac{a-j}{b-j}\right)^{2},

and repeatedly applying this estimate to the terms in $\binom{2a}{2b}$ . Thus, we obtain

$\displaystyle\sum_{\begin{subarray}{c}\ell=3\\ \ell\text{ odd }\end{subarray}}^{p}\left(\frac{p}{\ell}\right)^{\ell/2}\left(\frac{Cp}{t^{2}M}\right)^{\ell/2}$	$\displaystyle\leq\min_{m=1,3}\left\{\left(\frac{Cp^{2}}{t^{2}M}\right)^{\frac{m}{2}}\right\}\sum_{\ell=0}^{p/2}\binom{p/2}{l}\left(\frac{Cp}{t^{2}M}\right)^{l}$
	$\displaystyle=\min_{m=1,3}\left\{\left(\frac{Cp^{2}}{t^{2}M}\right)^{\frac{m}{2}}\right\}\left(1+\frac{Cp}{t^{2}M}\right)^{p/2}$
	$\displaystyle\leq\left(\frac{Cp^{2}}{t^{2}M}\right)\left(1+\frac{Cp}{t^{2}M}\right)^{p/2},$	(34)

where in the last inequality we’ve used that $\min\left\{x^{1/2},x^{3/2}\right\}\leq x$ for all $x\geq 0$ . In conclusion, we see that there exists $C>0$ so that

\mathbb{E}\left[\left(1+\frac{\bar{T}_{i}}{t}\right)^{p}\right]\leq\left(1+\frac{Cp^{2}}{t^{2}M}\right)\left(1+\frac{Cp}{t^{2}M}\right)^{p/2}.

(35)

Hence, since $\log(a+b)\leq(\log a)+b$ for $a\geq 1$ and $b>0$ ,

	$\displaystyle\sum_{j=1}^{N}\log\mathbb{E}\left[\left(1+\left(\frac{\bar{T}_{i}}{t}\right)^{p}\right)\right]$	$\displaystyle\leq\frac{p}{2}\sum_{j=1}^{N}\log\left(1+\frac{Cp}{t^{2}M}\right)+\sum_{j=1}^{N}\log\left(1+\frac{Cp^{2}}{t^{2}M}\right)$
		$\displaystyle\leq\frac{p}{2}\sum_{j=1}^{N}\frac{Cp}{t^{2}M}+\sum_{j=1}^{N}\frac{Cp^{2}}{t^{2}M}.$		(36)

When $p\leq p_{0}$ , set $t=\sqrt{C^{\prime}p\sum_{i=1}^{N}M_{i}^{-1}}$ where

C^{\prime}=\max\left\{(16C)^{2},2C^{1/2}\right\},

when $p\geq p_{0}$ , set

t=\frac{C^{\prime}p}{M_{N}}

with

C^{\prime}=\max\left\{4C,2C^{1/2}\right\},

we have the terms in (36) is less than p, which completes the proof of Proposition 2.1.

Proposition 2.5.

There exists a universal constant $c>0$ with the following property. Fix any vector $\theta$ in $\mathbb{R}^{n}$ , we have

\mathbb{P}\left(\left|\frac{1}{N}\log\left\|X_{N,n}(\theta)\right\|-\mu_{n,l}\right|\geq s\right)\leq 2\exp\left\{-cN\min\{\hat{M}s^{2},M_{N}s\}\right\},\quad s>0.

(37)

Proof.

To prove Proposition 4, we can see that Proposition 4 is equivalent to showing that for any $s>0$

\mathbb{P}\left(\left|\frac{1}{N}\sum_{i=1}^{N}\bar{T}_{i}\right|\geq s\right)\leq 2\exp\left\{-cNMs^{2}\right\}.

(38)

We remind the readers here that we will use the notation :

p_{0}=M_{N}^{2}\sum_{j=1}^{N}M_{j}^{-1},

and note that

\sqrt{p\sum_{j=1}^{N}M_{j}^{-1}}\leq\frac{p}{M_{N}}\quad\Longleftrightarrow\quad p\geq p_{0}.

Thus, applying Markov’s inequality to Proposition 2.1 shows that there exists $C>0$ so that for $1\leq p\leq p_{0}$

\mathbb{P}\left(\left|\frac{1}{N}\sum_{i=1}^{N}\bar{T}_{i}\right|\geq\frac{C}{N}\sqrt{p\sum_{j=1}^{N}\frac{1}{M_{j}}}\right)\leq e^{-p}.

(39)

Equivalently, set

\hat{M}=\sum_{j=1}^{N}\frac{1}{M_{j}},

we see that there exists $c>0$ so that

\mathbb{P}\left(\left|\frac{1}{N}\sum_{i=1}^{N}\bar{T}_{i}\right|\geq s\right)\leq 2e^{-cN\hat{M}s^{2}},\quad 0\leq s\leq\frac{C}{N}M_{N}\hat{M}.

(40)

This establishes (37) in this range of $s$ . To treat $s\geq\frac{C}{N}M_{N}\hat{M}$ , we again apply Markov’s inequality to Proposition 2.1 to see that there exists $C>0$ so that

p\geq p_{0}\Longrightarrow\mathbb{P}\left(\left|\sum_{i=1}^{N}\bar{T}_{i}\right|>C\frac{p}{M_{N}}\right)\leq e^{-p}.

Hence, there exists $c>0$ so that

\mathbb{P}\left(\left|\frac{1}{N}\sum_{i=1}^{N}\bar{T}_{i}\right|\geq s\right)\leq e^{-cNM_{N}s},\quad s\geq\frac{C}{N}M_{N}\hat{M},

(41)

completing the proof.

Turning to the probability in (9), recall that in [11]Proposition 8.1, we have shown that for every $\varepsilon\in(0,1)$ ,

\mathbb{P}\left(\left|\frac{1}{nN}\log\left\|X_{N,n}(\Theta)\right\|-\frac{1}{n}\sum_{i=1}^{k}\lambda_{i}\right|\geq\frac{k}{2Nn}\log\left(\frac{n}{k\varepsilon^{2}}\right)\right)\leq(C\varepsilon)^{\frac{k}{2}}.

(42)

If we set $s:=\frac{k}{nN}\log\frac{en}{k\varepsilon^{2}}$ , then

(C\varepsilon)^{k/2}=\exp\left[-\frac{1}{4}snN+\frac{k}{4}\log\left(\frac{en}{k}\right)+\frac{k}{2}\log(C)\right].

Hence, assuming that

s\geq C^{\prime}\frac{k}{nN}\log\left(\frac{en}{k}\right)

for $C^{\prime}$ sufficiently large, we arrive to the following expression:

\mathbb{P}\left(\left|\frac{1}{n}\sum_{i=1}^{k}\lambda_{i}-\frac{\log\left\|X_{N,n}(\Theta)\right\|}{nN}\right|\geq s\right)\leq e^{-\frac{snN}{4}},\quad s\geq C^{\prime}\frac{k}{nN}\log\left(\frac{en}{k}\right).

(43)

Now we prove Theorem 1, we follow the proof of Theorem 1.3 in [11], where $\mu_{n},\Sigma_{n,l}=\frac{1}{N}\sigma_{n,l}^{2}$ are the mean and variance of $\frac{1}{2}\log\left(\operatorname{Beta}(n/2,l/2)\right),$

$\displaystyle d\left(\lambda_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)\leq$	$\displaystyle 3d\left(\widehat{S}_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)+c_{0}\delta\left\\|\left(\Sigma_{n,l}\right)^{-1}\right\\|_{HS}^{1/2}$
	$\displaystyle+2\mathbb{P}\left(\left\\|S_{1}-\widehat{S}_{1}\right\\|>\delta\right)$
$\displaystyle=$	$\displaystyle 3d\left(\widehat{S}_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)+c_{0}\delta\left\\|\left(\Sigma_{n,l}\right)^{-1}\right\\|_{HS}^{1/2}$
	$\displaystyle+2\mathbb{P}\left(\left\\|S_{1}-\widehat{S}_{1}\right\\|>\delta\right),$	(44)

where $\widehat{S}_{1}=\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|.$ For part 1, we use Theorem 1.5 and $k=1$ ,

\frac{1}{N}\log\left\|X_{N,n}(\Theta)\right\|=\frac{1}{N}\sum_{i=1}^{N}Y_{i},

where $Y_{i}\sim\frac{1}{2}\log\operatorname{Beta}(n/2,l/2)$ , and $Y_{i}$ is log concave variable. Furthermore,

\left(\mathbb{E}\|(\Sigma_{n,l})^{-\frac{1}{2}}\bar{Y_{j}}\|_{2}^{3}\right)^{\frac{1}{3}}\leq C\left(\mathbb{E}\|(\Sigma_{n,l})^{-\frac{1}{2}}\bar{Y_{j}}\|_{2}^{2}\right)^{\frac{1}{2}}=C.

(45)

Thus we have

\beta_{i}=\frac{1}{N^{\frac{3}{2}}}\mathbb{E}\|(\Sigma_{n,l})^{-\frac{1}{2}}\bar{Y_{j}}\|_{2}^{3}\leq\frac{C^{3}}{N^{\frac{3}{2}}}1\leq i\leq N.

(46)

Therefore,

\beta:=\sum_{j=1}^{N}\beta_{j}\leq\frac{C^{3}}{N^{\frac{1}{2}}},

and we conclude that there exists an absolute constant $c>0$ so that

d\left(\widehat{S}_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)\leq cN^{-1/2}.

(47)

For part 2,

\displaystyle\left\|\left(\Sigma_{n,l}\right)^{-1}\right\|_{HS}\leq\frac{N}{\psi_{1}(n/2)-\psi_{1}((n+l)/2)}\sim\frac{n(n+l)N}{2l},

(48)

where we have used its asymptotic expansion $\psi_{1}(z)\sim\frac{1}{z}+O\left(z^{-2}\right)$ for large arguments and $\psi_{1}(z)$ is deceasing function for $z>0$ , then

\displaystyle\left\|\left(\Sigma_{n,l}\right)^{-1}\right\|_{HS}^{\frac{1}{2}}\sim\sqrt{\frac{n(n+l)N}{2l}}.

(49)

For part 3, it is same as [2]. For any collection positive real numbers $\delta>C\frac{1}{N}\log\left(en\right)$ , we therefore have

\displaystyle\mathbb{P}\left(\left\|S_{1}-\widehat{S}_{1}\right\|_{2}\geq\delta\right)\leq 2e^{-\delta N/4}.

Setting

\delta:=\frac{C}{N}\log\left(en\right)\log\left(\frac{N}{n}\right)

for a sufficiently large constant $C$ we find

\mathbb{P}\left(\left|S_{1}-\widehat{S}_{1}\right|\geq\delta_{j}\right)\leq 2e^{-C\log(en)\log(N/n)}\leq 2(n/N)^{1/2}.

(50)

Hence, as soon as $N>n$ , we have

\mathbb{P}\left(\left\|\mid S_{k}-\widehat{S}_{k}\right\|_{2}\geq\delta\right)\leq C\left(\frac{n}{N}\right)^{1/2},

(51)

where

\delta\leq\frac{C\log(n)\log(N/n)}{N}.

In conclusion,

	$\displaystyle d\left(\lambda_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)$	$\displaystyle\leq C\sqrt{\frac{1}{N}}+\left(\frac{C\log^{2}n\log^{2}(N/n)n(n+l)}{2lN}\right)^{1/2}+C\left(\frac{n}{N}\right)^{1/2}$		(52)
		$\displaystyle\leq\left(\frac{4C\log^{2}n\log^{2}(N/n)n(n+l)}{2lN}\right)^{1/2}.$		(53)

3 Proof for Theorem 1.2

For k dimensions, we remind the reader that

A_{i}(\Theta)=A_{i}\theta_{1}\wedge\cdots\wedge A_{i}\theta_{k},

(54)

where $\theta_{i}$ is a fixed k-frame in ${\mathbb{R}}^{n}$ . Here we set $\Theta$ is the standard k-frames, then the Gram identity reads

\displaystyle\left\|A(\Theta)\right\|=\det(A_{(k)}^{*}A_{(k)})=\sum_{\sigma\in S_{k}}(-1)^{\left|\sigma\right|}\sum_{j_{1},j_{2}\cdots j_{k}=1}^{n}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)},

(55)

where $A_{(k)}$ denotes the matrix composed of the first k columns of matrix $A$ and the second equation comes from the definition of matrix determinant. With some observation, we have the following lemma

Lemma 3.1.

For any k

\displaystyle Z=\det(A_{(k)}^{*}A_{(k)})=\sum_{\sigma\in S_{k}}(-1)^{\left|\sigma\right|}\sum_{j_{1}\neq j_{2}\neq\cdots\neq j_{k}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}

Proof.

Without loss of generation, assume $j_{1}=j_{2}$ . We can add cycle (12) on $\sigma$ , define

\sigma^{{}^{\prime}}:=\sigma(12),

then we have

A_{j_{i}i}s_{j_{i}}A_{j_{i}\sigma^{{}^{\prime}}(i)}=A_{j_{i}i}A_{j_{i}\sigma(i)}.

Moreover, take

	$\displaystyle B$	$\displaystyle=\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{j_{i}}A_{j_{i}\sigma(i)}$
		$\displaystyle=\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{j_{i}}A_{j_{i}\sigma^{{}^{\prime}}(i)}$
		$\displaystyle=\sum_{\sigma^{{}^{\prime}}\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{ji}A_{j_{i}\sigma^{{}^{\prime}}(i)}$
		$\displaystyle=-\sum_{\sigma^{{}^{\prime}}\in S_{k}}(-1)^{\left\|\sigma^{{}^{\prime}}\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{ji}A_{j_{i}\sigma^{{}^{\prime}}(i)}$
		$\displaystyle=-B,$

then $B=0$ , which ends the proof.

We want to control the moments of Z.

	$\displaystyle{\mathbb{E}}Z$	$\displaystyle={\mathbb{E}}\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1}\neq j_{2}\neq\cdots\neq j_{k}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}$
		$\displaystyle=\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1}\neq j_{2}\neq\cdots\neq j_{k}}{\mathbb{E}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}.$		(56)

Here, we need to recall a main result about integration with respect to the Haar measure on orthogonal group,(see [8] Theorem3.13)

Proposition 3.2.

Suppose $N\geq n$ . Let $g=\left(g_{ij}\right)_{1\leq i,j\leq N}$ be a Haar-distributed random matrix from $O(N)$ and let dg the normalized Haar measure on $O(N)$ . Given two functions $\bm{i},\bm{j}$ from $\{1,2,\ldots,2n\}$ to $\{1,2,\ldots,N\}$ , we have

		$\displaystyle\int_{g\in O(N)}g_{i(1)j(1)}g_{i(2)j(2)}\cdots g_{i(2n)j(2n)}dg$
	$\displaystyle=$	$\displaystyle\sum_{\mathrm{m},\mathrm{n}\in\mathcal{M}(2n)}\mathrm{Wg}_{n}^{O(N)}\left(\mathrm{m}^{-1}\mathrm{n}\right)\prod_{k=1}^{n}\delta_{i(\mathrm{~{}m}(2k-1)),i(\mathrm{~{}m}(2k))}\delta_{j(\mathrm{~{}m}(2k-1)),j(\mathrm{~{}m}(2k))}.$

Here we regard $\mathcal{M}(2n)$ as a subset of $S_{2n}$ . As a special case , we obtain an integral expression for $\mathrm{Wg}_{n}^{O(N)}(\sigma)$ :

\mathrm{Wg}_{n}^{O(N)}(\sigma)=\int_{g\in O(N)}g_{1j_{1}}g_{1j_{2}}g_{2j_{3}}g_{2j_{4}}\cdots g_{nj_{2n-1}}g_{nj_{2n}}dg,\quad\sigma\in S_{2n}

with

\left(j_{1},j_{2},\ldots,j_{2n}\right)=\left(\left\lceil\frac{\sigma(1)}{2}\right\rceil,\left\lceil\frac{\sigma(2)}{2}\right\rceil,\ldots,\left\lceil\frac{\sigma(2n)}{2}\right\rceil\right).

Collins and Śniady [8] obtained

\mathrm{Wg}_{n}^{O(N)}(\sigma)=(-1)^{|\mu|}\prod_{i\geq 1}\mathrm{Cat}_{\mu_{i}}\cdot N^{-n-|\mu|}+\mathrm{O}\left(N^{-n-|\mu|-1}\right),\quad N\rightarrow\infty,

where $\sigma$ is a permutation in $S_{2n}$ of reduced coset-type $\mu$ , which implies the permutation $\sigma$ for which $\mathrm{Wg}(\sigma)$ will have the largest order is the only one satisfying $|\sigma|=0$ , i.e. $\sigma=id$ .
Furthermore, S. Matsumoto obtain a more precise result for the expansion of $\mathrm{Wg}(\sigma)$ ([17]). Given a partition $\mu$ , we define $\mathrm{Wg}^{O(N)}(\mu;n)=\mathrm{W}_{n}^{(N)}(\sigma)$ , where $\sigma$ is a permutation in $S_{2n}$ of reduced coset-type $\mu$ . For example,

	$\displaystyle\mathrm{Wg}^{O(N)}((0);n)$	$\displaystyle=\mathrm{Wg}^{O(N)}\left(\mathrm{id}_{2n}\right)=N^{-n}+n(n-1)N^{-n-2}-n(n-1)N^{-n-3}+\mathrm{O}\left(N^{-n-4}\right)$		(57)
	$\displaystyle\mathrm{Wg}^{O(N)}((1);n)$	$\displaystyle=-N^{-n-1}+N^{-n-2}-\left(n^{2}+3n-7\right)N^{-n-3}+\mathrm{O}\left(N^{-n-4}\right)$		(58)

In our case, only when $\sigma(i)=i$ , $\mathrm{Wg}(\sigma)$ will have the largest order, then

Figure 3.1:

\displaystyle{\mathbb{E}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}i}=\mathrm{Wg}^{O(n)}((0);k)=n^{-k}+k(k-1)n^{-k-2}-k(k-1)n^{-k-3}+\mathrm{O}\left(n^{-k-4}\right).

(59)

If $|\sigma|=1$ , we have

\displaystyle\mathrm{Wg}^{O(n)}((1);k)

\displaystyle=-n^{-k-1}+n^{-k-2}-\left(k^{2}+3k-7\right)n^{-k-3}+\mathrm{O}\left(n^{-k-4}\right).

Directly, when k is finite we have

{\mathbb{E}}Z=\frac{n(n-1)\cdots(n-k+1)}{n^{k}}+{\mathrm{O}}(n^{-1})=1+{\mathrm{O}}(n^{-1}),

(60)

which comes from the fact tha the size of set $\{j_{1}\neq j_{2}\neq\cdots\neq j_{k}\}$ is $n(n-1)\cdots(n-k+1)$ . Next, we need to estimate the variance and fourth moment of Z.

Proposition 3.3.

For finite k and large n, we have

\displaystyle{\mathbf{Var}}Z={\mathrm{O}}(n^{-1})

Proof.

	$\displaystyle{\mathbb{E}}Z^{2}$	$\displaystyle={\mathbb{E}}\left(\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1}\neq j_{2}\neq\cdots\neq j_{k}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}\right)^{2}$
		$\displaystyle={\mathbb{E}}\left(\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1}\neq j_{2}\neq\cdots\neq j_{k}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}\right)\left(\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{l_{1}\neq l_{2}\neq\cdots\neq l_{k}}\Pi_{i=1}^{k}A_{l_{i}i}A_{l_{i}\sigma(i)}\right),$

we need to find the leading term in the above equation.
Case $i$ : If the k-tuple $\stackrel{{\scriptstyle\rightharpoonup}}{{j}}:=(j_{1},\cdots,j_{k})$ and $\stackrel{{\scriptstyle\rightharpoonup}}{{l}}:=(l_{1},\cdots,l_{k})$ are totally different,

\displaystyle{\mathbb{E}}Z^{2}={\mathbb{E}}\sum_{\sigma}(-1)^{\left|\sigma_{j}\right|+\left|\sigma_{l}\right|}\sum_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}},\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}\Pi_{i=1}^{k}A_{l_{i}i}A_{l_{i}\sigma(i)},

(61)

only when $\sigma=id$ , we have the leading term

Figure 3.2:

\displaystyle{\mathbb{E}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}\Pi_{i=1}^{k}A_{l_{i}i}A_{l_{i}\sigma(i)}=n^{-2k}-2k(2k-1)n^{-2k-2}+\mathrm{O}(n^{-2k-3}).

(62)

If $\sigma$ has one cycle,

\displaystyle{\mathbb{E}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}\Pi_{i=1}^{k}A_{l_{i}i}A_{l_{i}\sigma(i)}=-n^{-2k-1}+n^{-2k-2}+\mathrm{O}(n^{-2k-3}).

(63)

If $\sigma$ has two cycles,

\displaystyle{\mathbb{E}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}\Pi_{i=1}^{k}A_{l_{i}i}A_{l_{i}\sigma(i)}={\mathrm{O}}(n^{-2k-2}).

(64)

Case $ii$ : If sequences $\stackrel{{\scriptstyle\rightharpoonup}}{{j}}$ and $\stackrel{{\scriptstyle\rightharpoonup}}{{l}}$ has exactly one equal number where the index is same, w.l.o.g we assume $j_{1}=l_{1}$ .Since $j_{2}\neq j_{3}\neq\cdots\neq j_{k}$ , we need $\sigma(i)=i,i\geq 2$ to get the largest order, which means $\sigma=id$ . In conclusion the total number of pairing is 3k.

Figure 3.3:

If the index is different, for example $j_{1}=l_{2}$ , $\sigma$ need to be identity to get the largest order.

Figure 3.4:

the total number of pairing is $2\binom{k}{2}$ .

Case $iii$ : $\stackrel{{\scriptstyle\rightharpoonup}}{{j}}$ and $\stackrel{{\scriptstyle\rightharpoonup}}{{l}}$ has two equal numbers, the term is negligible ${\mathrm{O}}(n^{-2})$ .

	$\displaystyle{\mathbb{E}}Z^{2}$	$\displaystyle=\left(n^{-2k}+{\mathrm{O}}\left(\frac{1}{n^{2k+2}}\right)\right)*\left(n(n-1)\cdots(n-2k+1)\right)-2\binom{k}{2}(-n^{-2k-1}+n^{-2k-2})\left(n(n-1)\cdots(n-2k+1)\right)$
		$\displaystyle+(3k+2\binom{k}{2})*\frac{1}{n^{2k}}\left(n(n-1)\cdots(n-2k+2)\right)$
		$\displaystyle=1+{\mathrm{O}}(\frac{k}{n})+{\mathrm{O}}(\frac{1}{n^{2}}).$

Moreover,

\displaystyle{\mathbf{Var}}Z={\mathbb{E}}Z^{2}-\left({\mathbb{E}}Z\right)^{2}={\mathrm{O}}\left(\frac{k}{n}\right)+{\mathrm{O}}\left(\frac{1}{n^{2}}\right).

(65)

Next, we consider the fourth central moment of Z.

Proposition 3.4.

\displaystyle{\mathbb{E}}\left(Z-{\mathbb{E}}Z\right)^{4}={\mathrm{O}}(\frac{1}{n^{2}})

Proof.

Define $D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}=\sum_{\sigma\in S_{k}}(-1)^{\left|\sigma\right|}\sum_{j_{1}\neq j_{2}\neq\cdots\neq j_{k}}\Pi_{i=1}^{k}A_{j_{i}i}A_{j_{i}\sigma(i)}$ .

$\displaystyle{\mathbb{E}}\left(Z-{\mathbb{E}}Z\right)^{4}$	$\displaystyle={\mathbb{E}}\left(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}-{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}\right)^{4}$
	$\displaystyle={\mathbb{E}}\left(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}-{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}\right)\left(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}-{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}\right)\left(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}-{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}\right)\left(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{m}}}-{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{m}}}\right)$
	$\displaystyle={\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{m}}}-4{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{m}}}+6{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{m}}}-4{\mathbb{E}}(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}})^{4}+{\mathbb{E}}(D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}})^{4}.$	(66)

Step1, if there are same index, the number of free index will decrease, the term above will be smaller. W.l.o.g we assume $j_{1}=l_{1}$ and show that ${\mathrm{O}}(n^{-4k})$ term vanish. The leading term only happens when $\sigma=id$ similarly.

Figure 3.5:

The number of ${\mathrm{O}}(n^{-4k})$ term is

\displaystyle 3-(3\times 2+1\times 2)+(3\times 1+5)-4+1=0.

Multiply the number of index $n^{4k-1}$ , then we know ${\mathrm{O}}(\frac{1}{n})$ term vanishes in the fourth moment of Z. Step 2, we prove that for totally different 4 k-tuple , the term ${\mathrm{O}}(\frac{1}{n^{4k}}),{\mathrm{O}}(\frac{1}{n^{4k+1}})$ both vanish. Firstly, for ${\mathrm{O}}(\frac{1}{n^{4k}})$ term, $\sigma$ must be identity. The number of ${\mathrm{O}}(\frac{1}{n^{4k}})$ term is

1-4+6-4+1=0.

(67)

In addition (30) gives the fact that ${\mathrm{O}}(\frac{1}{n^{4k+1}})$ vanish.

Next consider the ${\mathrm{O}}(\frac{1}{n^{4k+1}})$ term , Weingarten calculus says that we have the sub-leading term when $\sigma$ has exactly one transposition. To simplify the proof , we omit the error term

	$\displaystyle{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}$	$\displaystyle=n^{-k}-\binom{k}{2}n^{-k-1},$		(68)
	$\displaystyle{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}$	$\displaystyle=n^{-2k}+2kn^{-2k-1}-2\binom{k}{2}n^{-2k-1}=n^{-2k}+(2k-2\binom{k}{2})n^{-2k-1}.$		(69)

The first part of second equation comes from the case $\sigma_{j}=\sigma_{k}=id$ , the second part comes from the case $\sigma_{j}$ and $\sigma_{k}$ only has one transposition. For example the reason that $2k$ in first part occurs is as follows,

Figure 3.6:

Figure 3.7:

Similarly, we have

	$\displaystyle{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}$	$\displaystyle=n^{-3k}+(6k-3\binom{k}{2})n^{-3k-1},$		(70)
	$\displaystyle{\mathbb{E}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{j}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{k}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{l}}}D_{\stackrel{{\scriptstyle\rightharpoonup}}{{m}}}$	$\displaystyle=n^{-4k}+(2\binom{4}{2}k-4\binom{k}{2})n^{-4k-1}.$		(71)

Put the above four equations back into equation (3), we prove that ${\mathrm{O}}(n^{-4k-1})$ term vanishes. Since the number of free index is 4k, which means that ${\mathrm{O}}(n^{-k-1})$ term vanish in the fourth moment of Z.

Let $\mu_{k}$ denote the k-th absolute central moment of Z. Consider Taylor expansion of $\log Z$ at $\mathbb{E}Z$ , which is a special case of the delta method. See [12] p166 for more details.

$\displaystyle{\mathbb{E}}[f(Z)]$	$\displaystyle={\mathbb{E}}\left[f\left({\mathbb{E}}Z+\left(Z-{\mathbb{E}}Z\right)\right)\right]$
	$\displaystyle\approx{\mathbb{E}}\left[f\left({\mathbb{E}}Z\right)+f^{\prime}\left({\mathbb{E}}Z\right)\left(Z-{\mathbb{E}}Z\right)+\frac{1}{2}f^{\prime\prime}\left({\mathbb{E}}Z\right)\left(Z-{\mathbb{E}}Z\right)^{2}\right]$
	$\displaystyle=f\left({\mathbb{E}}Z\right)+\frac{1}{2}f^{\prime\prime}\left({\mathbb{E}}Z\right){\mathbb{E}}\left[\left(Z-{\mathbb{E}}Z\right)^{2}\right].$	(72)

Similarly,

{\mathbf{Var}}[f(Z)]\approx\left(f^{\prime}({\mathbb{E}}[Z])\right)^{2}{\mathbf{Var}}[X]=\left(f^{\prime}\left({\mathbb{E}}Z)\right)\right)^{2}\mu_{2}(Z)^{2}-\frac{1}{4}\left(f^{\prime\prime}\left({\mathbb{E}}Z\right)\right)^{2}\mu_{2}(Z)^{4}.

To do more

$\displaystyle\log Z$	$\displaystyle\approx\log\mathbb{E}Z+\frac{Z-\mathbb{E}Z}{\mathbb{E}Z},$	(73)
$\displaystyle\mathbf{Var}\log Z$	$\displaystyle\approx\frac{\mathbf{Var}Z}{(\mathbb{E}Z)^{2}}={\mathrm{O}}(\frac{1}{n}),$	(74)
$\displaystyle\mu_{4}(\log Z)$	$\displaystyle\approx\frac{\mu_{4}(Z)}{(\mathbb{E}Z)^{4}}={\mathrm{O}}(\frac{1}{n^{2}}).$	(75)

We follow the notation in Bentkus [6] and define

S:=S_{N}=X_{1}+\cdots+X_{N},

where $X_{1},\ldots,X_{N}$ are independent random vectors in $\mathbb{R}^{k}$ with common mean $\mathbb{E}X_{j}=$ 0 . We set

C:=\operatorname{cov}(S)

to be the covariance matrix of $S$ , which we assume is invertible. With the definition

\beta_{j}:=\mathbb{E}\left\|C^{-\frac{1}{2}}X_{j}\right\|_{2}^{3},\beta:=\sum_{j=1}^{N}\beta_{j},

we have the following [6]: Theorem 6.4 (Multivariate CLT with Rate).
There exists an absolute constant $c>0$ such that

d\left(S,C^{\frac{1}{2}}Z\right)\leq ck^{\frac{1}{4}}\beta

where $Z\sim\mathcal{N}\left(0,\operatorname{Id}_{k}\right)$ denotes a standard Gaussian on $\mathbb{R}^{k}$ .
We have

$\displaystyle\beta_{j}$	$\displaystyle=\mathbb{E}\left\|[N\mathbf{Var}(\log Z)]^{-\frac{1}{2}}\log Z\right\|^{3}$	(76)
	$\displaystyle\leq CN^{-\frac{3}{2}}[\mathbf{Var}(\log Z)]^{-\frac{3}{2}}\mu_{4}(\log Z)$	(77)
	$\displaystyle=C\frac{1}{N^{\frac{3}{2}}n^{\frac{1}{2}}}$	(78)

Therefore,

\displaystyle\beta:=\sum_{j=1}^{N}\beta_{j}\leq C\frac{1}{\sqrt{nN}},

(79)

then similar to proof of Theorem 1.1 , divide into 3 parts we can prove $\lambda_{1}+\cdots+\lambda_{k}$ is convergent to Gaussian.

Acknowledgements

I am grateful to Professor Dang-Zheng Liu for his guidance. I would also like to thank Yandong Gu, Guangyi Zou, Ruohan Geng for their useful advice.

References

[1] G. Akemann G, Burda Z. Universal microscopic correlation functions for products of independent Ginibre matrices. J. Phys. A: Math. Theor. 45(46) (2012), 465201.
[2] G. Akemann, Z. Burda, M. Kieburg, and T. Nagao. Universal microscopic correlation functions for products of truncated unitary matrices. Journal of Physics A: Mathematical and Theoretical, (25)47 (2014), 255202.
[3] G. Akemann, Z. Burda, and M. Kieburg. Universal distribution of Lyapunov exponents for products of Ginibre matrices. Journal of Physics A: Mathematical and Theoretical, (39)47 (2014), 395202.
[4] G. Akemann, Z. Burda, and M. Kieburg. From integrable to chaotic systems: universal local statistics of Lyapunov exponents. EPL (Europhysics Letters), (4)126 (2019), 40001.
[5] Ahn A. Fluctuations of beta-Jacobi product processes. Probability Theory and Related Fields, 183(1)(2022), 257-123.
[6] V. Bentkus. A Lyapunov-type bound in RD. Theory of Probability Its Applications, (2)49(2005), 311–323.
[7] R. Carmona. Exponential localization in one dimensional disordered systems. Duke Mathematical Journal, (1)49 (1982), 191-213.
[8] Collins B, Śniady P. Integration with respect to the Haar measure on unitary, orthogonal and symplectic group. Communications in Mathematical Physics, 264(3)(2006), 773-795.
[9] P. Diaconis, Patterns in eigenvalues: the 70th Josian Willard Gibbs Lecture, Bull. of Amer. Math. Soc. 40 (2003), No. 2, 155-178.
[10] H.Furstenberg and H.Kesten. Products of random matrices. Ann. Math. Statist, 31(1960), 457–469.
[11] Hanin B, Paouris G. Non-asymptotic results for singular values of gaussian matrix products. Geometric and Functional Analysis, 31(2)(2021), 268-324.
[12] Haym Benaroya, Seon Mi Han, and Mark Nagurka. Probability Models in Engineering and Science. CRC Press, (2005).
[13] Kargin V. On the largest Lyapunov exponent for products of Gaussian matrices. Journal of Statistical Physics, 157(1)(2014), 70-83.
[14] Latała R. Estimation of moments of sums of independent real random variables. The Annals of Probability, 25(3)(1997), 1502-1513.
[15] É. Le Page. Théoremes limites pour les produits de matrices aléatoires. In: Probability Measures on Groups. Springer, Berlin (1982), pp. 258-303.
[16] Liu D Z, Wang D, Wang Y. Lyapunov exponent, universality and phase transition for products of random matrices. Communications in Mathematical Physics, 399(3)(2023), 1811-1855.
[17] Matsumoto, Sho. ”Jucys–Murphy elements, orthogonal matrix integrals, and Jack measures.” The Ramanujan Journal 26(2011), 69-107.
[18] Cohen J E, Newman C M. The stability of large random matrices and their products. The Annals of Probability, (1984), 283-310.
[19] Oseledec, V. I. A multiplicative ergodic theorem. Ljapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19(1968) 197–231.
[20] Skorski M. Bernstein-type bounds for beta distribution. Modern Stochastics: Theory and Applications, 10(2)(2023), 211-228.

	$\displaystyle\log\left\\|X_{N,n}(\Theta)\right\\|$	$\displaystyle=\log\left\\|A_{N}\cdots A_{1}(\Theta)\right\\|$		(11)
		$\displaystyle=\log\left\\|A_{N}\cdots A_{2}\frac{A_{1}(\Theta)}{\left\\|A_{1}(\Theta)\right\\|}\right\\|+\log\left\\|A_{1}(\Theta)\right\\|.$		(12)

	$\displaystyle\mathbb{P}\left(\left\|T_{i}-\log\left(\frac{n}{n+l_{i}}\right)\right\|\geq s\right)$	$\displaystyle=\mathbb{P}\left(\left\|\log\left(\operatorname{Beta}(n/2,l_{i}/2)\right)-\log\left(\frac{n}{n+l_{i}}\right)\right\|\geq s\right)$
		$\displaystyle=\mathbb{P}\left(\left\|\log\left(\frac{n+l_{i}}{n}\operatorname{Beta}(n/2,l_{i}/2)\right)\right\|\geq s\right)$
		$\displaystyle=\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\geq\frac{n}{n+l_{i}}e^{s}\right)+\mathbb{P}\left(\operatorname{Beta}(n/2,l_{i}/2)\leq\frac{n}{n+l_{i}}e^{-s}\right).$

$\displaystyle d\left(\lambda_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)\leq$	$\displaystyle 3d\left(\widehat{S}_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)+c_{0}\delta\left\\|\left(\Sigma_{n,l}\right)^{-1}\right\\|_{HS}^{1/2}$
	$\displaystyle+2\mathbb{P}\left(\left\\|S_{1}-\widehat{S}_{1}\right\\|>\delta\right)$
$\displaystyle=$	$\displaystyle 3d\left(\widehat{S}_{1},\mathcal{N}\left(\mu_{n},\Sigma_{n,l}\right)\right)+c_{0}\delta\left\\|\left(\Sigma_{n,l}\right)^{-1}\right\\|_{HS}^{1/2}$
	$\displaystyle+2\mathbb{P}\left(\left\\|S_{1}-\widehat{S}_{1}\right\\|>\delta\right),$	(44)

	$\displaystyle B$	$\displaystyle=\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{j_{i}}A_{j_{i}\sigma(i)}$
		$\displaystyle=\sum_{\sigma\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{j_{i}}A_{j_{i}\sigma^{{}^{\prime}}(i)}$
		$\displaystyle=\sum_{\sigma^{{}^{\prime}}\in S_{k}}(-1)^{\left\|\sigma\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{ji}A_{j_{i}\sigma^{{}^{\prime}}(i)}$
		$\displaystyle=-\sum_{\sigma^{{}^{\prime}}\in S_{k}}(-1)^{\left\|\sigma^{{}^{\prime}}\right\|}\sum_{j_{1},\cdots,j_{k},j_{1}=j_{2}}\Pi_{i=1}^{k}A_{ji}A_{j_{i}\sigma^{{}^{\prime}}(i)}$
		$\displaystyle=-B,$