SeanO’Rourke AlexanderSoshnikov Productsofindependentnon-Hermitianrandommatrices

(1)

El e c t ro nic

Journ a l of

Pr

ob a b il i t y

Vol. 16 (2011), Paper no. 81, pages 2219–2245.

Journal URL

http://www.math.washington.edu/~ejpecp/

Products of independent non-Hermitian random matrices

Sean O’Rourke^∗ Alexander Soshnikov^†

Abstract

We consider the product of a finite number of non-Hermitian random matrices with i.i.d. cen- tered entries of growing size. We assume that the entries have a finite moment of order bigger than two. We show that the empirical spectral distribution of the properly normalized product converges, almost surely, to a non-random, rotationally invariant distribution with compact sup- port in the complex plane. The limiting distribution is a power of the circular law.

Key words:Random matrices, Circular law.

AMS 2010 Subject Classification:Primary 60B20.

Submitted to EJP on January 28, 2011, final version accepted November 3, 2011.

∗Department of Mathematics, University of California, Davis, One Shields Avenue, Davis, CA 95616-8633. Email:

sdorourk@math.ucdavis.edu. Research was supported in part by NSF grants VIGRE DMS-0636297 and DMS-1007558

†Department of Mathematics, University of California, Davis, One Shields Avenue, Davis, CA 95616-8633. Email:

soshniko@math.ucdavis.edu. Research was supported in part by NSF grant DMS-1007558

(2)

1 Introduction and Formulation of Results

Many important results in random matrix theory pertain to Hermitian random matrices. Two pow- erful tools used in this area are the moment method and the Stieltjes transform. Unfortunately, these two techniques are not suitable for dealing with non-Hermitian random matrices,[6].

1.1 The Circular Law

One of the fundamental results in the study of non-Hermitian random matrices is the circular law.

We begin by defining the empirical spectral distribution (ESD).

Definition 1. Let X be a matrix of order n and letλ1, . . . ,λn be the eigenvalues of X. Then the empirical spectral distribution(ESD)µX ofX is defined as

µX(z, ¯z) = 1 n#

k≤n: Re λk

≤Re(z); Im λk

≤Im(z) .

Letξbe a complex random variable with finite non-zero varianceσ²and letN_nbe a random matrix of order nwith entires being i.i.d. copies of ξ. We say that the circular law holds for ξif, with probability 1, the ESDµ ¹

σp

nNn of _σ¹p

nN_nconverges (uniformly) to the uniform distribution over the unit disk asntends to infinity.

The circular law was conjectured in the 1950’s as a non-Hermitian counterpart to Wigner’s semi- circle law. The circular law was first shown by Mehta in 1967[22]whenξis complex Gaussian.

Mehta relied upon the joint density of the eigenvalues which was discovered by Ginibre [10]two years earlier.

Building on the work of Girko[11], Bai proved the circular law under the conditions thatξhas finite sixth moment and that the joint distribution of the real and imaginary parts ofξhas bounded density, [3]. In [6], the sixth moment assumption was weakened toE|ξ|²^+η for any specified η >0, but the bounded density assumption still remained. Götze and Tikhomirov ([15]) proved the circular law in the case of i.i.d. sub-Gaussian matrix entries. Pan and Zhou proved the circular law for any distributionξwith finite fourth moment[25]by building on[15]and utilizing the work of Rudelson and Vershynin in[27]. In an important development, Götze and Tikhomirov showed in[14]that the expected spectral distributionEµ_N_nconverges to the uniform distribution over the unit disk asn tends to infinity assuming that sup_jkE|(N_n)_jk|²φ((N_n)_jk)<∞, whereφ(x) = (ln(1+|x|))¹⁹^+η, η >

0. In[28], Tao and Vu proved the circular law assuming a bounded(2+η)^thmoment, for any fixed η >0. Finally, Tao and Vu have been able to remove the extraηin the moment condition. Namely, they proved the circular law in[29]assuming only that the second moment is bounded.

1.2 Main Results

In this paper, we study the ESD of the product

X⁽ⁿ⁾=X₁⁽ⁿ⁾X₂⁽ⁿ⁾· · ·X_m⁽ⁿ⁾

ofm independentn×nnon-Hermitian random matrices as ntends to infinity. Burda, Janik, and Waclaw[8]studied the mathematical expectation of the limiting ESD, lim_n→∞Eµ⁽ⁿ⁾_X , in the case

(3)

that the entries of the matrices are Gaussian. Here we extend their results by proving the almost sure convergence of the ESD,µ⁽_Xⁿ⁾, for a class of non-Gaussian random matrices. Namely, we require that the entries of X_i⁽ⁿ⁾, i = 1, . . . ,m, are i.i.d. random variables with a finite moment of order 2+η, η >0.

Theorem 2. Fix m>1and letξbe a complex random variable with variance 1such thatRe(ξ)and Im(ξ) are independent each with mean zero andE|ξ|²^+η<∞for some η >0. Let X₁⁽ⁿ⁾, . . . ,X_m⁽ⁿ⁾ be independent random matrices of order n where the entries of X⁽_jⁿ⁾ are i.i.d. copies of σjpξ

n for some collection of positive constants σ1, . . . ,σm. Then the ESDµ⁽ⁿ⁾_X of X⁽ⁿ⁾ = X₁⁽ⁿ⁾X₂⁽ⁿ⁾· · ·X⁽ⁿ⁾_m converges, with probability1, as n→ ∞to the distribution whose density is given by

ρ(z, ¯z) =

¨ ₁

mπσ⁻^m²|z|^m²⁻² for|z| ≤σ,

0 for|z|> σ, (1)

whereσ=σ1· · ·σm.

Remark 3. The almost sure convergence ofµ⁽ⁿ⁾_X implies the convergence ofEµ⁽ⁿ⁾_X as well.

Remark 4. We refer the reader to[4]for bounds on powers of a square random matrix with i.i.d.

entries. See also[1],[2],[9], [5], [7], and [24]for some other results on the spectral properties of products of random matrices.

2 Notation and Setup

The proof of Theorem 2 is divided into two parts and presented in Sections 3 and 4.

We note that without loss of generality, we may assume σ1 = σ2 = · · · = σm = 1. Indeed, the spectrum for arbitrary σ1, . . . ,σ_m can be obtained by a trivial rescaling. Following Burda, Janik, and Waclaw in[8], we letY⁽ⁿ⁾be a(mn)×(mn)matrix defined as

Y⁽ⁿ⁾=







0 X₁⁽ⁿ⁾ 0

0 0 X₂⁽ⁿ⁾ 0

... ...

0 0 X_m⁽ⁿ⁾₋₁

X⁽ⁿ⁾_m 0







. (2)

Section 4 will be devoted to proving that the ESD ofY⁽ⁿ⁾obeys the circular law asntends to infinity.

This statement is presented in the following Lemma.

Lemma 5 (Y⁽ⁿ⁾ obeys the circular law). The ESDµ_Y⁽ⁿ⁾ of Y⁽ⁿ⁾ converges, with probability 1, to the uniform distribution over the unit disk as n→ ∞.

3 Proof of Theorem 2

With Lemma 5 above, we are ready to prove Theorem 2.

(4)

Proof of Theorems 2. Using the definition ofY⁽ⁿ⁾in (2), we can compute

Y⁽ⁿ⁾m

=







Y₁ 0

Y₂ ...

0 Y_m





 ,

whereY_k=X_k⁽ⁿ⁾X⁽ⁿ⁾_k+1· · ·X⁽ⁿ⁾_m X₁⁽ⁿ⁾· · ·X_k−1⁽ⁿ⁾ for 1≤k≤m. Notice that eachY_khas the same eigenvalues asX⁽ⁿ⁾. Letλ1, . . . ,λ_ndenote the eigenvalues ofX⁽ⁿ⁾and letη1, . . . ,η_mndenote the eigenvalues ofY⁽ⁿ⁾. Then it follows that eachλkis an eigenvalue of

Y⁽ⁿ⁾m

with multiplicitym.

Let f :C→Cbe a continuous, bounded function. Then we have Z

C

f(z)dµ_X(n)(z, ¯z) =1 n

Xn

k=1

f(λ_k) = 1 mn

Xmn

k=1

f(η^m_k) = Z

C

f(z^m)dµ_Y(n)(z, ¯z).

By Lemma 5,

Z

C

f(z^m)dµ_Y(n)(z, ¯z)−→ 1 π

Z

D

f(z^m)dzd¯z a.s.

as n→ ∞where Ddenotes the unit disk in the complex plane. Thus, by the change of variables z7→z^mand ¯z7→z¯^m we can write

1 π

Z

D

f(z^m)dzd¯z= m π

Z

D

f(z) 1

m²|z|^m²⁻²dzd¯z= 1 mπ

Z

D

f(z)|z|^m²⁻²dzd¯z.

where the factor ofmout front of the integral corresponds to the fact that the transformation maps the complex planemtimes onto itself.

Therefore, we have shown that for all continuous, bounded functions f, Z

C

f(z)dµ_X⁽ⁿ⁾(z, ¯z)−→ 1 mπ

Z

D

f(z)|z|^m²⁻²dzd¯z a.s.

asn→ ∞and the proof is complete.

4 Proof of Lemma 5

In order to prove that the ESD ofY⁽ⁿ⁾obeys the circular law, we follow the work of Bai in[3], Bai and Silverstein in[6], and use the results developed by Tao and Vu in[28]. To do so, we introduce the following notation. Letµn denoted the ESD ofY⁽ⁿ⁾. That is,

µn(x,y) = 1 mn#

k≤mn: Re(λk)≤x; Im(λk)≤ y whereλ1, . . . ,λmnare the eigenvalues ofY⁽ⁿ⁾.

An important idea in the proof is to analyze the Stieltjes transformations_n:C→C^ofµn defined by s_n(z) = 1

mn Xmn

k=1

1 λk−z =

Z

C

1

x+i y−zdµn(x,y).

(5)

Sinces_n(z) is analytic everywhere except the poles, the real part determines the eigenvalues. Let z=s+i t. Then we can write

Re(s_n(z)) = 1 mn

Xmn

k=1

Re(λk)−s

|λ_k−z|²

=− 1 2mn

Xmn

k=1

∂

∂sln|λ_k−z|²

=−1 2

∂

∂s Z _∞

0

lnxνn(dx,z)

whereνn(·,z)is the ESD of the Hermitian matrixH_n= (Y⁽ⁿ⁾−z I)^∗(Y⁽ⁿ⁾−z I). This reduces the task to controlling the distributionsνn.

The main difficulties arise from the two poles of the log function, at∞and 0. We will need to use the bounds developed in[3] and[28] to control the largest singular value and the least singular value ofY⁽ⁿ⁾−z I.

A version of the following lemma was first presented by Girko,[11]. We present a slightly refined version by Bai and Silverstein,[6].

Lemma 6. For any uv6=0, we have c_n(u,v) =

Z Z

e^{iux+i v y}µn(dx, dy)

=u²+v² 4iuπ

Z Z

∂

∂s

Z _∞

0

lnxνn(dx,z)

e^{ius+i v t}dtds, (3)

where z=s+it.

We note that the singular values ofY⁽ⁿ⁾ are the union of the singular values ofX_k⁽ⁿ⁾for 1≤k≤n.

Thus, under the assumptions of Theorem 2, the ESD ofY⁽ⁿ⁾^∗Y⁽ⁿ⁾converges to the Marchenko-Pastur Law (see [20] and[6, Theorem 3.7]). Thus by Lemma 8 it follows that, with probability 1, the family of distributionsµn is tight. To prove the circular law we will show that the right-hand side of (3) converges toc(u,v), its counterpart generated by the circular law, for alluv6=0. Several steps of the proof will follow closely the work of Bai in[3]and Bai and Silverstein in[6]. We present an outline of the proof as follows.

1. We reduce the range of integration to a finite rectangle in Section 4.2. We will show that the proof reduces to showing that, for every largeA>0 and smallε >0,

Z Z

T

∂

∂s Z _∞

0

lnxνn(dx,z)

e^{ius+i v t}dsdt

→ Z Z

T

∂

∂s Z _∞

0

lnxν(dx,z)

e^ius⁺^{i v t}dsdt

where T = {(s,t) : |s| ≤ A,|t| ≤A³,|p

s²+t²−1| ≥ ε} andν(x,z) is the limiting spectral distribution of the sequence of matricesH_n= (Y⁽ⁿ⁾−z I)^∗(Y⁽ⁿ⁾−z I).

(6)

2. We characterize the limiting spectrumν(·,z)ofνn(·,z).

3. We establish a convergence rate ofνn(·,z)toν(·,z)uniformly in every bounded region ofz.

4. Finally, we show that for a suitably defined sequenceε_n, with probability 1, lim sup

n→∞

Z Z

T

Z _∞

εn

lnx(νn(dx,z)−ν(dx,z))

=0

and

n→∞lim Z _ε_n

0

lnxν_n(dx,z) =0.

4.1 Notation

In this section, we introduce some notation that we will use throughout the paper.

First, we will drop the superscript(n)from the matrices Y⁽ⁿ⁾, X⁽ⁿ⁾, X₁⁽ⁿ⁾, . . . ,X⁽ⁿ⁾_m and simply write Y,X,X₁, . . . ,X_m.

We write R = Y −z I where I is the identity matrix and z = s+i t ∈C. We will continue to let H_n= (Y−z I)^∗(Y−z I) =R^∗Rand haveνn(x,z)denote the empirical spectral distribution ofH_n for each fixedz.

For a(mn)×(mn)matrixA, there arem²blocks each consisting of an×nmatrix. We letA_{a b}denote then×nmatrix in positiona,bwhere 1≤a,b≤m.A_a,b;i,_j then refers to the element(A_{a b})_{i j} where 1≤i,j≤n.

Finally,C will be used as some positive constant that may change from line to line.

4.2 Integral Range Reduction

To establish Lemma 5, we need to find the limiting counterpart to g_n(s,t) = ∂

∂s Z _∞

0

lnxν_n(dx,z).

We begin by presenting the following lemmas.

Lemma 7(Bai-Silverstein[6]). For all uv6=0, we have c(u,v) = 1

π Z Z

x²+y²≤1

e^{iux+i v y}dxdy= u²+v² 4iuπ

Z Z

g(s,t)e^{ius+i v t}dt

ds, where

g(s,t) =

¨ _2s

s²+t², if s²+t²>1 2s, otherwise

(7)

Lemma 8(Horn-Johnson[17]). Letλj andηj denote the eigenvalues and singular values of an n×n matrix A, respectively. Then for any k≤n,

Xk

j=1

|λj|²≤ Xk

j=1

η²

ifηj is arranged in descending order.

Lemma 9(Bai-Silverstein[6]). For any uv6=0and A>2, we have

Z

|s|≥A

Z _∞

−∞

g_n(s,t)e^{ius+i v t}dtds

≤ 4π

|v|e⁻¹²^|^v^|^A+ 2π n|v|

mn

X

k=1

I

|λk| ≥ A 2

and

Z

|s|≤A

Z

t≥A³

g_n(s,t)e^{ius+i v t}dtds

≤ 8A

A²−1+4πA n

mn

X

k=1

I(|λk|>A)

whereλ1, . . . ,λmnare the eigenvalues of Y . Furthermore, if the function g_n(s,t)is replaced by g(s,t), the two inequalities above hold without the second terms.

Now we note that under the assumptions of Theorem 2 and by Lemma 8 and the law of large numbers we have

1 n

Xmn

k=1

I(|λ_k|>A)≤ 1

nA²Tr(Y^∗Y)−→ m

A² a.s.

Therefore, the right-hand sides of the inequalities in Lemma 9 can be made arbitrarily small by making Alarge enough. The same is true when g_n(s,t) is replaced by g(s,t). Our task is then reduced to showing

Z

|s|≤A

Z

|t|≤A³

[g_n(s,t)−g(s,t)]e^ius⁺^{i v t}dsdt−→0.

We define the sets

T =¦

(s,t):|s| ≤A,|t| ≤A³ and||z| −1| ≥ε© and

T₁={(s,t):||z−1|< ε}, wherez=s+i t.

Lemma 10(Bai-Silverstein[6]). For all fixed A and0< ε <1, Z Z

T1

|g_n(s,t)|dsdt≤32p

ε. (4)

Furthermore, if the function g_n(s,t)is replaced by g(s,t), the inequality above holds.

Since the right-hand side of (4) can be made arbitrarily small by choosing ε small, our task is reduced to showing

Z Z

T

[g_n(s,t)−g(s,t)]e^{ius+i v t}dsdt−→0 a.s. (5)

(8)

4.3 Characterization of the Circular Law

In this section, we study the convergence of the distributions νn(x,z) to a limiting distribution ν(x,z) as well as discuss properties of the limiting distributionν(x,z). We begin with a standard truncation argument which can be found, for example, in[6].

4.3.1 Truncation

LetYbandYe be the(mn)×(mn)matrices with entries Yb_a,b;i,_j=Y_a,b;i,jI(p

n|Y_a,b;i,_j| ≤n^δ)−E^Ya,b;i,jI(p

n|Y_a,b;i,_j| ≤n^δ) and

Ye_a,b;i,_j= Yb_a,b;i,j q

nE^bY_a,b;i,_j

2

where δ > 0. We denote the ESD of Hb_n = (Yb−z I)^∗(Yb−z I) by νbn(·,z) and the ESD of He_n = (Ye−z I)^∗(Ye−z I)byνen(·,z).

We will letL(F1,F₂)be the Levy distance between two distribution functionsF₁ andF₂defined by L(F₁,F₂) =inf{ε:F₁(x−ε)−ε≤F₂(x)≤F₁(x+ε) +εfor allx∈R}.

We then have the following Lemma.

Lemma 11. We have that

L(νn(·,z),νen(·,z)) =o(n^−ηδ/⁴)a.s.

where the bound is uniform for|z| ≤M . Proof. By[6, Corollary A.42]we have that

L⁴(ν(·,z),νbn(·,z))≤ 2

n²Tr(H_n−Hb_n)Tr[(Y−Yb)^∗(Y−Yb)]. By the law of large numbers it follows that, with probability 1,

1

nTrH_n= 1 n

Xm

a=1

X

1≤i,j≤n

|Y_a,a+_1;i,j|²+m|z|²−→m(1+|z|²).

Similarly, ¹

nTr(Hb_n)→m(1+|z|²)a.s.

(9)

For anyL>0, we have n^δη

n Tr[(Y−Yb)^∗(Y−Yb)] = n^δη n

Xm

a=1

X

1≤i,j≤n

|(Y−Yb)_a,a+1;i,j|²

≤ n^δη n²

Xm

a=1

X

1≤i,j≤n

pnY_a,a+_1;i,jI(p

n|Y_a,a+_1;i,_j|>n^δ)−E^p^nY_a,a+1;i,jI(p

n|Y_a,a+_1;i,j|>n^δ)

2

≤2n^δη





 1 n²

m

X

a=1

X

1≤i,j≤n

|p

nY_a,a+1;i,_j|²I(p

n|Y_a,a+1;i,j|>n^δ) +E|ξ|²I(|ξ|>n^δ)







≤ 2 n²

Xm

a=1

X

1≤i,j≤n

|p

nY_a,a+_1;i,_j|²^+ηI(p

n|Y_a,a+_1;i,j|>L) +E|ξ|²^+ηI(|ξ|>L) and hence

lim sup

n→∞

n^δη

n Tr[(Y−Yb)^∗(Y−Yb)]≤4mE|ξ|²^+ηI(|ξ|>L)a.s.

which can be made arbitrarily small by makingLlarge. Thus we have that L(ν(·,z),νbn(·,z)) =o(n^−ηδ/⁴)a.s.

where the bound is uniform for|z| ≤M. By[6, Corollary A.42]we also have that

L⁴(bν(·,z),νen(·,z))≤ 2

n²Tr(Hb_n+He_n)Tr(Yb^∗Yb)





1− 1 Æ

E|p

nbY_1,2;1,1|²





. A similar argument shows that 1−Æ

E|p

nYb_1,2;1,1|²=o(n^−ηδ)and the proof is complete.

Remark 12. For the remainder of the subsection, we will assume the conditions of Theorem 2 hold.

Also, by Lemma 11 we additionally assume that|Y_a,a+1;i,_j| ≤n^δ. 4.3.2 Useful tools and lemmas

We begin by denoting the Stieltjes transform ofνn(·,z)by

∆n(α,z) =

Z ν_n(dx,z) x−α ,

whereα=x+i y with y>0. We also note that∆n(α,z) = _mn¹ Tr(G)whereG= (H_n−αI)⁻¹ is the resolvent matrix. For brevity, the variablez will be suppressed when there is no confusion and we will simply write∆n(α).

We first present a number of lemmas that we will need to study∆n(α). We remind the reader that R=Y −z I andα=x+i y.

(10)

Lemma 13. If y >0and x∈K for some compact set K, then we have the following bounds, kYk²≤ max

1≤k≤mkX_kk²≤

m

X

k=1

kX_kk², (6)

kGk ≤ 1

y, (7)

kRGk ≤C r 1

y² + 1

y, (8)

kGR^∗k ≤C r 1

y² + 1

y, (9)

for some constant C >0which depends on K. Moreover, there exists a constant C which depends only on K such that

sup

kRGk:x∈K,y≥ y_n,z∈C ≤C È 1

y_n² + 1

y_n, (10)

sup

kGR^∗k:x∈K,y≥ y_n,z∈C ≤C È 1

y_n² + 1

y_n, (11)

for any sequence y_n>0.

Proof. The first inequality in (6) follows from the definition of the norm and the second inequality is trivial. The resolvent bound in (7) follows immediately becauseH_nis a Hermitian matrix.

To prove (8), we use polar decomposition to write R = U|R| where U is a partial isometry and

|R|=p

R^∗R. Then

kRGk=kU|R|(R^∗R−α)⁻¹k ≤ k|R|(R^∗R−α)⁻¹k

≤ sup

t∈Sp(R^∗R)|p

t(t−α)⁻¹| ≤sup

t≥0|p

t(t−α)⁻¹| ≤C r 1

y² + 1 y.

A similar argument verifies (9). (10) and (11) follow from (8) and (9) by using that y ≥ y_n. Lemma 14. We have that

E 1

nTrG_a,a

=E 1

mnTrG

for any1≤a≤m.

Proof. Fix 1≤a≤mand 1≤i≤n. We will show that E^Ga,a;i,i=E^G_a+_1,a+1;i,i.

Using the adjoint formula for the inverse of a matrix, we can write that for any 1≤ b≤m G_b,b;i,i= det(R^∗R−αI)⁽^b,i⁾

det(R^∗R−αI)

(11)

where(R^∗R−αI)^(b,i)is the matrixR^∗R−αI with the entries in the row and column that contain the element(R^∗R−αI)b,b;i,i replaced by zeroes except for the diagonal element which is replaced by a 1.

We will writeQ_b=X^∗_bX_b+|z|²I−αIand then note thatR^∗R−αI has the form







Q_m −¯zX₁ 0 · · · 0 −zX_m^∗

−zX₁^∗ Q₁ −¯zX₂ 0 · · · 0

0 −zX₂^∗ Q₂ ... 0 ...

... 0 ... ... −¯zX_m−2 0

0 · · · 0 −zX_m^∗₋₂ Q_m₋₂ −¯zX_m₋₁

−¯zX_m 0 · · · 0 −zX_m^∗₋₁ Q_m−₁







, (12)

whereQ_m,Q₁, . . . ,Q_m₋₁appear along the diagonal.

Let σ= (1 2 3 . . .m) ∈S_m. We now construct two bijective maps. Let T_σ be the map that takes matrices of the form (12) into the matrix where each occurrence ofX_bis replaced byX_σ(b)and each occurrence ofQ_b is replaced byQ_σ(b). Also, let

Ω =Cⁿ²×Cⁿ²× · · · ×Cⁿ²

| {z }

mtimes

denote the probability space. Then we write ω ∈ Ω as ω = (X₁,X₂, . . . ,X_m). We now define T_σ⁰ :Ω→Ωby T_σ⁰(X1, . . . ,X_m) = (X2,X₃, . . . ,X_m,X₁). Since each X₁, . . . ,X_m is an independent and identically distributed random matrix,T_σ⁰ is a measure preserving map.

We claim that det(R^∗R−αI) =det(T_σ(R^∗R−αI)). Indeed, ifλis an eigenvalue of(R^∗R−αI)with eigenvector v= (v_m,v₁, . . . ,v_m−1)^T where v_bis an n-vector, then a simple computation reveals that w= (v_σ(m),v_σ(₁₎, . . . ,v_σ(_m₋₁₎)^T= (v1, . . . ,v_m)^Tis an eigenvector ofT_σ(R^∗R−αI)with eigenvalueλ. Similarly, det(R^∗R−αI)^(b,i) = det

T_σ

(R^∗R−αI)^(b,i)

. Define f_a,i(ω) to be det(R^∗R−αI)^(b,i)(ω)for each realizationω∈Ω. Then we have that

f_a+_1,i(ω) =det R^∗R−αI_(a+_1,i) (ω)

=det T_σ

R^∗R−αI_(a+_1,i) (ω)

=det T_σ R^∗R−αI_(a,i) (ω)

=det R^∗R−αI₍a,i)(T_σ⁰(ω)) =f_a,i(T_σ⁰(ω)) and

det(R^∗R−αI)(ω) =det(R^∗R−αI)(T_σ⁰(ω)).

ThusG_a,a;i,i(T_σ⁰(ω)) =G_a₊_1,a₊_1;i,i(ω)for eachω∈Ω. SinceT_σ⁰ is measure preserving, the proof is complete.

Next, we present the decoupling formula, which can be found, for example, in[18]. Ifξis a real- valued random variable such thatE|ξ|^p+² <∞ and if f(t) is a complex-valued function of a real variable such that its firstp+1 derivatives are continuous and bounded, then

E[ξf(ξ)] =

p

X

a=0

κa+1

a! E[f^(a)(ξ)] +ε, (13)

(12)

whereκaare the cumulants ofξand|ε| ≤Csup_t|f⁽^p⁺¹⁾(t)|E|ξ|^p⁺²where C depends only onp.

Ifξis a Gaussian random variable with mean zero, then all the cumulants vanish except forκ2and the decoupling formula reduces to the exact equation

E[ξf(ξ)] =E[ξ²]E[f⁰(ξ)].

Finally, to use (13), we need to compute the derivatives of the resolvent matrix G with respect to the various entries ofY. This can be done by utilizing the resolvent identity and we find

∂G_a,b;k,l

∂Re(Y_c,c+1;q,p) =−(GR^∗)a,c;k,qG_c₊_1,b;p,l−G_a,c₊_1;k,p(RG)c,b;q,l,

∂G_a,b;k,l

∂Im(Y_c,c+1;q,p) =−i(GR^∗)a,c;k,qG_c+_1,b;p,l+iG_a,c+_1;k,p(RG)c,b;q,l. 4.3.3 Main Theorem

For the results below, we will considerα = x+i y where y ≥ y_n with y_n = n^−ηδ. Our goal is to establish the following result.

Theorem 15. Under the conditions of Theorem 2 and the additional assumption that|Y_a,a+_1;i,_j| ≤n^δ, we have

∆³_n(α,z) +2∆²_n(α,z) +α+1− |z|²

α ∆n(α,z) +1

α=r_n(α,z),

where ifδis chosen such thatδη≤1/32andδ≤1/32, then the remainder term r_n satisfies sup

|r_n(α,z)|:|z| ≤M,|x| ≤N,y ≥ y_n =O δn

a.s.

withδ_n=n⁻¹^/⁴y_n⁻⁵n^δ.

Remark 16. We note that the bounds presented here and in the rest of this section are not optimal and can be improved. The bounds given, however, are sufficient for our purposes.

In order to prove Theorem 15, we will need the following lemmas. The first lemma is McDiarmid’s Concentration Inequality[21].

Lemma 17(McDiarmid’s Concentration Inequality). Let X = (X1,X₂, . . . ,X_n)be a family of indepen- dent random variables with X_k taking values in a set A_k for each k. Suppose that the real-valued f defined onQ

A_k satisfies

|f(x)−f(x⁰)| ≤c_k

whenever the vectors x and x⁰ differ only in the kth coordinate. Let µ be the expected value of the random variable f(X). Then for any t≥0,

P |f(X)−µ| ≥t

≤2e^−2t²^/^P^c²^k.

Remark 18. McDiarmid’s Concentration Inequality also applies to complex-valued functions by applying Lemma 17 to the real part and imaginary part separately.

(13)

Lemma 19. For y ≥ y_n and|x| ≤N (whereα=x+i y), P |∆n(α,z)−E∆n(α,z)|>t

≤4e^{−c t}²^{n y}ⁿ⁴ (14) for some absolute constant c>0. Moreover,

sup

|∆n(α,z)−E∆n(α,z)|:|z| ≤M,|x| ≤N,y≥ y_n =O

n⁻^1/4y_n⁻²

a.s.

Proof. Let R_k denote the matrix R with the k-th column replaced by zeroes. ThenR^∗R and R^∗_kR_k differ by a matrix with rank at most two. So by the resolvent identity

1

mnTr R^∗R−α₋1

− 1 mnTr

R^∗_kR_k−α₋1

≤ 2 mn

R^∗R−α₋₁

R^∗_kR_k−R^∗R

R^∗_kR_k−α₋1

(15)

≤ C n y_nsup

t≥0

(t−α)⁻¹t

=C⁰ 1 n y_n²

where the constant C⁰ depends only on N. The mn columns of Y form an independent family of random variables. We now apply Lemma 17 to the complex-valued function ¹

mnTr(R^∗R−α)⁻¹ with the boundc_k=O(n⁻¹y_n⁻²)obtained in (15). This proves the bound (14). Thus, for any fixed point (α,z)in the region

{(α= x+i y,z=s+i t):|x| ≤N,y ≥ y_n,|z| ≤M} (16) one has

P|∆n(α,z)−E∆n(α,z)|>n^−1/4y_n⁻²

≤4e^−cn^1/2, (17)

where we recall that y_n=n^−ηδandδ >0 could be chosen to be arbitrary small.

If y =Imα >n^1/4y_n², then

|∆n(α,z)| ≤ 1

Imα <n⁻^1/4y_n⁻², |E∆n(α,z)|<n⁻^1/4y_n⁻². (18) Therefore, it is enough to bound the supremum of|∆n(α)−E∆n(α)|over the region

D={(α=x+i y,z=s+i t):|x| ≤N, y_n≤ y≤n¹^/⁴y_n², |z| ≤M}. (19) To this end, we consider a finiten^−C-net ofDwhereC is some sufficiently large positive constant to be chosen later. Clearly, one can construct such a net that contains at most[4M n^4Cn^1/4y_n²]points if n is sufficiently large, where [k] denotes the integer part of k. Let us denote these points by (αi,z_i), 1≤i≤[4M n^4Cn¹^/⁴y_n²]. It follows from (17) that one has

P^sup{i:|∆_n(α_i,z_i)−E∆_n(α_i,z_i)|>n⁻¹^/⁴y_n⁻²

≤16M y_n²n^4C⁺¹^/⁴e⁻^cn^1/2, (20) where the supremum is taken over the points of the net. Appying the Borel-Cantelli lemma, we obtain that

sup

i:|∆n(αi,z_i)−E∆n(αi,z_i)| =O

n⁻¹^/⁴y_n⁻²

a.s. (21)

(14)

where the supremum is again taken over the points of then⁻^C-net ofD. To extend the estimate (21) to the supremum over the whole regionD, we note that for(α,z)∈ D,

∂∆n(α,z)

∂Reα ≤ 1

y_n², (22)

∂∆_n(α,z)

∂Imα ≤ 1

y_n², (23)

∂∆n(α,z)

∂Rez

≤const_m2(n¹^+δ+M)

y_n² , (24)

∂∆n(α,z)

∂ Imz

≤const_m2(n^1+δ+M)

y_n² , (25)

whereconst_m is a constant that depends only onm.

The bounds (22-23) are simple properties of the Stieltjes transform. Indeed, the l.h.s. of (22) and (23) are bounded from above by ¹

|Imα|². The proof of (24-25) follows from the resolvent identitity (H_n(z₂)−αI)⁻¹−(H_n(z₁)−αI)⁻¹= (H_n(z₁)−αI)⁻¹(H_n(z₂)−H_n(z₁))(H_n(z₂)−αI)⁻¹, the formulaH_n(z) = (Y⁽ⁿ⁾−z I)^∗(Y⁽ⁿ⁾−z I), the bound|z| ≤M, and the bound

kY⁽ⁿ⁾k ≤n¹^+δ. (26)

We note that (26) follows from the fact that the matrix entries ofY⁽ⁿ⁾are bounded byn^δ.

Now, choosingC in the construction of the net sufficiently large, one extends the bound (21) to the whole regionDby (22-25). This finishes the proof of the lemma.

Lemma 20. For any1≤a≤m, sup

Var

1 nTrG_aa

:|x| ≤N,y≥ y_n,z∈C

=O(n⁻¹y_n⁻²) whereα=x+i y.

Proof. Let R_k denote the matrix R with the k-th column replaced by zeroes and let P_a be the or- thogonal projector such that TrG_a,a =Tr(P_aG P_a). Following the same procedure as in the proof of Lemma 19, we have that

Tr(R^∗R−α)⁻_a,a¹−Tr(R^∗_kR_k−α)⁻_a,a¹

= Tr

P_a(R^∗_kR_k−α)⁻¹(R^∗_kR_k−R^∗R)(R^∗R−α)⁻¹P_a

(27)

≤ C y_n².

where the constantC depends only onN.

We can write

1

nTrG_a,a−E 1

nTrG_a,a

= 1 n

Xmn

k=1

γk,

(15)

whereγkis the martingale difference sequence γk=Ek

1 nTrG_a,a

−Ek−1

1 nTrG_a,a

andEkdenotes the conditional expectation with respect to the elements in the firstkcolumns ofY. Then by the bound in (27) and[6, Lemma 2.12], we have

E 1

nTrG_a,a−E¹ nTrG_a,a

2

≤ C n²

mn

X

k=1

|γk|²

≤ C n y_n²

where the constantC depends only onN. Since the bound holds for any|x| ≤N,y≥ y_n, andz∈C the proof is complete.

Remark 21. By Lemmas 14, 19, and 20, for every 1≤a,b,c≤m E

1 nTrG_a,a

=E 1

mnTrG

= 1

mnTrG+O(n⁻¹^/⁴y_n⁻⁵) a.s., E

1

nTrG_a,a1 nTrG_b,b

=E

1 mnTrG

2

+O(n⁻¹^/⁴y_n⁻⁵)

= 1

mnTrG 2

+O(n⁻¹^/⁴y_n⁻⁵) a.s., and

E 1

nTrG_a,a1

nTrG_b,b1 nTrG_c,c

=E

1 mnTrG

3

+O(n⁻¹^/⁴y_n⁻⁵)

= 1

mnTrG 3

+O(n⁻¹^/⁴y_n⁻⁵) a.s.,

where the bounds hold uniformly in the region|x| ≤N,y≥ y_n, and|z| ≤M. We are now ready to prove Theorem 15.

Proof of Theorem 15. Fixα=x+i ywith|x| ≤N,y ≥ y_nandz∈C^with|z| ≤M. We will show that the remainder termr_n(α,z) =O(δn)a.s. where the constants in the termO(δn)depend only onN and M. In particular, the remainder term will be estimated using Lemmas 13 and 19 and Remark 21 where the bounds all hold uniformly in the region. In the proof presented below, will use the notationO_N,M(·)to represent a term which is bounded uniformly in the region|x| ≤N,y≥ y_n, and

|z| ≤M.

By applying the resolvent identity toGand replacingRandR^∗withY−z I andY^∗−¯z I, respectively, we obtain

1

nTrG_a,a=−1 α+ 1

αnTr[GY^∗Y]a,a− z

αnTr[GY^∗]a,a− z¯

αnTr[GY]a,a+|z|² αnTrG_a,a.

(16)

We will letY⁽^r⁾be the(mn)×(mn)matrix containing the real entries ofY andY⁽ⁱ⁾be the(mn)×(mn) matrix containing the imaginary entries of Y such thatY = Y^(r)+iY⁽ⁱ⁾. By assumption, Y^(r)and Y⁽ⁱ⁾are independent random matrices. Thus,

1−|z|² α

E¹

nTrG_a,a+ 1 α = 1

αnE^Tr(GY^∗Y^(r))a,a+ i

αnE^Tr(GY^∗Y⁽ⁱ⁾)a,a

− z

αnETr(GY⁽^r⁾^∗)a,a+ iz

αnETr(GY⁽ⁱ⁾^∗)a,a (28)

− z¯

αnE^Tr(GY⁽^r⁾)a,a− zi¯

αnE^Tr(GY⁽ⁱ⁾)a,a

Letδ=Var(Re(ξ)). Then Var(Im(ξ)) =1−δ. To compute the expectation, we fix all matrix entries except one and integrate with respect to that entry. Thus, by applying the decoupling formula (13) withp=1 and using the fact thatY_a,b;i,_j=0 wheneverb6=a+1, we obtain the following expansions for the terms on the right-hand side of (28),

1

αnE^Tr(GY^∗Y^(r))a,a= 1

αnE ^X

1≤j,k,l≤n

G_a,a;j,kY¯_a−1,a;l,kRe

Y_a−1,a;l,_j

= δ

αnE^TrGa,a− δ

αn²E ^X

1≤j,k,l≤n

Y¯_a₋_1,a;l,k

(GR^∗)a,a−1;j,lG_a,a;j,k

− δ

αn²E ^X

1≤j,k,l≤n

Y¯_a₋_1,a;l,k

G_a,a;_j,_j(RG)a−1,a;l,k

+O_N,M

n^δ n¹^/²y_n⁴

= δ

αnE^TrGa,a− δ

αn²E^TrGa,aTr(RGY^∗)_a−_1,a−1

+O_N,M

n^δ n¹^/²y_n⁴

. Here we use that theεerror term in (13) contains the second derivative

∂²(GY^∗)a,a;j,l

∂Re

Y_a−1,a;l,_j2 =O_N,M

1 y_n³

which consists of several terms each bounded by Lemma 13. After summing over 1≤ j,l ≤ nand utilizing the fact that the third moment of Re

Y_a−_1,a;l,j

is of order n^δ−3/2, we obtain an error