ISSN:1083-589X **in PROBABILITY**

### Limiting spectral distribution of sums of unitary and orthogonal matrices

^{∗}

### Anirban Basak

^{†}

### Amir Dembo

^{‡}

**Abstract**

We show that the empirical eigenvalue measure for sum ofdindependent Haar dis- tributedn-dimensional unitary matrices, converge forn→ ∞to the Brown measure of the free sum ofdHaar unitary operators. The same applies for independent Haar distributedn-dimensional orthogonal matrices. As a byproduct of our approach, we relax the requirement of uniformly bounded imaginary part of Stieltjes transform of Tnthat is made in [7, Thm. 1].

**Keywords:** Random matrices, limiting spectral distribution, Haar measure, Brown measure,
free convolution, Stieltjes transform, Schwinger-Dyson equation..

**AMS MSC 2010:**46L53; 60B10; 60B20.

Submitted to ECP on November 26, 2012, final version accepted on April 29, 2013.

**1** **Introduction**

The method of moments and the Stieltjes transform approach provide rather pre- cise information on asymptotics of the Empirical Spectral Distribution (in short ESD), for many Hermitian random matrix models. In contrast, both methods fail for non- Hermitianmatrix models, and the only available general scheme for finding the limiting spectral distribution in such cases is the one proposed by Girko (in [6]). It is extremely challenging to rigorously justify this scheme, even for the matrix model consisting of i.i.d. entries (of zero mean and finite variance). Indeed, after rather long series of par- tial results (see historical references in [3]), the circular lawconjecture, for the i.i.d.

case, was only recently established by Tao and Vu [17] in full generality. Barring this
simple model, very few results are known in the non-Hermitian regime. For example,
nothing is known about the spectral measure ofrandom orientedd-regulargraphs. In
this context, it was recently conjectured in [3] that, ford≥3, the ESD for the adjacency
matrix of a uniformly chosen random orientedd-regular graph converges to a measure
µdon the complex plane, whose density with respect to Lebesgue measurem(·)onC^{is}

h_{d}(v) := 1
π

d^{2}(d−1)

(d^{2}− |v|^{2})^{2}I_{{|v|≤}^{√}_{d}}. (1.1)
This conjecture, due to the observation thatµdis theBrown measureof thefree sumof
d≥2Haar unitary operators(see [9, Example 5.5]), motivated us to consider the related

∗Support: Melvin & Joan Lane endowed Stanford Graduate Fellowship Fund, and NSF grant DMS-1106627.

†Department of Statistics, Stanford University, USA.

E-mail:anirbanb@stanford.edu

‡Department of Mathematics, and Department of Statistics, Stanford University, USA.

E-mail:amir@math.stanford.edu

problem of sum of d independent Haar distributed, unitary or orthogonal matrices,
for which we prove such convergence of the ESD in Theorem 1.2. To this end, using
hereafter the notationhLog, µi^{b}_{a} :=Rb

alog|x|dµ(x)for anya < band probability measure µonR(for which such integral is well defined), withhLog, µi:=R

Rlog|x|dµ(x), we first recall the definition of Brown measure for a bounded operator (see [9, Page 333], or [2, 4]).

**Definition 1.1.** Let(A, τ)be a non-commutativeW^{∗}-probability space, i.e. a von Neu-
mann algebra Awith a normal faithful tracial stateτ (see [1, Defn. 5.2.26]). For ha
positive element inA, let µh denote the unique probability measure onR^{+} ^{such that}
τ(h^{n}) =R

t^{n}dµh(t)for alln∈Z^{+}. The Brown measureµaassociated with each bounded
a∈ A, is the Riesz measure corresponding to the[−∞,∞)-valued sub-harmonic func-
tion v 7→ hLog, µ_{|a−v|}ionC^{. That is,} µ_{a} is the unique Borel probability measure onC
such that

dµa(v) = 1

2π∆vhLog, µ_{|a−v|}idm(v), (1.2)
where ∆v denotes the two-dimensional Laplacian operator (with respect to v ∈ C^{),}
and the identity (1.2) holds in distribution sense (i.e. when integrated against any test
functionψ∈C_{c}^{∞}(C)).

**Theorem 1.2.** For anyd≥1, and0≤d^{0}≤d, asn→ ∞theESDfor sum ofd^{0} indepen-
dent, Haar distributed,n-dimensional unitary matrices{U_{n}^{i}}, and(d−d^{0})independent,
Haar distributed,n-dimensional orthogonal matrices {O_{n}^{i}}, converges weakly, in prob-
ability, to the Brown measure µd of the free sum of dHaar unitary operators (whose
density is given in (1.1)).

Recall that asn→ ∞, independent Haar distributedn-dimensional unitary (or orthogo-
nal) matrices converge in?-moments (see [16] for a definition), to the collection{ui}^{d}_{i=1}
of ?-free Haar unitary operators (see [1, Thm. 5.4.10]). However, convergence of ?-
moments, or even the stronger convergence in distribution of traffics (of [11]), do not
necessarily imply convergence of the corresponding Brown measures^{1}(see [16, §2.6]).

While [16, Thm. 6] shows that if the original matrices are perturbed by adding small Gaussian (ofunknown variance), then the Brown measures do converge, removing the Gaussian, or merely identifying the variance needed, are often hard tasks. For exam- ple, [8, Prop. 7 and Cor. 8] provide an example of ensemble where no Gaussian matrix of polynomially vanishing variance can regularize the Brown measures (in this sense).

Theorem 1.2 shows that sums of independent Haar distributed unitary/orthogonal ma- trices aresmoothenough to have the convergence of ESD-s to the corresponding Brown measureswithout adding anyGaussian.

Guionnet, Krishnapur and Zeitouni show in [7] that the limiting ESD ofUnTn for non- negative definite, diagonalTn of limiting spectral measure Θ, that is independent of the Haar distributed unitary (or orthogonal) matrixUn, exists, is supported on asingle ringand given by the Brown measure of the corresponding bounded (see [7, Eqn. (1)]), limiting operator. Their results, as well as our work, follow Girko’s method, which we now describe, in brief.

From Green’s formula, for any polynomialP(v) = Qn

i=1(v−λi)and test function ψ ∈
C_{c}^{2}(C), we have that

n

X

j=1

ψ(λj) = 1 2π

Z

C

∆ψ(v) log|P(v)|dm(v).

1The Brown measure of a matrix is its ESD (see [16, Propn. 1])

Considering this identity for the characteristic polynomialP(·)of a matrixS_{n} (whose
ESD we denote hereafter byL_{S}_{n}), results with

Z

C

ψ(v)dLS_{n}(v) = 1
2πn

Z

C

∆ψ(v) log|det(vIn−Sn)|dm(v)

= 1 4πn

Z

C

∆ψ(v) log det[(vIn−Sn)(vIn−Sn)^{∗}]dm(v).

Next, associate with any n-dimensional non-Hermitian matrixSn and every v ∈ C^{the}
2n-dimensionalHermitianmatrix

H_{n}^{v}:=

0 (S_{n}−vI_{n})
(S_{n}−vI_{n})^{∗} 0

. (1.3)

It can be easily checked that the eigenvalues of H_{n}^{v} are merely±1 times the singular
values ofvI_{n}−S_{n}. Therefore, withν_{n}^{v}denoting the ESD ofH_{n}^{v}, we have that

1

nlog det[(vI_{n}−S_{n})(vI_{n}−S_{n})^{∗}] = 1

nlog|detH_{n}^{v}|= 2hLog, ν_{n}^{v}i,
out of which we deduce the key identity

Z

C

ψ(v)dLS_{n}(v) = 1
2π

Z

C

∆ψ(v)hLog, ν_{n}^{v}idm(v) (1.4)
(commonly known as Girko’s formula). The utility of Eqn. (1.4) lies in the following gen-
eral recipe for proving convergence ofLSn per given family of non-Hermitian random
matrices{Sn}(to which we referred already as Girko’s method).

**Step 1: Show that for (Lebesgue almost) every** v ∈ C^{, as} n → ∞ the measures ν_{n}^{v}
converge weakly, in probability, to some measureν^{v}.

**Step 2: Justify that** hLog, ν_{n}^{v}i → hLog, ν^{v}i in probability (which is the main technical
challenge of this approach).

**Step 3: A uniform integrability argument allows one to convert the**v-a.e. convergence
of hLog, ν_{n}^{v}i to the corresponding convergence for a suitable collection S ⊆ C_{c}^{2}(C) of
(smooth) test functions. Consequently, it then follows from (1.4) that for each fixed,
non-randomψ∈ S,

Z

C

ψ(v)dLS_{n}(v)→ 1
2π

Z

C

∆ψ(v)hLog, ν^{v}idm(v), (1.5)
in probability.

**Step 4: Upon checking that**f(v) :=hLog, ν^{v}iis smooth enough to justify the integration
by parts, one has that for each fixed, non-randomψ∈ S,

Z

C

ψ(v)dLS_{n}(v)→ 1
2π

Z

C

ψ(v)∆f(v)dm(v), (1.6)

in probability. ForS large enough, this implies the convergence in probability of the
ESD-sLS_{n} to a limit which has the density _{2π}^{1} ∆f with respect to Lebesgue measure on
C^{.}

Employing this method in [7] requires, for**Step 2, to establish suitable asymptotics for**
singular values ofTn+ρUn. Indeed, the key to the proofs there is to show that uniform
boundedness of the imaginary part of Stieltjes transform ofTn(of the form assumed in
[7, Eqn. (3)]), is inherited by the corresponding transform ofTn+ρUn(see (1.12) for a

definition ofU_{n} andT_{n}). In the context of Theorem 1.2 (ford^{0} ≥1), at the startd= 1,
the expected ESD for|vI_{n}−U_{n}|has unbounded density (see Lem. 4.1), so the imaginary
parts of relevant Stieltjes transforms areunbounded. We circumvent this problem by
localizing the techniques of [7], whereby we can follow the development of unbounded
regions of the resolvent via the mapTn 7→ Tn+ρ(Un +U_{n}^{∗})(see Lem. 1.5), so as to
achieve the desired convergence of integral of the logarithm near zero, for Lebesgue
almost everyz. We note in passing that Rudelson and Vershynin showed in [15] that
the condition of [7, Eqn. (2)] about minimal singular value can be dispensed off (see
[15, Cor. 1.4]), but the remaining uniform boundedness condition [7, Eqn. (3)] is quite
rigid. For example, it excludes atoms in the limiting measureΘ(so does not allow even
Tn =In, see [7, Rmk 2]). As a by product of our work, we relax below this condition
about Stieltjes transform ofTn (compare (1.8) with [7, Eqn. (3)]), thereby generalizing
[7, Thm. 1].

**Proposition 1.3.** Suppose the ESD of R^{+}-valued, diagonal matrices {Tn} converge
weakly, in probability, to some probability measure Θsuch that Θ({0}) = 0. Assume
further that:

1. There exists finite constantM so that

n→∞lim P(kTnk> M) = 0. (1.7) 2. There exists a closed setK ⊆ Rof zero Lebesgue measure such that for every

ε >0, someκ_{ε}>0,M_{ε}finite and allnlarge enough,

{z:=(z)> n^{−κ}^{ε},|=(GT_{n}(z))|> Mε} ⊂ {z:z∈ [

x∈K

B(x, ε)}, (1.8)

whereGT_{n}(z)is the Stieltjes transform of the symmetrized version of theESDof
Tn, as defined in (1.13).

IfΘis not a (single) Dirac measure, then the following hold:

(a) TheESD ofA_{n} :=U_{n}T_{n} converges, in probability, to limiting probability measure
µA.

(b) The measureµApossesses a radially-symmetric densityhA(v) := _{2π}^{1}∆vhLog, ν^{v}iwith
respect to Lebesgue measure onC^{, where}ν^{v}:= ˜Θλ_{|v|}is the free convolution (c.f.

[1, §5.3.3]), ofλr=^{1}_{2}(δr+δ_{−r})and the symmetrized versionΘ˜ ofΘ.

(c) The support ofµAis single ring: There exists constants0≤a < b <∞so that
suppµ_{A}={re^{iθ}:a≤r≤b}.

Further,a= 0if and only ifR

x^{−2}dΘ(x) =∞.

(d) The same applies ifUnis replaced by a Haar distributed orthogonal matrixOn. This extension accommodatesΘ with atoms, unbounded density, or singular part, as long as (1.8) holds (at the finiten-level). For example, Proposition 1.3 applies forTn

diagonal having[np_{i}]entries equalx_{i} 6= 0, for p_{i} > 0, i = 1,2, . . . , k ≥ 2, whereas the
case ofT_{n}=αI_{n}for someα >0is an immediate consequence of Theorem 1.2.

Our presentation of the proof of Theorem 1.2 starts with detailed argument ford^{0} =d,
namely, the sum of independent Haar distributed unitary matrices. That is, we first
prove the following proposition, deferring to Section 5 its extension to all0≤d^{0}< d.

**Proposition 1.4.** For any d ≥ 1, as n → ∞ the ESDof sum of dindependent, Haar
distributed, n-dimensional unitary matrices {U_{n}^{i}}^{d}_{i=1}, converges weakly, in probability,
to the Brown measureµdof free sum ofdHaar unitary operators.

To this end, for anyv ∈ Cand i.i.d. Haar distributed unitary matrices{U_{n}^{i}}_{1≤i≤d}, and
orthogonal matrices{O^{i}_{n}}_{1≤i≤d}, let

U_{n}^{1,v}:=

0 (U_{n}^{1}−vI_{n})
(U_{n}^{1}−vIn)^{∗} 0

, (1.9)

and define O^{1,v}_{n} analogously, with O_{n}^{1} replacing U_{n}^{1}. Set V_{n}^{1,v} := U_{n}^{1,v} if d^{0} ≥ 1 and
V_{n}^{1,v}:=O^{1,v}_{n} ifd^{0}= 0, then let

V_{n}^{k,v} :=V_{n}^{k−1,v}+U_{n}^{k}+(U_{n}^{k})^{∗}:=V_{n}^{k−1,v}+
0 U_{n}^{k}

0 0

+

0 0
(U_{n}^{k})^{∗} 0

, fork= 2, . . . , d^{0}, (1.10)
and replacingU_{n}^{k} byO^{k}_{n}, continue similarly fork=d^{0}+ 1, . . . , d. Next, letG^{d,v}_{n} denote
the expected Stieltjes transform ofV_{n}^{d,v}. That is,

G^{d,v}_{n} (z) :=Eh 1

2nTr(zI2n−V_{n}^{d,v})^{−1}i

, (1.11)

where the expectation is over all relevant unitary/orthogonal matrices {U_{n}^{i}, O^{i}_{n}, i =
1, . . . , d}. Part (ii) of the next lemma, about the relation between unbounded regions
ofG^{d,v}_{n} (·), andG^{d−1,v}_{n} (·)summarizes the key observation leading to Theorem 1.2 (with
part (i) of this lemma similarly leading to our improvement over [7]). To this end, for any
ρ >0andarbitraryn-dimensional matrixTn(possibly random), which is independent of
the unitary Haar distributedU_{n}, let

Yn:=Tn+ρ(Un+U_{n}^{∗}) :=

0 Tn

T_{n}^{∗} 0

+ρ 0 Un

0 0

+ρ

0 0
U_{n}^{∗} 0

(1.12)
and consider the following two functions ofz∈C^{+}^{,}

GT_{n}(z) := 1

2nTr(zI2n−Tn)^{−1}, (1.13)

Gn(z) :=Eh 1

2nTr(zI2n−Yn)^{−1}|Tn

i

. (1.14)

**Lemma 1.5.** (i) FixingRfinite, suppose that kTnk ≤ Rand theESDofTn converges
to some Θ˜. Then, there exist 0 < κ1 < κ small enough, and finite Mε ↑ ∞ as ε ↓ 0,
depending only onRandΘ˜, such that for allnlarge enough andρ∈[R^{−1}, R],

=(z)> n^{−κ}^{1} &|=(Gn(z))|>2Mε =⇒ ∃ψn(z)∈C^{+}, =(ψn(z))> n^{−κ}&|=(GT_{n}(ψn(z))|> Mε

&z−ψ_{n}(z)∈B(−ρ, ε)∪B(ρ, ε).
(1.15)
The same applies when Un is replaced by Haar orthogonal matrix On (possibly with
different values of0< κ1< κandMε↑ ∞).

(ii) For anyR finite,d≥ 2and d^{0} ≥0, there exist0 < κ1 < κsmall enough and finite
Mε↑ ∞, such that (1.15) continues to hold forρ= 1, allnlarge enough, any|v| ≤Rand
someψ_{n}(·) := ψ_{n}^{d,v}(·)∈ C^{+}, even whenG_{n} and G_{T}_{n}, are replaced byG^{d,v}_{n} and G^{d−1,v}_{n} ,
respectively.

Section 2 is devoted to the proof of Lemma 1.5, building on which we prove Proposition 1.4 in Section 3. The other key ingredients of this proof, namely Lemmas 3.1 and 3.2, are established in Section 4. Finally, short outlines of the proofs of Theorem 1.2 and of Proposition 1.3, are provided in Sections 5 and 6, respectively.

**2** **Proof of Lemma 1.5**

This proof uses quite a few elements from the proofs in [7]. Specifically, focusing on
the case of unitary matrices, once a particular choice ofρ∈[R^{−1}, R]andT_{n} is made in
part (i), all the steps appearing in [7, pp. 1202-1203] carry through, so all the equations
obtained there continue to hold here (with a slight modification of bounds on error
terms in the setting of part (ii), as explained in the sequel). Since this part follows [7],
we omit the details. It is further easy to check that the same applies for the estimates
obtained in [7, Lem. 11, Lem. 12], which are thus also used in our proof (without
detailed re-derivation).

Proof of (i): We fix throughout this proof a fixed realization of the matrixT_{n}, so expec-
tations are taken only over the randomness in the unitary matrixUn. Having done so,
first note that from [7, Eqn. (37)-(38)] we get

Gn(z) =GT_{n}(ψn(z))−O(n, z, ψe n(z)), (2.1)
for

ψn(z) :=z− ρ^{2}G_{n}(z)

1 + 2ρG^{n}_{U}(z) , (2.2)

and

G^{n}_{U}(z) :=Eh 1
2nTr

U_{n}(zI_{2n}−Y_{n})^{−1} |T_{n}i
,
where for allz_{1}, z_{2}∈C^{+}

O(n, ze 1, z2) = 2O(n, z1, z2)

1 + 2ρG^{n}_{U}(z_{1}), (2.3)

with O(n, z1, z2)as defined in [7, pp. 1202]. Thus, (2.1) and (2.2) provide a relation
between G_{n} and G_{T}_{n} which is very useful for our proof. Indeed, from [7, Lem. 12]

we have that there exists a constant C_{1} := C_{1}(R) finite such that, for all large n, if

=(z)> C1n^{−1/4}then

=(ψn(z))≥ =(z)/2. (2.4)

Additionally, from [7, Eqn. (34)] we have that

ρ(Gn(z))^{2}= 2G^{n}_{U}(z)(1 + 2ρG^{n}_{U}(z))−O1(n, z), (2.5)
whereO1(·,·)is as defined in [7, pp. 1203]. To this end, denoting

F(Gn(z)) := ρ^{2}Gn(z)

1 + 2ρG^{n}_{U}(z), (2.6)

and using (2.5), we obtain after some algebra the identity Gn(z)h

ρ^{2}−F^{2}(Gn(z))i

=F(Gn(z))h

1 + ρO1(n, z)
1 + 2ρG^{n}_{U}(z)

i

. (2.7)

Since

1 + 2ρG^{n}_{U}(z) =1
2

1 +p

1 + 4ρ^{2}Gn(z)^{2}+ 4ρO1(n, z)

, (2.8)

where the branch of the square root is uniquely determined by analyticity and the
known behavior ofG^{n}_{U}(z)and Gn(z)as |z| → ∞ (see [7, Eqn. (35)]), we further have
that

F(G_{n}(z)) = 2ρ^{2}G_{n}(z)
1 +p

1 + 4(ρGn(z))^{2}+ 4ρO1(n, z)

=1 2

hρ^{2}G_{n}(z)p

1 + 4(ρG_{n}(z))^{2}+ 4ρO_{1}(n, z)

(ρGn(z))^{2}+ρO1(n, z) − ρ^{2}Gn(z)
(ρGn(z))^{2}+ρO1(n, z)

i

. (2.9)

The key to our proof is the observation that if |=(Gn(z))| → ∞and O_{1}(n, z) remains
small, then from (2.9), and (2.2) necessarily F(G_{n}(z)) = z −ψ_{n}(z) → ±ρ. So, if
O(n, z, ψe n(z))remains bounded then by (2.1) also |=(GT_{n}(ψn(z)))| → ∞, yielding the
required result.

To implement this, fixM = Mε ≥10such that 6M_{ε}^{−1} ≤ ε^{2}and recall that by [7, Lem.

11] there exists finite constantC2:=C2(R)such that, for all largen, if=(z)> C1n^{−1/4}
then

|1 + 2ρG^{n}_{U}(z)|> C_{2}ρ[=(z)^{3}∧1]. (2.10)
Furthermore, we have (see [7, pp. 1203]),

|O(n, z1, z_{2})| ≤ Cρ^{2}

n^{2}|=(z2)|=(z1)^{2}(=(z1)∧1). (2.11)
Therefore, enlarging C_{1} as needed, by (2.3), (2.4), and (2.10) we obtain that, for all
largen,

|O(n, z, ψe n(z))| ≤ Cρ

n^{2}|=(ψ_{n}(z))|=(z)^{2}(=(z)^{4}∧1) ≤Mε

whenever=(z) > C_{1}n^{−1/4}. This, together with (2.1), shows that if |=(Gn(z))| >2M_{ε},
then we have that|=(G_{T}_{n}(ψ_{n}(z)))|> M_{ε}. Now, fixing0< κ_{1}< κ <1/4we get from (2.4)
that=(ψn(z))> n^{−κ}. It thus remains to show only thatF(Gn(z))∈B(−ρ, ε)∪B(ρ, ε).
To this end, note that

|O1(n, z)| ≤ Cρ^{2}

n^{2}=(z)^{2}(=(z)∧1) ^{(2.12)}

(c.f. [7, pp. 1203]). Therefore,O1(n, z) =o(n^{−1})whenever=(z)> C1n^{−1/4}, and so the
rightmost term in (2.9) is bounded byM_{ε}^{−1}whenever|=(Gn(z))|>2Mε. Further, when

=(z)> C_{1}n^{−1/4},|=(Gn(z))|>2M_{ε}andnis large enough so|O1(n, z)| ≤1, we have that
for any choice of the branch of the square root,

ρGn(z)p

1 + 4(ρGn(z))^{2}+ 4ρO1(n, z)
(ρGn(z))^{2}+ρO1(n, z)

≤

p1 + 4|ρGn(z)|^{2}+ 4|ρO1(n, z)|

|ρGn(z)| −1 ≤4,

resulting with|F(Gn(z))| ≤ 3ρ. Therefore, using (2.10) and (2.12), we get from (2.7)
that if=(z)> C1n^{−1/4}and|=(Gn(z))|>2Mε, then

F^{2}(Gn(z))−ρ^{2}

≤6|Gn(z)|^{−1}≤6M_{ε}^{−1}≤ε^{2}.

In conclusion, z −ψ_{n}(z) = F(G_{n}(z)) ∈ B(ρ, ε)∪B(−ρ, ε), as stated. Further, upon
modifying the values ofκ1 < κ and Mε, this holds also when replacingUn by a Haar
distributed orthogonal matrixOn. Indeed, the same analysis applies except for adding
toO(n, z1, z2)of [7, pp. 1202] a term which is uniformly bounded byn^{−1}|=(z2)|^{−1}(=(z1)∧

1)^{−2} (see [7, proof of Thm. 18]), and using in this case [1, Cor. 4.4.28] to control the
variance of Lipschitz functions ofO_{n} (instead ofU_{n}).

Proof of (ii): Consider first the case ofd^{0} =d. Then, setting ρ= 1, T_{n} = V_{n}^{d−1,v}, and
Yn=V_{n}^{d,v}, one may check that following the derivation of [7, Eqn. (37)-(38)], now with
allexpectations taken alsooverTn, we get that

G^{d,v}_{n} (z) =G^{d−1,v}_{n} (ψ^{d,v}_{n} (z))−O(n, z, ψe _{n}^{d,v}(z)), (2.13)
for someK <∞and all{z∈C^{+}:=(z)≥K}, where

ψ^{d,v}_{n} (z) :=z− G^{d,v}_{n} (z)
1 + 2G^{d,v}_{U}

n(z) , (2.14)

G^{d,v}_{U}

n(z) :=Eh 1 2nTr

U_{n}^{d}(zI_{2n}−V_{n}^{d,v})^{−1} i
,
and for anyz1, z2∈C^{+}^{,}

O(n, ze 1, z2) := 2O(n, z1, z2)
1 + 2G^{d,v}_{U}

n(z_{1}).

Next, note that for someC <∞and anyC-valued functionfd(U_{n}^{1}, . . . , U_{n}^{d})of i.i.d. Haar
distributed{U_{n}^{i}}

E[(f_{d}−E[f_{d}])^{2}]≤dCkf_{d}k^{2}_{L}, (2.15)
wherekfdkLdenotes the relevant coordinate-wise Lipschitz norm, i.e.

kfdkL :=max^{d}

j=1 sup

U_{n}^{1},...,U_{n}^{d},Uen6=Un^{j}

|fd(U_{n}^{1}, . . . , U_{n}^{d})−fd(U_{n}^{1}, . . . , U_{n}^{j−1},Uen, U_{n}^{j+1}, . . .)|

kUn^{j}−Uenk2

.

Indeed, we bound the variance offd by the (sum ofd) second moments of martingale
differencesDjfd:=E[fd|U_{n}^{1}, . . . , U_{n}^{j}]−E[fd|U_{n}^{1}, . . . , U_{n}^{j−1}]. By the independence of{U_{n}^{i}}
and definition of kfdkL, conditional upon (U_{n}^{1}, . . . , U_{n}^{j−1}), the C-valued function U_{n}^{j} 7→

D_{j}f_{d}is Lipschitz of norm at mostkf_{d}k_{L}in the sense of [1, Ineq. (4.4.31)]. It then easily
follows from the concentration inequalities of [1, Cor. 4.4.28], that the second moment
of this function is at mostCkfdk^{2}_{L}(uniformly with respect to(U_{n}^{1}, . . . , U_{n}^{j−1})).

In the derivation of [7, Lem. 10], the corresponding error termO(n, z1, z2)is bounded by
a sum of finitely many variances of Lipschitz functions of the form_{2n}^{1} Tr{H(U_{n}^{d})}, each of
which has Lipschitz norm of ordern^{−1/2}, hence controlled by applying the concentration
inequality (2.15). We have here the same type of bound onO(n, z_{1}, z_{2}), except that each
variance in question is now with respect to some function _{2n}^{1} Tr{H(U_{n}^{1}, . . . , U_{n}^{d})}having
coordinate-wise Lipschitz norm of ordern^{−1/2}(and with respect to the joint law of the
i.i.d. Haar distributed unitary matrices). Collecting all such terms, we get here instead
of (2.11), the slightly worse bound

|O(n, z1, z2)|=O

1

n|=(z2)|=(z1)^{2}(=(z1)∧1)^{2}(=(z2)∧1)

(2.16)
(with an extra factor(=(z2)∧1)^{−1}due to the additional randomness in(z2I2n−Tn)^{−1}).

Using the modified bound (2.16), we proceed as in the proof of part (i) of the lemma,
to first bound O(n, z, ψe ^{d,v}_{n} (z)), O_{1}(n, z), and derive the inequalities replacing (2.4) and
(2.10). Out of these bounds, we establish the stated relation (1.15) betweenG^{d,v}_{n} and
G^{d−1,v}_{n} upon following the same route as in our proof of part (i). Indeed, when doing so,
the only effect of starting with (2.16) instead of (2.11) is in somewhat decreasing the
positive constantsκ_{1}, κ, while increasing each of the finite constants{Mε, ε >0}.
Finally, with [1, Cor. 4.4.28] applicable also over the orthogonal group, our proof of
(2.15) extends to any C-valued function fd(U_{n}^{1}, . . . , U_{n}^{d}^{0}, O^{d}_{n}^{0}^{+1}, . . . , O^{d}_{n}) of independent
Haar distributed unitary/orthogonal matrices{U_{n}^{i}, O_{n}^{i}}. Hence, as in the context of part
(i), the same argument applies for0≤d^{0} < d(up to addingn^{−1}|=(z2)|^{−1}(=(z1)∧1)^{−2}to
(2.16), c.f. [7, proof of Thm. 18]).

**3** **Proof of Proposition 1.4**

It suffices to prove Proposition 1.4 only ford≥2, since the easier case ofd= 1has
already been established in [12, Cor. 2.8]. We proceed to do so via the four steps of
Girko’s method, as described in Section 1. The following two lemmas (whose proof is
deferred to Section 4), take care of**Step 1**and**Step 2**of Girko’s method, respectively.

**Lemma 3.1.** Letλ_{1}=^{1}_{2}(δ_{−1}+δ_{1})andΘ^{d,v}:= Θ^{d−1,v}λ_{1}for alld≥2, starting atΘ^{1,v}
which forv 6= 0is the symmetrized version of the measure on R^{+} having the density
f_{|v|}(·)of (4.1), whileΘ^{1,0}=λ1. Then, for eachv ∈C^{and}d∈N^{, the}^{ESD-s}L_{V}d,v

n of the
matricesV_{n}^{d,v}(see (1.10)), converge weakly asn→ ∞, in probability, toΘ^{d,v}.

**Lemma 3.2.** For anyd≥2and Lebesgue almost everyv∈C^{,}
hLog, L_{V}d,v

n i → hLog,Θ^{d,v}i, (3.1)
in probability. Furthermore, there exist closedΛ_{d}⊂Cof zero Lebesgue measure, such

that Z

C

φ(v)hLog, L_{V}d,v

n idm(v)→ Z

C

φ(v)hLog,Θ^{d,v}idm(v), (3.2)
in probability, for each fixed, non-random φ∈ C_{c}^{∞}(C)whose support is disjoint ofΛd.
That is, the support ofφis contained for someγ >0, in the bounded, open set

Γ^{d}_{γ} :=

v∈C:γ <|v|< γ^{−1}, inf

u∈Λ_{d}{ |v−u|}> γ . (3.3)
We claim that the convergence result of (3.2) provides us already with the conclusion
(1.5) of**Step 3**in Girko’s method, for test functions in

S:={ψ∈C_{c}^{∞}(C), supported withinΓ^{d}_{γ} for someγ >0}.

Indeed, fixing d ≥2, the Hermitian matricesV_{n}^{d,v} of (1.10) are precisely thoseH_{n}^{v} of
the form (1.3) that are associated withSn := Pd

i=1U_{n}^{i} in Girko’s formula (1.4). Thus.

combining the latter identity forψ∈ S with the convergence result of (3.2) forφ= ∆ψ, we get the following convergence in probability asn→ ∞,

Z

C

ψ(v)dLS_{n}(v) = 1
2π

Z

C

∆ψ(v)hLog, L_{V}d,v

n idm(v)→ 1 2π

Z

C

∆ψ(v)hLog,Θ^{d,v}idm(v).
(3.4)
Proceeding to identify the limiting measure as the Brown measureµd :=µs_{d} of the
sumsd := u1+u2+· · ·+ud of?-free Haar unitary operatorsui, recall [14] that each
(ui, u^{∗}_{i})isR-diagonal. Hence, by [9, Propn. 3.5] we have thatΘ^{d,v} is the symmetrized
version of the law of|sd−v|, and so by definition (1.2) we have that for anyψ∈C_{c}^{∞}(C),

1 2π

Z

C

∆ψ(v)hLog,Θ^{d,v}idm(v) =
Z

C

ψ(v)µs_{d}(dv). (3.5)
In parallel with**Step 4**of Girko’s method, it thus suffices for completing the proof, to
verify that the convergence in probability

Z

C

ψ(v)dLS_{n}(v)→
Z

C

ψ(v)dµs_{d}(v), (3.6)

for eachfixedψ∈ S, yields the weak convergence, in probability, ofL_{S}_{n}toµ_{s}_{d}.

To this end, suppose first that (3.6) holds almost surely for each fixed ψ ∈ S, and
recall that for anyγ >0 and each openG⊂Γ^{d}_{γ} there existψk ∈ S such that ψk ↑ 1G.
Consequently, a.s.

lim inf

n→∞ LS_{n}(G)≥sup

k

lim inf

n→∞

Z

C

ψk(v)dLS_{n}(v) = sup

k

Z

C

ψk(v)dµs_{d}(v) =µs_{d}(G).
Further, from [9, Example 5.5] we know that µs_{d} has, for d ≥ 2, a bounded density
with respect to Lebesgue measure on C^{(given by} h_{d}(·)of (1.1)). In particular, since

m(Λ_{d}) = 0, it follows thatµ_{s}_{d}(Λ_{d}) = 0and henceµ_{s}_{d}(Γ^{d}_{γ})→1 whenγ →0. Given this,
fixing someγ_{`}↓0and openG⊂C, we deduce that a.s.

lim inf

n→∞ LS_{n}(G)≥ lim

`→∞lim inf

n→∞ LS_{n}(G∩Γ^{d}_{γ}_{`})≥ lim

`→∞µs_{d}(G∩Γ^{d}_{γ}_{`}) =µs_{d}(G). (3.7)
This applies for any countable collection{Gi} of open subsets ofC, with the reversed
inequality holding for any countable collection of closed subsets ofC. In particular,
fixing any countable convergence determining class{fj} ⊂Cb(C)and countable dense
Qb ⊂R^{such that}µs_{d}(f_{j}^{−1}({q})) = 0for allj andq∈Qb, yield the countable collectionG
ofµs_{d}-continuity sets (consisting of interiors and complement of closures off_{j}^{−1}([q, q^{0})),
q, q^{0} ∈ Qb), for which LS_{n}(·) converges to µs_{d}(·). The stated a.s. weak convergence
ofLS_{n} toµs_{d} then follows as in the usual proof of Portmanteau’s theorem, under our
assumption that (3.6) holds a.s.

This proof extends to the case at hand, where (3.6) holds in probability, since conver-
gence in probability implies that for every subsequence, there exists a further subse-
quence along which a.s. convergence holds, and the whole argument uses only count-
ably many functionsψk,`,i ∈ S. Specifically, by a Cantor diagonal argument, for any
given subsequencenj, we can extract a further subsequencej(l), such that (3.7) holds
a.s. forLS_{nj(l)} and allGin the countable collectionGofµs_{d}-continuity sets. Therefore,
a.s. LS_{nj(l)} converges weakly toµs_{d} and by the arbitrariness of{nj} we have that, in
probability,L_{S}_{n} converges toµ_{s}_{d}weakly.

**4** **Proofs of Lemma 3.1 and Lemma 3.2**

We start with a preliminary result, needed for proving Lemma 3.1.

**Lemma 4.1.** For Haar distributedUnand anyr >0, the expectedESDof|Un−rIn|has
the density

f_{r}(x) = 2
π

x

p(x^{2}−(r−1)^{2})((r+ 1)^{2}−x^{2}), |r−1| ≤x≤r+ 1 (4.1)
with respect to Lebesgue’s measure onR^{+}^{(while for}r= 0, thisESDconsists of a single
atom atx= 1).

Proof: It clearly suffices to show that the expected ESD of(Un−rIn)(Un−rIn)^{∗}has for
r >0the density

gr(x) = 1 π

1

p(x−(r−1)^{2})((r+ 1)^{2}−x), (r−1)^{2}≤x≤(r+ 1)^{2}. (4.2)
To this end note that by the invariance of the Haar unitary measure under multiplication
bye^{iθ}, we have that

E[1

nTr{U_{n}^{k}}] =E[1

nTr{(U_{n}^{∗})^{k}}] = 0, (4.3)
for all positive integerskandn. Thus,

Eh1 nTr

(Un+U_{n}^{∗})^{k} i

= k

k/2

forkeven and0otherwise.

Therefore, by the moment method, the expected ESD of Un +U_{n}^{∗} (denoted L¯U_{n}+U_{n}^{∗}),
satisfies

L¯_{U}_{n}_{+U}∗
n

= 2 cosd θ=e^{iθ}+e^{−iθ}, whereθ∼Unif(0,2π).

Consequently, we get the formula (4.2) for the densityg_{r}(x)of the expected ESD of
(U_{n}−rI_{n})(U_{n}−rI_{n})^{∗}= (1 +r^{2})I_{n}−r(U_{n}+U_{n}^{∗}),

by applying the change of variable formula forx= (1+r^{2})−2rcosθ(andθ∼Unif(0,2π)).

Proof of Lemma 3.1: Recall [1, Thm. 2.4.4(c)] that for the claimed weak convergence of
L_{V}d,v

n toΘ^{d,v}, in probability, it suffices to show that per fixedz∈C^{+}, the corresponding
Stieltjes transforms

f_{n}^{d,v}(z) := 1

2nTr{(zI2n−V_{n}^{d,v})^{−1}}

converge in probability to the Stieltjes transformG^{d,v}_{∞}(z)ofΘ^{d,v}. To this end, note that
eachf_{n}^{d,v}(z)is a point-wise Lipschitz function of{U_{n}^{i}}, whose expected value isG^{d,v}_{n} (z)
of (1.11), and thatkfnkL →0asn→ ∞(per fixed values ofd, v, z). It thus follows from
(2.15) that asn→ ∞,

E[(f_{n}^{d,v}(z)−G^{d,v}_{n} (z))^{2}]→0

and therefore, it suffices to prove that per fixedd,v∈C^{and}z∈C^{+}^{, as}n→ ∞,

G^{d,v}_{n} (z)→G^{d,v}_{∞}(z). (4.4)

Next observe that by invariance of the law of U_{n}^{1} to multiplication by scalar e^{iθ}, the
expected ESD of V_{n}^{1,v} depends only on r = |v|, with Θ^{1,v} = E[L_{V}1,v

n ] (see Lem. 4.1).

Hence, (4.4) trivially holds ford = 1and we proceed to prove the latter pointwise (in z, v), convergence by an induction ond ≥2. The key ingredient in the induction step is the (finite n) Schwinger-Dyson equation in our set-up, namely Eqn. (2.13)-(2.14).

Specifically, from (2.13)-(2.14) and the induction hypothesis it follows that for some
non-randomK < ∞, any limit point, denoted (G^{d,v}, G^{d,v}_{U} ), of the uniformly bounded,
equi-continuous functions(G^{d,v}_{n} , G^{d,v}_{U}

n)on{z∈C^{+}:=(z)≥K}, satisfies
G^{d,v}(z) =G^{d−1,v}_{∞} (ψ(z)), withψ(z) :=z− G^{d,v}(z)

1 + 2G^{d,v}_{U} (z). (4.5)
Moreover, from the equivalent version of (2.5) in our setting, we obtain that

4G^{d,v}_{U} (z) =−1 +
q

1 + 4G^{d,v}(z)^{2},

for a suitable branch of the square root (uniquely determined by analyticity and decay
to zero as|z| → ∞ofz7→(G^{d,v}(z), G^{d,v}_{U} (z))). Thus,G(z) =G^{d,v}(z)satisfies the relation

G(z)−G^{d−1,v}_{∞}

z− 2G(z)

1 +p

1 + 4G(z)^{2}

= 0. (4.6)

Since Θ^{d,v} = Θ^{d−1,v}λ_{1}, it follows that (4.6) holds also for G(·) = G^{d,v}_{∞}(·) (c.f. [7,
Rmk. 7]). Further,z7→G^{d−1,v}_{∞} (z)is analytic onC^{+}with derivative ofO(z^{−2})at infinity,
hence by the implicit function theorem the identity (4.6) uniquely determines the value
ofG(z)for all=(z)large enough. In particular, enlargingKas needed,G^{d,v} =G^{d,v}_{∞} on
{z ∈C^{+} :=(z)≥K}, which by analyticity of both functions extends to all ofC^{+}^{. With}
(4.4) verified, this completes the proof of the lemma.

The proof of Lemma 3.2 requires the control of=(G^{d,v}_{n} (z))as established in Lemma 4.3.

This is done inductively ind, with Lemma 4.2 providing the basisd= 1of the induction.

**Lemma 4.2.** For someCfinite, allε∈(0,1)andv∈C^{,}
n

z∈C^{+} : |=G^{1,v}_{n} (z)| ≥Cε^{−2}o

⊆n

E+iη:η∈(0, ε^{2}), E∈ ±(1±|v|)−2ε,±(1±|v|)+2εo
.
Proof: It is trivial to confirm our claim in case v = 0 (as G^{1,0}_{n} (z) = z/(z^{2}−1)). Now,
fixingr=|v|>0, letfe_{r}(·)denote the symmetrized version of the densityf_{r}(·), and note
that for anyη >0,

|=G^{1,v}_{n} (E+iη)| =
Z

|x−E|>√ η

η

(x−E)^{2}+η^{2}fe_{r}(x)dx+
Z

|x−E|≤√ η

η

(x−E)^{2}+η^{2}fe_{r}(x)dx

≤ 1 +h

sup

{x:|x−E|≤√ η}

fer(x)iZ

|x−E|≤√ η

η

(x−E)^{2}+η^{2}dx

≤ 1 +πh

sup

{x:|x−E|≤√ η}

fer(x)i

. (4.7)

WithΓεdenoting the union of open intervals of radiusεaround the four points±1±r, it follows from (4.1) that for someC1finite and anyr, ε >0,

sup

x /∈Γε

{fer(x)} ≤C1ε^{−2}.
Thus, from (4.7) it follows that

sup

{E,η:(E−√ η,E+√

η)⊂Γ^{c}_{ε}}

|=G^{1,v}_{n} (E+iη)| ≤Cε^{−2},

for someCfinite, allε∈(0,1)andr >0. To complete the proof simply note that
{(E, η) :E∈Γ^{c}_{2ε}, η∈(0, ε^{2})} ⊆ {(E, η) : (E−√

η, E+√

η)⊆Γ^{c}_{ε}},
and

sup

E∈R,η≥ε^{2}

|=G^{1,v}_{n} (E+iη)| ≤ε^{−2}.

Since the densityfe_{|v|}(·)is unbounded at±1±|v|, we can not improve Lemma 4.2 to show
that=G^{1,v}_{n} (z)is uniformly bounded. The same applies ford≥2so a result such as [7,
Lem. 13] is not possible in our set-up. Instead, as we show next, inductively applying
Lemma 1.5(ii) allows us to control the region where|=(G^{d,v}_{n} (z))| might blow up, in a
manner which suffices for establishing Lemma 3.2 (and consequently Proposition 1.4).

**Lemma 4.3.** Forr≥0,γ >0and integerd≥1, letΓ^{d,r}_{γ} ⊂Cdenote the union of open
balls of radiusγcentered at±m±rform= 0,1,2, . . . , d. Fixing integerd≥1,γ∈(0,1)
and R finite, there exist M finite and κ > 0 such that for all nlarge enough and any
v∈B(0, R),

sup{|=(G^{d,v}_{n} (z))|: =(z)> n^{−κ}, z /∈Γ^{d,|v|}_{γ} } ≤M . (4.8)
Proof: For anyd≥1,v∈C^{, positive}κand finiteM, set

Γ^{d,v}_{n} (M, κ) :={z:=(z)> n^{−κ},|=(G^{d,v}_{n} (z))|> M},

so our thesis amounts to the existence of finiteMandκ >0, depending only onR,d≥2 andγ∈(0,1), such that for allnlarge enough,

Γ^{d,v}_{n} (M, κ)⊂Γ^{d,|v|}_{γ} , ∀v∈B(0, R). (4.9)
Indeed, ford= 1this is a direct consequence of Lemma 4.2 (withγ = 2ε,M =Cε^{−2}),
and we proceed to confirm (4.9) by induction ond≥2. To carry out the inductive step

from d−1 tod, fix R finite and γ ∈ (0,1), assuming that (4.9) applies at d−1 and
γ/2, for some finiteM_{?} and positiveκ_{?}(both depending only ond,R andγ). Then, let
ε ∈ (0, γ/2) be small enough such that Lemma 1.5(ii) applies for some Mε ≥ M? and
0< κ1< κ≤κ?. From Lemma 1.5(ii) we know that for anynlarge enough,v ∈B(0, R)
andz∈Γ^{d,v}_{n} (2Mε, κ1), there existsw:=ψ^{d,v}_{n} (z)for which

z−w∈B(−1, ε)∪B(1, ε) & w∈Γ^{d−1,v}_{n} (M_{ε}, κ)⊆Γ^{d−1,v}_{n} (M_{?}, κ_{?})⊂Γ^{d−1,|v|}_{γ/2} ,
where the last inclusion is due to our choice ofM? andκ?. Withε≤γ/2, it is easy to
check thatz−w ∈B(−1, ε)∪B(1, ε)andw ∈ Γ^{d−1,r}_{γ/2} result withz ∈ Γ^{d,r}_{γ} . That is, we
have established the validity of (4.9) atdand arbitrarily smallγ, forM = 2Mεfinite and
κ_{1}positive, both depending only onR,dandγ.

Proof of Lemma 3.2: Recall [15, Thm 1.1] the existence of universal constants 0 < c1

and c_{2} < ∞, such that for any non-random matrix D_{n} and Haar distributed unitary
matrixU_{n}, the smallest singular values_{min}ofU_{n}+D_{n}satisfies,

P(smin(Un+Dn)≤t)≤t^{c}^{1}n^{c}^{2}. (4.10)
The singular values ofV_{n}^{d,v} are clearly the same as those ofSn −vIn = U_{n}^{1}+Dn for
Dn = Pd

i=2U_{n}^{i} −vIn, which is independent of the Haar unitary U_{n}^{1}. Thus, applying
(4.10) conditionally onD_{n}, we get that

P(s_{min}(V_{n}^{d,v})≤t)≤t^{c}^{1}n^{c}^{2}, (4.11)
for everyv∈C^{,}t >0andn. It then follows that for anyδ >0andα < c1,

Eh

(smin(V_{n}^{d,v}))^{−α}I

s_{min}(V_{n}^{d,v})≤n^{−δ}

i≤ c1

c1−αn^{c}^{2}^{−δ(c}^{1}^{−α)}. (4.12)
Setting hereafterα=c_{1}/2positive andδ= 4c_{2}/c_{1}finite, the right side of (4.12) decays
to zero asn→ ∞. Further, for anyn,dandv,

Eh

h|Log|, L_{V}d,v
n i^{n}_{0}^{−δ}i

≤Eh

|logsmin(V_{n}^{d,v})|I

s_{min}(V_{n}^{d,v})≤n^{−δ}

i

. (4.13)

Hence, with|x|^{α}log|x| →0asx→0, upon combining (4.12) and (4.13) we deduce that
lim sup

n→∞

sup

v∈CEh

h|Log|, L_{V}d,v
n i^{n}_{0}^{−δ}i

= 0. (4.14)

Next, consider the collection of setsΓ^{d}_{γ} as in (3.3), that corresponds to the compact
Λ_{d}:=

v∈C:|v| ∈ {0,1, . . . , d}

(such thatm(Λd) = 0). In this case,v∈Γ^{d}_{γ} implies that{iy:y >0}is disjoint of the set
Γ^{d,|v|}γ of Lemma 4.3. For such values ofv we thus combine the bound (4.8) of Lemma
4.3 with [7, Lem. 15], to deduce that for any integerd ≥1 and γ ∈ (0,1) there exist
finiten_{0}, M and positiveκ(depending only ondandγ), for which

E
L_{V}d,v

n (−y, y)

≤2M(y∨n^{−κ}) ∀n≥n_{0}, y >0, v∈Γ^{d}_{γ}. (4.15)
Imitating the derivation of [7, Eqn. (49)], we get from (4.15) that for some finiteC =
C(d, γ, δ), anyε≤e^{−1},n≥n0andv∈Γ^{d}_{γ},

Eh

h|Log|, L_{V}d,v
n i^{ε}_{n}−δ

i≤Cε|logε|. (4.16)

Thus, combining (4.14) and (4.16) we have that for anyγ >0, limε↓0lim sup

n→∞

sup

v∈Γ^{d}_{γ}Eh

h|Log|, L_{V}d,v
n i^{ε}_{0}i

= 0. (4.17)

Similarly, in view of (4.4), the bound (4.8) implies that

|=(G^{d,v}_{∞}(z))| ≤M , ∀z∈C^{+}\Γ^{d,|v|}_{γ} , v∈B(0, R),
which in combination with [7, Lem. 15], results with

Θ^{d,v}(−y, y)≤2M y ∀y >0, v∈Γ^{d}_{γ}
and consequently also

lim

ε↓0 sup

v∈Γ^{d}_{γ}

{h|Log|,Θ^{d,v}i^{ε}_{0}}= 0. (4.18)
Next, by Lemma 3.1, the real valued random variablesXn^{(ε)}(ω, v) :=hLog, L_{V}d,v

n i^{∞}_{ε} con-
verge in probability, as n → ∞, to the non-random X∞^{(ε)}(v) := hLog,Θ^{d,v}i^{∞}_{ε} , for each
v ∈C^{and}ε >0. This, together with (4.17) and (4.18), results with the stated conver-
gence of (3.1), for eachv∈Γ^{d}_{γ}, so consideringγ→0we conclude that (3.1) applies for
allv∈Λ^{c}_{d}, hence form-a.e.v.

Turning to prove (3.2), fix γ > 0 and non-random, uniformly bounded φ, supported
withinΓ^{d}_{γ}. Since{L_{V}d,v

n , v∈Γ^{d}_{γ}}are all supported onB(0, γ^{−1}+d), for each fixedε >0,
the random variablesYn^{(ε)}(ω, v) :=φ(v)Xn^{(ε)}(ω, v)m(Γ^{d}_{γ})with respect to the product law
P := P×m(·)/m(Γ^{d}_{γ})on (ω, v)are bounded, uniformly inn. Consequently, their con-
vergence inP-probability, form-a.e. v, toY∞^{(ε)}(v)(which we have already established),
implies the correspondingL1-convergence. Furthermore, by (4.17) and Fubini’s theo-
rem,

E[|Y_{n}^{(0)}−Y_{n}^{(ε)}|]≤m(Γ^{d}_{γ})kφk∞ sup

v∈Γ^{d}_{γ}

E[|X_{n}^{(0)}(ω, v)−X_{n}^{(ε)}(ω, v)|]→0,

whenn→ ∞followed byε↓0. Finally, by (4.18), the non-randomY∞^{(ε)}(v)→Y∞^{(0)}(v)as
ε↓0, uniformly overΓ^{d}_{γ}. Consequently, asn→ ∞followed byε↓0,

E[|Y_{n}^{(0)}−Y_{∞}^{(0)}|]≤E[|Y_{n}^{(0)}−Y_{n}^{(ε)}|] +E[|Y_{n}^{(ε)}−Y_{∞}^{(ε)}|] + sup

v∈Γ^{d}_{γ}

{|Y_{∞}^{(0)}−Y_{∞}^{(ε)}|}

converges to zero and in particular Z

C

φ(v)X_{n}^{(0)}(ω, v)dm(v)→
Z

C

φ(v)X_{∞}^{(0)}(v)dm(v),
inL1, hence inP-probability, as claimed.

**5** **Proof of Theorem 1.2**

Following the proof of Proposition 1.4, it suffices for establishing Theorem 1.2, to extend
the validity of Lemmas 3.1 and 3.2 in case ofSn = Pd^{0}

i=1U_{n}^{i} +Pd

i>d^{0}O_{n}^{i}. To this end,
recall that Lemma 1.5(ii) applies regardless of the value ofd^{0}. Hence, Lemmas 3.1 and
3.2 hold as soon as we establish Lemma 4.2, the bound (4.11) ons_{min}(V_{n}^{d,v}), and the
convergence (4.4) ford= 1. Examining Section 4, one finds that our proof of the latter
three results applies as soon asd^{0}≥1(i.e. no need for new proofs if we start withU_{n}^{1}).

In view of the preceding, we set hereafterd^{0} = 0, namely consider the sum of (only)
i.i.d Haar orthogonal matrices and recall that suffices to prove our theorem whend≥2

(for the case ofd= 1has already been established in [12, Cor. 2.8]). Further, while the
Haar orthogonal measure isnot invariant under multiplication bye^{iθ}, it is not hard to
verify that nevertheless

n→∞lim E[1

nTr{O^{k}_{n}}] =E[1

nTr{(O^{∗}_{n})^{k}}] = 0,

for any positive integerk. Replacing the identity (4.3) by the preceding and thereafter
following the proof of Lemma 4.1, we conclude thatE[L_{O}1,v

n ]⇒Θ^{1,v}asn→ ∞, for each
fixedv ∈C. This yields of course the convergence (4.4) of the corresponding Stieltjes
transforms (and thereby extends the validity of Lemma 3.1 even ford^{0} = 0). Lacking
the identity (4.3), for the orthogonal case we replace Lemma 4.2 by the following.

**Lemma 5.1.** The Stieltjes transformG^{1,v}_{n} of theESDE[L_{O}1,v

n ]is such that
z∈C^{+}: |=G^{1,v}_{n} (z)| ≥Cε^{−2} ⊂n

E+iη:η ∈(0, ε^{2}),
E∈ ±(1± |v|)−2ε,±(1± |v|) + 2ε

∪

±(|1±v| −2ε,±(|1±v|) + 2εo
,
for someCfinite, allε∈(0,1)and anyv∈C^{.}

Proof: We express G^{1,v}_{n} (z)as the expectation of certain additive function of the eigen-
values ofO^{1}_{n}, whereby information about the marginal distribution of these eigenvalues
shall yield our control on|=(G^{1,v}_{n} (z))|. To this end, setg(z, r) :=z/(z^{2}−r)forz ∈C^{+}^{,}
r≥0, and letφ(O^{1}_{n}) :=_{2n}^{1} Tr{(zI_{2n}−O^{1,v}_{n} )^{−1}}. Clearly,

φ(O^{1}_{n}) = 1
n

n

X

k=1

g(z, s^{2}_{k}), (5.1)

where {sk} are the singular values of O^{1}_{n}−vIn. For any matrix An and orthogonal
matrix Oe_{n}, the singular values of A_{n} are the same as those ofOe_{n}A_{n}Oe^{∗}_{n}. Considering
A_{n}=O_{n}^{1}−vI_{n}, we thus deduce from (5.1) thatφ(Oe_{n}O^{1}_{n}Oe_{n}^{∗}) =φ(O_{n}^{1}), namely thatφ(·)is
acentral functionon the orthogonal group (see [1, pp. 192]).

The group ofn-dimensional orthogonal matrices partitions into the classesO^{+}(n)and
O^{−}(n) of orthogonal matrices having determinant +1 and −1, respectively. In case
n = 2`+ 1 is odd, any On ∈ O^{±}(n)has eigenvalues {±1, e^{±iθ}^{j}, j = 1, . . . , `}, for some
θ = (θ_{1}, . . . , θ_{`}) ∈ [−π, π]^{`}. Similarly, for n = 2` even, O_{n} ∈ O^{+}(n) has eigenvalues
{e^{±iθ}^{j}, j= 1, . . . , `}, whereasO_{n}∈ O^{−}(n)has eigenvalues{−1,1, e^{±iθ}^{j}, j= 1, . . . , `−1}.
Weyl’s formula expresses the expected value of a central function of Haar distributed
orthogonal matrix in terms of the joint distribution ofθunder the probability measures
P^{±}n corresponding to the classesO^{+}(n)andO^{−}(n). Specifically, it yields the expression
G^{1,v}_{n} (z) =E[φ(O^{1}_{n})] =1

2E^{+}n[φ(diag(+1, R`(θ))] +1

2E^{−}n[φ(diag(−1, R`(θ))], for n= 2`+ 1,

=1

2E^{+}n[φ(diag(R`(θ))] +1

2E^{−}n[φ(diag(−1,1, R`−1(θ))], for n= 2`,
(5.2)
whereR`(θ) := diag(R(θ1), R(θ2),· · · , R(θ`))for the two dimensional rotation matrix

R(θ) =

cosθ sinθ

−sinθ cosθ

(see [1, Propn. 4.1.6], which also provides the joint densities ofθunderP^{±}n).