1Introduction MichaelC.Cranston StanislavA.Molchanov Onaconcentrationinequalityforsumsofindependentisotropicvectors

(1)

ISSN:1083-589X in PROBABILITY

On a concentration inequality for sums of independent isotropic vectors

^∗

Michael C. Cranston

^†

Stanislav A. Molchanov

^‡

Abstract

We consider a version of a classical concentration inequality for sums of independent, isotropic random vectors with a mild restriction on the distribution of the radial part of these vectors. The proof uses a little Fourier analysis, the Laplace asymptotic method and a conditioning idea that traces its roots to some of the original works on concentration inequalities.

Keywords:concentration inequality; isotropy.

AMS MSC 2010:Primary 60F05, Secondary 60E15.

Submitted to ECP on August 18, 2009, final version accepted on April 23, 2012.

1 Introduction

The purpose of this paper is to give a multi-dimensional version of a concentration inequality that traces its origins to P. Lévy [14]. These Lèvy type concentration results assert a rate of decay on the maximal amount of mass that the distribution of a sum of independent, nondegenerate, random variables can place in an arbitrary interval.

We remark that this is in contrast to another version of the concentration phenomena, see [13] and the vast literature listed there, where measures concentrate most of their mass on particular subsets of a measure space. However, the proof of our result relies on an early version of the latter type of concentration phenomena due to Bernstein and Hoeffding. Motivation to study the concentration inequality in the context of isotropic random vectors arises from a variety of sources. Sums of independent isotropic vectors are the so-called isotropic random flights which arise in problems in astronomy, [2], [4].

Such sums appear in random searches in the context of biological encounters, [3]. They are also used as polymer models in chemistry, [6], [8]. Yet another example of isotropic random vectors arises in a discrete model for the theory of magnetic fields generated by a turbulent media, [15], where random vectors of the formM ξappear whereM is a random element of the orthogonal group,O(3),and the random vectorξis independent ofM.

Early works on the concentration inequalities in the real valued case are due to Döblin, [5], Kolmogorov [11], Lèvy [14] and Rogozin [17]. The higher dimensional case has been treated in Kanter, [10]. In [10], the author considered independent, R^N- valued, symmetric random vectors X1,· · · , Xn. Then if C ⊂ R^N is a convex set and

∗Research supported by grants from NSF, DMS-1007176 and DMS-0706928.

†M. Cranston, University of California Irvine, USA. E-mail:[email protected]

‡S. Molchanov, University of North Carolina at Charlotte, USA. E-mail:[email protected]

(2)

λ=Pn

i=1P(X_i6∈C)it was proved that

P(X1+· · ·+Xn+x∈C)≤Φ(λ)

whereΦ(s) =e^−s(I₀(s) +I₁(s))andI₀andI₁are the modified Bessel functions given by Ik(s) =

∞

X

m=0

1 m!(m+k)!

s 2

2m+k

.

Our work considers the convex set C = Q(l), the cube centered at Q(l) = [0l]^d. As remarked in [10],Φ(ⁿ₂) ≥ cn^−1/2. Thus, the Kanter result doesn’t contain the dimensional dependence which does appear in our bound on the concentration function in Theorem 2.1.

The authors wish to thank the referee for pointing out the Lèvy decoupage technique [1], used in our conditioning argument at (3.6) below, which greatly improved an earlier version of the paper.

2 Statement of Result

Let us give a precise statement of the result proven by Kolmogorov [11]. Suppose {ξk}_k≥1are independent random variables defined on some probability space(Ω,F, P).

SetSn=Pn

k=1ξk and define forl >0,the concentration function Q_k(l) = sup

x

P(x≤ξ_k≤x+l) and

QS_n(l) = sup

x

P(x≤Sn ≤x+l).

Kolmogorov’s version of the concentration inequality states that there exists a constant C >0such that if

L≥l, L²≥l²logs where

s=

n

X

k=1

(1−Qk(l)) then

QSn(L)≤ CL l√

s.

We prove a higher dimensional version of this result. Due to possible degeneracies, the higher dimensional result will not hold in general, but a natural condition to impose on random vectors under which the result turns out to be true is isotropy. We say an R^d valued random vector, X, on a probability space(Ω,F, P) is isotropic if P X⁻¹ = P(U X)⁻¹for everyU ∈ O(d),the group of orthogonal matrices. With this definition we have,

Theorem 2.1. LetX1, X2,· · · be independent, isotropic random vectors with values in R^d, d≥2and putSn =X1+X2+· · ·+Xn.Letl >0andL >0be given. Define, for a >0,the cube

Q(l) = [0, a]^d⊂R^d. Assumepi =P(Xi ∈/ Q(l)), satisfiesPn

i=1pi → ∞.Then given any∈ (0,1),for every x∈R^d,

P(S_n∈x+Q(L))≤2 exp

−2²(Pn i=1pi)² n

+(1 +o(1)) r d

2π L

l

!d

(1−)

n

Xpi

!−^d₂

.

(2.1)

(3)

There various immediate corollaries that may be derived from this. We sate two such.

Corollary 2.2. Assume in addition to the conditions above that

n

X

i=1

p_i>2^−2/d r d

2π L

l

!2

and

1

4 > n 2(Pn

i=1pi)²ln2 ¹₂Pn i=1p_i^d₂ q

d 2π

L l

^d .

Then there is a positive constantcindependent ofn, landLsuch that,

P(Sn ∈x+Q(L))≤c

n

X

i=1

pi

!−^d₂

L l

d

. Proof. The conditions of the corollary ensure that the solution of

²= n

2(Pn

i=1p_i)²ln2 ((1−)Pn i=1pi)^d² q

d 2π

L l

d

satisfies 0 < < ¹₂. This follows since the left hand side minus the right hand side is increasing inand this forces the solution to be less than ¹₄. Sinceis then bounded from1we can take thisin (2.1) and find thecas claimed by usingTheorem 2.1.

Another possibility is the following.

Corollary 2.3. LetX1, X2,· · · be independent, isotropic random vectors with values in R^d, d≥2,for which

P(X_i∈/Q(1))≥ 1

2, for alli≥1.

There is a constantc(d)such that for everyxwe have P(S_n ∈x+Q(1))≤ c(d)

n^d/2. Proof. In this case, Pn

i=1p_i ≥ ⁿ₂ and L = l = 1. Taking = ¹₂ in Theorem 2.1 the corollary holds with a suitable choice of constantc.

3 Proof of the Concentration Inequality

We commence with two lemmas, the first is a local central limit theorem. This is likely a known result and as the proof is short we include it for completeness.

Lemma 3.1. Suppose thatYe₁,· · · ,Ye_mare iiduniformly distributed on the unit sphere inR^d.Ifpe^d_mdenotes the density ofYe1+· · ·+Yemthen

pe^d_m(0)∼

r d 2πm

!d

, m→ ∞.

(4)

Proof. IfY is uniformly distributed on the unit sphere inR^d,then fork∈R^d,we first compute

ψd(k) =E[e^ihk,Yⁱ].

Given k, one selects coordinates on the sphere φ₁, φ₂,· · ·, φ_d−1, φ_i ∈ [0, π),fori = 1,· · ·, d−2,andφ_d−1 ∈ [0,2π), so thatk= |k|cosφ1.The volume form on the sphere, normalized to have total mass one, is given in these coordinates by

dV = Γ(^d₂+ 1)

dπ^d² sin^d−2φ1sin^d−3φ2 · · ·sinφ_d−2dφ1dφ2· · ·dφ_d−1.

Then since the characteristic function of a random vector Y which is uniformly distributed on the sphere of radius1inR^dis real,

E[e^ihk,Yⁱ] =cd

Z π

0

cos(|k|cosφ1) sin^d−2φ1dφ1, wherec_d= Rπ

0 sin^d−2φ₁dφ₁⁻¹ . Ford= 2,by [12] page115,

ψ2(k) =1 π

Z π

0

cos(|k|cosφ)dφ

=1 π

Z π

0

cos(|k|sinφ)dφ

=J0(|k|),

whereJ0(z)is the0^thorder Bessel function of the first kind given by J0(z) = 1−(z/2)²

(1!)² +(z/2)⁴

(2!)² −(z/2)⁶ (3!)² +· · · . Ford= 3,

ψ₃(k) =1 2

Z π

0

cos(|k|cosφ₁) sinφ₁dφ₁

=1 2

Z 1

−1

cos(|k|w)dw

=sin(|k|)

|k| . Ford >3,we have by [12] page114,

ψ_d(k) =c_d Z π

0

cos(|k|cosφ₁) sin^d−2φ₁dφ₁

=2^d²⁻¹Γ d

2

|k|⁻^d²⁺¹Jd

2−1(|k|), whereJd

2−1is the order ^d₂−1Bessel function of the first kind.

Since for anyd≥2,

pe^d_m(0) = 1 (2π)^d

Z

R^d

(ψ_d(k))^mdk, (3.1)

we need to check the asymptotics of the right hand side.

(5)

Starting withd= 2,we observe thatψ₂(r) =J₀(r)has a unique global maximum at r = 0 withψ₂(0) = 1. Also, ψ⁰⁰₂(0) = −¹₂ and for any > 0, c₂() ≡ sup_r≥|ψ₂(r)| <1.

Similarly, ford= 3, ψ3(r) = ^sin_r^r has a unique global maximum atr= 0withψ3(0) = 1.

This timeψ⁰⁰₃(0) =−¹₃ and for any >0, c3()≡sup_r≥|ψ3(r)|<1.Finally, ford >3,we use the the representation, from [12] page114,

2^d²⁻¹Γ d

2

r⁻^d²⁺¹Jd

2−1(r) = Γ ^d₂ Γ ¹₂

Γ ^d−1₂ Z 1

−1

(1−t²)^d−3² cos(rt)dt to conclude thatψd(r) = 2^d²Γ ^d₂

r⁻^d²⁺¹Jd

2−1(r)has a unique global maximum atr = 0 withψ_d(0) = 1and

ψ⁰⁰_d(0) =− Γ ^d₂ Γ ¹₂

Γ ^d−1₂ Z 1

−1

(1−t²)^d−3² t²dt

=− Γ ^d₂ Γ ¹₂

Γ ^d−1₂

Γ ^d−1₂ Γ ³₂

Γ ^d₂+ 1

=−1 d,

and for any >0, cd()≡sup_r≥|ψd(r)|<1.

In order to determine the asymptotics at (3.1), we need ther → ∞decay ofψd(r).

Ford≥2from [12] page134for some positive constantc,

|ψ_d(r)| ≤c√

r^−d+1, r→ ∞. (3.2)

Now, withω_d= ^dπ

d 2

Γ(^d₂+1) being the volume ofS^d−1,write 1

(2π)^d Z

R^d

(ψd(k))^mdk= ωd

(2π)^d Z ∞

0

r^d−1(ψd(r))^mdr

= ωd

(2π)^d Z

0

r^d−1(ψd(r))^mdr+ ωd

(2π)^d Z ∞

r^d−1(ψd(r))^mdr.

(3.3)

We use the fact that in any dimension d ≥ 2, c_d() ≡ sup_r≥|ψd(r)| < 1 to show the second integral dies off exponentially fast inm.In fact,

Z ∞

r^d−1(ψ_d(r))^mdr

≤c_d()^m/2 Z ∞

r^d−1|ψd(r)|^m/2dr (3.4) and by (3.2), the second integral is bounded inmfor ^m₂ > _d−1^2d .

Since for anyd≥2, ψ_d⁰⁰(0) =−¹_d,in the first integral, we write (ψd(r))^m=e^m^ln^ψ^d^(r)∼e⁻^mr

2 2d

proceeding with the Laplace asymptotic method gives ω_d

(2π)^d Z

0

r^d−1(ψ_d(r))^mdr∼ ω_d (2π)^d

Z

0

|r|^d−1e⁻^mr

2 2d dr

= ωd

(2π)^d rd

m

!d

Z

√m d

0

r^d−1e⁻^r

2 2 dr

∼ r d

2πm

!^d

(3.5)

(6)

Thus, by (3.3), (3.4) and (3.5), ford≥2,

pe^d_m(0)∼

r d 2πm

!^d

and the lemma is proved.

Lemma 3.2. Let Y1, Y2,· · ·, Ymbe independent random vectors withYi uniformly distributed on the sphere of radiusR_i, i= 1,· · ·, m.Assume there is a numberl >0such that eachR_i≥land theR_iare non-random. Letmbe an even integer. Then forL >0 and anyx∈R^d,

P(Y1+Y2+· · ·+Ym∈x+Q(L))≤(1 +o(1)) r d

2πm L

l

!d

, m→ ∞.

Proof. It suffices to provide an appropriate bound on the L^∞ norm of the density of Y1+Y2+· · ·+Ym.Notice that

E[e^ihk,Yⁱⁱ] =ψd(Rik).

Then, by the independence ofY1, Y2,· · ·, Ym, E[e^ihk,Y¹^+Y²^+···+Y^mⁱ] =

m

Y

i=1

ψd(Rik).

By (3.2), for m > _d−1^2d , this characteristic function is integrable, which means S_m = Y1+Y2+· · ·+Ym has a bounded density,p^d_m(x), x ∈ R^d.In fact, for m even, by the arithmetic-geometric mean and Jensen’s inequalities,

||p^d_m||_∞≤ 1 (2π)^d

Z

R^d

m

Y

i=1

ψd(Rik)

dk

≤ 1 (2π)^d

1 m

m

X

i=1

Z

R^d

(ψd(Rik))^mdk

= 1 (2π)^d

1 m

m

X

i=1

1 R_i^d

Z

R^d

(ψd(k))^mdk

≤ 1 (2πl)^d

Z

R^d

(ψd(k))^mdk

≤(1 +o(1)) r d

2πm 1

l

!^d .

where the last line follows fromLemma 3.1.This completes the proof.

We can now prove the main result,Theorem 2.1.

Proof. We use the Lèvy decoupage technique [1] to decompose the random variables X_i,based on whether|X_i| ≥lor|X_i|< l.Set

pi=P(|Xi| ≥l).

Recall that we are assuming that p_i ≥ ¹₂, i = 1,2,· · · , n. For i = 1,2,· · · , n, let the random variables{U_i: 1≤i≤n}and{V_i: 1≤i≤n}be independent with

L(Ui) =L(|Xi| | |Xi| ≥l),L(Vi) =L(|Xi| | |Xi|< l).

(7)

Also take Bernoulli random variables{ηi: 1≤i≤n}independent of the{Ui: 1≤i≤n}

and the{V_i: 1≤i≤n}with

P(ηi= 1) = 1−P(ηi= 0) =pi.

Then, as above, take{Ye_i : 1≤i≤n}to be uniformly distributed on the unit sphere in R^dand independent of the previously defined random variables. In an obvious notation we may take the probability measureP asP =P_(U,V₎×Pη×P

Yeand our original random variables may be represented as

Xi= (ηiUi+ (1−ηi)Vi)Yei. For1≤i≤n,set

Y_i¹=η_iU_iYe_i and

Y_i⁰= (1−ηi)ViYei.

Notice that{Y_i¹: 1≤i≤n, ηi6= 0}satisfies the conditions ofLemma 3.2with respect to the probability measureP_η×P

Ye a.s..Denote the vector of outcomes of the sequence {η_i: 1≤i≤n}by

ζ_n = (η₁, η₂,· · ·, η_n).

Then for a given deterministic sequence(α₁, α₂, α₃,· · ·, α_n)∈ {0,1}ⁿ,define pη(α)≡P(ζn= (α1, α2, α3,· · ·, αn)) = Πⁿ_i=1p^α_iⁱ(1−pi)^1−αⁱ.

By Bernstein’s inequality or it’s generalization due to Hoeffding, [9], given0< <1, P_η

n

X

i=1

η_i<(1−)

n

X

i=1

p_i

!

=P_η

n

X

i=1

(η_i−p_i)<−

n

X

i=1

p_i

!

≤Pη |

n

X

i=1

(η_i−p_i)|>

n

X

i=1

p_i

!

≤2 exp

−2²(Pn i=1pi)² n

.

(3.6)

We then have

P_η×P

Ye(S_n ∈x+Q(L)) = X

α∈{0,1}ⁿ

p_η(α)P

Ye n

X

i=1

(α_iU_i+ (1−α_i)V_i)eY_i∈x+Q(L)

!

≤ X

α∈{0,1}ⁿ,Pn

i=1α_i<(1−)Pn i=1p_i

pη(α)

+ X

α∈{0,1}ⁿ,Pn

i=1α_i≥(1−)Pn i=1p_i

(pη(α)

×P

Ye n

X

i=1

(α_iU_i+ (1−α_i)V_i)eY_i ∈x+Q(L)

!!

.

(3.7)

It easily follows from Fubini’s theorem that ifZ is independent ofSn,then for anyQ

sup

x

P(Sm+Z ∈x+Q)≤sup

x

P(Sm∈x+Q). (3.8)

(8)

Using Lemma 3.2with m = [(1−)Pn

i=1p_i] which we may assume is even, we conclude from (3.8) that for eachα∈ {0,1}ⁿ for whichPn

i=1α_i ≥(1−)Pn

i=1p_i we have P_(U,V₎ a.s.

sup

x

PYe n

X

i=1

(αiUi+ (1−αi)Vi)Yei∈x+Q(L)

!

≤sup

x

PYe n

X

i=1

αiUiYei∈x+Q(L)

!

≤(1 +o(1)) r d

2π 1 l

!^d

(1−)

n

X

i=1

pi

!−^d₂

|Q|.

(3.9)

Thus, from (3.6), (3.7) and (3.9), it follows that asPn

i=1p_i→ ∞,one hasP_(U,V₎a.s.

Pη×P

Ye(Sn∈x+Q(L))≤2 exp

−2²(Pn i=1p_i)² n

+ (1 +o(1)) r d

2π L

l

!^d

(1−)

n

X

i=1

pi

!−^d₂

. Now integrate with respect toP_(U,V₎to complete the proof.

References

[1] Aloisio Araújo and Evarist Ginè. The central limit theorem for real and Banach valued random variables.. Wiley, New York, June 1980. MR-0576407

[2] Richard Barakat. Isotropic Random Flights. J. Phys. A., Nucl. Gen., Vol. 6, 796-804, June 1973. MR-0418184

[3] Richard Barakat.Lévy flights and superdiffusion in the context of biological encounters and random searches Physics of Life Reviews, Vol. 5, 133-150, 2008.

[4] S. Chandrasekhar.Stochastic Problems in Physics and Astronomy. Rev. Mod. Phys. 15 1-89, 1943. MR-0008130

[5] W. Döblin.Sur la summe d’un grande nombres des variables aleatoires independantes. Bull.

Sc. Math, 63, pp 23-32, 35-64, 1939.

[6] M. Doi and S.F. Edwards. The Theory of Polymer Dynamics. International Monographs on Physics 1988.

[7] R. Durrett. Probability: Theory and Examples, Second Edition Duxbury Press, 1996. MR- 1609153

[8] P. J. Flory Statistical Mechanics of Chain Molecules. New York: Interscience, 1969.

[9] Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58 (301): 13–30, March 1963. MR-0144363

[10] Marek Kanter. Probability inequalities for convex sets and multidimensional concentration functions. J. Multivariate Anal. 6 (1976), no. 2, 222–236 MR-0478328

[11] A.N. Kolmogorov. Sur les proprietes des fonctions concentrations de M. P, Lévy. Ann. de l’Inst. Henri Poincaré, XVI, 1,pp.27-34, 1958.

[12] N.N. Lebedev. Special Functions and Their Applications. Prentice-Hall, Englewood Cliffs, N.J. 1965.MR-0174795

[13] Michel Ledoux.The Concentration of Measure Phenomenon. AMS, 2001. MR-1849347 [14] Paul Lévy.Theorie de l’Addition des Variables Aleatoires. Gauthier-Villars, Paris, 1937.

[15] S.A. Molchanov and A. Ruzmaikin.Lyapunov exponents and distributions of magnetic fields in dynamo models. The Dynkin Festschrift, 287–306, Progr. Probab., 34, BirkhŁuser Boston, Boston, MA, 1994.

[16] G. Polya and G. Szego. Problems and Theorems in Analysis. Springer Verlag, New York, 1972. MR-0344042

[17] B. Rogozin.An estimate for concentration functions. Theory of Probability and its Appl. VI, 1, pp. 94-96, 1961. MR-0131893