THE BERRY-ESSEEN THEOREM

(1)

THE BERRY-ESSEEN THEOREM

K. NEAMMANEE

Received 24 November 2004 and in revised form 8 March 2005 Dedicated to Professor Yupaporn Kemprasit on her sixtieth birthday.

In 2001, Chen and Shao gave the nonuniform estimation of the rate of convergence in Berry-Esseen theorem for independent random variables via Stein-Chen-Shao method.

The aim of this paper is to obtain a constant in Chen-Shao theorem, where the random variables are not necessarily identically distributed and the existence of their third moments are not assumed. The bound is given in terms of truncated moments and the constant obtained is 21.44 for most values. We use a technique called Stein’s method, in particular the Chen-Shao concentration inequality.

1. Introduction and main result

LetX1,X2,. . .,X_nbe independent and not necessarily identically distributed random variables with zero mean and finite variance. Define W=X1+X2+···+Xn and assume that Var(W)=1. LetFnbe the distribution function ofWandΦthe standard normal distribution function. It is well known that if the Lindeberg condition,

∀ε >0, n i=1

EX_i²IXi> ε−→0 asn−→ ∞, (1.1) whereI(A) is an indicator random variable such that

I(A)=





1 ifAis true,

0 otherwise, (1.2)

is satisfied, then

∀x∈R, F_n(x)−→Φ(x) asn−→ ∞. (1.3) Furthermore, ifE|X_i|³<∞, then we have the uniform Berry-Esseen theorem

sup

x∈R

F_n(x)−Φ(x)≤C0

n i=1

EX_i³, (1.4)

International Journal of Mathematics and Mathematical Sciences 2005:12 (2005) 1951–1967 DOI:10.1155/IJMMS.2005.1951

(2)

and the nonuniform Berry-Esseen theorem Fn(x)−Φ(x)≤ C1

1 +|x|3

n i=1

EXi³, (1.5)

where bothC0andC1are absolute constants.

Note that in caseXi’s are identically distributed, (1.4) and (1.5) were first obtained by Esseen [4] and Nagaev [8], respectively. Bikjalis [1] generalized Nagaev’s result to the case thatX_i’s are not necessarily identically distributed random variables. Paditz [9,10]

calculatedC1to be 114.7 and 32 in 1977 and 1989, respectively, and Michel [7] reduced it to 30.84 for the independent and identically distributed case.

In 2001, Chen and Shao gave nonuniform and uniform bounds for independent and not necessarily identically distributed random variables without assuming the existence of third moments. Their result states as follows.

Theorem1.1 (Chen-Shao theorem). LetX1,X2,. . .,Xnbe independent random variables with zero means andⁿ_i₌1EX_i²=1. LetW=X1+X2+···+Xnand letFnbe the distribu- tion function ofW. Then,

F(x)−Φ(x)≤C n i=1

EX_i²IXi≥1 +|x|

1 +|x|2 +EXi³IXi<1 +|x| 1 +|x|3

, (1.6) F(x)−Φ(x)≤4.1

n i=1

EX_i²IX_i≥1+EX_i³IX_i<1. (1.7)

Observe that the constant 4.1 in (1.7) is smaller than 6 as obtained by Feller [5] and it was pointed out by Loh [6] that the truncation at 1 in (1.7) is optimal in the sense that

EX²I|X| ≥1+E|X|³I|X|<1=inf

A

EX²I(X∈A) +E|X|³IX∈A^C. (1.8) The standard tool used Esseen [4], Nagaev [8], Bikjalis [1], Paditz [9,10], and Michel [7]

is the Fourier-analytic method. But Chen and Shao [3] proved (1.6) and (1.7) by com- bining truncation with Stein’s method [14] and the concentration inequality approach.

The concentration inequality approach was originally used by Stein for independent and identically distributed random variables. It was extended by Chen [2] to dependent and nonidentically distributed random variables with arbitrary index sets. In [3], the concentration inequality approach is improved and extended to nonuniform bounds. The improved approach is much more eﬀective than that in [2]. In this paper, we combine the concentration inequality in [3] with the coupling approach to calculate the constant Cin (1.6). The followings are our main results.

Theorem 1.2. LetX1,X2,. . .,Xnbe independent random variables with zero means and _n

i=1EX_i²=1. LetW=X1+X2+···+X_nand letF_n be the distribution function ofW.

Then

F_n(x)−Φ(x)≤C0

n i=1

EX_i²IX_i≥1 +|x/4|

1 +|x/4|2 +EX_i³IX_i<1 +|x/4| 1 +|x/4|3

, (1.9)

(3)

where

C0=











21.44 if|x| ≤3or|x| ≥14,

32 if3<|x| ≤3.99or7.98<|x|<14, 60 otherwise.

(1.10)

Corollary1.3. IfX_i’s inTheorem 1.1have finite third moment, then F_n(x)−Φ(x)≤C1

_n

i=1EX_i³

1 +|x/4|3 , (1.11)

where

C1=





21.44 if|x| ≤7.98or|x| ≥14,

32 if7.98<|x|<14. (1.12) Observe that the bound inTheorem 1.2is given in terms of truncated moments. It is worthwhile to note also that truncated moments were considered by Sazonov [13]. In his work, he gave two main methods for deriving speed of convergence results in the central limit theorem (CLT), namely, the Fourier-analytic method and the method of composition which used convolutions directly. These methods are used to derive more results for random vectors. For nonuniform bound in CLT of random vectors, one can see, for examples, Rotar [11,12].

2. Auxiliary results

In this section, we give auxiliary results in order to prove the main theorem inSection 3.

LetX1,X2,. . .,Xn,W,Fn, andΦbe defined as inTheorem 1.2. In order to use the concentration inequality and the coupling approach, we introduce random variablesJ, ˜X1, X˜2,. . ., ˜X_n defined in the following way. The random variables J, X1,X2,. . .,X_n, ˜X1, X˜2,. . ., ˜Xn are independent, J uniformly distributed over the set{1, 2,. . .,n}, (Xi, ˜Xi) is a coupling pair, that is,Xiand ˜Xiare the same distributions. Fora >0, we also let

Y_j,a=X_jIX_j<1 +a, Y˜_j,a=X˜_jIX˜_j<1 +a, Sa=

n j=1

Yj,a, S˜a=Sa−YJ,a+ ˜YJ,a, αa=

n j=1

EX²_jIXj≥1 +a, βa= n j=1

EXj³IXj<1 +a, δa= α_a

(1 +a)²+ βa

(1 +a)³.

(2.1)

Observe that (Yj,a,Yj,a) is a coupling pair and (Sa,Sa) is an exchangeable pair in the sense that

PSa∈E,Sa∈E=PSa∈E, Sa∈E (2.2)

(4)

for arbitrary Borel sets Eand EonR. From the fact that (a+b)ⁿ≤2ⁿ⁻¹(aⁿ+bⁿ) for a,b≥0, we have

EYJ,a³=1 n

n j=1

EXj³IXj<1 +a=βa

n, (2.3)

EY˜_J,a−Y_J,a³≤8 n

n j=1

EX_j³IX_j<1 +a=8βa

n . (2.4)

Inproposition 2.1, we use the coupling approach to boundES²_aandES⁴_awhich are used in the proof of the concentration inequality.

Proposition2.1. (1)E^S^aS˜a=(1−1/n)Sa+ (1/n)ESa, whereE^XYis the conditional expec- tation ofYwith respect toX.

(2)ES²_a≤1 + (αa/(1 +a))².

(3)|ESa|³≤12βa+ 3(αa/(1 +a)) + (αa/(1 +a))³.

(4)ES⁴_a ≤53(1 +a)β_a + 30β_a(α_a/(1 +a)) + 6(α_a/(1 +a))²+ (α_a/(1 +a))⁴+ 6β_a + 6αa+ 3.

(5)If(1 +a)²αa+ (1 +a)βa<1/80anda≥3, thenES²_a≤1 + (3.8×10⁻⁸)andES⁴_a≤ 3.69.

(6)If(1 +a)²αa+ (1 +a)βa≥1/80anda≥14, thenES⁴_a/a⁴≤391δa. Proof. (1)

E^SâSã=E^SâSa−YJ,a+ ˜YJ,a

=Sa−E^S^aYJ,a+E^S^aY˜J,a

=S_a−1 n

n j=1

E^S^aY_j,a+1 n

n j=1

E^S^aY˜_j,a

=S_a−1 n

n j=1

Y_j,a+1 n

n j=1

EY_j,a

=

1−1 n

Sa+1 nESa.

(2.5)

(2) Leth:R²→Rbe defined by

h(˜t,t)=t˜²−t². (2.6)

Sincehis antisymmetric in the sense thath(˜t,t)= −h(t, ˜t) and (Sa, ˜Sa) is an exchangeable pair, by Stein [15, equation (9), page 10],

ES˜²_a−S²_a=EhS˜a,Sa

=0. (2.7)

(5)

From this fact and (1), we have

0=ES˜_a−S_aS˜_a+S_a

=2ES˜a−Sa

Sa+ES˜a−Sa2

=2EE^S^aS˜a−Sa

Sa+ES˜a−Sa2

= −2

nES²_a+ES˜a−Sa

2

+2 nE²Sa,

(2.8)

which implies that ES²_a=n

2EY˜J,a−YJ,a

2

+E²Sa

= n j=1

EX²_jIXj<1 +a−E²XjIXj<1 +a+E²Sa

≤ n j=1

EX²_jIXj<1 +a+E²Sa

≤1 + αa

1 +a 2

,

(2.9)

where we have used the fact thatⁿ_j₌₁EX²_j =1 and ESa=

n j=1

EXjIXj<1 +a =

n j=1

EXjIXj≥1 +a ≤ αa

1 +a (2.10) in the last inequality.

(3) By the same argument of (2), withh(˜t,t)=(˜t−t)(˜t²+t²), ES³_a=n

2ES˜_a−S_aS˜²_a−S²_a+ES_aES²_a

=n

2ES˜a−Sa2S˜a+Sa

+ESaES²_a

=n

2EY˜J,a−YJ,a

2Y˜J,a−YJ,a

+ 2Sa

+ESaES²_a

=n

2EY˜_J,a−Y_J_,a³+nEY˜_J,a−Y_J,a²S_a+ES_aES²_a.

(2.11)

Hence,

ES³_a≤n

2EY˜J,a−YJ,a³+nEY˜J,a−YJ,a2

Sa+ESaES²_a

≤4β_a+nEY˜_J,a−Y_J,a²S_a+ αa

1 +a

+ αa

1 +a 3

,

(2.12)

(6)

where we have used (2.4), (2.10), and (2) in the last inequality. Note that EY˜_J,a−Y_J,a²S_a=

1 n

n j=1

EY˜_j,a−Y_j,a² n l=1

Y_l,a

≤ 1

n n j=1

EY˜j,a−Yj,a

2

Yj,a

+

1 n

n j=1

EY˜j,a−Yj,a

2n l=1 l=j

EYl,a

≤1 n

n j=1

EY˜_j,a−Y_j,a²Y_j,a+1 n

n j=1

EY˜_j,a−Y_j,a²ES_a +1

n n j=1

EY˜_j,a−Y_j,a²EY_j,a

≤8 n

n j=1

EY_j,a³+2 n

αa

1 +a

≤8βa

n +2 n

αa

1 +a

.

(2.13) Hence, by (2.12) and (2.13),|ES³_a| ≤12βa+ 3(αa/(1 +a)) + (αa/(1 +a))³.

(4) Using the same argument of (2), withh(˜t,t)=(˜t−t)(˜t³+t³), we have ES⁴_a=n

2ES˜a−SaS˜³_a−S³_a+ESaES³_a

=n

2ES˜a−Sa

2S˜a−Sa

2

+ 3 ˜SaSa

+ESaES³_a

=n

2EY˜J,a−YJ,a4

+3n

2 EY˜J,a−YJ,a2

S²_a+Y˜J,a−YJ,a Sa

+ESaES³_a

≤n(1 +a)EY˜J,a−YJ,a³+3n

2 EY˜J,a−YJ,a

2

S²_a + 3n(1 +a)EY˜J,a−YJ,a2

Sa+ESaES³_a

≤32(1 +a)βa+ 6αa+ 12βa

α_a 1 +a

+ 3

α_a 1 +a

2

+ α_a

1 +a 4

+3n

2 EY˜_J_,a−Y_J_,a²S²_a,

(2.14)

where we have used (2.4), (2.10), (2.13), and (3) in the last inequality. From (2.14) and the fact that

EY˜J,a−YJ,a

2

S²_a

=1 n

n j=1

EY˜j,a−Yj,a2

E





 n l=1 l=j

Yl,a







2

+2 n

n j=1

EY˜j,a−Yj,a2

Yj,aE





 n l=1 l=j

Yl,a







(7)

+1 n

n j=1

EY˜_j,a₋Y_j,a²Y²_j,a

≤2 n

n j=1

EY²_j,aE





 n l=1 l=j

Y_l,a







2

+8 n

n j=1

EY_j,a³ E





 n l=1 l=j

Y_l,a







+4(1 +a) n

n j=1

EY_j,a³

≤2 n

n j=1

EY²_j,aES²_a−4 n

n j=1

EY_j,a² ESaYj,a+2 n

n j=1

EY²_j,aEY²_j,a

+8 n

n j=1

EYj,a³ESa+8 n

n j=1

EYj,a³EYj,a+4(1 +a) n βa

≤2 n+2

n αa

1 +a 2

+4 n

n j=1

EYj,a³

ES²_a+8β_a n

αa

1 +a

+14(1 +a)β_a n

≤2 n+4βa

n +12βa

n α_a

1 +a

+2 n

α_a 1 +a

2

+14(1 +a)βa

n ,

(2.15) we have

ES⁴_a≤53(1 +a)βa+ 30βa

α_a 1 +a

+ 6 α_a 1 +a

2

+ α_a

1 +a 4

+ 6βa+ 6αa+ 3. (2.16) (5) Follows directly from (2) and (4).

(6)

ES⁴_a a⁴ ^≤

53 a³

1 +a a

β_a+30β_a a⁴

αa

1 +a

+ 6 a⁴

αa

1 +a 2

+ 1 a⁴

α_a 1 +a

4

+6βa

a⁴ +6α_a a⁴ + 3

a⁴

≤70.697βa

(1 +a)³ +0.035α_a

(1 +a)²+ 3.997 (1 +a)⁴

≤70.697βa

(1 +a)³ +0.035α_a

(1 +a)²+ 319.76αa

≤391δ_a,

(2.17)

where we have used the fact that a≥14, αa≤1, and (1 +a)/a≤1.072 in the second inequality and the fact that (1 +a)²αa+ (1 +a)βa≥1/80 in the last inequality.

Next, we will prove the concentration inequality.

Proposition 2.2 (concentration inequality). Let i∈ {1, 2,. . .,n} and W⁽ⁱ⁾=W−Xi. Then for3≤a < b <∞and(1 +a)²α_a+ (1 +a)β_a<1/80,

Pa≤W⁽ⁱ⁾≤b≤ 40.98

(1 +a)³(b−a) + 46.38δa. (2.18)

(8)

Proof. LetS_i,a=S_a−Y_i,a. We observe thatW⁽ⁱ⁾=S_i,awhen max1≤j≤n,j=i|X_j|<1 +a. So

Pa≤W⁽ⁱ⁾≤b≤Pa≤Si,a≤b+P





max

1≤j≤n j=i

Xj≥1 +a







≤Pa≤Si,a≤b+ αa

(1 +a)².

(2.19)

Letγ=βa/2 and f :R→Rdefined by

f(t)=











0 fort < a−γ,

(1 +t+γ)³(t−a+γ) fora−γ≤t≤b+γ, (1 +t+γ)³(b−a+ 2γ) fort > b+γ.

(2.20)

So f is a nondecreasing function satisfying f(t)≥(1 +a)³ for a−γ < t < b+γ, and f(t)≥0 otherwise. LetM(w,t)=w[I(−w≤t <0)−I(0≤t <−w)]. Hence,

ES_i,afS_i,a)= n j=1 j=i

EY_j,afS_i,a−fS_i,a−Y_j,a

= n j=1 j=i

EY_j,a 0

−Yj,a

fS_i,a+tdt

= n j=1 j=i

EYj,a

RfSi,a+tI−Yj,a≤t <0−I0< t≤ −Yj,a dt

= n j=1 j=i

E

RfSi,a+tMYj,a,tdt

≥(1 +a)³ n j=1 j=i

E

Ia≤Si,a≤b

|t|≤γMYj,a,tdt

=(1 +a)³E











Ia≤Si,a≤b n j=1 j=i

Yj,aminγ,Yj,a











≥0.46(1 +a)³Pa≤Si,a≤b−PUi≤0.46),

(2.21)

whereUi=_n

j=1,j=i|Yj,a|min(γ,|Yj,a|) and we have used the fact that It1≤w≤t2

y≥c

It1≤w≤t2

−

1−y c

I(y≤c)

(2.22)

(9)

fort1,t2,y≥0,c >0 in the last inequality. Hence, Pa≤Si,a≤b≤ 1

0.46(1 +a)³ESi,afSi,a

+PUi≤0.46. (2.23) Next, we will bound the two terms on the right-hand side of (2.23). By the same argument as that inProposition 2.1, we can show thatES⁴_i,a≤3.69 andES²_i,a≤1 + (3.8×10⁻⁸). So

ESi,afSi,a≤(b−a+ 2γ)ESi,aSi,a+ (1 +γ)³

≤4b−a+β_aES⁴_i,a+|1 +γ|³ES_i,a

≤4b−a+β_a

$

ES⁴_i,a+1 +βa

2 ³

ES²_i,a

%

≤18.85b−a+βa

.

(2.24)

By the facts that min(a,b)≥b−b²/4afora,b >0, EX_i²IX_i≤1 +a≤

β_a^2/3<0.021, α_a≤ 1

80(1 +a)² ^≤7.8×10⁻⁴ fora≥3, (2.25) we have

EUi= n j=1 j=i

EYj,aminγ,Yj,a

≥ n j=1 j=i

EY²_j,a−EYj,a³ 4γ

≥ n j=1

EX²_jIX_j<1 +a₋EX²_jIX_j_≥1 +a ₋β γ

=

j=1 j=i

EX²_jIX_j<1 +a−E²X_jIX_j≥1 +a−0.5

=1−EX_i²IX_i<1 +a−2 n j=1

EX²_jIX_i≥1 +a−0.5

≥1−(βa)^2/3−2αa−0.5

≥0.477.

(2.26)

Using the same argument as inProposition 2.1(5), we can show that EUi−EUi⁴≤3.69γ⁴=0.231β⁴_a≤4.512×10⁻⁷ β_a

(1 +a)³. (2.27)

(10)

Hence,

PU_i≤0.46≤PEU_i−U_i≥0.477−0.46

=PEU_i−U_i≥0.017

≤EUi−EUi⁴ (0.017)⁴

≤5.402βa

(1 +a)³.

(2.28)

From (2.19), (2.23), (2.24), and (2.28), Pa≤W⁽ⁱ⁾≤b≤ 40.978

(1 +a)³

b−a+βa

+5.402βa

(1 +a)³+ αa

(1 +a)²

≤40.98(b−a)

(1 +a)³ + 46.36δa.

(2.29) Proposition2.3. Forx≥2,

Ef_x(W)≤ 15

(1 +x)², (2.30)

where fxis the unique solution of the Stein equation

f(w)₋w f(w)₌I(w_≤x)₋Φ(x). (2.31) Proof. From Stein [15, pages 22 and 24], we know that

0< f_x(w)<1−Φ(x) forw≤0,

0< f_x(x)≤1−Φ(x)1 +^√2πwe^(1/2)w²Φ(x) for 0< w≤x, f_x(w)≤1 ∀w∈R.

(2.32)

Hence,

Ef_x(W)=Ef_x(W)I(W≤0) +Ef_x(W)I

0< W≤4x 5

+Ef_x(W)I

W >4x 5

≤

1−Φ(x)P(W≤0) +1−Φ(x)E1 +^√2πWe^W²^/2I

0< W≤4x 5

+ E1 +W² 1 + (4x/5)²

≤

1−Φ(x)+1−Φ(x)

$

1 +4^√2π 5 xe^8x²^/25

%

+ 2

1 + (4x/5)².

(2.33)

(11)

Since

1−Φ(x)≤e_√⁻^(1/2)x²

2πx forx≥0, (2.34)

(see Stein [15, equation (25), page 23]) ande^x²^/2> xforx≥2, we have 1−Φ(x)(1 +x)²≤ 1

√2πx² (1 +x)²= 1

√2π 1

x+ 1 2

≤0.9, (2.35) which implies that

1−Φ(x)≤ 0.9

(1 +x)². (2.36)

From (2.34) and the fact thate^9x²^/50>9x²/50, we derive

√2π1−Φ(x)(1 +x)²xe^8x²^/25≤e⁻^9x²^/50(1 +x)²

≤50 9

1 x+ 1

2

≤12.5,

(2.37)

that is,

4^√2π 5

1−Φ(x)xe^8x²^/25≤ 10

(1 +x)². (2.38)

From (2.33), (2.36), (2.38), and the fact that (1 +x)/(1 + 4x/5)≤5/4, we have proved the

proposition.

Proposition2.4. Letx≥14andg:R→Rdefined byg(w)=(w fx(w)). If(1 +x)²αx+ (1 +x)βx<1/80, then for|u| ≤1 +x/4,

EgW⁽ⁱ⁾+u≤ 4.60

(1 +x/4)³+ 5.13δ_x/4(1 +x). (2.39) Proof. From Chen and Shao [3, pages 248–249], we know that

g(x−1)=√

2π1 + (x−1)²e^(x⁻¹⁾²^/2Φ(x−1) + (x−1)1−Φ(x), (2.40) gis increasing for 0≤w < x, and

EgW⁽ⁱ⁾+u≤ 2

1 +x³+ 21−Φ(x)+g(x−1) +EgW⁽ⁱ⁾+uIx−1< W⁽ⁱ⁾+u < x. (2.41) Forx≥14, elementary calculation yields

(1 +x)³

1 +x³ ^≤1.23 (2.42)