under the Growth-Curve Model in High-Dimension and its Error Bound.

(1)

成長曲線モデルにおける一様共分散構造の検定に対する尤度比統計量の高次元漸近展開と誤差評価

Asymptotic Expansion of the Distribution of LR Statistics for Testing the Intraclass Correlation

under the Growth-Curve Model in High-Dimension and its Error Bound.

数学専攻加藤直広 KATO Naohiro

1 Introduction

Letxbe ap-dimensional random vector distributed as ap-variate normal distribution with an unknown mean vectorµand an unknown covariance matrixΣ, denoted,x∼Np(µ,Σ). Suppose thatx1, . . . ,xn is a sampleN(=n+ 1, n≥p) independent observation vectors onx. Consider the problem of testing the null hypothesis

H:Σ=Σ_I =σ²{(1−ρ)I+ρ11} (1) against all alternatives, where1= (1,1, . . . ,1);σ² andρare parameters which satisfyσ²>0 and−1/(p−1)< ρ <1. The covariance structure (1) is known as the intraclass correlation model. The likelihood ratio criterionλfor testing the hypothesis (1) is given by

λ=

(p−1)^p−1|S|

uv^p−1 ^N₂

, where

u=1

p1S1, v= tr(S)−u.

Here,S denotes the sample covariance matrix based onx₁, . . . ,x_n.

For largeN and ﬁxedp, Box type asymptotic expansion of the null distribution ofλ^∗=λ^2/N is given (see e.g. Siotani et, al. 1985). In this paper, we shall derive asymptotic distribution under the following high-dimensional framework.

A:p→ ∞, N → ∞, p

N →c∈(0,1).

In this paper, we derive the asymptotic null distribution ofλ^∗on the framework A. To demonstrate the accuracy of our approximation, numerical simulations are done. Furthemore, we obtain computable error bound between the exact distribution and the asymptotic distribution.

1

(2)

2 Main Result

In this section, we derive the asymptotic null distribution ofλ^∗on the framework A. Theh-th moment ofλ^∗ is given by

E(λ^∗h) =K(p−1)^(p−1)h _p−1

j=1Γ[ⁿ₂ +h−^j₂] Γ[(p−1)(ⁿ₂ +h)] , where K = Γ[ⁿ₂(p−1)]/_p−1

j=1Γ[ⁿ₂ −^j₂]. The cumulant-generating function of logλ^∗ is expanded as follows;

log E[exp(itlogλ^∗)] =itµn,p+1

2(it)²σ²_n,p+ ∞ r=3

1

r!(it)^rγr,n,p, (2) where

µn,p= (p−1) log(p−1) +ψ(p−1)(n 2 −1

2)−ψ(n

2(p−1))(p−1), σ_n,p² =ψ_(p−1)(n

2 −1

2)−ψ(n

2(p−1))(p−1)², γr,n,p=ψ^(r−1)_(p−1)(n

2 −1

2)−ψ^(r−1)(n

2(p−1))(p−1)^r.

Here,ψis di-gamma function deﬁned byψ(z) = (d/dz) log Γ(z) andψ(p−1)(a) = _p−1

j=1ψ(a−¹₂(j−1)). Let

zn,p=logλ^∗−µn,p

σ²_n,p .

From (2), the r-th cumulant of zn,p can be expressed as γr,n,p/(σ_n,p² )^r². The characteristic function of thezn,p can be expressed as follows.

E[exp(itzn,p)] = exp −t²

2

1 + ∞ k=1

(it)^3k k!

∞ j=0

γ_k,j,n,p (it)^j

, where

γ_k,j,n,p=

j1+···+jk=j

γj1+3,n,p· · ·γjk+3,n,p

(j1+ 3)!· · ·(jk+ 3)!σ^j+3kn,p . Let

φs(t) = exp(−t² 2)

1 +

s k=1

(it)^3k k!

s−k

j=0

γ_k,j,n,p(it)^j

. (3)

Inverting (3), we obtain the Edgeworth expansion ofzn,pup to the orderO(m^−s) as

Φ_s(x) = Φ(x)−φ(x) ^s

k=1

1 k!

s−k

j=0

γ_k,j,n,p h3k+j−1(x)

2

(3)

where Φ andφare the distribution function of standard normal distribution and its derivatives, respectively; hr(x) denotes the r-th order Hermite polynomial deﬁned by

d dx

_r exp

−x² 2

= (−1)^rhr(x) exp −x²

2

.

3 Numerical Comparison

This section presents the results of numerical simulations to demonstrate the effectiveness of asymptotic normality of zn,p for some value of p and n. In all our simulations, we tookσ² = 1, ρ= 1/2. In Table 1, we list the estimated significance levels forzn,pforN = 100 calculated by using 1,000,000 repetitions with nominal significance levels of 0.01,0.05,0.50,0.95 and 0.99.

Table 1．Actual probabilities ofzn,p forN = 100

0.01 0.05 0.50 0.95 0.99

p=10 0.0046 0.0408 0.5261 0.9366 0.9794 p=20 0.0068 0.0445 0.5127 0.9440 0.9855 p=30 0.0078 0.0463 0.5089 0.9461 0.9870 p=40 0.0081 0.0471 0.5063 0.9471 0.9879 p=50 0.0085 0.0479 0.5046 0.9475 0.9883 p=60 0.0087 0.0477 0.5048 0.9478 0.9884 p=70 0.0089 0.0483 0.5049 0.9481 0.9887 p=80 0.0088 0.0480 0.5044 0.9482 0.9887 p=90 0.0087 0.0480 0.5054 0.9480 0.9886 From Table 1, we ﬁnd thatzn,p give a good approximation forp≥50.

For large N and ﬁxed p, the chi-square approximation of LR statistic is given. We list signiﬁcance levels for −2τlog(λ^∗)ⁿ² in Table 2 using the same setting as for the simulations presented in Table 1, where

τ = 1− p(2p³+p²−4p−3) 6n(p−1)(p²+p−4) is Bartlett correction factor.

Table 2．Actual probabilities of−2τlog(λ^∗)ⁿ2 forN = 100

0.01 0.05 0.50 0.95 0.99

p=10 0.0100 0.0502 0.5013 0.9502 0.9902 p=20 0.0107 0.0527 0.5121 0.9533 0.9909 p=30 0.0127 0.0607 0.5419 0.9606 0.9928 p=40 0.0189 0.0821 0.6077 0.9739 0.9958 p=50 0.0385 0.1407 1.000 1.000 1.000 p=60 1.0000 1.0000 1.0000 1.0000 1.0000 p=70 1.0000 1.0000 1.0000 1.0000 1.0000 p=80 1.0000 1.0000 1.0000 1.0000 1.0000 p=90 1.0000 1.0000 1.0000 1.0000 1.0000

3

(4)

From Table 2, the approximation of−2τlog(λ^∗)ⁿ² is accurate forp≤20.

4 Error Bound

In this section we shall ﬁnd an upper bound of error between the distribution of zn,p and the asymptotic distribution. The following inequality gives an upper bound.

supx |P(zn,p≤x)−Φ_s(x)| ≤ ^∞

−∞

1

|t||E[exp(itzn,p)]−φs(t)|dt, We obtain the following error bound.

sup

x |P(zn,p≤x)−Φ_s(x)|

<

s k=1

2^3k² k!

4p²

σ^3kmn(n−1)B[2v/σn,p]^k−^s−k

j=0

γ_k,j,n,p

Γ 3k

2 −Γ

3k 2 ,m²v²

2

+ B[2v/σn,p]^s+12^7s+72 p^2s+2 m^s+1n^s+1(n−1)^s+1σn,p^3s+3

1−8B[2v/σn,p]vp² n(n−1)σ³_n,p

^−3s−3₂

· Γ

3 2(s+ 1)

−Γ 3

2(s+ 1),m²v² 2

1−8B[2v/σn,p]vp² n(n−1)σ_n,p³

+ (^m₂ + [^p+1₂ ]−1)²

m²v²(^p−1₂ {[^p+1₂ ]−1} −1)·

1 + m²v²

σ²_n,p(^m₂ + [^p+1₂ ]−1)²

₋^p−1₂ _{[^p+1₂ _]−1}+1

+ 2

m²v²e⁻^m²²^v² + s k=1

2^3k+j² k!

s−k

j=0

γ_k,j,n,p

Γ

3k+j 2

−Γ

3k+j 2 ,m²v²

2

.

Here,m=n−p+ 1, 0 < v <(σ_n,p² )¹²/2 is a constant, [·] is Gaussian integer, Γ(z, a) =_∞

a x^z−1e^−xdx is imcompete gamma function and B[v] =− 1

2v − 1 v²− 1

v³log(1−v) + 1 m

1 1−v. We computed the values of the error bound forN = 100 ands= 0.

p 30 50 70 90

0.180594 0.101753 0.0816023 0.123004

Acknowledgement: The author would like to thank Professor Yasunori Fu- jikoshi of Chuo University for continuous instruction. In addition, I am grateful to seniors of Sugiyama laboratory for the help of this study.

References

[1] Siotani, M., Hayakawa, T. and Fujikoshi, Y. (1985). Modern Multivariate Statistical Analysis: A Graduate Course and Handbook, American Sciences Press, Columbus, OH.

4

under the Growth-Curve Model in High-Dimension and its Error Bound.

成長曲線モデルにおける一様共分散構造の検定に対する 尤度比統計量の高次元漸近展開と誤差評価

Asymptotic Expansion of the Distribution of LR Statistics for Testing the Intraclass Correlation