• 検索結果がありません。

NULL DISTRIBUTION OF MULTIPLE CORRELATION COEFFICIENT UNDER MIXTURE NORMAL MODEL

N/A
N/A
Protected

Academic year: 2022

シェア "NULL DISTRIBUTION OF MULTIPLE CORRELATION COEFFICIENT UNDER MIXTURE NORMAL MODEL"

Copied!
7
0
0

読み込み中.... (全文を見る)

全文

(1)

http://ijmms.hindawi.com

© Hindawi Publishing Corp.

NULL DISTRIBUTION OF MULTIPLE CORRELATION COEFFICIENT UNDER MIXTURE NORMAL MODEL

HYDAR ALI and DAYA K. NAGAR Received 14 April 2001

The multiple correlation coefficient is used in a large variety of statistical tests and regres- sion problems. In this article, we derive the null distribution of the square of the sample multiple correlation coefficient,R2, when a sample is drawn from a mixture of two multi- variate Gaussian populations. The moments of 1−R2and inverse Mellin transform have been used to derive the density ofR2.

2000 Mathematics Subject Classification: 62H10, 62H15.

1. Introduction. Suppose thatx(p×1),µ(p×1), andΣ(p×p) >0 are partitioned as x=x

1 x(2)

,µ=µ

µ(2)1

, andΣ=σ

11σ21 σ21Σ22

, wherex(2)=(x2, . . . , xp)andµ(2)=(µ2, . . . , µp) are(p−1)×1 andΣ22is(p−1)×(p−1), so that Var(x1)=σ11, Cov(x(2))=Σ22, and σ12 is the (p−1)×1 vector of covariances betweenx1 andx2, . . . , xp. The multiple correlation coefficient betweenx1andx(2), denoted by ¯R1·2···p, is defined as

R¯1·2···p=

σ21Σ221σ21

σ11

1/2

. (1.1)

LetAbe the sample sum of squares and products matrix formed fromNindepen- dent observations onx. PartitionAasA=a

11 a21 a21 A22

, whereA22is(p−1)×(p−1). The sample multiple correlation coefficient betweenx1andx(2)is defined by

R=a21A−122a21

a11

1/2

. (1.2)

It is well known that, when the underlying population is normal, the random matrixA has Wishart distribution withn=N−1 degrees of freedom and parameter matrixΣ.

Further, ¯R1·2···p=0 if and only ifx1is independent ofx(2)=(x2, . . . , xp). Furthermore, when the population multiple correlation coefficient ¯R1·2···pis zero, the distribution ofR2is beta with parameters(1/2)(p−1)and(1/2)(N−p).

In practice, it is often the case that the random variables are not normally dis- tributed. When such is the case, how would the departure from the normality affect the conventional inference procedure? Specifically, one may wonder what would be the sampling distributions of some commonly used statistics? For providing some answers to the above questions, Srivastava and Awan [9] and Tan [11] derived the distribution of the sample sum of squares and products matrix when sampling from a mixture of two multivariate normal distributions. The normal mixture is defined as follows:

f (x)=Np

µ1,Σ;x

+(1−)Np

µ2,Σ;x

, xRp, (1.3)

(2)

where Np(µ,Σ;x)

=(2π )(1/2)pdet(Σ)1/2exp

1

2(x−µ)Σ1(x−µ)

, xRp,µRp,Σ>0, (1.4) and 1 is known as the degree of contamination. This model is very common in medical, biological, and agricultural experiments (Titterington et al. [12]). For results on the distribution theory and robustness studies of certain test statistics when sam- pling from a mixture normal model, see Srivastava [8], Srivastava and Awan [9,10], Kabe and Gupta [5], Amey and Gupta [2], and Nagar and Castañeda [7].

Srivastava [8], using certain transformations, derived the null distribution of multi- ple correlation coefficient when sampling from a mixture of two multivariate normal distributions (see also Gupta and Kabe [3]). Amey [1] integrated the joint density of a11,a21, andA22suitably to derive the density ofR2and studied its robustness.

In this article, we derive the null distribution ofR2when sampling from a mixture of two multivariate normal distributions. First, we derive thehth null moment of 1−R2. Then, by using the inverse Mellin transform, the density of 1−R2is obtained from which the density ofR2is deduced.

Note thatR2is a function of the elements of sample sum of squares and products matrixA. Therefore, in our derivation, we use the distribution of Awhen sampling from the above model. Srivastava and Awan [9] and Tan [11] have shown that the density ofA, when sampling from (1.3), is a binomial sum of linear noncentral Wishart densities:

f (A)=

N γ=0

N γ

γ(1−)NγWp

n,Σ, c2γΣ1νν;A

, (1.5)

wheren=N−1,cγ2=γ(N−γ)/N, andν=(µ1−µ2). HereWp(n,Σ, cγ2Σ1νν;A)rep- resents the noncentral Wishart density withndegrees of freedom and noncentrality parameter matrixc2γΣ1ννdefined by

Kp(n,Σ,ν)etr

1 2Σ−1A

det(A)(1/2)(n−p−1) 0F1(p)

1 2n;1

4cγ2Σ−1AΣ−1νν

, (1.6) where

Kp(n,Σ,ν)=

2(1/2)pnΓp

1 2n

det(Σ)(1/2)n 1

etr

1

2cγ2Σ−1νν

(1.7) andΓm(a)=π(1/4)m(m1)m

j=1Γ(a−(1/2)(j−1)).

2. Null moments of 1−R2. In this section, we derive moments of 1−R2 when R¯1·2···p=0 (or equivalently σ21=0). LetΣ0=σ11 0

0 Σ22

andU=1−R2. Sincea11 is scalar, then

U=1−R2=1a21A221a21

a11 = det(A) a11det

A22. (2.1)

(3)

Thehth null moment ofUis given by E

Uh

=

N γ=0

N γ

γ(1−)NγEγ

Uh

, (2.2)

Eγ Uh

=Kp

n,Σ0

A>0etr

1 2Σ01A

a−h11det A22−h

×det(A)(1/2)(n−p−1)+h 0F1(p)

1 2n;1

4cγ2Σ−10 AΣ−10 νν

dA.

(2.3)

Replacinga11hand det(A22)−hby their integral representations, namely a11h= 1

2hΓ(h)

0 exp

1 2a11y1

y1h1dy1, Re(h) >0, det

A22

h

= 1

2(p1)hΓp1(h)

Y22>0etr

1 2A22Y22

×det Y22

h(1/2)(p1+1)

dY22, Re(h) >1 2(p−2),

(2.4)

respectively, in (2.3) and integratingA, the moment expression is rewritten as Eγ

Uh

=2(1/2)npKp

n,Σ0,ν Γp

(1/2)n+h Γ(h)Γp−1(h)

×

0 y1h1

Y22>0det Y22

h(1/2)p

det

Σ01+Y(1/2)nh

×1F1(p) 1

2n+h;1 2n;1

2cγ2Σ−10

Σ−10 +Y1

Σ−10 νν

dy1dY22,

(2.5)

whereY=y1 0 0 Y22

and1F1(p)is the confluent hypergeometric function of matrix argu- ment (Gupta and Nagar [4]). Since rank(Σ−10 (Σ−10 +Y )1Σ−10 νν)=1, the only nonzero characteristic root of the matrixΣ01(Σ01+Y )1Σ01ννis tr((Σ01+Y )1Σ01ννΣ01) and therefore,

1F1(p) 1

2n+h;1 2n;1

2cγ2Σ01

Σ01+Y1

Σ01νν

=1F1

1 2n+h;1

2n;1 2cγ2tr

Σ01+Y1

Σ01ννΣ01

,

(2.6)

where1F1is the confluent hypergeometric function of scalar argument (see [6]). Substi- tuting (2.6) in (2.5) and expanding1F1in series form, the moment expression simplifies to

Eγ Uh

=2(1/2)npKp

n,Σ0,ν Γp

(1/2)n+h Γ(h)Γp−1(h)

t=0

cγ2 2

t

(1/2)n+h t

(1/2)n

tt!

×

0

y1h−1

Y22>0

det

Y22h−(1/2)pdet

Σ−10 +Y−(1/2)n−h

× νΣ−10

Σ−10 +Y−1Σ−10 νt

dy1dY22,

(2.7)

(4)

where(a)r=a(a+1)···(a+r−1)and(a)0=1. Noting thatΣ0is a block diagonal matrix, we obtain

νΣ01

Σ01+Y−1Σ01νt

= ν12σ11−1

111y1−1+ν2Σ−122

Σ−122+Y2−1Σ−122ν2t

=

k+=t

t!

k!!

ν12σ11−1

111y1−1k ν2Σ−122

Σ−122+Y22−1Σ−122ν2

,

det

Σ−10 +Y

= σ11−1

111y1 det

Σ22−1det

Ip1+Σ22Y22 .

(2.8)

Now substituting (2.8) in (2.7), we have Eγ

Uh

=2(1/2)npKp

n,Σ0,ν

Γp((1/2)n+h) det(Σ0)(1/2)n+hΓ(h)Γp1(h)

×

t=0

cγ2 2

t

(1/2)n+h t

(1/2)n

t k+=t

1 k!!

ν12 σ11

k

×

0 y1h−1

111y1−((1/2)n+h+k)

dy1

×

Y22>0det Y22

h(1/2)p

det

Ip−1+Σ22Y22

(1/2)nh

× ν2Σ221

Σ221+Y22

1

Σ221ν2

dY22.

(2.9)

SubstitutingZ=(Ip1+Σ1/222Y22Σ1/222 )1, the integral involvingY22is evaluated as

Y22>0det Y22

h(1/2)p

det

Ip−1+Σ22Y22

(1/2)nh ν2Σ221

Σ221+Y22

1

Σ221ν2

dY22

=det Σ22−h

0<Z<Ip−1det(Z)(1/2)(np)

×det

Ip1−Zh−(1/2)p

ν2Σ221/2ZΣ221/2ν2

dZ

=det

Σ22−h

∂η

η=0

0<Z<Ip−1

det(Z)(1/2)(np)det

Ip1−Zh−(1/2)p

×etr

ην2Σ221/2ZΣ221/2ν2 dZ

=det

Σ22−hΓp−1

(1/2)n Γp−1(h) Γp−1

(1/2)n+h

∂η

η=01F1(p1) 1

2n;1

2n+h;ηΣ−122ν2ν2

=det Σ22

hΓp−1

(1/2)n Γp−1(h) Γp−1

(1/2)n+h

(1/2)n

(1/2)n+h

ν2Σ221ν2

,

(2.10) where1F1(p1)is the confluent hypergeometric function of matrix argument (see [4]).

(5)

Collecting terms containingy1and integrating, we obtain

0 y1h1

111y1

((1/2)n+h+k)

dy111hΓ (1/2)n

Γ(h) Γ

(1/2)n+h

(1/2)n k

(1/2)n+h

k

. (2.11) Substituting (2.10), (2.11), and (1.7) in (2.9) and simplifying the resulting expression using results on gamma function, we get

Eγ

Uh

=exp

1

2cγ2νΣ01ν

Γ (1/2)n Γ

(1/2)(n−p+1)

t=0+k=t

cγ2 2

t (1/2)n

k

(1/2)n

(1/2)n

t

×

ν1211

k

ν2Σ221ν2

k!!

Γ

(1/2)(n−p+1)+h Γ

(1/2)n+t+h Γ

(1/2)n+k+h Γ

(1/2)n++h

=exp

1

2cγ2νΣ−10 νΓ

(1/2)n Γ

(1/2)(n−p+1)+h Γ

(1/2)n+h Γ

(1/2)(n−p+1)

k=0

cγ2 2

k

×

ν1211

k

k! 2F2

1

2n+h+k,1 2n;1

2n+k,1 2n+h;1

2cγ2ν2Σ−122ν2

,

(2.12) where2F2is the generalized hypergeometric function of scalar argument (see [6]).

3. Distribution ofR2under mixture normal model. The density functionf (u)of U=1−R2is obtained by taking the inverse Mellin transform ofE(Uh)as

f (u)=

N γ=0

N γ

γ(1−)Nγfγ(u) (3.1) with

fγ(u)=(2π ι)1

CEγ

Uh

uh1dh, 0< u <1, (3.2) whereι=√

1 andCis a suitable contour. Substituting (2.12) in (3.2), we obtain fγ(u)=exp

1

2cγ2νΣ01ν

Γ

(1/2)n Γ

(1/2)(p−1) Γ

(1/2)(n−p+1)

t=0k+=t

cγ2 2

t

×

(1/2)n

k

(1/2)n

(1/2)n

t

ν1211

ν2Σ221ν2

k

k!! u(1/2)n+k+−1(1−u)(1/2)(p−3)

×2F1

1

2(p−1)+k,1

2(p−1)+;1

2(p−1); 1−u

, 0< u <1,

(3.3) where2F1is the Gauss hypergeometric function (see [6]). To obtain (3.3) we have used the result

1 0

u(1/2)n+h+k+−1(1−u)(1/2)(p−3)2F1

1

2(p−1)+k,1

2(p−1)+;1

2(p−1); 1−u

du

=Γ

(1/2)(p−1) Γ

(1/2)(n−p+1)+h Γ

(1/2)n+t+h Γ

(1/2)n+k+h Γ

(1/2)n++h .

(3.4)

(6)

The density ofR2=1−Uis now derived from the density ofUas g

R2

=

N γ=0

N γ

γ(1−)N−γgγ

R2

, (3.5)

where gγ

R2

=exp

1

2c2γνΣ01ν

Γ (1/2)n Γ

(1/2)(p−1) Γ

(1/2)(n−p+1)

t=0k+=t

cγ2 2

t

×

(1/2)n

k

(1/2)n

(1/2)n

t

ν1211

ν2Σ−122ν2

k

k!!

×

1−R2(1/2)n+k+1

R2(1/2)(p3)

×2F1

1

2(p−1)+k,1

2(p−1)+;1

2(p−1);R2

, 0< R2<1.

(3.6) By using the result2F1(a, b;c;z)=(1−z)cab2F1(c−a, c−b;c;z), the above density can be rewritten as

gγ R2

=exp

1

2cγ2νΣ−10 ν Γ (1/2)n Γ

(1/2)(p−1) Γ

(1/2)(n−p+1)

×

R2(1/2)(p−3)

1−R2(1/2)(n−p−1) t=0k+=t

cγ2 2

t (1/2)n

k

(1/2)n

(1/2)n

t

×

ν1211

ν2Σ221ν2

k

k!! 2F1

−k,−;1

2(p−1);R2

, 0< R2<1.

(3.7) It is interesting to note that ifν=0, then the densityg(R2)reduces to

g R2

= Γ

(1/2)n Γ

(1/2)(p−1) Γ

(1/2)(n−p+1)

×

R2(1/2)(p3)

1−R2(1/2)(np1)

, 0< R2<1.

(3.8)

References

[1] A. K. A. Amey,Robustness of the multiple correlation coefficient when sampling from a mixture of two multivariate normal populations, Comm. Statist. Simulation Com- put.19(1990), no. 4, 1443–1457.

[2] A. K. A. Amey and A. K. Gupta,Testing sphericity under a mixture model, Austral. J. Statist.

34(1992), 451–460.

[3] A. K. Gupta and D. G. Kabe,On some noncentral distribution problems for the mixture of two normal populations, Metrika38(1991), 1–10.

[4] A. K. Gupta and D. K. Nagar,Matrix Variate Distributions, Chapman & Hall/CRC, Florida, 2000.

[5] D. G. Kabe and A. K. Gupta,Hotelling’sT2-distribution for a mixture of two normal popu- lations, South African Statist. J.24(1990), 87–92.

(7)

[6] Y. L. Luke,The Special Functions and Their Approximations, Vol. I, Academic Press, New York, 1969.

[7] D. K. Nagar and M. E. Castañeda,Distribution of correlation coefficient under mixture normal model, to appear in Metrika, 2002.

[8] M. S. Srivastava,On the distribution of Hotelling’sT2and multiple correlationR2when sampling from a mixture of two normals, Comm. Statist. A—Theory Methods12 (1983), no. 13, 1481–1497.

[9] M. S. Srivastava and H. M. Awan,On the robustness of Hotelling’sT2-test and distribution of linear and quadratic forms in sampling from a mixture of two multivariate normal populations, Comm. Statist. A—Theory Methods11(1982), no. 1, 81–107.

[10] ,On the robustness of the correlation coefficient in sampling from a mixture of two bivariate normals, Comm. Statist. A—Theory Methods13(1984), 371–382.

[11] W. Y. Tan,On the distribution of the sample covariance matrix from a mixture of normal densities, South African Statist. J.12(1978), 47–55.

[12] D. M. Titterington, A. F. M. Smith, and U. E. Makov,Statistical Analysis of Finite Mixture Distributions, John Wiley, Chichester, 1985.

Hydar Ali: Department of Mathematics and Computer Science, the University of the West Indies, St. Augustine, Trinidad and Tobago

Daya K. Nagar: Departamento de Matemáticas, Universidad de Antioquia, Medellín, A. A.1226, Colombia

参照

関連したドキュメント

Srivastava, “Some inclusion properties of a certain family of integral operators,” Journal of Mathematical Analysis and Applications, vol.. Minda, “An internal

Nowadays, the biological resources in the chemostat model are mostly harvested with the aim of achieving economic interest and the taxation is used as an economic control instrument

Furthermore, if Figure 2 represents the state of the board during a Hex(4, 5) game, play would continue since the Hex(4) winning path is not with a path of length less than or equal

In this paper, we study the generalized Jordan-von Neumann constant and obtain its estimates for the normal structure coefficient N(X ), improving the known results of S..

(ii) Expressions for the index of dispersion, skewness and kurtosis for the negative binomial distribution with pmf (1.9), Poisson-Lindley with pmf (1.11) and generalized mixture

From the studies carried out on this area results that the pollution degree of the atmosphere, the concentration and distribution of the polluting factors indicate a

The basic plane boundary value problems of statics of the elastic mixture theory are considered when on the boundary are given: a displace- ment vector (the first problem), a

Using special properties of the Gauss hypergeometric function, the following simpler forms for (2) can be obtained.. [1] to reexpress the Gauss hypergeometric term in (2) in terms