Therefore, the function ( ) = G( : a/(1-b2)) = N(0, a/(1-b2)) is a solu- tion for the integral equation (10). The constant times of ( ) as well as
( ) = 0 can also be its solutions, though they are not probability density functions. ■
4 Consistent Bivariate Distribution
4.1 Derivation of the marginal distribution We obtain the following results from Lemma 3.Theorem 4 The marginal distribution ( ) consistent with a pair of con- ditional distributions (6) and (7), where 0 < 1-β 2δ 2, is
Proof. Assuming = 0, = 0 without loss of generality, we consider the case of (6) and (7). Replacing of Lemma 1 by , we get a conditional distribution = N(bw, a), where b = β δ and a = 11 + β 2 22 > 0. Regarding this function as the kernel H( , w), we obtain the integral equation (10). Then, under the condition 0 <1-b2 = 1-β 2δ 2, we get its solution ( ) = N(0, a/(1-b2)) from Lemma 3. The variance of this solution distribution is given as
where
If ≠ 0, ≠ 0, then we can consider the distributions of = - and = - which have zero means, and then we go back to the zero mean case mentioned above. ■
A similar result follows for the marginal distribution h( ) of Y:
Theorem 5 The marginal distribution h( ) consistent with a pair of conditional distributions (6) and (7), where 0 < 1-β 2δ 2, is
4.2 Derivation of the joint distribution
Theorem 6 The joint distribution consistent with a pair of condi- tional distributions (6) and (7), where 0 < 1-β 2δ 2, is
Proof. We can assume = 0, = 0 without loss of generality as in Proof of Theorem 4. If we consider the case of (6) and (7), then Theorem 4 shows that the consistent marginal distribution ( ) is (12) ( ) = N( , /θ). The joint distribution is given by the product of (12) and the condi- tional distribution (7). So we have
After rearranging this, we get
where φ = δ 2 11 + 22 and = δ( / )1/2. Under the assumption 0 < 1-
β2δ2, the function (15) is the density function of the joint distribution (14) with = 0, = 0. ■
We also get the following result in a similar way.
Theorem 7 The joint distribution consistent with a pair of condi- tional distributions (6) and (7), where 0 < 1-β 2δ 2, is
where the correlation coefficient is = β( / )1/2.
So far, we have obtained two families of joint distributions consistent with (6) and (7), that is (15) and (16). The regression coefficients of (15) are
And those of (16) are
The coefficient (17) may be or may not be equal β. The coefficient (20) may be or may not be equal δ. In other words, the four parameters (β, ,
11, 22) of (6) and (7) have much degrees of freedom to determine uniquely the parameters of the joint distribution. So that, we have obtained two families of joint distributions.
To make the two families of joint distributions of Theorems 6 and 7 coincide, we introduce another restriction in addition to the one 0 < 1
-β 2δ 2 already imposed.
Theorem 8 The joint distribution consistent with a pair of condi- tional distributions (6) and (7) under the assumptions
" ”
is N( , ), where
Proof. Under the assumptions (21), Theorems 6 and 7 hold. The further assumption (22) makes the variance matrices of (14) and (16) of Theorems have the common values
This can be rearranged to get
It would be possible to show that provided the joint distribution is N( , ) with (23) and (24), the conditional distributions are (6) and (7).
Theorem 8 gives the sufficient condition for that the distributions (6) and (7) are consistent with the bivariate normal distribution N( , ) defined by (23) and (24). Now, we summarize the discussion for the bivariate distributions:
(i) If the conditional distributions (6) and (7) are derived from the
joint distribution N( , ) with (23) and (24), then the conditions 1 > βδ
≧ 0 and δ 11 = β 22 hold.
(ii) If the conditions 0 < 1-β 2δ 2 and δ 11 = β 22 hold, then the con- ditional distributions (6) and (7) are derived from the joint distribution N( , ) with (23) and (24).
(iii) The condition 1 > βδ ≧ 0 implies 0 < 1-β 2δ 2, but not the converse.
5 Multivariate Normal Distribution
We here turn to a general multivariate distribution case and consider an s-dimension random vector Z whose probability density function is
That is, Z is assumed to be distributed as N( , ) with a mean vector every element of which lies in the interval (- , + ), and with a posi- tive definite (symmetric) covariance matrix .
According to the partition of Z as Z' = [ X', Y' ] = [(1 × p), (1 × q)], we partition the parameters,
where 12 = '21. Letting be the conditional density function of X given Y = , and be the one of Y given X = , we have 2)
2) See, for instance, T. W. Anderson(1958) : An Introduction to Mulitivariate Analysis, New York: John Wiley and Sons.
Letting the regression coefficients B = and = , we have a constraint on the coefficients,
5.1 The problem
Let us now consider random vectors X = ( p × 1) and Y = (q × 1).
Without loss of generality, it is assumed that p ≦ q. Suppose in general that the normal conditional distributions of X and Y are, respectively,
The problem now is to find the p + q dimension joint distribution or their joint density function and their marginal density functions ( ) and h( ). In particular, we try to find condition under which the joint distribution is multivariate normal.
In a similar way to get (10) , we have an integral equation
where
with ∫ ∙∙∙ d denoting a q-dimensional integration. Now, the problem is to find first a solution ( ) for the integral equation (27), and second a joint distribution, which is given as a product of this ( ) and the conditional distribution (26), . It will be seen that the problem in a general multivariate case will need cumbersome calculations. The preparation for that matter is the next section.
6 Preparation 2
Some of the propositions given here may be fairly well known so that we don’t provide proofs for them.
6.1 Matrix algebra
Lemma 9 For two non-singular matrices A = (m × m) and C = (n × n), and two rectangular matrices B = (m × n) and D = (n × m), define
then we have
We will make use of the following propositions concerning character- istic roots of matrices. The characteristic roots in the propositions can be multiple roots and/or zeros.
Lemma 10 (a) If the characteristic roots of a square matrix A= (n × n) are λ1, ... , λn, then the characteristic roots of A' are λ1, ... , λn.
(b) For two rectartgular matrices A = (m × n) and B = (n × m), where m ≦ n, let the characteristic roots of AB = (m × m) be λ1, ... , λn (including multiple roots), then the characteristic roots of BA = (n × n) are λ1, ... , λn
and n-m zeros.
Proof. (a) See Dhrymes (1978, p49, Proposition 42, (a)).
(b) See Dhrymes (1978, p.51, Corollary 5). ■
3)
Lemma 11 For the regression coefficients B = (p × q) and = (q × p) of the conditional distributions (25) and (26), let the characteristic roots of the product B be λ1, ... , λn (including multiple roots), then we have:
(a) The characteristic roots of 'B' = (p × p) are λ1, ... , λp.
(b) The characteristic roots of B = (q × q) are λ1, ... , λp and q-p zeros.
(c) The characteristic roots of B' ' = (q × q) are λ1, ... , λp and q-p zeros.
Proof. (a), (b) and (c) are applications of Lemma 10. ■
6.2 Matrix equation
It will be shown in Section 7 that the problem turns out to be solving an integral equation, and to be solving a matrix equation. We here deal with the matrix equation in advance. The matrices appearing in this subsection are assumed to be transformed into a diagonalized form.
Section 8 discusses more general cases where the matrices are trans- formed into a block diagonalized form.
Now, let us consider two square matrices U = (m × m) and V = (n × n), and a rectangular matrix W = (m × n). We assume that U and V can be diagonalized (real and symmetric, for example), and that the character- istic roots of U be λ1, ... , λm (distinct roots) and those of V be v1, ... , vn(distinct roots). Then, we get the following proposition concerning a matrix equation
where X = (m × n) is an unknown matrix.
Lemma 12 Concerning the equation (28), we haυe the propositions (a), (b), (c) and (d) below:
(a) Under the assumption
3) P. J. Dhrymes(1978) : Mathematics for Econometrics, New York: Springer Verlag.