Random Variables - 東北大学機関リポジトリTOUR

5.1 Random variables and their distributions

A random variable is intuitively a variable whose values appear along with a certain probability law. A typical example appears in random sampling. Consider a variable whose values are obtained from the measurement of samples chosen randomly from a population. Then the variable obeys a certain probability law arising from random sampling, so it is a random variable.

To be slightly more precise, arandom variableis a variableXfor which we may ask the probabilityPðXxÞthatX takes values less than or equalx2R. However, for logical validity ofPðXxÞwe need to prepare a probability space ð;F;PÞbefore introducing a random variable. In the above-mentioned example of random sampling, we setto be the population and deﬁne the probability byPðAÞ ¼ jAj=jjalong with combinatorial probability. Our variableXgives a deﬁnite value for each individual !. In other words, X: !Ris a function. ThenPðXxÞis deﬁned by

PðXxÞ ¼jfXxgj

jj ; x2R; ð5:1Þ

wherefXxgis a short-hand notation forf!2;Xð!Þ xg. Abstracting the above argument, we give the following formal deﬁnition.

Deﬁnition 5.1. Letð;F;PÞbe a probability space. A functionX: !Ris called arandom variableiffXxg ¼ f!2;Xð!Þ xgis an event in F for allx2R. Moreover, the function

FðxÞ ¼FXðxÞ ¼PðXxÞ; x2R; ð5:2Þ

is called the distribution functionofX.

Example 5.2. Tossing a coin, we set X¼1 if the heads occurs and X¼0 if the tails occurs. ThenX becomes a random variable such that

PðX¼0Þ ¼PðX¼1Þ ¼1 2: The distribution function is given by

FXðxÞ ¼

0; x<0, 1=2; 0x<1,

1; x 1.

Rolling dice being similar, the investigation is left to the readers in the following exercise.

0 0.2 0.4 0.6 0.8 1

0 0.2 0.4 0.6 0.8 1 d

P(D|T )⁺

Fig. 4.4. Graph ofPðDjT^þÞfor0d1.

Exercise 5.3. Consider the experiment of rolling a fair die. Let X be the random variable which assigns 1 if the number appears is even and 0 if the number that appears is odd. Find PðX¼1ÞandPðX¼0Þ.

Exercise 5.4. Consider the experiment of tossing a coin three times. LetX be the number of heads obtained. We assume that the tosses are independent and the probability of a head is p. Find the probabilitiesPðX¼0Þ,PðX¼1Þ, PðX¼2ÞandPðX¼3Þ.

Exercise 5.5. Suppose that a fair die is rolled seven times. Find the probability that 1 and 2 dots appear twice each; 3, 4, and 5 dots once each; and 6 dots not at all.

Example 5.6. Letbe the set of players of Team A. LetXbe the height of a player randomly chosen from. Then the distribution function ofX is given by

FXðxÞ ¼PðXxÞ ¼jf!2;Xð!Þ xgj

jj : ð5:3Þ

This is essentially the cumulative relative frequencies of the heights of Team A, see Sect. 2.1.

Theorem 5.7. LetX be a random variable andFðxÞ ¼F_XðxÞthe distribution function. Then we have (i) lim

x!1FðxÞ ¼0 and lim

x!þ1FðxÞ ¼1;

(ii) Ifx1x2, then Fðx1Þ Fðx2Þ;

(iii) lim

!þ0FðxþÞ ¼FðxÞ, namely,FðxÞis right-continuous.

Theorem 5.8. LetX be a random variable andFðxÞ ¼F_XðxÞthe distribution function. Then we have PðX¼xÞ ¼FXðxÞ lim

!þ0FXðxÞ; x2R:

For the proofs see the standard textbooks. We only mention here that countable operations of sets are required in the proofs.

Exercise 5.9. Verify the properties (i)–(iii) in Theorem 5.7 for the distribution function in Example 5.2.

Deﬁnition 5.10. A random variableXis calleddiscreteif the distribution functionFXðxÞincreases only by jumps. A random variableX is called continuousif the distribution functionFXðxÞis continuous. (Note that there are random variables that are neither discrete nor continuous.)

For a discrete random variableXthe jump points ofFXðxÞare at most countable, say,a1;a2;. . .. The jump atx¼ai

is denoted by pi>0. Then we have

p_i¼PðX¼a_iÞ ¼F_Xða_iÞ lim

!þ0F_Xða_iÞ; X

p_i¼1:

Thus, with a discrete random variableX, we may associate the possible valuesaiand its probabilitypi. It is convenient to allow pi¼0 (in that case ai is not a possible value though). The random variables in Examples 5.2 and 5.6 are discrete.

For a continuous random variable X, we have PðX¼xÞ ¼0 for all x. This is an immediate consequence of Theorem 5.8. We now understand why we needed to consider the probability of the eventsfXxginstead offX¼xg for introducing a random variable.

Example 5.11. LetXbe the coordinate of a point chosen from an interval¼ ½0;L,L>0, in such a way that every point of is chosen equally likely, see Example 4.4. We are interested in the distribution functionFXðxÞ. Since the eventfXxgnever occurs ifx<0, we haveFXðxÞ ¼0forx<0. While the eventfXxgcertainly occurs ifx>L, we haveF_XðxÞ ¼1 forx>L. For 0xLwe have

PðXxÞ ¼ j½0;xj j½0;Lj ¼x

L Consequently, we have

FXðxÞ ¼

0; x<0, x=L; 0xL, 1; x>L.

: ð5:4Þ

Since FXðxÞis a continuous function obviously, the random variableX is continuous.

Example 5.12. Cutting oﬀ a stick of lengthLat a randomly chosen point, we obtain two fragments. We are interested in the length of the shorter fragment, which is denoted byS. The stick is modeled by an interval¼ ½0;Lin the real line and letXdenote the coordinate of a randomly chosen point. ThenXbecomes a random variable as is discussed in

Example 5.11. SincefSxgnever occurs for x<0andfSxgcertainly occurs if x>L=2, we haveFSðxÞ ¼0 for x<0andFSðxÞ ¼1for x>L=2. Suppose that0xL=2. Then we have

PðSxÞ ¼Pð0XxÞ þPðLxXLÞ ¼ x Lþx

L¼2x L : Summing up, we have

FSðxÞ ¼

0; x<0, 2x=L; 0xL=2, 1; x>L=2.

: ð5:5Þ

Thus,S is a continuous random variable. We may derive the distribution functionFSðxÞalternatively by using S¼ minfX;LXg.

Example 5.13. Letbe a disc of radiusR>0. Choose a point randomly fromand letXbe the distance between the chosen point and the center of the disc. ThenX becomes a random variable. Obviously,FXðxÞ ¼0for x<0 and FXðxÞ ¼1forx>R. Suppose that0xR. Since the eventfXxgcorresponds to the concentric disc with radiusx, we have

PðXxÞ ¼ x² R² ¼ x²

R²;

where the probability is calculated along with Example 4.5. Consequently, we have

FXðxÞ ¼

0; x<0, x²=R²; 0xR, 1; x>R.

: ð5:6Þ

Thus,X is a continuous random variable.

In general, the distribution function FXðxÞof a continuous random variable X is continuous by deﬁnition but not necessarily diﬀerentiable. IfFXðxÞis piecewise diﬀerentiable, the derivative

f_XðxÞ ¼F⁰_XðxÞ ¼ d dxF_XðxÞ

is called the(probability) density functionofX. It then follows from the fundamental theorem of diﬀerential integral calculus that

PðXxÞ ¼F_XðxÞ ¼ Zx

f_XðtÞdt

and

PðaXbÞ ¼ Zb

f_XðxÞdx; a<b:

Since the density function gives a probability only through integration, ifFXðxÞis not diﬀerentiable atx¼a, the value of fXðxÞ atx¼a may be given arbitrarily. Continuous random variables with density functions as well as discrete random variables cover a quite wide range of applications.

Exercise 5.14. Deﬁne a functionFðxÞby

FðxÞ ¼

0; x<0, xþ1

2; 0x 1 2,

1; x 1

2. 8>

Verify the properties (i)–(iii) in Theorem 5.7.

Exercise 5.15. Let X be a random variable of which the distribution function is given by FðxÞ described in Exercise 5.14, Find the following probabilities:

P X1 4

; P 0<X1 4

; PðX¼0Þ:

(Note thatFðxÞis not continuous atx¼0.)

Exercise 5.16. Determine the constantsaandb such that

FðxÞ ¼ 1ae^x=b; x 0,

0; x<0,

is the distribution function of a random variable.

Deﬁnition 5.17. The meanorexpectation of a random variableX is deﬁned by

E½X ¼X ¼ Z

Xð!ÞPðd!Þ;

where the right-hand side is the so-called the Lebesgue integral.

For practical problems we consider two cases. For a discrete random variableXwith possible valuesa1;a2;. . . the mean becomes

E½X ¼_X ¼X

a_iPðX¼a_iÞ ¼X

xPðX¼xÞ:

In the most right expression, which is just by convention, the sum is taken over all real numbers xbut in fact, since PðX¼xÞ ¼0 except at most countable x¼ai the expression is reduced to a usual sum. For a continuous random variable with density function fXðxÞthe mean becomes

E½X ¼_X ¼ Z þ1

x f_XðxÞdx:

Many important statistics of a random variableX is deﬁned in terms of the mean. For example, thevarianceof Xis deﬁned by

V½X ¼_X² ¼E½ðXE½XÞ² ¼E½X² E½X²: Moreover, thecentral moment of degreekis deﬁned by

mk½X ¼E½ðXE½XÞ^k:

Example 5.18. LetXbe the random variable introduced in Example 5.11. It is a continuous random variable since the distribution functionFXðxÞinð5.4Þis continuous. The density function fXðxÞis obtained by diﬀerentiatingFXðxÞas follows:

fXðxÞ ¼ 1=L; 0xL,

0; otherwise. ð5:7Þ

Then the mean ofX is given by

E½X ¼ Zþ1

x fXðxÞdx¼ Z L

x1 L dx¼L

2: Similarly, we have

E½X² ¼ Zþ1

x²fXðxÞdx¼ Z L

x²1

L dx¼L² 3 : Hence the variance is given by

V½X ¼E½X² E½X²¼L²

3 L

¼L² 12:

The probability distribution deﬁned by the density function ð5.7Þ is called the uniform distribution on ½0;L.

Accordingly, the random variable Sintroduced in Example 5.12 obeys the uniform distribution on ½0;L=2.

Example 5.19. For2Rand >0, thenormal distributionNð; ²Þis deﬁned by the density function:

fðxÞ ¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2²

p exp ðxÞ² 2²

;

see also Example 2.19. In particular, Nð0;1Þis called the standard normal distribution. If a random variable obeys Nð0;1Þ, the distribution function is given by

FXðxÞ ¼PðXxÞ ¼ 1 ﬃﬃﬃﬃﬃﬃ p2

Zx 1

e^t²⁼²dt: ð5:8Þ

It is noted that the right-hand side is not expressed in terms of an elementary function. Instead, we deﬁne the(Gauss) error functionby

erfðxÞ ¼ 2 ﬃﬃﬃ p

Zx 0

e^t²dt:

Thenð5.8Þbecomes

F_XðxÞ ¼1

2 1þerf x ﬃﬃﬃ2 p

Exercise 5.20. LetX be a discrete random variable such that

PðX¼ 1Þ ¼PðX¼0Þ ¼PðX¼1Þ ¼1 3: Find the mean and variance of X.

Exercise 5.21. LetX be a continuous random variable of which the density function is given by

fXðxÞ ¼ 2x; 0<x<1, 0; otherwise.

Find the mean and variance of X.

Exercise 5.22. LetX be the random variable introduced in Example 5.13. Find the density function ofXand show thatE½X ¼2R=3 andV½X ¼R²=18.

Exercise 5.23. Prove that the moment of degree2mof the standard normal distribution Nð0;1Þis given by 1ﬃﬃﬃﬃﬃﬃ

p2 Z þ1

x^2me^x²⁼²dx¼ð2mÞ!

2^mm!; m¼1;2;. . .:

5.2 Joint distributions

LetX₁;X₂;. . .;X_nbe random variables deﬁned on a probability spaceð;F;PÞ. There are two points of view. One is to regard them as a sequence of random variables. This is suitable for the study of asymptotic properties and limit behavior. The other is to regard them as a random vectorðX1;X2;. . .;XnÞinn-dimensional space. Since the essence is the same, we switch the notation by convenience. The statistics of ﬁnitely many random variables X1;X2;. . .;Xn is described by thejoint distribution functiondeﬁned by

FX₁X₂...X_nðx1;x2;. . .;xnÞ ¼PðX1 x1;X2x2;. . .;XnxnÞ; x1;x2;. . .;xn2R; where the right-hand side is the probability of the product eventTn

i¼1fX_ix_ig.

IfX₁;. . .;X_n are discrete random variables, it is suﬃcient and more convenient to deal with the joint probability of the form

PðX₁¼x₁;X₂ ¼x₂;. . .;X_n ¼x_nÞ;

where x₁;x₂;. . .;x_n run over all possible values of X₁;X₂;. . .;X_n, respectively. In that case the random points ðX₁;X₂;. . .;X_nÞare scattered inn-dimensional space in a discrete manner. We are also interested in a particular type of continuous random vector, where the joint distribution function is given by the integral:

FX₁X₂...X_nðx1;x2;. . .;xnÞ ¼ Zx₁

dt1

Zx₂ 1

dt2 Zx_n

dtnfðt1;t2;. . .;tnÞ

forx1;x2;. . .;xn2R. In that case, the integrandfðx1;x2;. . .;xnÞis called thejoint density functionofX1;X2;. . .;Xnand denoted by fX₁X₂...X_nðx1;x2;. . .;xnÞ.

Exercise 5.24. Consider an experiment of tossing a fair coin twice. Let ðX;YÞ be a 2-dimensional random vector, whereXis the number of heads that occurs in the two tosses andY is the number of tails that occurs in the two tosses.

Find PðX¼2;Y ¼0Þ,PðX¼0;Y¼1ÞandPðX¼1;Y ¼1Þ.

LetðX₁;. . .;X_nÞbe ann-dimensional random vector such that eachX_jis a discrete random variable. Then we have PðX1¼xÞ ¼ X

x2;...;xn

PðX1¼x;X2¼x2;. . .;Xn¼xnÞ

and the mean ofX1 is given by _X₁¼E½X₁ ¼X

xPðX₁¼xÞ ¼ X

x₁;x₂;...;x_n

x₁PðX₁¼x₁;X₂¼x₂;. . .;X_n¼x_nÞ:

Similarly,

X_j¼E½Xj ¼ X

x₁;x₂;...;x_n

xjPðX1¼x1;X2¼x2;. . .;Xn¼xnÞ:

IfðX₁;. . .;X_nÞadmits a joint density function f_X₁_X₂_...X_nðx₁;x₂;. . .;x_nÞ, we have fX₁ðxÞ ¼

Zþ1

Z þ1

fX₁X₂...X_nðx;x2;. . .;xnÞdx2 dxn; and the mean ofX₁ is given by

X₁¼E½X1 ¼ Z þ1

x fX₁ðxÞdx¼ Zþ1

Z þ1

x1fX₁X₂...X_nðx1;x2;. . .;xnÞdx1dx2 dxn: Similarly,

_X_j ¼E½X_j ¼ Zþ1

Zþ1

x_jf_X₁_X₂_...X_nðx₁;x₂;. . .;x_nÞdx₁dx₂ dx_n:

Moreover, the higher-order statistics are deﬁned by means ofE½X₁^p¹ X_n^pⁿ. For example,E½X_i²is the moment of 2nd order andE½X_iX_jis a mixed moment of 2nd order. For discrete random variables we have

E½XjXk ¼ X

x1;x2;...;xn

xjxkPðX1¼x1;X2¼x2;. . .;Xn¼xnÞ and for continuous random variables with a joint density function we have

E½XjXk ¼ Zþ1

Zþ1

xjxkfX1X2...Xnðx1;x2;. . .;xnÞdx1dx2 xn:

Deﬁnition 5.25. The covarianceof two random variablesX andY is deﬁned by

XY ¼CovðX;YÞ ¼E½ðXE½XÞðYE½YÞ ¼E½XY E½XE½Y: ð5:9Þ The correlation coeﬃcientofX andY is deﬁned by

XY ¼ CovðX;YÞ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pV½X ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ

pV½Y¼ XY

_X_Y; ð5:10Þ

where_X¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pV½X

and_Y ¼ ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ pV½Y

are the standard deviations ofX andY, respectively.

For random variablesX₁;. . .;X_n, the matrix with ¼ ½jk; jj¼_X²

j ¼V½Xj; jk¼X_jX_k ¼CovðXj;XkÞ is called the variance-covariance matrix.

Deﬁnition 5.26. We say that random variables X1;X2;. . .;Xn are independent if the joint distribution function is factorized as

PðX₁x₁;X₂x₂;. . .;X_nx_nÞ ¼Yⁿ

i¼1

PðX_ix_iÞ

or equivalently,

F_X₁_X₂_...X_nðx₁;x₂;. . .;x_nÞ ¼Yⁿ

i¼1

F_X_iðx_iÞ:

It is proved by deﬁnition that discrete random variablesX1;. . .;Xn are independent if and only if PðX₁¼x₁;X₂¼x₂;. . .;X_n¼x_nÞ ¼Yⁿ

i¼1

PðX_i¼x_iÞ

for all x1;x2;. . .;xn2R. Random variables X1;. . .;Xn with joint density function fX₁X₂...X_nðx1;x2;. . .;xnÞ are independent if and only if the joint density function is factorized as

fX1X2...Xnðx1;x2;. . .;xnÞ ¼Yⁿ

i¼1

fXiðxiÞ;

where fX_iðxiÞis the density function ofXi.

Remark 5.27. Deﬁnition 5.26 applies to an arbitrary family of random variables. A family of random variables fX ; 2g is called independent if any ﬁnitely many random variables X ₁;. . .;X _n chosen from the family are independent in the sense of Deﬁnition 5.26.

Remark 5.28. By deﬁnition two random variablesX andY are independent ifPðXx;YyÞ ¼PðXxÞPðYyÞ for allx;y2R. A family of random variablesfX ; 2gis calledpairwise independentif any two random variables X ₁ andX ₂, 1 6¼ 2, chosen from the family are independent. Note that a pairwise independent family of random variables is not necessarily independent.

Exercise 5.29. For >0and >0 letFðx;yÞbe a function deﬁned by

Fðx;yÞ ¼ ð1e^xÞð1e^yÞ; x 0,y 0,

0; otherwise.

Prove thatFðx;yÞis the joint distribution function of a 2-dimensional random vectorðX;YÞ. Then show thatXandYare independent.

Theorem 5.30. If two random variablesX andY are independent, we haveE½XY ¼E½XE½YandCovðX;YÞ ¼0.

Proof. Suppose thatX andY are discrete random variables. Since they are independent by assumption, we have the factorizationPðX¼x;Y ¼yÞ ¼PðX¼xÞPðY ¼yÞ. Then we have

E½XY ¼X

x;y

xyPðX¼x;Y ¼yÞ ¼X

x;y

xyPðX¼xÞPðY ¼yÞ ¼X

xPðX¼xÞX

yPðY¼yÞ

and hence

E½XY ¼E½XE½Y: ð5:11Þ

Suppose next thatXandYadmits a joint density function f_XYðx;yÞ. Since they are independent by assumption we have f_XYðx;yÞ ¼f_XðxÞf_YðyÞ. Then we have

E½XY ¼ Zþ1

Zþ1

xyfXYðx;yÞdxdy¼ Zþ1

Zþ1

xyfXðxÞfYðyÞdxdy¼ Zþ1

xfXðxÞdx Zþ1

yfYðyÞdy

and we come toð5.11Þ. For a general pair of independent random variablesX andY we need Lebesgue integral on a probability space and omit the proof, see the standard textbooks. Finally, it follows immediately fromð5.11Þthat

CovðX;YÞ ¼E½XY E½XE½Y ¼E½XE½Y E½XE½Y ¼0;

as desired.

Remark 5.31. Two random variables X and Y are called uncorrelated if CovðX;YÞ ¼0. Theorem 5.30 says that independent random variables are uncorrelated. However, the converse is not true in general, see Exercise 5.32.

Exercise 5.32. LetZ1 andZ2 be independent random variables such that

PðZ1¼ 1Þ ¼PðZ2¼ 1Þ ¼1 2; in other words,Z1 andZ2 stand for tossing two coins. Set

X¼Z₁þZ₂; Y¼Z₁Z₂: Show that X andY are uncorrelated but are not independent.

Exercise 5.33. LetðX;YÞbe a 2-dimensional random vector of which the density function is given by

fXYðx;yÞ ¼x²þy²

4 e^ðx²^þy²^Þ=2: Show that X andY are uncorrelated but are not independent.

5.3 Regression curves

LetXandY be two random variables. We identifyðX;YÞwith a random point in thexy-coordinate plane. First we consider the case where both XandY are discrete. Forx;y2Rthe conditional probability

PðY ¼yjX¼xÞ ¼PðX¼x;Y¼yÞ PðX¼xÞ is deﬁned whenever PðX¼xÞ>0. Note that

PðY ¼yjX¼xÞ ¼ 1 PðX¼xÞ

PðX¼x;Y¼yÞ ¼ 1

PðX¼xÞPðX¼xÞ ¼1:

Then we regardPðY ¼yjX¼xÞas a probability distribution concentrated on the vertical line withx-coordinatexin the xy-coordinate plane. Then theconditional expectationof Y under the condition X¼xis deﬁned by

E½YjX¼x ¼X

yPðY ¼yjX¼xÞ: ð5:12Þ

Then we obtain a functionx7!E½YjX¼x, wherexruns overRsuch thatPðX¼xÞ>0. This function gives rise to a discrete curve in thexy-coordinate plane, which is called theregression curveforY subject toX.

We next consider the case whereXandYadmit a joint density function fXYðx;yÞ. Theconditional density functionof Y under the conditionX¼xis deﬁned by

fYjXðyjxÞ ¼ fXYðx;yÞ

fXðxÞ ¼ fXYðx;yÞ Zþ1

fXYðx;yÞdy

; ð5:13Þ

whenever the denominator is positive. Since Z þ1

fYjXðyjxÞdy¼ 1 Zþ1

f_XYðx;yÞdy Zþ1

fXYðx;yÞdy¼1;

as in the discrete case we understand that fYjXðyjxÞ is a density function concentrated on the vertical line with x-coordinate isxin thexy-coordinate plane. Then theconditional expectationofY under the conditionX¼xis deﬁned by

E½YjX¼x ¼ Zþ1

y f_YjXðyjxÞdy: ð5:14Þ

Then we obtain a functionx7!E½YjX¼x, wherexruns overRsuch thatRþ1

1 fXYðx;yÞdy>0. This function gives rise to a curve in the xy-coordinate plane, which is called theregression curvefor Y subject toX.

Exercise 5.34. LetðX;YÞbe a 2-dimensional random vector of which the density function is given by

f_XYðx;yÞ ¼ e^y; 0<xy, 0; otherwise.

Find the conditional density function of Y under the conditionX¼x. Then calculateE½YjX¼x.

Exercise 5.35(Bayes’ formula for continuous random variables). LetX;Y be random variables with a joint density function fXYðx;yÞ. Prove that

fYjXðyjxÞ ¼ f_XjYðxjyÞf_YðyÞ Zþ1

fXjYðxjyÞfYðyÞdy :

5.4 Two-dimensional normal distributions

Let¼ ½jbe ann-dimensional column vector and¼ ½jka strictly positive deﬁnitennmatrix. By deﬁnition hx;xi>0 for allx2Rⁿ withx6¼0 and necessarilyis invertible and symmetric. Deﬁne a function fðxÞby

fðxÞ ¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð2Þⁿjj

p exp 1

2hðxÞ;¹ðxÞi

; x2Rⁿ; ð5:15Þ

wherejjis the determinant. It is proved that Zþ1

Z þ1

fðx1;. . .;xnÞdx1 dxn¼1;

with the help of diagonalization ofand coordinate change. In other words, fðxÞis a probability density function inn variables. The corresponding probability distribution is called ann-dimensionalnormal distributionand is denoted by Nð;Þ. Moreover, we can check by elementary calculus that

Zþ1

xjfðx1;. . .;xnÞdx1 dxn¼j ð5:16Þ and

Zþ1

ðxjjÞðxkkÞfðx1;. . .;xnÞdx1 dxn ¼jk: ð5:17Þ As a result, is the mean vector andthe variance-covariance matrix of the normal distributionNð;Þ.

Here we study the case of two dimension. Take a vector2R²and a strictly positive deﬁnite22matrix, say,

¼ a

b ; ¼ 11 12

21 22

Note thatbecomes a symmetric matrix i.e.,12¼21. The density function ofNð;Þis deﬁned by fðx;yÞ ¼ 1

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð2Þ²jj

p exp 1

2hðxÞ;¹ðxÞi

; x¼ x

y 2R²: ð5:18Þ

Theorem 5.36. LetXandY be random variables with meansX; Y, variances_X²; _Y² and covarianceXY. IfðX;YÞ obeys a 2-dimensional normal distribution, the joint density function is given by

f_XYðx;yÞ ¼ 1 2XY

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 1²

p exp 1

2ð1²Þ

2xX

_Y þ yY

( )

" #

; ð5:19Þ

where¼XY=ðXYÞis the correlation coeﬃcient.

Proof. Let Nð;Þ be the normal distribution that ðX;YÞ obeys and fðx;yÞ its density function as in ð5.18Þ. As a particular case ofð5.16Þandð5.17Þ, we have

X ¼E½X ¼ Zþ1

Zþ1

x fðx;yÞdxdy¼a;

Y ¼E½Y ¼ Zþ1

Zþ1

y fðx;yÞdxdy¼b;

and

²_X¼E½ðX_XÞ² ¼ Zþ1

Z þ1

ðx_XÞ²fðx;yÞdxdy¼₁₁;

²_Y¼E½ðYYÞ² ¼ Zþ1

Zþ1

ðyYÞ²fðx;yÞdxdy¼22;

XY ¼E½ðXXÞðYYÞ ¼ Zþ1

Zþ1

ðxXÞðyYÞfðx;yÞdxdy¼12¼21:

Hence the joint density function fXYðx;yÞ is given by the density function of Nð;Þwith and being given as above. We now look at the quadratic functionhðxÞ;¹ðxÞi in the right-hand side ofð5.18Þ. First setting

¹¼ 11 12

21 22

;

we obtain

hðxÞ;¹ðxÞi ¼₁₁ðxaÞ²þ2₁₂ðxaÞðybÞ þ₂₂ðybÞ²: ð5:20Þ y

x y

a b

Fig. 5.1. Density function of 2-dimensional normal distributionNð;Þand its contour curves.

Then inserting

11 ¼22

jj; 22 ¼11

jj; 12¼21¼ 12

jj; into ð5.20Þ, we have

hðxÞ;¹ðxÞi ¼ 1

jjf22ðxaÞ²212ðxaÞðybÞ þ11ðybÞ²g

¼1122

ðxaÞ² 11

212

1122

ðxaÞðybÞ þðybÞ² 22

: ð5:21Þ

Finally, using the correlation coeﬃcient

¼ XY

¼ 12

ﬃﬃﬃﬃﬃﬃﬃ 11

p ﬃﬃﬃﬃﬃﬃﬃ22

together witha¼X,b¼Y, we come to

hðxÞ;¹ðxÞi ¼_X²²_Y jj

2xX

_Y þ yY

( )

: ð5:22Þ

On the other hand, we have

jj ¼11221221 ¼²_X_Y²²_XY ¼²_X_Y² 1 _XY² _X²_Y²

¼_X²²_Yð1²Þ: ð5:23Þ

Thenð5.19Þfollows immediately from ð5.22Þandð5.23Þ.

Theorem 5.37. LetXandY be random variables with meansX; Y, variances_X²; _Y² and covarianceXY. IfðX;YÞ obeys a 2-dimensional normal distribution, the density functions ofXandY(called the marginal density function in this context) are given by

fXðxÞ ¼ Zþ1

fXYðx;yÞdy¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2²_X

p exp ðxXÞ² 2_X²

; ð5:24Þ

f_YðyÞ ¼ Zþ1

f_XYðx;yÞdx¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2²_Y

p exp ðyYÞ² 2_Y²

; ð5:25Þ

respectively. In other words,X andY obeys the normal distributionsNðX; _X²ÞandNðY; _Y²Þ, respectively.

Proof. We see fromð5.21Þthat

hðxÞ;¹ðxÞi ¼ 11

jj yb12

ðxaÞ

þ 1 11

ðxaÞ²;

and hence

fXYðx;yÞ ¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð2Þ²jj

p exp 1

2 11

jj yb12

ðxaÞ

þ 1 11

ðxaÞ²

( )

" #

: ð5:26Þ

Then we have

fXðxÞ ¼ Z þ1

fXYðx;yÞdy¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ð2Þ²jj p

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2jj 11

exp 1 211

ðxaÞ²

¼ 1

ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 211

p exp 1

2₁₁ðxaÞ²

; ð5:27Þ

which provesð5.24Þ. Similarly,ð5.25Þ is derived.

Theorem 5.38. LetXandY be random variables with meansX; Y, variances_X²; _Y² and covarianceXY. IfðX;YÞ obeys a 2-dimensional normal distribution, the conditional density function f_YjXðyjxÞis given by

fYjXðyjxÞ ¼ f_XYðx;yÞ

fXðxÞ ¼ 1 ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2_Y²ð1²Þ

p exp 1

2²_Yð1²Þ yY_XY

_X² ðxXÞ 2

" #

; ð5:28Þ

where¼_XY is the correlation coeﬃcient ofXand Y. In particular, the conditional density function f_YjXðyjxÞis a normal distribution.

Proof. By taking the ratio ofð5.26Þagainstð5.27Þwe obtain

ドキュメント内東北大学機関リポジトリTOUR (ページ 30-46)