• 検索結果がありません。

The moment-generating function of X is given by φ 1 (θ 1 ) and that of Y is φ 2 (θ 2 ).

N/A
N/A
Protected

Academic year: 2021

シェア "The moment-generating function of X is given by φ 1 (θ 1 ) and that of Y is φ 2 (θ 2 )."

Copied!
5
0
0

読み込み中.... (全文を見る)

全文

(1)

3. Theorem: Let φ(θ 1 , θ 2 ) be the moment-generating function of (X, Y).

The moment-generating function of X is given by φ 11 ) and that of Y is φ 22 ).

Then, we have the following facts:

φ 1 (θ 1 ) = φ(θ 1 , 0), φ 2 (θ 2 ) = φ(0, θ 2 ).

215

Proof:

Again, the definition of the moment-generating func- tion of X and Y is represented as:

φ(θ 1 , θ 2 ) = E(e

θ1

X+θ

2

Y ) = Z

−∞

Z

−∞

e

θ1

x+θ

2

y f xy (x, y) dx dy.

When φ(θ 1 , θ 2 ) is evaluated at θ 2 = 0, φ(θ 1 , 0) is rewrit- ten as follows:

φ(θ 1 , 0) = E(e

θ1

X ) = Z

−∞

Z

−∞

e

θ1

x f xy (x, y) dx dy

= Z

−∞

e

θ1

x Z

−∞

f xy (x, y) dy dx 216

= Z

−∞

e

θ1

x f x (x) dx = E(e

θ1

X ) = φ 11 ).

Thus, we obtain the result: φ(θ 1 , 0) = φ 11 ).

Similarly, φ(0, θ 2 ) = φ 22 ) can be derived.

217

4. Theorem: The moment-generating function of (X, Y ) is given by φ(θ 1 , θ 2 ).

Let φ 11 ) and φ 22 ) be the moment-generating func- tions of X and Y , respectively.

If X is independent of Y, we have:

φ(θ 1 , θ 2 ) = φ 1 (θ 1 )φ 2 (θ 2 ).

218

Proof:

From the definition of φ(θ 1 , θ 2 ), the moment-generating function of X and Y is rewritten as follows:

φ(θ 1 , θ 2 ) = E(e

θ1

X

+θ2

Y ) = E(e

θ1

X )E(e

θ2

Y ) = φ 1122 ).

The second equality holds because X is independent of Y.

Multivariate Case: For multivariate random variables X 1 , X 2 , · · ·, X n , the moment-generating function is defined as:

φ(θ 1 , θ 2 , · · · , θ n ) = E(e

θ1

X

1+θ2

X

2+···+θn

X

n

).

(2)

1. Theorem: If the multivariate random variables X 1 , X 2 , · · ·, X n are mutually independent,

the moment-generating function of X 1 , X 2 , · · ·, X n , de- noted by φ(θ 1 , θ 2 , · · ·, θ n ), is given by:

φ(θ 1 , θ 2 , · · · , θ n ) = φ 1 (θ 1 )φ 2 (θ 2 ) · · · φ n (θ n ), where φ i (θ) = E(e

θXi

).

221

Proof:

From the definition of the moment-generating function in the multivariate cases, we obtain the following:

φ(θ 1 , θ 2 , · · · , θ n ) = E(e

θ1

X

1+θ2

X

2+···+θn

X

n

)

= E(e

θ1

X

1

)E(e

θ2

X

2

) · · · E(e

θn

X

n

)

= φ 1122 ) · · · φ nn ).

222

2. Theorem: Suppose that the multivariate random vari- ables X 1 , X 2 , · · ·, X n are mutually independently and identically distributed.

Suppose that X i ∼ N(µ, σ 2 ).

Let us define ˆ µ = P n

i

=

1 a i X i , where a i , i = 1, 2, · · · , n, are assumed to be known.

Then, ˆ µ ∼ N(µ P n

i

=

1 a i , σ 2 P n i

=

1 a 2 i ).

223

Proof:

From Example 1.8 (p.111) and Example 1.9 (p.147), it is shown that the moment-generating function of X is given by: φ x (θ) = exp(µθ + 1 2 σ 2 θ 2 ), when X is normally distributed as X ∼ N(µ, σ 2 ).

224

Let φ

µ

ˆ be the moment-generating function of ˆ µ.

φ

µ

ˆ (θ) = E(e

θµ

ˆ ) = E(e

θPni=1

a

i

X

i

) =

n

Y

i

=

1

E(e

θai

X

i

)

=

n

Y

i

=

1

φ x (a i θ) =

n

Y

i

=

1

exp(µa i θ + 1 2 σ 2 a 2 i θ 2 )

= exp(µ X n

i

=

1

a i θ + 1 2 σ 2

X n

i

=

1

a 2 i θ 2 )

which is equivalent to the moment-generating func- tion of the normal distribution with mean µ P n

i

=

1 a i and variance σ 2 P n

i=1 a 2 i , where µ and σ 2 in φ x (θ) is simply

replaced by µ P n

i=1 a i and σ 2 P n

i=1 a 2 i in φ

µ

ˆ (θ), respec- tively.

Moreover, note as follows.

When a i = 1/n is taken for all i = 1, 2, · · · , n, i.e., when ˆ µ = X is taken, ˆ µ = X is normally distributed as:

X ∼ N(µ, σ 2 /n).

(3)

6 Law of Large Numbers ( ର਺ͷ๏

ଇ ) and Central Limit Theorem (

৺ۃݶఆཧ )

6.1 Chebyshev’s Inequality (νΣϏγΣϑͷෆ

౳ࣜ)

227

Theorem: Let g(X) be a nonnegative function of the ran- dom variable X, i.e., g(X) ≥ 0.

If E(g(X)) exists, then we have:

P(g(X) ≥ k) ≤ E(g(X))

k , (6)

for a positive constant value k.

228

Proof:

We define the discrete random variable U as follows:

U =

 

 

 

 

1, if g(X) ≥ k, 0, if g(X) < k.

Thus, the discrete random variable U takes 0 or 1.

Suppose that the probability function of U is given by:

f (u) = P(U = u), where P(U = u) is represented as:

P(U = 1) = P(g(X) ≥ k), 229

P(U = 0) = P(g(X) < k).

Then, in spite of the value which U takes, the following equation always holds:

g(X) ≥ kU,

which implies that we have g(X) ≥ k when U = 1 and g(X) ≥ 0 when U = 0, where k is a positive constant value.

Therefore, taking the expectation on both sides, we obtain:

E(g(X)) ≥ kE(U), (7)

230

where E(U) is given by:

E(U) =

1

X

u

=

0

uP(U = u) = 1 × P(U = 1) + 0 × P(U = 0)

= P(U = 1) = P(g(X) ≥ k). (8) Accordingly, substituting equation (8) into equation (7), we have the following inequality:

P(g(X) ≥ k) ≤ E(g(X)) k .

Chebyshev’s Inequality: Assume that E(X) = µ, V(X) = σ 2 , and λ is a positive constant value. Then, we have the following inequality:

P(|X − µ| ≥ λσ) ≤ 1 λ 2 ,

or equivalently,

P(|X − µ| < λσ) ≥ 1 − 1

λ 2 ,

which is called Chebyshev’s inequality.

(4)

Proof:

Take g(X) = (X − µ) 2 and k = λ 2 σ 2 . Then, we have:

P((X − µ) 2 ≥ λ 2 σ 2 ) ≤ E(X − µ) 2 λ 2 σ 2 , which implies P(|X − µ| ≥ λσ) ≤ 1

λ 2 . Note that E(X − µ) 2 = V(X) = σ 2 .

Since we have P(|X − µ| ≥ λσ) + P(|X − µ| < λσ) = 1, we can derive the following inequality:

P(|X − µ| < λσ) ≥ 1 − 1

λ 2 . (9)

233

An Interpretation of Chebyshev’s inequality: 1/λ 2 is an upper bound for the probability P(|X − µ| ≥ λσ).

Equation (9) is rewritten as:

P(µ − λσ < X < µ + λσ) ≥ 1 − 1 λ 2 .

That is, the probability that X falls within λσ units of µ is greater than or equal to 1 − 1/λ 2 .

Taking an example of λ = 2, the probability that X falls within two standard deviations of its mean is at least 0.75.

234

Furthermore, note as follows.

Taking = λσ, we obtain as follows:

P(|X − µ| ≥ ) ≤ σ 2 2 , i.e.,

P(|X − E(X)| ≥ ) ≤ V(X)

2 , (10)

which inequality is used in the next section.

235

Remark: Equation (10) can be derived when we take g(X) = (X − µ) 2 , µ = E(X) and k = 2 in equation (6).

Even when we have µ , E(X), the following inequality still hold:

P(|X − µ| ≥ ) ≤ E((X − µ) 2 )

2 .

Note that E((X−µ) 2 ) represents the mean square error (MSE).

When µ = E(X), the mean square error reduces to the vari- ance.

236

6.2 Law of Large Numbers (ର਺ͷ๏ଇ) and Convergence in Probability ( ֬཰ऩଋ )

Law of Large Numbers 1: Assume that X 1 , X 2 , · · ·, X n

are mutually independently and identically distributed with mean E(X i ) = µ for all i.

Supopose that the moment-generating function of X i is finite.

Define X n = 1 n

n

X

i

=

1

X i .

Then, X n −→ µ as n −→ ∞.

Proof: The moment-generating function is written as:

φ(θ) = 1 + µ

0

1 θ + 1

2! µ

0

2 θ 2 + 1

3! µ

0

3 θ 3 + · · ·

= 1 + µ

0

1 θ + O(θ 2 )

where µ

0

k = E(X k ) for all k. That is, all the moments exist.

φ x (θ) = φ θ

n n

= 1 + µ

0

1 θ

n + O( θ 2 n 2 ) n

= 1 + µ

0

1 θ

n + O( 1 n 2 ) n

=

(1 + x)

1x

µθ+

O(n

−1

)

−→ exp(µθ) as x −→ 0,

(5)

which is the following probability function:

f (x) =

 

 

 

 

1 if x = µ, 0 otherwise.

φ(θ) =

X e

θx

f (x) = e

θµ

f (µ) = e

θµ

239

Law of Large Numbers 2: Assume that X 1 , X 2 , · · ·, X n are mutually independently and identically distributed with mean E(X i ) = µ and variance V(X i ) = σ 2 < ∞ for all i.

Then, for any positive value , as n −→ ∞, we have the following result:

P(|X n − µ| > ) −→ 0, where X n = 1

n

n

X

i=1

X i .

We say that X n converges in probability to µ.

240

参照

関連したドキュメント

Since a first extension of Orlicz-Sobolev spaces on metric spaces, denoted by M Φ 1 (X), following Hajłasz’ method, was studied in [4], it is natural to examine

Abstract: In this paper, we proved a rigidity theorem of the Hodge metric for concave horizontal slices and a local rigidity theorem for the monodromy representation.. I

This paper is a sequel to [1] where the existence of homoclinic solutions was proved for a family of singular Hamiltonian systems which were subjected to almost periodic forcing...

Lang, The generalized Hardy operators with kernel and variable integral limits in Banach function spaces, J.. Sinnamon, Mapping properties of integral averaging operators,

Since a first extension of Orlicz-Sobolev spaces on metric spaces, denoted by M Φ 1 (X), following Hajłasz’ method, was studied in [4], it is natural to examine

Global transformations of the kind (1) may serve for investigation of oscilatory behavior of solutions from certain classes of linear differential equations because each of

the log scheme obtained by equipping the diagonal divisor X ⊆ X 2 (which is the restriction of the (1-)morphism M g,[r]+1 → M g,[r]+2 obtained by gluing the tautological family

The crucial assumption in [14] is that the distribution of the increments possesses a density and has an everywhere finite moment-generating function. In particular, the increments