TA11 最近の更新履歴 Econometrics Ⅰ 2016 TA session

(1)

TA session note#11

Shouto Yonekura

June 29, 2016

Abstract

Although I am going to use this note on 12th July, reading this might be useful to you to understand assingments.

1 Modes of convergence

Def11.1

For random vector X := (X1, X2, · · · Xm) ∈ Rⁿ, the distribtuin function of X, defined for x := (x1, x2, · · · , xn) ∈ Rⁿ^, denoted by FX(x) := P (X ≤ x). Let {Xn} be random vectors with values in Rⁿ^.

(1) Xn converges almost shurely to X, Xn→a.s.X is P (lim_n→∞Xn = X) = 1

(2) For a real number r > 0, Xn converges in the rth mean to X, Xn_→rX, if E[(Xn_{− X)}^r] → 0 as n → ∞. (3) Xn converges in probability to X, Xn_→pX, if for every ǫ > 0, lim_n→∞_{P (| X}n− X |≤ ǫ) = 1

(4) Xn converges in law or in distribution to X, Xn_→dX, if FXn(x) → F^X(x) as n → ∞, for all points x at which FX(x) is continuous

Example1

We say that a random vector X ∈ Rⁿ is degenerate at a point c ∈ Rⁿ is P (X = c) = 1. Let Xn ∈ R be degenerate at 1/n for n = 1, 2, · · · and X ∈ R be degenerated at 0. Since 1/n → 0 as n → ∞, it might be expected that Xn →dX. The distribution function of Xn is FXn= 1_[1/n,∞), and that of X is FX = 1_[0,∞]. Then FXn → FX

for all x except x = 0, and for x = 0 we get FXn(0) = 0 6= 1 = FX(0). However, since FX(x) is not continuous at x = 0, we nevertheless have Xn→dX.

(2)

Prop11.2

(a) Xn_→a.s_{X =⇒ X}n_→pX (b) Xn _→r_{X =⇒ X}n_→pX (c) Xn→pX =⇒ Xn →d X Proof

(a)

Let ǫ > 0. Then

P (| Xn− X |> ǫ) = E[1(ǫ,∞)(| Xn− X |)] holds. Since Xn _→a.sX, using the dominated convergence theorem, we get

lim_n→∞E[1_(ǫ,∞)_{(| X}n− X |)] = E[limn→∞¹(ǫ,∞)(| Xn− X |)]

= 0

(b)

By the Chebyshev’s inequality, we will get

P (| Xⁿ− X |> ǫ) ≤ _ǫ¹^pE[| Xⁿ− X |^p^]

→ 0 as n → ∞.

(c)

Let ǫ > 0 and let ι ∈ Rⁿ represent the vector whose componets are all 1. If Xn _{≤ x}0, then either X ≤ x⁰^{+ ǫι} or | X − Xn|> ǫ hold. In other words,{Xn≤ x0} ⊂ {X ≤ x0+ ǫι} ∪ {| X − Xn |> ǫ}. Hence

FXn(x0_{) ≤ F}X(x0+ ǫι) + P (| X − Xⁿ|> ǫ). Similarly,

FX(x0− ǫι) ≤ FXn^(x0) + P (| X − Xn|> ǫ). Therefore, since P (| Xⁿ− X |> ǫ) → 0,

FX(x0− ǫι) ≤ liminfF^Xn(x0) ≤ limsupF^Xn(x0_{) ≤ F}X(x0+ ǫι).

If FX(x) is continious at x0, then the left and right ends of this inequality both converge to FX(x0) as ǫ → 0. This means FXn(x0) → FX(x0). Q.E.D.

2 Some useful tools for asymptotic theory

Prop11.3 (Markov’s inequality)

Let X be a random variable and h() be a non-negative function. If E[h(X)] < ∞, then P (h(X) ≥ a) ≤ ¹aE[h(X)] ∀a > 0

holds. Proof

(3)

Let FX be the distribution function of X. Since ∀x, h(x) ≥ 0 E[h(X)] =´ h(X)dF (x)

≥^´_{h(x)≥a}^{h(X)dF (x)}

=´ 1_{h(x)≥a}h(X)dF (x)

≥^{´ 1}{h(x)≥a}^{adF (x)}

= aP (h(x) ≥ a) Q.E.D.

Note that 1A(x) is called an indicator function defined as 1A(x) =

(1 _{if x ∈ A} 0 _{if x ∈ A}^c^. This function has a following property;

E[1A(x)] = P (A) × 1 + P (A^c) × 0

= P (A)

Lem11.4 (Chebyshev’s inequality)

Let X be a random variable with mean=µ and varince=σ². Then P (| X − µ |≥ ǫ) ≤ _ǫ¹²^σ² ∀ǫ > 0 Proof.

You just set h(X) =| X − µ |² ^{and a = ǫ}²in Prop11.3. Q.E.D. Thm11.5 (Continuous mapping theorem)

(a) Let {Xn} be random viriables or random vectors such that Xn →pX and let h() be a continuous real-valued function Then h(Xn) →ph(X).

(b)Let {Xn} be random viriables or random vectors such that Xn →d X and let h() be a continuous real-valued function Then h(Xn) →dh(X).

Proof

Without loss of generality, I only provide the proof in the case of random variables. Let ǫ > 0. Since h() is continuous at ∀a ∈ R, there exists δ > 0 such that | h(x) − h(a) |< δ ⇒| h(x) − h(a) |< ǫ. That is

| h(x) − h(a) |≥ ǫ =⇒| h(x) − h(a) |≥ δ. Next we set Xn = x, then we get

P (| h(Xⁿ) − h(a) |≥ ǫ) =⇒ P (| h(Xⁿ) − h(a) |≥ δ)

→ 0 as n → ∞. (b) is followed from Prop11.2. Q.E.D.

Remark

(4)

Above theorem says, for example, if Xn→pX, then

X_n⁻¹_→pX⁻¹ X_n²_→pX² hold.

When you need to show that some r.v.s converge in probability, following theorem might be useful. Thm 11.6

Xn→ Xp iff lim_n→∞E[_1+|X^|Xⁿ^−X|_n_−X|] = 0 Proof

Without loss of generality, we can assume that X = 0. Thus we need to show that Xn→ 0 iff limn→∞^E[_1+|X^|Xⁿ^|n_|^{] = 0.}

Suppose that Xn _→p0. Then for any ǫ > 0,

|Xⁿ|

1+|Xn| ^≤ ^|X

n_|

1+|Xn|¹^{|Xⁿ^|>ǫ}^{+ ǫ1}^{|Xⁿ^|≤ǫ}

≤ 1{|Xn|>ǫ}^{+ ǫ,}

⇐⇒ ^E[_1+|X^|Xⁿ^|_n_|] ≤ P (1{|Xn|>ǫ}^{) + ǫ,}

⇐⇒ ^limn→∞^E[_1+|X^|Xⁿ^|n_|^{] ≤ ǫ,}

holds. Thus we have lim_n→∞E[_1+|X^|Xⁿ^|_n_|] = 0 since ǫ was arbitrary. Next suppose that lim_n→∞E[_1+|X^|Xⁿ^−X|_n_−X|] = 0 holds. Since _1+x^x is an increase function, we get

ǫ

1+ǫ¹{|Xn|>ǫ}≤_1+|X^|Xⁿ^|_n_|¹{|Xn|>ǫ}

≤_1+|X^|Xⁿ^|_n_|^. Taking expectations and limits gives

ǫ

1+ǫ^lim→∞P (| Xn|> ǫ)

≤ limn→∞^E[_1+|X^|Xⁿ^|_n_|] = 0 Q.E.D

Thm11.7 (Slutsky’s theorem)

Let {Xn}, {Yn} be random variables or random vectors. Suppose that Xn →dX and Yn →p c, where c is a fixed real number.

(a) Xn+ Yn→dX + c; (b) YnXn→d cX;

(c) Xn/Yn→dX/c if c 6= 0. Proof

(5)

I only provide the proof of (a) and in the case of random variables. Let t ∈ R and ǫ > 0. Then FXn+Yn(t) = P (Xn+ Yn≤ t)

≤ P ({Xⁿ^{+ Y}ⁿ≤ t} ∩ {| Yⁿ− c |< ǫ}) + P (| Yⁿ− c |≥ ǫ)

≤ P (Xn≤ t − c + ǫ) + P (| Yn− c |≥ ǫ). and, similarly,

FXn+Yn(t) ≥ P (Xⁿ≤ t − c − ǫ) + P (| Yⁿ− c |≥ ǫ).

If t − c, t − c + ǫ and t − c − ǫ are continuity points of F^X, the it follows from the continuous mapping theorem nad the assumptions of the theore that

FX(t − c − ǫ) ≤ liminfFXn+n(t) ≤ limsupFXn+Yn(t) ≤ FX(t − c + ǫ). Thus we get

lim_n→∞FXn+Yn(t) = FX(t − c). The result follows from FX+c(t) = FX(t − c). Q.E.D.

Example2

Let x ∼ t(n). If n → ∞, then t(n) →^d^{N (0, 1).} Proof

Let z ∼ N(0, 1) and y ∼ χ²^{(n). If v}ⁱ∼^iid ^χ²(1) , i = 1, 2, · · · n, then^Pⁿi=1^vⁱ∼ χ²(n).Since E[vi] = 1 (prop5.2), we can show that E[_n¹^Pⁿ_i vi_{] →}p 1 by using the law of large number. Thus ^p^y_n _→p 1 (Continuous mapping theorem).From these results, we can finally obtain that √^z

y/n ^→^d N (0, 1) (Slutsky’s theorem) as required. Q.E.D. Example3

Let y ∼ F (l, m). Then ly →dχ²(l) as m → ∞. Proof

Let y ∼ F (l, m),x ∼ χ²(l) and z ∼ χ²(m). Suppose that they are mutually independent. Then by the definiiton of F-distribution, _z/m^x = ly.Since E[x] = E[^P^m_i=1χ²_i(1)] = m, z/m →p 1. Thus_z/m^x _→d x ∼ χ²(l). Hence we get ly →^d^χ²^{(l) Q.E.D.}

3 LLN&CLT

Thm11.8 ((weak)Law of Large Numbers)

Let {Xⁿ} be iid random variables with mean µ and variance σ²< ∞ and let ¯^{X := n}⁻¹^Pi^Xⁱ^{. Then}

X →¯ pE[X1] = µ Proof

(6)

First we have to calculate E[ ¯X] and V [ ¯X] and they will be calculated as follows: E[ ¯X] = E[n⁻¹^P_iXi]

= n⁻¹E[^P_iXi]

= n⁻¹nµ = µ; V [ ¯X] = V [n⁻¹^P_iXi]

= n⁻²^P_iV [Xi]

= n⁻²nσ²=^σ_n². Next we apply the Chebyshev’s inequality to above:

P (| ¯X − µ |> ǫ) ≤ ǫ^{−2 σ}n²

→ 0 Q.E.D.

Thm11.9 (Central Limit Theorem (Lindeberg-Levy))

Let {Xⁿ} be iid random variables with mean µ and variance σ²< ∞. Then

X−µ¯

√σ²/n^→^d^{N (0, 1)}

or ^√n( ¯_{X − µ) →}dN (0, σ²) holds.

Proof

LetYn= ^X¹^+X²^√^+···Xⁿ^−nµ

nσ² ^{and Z}ⁱ⁼ Xi_−µ

σ , i = 1, 2, · · · n. By the assumptions,{Zⁱ} are iid and Tn =

Pn i=1^Zⁱ

n^1/2 holds. Then characteristic function of Zi is given below:

ϕTn(t) = E[e^itTⁿ]

= E[eⁱ^n1/2^t ^Pⁿⁱ⁼¹^Zⁱ]

=

n

Y

i=1

E[eⁱ^n1/2^t ^Zⁱ]

=

n

Y

i=1

ϕZi( ^t n^1/2⁾

Sinc {Zⁱ} areidentically distributed,

ϕTn^(t) = {Z1( ^t n^1/2^)}

n

(7)

also holds,Next we apply Taylor expansion to f (x) = e^ix around 0, then we get

e^ix = _{1 + ix − x}² ˆ 1

0 ^{(1 − s)e} isx_dx.

Let x = Z1_n1/2^t . Then we get following:

eⁱ^n1/2^t ^Z¹ = 1 + ^it n^1/2^Z¹⁻

t² n^(Z¹⁾

2^ˆ ¹

0 ^{(1 − s)e} i ^st

n1/2^Z¹_ds

= 1 + ^it n^1/2^Z¹⁻

t² 2n^(Z¹⁾

2₊^t²

n^(Z¹⁾

2^ˆ ¹

0 (1 − s)(1 − eⁱ^n1/2^st ^Z¹^)ds. Since for any i,

E[Zi] = E[^Xⁱ^{− µ} σ ^{] = 0} E[Z_i²] = V [Zi] = V [^Xⁱ^{− µ}

σ ^{] = 1} hold, the followings also hold

ϕT1(t) _{= 1 −} ^t

2

2n⁺ t² n^E[(Z¹⁾

2^ˆ ¹

0 (1 − s)(1 − eⁱ^n1/2^st ^Z¹^)ds]

= 1 − ^t

2

2n⁺ t² n

ˆ 1

0 ^{(1 − s)(ϕ}

′′

Z1(st) − ϕ^′′Z1^(0))ds.

.

Next we can evaluate ϕ^′′_Z₁_{(st) − ϕ}^′′_Z₁(0) as follows:

α(s; t) := ϕ^′′_Z₁_{(st) − ϕ}^′′_Z₁(0) = _−E[Z₁²e^istZ1^n1/2 _{− E[Z}₁²]]

= _−E[Z₁²(e^istZ1^n1/2 _{− 1)]} and,

| Z1²^(e

istZ1

n1/2 − 1) | ≤ 2Z1²

is also valid. Using the dominated convergence theorem, we can get lim_t→0α(s; t) = 0. Therefore we can show that:

ϕTn(t) _{= {1 −} ^t

2

2n^{+ o(} t²

n^)}

n

= (1 − ^t

2

2n⁾

n_{+ no(}^t²

n⁾

(8)

and asn → ∞,ϕ^Tn → e⁻^t2². This is the characteristic function of standard normal distribution.Q.E.D Example4

If {Xⁿ} ∼^iid U (0, 1), then CLT says that ^Pⁱ√^Xⁱ⁻ⁿ¹²

n₁₂¹ ^→^dN (0, 1) since E[X1] = ¹₂ and V [X1] = ₁₂¹. Below figure shows how ^Pⁱ√^Xⁱ⁻ⁿ¹²

n₁₂¹ converges to N (0, 1) as n becoms large.

Example5

If {Xn} ∼iidBe(α, β), then CLT implies that

P

i^Xⁱ−nα+β^α

qn(α+β+1)(α+β)2^αβ

→d N (0, 1) since E[X1] = _α+β^α and V [X1] =

αβ

(α+β+1)(α+β)². Below figure shows how

P

i^Xⁱ−nα+β^α

qn ^αβ

(α+β+1)(α+β)2

converges to N (0, 1) as n becoms large in the case of α = 1, β = 2.

(9)

4 Consistency

Def11.10 Consistency

Let {Xⁿ}be random variables and ˆθⁿ be an estimator of θ ∈ Θ ⊆ R^k based on {Xⁿ}. If θˆn →pθ ∀θi

holds, then ˆθn is said to be a consistent estimator of θ. Example6

Let {Xⁱ} ∼^iid ^{N (µ, σ}²^{) and ¯}^{X := n}⁻¹^Pi^Xⁱ^{. Then ¯}X is a consistent estimator of µ. Proof

By the LLN, we can show that n⁻¹^P_iXi_→pE[X1] = µ. Q.E.D. Example7

Let {Xi} ∼iid N (µ, σ²) and ˆσ²:= n⁻¹^P_i(Xi− ¯^X)²^{. Then ˆ}^σ² is a consistent estimator of σ². Proof

(10)

First we can decompose ˆσ²as follows:

σˆ²:= n⁻¹^P_i(Xi_{− ¯}X)²

= n⁻¹P((X_i_{− µ) − ( ¯}_{X − µ)}²)

= n⁻¹^P_i_{(Xi_{− µ)}²_{− 2(X}i_{− µ)( ¯}X − µ) + ( ¯X − µ)²}

= n⁻¹^P_i(Xi− µ)²− (µ − ¯^X)²

= n⁻¹^P_i(X_i²_{− 2X}iµ + µ²_{) − (µ − ¯}X)²

→^p^σ²^{+ µ}²− 2µ²^{+ µ}²− 0

= σ². Q.E.D

5 Asymptotic properties of the OLSE

Prop11.11

Suppose that assumptions A1 ∼ A5 hold. Under these assumptions, (a) _{β →}^ˆ pβ

(b) s²:= _n−k^e^′^e _→pσ² holds. Where ˆβ is the OLSE of β and e := y − X ˆ^β.

Proof

(a) First we can decompose ˆβ as follows:

β = (Xˆ ^′X)⁻¹X^′y

= (X^′X)⁻¹X^′(Xβ + u)

= β + (X^′X)⁻¹X^′u

= β + (_n¹X^′X)^{−1 1}_nX^′u

= β + Q⁻¹_xxQxu. Using the LLN, we can easily that Qxu→p0 since

1 n^X

′u →pX^′E[u]

= 0.

Suppose that Qxxconverges to some finite matrix Mxx. Then QxxQxu→p0. Hence we get ˆ_{β →}pβ. (b) Since e = y − X ˆβ = u − X( ˆβ − β), we can rewiten s² as follows:

s²=_n−k¹ ^P_ie²_i

n n−k⁽ⁿ⁻¹^u

′u − ( ˆβ − β)^′_n¹^X^′^{u + ( ˆ}β − β)^′¹_n^X^′^{X( ˆ}β − β))

= _n−kⁿ (n⁻¹^P_iu²_i _{− ( ˆ}_{β − β)}^′_n¹^P_ixiui+ ( ˆ_{β − β)}^′_n¹^P_ixix^′_i( ˆ_{β − β)).}

We have already showed that ˆ_{β →}pβ and suppose that Qxxconverges to some finite matrix Mxx. This means that

1 n−k

P

i^e²i →^p^E[u²i^{] = σ}² ⁽_n−kⁿ → 1 as n → ∞) ^Q.E.D.

(11)

Below figure shows how ˆβ converges to β in the case of example1 in TA note#7.

Prop11.12

Suppose that assumptions A1 ∼ A5 hold. In addtion to these, E[u⁴i^{] and (x}ⁱ^x

′

i⁾²are finite for any i. Where xi∈ Rⁿ^. Under these assumptions,

E[k xiuik²] < ∞;

√1

n^{P x}ⁱ^uⁱ^→^d^{N (0, Ω),}

hold. Where Ω := E[xix^′_iu²_i]. Proof

By the triangle inequality and Jensen’s inequality, we can show that k E[xix^′_iu²_i_{] k}²_{≤ E[k x}ix^′_iu²_i _k²]

= E[k x²i^u²i k] =k xⁱk²^E[u²i^].

(12)

Using Cauchy–Schwarz inequality, we will obtain

k xik²^E[u²i] ≤ (k xik⁴⁾^1/2^(E[u⁴i^])^1/2

< ∞. Therefore, we finally get √¹

n

P

i^xⁱ^uⁱ→dN (0, Ω) by CLT. Q.E.D. Note that^√n ¯X =^√n¹_n^P_iXi= _n√ⁿ_n^P_iXi= √¹_n^P_iXi.

Prop11.13

Suppose that assumptions A1 ∼ A5 hold. In addtion to these, E[u⁴i^{] and (x}ix^′_i)²are finite for any i. Where xi∈ Rⁿ^. Under these assumptions,

√n( ˆ_{β − β) →}dNk(0, V )

holds. Where V := M_xx⁻¹ΩM_xx⁻¹. Proof

First, we can rewriten ^√n( ˆβ − β) as follows:

√_{n( ˆ}

β − β) =^√^n(X^′^X)⁻¹^X^′^u,

= (_n¹X^′X)^{−1 1}^√_nX^′u,

= Q⁻¹_xx√¹_nX^′u.

From Prop11.11,^√¹_nX^′uui_→dN (0, Ω).Suppose that Qxxconverges to some finite matrix Mxx. Then using Slutsky’s theorem, we obtain that Q⁻¹_xx√¹_nX^′_{u →}dM_xx⁻¹N (0, Ω). Hence we can finally show that:

√_{n( ˆ}

β − β) →dNk(0, M_xx⁻¹ΩM_xx⁻¹) , (M_xx^′ = Mxx)

= Nk(0, V ) Q.E.D.

TA11 最近の更新履歴 Econometrics Ⅰ 2016 TA session

TA session note#11

Shouto Yonekura

June 29, 2016

Contents

1 Modes of convergence

2 Some useful tools for asymptotic theory

3 LLN&CLT

4 Consistency

5 Asymptotic properties of the OLSE