• 検索結果がありません。

1. Regression model: y = X β + u, u ∼ N(0 , σ

N/A
N/A
Protected

Academic year: 2021

シェア "1. Regression model: y = X β + u, u ∼ N(0 , σ"

Copied!
14
0
0

読み込み中.... (全文を見る)

全文

(1)

8 Generalized Least Squares Method (GLS, 一般化最 小自乗法 )

1. Regression model: y = X β + u, uN(0 , σ

2

Ω ) 2. Heteroscedasticity (

不等分散,不均一分散

)

σ

2

Ω =

 







σ

21

0 · · · 0 0 σ

22

... ...

... ... ... 0 0 · · · 0 σ

2n

 







(2)

First-Order Autocorrelation (

一階の自己相関,系列相関

)

In the case of time series data, the subscript is conventionally given by t, not i . u

t

= ρ u

t1

+

t

,

t

iid N(0 , σ

2

)

σ

2

Ω = σ

2

1 − ρ

2

 











1 ρ ρ

2

· · · ρ

n1

ρ 1 ρ · · · ρ

n2

ρ

2

ρ 1 · · · ρ

n3

... ... ... ... ...

ρ

n1

ρ

n2

ρ

n3

· · · 1

 











V(u

t

) = σ

2

= σ

2

1 − ρ

2

3. The Generalized Least Squares (GLS

,一般化最小二乗法

) estimator of β ,

(3)

denoted by b, solves the following minimization problem:

min

b

(yXb)

0

1

(yXb)

The GLSE of β is:

b = (X

0

1

X)

1

X

0

1

y

4. In general, when Ω is symmetric, Ω is decomposed as follows.

Ω = A

0

Λ A

Λ is a diagonal matrix, where the diagonal elements of Λ are given by the eigen values.

A is a matrix consisting of eigen vectors.

When Ω is a positive definite matrix, all the diagonal elements of Λ are positive.

(4)

5. There exists P such that Ω = PP

0

(i.e., take P = A

0

Λ

1/2

). = ⇒ P

1

P

0−1

= I

n

Multiply P

1

on both sides of y = X β + u.

We have:

y

?

= X

?

β + u

?

,

where y

?

= P

1

y, X

?

= P

1

X, and u

?

= P

1

u.

The variance of u

?

is:

V(u

?

) = V(P

1

u) = P

1

V(u)P

0−1

= σ

2

P

1

P

0−1

= σ

2

I

n

. because Ω = PP

0

, i.e., P

1

P

0−1

= I

n

.

Accordingly, the regression model is rewritten as:

y

?

= X

?

β + u

?

, u

?

∼ (0 , σ

2

I

n

)

(5)

Apply OLS to the above model.

Let b be as estimator of β from the above model.

That is, the minimization problem is given by:

min

b

(y

?

X

?

b)

0

(y

?

X

?

b) ,

which is equivalent to:

min

b

(yXb)

0

1

(yXb) .

Solving the minimization problem above, we have the following estimator:

b = (X

?0

X

?

)

−1

X

?0

y

?

= (X

0

−1

X)

−1

X

0

−1

y ,

(6)

which is called GLS (Generalized Least Squares) estimator.

b is rewritten as follows:

b = β + (X

?0

X

?

)

−1

X

?0

u

?

= β + (X

0

−1

X)

−1

X

0

−1

u The mean and variance of b are given by:

E(b) = β,

V(b) = σ

2

(X

?0

X

?

)

1

= σ

2

(X

0

1

X)

1

. 6. Suppose that the regression model is given by:

y = X β + u , uN(0 , σ

2

Ω ) . In this case, when we use OLS, what happens?

β ˆ = (X

0

X)

−1

X

0

y = β + (X

0

X)

−1

X

0

u

(7)

V( ˆ β ) = σ

2

(X

0

X)

1

X

0

X(X

0

X)

1

Compare GLS and OLS.

(a) Expectation:

E( ˆ β ) = β, and E(b) = β Thus, both ˆ β and b are unbiased estimator.

(b) Variance:

V( ˆ β ) = σ

2

(X

0

X)

1

X

0

X(X

0

X)

1

V(b) = σ

2

(X

0

1

X)

1

Which is more e ffi cient, OLS or GLS?.

(8)

V( ˆ β ) − V(b) = σ

2

(X

0

X)

1

X

0

X(X

0

X)

1

− σ

2

(X

0

1

X)

1

= σ

2

(

(X

0

X)

1

X

0

(X

0

1

X)

1

X

0

1

) Ω

× (

(X

0

X)

1

X

0

(X

0

1

X)

1

X

0

1

)

0

= σ

2

AA

0

is the variance-covariance matrix of u, which is a positive definite ma- trix.

Therefore, except for Ω = I

n

, AA

0

is also a positive definite matrix.

This implies that V( ˆ β

i

) − V(b

i

) > 0 for the ith element of β . Accordingly, b is more e ffi cient than ˆ β .

7. If uN(0 , σ

2

), then bN( β, σ

2

(X

0

1

X)

1

).

(9)

Consider testing the hypothesis H

0

: R β = r.

R : G × k, rank(R) = Gk.

RbN(R β, σ

2

R(X

0

1

X)

1

R

0

).

Therefore, the following quadratic form is distributed as:

(Rbr)

0

(R(X

0

−1

X)

−1

R

0

)

1

(Rbr)

σ

2

∼ χ

2

(G)

8. Because (y

?

X

?

b)

0

(y

?

X

?

b)

2

∼ χ

2

(nk), we obtain:

(yXb)

0

1

(yXb)

σ

2

∼ χ

2

(nk)

9. Furthermore, from the fact that b is independent of yXb, the following F

distribution can be derived:

(10)

(Rbr)

0

(R(X

0

1

X)

1

R

0

)

1

(Rbr) / G

(yXb)

0

1

(yXb) / (nk)F(G , nk) 10. Let b be the unrestricted GLSE and ˜b be the restricted GLSE.

Their residuals are given by e and ˜u, respectively.

e = yXb , ˜u = yX ˜b

Then, the F test statistic is written as follows:

( ˜u

0

1

˜ue

0

1

e) / G

e

0

−1

e / (nk)F(G , nk)

(11)

8.1 Example: Mixed Estimation (Theil and Goldberger Model)

A generalization of the restricted OLS = ⇒ Stochastic linear restriction:

r = R β + v , E(v) = 0 and V(v) = σ

2

Ψ y = X β + u , E(u) = 0 and V(u) = σ

2

I

n

Using a matrix form,

( y

r )

=

( X

R )

β +

( u

v )

, E

( u

v )

=

( 0

0 )

and V

( u

v )

= σ

2

( I

n

0

0 Ψ

)

For estimation, we do not need normality assumption.

Applying GLS, we obtain:

b =

 

( X

0

R

0

)

( I

n

0

0 Ψ

)

1

( X

R ) 

−1



( X

0

R

0

)

( I

n

0

0 Ψ

)

1

( y r

) 

= (

X

0

X + R

0

Ψ

−1

R )

1

(

X

0

y + R

0

Ψ

−1

r )

.

(12)

Mean and Variance of b: b is rewritten as follows:

b =

 

( X

0

R

0

)

( I

n

0

0 Ψ

)

1

( X

R ) 

1



( X

0

R

0

)

( I

n

0

0 Ψ

)

1

( y r

) 

= β +

 

( X

0

R

0

)

( I

n

0

0 Ψ

)

−1

( X

R ) 

1

( u v )

Therefore, the mean and variance are given by:

E(b) = β = ⇒ b is unbiased.

V(b) = σ

2

 

( X

0

R

0

)

( I

n

0

0 Ψ

)

1

( X R

) 

1

= σ

2

(

X

0

X + R

0

Ψ

1

R )

1

(13)

9 Maximum Likelihood Estimation (MLE, 最尤法 )

−→ Review

1. The distribution function of { X

i

}

ni=1

is f (x; θ ), where x = (x

1

, x

2

, · · · , x

n

) and θ = ( µ, Σ ).

Note that X is a vector of random variables and x is a vector of their realizations (i.e., observed data).

Likelihood function L( · ) is defined as L( θ ; x) = f (x; θ ).

Note that f (x; θ ) = ∏

n

i=1

f (x

i

; θ ) when X

1

, X

2

, · · · , X

n

are mutually indepen-

dently and identically distributed.

(14)

The maximum likelihood estimator (MLE) of θ is θ such that:

max

θ

L( θ ; X) . ⇐⇒ max

θ

log L( θ ; X) .

MLE satisfies the following two conditions:

(a) ∂ log L( θ ; X)

∂θ = 0.

(b) ∂

2

log L( θ ; X)

∂θ∂θ

0

is a negative definite matrix.

2. Fisher’s information matrix (フィッシャーの情報行列) is defined as:

I( θ ) = − E ( ∂

2

log L( θ ; X)

∂θ∂θ

0

) ,

where we have the following equality:

− E ( ∂

2

log L( θ ; X)

∂θ∂θ

0

) = E ( ∂ log L( θ ; X)

∂θ

log L( θ ; X)

∂θ

0

) = V ( ∂ log L( θ ; X)

∂θ

)

参照

関連したドキュメント

After that, applying the well-known results for elliptic boundary-value problems (without parameter) in the considered domains, we receive the asymptotic formu- las of the solutions

Then, we prove the model admits periodic traveling wave solutions connect- ing this periodic steady state to the uniform steady state u = 1 by applying center manifold reduction and

In the second section, we study the continuity of the functions f p (for the definition of this function see the abstract) when (X, f ) is a dynamical system in which X is a

Rhoudaf; Existence results for Strongly nonlinear degenerated parabolic equations via strong convergence of truncations with L 1 data..

Lang, The generalized Hardy operators with kernel and variable integral limits in Banach function spaces, J.. Sinnamon, Mapping properties of integral averaging operators,

In this paper, we study the existence and nonexistence of positive solutions of an elliptic system involving critical Sobolev exponent perturbed by a weakly coupled term..

In [3] the authors review some results concerning the existence, uniqueness and regularity of reproductive and time periodic solutions of the Navier-Stokes equations and some

Global transformations of the kind (1) may serve for investigation of oscilatory behavior of solutions from certain classes of linear differential equations because each of