Stock, J.H. (1987) “Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors,” Econometrica, Vol.55, pp.1035 – 1056.
Proposition:
Let y
1,tbe a scalar, y
2,tbe a k × 1 vector, and (y
1,t, y
02,t)
0be a g × 1 vector, where g = k + 1.
Consider the following model:
y
1,t= α + γ
0y
2,t+ z
∗t∆ y
2,t= u
2,t( z
∗tu
2,t)
= Ψ
∗(L)
t tis a g × 1 i.i.d. vector with E(
t) = 0 and E(
tt0) = PP
0. OLSE is given by:
( α ˆ γ ˆ )
=
( T ∑
y
02,t∑ y
2,t∑ y
2,ty
02,t)
−1( ∑ y
1,t∑ y
1,ty
2,t)
.
Define λ
∗1, which is a g × 1 vector, and Λ
∗2, which is a k × g matrix, as follows:
Ψ
∗(1) P = ( λ
∗10Λ
∗2)
.
Then, we have the following results:
( T
1/2( ˆ α − α ) T ( ˆ γ − γ )
)
−→
1
( Λ
∗2∫
W(r)dr )
0Λ
∗2∫
W(r)dr Λ
∗2(∫
(W(r)) (W(r))
0dr )
Λ
∗20
−1
( h
1h
2)
, where
( h
1h
2)
=
λ
∗10W(1) Λ
∗2(∫
W(r) (dW(r))
0)
λ
∗1+ ∑
∞τ=0
E(u
2,tz
∗t+τ)
.
W(r) denotes a g-dimensional standard Brownian motion.
1) OLSE of the cointegrating vector is consistent even though u
tis serially
correlated.
2) The consistency of OLSE implies that T
−1∑
ˆu
2t−→ σ
2. 3) Because T
−1∑
(y
1,t− y
1)
2goes to infinity, a coe ffi cient of determination,
R
2, goes to one.
3.4 Testing Cointegration
3.4.1 Engle-Granger Test y
t∼ I(1)
y
1,t= α + γ
0y
2,t+ u
t• u
t∼ I(0) = ⇒ Cointegration
• u
t∼ I(1) = ⇒ Spurious Regression
Estimate y
1,t= α + γ
0y
2,t+ u
tby OLS, and obtain ˆu
t.
Estimate ˆu
t= ρ ˆu
t−1+ δ
1∆ ˆu
t−1+ δ
2∆ ˆu
t−2+ · · · + δ
p−1∆ ˆu
t−p+1+ e
tby OLS.
ADF Test:
• H
0: ρ = 1 (Sprious Regression)
• H
1: ρ < 1 (Cointegration)
= ⇒ Engle-Granger Test
For example, see Engle and Granger (1987), Phillips and Ouliaris (1990) and Hansen
(1992).
Asymmptotic Distribution of Residual-Based ADF Test for Cointegration
# of Refressors, (a) Regressors have no drift (b) Some regressors have drift
excluding constant 1% 2.5% 5% 10% 1% 2.5% 5% 10%
1 − 3 . 96 − 3 . 64 − 3 . 37 − 3 . 07 − 3 . 96 − 3 . 67 − 3 . 41 − 3 . 13
2 − 4 . 31 − 4 . 02 − 3 . 77 − 3 . 45 − 4 . 36 − 4 . 07 − 3 . 80 − 3 . 52
3 − 4 . 73 − 4 . 37 − 4 . 11 − 3 . 83 − 4 . 65 − 4 . 39 − 4 . 16 − 3 . 84
4 − 5 . 07 − 4 . 71 − 4 . 45 − 4 . 16 − 5 . 04 − 4 . 77 − 4 . 49 − 4 . 20
5 − 5 . 28 − 4 . 98 − 4 . 71 − 4 . 43 − 5 . 36 − 5 . 02 − 4 . 74 − 4 . 46
J.D. Hamilton (1994), Time Series Analysis, p.766.
3.4.2 Error Correction Representation
VAR(p) model:
y
t= α + φ
1y
t−1+ φ
2y
t−2+ · · · + φ
py
t−p+
t,
where y
t, α and
tindicate g × 1 vectors for t = 1 , 2 , · · · , T , and φ
sis a g × g matrix
for s = 1 , 2 , · · · , p.
Rewrite:
y
t= α + ρ y
t−1+ δ
1∆ y
t−1+ δ
2∆ y
t−2+ · · · + +δ
p−1∆ y
t−p+1+
t, where
ρ = φ
1+ φ
2+ · · · + φ
p,
δ
s= − ( φ
s+1+ δ
s+2+ · · · + φ
p) , for s = 1 , 2 , · · · , p − 1.
Again, rewrite:
∆ y
t= α + δ
0y
t−1+ δ
1∆ y
t−1+ δ
2∆ y
t−2+ · · · + +δ
p−1∆ y
t−p+1+
t, where
δ
0= ρ − I
g= −φ (1) ,
for φ (L) = I
g− δ
1L − δ
2L
2− · · · − δ
pL
p.
If y
thas h cointegrating relations, we have the following error correction represen- tation:
∆ y
t= α − BA
0y
t−1+ δ
1∆ y
t−1+ δ
2∆ y
t−2+ · · · + +δ
p−1∆ y
t−p+1+
t,
where A
0y
t−1is a stationary h × 1 vector (i.e., h I(0) processes), and B and A are g × h matrices.
Note that φ (1) = BA
0for φ (L) = I
g− δ
1L − δ
2L
2− · · · − δ
pL
p.
Each row of A
0denotes the cointegrating vector, i.e., A
0consists of h cointegrating
vectors.
Suppose that
t∼ N(0 , Σ ). The log-likelihood function is:
log l( α, δ
1, · · · , δ
p−1, B | A)
= − T g
2 log(2 π ) − T
2 log |Σ|
− 1 2
∑
T t=1( ∆ y
t− α + BA
0y
t−1− δ
1∆ y
t−1− · · · − δ
p−1∆ y
t−p+1)
0Σ
−1× ( ∆ y
t− α + BA
0y
t−1− δ
1∆ y
t−1− · · · − δ
p−1∆ y
t−p+1) Given A and h, maximize log l with respect to α, δ
1, · · · , δ
p−1, B.
Then, given h, how do we estimate A? = ⇒ Johansen (1988, 1991)
(*) Canonical Correlatoion (正準相関)
x
0= (x
1, x
2, · · · , x
n) and y
0= (y
1, y
2, · · · , y
m), where n ≤ m.
u = a
0x = a
1x
1+ a
2x
2+ · · · + a
nx
n, v = b
0y = b
1y
1+ b
2y
2+ · · · + b
my
m, where V(u) = V(v) = 1 and E(x) = E(y) = 0 for simplicity.
Define:
V(x) = Σ
xx, E(xy
0) = Σ
xy, V(y) = Σ
yy, E(yx
0) = Σ
yx= Σ
0xy.
The correlation coe ffi cient between u and v, denoted by ρ , is:
ρ = Cov(u , v)
√ V(u) √
V(v) = a
0Σ
xyb , where V(u) = a
0Σ
xxa = 1 and V(v) = b
0Σ
yyb = 1.
Maximize ρ = a
0Σ
xyb subject to a
0Σ
xxa = 1 and b
0Σ
yyb = 1.
The Lagrangian is:
L = a
0Σ
xyb − 1
2 λ (a
0Σ
xxa − 1) − 1
2 µ (b
0Σ
yyb − 1) .
Take a derivative with respect to a and b.
∂ L
∂ a = Σ
xyb − λΣ
xxa = 0 ,
∂ L
∂ b = Σ
0xya − µΣ
yyb = 0 . Using a
0Σ
xxa = 1 and b
0Σ
yyb = 1, we obtain:
λ = µ = a
0Σ
xyb . From the first equation, we obtain:
a = 1
λ Σ
−xx1Σ
xyb ,
which is substituted into the second equation as follows:
1
λ Σ
0xyΣ
−xx1Σ
xyb − λΣ
yyb = 0 , i.e.,
( Σ
−1yyΣ
0xyΣ
−1xxΣ
xy− λ
2I
m)b = 0 , i.e.,
|Σ
−yy1Σ
0xyΣ
−xx1Σ
xy− λ
2I
m| = 0 .
The solution of λ
2is given by the maximum eigen value of Σ
−yy1Σ
0xyΣ
−xx1Σ
xy, and b is
the corresponding eigen vector.
Back to the Cointegration:
Estimate the following two regressions:
∆ y
t= b
1,0+ b
1,1∆ y
t−1+ b
1,2∆ y
t−2+ · · · + b
1,p−1∆ y
t−p+1+ u
1,ty
t−1= b
2,0+ b
2,1∆ y
t−1+ b
2,2∆ y
t−2+ · · · + b
2,p−1∆ y
t−p+1+ u
2,tObtain ˆu
i,tfor i = 1 , 2 and t = 1 , 2 , · · · , T , and compute as follow:
Σ ˆ
11= 1 T
∑
T t=1ˆu
1,tˆu
01,t, Σ ˆ
22= 1 T
∑
T t=1ˆu
2,tˆu
02,t, Σ ˆ
12= 1
T
∑
T t=1ˆu
1,tˆu
02,t, Σ ˆ
21= Σ ˆ
012.
From ˆ Σ
−122Σ ˆ
21Σ ˆ
−111Σ ˆ
12, compute h biggest eigenvalues, denoted by ˆ λ
1, ˆ λ
2, · · · , ˆ λ
h, and the corresponding eigen vectors, denoted by ˆa
1, ˆa
2, · · · , ˆa
h, where ˆ λ
1> λ ˆ
2> · · · >
λ ˆ
h,
The estimate of A, ˆ A, is given by ˆ A = (ˆa
1, ˆa
2, · · · , ˆa
h).
How do we obtain h?
3.5 Testing the Number of Cointegrating Vectors
Trace Test (
トレース検定):
H
0: λ
h+1= 0 and H
1: λ
h> 0.
2(log l
1− log l
0) = − T
∑
gi=h+1
log(1 − λ ˆ
i) −→ tr(Q) , where
Q = (∫
10
W(r)dW(r)
0)
0(∫
10
W(r)W(r)
0dr
)
−1(∫
1 0W(r)dW(r)
0)
.
Trace Test for # of Cointegrating Relations
# of Random (a) Regressors have no drift (b) Some regressors have drift
Walks (g − h) 1% 2.5% 5% 10% 1% 2.5% 5% 10%
1 11.576 9.658 8.083 6.691 6.936 5.332 3.962 2.816
2 21.962 19.611 17.844 15.583 19.310 17.299 15.197 13.338
3 37.291 34.062 31.256 28.436 35.397 32.313 29.509 26.791
4 55.551 51.801 48.419 45.248 53.792 50.424 47.181 43.964
5 77.911 73.031 69.977 65.956 76.955 72.140 68.905 65.063
J.D. Hamilton (1994), Time Series Analysis, p.767.
Largest Eigenvalue Test (最大固有値検定):
H
0: λ
h+1= 0 and H
1: λ
h> 0.
2(log l
1− log l
0) = − T log(1 − λ ˆ
h+1) −→ maxmum eigen value of Q ,
Maximum Eigenvalue Test for # of Cointegrating Relations
# of Random (a) Regressors have no drift (b) Some regressors have drift
Walks (g − h) 1% 2.5% 5% 10% 1% 2.5% 5% 10%
1 11.576 9.658 8.083 6.691 6.936 5.332 3.962 2.816
2 18.782 16.403 14.595 12.783 17.936 15.810 14.036 12.099
3 26.154 23.362 21.279 18.959 25.521 23.002 20.778 18.697
4 32.616 29.599 27.341 24.917 31.943 29.335 27.169 24.712
5 38.858 35.700 33.262 30.818 38.341 35.546 33.178 30.774
J.D. Hamilton (1994), Time Series Analysis, p.768.
4 GMM (Generalized Mothod of Moments,
一般化積 率法)
1. Method of Moments (
積率法):
Regression Model: y
t= x
tβ +
tFrom the assumption, E(x
0tt) = 0.
The sample mean is given by:
1 T
∑
T t=1x
0tt= 1 T
∑
T t=1x
0t(y
t− x
tβ ) = 0 .
Therefore,
β
M M=
1 T
∑
T t=1x
0tx
t
−1
1 T
∑
T t=1x
0ty
t
,
which is equivalent to OLS.
2. Generalized Mothod of Moments (GMM,
一般化積率法):
E (h( θ ; w
t)) = 0 θ is a k × 1 parameter vector to be estimated.
w
tis an observed vector w
t= (y
t, x
t).
h( θ ; w
t) is a r × 1 vector function, where r ≥ k.
Define g( θ ; W
T) as follows:
g( θ ; W
T) = 1 T
∑
T t=1h( θ ; w
t) , where W
T= { w
T, w
T−1, · · · , w
1} .
Compute:
min
θ( g( θ ; W
T) )
0S
−1(
g( θ ; W
T) )
The solution of θ , denoted by ˆ θ
T, corresponds to the GMM estimator, where
S is defined as follows:
S = lim
T→∞