QML Estimators in Linear Regression Models with Functional Coefficient Autoregressive Processes

(1)

Volume 2010, Article ID 956907,30pages doi:10.1155/2010/956907

Research Article

QML Estimators in Linear Regression Models with Functional Coefficient Autoregressive Processes

Hongchang Hu

School of Mathematics and Statistics, Hubei Normal University, Huangshi 435002, China

Correspondence should be addressed to Hongchang Hu,[email protected] Received 30 December 2009; Revised 19 March 2010; Accepted 6 April 2010 Academic Editor: Massimo Scalia

Copyrightq2010 Hongchang Hu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This paper studies a linear regression model, whose errors are functional coeﬃcient autoregressive processes. Firstly, the quasi-maximum likelihoodQMLestimators of some unknown parameters are given. Secondly, under general conditions, the asymptotic propertiesexistence, consistency, and asymptotic distributionsof the QML estimators are investigated. These results extend those of Maller2003, White1959, Brockwell and Davis1987, and so on. Lastly, the validity and feasibility of the method are illuminated by a simulation example and a real example.

1. Introduction

Consider the following linear regression model:

ytx^T_tβεt, t1,2, . . . , n, 1.1

wherey_t’s are scalar response variables,x_t’s are explanatory variables,βis ad-dimensional unknown parameter, and theεt’s are functional coeﬃcient autoregressive processes given as ε1η1, εtftθεt−1ηt, t2,3, . . . , n, 1.2

whereη_t’s are independent and identically distributed random errors with zero mean and finite variance σ², θ is a one-dimensional unknown parameter and f_tθ is a real valued function defined on a compact setΘwhich contains the true valueθ0as an inner point and is a subset ofR¹. The values ofθ₀andσ²are unknown.

(2)

Model1.1includes many special cases, such as an ordinary linear regression model whenf_tθ≡0; see1–11 . In the sequel, we always assume thatf_tθ/0,for someθ∈Θ, is a linear regression model with constant coefficient autoregressive processeswhen f_tθ θ, see Maller 12 , Pere 13 , and Fuller 14 , time-dependent and functional coefficient autoregressive processes when β 0, see Kwoun and Yajima 15 , constant coefficient autoregressive processes when f_tθ θand β 0, see White 16, 17 , Hamilton 18 , Brockwell and Davis 19 , and Abadir and Lucas 20 , time-dependent or time-varying autoregressive processeswhenftθ atandβ 0, see Carsoule and Franses21 , Azrak and Mélard22 , and Dahlhaus23 , and so forth.

Regression analysis is one of the most mature and widely applied branches of statistics. Linear regression analysis is one of the most widely used statistical techniques.

Its applications occur in almost every field, including engineering, economics, the physical sciences, management, life and biological sciences, and the social sciences. Linear regression model is the most important and popular model in the statistical literature, which attracts many statisticians to estimate the coeﬃcients of the regression model. For the ordinary linear regression modelwhen the errors are independent and identically distributed random variables, Bai and Guo 1 , Chen 2 , Anderson and Taylor 3 , Drygas 4 , Gonz´alez- Rodr´ıguez et al.5 , Hampel et al.6 , He7 , Cui8 , Durbin9 , Hoerl and Kennard10 , Li and Yang11 , and Zhang et al.24 used various estimation methodsLeast squares estimate method, robust estimation, biased estimation, and Bayes estimationto obtain estimators of the unknown parameters in1.1and discussed some large or small sample properties of these estimators.

However, the independence assumption for the errors is not always appropriate in applications, especially for sequentially collected economic and physical data, which often exhibit evident dependence on the errors. Recently, linear regression with serially correlated errors has attracted increasing attention from statisticians. One case of considerable interest is that the errors are autoregressive processes and the asymptotic theory of this estimator was developed by Hannan and Kavalieris 25 . Fox and Taqqu26 established its asymptotic normality in the case of long-memory stationary Gaussian observations errors. Giraitis and Surgailis 27 extended this result to non-Gaussian linear sequences. The asymptotic distribution of the maximum likelihood estimator was studied by Giraitis and Koul in 28 and Koul in29 when the errors are nonlinear instantaneous functions of a Gaussian long-memory sequence. Koul and Surgailis 30 established the asymptotic normality of the Whittle estimator in linear regression models with non-Gaussian long-memory moving average errors. When the errors are Gaussian, or a function of Gaussian random variables that are strictly stationary and long range dependent, Koul and Mukherjee31 investigated the linear model. Shiohama and Taniguchi32 estimated the regression parameters in a linear regression model with autoregressive process.

In addition toconstant or functional or random coeﬃcientautoregressive model, it has gained much attention and has been applied to many fields, such as economics, physics, geography, geology, biology, and agriculture. Fan and Yao 33 , Berk 34 , Hannan and Kavalieris35 , Goldenshluger and Zeevi36 , Liebscher37 , An et al.38 , Elsebach39 , Carsoule and Franses 21 , Baran et al.40 , Distaso 41 , and Harvill and Ray 42 used various estimation methodsthe least squares method, the Yule-Walker method, the method of stochastic approximation, and robust estimation methodto obtain some estimators and discussed their asymptotic properties, or investigated hypotheses testing.

This paper discusses the model 1.1-1.2 including stationary and explosive processes. The organization of the paper is as follows. InSection 2some estimators ofβ, θ,

(3)

and σ² are given by the quasi-maximum likelihood method. Under general conditions, the existence and consistency the quasi-maximum likelihood estimators are investigated, and asymptotic normality as well, inSection 3. Some preliminary lemmas are presented in Section 4. The main proofs are presented inSection 5, with some examples inSection 6.

2. Estimation Method

Write the “true” model as

ytx^T_tβ0et, t1,2, . . . , n, 2.1 e1 η1, etftθ0et−1ηt, t2,3, . . . , n, 2.2

wheref_tθ0 dftθ/dθ|_θθ₀/0, andη_t’s are i.i.d errors with zero mean and finite variance σ₀². Define₋₁

i0f_t−iθ0 1, and by2.2we have

et^t−1

j0

_j−1

i0

f_t−iθ0

η_t−j. 2.3

Thuse_tis measurable with respect to theσ-fieldHgenerated byη₁, η₂, . . . , η_t, and

Ee_t0, Varet σ₀² t−1

j0

_j−1

i0

f_t−i² θ0

. 2.4

Assume at first that theη_t’s are i.i.d.N0, σ². Using similar arguments to those of Fuller14 or Maller12 , we get the log-likelihood ofy₂, y₃, . . . , y_nconditional ony₁:

Ψn

β, θ, σ²

logL_n−1

2n−1logσ²− 1 2σ²

n t2

ε_t−f_tθεt−1₂

− 1

2n−1log2π. 2.5

At this stage we drop the normality assumption, but still maximize 2.5 to obtain QML estimators, denoted byσ_n², β_n, θ_nwhen they exist:

∂Ψn

∂σ² −n−1 2σ² 1

2σ⁴ n

t2

ε_t−f_tθεt−12

, 2.6

∂Ψn

∂θ 1 σ²

n t2

f_tθ ε_t−f_tθεt−1

ε_t−1, 2.7

∂Ψn

∂β 1 σ²

n t2

ε_t−f_tθεt−1 x_t−f_tθxt−1

. 2.8

(4)

Thusσ_n², βn, θnsatisfy the following estimation equations:

σ_n² 1

n−1 n

t2

εt−ft

θn

ε_t−1₂

, 2.9

n t2

εt−ft

θn

ε_t−1 f_t

θn

ε_t−10, 2.10

n t2

εt−ft

θn

ε_t−1 xt−ft

θn

x_t−1

0, 2.11

where

ε_ty_t−x^T_tβ_n. 2.12

Remark 2.1. If f_tθ θ, then the above equations become the same as Maller’s 12 . Therefore, we extend the QML estimators of Maller12 .

To calculate the values of the QML estimators, we may use the grid search method, steepest ascent method, Newton-Raphson method, and modified Newton-Raphson method.

In order to calculate inSection 6, we introduce the most popular modified Newton-Raphson method proposed by Davidon-Fletcher-Powellsee Hamilton18 .

Letd2×1 vector→−

θ^m σ^m2, β^m, θ^mdenote an estimator of→−

θ σ², β, θthat has been calculated at themth iteration, and letA^m denote an estimation of H→−

θ^m ⁻¹. The new estimator→−

θ^m1is given by

−

→θ^m1→−

θ^msA^mg →−

θ^m

2.13

forsthe positive scalar that maximizesΨn{→−

θ^msA^mg→−

θ^m},whered2×1 vector

g →−

θ^m

∂Ψn

→− θ

∂→− θ |→−

θ→− θ^m

⎛

⎜⎜

⎝

∂Ψn

∂σ²|_σ²_σ^m2

∂Ψn

∂β |_ββm

∂Ψn

∂θ |_θθm

⎞

⎟⎟

⎠

2.14

andd2×d2symmetric matrix

H →−

θ^m

−∂²Ψn

→− θ

∂→− θ ∂→−

θ^T

|→− θ→−

θ^m

⎛

⎜⎜

⎜⎝

∂²Ψn

∂σ²²

∂²Ψn

∂σ²∂β

∂²Ψn

∂σ²∂θ

∗ ∂²Ψn

∂β∂β^T

∂²Ψn

∂β∂θ

∗ ∗ ∂²Ψn

∂θ²

⎞

⎟⎟

⎟⎠

|→− θ→−

θ^m , 2.15

(5)

where

∂²Ψn

∂σ²² n−1 2σ⁴ − 1

σ⁶ n

t2

εt−ftθε_t−1₂ ,

∂²Ψn

∂σ²∂β − 1 σ⁴

n t2

εt−ftθε_t−1 xt−ftθx_t−1_T ,

∂²Ψn

∂σ²∂θ − 1 σ⁴

n t2

εt−ftθε_t−1 f_tθ,

2.16

∂²Ψn

∂β∂β^T − 1 σ²

n t2

x_t−f_tθx_t−1 x_t−f_tθx_t−1_T

, 2.17

∂²Ψn

∂β∂θ − 1 σ²

n t2

f_tθε_t−1x_tf_tθεtx_t−1−2f_tθf_tθx_t−1ε_t−1 ,

∂²Ψn

∂θ² − 1 σ²

n t2

f_t²θ f_tθf_tθ

ε²_t−1−f_tθεtε_t−1 .

2.18

Once→−

θ^m1and the gradient at→−

θ^m1have been calculated, a new estimationA^m1 is found from

A^m1A^m−A^m Δg^m1 Δg^m1T

A^m Δg^m1T

A^m Δg^m1 −

Δ→− θ^m1

Δ→−

θ^m1 _T

Δg^m1T Δ→−

θ^m1

, 2.19

where

Δ→−

θ^m1→−

θ^m1−→−

θ^m, Δg^m1g →−

θ^m1

−g →−

θ^m

. 2.20

It is well known that least squares estimators in ordinary linear regression model are very good estimators, so a recursive algorithms procedure is to start the iteration with β⁰, σ⁰² which are least squares estimators ofβ and σ², respectively. Take θ⁰ such that f_tθ⁰ 0. Iterations are stopped if some termination criterion is reached, for example, if

→−

θ^m1−→− θ^m

→− θ^m

< δ, 2.21

for some prechosen small numberδ >0.

(6)

Up to this point, we obtain the values of QML estimators when the functionftθ ft, θis known. However, the functionf_tθis never the case in practice; we have to estimate it. By2.12and1.2, we obtain

f t,θ_n

ε_t

ε_t−1, t2,3, . . . , n. 2.22

Based on the dataset{ft, θ_n, t2,3, . . . , n}, we may obtain the estimation functionft,θ_n of ft, θ by some smoothing methods see Simonﬀ 43 , Fan and Yao 33 , Green and Silverman44 , Fan and Gijbels45 , etc.

To obtain our results, the following conditions are suﬃcient.

A1X_n_n

t2x_tx^T_t is positive definite for suﬃciently largenand

n→ ∞lim max

1≤t≤nx^T_tX_n⁻¹xt0, 2.23

lim sup

n→ ∞ |λ|_max

X_n^−1/2ZnX_n^−T/2

<1, 2.24

whereZn 1/2_n

t2xtx^T_t−1x_t−1x^T_tand|λ|_max·denotes the maximum in absolute value of the eigenvalues of a symmetric matrix.

A2There is a constantα >0 such that

t j1

_j−1

i0

f_t−i² θ

≤α 2.25

for anyt∈ {1,2, . . . , n}andθ∈Θ.

A3The derivativesf_tθ df_tθ/dθ, f_tθ df_tθ/dθexist and are bounded for anytandθ∈Θ.

Remark 2.2. Maller 12 applied the condition A1, and Kwoun and Yajima15 used the conditionsA2andA3. Thus our conditions are general.A1delineates the class ofx_tfor which our results hold in the sense required. It is further discussed by Maller in12 . Kwoun and Yajima15 call{et}stable if Varetis bounded. ThusA2implies that{et}is stable.

However,{et}is not stationary. In fact, by2.3, we obtain that

Covet, e_tk σ₀² _k−1

i0

f_tk−iθ0 f_tθ0^k

i0

f_tk−iθ0 · · ·^t−2

l0

f_t−lθ0^tk−2

i0

f_tk−iθ0

, 2.26

which is dependent oft.

(7)

For ease of exposition, we will introduce the following notations which will be used later in the paper.

Defined1-vectorϕ β, θ, and

Sn ϕ

σ²∂Ψn

∂ϕ σ² ∂Ψn

∂β ,∂Ψn

∂θ

, Fn ϕ

−σ² ∂²Ψn

∂ϕ∂ϕ^T. 2.27

By2.7and2.8, we get

F_n ϕ

⎛

⎜⎜

⎜⎝

X_nθ ⁿ

t2

f_tθεt−1x_tf_tθεtx_t−1−2f_tθf_tθxt−1ε_t−1

∗ ⁿ

t2

f_t²θ ftθf_tθ

ε_t−1² −f_tθεtε_t−1

⎞

⎟⎟

⎟⎠, 2.28

whereX_nθ −σ²∂²Ψn/∂β∂β^Tand the∗indicates that the element is filled in by symmetry.

Thus,

DnE Fn ϕ0

⎛

⎜⎜

⎝

Xnθ0 0

∗ ⁿ

t2

f_t²θ0 ftθ0f_tθ0

Ee_t−1² −f_tθ0Eete_t−1

⎞

⎟⎟

⎠

⎛

⎜⎜

⎝

Xnθ0 0

∗ ⁿ

t2

f_t²θ0Ee_t−1²

⎞

⎟⎟

⎠

Xnθ0 0

∗ Δnθ0, σ0

,

2.29

where

Δnθ0, σ0

n t2

f_t²θ0Ee²_t−1 σ₀² n

t2

f_t²θ0^t−2

j0

_j−1

i0

f_t−i² θ

On. 2.30

3. Statement of Main Results

Theorem 3.1. Suppose that conditions (A1)–(A3) hold. Then there is a sequenceA_n ↓ 0 such that, for eachA >0, asn → ∞, the probability

P

there are estimatorsϕ_n,σ_n² withS_n ϕ_n

0, and ϕ_n,σ_n²

∈N_nA

−→1. 3.1

(8)

Furthermore,

ϕn,σ_n²

−→p

ϕ0, σ₀²

, n−→ ∞, 3.2

where, for eachn1,2, . . . , A >0 andA_n∈0, σ₀²; define neighborhoods NnA

ϕ∈R^d1: ϕ−ϕ0

_T

Dn ϕ−ϕ0

≤A² , N_nA NnA∩

σ²∈

σ₀²−An, σ²₀An

.

3.3

Theorem 3.2. Suppose that conditions (A1)–(A3) hold. Then 1

σ_nF_n^T/2 ϕ_n ϕ_n−ϕ₀

−→DN0, Id1, n−→ ∞. 3.4

Remark 3.3. Forθ∈R^m, m∈N, our results still hold.

In the following, we will investigate some special cases in the model 1.1-1.2.

Although the following results are directly obtained from Theorems3.1and3.2, we discuss these results in order to compare with the corresponding results.

Corollary 3.4. Letftθ θ. If condition (A1) holds, then, for|θ|/1,3.1,3.2, and3.4hold.

Remark 3.5. These results are the same as the corresponding results of Maller12 . Corollary 3.6. Ifβ0 andf_tθ θ, then, for|θ| /1,

_n

t2ε²_t−1 σ_n

θ_n−θ₀

−→DN0,1, n−→ ∞, 3.5

where

σ²_n 1

n−1 n

t2

εt−θnε_t−12

, θn _n

t2εtε_t−1 _n

t2ε²_t−1 . 3.6

Remark 3.7. These estimators are the same as the least squares estimatorssee White16 . For|θ|>1,{εt}are explosive processes. In the case, the corollary is the same as the results of White17 . While|θ|<1, notice thatσ_n²→pσ₀²and1/n−1_n

t2ε²_t−1→pEε_t²σ₀²/1−θ²₀, and byCorollary 3.6we obtain

√n

θ_n−θ₀

−→DN

0,1−θ₀²

. 3.7

The result was discussed by many authors, such as Fujikoshi and Ochi46 and Brockwell and Davis19 .

(9)

Corollary 3.8. Letβ0. If conditions (A2) and (A3) hold, then F^1/2_n

θn

σn

θn−θ0

−→DN0,1, n−→ ∞, 3.8

where

F_n θ_n

ⁿ

t2

f_t²

θ_n f_t

θ_n

ε²_t−1−f_t θ_n

ε_tε_t−1 ,

σ_n² 1

n−1 n

t2

ε_t−f_t

θ_n ε_t−1₂

.

3.9

Corollary 3.9. Letf_tθ a_t. If condition (A1) holds, then

1 σn

_n

t2

xt−a_tx_t−1xt−a_tx_t−1^T _T/2

β_n−β₀

−→DN0, Id, n−→ ∞. 3.10

Remark 3.10. Let at 0. Note that _n

t2xtx_t^T O√n and σ_n²→pσ₀²; we easily obtain asymptotic normality of the quasi-maximum likelihood or least squares estimator in ordinary linear regression models from the corollary.

4. Some Lemmas

To prove Theorems3.1and3.2, we first introduce the following lemmas.

Lemma 4.1. The matrix Dn is positive definite for large enough n with ESnϕ0 0 and VarSnϕ0 σ₀²Dn.

Proof. It is easy to show that the matrixD_nis positive definite for large enoughn. By2.8, we have

σ₀²E ∂Ψn

∂β |_ββ₀

ⁿ

t2

E e_t−f_tθ0e_t−1 x_t−f_tθ0x_t−1

ⁿ

t2

x_t−f_tθ0x_t−1

Eη_t0.

4.1

Note thate_t−1andηtare independent of each other; thus by2.7andEηt0, we have

σ₀²E ∂Ψn

∂θ |_θθ₀

ⁿ

t2

E

e_t−f_tθ0et−1

f_tθ0et−1

ⁿ

t2

f_tθ0E η_te_t−1 0.

4.2

(10)

Hence, from4.1and4.2,

E Sn ϕ0

σ₀²E ∂Ψn

∂β |_ββ₀,∂Ψn

∂θ |_θθ₀

0. 4.3

By2.8and2.17, we have

Var

σ₀²∂Ψn

∂β |_ββ₀

Var _n

t2

et−ftθ0et−1 xt−ftθ0xt−1

Var _n

t2

ηt xt−ftθ0xt−1

σ₀²Xnθ0.

4.4

Note that{f_tθ0ηte_t−1, Ht}is a martingale diﬀerence sequence with Var f_tθ0ηte_t−1

f_t²θ0Eη²_tEe²_t−1σ₀²f_t²θ0Ee²_t−1, 4.5

so

Var

σ₀²∂Ψn

∂θ |_θθ₀

Var _n

t2

η_tf_tθ0et−1

ⁿ

t2

f_t²θ0Ee²_t−1σ₀²Δnθ0, σ₀.

4.6

By2.7and2.8and noting thate_t−1andηtare independent of each other, we have

Cov

σ₀²∂Ψn

∂β |_ββ₀, σ₀²∂Ψn

∂θ |_θθ₀

E

σ₀²∂Ψn

∂β |_ββ₀, σ₀²∂Ψn

∂θ |_θθ₀

E _n

t2

η_t² xt−ftθ0xt−1

f_tθ0et−1

E _n

t3

η_t x_t−f_tθ0x_t−1^t−1

s2

η_sf_sθ0e_s−1

E _n

s3

ηsf_sθ0es−1

s−1 t2

ηt xt−ftθ0xt−1 0.

4.7

From4.4–4.7, it follows that VarSnϕ0 σ₀²D_n.

(11)

Lemma 4.2. If condition (A1) holds, then, for anyθ ∈ Θ, the matrixXnθis positive definite for large enoughn, and

nlim→ ∞max

1≤t≤nx^T_tX⁻¹_n θxt0. 4.8

Proof. Letλ₁andλ_dbe the smallest and largest roots of|Zn−λX_n|0. Then from the study of Rao in47, Ex 22.1 ,

λ1 ≤ u^TZnu

u^TX_nu ≤λd 4.9

for unit vectors u. Thus by 2.24, there are some δ ∈ max{0,1 − 1 min_2≤t≤n| f_t²θ|/max2≤t≤n|ftθ|},1andn0δsuch thatn≥N0implies that

u^TZ_nu≤1−δu^TX_nu. 4.10

By4.10, we have

u^TX_nuⁿ

t2

u^T x_t−f_tθx_t−1²

ⁿ

t2

u^Tx_t2

f_t²θ

u^Tx_t−12

−f_tθu^Tx_t−1x^T_tu−f_tθu^Tx_tx^T_t−1u

≥ⁿ

t2

u^Tx_t2

min

2≤t≤n

f_t²θⁿ

t2

u^Tx_t−12

−max

2≤t≤nf_tθu^TZ_nu

≥u^TXnumin

2≤t≤n

f_t²θu^TXnu−max

2≤t≤nftθu^TZnu

≥

1min

2≤t≤n

f_t²θ−max

2≤t≤nftθ1−δ

u^TXnu Cθ, δu^TXnu.

4.11

By the study of Rao in47, page 60 and2.23, we have

u^Txt

2

u^TX_nu −→0. 4.12

(12)

From4.12andCθ, δ>0,

x^T_tX_n⁻¹θ sup

u

u^Txt

2

u^TXnθu

≤sup

u

u^Txt

2

Cθ, δu^TXnu

−→0. 4.13

Lemma 4.3see48 . LetW_nbe a symmetric random matrix with eigenvaluesλ_jn,1 ≤j ≤ d.

Then

Wn−→pI⇐⇒λjn−→p1, n−→ ∞. 4.14

Lemma 4.4. For eachA >0,

sup

ϕ∈NnA

D^−1/2_n F_n ϕ

D_n^−T/2−Φn−→p0, n−→ ∞, 4.15

and also

Φn−→DΦ, 4.16

c→lim0lim sup

A→ ∞ lim sup

n→ ∞ P

ϕ∈NinfnAλmin

D^−1/2_n Fn ϕ D_n^−T/2

≤c 0, 4.17

where

Φn

⎛

⎜⎝

I_d 0

0 _n

t2f_t²θ0e²_t−1 Δnθ0, σ₀

⎞

⎟⎠, Φ I_d1. 4.18

Proof. LetX_nθ0 X^1/2_n θ0Xn^T/2θ0be a square root decomposition ofX_nθ0. Then

D_n

X_n^1/2θ0 0

∗ !

Δnθ0, σ₀

X_n^T/2θ0 0

∗ !

Δnθ0, σ₀

D^1/2_n D^T/2_n . 4.19

Letϕ∈NnA. Then

ϕ−ϕ0

_T

Dn ϕ−ϕ0

β−β0

_T

Xnθ0 β−β0

θ−θ0²Δnθ0, σ0≤A². 4.20

(13)

From2.28,2.29, and4.18,

D_n^−1/2Fn ϕ

D^−T/2_n −Φn

⎛

⎜⎜

⎜⎝

X^−1/2_n θ₀X_nθX_n^−T/2θ₀−I_d X_n^−1/2θ0_n

t2 f_tθεt−1xtf_tθεtxt−1−2ftθf_tθεt−1xt−1

!Δnθ0, σ0

∗

n t2

f_t²θ ftθf_tθ

ε²_t−1−f_tθεtεt−1

−n

t2 f_t²θ0e²_t−1

!Δnθ0, σ0

⎞

⎟⎟

⎟⎠. 4.21

Let

N^β_nA

β: β−β0

_T

X^1/2_n θ0² ≤A² , 4.22 N_n^θA

θ:|θ−θ0| ≤ A

!Δnθ0, σ₀

. 4.23

In the first step, we will show that, for eachA >0,

sup

θ∈N^θnA

X^−1/2_n θ0XnθX_n^−T/2θ0−Id−→0, n−→ ∞. 4.24

In fact, note that

X^−1/2_n θ0XnθX_n^−T/2θ0−I_dX_n^−1/2θ0Xnθ−X_nθ0X_n^−T/2θ0

X_n^−1/2θ0T1T₂−T₃X_n^−T/2θ0, 4.25

where

T₁ⁿ

t2

f_tθ0−f_tθ

x_t−1 x_t−f_tθ0xt−1T

,

T₂ⁿ

t2

x_t−f_tθ0xt−1 x^T_t−1,

T₃ⁿ

t2

f_tθ0−f_tθ₂

x_t−1x^T_t−1.

4.26

(14)

Letu, v ∈ R^d, |u| |v| 1, and letu^T_n u^TX_n^−1/2θ0, v_n^T X_n^−T/2θ0v. By Cauchy- Schwartz inequality,Lemma 4.2, conditionA3, and noting thatθ∈N_n^θA, we have that

u^T_nT₁v_n

n t2

f_tθ0−f_tθ

u^T_nx_t−1 x_t−f_tθ0x_t−1T

v_n

≤max

2≤t≤nf_tθ0−f_tθ

n t2

u^T_nx_t−1 x_t−f_tθ0x_t−1T

v_n

≤max

2≤t≤nftθ0−ftθ_n

t2

u^T_nx_t−1x^T_t−1un

1/2

· _n

t2

v_n^T xt−f_tθ0x_t−1 xt−f_tθ0x_t−1_T v_n

_1/2

≤max

2≤t≤nf_tθ0−f_tθ_n

t2

u^T_nx_tx^T_tu_n 1/2

≤max

2≤t≤n

f_t

θ|θ0−θ| ·√ nmax

1≤t≤n

x_t^TX⁻¹_n θ0xt

≤C

"

n

Δnθ0, σ0o1−→0.

4.27

Hereθaθ 1−aθ0for some 0≤a≤1. Similar to the proof ofT₁, we easily obtain that u^T_nT₂v_n−→0. 4.28

By Cauchy-Schwartz inequality,Lemma 4.2, conditionA3, and noting thatN_n^θA, we have that

u^T_nT3vn u^T_n

n t2

ftθ0−ftθ₂

x_t−1x_t−1^T vn

≤max

2≤t≤nf_tθ0−f_tθ² _n

t2

u^T_nx_tx^T_tu_n n

t2

v^T_nx_tx^T_tv_n _1/2

≤nmax

2≤t≤n

f_t

θ²|θ0−θ|²max

1≤t≤n

x^T_tX_n⁻¹θ0xt

≤ nA²

Δnθ0, σ₀o1−→0.

4.29

Hence,4.24follows from4.25–4.29.

(15)

In the second step, we will show that X_n^−1/2θ0_n

t2 f_tθε_t−1x_tf_tθεtx_t−1−2f_tθf_tθε_t−1x_t−1

!Δnθ0, σ0 −→p0. 4.30

Note that

εtyt−x_t^Tβx_t^T β0−β et, ε_t−f_tθ0ε_t−1 x_t−f_tθ0x_t−1T

β₀−β η_t.

4.31

Consider

J ⁿ

t2

f_tθε_t−1x_tf_tθεtx_t−1−2f_tθf_tθε_t−1x_t−1

ⁿ

t2

ε_t−1f_tθ x_t−f_tθ0x_t−1

f_tθ ε_t−f_tθ0ε_t−1 x_t−1 2f_tθ f_tθ0−f_tθ

ε_t−1x_t−1 T₁T₂T₃T₄2T₅2T₆,

4.32

where

T₁ⁿ

t2

x^T_t−1f_tθ β₀−β x_t−f_tθ0x_t−1

, T₂ⁿ

t2

f_tθe_t−1 x_t−f_tθ0x_t−1 ,

T₃ⁿ

t2

f_tθ x_t−f_tθ0x_t−1T

β₀−β

x_t−1, T₄ⁿ

t2

f_tθηtx_t−1,

T₅ⁿ

t2

f_tθ f_tθ0−f_tθ

x^T_t−1 β₀−β

x_t−1, T₆ⁿ

t2

f_tθ f_tθ0−f_tθ

e_t−1x_t−1. 4.33

Forβ∈N_n^βAand eachA >0, we have β₀−β_T

x_t² β₀−β_T

X^1/2_n θ0X_n^−1/2θ0xtx_t^TX^−T/2_n θ0X_n^T/2θ0 β₀−β

≤max

1≤t≤n

x^T_tX_n⁻¹θ0xt β₀−βT

X_nθ0 β₀−β

≤A²max

1≤t≤n

x^T_tX_n⁻¹θ0xt

.

4.34