A Review of Large Sample Asymptotics - PDF ECONOMETRICS

6.1 Introduction

The most widely-used tool in sampling theory is large sample asymptotics. By “asymptotics” we mean approximating a finite-sample sampling distribution by taking its limit as the sample size diverges to infinity. In this chapter we provide a brief review of the main results of large sample asymptotics. It is meant as a reference, not as a teaching guide. Asymptotic theory is covered in detail in Chapters 7-9 ofIntroduction to Econometrics. If you have not previous studied asymptotic theory in detail you should study these chapters before proceeding.

6.2 Modes of Convergence

Definition 6.1 A sequence of random vectorsZn∈R^kconverges in probabil- itytoZ asn→ ∞, denotedZ_n−→_p Z or alternatively plim_n_→∞Z_n=Z, if for all δ>0,

n→∞lim P[kZn−Zk ≤δ]=1. (6.1) We callZ theprobability limit(orplim) ofZn.

The above definition treats random variables and random vectors simultaneously using the vector norm. It is useful to know that for a random vector, (6.1) holds if and only if each element in the vector converges in probability to its limit.

Definition 6.2 Let Z_n be a sequence of random vectors with distributions Fn(u)=P[Zn≤u] . We say thatZn converges in distributiontoZ asn→ ∞, denotedZn−→

d Z, if for alluat whichF(u)=P[Z≤u] is continuous,Fn(u)→ F(u) asn→ ∞. We refer toZ and its distributionF(u) as theasymptotic dis- tribution,large sample distribution, orlimit distributionofZn.

155

6.3 Weak Law of Large Numbers

Theorem 6.1 Weak Law of Large Numbers (WLLN) IfYi∈R^kare i.i.d. andEkYk < ∞, then asn→ ∞,

Y =1 n

i=1

Y_i−→_p E[Y] .

The WLLN shows that the sample meanY converges in probability to the true population expecta- tionµ. The result applies to any transformation of a random vector with a finite mean.

Theorem 6.2 IfYi ∈R^k are i.i.d.,h(y) :R^k →R^q, andEkh(Y)k < ∞, thenµb=

1 n

i=1h(Yi)−→_p µ=E[h(Y)] asn→ ∞.

An estimator which converges in probability to the population value is calledconsistent.

Definition 6.3 An estimatorθbofθisconsistentifθb−→

p θasn→ ∞.

6.4 Central Limit Theorem

Theorem 6.3 Multivariate Lindeberg-Lévy Central Limit Theorem (CLT). If Yi∈R^kare i.i.d. andEkYk²< ∞, then asn→ ∞

pn³ Y −µ´

−→d N (0,V) whereµ=E[Y] andV =Eh¡

Y−µ¢ ¡

Y −µ¢₀i .

The central limit theorem shows that the distribution of the sample mean is approximately normal in large samples. For some applications it may be useful to notice that Theorem 6.3 does not impose any restrictions onV other than that the elements are finite. Therefore this result allows for the possibility of singularV.

The following two generalizations allow for heterogeneous random variables.

Theorem 6.4 Multivariate Lindeberg CLT. Suppose that for alln,Yni∈R^k,i= 1, ...,rn, are independent but not necessarily identically distributed with expec- tationsE[Yni]=0 and variance matricesV_ni =E£

Y_niY_ni⁰ ¤

. SetV_n=Pn i=1V_ni. Supposeν²n=λmin(Vn)>0 and for all²>0

n→∞lim 1 ν²n

i=1

E£

kYnik²1^©kYnik²≥²ν²n

ª¤=0. (6.2)

Then asn→ ∞

V⁻_n^1/2

i=1

Yni −→

d N (0,Ik) .

Theorem 6.5 Suppose Yni ∈R^k are independent but not necessarily identi- cally distributed with expectations E[Yni]=0 and variance matrices Vni = E£

YniY_ni⁰ ¤

. Suppose

1 n

i=1

Vni→V >0 and for someδ>0

sup

n,i EkY_nik²^+δ< ∞. (6.3)

Then asn→ ∞ p

n Y −→

d N (0,V) .

6.5 Continuous Mapping Theorem and Delta Method

Continuous functions are limit-preserving. There are two forms of the continuous mapping theorem, for convergence in probability and convergence in distribution.

Theorem 6.6 Continuous Mapping Theorem (CMT). Let Zn ∈R^k andg(u) : R^k→R^q. IfZ_n−→_p casn→ ∞andg(u) is continuous atctheng(Z_n)−→_p g(c) asn→ ∞.

Theorem 6.7 Continuous Mapping Theorem. IfZn−→

d Z as n→ ∞andg :

R^m→R^khas the set of discontinuity pointsD_g such thatP£

Z∈D_g¤

=0, then g(Zn)−→

d g(Z) asn→ ∞.

Differentiable functions of asymptotically normal random estimators are asymptotically normal.

Theorem 6.8 Delta Method. Letµ∈R^kandg(u) :R^k→R^q. Ifp n¡

µb−µ¢

−→d ξ, whereg(u) is continuously differentiable in a neighborhood ofµ, then asn→

∞ p

n¡ g¡

µb¢

−g(µ)¢

−→d G⁰ξ (6.4)

whereG(u)=_∂^∂_ug(u)⁰andG=G(µ). In particular, ifξ∼N (0,V) then asn→ ∞ pn¡

g¡ µb¢

−g(µ)¢

−→d N¡

0,G⁰V G¢

. (6.5)

6.6 Smooth Function Model

The smooth function model isθ=g¡ µ¢

whereµ=E[h(Y)] andg¡ µ¢

is smooth in a suitable sense.

The parameter θ =g¡ µ¢

is not a population moment so it does not have a direct moment esti- mator. Instead, it is common to use aplug-in estimatorformed by replacing the unknownµwith its point estimatorµband then “plugging” this into the expression forθ. The first step is the sample mean µb=n⁻¹Pn

i=1h(Y_i). The second step is the transformationθb=g¡ θb¢

. The hat “^” indicates thatθbis a sam- ple estimator ofθ. The smooth function model includes a broad class of estimators including sample variances and the least squares estimator.

Theorem 6.9 If Yi ∈R^m are i.i.d., h(u) :R^m →R^k,Ekh(Y)k < ∞, and g(u) : R^k→R^qis continuous atµ, thenθb−→_p θasn→ ∞.

Theorem 6.10 IfYi∈R^mare i.i.d.,h(u) :R^m→R^k,Ekh(Y)k²< ∞,g(u) :R^k→ R^q, andG(u)= ∂

∂ug(u)⁰is continuous in a neighborhood ofµ, then asn→ ∞ pn¡

θb−θ¢

−→d N (0,V_θ) whereV_θ=G⁰V G,V =Eh

¡h(Y)−µ¢ ¡

h(Y)−µ¢0i

, andG=G¡ µ¢

Theorem 6.9 establishes the consistency ofθbforθand Theorem 6.10 establishes its asymptotic nor- mality. It is instructive to compare the conditions. Consistency requires thath(Y) has a finite expecta- tion; asymptotic normality requires thath(Y) has a finite variance. Consistency requires that g(u) be continuous; asymptotic normality requires thatg(u) is continuously differentiable.

6.7 Best Unbiased Estimation

This section presents an efficiency bound for estimation of the mean. The result is are finite-sample rather than asymptotic, but is convenient to introduce at this point since the bound is identical to the asymptotic variance.

Theorem 6.11 SupposeY_i are i.i.d., µ=E[h(Y)], and Ekh(Y)k²< ∞. Ifµeis unbiased forµthen var£

µe¤

≥n⁻¹V whereV =Eh¡

h(Y)−µ¢ ¡

h(Y)−µ¢₀i .

For details and a proof see Section 11.6 ofIntroduction to Econometrics. Theorem 6.11 is an analog of the Cramér-Rao lower bound for semiparametric estimation. The result shows that the asymptotic vari- ance from Theorems 6.3 is the best possible in any finite sample among unbiased estimators. Theorem 6.11 is sharp, since the sample mean has the finite sample variancen⁻¹V.

6.8 Stochastic Order Symbols

It is convenient to have simple symbols for random variables and vectors which converge in prob- ability to zero or are stochastically bounded. In this section we introduce some of the most common notation.

LetZnandan,n=1, 2, ... be sequences of random variables and constants. The notation Zn=op(1)

(“small oh-P-one”) means thatZ_n−→_p 0 asn→ ∞. We also write Zn=op(an) ifa⁻¹_n Zn=op(1).

Similarly, the notation Z_n=O_p(1) (“big oh-P-one”) means that Z_n is bounded in probability. Pre- cisely, for any²>0 there is a constantM_²< ∞such that

lim sup

n→∞ P[|Z_n| >M_²]≤².

Furthermore, we write

Zn=Op(an) ifa⁻¹_n Zn=Op(1).

Op(1) is weaker thanop(1) in the sense thatZn=op(1) impliesZn=Op(1) but not the reverse. How- ever, ifZn=Op(an) thenZn=op(bn) for anybnsuch thatan/bn→0.

A random sequence with a bounded moment is stochastically bounded.

Theorem 6.12 IfZnis a random vector which satisfies EkZ_nk^δ=O(a_n) for some sequenceanandδ>0, then

Zn=Op(a^1/_n^δ).

Similarly,EkZnk^δ=o(an) impliesZn=op(a^1/δ_n ).

There are many simple rules for manipulatingop(1) andOp(1) sequences which can be deduced from the continuous mapping theorem. For example,

op(1)+op(1)=op(1) op(1)+Op(1)=Op(1) Op(1)+Op(1)=Op(1) op(1)op(1)=op(1) o_p(1)O_p(1)=o_p(1) Op(1)Op(1)=Op(1).

6.9 Convergence of Moments

We give a sufficient condition for the existence of the mean of the asymptotic distribution, define uniform integrability, provide a primitive condition for uniform integrability, and show that uniform integrability is the key condition under whichE[Zn] converges toE[Z].

Theorem 6.13 IfZn−→

d ZandEkZnk ≤CthenEkZk ≤C.

Definition 6.4 The random vectorZnisuniformly integrableasn→ ∞if

Mlim→∞lim sup

n→∞ E[kZnk1{kZnk >M}]=0.

Theorem 6.14 If for someδ>0,EkZnk^1+δ≤C< ∞, thenZnis uniformly inte- grable.

Theorem 6.15 IfZn−→

d ZandZnis uniformly integrable thenE[Zn]−→E[Z] .

6.10 Uniform Stochastic Bounds

Theorem 6.16 If|Yi|^ris uniformly integrable, then asn→ ∞ n^−1/r max

1≤i≤n|Yi| −→_p 0. (6.6)

Equation (6.6) implies that ifY hasr finite moments then the largest observation will diverge at a rate slower thann^1/r. The higher the moments, the slower the rate of divergence.

ドキュメント内 PDF ECONOMETRICS - Keio (ページ 175-182)