pdf Research Kengo Kato

(1)

2014.7.29. (edit: 2015.4.27) Semiparametric efficiency bound for linear quantile

regression

Kengo Kato Remark 1. The semiparametric efficieny bound for the linear quantile regression model is derived in [1] as a special case of that for the censored quantile regression model. Here we present a direct derivation of the efficiency bound, following Section 25.4 in [2]; indeed the derivation is just a small modification of that in Example 25.28 for the mean regression.

Consider the quantile regression model

Y = X^Tβ + ϵ, P(ϵ ≤ 0 | X) = τ,

where Y is scalar and X is k-dimensional. Let f (ϵ, x) be the joint density of (ϵ, X) with respect to dϵdµ(x) where µ is some σ-finite measure on R^k^; we assume that lim_|ϵ|→∞f (ϵ, x) = 0 for µ-almost all x ∈ R^k and other stan- dard regularity conditions (we drop “µ-almost all x ∈ R^k” in the following discussion). Then the distribution of (Y, X), denoted by P_β,f, is of the form

dP_β,f(y, x) = f (y − x^Tβ, x)dydµ(x).

Denote by f (ϵ | x) the conditional density of ϵ given X = x, i.e., f (ϵ | x) = f (ϵ, x)/∫ f(ϵ^′, x)dϵ^′. The conditional quantile restriction is written as

∫

φ(ϵ)f (ϵ | x)dϵ = 0, φ(ϵ) = τ − 1(ϵ ≤ 0), which is equivalent to

∫

φ(ϵ)f (ϵ, x)dϵ = 0.

Consider a perturbation ftof f with t ∈ R, which must satisfy the relation

∫

φ(ϵ)ft(ϵ, x)dϵ = 0. Taking derivative with respect to t, we have

0 = ^d dt

∫

φ(ϵ)ft(ϵ, x)dϵ =

∫

φ(ϵ)^∂

∂t^f^t^{(ϵ, x)dϵ,} ⁽¹⁾ where we assumed that the derivative and the integral can be interchanged. Note that the score function for f in the submodel t 7→ Pβ,ft is

g(y − x^Tβ, x) = ^∂

∂t^{log f}^t^{(y − x}

T_{β, x)} t=0⁼

∂f_t(y − x^Tβ, x)/∂t|_t=0 f (y − x^Tβ, x) ^, which, because of (1), satisfies

∫

φ(ϵ)g(ϵ, x)f (ϵ, x)dϵ = 0.

1

(2)

2

This leads to an intuition that the L²(Pβ,f)-closure of the set of score functions for f is H = {(y, x) 7→ g(y − x^Tβ, x) : g ∈ G}, where

G =^{(ϵ, x) 7→ g(ϵ, x) :

∫ ∫

g²(ϵ, x)f (ϵ, x)dϵdµ(x) < ∞,

∫ ∫

g(ϵ, x)f (ϵ, x)dϵdµ(x) = 0,

∫

φ(ϵ)g(ϵ, x)f (ϵ, x)dϵ = 0^}. Indeed, for a bounded g ∈ G, consider f_t= (1+tg)f , for which ∂ log f_t/∂t|_t=0= g. To verify that the map (y, x) 7→ g(y − x^Tβ, x) is a score function for f , we have to check that t 7→ P_β,ft is a submodel for t in a neighborhood of 0. For sufficiently small t, ft is nonnegative and ∫∫ ftdϵdµ = 1 (the latter follows from the fact that ∫∫ gfdϵdµ = 0), and verifies ∫ φ(ϵ)ft(ϵ, x)dϵ = 0, so that t 7→ P_β,ft gives a submodel for t in a neighborhood of 0. Taking the L²(P_β,f)-closure, we obtain the desired assertion.

The score function for β is

˙ℓβ,f(y, x) = −x^{∂f (y − x}

T_{β, x)/∂ϵ}

f (y − x^Tβ, x) ^.

The efficient score for β, denoted by ˜ℓ_β,f, is obtained by projecting (each element of) ˙ℓ_β,f onto the orthocomplement of H in L²(P_β,f). For any function a(ϵ, x) square integrable with respect to the distribution of (ϵ, X) such that ∫∫ a(ϵ, x)f(ϵ, x)dϵdµ(x) = 0, the projection of a(ϵ, x) onto the orthocomplement of H in L²(P_β,f) is identical to that of a(ϵ, x) onto the set of functions of the form φ(ϵ)h(x) where h is square integrable with respect to the marginal distribution of X, so that the desired projection is given by

φ(ϵ)E[a(ϵ, X)φ(ϵ) | X = x] E[φ²(ϵ) | X = x] ^. Hence the efficient score ˜ℓβ,f for β is computed as

ℓ˜_β,f(y, x) = −xφ(ϵ)^{∫ φ(ϵ}

′_{){∂f (ϵ}′_{| x)/∂ϵ}dϵ}′

∫ φ²(ϵ^′)f (ϵ^′| x)dϵ^′ ^{= x} φ(ϵ)

τ (1 − τ )^{f (0 | x),} so that the semiparametric efficiency bound for estimation of β is

(E[˜^ℓβ,f^{(y, x)˜}^ℓβ,f^{(y, x)}^T^])⁻¹= τ (1 − τ )(E[f²^{(0 | X)XX}^T^])⁻¹^, provided that the inverse matrix on the right side exists.

References

[1] Newey, W.K. and Powell, J.L. (1990). Efficient estimation of linear and type I censored regression models under conditional quantile restrictions. Econometric Theory 6295-317.

[2] van der Vaart, A.W. Asymptotic Statistics. Cambridge University Press.