• 検索結果がありません。

pdf Research Kengo Kato

N/A
N/A
Protected

Academic year: 2018

シェア "pdf Research Kengo Kato"

Copied!
6
0
0

読み込み中.... (全文を見る)

全文

(1)

2013.5.7. Minimax lower bounds via Neyman-Pearson lemma

Kengo Kato Suppose that there is a scalar dependent variable Y and a scalar covari- ate X which we assume has support in [0, 1]. Consider the nonparametric regression model

Y = f (X) + ϵ, ϵ ⊥⊥ X, ϵ ∼ N(0, σ2), σ2 > 0.

We fix the distribution of X and σ2 > 0. Let {ϕj}j=1 be an orthonormal system in L2([0, 1]). We assume that

L := sup

j≥1E[ϕ

2j(X)] < ∞.

For given α > 0 and C1> 0, suppose that f belongs to the class F(α, C1) = {f ∈ L2([0, 1]) : |⟨f, ϕj⟩| ≤ C1j−α, ∀j ≥ 1},

where ⟨·, ·⟩ denotes the inner product in L2([0, 1]). Denote by ∥ · ∥ the L2([0, 1])-norm. Let (Y1, X1), . . . , (Yn, Xn) be i.i.d. observations of (Y, X).

The purpose of this note is to prove (in a self-contained manner) the following (well-known) theorems by means of a simple application of the Neyman-Pearson lemma.1

Theorem 1. Under the above setup, we have inffˆ f ∈F (α,Csup 1)Ef[∥ ˆf − f∥

2] ≳ n−(2α−1)/(2α),

where the infimum is taken over all estimators ˆf of f .

Remark 1. The idea of the proof of Theorem 1 is borrowed from [1, 2] where minimax lower bounds in the problem of estimating structural func- tions in nonparametric instrumental variables models and slope functions in functional linear models are derived [see also the proof of 4, Theorem 7]. However, in [1, 2], detailed proofs for the minimax lower bounds are not presented (though the proofs are correct). Hence I hope that this note would be of some help in understanding their proofs.

Alternatively, we have the following theorem.

Theorem 2. There exists a small constant c > 0 such that lim inf

n→∞ inffˆ f ∈F (α,Csup 1)Pf(∥ ˆf − f∥

2 > cn−(2α−1)/(2α)) > 0.

1For various techniques to derive minimax lower bounds in nonparametric statistical problems, see [3].

1

(2)

Proof of Theorem 1. Let Mn be the integer part of n1/(2α). For θMn = (θMn+1, . . . , θ2Mn)T ∈ RMn, define

fθMn(·) = C1

2Mn

j=Mn+1

j−αθjϕj(·).

Clearly, fθMn ∈ F(α, C1) whenever θMn ∈ [0, 1]Mn. Lemma 1. We have

inffˆ f ∈F (α,Csup 1)Ef[∥ ˆf − f∥

2] ≥ inf θˆMn

sup

θMn∈{0,1}Mn

EθMn[∥fθˆMn − fθMn2],

where the infimum on the right side is taken over all estimators ˆθMn [0, 1]Mn of θMn.

Proof of Lemma 1. For arbitrary ˆf , we have sup

f ∈F (α,C1)Ef[∥ ˆf − f∥

2] ≥ sup θMn∈{0,1}Mn

EθMn[∥ ˆf − fθMn2]. Moreover, by Bessel’s inequality,

∥ ˆf − f∥2

j=1

(⟨ ˆf , ϕj⟩ − ⟨f, ϕj⟩)2,

so that when f = fθMn for some θMn ∈ {0, 1}Mn, it is enough to consider the estimator of the form

f (·) =ˆ

2Mn

j=Mn+1

ˆ αjϕj(·),

where ˆαj are data-dependent. By defining ˆθj = C1−1jααˆj, ˆf is of the form f (·) = fˆ θˆMn(·) = C1

2Mn

j=Mn+1

j−αθˆjϕj(·).

We need to show that we can restrict ˆθj in such a way that 0 ≤ ˆθj ≤ 1. For given ˆθj, define

θ˜j =





1, if ˆθj > 1, θˆj, if 0 ≤ ˆθj ≤ 1, 0, if ˆθj < 0. Then whenever θMn ∈ {0, 1}Mn,

∥fθˆMn − fθMn2≥ ∥fθ˜Mn − fθMn2.

This completes the proof of the lemma. □

(3)

For the notational convenience, write

θ−jMn = (θMn+1, . . . , θj−1, θj+1, . . . , θ2Mn)T, Mn+ 1 ≤ j ≤ 2Mn.

Observe that sup

θMn∈{0,1}MnEθ

Mn[∥fθˆMn − fθMn2]

2M1n

θMn∈{0,1}Mn

EθMn[∥fθˆMn − fθMn2]

= C

12

2Mn

2Mn

j=Mn+1

j−2α

θMn

−j ∈{0,1} Mn−1

{EθMn

−j j=0

[(ˆθj− θj)2]

+ EθMn

−j j=1

[(ˆθj− θj)2]}. (1) We want to lower bound

EθMn

−j j=0

[(ˆθj− θj)2] + EθMn

−j j=1

[(ˆθj− θj)2].

To this end, we make use of a variant of Neyman-Pearson lemma combined with Le Cam’s inequality [3, Lemma 2.3].

Lemma 2. Let (S, S, µ) be a measure space, and let p, q be probability den- sity functions with respect to µ. Then

(i) (A variant of Neyman-Pearson lemma): inf

{∫

φpdµ +

ψqdµ : φ ≥ 0, ψ ≥ 0, φ + ψ ≥ 1 }

(p ∧ q)dµ. (ii) (Le Cam’s inequality):

(p ∧ q)dµ ≥ 12

(∫ √pqdµ)2.

Proof of Lemma 2. Part (i): Let φ ≥ 0, ψ ≥ 0, φ + ψ ≥ 1. Then

φpdµ +

ψqdµ ≥

(φ ∧ 1)pdµ +

(1 − φ ∧ 1)qdµ.

We lower bound the right side with respect to φ. Clearly, we may assume that φ ≤ 1. The desired conclusion follows from the inequality

(p − q)(φ − 1(p < q))dµ ≥ 0. Part (ii): Since∫ (p ∨ q)dµ + ∫ (p ∧ q)dµ = 2, we have

(∫ √pqdµ)2=(∫ √(p ∧ q)(p ∨ q)dµ )2

(p ∧ q)dµ

(p ∨ q)dµ

=

(p ∧ q)dµ {

2 −

(p ∨ q)dµ }

≤ 2

(p ∧ q)dµ,

so that the desired inequality is obtained. □

(4)

For a while, fix Mn+1+1 ≤ j ≤ 2Mnand θM−jn ∈ {0, 1}Mn−1. Let pθj(y | x) denote the conditional density function of Y given X = x when f = fθMn

−j j

:

pθj(y | x) = 1 2πσ2 exp

{

12 (y − fθMn

−j j

(x))2 }

. Then

EθMn

−j j=0

[ˆθ2j] + EθMn

−j j=1

[(ˆθj− 1)2]

= E [∫

θˆ2j((y1, X1), . . . , (yn, Xn))

n

i=1

pθj=0(yi | Xi)dy1· · · dyn

+

{

1 − ˆθj((y1, X1), . . . , (yn, Xn))}2

n

i=1

pθj=1(yi | Xi)dy1· · · dyn ]

. (2)

Note that ˆθ2j + (1 − ˆθj)2 ≥ 1/2, i.e., 2ˆθ2j + 2(1 − ˆθj)2≥ 1, and

pθj=0(y | x)pθj=1(y | x) = 1 2πσ2 exp

σ12 (

y − fθMn

−j j=0

(x) + fθMn

−j j=1

(x) 2

)2

× exp {

12(fθMn

−j j=1(x) − fθ Mn

−j j=0

(x))2 }

, so that, by Lemma 2,

θˆ2

j((y1, X1), . . . , (yn, Xn))

n

i=1

pθj=0(yi | Xi)dy1· · · dyn

+

{

1 − ˆθj((y1, X1), . . . , (yn, Xn))}2

n

i=1

pθj=1(yi | Xi)dy1· · · dyn

14exp {

C

12j−2α

2

n

i=1

ϕ2j(Xi) }

.

By convexity of the map x 7→ e−x, we have (2) ≥ 14exp

{

C

12j−2αn

2 E[ϕ

2j(X)]

}

14exp {

C

12j−2αnL

2 }

. For j ≥ Mn+ 1,

j−2αn ≤ (Mn+ 1)−2αn ≤ 1, so that whenever j ≥ Mn+ 1,

EθMn

−j j=0

[ˆθ2j] + EθMn

−j j=1

[(ˆθj− 1)2] ≥ 1 4exp

{

C

12L

2 }

.

(5)

Since Mn+ 1 ≤ j ≤ 2Mn and θM−jn ∈ {0, 1}Mn−1 are arbitrary, combining this inequality with (1), we have

ˆinf

θMn

sup

θMn∈{0,1}Mn

EθMn[∥fθˆMn − fθMn2]

C

12

8 exp {

C

12L

2

} 2Mn

j=Mn+1

j−2α. (3)

Since 2Mj=Mnn+1j−2α ∼ Mn−2α+1 ∼ n−(2α−1)/(2α), we obtain the desired

conclusion. □

Proof of Theorem 2. It is not difficult to see that inffˆ f ∈F (α,Csup

1)

Pf(∥ ˆf − f∥2 > cn−(2α−1)/(2α))

≥ inf

θˆMn

sup

θMn∈{0,1}MnPθ

Mn(∥fθˆMn − fθMn2> cn−(2α−1)/(2α)),

where infθˆMn is taken over all estimators ˆθMn ∈ [0, 1]Mn of θMn.

Denote by δ1n the right side on (3). Fix arbitrary estimator ˆθMn [0, 1]Mn−1 of θMn. Since {0, 1}Mn is a finite set and the supremum over {0, 1}Mn is attained, there exists a sequence θMn such that

EθMn[∥fθˆMn − fθMn2] ≥ δ1n.

Moreover,

∥fθˆMn − fθMn2≤ C1 2Mn

j=Mn+1

j−2α(ˆθj − θj)2≤ C12

2Mn

j=Mn+1

j−2α =: δ2n, so that E[∥fθˆMn − fθMn4] ≤ δ22n. We recall the Paley-Zygmund inequality. Lemma 3(Paley-Zygmund inequality). Let Z be a real valued random vari- able with finite second moment. Then for every λ ∈ (0, 1),

P(Z≥ λE[Z]) ≥ (1 − λ)2(E[Z])

2

E[Z2] .

Apply the Paley-Zygmund inequality with λ = 1/2 and Z = ∥fθˆMn

fθMn2. Then

PθMn(∥fθˆMn − fθMn2 ≥ δ1n/2)

≥ PθMn

(

∥fθˆMn − fθMn21

2EθMn[∥fθˆMn − fθMn

2]

)

14(E[∥fθˆMn − fθMn

2])2

E[∥fθˆMn − fθMn4] δ1n222n

1 256exp

{

C

12L

2 }

. Therefore, we conclude that

lim inf

n→∞ inffˆ f ∈F (α,Csup

1)

Pf(∥ ˆf − f∥2> δ1n/2) ≥ 2561 exp {

C

12L

2 }

.

(6)

Since δ1n∼ n−(2α−1)/(2α), we obtain the desired conclusion. □ References

[1] Hall, P. and Horowitz, J.L. (2005). Nonparametric methods for inference in the presence of instrumental variables. Ann. Statist. 33 2904-2929. [2] Hall, P. and Horowitz, J.L. (2007). Methodology and convergence rates

for functional linear regression. Ann. Statist. 35 70-91.

[3] Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer.

[4] Yuan, M. and Cai, T. (2010). A reproducing kernel Hilbert space ap- proach to functional linear regression. Ann. Statist. 38 3412-3444.

参照

関連したドキュメント

Now it makes sense to ask if the curve x(s) has a tangent at the limit point x 0 ; this is exactly the formulation of the gradient conjecture in the Riemannian case.. By the

[3] Chen Guowang and L¨ u Shengguan, Initial boundary value problem for three dimensional Ginzburg-Landau model equation in population problems, (Chi- nese) Acta Mathematicae

Kilbas; Conditions of the existence of a classical solution of a Cauchy type problem for the diffusion equation with the Riemann-Liouville partial derivative, Differential Equations,

An integral inequality is deduced from the negation of the geometrical condition in the bounded mountain pass theorem of Schechter, in a situation where this theorem does not

Then it follows immediately from a suitable version of “Hensel’s Lemma” [cf., e.g., the argument of [4], Lemma 2.1] that S may be obtained, as the notation suggests, as the m A

It is known that if the Dirichlet problem for the Laplace equation is considered in a 2D domain bounded by sufficiently smooth closed curves, and if the function specified in the

Indeed, when using the method of integral representations, the two prob- lems; exterior problem (which has a unique solution) and the interior one (which has no unique solution for

There arises a question whether the following alternative holds: Given function f from W ( R 2 ), can the differentiation properties of the integral R f after changing the sign of