• 検索結果がありません。

pdf Research Kengo Kato

N/A
N/A
Protected

Academic year: 2018

シェア "pdf Research Kengo Kato"

Copied!
3
0
0

読み込み中.... (全文を見る)

全文

(1)

2013.5.7.(minor edit: 2015.8.28)

Implications of KKT conditions in quantile regression

Kengo Kato Let y = (y1, . . . , yn)T ∈ Rn and X = (x1, . . . , xn)T ∈ Rn×p be a pair of a vector of dependent variables and a design matrix. Consider to solve the following minimization problem:

(QR) min

β∈Rp n

i=1

ρτ(yi− xTiβ),

where ρτ(u) = (τ − 1(u ≤ 0))u is the check function used in quantile regression. This minimization problem reduces to the following linear programming problem:

(QR-LP) min

u,v∈Rn,β∈Rpτ 1 T

nu+ (1 − τ )1Tnv

s.t. u − v = y − Xβ, u ≥ 0n, v ≥ 0n,

where 1n = (1, . . . , 1)T ∈ Rn and 0n = (0, . . . , 0)T ∈ Rn. The inequalities u ≥ 0 and v ≥ 0 are interpreted coordinatewise. In the problem (QR), the set of feasible solutions is non-empty and the objective function is non-negative on the set of feasible solutions. Hence there exists at least one optimal solution to the problem (QR-LP) and so to (QR). The purpose of this note is to to prove the following lemma, which roughly describes “first order conditions” for the problem (QR), by using the KKT theorem. The following lemma is a small modification of Lemma 2.1 in [2] where only the LAD case (i.e. the case with τ = 1/2) is handled. Lemma 1. Let β be an optimal solution to the problem (QR). Let I = {i ∈ {1, . . . , n} : yi = xTi β}. Then there exist ai∈ [−1, 0], i ∈ I such that

n

i=1

{τ − 1(yi≤ xTiβ)}xi =

i∈I

aixi.

Hence we have

n

i=1

{τ − 1(yi≤ xTiβ)}xi

≤ Card(I) max

1≤i≤n|xi|.

Proof. Let

ui = max{yi− xTi β, 0}, vi = max{−yi+ xTi β, 0}.

Then u− v = y − Xβ and (u, v, β) is an optimal solution to the problem (QR-LP). Defining

f (u, v, β) = τ 1Tnu+ (1 − τ )1Tnv,

g(u, v, β) = (g1(u, v, β), . . . , g2n(u, v, β))T = (−uT, −vT)T, h(u, v, β) = (h1(u, v, β), . . . , hn(u, v, β))T = u − v − y + Xβ, the problem (QR-LP) can be written as

u,v∈Rminn,β∈Rpf (u, v, β)

s.t. g(u, v, β) ≤ 02n, h(u, v, β) = 0n.

1

(2)

2

Let ei ∈ Rn denote the vector of which only the i-th element is 1 and the other ele- ments are all zero. Then the gradient vectors of f (u, v, β), gi(u, v, β), gn+i(u, v, β) and hi(u, v, β) are given by

∇f (u, v, β) =

 τ 1n

(1 − τ )1n

0p

, ∇gi(u, v, β) =

−ei

0n

0p

,

∇gn+i(u, v, β) =

 0

−ei

0p

, ∇hi(u, v, β) =

 ei

−ei

xi

, i = 1, . . . , n.

Since all the constraints are linear, by the KKT theorem [1, Proposition 3.3.7], there exist µ1, . . . , µ2n≥ 0 and λ1, . . . , λn such that

 τ 1n

(1 − τ )1n

0p

+

n

i=1

µi

−ei

0n

0p

+

n

i=1

µn+i

 0n

−ei

0p

+

n

i=1

λi

 ei

−ei

xi

= 02n+p, and

µiui = 0, µn+ivi= 0, i = 1, . . . , n. Recall I= {i ∈ {1, . . . , n} : yi= xTiβ}. Let

I+ = {i ∈ {1, . . . , n} : yi> xTiβ}, I = {i ∈ {1, . . . , n} : yi < xTi β}. Observe that

i ∈ I+ ⇒ ui > 0 ⇒ µi= 0 ⇒ λi= −τ, i ∈ I ⇒ vi> 0 ⇒ µn+i= 0 ⇒ λi= 1 − τ, Therefore,

i∈I+

τ xi+ (τ − 1)

i∈I

xi=

i∈I

λixi.

The left side is expressed as

i∈I+∪I

{τ − 1(yi≤ xTiβ)}xi=

n

i=1

{τ − 1(yi≤ xTi β)}xi+ (1 − τ )

i∈I

λixi,

so that

n

i=1

{τ − 1(yi≤ xTiβ)}xi=

i∈I

i− 1 + τ )xi. For i ∈ I, we have

τ − µi+ λi= 0 ⇒ λi≥ −τ,

1 − τ − µn+i− λi= 0 ⇒ λi≤ 1 − τ,

so that λi ∈ [−τ, 1 − τ ], i ∈ I. This completes the proof. □ The next question is how large Card(I) is. Suppose that y is random but X is fixed (if X is random, consider the conditional distribution of y given X). Lemma 2. Suppose that n ≥ p and the distribution of y is absolutely continuous with respect to the Lebesgue measure on Rn. Then

Card(I) ≤ p, a.s.

(3)

3

Proof. For I ⊂ {1, . . . , n}, let

SI = {y ∈ Rn : yi= xTiβ, ∀i ∈ I, ∃β ∈ Rp}.

Then as long as Card(I) ≥ p + 1, the set SI is a linear subspace of Rn of dimension at most n − 1, so that its Lebesgue measure in Rn is zero. The conclusion follows from the fact that

{Card(I) ≥ p + 1} ⊂

I⊂{1,...,n}

Card(I)≥p+1

{y ∈ SI},

and the absolute continuity of the distribution of y. □ References

[1] Bertsekas, D. (1999). Nonlinear Programming (2nd edition). Athena Scientific. [2] El-Attar, R.A., Vidyasagar, M. and Dutta, S.P.K. (1979). An algorithm for l1-norm minimization with application to nonlinear l1-approximation. SIAM J. Numer. anal. 16 70-86.

参照

関連したドキュメント

He thereby extended his method to the investigation of boundary value problems of couple-stress elasticity, thermoelasticity and other generalized models of an elastic

Theorem 4.8 shows that the addition of the nonlocal term to local diffusion pro- duces similar early pattern results when compared to the pure local case considered in [33].. Lemma

Keywords: continuous time random walk, Brownian motion, collision time, skew Young tableaux, tandem queue.. AMS 2000 Subject Classification: Primary:

Kilbas; Conditions of the existence of a classical solution of a Cauchy type problem for the diffusion equation with the Riemann-Liouville partial derivative, Differential Equations,

Since the boundary integral equation is Fredholm, the solvability theorem follows from the uniqueness theorem, which is ensured for the Neumann problem in the case of the

This paper presents an investigation into the mechanics of this specific problem and develops an analytical approach that accounts for the effects of geometrical and material data on

Beyond proving existence, we can show that the solution given in Theorem 2.2 is of Laplace transform type, modulo an appropriate error, as shown in the next theorem..

While conducting an experiment regarding fetal move- ments as a result of Pulsed Wave Doppler (PWD) ultrasound, [8] we encountered the severe artifacts in the acquired image2.