• 検索結果がありません。

ONTO A REGION DEFINED BY A LINEAR CONSTRAINT AND BOX CONSTRAINTS IN

N/A
N/A
Protected

Academic year: 2022

シェア "ONTO A REGION DEFINED BY A LINEAR CONSTRAINT AND BOX CONSTRAINTS IN"

Copied!
23
0
0

読み込み中.... (全文を見る)

全文

(1)

ONTO A REGION DEFINED BY A LINEAR CONSTRAINT AND BOX CONSTRAINTS IN

Rn

STEFAN M. STEFANOV

Received 29 September 2003 and in revised form 6 April 2004

We consider the problem of projecting a point onto a region defined by a linear equality or inequality constraint and two-sided bounds on the variables. Such problems are interest- ing because they arise in various practical problems and as subproblems of gradient-type methods for constrained optimization. Polynomial algorithms are proposed for solving these problems and their convergence is proved. Some examples and results of numerical experiments are presented.

1. Introduction

Consider the problem of projecting a pointx=(x1,...,xn)Rnonto a set defined by a linear inequality constraint “”, linear equality constraint, or linear inequality constraint

” with positive coefficients and box constraints. This problem can be mathematically formulated as the following quadratic programming problem:

min

c(x)n

j=1

cj xj

1 2

n j=1

xjxj2

(1.1)

subject to

xX, (1.2)

where the feasible regionXis defined by n

j=1

djxjα, dj>0, j=1,...,n, (1.3)

ajxjbj, j=1,...,n, (1.4)

Copyright©2004 Hindawi Publishing Corporation Journal of Applied Mathematics 2004:5 (2004) 409–431 2000 Mathematics Subject Classification: 90C30, 90C20, 90C25 URL:http://dx.doi.org/10.1155/S1110757X04309071

(2)

in the first case, by

n j=1

djxj=α, dj>0, j=1,...,n, (1.5)

ajxjbj, j=1,...,n, (1.6)

in the second case, or by n j=1

djxjα, dj>0, j=1,...,n, (1.7)

ajxjbj, j=1,...,n, (1.8)

in the third case.

Denote this problem by (P) in the first case (problem (1.1)-(1.2) withXdefined by (1.3)-(1.4)), by (P=) in the second case (problem (1.1)-(1.2) withX defined by (1.5)- (1.6)), and by (P) in the third case (problem (1.1)-(1.2) withXdefined by (1.7)-(1.8)).

Sincec(x) is a strictly convex function andXis a convex closed set, then this is a convex programming problem and it always has auniqueoptimal solution whenX= ∅.

Problems of the form (1.1)-(1.2) withXdefined by (1.3)-(1.4), (1.5)-(1.6), or (1.7)- (1.8) arise in production planning and scheduling (see [2]), in allocation of resources (see [2,7,8,15]), in the theory of search (see [4]), in facility location (see [10,11,12,13,14]), and so forth. Problems (P), (P=), and (P) also arise as subproblems of some projection optimization methods of gradient (subgradient) type for constrained optimization when the feasible region is of the form (1.3)-(1.4), (1.5)-(1.6), or (1.7)-(1.8) (see, e.g., [6]).

These projection problems are to be solved ateachiteration of algorithm performance because current points generated by these methods must be projected on the feasible re- gion ateachiteration. That is why projection is the most onerous and time-consuming part of any projection gradient-type method for constrained optimization and we need efficient algorithms for solving these problems. This is the motivation to study the prob- lems under consideration.

Problems like (P), (P=), and (P) are subject of intensive study. Related problems and methods for them are considered in [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15].

An algorithm for finding a projection onto a simple polytope is proposed, for example, in [9]. Projections in the implementation of stochastic quasigradient methods are studied in [10]. Projected Newton-type methods are suggested in [1,5].

This paper is devoted to the development of new efficient polynomial algorithms for finding a projection onto the setXdefined by (1.3)-(1.4), (1.5)-(1.6), or (1.7)-(1.8). The paper is organized as follows. InSection 2, characterization theorems (necessary and suf- ficient conditions or sufficient conditions) for the optimal solutions to the considered problems are proved. In Section 3, new algorithms of polynomial complexity are sug- gested and their convergence is proved. InSection 4, we consider some theoretical and numerical aspects of implementation of the algorithms and give some extensions of both

(3)

characterization theorems and algorithms. InSection 5, we present results of some nu- merical experiments.

2. Main results. Characterization theorems

2.1. Problem (P). First consider the following problem:

(P)

min

c(x)n

j=1

cjxj1 2

n j=1

xjxj2

(2.1)

subject to (1.3) and (1.4).

Suppose that the following assumptions are satisfied:

(1.a)ajbjfor all j=1,...,n. Ifak=bk for somek, 1kn, then the valuexk:= ak=bkis determined in advance;

(1.b)nj=1djajα; otherwise the constraints (1.3)-(1.4) are inconsistent andX= ∅ whereXis defined by (1.3)-(1.4).

In addition to this assumption we suppose thatαn

j=1djbj in some cases which are specified below.

The Lagrangian for problem (P) is L(x,u,v,λ)=1

2 n j=1

xjxj

2

+λ n

j=1

djxjα + n j=1

uj

ajxj

+ n j=1

vj

xjbj

, (2.2) whereλR1+,u,vRn+, andRn+consists of all vectors withnreal nonnegative compo- nents.

The Karush-Kuhn-Tucker (KKT) necessary and sufficient optimality conditions for the minimumx=(x1,...,xn) are

xj xj+λdjuj+vj=0, j=1,...,n, (2.3) ujajxj=0, j=1,...,n, (2.4) vjxj bj=0, j=1,...,n, (2.5) λ

n

j=1

djxj α =0, λR1+, (2.6) n

j=1

djxj α, (2.7)

ajxj bj, j=1,...,n, (2.8) ujR1+, vjR1+, j=1,...,n. (2.9)

(4)

Here,λ,uj,vj,j=1,...,n, are the Lagrange multipliers associated with the constraints (1.3),ajxj,xjbj,j=1,...,n, respectively. Ifaj= −∞orbj=+for somej, we do not consider the corresponding condition (2.4), (2.5) and Lagrange multiplieruj[vj].

Sinceλ0,uj0,vj0, j=1,...,n, and since the complementary conditions (2.4), (2.5), (2.6) must be satisfied, in order to findxj,j=1,...,n, from system (2.3)–(2.9), we have to consider all possible cases forλ,uj,vj: allλ,uj,vjequal to 0; allλ,uj,vjdifferent from 0; some of them equal to 0 and some of them different from 0. The number of these cases is 22n+1, where 2n+ 1 is the number ofλ,uj,vj, j=1,...,n. Obviously this is an enormous number of cases, especially for large-scale problems. For example, when n=1 500, we have 2300110900cases. Moreover, in each case we have to solve a large-scale system of (nonlinear) equations inxj,λ,uj,vj,j=1,...,n. Thereforedirectapplication of the KKT theorem, using explicit enumeration of all possible cases, for solving large- scale problems of the considered form would not give a result and we need results and efficient methods to cope with these problems.

The following theorem gives a characterization of the optimal solution to problem (P). Its proof, of course, is based on the KKT theorem. As we will see in Section 5, by usingTheorem 2.1, we can solve problem (P) withn=10 000 variables in 0.00055 seconds on a personal computer.

Theorem2.1 (characterization of the optimal solution to problem (P)). A feasible so- lutionx=(x1,...,xn)X(1.3)-(1.4) is the optimal solution to problem (P) if and only if there exists someλR1+such that

xj =aj, jJaλdef=

j:λxjaj

dj

, (2.10)

xj =bj, jJbλdef=

j:λxjbj dj

, (2.11)

xj =xjλdj, jJλdef=

j:xjbj

dj < λ <xjaj

dj

. (2.12)

Proof. (i) Letx=(x1,...,xn) be the optimal solution to (P). Then there exist constants λ,uj,vj,j=1,...,n, such that the KKT conditions (2.3)–(2.9) are satisfied. Consider both possible cases forλ.

(1) Letλ >0. Then system (2.3)–(2.9) becomes (2.3), (2.4), (2.5), (2.8), (2.9), and n

j=1

djxj =α, (2.13)

that is, the inequality constraint (1.3) is satisfied with an equality forxj, j= 1,...,n, in this case.

(a) Ifxj =aj, thenuj0 andvj=0 according to (2.5). Therefore (2.3) implies xj xj=ujλdj≥ −λdj. Sincedj>0, then

λxjxj dj

xjaj

dj . (2.14)

(5)

(b) Ifxj =bj, thenuj=0 according to (2.4) andvj0. Therefore (2.3) implies xj xj= −vjλdj≤ −λdj. Hence

λxjxj

dj xjbj

dj . (2.15)

(c) Ifaj< xj < bj, thenuj=vj=0 according to (2.4) and (2.5). Therefore (2.3) impliesxj =xjλdj. Sincedj>0,j=1,...,n,λ >0 by the assumptions, then

xj> xj. It follows frombj> xj,xj > ajthat

xjbj<xjxj, xjxj <xjaj. (2.16) Using dj>0, we obtain λ=(xjxj)/dj<(xjaj)/dj,λ=(xjxj)/dj>

(xjbj)/dj, that is, xjbj

dj < λ <xjaj

dj . (2.17)

(2) Letλ=0. Then system (2.3)–(2.9) becomes

xj xjuj+vj=0, j=1,...,n, (2.18) and (2.4), (2.5), (2.7), (2.8), (2.9).

(a) Ifxj =aj, thenuj0,vj=0. Thereforeajxjxj xj=uj0. Multi- plying both sides of this inequality by(1/dj) (<0 by the assumption), we obtain

xjaj

dj 0λ. (2.19)

(b) Ifxj =bj, thenuj=0,vj0. Thereforebjxjxj xj= −vj0. Multi- plying this inequality by(1/dj)<0, we get

xjbj

dj 0λ. (2.20)

(c) Ifaj< xj < bj, thenuj=vj=0. Thereforexj xj=0, that is,xj =xj. Since bj> xj,xj > aj,j=1,...,n, by the assumption, then

xjbj<xjxj =0, 0=xjxj <xjaj. (2.21) Multiplying both inequalities by 1/dj>0, we obtain

xjbj

dj <0λ, λ0<xjaj

dj , (2.22)

that is, in case (c) we have

xjbj

dj < λ <xjaj

dj . (2.23)

(6)

In order to describe cases (a), (b), (c) for both (1) and (2), it is convenient to introduce the index setsJaλ,Jbλ,Jλdefined by (2.10), (2.11), and (2.12), respectively. ObviouslyJaλ JbλJλ= {1,...,n}. The “necessity” part is proved.

(ii) Conversely, letxXand let components ofxsatisfy (2.10), (2.11), and (2.12), whereλR1+.

(1) Ifλ >0, thenxj xj<0, jJλ, according to (2.12) anddj>0. Set

λ=xjxj

dj (>0) obtained from

jJaλ

djaj+

jJbλ

djbj+

jJλ

dj

xjλdj

=α;

uj=vj=0 for jJλ;

uj=ajxj+λdj0 according to the definition ofJaλ, vj=0 forjJaλ; uj=0, vj=xjbjλdj0 according to the definition ofJbλ forjJbλ.

(2.24) By using these expressions, it is easy to check that conditions (2.3), (2.4), (2.5), (2.6), (2.9) are satisfied; conditions (2.7) and (2.8) are also satisfied according to the assumptionxX.

(2) Ifλ=0, thenxj =xj, jJλ, according to (2.12), and

Jλ=0=

j:xjbj

dj <0<xjaj

dj

. (2.25)

Sincedj>0,xjbj<0,xjaj>0,jJ0. Thereforexj =xj(aj,bj). Set

λ=xjxj

dj (=0), uj=vj=0 forjJλ=0, uj=ajxj+λdj=ajxj(0), vj=0 forjJaλ=0, uj=0, vj=xjbjλdj=xjbj(0) forjJbλ=0.

(2.26)

Obviously conditions (2.3), (2.4), (2.5), (2.9) are satisfied; conditions (2.7), (2.8) are also satisfied according to the assumptionxX, and condition (2.6) is obviously satis- fied forλ=0.

In both cases (1) and (2) of part (ii),xj,λ,uj,vj, j=1,...,n, satisfy KKT conditions (2.3)–(2.9) which are necessary and sufficient conditions for a feasible solution to be an optimal solution to a convex minimization problem. Thereforexis the (unique) optimal

solution to problem (P).

In view of the discussion above, the importance ofTheorem 2.1consists in the fact that it describes components of the optimal solution to (P) only through the Lagrange multiplierλassociated with the inequality constraint (1.3).

(7)

Since we do not know the optimal value ofλfromTheorem 2.1, we define an iterative process with respect to the Lagrange multiplierλand we prove convergence of this process inSection 3.

It follows fromdj>0 andajbj, j=1,...,n, that

ubjdef

= xjbj

dj xjaj

dj

def=laj, j=1,...,n, (2.27)

for the expressions by means of which we define the setsJaλ,Jbλ,Jλ.

The problem how to ensure a feasible solution to problem (P), which is an assump- tion ofTheorem 2.1, is discussed after the statement of the corresponding algorithm.

2.2. Problem (P=). Consider problem (P=) of finding a projection ofxonto a setX of the form (1.5)-(1.6):

(P=)

min

c(x)n

j=1

cjxj1 2

n j=1

xjxj2

(2.28)

subject to (1.5) and (1.6).

We have the following assumptions:

(2.a)ajbjfor allj=1,...,n;

(2.b)nj=1djajαn

j=1djbj; otherwise the constraints (1.5)-(1.6) are inconsistent and the feasible region (1.5)-(1.6) is empty.

The KKT conditions for problem (P=) are

xj xj+λdjuj+vj=0, j=1,...,n,λR1, uj

ajxj=0, j=1,...,n, vjxj bj=0, j=1,...,n,

n j=1

djxj =α, ajxj bj, j=1,...,n, ujR1+, vjR1+, j=1,...,n.

(2.29)

In this case the following theorem, which is analogous toTheorem 2.1, holds true.

Theorem2.2 (characterization of the optimal solution to problem (P=)). A feasible so- lutionx=(x1,...,xn)X(1.5)-(1.6) is the optimal solution to problem (P=) if and only

(8)

if there exists someλR1such that xj =aj, jJaλdef=

j:λxjaj

dj

, (2.30)

xj =bj, jJbλdef=

j:λxjbj dj

, (2.31)

xj =xjλdj, jJλdef=

j:xjbj

dj < λ <xjaj

dj

. (2.32)

The proof ofTheorem 2.2is omitted because it is similar to that ofTheorem 2.1.

2.3. Problem (P). Consider problem (P) of finding a projection ofxonto a setX of the form (1.7)-(1.8):

(P)

min

c(x)n

j=1

cj xj

1 2

n j=1

xjxj2

(2.33)

subject to (1.7) and (1.8).

We have the following assumptions:

(3.a)ajbjfor allj=1,...,n;

(3.b)αn

j=1djbj; otherwise constraints (1.7)-(1.8) are inconsistent and X= ∅, whereXis defined by (1.7)-(1.8).

Rewrite (P) in the form (2.33), (2.34), (1.8), where

n

j=1

djxj≤ −α, dj>0, j=1,...,n. (2.34)

Since the linear functiond(x) := −n

j=1djxj+αis both convex and concave, (P) is a convex optimization problem.

Letλ,λbe the Lagrange multipliers associated with (1.5) (problem (P=)) and with (2.34) (problem (P)), and letxj,xj, j=1,...,n, be components of the optimal solu- tions to (P=), (P), respectively. For the sake of simplicity, we useuj,vj, j=1,...,n, in- stead ofuj,vj,j=1,...,n, for the Lagrange multipliers associated withajxj,xjbj,

j=1,...,n, from (1.8), respectively.

The Lagrangian for problem (P) is Lx,u,v,λ=1

2 n j=1

xjxj2+λ

n

j=1

djxj+α +

n j=1

ujajxj+ n j=1

vjxjbj

(2.35)

(9)

and the KKT conditions for (P) are

xj xjλdjuj+vj=0, j=1,...,n, (2.36) ujajxj=0, j=1,...,n, (2.37) vjxj bj=0, j=1,...,n, (2.38) λ

αn

j=1

djxj =0, λR1+, (2.39)

n

j=1

djxj ≤ −α, (2.40)

ajxj bj, j=1,...,n, (2.41) ujR1+, vjR1+, j=1,...,n. (2.42) We can replace (2.36) and (2.39) by

xj xj+λdjuj+vj=0, j=1,...,n, (2.43) λ

n

j=1

djxj α =0, λR1,dj>0, (2.44)

respectively, where we have redenotedλ:= −λR1.

Conditions (2.43) with λinstead ofλ, (2.37), (2.38), (2.41), (2.42) are among the KKT conditions for problem (P=).

Theorem2.3 (sufficient condition for optimal solution). (i)Ifλ=(xjxj)/dj0, then xj,j=1,...,n, solve problem (P) as well.

(ii)Ifλ=(xjxj)/dj>0, thenxj,j=1,...,n, defined as xj =bj, jJbλ; xj =minbj,xj, jJλ;

xj =minbj,xj jJaλsuch thataj<xj; xj =aj jJaλsuch thatajxj

(2.45)

solve problem (P).

Proof. (i) Letλ=(xjxj)/dj0 (i.e.,xjxj, jJλ, because dj>0). Sincexj, j= 1,...,n, satisfy KKT conditions for problem (P=) as components of the optimal solution to (P=), then (2.43), (2.37), (2.38), (2.40) with equality (and therefore (2.44)), (2.41), (2.42) are satisfied as well (withλinstead ofλ). Since they are the KKT necessary and sufficient conditions for (P), thenxj,j=1,...,n, solve (P).

(ii) Let λ=(xjxj)/dj>0 (i.e., xj > xj, jJλ). Since x=(xj)nj=1 is the opti- mal solution to (P=) by the assumption, then KKT conditions for (P=) are satisfied.

(10)

Ifx:=(xj)nj=1 is the optimal solution to (P), thenxsatisfies (2.43), (2.37), (2.38), (2.44), (2.40), (2.41), (2.42). Sinceλ >0, thenλcannot play the role ofλin (2.43) and (2.44) becauseλmust be a nonpositive real number in (2.43) and (2.44). Thereforexj, which satisfy KKT conditions for problem (P=), cannot play the roles ofxj, j=1,...,n, in (2.43), (2.37), (2.38), (2.44), (2.40), (2.41), (2.42). Hence, in the general case the equal- itynj=1djxj=αis not satisfied forxj=xj. Therefore, in order that (2.44) be satisfied, λmust be equal to 0. This conclusion helps us to prove the theorem.

Letx:=(xj)nj=1be defined as in part (ii) of the statement ofTheorem 2.3.

Setλ=0;

(1)uj=0,vj=xjbj(0 according to the definition ofJbλ(2.31),λ >0,dj>0) for jJbλ;

(2)uj=vj=0 forjJaλsuch thataj<xjand forjJλsuch thatxj< bj; (3)uj=0,vj=xjbj(0) forjJλsuch thatxjbj;

(4)uj=ajxj(0),vj=0 forjJaλsuch thatajxj.

In case (2) we haveaj<xj, thereforeaj<xj=xj according to the definition ofxj in this case. In case (3), sincebjxj, that is,bjxj0, thenvj:=xjbj0. Consequently, conditions (2.41) and (2.42) are satisfied for all jaccording to (1), (2), (3), and (4).

As we have proved, (2.44) is satisfied withλ=0. Since the equality constraint (1.5) n

j=1djxj =αis satisfied for the optimal solutionxto (P=), since the components ofx defined in the statement ofTheorem 2.3(ii) are such that some of them are the same as the corresponding components ofx, since some of the components ofx, namely those forjJaλwithaj<xj, are greater than the corresponding componentsxj =aj,jJaλ, of x, and sincedj>0,j=1,...,n, then obviously the inequality constraint (2.40) holds for x. It is easy to check that other conditions (2.43), (2.37), (2.38) are also satisfied. Thus, xj, j=1,...,n, defined above satisfy the KKT conditions for (P). Thereforexis the

optimal solution to problem (P).

According toTheorem 2.3, the optimal solution to problem (P) is obtained by using the optimal solution and optimal value of the Lagrange multiplierλfor problem (P=).

That is why we suppose thatnj=1djajαin addition to assumption (3.b) (see Step (1) ofAlgorithm 3below) as we assumed this in assumption (2.b) for problem (P=).

3. The algorithms

3.1. Analysis of the optimal solution to problem (P). Before the formal statement of the algorithm for problem (P), we discuss some properties of the optimal solution to this problem, which turn out to be useful.

Using (2.10), (2.11), and (2.12), condition (2.6) can be written as follows:

λ

jJaλ

djaj+

jJbλ

djbj+

jJλ

dj

xjλdj

α

=0, λ0. (3.1)

Since the optimal solution x to problem (P) obviously depends on λ, we consider

(11)

components ofxas functions ofλfor differentλR1+:

xj =xj(λ)=

aj, jJaλ, bj, jJbλ,

xjλdj, jJλ.

(3.2)

Functionsxj(λ),j=1,...,n, are piecewise linear, monotone nonincreasing, piecewise dif- ferentiable functions ofλwith two breakpoints atλ=(xjaj)/djandλ=(xjbj)/dj.

Let

δ(λ)def=

jJλa

djaj+

jJbλ

djbj+

jJλ

djxjλ

jJλ

d2jα. (3.3)

If we differentiate (3.3) with respect toλ, we get δ(λ)≡ −

jJλ

d2j<0, (3.4)

whenJλ= ∅, andδ(λ)=0 whenJλ= ∅. Henceδ(λ) is amonotone nonincreasing func- tionofλ,λR1+, and maxλ0δ(λ) is attained at the minimum admissible value ofλ, that is, atλ=0.

Case 1. Ifδ(0)>0, in order that (3.1) and (2.7) be satisfied, there exists someλ>0 such thatδ(λ)=0, that is,

n j=1

djxj =α, (3.5)

which means that the inequality constraint (1.3) is satisfied with an equality forλin this case.

Case 2. Ifδ(0)<0, thenδ(λ)<0 for allλ0, and the maximum ofδ(λ) withλ0 is δ(0)=maxλ0δ(λ) and it is attained atλ=0 in this case. In order that (3.1) be satisfied, λmust be equal to 0. Thereforexj =xj,jJλ=0, according to (2.12).

Case 3. In the special case whenδ(0)=0, the maximumδ(0)=maxλ0δ(λ) ofδ(λ) is also attained at the minimum admissible value ofλ, that is, forλ=0, becauseδ(λ) is a monotone nonincreasing function in accordance with the above consideration.

As we have seen, for the optimal value ofλ, we haveλ0 in all possible cases, as the KKT condition (2.6) requires. We have shown that inCase 1we need an algorithm for findingλwhich satisfies the KKT conditions (2.3)–(2.9) but such thatλsatisfies (2.7) with an equality. In order that this be fulfilled, the set (1.5)-(1.6) (i.e., feasible region of problem (P=)) must be nonempty. That is why we have requiredαn

j=1djbjin some cases in addition to the assumptionnj=1djajα(see assumption (1.b)). We have also used this in the proof ofTheorem 2.1, part (ii), whenλ >0.

参照

関連したドキュメント

We believe it will prove to be useful both for the user of critical point theorems and for further development of the theory, namely for quick proofs (and in some cases improvement)

He thereby extended his method to the investigation of boundary value problems of couple-stress elasticity, thermoelasticity and other generalized models of an elastic

In this paper we develop and analyze new local convex ap- proximation methods with explicit solutions of non-linear problems for unconstrained op- timization for large-scale systems

Consider the minimization problem with a convex separable objective function over a feasible region defined by linear equality constraint(s)/linear inequality constraint of the

In this paper, several three-order explicit linear four-step methods are proposed, which possess far longer intervals of absolute stability than the classical Adams-Bashforth

The idea of applying (implicit) Runge-Kutta methods to a reformulated form instead of DAEs of standard form was first proposed in [11, 12], and it is shown that the

In this section, we show a strong convergence theorem for finding a common element of the set of fixed points of a family of finitely nonexpansive mappings, the set of solutions

Wangkeeree, A general iterative methods for variational inequality problems and mixed equilibrium problems and fixed point problems of strictly pseudocontractive mappings in