• 検索結果がありません。

A Smoothing Method for Mathematical Programs with Second-Order Cone Complementarity Constraints

N/A
N/A
Protected

Academic year: 2021

シェア "A Smoothing Method for Mathematical Programs with Second-Order Cone Complementarity Constraints"

Copied!
25
0
0

読み込み中.... (全文を見る)

全文

(1)

A Smoothing Method for Mathematical Programs with Second-Order Cone Complementarity Constraints

Guidance

Professor Masao FUKUSHIMA

Takeshi EJIRI

2005 Graduate Course in

Department of Applied Mathematics and Physics Graduate School of Informatics

Kyoto University

KYOTO UNIVER SITY FO

U NKYOTODED 1JAPAN897

February 2007

(2)

Abstract

Real-world problems are subject to various uncertainties. To cope with difficulties arising in those situations, there has been increasing attention to optimization problems under uncer- tainty in recent years. However, just a few works have dealt with Mathematical Programs with Equilibrium Constraints (MPEC) under uncertainty. In particular, there have been very few studies on a robust optimization approach to those problems.

In this paper, we formulate an MPEC under uncertainty as a Mathematical Program with

Second-Order Cone Complementarity Constraints (MPSOCCC). Since the MPSOCCC is diffi-

cult to deal with directly, we use a smoothing function to approximate the problem and solve a

sequence of smooth approximation problems to get a solution of the original MPSOCCC. The

main theoretical result of this work is to establish that the proposed smoothing method has the

global convergence property to a stationary point of the MPSOCCC.

(3)

Contents

1 Introduction 1

2 Preliminaries 2

2.1 Bilevel program and MPSOCCC . . . . 2 2.2 Jordan algebra associated with the SOC . . . . 3 2.3 Reformulation of SOCCC into equation . . . . 5

3 Smoothing method for problem P 0 7

4 Optimality conditions 9

5 Concluding remarks 20

(4)

1 Introduction

Optimization problems whose constraints contain complementarity conditions or variational in- equalities are called Mathematical Programs with Equilibrium Constraints (MPECs). Although MPECs are very difficult problems to deal with because of their nonconvexity and nondifferen- tiability, they have high ability of formulating a wide range of practical problems. Therefore researchers have paid much attention to them and actively studied them in recent years. In par- ticular, a bilevel program is a special case of MPEC, which includes many important problems such as a leader-follower decision-making problem. The bilevel program can be reformulated as a Mathematical Program with Complementarity Constraints (MPCC), by replacing the lower- level optimization problem by its KKT conditions. For MPECs, MPCCs and bilevel programs, there are many research results about optimality conditions and effective algorithms, such as branch-and-bound methods [2, 14, 18, 22], complementary pivoting algorithm [4] and smoothing methods [9, 12, 20].

While optimization can be applied to a wide variety of real problems and systems, a majority of the methods dealing with those problems treat the data as exactly known, that is, those methods get a solution by formulating an optimization problem with deterministic data. In problems we really face, however, it is in fact a rare case that the related information is perfectly known, but rather there are uncertain data almost always. To overcome these issues, there have recently been proposed a number of solution methods which even include uncertainty into problem formulation, instead of replacing uncertain data by some fixed data. For example, some methods use the expected values of stochastic data [13], or their variances [8], while other methods called robust optimization [3] seek a solution in the worst possible scenario.

Especially, the worst case optimization is essential in dealing with actual problems with uncertain information, in view of the importance of accident prevention in dangerous operations which might lead to disasters like explosions in a large scale chemical plant, or risk avoidance in the finance field which has grown massively with derivatives such as futures trading. In a robust optimization method, if the data are known to belong to some ellipsoidal set, an optimization problem with uncertain data can be represented as a Second-Order Cone Programming (SOCP) problem [1, 3].

On the other hand, there have been just a few works to deal with MPECs under uncertainty [17, 21]. Particularly, very few studies have been done with regard to a robust optimization method for such problems, in spite of its significance. When there is uncertainty in the lower level optimization problem of a bilevel program, it can be formulated as a bilevel program having SOCP as its lower level problem. This paper discusses Mathematical Programs with Second-Order Cone Complementarity Constraints (MPSOCCC), which arise when the lower level SOCP in the bilevel problem is replaced by its KKT conditions. SOCP includes Linear Programming and Quadratically Constrained Quadratic Programming. Moreover, Second-Order Cone Complementarity Conditions (SOCCC) include the ordinary Complementarity Conditions as a special case. Hence MPSOCCC is an upper class of MPCC.

In this paper, we propose a smoothing method for solving MPSOCCC, and discuss its con-

vergence properties. Similarly to MPCC and MPLCC, MPSOCCC is a nonconvex and nondiffer-

entiable problem in general, and therefore it is difficult to deal with directly. Using a smoothing

(5)

function [11], the proposed method constructs a smooth approximation of the nonsmooth prob- lem. The main result of this paper is that the proposed method is guaranteed to have the global convergence property to a stationary point of the MPSOCCC under some assumptions.

The remainder of this paper is organized as follows: In Section 2, we formulate the MP- SOCCC and review the Jordan algebra associated with the Second-Order Cones. In Section 3, we consider a smoothing function and propose a smoothing method for the MPSOCCC. The optimality conditions and convergence properties are discussed in Section 4. Lastly in Section 5, we make some remarks to conclude this paper.

2 Preliminaries

2.1 Bilevel program and MPSOCCC

We consider the following bilevel programming problem:

P : min

x,y f (x, y) s.t. x ∈ X,

y ∈ Y (x),

where f : < n

1

×n

2

→ < is a continuously differentiable function, X ⊆ < n

1

, and Y (x) ⊆ < n

2

represents the solution set of the following Second-Order Cone Program (SOCP) with parameter x:

SOCP(x) : min

y g(x, y)

s.t. h(x, y) := A(x)y + b(x) = 0, y ∈ K ,

where function g : < n

1

× n

2

→ < is continuously differentiable, A(x) = [A 1 (x) · · · A s (x)] ∈ < m × n

2

with A i (x) ∈ < m × l

i

, b(x) ∈ < m , and the set K is the Cartesian product of second-order cones K l

i

= { y = (y 1 , y 2 ) > ∈ < × < l

i

−1 | y 1 ≥ k y 2 k} (i = 1, · · · , s), i.e., K = K l

1

× · · · × K l

s

with n 2 = l 1 + · · · + l s .

First we consider the reformulation of P. For this purpose, we make the following assumptions on problems P and SOCP(x):

A 1. X ⊆ < n

1

is nonempty and compact.

A 2. For any x ∈ X , the feasible set of SOCP(x) is nonempty.

A 3. For any x ∈ X , g(x, · ) is strongly convex.

Since the second-order cone K l

i

is a convex set, the problem SOCP(x) is a convex pro-

gramming problem under assumption A3. Moreover, from assumptions A2 and A3, the problem

SOCP(x) has a unique solution for every x ∈ X . However the upper level problem P is generally

not a convex programming problem since the feasible set of P is not a convex set in general,

even when SOCP(x) has a unique solution for all x ∈ X .

(6)

Let I (y) denote the index set such that I (y) := { i | y i ∈ bd K l

i

} for any y ∈ < n

2

, where y i ∈ < l

i

is the block component of y and bd K l

i

denotes the boundary of K l

i

. Furthermore, for the matrix A(x) = [A 1 (x) · · · A s (x)], let [A i (x)] i ∈I (y) ∈ < m × P

i∈I

l

i

denote the submatrix consisting of A i (x), i ∈ I (y). Similarly, we denote the submatrix with components A i (x), i / ∈ I (y), by [A i (x)] i / ∈I (y) ∈ < m × P

i /∈I

l

i

. With these definitions, we introduce the following assumption.

A 4. For any x ∈ X and any optimal solution y of SOCP(x), there exists an index i such that i / ∈ I (y). Furthermore, [A i (x)] i / ∈I(y) has full row rank.

The KKT conditions for SOCP(x) are given by

∇ y g(x, y) − η − A(x) > ζ = 0, A(x)y + b(x) = 0, y ∈ K , η ∈ K , y > η = 0,

(1)

where η ∈ < n

2

and ζ ∈ < m are Lagrange multiplier vectors. The last conditions in the above conditions (1) are called Second-Order Cone Complementarity Constraints (SOCCC). Since SOCP(x) is a convex program under our assumptions, the above KKT conditions (1) become the necessary and sufficient optimality condition for this problem. Thus, by replacing SOCP(x) with its KKT conditions, we can reformulate problem P into the nonlinear programming problem involving SOCCC:

MPSOCCC : min f (x, y) s.t. x ∈ X,

∇ y g(x, y) − η − A(x) > ζ = 0, A(x)y + b(x) = 0,

y ∈ K , η ∈ K , y > η = 0.

2.2 Jordan algebra associated with the SOC

In this subsection, we introduce Jordan algebra, which provides a useful tool for dealing with SOCs.

For two n-dimensional vectors x = (x 1 , x 2 ) ∈ < × < n−1 and y = (y 1 , y 2 ) ∈ < × < n−1 , we define the Jordan product by

x ◦ y := (x > y, x 1 y 2 + y 1 x 2 ) > .

Note that the Jordan product generates an n-dimensional vector from two n-dimensional vectors.

For convenience, we denote x ◦ x by x 2 , and define the vector e by e := (1, 0, · · · , 0) > . Then Jordan product has the following properties [10]:

Property 1. For any x, y, z ∈ < n , the following properties hold.

1. e ◦ x = x,

2. x ◦ y = y ◦ x,

(7)

3. (x + y) ◦ z = x ◦ z + y ◦ z, 4. x ◦ (x 2 + y) = x 2 ◦ (x ◦ y).

Property 1 shows that vector e plays the role of the unit vector. Properties 2 and 3 represent commutativity and distributivity, respectively. Notice that the Jordan product is not associative, that is, x ◦ (y ◦ z) 6 = (x ◦ y) ◦ z in general.

The next proposition shows that the second-order cone complementarity condition can be represented in another form by use of the Jordan product [11]:

Proposition 1. For any two vectors x and y in < n , the relation

x ∈ K n , y ∈ K n , x > y = 0 ⇐⇒ x ∈ K n , y ∈ K n , x ◦ y = 0 holds.

Next we introduce the spectral factorization of vectors in < n associated with the second-order cone K n . Let z = (z 1 , z 2 ) ∈ < × < n 1 . Then we can decompose z as

z = λ 1 u { 1 } + λ 2 u { 2 } , where λ 1 and λ 2 are called the spectral values of z defined by

λ i = z 1 + ( − 1) i k z 2 k , i = 1, 2, (2) and u { 1 } and u { 2 } are called the spectral vectors of z defined by

u {i} =

 

 

 

 1 2

1, ( − 1) i z 2 k z 2 k

if z 2 6 = 0, 1

2

1, ( − 1) i w if z 2 = 0,

i = 1, 2, (3)

where w ∈ < n 1 is any vector satisfying k w k = 1. Some of the notable features of spectral values and vectors are the following [11, 15].

Property 2. For any vector z ∈ < n , the spectral values λ 1 and λ 2 given by (2) and spectral vectors u { 1 } and u { 2 } given by (3) possess the following properties:

1. u { 1 } ◦ u { 2 } = 0, k u { 1 } k = k u { 2 } k = 1/ √ 2, 2. u { i } ∈ bd K n , u { i } ◦ u { i } = u { i } , i = 1, 2, 3. λ 1 ≤ λ 2 , λ 1 ≥ 0 ⇐⇒ z ∈ K n .

The projection mapping P K

n

onto the second-order cone K n is defined by P K

n

(z) := arg min

z

0

∈K

n

k z 0 − z k .

By use of the spectral factorization, we can express this function explicitly as follows [11]:

P K

n

(z) = max { 0, λ 1 } u { 1 } + max { 0, λ 2 } u { 2 } . (4)

(8)

2.3 Reformulation of SOCCC into equation The SOCCC on K = K l

1

× · · · × K l

s

,

y ∈ K , η ∈ K , y > η = 0, (5)

can be written componentwise as follows [11, 15]:

y i ∈ K l

i

, η i ∈ K l

i

, (y i ) > η i = 0 (i = 1, · · · , s), (6) where y = (y 1 , · · · , y s ) > ∈ < l

1

×· · ·×< l

s

, η = (η 1 , · · · , η s ) > ∈ < l

1

×· · ·×< l

s

. For the decomposed SOCCC (6), we introduce functions ϕ i : < l

i

× < l

i

→ < l

i

(i = 1, · · · , s) satisfying the following relation:

ϕ i (y i , η i ) = 0 ⇐⇒ y i ∈ K l

i

, η i ∈ K l

i

, (y i ) > η i = 0. (7) By using such functions, we can then define the equivalent equation to SOCCC such that

Φ(y, η) = 0 ⇐⇒ y ∈ K , η ∈ K , y > η = 0,

where Φ : < n

2

× < n

2

→ < n

2

is a vector-valued function having functions ϕ i (i = 1, · · · , s) as block components, i.e.,

Φ(y, η) :=

 

ϕ 1 (y 1 , η 1 ) .. . ϕ s (y s , η s )

  .

In [11], it is showed that the following function ϕ i

NR

: < l

i

× < l

i

→ < l

i

(i = 1, · · · , s) defined by using the projection mapping P K

li

defined by (4) satisfies the relation (7):

ϕ i

NR

(y i , η i ) := y i − P K

li

(y i − η i ).

This is called the natural residual function. Note that P K

li

is not differentiable. Using the functions ϕ i

NR

, we define function Φ

NR

: < n

2

× < n

2

→ < n

2

by

Φ

NR

(y, η) :=

 

ϕ 1

NR

(y 1 , η 1 ) .. . ϕ s

NR

(y s , η s )

  ,

and H 0 : < n

1

× < n

2

× < n

2

× < m → < n

2

× < m × < n

2

by H 0 (x, y, η, ζ ) :=

 

∇ y g(x, y) − η − A(x) > ζ A(x)y + b(x)

Φ

NR

(y, η)

  .

Then we can rewrite the KKT conditions (1) of SOCP(x) as

H 0 (x, w) := H 0 (x, y, η, ζ) = 0.

(9)

Using the above relation, instead of MPSOCCC, we can consider the following equivalent nonlinear optimization problem:

P 0 : min f (x, y) s.t. x ∈ X,

H 0 (x, y, η, ζ ) = 0.

For any x ∈ X and any optimal solution y of SOCP(x), let (η, ζ) ∈ < n

2

× < m denote a solution of the equation H 0 (x, y, η, ζ ) = 0. Then for any i / ∈ I (y), we have from (7) that η i = 0.

Hence the first condition of the KKT conditions (1) is rewritten as η i + A i (x) > ζ = ∇ y

i

g(x, y), (i ∈ I (y)),

η i = 0, A i (x) > ζ = ∇ y

i

g(x, y), (i / ∈ I (y)). (8) Since matrix [A i (x)] i / ∈I (y) has full row rank from assumption A4, the matrix

 I P

i∈I(y)

l

i

[A i (x)] > i∈I(y) 0 [A i (x)] > i / ∈I(y)

 has full column rank. Therefore the solution of (8) exists uniquely.

The next theorem shows the equivalence of P and P 0 .

Theorem 1. Assume that A1–A4 hold. Then the next two statements are equivalent.

1. (x , y ) is a global (local) optimal solution of problem P.

2. There exists a vector (η , ζ ) such that (x , y , η , ζ ) is a global (local) optimal solution of problem P 0 .

Proof. First, we assume that (x , y ) is a local minimizer of problem P, and show the statement 2. Then from assumption A4, there exists a unique point (η , ζ ) such that (x , y , η , ζ ) is a feasible point of P 0 . Suppose that (x , y , η , ζ ) is not a local minimizer of P 0 . Then there exists a sequence { (x k , y k , η k , ζ k ) } of feasible points of P 0 converging to (x , y , η , ζ ) and satisfying f(x k , y k ) < f (x , y ) for all k. For such points (x k , y k , η k , ζ k ), however, points (x k , y k ) are feasible to P for all k, which contradicts the fact that (x , y ) is a local optimal solution of problem P.

Next, assume that (x , y ) is a global optimal solution of problem P. By what we have just mentioned, there is a unique vector (η , ζ ) such that (x , y , η , ζ ) is a local minimizer of P 0 . If (x , y , η , ζ ) is not a global minimizer of P 0 , then there is another point (¯ x, y, ¯ η, ¯ ζ) satisfying ¯ f(¯ x, y) ¯ < f(x , y ). The point (¯ x, y) is, however, a feasible point of problem P, which therefore ¯ contradicts the assumption that (x , y ) is a global optimal solution of problem P.

Conversely, assume now that (x , y , η , ζ ) is a local minimizer of P 0 . Since y ∈ Y (x ),

(x , y ) is a feasible solution of P. Suppose that (x , y ) is not a local minimizer of P. Then

there exists a sequence { (x k , y k ) } of feasible points of problem P which converges to (x , y )

and satisfies f (x k , y k ) < f (x , y ) for all k. From assumption A4, there exists a unique vector

k , ζ k ) such that (x k , y k , η k , ζ k ) is a feasible point of problem P 0 for all k. If (η k , ζ k ) converges

(10)

to (η , ζ ), we have a contradiction with the local optimality of (x , y , η , ζ ). We will assume that (η k , ζ k ) does not converge to (η , ζ ), and show contradiction.

To begin with, we assume that the sequence { (η k , ζ k ) } is unbounded. Since (x k , y k , η k , ζ k ) is feasible to problem P 0 for all k, it follows that

∇ y g(x k , y k ) − η k − A(x k ) > ζ k = 0.

Dividing both sides of this equation by k (η k , ζ k ) > k , and passing to the limit k → ∞ , we have (I A(x ) > ) η ˜

ζ ˜

!

= 0, (9)

where we assume, without loss of generality, lim k →∞

k

k

)

>

k (η

k

k

)

>

k = (˜ η, ζ ˜ ) > 6 = 0. From SOCCC and the continuity of functions, we obtain ˜ η i = 0 for index i / ∈ I (y ). Therefore using a similar reasoning for the uniqueness of vector (η, ζ) satisfying (8), the vector (˜ η, ζ) satisfying (9) is ˜ uniquely determined as (˜ η, ζ ˜ ) = 0. But this contradicts (˜ η, ζ) ˜ 6 = 0. Hence the sequence { (η k , ζ k ) } is bonded. Then, without loss of generality, we can assume that there exists a point (¯ η, ζ ¯ ) such that (η k , ζ k ) → (¯ η, ζ). By the continuity of the functions, we see that the point (x ¯ , y , η, ¯ ζ ¯ ) is feasible to problem P 0 with (x , y , η, ¯ ζ ¯ ) 6 = (x , y , η , ζ ). From assumption A4, however, there is a unique vector (η, ζ) such that (x , y , η, ζ) is feasible to P 0 . This is a contradiction.

Finally, suppose that (x , y , η , ζ ) is a global optimal solution of problem P 0 . If (x , y ) is not a global optimal solution of problem P, then there exists a feasible solution (¯ x, y) of P ¯ such that f (¯ x, y) ¯ < f(x , y ). From assumption A4, there exists a unique vector (¯ η, ζ) such ¯ that (¯ x, y, ¯ η, ¯ ζ ¯ ) is a feasible solution of problem P 0 , which contradicts the global optimality of (x , y , η , ζ ). Hence the point (x , y ) is a global optimal solution of problem P.

3 Smoothing method for problem P 0

In the previous section we have shown that problem P 0 is equivalent to problem P. Problem P 0 is no longer a bilevel program or an MPEC, but an ordinary nonlinear programming problem.

On the other hand, owing to the nondifferentiability of function H 0 , we cannot apply algorithms using derivatives of constraint functions, such as Newton-like methods. So in this section, we will discuss a smooth approximation of problem P 0 , which is obtained by smoothing the function H 0 .

For a nondifferentiable function ψ : < n → < m , we consider a function ψ µ : < n → < m having the following properties with a parameter µ > 0:

1. For any µ > 0, ψ µ is differentiable.

2. For any x ∈ < n , lim µ +0 ψ µ (x) = ψ(x).

Any function having these properties is called a smoothing function of ψ.

We will now discuss smoothing functions of ϕ i

NR

defined in the previous section. For this purpose, we first consider a continuously differentiable convex function ˆ g : < → < having the properties

α→−∞ lim ˆ g(α) = 0, lim

α→∞ (ˆ g(α) − α) = 0, 0 < ˆ g 0 (α) < 1. (10)

(11)

For example, the functions

ˆ

g(α) = p α 2 + 4 + α /2 and

ˆ

g(α) = log (exp(α) + 1)

have the above properties [7]. With the use of such ˆ g and a parameter µ > 0, we can define P K

n

: < n → < n by

P K

n

(z) := µˆ g( λ 1

µ )u { 1 } + µˆ g( λ 2 µ )u { 2 } ,

where λ 1 and λ 2 are the spectral values of z defined by (2), and u {1} and u {2} are the spectral vectors of z defined by (3). Then this function P K

n

becomes a smoothing function of the projection mapping P K

n

onto the SOC K n [11]. Hence using this function, we can define a smoothing function ϕ i µ : < l

i

× < l

i

→ < l

i

of ϕ i

NR

by

ϕ i µ (y i , η i ) := y i − P K

li

(y i − η i ).

In addition, let Φ µ : < n

2

× < n

2

→ < n

2

and H µ : < n

1

× < n

2

× < n

2

× < m → < n

2

× < m × < n

2

be defined by

Φ µ (y, η) :=

 

ϕ 1 µ (y 1 , η 1 ) .. . ϕ s µ (y s , η s )

 

and

H µ (x, w) := H µ (x, y, η, ζ ) :=

 

∇ y g(x, y) − η − A(x) > ζ A(x)y + b(x)

Φ µ (y, η)

  .

Then for any µ > 0, the function H µ is differentiable and also has the properties of a smoothing function of H 0 .

According to the above consideration, we can formulate a nonlinear programming problem with parameter µ > 0 as follows:

P µ : min f (x, y) s.t. x ∈ X,

H µ (x, w) = 0.

This problem can be seen as an approximation of problem P 0 having a parameter µ > 0. In particular, P µ is a smooth optimization problem for any µ > 0.

We now make new assumptions on function H 0 and the smoothing function H µ :

A 5. For any x ¯ ∈ X and w ¯ ∈ < 2n

2

+m such that H 0 (¯ x, w) = 0, all the matrices belonging to the ¯

generalized Jacobian ∂ w H 0 (¯ x, w) ¯ are nonsingular.

(12)

A 6. For any x ¯ ∈ X and µ > 0, there exists the inverse matrix ∇ w H µ (¯ x, w) 1 ∈ < (2n

2

+m) × (2n

2

+m) of the Jacobian of H µ (¯ x, w) for all w ∈ < 2n

2

+m . Furthermore, there exists a constant Γ > 0 such that k∇ w H µ (¯ x, w) 1 k ≤ Γ for all w ∈ < 2n

2

+m .

Under assumption A6, Hadmard Theorem [19] ensures that H µ (¯ x, · ) is homeomorphism of

< 2n

2

+m onto < 2n

2

+m for any µ > 0 and ¯ x ∈ X , that is, the function H µ (¯ x, · ) is one-to-one and H µ (¯ x, · ) and H µ (¯ x, · ) 1 are continuous. Especially, there exists a unique solution of the equation H µ (¯ x, · ) = 0.

For µ > 0, let F µ represent the feasible set of P µ , i.e., F µ := { (x, w) | x ∈ X, H µ (x, w) = 0 } . The following theorem implies that there exists an accumulation point of a sequence generated by the smoothing method.

Theorem 2. Assume that A1–A6 hold, and let µ > ¯ 0 be any fixed constant. Then F µ is nonempty and uniformly bounded on [0, µ]. ¯

Proof. From the definition of F µ , assumption A1, and the fact that the equation H µ (x, w) = 0 (µ ≥ 0) has at least one solution for any x ∈ X (by assumption A4 or A6), there is (x, w) ∈ F µ for any µ ∈ [0, µ]. As ¯ X is compact, F µ is bounded with respect to the x-component.

Assuming that F µ is not bounded for the w-component, we will derive a contradiction. Under this assumption, there are sequences { µ k } , { x k } , and { w k } such that (x k , w k ) ∈ F µ

k

and µ k

˜

µ ∈ [0, µ], ¯ k w k k → ∞ as k → ∞ . We can now assume without loss of generality that { x k } has a limit point ˜ x ∈ X . By assumptions A2–A4 and A6, there exists a unique ˜ w satisfying ˜ w ∈ F µ ˜ (˜ x), namely H µ ˜ (˜ x, w) = 0 holds. On the other hand, from assumption A5 or A6, the (generalized) ˜ Jacobian of H µ ˜ (˜ x, w) is nonsingular. Hence by the implicit function theorem there exists a ˜ continuous function w(µ, x) satisfying H µ (x, w(µ, x)) = 0 in a neighborhood of the point (˜ µ, x). ˜ Since the solution of equation H µ (x, w) = 0 is also determined uniquely in a neighborhood of (˜ µ, x) by assumption A6, it eventually holds for sufficiently large ˜ k that w k = w(µ k , x k ). Since the function w( · , · ) is continuous, and since the sequences { µ k } and { x k } are both bounded, there exists a constant β > 0 such that lim k→∞ k w(µ k , x k ) k ≤ β, which contradicts the unboundedness of the sequence { w k } .

4 Optimality conditions

In this section, we discuss optimality conditions for problems P 0 and P µ . In the case of µ > 0, P µ is a smooth nonlinear program and hence the optimality conditions can be written as KKT conditions. Our purpose is to obtain an optimal solution of MPSOCCC or P 0 from a KKT point of P µ . To this end, we first refer to the optimality conditions for problem P 0 .

In the following, we suppose that the set X is specified as X := { x | ˜ g(x) ≤ 0, ˜ h(x) = 0 } , where ˜ g : < n

1

→ < p and ˜ h : < n

1

→ < q are both continuously differentiable functions.

Since P 0 is a nonsmooth problem, we cannot consider the ordinary KKT conditions. In

[9], by using C-subdifferential of nondifferential functions, necessary conditions for optimality

are represented as the following Fritz-John conditions: If the point (x, y, η, ζ) is optimal for

(13)

P 0 , then there exist vectors δ ∈ < + , θ ∈ < n

2

, ρ ∈ < m , σ ∈ < n

2

, ν ∈ < p + , ξ ∈ < q such that (δ, θ, ρ, σ, ν, ξ) 6 = 0 and

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  0 ∈ δ

 

 

∇ x f (x, y)

∇ y f (x, y) 0 0

 

  +

 

 

∇ x L(x, y, η, ζ )

∇ y L(x, y, η, ζ)

∇ η L(x, y, η, ζ)

∇ ζ L(x, y, η, ζ)

 

  θ +

 

 

∇ x (A(x)y) + ∇ b(x) A(x) >

0 0

 

  ρ

+

 

 

 

 

 

 

 

 

 

 

0

 

y

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ y

s

ϕ s

NR

(y s , η s )

 

 

η

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ η

s

ϕ s

NR

(y s , η s )

 

0

 

 

 

 

 

 

 

 

 

 

 σ +

 

 

∇ x ˜ g(x) 0 0 0

 

  ν +

 

 

∇ x ˜ h(x) 0 0 0

 

  ξ,

L(x, y, η, ζ) = 0, A(x)y + b(x) = 0,

ϕ i

NR

(y i , η i ) = 0, (i = 1, · · · , s),

˜ h(x) = 0,

˜

g(x) ≤ 0, ν ≥ 0, ν > g(x) = 0, ˜

(11)

where L : < n

1

× < n

2

× < n

2

× < m → < n

2

is defined by L(x, y, η, ζ ) := ∇ y g(x, y) − η − A(x) > ζ.

According to [16], by letting z i := y i − η i , the C-subdifferential of the function ϕ i

NR

(y i , η i ) is calculated as

y

i

ϕ i

NR

(y i , η i ) = I l

i

− ∂P K

li

(z i ),

η

i

ϕ i

NR

(y i , η i ) = ∂P K

li

(z i ), (12) where in the case of z 1 i 6 = ±|| z 2 i || , P K

li

is differentiable and its derivative is given as

∇ P K

li

(z i ) =

 

 

 

 

0, (z 1 i < −k z i 2 k ) I l

i

, (z 1 i > k z 2 i k ) 1

2

1 (v i ) >

v i M

!

, ( −k z 2 i k < z 1 i < k z i 2 k )

(13)

with v i := z i 2

k z i 2 k , M := z 1 i k z 2 i k + 1

!

I l

i

1 − z 1 i

k z 2 i k v i (v i ) > ,

while, in the case of z 1 i = ±|| z 2 i || , P K

li

is not differentiable and its C-subdifferential is given as follows: if z 2 i 6 = 0 and z 1 i = k z i 2 k , then

∂P K

li

(z i ) = co (

I l

i

, 1 2

1 (v i ) >

v i M

!)

, with v i := z i 2

k z 2 i k , M := 2I l

i

−1 − v i (v i ) > , (14) if z 2 i 6 = 0 and z 1 i = −k z 2 i k , then

∂P K

li

(z i ) = co (

0, 1 2

1 (v i ) >

v i M

!)

, with v i := z 2 i

k z i 2 k , M := v i (v i ) > , (15)

(14)

finally if z i = 0, then

∂P K

li

(z i ) = co (

{ 0 } ∪ { I l

i

} ∪ ( 1

2

1 (v i ) >

v i M

!

M = (v i 0 + 1)I l

i

−1 − v 0 i v i (v i ) >

for some | v i 0 | ≤ 1 and k v i k = 1 ))

, (16) where co denotes the convex hull.

If δ 6 = 0 holds in the Fritz-John conditions (11), then we can assume without loss of generality that δ = 1, and conditions (11) become the KKT conditions for problem P 0 :

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  0 ∈

 

 

∇ x f (x, y)

∇ y f (x, y) 0 0

 

  +

 

 

∇ x L(x, y, η, ζ )

∇ y L(x, y, η, ζ )

∇ η L(x, y, η, ζ )

∇ ζ L(x, y, η, ζ )

 

  θ +

 

 

∇ x (A(x)y) + ∇ b(x) A(x) >

0 0

 

  ρ

+

 

 

 

 

 

 

 

 

 

 

0

 

y

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ y

s

ϕ s

NR

(y s , η s )

 

 

η

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ η

s

ϕ s

NR

(y s , η s )

 

0

 

 

 

 

 

 

 

 

 

 

 σ +

 

 

∇ x g(x) ˜ 0 0 0

 

  ν +

 

 

∇ x ˜ h(x) 0 0 0

 

  ξ,

L(x, y, η, ζ) = 0, A(x)y + b(x) = 0,

ϕ i

NR

(y i , η i ) = 0, (i = 1, · · · , s),

˜ h(x) = 0,

˜

g(x) ≤ 0, ν ≥ 0, ν > g(x) = 0. ˜

(17)

Now let ˜ I (x) denote the index set of the active constraints in ˜ g i (x) ≤ 0, i.e., ˜ I (x) :=

{ i | ˜ g i (x) = 0 } . We introduce the following linear independence assumption.

A 7. For any point (x, y, η, ζ ) satisfying the Fritz-John conditions (11), vectors ∇ g ˜ i (x) (i ∈ I ˜ (x)), ∇ ˜ h j (x) (j = 1, · · · , q) are linearly independent.

Under this assumption, it can be shown that every Fritz-John point satisfies the KKT con- ditions (17).

Theorem 3. Assume that A1, A2, A5 and A7 hold. If (x, y, η, ζ ) satisfies Fritz-John conditions (11) with multipliers (δ, θ, ρ, σ, ν, ξ), then δ is not equal to zero.

Proof. Suppose that (x, y, η, ζ ) satisfies the Fritz-John conditions (11) with δ = 0. Then there

(15)

exist (θ, ρ, σ, ν, ξ) 6 = 0 satisfying the following system:

0 ∈

 

 

∇ x L(x, y, η, ζ)

∇ y L(x, y, η, ζ)

∇ η L(x, y, η, ζ)

∇ ζ L(x, y, η, ζ)

 

  θ +

 

 

∇ x (A(x)y) + ∇ b(x) A(x) >

0 0

 

  ρ

+

 

 

 

 

 

 

 

 

 

 

0

 

y

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ y

s

ϕ s

NR

(y s , η s )

 

 

η

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ η

s

ϕ s

NR

(y s , η s )

 

0

 

 

 

 

 

 

 

 

 

 

 σ +

 

 

∇ x g(x) ˜ 0 0 0

 

  ν +

 

 

∇ x ˜ h(x) 0 0 0

 

  ξ, (18)

L(x, y, η, ζ ) = 0, A(x)y + b(x) = 0,

ϕ i

NR

(y i , η i ) = 0, (i = 1, · · · , s),

˜ h(x) = 0,

˜

g(x) ≤ 0, ν ≥ 0, ν > g(x) = 0. ˜ (19)

Note that (18) can be rewritten as

0 = h ∇ x L(x, y, η, ζ ) ∇ x h(x, y) ∇ x ˜ g(x) ∇ x ˜ h(x) i

 

  θ ρ ν ξ

 

  , (20)

0 ∈

 

 

 

 

 

 

 

 

∇ y L(x, y, η, ζ ) A(x) >

 

y

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ y

s

ϕ s

NR

(y s , η s )

 

∇ η L(x, y, η, ζ ) 0

 

η

1

ϕ 1

NR

(y 1 , η 1 ) 0 . ..

0 ∂ η

s

ϕ s

NR

(y s , η s )

 

∇ ζ L(x, y, η, ζ) 0 0

 

 

 

 

 

 

 

 

  θ ρ σ

  . (21)

Since the matrix on the right-hand side of (21) represents the generalized Jacobian ∂ w H 0 (x, w), it is nonsingular by assumption A5. Hence (21) implies that (θ, ρ, σ) = 0. Furthermore it follows from (20) together with the complementarity condition (19) and assumption A7 that (ν, ξ) = 0, which contradicts (θ, ρ, σ, ν, ξ) 6 = 0.

Next we consider optimality conditions for problem P µ . Since problem P µ is a smooth

optimization problem, its KKT conditions state that there exist Lagrange multipliers θ ∈ < n

2

,

(16)

ρ ∈ < m , σ ∈ < n

2

, ν ∈ < p + and ξ ∈ < q satisfying

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 0 ∈

 

 

∇ x f (x, y)

∇ y f (x, y) 0 0

 

  +

 

 

∇ x L(x, y, η, ζ )

∇ y L(x, y, η, ζ )

∇ η L(x, y, η, ζ )

∇ ζ L(x, y, η, ζ )

 

  θ +

 

 

∇ x (A(x)y) + ∇ b(x) A(x) >

0 0

 

  ρ

+

 

 

 

 

 

 

 

 

 

 

 

0

 

∇ y

1

ϕ 1 µ (y 1 , η 1 ) 0 . ..

0 ∇ y

s

ϕ s µ (y s , η s )

 

 

∇ η

1

ϕ 1 µ (y 1 , η 1 ) 0 . ..

0 ∇ η

s

ϕ s µ (y s , η s )

 

0

 

 

 

 

 

 

 

 

 

 

  σ +

 

 

∇ x g(x) ˜ 0 0 0

 

  ν +

 

 

∇ x ˜ h(x) 0 0 0

 

  ξ,

L(x, y, η, ζ) = 0, A(x)y + b(x) = 0,

ϕ i µ (y i , η i ) = 0, (i = 1, · · · , s),

˜ h(x) = 0,

˜

g(x) ≤ 0, ν ≥ 0, ν > g(x) = 0, ˜

(22)

where ∇ y

i

ϕ i µ and ∇ η

i

ϕ i µ are calculated by letting z i := y i − η i as follows [11]:

∇ y

i

ϕ i µ (y i , η i ) = I l

i

− ∇ P K

li

(z i ),

∇ η

i

ϕ i µ (y i , η i ) = ∇ P K

li

(z i ), (23)

∇ P K

li

(z i ) =

 

 

 

 

 

 

"

b i c i (v i ) >

c i v i a i I l

i

1 + (b i − a i )v i (v i ) >

#

, (z 2 6 = 0)

ˆ g 0 ( z 1

µ )I l

i

, (z 2 = 0),

(24)

where

v i := z 2 i k z 2 i k , a i :=

µ

ˆ

g( λ µ

i2

) − g( ˆ λ µ

i1

)

λ i 2 − λ i 1 , b i := 1

2 ˆ g 0 ( λ i 2

µ ) + ˆ g 0 ( λ i 1 µ )

!

, c i := 1

2 g ˆ 0 ( λ i 2

µ ) − ˆ g 0 ( λ i 1 µ )

!

, (25) λ i j := z 1 i + ( − 1) j k z 2 i k , j = 1, 2. (26) The differences between these two optimality conditions (17) and (22) lie in the function ϕ i

NR

in (17) and ϕ i µ in (22), and the C-subdifferential and the ordinary differential for these functions.

A limit of KKT points for P µ may be expected to satisfy the optimality conditions (17) for

problem P 0 under suitable assumptions, as shown in the next theorem. In fact, to get a solution

of MPSOCCC, we may employ a smoothing method which sequentially computes KKT points

for P µ with µ > 0 tending to zero.

(17)

Theorem 4. Assume that A1, A2 and A7 hold. For any sequence { µ k } such that µ k > 0 and µ k → 0, let (x k , y k , η k , ζ k ) satisfy the KKT conditions for P µ

k

with Lagrange multipliers (θ k , ρ k , σ k , ν k ). Assume that the sequence { (x k , y k , η k , ζ k , θ k , ρ k , σ k , ν k ) } has an accumulation point. Then every accumulation point (x , y , η , ζ , θ , ρ , σ , ν ) satisfies the optimality condi- tions (17) for P 0 .

Proof. Without loss of generality, we assume that the sequence { (x k , y k , η k , ζ k , θ k , ρ k , σ k , ν k ) } converges to (x , y , η , ζ , θ , ρ , σ , ν ) as k → ∞ . From the assumptions of the theorem, (x k , y k , η k , ζ k , θ k , ρ k , σ k , ν k ) satisfies conditions (22) for all k. For the continuity of the functions involved, ∇ f (x k , y k ), ∇ L(x k , y k , η k , ζ kk , ∇ h(x k , y kk , ∇ g(x ˜ kk , ∇ ˜ h(x kk , L(x k , y k , η k , ζ kk , A(x k )y k +b(x k ), ϕ i µ

k

(y k , η k ), (i = 1, · · · , s), ˜ h(x k ), ˜ g(x k ) and (ν k ) > ˜ g(x k ) converge to ∇ f (x , y ),

∇ L(x , y , η , ζ , ∇ h(x , y , ∇ g(x ˜ , ∇ ˜ h(x , L(x , y , η , ζ , A(x )y + b(x ), ϕ i

NR

(y , η ), (i = 1, · · · , s), ˜ h(x ), ˜ g(x ) and (ν ) > g(x ˜ ), respectively. Hence it is enough to prove that

k→∞ lim ∇ ϕ i µ

k

((y i ) k , (η i ) k ) ∈ ∂ϕ i

NR

((y i ) , (η i ) ) (27) holds true for every i ∈ { 1, · · · , s } . Moreover, from the relation between (12) and (23), it is sufficient for (27) to show that

k lim →∞ ∇ P K

li

k

((z i ) k ) ∈ ∂P K

li

((z i ) ), (28) where (z i ) k := (y i ) k − (η i ) k .

For any k and i ∈ { 1, · · · , s } , define a k i , b k i , c k i and (λ i j ) k (j = 1, 2) by (25) and (26). When lim k →∞i j ) k 6 = 0, since µ k → 0 as k → ∞ , we have | µ

ijk

)

k

| → ∞ . Thus from the properties (10) of function ˆ g, the following relations hold:

if lim

k→∞ (λ i j ) k < 0 then lim

k→∞ g( ˆ (λ i j ) k

µ k ) = 0, lim

k→∞ g ˆ 0 ( (λ i j ) k µ k ) = 0, if lim

k →∞ (λ i j ) k > 0 then lim

k →∞ g( ˆ (λ i j ) k

µ k ) − (λ i j ) k µ k

!

= 0, lim

k →∞ ˆ g 0 ( (λ i j ) k µ k ) = 1.

(29)

On the other hand, when lim k →∞i j ) k = 0, if lim k →∞

i j

)

k

µ

k

= −∞ hold, then

k lim →∞ g( ˆ (λ i j ) k µ k ) = 0, if lim k →∞ µ

ijk

)

k

= ∞ hold, then

k→∞ lim g( ˆ (λ i j ) k

µ k ) − (λ i j ) k µ k

!

= 0, and finally if lim sup k→∞

ij

)

k

µ

k

< ∞ , then by the continuity of function ˆ g, we have

lim sup

k →∞

g( ˆ (λ i j ) k µ k )

< ∞ .

(18)

As a result, in each case we have

k→∞ lim

µ k g( ˆ (λ i j ) k µ k )

= 0. (30)

Using the above relations, we will show that a limit of ∇ P K

li

k

((z i ) k ) belongs to the set

∂P K

li

((z i ) ) represented by (13)–(16) in each of the following cases:

1. (z 1 i ) < −k (z 2 i ) k , 2. (z 1 i ) > k (z i 2 ) k ,

3. −k (z 2 i ) k < (z 1 i ) < k (z 2 i ) k , 4. (z 1 i ) = k (z i 2 ) k 6 = 0,

5. (z 1 i ) = −k (z i 2 ) k 6 = 0, 6. (z i ) = 0,

where (z i ) := lim k →∞ (z i ) k . Similarly, we define a i := lim k →∞ a k i , b i := lim k →∞ b k i , c i :=

lim k→∞ c k i and (λ i j ) := lim k→∞i j ) k (j = 1, 2).

Case 1. (z 1 i ) < −k (z i 2 ) k

By taking an appropriate subsequence if necessary, it is sufficient to consider the following two cases: (i) k (z i 2 ) k k = 0 holds for all k, (ii) k (z 2 i ) k k 6 = 0 holds for all k.

First, we consider the case that k (z i 2 ) k k = 0 holds for all k. Since lim k →∞ (z µ

1ik

)

k

= −∞ holds from (z 1 i ) < 0, it follows from (24) and (10) that

k lim →∞ ∇ P K

li

k

((z i ) k ) = lim

k →∞ g ˆ 0 ( (z 1 i ) k

µ k )I l

i

= 0.

We next consider the case that k (z 2 i ) k k 6 = 0 holds for all k. It follows from the assumption (z 1 i ) < −k (z i 2 ) k along with (26) that lim k→∞i 1 ) k ≤ lim k→∞i 2 ) k < 0. Therefore from the definition (25) of b k i and c k i and (29), we obtain

b i = c i = 0.

Since (λ i 1 ) k < (λ i 2 ) k holds for all k, we have from the mean value theorem that there exist α k i

i1

)

k

µ

k

, µ

i2k

)

k

for all k such that

ˆ

g 0k i ) = a k i = g( ˆ µ

i2k

)

k

) − g( ˆ µ

i1k

)

k

)

i2

)

k

µ

k

µ

i1k

)

k

.

As shown above, we have lim k →∞i 1 ) k ≤ lim k →∞i 2 ) k < 0, which implies lim k →∞ α k i = −∞ . Moreover, we have

a i = lim

k →∞ a k i = lim

k →∞ g ˆ 0 (a k i ) = 0.

参照

関連したドキュメント

In this paper, we use the reproducing kernel Hilbert space method (RKHSM) for solving a boundary value problem for the second order Bratu’s differential equation.. Convergence

In this paper, we study the generalized Keldys- Fichera boundary value problem which is a kind of new boundary conditions for a class of higher-order equations with

In this note, we consider a second order multivalued iterative equation, and the result on decreasing solutions is given.. Equation (1) has been studied extensively on the

In this article we study a free boundary problem modeling the tumor growth with drug application, the mathematical model which neglect the drug application was proposed by A..

The nonlinear impulsive boundary value problem (IBVP) of the second order with nonlinear boundary conditions has been studied by many authors by the lower and upper functions

Gupta, “Solvability of a three-point nonlinear boundary value problem for a second order ordinary differential equation,” Journal of Mathematical Analysis and Applications,

– Solvability of the initial boundary value problem with time derivative in the conjugation condition for a second order parabolic equation in a weighted H¨older function space,

Based on these results, we first prove superconvergence at the collocation points for an in- tegral equation based on a single layer formulation that solves the exterior Neumann