An interior point method with a primal-dual

(1)

a primal-dual l

2

barrier penalty function

for nonlinear optimization

Hiroshi Yamashita 3

and Hiroshi Yab e y

March, 1999

Abstract

Inthispaper,weareconcernedwithaprimal-dualinteriorpointmetho dforsolv-

ing nonlinearly constrained optimization problems, in which Newton-like methods

are appliedto theshifted barrier KKT conditions. We propose a new primal-dual

merit function, which is called the primal-dual l

2

barrier penaltyfunction, within

theframeworkoflinesearchmethods,andshowtheglobalconvergencepropertyof

ourmethod. Furthermore,bycarefullycontrollingparameters inthealgorithm,its

sup erlinearconvergencepropertyis shown.

1 Introduction

In this paper,weconsiderthe following constrained optimizationproblem:

minimize f(x); x2R n

;

subject to g(x)=0; x

i

0; i 2I

P

; (1)

whereweassumethatthefunctionsf :R n

!R 1

andg:R n

!R m

aretwicecontinuously

dierentiable,and I

P

isasubset of theindex setf1;2;:::;ng. Letn 0

=jI

P

j>0and E be

a n 0

2n matrix whose rows consist of e t

i

; i2I

P

,where e

i 2R

n

denotes the i-th column

vector of the identity matrix. Then problem (1) is writtenas:

minimize f(x); x 2R n

;

subjectto g(x)=0; Ex 0:

(2)

In the sequel, weuse the notation

x 0

Ex 2R n

0

3

MathematicalSystems,Inc.,2-4-3,Shinjuku,Shinjuku-ku, Tokyo,Japan. [email protected]

y

Department of Applied Mathematics, Faculty of Science, Science University of Tokyo, 1-3, Kagu-

razaka,Shinjuku-ku, Tokyo,Japan. yab [email protected]

(2)

Let the Lagrangian function of the ab ove problem be denedby

L(w)=f(x)0y t

g(x)0z t

Ex=f(x)0y t

g(x)0z t

x 0

; (3)

where w=(x;y;z) t

, and y2R m

and z 2R n

0

are the Lagrange multiplier vectors which

correspond to the equality and inequality constraints respectively. Then Karush-Kuhn-

Tucker (KKT)conditions for optimality of the above problem are givenby

r

0 (w)

0

B

@ r

x L(w)

g(x)

X 0

Ze 1

C

A = 0

B

@ 0

0

0 1

C

A (4)

and

x 0

0; z 0;

(5)

where

r

x

L(w ) = rf(x)0A(x) t

y0E t

z;

A(x) = 0

B

@ rg

1 (x)

t

.

rg

m (x)

t 1

C

A

;

X 0

= diag(x 0

1

;111;x 0

n 0

);

Z = diag(z

1

;111;z

n 0)

;

e = (1;111;1) t

2R n

0

:

To solve the ab ove problem by a primal-dual interior point method, Yamashita [15]

intro duces the barrier penalty functionF(;):S !R 1

whichis denedby

F(x;)=f(x)0 n

0

X

i=1 logx

0

i +

m

X

i=1 jg

i (x)j; (6)

where and are given positive constants, and S = fx2R n

jx 0

>0g . If is su-

ciently large, the necessary condition for the optimality of the barrier penalty function

minimization problem fora given>0 is

r(w;) 0

B

@ r

x L(w)

g(x)

X 0

Ze0e 1

C

A

= 0

B

@ 0

0

0 1

C

A

; (7)

and x 0

> 0; z > 0. The above conditions are called the barrier KKT conditions. The

search direction of the proposed method is based on the Newton step for solving the

equality part of the barrier KKT conditions. Let 1w = (1x;1y;1z) t

be dened by a

solution of

J(w )1w =0r(w;) (8)

(3)

J(w)= 0

B

@

G 0A(x) t

0E t

A(x) 0 0

ZE 0 X

0 1

C

A

; (9)

where we use the relation X 0

Ze = X 0

z = ZEx. The matrix G is r 2

x

L(w) or a quasi-

Newton approximationtothe Hessian matrix.

Let 1F

l

(x;;s)bea rst order approximationtothe quantity F(x+s;)0F(x;),

i.e.,

1F

l

(x;;s)rf(x) t

s0e t

(X 0

) 01

Es+ m

X

i=1

g

i

(x)+rg

i (x)

t

s

0 m

X

i=1 jg

i (x)j: (10)

Then itis possible to provethat

1F

l

(x;;1x)01x t

(G+E t

(X 0

) 01

ZE)1x0 m

X

i=1

(0jy

i +1y

i j)jg

i (x)j:

The above inequality shows that the direction 1x which is derived from (8) can be a

descent direction of the barrier penalty function F(x;) if G is positive denite and

is suciently large. Based on this observation, the line search algorithm and the trust

regionalgorithmforthe primalvariableare proposedbyYamashita[15]andYamashitaet

al.[16, 18] respectively. Forthe variablez,the step size iscontrolledby abox constraint.

The step size for the variabley is usually taken equal to the one for z. Both algorithms

are shown to be quite ecient. Some researchers have dealt with other primal merit

functionswithinthe frameworkof line searchstrategies ortrust regionstrategies (Seefor

example, Breiteld and Shanno[4], Dennis, Heinkenschloss and Vicente[8], Byrd, Gilbert

andNocedal[5],andAkrotirianakisandRustem[1,2]). Superlinearconvergenceproperties

of primal-dual methods based on solving the barrier KKT conditions have b een also

studiedbyseveralauthors,forexample,Martinez,ParadaandTapia[12],El-Bakry,Tapia,

Tsuchiya and Zhang[9], Yamashita and Yab e[17], Yabe and Yamashita[13], Yamashita,

Yabe and Tanab e[18], and Byrd, Liu and No cedal[7].

In this paper, weconsider a moreconventional merit function:

F

0

(x;)=f(x)0 n

0

X

i=1 logx

0

i +

1

2 m

X

i=1 g

i (x)

2

; (11)

whichisextensivelydescribedinabookbyFiaccoandMcCormick[10]. Wealso call this

functionthe barrierp enaltyfunction. Todiscriminatethis functionfrom(6),wemaycall

this the l

2

barrier penalty function. Whereas the function dened in (6) may be called

the l

1

barrier penalty function.

The necessary condition for the optimality of the problem

minimize F

0

(x;); x2S

is

rF

0

(x;)=rf(x)0E t

(X 0

) 01

e+ 1

m

X

i=1 g

i (x)rg

i

(x)=0 (12)

(4)

and x > 0. As in [15], we intro duce the variables y and z by y = 0g(x)= and z =

(X 0

) 01

e. Then the ab oveconditions are written as

r(w;) 0

B

@

rf(x)0A(x) t

y0E t

z

g(x)+y

X 0

Ze0e

1

C

A

= 0

B

@ 0

0

0 1

C

A (13)

andx 0

>0;z >0. Wecalltheseconditionsthe shiftedbarrierKKT(SBKKT)conditions.

These conditions are also considered by Forsgrenand Gill [11]. We do not consider the

interior conditions x 0

>0;z >0hereafter assuming these conditionsare always satised.

In what follows, the subscript k denotes an iteration count in the inner iteration or

in the outer iteration. Let k1 k denote the l

2

norm for vectors and the operator norm

inducedfrom the l

2

vector norm for matrices. LetR n

0

+

=fz 2R n

0

j z >0g.

2 Algorithm and its global convergence

2.1 Outer iteration

A prototyp eof the algorithm that uses the SBKKT conditions is described asfollows.

Algorithm IP

Step 0. (Initialize) Set " >0, M

c

> 0 and k =0. Let a positive sequence f

k g;

k

# 0

b e given.

Step 1. (Termination)If kr

0 (w

k

)k",then stop.

Step 2. (ApproximateSBKKT point)Find apointw

k +1

that satises

kr(w

k +1

;

k

)kM

c

k : (14)

Step 3. (Update) Set k :=k+1 and go toStep 1. 2

Wenotethatthe barrierparametersequencef

k

g inAlgorithmIPneed notb edeter-

minedbeforehand. The value of each

k

maybesetadaptivelyasthe iteration pro ceeds.

Wecall condition (14) the approximateSBKKT condition, and call a pointthat satises

this condition the approximate SBKKT p oint.

The following theorem showsthe global convergenceproperty of AlgorithmIP.

Theorem 1 Let fw

k

g be an innite sequence generated by Algorithm IP. Suppose that

the sequences fx

k

g and fy

k

g are bounded. Then fz

k

g is bounded, and any accumulation

point of fw

k

g satises KKT conditions (4) and (5).

(5)

Proof. Assume that there existsan i suchthat (E z

k )

i

!1. Equation (14) yields

(rf(x

k

)0A(x

k )

t

y

k )

i

(E t

z

k )

i

01

M

c

k 01

(E t

z

k )

i

;

whichisacontradictionbecauseofthe boundednessoffx

k

gand fy

k

g. Thusthesequence

fz

k

g isbounded.

Let w^ b e any accumulationpointof fw

k

g. Sincethe sequences fw

k

g andf

k

gsatisfy

(14) for each k and

k

approaches zero, r

0

(w )^ =0 follows from the denition of r(w;).

Therefore the proof iscomplete. 2

2.2 Solving the shifted barrier KKT conditions

Inthissubsectionweconsideramethodforsolvingthe SBKKTconditionsapproximately

for a given > 0 (Step 2 of Algorithm IP). Therefore the index k denotes the inner

iteration countforagiven>0 inthissubsection. TheNewton-likeiteration forsolving

(13) is denedby

J

k 1w

k

=0r(w

k

;);

(15)

where the Jacobianmatrix J

k

isgivenby

J

k

= 0

B

@ G

k

0A(x

k )

t

0E t

A(x

k

) I 0

Z

k

E 0 X

0

k 1

C

A

; (16)

andthematrixG

k isr

2

x L(w

k

)oritsapproximation. Thefollowinglemmagivesasucient

condition for equation (15) tobe solvable.

Lemma 1 If G

k

ispositive denite, then the matrix J

k

is nonsingular.

Proof. Consider the equation

J

k 0

B

@ x

y

z 1

C

A=0;

for (x;y;z) t

2R n

2R m

2R n

0

. Thenwehave

(G

k +E

t

(X 0

k )

01

Z

k E+

1

A(x

k )

t

A(x

k

))x = 0;

y = 0

01

A(x

k )x;

z = 0(X

0

k )

01

Z

k Ex:

By the assumption we obtain x =0, and therefore y =0 and z =0. This proves the

lemma. 2

We notethat by eliminating 1y

k

and 1z

k

fromthe rst setof equations (15):

G

k 1x

k

0A(x

k )

t

1y

k 0E

t

1z

k

=0r

x L(w

k );

(17)

(6)

A(x

k )1x

k

+1y

k

= 0g(x

k )0y

k

; (18)

Z

k E1x

k +X

0

k 1z

k

= e0X 0

k z

k

; (19)

we have

(G

k +E

t

(X 0

k )

01

Z

k E+

1

A(x

k )

t

A(x

k ))1x

k

=0rF

0 (x

k

;):

(20)

Therefore it is easyto see that under appropriate assumptions the function F

0

(x;) can

be used as a merit function as in [15]. Because F

0

(x;) depends only on the primal

variables,weshoulduse a methodsimilar tothe one whichis givenin[15] for controlling

the step sizes fordual variables. Instead of following this p ossibility, weconsider amerit

function in the primal-dual space in this paper. Some primal-dual merit functions have

been proposed(See forexample,Argaezand Tapia[3],andEl-Bakry,Tapia,Tsuchiyaand

Zhang[9]for solvingthebarrierKKTconditions(7), andForsgrenandGill[11]forsolving

the SBKKT conditions (13)).

Tohaveameritfunctionwhichhas aminimump ointattheSBKKTpoint,andwhich

givesa descent direction with aNewton step, it is naturalto consider

F

0

(x;)+

2

kg(x)+yk 2

+

2 kX

0

z0ek 2

;

where is a p ositive constant. We note that the second and third terms correspond to

the secondandthird componentsinr(w;) respectively. However,this functiondoesnot

prevent eachcomponentof the variablez tend to 0,and therefore cannotgive aglobally

convergent algorithm unlessanappropriate procedureis devised. Thusweneeda sort of

the barriertermforthe variablez. Inthis paperweproposethe followingfunctionwhich

is calledthe primal-dualbarrier penalty function:

F(w;)=F

0

(x;)+log f(x

0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

n 0

Q

i=1 x

0

i z

i

!

=n 0

; (21)

where>0and 2(0;2)areconstants,whichisamodicationoftheprimal-dualmerit

function prop osed by Yamashita[14]. The denominator in the second term is to prevent

z

i

tend to 0 for each i. For notational conveniencewe denote the expression in the last

term in(21) by (w), i.e.,

(w) log f(x

0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

n 0

Q

i=1 x

0

i z

i

!

=n 0 (22)

= log

f(x 0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

0

n 0

X

i=1 logx

0

i z

i :

Forlater convenience wequote two wellknown relations

(x 0

) t

z

n 0

0

@ n

0

Y

i=1 x

0

i z

i 1

A 1=n

0

; (23)

(7)

n

X

i=1 1

n 0

x 0

i z

i

1

n 0

Q

i=1 x

0

i z

i

!

1=n 0

: (24)

In the aboveinequalities, the equalities hold if and only if x 0

1 z

1

=111=x 0

n 0

z

n 0

.

From(23), it is easyto provethe followinglemma.

Lemma 2 There hold:

(i) (w )0:

(ii) (w )=0 if and only if g(x)+y =0 and X 0

z0e=0. 2

Nowwe calculate the derivativesof the merit function:

rF(w;) = 0

B

@ rF

0

(x;)+r

x (w)

r

y (w)

r

z (w)

1

C

A; (25)

where

r

x

(w ) =

f(x 0

) t

zg 01

E t

z=n 0

+2A(x) t

(g(x)+y)+2E t

Z(X 0

z0e)

f(x 0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

0 E

t

(X 0

) 01

e

n 0

;

r

y

(w ) =

2(g(x)+y)

f(x 0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

;

r

z

(w ) =

f(x 0

) t

zg 01

x 0

=n 0

+2X 0

(X 0

z0e)

f(x 0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

0 Z

01

e

n 0

:

Lemma 3 There hold the following relations:

r(w;)=0 () rF

0

(x;) =0; g(x)+y=0; X 0

z0e=0 (26)

() rF(w;)=0:

Proof. The rst equivalenceis obviousfrom (12).

Thesecondrelationcomesfrom(25). IfrF

0

(x;) =0; g(x) +y=0andX 0

z0e=0,

thenwehaverF(w;) =0. Converselyassumethat rF(w;)=0. Thenitfollowsfrom

the relations r

y

(w)=0and r

z

(w)=0 that

g(x)+y =0

and

f(x 0

) t

zg 01

x 0

=n 0

+2X 0

(X 0

z0e)

f(x 0

) t

zg

=n 0

+kX 0

z0ek 2

0 Z

01

e

n 0

=0:

(27)

Equation (27) yields

f(x 0

) t

zg 01

z=n 0

+2Z(X 0

z0e)

f(x 0

) t

zg

=n 0

+kX 0

z0ek 2

0 (X

0

) 01

e

n 0

=0;

(8)

x

rF

0

(x;) =r

x

F(w;) =0:

Equation (27) also yields

2(X 0

z0e)=

n 0

f(x 0

) t

zg

n 0

+kX 0

z0ek 2

!

(X 0

Z) 01

e0 f(x

0

) t

zg 01

n 0

e:

Multiplying (X 0

z0e) t

to both sides of the ab oveequality,wehave

2kX 0

z0ek 2

=

f(x 0

) t

zg

n 0

+kX 0

z0ek 2

!

0 f(x

0

) t

zg

n 0

0

n 0

f(x 0

) t

zg

n 0

+kX 0

z0ek 2

!

e t

(X 0

Z) 01

e+f(x 0

) t

zg 01

= kX

0

z0ek 2

+f(x 0

) t

zg 01

0

n 0

f(x 0

) t

zg

n 0

+kX 0

z0ek 2

!

e t

(X 0

Z) 01

e:

Thus thereholds

(20)kX 0

z0ek 2

=f(x 0

) t

zg 01

0

n 0

f(x 0

) t

zg

n 0

+kX 0

z0ek 2

!

e t

(X 0

Z) 01

e:

By (23) and (24), we have

20+

n 0

e t

(X 0

Z) 01

e

kX 0

z0ek 2

= f(x

0

) t

zg 01

0

f(x 0

) t

zg

n 0

e t

(X 0

Z) 01

e

n 0

f(x

0

) t

zg 01

0f(x 0

) t

zg 01

Q

n 0

i=1 x

0

i z

i

1=n 0

Q

n 0

i=1 x

0

i z

i

1=n 0

= 0;

whichimplies X 0

z0e=0.

Therefore the proof is complete. 2

In the following, we set 1x 0

= E1x. To derive an upper bound on the directional

derivativeof F, werst calculate the one for.

(28)

r(w) t

1w

=

f(x 0

) t

zg 01

(z t

1x 0

+(x 0

) t

1z)=n 0

+2(A(x)1x+1y) t

(g(x)+y)+2(Z1x 0

+X 0

1z) t

(X 0

z0

f(x 0

) t

zg

=n 0

+kg(x)+yk 2

+kX 0

z0ek 2

0

n 0

X

i=1 z

i 1x

0

i +x

0

i 1z

i

x 0

i z

i :

(9)

k

r(w

k )

t

1w

k

0(20)

kg(x

k )+y

k k

2

+kX 0

k z

k 0ek

2

f(x 0

k )

t

z

k g

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

: (29)

Proof. From(28), wehave

r(w

k )

t

1w

k

=

f(x 0

) t

zg 01

(0(x 0

k )

t

z

k

=n 0

)02kg(x

k )+y

k k

2

02kX 0

k z

k 0ek

2

f(x 0

) t

zg

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

0 n

0

X

i=1

0(x 0

k )

i (z

k )

i

n 0

(x 0

k )

i (z

k )

i

=

f(x 0

) t

zg 01

0(20)kg(x

k )+y

k k

2

0(20)kX 0

k z

k

0ek 2

f(x 0

) t

zg 1+

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

0 n

0

X

i=1

n 0

(x 0

k )

i (z

k )

i :

From relations (23) and (24), weobtain

f(x 0

) t

zg 01

0(20)kg(x

k )+y

k k

2

0(20)kX 0

k z

k

0ek 2

f(x 0

) t

zg

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

0 n

0

X

i=1

n 0

(x 0

k )

i (z

k )

i

n

0

(x 0

k )

t

z

k 0

n 0

Q

i=1 (x

0

k )

i (z

k )

i

!

1=n 0

0(20)

kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

(x 0

k )

t

z

k

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

0(20)

kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

(x 0

k )

t

z

k

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

:

This provesthe lemma. 2

Lemma 5 If 1w

k

solves (15), then we have

rF(w

k

;) t

1w

k

01x

t

k (G

k +E

t

(X 0

k )

01

Z

k E+

1

A(x

k )

t

A(x

k ))1x

k

0(20)

kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

(x 0

k )

t

z

k

=n 0

+kg(x

k )+y

k k

2

+kX 0

k z

k

0ek 2

:

Proof. From (20) and (25), weobtain

rF(w

k

;) t

1w

k

= 01x t

k (G

k +E

t

(X 0

k )

01

Z

k E+

1

A(x

k )

t

A(x

k ))1x

k

+r(w

k )

t

1w

k

whichprovesthe lemmafrom (29). 2

(10)

Lemma 6 Assume that1w

k

solves(15). If1x

k

=0; g(x

k )+y

k

=0andX

k z

k

0e=0,

then w

k

isan SBKKT point.

Proof. 1x

k

= 0 means rF

0 (x

k

;) = 0 from (20). Thus from (26), r(w

k

;) = 0 follows.

2

WenotethatthislemmashowsthatifG

k

ispositivedeniteandw

k

isnot anSBKKT

point, then the direction 1w

k

is a descent direction for the primal-dual barrier p enalty

function fromLemma 5.

2.3 Line search algorithm

To obtain a globally convergent algorithm to an SBKKT point for a xed > 0, it is

necessary to modify the basic Newton iteration with the unit step size somehow. Our

iterations consist of

w

k +1

=w

k +

k 1w

k

;

where

k

is a step size determined by the line searchproceduredescribedbelow.

The mainiterationistodecreasethevalueoftheprimal-dualbarrierpenaltyfunction

F(w;) for xed . Thus the step size is determined by the sucient decrease rule of

the meritfunction. Weadopt Armijo's rule. Atthe pointw

k

, wecalculate the maximum

allowedstep tothe boundaryof the feasible regionby

k max

=min (

min

i (

0 (x

0

k )

i

(1x 0

k )

i

(1x

0

k )

i

<0 )

;min

i (

0 (z

k )

i

(1z

k )

i

(1z

k )

i

<0 ))

:

A step to the nextiterate is given by

k

=

k

l

k

;

k

=minf

k max

;1g;

where 2 (0;1) and 2 (0;1) are xed constants and l

k

is the smallest nonnegative

integer such that

F(w

k +

k

l

k

1w

k

;)0F(w

k

;) "

0

k

l

k

rF(w

k

;) t

1w

k

;

where "

0

2(0;1).

Nowwe give the line searchalgorithm, which is called AlgorithmLS. This algorithm

can be regarded as the inner iteration of Algorithm IP (see Step 2 of Algorithm IP).

Algorithm LS

Step 0. (Initialize) Let w

0

2 S 2R m

2R n

0

+

, and > 0, > 0. Set "

0

> 0, 2 (0;1),

2(0;1), "

0

2(0;1). Let k =0.

Step 1. (Termination)If kr(w

k

;)k "

0

; then stop.

Step 2. (Compute direction) Calculatethe direction 1w

k

by (15).

(11)

(30)

k max

= min (

min

i (

0 (x

0

k )

i

(1x 0

k )

i

(1x

0

k )

i

<0 )

;min

i (

0 (z

k )

i

(1z

k )

i

(1z

k )

i

<0 ))

;

k

= minf

kmax

;1g: (31)

Find the smallestnonnegativeintegerl

k

that satises

F(w

k +

k

l

k

1w

k

;)0F(w

k

;) "

0

k

l

k

rF(w

k

;) t

1w

k : (32)

Calculate

k

=

k

l

k

:

Step 4. (Update variables) Set

w

k +1

=w

k +

k 1w

k :

Step 5. Set k :=k+1 and go toStep 1. 2

Toprove globalconvergenceof Algorithm LS,weneed the followingassumptions.

Assumption G

(G1) Thefunctions f and g

i

;i=1;:::;m, are twice continuously dierentiable.

(G2) Thelevelsetofthe primal-dualbarrierpenalty functionF(w ;) ataninitialpoint

w

0

2S2R m

2R n

0

+

,whichisdenedby n

w2S2R m

2R n

0

+

j F(w;)F(w

0

;) o

,

iscompact for given>0.

(G3) Thematrix G

k

is uniformly p ositive deniteand uniformly bounded. 2

We note that if a quasi-Newton approximation is used for computing the matrix G

k ,

then weneed thecontinuityof onlythe rstorder derivativesoffunctions inAssumption

(G1). Wealso note that for the case of n 0

=n, Assumption (G3) can be replaced by the

following weakercondition:

(G3) 0

The matrix G

k

is p ositive semi-deniteand uniformlybounded.

The followingtheorem givesa convergenceofan innitesequence generated by Algo-

rithm LS.

Theorem 2 Let an innite sequence fw

k

g be generated by Algorithm LS. Then there

existsatleastoneaccumulationpointoffw

k

g,andanyaccumulationpointofthesequence

fw

k

g is an SBKKT point.

(12)

Proof. Sincethe sequencefF(w

k

;)g isdecreasing,eachcomponentofthe sequencefx

k g

is bounded away from zero and bounded ab ove by the existence of the log barrier term

and the assumption. The sequence fz

k

g also has these properties. Thus there exists a

positivenumb er M suchthat

kvk 2

M v

t

(G

k +E

t

(X 0

k )

01

Z

k

E)v Mkvk 2

; 8v 2R n

; (33)

by the assumption. FromLemma 5 and (33), we have

rF(w

k

;) t

1w

k 0

k1x

k k

2

M

+r(w

k )

t

1w

k

<0;

(34)

and from (32),

F(w

k +1

;)0F(w

k

;) "

0

k

l

k

rF(w

k

;) t

1w

k (35)

0"

0

k

l

k

k1x

k k

2

M

0r(w

k )

t

1w

k

!

< 0:

(36)

Because the sequence fF(w

k

;)g is decreasing and boundedb elow, the left hand side of

(35) convergesto 0. From(33) and (20), k1x

k

kis uniformly boundedabove. Then from

(18) and (19) for 1y

k

and 1z

k

, we conclude that k1w

k

k is uniformly b ounded above.

Since liminf

k !1 (x

0

k )

i

>0;liminf

k!1 (z

k )

i

>0;i=1;111;p,wehave liminf

k !1

k

>0.

Supp ose that there exists asubsequence K f0;1;111g and a suchthat

liminf

k!1

rF(w

k

;) t

1w

k

>0; k 2K : (37)

Then we have l

k

! 1;k 2 K from (35) because the left most expression tends to zero,

and thereforewecan assumel

k

>0forsucientlylargek 2K withoutlossof generality.

If l

k

>0 then the point w

k +

k 1w

k

= doesnot satisfy condition (32). Thus,we have

F(w

k +

k 1w

k

=;)0F(w

k

;)>"

0

k rF(w

k

;) t

1w

k

=:

(38)

By the mean value theorem, there existsa

k

2(0;1) such that

F(w

k +

k 1w

k

=;)0F(w

k

;)=

k rF(w

k +

k

k 1w

k

=;) t

1w

k

=:

(39)

Then, from(38) and (39), we have

"

0 rF(w

k

;) t

1w

k

<rF(w

k +

k

k 1w

k

=;) t

1w

k :

This inequality yields

rF(w

k +

k

k 1w

k

=;) t

1w

k

0rF(w

k

;) t

1w

k (40)

> ("

0

01)rF(w

k

;) t

1w

k

>0:

Thus by the property l

k

! 1, we have k

k

k 1w

k

=k ! 0;k 2 K, because k1w

k k is

uniformly bounded above. Thusthe lefthand side of (40) and therefore rF(w

k

;) t

1w

k

(13)

proved

lim

k !1

rF(w

k

;) t

1w

k

=0:

(41)

This implies that

1x

k

!0; g(x

k )+y

k

!0; X 0

k z

k

0e!0;

(42)

from (34). We should note that the existence of an accumulation point of the sequence

fw

k

gisassuredbyAssumption(G2). Letanarbitraryaccumulationpointofthesequence

fw

k

g be w^=(^x;^y ;z)^ t

2S2R m

2R n

0

+

. Then from (42),we have

^ y=0

g(^x)

; z^=(

^

X 0

) 01

e;

where

^

X 0

= diag (^x 0

1

;111;x^ 0

n 0

). Because 1x

k

! 0 implies rF

0

(^x;) = 0 from (20), we

haver(w ;^ )=0from (26). 2

3 Superlinear Convergence

Inthe previous section,wehave provedthe globalconvergencepropertyof AlgorithmIP.

Inthissection,wediscussunderwhichconditionAlgorithmIPcanpossessthesuperlinear

convergence property. For this purp ose, we rst consider the following local algorithm,

which is called Algorithm IPlo cal. By appropriately controlling the parameters

k and

k

at each step near a KKT point, we can show that the unit Newton-likestep from an

approximateSBKKT pointyieldsa nextapproximate SBKKT p ointthat corresponds to

the new updatedbarrierparameter, and that the sequencefw

k

g generated by Algorithm

IPlocal convergessup erlinearly tothe KKT point.

Algorithm IPlocal

Step 0. (Initialize) Set w

0

2S2R m

2R n

0

+ ,

0

>0,0<

0

<1and ">0. Letk =0.

Step 1. (Termination)If kr

0 (w

k

)k";then stop.

Step 2. (Update the parameters) Choose the parameters

k

>0 and 0<

k

<1.

Step 3. (Compute direction) Calculate the direction 1w

k

by the linear system of equa-

tions

J

k 1w

k

=0r(w

k

;

k );

(43)

where the matrix J

k

is givenby

J

k

= 0

B

@ G

k

0A(x

k )

t

0E t

A(x

k

)

k

I 0

Z

k

E 0 X

0

k 1

C

A : (44)

(14)

k max

= min (

min

i (

0 (x

0

k )

i

(1x 0

k )

i

(1x

0

k )

i

<0 )

;min

i (

0 (z

k )

i

(1z

k )

i

(1z

k )

i

<0 ))

;

k

= minf

k

k max

;1g:

Step 5. (Update variables) Set

w

k +1

=w

k +

k 1w

k :

Step 6. Set k :=k+1 and go toStep 1. 2

Denote the Jacobianmatrix of r(w;) by

rr(w ;) = 0

B

@ r

2

x

L(w ) 0A(x) t

0E t

A(x) I 0

ZE 0 X

0 1

C

A :

Let w 3

= (x 3

;y 3

;z 3

) t

be a KKT point of (1). In the following, we assume that k is

sucientlylargeand

k

issucientlycloseto0. Inordertoprovesuperlinearconvergence,

we need Assumption L.

Assumption L

(L1) The sequence fw

k

g converges to w 3

.

(L2) The secondderivatives of the functions f and g are Lipschitzcontinuousat x 3

.

(L3) The linear independence of active constraint gradients, the second order sucient

condition for optimality and the strict complementarity condition hold at w 3

.

(L4)

k and

k

are updated by the rules

k

=

k kr

0 (w

k )k

1+

1

and 10

k

=

k kr

0 (w

k )k

2

forp ositive constants

1 ,

2

and suchthat min(1;

2 )>

1

and0<<1, andfor

apositive numb er

k

suchthat 1

M 0

k

M

0

,where M 0

is ap ositive constant.

(L5) The matrix G

k

satises ateachk,

kG

k 0r

2

x L(w

3

)k<

for suciently small >0,and

k(G

k 0r

2

x L(w

k ))1x

k

k=O(k1w

k k

1+

3

)

for some positive constant

3

such that

3

>

1

. 2

(15)

First weshouldnote that by(L3), the Jacobianmatrixrr

0

(w ) isnonsingular. Then

by (L2), (L4) and (L5), wehave

kJ

k 0rr

0 (w

3

)k krr

0 (w

k

)0rr

0 (w

3

)k+kG

k 0r

2

x L(w

3

)k+

k

krr

0 (w

k

)0rr

0 (w

3

)k++M 0

kr

0 (w

k )k

1+

1

:

Since for sucientlylarge k and sucientlysmall ,there holds

krr

0 (w

3

) 01

kkJ

k 0rr

0 (w

3

)k<1;

J

k

is nonsingularand we have

kJ 01

k

for a positive constant by Banach perturbation lemma. Thus the linear system of

equations (43) has a unique solution.

Nowwegivethefollowingtheorem,whichisveryimportantforprovingthesuperlinear

convergence property of Algorithm IPlocal. This theorem shows that if w

k

satises the

approximateSBKKT condition for

k 01

,then

k

issettobeunit inStep 4of Algorithm

IPlocal and w

k +1w

k

also satisesthe approximate SBKKT condition for

k .

Theorem 3 Let M

c

be a constant such that 0<M

c

<

p

n 0

.

(1) If a point w^ =(^x;y ;^ z)^ t

2S2R m

2R n

0

+

satiseskr(w ;^

k

)kM

c

k , then

1 kr

0 (w

k )k

1+

1

kr

0

(w)k^

2 kr

0 (w

k )k

1+

1

(45)

for positive constants

1 and

2 .

(2) If kr(w

k

;

k 01

)kM

c

k01

, then

k

=1:

(3) There holds

kr(w

k +1w

k

;

k

)kM

c

k : (46)

Proof. (1) Since kr (w ;^

k

)kM

c

k

, wehave

kr

0

(w )k^ =

r(w ;^

k )+

k 0

B

@ 0

0^y

e 1

C

A

=O(

k

)=O(kr

0 (w

k )k

1+1

):

Furthermore we obtain

kr

0

(w )k^ =

r(w ;^

k )+

k 0

B

@ 0

0y^

e 1

C

A

k

0

B

@ 0

0^y

e 1

C

A

0kr(w ;^

k )k

=

k q

k^yk 2

+kek 2

0kr(w ;^

k )k(

p

n 0

0M

c )

k

p

n 0

0M

c

M 0

kr

0 (w

k )k

1+

1

:

(2) We will showthat

k min

i (

0 (x

0

k )

i

(1x 0

k )

i

(1x

0

k )

i

<0 )

1:

(47)

(16)

Fori suchthat (Ex )

i

>0, it followsfrom(1x

k )

i

!0 and

k

!1 that

0

k (x

0

k )

i

(1x 0

k )

i

>1 for (1x 0

k )

i

<0:

Nowwe consideran index i such that (Ex 3

)

i

=0. In this case wenote that (z 3

)

i

>0 by

Assumption (L3). By (43), we have

(x 0

k )

i

+(1x 0

k )

i

=

k

(z

k )

i 0

(x 0

k )

i (1z

k )

i

(z

k )

i : (48)

Since kr (w

k

;

k 01

)kM

c

k01

, we have

k

1

M 0

kr

0 (w

k )k

1+

1

1+

1

M 0

kr

0 (w

k01 )k

(1+

1 )

2

(49)

by result (1), and

j(x 0

k )

i (z

k )

i 0

k 01 j M

c

k 01 :

The latter yields

(x 0

k )

i

(1+M

c )

k 01

(z

k )

i

=

1+M

c

(z

k )

i

k 01 kr

0 (w

k 01 )k

1+

1

:

Since

(1z

k )

i

k1w

k

k=O(kr(w

k

;

k

)k)=O(kr

0 (w

k

)k)=O(kr

0 (w

k 01 )k

1+

1

);

we have

(x 0

k )

i (1z

k )

i

=O

kr

0 (w

k 01 )k

2(1+

1 )

: (50)

Assumption (L4) implies (1+

1 )

2

<2(1+

1

). Thus it follows from(48), (49) and (50)

that

(x 0

k )

i

+(1x 0

k )

i

>

k

(z

k )

i

; (51)

where isgivenby (L4). Since(x 0

k )

i (z

k )

i kr

0 (w

k

)k, Assumption (L4) guarantees

k

(z

k )

i

=

k kr

0 (w

k )k

1+1

(z

k )

i

k (x

0

k )

i kr

0 (w

k )k

1

k (x

0

k )

i kr

0 (w

k )k

2

= 1

(x

0

k )

i (10

k );

then we have

k

(z

k )

i (x

0

k )

i (10

k ):

(52)

Thus by (51) and (52) weobtain

(x 0

k )

i

+(1x 0

k )

i

>(10

k )(x

0

k )

i

;

whichimplies

k 0

(x 0

k )

i

(1x 0

k )

i

!

>1 for (1x 0

k )

i

<0:

(17)

In the same way as above,wecan prove that

k min

i (

0 (z

k )

i

(1z

k )

i

(1z

k )

i

<0 )

1:

Therefore the result follows.

(3) From Assumptions (L4) and (L5), wedirectly obtain

kr(w

k +1w

k

;

k

)k = kr(w

k

;

k

)+rr(w

k

;

k )1w

k

+O(k1w

k k

2

)k

kr(w

k

;

k )+J

k 1w

k

k+O(k1w

k k

2

)

+k(J

k

0rr(w

k

;

k ))1w

k k

= k(G

k 0r

2

x L(w

k ))1x

k

k+O(k1w

k k

2

)

= O(k1w

k k

min(1+

3

; 2)

)

= O(kr (w

k

;

k )k

min(1+

3

;2)

)

= O(kr

0 (w

k )k

min(1+3; 2)

)

= o(kr

0 (w

k )k

1+1

)

= o(

k )

M

c

k :

This proves(46).

Therefore the proof of this theorem is complete. 2

4 Global and superlinear convergence

Theorem 2 assures the global convergence of Algorithm LS to an SBKKT p oint for a

xedand therefore theglobal convergenceof AlgorithmIP to aKKT pointof problem

(1), whileTheorem 3implies thesuperlinearconvergenceof AlgorithmIPlocal toaKKT

point of problem (1). Specically Theorem 3 showsthat if w

k

satises the approximate

SBKKT condition for

k 01

, then

k

is setto beunit in Step 4 of Algorithm IPlocal and

w

k +1w

k

also satises the approximate SBKKT condition for

k

. Thus by result (1) of

Theorem 3,the superlinearconvergenceprop ertyof AlgorithmIPlo calcan beobtained if

we cho ose anapproximateSBKKT pointfor

0

asan initialpoint.

Howeverthis doesnot necessarily imply the superlinearconvergenceofAlgorithm IP,

because the Armijo line search criterion required in the inner iteration (Algorithm LS)

may prevent from cho osing a unit step size even if the iterates are near a KKT point.

This phenomenon is known as the Maratos eect. However if we adopt a unit step size

when the current p oint w

k

(the initial point for the k-th inner iteration) satises the

approximate SBKKT condition for suciently small

k 01

, and w

k +1w

k

(the rst step

for the k-th inner iteration) satises the approximate SBKKT condition for

k01 , even

when the merit function value does not satisfy the Armijo rule, then Theorems 2 and

3 assure that we can have the global and superlinear convergence of Algorithm IP by

appropriately controllingthe parameters

k and

k

at the nal stage of iterations.

(18)

purpose, we could use a nonmonotone strategy like the primal-dual interior point trust

region method given by Yamashita, Yab e and Tanab e[18] for example. However we did

not adopt the technique just for simplicity of exposition of the algorithm in this pap er.

References

[1] I.Akrotirianakis and B.Rustem, A globally convergent interior point algorithm for

general nonlinear programming problems, Technical Report 97-14, Department of

Computing,ImperialCollegeofScience,TechnologyandMedicine,revisedJuly1998.

[2] I.Akrotirianakisand B.Rustem,A primal-dualinterior pointalgorithm with anexact

and dierentiable function for general nonlinear programming problems, Technical

Rep ort 98-09, Department of Computing, Imperial College of Science, Technology

and Medicine, 1998.

[3] M.Argaez and R.A.Tapia, On the global convergence of a modied augmented La-

grangian linesearch interior point Newton method for nonlinear programming,Tech-

nical Rep ort TR95-38, Department of Computational and Applied Mathematics,

Rice University,October1995 (revised February 1997).

[4] M.G.BreiteldandD.F.Shanno,Preliminarycomputationalexperiencewithmo died

log-barrierfunctionsforlarge-scalenonlinearprogramming,inLarge ScaleOptimiza-

tion, Kluweracademic publishers, Dordrecht,Boston, London, 1994.

[5] R.H.Byrd, J.C.Gilbert and J.Nocedal,A trustregion method based on interior point

techniques for nonlinear programming, Technical Rep ort OTC 96/02, Optimization

TechnologyCenter, Argonne National Laboratory,June, 1996.

[6] R.H.Byrd,M.E.HribarandJ.Nocedal,Aninteriorpointalgorithmforlargescalenon-

linear programming,TechnicalReportOTC97/05,OptimizationTechnologyCenter,

Argonne NationalLaboratory, August, 1997.

[7] R.H.Byrd, G.Liuand J.Nocedal,On the local behaviourof aninteriorpointmetho d

for nonlinear programming, in Numerical analysis 1997, D.F.Griths, D.J.Higham

and G.A.Watson eds.,Longman (1998),pp.37-56.

[8] J.E.Dennis, Jr.,M.Heinkenschloss and L.N.Vicente,Trust-region interior-point SQP

algorithms for a class of nonlinear programming problems,TR94-45, Dept. of Com-

putational and Applied Mathematics, Rice University, Houston, Texas, USA, 1994

(revised Novemb er1995).

[9] A.S.El-Bakry,R.A.Tapia, T.Tsuchiyaand Y.Zhang, On the formulation and theory

of the Newton interior-p oint method for nonlinear programming, Journal of Opti-

mization Theory and Applications,89(1996) pp.507-541.

[10] A.V.FiaccoandG.P.McCormick,Nonlinear Programming: Sequential Unconstrained

Minimization Techniques, SIAM, Philadelphia,1990.

(19)

gramming, SIAM J.on Optimization, 8(1998) pp.1132-1152.

[12] H.J.Martinez, Z.Parada and R.A.Tapia, On the characterization of Q-superlinear

convergenceofquasi-Newtoninterior-p ointmethodsfornonlinearprogramming,Bol.

Soc. Mat. Mexicana,Vol.1 (1995),pp.137-148.

[13] H.Yab e and H.Yamashita, Q-superlinear convergence of primal-dual interior point

quasi-Newton methods for constrained optimization, Journal of the Operations Re-

search Society of Japan, 40(1997), pp.415-436.

[14] H.Yamashita, A primal-dual exact merit function for constrained optimization, Op-

timization { Modeling and Algorithms 8, Coop erative Research Report 84, The In-

stitute of Statistical Mathematics, March(1996), pp.119-127.

[15] H.Yamashita, A globally convergent primal-dual interior point method for con-

strainedoptimization, Optimization Methods and Software, 10(1998), pp.443-469.

[16] H.Yamashita and T.Tanab e, A primal-dual interior point trust region method for

large scale constrained optimization, Optimization { Modeling and Algorithms 6,

Co operative Research Report 73, The Institute of Statistical Mathematics, March

(1995), pp.1-25.

[17] H.Yamashita and H.Yab e, Superlinear and quadratic convergence of some primal-

dual interior p oint methods for constrained optimization, Mathematical Program-

ming,75 (1996),pp.377-397.

[18] H.Yamashita,H.Yab eand T.Tanab e,Aglobally andsuperlinearly convergentprimal-

dualinterior pointtrustregion methodfor largescaleconstrainedoptimization,Tech-

nical Report, 1997 July (revised 1998 July).