a primal-dual l
2
barrier penalty function
for nonlinear optimization
Hiroshi Yamashita 3
and Hiroshi Yab e y
March, 1999
Abstract
Inthispaper,weareconcernedwithaprimal-dualinteriorpointmetho dforsolv-
ing nonlinearly constrained optimization problems, in which Newton-like methods
are appliedto theshifted barrier KKT conditions. We propose a new primal-dual
merit function, which is called the primal-dual l
2
barrier penaltyfunction, within
theframeworkoflinesearchmethods,andshowtheglobalconvergencepropertyof
ourmethod. Furthermore,bycarefullycontrollingparameters inthealgorithm,its
sup erlinearconvergencepropertyis shown.
1 Introduction
In this paper,weconsiderthe following constrained optimizationproblem:
minimize f(x); x2R n
;
subject to g(x)=0; x
i
0; i 2I
P
; (1)
whereweassumethatthefunctionsf :R n
!R 1
andg:R n
!R m
aretwicecontinuously
dierentiable,and I
P
isasubset of theindex setf1;2;:::;ng. Letn 0
=jI
P
j>0and E be
a n 0
2n matrix whose rows consist of e t
i
; i2I
P
,where e
i 2R
n
denotes the i-th column
vector of the identity matrix. Then problem (1) is writtenas:
minimize f(x); x 2R n
;
subjectto g(x)=0; Ex 0:
(2)
In the sequel, weuse the notation
x 0
Ex 2R n
0
3
MathematicalSystems,Inc.,2-4-3,Shinjuku,Shinjuku-ku, Tokyo,Japan. [email protected]
y
Department of Applied Mathematics, Faculty of Science, Science University of Tokyo, 1-3, Kagu-
razaka,Shinjuku-ku, Tokyo,Japan. yab [email protected]
Let the Lagrangian function of the ab ove problem be denedby
L(w)=f(x)0y t
g(x)0z t
Ex=f(x)0y t
g(x)0z t
x 0
; (3)
where w=(x;y;z) t
, and y2R m
and z 2R n
0
are the Lagrange multiplier vectors which
correspond to the equality and inequality constraints respectively. Then Karush-Kuhn-
Tucker (KKT)conditions for optimality of the above problem are givenby
r
0 (w)
0
B
@ r
x L(w)
g(x)
X 0
Ze 1
C
A = 0
B
@ 0
0
0 1
C
A (4)
and
x 0
0; z 0;
(5)
where
r
x
L(w ) = rf(x)0A(x) t
y0E t
z;
A(x) = 0
B
B
@ rg
1 (x)
t
.
.
.
rg
m (x)
t 1
C
C
A
;
X 0
= diag(x 0
1
;111;x 0
n 0
);
Z = diag(z
1
;111;z
n 0)
;
e = (1;111;1) t
2R n
0
:
To solve the ab ove problem by a primal-dual interior point method, Yamashita [15]
intro duces the barrier penalty functionF(;):S !R 1
whichis denedby
F(x;)=f(x)0 n
0
X
i=1 logx
0
i +
m
X
i=1 jg
i (x)j; (6)
where and are given positive constants, and S = fx2R n
jx 0
>0g . If is su-
ciently large, the necessary condition for the optimality of the barrier penalty function
minimization problem fora given>0 is
r(w;) 0
B
@ r
x L(w)
g(x)
X 0
Ze0e 1
C
A
= 0
B
@ 0
0
0 1
C
A
; (7)
and x 0
> 0; z > 0. The above conditions are called the barrier KKT conditions. The
search direction of the proposed method is based on the Newton step for solving the
equality part of the barrier KKT conditions. Let 1w = (1x;1y;1z) t
be dened by a
solution of
J(w )1w =0r(w;) (8)
J(w)= 0
B
@
G 0A(x) t
0E t
A(x) 0 0
ZE 0 X
0 1
C
A
; (9)
where we use the relation X 0
Ze = X 0
z = ZEx. The matrix G is r 2
x
L(w) or a quasi-
Newton approximationtothe Hessian matrix.
Let 1F
l
(x;;s)bea rst order approximationtothe quantity F(x+s;)0F(x;),
i.e.,
1F
l
(x;;s)rf(x) t
s0e t
(X 0
) 01
Es+ m
X
i=1
g
i
(x)+rg
i (x)
t
s
0 m
X
i=1 jg
i (x)j: (10)
Then itis possible to provethat
1F
l
(x;;1x)01x t
(G+E t
(X 0
) 01
ZE)1x0 m
X
i=1
(0jy
i +1y
i j)jg
i (x)j:
The above inequality shows that the direction 1x which is derived from (8) can be a
descent direction of the barrier penalty function F(x;) if G is positive denite and
is suciently large. Based on this observation, the line search algorithm and the trust
regionalgorithmforthe primalvariableare proposedbyYamashita[15]andYamashitaet
al.[16, 18] respectively. Forthe variablez,the step size iscontrolledby abox constraint.
The step size for the variabley is usually taken equal to the one for z. Both algorithms
are shown to be quite ecient. Some researchers have dealt with other primal merit
functionswithinthe frameworkof line searchstrategies ortrust regionstrategies (Seefor
example, Breiteld and Shanno[4], Dennis, Heinkenschloss and Vicente[8], Byrd, Gilbert
andNocedal[5],andAkrotirianakisandRustem[1,2]). Superlinearconvergenceproperties
of primal-dual methods based on solving the barrier KKT conditions have b een also
studiedbyseveralauthors,forexample,Martinez,ParadaandTapia[12],El-Bakry,Tapia,
Tsuchiya and Zhang[9], Yamashita and Yab e[17], Yabe and Yamashita[13], Yamashita,
Yabe and Tanab e[18], and Byrd, Liu and No cedal[7].
In this paper, weconsider a moreconventional merit function:
F
0
(x;)=f(x)0 n
0
X
i=1 logx
0
i +
1
2 m
X
i=1 g
i (x)
2
; (11)
whichisextensivelydescribedinabookbyFiaccoandMcCormick[10]. Wealso call this
functionthe barrierp enaltyfunction. Todiscriminatethis functionfrom(6),wemaycall
this the l
2
barrier penalty function. Whereas the function dened in (6) may be called
the l
1
barrier penalty function.
The necessary condition for the optimality of the problem
minimize F
0
(x;); x2S
is
rF
0
(x;)=rf(x)0E t
(X 0
) 01
e+ 1
m
X
i=1 g
i (x)rg
i
(x)=0 (12)
and x > 0. As in [15], we intro duce the variables y and z by y = 0g(x)= and z =
(X 0
) 01
e. Then the ab oveconditions are written as
r(w;) 0
B
@
rf(x)0A(x) t
y0E t
z
g(x)+y
X 0
Ze0e
1
C
A
= 0
B
@ 0
0
0 1
C
A (13)
andx 0
>0;z >0. Wecalltheseconditionsthe shiftedbarrierKKT(SBKKT)conditions.
These conditions are also considered by Forsgrenand Gill [11]. We do not consider the
interior conditions x 0
>0;z >0hereafter assuming these conditionsare always satised.
In what follows, the subscript k denotes an iteration count in the inner iteration or
in the outer iteration. Let k1 k denote the l
2
norm for vectors and the operator norm
inducedfrom the l
2
vector norm for matrices. LetR n
0
+
=fz 2R n
0
j z >0g.
2 Algorithm and its global convergence
2.1 Outer iteration
A prototyp eof the algorithm that uses the SBKKT conditions is described asfollows.
Algorithm IP
Step 0. (Initialize) Set " >0, M
c
> 0 and k =0. Let a positive sequence f
k g;
k
# 0
b e given.
Step 1. (Termination)If kr
0 (w
k
)k",then stop.
Step 2. (ApproximateSBKKT point)Find apointw
k +1
that satises
kr(w
k +1
;
k
)kM
c
k : (14)
Step 3. (Update) Set k :=k+1 and go toStep 1. 2
Wenotethatthe barrierparametersequencef
k
g inAlgorithmIPneed notb edeter-
minedbeforehand. The value of each
k
maybesetadaptivelyasthe iteration pro ceeds.
Wecall condition (14) the approximateSBKKT condition, and call a pointthat satises
this condition the approximate SBKKT p oint.
The following theorem showsthe global convergenceproperty of AlgorithmIP.
Theorem 1 Let fw
k
g be an innite sequence generated by Algorithm IP. Suppose that
the sequences fx
k
g and fy
k
g are bounded. Then fz
k
g is bounded, and any accumulation
point of fw
k
g satises KKT conditions (4) and (5).
Proof. Assume that there existsan i suchthat (E z
k )
i
!1. Equation (14) yields
(rf(x
k
)0A(x
k )
t
y
k )
i
(E t
z
k )
i
01
M
c
k 01
(E t
z
k )
i
;
whichisacontradictionbecauseofthe boundednessoffx
k
gand fy
k
g. Thusthesequence
fz
k
g isbounded.
Let w^ b e any accumulationpointof fw
k
g. Sincethe sequences fw
k
g andf
k
gsatisfy
(14) for each k and
k
approaches zero, r
0
(w )^ =0 follows from the denition of r(w;).
Therefore the proof iscomplete. 2
2.2 Solving the shifted barrier KKT conditions
Inthissubsectionweconsideramethodforsolvingthe SBKKTconditionsapproximately
for a given > 0 (Step 2 of Algorithm IP). Therefore the index k denotes the inner
iteration countforagiven>0 inthissubsection. TheNewton-likeiteration forsolving
(13) is denedby
J
k 1w
k
=0r(w
k
;);
(15)
where the Jacobianmatrix J
k
isgivenby
J
k
= 0
B
@ G
k
0A(x
k )
t
0E t
A(x
k
) I 0
Z
k
E 0 X
0
k 1
C
A
; (16)
andthematrixG
k isr
2
x L(w
k
)oritsapproximation. Thefollowinglemmagivesasucient
condition for equation (15) tobe solvable.
Lemma 1 If G
k
ispositive denite, then the matrix J
k
is nonsingular.
Proof. Consider the equation
J
k 0
B
@ x
y
z 1
C
A=0;
for (x;y;z) t
2R n
2R m
2R n
0
. Thenwehave
(G
k +E
t
(X 0
k )
01
Z
k E+
1
A(x
k )
t
A(x
k
))x = 0;
y = 0
01
A(x
k )x;
z = 0(X
0
k )
01
Z
k Ex:
By the assumption we obtain x =0, and therefore y =0 and z =0. This proves the
lemma. 2
We notethat by eliminating 1y
k
and 1z
k
fromthe rst setof equations (15):
G
k 1x
k
0A(x
k )
t
1y
k 0E
t
1z
k
=0r
x L(w
k );
(17)
A(x
k )1x
k
+1y
k
= 0g(x
k )0y
k
; (18)
Z
k E1x
k +X
0
k 1z
k
= e0X 0
k z
k
; (19)
we have
(G
k +E
t
(X 0
k )
01
Z
k E+
1
A(x
k )
t
A(x
k ))1x
k
=0rF
0 (x
k
;):
(20)
Therefore it is easyto see that under appropriate assumptions the function F
0
(x;) can
be used as a merit function as in [15]. Because F
0
(x;) depends only on the primal
variables,weshoulduse a methodsimilar tothe one whichis givenin[15] for controlling
the step sizes fordual variables. Instead of following this p ossibility, weconsider amerit
function in the primal-dual space in this paper. Some primal-dual merit functions have
been proposed(See forexample,Argaezand Tapia[3],andEl-Bakry,Tapia,Tsuchiyaand
Zhang[9]for solvingthebarrierKKTconditions(7), andForsgrenandGill[11]forsolving
the SBKKT conditions (13)).
Tohaveameritfunctionwhichhas aminimump ointattheSBKKTpoint,andwhich
givesa descent direction with aNewton step, it is naturalto consider
F
0
(x;)+
2
kg(x)+yk 2
+
2 kX
0
z0ek 2
;
where is a p ositive constant. We note that the second and third terms correspond to
the secondandthird componentsinr(w;) respectively. However,this functiondoesnot
prevent eachcomponentof the variablez tend to 0,and therefore cannotgive aglobally
convergent algorithm unlessanappropriate procedureis devised. Thusweneeda sort of
the barriertermforthe variablez. Inthis paperweproposethe followingfunctionwhich
is calledthe primal-dualbarrier penalty function:
F(w;)=F
0
(x;)+log f(x
0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
n 0
Q
i=1 x
0
i z
i
!
=n 0
; (21)
where>0and 2(0;2)areconstants,whichisamodicationoftheprimal-dualmerit
function prop osed by Yamashita[14]. The denominator in the second term is to prevent
z
i
tend to 0 for each i. For notational conveniencewe denote the expression in the last
term in(21) by (w), i.e.,
(w) log f(x
0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
n 0
Q
i=1 x
0
i z
i
!
=n 0 (22)
= log
f(x 0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
0
n 0
n 0
X
i=1 logx
0
i z
i :
Forlater convenience wequote two wellknown relations
(x 0
) t
z
n 0
0
@ n
0
Y
i=1 x
0
i z
i 1
A 1=n
0
; (23)
n
X
i=1 1
n 0
x 0
i z
i
1
n 0
Q
i=1 x
0
i z
i
!
1=n 0
: (24)
In the aboveinequalities, the equalities hold if and only if x 0
1 z
1
=111=x 0
n 0
z
n 0
.
From(23), it is easyto provethe followinglemma.
Lemma 2 There hold:
(i) (w )0:
(ii) (w )=0 if and only if g(x)+y =0 and X 0
z0e=0. 2
Nowwe calculate the derivativesof the merit function:
rF(w;) = 0
B
@ rF
0
(x;)+r
x (w)
r
y (w)
r
z (w)
1
C
A; (25)
where
r
x
(w ) =
f(x 0
) t
zg 01
E t
z=n 0
+2A(x) t
(g(x)+y)+2E t
Z(X 0
z0e)
f(x 0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
0 E
t
(X 0
) 01
e
n 0
;
r
y
(w ) =
2(g(x)+y)
f(x 0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
;
r
z
(w ) =
f(x 0
) t
zg 01
x 0
=n 0
+2X 0
(X 0
z0e)
f(x 0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
0 Z
01
e
n 0
:
Lemma 3 There hold the following relations:
r(w;)=0 () rF
0
(x;) =0; g(x)+y=0; X 0
z0e=0 (26)
() rF(w;)=0:
Proof. The rst equivalenceis obviousfrom (12).
Thesecondrelationcomesfrom(25). IfrF
0
(x;) =0; g(x) +y=0andX 0
z0e=0,
thenwehaverF(w;) =0. Converselyassumethat rF(w;)=0. Thenitfollowsfrom
the relations r
y
(w)=0and r
z
(w)=0 that
g(x)+y =0
and
f(x 0
) t
zg 01
x 0
=n 0
+2X 0
(X 0
z0e)
f(x 0
) t
zg
=n 0
+kX 0
z0ek 2
0 Z
01
e
n 0
=0:
(27)
Equation (27) yields
f(x 0
) t
zg 01
z=n 0
+2Z(X 0
z0e)
f(x 0
) t
zg
=n 0
+kX 0
z0ek 2
0 (X
0
) 01
e
n 0
=0;
x
rF
0
(x;) =r
x
F(w;) =0:
Equation (27) also yields
2(X 0
z0e)=
n 0
f(x 0
) t
zg
n 0
+kX 0
z0ek 2
!
(X 0
Z) 01
e0 f(x
0
) t
zg 01
n 0
e:
Multiplying (X 0
z0e) t
to both sides of the ab oveequality,wehave
2kX 0
z0ek 2
=
f(x 0
) t
zg
n 0
+kX 0
z0ek 2
!
0 f(x
0
) t
zg
n 0
0
n 0
f(x 0
) t
zg
n 0
+kX 0
z0ek 2
!
e t
(X 0
Z) 01
e+f(x 0
) t
zg 01
= kX
0
z0ek 2
+f(x 0
) t
zg 01
0
n 0
f(x 0
) t
zg
n 0
+kX 0
z0ek 2
!
e t
(X 0
Z) 01
e:
Thus thereholds
(20)kX 0
z0ek 2
=f(x 0
) t
zg 01
0
n 0
f(x 0
) t
zg
n 0
+kX 0
z0ek 2
!
e t
(X 0
Z) 01
e:
By (23) and (24), we have
20+
n 0
e t
(X 0
Z) 01
e
kX 0
z0ek 2
= f(x
0
) t
zg 01
0
f(x 0
) t
zg
n 0
e t
(X 0
Z) 01
e
n 0
f(x
0
) t
zg 01
0f(x 0
) t
zg 01
Q
n 0
i=1 x
0
i z
i
1=n 0
Q
n 0
i=1 x
0
i z
i
1=n 0
= 0;
whichimplies X 0
z0e=0.
Therefore the proof is complete. 2
In the following, we set 1x 0
= E1x. To derive an upper bound on the directional
derivativeof F, werst calculate the one for.
(28)
r(w) t
1w
=
f(x 0
) t
zg 01
(z t
1x 0
+(x 0
) t
1z)=n 0
+2(A(x)1x+1y) t
(g(x)+y)+2(Z1x 0
+X 0
1z) t
(X 0
z0
f(x 0
) t
zg
=n 0
+kg(x)+yk 2
+kX 0
z0ek 2
0
n 0
n 0
X
i=1 z
i 1x
0
i +x
0
i 1z
i
x 0
i z
i :
k
r(w
k )
t
1w
k
0(20)
kg(x
k )+y
k k
2
+kX 0
k z
k 0ek
2
f(x 0
k )
t
z
k g
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
: (29)
Proof. From(28), wehave
r(w
k )
t
1w
k
=
f(x 0
) t
zg 01
(0(x 0
k )
t
z
k
=n 0
)02kg(x
k )+y
k k
2
02kX 0
k z
k 0ek
2
f(x 0
) t
zg
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
0 n
0
X
i=1
0(x 0
k )
i (z
k )
i
n 0
(x 0
k )
i (z
k )
i
=
f(x 0
) t
zg 01
0(20)kg(x
k )+y
k k
2
0(20)kX 0
k z
k
0ek 2
f(x 0
) t
zg 1+
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
0 n
0
X
i=1
n 0
(x 0
k )
i (z
k )
i :
From relations (23) and (24), weobtain
f(x 0
) t
zg 01
0(20)kg(x
k )+y
k k
2
0(20)kX 0
k z
k
0ek 2
f(x 0
) t
zg
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
0 n
0
X
i=1
n 0
(x 0
k )
i (z
k )
i
n
0
(x 0
k )
t
z
k 0
n 0
Q
i=1 (x
0
k )
i (z
k )
i
!
1=n 0
0(20)
kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
(x 0
k )
t
z
k
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
0(20)
kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
(x 0
k )
t
z
k
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
:
This provesthe lemma. 2
Lemma 5 If 1w
k
solves (15), then we have
rF(w
k
;) t
1w
k
01x
t
k (G
k +E
t
(X 0
k )
01
Z
k E+
1
A(x
k )
t
A(x
k ))1x
k
0(20)
kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
(x 0
k )
t
z
k
=n 0
+kg(x
k )+y
k k
2
+kX 0
k z
k
0ek 2
:
Proof. From (20) and (25), weobtain
rF(w
k
;) t
1w
k
= 01x t
k (G
k +E
t
(X 0
k )
01
Z
k E+
1
A(x
k )
t
A(x
k ))1x
k
+r(w
k )
t
1w
k
whichprovesthe lemmafrom (29). 2
Lemma 6 Assume that1w
k
solves(15). If1x
k
=0; g(x
k )+y
k
=0andX
k z
k
0e=0,
then w
k
isan SBKKT point.
Proof. 1x
k
= 0 means rF
0 (x
k
;) = 0 from (20). Thus from (26), r(w
k
;) = 0 follows.
2
WenotethatthislemmashowsthatifG
k
ispositivedeniteandw
k
isnot anSBKKT
point, then the direction 1w
k
is a descent direction for the primal-dual barrier p enalty
function fromLemma 5.
2.3 Line search algorithm
To obtain a globally convergent algorithm to an SBKKT point for a xed > 0, it is
necessary to modify the basic Newton iteration with the unit step size somehow. Our
iterations consist of
w
k +1
=w
k +
k 1w
k
;
where
k
is a step size determined by the line searchproceduredescribedbelow.
The mainiterationistodecreasethevalueoftheprimal-dualbarrierpenaltyfunction
F(w;) for xed . Thus the step size is determined by the sucient decrease rule of
the meritfunction. Weadopt Armijo's rule. Atthe pointw
k
, wecalculate the maximum
allowedstep tothe boundaryof the feasible regionby
k max
=min (
min
i (
0 (x
0
k )
i
(1x 0
k )
i
(1x
0
k )
i
<0 )
;min
i (
0 (z
k )
i
(1z
k )
i
(1z
k )
i
<0 ))
:
A step to the nextiterate is given by
k
=
k
l
k
;
k
=minf
k max
;1g;
where 2 (0;1) and 2 (0;1) are xed constants and l
k
is the smallest nonnegative
integer such that
F(w
k +
k
l
k
1w
k
;)0F(w
k
;) "
0
k
l
k
rF(w
k
;) t
1w
k
;
where "
0
2(0;1).
Nowwe give the line searchalgorithm, which is called AlgorithmLS. This algorithm
can be regarded as the inner iteration of Algorithm IP (see Step 2 of Algorithm IP).
Algorithm LS
Step 0. (Initialize) Let w
0
2 S 2R m
2R n
0
+
, and > 0, > 0. Set "
0
> 0, 2 (0;1),
2(0;1), "
0
2(0;1). Let k =0.
Step 1. (Termination)If kr(w
k
;)k "
0
; then stop.
Step 2. (Compute direction) Calculatethe direction 1w
k
by (15).
(30)
k max
= min (
min
i (
0 (x
0
k )
i
(1x 0
k )
i
(1x
0
k )
i
<0 )
;min
i (
0 (z
k )
i
(1z
k )
i
(1z
k )
i
<0 ))
;
k
= minf
kmax
;1g: (31)
Find the smallestnonnegativeintegerl
k
that satises
F(w
k +
k
l
k
1w
k
;)0F(w
k
;) "
0
k
l
k
rF(w
k
;) t
1w
k : (32)
Calculate
k
=
k
l
k
:
Step 4. (Update variables) Set
w
k +1
=w
k +
k 1w
k :
Step 5. Set k :=k+1 and go toStep 1. 2
Toprove globalconvergenceof Algorithm LS,weneed the followingassumptions.
Assumption G
(G1) Thefunctions f and g
i
;i=1;:::;m, are twice continuously dierentiable.
(G2) Thelevelsetofthe primal-dualbarrierpenalty functionF(w ;) ataninitialpoint
w
0
2S2R m
2R n
0
+
,whichisdenedby n
w2S2R m
2R n
0
+
j F(w;)F(w
0
;) o
,
iscompact for given>0.
(G3) Thematrix G
k
is uniformly p ositive deniteand uniformly bounded. 2
We note that if a quasi-Newton approximation is used for computing the matrix G
k ,
then weneed thecontinuityof onlythe rstorder derivativesoffunctions inAssumption
(G1). Wealso note that for the case of n 0
=n, Assumption (G3) can be replaced by the
following weakercondition:
(G3) 0
The matrix G
k
is p ositive semi-deniteand uniformlybounded.
The followingtheorem givesa convergenceofan innitesequence generated by Algo-
rithm LS.
Theorem 2 Let an innite sequence fw
k
g be generated by Algorithm LS. Then there
existsatleastoneaccumulationpointoffw
k
g,andanyaccumulationpointofthesequence
fw
k
g is an SBKKT point.
Proof. Sincethe sequencefF(w
k
;)g isdecreasing,eachcomponentofthe sequencefx
k g
is bounded away from zero and bounded ab ove by the existence of the log barrier term
and the assumption. The sequence fz
k
g also has these properties. Thus there exists a
positivenumb er M suchthat
kvk 2
M v
t
(G
k +E
t
(X 0
k )
01
Z
k
E)v Mkvk 2
; 8v 2R n
; (33)
by the assumption. FromLemma 5 and (33), we have
rF(w
k
;) t
1w
k 0
k1x
k k
2
M
+r(w
k )
t
1w
k
<0;
(34)
and from (32),
F(w
k +1
;)0F(w
k
;) "
0
k
l
k
rF(w
k
;) t
1w
k (35)
0"
0
k
l
k
k1x
k k
2
M
0r(w
k )
t
1w
k
!
< 0:
(36)
Because the sequence fF(w
k
;)g is decreasing and boundedb elow, the left hand side of
(35) convergesto 0. From(33) and (20), k1x
k
kis uniformly boundedabove. Then from
(18) and (19) for 1y
k
and 1z
k
, we conclude that k1w
k
k is uniformly b ounded above.
Since liminf
k !1 (x
0
k )
i
>0;liminf
k!1 (z
k )
i
>0;i=1;111;p,wehave liminf
k !1
k
>0.
Supp ose that there exists asubsequence K f0;1;111g and a suchthat
liminf
k!1
rF(w
k
;) t
1w
k
>0; k 2K : (37)
Then we have l
k
! 1;k 2 K from (35) because the left most expression tends to zero,
and thereforewecan assumel
k
>0forsucientlylargek 2K withoutlossof generality.
If l
k
>0 then the point w
k +
k 1w
k
= doesnot satisfy condition (32). Thus,we have
F(w
k +
k 1w
k
=;)0F(w
k
;)>"
0
k rF(w
k
;) t
1w
k
=:
(38)
By the mean value theorem, there existsa
k
2(0;1) such that
F(w
k +
k 1w
k
=;)0F(w
k
;)=
k rF(w
k +
k
k 1w
k
=;) t
1w
k
=:
(39)
Then, from(38) and (39), we have
"
0 rF(w
k
;) t
1w
k
<rF(w
k +
k
k 1w
k
=;) t
1w
k :
This inequality yields
rF(w
k +
k
k 1w
k
=;) t
1w
k
0rF(w
k
;) t
1w
k (40)
> ("
0
01)rF(w
k
;) t
1w
k
>0:
Thus by the property l
k
! 1, we have k
k
k 1w
k
=k ! 0;k 2 K, because k1w
k k is
uniformly bounded above. Thusthe lefthand side of (40) and therefore rF(w
k
;) t
1w
k
proved
lim
k !1
rF(w
k
;) t
1w
k
=0:
(41)
This implies that
1x
k
!0; g(x
k )+y
k
!0; X 0
k z
k
0e!0;
(42)
from (34). We should note that the existence of an accumulation point of the sequence
fw
k
gisassuredbyAssumption(G2). Letanarbitraryaccumulationpointofthesequence
fw
k
g be w^=(^x;^y ;z)^ t
2S2R m
2R n
0
+
. Then from (42),we have
^ y=0
g(^x)
; z^=(
^
X 0
) 01
e;
where
^
X 0
= diag (^x 0
1
;111;x^ 0
n 0
). Because 1x
k
! 0 implies rF
0
(^x;) = 0 from (20), we
haver(w ;^ )=0from (26). 2
3 Superlinear Convergence
Inthe previous section,wehave provedthe globalconvergencepropertyof AlgorithmIP.
Inthissection,wediscussunderwhichconditionAlgorithmIPcanpossessthesuperlinear
convergence property. For this purp ose, we rst consider the following local algorithm,
which is called Algorithm IPlo cal. By appropriately controlling the parameters
k and
k
at each step near a KKT point, we can show that the unit Newton-likestep from an
approximateSBKKT pointyieldsa nextapproximate SBKKT p ointthat corresponds to
the new updatedbarrierparameter, and that the sequencefw
k
g generated by Algorithm
IPlocal convergessup erlinearly tothe KKT point.
Algorithm IPlocal
Step 0. (Initialize) Set w
0
2S2R m
2R n
0
+ ,
0
>0,0<
0
<1and ">0. Letk =0.
Step 1. (Termination)If kr
0 (w
k
)k";then stop.
Step 2. (Update the parameters) Choose the parameters
k
>0 and 0<
k
<1.
Step 3. (Compute direction) Calculate the direction 1w
k
by the linear system of equa-
tions
J
k 1w
k
=0r(w
k
;
k );
(43)
where the matrix J
k
is givenby
J
k
= 0
B
@ G
k
0A(x
k )
t
0E t
A(x
k
)
k
I 0
Z
k
E 0 X
0
k 1
C
A : (44)
k max
= min (
min
i (
0 (x
0
k )
i
(1x 0
k )
i
(1x
0
k )
i
<0 )
;min
i (
0 (z
k )
i
(1z
k )
i
(1z
k )
i
<0 ))
;
k
= minf
k
k max
;1g:
Step 5. (Update variables) Set
w
k +1
=w
k +
k 1w
k :
Step 6. Set k :=k+1 and go toStep 1. 2
Denote the Jacobianmatrix of r(w;) by
rr(w ;) = 0
B
@ r
2
x
L(w ) 0A(x) t
0E t
A(x) I 0
ZE 0 X
0 1
C
A :
Let w 3
= (x 3
;y 3
;z 3
) t
be a KKT point of (1). In the following, we assume that k is
sucientlylargeand
k
issucientlycloseto0. Inordertoprovesuperlinearconvergence,
we need Assumption L.
Assumption L
(L1) The sequence fw
k
g converges to w 3
.
(L2) The secondderivatives of the functions f and g are Lipschitzcontinuousat x 3
.
(L3) The linear independence of active constraint gradients, the second order sucient
condition for optimality and the strict complementarity condition hold at w 3
.
(L4)
k and
k
are updated by the rules
k
=
k kr
0 (w
k )k
1+
1
and 10
k
=
k kr
0 (w
k )k
2
forp ositive constants
1 ,
2
and suchthat min(1;
2 )>
1
and0<<1, andfor
apositive numb er
k
suchthat 1
M 0
k
M
0
,where M 0
is ap ositive constant.
(L5) The matrix G
k
satises ateachk,
kG
k 0r
2
x L(w
3
)k<
for suciently small >0,and
k(G
k 0r
2
x L(w
k ))1x
k
k=O(k1w
k k
1+
3
)
for some positive constant
3
such that
3
>
1
. 2
First weshouldnote that by(L3), the Jacobianmatrixrr
0
(w ) isnonsingular. Then
by (L2), (L4) and (L5), wehave
kJ
k 0rr
0 (w
3
)k krr
0 (w
k
)0rr
0 (w
3
)k+kG
k 0r
2
x L(w
3
)k+
k
krr
0 (w
k
)0rr
0 (w
3
)k++M 0
kr
0 (w
k )k
1+
1
:
Since for sucientlylarge k and sucientlysmall ,there holds
krr
0 (w
3
) 01
kkJ
k 0rr
0 (w
3
)k<1;
J
k
is nonsingularand we have
kJ 01
k
k
for a positive constant by Banach perturbation lemma. Thus the linear system of
equations (43) has a unique solution.
Nowwegivethefollowingtheorem,whichisveryimportantforprovingthesuperlinear
convergence property of Algorithm IPlocal. This theorem shows that if w
k
satises the
approximateSBKKT condition for
k 01
,then
k
issettobeunit inStep 4of Algorithm
IPlocal and w
k +1w
k
also satisesthe approximate SBKKT condition for
k .
Theorem 3 Let M
c
be a constant such that 0<M
c
<
p
n 0
.
(1) If a point w^ =(^x;y ;^ z)^ t
2S2R m
2R n
0
+
satiseskr(w ;^
k
)kM
c
k , then
1 kr
0 (w
k )k
1+
1
kr
0
(w)k^
2 kr
0 (w
k )k
1+
1
(45)
for positive constants
1 and
2 .
(2) If kr(w
k
;
k 01
)kM
c
k01
, then
k
=1:
(3) There holds
kr(w
k +1w
k
;
k
)kM
c
k : (46)
Proof. (1) Since kr (w ;^
k
)kM
c
k
, wehave
kr
0
(w )k^ =
r(w ;^
k )+
k 0
B
@ 0
0^y
e 1
C
A
=O(
k
)=O(kr
0 (w
k )k
1+1
):
Furthermore we obtain
kr
0
(w )k^ =
r(w ;^
k )+
k 0
B
@ 0
0y^
e 1
C
A
k
0
B
@ 0
0^y
e 1
C
A
0kr(w ;^
k )k
=
k q
k^yk 2
+kek 2
0kr(w ;^
k )k(
p
n 0
0M
c )
k
p
n 0
0M
c
M 0
kr
0 (w
k )k
1+
1
:
(2) We will showthat
k min
i (
0 (x
0
k )
i
(1x 0
k )
i
(1x
0
k )
i
<0 )
1:
(47)
Fori suchthat (Ex )
i
>0, it followsfrom(1x
k )
i
!0 and
k
!1 that
0
k (x
0
k )
i
(1x 0
k )
i
>1 for (1x 0
k )
i
<0:
Nowwe consideran index i such that (Ex 3
)
i
=0. In this case wenote that (z 3
)
i
>0 by
Assumption (L3). By (43), we have
(x 0
k )
i
+(1x 0
k )
i
=
k
(z
k )
i 0
(x 0
k )
i (1z
k )
i
(z
k )
i : (48)
Since kr (w
k
;
k 01
)kM
c
k01
, we have
k
1
M 0
kr
0 (w
k )k
1+
1
1+
1
1
M 0
kr
0 (w
k01 )k
(1+
1 )
2
(49)
by result (1), and
j(x 0
k )
i (z
k )
i 0
k 01 j M
c
k 01 :
The latter yields
(x 0
k )
i
(1+M
c )
k 01
(z
k )
i
=
1+M
c
(z
k )
i
k 01 kr
0 (w
k 01 )k
1+
1
:
Since
(1z
k )
i
k1w
k
k=O(kr(w
k
;
k
)k)=O(kr
0 (w
k
)k)=O(kr
0 (w
k 01 )k
1+
1
);
we have
(x 0
k )
i (1z
k )
i
=O
kr
0 (w
k 01 )k
2(1+
1 )
: (50)
Assumption (L4) implies (1+
1 )
2
<2(1+
1
). Thus it follows from(48), (49) and (50)
that
(x 0
k )
i
+(1x 0
k )
i
>
k
(z
k )
i
; (51)
where isgivenby (L4). Since(x 0
k )
i (z
k )
i kr
0 (w
k
)k, Assumption (L4) guarantees
k
(z
k )
i
=
k kr
0 (w
k )k
1+1
(z
k )
i
k (x
0
k )
i kr
0 (w
k )k
1
k (x
0
k )
i kr
0 (w
k )k
2
= 1
(x
0
k )
i (10
k );
then we have
k
(z
k )
i (x
0
k )
i (10
k ):
(52)
Thus by (51) and (52) weobtain
(x 0
k )
i
+(1x 0
k )
i
>(10
k )(x
0
k )
i
;
whichimplies
k 0
(x 0
k )
i
(1x 0
k )
i
!
>1 for (1x 0
k )
i
<0:
In the same way as above,wecan prove that
k min
i (
0 (z
k )
i
(1z
k )
i
(1z
k )
i
<0 )
1:
Therefore the result follows.
(3) From Assumptions (L4) and (L5), wedirectly obtain
kr(w
k +1w
k
;
k
)k = kr(w
k
;
k
)+rr(w
k
;
k )1w
k
+O(k1w
k k
2
)k
kr(w
k
;
k )+J
k 1w
k
k+O(k1w
k k
2
)
+k(J
k
0rr(w
k
;
k ))1w
k k
= k(G
k 0r
2
x L(w
k ))1x
k
k+O(k1w
k k
2
)
= O(k1w
k k
min(1+
3
; 2)
)
= O(kr (w
k
;
k )k
min(1+
3
;2)
)
= O(kr
0 (w
k )k
min(1+3; 2)
)
= o(kr
0 (w
k )k
1+1
)
= o(
k )
M
c
k :
This proves(46).
Therefore the proof of this theorem is complete. 2
4 Global and superlinear convergence
Theorem 2 assures the global convergence of Algorithm LS to an SBKKT p oint for a
xedand therefore theglobal convergenceof AlgorithmIP to aKKT pointof problem
(1), whileTheorem 3implies thesuperlinearconvergenceof AlgorithmIPlocal toaKKT
point of problem (1). Specically Theorem 3 showsthat if w
k
satises the approximate
SBKKT condition for
k 01
, then
k
is setto beunit in Step 4 of Algorithm IPlocal and
w
k +1w
k
also satises the approximate SBKKT condition for
k
. Thus by result (1) of
Theorem 3,the superlinearconvergenceprop ertyof AlgorithmIPlo calcan beobtained if
we cho ose anapproximateSBKKT pointfor
0
asan initialpoint.
Howeverthis doesnot necessarily imply the superlinearconvergenceofAlgorithm IP,
because the Armijo line search criterion required in the inner iteration (Algorithm LS)
may prevent from cho osing a unit step size even if the iterates are near a KKT point.
This phenomenon is known as the Maratos eect. However if we adopt a unit step size
when the current p oint w
k
(the initial point for the k-th inner iteration) satises the
approximate SBKKT condition for suciently small
k 01
, and w
k +1w
k
(the rst step
for the k-th inner iteration) satises the approximate SBKKT condition for
k01 , even
when the merit function value does not satisfy the Armijo rule, then Theorems 2 and
3 assure that we can have the global and superlinear convergence of Algorithm IP by
appropriately controllingthe parameters
k and
k
at the nal stage of iterations.
purpose, we could use a nonmonotone strategy like the primal-dual interior point trust
region method given by Yamashita, Yab e and Tanab e[18] for example. However we did
not adopt the technique just for simplicity of exposition of the algorithm in this pap er.
References
[1] I.Akrotirianakis and B.Rustem, A globally convergent interior point algorithm for
general nonlinear programming problems, Technical Report 97-14, Department of
Computing,ImperialCollegeofScience,TechnologyandMedicine,revisedJuly1998.
[2] I.Akrotirianakisand B.Rustem,A primal-dualinterior pointalgorithm with anexact
and dierentiable function for general nonlinear programming problems, Technical
Rep ort 98-09, Department of Computing, Imperial College of Science, Technology
and Medicine, 1998.
[3] M.Argaez and R.A.Tapia, On the global convergence of a modied augmented La-
grangian linesearch interior point Newton method for nonlinear programming,Tech-
nical Rep ort TR95-38, Department of Computational and Applied Mathematics,
Rice University,October1995 (revised February 1997).
[4] M.G.BreiteldandD.F.Shanno,Preliminarycomputationalexperiencewithmo died
log-barrierfunctionsforlarge-scalenonlinearprogramming,inLarge ScaleOptimiza-
tion, Kluweracademic publishers, Dordrecht,Boston, London, 1994.
[5] R.H.Byrd, J.C.Gilbert and J.Nocedal,A trustregion method based on interior point
techniques for nonlinear programming, Technical Rep ort OTC 96/02, Optimization
TechnologyCenter, Argonne National Laboratory,June, 1996.
[6] R.H.Byrd,M.E.HribarandJ.Nocedal,Aninteriorpointalgorithmforlargescalenon-
linear programming,TechnicalReportOTC97/05,OptimizationTechnologyCenter,
Argonne NationalLaboratory, August, 1997.
[7] R.H.Byrd, G.Liuand J.Nocedal,On the local behaviourof aninteriorpointmetho d
for nonlinear programming, in Numerical analysis 1997, D.F.Griths, D.J.Higham
and G.A.Watson eds.,Longman (1998),pp.37-56.
[8] J.E.Dennis, Jr.,M.Heinkenschloss and L.N.Vicente,Trust-region interior-point SQP
algorithms for a class of nonlinear programming problems,TR94-45, Dept. of Com-
putational and Applied Mathematics, Rice University, Houston, Texas, USA, 1994
(revised Novemb er1995).
[9] A.S.El-Bakry,R.A.Tapia, T.Tsuchiyaand Y.Zhang, On the formulation and theory
of the Newton interior-p oint method for nonlinear programming, Journal of Opti-
mization Theory and Applications,89(1996) pp.507-541.
[10] A.V.FiaccoandG.P.McCormick,Nonlinear Programming: Sequential Unconstrained
Minimization Techniques, SIAM, Philadelphia,1990.
gramming, SIAM J.on Optimization, 8(1998) pp.1132-1152.
[12] H.J.Martinez, Z.Parada and R.A.Tapia, On the characterization of Q-superlinear
convergenceofquasi-Newtoninterior-p ointmethodsfornonlinearprogramming,Bol.
Soc. Mat. Mexicana,Vol.1 (1995),pp.137-148.
[13] H.Yab e and H.Yamashita, Q-superlinear convergence of primal-dual interior point
quasi-Newton methods for constrained optimization, Journal of the Operations Re-
search Society of Japan, 40(1997), pp.415-436.
[14] H.Yamashita, A primal-dual exact merit function for constrained optimization, Op-
timization { Modeling and Algorithms 8, Coop erative Research Report 84, The In-
stitute of Statistical Mathematics, March(1996), pp.119-127.
[15] H.Yamashita, A globally convergent primal-dual interior point method for con-
strainedoptimization, Optimization Methods and Software, 10(1998), pp.443-469.
[16] H.Yamashita and T.Tanab e, A primal-dual interior point trust region method for
large scale constrained optimization, Optimization { Modeling and Algorithms 6,
Co operative Research Report 73, The Institute of Statistical Mathematics, March
(1995), pp.1-25.
[17] H.Yamashita and H.Yab e, Superlinear and quadratic convergence of some primal-
dual interior p oint methods for constrained optimization, Mathematical Program-
ming,75 (1996),pp.377-397.
[18] H.Yamashita,H.Yab eand T.Tanab e,Aglobally andsuperlinearly convergentprimal-
dualinterior pointtrustregion methodfor largescaleconstrainedoptimization,Tech-
nical Report, 1997 July (revised 1998 July).