Finally, we apply the obtained results to the computation of the convergence rate of Krylov subspace methods when applied to a preconditioned linear system

(1)

CHEBYSHEV APPROXIMATION VIA POLYNOMIAL MAPPINGS AND THE CONVERGENCE BEHAVIOUR OF KRYLOV SUBSPACE METHODS^∗

BERND FISCHER^†ANDFRANZ PEHERSTORFER^‡

Abstract. Letϕmbe a polynomial satisfying some mild conditions. Given a setR⊂C, a continuous function fonRand its best approximationp^∗_n−1fromΠn−1with respect to the maximum norm, we show thatp^∗_n−1◦ϕm

is a best approximation tof◦ϕmon the inverse polynomial imageSofR, i.e.ϕm(S) =R, where the extremal signature is given explicitly. A similar result is presented for constrained Chebyshev polynomial approximation.

Finally, we apply the obtained results to the computation of the convergence rate of Krylov subspace methods when applied to a preconditioned linear system. We investigate pairs of preconditioners where the eigenvalues are contained in setsSandR, respectively, which are related byϕm(S) =R.

Key words. Chebyshev polynomial, optimal polynomial, extremal signature, Krylov subspace method, conver- gence rate.

AMS subject classifications. 41A10, 30E10, 65F10.

1. Notations and statement of the problem. LetR ⊂Cdenote a compact subset of the complex plane and letC(R)be the set of continuous functions onR. Forf ∈C(R)we denote bykfk^R:= maxz∈R |f(z)|the uniform norm onR. Furthermore, letg1, g2. . . , gn∈ C(R)be linearly independent functions with Vn := span{g1, g2. . . , gn}. Then the best approximationg^∗ off with respect toVn onR is the solution of the complex Chebyshev approximation problem

kf −g^∗kR= min

g∈Vnkf−gkR. (1.1)

It is well-known, thatg^∗ exists for anyRand is unique provided thatRcontains at leastn points.

Now, letΠn := {p(z) = Pn

j=0ajz^j| aj ∈ C}denote the set of all polynomials of degree up tonand letϕm∈Πm\Πm−1be a polynomial of exact degreem.

Randϕmmay be used to define the set (cf. Figure 1.1) S=S(R, ϕm) ={s∈C: ϕm(s)∈R}. In other words, we haveϕm(S) =RandS=ϕ⁻¹_m(R), respectively.

By construction, it is clear that

gmin∈Vnkf◦ϕm−g◦ϕmk^S= min

g∈Vnkf−gk^R.

However, ifVnis a space of polynomials, e.g.,Vn = Πn−1orVn = (z−c)Πn−1, one may ask the question whether even the following equations

kf −p^∗_n−1k^R= min

pn−1∈Πn−1kf−pn−1k^R

= min

pmn−1∈Πmn−1kf◦ϕm−pmn−1k^S =kf◦ϕm−p^∗_n−1◦ϕmk^S (1.2)

∗Received May 18, 1999. Accepted for publication June 22, 2001. Recommended by R. Freund.

†Institute of Mathematics, Medical University of L ¨ubeck, 23560 L ¨ubeck, Wallstraße 40, Germany. E-mail:

[email protected]

‡Institute for Analysis and Computational Mathematics, University of Linz, 4040 Linz, Altenbergerstraße, Aus- tria. This research was supported by the Austrian Fonds zur F ¨orderung der wissenschaftlichen Forschung, project- number P12985-TEC. E-mail:

[email protected] 205

(2)

−3 −2 −1 0 1 2 3

−3

−2

−1 0 1 2 3

−2 −1 0 1 2

−1.5

−1

−0.5 0 0.5 1 1.5

FIG. 1.1.(ϕ3)⁻¹(R)(left) for various ellipsesR(right) andϕ3(z) =z³+ 2z²+ 0.25z−3.

hold true. In Section 2, we will give a complete answer to this question. We remark that the case whereSis the inverse image of equipotential lines under a polynomial mapping has been considered in Peherstorfer [9].

Furthermore, we will investigate the convergence behavior of Krylov subspace methods when applied to the linear system

Ax=b, A ∈C^N^×^N.

Actually, this application, to a certain extent, provided the motivation for the discussion of Chebyshev approximation problems connected via the polynomial mappingϕm. As we will describe in Section 3, the convergence rate of Krylov subspace methods may be given in terms of so-called optimal polynomials. For a given parameterc6∈R(typicallyc= 0), such a polynomialP_n^Ris the solution of the constrained Chebyshev approximation problem

kP_n^RkR=k1−(z−c)q_n^∗−1(z)kR= min

qn−1∈Πn−1 k1−(z−c)qn−1(z)kR, (1.3)

where typicallyRis a set which contains all eigenvalues of the given matrixA.

When using iterative methods, preconditioning is an important issue. Here, one is look- ing for a preconditioning matrixMsuch that the new systemM⁻¹Ax =M⁻¹bis easier to solve than the original systemAx=b. LetSdenote a set which contains all eigenvalues ofM⁻¹A. Naturally, one is interested in studying the convergence properties of the preconditioned system as compared to the original system. It turns out that for certain classes of preconditioners and certain linear systems the eigenvalue inclusion sets and sometimes even the eigenvalues themselves are related viaR = ϕm(S). Thus, our analysis enables one to relate the corresponding convergence rates to each other. See Section 3 for details.

Finally, we note that it is often easier to compute the Chebyshev PolynomialT_n^Rthan the optimal polynomial. It is the solution of the Chebyshev approximation problem

kT_n^RkR=kzⁿ−p^∗_n−1kR= min

pn−1∈Πn−1 kzⁿ−pn−1(z)kR. (1.4)

Observe that the scaled Chebyshev polynomial T_n^R(z)/T_n^R(c) always provides an upper bound for the norm of the optimal polynomial

kP_n^Rk^R≤

T_n^R T_n^R(c)

_R

,

(3)

where equality holds for some sets R. Examples of such sets include single intervals (cf.

Markoff [8]) and certain ellipses (cf. Fischer and Freund [4]).

2. Best approximations and extremal signatures on inverse images of polynomial mappings. Throughout this section we assume thatϕm ∈ Πm\Πm−1 is a polynomial of degreemwith leading coefficientam. Moreover, let zj ∈ Cbe a given point, then we denote byzj,1, zj,2, . . . , zj,mthe zeros ofϕm(z)−zj. We always assume that these zeros are simple, i.e.,ϕmhas a full set of inverse brancheszj,l =ϕ⁻_m,l¹(zj). This assumption, for example, implies that we have the partial fraction expansion

1 ϕm(z)−zj

=

m

X

l=1

1 ϕ⁰_m(zj,l)

1 (z−zj,l). (2.1)

For the proof of our first theorem, the following well-known characterization of best approximation will be useful (cf. Rivlin and Shapiro [11]). The function g^∗ is a best approximation off with respect to Vn (cf. (1.1)) if, and only if, there exist r extremal pointsz1, z2. . . , zr ∈ {z ∈ R : |(f −g^∗)(z)| = kf −g^∗k^R} and positive numbers µ1, µ2, . . . , µr ∈ R₊ (r ≤ 2n+ 1in the complex case andr ≤ n+ 1in the real case) such that

r

X

j=1

µjsgn(f−g^∗)(zj)gk(zj) = 0 for k= 1,2, . . . , n.

(2.2)

The set{(zj, µjsgn(f −g^∗)(zj))|j = 1,2, . . . , r}is called an extremal signature for f−g^∗onRwith respect toVn.

THEOREM2.1. LetR⊂C,f ∈C(R), andS=S(R, ϕm)be given. Ifp^∗_n−1is the best approximation off with respect toΠn−1onR

kf−p^∗_n−1k^R= min

pn−1∈Πn−1 kf−pn−1k^R,

thenp^∗_n−1◦ϕmis the best approximation off◦ϕmwith respect toΠmn−1onS kf◦ϕm−p^∗_n−1◦ϕmkS = min

pmn−1∈Πmn−1 kf◦ϕm−pmn−1kS.

Furthermore, if{(zj, µjsgn(f−p^∗_n−1)(zj))|j = 1,2, . . . , r}is an extremal signature for f−p^∗_n−1onR, i.e.,

r

X

j=1

µjsgn(f−p^∗_n−1)(zj)z^k_j = 0, for k= 0,1, . . . , n−1, (2.3)

then

r

X

j=1 m

X

l=1

µjsgn(f−p^∗_n−1)(ϕm(zj,l))z_j,l^k = 0, for k= 0,1, . . . , nm−1,

where zj,1, zj,2, . . . , zj,m denote the zeros of ϕm(z)−zj, for j = 1,2, . . . , r. That is, {(zj,l, µj,lsgn(f−p^∗_n−1)(ϕm(zj,l)))|j = 1,2, . . . , r, l= 1,2, . . . , m}, withµj,l:=µj, is an extremal signature for(f−p^∗_n−1)◦ϕmonSwith respect toΠnm−1.

(4)

Proof. For convenience we set ˆ

µj:=µjsgn(f−p^∗_n−1)(ϕm(zj,l)) =µjsgn(f −p^∗_n−1)(zj).

We then have to show that

r

X

j=1

ˆ µj

m

X

l=1

z^k_j,l= 0, for k= 0,1, . . . , mn−1.

To start with, we note that by construction the pointszj,lare extremal points for(f−p^∗_n−1)◦ ϕm. Furthermore, in view of (2.1) and (2.3) we obtain

r

X

j=1

ˆ µj

m

X

l=1

1 ϕ⁰_m(zj,l)

1 (z−zj,l) =

r

X

j=1

ˆ µj

ϕm(z)−zj

=

r

X

j=1

ˆ µj

∞

X

k=0

z^k_j ϕm(z)^k+1

!

=

∞

X

k=0





r

X

j=1

ˆ µjz^k_j



 1 ϕm(z)^k+1

=O 1

z^m(n+1)

,

asz→ ∞. On the other hand, we have

r

X

j=1

ˆ µj

m

X

l=1

1 ϕ⁰_m(zj,l)

1 (z−zj,l) =

∞

X

k=0





r

X

j=1

ˆ µj

m

X

l=1

z_j,l^k ϕ⁰_m(zj,l)



 1 z^k+1, and consequently

r

X

j=1

ˆ µj

m

X

l=1

q(zj,l)

ϕ⁰_m(zj,l)= 0, for all q∈Πm(n+1)−2. For the special choiceq(z) =z^kϕ⁰_m(z)we obtain

r

X

j=1

ˆ µj

m

X

l=1

z^k_j,l= 0, for k= 0,1, . . . , mn−1, which concludes the proof.

Withf(z) =zⁿwe immediately arrive at the following corollary.

COROLLARY2.2. LetR⊂C, andS =S(R, ϕm)be given. IfT_n^Rdenotes the Cheby- shev polynomial with respect toR(cf. (1.4)), then the Chebyshev polynomial of degreemn with respect toSis given by

T_mn^S (z) = 1

aⁿ_mT_n^R(ϕm(z)).

We remark that one may also find Corollary2.2in Kamo and Borodin [6]. Their proof, however, is based on Kolmogorov’s criterion and therefore does not provide an extremal

(5)

signature forT_n^R◦ϕm. Furthermore, we note that the above theorem includes the result of Lebedev [7] for the case whereSis the union of finitely many intervals on the real line.

Next, we turn our attention to the computation of optimal polynomials or, more general, to the case of constrained Chebyshev approximation. For constrained Chebyshev problems on two and several intervals see, for example, Fischer [2], Peherstorfer and Schiefermayr [10]

and references therein.

The proof of the next theorem is along the lines of the proof of Theorem 2.1and is therefore omitted.

THEOREM 2.3. LetR ⊂ C, c 6∈ R, f ∈ C(R), andS = S(R, ϕm)be given. If Q^∗_n−1= (z−c)q_n^∗−1is the best approximation offwith respect to(z−c)Πn−1onR

kf−Q^∗_n−1k^R= min

qn−1∈Πn−1 kf−(z−c)qn−1k^R,

thenQ^∗_n−1◦ϕm= (ϕm(z)−c)q_n^∗−1◦ϕmis the best approximation off◦ϕmwith respect to(ϕm(z)−c)Πmn−1onS

kf◦ϕm−Q^∗_n−1◦ϕmkS = min

qmn−1∈Πmn−1 kf ◦ϕm−(ϕm(z)−c)qmn−1kS.

Furthermore, if{(zj, µjsgn(f(zj)−Q^∗_n−1)(zj))|j = 1,2, . . . , r}is an extremal signature forf−Q^∗_n−1onR, i.e.,

r

X

j=1

µjsgn(f(zj)−Q^∗_n−1)(zj)(zj−c)z_j^k = 0, for k= 0,1, . . . , n−1,

then, fork= 0,1, . . . , nm−1,

r

X

j=1 m

X

l=1

µjsgn(f(ϕm(zj,l))−Q^∗_n−1(ϕm(zj,l))(ϕm(zj,l)−c)z_j,l^k = 0,

where zj,1, zj,2, . . . , zj,m denote the zeros of ϕm(z)−zj, for j = 1,2, . . . , r. That is, {(zj,l, µj,lsgn(f(ϕm(zj,l))−Q^∗_n−1(ϕm(zj,l))))|j = 1,2, . . . , r, l = 1,2, . . . , m}, with µj,l:=µj, is an extremal signature forf◦ϕm−Q^∗n−1◦ϕmonSwith respect toΠnm−1.

The special casef(z) = 1gives rise to the following corollary.

COROLLARY2.4. LetR⊂C,c6∈R, andS =S(R, ϕm)be given. IfP_n^R= 1−(z− c)q^∗_n−1denotes the optimal polynomial with respect toRand toc(cf. (1.3)), then we have

kP_n^R◦ϕmkS= min

qmn−1∈Πmn−1 k1−(ϕm(z)−c)qmn−1kS.

It is worth noticing that the corollary above in general does not imply thatP_n^R◦ϕmis the optimal polynomial for the setS. This would be the case, if

kP_n^R◦ϕmk^S =k1−(ϕm(z)−c)q_n^∗−1◦ϕmk^S= min

qmn−1∈Πmn−1 k1−(z−e)qmn−1k^S, whereeis a zero ofϕm(z)−c.

In the remaining part of this section we will further investigate polynomial mappingsϕ2

of degree two. We start with the case whereS is the union of two disjoint intervals on the real line.

THEOREM2.5. LetR= [a, b], a < b∈ R, andc ∈R\[a, b]be given. Furthermore, letϕ2 be such thatS = S(R, ϕ2)is the union of two intervals and let e1, e2 denote the

(6)

zeros ofϕ2(z)−c. IfP_n^Rdenotes the optimal polynomial with respect toRand toc, then P_2n^S =P_n^R◦ϕ2is the optimal polynomial with respect to S and toe1ande2, respectively,

kP_n^R◦ϕ2kS = min

q2n−1∈Π2n−1 k1−(z−e1)q2n−1kS

= min

q2n−1∈Π2n−1 k1−(z−e2)q2n−1kS.

Moreover,P_2n^S is even optimal forΠ2n+1, i.e., we haveP_2n+1^S =P_2n^S.

Proof. LetT_n^Rdenote the Chebyshev polynomial with respect toR. By Corollary2.2we have that the Chebyshev polynomial with respect toSis given byT_2n^S(z) =T_n^R(ϕ2(z))/a²₂. SinceT_n^Ris a shifted version of the classical Chebyshev polynomials on the unit interval, it has preciselyn+ 1extremal points onR. Now, Theorem2.1implies thatT_2n^S has2(n+ 1) extremal points onS. Finally, the assertion follows from Theorem 4.3 in Fischer [2], which states that the optimal polynomial with respect toSis a scaled Chebyshev polynomial

P_2n^S(z) = T_2n^S(z)

T_2n^S(e1)= T_2n^S(z) T_2n^S(e2),

ifT_2n^S has2n+2extremal points onS. Actually, the fact thatT_2n^S has one additional extremal point also implies thatP_2n+1^S =P_2n^S, cf. Corollary 3.3.6(b) in Fischer [3].

Next, we analyze the special mappingsϕ^±₂(z) =±z(z−1).

THEOREM 2.6. LetR ⊂ C,0 6∈ R, andS^± = S(R, ϕ^±₂)withϕ^±₂(z) = ±z(z−1) be given. IfP_n^Rdenotes the optimal polynomial with respect toRand to0, then the optimal polynomials of degree2nand2n+ 1with respect toS^± and to0and1, respectively, are given by

P_2n+1^S^± =P_2n^S^± =P_n^R◦ϕ^±₂ with

kP_2n+1^S⁺ kS⁺=kP_2n^S⁺kS⁺=kP_n^RkR=kP_2n^S⁻kS⁻ =kP_2n+1^S⁻ kS⁻.

Proof. We only consider the caseϕ2 =ϕ⁺₂ withS =S⁺. We start by noting that the polynomialϕ2satisfies the symmetry relationϕ2(l(z)) =ϕ2(z)withl(z) = 1−z. Hence, by construction, we haveS =l(S). From the fact that the optimal polynomial is uniquely defined and

maxz∈S |P_k^S(z)|= max

z∈S |P_k^S(l(z))|,

we obtain the identityP_k^S(z) =P_k^S(l(z)). For odd degree polynomials this symmetry relation implies that the leading coefficient vanishes, becauseP_2n+1^S (z) =α2n+1z²ⁿ⁺¹+· · ·and P_2n+1^S (l(z)) =α2n+1(1−z)²ⁿ⁺¹+· · ·=−α2n+1z²ⁿ⁺¹+· · ·. Now, let us consider the even degree casek= 2n. Here, we deduce from the symmetry relationP_2n^S(z) =P_2n^S(l(z))that all zeros ofP_2n^S come in pairsP_2n^S(zj) = P_2n^S(l(zj)) = 0, j = 1,2, . . . , n. Letqn denote the polynomial with the zerosyj = ϕ2(zj), j = 1,2, . . . , n, and constant term 1. Since P_2n^S −qn◦ϕ2∈Π2nhas2n+ 1zeros, namelyzj, l(zj), j= 1,2, . . . , nandzn+1= 0, we conclude thatP_2n^S =qn◦ϕ2, which proves the assertion.

(7)

3. Application. In this section we will analyze the convergence of iterative methods when applied to the solution of a non-symmetric, nonsingular linear system

Ax=b, A ∈C^N×N.

Throughout this section we assume for convenience thatAis diagonalizable. Some of the most effective iterative methods available are those of Krylov subspace type which have in- built minimization properties. Here we consider minimal residual methods, whose iterates minimize the Euclidian norm of the residualr_nat each step. More precisely, for the methods under consideration thenth residualrnmay be written as

rn=b− Axn=pn(A)r0,

wherepn ∈ Πn is a polynomial of exact degreen satisfying the interpolatory constraint pn(0) = 1. The polynomialpnand consequently the iteratexn are uniquely determined by the minimization property

krnk²= min{kp(A)r0k²; p∈Πn, p(0) = 1}.

The actual implementation of such a method depends on the properties of the given coefficient matrixA. Among the most well-known methods belonging to this class are CR (forA symmetric and positive definite), MINRES (forAsymmetric but indefinite) and GMRES (for Anon-symmetric) (cf., e.g., Saad [12]).

LetV denote the eigenvector matrix of A. Then the 2-norm of the residual may be bounded by the following standard estimate

kr_nk2

kr₀k² ≤κ2(V) min

p∈Πn, p(0)=1 max

λ∈σ(A)|p(λ)|=κ2(V)kP_n^σ(^A⁾kσ(A), (3.1)

whereσ(A)denotes the spectrum ofAandκ2(V)denotes the condition number of the eigenvector matrixV. In conclusion, to bound the convergence rate of minimal residual methods one is tempted to compute the norm of the optimal polynomialPn^σ(A)(cf. (1.3)) with respect toσ(A)and to0.

In the remaining part of this section we will identify matricesAand preconditionerM such that the spectrum ofAand its preconditioned versionM⁻¹Aare related by a polynomial mapping

ϕk(σ(A)) =σ(M⁻¹A), with

P_kn^σ(^A⁾=P_n^σ(^M⁻¹^A⁾◦ϕk, and

kP_kn^σ(Â⁾kσ(A)=kP_n^σ(^M⁻¹Â⁾◦ϕkkσ(A)=kP_n^σ(^M⁻¹Â⁾kσ(M⁻¹A).

These equations together with (3.1) have an important consequence. They indicate, at least in the absence of roundoff errors, that a minimal residual method converges aboutk-times faster for the preconditioned system than for the original system.

Let us now consider linear systems of the form A=

A B^T

B 0

∈C^(n+m)^×^(n+m), (3.2)

(8)

where then×nmatrixAis regular and them×n,m < n, matrixB has full rank. The efficient solution of such systems plays an important role in many different applications. For example, the mixed finite element approximation of the Stokes problem (cf. [5]) leads to a system of the form (3.2) with symmetric positive definiteA, whereas the discretization of the Oseen problem (cf. [1]) results in a matrix problem with non-symmetricA. Other examples include linear least squares problems and linear KKT systems.

Given the structure ofA, an obvious choice is to use block preconditioning. Furthermore, often a fast solver for the top-left blockAis available. Therefore, preconditioners of the form

M(F, M) =

A F

0 M

, (3.3)

whereM is nonsingular are quite popular (cf. [1]). To provide an efficient implementation of the preconditioner, common choices for the top-right blockF areF = 0andF = B^T, respectively.

In general, there is some flexibility in applying the preconditioner. One may consider a left M⁻¹A, a right AM⁻¹, or a centrally preconditioned coefficient matrix M^−1/2AM^−1/2, respectively. For the latter case the matrixMis assumed to be positive definite. However, it is easy to see, that the three preconditioned matrices share the same eigenvalues. They are given by the solution of the generalized eigenvalue problem

Av=λMv.

(3.4)

The next lemma relates the wanted solutions λ of (3.4) for the specific choices M(0,±M)andM(B^T,−M)to the generalized eigenvaluesµof the so-called Schur complement

BA⁻¹B^Tp=µMp.

(3.5)

In particular, it shows that the two sets of eigenvalues are related by a polynomial mapping.

The proof is straightforward and might be found in Elman and Silvester [1].

LEMMA3.1. Letµk,k= 1,2, . . . , m, denote the solutions of the generalized eigenvalue problem (3.5).

(i) Letϕ⁺₂(z) = z(z−1). The solutions ofAv = λM(0, M)vare λ0 = 1, of multiplicityn−m, and

λ^±_k = 1±√ 1 + 4µk

2 , i.e., ϕ⁺₂(λ^±_k) =µk, k= 1,2, . . . , m.

(ii) Letϕ⁻₂(z) =−z(z−1). The solutions ofAv = λM(0,−M)vareλ0 = 1, of multiplicityn−m, and

λ^±_k =1±√ 1−4µk

2 , i.e., ϕ⁻₂(λ^±_k) =µk, k= 1,2, . . . , m.

(iii) The solutions ofAv=λM(B^T,−M)vareλ0= 1, of multiplicityn, and λk=µk, k= 1,2, . . . , m.

Let us now investigate the case that the matricesAandM are symmetric and positive definite. Here, it is easy to see that the corresponding matrix Ais symmetric but indefinite and that the preconditionerM(0, M)is symmetric and positive definite. It follows from Lemma3.1(i) that the preconditioned matrixA⁺ = M(0, M)^−1/2AM(0, M)^−1/2

(9)

is indefinite and that its eigenvalues are, apart from λ = 1, symmetric about the point 1/2. On the other hand, the alternative preconditionerM(0,−M)is itself indefinite and therefore may not be applied centrally. As a consequence, the preconditioned system A⁻ = M(0,−M)⁻¹A (orA⁻ = AM(0,−M)⁻¹) is no longer symmetric. However, all its eigenvalues are located in the right-half plane. Lemma3.1(ii) shows that apart from λ= 1, all of them are either located on a cross with center1/2or on a real interval.

There has been some discussion in the literature (cf. [5]) on whether MINRES applied to the symmetric indefinite systemA⁺or GMRES applied to one of the non-symmetric systems A⁻ will converge faster. The next theorem shows that at least the convergence rates are identical. The proof is a direct consequence of (3.1), Lemma3.1, and Theorem2.6.

THEOREM3.2. LetV⁻denote the eigenvector matrix ofA⁻and letr0= (0, v)^T with v ∈ R^m. Furthermore, letR = {µj : j = 1,2, . . . m}denote the generalized eigenvalues of the Schur complement (3.5) and letS^± =σ(A^±)\ {1}. Then the 2-norm of the residuals r⁺_k =p⁺_k(A⁺)r₀andr⁻_k =p⁻_k(A⁻)r₀may be estimated as follows

kr⁺_kk²

kr₀k2 ≤ kP_2n^S⁺kS⁺=kP_n^RkR, kr⁻_kk2

kr0k² ≤κ2(V⁻)kP_2n^S⁻kS⁻=κ2(V⁻)kP_n^Rk^R, fork∈ {2n,2n+ 1}.

Some remarks are in place. The above theorem is essentially a result on polynomial approximation. It says nothing about the performance of MINRES and GMRES in finite precision arithmetic. However, all our numerical test runs (cf. also [5]) have shown precisely the predicted convergence behavior. Apart from the constant factorsκ2(V⁻)the bounds of the preceding theorem are the same. Note thatκ2(V⁺) = 1. Actually, an analysis based on orthogonal polynomials shows that even the iteratesx⁺_2n+1 = x⁺_2n =x⁻_2n =x⁻_2n+1are the same. Furthermore, the special choicer0= (0, v)^T ensures that no eigendirection associated with the eigenvalue1exists in the initial residual (for details see [5]).

0 0.2 0.4 0.6 0.8 1

−0.25

−0.2

−0.15

−0.1

−0.05 0 0.05 0.1 0.15 0.2

−2 0 2 4

−0.1

−0.05 0 0.05 0.1 0.15

FIG. 3.1.(ϕ⁺₂)⁻¹(E)(left) and(ϕ⁻₂)⁻¹(E)(right) for various ellipsesE.

The above analysis is based on the fact that the top-left blockAin (3.2) is inverted in exact arithmetic. In practice, however, only an approximation to the inverse is available. Here the spectra of the preconditioned matrices are contained in regions which do includeS^±. By Lemma3.1we haveS^± = (ϕ^±₂)⁻¹(R). Figure 3.1 shows, apart fromS^±, the preimage of

(10)

ϕ^±₂ of ellipsesE containing the setR. With the help of Theorem2.6it is again possible to explicitly compute the optimal polynomials with respect to the depicted enlarged versions of S^±.

Finally, we turn our attention to the non-symmetric case. That is, we assume that the matrixAin (3.2) is non-symmetric. Following Elman and Silvester [1] we investigate the block diagonal preconditioned systemA^BD =AM(0, M)⁻¹and the block triangular pre- conditoned systemA^BT =AM(B^T,−M)⁻¹, respectively. We learn from Lemma3.1that, apart from the eigenvalueλ= 1, the eigenvaluesλofA^BD and the eigenvaluesµofA^BT are related by the quadratic mappingµ=ϕ2(λ)withϕ2(z) =z(z−1).

Actually, for the Oseen operator Elman and Silvester [1] proved that all eigenvalues ofA^BT are contained in a rectangular boxR in the right half plane. Consequently, the eigenvalues ofA^BD are contained in setsS which are the preimages ofR. The next figure displays some typical inclusion sets.

−2 0 2 4

−5 0 5

−2 0 2 4

−1.5

−1

−0.5 0 0.5 1 1.5

FIG. 3.2.S(left) andR=ϕ2(S)(right) for different setsR.

Also, Elman and Silvester reported on some numerical test runs with these preconditioners. They observed that GMRES applied toA^BD took twice as many iterations as GMRES applied toA^BT to reach the same reduction of the initial norm. Moreover, the norm of the odd iterates stagnates for the diagonal preconditioning.

The next theorem provides an explanation for their observations in terms of the respec- tive convergence rates. Again, the proof follows directly from (3.1), Lemma3.1, and Theo- rem2.6.

THEOREM 3.3. LetV^BD andV^BT denote the eigenvector matrix ofA^BD andA^BT, respectively and letr₀= (0, v)^T withv∈R^m. Furthermore, letR={µj :j = 1,2, . . . m} denote the generalized eigenvalues of the Schur complement (3.5) and letS^BD=σ(A^BD)\ {1}. Then the 2-norm of the residualsr^BD_k =p^BD_k (A^BD)r₀andr^BT_k =p^BT_k (A^BT)r₀may be estimated as follows

kr^BD_k k2

kr₀k² ≤κ2(V^BD)kP_2n^S^BDkS^BD =κ2(V^BD)kP_n^RkR, k∈ {2n,2n+ 1}, kr^BT_n k2

kr0k² ≤κ2(V^BT)kP_n^RkR.

Acknowledgments. Part of this work was done while the first author was visiting the Laboratoire d’Analyse Num´erique et d’Optimisation at the Universit´e des Sciences et Tech-

(11)

nologies de Lille. He would like to thank all members of the Institute for their warm hospi- tality. Finally, we thank the reviewers for helpful suggestions and comments.

REFERENCES

[1] H.C. ELMAN ANDD.J. SILVESTER, Fast Nonsymmetric iterations and preconditioning for Navier-Stokes equations, SIAM J. Sci. Comput., 17 (1996), pp. 33-46.

[2] B. FISCHER, Chebyshev polynomials for disjoint compact sets, Constr. Approx., 8 (1992), pp. 309-329.

[3] B. FISCHER, Polynomial Based Iteration Methods for Symmetric Linear Systems, Wiley-Teubner, Chicester, Stuttgart, 1996.

[4] B. FISCHER ANDR. FREUND, On the constraint Chebyshev approximation problem on ellipses, J. Approx.

Theory, 62 (1990), pp. 297-315.

[5] B. FISCHER, A. RAMAGE, D.J. SILVESTER,ANDA.J. WATHEN, Minimum residual methods for augmented systems, BIT, 38 (1998), pp. 527-543.

[6] S.O. KAMO AND P.A. BORODIN, Chebyshev polynomials for Julia sets, Moscow Univ. Math. Bull., 49 (1994), pp. 44-45.

[7] V.I. LEBEDEV, Iterative methods for solving operator equations with a spectrum contained in several inter- vals, Comput. Math. Math. Phys., 9 (1969), pp. 17-24.

[8] W. MARKOFF, ¨Uber Polynome, die in einem gegebenen Intervalle m¨oglichst wenig von Null abweichen, Math.

Ann., 77 (1916), pp. 213-258.

[9] F. PEHERSTORFER, Minimal polynomials for compact sets of the complex plane, Constr. Approx., 12 (1996), pp. 481-488.

[10] F. PEHERSTORFER ANDK. SCHIEFERMAYR, Theoretical and numerical description of extremal polynomials on several intervals II, Acta Math. Hungar., 83 (1999), pp. 103-128.

[11] T.J. RIVLIN ANDH.S. SHAPIRO, A unified approach to certain problems of approximation and minimization, J. Soc. Indust. Appl. Math., 9 (1961), pp. 670-699.

[12] Y. SAAD, Iterative Methods for Sparse Linear Systems, PWS, Boston, 1996.