The Fr ´echet Derivative of an Analytic Function of a Bounded Operator with Some Applications

(1)

Volume 2009, Article ID 239025,17pages doi:10.1155/2009/239025

Research Article

The Fr ´echet Derivative of an Analytic Function of a Bounded Operator with Some Applications

D. S. Gilliam,

¹

T. Hohage,

²

X. Ji,

³

and F. Ruymgaart

¹

1Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409, USA

2Institute for Numerical and Applied Mathematics, University of G¨ottingen, 37083 G¨ottingen, Germany

3Department of Mathematics, Utah Valley University, Orem, UT 84058, USA

Correspondence should be addressed to D. S. Gilliam,[email protected] Received 7 June 2008; Accepted 15 January 2009

Recommended by Petru Jebelean

The main result in this paper is the determination of the Fr´echet derivative of an analytic function of a bounded operator, tangentially to the space of all bounded operators. Some applied problems from statistics and numerical analysis are included as a motivation for this study. The perturbation operator increment is not of any special form and is not supposed to commute with the operator at which the derivative is evaluated. This generality is important for the applications.

In the Hermitian case, moreover, some results on perturbation of an isolated eigenvalue, its eigenprojection, and its eigenvector if the eigenvalue is simple, are also included. Although these results are known in principle, they are not in general formulated in terms of arbitrary perturbations as required for the applications. Moreover, these results are presented as corollaries to the main theorem, so that this paper also provides a short, essentially self-contained review of these aspects of perturbation theory.

Copyrightq2009 D. S. Gilliam et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Motivated by certain applications in numerical analysis and, in particular, statistics, this paper deals with the Fr´echet derivative of an analytic functionϕof a bounded linear operator T on a separable Hilbert spaceHin the sense of the usual functional calculus, tangentially to the Banach spaceLof all bounded linear operators mappingHinto itself. More precisely, a first order approximation to the diﬀerence

ϕ T

−ϕT, TT Π, Π∈ L, 1.1

(2)

is obtained, including the order of magnitude of the remainder. An example of such a function ϕis a generalized or regularized inverse of the square root

ϕT

αIT_−1/2

, α >0, 1.2

whereIis the identity operator. Once the Fr´echet derivative has been establishedSection 2, it yields the asymptotic distribution of functions of certain random operators via an ensuing delta method : a well-known statistical techniqueseeSection 4.

ClearlyT can be regarded as a perturbed version ofT, and it is not surprising that perturbation methods are employed to obtain the desired result. The authors are aware of the possibility that the rather straightforward result on the Fr´echet derivative might be hidden somewhere in the rich literature on perturbation theory1–3 . Yet they have not been successful in identifying a reference that states the result in its present form, tailored to the applications they have in mind. Some remarks are particularly in order.

aThe perturbationsΠare typically of small norm but otherwise arbitrary bounded or Hermitian. In literature, they are often of the form

Π T1²T2· · · 1.3 for operators T₁, T₂, . . ., and a small number . In statistics, there is no point in representing the perturbation in such a form.

bThe perturbationΠand the operatorTare not assumed to commute, because in our applications such an assumption would not in general be fulfilled. If the operators do commute, however, the Fréchet derivative would reduce toϕT, in the sense of functional calculus withϕthe derivative ofϕ. In the case considered here, the actual Fréchet derivative andϕTmay differ considerably.

cA central theme in perturbation theory concerns the perturbation of an isolated eigenvalue and corresponding eigenprojectionsee, e.g, the references mentioned before. Some of the results are included, because they can be easily derived from the main result on the Fr´echet derivative by choosing a special function ϕ Section 3. In this way, the paper presents a concise and essentially self-contained review of some basic results in this area. They are again presented in terms of a generalHermitianperturbationΠ, as being required for statistical application, in the same vein as, but somewhat more general than, Dauxois et al.4 .

As has already been mentioned in the beginning,Hwill be a separable Hilbert space and Lthe Banach space of all bounded linear operators mapping Hinto itself. The inner product onHwill be denoted by·,·and the norm by · .The norm onLwill be written · L, and the notationLHandCHwill be used to denote the subspace of all Hermitian and all compact Hermitian operators, respectively.

We will exclusively deal with infinite dimensional Hilbert spaces and will not attempt to include the simpler finite dimensional case in our formulation. The Fr´echet derivative for arbitrary perturbations is well known in the finite dimensional matrix case. This result and further references can be found in the recent monograph by Bhatia 5 . In the finite dimensional case, this derivative is also implicitly present in Theorem 2.1 of Ruymgaart and Yang6 to obtain the asymptotic distribution of a function of a random matrix.

(3)

2. The Fr ´echet Derivative

Let us fix an arbitraryT ∈ Lwith spectrumσTand a bounded open regionΩ⊂ Cin the complex plane with smooth boundaryΓ ∂Ω, such that

σT⊂Ω, δ_Γdist

Γ, σT

>0. 2.1

Furthermore, let us consider functions of type

ϕ :D−→C, analytic, 2.2

whereD⊃Ωis an open neighborhood ofΩ. Let us write M_Γ max

z∈Γϕz<∞, L_Γlength ofΓ<∞. 2.3 The resolvent

Rz zI−T⁻¹, z∈ρT, 2.4

is analytic on the resolvent setρT σTc

, and the operator

ϕT 1

2πi

ΓϕzRzdz 2.5

is well defined. This relation establishes an algebra homomorphism7, Section 17.2 which implies in particular that

ϕTψT ϕψT, 2.6

ifψ:D → Cis also analytic. In particular, we have

T 1 2πi

ΓzRzdz. 2.7

The operators

ϕ_TΠ 1 2πi

ΓϕzRzΠRzdz, 2.8

S_ϕ,Π 1 2πi

ΓϕzRz

ΠRz₂

I−ΠRz₋₁

dz 2.9

(4)

are well defined for everyΠ ∈ L suﬃciently small. Note that according to Dunford and Schwartz8, Lemma VII.6.11 , there is a constant 0< K <∞, such that

Rz

L≤ K

δ_Γ, ∀z∈Ω^c. 2.10

Theorem 2.1Fréchet Derivative. LetT ∈ Land suppose thatϕsatisfies2.2. Thenϕmaps the neighborhood{TT Π:Π∈ L, Π_L≤cδ_Γ/K, for some 0< c <1}intoL, when defined in the usual way of functional calculus. This mapping is Fréchet differentiable atT, tangentially toL, with bounded derivativeϕ_T :L → L,as defined by2.8. More specifically, we have

ϕT Π ϕT ϕ_TΠ Sϕ,Π, 2.11

whereS_ϕ,Πis defined in2.9and ϕ_TΠ

L≤ 1 2πM_ΓL_Γ

K δ_Γ

₂

ΠL, 2.12

S_ϕ,Π_L≤ 1

21−cπM_ΓL_Γ K

δ_Γ ₃

Π²_L. 2.13

Proof. Forϕto be well defined on the neighborhood, let us first show that σT⊂Ω, ∀Π∈ L withΠ_L≤cδ_Γ

K. 2.14

To verify this, note that by 2.10 we have ΠRz_L < c for such Π. Consequently, the operator

RzI−ΠRz⁻¹Rz{Rz⁻¹−ΠRz}⁻¹

zI−T−Π⁻¹Rz 2.15

is bounded for eachz∈Ω^c, which entails2.14. Hence, ϕT 1

2πi

ΓϕzRzdz 2.16

is well defined forΠwithΠL≤cδ_Γ/K.

Applying a Neumann series expansion 9, Section 5.2 to the inverse on the left in 2.15, we obtain

Rz Rz

I ΠRz

ΠRz₂ · · ·

Rz RzΠRz Rz

ΠRz₂

I−ΠRz₋₁ ,

2.17 just as in Watson10 for matrices. Term-wise integration yields2.11.

(5)

The upper bounds in2.12and2.13are immediate from2.8and2.9, respectively, by exploiting2.3and2.15. The boundedness ofϕ_T as a linear operator mappingLinto itself follows at once from2.12.

Remark 2.2. It will be seen inSection 4that for the applications we have in mind it is important that we do not require thatT andΠcommute. If they do, however, it is clear that the Fr´echet derivative in2.8reduces to

ϕ_TΠ 1

2πi

ΓϕzR²zdz

Π. 2.18

It has been shown in Dunford and Schwartz8, proof of Theorem VII.6.10 that 1

2πi

ΓϕzR²zdz 1 2πi

ΓϕzRzdz. 2.19

Combination of2.11with2.18and2.19yields ϕT Π ϕT ϕTΠ O

Π²_L

, 2.20

writing, for anyr >0,

O Π^r_L

, asΠL−→0, 2.21

to indicate any quantityoperator, vector, numberwhose norm or absolute value is of the given order. Note that in2.20the operator, ϕT is to be understood in the sense of the usual functional calculus as in2.5withϕreplaced by its derivativeϕ.

In this situation of commuting operators, Dunford and Schwartz8 obtain the Taylor expansion

ϕT Π ^∞

n0

ϕⁿT

n! Πⁿ, 2.22

which implies, of course,2.20.

Keeping the perturbation as before, we now restrict T to the class CH of compact Hermitian operators. The bounded and countable spectrum consists of the number 0, whether an eigenvalue or not, and all the nonzero eigenvaluesλ₁, λ₂, . . . ∈R. In this work, we avoid technical issues related toλ0 being an eigenvalue, and assume thatTis one-to-one, that is, Tf 0 implies thatf0. It is well known7 that such aTcan be represented as

T ^∞

j1

λ_jP_j, 2.23

(6)

where thePjare the corresponding orthogonal eigenprojections onto the mutually orthogonal finite dimensional eigenspaces. These projections provide a resolution of the identity inH, that is,

I_H^∞

j1

P_j. 2.24

The resolvent has the expansion

Rz ^∞

j1

1 z−λj

P_j, z∈ρT. 2.25

Corollary 2.3. Let the conditions ofTheorem 2.1be fulfilled forT ∈ CHwith expansion2.23. In this case the Fr´echet derivativeϕ_T :L → Lis given by

ϕ_TΠ

j

ϕ λj

PjΠPj

j /k

ϕ λk

−ϕ λj

λ_k−λ_j PjΠPk, Π∈ L. 2.26

Proof. Let us substitute the expansion2.25 forRz into the expression forϕ_TΠ in2.8.

Application of the partial fraction method yields

ϕ_TΠ

j

k

1 2πi

Γ

ϕz z−λ_j

z−λ_kdz

PjΠPk

j

1 2πi

Γ

ϕz z−λj²dz

P_jΠPj

j /k

1 λ_k−λ_j

1 2πi

Γ

ϕz

z−λ_k− ϕz z−λ_j

dz

P_jΠPk.

2.27

The right-hand side of2.27reduces at once to the expression on the right in2.26by an application of Cauchy’s integral formula.

Example 2.4. The functionϕz z,z ∈ C, is analytic on the entire complex plane so that Corollary 2.3applies. The Fr´echet derivative in2.26now reduces to

ϕ_TΠ

j

k

P_jΠPk Π, 2.28

Π ∈ L. Of course this result is immediate because in this simple caseϕT Π T Π ϕT Π.

Example 2.5. Next let us, forp >0, consider the function

ϕαz αz^−p, α > δ_Γ>0, 2.29

(7)

forz / −α. Note that the choice of αensures that the pole at z −αremains outside the contourΓ. Clearly there exists an open regionΩof the type required, such thatϕis analytic on some open neighborhoodDofΩ. HenceCorollary 2.3applies again. The operatorϕ_αT αI T^−p represents a regularized or generalized inverse of Tikhonov type, according to whetherTis injective or not. The Fr´echet derivative in2.26now equals

ϕ_α,TΠ −p

j

1

αλ_j_p1P_jΠPj

j /k

αλj

_p

− αλk

_p λk−λj

αλj

_p αλk

_pP_jΠPk, 2.30

forΠ∈ L.

Remark 2.6. ForT ∈ CH,T andΠcommuting the double sum on the right in2.26cancels and we obtain

ϕ_TΠ

j

ϕ λj

PjΠ, 2.31

in accordance with2.20. Apparently, the double sum is a correction term needed whenT andΠdo not commute.

3. Perturbation of Eigenvalues and Eigenvectors

Throughout this section, bothTandΠare assumed to be Hermitian, so that alsoT Π∈ LH. In addition to this, we assume that

T ∈ LH has an isolated eigenvalueλ1, 3.1

with one-dimensional eigenspace. Consequently, the eigenprojection can be written

P₁p₁⊗p₁, for somep₁∈H with p₁ 1, 3.2 where fora, b∈Hthe operatora⊗bis defined bya⊗bxx, ba, x∈H.

The region Ω ⊃ σT will now be chosen in such a way that it has a connected componentΩ1with the properties

Ω1∩σT λ₁, distΩ1,Ω\Ω1>0. 3.3

A special analytic functionϕ₁:D → Csuch that

ϕ₁z 1, z∈Ω1, ϕ₁z 0, z∈Ω0 Ω\Ω1, 3.4 will play an important role in the sequel. Note, for instance, that

ϕ1T P1. 3.5

(8)

For the Fr´echet derivative ofϕ1T atT, a special expression can be obtained. Let us write

T λ1P1T0, 3.6

whereT0 is Hermitian with spectrumσT0 ⊂ Ω0. According to the spectral theorem, there exists a resolution of the identityEλ,λ∈σT0, such that

T₀

σT0λ dEλ. 3.7

It should be noted that

P₁Eλ EλP1O, ∀λ∈σT0, 3.8

whereOis the zero operator, and that

Rz 1

z−λ1

P₁

σT0

1

z−λdEλ. 3.9

Let us define

Q₁

σT0

1

λ1−λdEλ. 3.10

Lemma 3.1. The Fr´echet derivative ofϕ1TatTis given by

ϕ_1,TΠ P1ΠQ1Q1ΠP1, Π∈ LH. 3.11

Proof. This follows by substitution of3.9in the expression on the right in ϕ_1,TΠ 1

2πi

Γ1

RzΠRzdz, Γ1∂Ω1, 3.12

for this derivative; see also2.8. We thus obtain ϕ_1,TΠ 1

2πi

Γ1

1 z−λ1

₂P₁ΠP1dz 1

2πi

Γ1

1 z−λ1P1Π

σT0

1

z−λdEλ

dz 1

2πi

Γ1

σT0

1

z−λdEλ

Π 1 z−λ₁P₁dz 1

2πi

Γ1

σT0

1

z−λdEλ

Π

σT0

1

z−μdEμ

.

3.13

(9)

By Cauchy’s integral formula 1 2πi

Γ1

1

z−λ₁₂dzϕ₁ λ₁

0, 3.14

so that the first term on the right side in3.13is the zero operator. Regarding the second, note that

1 2πi

Γ1

1 z−λ1

z−λdz 1 λ₁−λ

1 2πi

Γ1

1

z−λ₁dz− 1 2πi

Γ1

1 z−λdz

1

λ₁−λ, 3.15

because eachλ∈σT0lies outside the contourΓ1. Consequently, the second term equals

P₁Π 1

2πi

σT0

1

λ₁−λdEλ

P₁ΠQ1. 3.16

Similarly, the third term equalsQ₁ΠP1. The last term cancels, because 1

2πi

Γ1

1

z−λz−μdz0, 3.17

since bothλandμlie outsideΓ1.

Some results about the perturbation of λ₁ and P₁ in a given direction as in 1.3 that are well known in literature1,2 can be partly recovered for perturbations in some neighborhood, in an essentially self-contained manner, as simple consequences of the results inSection 2.

Corollary 3.2. Under the assumptions3.1,3.2, and forΠ∈ LHsuﬃciently small, the operatorT has an isolated eigenvalueλ₁with eigenprojectionP₁p₁⊗p₁for some unit vectorp₁∈H, satisfying

P1P1P1ΠQ1Q1ΠP1O Π²_L

, 3.18

whereQ₁is defined in3.10.

Proof. In view of3.5and3.11, application of2.11withϕϕ1yieldsϕ1T P1P1ΠQ1 Q₁ΠP1 OΠ²_L. Clearly ϕ₁Tis Hermitian, and because ϕ1T² ϕ²₁T ϕ₁Tby 2.6, it is also idempotent so that it is in fact some projectionP₁, for example, it follows that P1−P1_L<1 for allΠsuﬃciently small, and hence the range ofP1must also have dimension 111 so thatP1 p1⊗p1for somep1∈Hwithp11.

Next, letχz z,z ∈ C, be the identity function. By 2.6, again, on the one hand we haveχϕ1Tp1 TP1p1 Tp1, and on the otherϕ1χTp1 P1Tp1 p1⊗p1Tp1 Tp₁,p₁p₁ λ₁p₁. Hencep₁is an eigenvector ofTwith eigenvalueλ₁.

(10)

Corollary 3.3. Under the assumptions ofCorollary 3.2, we have

p1p1Q1Πp1O Π²_L

. 3.19

Proof. Let us first observe thatP₁ΠQ1p₁ 0 because of 3.8. Hence3.18yieldsp₁−p₁ P₁−P₁p1r₁r₂Q₁Πp1r₁r₂OΠ²_L, where

r₁P₁ p₁−p₁

, r₂ P₁−P₁ p₁−p₁

. 3.20

It suﬃcies to show thatrj OΠ²_Lforj 1,2. The idea of the proof can be found in Dauxois et al.4 .

Regardingr1, note that 1− p1, p1²p1, p1p1−p1² ≤ P1−P1²_LOΠ²_L, once more using3.18. Hencep₁, p₁ → 1, asΠ_L → 0, and therefore 2≥1p₁, p₁ ≥1 for ΠLsuﬃciently small. This entails

r₁ p₁, p₁

p₁−p₁ 1− p₁, p₁

1− p1, p1

₂ 1

p1, p1 O Π²_L

.

3.21

Forr2we have

r₂ ≤ P₁−P₁

L p₁−p₁ O

Π_L 2 1−

p₁, p₁ O

ΠL 2 r1 O Π²_L

,

3.22

as can be seen from3.21.

Corollary 3.4. Under the assumptions ofCorollary 3.2, we have λ1 λ1

Πp1, p1

O Π²_L

. 3.23

Proof. With the help of 3.19, we see that λ1 Tp1,p1 T Πp1 Q1Πp1, p1 Q₁Πp1OΠ²_L. The result follows from a routine calculation combined with the equalities Tp1, p1 λ1, Tp1, Q1Πp1 λ1p1, Q1Πp1 λ1Q1p1,Πp1 0, and TQ1Πp1, p1 Q1Πp1, Tp1 λ1Q1Πp1, p1 λ1Πp1, Q1p1 0. For the last two equalities we assume thatTandQ₁are Hermitian andQ₁p₁0 by3.8.

Corollary 3.5. LetT ∈ CHbe given by2.23and satisfy3.2. Then3.18and3.19remain true with

Q₁^∞

j2

1

λ₁−λ_jP_j. 3.24

(11)

Proof. All nonzero eignvalues ofT are isolated, in particularλ1. It is immediate from2.23 thatT₀_∞

j2λ_jP_j, and this leads to the special expression forQ₁in3.24.

Remark 3.6. The assumption thatΠbe Hermitian is in fact not necessary. Of course, if we just requireΠto be bounded, the perturbed operatorTis not in general Hermitian anymore. In particular, a suitably modified version ofCorollary 3.3will now claim the existence of a pair of eigenvectors,p₁forTandp^∗₁forT^∗, with expansions

p₁p₁Q₁Πp1O Π²_L

, p^∗₁p₁Q₁Π^∗p₁O Π²_L

, 3.25

asΠ → 0.

4. Applications

In this section, we will sketch three applications: two in statistics and one in numerical analysis.

4.1. Noisy Integral Equations

LetK : L²0,1 → L²0,1be a compact injective integral operator, with measurable real kernel denoted by the same symbol without confusion. More specifically, inputf ∈L²0,1 and outputg∈L²0,1are related according to

gs ₁

0

Ks, tftdt. 4.1

In practice, only finitely many data regarding the output are available, usually blurred by random measurement error. If the data are collected according to a random design, we may think of the data set as of n independent copies X1, Y1, . . . ,Xn, Yn of a pair X, Y of random variables, where

Y gX KfX , 4.2 the design variableXhas a Uniform0,1distribution, the error variablehas finite variance and zero mean, and whereXandare stochastically independent.

It is the purpose to recoverf from these data. It is expedient to “precondition” with the adjoint operatorK^∗and recoverffrom the equation

qK^∗g K^∗Kf Rf, 4.3

whereRis compact, Hermitian, and strictly positive. Under suitable conditions,

qt 1

n n

i1

YiK^∗ t, Xi

1 n

n i1

YiK Xi, t

, t∈0,1 , 4.4

(12)

is an unbiased and√

n-consistent estimator ofq; see, for instance, van Rooij and Ruymgaart 12 . Since R⁻¹ is unbounded, an estimator of the input f is obtained by applying a regularized inverse ofRtoq. Here we will use the Tikhonov type inverse

αIR⁻¹ϕ_aR, α >0, 4.5

whereϕ_αz αz⁻¹; see also2.29. This yields the input estimator

f_αϕ_αRq, α >0. 4.6

To assess the quality of the estimator, one considers the mean integrated squared errorMISE E f_α−f ². 4.7 The behavior of the MISE is well studied in literature.

Recently, there is an interest in certain econometric models where the operatorK or Ris unknown but can be estimated from the data. LetRdenote an estimator ofRand assume thatRis also compact, Hermitian, and nonnegative. In this case, the input estimator

f_α

αIR₋₁

qϕαR q, α >0, 4.8 will be employed. One expects that estimation ofRwill increase the MISE, and naturally the question arises how much bigger the MISE off_αwill be than that off_α.

An upper bound for this increase of the MISE can be easily found from the results in Section 2. For large sample sizen,Rwill be close toR, andΠ R−Rcan be considered as a small random perturbation ofR. Writingϕ_α,Rfor the Fr´echet derivative atR, we see from Theorem 2.1that

f_α−f

ϕαR −ϕαR

qϕαRq−f

ϕ_α,RΠ

qf_α−fO Π ²

.

4.9

Apparently,ϕ_α,RΠ qis an extra error term due to the estimation ofR.

To find an upper bound for its MISE, let us first observe that2.30simplifies forp1 and yields

ϕ_α,RΠ −

j

k

1 αλj

αλk

P_jΠP k, 4.10

where now theλjand thePjare the spectral characteristics ofR. Let us write, for brevity, h

k

1 αλk

P_kq, 4.11

(13)

and note that

h²≤ 1

α²q ². 4.12

We thus arrive at

E ϕ_α,RΠ q ²E

j

1 αλj

p_jΠ h ²≤ 1

α²E Π ²

L h ²≤ 1

α⁴E Π ²

L q ². 4.13

Hence, under suitable assumptions, estimation of the kernel yields an extra term in the MISE of the input estimator which is of orderα⁻⁴. In the Russian literature, sharper bounds can be found; see in particular Bakushinsky and Kokurin13, Section 2.2 . For results of this type in the statistical literature, obtained in a diﬀerent manner, see, for instance, Hall and Horowitz14 and Florens15 .

4.2. Some Asymptotics for Functional Canonical Correlations

LetX be a real random element in the Hilbert spaceL²0,1and assume thatEX⁴ < ∞.

Its meanμ∈L²0,1and covariance operatorS:L²0,1 → L²0,1are well defined by the relationsEf, Xf, μ,Ef, X−μX−μ, gf, Sgfor allf, g∈L²0,1. The operatorS is known to be of finite trace and hence Hilbert-Schmidt and compact. It is also nonnegative Hermitian. Without real loss of generality, we will assumeSto be injective, so that it will be strictly positive.

Next suppose that we are given a random sampleX1, . . . , Xnof independent copies of X. The usual estimators ofμandSareX 1/n_n

i1X_iandS 1/n_n

i1Xi−X⊗Xi−X, respectively, whereSshares all the properties ofS, except that it cannot be injective because it has a finite dimensional kernel whose range has dimension at mostn−1.

BecauseS cannot be injective, the finite dimensional definition of sample canonical correlation has to be modified, and some kind of smoothing or regularization is recommended in literature 16 . Regularization might even be useful when the population is considered, although Sis injective 17 . This regularization yields Tikhonov type inverses in an expression for the canonical correlation .

For a precise definition, letH1and H2 be two closed subspaces ofL²0,1andIj the orthogonal projection ontoHjj1,2. Let us writeS_jkI_jSI_k, and note thatI_jαISIk Sjkforj /k. Similar notation will be used forS. The regularized squared principal canonical correlation for the population is now defined as

ρ² sup

f1,f2/0

f₁, S₁₂f₂2

f₁,

αI₁S₁₁ f₁

f₂,

αI₂S₂₂

f₂. 4.14

Its sample analogueρ² is obtained by replacing theSjk with Sjk in 4.14. The supremum is actually a maximum, and pairs of maximizers will be denoted by f₁^∗, f₂^∗, and f₁^∗, f₂^∗,

(14)

respectively. The corresponding canonical variates then are X, f_j^∗

, X,f_j^∗

, j 1,2. 4.15

For an alternative description of these canonical correlations, let us introduce the operator

R₁

αI₁S₁₁_−1/2 S₁₂

αI₂S₂₂₋₁ S₂₁

αI₁S₁₁_−1/2

. 4.16

Interchanging the indices 1 and 2 yieldsR₂, and replacingS_jkwithS_jkyieldsR₁andR₂. It can be seen that all these operators are Hilbert-Schmidt and strictly positive Hermitian. It will be assumed thatRjhas the largest eigenvalue with one-dimensional eigenspace generated byf_j^∗ withf_j^∗1. Under this condition, it has been shown in Cupidon et al.18 that

ρ²largest eigenvalue ofRj

f_j^∗, Rjf_j^∗

, 4.17

forj1,2. A similar result holds true forρ².

It is well known that the asymptotic distribution of the eigenvalues and eigenfunctions of a random operator can be derived from the asymptotic distribution of this random operator itselfsee10 for Euclidean spaces and4 for Hilbert spaces. This technique is based on the results ofSection 3. In the present situation, this means that we have to show the convergence in distribution of the suitably standardized Rj. Because all operators are Hilbert-Schmidt, it can be shown that

Rj√

nRj−Rj

d

−→ G, as n−→ ∞, inL. 4.18 Result4.18follows easily if convergence in distribution can be established for each of the factors definingR_j, for instance,

αISjj

−1/2

ϕα Sjj

, 4.19

where this timeϕαz αz^−1/2, compare2.29. It is known4 that

√n S_jj−S_jj−→ Gd jj, as n−→ ∞, inLHS, 4.20

for some Gaussian random elementGjj, whereLHSis the Hilbert space of all Hilbert-Schmidt operators mapping H into itself. Writing ϕ_α,j for the Fr´echet derivative evaluated at S_jj Section 2and exploiting the fact that the imbedding ofLHSisLare continuous, we obtain via a kind of delta-method18,19

√n ϕα Sjj

−ϕα

Sjj

d

−→ϕ_α,jGjj, as n−→ ∞, inL 4.21

the desired result. A combination of results like this for each of the factors ofR_jyields4.18.

(15)

4.3. Solution of a Nonlinear Operator Equation

In Bakushinsky and Kokurin 13 , the following problem is considered. Let H1 andH2 be Hilbert spaces andF:H1 → H2an operator, not necessarily linear. Thenonlinearequation

Fx 0, x∈H, 4.22

is studied. Letx^∗be a solution of4.22and introduce a setΩ {x∈H1 :x−x^∗1 ≤r}, for somer >0. It is assumed thatFis Fr´echet diﬀerentiable onΩ. IfF_xis the derivative atx∈Ω it is, moreover, assumed that

F_x −F_y

LH1,H2≤Lx−y1, x, y∈Ω, 4.23 where 0 < L < ∞is a given number. Given an initial point x₀ ∈ Ω and a sequence{αn}, α_n>0, of regularization parameters, these authors show that, under some further conditions, the generalized Gauss-Newton method generates a sequence of points{xn}such that

xn−x^∗ O α^p_n

, for somep≥ 1

2. 4.24

In their proof of this result, the authors need a crucial upper bound. Under some additional assumptions, we want to derive this upper bound as an immediate consequence ofTheorem 2.1. In order to relate the present problem to the setup of our paper, let us assume thatH1H2H, and note that

F_x∗_∗ F_x∗

T ∈ LH. 4.25

Forx_n, let

F_x_n_∗ F_x_n

T∈ LH, 4.26

and set

TTT−T T Π, 4.27

where obviouslyΠ∈ LH. It is not hard to see that4.23entails

Π_L≤a x^∗−x_n , 4.28 for some 0 < a < ∞. LetΓbe the contour in2.1andD the corresponding domain. As in Bakushinsky and Kokurin13 , a functionΘz, α,z∈D, is employed in the iteration scheme, which is analytic onD.

(16)

Narrowing down the generality in Bakushinsky and Kokurin13 somewhat further, so that the current conditions are satisfied, their proof of the convergence of the iterations requires an upper bound for the expressionin our notation

1 2πi

Γ

1−Θ z, αn

{Rz−Rz}dz 1−Θ T, αn

−

1−Θ T, αn

Θ T, αn

−Θ T, αn

.

4.29

Keepingnfixed, let us briefly write this last expression asΘT −ΘT. NowTheorem 2.1 applies withϕ Θ, and application yields at once

ΘT−ΘTL≤ Θ_TT−T

LO

T−T²_L

≤bT−T_LO

T−T²_L

≤ab x^∗−xn O x^∗−xn ² ,

4.30

for some 0< b <∞, by4.28.

Acknowledgments

The authors are grateful to the referee for some useful comments. For this research, D. S.

Gilliam was supported by AFOSR Grant no. FA9550-04-1027 and F. H. Ruymgaart by NSF Grant no. DMS-0605167.

References

1 T. Kato, Perturbation Theory for Linear Operators, Springer, Berlin, Germany, 1966.

2 F. Rellich, Perturbation Theory of Eigenvalue Problems, Gordon and Breach, New York, NY, USA, 1969.

3 F. Chatelin, Spectral Approximation of Linear Operators, Computer Science and Applied Mathematics, Academic Press, New York, NY, USA, 1983.

4 J. Dauxois, A. Pousse, and Y. Romain, “Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference,” Journal of Multivariate Analysis, vol. 12, no. 1, pp. 136–154, 1982.

5 R. Bhatia, Positive Definite Matrices, Princeton Series in Applied Mathematics, Princeton University Press, Princeton, NJ, USA, 2007.

6 F. Ruymgaart and S. Yang, “Some applications of Watson’s perturbation approach to random matrices,” Journal of Multivariate Analysis, vol. 60, no. 1, pp. 48–60, 1997.

7 P. D. Lax, Functional Analysis, Pure and Applied Mathematics, John Wiley & Sons, New York, NY, USA, 2002.

8 N. Dunford and J. T. Schwartz, Linear Operators. Part I: General Theory, Wiley Classics Library, Wiley- Interscience, New York, NY, USA, 1988.

9 L. Debnath and P. Mikusi ´nski, Introduction to Hilbert Spaces with Applications, Academic Press, San Diego, Calif, USA, 2nd edition, 1999.

10 G. S. Watson, Statistics on Spheres, vol. 6 of University of Arkansas Lecture Notes in the Mathematical Sciences, John Wiley & Sons, New York, NY, USA, 1983.

11 F. Riesz and B. Sz.-Nagy, Functional Analysis, Dover Books on Advanced Mathematics, Dover, New York, NY, USA, 1990.

12 A. C. M. van Rooij and F. Ruymgaart, “Asymptotic minimax rates for abstract linear estimators,”

Journal of Statistical Planning and Inference, vol. 53, no. 3, pp. 389–402, 1996.

(17)

13 A. B. Bakushinsky and M. Yu. Kokurin, Iterative Methods for Approximate Solution of Inverse Problems, vol. 577 of Mathematics and Its Applications, Springer, Dordrecht, The Netherlands, 2004.

14 P. Hall and J. L. Horowitz, “Nonparametric methods for inference in the presence of instrumental variables,” The Annals of Statistics, vol. 33, no. 6, pp. 2904–2929, 2005.

15 J.-P. Florens, “Inverse problems and structural econometrics: the example of instrumental variables,”

in Advances in Economics and Econometrics: Theory and Applications Dewatripont, M. Hanson and S. J.

Turnovsky, Eds., vol. 2, pp. 284–311, Cambridge University Press, Cambridge, UK, 2003.

16 S. E. Leurgans, R. A. Moyeed, and B. W. Silverman, “Canonical correlation analysis when the data are curves,” Journal of the Royal Statistical Society. Series B, vol. 55, no. 3, pp. 725–740, 1993.

17 J. Cupidon, R. Eubank, D. S. Gilliam, and F. Ruymgaart, “Some properties of canonical correlations and variates in infinite dimensions,” Journal of Multivariate Analysis, vol. 99, no. 6, pp. 1083–1104, 2008.

18 J. Cupidon, D. S. Gilliam, R. Eubank, and F. Ruymgaart, “The delta method for analytic functions of random operators with application to functional data,” Bernoulli, vol. 13, no. 4, pp. 1179–1194, 2007.

19 A. W. van der Vaart, Asymptotic Statistics, Cambridge University Press, Cambridge, UK, 1998.