Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle

(1)

Orthogonal Rational Functions on the Unit Circle with Prescribed Poles not on the Unit Circle

Adhemar BULTHEEL ^†, Ruyman CRUZ-BARROSO^‡ and Andreas LASAROW ^§

† Department of Computer Science, KU Leuven, Belgium E-mail: [email protected]

URL: https://people.cs.kuleuven.be/~adhemar.bultheel/

‡ Department of Mathematical Analysis, La Laguna University, Tenerife, Spain E-mail: [email protected]

§ Fak. Informatik, Mathematik & Naturwissenschaften, HTWK Leipzig, Germany E-mail: [email protected]

Received August 01, 2017, in final form November 20, 2017; Published online December 03, 2017 https://doi.org/10.3842/SIGMA.2017.090

Abstract. Orthogonal rational functions (ORF) on the unit circle generalize orthogonal polynomials (poles at infinity) and Laurent polynomials (poles at zero and infinity). In this paper we investigate the properties of and the relation between these ORF when the poles are all outside or all inside the unit disk, or when they can be anywhere in the extended complex plane outside the unit circle. Some properties of matrices that are the product of elementary unitary transformations will be proved and some connections with related algorithms for direct and inverse eigenvalue problems will be explained.

Key words: orthogonal rational functions; rational Szeg˝o quadrature; spectral method; rational Krylov method; AMPD matrix

2010 Mathematics Subject Classification: 30D15; 30E05; 42C05; 44A60

1 Introduction

Orthogonal rational functions (ORF) on the unit circle are well known as generalizations of orthogonal polynomials on the unit circle (OPUC). The pole at infinity of the polynomials is replaced by poles “in the neighborhood” of infinity, i.e., poles outside the closed unit disk. The recurrence relations for the ORF generalize the Szeg˝o recurrence relations for the polynomials.

If µ is the orthogonality measure supported on the unit circle, and L²_µ the corresponding Hilbert space, then the shift operatorT_µ:L²_µ→L²_µ:f(z)7→zf(z) restricted to the polynomials has a representation with respect to the orthogonal polynomials that is a Hessenberg matrix.

However, if instead of a polynomial basis, one uses a basis of orthogonal Laurent polynomials by alternating between poles at infinity and poles at the origin, a full unitary representation ofT_µ with respect to this basis is a five-diagonal CMV matrix [12].

The previous ideas have been generalized to the rational case by Velázquez in [47]. He showed that the representation of the shift operator with respect to the classical ORF is not a Hessenberg matrix but a matrix Möbius transform of a Hessenberg matrix. However, a full unitary representation can be obtained if the shift is represented with respect to a rational analog of the Laurent polynomials by alternating between a pole inside and a pole outside the unit disk. The resulting matrix is a matrix Möbius transform of a five-diagonal matrix.

This paper is a contribution to the Special Issue on Orthogonal Polynomials, Special Functions and Applica- tions (OPSFA14). The full collection is available athttps://www.emis.de/journals/SIGMA/OPSFA2017.html

(2)

Orthogonal Laurent polynomials on the real line, a half-line, or an interval were introduced by Jones et al. [31,32] in the context of moment problems, Pad´e approximation and quadrature and this was elaborated by many authors. Gonz´alez-Vera and his co-workers were in particular involved in extending the theory where the poles zero and infinity alternate (the so-called balanced situation) to a more general case where in each step either infinity or zero can be chosen as a pole in any arbitrary order [8, 20]. They also identify the resulting orthogonal Laurent polynomials as shifted versions of the orthogonal polynomials. Hence the orthogonal Laurent polynomials satisfy the same recurrence as the classical orthogonal polynomials after an appropriate shifting and normalization is embedded.

The corresponding case of orthogonal Laurent polynomials on the unit circle was introduced by Thron in [40] and has been studied more recently in for example [15,18]. Papers traditionally deal with the balanced situation like in [18] but in [15] also an arbitrary ordering was considered.

Only in [16] Cruz-Barroso and Delvaux investigated the structure of the matrix representation with respect to the basis of the resulting orthogonal Laurent polynomials on the circle. They called it a “snake-shaped” matrix which generalizes the five diagonal matrix.

The purpose of this paper is to generalize these ideas valid for Laurent polynomials on the circle to the rational case. That is to choose the poles of the ORF in an arbitrary order either inside or outside the unit disk. We relate the resulting ORF with the ORF having all their poles outside or all their poles inside the disk, and study the corresponding recurrence relations. With respect to this new orthogonal rational basis, the shift operator will be represented by a matrix M¨obius transformation of a snake-shaped matrix.

In the papers by Lasarow and coworkers (e.g., [23,24,25,34]) matrix versions of the ORF are considered. In these papers also an arbitrary choice of the poles is allowed but only with the restrictive condition that if α is used as a pole, then 1/α cannot be used anymore. This means that for example the “balanced situation” is excluded. One of the goals of this paper is to remove this restriction on the poles.

In the context of quadrature formulas, an arbitrary sequence of poles not on the unit circle was also briefly discussed in [19]. The sequence of poles considered there need not be New- tonian, i.e., the poles for the ORF of degree n may depend on n. Since our approach will emphasize the role of the recurrence relation for the ORF, we do need a Newtonian sequence, although some of the results may be generalizable to the situation of a non-Newtonian sequence of poles.

One of the applications of the theory of ORF is the construction of quadrature formulas on the unit circle that form rational generalizations of the Szeg˝o quadrature. They are exact in spaces of rational functions having poles inside and outside the unit disk. The nodes of the quadrature formula are zeros of para-orthogonal rational functions (PORF) and the weights are all positive numbers. These nodes and weights can (like in Gaussian quadrature) be derived from the eigenvalue decomposition of a unitary truncation of the shift operator to a finite-dimensional subspace. One of the results of the paper is that there is no gain in considering an arbitrary sequence of poles inside and outside the unit disk unless in a balanced situation. When all the poles are chosen outside the closed unit disk or when some of them are reflected in the circle, the same quadrature formula will be obtained. The computational effort for the general case will not increase but neither can it reduce the cost.

In network applications or differential equations one often has to work with functions of large sparse matrices. IfAis a matrix and the matrix functionf(A) allows the Cauchy representation f(A) = R

Γf(z)(z−A)⁻¹dµ(z), where Γ is a contour encircling all the eigenvalues of A then numerical quadrature is a possible technique to obtain an approximation for f(A). If for example Γ is the unit circle, then expressions like u^∗f(A)u for some vector u can be approximated by quadrature formulas discussed in this paper which will be implemented disguised as Krylov subspace methods (see for example [27,29,33]).

(3)

The purpose of the paper though is not to discuss quadrature in particular. It is just an example application that does not require much extra introduction of new terminology and notation. The main purpose however is to give a general framework on which to build for the many applications of ORFs. Just like orthogonal polynomials are used in about every branch of mathematics, ORFs can be used with the extra freedom to exploit the location of the poles.

For example, it can be shown that the ORFs can be used to solve multipoint moment problems as well as more general rational interpolation problems where locations of the poles inside and outside the circle are important for the engineering applications like system identification, model reduction, filtering, etc. When modelling the transfer function of a linear system, poles should be chosen inside as well as outside the disk to guarantee that the transient as well as the steady state of the system is well modelled. It would lead us too far to also include the interpolation properties of multipoint Pad´e approximation and the related applications in several branches of engineering. We only provide the basics in this paper so that it can be used in the context of more applied papers.

The interpretation of the recursion for the ORFs as a factorization of a matrix into elementary unitary transformations illustrates that the spectrum of the resulting matrix is independent of the order in which the elementary factors are multiplied. As far as we know, this fact was previously unknown in the linear algebra community, unless in particular cases like unitary Hessenberg matrices. As an illustration, we develop some preliminary results in Section 11 in a linear algebra setting that is slightly more general than the ORF situation.

In the last decades, many papers appeared on inverse eigenvalue problems for unitary Hes- senberg matrices and rational Krylov methods. Some examples are [4, 30, 35,36, 37, 38, 44].

These use elementary operations that are very closely related to the recurrence that will be discussed in this paper. However, they are not the same and often miss the flexibility discussed here. We shall illustrate some of these connections with certain algorithms from the literature in Section 12.

The outline of the paper is as follows. In Section2we introduce the main notations used in this paper. The linear spaces and the ORF bases are given in Section 3. Section 4 brings the Christoffel–Darboux relations and the reproducing kernels which form an essential element to obtain the recurrence relation given in Section5but also for the PORF in Section6to be used for quadrature formulas in Section7. The alternative representation of the shift operator is given in Section8and its factorization in elementary 2×2 blocks in the subsequent Section9. We end by drawing some conclusions about the spectrum of the shift operator and about the computation of rational Szeg˝o quadrature formulas in Section 10. The ideas that we present in this paper, especially the factorization of unitary Hessenberg matrices in elementary unitary factors is also used in the linear algebra literature mostly in the finite-dimensional situation. These elementary factors and what can be said about the spectrum of their product is the subject of Section 11.

These elementary unitary transformations are intensively used in numerical algorithms such as Arnoldi-based Krylov methods where they are known as core transformations. Several variants of these rational Krylov methods exist. The algorithms are quite similar yet different from our ORF recursion as we explain briefly in Section 12 illustrating why we believe the version presented in this paper has superior advantages.

2 Basic def initions and notation

We use the following notation. C denotes the complex plane, Cb the extended complex plane (one point compactification), R the real line, Rb the closure of R inCb, Tthe unit circle, D the open unit disk,Db =D∪T, andE=Cb\D. For any numberb z∈Cb we definez∗= 1/z (and set 1/0 =∞, 1/∞= 0) and for any complex functionf, we definef∗(z) =f(z∗).

(4)

To approximate an integral I_µ(f) =

Z

T

f(z)dµ(z),

where µ is a probability measure on T one may use Szeg˝o quadrature formulas. The nodes of this quadrature can be computed by using the Szeg˝o polynomials. Orthogonality in this paper will always be with respect to the inner product

hf, gi= Z

T

f(z)g(z)dµ(z).

The weights of the n-point quadrature are all positive, the nodes are on T and the formula is exact for all Laurent polynomials f ∈span{z^k:|k| ≤n−1}.

This has been generalized to rational functions with a set of predefined poles. The corresponding quadrature formulas are then rational Szeg˝o quadratures. This has been discussed in many papers and some of the earlier results were summarized in the book [9]. We briefly recall some of the results that are derived there. The idea is the following. Fix a sequenceα= (α_k)k∈N

with α={α_k}_k∈_N⊂D, and consider the subspaces of rational functions defined by L₀ =C, L_n=

( pn(z)

πn(z):pn∈ P_n, πn(z) =

n

Y

k=1

(1−αkz) )

, n≥1,

where P_n is the set of polynomials of degree at most n. These rational functions have their poles among the points in α∗ ={α_j∗ = 1/αj:αj ∈α}. We denote the corresponding sequence as α_∗ = (αj∗)j∈N. Let φ_n ∈ L_n\ L_n−1, and φ_n ⊥ L_n−1 be the nth orthogonal rational basis function (ORF) in a nested sequence. It is well known that these functions have all their zeros inD (see, e.g., [9, Corollary 3.1.4]). However, the quadrature formulas we have in mind should have their nodes on the circle T. Therefore, para-orthogonal rational functions (PORF) are introduced. They are defined by

Q_n(z, τ) =φ_n(z) +τ φ^∗_n(z), τ ∈T, where besides the ORF φ_n(z) = ^p_πⁿ^(z)

n(z), also the “reciprocal” function φ^∗_n(z) = ^p_π^∗ⁿ^(z)

n(z) = ^zⁿ_π^p^n∗^(z)

n(z)

is introduced. These PORF have nsimple zeros{ξ_nk}ⁿ_k=1 ⊂T (see, e.g., [9, Theorem 5.2.1]) so that they can be used as nodes for the quadrature formulas

I_n(f) =

n

X

k=1

w_nkf(ξ_nk)

and the weights are all positive, given by w_nk = 1/

n−1

P

j=0

|φ_j(ξnj)|² (see, e.g., [9, Theorem 5.4.2]).

These quadrature formulas are exact for all functions of the form {f = g∗h:g, h ∈ L_n−1} = Ln−1L_(n−1)∗ (see, e.g., [9, Theorem 5.3.4]).

The purpose of this paper is to generalize the situation where the αj are all in D to the situation where they are anywhere in the extended complex plane outside T. This will require the introduction of some new notation.

So consider a sequence α with α ⊂ D and its reflection in the circle β = (βj)j∈N where β_j = 1/α_j =αj∗ ∈ E. We now construct a new sequence γ = (γ_j)j∈N where each γ_j is either equal toαj orβj.

Partition{1,2, . . . , n}(n∈Nb =N∪{∞}) into two disjoint index sets: the ones whereγj =αj

and the indices whereγ_j =β_j:

an={j:γj =αj ∈D, 1≤j ≤n} and bn={j:γj =βj ∈E, 1≤j≤n},

(5)

and define

α_n={α_j:j∈an} and β_n={β_j:j ∈bn}.

It will be useful to prepend the sequenceα with an extra point α₀ = 0. That means thatβ is preceded by β0 = 1/α0=∞. Forγ, the initial point can beγ0=α0 = 0 or γ0 =β0 =∞.

With each of the sequencesα,β, andγwe can associate orthogonal rational functions. They will be closely related as we shall show. The ORF for the γ sequence can be derived from the ORF for theαsequence by multiplying with a Blaschke product just like the orthogonal Laurent polynomials are essentially shifted versions of the orthogonal polynomials (see, e.g., [15]).

To define the denominators of our rational functions, we introduce the following elementary factors:

$^α_j(z) = 1−αjz, $_j^β(z) =

(1−β_jz, ifβ_j 6=∞,

−z, ifβj =∞, $_j^γ(z) =

($_j^α(z), ifγ_j =α_n,

$_j^β(z), ifγj =βn. Note that ifαj = 0 and henceβj =∞then $_j^α(z) = 1 but $^β_j(z) =−z.

To separate theα and theβ-factors in a product, we also define

˙

$^α_j(z) =

($^α_j, ifγ_j =α_j,

1, ifγ_j =β_j, and $˙_j^β(z) =

($^β_j, ifγ_j =β_j, 1, ifγ_j =α_j.

Because the sequenceγis our main focus, we simplify the notation by removing the superscriptγ when not needed, e.g., $j =$_j^γ= ˙$^α_j$˙^β_j etc.

We can now define forν ∈ {α, β, γ}

π_n^ν(z) =

n

Y

j=1

$_j^ν(z)

and the reduced products separating theα and theβ-factors

˙ π_n^α(z) =

n

Y

j=1

˙

$_j^α(z) = Y

j∈an

$_j(z), π˙_n^β(z) =

n

Y

j=1

˙

$_j^β(z) = Y

j∈bn

$_j(z), so that

πn(z) =

n

Y

j=1

$j(z) = ˙π_n^α(z) ˙π^β_n(z).

We assume here and in the rest of the paper that products over j∈∅ equal 1.

The Blaschke factors are defined forν∈ {α, β, γ} as ζ_j^ν(z) =σ^ν_j z−ν_j

1−νjz, σ^ν_j = ν_j

|ν_j|, ifνj 6∈ {0,∞}, ζ_j^ν(z) =σ^ν_jz=z, σ^ν_j = 1, ifνj = 0, ζ_j^ν(z) =σ^ν_j/z= 1/z, σ^ν_j = 1, ifν_j =∞.

Thus σ^ν_j =





 ν_j

|ν_j|, for ν_j 6∈ {0,∞}, 1, for νj ∈ {0,∞}.

(6)

Because σ_n^α=σ^βn, we can remove the superscript and just writeσn. If we also use the following notation which maps complex numbers ontoT

u(z) =





 z

|z| ∈T, z∈C\ {0}, 1, z∈ {0,∞}, then σj =u(αj) =u(βj) =u(γj).

Set ($^ν_j)^∗(z) = $^ν∗_j (z) = z$^ν_j∗(z) (e.g., (1−αjz)^∗ = z−αj if ν = α), then ζ_j^ν = σj

$_j^ν∗

$^ν_j . Later we shall also use π^ν∗_n to mean

Qn j=1

$^ν∗_j . Note that ζ_j^α =ζ_j∗^β = 1/ζ_j^β. Moreover if α_j = 0 and hence β_j =∞, then$_j^α∗(z) =z and $_j^β∗(z) =−1.

Next define the finite Blaschke products forν∈ {α, β}

B^ν₀ = 1, and B^ν_n(z) =

n

Y

j=1

ζ_j^ν(z), n= 1,2, . . . .

It is important to note that here ν6=γ. For the definition of Bn^γ =Bn see below.

Like we have split up the denominators πn = ˙π_n^απ˙n^β in the α-factors and the β-factors, we also define for n≥1

ζ˙_j^α=

(ζ_j^α, if γj =αj, 1, if γ_j =β_j,

ζ˙_j^β =

(ζ_j^β, if γj =βj, 1, if γ_j =α_j, and

B˙^α_n(z) =

n

Y

j=1

ζ˙_j^α(z) = Y

j∈aⁿ

ζj(z), and B˙^β_n(z) =

n

Y

j=1

ζ˙_j^β(z) = Y

j∈bn

ζj(z), so that we can define the finite Blaschke products for theγ sequence:

B_n(z) =

(B˙^α_n(z), if γn=αn, B˙^βn(z), if γ_n=β_n.

Note that the reflection property of the factors also holds for the products: B^α_n = (B^βn)∗ = 1/Bn^β,Bn∗ = 1/Bn, and ( ˙B_n^αB˙n^β)∗ = 1/( ˙Bn^βB˙_n^α). However,

B˙^α_n = Y

j∈an

ζ_j^α= Y

j∈an

ζ_j∗^β = Y

j∈an

1/ζ_j^β 6= Y

j∈bⁿ

1/ζ_j^β = 1/B˙^β_n.

3 Linear spaces and ORF bases

We can now introduce our spaces of rational functions for n≥0:

L^ν_n= span{B₀^ν, B₁^ν, . . . , B_n^ν}, ν∈ {α, β, γ}, and L˙^ν_n= span{B˙₀^ν,B˙₁^ν, . . . ,B˙_n^ν}, ν∈ {α, β}.

The dimension of L^ν_nisn+ 1 forν ∈ {α, β, γ}, but note that the dimension of ˙L^ν_n forν ∈ {α, β}

can be less thann+1. Indeed some of the ˙B^ν_j may be repeated so that for example the dimension of ˙L^α_n is only|an|+ 1 with|an|the cardinality of an and similarly forν =β. Hence for ν =γ:

L_n= span{B₀, . . . , Bn}= spanB˙0,B˙₁^α, . . . ,B˙_n^α,B˙^β₁, . . . ,B˙_n^β = ˙L^α_n+ ˙L^β_n= ˙L^α_nL˙^β_n.

(7)

Because forn≥1 B˙^α_n = Y

j∈an

ζ_j^α= Y

j∈an

1

ζ_j^β and B˙_n^β = Y

j∈bn

ζ_j^β = Y

j∈bn

1 ζ_j^α, it should be clear thatB_k^α= ˙B_k^α/B˙_k^β and B_k^β = ˙B^β_k/B˙^α_k, hence that

L^α_n = span (

B˙0, B˙₁^α B˙₁^β, . . . ,

B˙_n^α B˙n^β

)

and L^β_n= span (

B˙0, B˙₁^β B˙^α₁, . . . ,

B˙n^β

B˙^α_n )

.

Occasionally we shall also need the notation

˙

ς_n^α = Y

j∈aⁿ

σ_j ∈T, ς˙_n^β = Y

j∈bn

σ_j ∈T, and ς_n=

n

Y

j=1

σ_j ∈T.

Lemma 3.1. If f ∈ L_n thenf /B˙n^β ∈ L^α_n andf /B˙^α_n ∈ L^β_n. In other wordsL_n= ˙Bn^βL^α_n= ˙B_n^αL^β_n. This is true for all n≥0 if we set B˙₀^α = ˙B₀^β = 1.

Proof . This is trivial for n= 0 since thenL_n=C. Iff ∈ L_n, and n≥1 then it is of the form f(z) = pn(z)

π_n(z) = pn(z)

˙

π_n^α(z) ˙πn^β(z), pn∈ P_n. Therefore

f(z)

B˙n^β(z) = ˙ς^β_n p_n(z) ˙πn^β(z)

˙

π_n^α(z) ˙πn^β(z) ˙π^β∗n (z) = ˙ς^β_n p_n(z)

˙

π_n^α(z) ˙πn^β∗(z).

Recall that $^β∗_j =−1 and σj = 1 if βj =∞ (and hence αj = 0), we can leave these factors out and we shall write Q·

for the product instead of Q

, the dot meaning that we leave out all the factors for whichα_j = 1/β_j = 0.

˙ ς^β_n

˙

π_n^β∗(z) = Y·

j∈bn

βj

|β_j|(z−β_j) = Y·

j∈bn

|α_j|

α_j(z−1/α_j) = Y·

j∈bn

−|α_j| 1−α_jz, and thus

f(z) B˙_n^β(z) =cn

pn(z)

n

Q

j=1

(1−α_jz)

∈ L^α_n, cn= Y·

j∈bn

(−|α_j|)6= 0.

The second part is similar.

Lemma 3.2. With our previous definitions we have for n≥1 B˙^β_nL^α_n−1= span

(

B_k^αB˙_n^β = B˙_k^α B˙^β_k

B˙_n^β:k= 0, . . . , n−1 )

= ˙ζ_n^βspan{B₀, B₁, . . . , Bn−1}= ˙ζ_n^βL_n−1, and similarly

B˙^α_nL^β_n−1 = span (

B_k^βB˙_n^α= B˙_k^β B˙_k^α

B˙^α_n:k= 0, . . . , n−1 )

= ˙ζ_n^αspan{B₀, B1, . . . , Bn−1}= ˙ζ_n^αL_n−1.

(8)

Proof . By our previous lemma ˙Bn^βL^α_n−1 = ˙ζn^βB˙_n−1^β L^α_n−1 = ˙ζn^βL_n−1. The second relation is

proved in a similar way.

To introduce the sequences of orthogonal rational functions (ORF) for the different sequen- cesν,ν∈ {α, β, γ} recall the inner product that we can write with our ( )∗-notation ashf, gi= R

Tf∗(z)g(z)dµ(z) whereµis assumed to be a probability measure positive a.e. on T.

Then the orthogonal rational functions (ORF) with respect to the sequence ν with ν ∈ {α, β, γ} are defined byφ^ν_n∈ L^ν_n\ L^ν_n−1 withφ^ν_n⊥ L^ν_n−1 forn≥1 and we chooseφ^ν₀ = 1.

Lemma 3.3. The function φ^α_nB˙n^β belongs to L_n and it is orthogonal to the n-dimensional subspace ζ˙n^βL_n−1 for alln≥1.

Similarly, the functionφ^βnB˙_n^αbelongs toL_n and it is orthogonal to then-dimensional subspace ζ˙_n^αL_n−1,n≥1.

Proof . First note that φ^α_nB˙n^β ∈ L_nby Lemma 3.1.

By definitionφ^α_n ⊥ L^α_n−1. Thus by Lemma 3.2and becausehf, gi=D

B˙_n^νf,B˙_n^νgE , B˙^β_nφ^α_n⊥B˙_n^βL^α_n−1 = ˙ζ_n^βL_n−1.

The second claim follows by symmetry.

Note that ˙ζn^βL_n−1 = L_n−1 if γn = αn. Thus, up to normalization,φ^α_nB˙n^β is the same as φn

and similarly, if γ_n=β_n thenφ_n and φ^βnB˙^α_n are the same up to normalization.

Lemma 3.4. Forn≥1 the function B˙_n^α(φ^α_n)∗ belongs to L_n and it is orthogonal to ζ˙_n^αLn−1. Similarly, for n≥1 the function B˙n^β(φ^βn)∗ belongs toL_n and it is orthogonal toζ˙n^βL_n−1. Proof . Since φ^α_nB˙n^β ⊥ζ˙n^βL_n−1,

(φ^α_nB˙^β_n)∗ ⊥ζ˙_n∗^β L_(n−1)∗,

and thus by Lemma 3.2 and because

B˙^α_n−1B˙_n−1^β L_(n−1)∗ = ˙B_n−1^α B˙_n−1^β P_(n−1)∗

˙

π^α_(n−1)∗π˙_(n−1)∗^β = P_n−1

˙

π^α_n−1π˙^β_n−1 =L_n−1 it follows that

B˙^α_nφ^α_n∗= ˙B_n^αB˙_n^β(φ^α_nB˙_n^β)∗⊥ζ˙_n^αB˙_n−1^α B˙_n−1^β L_(n−1)∗ = ˙ζ_n^αL_n−1.

The other claim follows by symmetry.

We now define the reciprocal ORFs by (recallf∗(z) =f(1/z)) (φ^ν_n)^∗ =B_n^ν(φ^ν_n)∗, ν ∈ {α, β}.

For the ORF in L_n however we set φ^∗_n= ˙B^α_nB˙_n^β(φ_n)∗.

Note that by definition B_n is either ˙B_n^α or ˙Bn^β depending on γ_n being α_n or β_n, while in the previous definition we do not multiply withBn but with the product ˙B_n^αB˙n^β. The reason is that we want the operation ( )^∗ to be a map from L^ν_n toL^ν_n for all ν∈ {α, β, γ}.

(9)

Remark 3.5. As the operation ( )^∗ is a map from L^ν_n toL^ν_n, it depends onnand on ν. So to make the notation unambiguous we should in fact use something like f^[ν,n]iff ∈ L^ν_n. However, in order not to overload our notation, we shall stick to the notation f^∗ since it should always be clear from the context what the space is to whichf will belong. Note that we also used the same notation to transform polynomials. This is just a special case of the general definition.

Indeed, a polynomial of degreenbelongs toL^α_n for a sequenceα where allα_j = 0,j= 0,1,2, . . . and for this sequence B_n^α(z) =zⁿ.

Note that for a constanta∈ L₀ =Cb we have a^∗ =a. Although ( )^∗ is mostly used for scalar expressions, we shall occasionally useA^∗whereAis a matrix whose elements are all inL_n. Then the meaning is that we take the ( )^∗ conjugate of each element in its transpose. Thus if A is a constant matrix, thenA^∗has the usual meaning of the adjoint or complex conjugate transpose of the matrix. We shall need this in Section 8.

Remark 3.6. It might also be helpful for the further computations to note the following. Ifpn

is a polynomial of degree n with a zero at ξ, then p^∗_n will have a zero at ξ∗ = 1/ξ. Hence, if ν ∈ {α, β, γ}and φ^ν_n= _π^p^νⁿν

n, then φ^ν_n∗ = _π^p^ν^n∗ν n∗ = ^p_π^ν∗ⁿν∗

n . We know by [9, Corollary 3.1.4] that φ^α_n has all its zeros inD, hencep^α_n does not vanish inEand p^α∗_n does not vanish inD. By symmetryφ^βn

has all its zeros inEandp^β∗n does not vanish inE. For the generalφn, it depends onγnbeingαn

orβn. However from the relations between φnand φ^α_n orφ^βn that will be derived below, we will be able to guarantee that at least for z = ν_n we have φ^ν∗_n (ν_n) 6= 0 and p^ν∗_n (ν_n) 6= 0 for all ν ∈ {α, β, γ} (see Corollary3.11 below).

The orthogonality conditions defineφn and φ^∗_n uniquely up to normalization. So let us now make the ORFs unique by imposing an appropriate normalization. First assume that from now on the φ^ν_n refer to orthonormal functions in the sense thatkφ^ν_nk= 1. This makes them unique up to a unimodular constant. Defining this constant is what we shall do now.

Supposeγ_n=α_n, then φ_n and φ^α_nB˙n^β are both in L_n and orthogonal to L_n−1 (Lemma 3.3).

If we assume kφ_nk= 1 andkφ^α_nk= 1, hencekφ^α_nB˙n^βk=kφ^α_nk= 1, it follows that there must be some unimodular constant s^α_n ∈ T such that φn =s^α_nφ^α_nB˙n^β. Of course, we have by symmetry that for γ_n=β_n, there is some s^βn∈Tsuch thatφ_n=s^βnφ^βnB˙_n^α.

To define the unimodular factorss^α_n and s^βn, we first fixφ^α_n andφ^βn uniquely as follows.

We know thatφ^α_nhas all its zeros inDand henceφ^α∗_n has all its zeros inEso thatφ^α∗_n (α_n)6= 0.

Thus we can take φ^α∗_n (α_n) > 0 as a normalization for φ^α_n. Similarly for φ^βn we can normalize by φ^β∗n (β_n) > 0. In both cases, we have made the leading coefficient with respect to the basis {B_j^ν}ⁿ_j=0 positive since φ^α_n(z) = φ^α∗_n (α_n)B_n^α(z) +ψ^α_n−1(z) with ψ^α_n−1 ∈ L^α_n−1 and φ^βn(z) = φ^β∗n (βn)Bn^β(z) +ψ_n−1^β (z) with ψ^β_n−1 ∈ L^β_n−1. Before we define the normalization for the γ sequence, we prove the following lemma which is a consequence of the normalization of the φ^α_n and theφ^βn.

Lemma 3.7. For the orthonormal ORFs, it holds that φ^α_n = φ^βn∗ and (φ^α_n)^∗B˙n^β = φ^βnB˙_n^α and hence also (φ^βn)^∗B˙_n^α =φ^α_nB˙n^β for all n≥0.

Proof . For n= 0, this is trivial sinceφ₀, φ^α₀, φ^β₀,B˙₀^α and ˙B₀^β are all equal to 1.

We give the proof forn≥1 andγn=αn(forγn=βn, the proof is similar). Since by previous lemmas ˙Bn^β(φ^βn)∗ and φ^α_nB˙n^β are both in L_n and orthogonal to Ln−1, and since kB˙n^β(φ^βn)∗k = kφ^β_nk= 1 and kφ^α_nB˙_n^βk=kφ^α_nk= 1, there must be somes_n∈Tsuch that

snφ^α_nB˙_n^β =φ^β_n∗B˙_n^β or snφ^α_n=φ^β_n∗.

(10)

Multiply with Bn^β = B_n∗^α and evaluate at βn to get snφ^α_n(βn)B^α_n∗(βn) = φ^β∗n (βn) > 0. Thussn

should arrange for

0< snφ^α_n(1/αn)B_n∗^α (1/αn) =snφ^α_n∗(αn)B_n^α(αn) =snφ^α∗_n (αn), and since φ^α∗_n (αn)>0, it follows thatsn= 1.

Because (φ^α_n)^∗ =B_n^αφ^α_n∗ =B_n^αφ^βn and B_n^α= ˙B_n^α/B˙n^β, also the other claims follow.

For the normalization of the φ_n, we can do two things: either we make the normalization of φn simple and choose for example φ^∗_n(γn) > 0, similar to what we did for φ^α_n and φ^βn (but this is somewhat problematic as we shall see below), or we can insist on keeping the relation with φ^α_n and φ^βn simple as in the previous lemma and arrange thats^α_n =s^βn= 1. We choose for the second option.

Let us assume thatγn=αn. Denote φn(z) = pn(z)

˙

π_n^α(z) ˙π_n^β(z) and φ^α_n(z) = p^α_n(z) π^α_n(z), with p_n andp^α_n both polynomials in P_n. Then

φ^∗_n(z) = ς_np^∗_n(z)

˙

π_n^α(z) ˙πn^β(z) and φ^α∗_n (z) = ς_np^α∗_n (z)

π^α_n(z) , ς_n=

n

Y

j=1

σ_j.

We already know that there is some s^α_n ∈T such that φn =s^α_nB˙n^βφ^α_n. Take the ( )∗ conjugate and multiply with ˙B_n^αB˙n^β to getφ^∗_n=s^α_nB˙n^βφ^α∗_n .

It now takes some simple algebra to reformulateφ^∗_n=s^α_nB˙n^βφ^α∗_n as φ^∗_n(z) = ςnp^∗_n(z)

˙

π_n^α(z) ˙πn^β(z) =s^α_n ςnp^α∗_n (z)

˙

π^α_n(z) ˙π^βn(z) Y·

j∈bn

(−|β_j|).

This implies that p^∗_n(z) = s^α_np^α∗_n (z)Q·

j∈bⁿ(−|β_j|) and thus that p^∗_n(z) has the same zeros as p^α∗_n (z), none of which is in D. Thus the numerator of φ^∗_n will not vanish at α_n ∈ D but one of the factors (1−β_jαn) from ˙π^βn(αn) could be zero. Thus a normalization φ^∗_n(αn) >0 is not an option in general. We could however make s^α_n = 1 when we choose p^∗_n(α_n)/p^α∗_n (α_n) >0 or, since φ^α∗_n (αn) > 0, this is equivalent with ςnp^∗_n(αn)/π_n^α(αn) > 0. Yet another way to put this is requiring that φ^∗_n(z)/B˙_n^β(z) is positive at z = α_n. This does not give a problem with 0 or∞ since

B˙^α_n(z)φn∗(z) = φ^∗_n(z)

B˙n^β(z) = ς˙_n^αp^∗_n(z)

˙ π_n^α(z)Q·

j∈bⁿ(z−βj), ς˙_n^α= Y

j∈an

σj. (3.1)

It is clear that neither the numerator nor the denominator will vanish for z=α_n.

Of course a similar argument can be given if γn =βn. Then we choose φ^∗_n(z)/B˙_n^α(z) to be positive at z=βn or equivalentlyςnp^∗_n(βn)/π^βn(βn)Q·

j∈an(−|α_j|)>0.

Let us formulate the result about the numerators as a lemma for further reference.

Lemma 3.8. With the normalization that we just imposed the numerators p^ν_n of φ^ν_n =p^ν_n/π_n^ν, ν ∈ {α, β, γ} and n≥1 are related by

pn(z) =p^α_n(z) Y·

j∈bn

(−|β_j|) =p^β∗_n (z)ςn Y·

j∈aⁿ

(−|α_j|), if γn=αn

(11)

and

p_n(z) =p^β_n(z) Y·

j∈aⁿ

(−|α_j|) =p^α∗_n (z)ς_n Y·

j∈bn

(−|β_j|), if γ_n=β_n, where as before ς_n=Qn

j=1σ_j.

Proof . The first expression for γn = αn has been proved above. The second one follows in a similar way from the relation φn(z) =φ^β∗n (z) ˙B_n^α(z). Indeed

pn(z)

π_n(z) = ςnp^β∗n (z) Q

j∈an

$^β_j(z) Q

j∈bn

$_j^β(z) Y

j∈an

σj

z−αj

1−α_jz

= ςnp^β∗n (z) Q

j∈aⁿ$^α_j(z) Q

j∈bn

$^β_j(z) Y

j∈an

σj

z−α_j

1−β_jz = ςnp^β∗n (z) πn(z) Y·

j∈an

σj(−α_j)z−α_j z−αj

.

With −σ_jαj =−|α_j|the result follows.

The caseγ_n=β_n is similar.

Note that this normalization again means that we take the leading coefficient ofφ_nto be positive in the following sense. If γ_n=α_n thenφ_n(z) = ( ˙B_n^αφn∗)(α_n) ˙B_n^α(z) +ψn−1(z) with ψn−1 ∈ L_n−1, while ˙B_n^αφn∗ = φ^α∗_n and φ^α∗_n (α_n) > 0. If γ_n = β_n then φ_n(z) = ( ˙Bn^βφn∗)(β_n) ˙Bn^β(z) + ψn−1(z) withψn−1∈ Ln−1 and the conclusion follows similarly.

Whenever we use the term orthonormal, we assume this normalization and {φ_n: n = 0,1, 2, . . .}will denote this orthonormal system.

Thus we have proved the following Theorem. It says that ifγn =αn, then φn is a ‘shifted’

version ofφ^α_n where ‘shifted’ means multiplied by ˙Bn^β:

B˙^β_n(z)φ_n(z) = ˙B_n^β(z)[a₀B^α₀ +· · ·+a_nB_n^α(z)] =a₀B˙_n^β(z) +· · ·+a_nB˙_n^α(z), and a similar interpretation if γn=βn. We summarize this in the following theorem.

Theorem 3.9. Assume all ORFs φ^ν_n, ν ∈ {α, β, γ} are orthonormal with positive leading coefficient, i.e.,

φ^α∗_n (α_n)>0 and φ^β∗_n (β_n)>0 and

((φ^∗_n/B˙^βn)(αn)>0 if γn=αn, (φ^∗_n/B˙^α_n)(β_n)>0 if γ_n=β_n. Then for all n≥0

φ_n= (φ^α_n) ˙B_n^β = (φ^β_n)^∗B˙^α_n and φ^∗_n= (φ^α_n)^∗B˙_n^β = (φ^β_n) ˙B^α_n if γ_n=α_n, while

φn= (φ^β_n) ˙B^α_n = (φ^α_n)^∗B˙_n^β and φ^∗_n= (φ^β_n)^∗B˙^α_n = (φ^α_n) ˙B_n^β if γn=βn. Corollary 3.10. We have for all n≥1 that(φ^ν_n)^∗ ⊥ζ_n^νL^ν_n−1, ν ∈ {α, β, γ}.

Corollary 3.11. The rational functionsφ^α_n andφ^α∗_n are inL^α_n and hence have all their poles in {β_j:j= 1, . . . , n} ⊂E while the zeros ofφ^α_n are all in D and the zeros ofφ^α∗_n are all in E.

The rational functions φ^βn and φ^β∗n are in L^β_n and hence have all their poles in {α_j: j = 1, . . . , n} ⊂Dwhile the zeros of φ^βn are all in E and the zeros ofφ^β∗n are all in D.

The rational functions φn and φ^∗_n are in L_n and hence have all their poles in {β_j: j ∈ an} ∪ {α_j:j∈bn}.

The zeros of φn are the same as the zeros of φ^α_n and thus are all in D if γn = αn and they are the same as the zeros of φ^βn and thus they are all in E if γn=βn.