• 検索結果がありません。

This multilevel summation algorithm can be also applied beyond RBFs, to discrete integral transform evaluation, Gaussian filtering and de- blurring of images, and particle force summation

N/A
N/A
Protected

Academic year: 2022

シェア "This multilevel summation algorithm can be also applied beyond RBFs, to discrete integral transform evaluation, Gaussian filtering and de- blurring of images, and particle force summation"

Copied!
25
0
0

読み込み中.... (全文を見る)

全文

(1)

FAST MULTILEVEL EVALUATION OF SMOOTH RADIAL BASIS FUNCTION EXPANSIONS

OREN E. LIVNE ANDGRADY B. WRIGHT

Abstract.Radial basis functions (RBFs) are a powerful tool for interpolating/approximating multidimensional scattered data. Notwithstanding, RBFs pose computational challenges, such as the efficient evaluation of an -center RBF expansion at points. A direct summation requires operations. We present a new multilevel method whose cost is only, where is the desired accuracy and is the dimension. The method applies to smooth radial kernels, e.g., Gaussian, multiquadric, or inverse multiquadric. We present numerical results, discuss generalizations, and compare our method to other fast RBF evaluation methods. This multilevel summation algorithm can be also applied beyond RBFs, to discrete integral transform evaluation, Gaussian filtering and de- blurring of images, and particle force summation.

Key words.Radial basis functions, fast multilevel multi-summation, integral transforms, particle interaction AMS subject classifications.41A21, 41A30, 41A63, 65D25, 65N06, 65R10, 68Q25

1. Introduction. In many science and engineering disciplines we need to interpolate or approximate a function! "$#&%'"()*,+ from a discrete set of scattered samples. Radial basis functions (RBFs) are a simple and powerful technique for solving this problem, with applications to cartography [31], neural networks [27], geophysics [9,10], pattern recogni- tion [41], graphics and imaging [16,17,44], and the numerical solution of partial differential equations [33,34].

The basic RBF approximation to.-/10 is given by theexpansion

(1.1) 2 -/1043

5

6

78:91;

-<

7

0>=?-A@B/C!<

7

@D0E(

where boldface symbols are in" # ,@GF@ denotes the) -dimensional Euclidean norm,HI< 7J 578:9 are calledcenters,

;

-<

7 0 are theexpansion coefficients, and= is a univariate,radially sym- metric kernel. Givendata 7 3K.-< 7 0,L?3NMO(D+(DPIPDPB(Q , the expansion coefficients are chosen to satisfy the interpolation conditions2 -< 7 0R3S 7 ,LT3UMO(D+(DPIPDPB(Q , or in matrix notation, to solve

(1.2)

VWYX Z[\VW]^Z[

3

VW>_`Z[

( where

Xab

7

3c=?-A@B<

a

C!<

7

@D0EP

More generally, one can choose more data than centers and solve the corresponding overde- termined system for

]

[15, Ch. 8]. Some common choices of= are listed in Table1.1. The well-posedness of (1.2) is discussed in [18, Ch. 12–16].

Although RBFs have been effectively applied to many applications, their wider adoption has been hindered by a prohibitively high, non-scalable computational cost, mainly stemming from the infinite support of the commonly used radial kernels [25]. The two main computa- tional challenges of RBFs are

d Received July 12, 2005. Accepted for publication February 16, 2006. Recommended by S. Ehrich.

Scientific Computing and Imaging Institute, University of Utah, 50 South Central Campus Drive, Room 3490, Salt Lake City, UT 84112, USA. (livne@sci.utah.edu).

Department of Mathematics, University of Utah, 155 South 1400 East, Room 233, Salt Lake City, UT 84112- 0090, USA. (wright@math.utah.edu).

263

(2)

TABLE1.1

Some commonly used radial basis functions. In all cases,e$fhg .

Type of Radial Kernel =^-i0 Smooth Kernels

Gaussian (GA) j k1lnmopq

Generalized multiquadric (GMQ) -r+tsc-vuwi0>xy0>zI{Ax , |~}3cM and|}€‚„ƒ

… Multiquadric (MQ) ->+tsN-uyi0>xy0†{Ax … Inverse multiquadric (IMQ) -r+tsc-vuwi0>xy0k‡†>{Ax … Inverse quadratic (IQ) -r+tsc-vuwi0>xy0k‡†

Piecewise Smooth Kernels

Generalized Duchon spline (GDS) i„xAˆ^‰‹ŠŒi ,  €ƒ

iwxAz , |ŽM and|}€‚ƒ

Mat´ern



†`kz

‘

-’|“0 iz”

z

-i0, |ŽM

Wendland [43] -r+ŒC!i0>ˆ

•:–

-i0, – 3 polynomial, €ƒ

Oscillatory Kernels

J-Bessel (JB) [24] =

#

-i043˜—“™

qš › lnm>oAp

lnmop

™q š ›

, )?3,+(  (DPDPIP

(A) Fitting:GivenHD< 7J 78‡95 andHw 7J 78:95 , determine the expansion coefficientsH

;

-<

7 0 J 5

78‡9 . Because

X

in (1.2) is adense-Q&sœ+0-by--Qsœ+y0 symmetric matrix, standard direct solvers requirežŸ-Q1 0 operations.

(B) Evaluation:GivenHI< 7„J 578:9 andH

;

-<

7 0 J 5

78:9 , evaluate the RBF interpolant (1.1) at multiple points/!3K/

a

, ¡43¢MO(D+(DPIPDP`(>£ . The cost of direct summation of (1.1) for

all/

a

isžŸ-£Q10.

In practice, it is sufficient to solve (A) and (B) up to some specified error tolerance. A few Krylov-subspace iterative methods have been developed for (A) [2,3,21,22, 23]. These algorithms require the ability to efficiently multiply

X

by a vector. Thus, overcoming (B) is also important to overcoming (A).

In this paper we develop a fast multilevel evaluation algorithm for reducing the compu- tational cost of (B) tožŸ->-£¤s¥Q10B-‰Š^-r+w¦w§0>0 # 0 operations, where§ is the desired evaluation accuracy. The method is applicable to any smooth = (e.g., Table 1.1, top section) in any dimension) , and to any centers HD< 7J 578:9 and evaluation points HI/

a

JI¨

a

8:9 . The idea is that a smooth = can be accurately represented on a coarser grid of fewer centers and evalua- tion points, at which a direct summation of (1.1) is less expensive. This approach builds on Brandt’s multilevel evaluation of integral transforms [11]; because of the smoothness of= , our method only requires two levels.

Alternative fast RBF expansion methods are:

… Fast Multipole Method (FMM).Beatson and Newsam [7] originally developed an FMM for evaluating RBF expansions with the©3¤+ GDS kernel (see Table1.1).

More recently, Beatson and colleagues have extended the FMM to other radial ker- nels [4,5,6,7,19]. The evaluation complexity isžŸ--£˜sQ10B-‰ŠªQ10B-‰‹Š1->+y¦w§00 # •

†B0 , where§ is the desired accuracy. While these methods have been successful in ap- plications [9,10,17], they are rather complicated to program – especially in higher dimensions, because of the complex hierarchical structure and tree codes required to

(3)

decompose= into “near field” and “far field” components [15,« 7.3]. Additionally, a different set of Laurent and Taylor coefficients must be determined for each new radial kernel and dimension) .

… Fast Gauss Transform (FGT).Roussos and Baxter [38,39] adapted the Fast Gauss Transform (FGT) of Greengard and Strain [30] to RBF evaluation. Exploiting the conditionally positive (negative) definite properties of certain radial kernels, (1.1) is replaced by the evaluation of (1.1) with the GA kernel using FGT, followed by an appropriate Gaussian quadrature rule for translating the GA result back to the original kernel. The algorithm’s cost isžŸ-£¥s&Q10, where the constant increases with the desired accuracy and dimension. FGT is non-hierarchical and parallelizable, but it is hard to precisely control the evaluation error due to the complicated quadrature rules.

The main advantages of our method include:

(i) Its implementation is simple and easily parallelizable forallsmooth kernels= inany dimension) .

(ii) The evaluation error of the method normally depends only simple bounds on the derivatives of =^-i0 at iN3¬M (such bounds for many useful kernels are given in AppendixB).

(iii) The accuracy and complexity of any fast evaluation method must depend on= and thus on the “shape parameter”u . In practice,u is required to grow withQ to avoid severe numerical ill-conditioning of the linear system (1.2) for computing the ex- pansion coefficients [40]. Unlike the FMM and FGT methods, we explicitly include the shape parameteru in our algorithm’s analysis. Therefore, we are able to provide precise user control over the evaluation error and precise asymptotic results on the complexity.

Unlike the FMM, our method is presently limited to smooth radial kernels. Its generalization to piecewise-smooth kernels like GDS is still a work-in-progress; see« 7.

Our algorithm has important applications beyond RBFs: filtering and de-blurring of im- ages [29, pp. 165–184]; force summation among particles/atoms with smooth potential of interactions= (e.g., [11,« 1–3]); evaluation and solution ofcontinuousintegral equations [20]

(1.3) 2-/1043¢­¯®°=^-@B/‚C<t@I0

;

-<±0>)„<¥(

discretized on the centersHI< 7J 78:95 and evaluation pointsHI/

a J ¨a

8:9 ; and so on.

The paper is organized as follows: In « 2we derive our fast evaluation algorithm in 1- D for any smooth kernel. We apply this general algorithm to specific smooth kernels from Table1.1and show numerical results in « 3. The generalization of the algorithm to )¥Ž¤+

dimensions is described in « 4, followed by application of the ) -dimensional algorithm to specific smooth kernels in « 5. In « 6, we present several numerical results in two and three dimensions. We discuss future research in« 7.

2. 1-D fast evaluation algorithm. Let= be a smooth radial kernel (see Table1.1, top section), andHI² 7wJ 78‡95 (BHI³

a J ¨a

8:9Ÿ´

" . Without loss of generality, we assume that the centers

HD² 7J

5

78:9 and the evaluation points HD³

a J ¨a

8:9 lie in µMO(I+B¶ and are sorted so that² 7¸· ² 7 •

†

,

LN3¹M“(D+„(IPDPIP`(>QºC,+ , and ³

a · ³ a• †

, ¡E3¹MO(I+„(IPDPDP`(£»C,+ . We denote by “level ¼ ” the

collection ofHI² 7J 578:9 andHI³

a

JI¨

a

8‡9 and make no assumption on the densities of HI² 7wJ 78:95 and

HD³

a

a

8:9 . Quantities defined at these centers and points are denoted by lowercase symbols

(4)

y0 y1 y2 y3 y4 yn1 yn

λ(y0)λ(y1) λ(y2)λ(y3) λ(y4) λ(yn1)λ(yn) Levelh:

Y0 Y1 Y2 YN1 YN

Λ(Y0) Λ(Y1) Λ(Y2) Λ(YN1) Λ(YN)

LevelH:

FIG. 2.1. Illustration of the two-level approach for the case½‚¾N¿. ÀAÁÃÂDÄÅÂ>ÆOÇ are similarly defined over

ÀÈɒÄÊ

ÉÆOÇ . LevelË contains½„ points outside the convex hull of levelÌ to allow centered interpolation.

(e.g.,H

;

7 0 J 5

78:9

(BH2 -³ a0 J ¨a

8:9 ). The 1-D evaluation task is to compute

(2.1) 2 -³

a

043

5

6

78:9 ;

7

0=Í-γ a C!²

7 Î0h(!¡.3NMO(I+„(DPIPDPB(£ÏP

We define an auxiliary “level Ð ” consisting of two uniform grids HIÑ

—

JIÒ

—

8:9 andHIÓÕÔ JwÖÔ 8‡9 with spacing Ð each. These grids cover HI² 7wJ 578‡9 and HD³

a J ¨a

8:9 , respectively, so that a dis- crete function defined at levelÐ can be approximated at any² 7 usingcentered– th-order interpolation, for some even– €Tƒ . Specifically, we choose

Ñ — 3ײ

9 C -– C+y0rÐ



sœØ‡ÐT(ØT3cMO(I+„(IPDPDP`(AÙT(ÚÙÏ3ÜÛ

² 5 C!²

9

Ð

C!M“PÝwޟs

– (

ӟÔ3׳ 9 C -–

C¥+0rÐ



s¸ßÐà(!ß?3cMO(I+„(IPDPDP`(á¢(Yáâ3

Û ³ ¨ C!³

9

Ð

C!M“PÝ Þ s – P

Quantities defined at levelÐ are denoted by uppercase symbols (e.g.,HyãR-’Ñ

— 0 JyÒ

—

8:9 ,HwäŒ-ÓÕÔI0 JyÖÔ 8:9 );

see Fig. 2.1. LevelÐ is coarser than, and at most comparable with level¼ ;ÐT(– are deter- mined by the shape parameteru of= and the target accuracy§ inHy2 -³

a0 J ¨a

8:9 (see« 2.1,«3).

The evaluation algorithm replaces the expensive summation (2.1) at level¼ by a less expen- sive summation at levelÐ by utilizing the spatial smoothness of = . First,=^-γ

a

C¥²‡Î0 is a smooth function of² , for every fixed³

a

. Therefore its value at²3¢² 7 can be approximated by a centered– th-order interpolation from its values at neighboringÑ

—

’s. Namely, (2.2) =?-γ

a

C² 7 Î0$3

6

—¯å„æIç¯è 7 —

=Í-Aγ a C©Ñ

— Î0TsžŸ-§é:0`(TL?3cMO(D+(DPIPDP`(>Q!(

whereê 7 ë3ìH„Ø íÎÑ — C²

7 Î · –

ÐE¦

 J

,

è 7 —

are the centered– th-order interpolation weights from the coarse centersÑ

—

to² 7 , and§é is the interpolation error, which we bound in« 2.1and

« 3. Substituting the approximation (2.2) into (2.1) and interchanging the order of summation, we obtain

2 -³

a

0t3

5

6

78:9 VW 6

—¯å„æ ç è 7 —

=Í-γ a C©Ñ

— Î0TsžŸ-§é:0 Z[ ;

7 0

3 Ò

6

—

8‡9

=?-Aγ a C!Ñ

— Î0 6

—ïå„æDçïè 7 — ;

7

0sžŸ-Q4@

]

@DðŸ§`é0

3 Ò

6

8‡9 ã-Ñ

—

0>=?-γ a C!Ñ

— Î0Ts¸žŸ-Q4@

] @ ð § é

0B(¡G3NM“(D+„(IPDPIP`(>£Ï(

(2.3)

(5)

where

(2.4) ãR-’Ñ

—

0 ë3

6

—¯å„æIç è 7 — ;

7

0B(ŒØT3NMO(D+(DPIPDPB(AÙT(

which is calledanterpolation[11] or aggregation ofH

;

7 0 J 5

78:9 to levelÐ . The upper bound for the total interpolation error in2 -³

a0 isžŸ-Q4@

]

@DðŸ§`éí0, where@

]

@Ið is the maximum norm ofH

;

7 0 J 5

78:9 . In fact, if we assume that local errors accumulate randomly, the total error will only bežŸ-òñ Q4@

]

@DðŸ§`é0.

Second, we similarly use the smoothness of=^-AγàC¥Ñ

— Î0 as a function of³ for a fixed

Ñ —

to approximate its value at³ó3ô³

a

by a centered– th-order interpolation from its values neighboringÓ&Ô ’s. Namely,

(2.5) =Í-Aγ

a

C!Ñ

— Î0$3

6

Ô å æwõ è a

Ô=?-ÎÓÕԌC©Ñ

— Î0TsžŸ-§`é0`(!¡.3cMO(I+„(DPIPDP`(£ÏP

whereê

a

3öHßÕ íÎÓÕԌC!³

a Î · –

ÐE¦

 J , and

è aÔ are the centered– th-order interpolation weights from the coarse evaluation pointsÓ&Ô to³

a

. Substituting (2.5) into (2.3) gives

2 -³

a

0$3

Ò

6

—

8:9 ã-Ñ

—

0ø÷

6

Ô å æwõ è a

Ôw=?-ÎÓÕÔÃC©Ñ

— Î0Ts¸žŸ-’§é0òùhsžŸ-Q4@

]

@Dð§`é0

3 6

Ô å æ õ è aÔ Ò

6—

8‡9 ã-Ñ

—

0>=?-ÎÓÕԌC©Ñ

— Î0TsžŸ-Q4@

]

@DðŸ§`éí0

3 6

Ô å æ õ è a

ÔyäÃ-ÓÕÔI0TsžŸ-Q4@

]

@DðŸ§éí0B(©¡^3NM“(D+(DPDPIPB(>£¤(

(2.6) where

(2.7) äŒ-ÓÕÔy0 ë3

Ò

6

—

8:9 ã-Ñ

—

0=Í-ÎÓÕԌC©Ñ

— Î0($ß?3×M“(D+(DPDPIPB(AáÚP

For fast-decaying= (e.g. GA),(2.7) can be truncated to a neighborhood ofÓ Ô and replaced by

(2.8) äŒ-ÓÕÔy0$3

6

—

îúûwü

kíý.þ

úÿ

ãR-’Ñ

—

0>=?-ÎÓÕԌC©Ñ

— Î01sžŸ-Q4@

]

@DðŸ§40`(©ß?3cMO(I+„(DPIPDPB(á (

where €ƒ and the truncation error§ depends on= and . If= does not decay fast (or at all) asi% (e.g. IMQ and MQ), we resort toÃ3NÙ (i.e. no truncation).

Thus, the original evaluation task (2.1) is replaced by the less expensive, analogous evalu- ation task (2.8) at levelÐ . Assuming (2.8) is directly summed, we can reconstructH2 -³

a 0

a

8:9

byinterpolating HyäŒ-Ó Ô 0 J ÖÔ 8‡9 back to level¼ , namely, computing (2.6). Summarizing, our fast evaluation task consists of the following steps:

1. Anterpolation:for everyL?3cMO(D+(DPIPDPB(Q , compute the anterpolation weightsH

è 7 — J

—ïå„æDç

. Compute the coarse expansion coefficientsHIã-Ñ

— 0 JIÒ

—

8:9 using (2.4).

2. Coarse Level Summation:evaluateäŒ-Ó Ô 0`(.ß?3NM“(D+(DPDPIP`(Aá using (2.8).

3. Interpolation:for every¡.3cMO(I+„(IPDPDP`(£ , compute the interpolation weightsH

aÔ J Ô å æ õ

. Then interpolateHwäŒ-Ó&ÔI0

J ÖÔ

8:9 toHy2 -³

a0 J ¨a

8:9 using (2.6).

(6)

2.1. Complexity and accuracy. Step 1 consists of two parts. Computing the weights

H è 7 — J —

requiresžŸ-– Q10 operations (see AppendixA). Then (2.4) is executed inžŸ-– Q10 op- erations (see« 2.2). The truncated coarse sum in step 2 requiresžŸ- Dá,0 operations. In some cases, this cost can be cut using the Fast Fourier Transform (FFT) as discussed below. Step 3 consists of computing H

è aÔ J Ô , which costs žŸ-– £T0, and interpolating the level Ð RBF expansion to level¼ for a cost ofžŸ-– £T0. The algorithm’s complexity is thus

(2.9) ,-Qhsº£‚0– s

Ð P

Because the coarse grid interpolations are centered and– is even, the error§ é can be bounded by [42, p. 32]

§`éE3

=?-γ a C²

7 Î0.C

6

—¯å„æ ç¯è 7 —

=Í-Aγ a C©Ñ

— Î0 Ð

‘

• †

x

x

– = l p

a

C„0 (

where is in the convex hull ofHyÑ

— J

—ïå„æDç

,L3NM“(D+„(IPDPIP`(>Q . For infinitely smooth= we obtain

the uniform bound

§`é

Ð

‘ • †

x

x

– = l p ð P

The truncation error§ inäŒ-ÓÕÔy0 due to (2.8) is bounded by the “tail” of= . For everyß ,

§ ­

ý.þDk

k ð Î=±-AÎÑ CÓ Ô Î0IÎ)íÑcs ­ ð

ý þ •

Î=^-AÎÑ,C!Ó Ô Î0DÎ) ÑU3

 ­ ð

Î=^-i0IÎ)iwP

Let contain the values from directly summing (2.1) and ! the contain the values from our fast evaluation for¡G3×MO(I+„(IPDPDPD(>£ . We define the evaluation accuracy as the relative error norm

(2.10) "S 3 @# C$! ¯@Dð

@ ¯@Dð

P

Using the bounds on§ é and§ , we obtain (2.11) "%'&

( Ð

‘ • †

x

x

– = l p ð s  ­ ð

@I=±-i0D@B)i*)N(

where& 3ÜQ4@

]

@Dð&¦“@ ¯@Dð is a measure of the condition number of the direct evaluation

problem (2.1) (see the paragraph below). The algorithm’s efficiency is determined by choos- ing the parametersÐT(– (+ to minimize for a bounded accuracy " § (or minimize"

subject to bounded ). An exact optimization is of course not required. The algorithm’s efficiency depends only on = , not on the specific locationsHI² 7 J 78:95 (DHD³

a J ¨a

8:9 or the values

H ;

7 0 J 5

78‡9 . In «3we show for a few specific RBFs that by correctly choosing the param- eters (normally– ,-\‰Š±-r+y¦§0), the algorithm scales linearly withQ‚s×£ with constant

N‰Š1->+y¦§0.

We conclude this section with a note on the condition number& of (2.1). From [32], we know that if (2.1) is summed directly with precision. , then relative errors in of size &/.

would be introduced. For example, if.3 +M“k:†10 , @

]

@IðÜ3\žŸ->+IM32I0, and H

;

7 0 J 5

78‡9 have

alternating signs so that @ ¯@Ið 3 žŸ-r+y0 (as is often the case in RBF approximations), then

&/.©3SžŸ-r+M42.:0ª3SžŸ-r+Mïk2I0. This error analysis also holds true foranyindirect summation

method of (2.1) (e.g. FMM, FGT, or the current approach). Similar to these other fast evalu- tion methods, we hereafter assume that the condition number& is not too large, otherwise§é ,

§5 should be much smaller than§ to achieve" § .

(7)

2.2. Fast update; parallelization. Interpolation.Note that each2-³ 0`(í¡.3cMO(I+„(DPIPDP`(£ , in step 3 can be independently interpolated fromHäÃ-Ó Ô 0 J ÖÔ 8‡9 . Hence, evaluating at a new point³ costsžŸ-– 0?3˜žŸ-‰‹Š±->+y¦w§00 operations, without repeating or updating steps 1 and 2.

Most likely, new evaluation points lie in the convex hull of

— J җ

8‡9 and thus ofHIÓ Ô J ÖÔ 8:9 . However, if³ cannot be centrally interpolated from existingHDÓ Ô JwÖÔ 8:9 , we append the coarse grid with at most– newÓ Ô ’s near³ . The coarse summation (2.7) is then performed for the newÓ Ô ’s and2 -³

a0 is centrally interpolated from theseäÃ-Ó Ô 0. This update requiresžŸ-– Ù~0 operations, which for some cases may bežŸ-rñ Q‰‹Š1->+y¦w§00 (see«3); further evaluations inside the convex hull of the extended coarse grid cost onlyžŸ-‰‹Š±->+y¦w§00 per new evaluation point.

Anterpolation. Adjointly to interpolation, we implement anterpolation in our code as follows. First, all HãR-’Ñ

— 0 J җ

8‡9 are set to zero. For eachL!3 MO(I+„(DPIPDPB(Q we increment the coarse expansion coefficients which include

;

7 0 in their sum (2.4); namely,

(2.12) ã-Ñ

—

076hCϋR-Մ

—

01s

è 7 — ;

7

0B(98±Ø

€ ê 7 P

This computation may be interpreted as distributing

;

7 0 between several neighboring coarse centers, as illustrated in Fig.2.2. A new center² can now be accommodated by incrementing

λ(yj−2)λ(yj−1) λ(yj) λ(yj+1) λ(yj+2)

Λ (YJ1) Λ (YJ) Λ (YJ+1) Λ (YJ+2)

ωjJ1 ωjJ ωjJ+1 ωjJ+2

. . . . . . . .

. . . . . . . .

FIG. 2.2. Anterpolation can be interpreted as distributing each:ï<;>= between several neighboring coarse centers?3@ . The Figure depicts an example for½Ã¾BA .

fewã-Ñ

— 0’s neighboring² by (2.12) (žŸ-– 0 operations), updating the coarse sums (2.7) for allÓ Ô ’s (žŸ-– Ù~0 operations), and interpolating (žŸ-– £T0 operations), a total ofžŸ-– -£UsÙ~00 operations. Note that if we want to update the interpolant 2 -³

a0 only for some³

a

near² , we need update onlyžŸ-– 0 relevantäŒ-Ó Ô 0 withÓ Ô near³

a

, hence the cost of this update is onlyžŸ-– x0. For² outside the convex hull of

— J җ

8:9 , use a grid extension as in the previ- ous paragraph. This complexity analysis also applies toremovinga center² 7 from the RBF interpolant.

Parallelization. The fast evaluation algorithm readily lends itself to distributed archi- tectures. Anterpolation of each

;

7 0 to the coarse lattice can be done independently of the others, however, multipleL ’s may require conflicting access to the same coarse data (all² 7 ’s withØ € ê 7 would like to add their contributions toãR-’Ñ

— 0). A domain decomposition ap- proach in which each processor is assigned a contiguous sub-domain of the centersHI² 7wJ 578:9 to work on, will resolve such conflicts. Similar ideas can be applied to the coarse level sum- mation and interpolation steps.

2.3. Fast coarse level summation. In cases where the evaluation points

aJ ¨a

8:9 are

not far outside the convex hull of the centers 7wJ 578:9 , and vice versa, the coarse evaluation points can be set equal to the coarse centers, viz. á 3¤Ù , Ó

— 3 Ñ —

, Øö3¤MO(D+(DPIPDP`(Ù .

The coarse level summation (2.7) then amounts to a matrix vector product similar to (1.2), where

X

is now an ÙDC¸Ù symmetric Toeplitz matrix. Thus, (2.7) can be computed in

žŸ-Ùö‰‹ŠÃÙ~0 operations using the FFT [28, pp. 201–202]. This approach is especially attractive

(8)

for reducing thežŸ-Ù~xy0 complexity of a non-truncated coarse level summation (i.eÍ3SÙ ) for a large enoughÙ . The complexity of the algorithm thus becomes

9 -Qhsº£‚0

– s +

ÐFEHG

ŠHIJw(‰Š

+

ÐLK

(

which scales linearly withQ and£ . 3. Examples of 1-D evaluation.

3.1. GA kernel. Let=^-i0t3Njk±lnm>oApq . Then

= l p ð

3×u

–

-– ¦  0 (

(see (B.2) of AppendixBwith)T3˜+ ). In addition,= exponentially decays atiE%9 , and the “tail” is bounded by [1, p. 298]

(3.1) ­

ð

j

kím>q>oAq

)i

j k1l

m>pq

u P

Using (3.1) and Stirling’s asymptotic formula [1, p. 257]

(3.2) ‘ -NMPOs$QB07

ñ  j

kSRUT

-NMVO 0 RUT

•W

k:†{x

in (2.11), we obtain the accuracy estimate (keeping only the main terms)

"%YX



ñ

ï–[Z

X

ÐTu

ñ –

ñ  j Z s 

jk1l

m>pq

u \ X

Ђu

ñ –

ñ  j Z s

jk1l

m>pq

u P

The first term in" can be bounded by§ only if we requireЂu ñ – · Q ñ  j for someM · Q ·

+ , and– 3KžŸ-‰‹Š±->+y¦w§00. Thus,

ÐY

+

u ñ – (

(3.3)

–

œ‰‹Š

+

§ P

(3.4)

In practice,– is rounded to the next even integer. The second term is bounded byžŸ-’§0 if

‰‹Š

+

u

Cœ- BЂu„0 x \

‰‹ŠÃ§ ]H^

+

Ђu XO‰Š

+

Z › q \

w(

provided · + . is minimized if and only if is, hence we choose

(3.5) _`XŠ +

‰‹Š

+

§ Z ›q P

By selecting (3.3)–(3.5), we can evaluate (2.1) for the GA RBF in

(3.6)

(

XŠ

+

§ Z

-Qhsº£‚0‡saXŠ

+

uw§

‰‹Š

+§ Z ›q u )

operations. The complexity scales linearly with Q and£ for all u

\ Q . If ucb Q , the original summation (2.1) can be truncated similarly to (2.8) and directly evaluated inžŸ-QEs

£T0 operations. Hence, the evaluation task (2.1) with the GA kernel can be carried out in

žŸ-Qhsº£‚0 operations, for allu?Ž¥M .

(9)

3.1.1. Numerical experiments. We numerically verify the accuracy and work estimates for our multilevel method derived above. In all experiments we setuº3 ñ Q±¦ed and£ 3

 Q . Similar results are obtained with otherQG(>£!(ru . We do not employ the FFT summation technique of«2.3.

To enforce§ é s§ § , we set the algorithm’s parameters so that§ é 3c§ 3K§„¦  . The work (3.6) is then minimized whenÐT(– (+ are chosen as follows:

– 3

‰Š xf

‰Š

†W

ÐÏ3 Qu X  j

–gZ

›q (

(3.7)

–

3ih#h

–kj#j

((3.8)

Ã3mln onp

qsr

Ђu

‰Šct

m f › q4u

u · d§ (

M u?*

d§ (

where hFj denotes rounding to the next integer and h#hFj#j indicates rounding the argument to the next even integer. We useQª3U+w¦*d andr 3U+„P‹+ ; a more precise analysis (also alleviating the need for the deriviation (3.3)–(3.5)) could optimizeQy( r by preparing a numerical table of the relative error" 3v"&-wQy(

r 0 (measured for several differentQ ’s versus a direct evaluation of (2.1), and averaged over several randomH

;

7 0 J 5

78:9 ), and usingQw( r to minimize under

" § . However, this hardly seems profitable, as good results are obtained for our rough

estimates forQy(r .

First, we verify that with this choice of parameters the relative error" (2.10) is indeed

žŸ-§0. Table3.1shows the" for various values ofQ and§ . Each entry in the table is the

average of ten different experiments, whereHD² 7J 578:9 andHI³

a

a

8:9 were randomly selected in

µM“(D+D¶, andH

;

7 0 J 5

78:9 randomly chosen inµC+„(I+B¶ in each experiment. Clearly," is below§ in all cases.

TABLE3.1

Relative errorx of the multilevel evaluation method for the GA, versus and (œ¾‚¿ ).

Q §ø3,+IMOkíx §ø3 +IMOk

t

§°3,+IMïk0 §3 +IMïk2 §3,+Mïk‡†

9

100 ÝïPyzdøF+IMOk   +„PyzdøF„+Mïk|{ }ïP+



F„+IMOk|2 +P~4døF+IMOk| ÝOP€MRF+IMOk:†>x

200 ÝïP݄ÝRF+IMOk

   PMP}ªF„+Mïk|{ ÝOP3F„+IMOk|2 +Py3€F+IMOk| }ïPdPyRF+IMOk:†>x

400 yOPÝMF+IMOk   +„P‹+



F„+Mïk|{ ÝOP‚}ï+F„+IMOk|2 +P



yF+IMOk| “PdP€RF+IMOk:†>x

800 d“P ~F+IM k   +„P€4€F„+M k|{ d P€3€F„+IM k|2 +PÝ4€F+IM k| d P~“+F+IM k:†>x

1600 y“P+yÝRF+IMOk   +„P€ÝRF„+Mïk|{ €“P



+F„+IMOk|2 +Pd3døF+IMOk| }ïP~4dF+IMOk:†>x

Second, we verify that the work (3.6) linearly scales with£ ,Q , and‰Š^-r+y¦§0. Table3.2 compares the number of operations required for our multilevel method for various values of

Q and§ , with a direct evaluation. Each evaluation of j¯k|ƒ is counted as one operation. As expected, the results follow the work estimate (3.6), giving…„%†Í‰Š±-r+y¦§0B-Qs£‚0 , where

ÝïP‡}

· † ·

}¯P

 . Note that the hidden constant (which is of course only roughly estimated here) is very small.

3.2. MQ kernel. Let=^-i03Ï->+Ãs -vuwi0>xw0

›q

. This kernel grows asi~%m , hence we do not truncate the coarse level sum by choosingª3¢Ù in (2.8). From (B.6) of AppendixB

(10)

TABLE3.2

Work (number of floating-point operations)ˆ required of the multilevel evaluation method for the GA, for various and desired accuracies . The right column indicates the number of operations required for a direct evaluation. We average over centers and expansion coefficients as in Table3.1.

Q §3,+Mïkx §ø3,+IMOk t §3,+Mïk0 §ø3,+IMOk|2 §3,+Mïk‡†

9 direct

100 +IM4dM  +5€4€“+„+  €3€zd M  ~  d Ý44~4 +IMM„M„MM

200  M4d3€  y4~4~“+5 Ý3}z~  M 4~O+e}4} ~P}ed3yO+ dMM„M„MM

400 dM „ } }eO+~O+ +„+  }¯+M +y „„ +4€4d4d“+ +„MM„M„MM

800 }e€ Ý3}  +5d3€P}ï+5

„

+„+5 Ý



Ýz~4~ Ý3} y3yO+



M4 4dMM„M„MM

1600 +Ýz~Mzd   €3Ý4€4 d3y44d3y Ý Ýï+M4yM



 Ýï+5~MO+



Ý4„MM„M„MM

with|&3,+ ,)Ÿ3ö+ ,

= l p ð

3œu  ‘ • †

x ‘ k:†

x

 P

Applying this to (2.11) and expanding with (3.2), we obtain the accuracy estimate

(3.9) "%



– X 

ï–[Z

›q X Ђu

–

 j Z \ X

ÐTu

–

 j Z P

This can be bounded by § only if we requireЂu – ·  jeQ for someM · Q · + , and– 3

žŸ-‰‹Š±->+y¦§0>0. Thus,

Љ

+

u –

–

׉‹Š

+§ P

Again,– is rounded to the next even integer. The fast evaluation complexity for the MQ kernel is

9Kž

(

XŠ

+

§ Z

-QÕsº£‚0±s,XŠ

+

§ Z x u x )

operations. Hence, the algorithm scales linearly withQ and£ for alluŠ,ñ Q or smaller. For largeru , is dominated by the coarse level summation (2.7). As discussed in« 2.3, we can reduce this computation tožŸ--‰‹Š±->+y¦w§0u„0¯‰Š±-‰‹Š1-r+w¦w§0òu„00 using the FFT.

3.2.1. Numerical experiments. Similarly to« 3.1.1, we numerically verify the accuracy and work estimates derived above. The same assumptions on the centers, evaluation points and expansion coefficients are made;u°3 ñ Q±¦*d and£ 3  Q . The FFT summation technique of« 2.3is not used, asu‹Kñ Q .

Here,§ 3\M , thus we choose– (AÐ to have§ é ˜§ . The optimalÐT(– that minimize (2.9) are

– 3

‰Š †f

‰Š

†W

(3.10)

ÐÏ3



jeQ

u – (

(3.11)

–

3ihh

–/j#j

((3.12)

参照

関連したドキュメント

To obtain the asymptotic expansion, as mentioned in Section 2.2, we rewrite the sum (14) of ⟨ 5 2 ⟩ N by using an integral by the Poisson summation formula (Proposition 4.6)

In addition, we extend the methods and present new similar results for integral equations and Volterra- Stieltjes integral equations, a framework whose benefits include the

to use a version of Poisson summation with fewer hypotheses (for example, see Theorem D.4.1 in [1])... It seems surprisingly difficult to prove directly from the definition of a

SOFO, Computational Techniques for the Summation of Series, Kluwer Publishing Co., New York, 2003.

We show some symmetry relations among the correlation functions of the in- tegrable higher-spin XXX and XXZ spin chains, where we explicitly evaluate the multiple integrals

The numerical tests that we have done showed significant gain in computing time of this method in comparison with the usual Galerkin method and kept a comparable precision to this

After proving the existence of non-negative solutions for the system with Dirichlet and Neumann boundary conditions, we demonstrate the possible extinction in finite time and the

The object of this paper is the uniqueness for a d -dimensional Fokker-Planck type equation with inhomogeneous (possibly degenerated) measurable not necessarily bounded