1Introduction XiaoluTan AsplittingmethodforfullynonlineardegenerateparabolicPDEs

(1)

El e c t ro nic J

o f

Pr

ob a bi l i t y

Electron. J. Probab.18(2013), no. 15, 1–24.

ISSN:1083-6489 DOI:10.1214/EJP.v18-1967

A splitting method for fully nonlinear degenerate parabolic PDEs

Xiaolu Tan

^∗

Abstract

Motivated by applications in Asian option pricing, optimal commodity trading etc., we propose a splitting scheme for fully nonlinear degenerate parabolic PDEs. The splitting scheme generalizes the probabilistic scheme of Fahim, Touzi and Warin [13]

to the degenerate case. General convergence as well as rate of convergence are obtained under reasonable conditions. In particular, it can be used for a class of Hamilton-Jacobi-Bellman equations, which characterize the value functions of stochastic control problems or stochastic differential games. We also provide a simulation- regression method to make the splitting scheme implementable. Finally, we give some numerical tests in an Asian option pricing problem and an optimal hydropower management problem.

Keywords: Numerical scheme ; nonlinear degenerate PDE ; splitting method ; viscosity solution.

AMS MSC 2010:Primary 65C05 ; secondary 49L25.

Submitted to EJP on April 23, 2012, final version accepted on January 26, 2013.

1 Introduction

Numerical methods for parabolic partial differential equations (PDEs) are largely developed in the literature, on finite difference scheme, finites elements scheme, semi- Lagrangian scheme, Monte-Carlo method, etc. For nonlinear PDEs, and especially in high dimensional cases, the numerical resolution becomes a big challenge.

A typical kind of nonlinear parabolic PDEs is the Hamilton-Jacobi-Bellman (HJB) equation, which characterizes the solution of the optimal control problems. In this context, for finite difference method, one can only use the explicit scheme, since the implicit scheme needs to invert too many matrices. In the one dimensional case, the explicit finite difference scheme can be easily constructed and the monotonicity is guaranteed by the CFL condition. In high dimensional cases, Bonnans and Zidani [4] propose a numerical algorithm to construct a monotone scheme. Another numerical method for general HJB equations is the semi-Lagrangian scheme proposed in Debrabant and Jakobsen [12]. It can be easily constructed to be monotone, but they need next to use a finite

∗CMAP, École Polytechnique, Paris. E-mail:xiaolu.tan@polytechnique.edu

(2)

difference grid as well as an interpolation method to make it implementable. It hence can be viewed as a finite difference scheme.

Generally speaking, finite difference and semi-Lagrangian schemes are easily imple- mented and perform quite well in low dimensional cases; and in high dimensional cases, the Monte-Carlo method is preferred. Recently, Fahim, Touzi and Warin [13] proposed a probabilistic method for nonlinear parabolic PDEs, which is closely related to the second order backward stochastic differential equation (2BSDE) developed in Cheridito et al. [9] and Soner et al. [18]. With simulations of a diffusion process, they propose the estimations of the value function and its derivatives by conditional expectations, by which they can approximate the nonlinear part of the PDE and then get a convergent scheme. However, their scheme can only be applied in the non-degenerate cases.

We want to generalize the probabilistic scheme of Fahim, Touzi and Warin [13] to the degenerate case, motivated by its applications in finance. For example, in Asian option pricing problems, we must consider the cumulative average stock prices At; for look- back options, we consider also the historical maximum and/or minimum stock prices M_t,m_t. They are all degenerate variables without a diffusion generator, and hence the pricing equation turns to be a degenerate parabolic equation. In some optimal commodity trading models(see e.g. [1], [7] and [8]), the storage amount of commodities is an important state variable, and the optimization problem induces a PDE which degenerates on storage amount variable. In life insurance, Dai et al. [11] proposed a financial pricing model for a Variable Annuities product Guaranteed Minimum With- drawal Benefit (GMWB). In their model, the price of GMWB depends on two variables:

the reference account and the guaranteed account, where the latter degenerates and the pricing equation is a degenerate parabolic PDE.

For these degenerate PDEs, the degenerate part is separable. Therefore, a natural solution is the splitting scheme. Our idea is to use the probabilistic scheme to treat the non-degenerate part, and use the semi-Lagrangian scheme to solve the degenerate part, and by combining the two methods, we get a splitting scheme. In particular, it generalizes the probabilistic scheme of Fahim, Touzi and Warin [13] to the degenerate case.

Another contribution of the paper is to propose a simulation-regression technique to make the semi-Lagrangian scheme implementable, in place of the classical finite difference method together with interpolation technique as used in Debrabant and Jakobsen [12], or Chen and Forsyth [8]. In the simulation-regression method, we can use global polynomials, or local hypercubes or local polynomials etc. as regression function basis.

The global polynomial method means to approximate a function with some polynomials on the whole space, while the local basis method means to discretize first the space into local rectangles, and then to approximate the corresponding function with some polynomials on every local rectangle. As illustrated in Gobet, Lemor and Warin [14]

and also in Bouchard and Warin [6], the local hypercubes and local polynomials basis method are very efficient in concrete cases. Moreover, they show that in practice, it is enough to choose a small number (about five or six) of discretization points in every dimension for the local basis method, while for finite difference method, one needs many more discretization points (more than 50 points in [8] for example) in every dimension.

In particular, it permits to treat problems in high dimensions (up to 5 dimensions in [13] and up to 6 dimensions in [6]). In our context, we shall provide a four dimensional numerical example.

The rest of the paper is organized as follows. In Section 2, we introduce a degenerate PDE and a splitting scheme which combines the probabilistic scheme in [13] and semi- Lagrangian scheme. Then we provide a local uniform convergence result as well as a rate of convergence, where the main idea is to adapt the viscosity solution technique

(3)

proposed in Barles and Souganidis [3] and Barles and Jakobsen [2]. In Section 3, we propose a simulation-regression technique to approximate the conditional expectations used in the splitting scheme, making the scheme implementable. We shall also discuss the choices of function basis used in the regression and then provide some convergence results for this implementable scheme. Finally, Section 4 provides some experimental examples.

Notation: Let |η|:=η¹+· · ·+η^dforη ∈N^d^{. Given}T ∈R⁺ ^andd, d⁰ ∈N, we denote Q_T := [0, T)×R^d×R^d⁰^,Q_T := [0, T]×R^d×R^d⁰ ^and

C^0,1(QT) :=

ϕ : QT →R^{such that}|ϕ|1<∞ , where

|ϕ|0 := sup

QT

|ϕ(t, x, y)| and |ϕ|1 := |ϕ|0 + sup

QT×QT

|ϕ(t, x, y)−ϕ(t⁰, x⁰, y⁰)|

|x−x⁰|+|y−y⁰|+|t−t⁰|¹². In this paper, the constant C is used in many inequalities, its value may vary from line to line.

2 The degenerate PDE and splitting scheme

In this section, we first introduce a nonlinear parabolic PDE which has a separable degenerate part. We next propose a splitting scheme, and for which we provide a local uniform convergence result of the splitting scheme when the PDE satisfies a comparison result for bounded viscosity solutions, as well as a rate of convergence when the nonlinear part of the PDE is a concave Hamiltonian.

2.1 A degenerate nonlinear PDE

Let T ∈ R⁺^, µ : [0, T]×R^d → R^d ^and σ : [0, T]×R^d → S_d be continuous, denote a(t, x) := σ(t, x)σ(t, x)^T, we define a linear operator L^X on the smooth functions ϕ : QT →R^by

L^Xϕ(t, x, y) := ∂_tϕ(t, x, y) + µ(t, x)·D_xϕ(t, x, y) + 1

2a(t, x)·D²_xxϕ(t, x, y).

We say thatL^X is a linear operator associated to the diffusion processX = (Xt)_0≤t≤T defined by the stochastic differential equation:

dX_t = µ(t, X_t)dt + σ(t, X_t)dW_t, (2.1) whereW = (W_t)_0≤t≤T is ad-dimensional standard Brownian motion.

Given a nonlinear function

F: (t, x, y, r, p,Γ)∈R⁺×R^d×R^d⁰×R×R^d×S^d 7→ F(t, x, y, r, p,Γ)∈R, we then get a nonlinear operatorF(t, x, y, ϕ, Dxϕ, D²_xxϕ)onϕ. We denote byFp andFΓ

the derivative of functionFw.r.t. pandΓ.

Next, we give the degenerate part which involves with the partial gradient with respect toy. Given functions

l^α,β, c^α,β, f_i^α,β, g_j^α,β

α∈A, β∈B,1≤i≤d,1≤j≤d⁰

defined on Q_T with index spaceA and B, we denote f^α,β := (f_i^α,β)_1≤i≤d and g^α,β :=

(g^α,β_j )_1≤j≤d⁰, and define the LagrangianL^α,β by

L^α,βϕ(t, x, y) := l^α,β(t, x, y) + c^α,β(t, x, y)ϕ(t, x, y)

+ f^α,β(t, x, y)·Dxϕ(t, x, y) + g^α,β(t, x, y)·Dyϕ(t, x, y),

(4)

and the Hamiltonian by

H(t, x, y, ϕ(t, x, y), Dxϕ(t, x, y), Dyϕ(t, x, y)) := inf

α∈A sup

β∈B

L^α,βϕ(t, x, y).

Finally, let us introduce the degenerate fully nonlinear parabolic PDE which will be considered throughout the paper:

− L^Xv − F(·, v, D_xv, D_xx² v) − H(·, v, D_xv, D_yv)

(t, x, y) = 0, on Q_T, (2.2) with terminal condition

v(T, x, y) = Φ(x, y). (2.3)

The PDE (2.2) is composed by three separable parts: the linear partL^X, the nonlinear partF, and the first order degenerate partH.

2.2 A splitting scheme

As observed above, the three parts in PDE (2.2) are separable, we can then propose a splitting numerical scheme to solve it. The idea is to split (2.2) into the following two equations:

− L^Xv(t, x, y) − F(·, v, Dxv, D²_xxv)(t, x, y) = 0 (2.4) and

−∂_tv(t, x, y) − H(·, v, D_xv, D_yv)(t, x, y) = 0, (2.5) then to solve them separately. Equation (2.4) is nonlinear and non-degenerate for every fixedy, then it can be treated by the probabilistic scheme proposed in Fahim et al.[13].

Equation (2.5) is a first order Hamilton-Jacobi-Bellman-Isaacs (HJBI) equation, we shall solve it by semi-Lagrangian scheme. Then, combining the two schemes sequentially, we get the splitting scheme.

Let us first give a time discrete grid(tn)_n=0,···,N withtn :=nh, whereh:=T /N for N ∈ N. As in [13], we defineXˆ_h^t,x by the Euler scheme of the diffusion processX in (2.1):

Xˆ_h^t,x := x + µ(t, x)h + σ(t, x)·(W_t+h−W_t), ∀(t, x)∈[0, T]×R^d.

Letv^hdenote the numerical solution, then the probabilistic scheme of [13] for equation (2.4) is given by

v^h(tn, x, y) =Th[v^h](tn, x, y) :=E

v^h(tn+1,Xˆ_h^tⁿ^,x, y)

+hF(tn, x, y,EDhv^h(tn, x, y)), (2.6) where

EDhϕ(t_n, x, y) := E

ϕ(t_n+1,Xˆ_h^tⁿ^,x, y)H_i^tⁿ^,x,h(∆W_n+1)

:i= 0,1,2 ,

with∆W_n+1:=W_t_n+1−W_t_nand the Hermite polynomials are defined byH₀^t,x,h(w) := 1, H₁^t,x,h(w) :=σ^T(t, x)⁻¹^w_h andH₂^t,x,h(w) :=σ^T(t, x)⁻¹^ww^T_h^−hI2 ^dσ(t, x)⁻¹.

Remark 2.1. The schemeT_his well defined as soon as Det(σ(t, x))6= 0for each(t, x)∈ [0, T)×R^d^{. When}ϕis smooth, by integration by parts, one can verify that

Eh

ϕ t_n+1,Xˆ_h^tⁿ^,x, y

H_i^tⁿ^,x,h(∆W_n+1)i

= EDⁱ_xiϕ t_n+1,Xˆ_h^tⁿ^,x, y

, i= 0,1,2.

For more details on this fact and of the probabilistic schemeTh of (2.6), we refer to Fahim et al. [13].

(5)

The second PDE (2.5) is a first order HJBI equation, its semi-Lagrangian scheme is given by

v^h(tn, x, y) = Sh[v^h](tn, x, y) := inf

α∈A sup

β∈B

n

hl^α,β(tn, x, y) +hc^α,β(tn, x, y)v^h(tn+1, x, y) + v^h t_n+1, x+hf^α,β(t_n, x, y), y+hg^α,β(t_n, x, y)o

. (2.7)

Remark 2.2. The semi-Lagrangian schemeShis deduced intuitively from the discrete version of equation (2.5):

v^h(tn+1, x, y)−v^h(tn, x, y)

h + inf

α∈A sup

β∈B

n

l^α,β(tn, x, y) + c^α,β(tn, x, y)v^h(tn+1, x, y) + v^h(tn+1, x+hf^α,β(tn, x, y), y+hg^α,β(tn, x, y)) − v^h(tn+1, x, y)

h

o

= 0.

Finally, we are ready to introduce the splitting schemeSh◦Th for the original PDE (2.2), (2.3). Concretely, with terminal condition

v^h(t_N, x, y) := Φ(x, y), (2.8)

we computev^h(t_n,·)in a backward iteration. Givenv^h(t_n+1,·), we introduce the ficti- tious timet_n+1

2 and computev^h(tn,·)by v^h(t_n+1

2, x, y) := T_h[v^h](t_n, x, y) withT_hdefined in (2.6), (2.9) and

v^h(tn, x, y) = Sh◦Th[v](tn, x, y) := inf

α∈Asup

β∈B

n

h l^α,β(tn, x, y) + h c^α,β(tn, x, y)v^h(t_n+1 2, x, y) + v^h t_n+1

2, x+f^α,β(t_n, x, y)h, y+g^α,β(t_n, x, y)ho

. (2.10) Clearly, when Det(σ(t, x))6= 0for every(t, x)∈[0, T)×R^d, the schemeS_h◦T_his well defined and it gives a unique numerical solutionv^h.

2.3 The convergence results

We shall provide two convergence results for the splitting schemeS_h◦T_hin (2.10), similar to Fahim et al.[13]. The first one is the local uniform convergence in the context of Barles and Souganidis [3], and the second is a rate of convergence.

We first recall that an upper semicontinuous (resp., lower semicontinuous) function v (resp. v) onQT is called a viscosity subsolution (resp., supersolution) of (2.2) if, for any(t, x, y)∈Q_T and any smooth functionϕsatisfying

0 = (v−ϕ)(t, x, y) = max

QT

(v−ϕ)

resp.,0 = (v−ϕ)(t, x, y) = min

QT

(v−ϕ) ,

we have

− L^Xϕ − F(t, x, y, ϕ, D_xϕ, D_xx² ϕ) − H(t, x, y, D_xϕ, D_yϕ) ≤(resp., ≥) 0.

Definition 2.3. We say that the PDE (2.2)satisfies a comparison result for bounded functions if, for any bounded upper semicontinuous subsolution v and any bounded lower semicontinuous supersolutionvonQ_T satisfying

v(T,·) ≤ v(T,·), we havev≤v.

(6)

Let us now give some assumptions on the equation (2.2), and then provide a first convergence result.

Assumption F :(i) The diffusion coefficientsµandσare Lipschitz inxand continuous in t,σσ^T(t, x)>0for all(t, x)∈[0, T]×R^d^andRT

0

σσ^T(t,0) +µ(t,0)

dt <∞.

(ii) The nonlinear operatorF is uniformly Lipschitz in(x, y, r, p,Γ), continuous intand sup_(t,x,y)∈Q_T|F(t, x, y,0,0,0)|<∞.

(iii)F is elliptic and satisfies

a⁻¹·FΓ ≤ 1 on R×R^d×R^d⁰×R×R^d×S^d. (2.11) (iv)Fp∈Image(FΓ)and

F_p^TF_Γ⁻¹Fp

_∞<+∞.

Remark 2.4. AssumptionFis almost the same as the AssumptionFin [13], here we just add a variableyin the nonlinear operatorF.

Assumption H :The coefficients in HamiltonianH are all uniformly bounded, i.e.

sup

(α,β)∈A×B,1≤i≤d,1≤j≤d⁰

|l^α,β|0 + |c^α,β|0 + |f_i^α,β|0 + |g^α,β_j |0 < ∞.

Assumption M :F_r− ¹₄ F_p^TF_Γ⁻¹F_p ≥ 0 andc^α,β≥0for everyα∈ A, β∈ B.

Remark 2.5. AssumptionMis imposed to guarantee the monotonicity of the splitting schemeS_h◦T_h. However, it is not crucial as soon as AssumptionsFandHhold true.

In fact, as discussed in Remark 3.13 of [13], since the equation is parabolic, we can introduce a new functionu(t, x, y) :=e^θ(T^−t)v(t, x, y)for some positive constantθlarge enough, then the new PDE for u(t, x, y) satisfies Assumption Munder Assumptions F andH. Here, we impose this assumption only to simplify the presentation and the arguments.

Theorem 2.6. Let AssumptionsF,HandMhold true, and assume that the degenerate fully nonlinear parabolic PDE (2.2)satisfies a comparison result for bounded viscosity solutions. Then for every bounded Lipschitz terminal condition functionΦ, there exists a bounded functionvsuch that

v^h −→ v locally uniformly as h→0,

wherev^his the numerical solution of schemeSh◦Thdefined by (2.8),(2.9)and (2.10).

Moreover,vis the unique bounded viscosity solution of the equation(2.2)with terminal condition(2.3).

It is clear that AssumptionsFandHhold true for a class of HJB equations as well as a class of HJBI equations which characterize the value functions of the stochastic differential game problems. We next provide a rate of convergence in case thatF and Hare both concave Hamiltonians, i.e. when the nonlinear equation (2.2) is a HJB equation. We shall use the arguments developed by Barles and Jakobsen [2]. The following stronger assumptions implies that the nonlinear PDE (2.2) satisfies a comparison result for bounded functions, and has a unique bounded viscosity solution given a bounded and Lipschitz continuous functionΦ, see e.g. Proposition 2.1 of [2].

Assumption HJB :AssumptionsFandMhold andF is a concave Hamiltonian, i.e.

µ·p + 1

2a·Γ + F(t, x, y, r, p,Γ) = inf

γ∈C L^γ(t, x, y, r, p,Γ), with

L^γ(t, x, y, r, p,Γ) := l^γ(t, x, y) + c^γ(t, x, y)r + f^γ(t, x, y)·p + 1

2a^γ(t, x, y)·Γ.

(7)

AndB = {β}is a singleton, henceH is also a concave Hamiltonian, so that it can be written as

H(t, x, y, r, p, q) = inf

α∈A

l^α(t, x, y) + c^α(t, x, y)r + f^α(t, x, y)·p + g^α(t, x, y)·q Moreover, the functionsl,c,f,gandσsatisfy that

sup

α∈A,γ∈C

|l^α+l^γ|1+|c^α+c^γ|1+|f^α+f^γ|1+|g^α|1+|σ^γ|1

< ∞

Assumption HJB+ :AssumptionHJBholds true, and for anyδ >0, there exists a finite set{αi, γi}^I_i=1^δ such that for any(α, γ)∈ A × C :

1≤i≤Iinf δ

|l^α−l^αⁱ|0 + |c^α−c^αⁱ|0 + |f^α−f^αⁱ|0 + |σ^α−σ^αⁱ|0

≤ δ,

and

1≤i≤Iinf δ

|l^γ−l^γⁱ|0 + |c^γ−c^γⁱ|0 + |f^γ−f^γⁱ|0 + |g^γ−g^γⁱ|0

≤ δ.

Theorem 2.7. Suppose that the terminal condition functionΦis bounded and Lipschitz- continuous. Then there is a constantC such that(i) under Assumption HJB, we have v−v^h ≤ Ch¹⁴,(ii) under AssumptionHJB+, we have−Ch¹⁰¹ ≤ v−v^h ≤ Ch¹⁴, where vis the unique bounded viscosity solution of (2.2)introduced in Theorem 2.6.

Remark 2.8. The above convergence rate is the same as that obtained in Fahim et al.[13]. It may not be the best rate in general. However, to the best of our knowledge, it is the optimal rate that we can prove in this stochastic control problem context so far.

2.4 Proof of local uniform convergence

To prove the local uniform convergence in Theorem 2.6, we shall verify the criteria proposed in Theorem 2.1 of Barles and Souganidis [3]: the monotonicity, the consistency of the scheme and the stability of the numerical solutions. Moreover, as discussed in Remark 3.2 of [13], we need also to show that

lim inf

(t⁰,x⁰,y⁰,h)→(T ,x,y,0)v^h(t⁰, x⁰, y⁰)≥Φ(x, y)and lim sup

(t⁰,x⁰,y⁰,h)→(T ,x,y,0)

v^h(t⁰, x⁰, y⁰)≤Φ(x, y).

(2.12) Remark 2.9. By the definition of the numerical schemeSh◦Thin(2.10), the numerical solutionv^h is only defined on the time grid(tn)_0≤n≤n product R^d×R^d⁰. However, we can use linear interpolation method to extendv^hon the whole spaceQT.

Proposition 2.10. Let AssumptionsF,HandMhold true, then for two functionsϕand ψdefined onQT with exponential growth, we have

ϕ≤ψ =⇒ Sh◦Th[ϕ] (t, x, y) ≤ Sh◦Th[ψ] (t, x, y).

Proof. By Lemma 3.12 and Remark 3.13 of [13],ϕ ≤ψ implies thatT_h[ϕ](t, x, y)≤ Th[ψ](t, x, y). Then sincec^α,β ≥0according to AssumptionM, it follows immediately by (2.10) thatSh◦Th[ϕ](t, x, y) ≤ Sh◦Th[ψ](t, x, y).

We first define a consistency error function, then prove that our splitting scheme Sh◦This consistent.

(8)

Definition 2.11. Given a smooth functionϕdefined onQ_T, the consistency error function of schemeS_h◦T_his given by

Λ^ϕ_h(·) := ϕ(·)−Sh◦Th[ϕ](·)

h +L^Xϕ(·) +F(·, ϕ, Dxϕ, D²_xxϕ) +H(·, ϕ, Dxϕ, Dyϕ).

(2.13) The schemeSh◦This said consistent if

Λ^ϕ+c_h (t⁰, x⁰, y⁰)→0 as(c, h, t⁰, x⁰, y⁰)→(0,0, t, x, y), (2.14) for every(t, x, y)∈QT and every smooth functionϕwith bounded derivatives.

Proposition 2.12. Let AssumptionsF,HandMhold true, then the schemeSh◦This consistent. In addition, ifµand σ are uniformly bounded, then the consistency error functionΛ^ϕ_h is uniformly bounded byh E(ϕ), where

E(ϕ) := C

1 + |∂ttϕ|0 +

2

X

i=0

|∂tDⁱ_ziϕ|0 +

4

X

i=0

|Dⁱ_ziϕ|0

withz:= (x, y)∈R^d+d⁰,

for a constantCindependent ofϕandh.

Proof. For every(t, x, y) ∈ QT, the value Λ^ϕ_h(t, x, y) is independent of the value of (µ(¯t,x), σ(¯¯ t,x))¯ when (¯t,x)¯ 6= (t, x). Hence we can always change the value of µ and σoutside the neighborhood of(t, x)without influence on the definition of consistency in (2.14). Therefore, without loss of generality, we can just suppose thatµ andσ are uniformly bounded and show that for every smooth functionϕwith bounded derivatives of any order, the consistency error functionΛ^ϕ_h defined in (2.13) satisfies

Λ^ϕ_h(·)

₀ ≤ h E(ϕ). (2.15)

First, let us denote

L^X^ˆ^t,xϕ(t⁰, x⁰, y) := ∂tϕ(t⁰, x⁰, y) + µ(t, x)·Dxϕ(t⁰, x⁰, y) + 1

2a(t, x)·D²_xxϕ(t⁰, x⁰, y), then by Itô’s formula,

E^h(t, x, y, ϕ) := Th[ϕ](t, x, y) − ϕ(t, x, y)

= h L^Xϕ(·) + F(·, ϕ, Dxϕ, D²_xxϕ)

(t, x, y) +h² 1

h² E Z t+h

t

Z u

t

L^X^ˆ^t,xL^X^ˆ^t,xϕ(s,Xˆ_s^t,x, y)ds du

(2.16) +h²h 1

h F(·,EDhϕ)(t, x, y) − F(·, ϕ, Dϕ, D_xx² ϕ)(t, x, y)i . DenoteE₁(t, x, y, ϕ) := L^Xϕ(t, x, y) +F(·, ϕ, Dxϕ, D_xx² ϕ)(t, x, y)and byE₂(t, x, y, ϕ)the last two terms of the above equality (2.16) divided byh², thenE^h(t, x, y, ϕ)can rewritten as

E^h(t, x, y, ϕ) = h E1(t, x, y, ϕ) + h²E2(t, x, y, ϕ).

Clearly, by the boundedness ofµandσ, together with AssumptionF, there is a constant Cindependent ofhsuch that

E2(·, ϕ)

₀ ≤ C

1 +|∂ttϕ|0+

2

X

i=0

|∂tD_xⁱiϕ|0+

4

X

i=0

|Dⁱ_xiϕ|0

,

(9)

and moreover,E₁is Lipschitz inz:= (x, y)with coefficient

LE₁ ≤ C 1 + |∂tDzϕ|0 + |Dzϕ|0 + |D²_zzϕ|0 + |D³_zzzϕ|0 . By simplifying c^α,β(t, x, y), l^α,β(t, x, y), f^α,β(t, x, y), g^α,β(t, x, y)

into(c^α,β, l^α,β, f^α,β, g^α,β), we deduce that

1

h Sh[(ϕ+E^h(·, ϕ))](t, x, y) − ϕ(t, x, y) − E^h(t, x, y, ϕ)

= 1

h inf

α∈A sup

β∈B

h

hl^α,β + hc^α,βϕ(t, x, y) + ϕ(t, x+f^α,βh, y+g^α,βh) − ϕ(t, x, y) + hc^α,βE^h(t, x, y, ϕ) +E^h(t, x+f^α,βh, y+g^α,βh)−E^h(t, x, y, ϕ)i

= inf

α∈A sup

β∈B

h

l^α,β + c^α,βϕ(t, x, y) + (f^α,β·Dxϕ + g^α,β·Dyϕ)(t, x, y) + 1

h

ϕ(t, x+f^α,βh, y+g^α,βh)−ϕ(t, x, y)

−(f^α,βDxϕ+g^α,βDyϕ)(t, x, y) + c^α,βE^h(t, x, y) + 1

h

E^h(t, x+f^α,βh, y+g^α,βh, ϕ) − E^h(t, x, y, ϕ)i

=: H(·, ϕ, D_xϕ, D_yϕ)(t, x, y) +hE₃(t, x, y, ϕ), (2.17) whereE3(t, x, y, ϕ)is defined by the last equality of (2.17), and it satisfies

|E3(t, x, y, ϕ)| ≤ C |D_zz² ϕ|0 + 1

hE^h(t, x, y, ϕ) + 2|E2(t, x, y, ϕ)|

+ L_E₁ ≤ E(ϕ).

Combining the estimations (2.16) and (2.17), and by (2.13) as well as the equality ϕ(t, x, y)−Sh◦Th[ϕ](t, x, y)

h

= ϕ(t, x, y)−Th[ϕ](t, x, y)

h + ϕ(t, x, y) +E^h(t, x, y, ϕ)−Sh[ϕ+E^h(·, ϕ)](t, x, y)

h ,

it follows that (2.15) holds true.

Proposition 2.13. Let AssumptionsF,HandMhold true, and the terminal condition functionΦbeL^∞-bounded, then(v^h)hisL^∞-bounded, uniformly inhforhsmall enough.

Proof. Suppose that|v^h(tn+1,·)|0≤Cn+1, then from Lemma 3.14 of [13], there exists a constantCindependent ofhsuch that

v^h t_n+1 2,·

₀ ≤ C_n+1(1 +hC) + hC.

It follows from (2.10) that whenh < C⁻¹,

|v^h(t_n,·)|0 ≤ (1 +hC)(C_n+1(1 +hC) +hC) + hC ≤ (1 + 3hC)C_n+1 + 3hC.

Therefore,|v^h(tn,·)|0 ≤ C⁰e^C⁰^T for some constant C⁰ (independent of h) from the discrete Gronwall inequality.

We have shown in the above the monotonicity, consistency and stability of scheme S_h◦T_h, the rest is to confirm (2.12). In fact, we will provide a little stronger property of(v^h)_h>0which implies that

lim

(t⁰,x⁰,y⁰,h)→(T ,x,y,0) v^h(t⁰, x⁰, y⁰) = Φ(x, y).

Proposition 2.14. Let Assumptions F, H and M hold true, and Φ be Lipschitz and uniformly bounded. Then(v^h)his Lipschitz in(x, y), uniformly inh.

(10)

Proof. To prove the thatv^h is Lipschitz in (x, y), we shall use the discrete Gronwall inequality as in the proof of Lemma 3.16 of [13].

Suppose that v^h(tn+1,·) is Lipschitz with coefficient Ln+1, then by the proof of Lemma 3.16 of [13], the functionv^h(t_n+1

2,·) =Th[v^h](tn,·)is Lipschitz inxwith coeffi- cientLn+1((1 +Ch)^1/2+Ch) +Ch; moreover,v^h(t_n+1

2,·)is Lipschitz inywith coefficient L_n+1(1 +Ch)by Lemma 3.14 of [13]. It follows thatv^h(t_n+1

2,·)is Lipschitz in(x, y)with coefficientL_n+1

2 ≤L_n+1((1 +Ch)^1/2+Ch) +Ch.

Next, we can easily verify by (2.10) thatv^h(tn,·)is Lipschitz in(x, y)with coefficient Ln ≤L_n+1

2(1 +Ch) +Ch. Therefore, the proof is concluded by the discrete Gronwall inequality.

We can also prove that v^h is 1/2−Hölder in t as was done in Lemma 3.17 of [13]

for their numerical solution. However, to avoid the heavy calculation in their proof, we shall give a weaker result which is enough to guarantee the condition (2.12).

Proposition 2.15. Let Assumptions F, H and M hold true, and Φ be Lipschitz and uniformly bounded. Then|v^h(t_n, x, y)−Φ(x, y)| ≤C√

T−t_n.

Proof. We first introduce¯v^h as the numerical solution of (2.4) computed by scheme T_h, i.e. ¯v^h(T,·) := Φ(·)andv¯^h(t_n,·) :=T_h[¯v^h](t_n,·). Clearly, by Lemmas 3.14 and 3.17 of [13],(¯v^h)h>0is uniformly bounded and satisfies

|¯v^h(tn,·)−Φ(·)| ≤ C(T−tn)^1/2, uniformly inh. (2.18) We claim that

|¯v^h(tn, x, y)−v^h(tn, x, y)| ≤ C(T−tn). (2.19) Then by (2.18), we conclude the proof. Thus it is enough to prove the claim (2.19).

We first recall that by AssumptionFand (2.6), for a constantc∈R^{, we have}Th[v^h+ c](t, x, y)≤T_h[v^h](t, x, y) +c+hF_r|c|. Suppose that forLlarge enough,

|¯v^h(t_n+1, x, y)−v^h(t_n+1, x, y)| ≤ L(T−t_n+1).

It follows by the monotonicity ofT_hand the uniform boundedness ofv^handv¯^hthat

|¯v^h(t_n, x, y)−v^h(t_n+1

2, x, y)| ≤ L(T −t_n+1) +Ch.

And hence by (2.10),

|¯v^h(tn, x, y)−v^h(tn, x, y)| ≤ L(T−tn+1) + 2Ch ≤ L(T−tn), which confirms (2.19).

We remark finally that with Propositions 2.10, 2.12, 2.13, 2.14 and 2.15 together with Theorem 2.1 of Barles and Souganidis [3], Theorem 2.6 holds true.

2.5 Proof for rate of convergence

As in [13], our arguments to prove the rate of convergence in Theorem 2.7 are based on Krylov’s shaking coefficient method, and our analysis stays in the context of Barles and Jakobsen [2]. We first derive some technical Lemmas similar to that in [13].

Lemma 2.16. Let AssumptionsF,H andMhold true and h≤ 1, define λ1 := |Fr|∞, λ2:= sup_α,β|c^α,β|0,λ:=λ1+λ2+λ1λ2. Then, for every(a, b, c)∈R³+, and every bounded functionϕ≤ψdefined onQT, with functionδ(t) :=e^λ(T^−t)(a+b(T −t)) +c, we have Sh◦Th[ϕ+δ](t, x, y) ≤ Sh◦Th[ψ](t, x, y) + δ(t) − h(b−λc), ∀t≤T−handx∈R^d.

(11)

Proof. First, from the proof of Lemma 3.21 in [13], we have

Th[ϕ+δ](t, x, y) ≤ Th[ϕ](t, x, y) + (1 +hλ1)δ(t+h).

It follows by the definition of the splitting schemeSh◦Thin (2.10) that Sh◦Th[ϕ+δ](t, x, y) ≤ Sh◦Th[ϕ](t, x, y) + (1 +hλ1)(1 +hλ2)δ(t+h).

By the monotonicity of the splitting schemeS_h◦T_h, we get

Sh◦Th[ϕ+δ](t, x, y)≤Sh◦Th[ψ](t, x, y) +δ(t) +ζ(t), whereζ(t) := (1 +hλ)δ(t+h)−δ(t).

Finally, using exactly the same arguments as in the proof of Lemma 3.5 of [13], it follows that

ζ(t) ≤ −h(b−λc), which concludes the proof.

Proposition 2.17. Let Assumptions F, H and M hold true, h ≤ 1 and ϕ, ψ be two bounded functions defined onQ^T ^satisfying

1

h ϕ−Sh◦Th[ϕ]

≤ g1 and 1

h(ψ−Sh◦Th[ψ]) ≥ g2, on QT

for some bounded functionsg1andg2. Then for everyn= 0,· · ·, N,

(ϕ−ψ)(tn, x, y) ≤ e^λ(T^−tⁿ⁾|(ϕ−ψ)⁺(T,·)|0 + (T −h)e^λ(T−tⁿ⁾|(g1−g2)⁺|0, with some constantλ≥ |Fr|∞+ sup_α,β|c^α,β|0+|Fr|∞ sup_α,β|c^α,β|0.

Proof. With Lemma 2.16, the proof is exactly the same as in Proposition 3.20 of [13].

Note that we replaceβbyλin our proposition.

Now, we are ready to give the

Proof of Theorem 2.7 (i). First, under AssumptionHJB, we can rewrite the original PDE (2.2) as a standard HJB

−∂tv − inf

α∈A,γ∈C

n

(l^α+l^γ) + (c^α+c^γ)v + (f^α+f^γ)·Dxv + g^α·Dyv + 1

2(σ^γσ^γT)·D_xx² v o

= 0.

With Assumption HJB and the Lipschitz terminal condition, it satisfies a comparison result and admits a unique viscosity solution in C^0,1(Q_T)(see e.g. Proposition 2.1 of [2]). Then by the shaking coefficients method, we can construct a bounded subsolution v^ε∈C^0,1(QT)such that

v − ε ≤ v^ε ≤ v.

Letρ∈C_c^∞(Q_T)be a positive function supported in

(t, x, y) :t∈[0,1],|x| ≤ 1,|y| ≤1 with unit mass, and define

w^ε(t, x, y) := v^ε∗ρ^ε, where ρ^ε(t, x, y) := 1

ε^d+d⁰⁺² ρ t ε²,x

ε,y ε

.

Then w^ε is a smooth subsolution of (2.2) and satisfies |w^ε−v| ≤ 2ε. Moreover, since v^ε∈C^0,1(QT)is uniformly Lipschitz in(x, y)and1/2−Hölder int, it follows that

w^ε∈C^∞, and

∂_t^η⁰D^η_x¹η^+η1y^η²2w^ε

≤Cε^1−2η⁰^−|η¹^|−|η²^|, ∀(η0, η1, η2)∈N^1+d+d⁰\ {0}. (2.20)

(12)

Now, let us consider the consistency error functionΛ^w_h^ε(t, x, y)defined in (2.13). By Proposition 2.12 and (2.20), it follows that there exists a constantC independent ofε andhfor0≤h≤1such that

|Λ^w_h^ε|0 ≤ R(h, ε) := Chε⁻³. (2.21) Moreover, sincew^εis a subsolution of equation (2.2), it follows by the definition of Λ^w_h^ε in (2.13) that

w^ε ≤ S_h◦T_h[w^ε] + Ch²ε⁻³. Finally, by Proposition 2.17, we get

w^ε−v^h ≤ C(ε+hε⁻³), and v − v^h = v − w^ε + w^ε − v^h ≤ C(ε + hε⁻³) and it follows by a minimization technique onεthat

v − v^h ≤ Cinf

ε>0 ε + hε⁻³

≤ C⁰h¹⁴. (2.22)

Proof of Theorem 2.7 (ii) : Under Assumption HJB+, we can apply the switching system method of Barles and Jakobsen [2] which constructs a smooth supersolution closed to viscosity solution to PDE (2.2) and provides the lower bound:

v − v^h ≥ − inf

ε>0 Cε¹³ + R(h, ε)

= − C⁰h¹⁰¹, (2.23) whereR(h, ε)is defined in (2.21).

3 Basis projection and simulation-regression method

To get an implementable scheme, we need to specify how to compute the expecta- tionsEh

ϕ(tn+1,Xˆ_h^tⁿ^,x, y)H_i^tⁿ^,x,h(∆Wn+1)i

i=0,1,2 in the splitting schemeSh◦Th. When analytic closed formulas are not available in the concrete examples, we usually use Monte-Carlo simulation-regression method to estimate them. Some estimations were discussed in recent works, e.g. Malliavin estimations [5], function basis regression [14]

and cubature method [10], etc.

All of these methods need the simulations ofX. Given a discrete time grid(t_n)_0≤n≤N, wheret_n:=n handh:=T /N, we define a Euler approximationXˆ ofX

Xˆ_t_n+1 := Xˆ_t_n + µ(t_n,Xˆ_t_n)h + σ(t_n,Xˆ_t_n)∆W_n+1, (3.1) where∆Wn+1 := Wt_n+1−Wt_n. Then with simulations of processXˆ as well asW, one can estimate the conditional expectations

E h

ϕ(t_n+1,Xˆ_t_n+1, y)H_i^tⁿ^,^X^ˆ^tn^,h(∆W_n+1)

Xˆ_t_n i

i=0,1,2

.

However, these methods are usually discussed in a non-degenerate context, in other words, they can be used for a given fixedy, which is not appropriate for the implemen- tation of our splitting schemeS_h◦T_h.

One solution is to discretize the space ofY into a discrete grid(y_i)_i∈I, and then for each fixedyi, we simulate the diffusion processXand get estimations of the conditional expectations for all x with every fixed yi, then use the interpolation method to get the estimation of theses expectations for allx and y. This is a combination of finite

(13)

difference method and Monte-Carlo method, which may lose the advantages of Monte- Carlo method in high dimensional cases.

Therefore, we propose to simulate the diffusion processXwith Euler scheme and to simulateY with a continuous probability distribution (e.g. normal distribution, uniform distribution, etc.) independent of X. And then we use a regression method like in Longstaff and Schwartz [15] in American option pricing context or Gobet, Lemor and Warin [14] in BSDE context to estimate the conditional expectations

E h

ϕ(tn+1,Xˆtn+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆Wn+1)

Xˆtn, Y i

i=0,1,2, (3.2)

with which we shall make the splitting schemeSh◦Thimplementable.

Remark 3.1. (i) The distribution ofY may be chosen arbitrarily according to the concrete context.

(ii) In practice, if we choose local hypercubes or local polynomials as functions basis for the regression method, we still need to discretize the space. However, as discussed in the introduction, this discretization can be coarse in practice, which permits to keep the advantage of the simulation-regression method in high-dimensional cases (see also the numerical examples in Section 4).

In the following, we first give a basis projection scheme as well as a similation- regression method to estimate the regression coefficient. Then we discuss the convergence of Monte-Carlo errors in our context.

3.1 Basis projection scheme and simulation-regression method 3.1.1 The basis projection scheme

To compute the conditional expectations (3.2), we first project them on a functional space spanned by the basis functions(ek(x, y))1≤k≤K, whereK∈N∪ {+∞}. We recall thatH₂^t,x,his a matrix of dimensiond×d,H₁^t,x,his a vector of dimensiondandH₀^t,x,h = 1. In order to simplify the presentation, we shall suppose thatd=d⁰= 1. All of the results can be easily extended to the cased >1and/ord⁰ >1. Let

˜λⁱ := argmin

λ E

ϕ(tn+1,Xˆt_n+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆Wn+1) −

K

X

k=1

λkek( ˆXt_n, Y)2

, (3.3) then the projected approximation of (3.2) is denoted by

E˜

ϕ(tn+1,Xˆt_n+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆Wn+1)

Xˆt_n, Y :=

K

X

k=1

˜λⁱ_kek( ˆXt_n, Y). (3.4)

Remark 3.2. There are several choices for function basis(ek(x, y))_1≤k≤K, for example global polynomials, local hypercubes or local polynomials, we refer to Bouchard and Warin [6] for some interesting discussions.

We replace the conditional expectations (3.2) in schemeSh◦Thby their projected approximations (3.4), and denote the new splitting scheme byS_h◦T˜_h. Concretely, it is defined as follows:

T˜h[˜v^h](tn, x, y) := E˜

˜

+ hF(·,E˜D˜v^h(·))(tn, x, y), where

E˜Dhϕ(tn, x, y) = E˜

ϕ(tn+1,Xˆ_h^tⁿ^,x, y)H_i^tⁿ^,x,h(∆Wn+1)

:i= 0,1,2 ,

(14)

and

˜

v^h(t_n, x, y) = S_h◦T˜_h[˜v^h](t_n, x, y) := inf

α sup

β

n

hl^α,β(tn, x, y) + hc^α,β(tn, x, y) ˜Th[˜v^h](tn, x, y) + ˜T_h[˜v^h] t_n, x+f^α,β(t_n, x, y)h, y+g^α,β(t_n, x, y)ho

. (3.5)

3.1.2 Simulation-regression method

Next, we propose to use a simulation-regression method to approximate ˜λ. We still suppose thatd=d⁰= 1for simplicity.

Let ( ˆX_t^m_n)0≤n≤N,(∆W_n^m)0<n≤N, Y^m

1≤m≤M be M independent simulations of Xˆ,

∆W and Y, where Xˆ is defined in (3.1), the regression method with function basis (ek(x, y))_1≤k≤Kis to get the solution of the least square problem:

λˆ^i,M =argmin

λ M

X

m=1

ϕ(tn+1,Xˆ_t^m_n+1, Y^m)H^tⁿ^,^X^ˆ

m tn,h

i (∆W_n+1^m )−

K

X

k=1

λkek( ˆX_t^m_n, Y^m)2

.(3.6)

A raw regression estimation of the conditional expectations (3.2) from theseM sam- ples is given by

E¯^Mh

ϕ(tn+1,Xˆt_n+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆Wn+1) Xˆt_n, Yi

:=

K

X

k=1

λˆ^i,M_k ek( ˆXt_n, Y), i= 0,1,2.

(3.7) Then with a priori upper boundsΓ_i( ˆX_t_n, Y)and lower boundsΓ_i( ˆX_t_n, Y), we define the regression estimation of (3.2):

Eˆ^Mh

Xˆt_n, Y i

(3.8) := Γ_i( ˆXtn, Y) ∨ E¯^Mh

ϕ(tn+1,Xˆtn+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆Wn+1) Xˆtn, Yi

∧ Γi( ˆXtn, Y).

Remark 3.3. As observed in Bouchard and Touzi [5], the truncation method is an important technique to obtain a L^p−convergence. By Lemma (2.15), we can choose Γ0(x, y) = Γ0(x, y)andΓ₀(x, y) =−Γ0(x, y)with a functionΓ0satisfying

Γ0(x, y) ≤ Φ(x, y) + Cp

T −tn for some constantC. (3.9) Remark 3.4. In Gobet et al. [14], the authors propose the following minimization problem in place of (3.6):

min

λ⁰,λ¹ M

X

m=1

ϕ(tn+1,Xˆ_t^m

n+1, Y^m) −

K

X

k=1

λ⁰_kek( ˆX_t^m

n, Y^m) −

K

X

k=1

λ¹_kek( ˆX_t^m

n, Y^m)∆W_n+1^m 2

,

which gives also a good estimation for˜λⁱ by the fact that∆Wn+1 is independent of the σ−field generated byY, W0,∆W1,· · ·,∆Wn.

We replace the conditional expectations (3.2) in schemeSh◦Thby their regression estimations (3.8) and denote the new numerical splitting scheme bySh◦Tˆ^M_h , which is

Tˆ^M_h [ˆv^h](tn, x, y) := Eˆ^M ˆ

+ h F(·,Eˆ^MDˆv^h(·))(tn, x, y),

(15)

and

Eˆ^MD_hϕ(t_n, x, y) = Eˆ^M

ϕ(t_n+1,Xˆ_h^tⁿ^,x, y)H_i^tⁿ^,x,h(∆W_n+1)

: i= 0,1,2 ,

so thatS_h◦Tˆ^M_h is defined by ˆ

v^h(t_n, x, y) = S_h◦Tˆ^M_h [ˆv^h](t_n, x, y) := inf

α∈Asup

β∈B

n

hl^α,β(t_n, x, y) + hc^α,β(t_n, x, y) ˆT^M_h [ˆv^h](t_n, x, y) (3.10) + ˆT^M_h [ˆv^h] tn, x+f^α,β(tn, x, y)h, y+g^α,β(tn, x, y)ho

.

3.2 The convergence results of simulation-regression scheme

To get a convergence result of schemesS_h◦T˜_handS_h◦Tˆ^M_h , we can no longer use the same arguments as in Fahim et al. [13], since there is no uniform convergence property inL^pfor the Monte-Carlo error( ˆE^M−E)(R)as in the AssumptionEof [13]. To see this, let us consider the extreme case where the equation is totally degenerate (i.e.d= 0and d⁰>0), and then we need to approximate an arbitrary bounded function in a functional space with finite number of basis functions, which does not give a uniform convergence.

Also, since we are in the viscosity solution analysis context of Barles and Souganidis [3], we can not hope to obtain a probabilisticL²(Ω)−convergence as in Gobet et al. [14].

However, we can get a convergence result if we choose the local hypercubes as function basis. Let us restrict the numerical resolution on [0, T]×D instead of QT, whereD ⊂R^d+d⁰ is a bounded domain. Clearly, we need to assume that the boundary conditions on the domainD^c:=R^d+d⁰\Dare available for schemeSh◦Tˆ^M_h .

Definition 3.5. Given a domain D ⊆ R^d+d⁰, a class of hypercube sets (Bk)1≤k≤K is called a partition ofDwhenever∪^K_k=1B_k=DandB_i∩B_j=∅.

Remark 3.6. The simplest examples of partition of D is the uniform partition. With uniform interval [x_k, x⁰_k) and [y_k, y_k⁰), B_k are of the form [x_k, x⁰_k)×[y_k, y_k⁰). Recently, Bouchard and Warin [6] proposed a partition based on the simulations. They first sort all the simulations and then divide the space in a non-uniform way such that they have the same number of simulation particles in every hypercubeBk.

Remark 3.7. If we use hypercubes (1B_k)_1≤k≤K as basis function in the projections (3.3), where (Bk)1≤k≤K is a partition of D ⊆ R^d+d⁰, then the projection approximation is equivalent to taking another conditional expectation on theσ-field generated by (X_t_n, Y)∈B_k

1≤k≤K, in other words, E˜

Xˆt_n, Y

(3.11)

=

K

X

k=1

Eh

ϕ(t_n+1,Xˆ_t_n+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆W_n+1)

( ˆX_t_n, Y)∈1_B_k i

1_B_k( ˆX_t_n, Y).

Let us use (ek)1≤k≤K = (1B_k)1≤k≤K as projection basis in (3.3) and (3.6), where (Bk)1≤k≤K is a partition of D. Given a bounded function ϕ onD, a process Xˆ and a random variableY, we shall consider the random variables of the form

Ri(ϕ) := ϕ(tn+1,Xˆt_n+1, Y)H_i^tⁿ^,^X^ˆ^tn^,h(∆Wn+1), i= 0,1,2, (3.12) and then give an estimation for the regression error( ˆE^M −E˜) [Ri(ϕ)|Xˆt_n =x, Y =y].