Duality on gradient estimates and Wasserstein controls

(1)

Duality on gradient estimates and Wasserstein controls

Kazumasa Kuwada



 Ochanomizu University Universit¨at Bonn





(2)

§ 1 Introduction

(3)

Equivalent conditions for a lower Ricci curvature bound (von Renesse & Sturm ’05, etc...)

X: complete Riemannian manifold

P_t: heat semigroup associated with ∆ (i) Ric ≥ k

(ii) d_p^W (P_t^∗µ, P_t^∗ν) ≤ e⁻^ktd_p^W (µ, ν) for some p ∈ [1, ∞]

(iii) |∇P_tf |(x) ≤ e⁻^ktP_t(|∇f |^q)(x)^1/q for some q ∈ [1, ∞]

(4)

Our goal:

Generalization of (ii) ⇔ (iii)

for p, q with 1

p + 1

q = 1. (ii) L^p-Wasserstein control

d_p^W (P_t^∗µ, P_t^∗ν) ≤ e⁻^ktd_p^W (µ, ν) (iii) L^q-gradient estimate

|∇P_tf |(x) ≤ e⁻^ktP_t(|∇f |^q)(x)^1/q

(5)

§ 2 Framework and main result

(6)

(X, d): Polish metric space.

• (P_x)_x_∈_X ⊂ P(X): Markov kernel.

P f (x) :=

∫

X

f dP_x, P ^∗µ(A) :=

∫

X

P_x(A)µ(dx).

(e.g. P = P_t: heat semigroup)

• d˜: continuous distance function on X. (e.g. d˜ = e⁻^ktd)

(7)

Π(µ, ν): set of couplings for µ, ν ∈ P(X), i.e.

Π(µ, ν) :=



π

¯¯¯¯

¯¯ π(A × X) = µ(A), π(X × A) = ν(A)



 .

L^p-Wasserstein (pseudo) distance For p ∈ [1, ∞],

d_p^W (µ, ν) := inf

π∈Π(µ,ν) kdk^L^p^(π) ∈ [0, ∞].

(8)

Gradient

|∇^df |(x) := lim

r↓0 sup

y∈B_r(x)

¯¯¯¯ f (x) − f (y) d(x, y)

¯¯¯¯

k∇^df k_∞ := sup

x∈X |∇^df |(x)

f : Lipschitz ⇒ |∇^df | is an upper gradient of f i.e. f (y) − f (x) ≤

∫ b a

|∇^df |(γ(s))ds,

∀γ : [a, b] → X 1-Lipschitz curve from x to y

(9)

Gradient

|∇^df |(x) := lim

r↓0 sup

y∈B_r(x)

¯¯¯¯ f (x) − f (y) d(x, y)

¯¯¯¯

k∇^df k_∞ := sup

x∈X |∇^df |(x)

f : Lipschitz ⇒ |∇^df | is an upper gradient of f i.e. f (y) − f (x) ≤

∫ b a

|∇^df |(γ(s))ds,

∀γ : [a, b] → X 1-Lipschitz curve from x to y

(10)

L^p-Wasserstein control

d_p^W (P ^∗µ, P ^∗ν) ≤ d˜_p^W (µ, ν) (W_p) for p ∈ [1, ∞] and µ, ν ∈ P(X)

L^q-gradient estimate

|∇_d^˜P f|(x) ≤ P (|∇^df |^q)(x)^1/q (G_q) for q ∈ [1, ∞) and f ∈ C_b^Lip(X),

k∇_d^˜P f k_∞ ≤ k∇^df k_∞ (G_∞) for q = ∞

(11)

v: Radon measure on X with supp(v) = X. Assumption 1 (X, d): proper length space

Assumption 2 (X, d, v) supports

• local (uniform) volume doubling condition

• (1, ρ)-local Poincar´e inequality (∃ρ ≥ 1) Assumption 3 d˜: geodesic distance

Assumption 4 P_x ¿ v, x 7→ dP_x

dv (y): continuous

(12)

v: Radon measure on X with supp(v) = X. Assumption 1 (X, d): proper length space

Assumption 2 (X, d, v) supports

• local (uniform) volume doubling condition

• (1, ρ)-local Poincar´e inequality (∃ρ ≥ 1) Assumption 1,2 enable us to employ

general theory of Hamilton-Jacobi semigroups

(13)

Local uniform volume doubling condition

∃D > 0, ∃R₁ > 0 s.t. ∀x ∈ X, ∀r < R₁ v(B_2r(x)) ≤ Dv(B_r(x)).

(1, ρ)-local Poincar´e inequality

∀R₂ > 0, ∃λ ≥ 1, ∃C_P > 0 s.t. ∀r < R₂,

−

∫

B_r(x)

|f −f_x,r|dv ≤ C_P r

(

−

∫

B_λr(x)

g^ρ dv

)1/ρ

for ∀f and ∀g: upper gradient of f , where f_x,r := 1

v(B_r(x))

∫

B_r(x)

f dv =: −

∫

B_r(x)

f dv.

(14)

For p, q ∈ [1, ∞] with 1

p + 1

q = 1, (i) (W_p) ⇒ (G_q)

(ii) Under Assumption 1-4, (G_q) ⇒ (W_p) Theorem (K. ’09)

(15)

Remarks

• For p⁰ > p,





(G_p) ⇒ (G_p⁰ ) (W_p⁰) ⇒ (W_p) (without Assumption 1-4)

• (G _∞) ⇔ (W₁) is well known.

 via Kantorovich-Rubinstein formula;

without Assumption 1-4





• (W_∞) ⇒ (G₁) is essentially well known.

(16)

§ 3 Examples and applications

How do we obtain (W_p)?

(17)

(A) Two known derivations of ( W

_p

)

(18)

(1) Coupling by parallel transport of B.m.’s X: cpl. Riem. mfd., Ric ≥ k

∃(B_t, B˜_t): coupling of B.m’s s.t.

d(B_t, B˜_t) ≤ e⁻^kt/2d(B₀, B˜₀) PPP-a.s.

i.e. Ric ≥ k ⇒ (W_∞)

(19)

Extension:

Backward (super) Ricci ﬂow ∂_tg(t) ≤ Ric_g_(t) For g(t)-B.m.’s (↔ generator ∂_t + 1

2 ∆_g(_·₎), d_g(t)(B_t, B˜_t) ≤ d_g(s)(B_s, B˜_s) PPP-a.s.

◦ McCann & Topping ’08: (W₂), X: cpt

◦ Arnaudon, Coulibaly & Thalmaier ’09: (W_∞) (cf. K. & Philipowski ’09 for non-explosion)

(20)

(2) Gradient ﬂow formulation of the heat ﬂow µ_t X: cpl. Riem. mfd.

“∂_tµ_t = −∇E(µ_t)” (E: relative entropy)

• Ric ≥ k ⇔ “Hess E ≥ k”,

• Hess E ≥ k ⇒ (W₂) for µ_t (= P_t^∗µ) (Heuristically, diﬀerential geometry on P(X))

Ã Extension to singular spaces (e.g. Alexandrov spaces)

(21)

Remark

“(a lower Ricci bound) ⇒ (W_p)” in the literature.

No direct way “(G_q) ⇒ (W_p)” was known.

E.g. in von Renesse & Sturm ’05, Ric ≥ k

⇓ coupling method

(W_∞) ⇒ (W_p) ⇒ (W₁)

⇓ ⇓

(G₁) ⇒ (G_q) ⇒ (G_∞) ⇒ Ric ≥ k Bochner

(22)

(B) H¨ ormander-type operators

on a Lie group

(23)

X: Lie group

{X_i}ⁿ_i=1: left-invariant, lin. indep. vector ﬁelds satisfying the H¨ormander condition P_t := e^tA, A :=

∑n i=1

X_i²,

|Γf | := 1 2

(A(f ²) − 2f Af )

=

∑n i=1

|X_if |²

L^q-gradient estimate

|ΓP_tf |(x) ≤ K_q(t)P_t(|Γf |^q/2)(x)^2/q (G^∗_q)

(24)

Known results

• 3-dim. Heisenberg group, K_q(t) ≡ K_q > 1

◦ q > 1: Driver & Melcher ’05

◦ q = 1: H.-Q. Li ’06 / Bakry, Baudoin, Bonnefont & Chafa¨ı ’08

• X: general, q > 1: Melcher ’08 (K_q(t) ≡ K_q if X: nilpotent)

• X: group of type H, q = 1, K_q(t) ≡ K_q: Eldredge ’10

• X = SU (2), q > 1, K_q(t) = K_qe⁻^t: Baudoin & Bonnefont ’09

(25)

Carnot-Caratheodory distance For V ∈ T_xX,

|V | =









( ∑ⁿ

i=1

a_i²

)1/2

if V =

∑n

i=1

a_iX_i(x),

∞ otherwise.

d(x, y) := inf





∫ 1 0

|γ˙ _s|ds

¯¯¯¯

¯¯ γ₀ = x, γ₁ = y





(26)

v: right-Haar measure, P = P_t.

(i) (X, d, v; P ) satisﬁes Assumption 1-4 (ii) (G^∗_q) ⇒ (G_q)

Proposition

(G^∗_q) ⇒ (W_p) for q ∈ [ 1, ∞]. Corollary

(27)

Examples

(28)

3-dim. Heisenberg group X = RRR³, v: Lebesgue (x, y, z) · (x⁰, y⁰, z⁰)

= (x + x⁰, y + y⁰, z + z⁰ + 1

2 (xy⁰ − yx⁰)) X₁ = ∂_x − y

2 ∂_z , X₂ = ∂_y + x

2 ∂_z

Associated diﬀusion (B_t¹, B_t², B_t³) from (x, y, z): B_t¹ = W_t¹, B_t² = W_t²,

B_t³ = z + 1 2

∫ _t

0

W_t¹dW_t² − W_t²dW_t¹, where (W_t¹, W_t²): 2-dim. BM from (x, y)

(29)

(W_∞): For each t > 0,

∃ a coupling (BBB_t, BBB˜ _t) of (B_t¹, B_t², B_t³) s.t.

d(BBB_t, BBB˜ _t) ≤ K₁d(BBB₀, BBB˜ ₀) PPP-a.s.

(30)

Deﬁnition

X: a group of type H iﬀ, for X : Lie alg.

associated with X with a scalar product h·, ·i,

• X = V ⊕ Z with [V, V] = Z, [V, Z] = [Z, Z] = 0.

• J : Z → End V given by

hJ (Z)V₁, V₂i := hZ, [V₁, V₂]i satisﬁes J(Z)² = −kZkId.

{X_i}ⁿ_i=1 will be an ONB of V.

(31)

Remarks

• (Heisenberg) ⊂ (type H)

• (type H) ⊂ (stratiﬁed, step 2 nilpotent)

• (type H) ∩ (free step 2 nilpotent)

= {3-dim. Heisenberg}

• For type H,

possible dimension is completely determined

(32)

Remarks

• (Heisenberg) ⊂ (type H)

• (type H) ⊂ (stratiﬁed, step 2 nilpotent)

• (type H) ∩ (free step 2 nilpotent)

= {3-dim. Heisenberg}

• For type H,

possible dimension is completely determined

(33)

§ 4 Sketch of the proof of ( G

_q

) ⇒ ( W

_p

)

(34)

Recall:

d_p^W (P ^∗µ, P ^∗ν) ≤ d˜_p^W (µ, ν) (W_p)

|∇_d^˜P f |(x) ≤ P (|∇^df |^q)(x)^1/q (G_q)

• p = 1 (q = ∞) : well-known

• (W_p) for ∀p < ∞ ⇒ (W_∞)

ÃÃÃ We may assume p ∈ (1, ∞)

◦ (W_p) for µ = δ_x, ν = δ_y ⇒ (W_p) ÃÃÃ We show d_p^W (P_x, P_y)^p

p ≤ d(x, y˜ )^p p

(35)

Recall:

d_p^W (P ^∗µ, P ^∗ν) ≤ d˜_p^W (µ, ν) (W_p)

|∇_d^˜P f |(x) ≤ P (|∇^df |^q)(x)^1/q (G_q)

• p = 1 (q = ∞) : well-known

• (W_p) for ∀p < ∞ ⇒ (W_∞)

p ≤ d(x, y˜ )^p p

(36)

Recall:

d_p^W (P ^∗µ, P ^∗ν) ≤ d˜_p^W (µ, ν) (W_p)

|∇_d^˜P f |(x) ≤ P (|∇^df |^q)(x)^1/q (G_q)

• p = 1 (q = ∞) : well-known

• (W_p) for ∀p < ∞ ⇒ (W_∞)

p ≤ d(x, y˜ )^p p

(37)

General theory of the Hamilton-Jacobi semigroup

(Lott & Villani ’07, Balogh, Engoulatov, Hunziker &

Maasalo ’09)

Q_tf (x) := inf

y∈X

[

f (y) + t · 1 p

( d(x, y) t

)^p ]

• Under Assumption 1,

Q_·f ∈ C_b^Lip([0, ∞) × X) if f ∈ C_b^Lip(X)

• Under Assumption 1-2, for ∀t > 0, v-a.e.

∂_tQ_tf = − 1

q |∇^dQ_tf |^q (Note: q⁻¹u^q = sup_s_≥₀ (

us − p⁻¹s^p))

(38)

Kantorovich duality

d_p^W (µ, ν)^p = sup

f∈C_b^Lip

[∫

X

f ^∗ dµ −

∫

X

f dν ]

,

f ^∗(x) : = inf

y∈X [ f (y) + d(x, y)^p ]

= p Q₁(p⁻¹f )(x)

⇓ d_p^W (µ, ν)^p

p = sup

f

[∫

X

Q₁f dµ −

∫

X

f dν ]

(39)









˜

γ : [0, 1] → X d˜-minimal geodesic,

˜

γ₀ = y, γ˜₁ = x,

d(˜˜ γ_s, γ˜_t) = |t − s|d(x, y˜ )

(Assumption 3)

◦ ◦

d_p^W (P_x, P_y)^p

p = sup

f

[P Q₁f (x) − P f (y)]

interpolation = sup

f

[∫ 1 0

∂_t(P Q_tf (˜γ_t))dt ]

(40)

∂_t(P Q_tf (˜γ_t)) (

“=” P (∂_tQ_tf )(˜γ_t) + h∇P Q_tf (˜γ_t), γ˜˙_ti) HJ eq.

up. grad. ≤ − 1

q P (|∇^dQ_tf |^q)(˜γ_t)

+ d(x, y˜ ) ¯¯∇_d^˜P Q_tf ¯¯ (˜γ_t) (G_q) ≤ d(x, y˜ )σ − 1

q σ^q ≤ d(x, y˜ )^p p . (

σ := P (|∇^dQ_tf |^q)(˜γ_t)^1/q

) ¥

(41)

§ 5 Duality under diﬀerent assumptions

(42)

X: Polish space

d, d˜: lower semicontinuous pseudo-distances P : Markov kernel, P (C_b(X)) ⊂ C_b(X)

(43)

Assumption 5

For ∀x, y ∈ X with d(x, y) < ∞,

∃γ: minimal geodesic from x to y. The same is true for d˜

Assumption 6

Q_tf is measurable for ∀f ∈ C_b(X) Assumption 7

limr↓0 sup

y; ˜d(x,y)≤r

P Q_tf (y) ≤ P Q_tf (x)

(44)

“Assumption 8”

The following holds locally:

• ∃D ≥ 0 s.t.

for ∀γ: d-min. geod., ∀λ ∈ [0, 1], d(x, γ(λ))²

≥ (1 − λ)d(x, γ(0))² + λd(x, γ(1))²

− Dλ(1 − λ)d(γ(0), γ(1))²

(S(D))

• ∀y, x ∈ X, with d(x, y): small,

y is on a min. geod. from x of given length

(45)

Assumption 8

For ∀K ⊂ X cpt., ∃D_K ≥ 0, ∃η_K > 0 s.t.

(i) for ∀γ: d-min. geod. with γ(0) ∈ K,

d(γ(0), γ(1)) < η_K , d(γ(0), x) < η_K and λ ∈ (0, 1), (S(D_K)) holds.

(ii) For ∀x ∈ K, ∀y ∈ X with d(x, y) < η_K ,

∃γ: min. geod. with d(x, γ(1)) = η_K and γ(λ) = y for some λ ∈ (0, 1).

(46)

Examples

• X: cpl. Riem. mfd., d: Riem. distance

◦ Ass. 8 (i) ⇐ local lower sect. curv. bound

◦ Ass. 8 (ii) ⇐ local positivity of inj. radius

• X: Wiener space, d: Cameron-Martin norm

◦ Ass. 8 (i) holds with D_K = 1 (“=” holds!)

◦ Ass. 8 (ii) is obvious

(47)

Subgradient

|∇⁻_d f |(x) := lim

r↓0 sup

y∈B_r(x)

[ f (x) − f (y) d(x, y)

]

+

|∇⁻_d_˜ P f |(x) ≤ P (|∇⁻_d f |^q)(x)^1/q (G⁻_q )

For p, q ∈ [1, ∞] with 1

p + 1

q = 1, (i) (W_p) ⇒ (G⁻_q )

(ii) Under Assumption 5-8, (G⁻_q ) ⇒ (W_p) Theorem (K. ’10)

(48)

Diﬃculty: Leibniz rule for P Q_tf (˜γ(t))



 Neither s 7→ P Q_tf (˜γ(s))

nor s 7→ P Q_sf (˜γ(t)) is of class C¹





Key of the proof Assumption 8

⇒ sharp (local) uniform upper bound of Q_t+sf − Q_sf

t for small t

(49)

§ 6 Questions

(50)

(i) When (weak) ⇒ (strong) ?

i.e. (W_p) ⇒ (W_p⁰) or (G_p⁰) ⇒ (G_p) for p⁰ > p

(OK if X: Riem., P = P_t)

(ii) When “(W_p) ⇒ (pathwise control)”

in the case P = P_t ?

(iii) “Bakry-´Emery’s Γ₂-criterion ⇔ (G_q)” in the case P = P_t, d˜ = e⁻^ktd ?

(When |∇^df | = |Γf |^1/2 ?).

(iv) Relation with other “lower curvature bounds”...