Equivalence of the Poincaré inequality with a transport-chi-square inequality in dimension one

(1)

ISSN:1083-589X in PROBABILITY

Equivalence of the Poincaré inequality with a transport-chi-square inequality in dimension one

Benjamin Jourdain

^∗

Abstract

In this paper, we prove that, in dimension one, the Poincaré inequality is equivalent to a new transport-chi-square inequality linking the square of the quadratic Wasserstein distance with the chi-square pseudo-distance. We also check tensorization of this transport-chi-square inequality.

Keywords: Poincaré inequality; transport inequality; chi-square pseudo-distance; Wasserstein distance.

AMS MSC 2010:26D10; 60E15.

Submitted to ECP on June 26, 2012, final version accepted on September 26, 2012.

SupersedesarXiv:1206.5931v1.

Forq≥1, the Wasserstein distance with indexqbetween two probability measures µandνonR^dis denoted by

W_q^q(µ, ν) = inf

γ<^µν

Z

R^d×R^d

|x−y|^qdγ(x, y) (0.1)

where the infimum is taken over all probability measuresγonR^d×R^dwith respective marginalsµand ν. We also introduce the relative entropy and the chi-square pseudo distance

H(ν|µ) = (R

R^dln

dν dµ(x)

dν(x)ifν absolutely continuous w.r.t.µ +∞otherwise

χ²₂(ν|µ) =





 R

R^d

dν

dµ(x)−12

dµ(x) =k^dν_dµ−1k²_L2(µ)ifν absolutely continuous w.r.t. µ +∞otherwise

.

Next, we precise the inequalities that will be discussed in the paper.

Definition 0.1. The probability measureµonR^dis said to satisfy the Poincaré inequality P(C)with constantCif

∀ϕ:R^d →RC¹with a bounded gradient, Z

R

ϕ²(x)dµ(x)− Z

R

ϕ(x)dµ(x) 2

≤C Z

R

|∇ϕ(x)|²dµ(x).

∗Université Paris-Est, CERMICS, France. E-mail:[email protected]

(2)

the transport-chi-square inequality Tχ(C)with constantCif

∀ν probability measure onR^d, W₂(µ, ν)≤√

Cχ₂(ν|µ).

the log-Sobolev inequality LS(C) with constant C if ∀ϕ : R^d → R C² compactly supported,

Z

R

ϕ²(x) ln(ϕ²(x))dµ(x)− Z

R

ϕ²(x)dµ(x) ln Z

R

ϕ²(x)dµ(x)

≤C Z

R

|∇ϕ(x)|²dµ(x).

the transport-entropy inequality TH(C)with constantCif

∀ν probability measure onR^d, W2(µ, ν)≤√

CH(ν|µ).

According to [9], the log-Sobolev inequality is stronger than the transport-entropy inequality which is itself stronger than the Poincaré inequality and more precisely LS(C) ⇒ T_H(C) ⇒ P(C/2). The transport-entropy inequality is strictly weaker than the log-Sobolev inequality (see [3, 5] for examples of one-dimensional probability mea- suresµsatisfying the transport-entropy inequality but not the log-Sobolev inequality) and is strictly stronger than the Poincaré inequality (see for example [5] Theorem 1.7).

To obtain some transport inequality equivalent to the Poincaré inequality, one may try to replace eitherW₂(ν, µ)in the left-hand-side by some smaller Wasserstein distance or the relative entropy H(ν|µ)in the right-hand-side by some larger pseudo-distance.

The first possibility is successfully explored in [2] Corollary 5.1 where the Poincaré inequality is proved to be equivalent to the modified transport-entropy inequality

∃C <+∞,∀ν probability measure onR^d, inf

γ<^µ_ν

Z

R^d×R^d

|x−y|²∧ |x−y|

dγ(x, y)≤CH(ν|µ)

with possibly different constantsC. The present paper is devoted to the second possibility. More precisely, since the inequalityxln(x)≤(x−1) + (x−1)²impliesH(ν|µ)≤ χ²₂(ν|µ), we consider replacing the transport-entropy inequality TH(C)by the weaker transport-chi-square inequalityTχ(C). It turns out that, by an easy adaptation of the linearization argument in [9], the transport-chi-square inequality implies the Poincaré inequality. Moreover, in dimensiond= 1, we are able to prove the converse implication so that both inequalities are equivalent. Last, we prove tensorization of the transport- chi-square inequality.

1 Main results

Theorem 1.1. ∀d≥1,T_χ(C)⇒ P(C). Moreover, whend= 1,P(C)⇒ T_χ(32C)and the transport-chi-square and Poincaré inequalities are equivalent.

Before proving Theorem 1.1, we state our second main result dedicated to the tensorization property of the transport-chi-square inequality. Its proof is postponed in Section 4.

Theorem 1.2. Ifµ₁andµ₂are probability measures onR^d¹^andR^d²respectively satis- fyingT_χ(C₁)andT_χ(C₂), then the measureµ₁⊗µ₂satisfiesT_χ((C₁+C₂(1+p

(3d₂+ 2)d₂))∧

(C2+C1(1 +p

(3d1+ 2)d1))).

Remark 1.3. According to Proposition 8.4.1 [1], ifµ1andµ2respectively satisfyTH(C1) and TH(C2), then µ1⊗µ2 satisfies TH(C1∨C2). The constant that we obtain in the tensorization of the transport-chi-square inequality is larger thanC1∨C2.

(3)

The proof of the one-dimensional implicationP(C)⇒ Tχ(32C)in Theorem 1.1 relies on the two next propositions, the proof of which are respectively postponed in Sections 2 and 3. When d = 1, we denote by F(x) = µ((−∞, x]) and G(x) = ν((−∞, x]) the cumulative distribution functions of the probability measuresµandν. The càg pseudo- inverses ofG (resp. F) is defined by G⁻¹ :]0,1[3 u 7→ inf{x ∈ R : G(x) ≥ u} (resp.

F⁻¹(u) = inf{x∈R:G(x)≥u}) and satisfies

∀x∈R, ∀u∈(0,1), x < G⁻¹(u)⇔G(x)< u. (1.1) Whenµ(resp. ν) admits a density w.r.t. the Lebesgue measure, this density is denoted byf (resp. g). Moreover, the optimal coupling in (0.1) is given byγ=du◦(F⁻¹, G⁻¹)⁻¹ wheredudenotes the Lebesgue measure on(0,1)so that

W_q^q(µ, ν) = Z 1

0

(F⁻¹(u)−G⁻¹(u))^qdu

(see [10] p107-109). We take advantage of this optimal coupling to work with the cumulative distribution functions and check the following proposition. In higher dimensions, far less is known on the optimal coupling and this is the main reason why we have not been able to check whether the Poincaré inequality implies the transport-chi-square inequality.

Proposition 1.4. If a probability measureµon the real line admits a positive probability densityf, then, for any probability measureνonR^,

W₂²(µ, ν)≤4 Z

R

(F−G)²

f (x)dx. (1.2)

Remark 1.5. • One deduces thatW₁²(µ, ν) ≤ 4R

R (F−G)²

f (x)dx. Notice that since, by (1.1)and Fubini’s theorem,

W1(µ, ν) = Z 1

0

Z

R

1_{F⁻¹_(u)≤x<G⁻¹_(u)}+ 1_{G⁻¹_(u)≤x<F⁻¹_(u)}dxdu

= Z

R

Z 1 0

1{G(x)<u≤F(x)}+ 1_{F(x)<u≤G(x)}dudx= Z

R

|F(x)−G(x)|dx, the stronger bound

W₁²(µ, ν) = Z

R

|F−G|

√f ×p f(x)dx

²

≤ Z

R

(F−G)²

f (x)dx

is a consequence of the Cauchy-Schwarz inequality.

• It is not possible to controlR

R (F−G)²

f (x)dxin terms ofW₂²(µ, ν). Indeed forf(x) =

1

2e^−|x|anddν(x) =¹₂e^−|x−m|dx, one hasW₂²(µ, ν) =m², G(x) = e^x−m

2 1_{x≤m}+ (1−e^m−x

2 )1_{x>m}, and form >0,

Z

R

(F−G)²

f (x)dx≥ Z +∞

m

(F−G)²

f (x)dx=e^−m

2 (e^m−1)². Next, when the probability measureµon the real line admits a positive probability density satisfying a tail assumption known to be equivalent to the Poincaré inequality (see Theorem 6.2.2 [1]), we are able to control the right-hand-side of (1.2) in terms of χ²₂(ν|µ).

(4)

Proposition 1.6. Letf(x)be a positive probability density on the real line with cumulative distribution functionF(x) =Rx

−∞f(y)dyand medianmsuch that

b^def= sup

x≥m

Z +∞

x

f(y)dy Z x

m

dy f(y)∨sup

x≤m

Z x

−∞

f(y)dy Z m

x

dy

f(y) <+∞. (1.3) Then for any probability densitygon the real line with cumulative distribution function G(x) =Rx

−∞g(y)dy, Z

R

(F−G)²

f (x)dx≤4b Z

R

(f−g)²

f (x)dx. (1.4)

Remark 1.7. • The combination of these two propositions implies that any probability measureµon the real line admitting a positive density f such thatb <+∞

satifiesTχ(16b).

• Proposition 1.6 is a generalization of the last assertion in Lemma 2.3 [7] wheref is restricted to the class of probability densities f_∞ solving f_∞(x) = −A(F_∞(x)) on the real line with

A: [0,1]→RC¹, negative on(0,1)and s.t. A(0) =A(1) = 0, A⁰(0)<0, A⁰(1)>0.

The constantb associated with any such density is finite by the proof of Lemma 2.1 [7]. Moreover, in order to investigate the long-time behaviour of the solution ftof the Fokker-Planck equation

∂_tf_t(x) =∂_xxf_t(x) +∂_x(A⁰(F_t(x))f_t(x)), (t, x)∈[0,+∞)×R to the densityf_∞ such thatR

Rxf_∞(x)dx =R

Rxf0(x)dx, [7] first investigates the exponential convergence to0ofR

R

(F_t−F∞)²

f∞ (x)dx(Lemma 2.8) before dealing with that ofR

R

(ft−f∞)²

f∞ (x)dx(Theorem 2.4).

• It is not possible to controlR

R (f−g)²

f (x)dxin terms ofR

R (F−G)²

f (x)dx, even when b <+∞. Indeed letf(x) = ¹₂e^−|x|and

forn∈N, g_n(x) =X

k≤n

f(x)1_[k−1,k)(|x|) +X

k≥n

e⁻^|x|²

2 1_[x_k_,k+1)(|x|) where x_k = k + 1−2 ln

1 + ^e−1₂ e⁻^k+1²

belongs to (k, k+ 1) and is such that Rk+1

xk e⁻^x²dx=Rk+1

k e^−xdx. One has, using∀y≥0, ln(1 +y)≥ _1+y^y by concavity of the logarithm and1 +^e−1₂ e⁻^k+1² ≤√

efor the inequality,

Z

R

(f−gn)²

f (x)dx= 2 Z +∞

n

g²_n

f (x)dx−e⁻ⁿ= 2X

k≥n

ln

1 +e−1 2 e⁻^k+1²

−e⁻ⁿ

≥ (e−1)

√e X

k≥n

e⁻^k+1² −e⁻ⁿ= (√

e+ 1)e⁻ⁿ⁺¹² −e⁻ⁿ.

On the other hand, since fork ≥nandx∈[k, k+ 1], 1−^e^−k₂ ≤Gn(x)≤F(x) = 1−^e^−x₂ ,

Z

R

(F−Gn)²

f (x)dx≤X

k≥n

Z k+1 k

(e^−k−e^−x)²

e^−x dx= e²−2e−1 e−1 e⁻ⁿ.

(5)

Proof of Theorem 1.1. The implicationTχ(C)⇒ P(C)is obtained by linearization of the transport-chi-square inequalityT_χ(C). Forν_ε= (1 +εφ)µwithφ:R^d→R^aC²function compactly supported and such thatR

R^dφ(x)dµ(x) = 0, according to [9] p394, there is a finite constantKnot depending onεsuch that

Z

R^d

φ²(x)dµ(x)≤ sZ

R^d

|∇φ(x)|²dµ(x)×W₂(µ, ν_ε)

ε +KW₂²(µ, ν_ε)

ε .

WhenTχ(C)holds, thenW2(µ, νε)≤εq CR

R^dφ²(x)dµ(x)and taking the limitε→0, one deduces that

Z

R^d

φ²(x)dµ(x)≤ sZ

R^d

|∇φ(x)|²dµ(x)× s

C Z

R^d

φ²(x)dµ(x).

This implies R

R^dφ²(x)dµ(x) ≤ CR

R^d|∇φ(x)|²dµ(x). Let now ϕ, φn : R^d → R ^be C² functions compactly supported withφ_n taking its values in[0,1], equal to1on the ball centered at the origin with radiusnand∇φ_n bounded by1. Taking the limitn→ ∞in the inequality written withφreplaced byϕn=ϕ−φn

R

Rdϕ(x)dµ(x) R

Rdφn(x)dµ(x), one deduces that the Poincaré inequalityP(C)holds forϕ. The extension toC¹ functionsϕwith a bounded gradient is obtained by density.

To prove the converse implication, we now suppose that d = 1, µ satisfies the Poincaré inequalityP(C)and thatχ2(ν|µ)<+∞. We setµn=ρn? µandνn=ρn? νfor n≥1where

ρ_n(x) = r n

2πe⁻^nx

2

2 (1.5)

denotes the density of the centered Gaussian law with variance1/n. ForϕaC¹function onRwith a bounded derivative such that0 = R

Rϕ(x)dµ_n(x) = R

Rρ_n? ϕ(x)dµ(x), one has

Z

R

ϕ²(x)dµ_n(x) = Z

R

(ρ_n? ϕ²)(x)−(ρ_n? ϕ)²(x)dµ(x) + Z

R

(ρ_n? ϕ)²(x)dµ(x)

≤ Z

R

1

n(ρ_n?(ϕ⁰)²)(x)dµ(x) +C Z

R

(ρ_n? ϕ⁰)²(x)dµ(x)

≤ 1 +nC n

Z

R

(ρ_n?(ϕ⁰)²)(x)dµ(x) = 1 +nC n

Z

R

(ϕ⁰)²(x)dµ_n(x) where we used the Poincaré inequalities for the Gaussian densityρ_n([1] Théorème 1.5.1 p10) applied toϕand forµapplied toρn? ϕfor the second inequality then Jensen’s inequality. The probability measureµn admits a positive density w.r.t. the Lebesgue measure and satisfiesP(^1+nC_n ). According to Théorème 6.2.2 [1], this property is equivalent to the fact that the constant associated withµnthrough (1.3) isbn ≤2^1+nC_n . Combining Propositions 1.4 and 1.6, one deduces that

W₂²(µn, νn)≤321 +nC

n χ²₂(νn|µn).

To conclude, let us check thatW₂²(µ, ν) ≤lim inf_n→∞W₂²(µ_n, ν_n)and thatχ²₂(ν_n|µ_n)≤ χ²₂(ν|µ). First, the probability measuresµnwith c.d.f.Fn(x) =µn((−∞, x])(respνnwith c.d.f. Gn(x) = νn((−∞, x])) converge weakly toµ(resp. ν) which ensures thatdu a.e.

on(0,1),(F_n⁻¹(u), G⁻¹_n (u))tends to(F⁻¹(u), G⁻¹(u))asn→ ∞. With Fatou lemma, one

(6)

deduces that

W₂²(µ, ν) = Z 1

0

(F⁻¹(u)−G⁻¹(u))²du

≤lim inf

n→∞

Z 1 0

(F_n⁻¹(u)−G⁻¹_n (u))²du= lim inf

n→∞ W₂²(µ_n, ν_n).

On the other hand, by Jensen’s inequality, χ²₂(ν_n|µ_n) =

Z

R

R(^dν_dµ(y)−1)ρ_n(x−y)dµ(y) R

Rρn(x−y)dµ(y)

!² Z

R

ρ_n(x−z)dµ(z)dx

≤ Z

R

Z

R

dν dµ(y)−1

2

ρn(x−y)dµ(y)dx=χ²₂(ν|µ).

Remark 1.8. Since

W₂²(µn, νn)≤ inf

γ<^µν

Z

R³

((x+z)−(y+z))²dγ(x, y)ρn(z)dz=W₂²(µ, ν), one haslimn→∞W2(µn, νn) =W2(µ, ν).

Moreover, whenχ²₂(ν|µ)< +∞, then interpreting µ_n and (respν_n) as the distribution at time _n¹ of a Brownian motion initially distributed according toµ(resp. ν) and using Theorem 1.7 [4], one obtainslim_n→∞χ²₂(νn|µn) =χ²₂(ν|µ).

2 Proof of Proposition 1.4

To prove the proposition, one first needs to express the Wasserstein distance in terms of the cumulative distribution functionsF andGinstead of their pseudo-inverses.

Lemma 2.1.

W₂²(µ, ν) = Z

R²

(F(x∧y)−G(x∨y))⁺+ (G(x∧y)−F(x∨y))⁺

dydx. (2.1) Proof of Lemma 2.1. Using Fubini’s theorem and (1.1) for the third equality, one obtains

W₂²(µ, ν) = Z 1

0

(G⁻¹(u)−F⁻¹(u))²du

= 2 Z

[0,1]

Z

R²

1_{F−1(u)≤x≤y<G⁻¹(u)}+ 1_{G−1(u)≤x≤y<F⁻¹(u)}

dxdydu

= 2 Z

R²

1_{x≤y}

Z 1 0

1{G(y)<u≤F(x)}+ 1{F(y)<u≤G(x)}

dudydx

= 2 Z

R

Z +∞

x

(F(x)−G(y))⁺+ (G(x)−F(y))⁺

dydx. (2.2)

By symmetry, one deduces that (2.1) holds.

Proof of Proposition 1.4. One has Z +∞

x

(F(x)−G(y))⁺dy

= 1{F(x)>G(x)}

Z G⁻¹(F(x)) x

(F(x)−G(y))dy ≤(F(x)−G(x))⁺(G⁻¹(F(x))−x).

(2.3)

(7)

By Fubini’s theorem and a similar argument, Z

R

Z +∞

x

(G(x)−F(y))⁺dydx= Z

R

Z x

−∞

(G(y)−F(x))⁺dydx

≤ Z

R

(G(x)−F(x))⁺(x−G⁻¹(F(x)))dx

With (2.2) and (2.3), then using Cauchy-Schwarz inequality and the change of variables u = F(x), one deduces that when µ admits a positive density f w.r.t. the Lebesgue measure, then

W₂²(µ, ν)≤2 Z

R

|G(x)−F(x)||x−G⁻¹(F(x))|dx

≤2 Z

R

(G(x)−F(x))² f(x) dx

^1/2

× Z

R

(x−G⁻¹(F(x)))²f(x)dx 1/2

= 2 Z

R

(G(x)−F(x))² f(x) dx

^1/2

× Z 1

0

(F⁻¹(u)−G⁻¹(u))²du ^1/2

.

Recognizing that the second factor in the r.h.s. is equal toW2(µ, ν), one concludes that (1.4) holds as soon asW2(µ, ν) < +∞. To prove (1.4) without assuming finiteness of W₂(µ, ν), one defines a sequence(G_n)_nof cumulative distribution functions converging pointwise toGby setting

G_n(x) =







F(x)∧_n¹ ifx < G⁻¹(_n¹)

G(x)ifx∈[G⁻¹(_n¹), G⁻¹(ⁿ⁻¹_n )) F(x)∨ⁿ⁻¹_n ifx≥G⁻¹(ⁿ⁻¹_n ) Forx < G⁻¹(_n¹),G(x)<_n¹ and

|F(x)−Gn(x)|= (F(x)−1

n)⁺≤min(|F(x)−G(x)|,(F(x)− 1

n+ 1)⁺)≤ |F(x)−Gn+1(x)|.

Similarly, forx≥G⁻¹ ⁿ⁻¹_n

,G(x)≥ⁿ⁻¹_n and

|F(x)−Gn(x)|= (n−1

n −F(x))⁺≤min(|F(x)−G(x)|,( n

n+ 1−F(x))⁺)≤ |F(x)−Gn+1(x)|.

As a consequence, for fixedx∈R, the sequence(|Gn(x)−F(x)|)n∈N is non-decreasing and goes to |G(x)−F(x)| as n → ∞. By monotone convergence, one deduces that lim_n→+∞R

R

(G_n−F)²

f (x)dx=R

R (G−F)²

f (x)dx.Moreover,

G⁻¹_n (u) =







F⁻¹(u)∧G⁻¹(¹_n)ifu≤ ¹_n G⁻¹(u)ifu∈(_n¹,ⁿ⁻¹_n ]

F⁻¹(u)∨G⁻¹(ⁿ⁻¹_n )ifu > ⁿ⁻¹_n .

As a consequence, denoting byνnthe probability measure with c.d.f. Gn, W₂²(µ, νn) =

Z 1 0

(F⁻¹(u)−G⁻¹_n (u))²du <+∞

and W₂²(µ, ν) ≤ lim inf_n→∞W₂²(µ, νn)by Fatou Lemma. One concludes by taking the limitn→+∞in (1.4) written with(νn, Gn)replacing(ν, G).

(8)

3 Proof of Proposition 1.6

Let us assume that b <+∞andR

R (f−g)²

f (x)dx <+∞. By integration by parts, for n∈N^∗^,

Z n

−n

(F−G)²

f (x)dx=

(F−G)²(x) Z x

m

dy f(y)

⁺ⁿ

−n

−2 Z n

−n

(F−G)(f−g)(x) Z x

m

dy f(y)dx.

(3.1) Forxlarger than the medianmof the densityf, by definition ofb, then by the equality (F−G)(x) =R∞

x (g−f)(y)dyand Cauchy-Schwarz inequality, one has 0≤(F−G)²(x)

Z x m

dy

f(y)≤b(F−G)²(x) R+∞

x f(y)dy =b R∞

x (f−g)(y)dy² R+∞

x f(y)dy ≤b Z ∞

x

(f −g)² f (y)dy.

where the right-hand-side tends to0 asx→ +∞by integrability of ^(f−g)_f ² on the real line. Similarly, limx→−∞(F−G)²(x)Rm

x dy

f(y) = 0. Taking the limitn → ∞in (3.1) and using again the definition ofb, one deduces that

Z

R

(F−G)²

f (x)dx≤2b Z

R

|(F−G)(f −g)|(x) 1_{x≥m}

R∞

x f(y)dy + 1_{x<m}

Rx

−∞f(y)dy

!

dx. (3.2)

The product|(F−G)(f−g)|(x)× ₁

{x≥m}

R∞

x f(y)dy +Rx¹^{x<m}

−∞f(y)dy

is locally integrable onR^since the first factor is integrable and the second one is locally bounded. Letan<+∞denote the integral of this function on[−n, n].

By Cauchy Schwarz inequality,

a_n ≤ s

Z

R

(f −g)² f (x)dx



 Z n

−n

f(F−G)²(x) 1_{x≥m}

R∞

x f(y)dy+ 1_{x<m}

Rx

−∞f(y)dy

!2

dx





1/2

.

(3.3) Now, settingε_n = ^(F−G)R∞ ²⁽ⁿ⁾

n f(y)dy + ^(F−G)R−n ²⁽⁻ⁿ⁾

−∞f(y)dy , we obtain by integration by parts that for n≥ |m|,

Z n

−n

f(F−G)²(x) 1_{x≥m}

R∞

x f(y)dy + 1_{x<m}

Rx

−∞f(y)dy

!² dx

=

"

(F−G)²(x) R∞

x f(y)dy

#ⁿ

m

−2 Z n

m

(F−G)(f −g)(x) R∞

x f(y)dy dx−

"

(F−G)²(x) Rx

−∞f(y)dy

#^m

−n

+ 2 Z m

−n

(F−G)(f−g)(x) Rx

−∞f(y)dy dx

=−4(F−G)²(m) +εn−2 Z n

−n

(F−G)(f −g)(x) 1_{x≥m}

R∞

x f(y)dy − 1_{x<m}

Rx

−∞f(y)dy

! dx

≤2a_n+ε_n.

Plugging this estimation in (3.3), one deduces that

∀n≥ |m|, an ≤1_{a_n_>0}

2 + εn

a_n Z

R

(f−g)² f (x)dx.

(9)

Using that, according to the analysis of the boundary terms in the first integration by parts performed in the proof,lim_n→+∞ε_n= 0and that(a_n)_nis non-decreasing, one may take the limitn→ ∞in this inequality to obtain

Z

R

|(F−G)(f −g)|(x) 1_{x≥m}

R∞

x f(y)dy + 1_{x<m}

Rx

−∞f(y)dy

! dx≤2

Z

R

(f −g)² f (x)dx.

One easily concludes with (3.2).

4 Proof of Theorem 1.2

Letν be a probability measure onR^d¹×R^d²with respective marginalsν₁andν₂and such thatχ₂(ν|µ₁⊗µ₂)<+∞,ρdenote the Radon-Nykodym derivative _dµ^dν

1⊗µ2 and for x1∈R^d¹^,ρ1(x1) =R

R^d²ρ(x1, x2)dµ2(x2). Notice that χ²₂(ν, µ1⊗µ2) =

Z

R^d^{1 +}^d²

(ρ(x1, x2)−1)²dµ1(x1)dµ2(x2).

According to the tensorization property of transport costs (see for instance Proposi- tion A.1 [6]),

W₂²(µ1⊗µ2, ν)≤W₂²(µ1, ν1) + Z

R^d¹

1_{ρ₁_(x₁_)>0}W₂²

µ2,ρ(x1, .) ρ₁(x₁)µ2

dν1(x1) (4.1) By T_χ(C₁) satisfied by µ₁, the equality _dµ^dν¹

1(x₁) = ρ₁(x₁) = R

R^d²ρ(x₁, x₂)dµ₂(x₂) and Jensen’s inequality, one has

W₂²(µ1, ν1)≤C1χ²₂(ν1|µ1) =C1

Z

R^d¹

(ρ1(x1)−1)²dµ1(x1)≤C1χ²₂(ν, µ1⊗µ2). (4.2) So the first term of the right-hand-side of (4.1) is controled byχ²₂(ν, µ₁⊗µ₂). By the inequalityT_χ(C₂)satisfied byµ₂, whenρ₁(x₁)>0,

W₂²

µ2,ρ(x1, .) ρ1(x1)µ2

≤C2

Z

R^d²

ρ(x1, x2) ρ1(x1) −1

²

dµ2(x2).

Unfortunately, there is no hope to control Z

R^d^{1 +}^d²

1_{ρ₁_(x₁_)>0}

ρ(x1, x2) ρ₁(x₁) −1

²

dν1(x1)dµ2(x2)

= Z

R^d^{1 +}^d²

1_{ρ₁_(x₁_)>0}

ρ(x1, x2) ρ₁(x₁) −1

²

ρ1(x1)dµ1(x1)dµ2(x2) in terms ofχ²₂(ν, µ1⊗µ2)because of the possible very small values ofρ1(x1). Therefore it is not enough to plug the latter inequality into the right-hand-side of (4.1) to conclude thatµ1⊗µ2satisfies a transport-chi-square inequality. So we are only going to use this inequality forρ₁(x₁)≥ _α¹ whereαis some constant larger than1to be optimized at the end of the proof. Using Lemma 4.1 below withβ=α, one obtains

Z

R^d¹

W₂²

µ2,ρ(x₁, .) ρ1(x1)µ2

1_{ρ₁_(x₁_)≥¹

α}dν1(x1)

≤αC2

Z

R^d^{1 +}^d²

(ρ(x1, x2)−1)²1_{ρ₁_(x₁_)≥1

α}dµ1(x1)dµ2(x2). (4.3)

(10)

For small positive values of ρ1, we use the estimation of W₂²

µ2,^ρ(x_ρ ¹^,.)

1(x1)µ2

deduced from the optimal coupling for the total variation distance. If ν 6= µ, let ε denote a Bernoulli random variable with parameterp=R

R^d²

_ρ(x

1,x2) ρ₁(x₁) ∧1

dµ2(x2)and (X, Y, Z) denote an independentR^d²×R^d²×R^d²-valued random vector withX,Y andZrespec- tively distributed according to ¹_p_ρ(x

1,x2) ρ₁(x₁) ∧1

dµ2(x2), _1−p¹

1−^ρ(x_ρ ¹^,x²⁾

1(x₁)

+

dµ2(x2) and

1 1−p

_ρ(x

1,x₂) ρ₁(x₁) −1+

dµ₂(x₂). The random variablesεX+ (1−ε)Y andεX+ (1−ε)Z are respectively distributed according todµ2(x2)and ^ρ(x_ρ ¹^,x²⁾

1(x1) dµ2(x2). As a consequence, W₂²

µ2,ρ(x1, .) ρ₁(x₁)µ2

≤E (1−ε)²|Y −Z|²

= (1−p)E |Y −Z|²

≤2(1−p)

"

E

Y − Z

R^d²

y₂dµ₂(y₂)

2! +E

Z− Z

R^d²

y₂dµ₂(y₂)

2!#

≤2 Z

R^d²

x2− Z

R^d²

y2dµ2(y2)

2

ρ(x1, x2) ρ1(x1) −1

dµ2(x2).

One deduces Z

R^d¹

1_{0<ρ

1(x₁)<_α¹}W₂²

µ₂,ρ(x1, .) ρ₁(x₁)µ₂

dν₁(x₁)

≤2 Z

R^d^{1 +}^d²

x2− Z

R^d²

y2dµ2(y2)

2

|ρ(x1, x2)−ρ1(x1)|1_{ρ₁_(x₁_)<1

α}dµ1(x1)dµ2(x2)

≤2 Z

R^d^{1 +}^d²

x2− Z

R^d²

y2dµ2(y2)

4

1_{ρ₁_(x₁_)<¹

α}dµ1(x1)dµ2(x2)

!^1/2

× Z

R^d^{1 +}^d²

(ρ(x1, x2)−ρ1(x1))²1_{ρ₁_(x₁_)<1

α}dµ1(x1)dµ2(x2) ^1/2

≤2C2

p(3d2+ 2)d2

Z

R^d¹

α²(ρ₁(x₁)−1)²

(α−1)² 1_{ρ₁_(x₁_)<¹

α}dµ1(x1) ^1/2

× Z

R^d^{1 +}^d²

[(ρ(x1, x2)−1)²−(ρ1(x1)−1)²]1_{ρ₁_(x₁_)<¹

α}dµ1(x1)dµ2(x2) 1/2

≤C2αp

(3d2+ 2)d2

α−1

Z

R^d^{1 +}^d²

(ρ(x1, x2)−1)²1_{ρ₁_(x₁_)<1

α}dµ1(x1)dµ2(x2),

where we used Cauchy Schwarz inequality for the second inequality, then Lemma 4.2 below and an explicit computation of the third factor for the third inequality and last the inequality√

b√

a−b≤ ^a₂ for anya≥b≥0.

Inserting this estimation together with (4.2) and (4.3) into (4.1), one obtains

W₂²(µ1⊗µ2, ν)≤C1χ²₂(ν1, µ1) +C2α 1∨

p(3d2+ 2)d2

α−1

!

χ²₂(ν, µ1⊗µ2).

For the optimal choiceα= 1 +p

(3d2+ 2)d2, one concludes that the measureµ1⊗µ2

satisfiesTχ(C1+C2(1 +p

(3d2+ 2)d2)). Exchanging the roles ofµ1andµ2in the above reasonning, one obtains thatµ₁⊗µ₂also satisfiesTχ(C₂+C₁(1 +p

(3d₁+ 2)d₁)).

(11)

Lemma 4.1. Forβ≥α >0, Z

R^d^{1 +}^d²

ρ(x1, x2) ρ₁(x₁) −1

²

1_{ρ₁_(x₁_)≥1

α}dν1(x1)dµ2(x2) +β

Z

R^d¹

(ρ₁(x₁)−1)²1_{ρ

1(x₁)≥_α¹}dµ₁(x₁)

≤β Z

R^d^{1 +}^d²

(ρ(x₁, x₂)−1)²1_{ρ

1(x₁)≥_α¹}dµ₁(x₁)dµ₂(x₂).

Proof. Developping the squares and using the definition ofρ₁and the equalitydν₁(x₁) = ρ1(x1)dµ1(x1), one checks that the difference between the right-hand-side and the first term of the left-hand-side is equal to

Z

R^d¹

β− 1 ρ₁(x₁)

Z

R^d²

ρ²(x₁, x₂)dµ₂(x₂) + (1−2β)ρ₁(x₁) +β

1_{ρ

1(x₁)≥_α¹}dµ₁(x₁).

One easily concludes by remarking that the first integral is retricted to the x1 ∈ R^d¹ such that_ρ ¹

1(x₁)≤α≤βand that Z

R^d²

ρ²(x₁, x₂)dµ₂(x₂)≥ Z

R^d²

ρ(x₁, x₂)dµ₂(x₂) ²

=ρ²₁(x₁).

Lemma 4.2. If a probability measureµonR^d^satisfiesT(C), then Z

R^d

x− Z

R^d

ydµ(y)

2

dµ(x)≤dC and Z

R^d

x− Z

R^d

ydµ(y)

4

dµ(x)≤(3d+ 2)dC². Proof. According to Theorem 1.1, µ satisfies P(C). By spatial translation, one may assume that R

R^dydµ(y) = 0. Applying the Poincaré inequality P(C) to the functions x= (x1, . . . , xd)∈R^d7→xi,x7→x²_i andx7→xixj with1≤i6=j≤d, yields,

Z

R^d

x²_idµ(x)≤C

Z

R^d

x⁴_idµ(x)≤4C Z

R^d

x²_idµ(x) + Z

R^d

x²_idµ(x) 2

≤5C² Z

R^d

(x_ix_j)²dµ(x)≤C Z

R^d

x²_i +x²_jdµ(x) + Z

R^d

x_ix_jdµ(x) 2

≤2C²+ Z

R^d

x²_idµ(x) Z

R^d

x²_jdµ(x)≤3C². One easily concludes by summation of these inequalities.

References

[1] Cécile Ané, Sébastien Blachère, Djalil Chafaï, Pierre Fougères, Ivan Gentil, Florent Malrieu, Cyril Roberto, and Grégory Scheffer,Sur les inégalités de Sobolev logarithmiques, Panora- mas et Synthèses [Panoramas and Syntheses], vol. 10, Société Mathématique de France, Paris, 2000, With a preface by Dominique Bakry and Michel Ledoux. MR-1845806

[2] Sergey G. Bobkov, Ivan Gentil, and Michel Ledoux, Hypercontractivity of Hamilton-Jacobi equations, J. Math. Pures Appl. (9)80(2001), no. 7, 669–696. MR-1846020

[3] Patrick Cattiaux and Arnaud Guillin,On quadratic transportation cost inequalities, J. Math.

Pures Appl. (9)86(2006), no. 4, 341–361. MR-2257848

(12)

[4] Joaquin Fontbona and Benjamin Jourdain, A trajectorial interpretation of the dissipations of entropy and Fisher information for stochastic differential equations, preprint HAL- 00608977

[5] Nathael Gozlan,Transport entropy inequalities on the line, Electron. J. Probab.17(2012), no. 49, 1–18.

[6] Nathael Gozlan and Christain Léonard, Transport inequalities. A survey, Markov Process.

Related Fields16(2010), no. 4, 635–736. MR-2895086

[7] Benjamin Jourdain and Florent Malrieu,Propagation of chaos and Poincaré inequalities for a system of particles interacting through their CDF, Ann. Appl. Probab.18(2008), no. 5, 1706–1736. MR-2462546 MR-2462546

[8] Laurent Miclo,Quand est-ce que des bornes de Hardy permettent de calculer une constante de Poincaré exacte sur la droite?, Ann. Fac. Sci. Toulouse Math. (6)17(2008), no. 1, 121–

192. MR-2464097 MR-2464097

[9] Felix Otto and Cédric Villani, Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality, J. Funct. Anal.173(2000), no. 2, 361–400. MR-1760620 [10] Svetlozar T. Rachev and Ludger Rüschendorf,Mass transportation problems. Vol. I, Prob- ability and its Applications (New York), Springer-Verlag, New York, 1998, Theory. MR- 1619170

Acknowledgments.I thank Arnaud Guillin for fruitful discussions and in particular for pointing out the implicationTχ(C) ⇒ P(C)and the interest of tensorization to me. I also thank the anonymous referee for suggesting how to shorten the proof of Lemma 2.1.