• 検索結果がありません。

Fenchel’s duality theorem is given in Theorem 2

N/A
N/A
Protected

Academic year: 2021

シェア "Fenchel’s duality theorem is given in Theorem 2"

Copied!
2
0
0

読み込み中.... (全文を見る)

全文

(1)

集中講義「深層学習および機械学習の数理」

レポート課題 締め切り:925

2020-9-2

Taiji Suzuki 6th Laboratory of Mathematical Informatics (Room 352, Faculty of Engineering Building 6) e-mail: [email protected] 以下の問題のうち2問以上を選択して解答してください.レポートは日本語で構いません.

1. Consider the second order polynomial kernel k(x, y) = (1 +xy)2 on R2. Construct a function having a formf(x) =m

j=1αjk(xj, x) withxjR2 andαj R(j= 1, . . . , m) such that {

f(x)0 (x∥ ≤1), f(x)>0 (x>1), forxR2

2. Derive the dual problem of support vector machine by using Fenchel’s duality theorem:

min

α∈Rn

n i=1

max {

1yi

n j=1

k(xi, xjj,0 }

+λ 2

n i=1

n j=1

αiαjk(xi, xj).

Fenchel’s duality theorem is given in Theorem 2.

3. Let k : Rd ×Rd R be a characteristic kernel that has H as its corresponding RKHS. For a probability measureP, define

mP(x) =

k(x, X)dP(X).

Assume thatk is transition invariant, i.e. there exitsψ such thatk(x, y) =ψ(xy). Then show that, for any probability measuresP andQ,

mP mQH = 0 P =Q.

Hint: Use Bochner’s theorem and the necessary and sufficient condition of positive definiteness.

4. For 1 q < , let wq := (d

i=1|wi|q)1/q for w Rd, and Hq := {f(x) = wx (x Rd) |

wq 1, wRd}. Givenx1, . . . , xnRn, its empirical Rademacher complexity is denoted by

Rˆn(Hq) = Eσ

[ sup

f∈Hq

1 n

n i=1

σif(xi)

x1, . . . , xd

] .

Now, suppose that 1p, q <satisfiesq < p. Then, show that Rˆn(Hp)d1/p1/qRˆn(Hq) wherep =p/(p1) andq=q/(q1).

5. Prove the Massart’s theorem: For a finite set of functions F ={f1, . . . , fM} where eachfi (i = 1, . . . , M) is a function fromRd to Rsatisfying supx|fi(x)| ≤R, it holds that

Rˆn(F) = Eσ

[ sup

f∈F

1 n

n i=1

σif(xi)

(xi)ni=1 ]

= Eσ

[

sup

f∈F∪{−f|f∈F}

1 n

n i=1

σif(xi)(xi)ni=1 ]

R

2 log 2M

n .

Hint: You may use the following inequalities:

1

(2)

Jensen’s inequality: exp(sEσ[g(σ1, . . . , σn)])Eσ[exp(sg(σ1, . . . , σn))] forsRandg:Rn R.

exp(maxf∈FF(f))

f∈Fexp(F(f)) forF :F R.

Hoeffding’s inequality: Eσi[exp(σia)]exp(a2/2) foraR.

6. Suppose thatx<1 (a.s.). Derive an upper bound of the Rademacher complexity of a neural network model:

F=

M j=1

αjη(ajx)|αj R, aj Rd, max

1jMajpC1, αq C2

where 1p, q≤ ∞andη(u) = max{u,0}.

7. Write your opinion about the future direction of machine learning and artificial intelligence tech- niques. Of course, you may discuss the future of deep learning.

Definition 1 (Legendre transform, convex conjugate) Let f : Rd Rbe a convex function. Its convex conjugate functionf:RdR∪ {+∞} is defined by

f(y) = sup

x∈Rd{xyf(x)}. This transformationf 7→f is called Legendre transformation.

Theorem 2 (Fenchel’s duality theorem (special case)) Letf :Rn Randg:Rd Rbe convex functions, and let ARn×d. Then it holds that

inf

x∈Rd{f(Ax) +g(x)}= inf

y∈Rn{f(y) +g(Ay)} where f, gare the convex conjugate functions off andg respectively.

2

参照

関連したドキュメント

Secondary 05A15, 05B20, 05C38, 05C50, 05C70, 11A51, 11B83, 15A15, 15A36, 52C20 Keywords: directed graph, cycle system, path system, walk system, Aztec diamond, Aztec pillow,

This is a consequence of a more general result on interacting particle systems that shows that a stationary measure is ergodic if and only if the sigma algebra of sets invariant

For example, a maximal embedded collection of tori in an irreducible manifold is complete as each of the component manifolds is indecomposable (any additional surface would have to

In the language of category theory, Stone’s representation theorem means that there is a duality between the category of Boolean algebras (with homomorphisms) and the category of

Keywords: nonlinear operator equations, Banach spaces, Halley type method, Ostrowski- Kantorovich convergence theorem, Ostrowski-Kantorovich assumptions, optimal error bound, S-order

In [9], it was shown that under diffusive scaling, the random set of coalescing random walk paths with one walker starting from every point on the space-time lattice Z × Z converges

Matroid intersection theorem (Edmonds) Discrete separation (Frank). Fenchel-type

An integral inequality is deduced from the negation of the geometrical condition in the bounded mountain pass theorem of Schechter, in a situation where this theorem does not