• 検索結果がありません。

正定値対称行列上の情報幾何と

N/A
N/A
Protected

Academic year: 2021

シェア "正定値対称行列上の情報幾何と"

Copied!
48
0
0

読み込み中.... (全文を見る)

全文

(1)

1

正定値対称行列上の情報幾何と そのいくつかの応用

小原 敦美

福井大学工学研究科

正定対称行列をめぐるモデリング・数理・アルゴリズムの世界 2014115 at 政策研究大学院大学

(2)

1. Introduction

: the set of positive definite real symmetric matrices

related to branches in applications

matrix (in)eq. (Lyapunov,Riccati,…)

mathematical programming (SDP)

Statistics, signal processing, time series analysis (Gaussian, Covariance matrix)

2

(3)

1. Introduction

: the set of positive definite real symmetric matrices

related to branches in mathematics

Linear algebra, Convex analysis

Riemannian symmetric space

Symmetric cone (Jordan algebra)

Symplectic geom. (Siegel-Poincare)

Information (Hessian) geom.

3

(4)

Our interests

stable matrices and IG [O,Amari Kybernetika93]

standard IG [O,Suda,Amari LAA96]

dual conections and Jordan alg. [O,Uohashi Positivity04]

means on sym. cones [O IEOT04]

complexity analysis of IPM[Kakihara,O,Tsuchiya JOTA13]

deformed IG [O,Eguchi ISM_RM05]

update formula for Q-Newton [Kanamori,O OMS13]

group invariance and q-Gaussian pdf[O, Eguchi I3]

4

(5)

5

2.平均と情報幾何 はじめに(1)

算術平均

幾何平均

調和平均

AGH不等式(様々な不等式の基礎)

(6)

6

はじめに(2)

(=n次正定値対称行列集合)

行列の不等式(AGH ineq.

(7)

7

目的

1.算術・幾何・調和平均と情報幾何の関係

α平均とα測地線の中点

2.平均の群不変性

(あるクラスの)作用素単調関数のα測地線方程 式による特徴付け

おまけ

対称錐への拡張

二重自己平行部分多様体

Jordan代数による特徴付け

(8)

8

OUTLINE

はじめに

(自己随伴)正値作用素の平均とその性質

作用素平均(operator mean)の理論 Kubo-Ando 80, Hiai-Kosaki 03

PD(n)上の情報幾何と測地線の中点

平均のしくみ 等質性とスペクトル分解

応用例: 画像処理など?

(9)

9

(作用素)平均と作用素単調関数

[Kubo & Ando 80]

Def (平均の公理)

が正定値対称行列集合PDn)上の平均

i)

ii)

iii)

ここで

iv)

(10)

10

平均と作用素単調関数(2)

Thm [KA80]

PDn)の平均 ある作用素単調関数 f (t)が存在して,次のように表せる:

ここで,Xのスペクトル分解を用いて,

f (t) と σ は1対1対応(表現関数)

(11)

11

平均と作用素単調関数(3)

Def: f (t)が作用素単調

任意のエルミート行列A,B (サイズも)に対して,

以下が成立する:

(12)

12

平均と作用素単調関数(4)

スカラの場合の

1

t

の平均

となっていることに注意

(13)

13

平均と作用素単調関数(5)

不等式と平均の表現関数の関係 のとき,

(14)

14

PD(n) の情報幾何

[O Suda Amari 96]

PDn)上に

Riemann計量・・・微少な二点間の距離や角度

接続・・・空間の曲がり方

を定める.

定め方はいろいろあるが,様々な双対性を反映 した幾何構造(情報幾何)を考える.

以下,微分幾何速習コース(4枚)

(15)

15

接空間

(16)

16

Riemann計量 ー接空間の内積ー

(17)

17

接続(または平行移動) (1)

ー接空間同士の関係ー

(18)

18

接続(または平行移動) (2)

ー接空間同士の関係ー

(19)

19

PD(n) の情報幾何(2)

対称行列集合(ベクトル空間)の基底行列

座標

ポテンシャル関数

Riemann計量

α接続

(20)

20

PD(n) の情報幾何(3)

PD(n)の双対幾何構造

注1: はLevi-Civita接続(自己双対)

注2:情報幾何では双対接続 が最も重要

合同変換 ,逆変換 に対して幾何構造不変( )

α測地線・・・以下の連立微分方程式の解

(21)

21

例: 1 次元の場合

ポテンシャル関数

Riemann計量

α接続

α測地線方程式

(22)

22

PD(n) の情報幾何(4)

解をP(x(s))=P(s)と表し,境界条件をP(0)=A, P(1)=Bとして解を求めると

より一般に

(23)

23

PD(n) の平均と α 測地線の中点

[O 04]

定理:α-power平均はα測地線の中点である

特に

α測地線の中点と平 均

(24)

24

PD(n) の平均と α 測地線の中点(2)

1:中点でなくとも平均の公理を満たす.

Uhlmannの補間平均 (2パラメータ)

2Finsler幾何アプローチ

Corach,Porta,Recht 93 ・・・ 幾何平均

Fujii 94 ・・・ 算術・調和平均

3種類のFinsler距離の最短曲線として定義

Kamei 94 ・・・ これ以外はできない

(25)

25

もう少し考えてみる 注目点1:等質性

PDn)は等質空間(合同変換群が推移的に働く)

i)

ii)

幾何構造はどこも均一

(26)

26

注目点1:等質性

公理 ii) :不変性

I

(27)

27

注目点1:等質性

公理 ii) :不変性

平均の公式

に注目すると

I

ただし

(28)

28

注目点2:スペクトル分解

IσZ Z と同じ固有ベクトルを持つ

: Piは直交射影行列 ゆえに

(29)

29

定理とあわせて導けること

作用素単調関数と測地線の関係

I Z を結ぶα測地線を とすると

測地線 は に拘束される

作用素単調関数=1次元の測地線の解:

(30)

30

定理とあわせて導けること

作用素単調関数と測地線の関係

(1)境界条件

(2)α測地線方程式(再掲)

系:α-power平均の表現関数は,(1),(2)の解

として特徴づけられる.

(31)

3.1 Deformed IG on

-The standard case:

P P

s s

V ( ) log ( ) log det

, V (s) : R R

31

Def.

Purpose:

-Their different and/or common geometric structures - Geometry of multivariate elliptical pdf.

(32)

Def.

Rem. The standard case V= -log:

2 ,

0 )

( ,

1 )

1(s k s k

Prop.1 (convexity conditions)

The Hessian matrix of the V-potential is positive definite on if and only if

32

(33)

Assumption: the convexity conditions hold.

- Riemannian metric is

=

Here,

X, Y in sym(n,R) ~ tangent vectors at P

Rem. The standard case V= -log:

=

33

(34)

Dual affine connections

Let be the canonical flat connection on . the V-potential defines the following dual

connection with respect to :

34

(35)

divergence

- a variant of relative entropy,

- Pythagorean type decomposition 35 35 : Dually flat structure on

induced by the V-potential

(36)

Group Invariance of

the structure on

Linear transformation on congruent transformation:

the differential: T

G

T G

GXG X

n GL G

GPG P

)*

(

), ,

( ,

R

36 36

(37)

Linear transformation on congruent transformation:

the differential:

Invariance

metric:

connections:

and the same for where

T G

T G

GXG X

n GL G

GPG P

)*

(

), ,

( ,

R

37 37

Group Invariance of

the structure on

(38)

Prop.

The largest group that preserves the dualistic structure invariant is

except in the standard case.

) ,

( n R SL

G

G

) ,

( n R GL

G

G

Rem. the standard case:

Rem. The power potential of the form:

has a special property.

38

with

with

(39)

3.2 Application to

multivariate statistics

Non Gaussian distribution

(generalized exponential family)

Robust statistics

beta-divergence,

Machine learning, and so on

Nonextensive statistical physics

Power distribution,

generalized (Tsallis) entropy, and so on

39

(40)

U-model and U-divergence

U-model Def.

Given a convex function U on R and set u=U’,

U-model is a family of elliptic pdf’s specified by P:

:normalizing const.

40

(41)

Rem. When

U

=exp, the U-model is the family of Gaussian distributions.

U-divergence:

Natural closeness measure on the U-model ,

Rem. When

U

=exp, the U-divergence is the

Kullback-Leibler divergence (relative entropy).

41

(42)

Example: beta-model and beta- divergence (1)

Beta-model

For and

q-exponential and q-logarithmic functions 42

(43)

Example: beta-model and beta- divergence (2)

Beta-divergence

43

(44)

IG induced from divergences

Divergence induces IG structure.

where

44

(45)

Prop.

IG on induced from coincides with derived from the following V-

potential function:

45

Relation between the U- and V-

geometries

(46)

Group invariance for the power potentials

Prop.

V

is of the power form

1) Orthogonality is GL(n)-invariant.

2) The dual affine connections derived from the power potentials are GL(n)-invariant.

Hence,

Both - and -projections are GL(n) -invariant.

46

(47)

Thm [O & Eguchi 13]

IG on induced from coincides with

on induced from

Implication: statistical inference on using is GL(n)-invariant.

47

(48)

Main References

A. Ohara, N. Suda and S. Amari, Dualistic Differential Geometry of Positive Definite Matrices and Its Applications to Related Problems, Linear Algebra and its Applications, Vol.247, 31-53 (1996).

A. Ohara, Geodesics for Dual Connections and Means on Symmetric Cones, Integral Equations and Operator Theory, Vol.50, 537-548 (2004).

A. Ohara and S. Eguchi, Geometry on positive definite matrices and V-potential function, Research Memorandum No. 950, The Institute of Statistical

Mathematics, Tokyo, July (2005).

T. Kanamori and A. Ohara,

A Bregman Extension of quasi-Newton updates I: An Information

Geometrical Framework, Optimization Methods and Software, Vol. 28, No.

1, 96-123 (2013).

A. Ohara and S. Eguchi, Group Invariance of Information Geometry on q- Gaussian Distributions Induced by Beta-Divergence, Entropy, Vol. 15, 4732-4747 (2013).

48

参照

関連したドキュメント

Projection of Differential Algebras and Elimination As was indicated in 5.23, Proposition 5.22 ensures that if we know how to resolve simple basic objects, then a sequence of

We present a new reversed version of a generalized sharp H¨older’s inequality which is due to Wu and then give a new refinement of H¨older’s inequality.. Moreover, the obtained

Shakhmurov, “Coercive boundary value problems for regular degenerate di ff erential-operator equations,” Journal of Mathematical Analysis and Applications, vol. Shakhmurov,

Agarwal, “Multiple positive solutions to superlinear periodic boundary value problems with repulsive singular forces,” Journal of Mathematical Analysis and Applications, vol..

Liu, “The base sets of primitive zero-symmetric sign pattern matrices,” Linear Algebra and Its Applications, vol.. Shen, “Bounds on the local bases of primitive nonpowerful

Erd˝ os, Some problems and results on combinatorial number theory, Graph theory and its applications, Ann.. New

In this work, our main purpose is to establish, via minimax methods, new versions of Rolle's Theorem, providing further sufficient conditions to ensure global

In Section 3 using the method of level sets, we show integral inequalities comparing some weighted Sobolev norm of a function with a corresponding norm of its symmetric