• 検索結果がありません。

Probabilistic Inference by Minimizing the TRW Free Energy using CCCP IBIS2010 poster

N/A
N/A
Protected

Academic year: 2018

シェア "Probabilistic Inference by Minimizing the TRW Free Energy using CCCP IBIS2010 poster"

Copied!
1
0
0

読み込み中.... (全文を見る)

全文

(1)

Probabilistic Inference by minimizing the TRW

Free Energy using CCCP

Yu Nishiyama

1

, Xingyao Ye

2

, Alan L. Yuille

2

1

Laboratory for Integrated Theoretical Neuroscience, RIKEN, Brain Science Institute, Wako, JAPAN

2

Department of Statistics, UCLA, Los Angeles, USA

Introduction

◮ Probabilistic inference – computing marginals and/or the most probable

assignment (MAP estimate) given a large-scale graphical model.

◮ Many applications – e.g., computer vision, protein folding, genetic analysis,

neural science.

◮ No exact inference algorithms as belief propagation in tree graphs when

graphical models have cycles.

◮ Goal – developing approximate algorithms for graphs with cycles, favoring

algorithms which are as accurate as possible and are guaranteed to converge.

◮ Efficient TRW-BP algorithms [Wainwright et. al. 2005]

⊲ guaranteed to compute the global minimum of TRW free energy, though

not convergent.

◮ TRW-GP [Globerson and Jaakkola 2007]

guaranteed to converge by optimizing the convex dual of a modified TRW

free energy.

◮ sum-TRW-S [Meltzer et al 2009]

TRW-BP is convergent if messages are updated with right update order.

Purpose

◮ Developing a set of probabilistic inference algorithms by minimizing the

TRW free energy using CCCP.

Markov Random Field & Free Energy

◮ Markov Random Fields (MRFs) defined over graph G = (V, E)

p(x; θ) = exp  

X

i∈V

θi(xi) + X ij∈E

θij(xi, xj) − Φ(θ)  

 .

◮ The true singleton and pairwise marginals p = {{pi(xi)}, {pij(xi, xj)}} are

searched within the local polytope.

L(G) =   

 

b ≥ 0

P xi

bi(xi) = 1, ∀i ∈ V

P xj

bij(xi, xj) = bi(xi), P

xi

bij(xi, xj) = bj(xj), ∀xi, ∀xj, ∀ij ∈ E.

  

 

.

◮ The beliefs b which minimize the free energy yield approximations to the

marginals b∗ ≃ p. The free energies have the following form;

F (b) = −b · θ − S(b),

where −b · θ = P

i∈V

i(xi)ibi + P ij∈E

ij(xi, xj)ib

ij and S(b) is the entropy term.

Tree-ReWeighted (TRW) Entropy

◮ The TRW entropy is given by the convex combination of tree entropies

ST (b):

STRW (b) = X

T∈T

ρ(T)ST(b) = X i∈V

Si(bi) − X

ij∈E

ρijIij(bij).

⊲ T is a spanning tree on graph G. T is the set of all spanning trees. ρ(.) is

a probability distribution over the spanning trees. ρij is the edge

appearance probability of ij ∈ E.

◮ Edge appearance vector ρe = {ρij} is chosen within the spanning tree

polytope:

I

T (G) =

(

ρe ∈ IR|E||ρe = X T∈T

ρ(T )v(T) )

.

⊲ v(T) ∈ {0, 1}|E| is an indicator vector such that [v(T)]ij = 1, ij ∈ T, and

otherwise 0.

◮ The TRW entropy can be rewritten over the local polytope L(G) as

¯

STRW(b) = X

i∈V

ciSi(bi) + X

ij∈E

cijSij(bij). (1)

⊲ cij = ρij and ci = 1 − P

j∈Ni ρij. Ni is the set of neighboring nodes of node i.

◮ The TRW entropy STRW(b) is concave over the local polytope L(G).

◮ The Bethe-approximation entropy corresponds to ρe = 1, though 1 /TI (G).

Hence the Bethe-approximation entropy is not guaranteed to be concave if the graph is loopy.

Concave-Convex Procedure(CCCP)

◮ Let Fvex be the set of all convex functions and Dec(F ) be the set of all pairs

of two convex functions such that the difference of the two convex functions

is equal to the objective function F (b):

Dec(F ) = (fvex, gvex) ∈ F2

vex |fvex(b) − gvex(b) = F (b) .

◮ Theorem

Given a series of decompositions of an objective function F (b) into convex

and concave functions, {(fvext , gvext ) ∈ Dec(F )}, the iterative algorithm

∂fvext (bt+1) = ∂gvext (bt) (2 )

is guaranteed to monotonically decrease the objective function F (b).

◮ Let CCCP(F ; f 0

vex, gvex0 ) denote the specific CCCP algorithm under a

decomposition (fvex0 , gvex0 ). The family of CCCP algorithms is given by the

collection:

CCCP(F ) = {CCCP(F ; fvex, gvex)|(fvex, gvex) Dec(F )} . (3)

A CCCP(F ; f 0

vex, gvex0 ) can be

expanded into a family of

CCCP algorithms CCCP(F ).

geometrical meaning

time t time t+1 time t+2

TRW-CCCP

◮ Let fvex be a convex free energy which has the form of

fvex(b; u) = −b · θ − u · S, (4)

where u · S = P

i∈V

uiSi(bi) + P

ij∈E

uijSij(bij), and u = ({ui}, {uij}) is chosen in a set U

to ensure that fvex(b; u) is convex.

◮ We decompose the TRW free energy, using two convex free energies

fvex(b; ˜u) and gvex(b; u), as

FTRW (b) = −b · θ + fvex(b; ˜u) − gvex(b; u), (5)

where u˜ = c + u. The set in which CCCP algorithms are guaranteed to

converge is

UCCCP =

n

u ∈ IR|V|+|E||u ∈ U, u˜ ∈ Uo. (6)

◮ Resulting TRW-CCCP algorithms are

Algorithm TRW-CCCP; A family of CCCP to minimize the TRW free energy.

1. Initialization: bα(1)(xα) = bα(0)(xα) ∝ exp h

θα(xα)

cα i

α ∈ V ∪ E

2. Choose a free vector u in the set UCCCP. 3. Inner loop:

bi(xi) ∝ h bij(xi)

bi(xi)

i

˜ uij ˜ ui+˜uij

bi(xi)

bij(xi, xj) ∝ h bij(xi)

bi(xi)

i−˜ ui˜

ui+˜uij

bij(xi, xj) ij ∈ E

4. Outer loop:

bα(t+2)(xα) ∝

h

bα(t+1)(xα)

bα(t)(xα)

iuu˜αα

bα(t+1)(xα) α ∈ V ∪ E

5. Output: A set of approximate marginals b∗(≃ p) when outer loop converges. Otherwise, set beliefs to b(t) = b(t+1), b(t+1) = b(t+2) and go to 2.

Numerical Results: Ising Spin Models

◮ In our experiments,

TRW-CCCP under any

setting of parameter u˜

computed the same

pseudo-marginals b∗ as that

from TRW-BP. Larger

settings of u made

TRW-CCCP local and

smaller settings of u made

TRW-CCCP non-local.

◮ Two dimensional grid (10x10 size)

5 10 15 20 25

‑1.26 ‑1.24 ‑1.22 ‑1.2 ‑1.18 ‑1.16 ‑1.14 ‑1.12 ‑1.1

Mixed(10x10grid)

step(outer loop)

TRW Free Energy

 

 

tilde‑u=0.1

tilde‑u=0.2

tilde‑u=0.5

tilde‑u=1

tilde‑u=2

tilde‑u=5

tilde‑u=10

0 200 400 600 800 1000

0

0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01

Mixed(5x5grid), Inner Loop=1

step

difference of beliefs

 

 

TRW‑BP

tilde‑u=1

tilde‑u=2

20 40 60 80 100 120 140 160

20

40 60 80

100 120

140

Mixed(10x10grid)

step(outer loop)

Number of inner loop

 

 

tilde‑u=0.1

tilde‑u=0.2

tilde‑u=0.5

tilde‑u=1

tilde‑u=2

tilde‑u=5

tilde‑u=10

100 200 300 400 500 600 700 800 900 1000

0

0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01

Mixed(5x5grid), Inner Loop=2

step

difference of beliefs

 

 

TRW‑BP

tilde‑u=1

tilde‑u=2

Conclusions

◮ We developed a new set of convergent CCCP algorithms which are

guaranteed to find the global minimum of the TRW free energy. The

algorithms include many free parameters (u ∈ IR|V|+|E|) which could be altered

to change the convergence rate. TRW-CCCP seems more stable for difficult problems such that energy functions have large random weights and

参照

関連したドキュメント

Department of Orthopedic Surgery Okayama University Medical School Okayama Japan.. in

In Section 2, we introduce the infinite-wedge space (Fock space) and the fermion operator algebra and write the partition function in terms of matrix elements of a certain operator..

On figures 2 and 6, the minimum, maximum and two of the intermediate free energies discussed in subsections 3.5 and 6.5 are shown for sinusoidal and exponential histories with n =

With this goal, we are to develop a practical type system for recursive modules which overcomes as much of the difficulties discussed above as possible. Concretely, we follow the

Synopsis The Guidelines for Design and Construction of Grouting for Prestressed Concrete Structures, established in 2005 by the Japan Prestressed Concrete Institute, have been

Amount of Remuneration, etc. The Company does not pay to Directors who concurrently serve as Executive Officer the remuneration paid to Directors. Therefore, “Number of Persons”

Using the CMT analysis for aftershocks (M j >3.0) of 2004 Mid Niigata earthquake (M j 6.8) carried out by National Research Institute for Earth Science and Disaster Prevention

Abstract: Using the CMT analysis for local events (M>3.5) carried out regularly by National Research Institute for Earth Science and Disaster Prevention (NIED), the spatial variation