slides

(1)

Sparse Additive Matrix

Factorization for Robust

PCA and Its Generalization

Shinichi Nakajima (Nikon)

Masashi Sugiyama (Tokyo Tech.)

S. Derin Babacan (Illinois Univ.)

(2)

✤ Model-induced regularization

✤ Formulation of SAMF

✤ Mean update (MU) algorithm

✤ Experimental results

Sparse additive matrix factorization (SAMF) is a Bayesian

model that allows arbitrary sparsity design (like Group-LASSO).

It makes use of model-induced regularization of variational

Bayesian matrix factorization.

Model-induced regularization

(4)

Bayesian matrix factorization (Probabilistic PCA)

Observations:

Parameters:

Hyperparameters:

Likelihood: Priors:

Approximate with a low rank matrix

[Salakhutdinov&Mnih:NIPS2008, Tipping&Bishop:JRSS1999]

(Loading matrix) (Hidden variables)

(5)

0 1 2 3 4 5 6 7 8 9 10 0

1 2 3 4 5 6 7 8 9 10

VB ML

γ^h

! γ^h

Model induced regularization (MIR)

[Nakajima&Sugiyama:JMLR2011]

VB solution is written as , where

Factorization induces implicit regularization and low-rank sparsity.

Let be SVD of the observed matrix .

(6)

0 1 2 3 4 5 6 7 8 9 10 0

1 2 3 4 5 6 7 8 9 10

VB ML

γ^h

! γ^h

Model induced regularization (MIR)

[Nakajima&Sugiyama:JMLR2011]

a → u/b b → cû câ _{= 1} c^b ^{→ ∞} p(a|câ) ∝ exp

!

− ^a

2

2c^a²

"

p₍b_|cb_{) ∝ exp}

!

− ^b

2

2^c^b²

" p(y|a,b) ∝ exp

!

−^(y⁻ ^ab)² 2

"

p(y|u) ∝ exp

!

−

(y ⁻ u)² 2

"

p(u|c^u) ∝ exp

!

− u² 2c^u²

"

Linear + ARD prior MF + Gauss prior

c_u → ∞ ML estimation when

b → ∞ Regularized unless !

VB solution is written as , where

Let be SVD of the observed matrix .

Factorization induces implicit regularization and low-rank sparsity.

(7)

Sparse additive

matrix factorization

(SAMF)

(8)

Idea: By using factorization, we will construct a Bayesian variant of

Group-LASSO, where no manual parameter tuning is required.

Sparse matrix factorization (SMF) term

(9)

Outlined font

!→

1. partition

2. rearrange

3. factorize

Idea: By using factorization, we will construct a Bayesian variant of

Group-LASSO, where no manual parameter tuning is required.

Sparse matrix factorization (SMF) term

Partition-wise sparsity is expected!

such that

where is a one-to-one map.

Formal expression:

(10)

!→

Row-wise sparse term:

!→

Column-wise sparse term:

Element-wise sparse term:

!→

Examples of SMF term

:Hadamard product

(11)

✤

an extension of robust PCA.

✤

a unified framework of factorization.

A single theory is applied to the GMF terms,

giving a useful algorithm called mean update (MU). SAMF model is a sum of SMF terms:

Ex:

SAMF model is

[Candes+:CoRR2009, Babacan+:IEEETSP2012]

Sparse additive matrix factorization (SAMF) model

(12)

Mean Update (MU)

Algorithm

(13)

Likelihood:

Priors:

Observations: Parameters: Hyperparameters:

for

Formulation of SAMF

Bayesian inference is intractable!

(14)

Variational Bayesian Approximation

Trial distribution: Free Energy:

s.t.

Bayes posterior is minimizer.

Impose decomposability constraint, which makes posterior tractable.

(15)

Standard VB Iteration

Standard VB procedure gives an iterative algorithm for means and covariances:

For empirical variational Bayes: For unknown noise variance:

Note 2: [Babacan+:IEEETSP2012] derived almost identical algorithm for robust PCA Note 1: Standard VB algorithm gives a local solution.

(16)

Mean Update (MU) Algorithm

VBMF solution can be analytically computed.

Partial analytic solution is available.

Each step gives global solution, but no guarantee of global optimality of MU.

Advantage will be experimentally shown.

Given , VB posterior on coincides to

VB posterior of a plain MF model, for each . Theorem 1:

3. Iterate 1 and 2. Mean update (MU) algorithm:

2. is updated, given .

is updated for each s, given

1. . Can also be applied

to empirical VB. [Nakajima+:NIPS2012]

(17)

Experimental results

(18)

MU vs Standard VB (with artificial data)

MU tends to give a better local solution (with lower free energy and

smaller reconstruction error) than Standard VB

(although computation time for each step is a bit longer).

U^LRCE _{= U}^low-rank _{+ U}^row-wise _{+ U}column-wise

+ Uelement-wise

`LRCE’-SAMF:

0 50 100 150 200 250

4.1 4.2 4.3 4.4 4.5 4.6 4.7

Iteration

F/(LM)

MeanUpdate Standard(iniML) Standard(iniMLSS) Standard(iniRan)

0 1 2 3 4 5

!! U⁻

U∗!2 Fro/(LM) Overall Low−rank Row Column Element

U^LE _{= U}^low-rank _{+ U}element-wise

`LE’-SAMF: (robust PCA)

0 50 100 150 200 250

3 3.2 3.4 3.6 3.8 4 4.2

Iteration

F/(LM)

0 1 2 3 4

!! U⁻

U∗!2 Fro/(LM) Overall Low−rank Row Column Element

Free energy Reconstruction error

blue: MU

red: Standard VB

Free energy Reconstruction error

0 50 100 150 200 250

0 20 40 60 80

Iteration

Time(sec)

0 50 100 150 200 250

0 2 4 6 8 10

Iteration

Time(sec)

Computation time

(19)

V

pixels

V = U

^BG

+ U

^FG

+ E

V = U

^low-rank

+ U

element-wise

_{+ E}

Robust PCA

(`LE’-SAMF)

FG/BG video separation

(20)

Segmentation-based SAMF (sSAMF)

Sparsity design based on pre-segmentation.

Efficient graph-based image segmentation [Felzenszwalb&Huttenlocher04]

~ 0.05 sec / image

time

pixels ^Usegment-wise

V = U

^BG

+ U

^FG

+ E

V = U

^low-rank

+ U

element-wise

_{+ E}

Robust PCA

(`LE’-SAMF)

V = U

^low-rank

+ ^U

segment-wise

+ E

Segmentation-based SAMF

(21)

`LE’-SAMF vs sSAMF (with Caviar dataset)

Segment-wise sparse term works better than element-wise term!

sSAMF:

`LE’-SAMF:

U^low-rank +Usegment-wise

U^low-rank + Uelement-wise

(a) Original (b) BG (‘LE’-GMF) (c) FG (‘LE’-GMF)

(d) Segmented (e) BG (sGMF) (f) FG (sGMF)

BG _FG

(22)

✤ SAMF is a Bayesian variant, which requires no manual parameter

tuning, of Group LASSO for robust PCA

✤ Mean update algorithm, proposed based on the analytic solution of

VBMF [Nakajima+:NIPS2012], tends to give a better solution than

the standard VB iteration.

✤ Segmentation-based SAMF showed an advantage of flexible sparsity

design over the plain robust PCA.

Conclusions

(23)

slides

Sparse Additive Matrix

Factorization for Robust

PCA and Its Generalization

Shinichi Nakajima (Nikon)

Masashi Sugiyama (Tokyo Tech.)

S. Derin Babacan (Illinois Univ.)

✤ Model-induced regularization

✤ Formulation of SAMF

✤ Mean update (MU) algorithm

✤ Experimental results

Sparse additive matrix factorization (SAMF) is a Bayesian

model that allows arbitrary sparsity design (like Group-LASSO).

It makes use of model-induced regularization of variational

Bayesian matrix factorization.

Contents:

Model-induced regularization

Bayesian matrix factorization (Probabilistic PCA)

Model induced regularization (MIR)

Factorization induces implicit regularization and low-rank sparsity.

Model induced regularization (MIR)

Factorization induces implicit regularization and low-rank sparsity.

Sparse additive

matrix factorization

(SAMF)

Idea: By using factorization, we will construct a Bayesian variant of

Group-LASSO, where no manual parameter tuning is required.

Sparse matrix factorization (SMF) term

Outlined font

!→

Idea: By using factorization, we will construct a Bayesian variant of

Group-LASSO, where no manual parameter tuning is required.

Sparse matrix factorization (SMF) term

Partition-wise sparsity is expected!

!→

!→

!→

Examples of SMF term

✤

✤

Sparse additive matrix factorization (SAMF) model

Mean Update (MU)

Algorithm

Formulation of SAMF

Bayesian inference is intractable!

Variational Bayesian Approximation

Impose decomposability constraint, which makes posterior tractable.

Standard VB Iteration

Mean Update (MU) Algorithm

Advantage will be experimentally shown.

Experimental results

MU vs Standard VB (with artificial data)

MU tends to give a better local solution (with lower free energy and

smaller reconstruction error) than Standard VB

(although computation time for each step is a bit longer).

V

V = U

+ U

+ E

V = U

+ U

+ E

Robust PCA

(`LE’-SAMF)

FG/BG video separation

Segmentation-based SAMF (sSAMF)

Sparsity design based on pre-segmentation.

V = U

+ U

+ E

V = U

+ U

+ E

Robust PCA

(`LE’-SAMF)

V = U

+ U

+ E

Segmentation-based SAMF

`LE’-SAMF vs sSAMF (with Caviar dataset)

_{+ E}

_{+ E}

+ ^U