• 検索結果がありません。

Introduction to Sequential Data assimilation methods:

N/A
N/A
Protected

Academic year: 2021

シェア "Introduction to Sequential Data assimilation methods:"

Copied!
23
0
0

読み込み中.... (全文を見る)

全文

(1)

Introduction to Sequential Data assimilation methods:

Their mathematical basis and recent development

Tomoyuki Higuchi

Research Organization of Information and Systems The Institute of Statistical Mathematics/JST CREST

1/42

Deductive Approach

Inductive Approach

TESD: Four Kinds of Methodology of Science

T:Theory E:Experiment

S:Simulation D 䠖 Massive Data Analysis

(Axle)

Data Assimilation

Drive Force for Science

2/42 Red colorindicates a slide used in the last year’s presentation

(2)

3

What is Data Assimilation?

• Emerging subject in meteorology and oceanography.

• Methodology to synthesize numerical simulation model and observed data

– Simulation model can not reflect real physics accurately.

e.g.Accurate weather forecast needs good initial conditions.

• Uncertainty in the model (boundary condition, initial condition, unknown parameters, unknown dynamics...) exists.

– Observation data have some physical/budgetary restrictions.

Correct variables in numerical simulation model using observation data. = Data Assimilation

Simulation model Observation data

3/42

Objects of Data Assimilation from a viewpoint of Meteorology and Oceanography

[1] To produce the best (better) initial condition for forecasting. It is actually realized in the real weather forecast (ex., Japan Meteorological Agency).

[2] To find the best (better) boundary condition in constructing a simulation model. This procedure includes a setting of appropriate boundary conditions necessary for dealing with a coupled phenomena.

[3] To attain an optimal parametervector that appears in an empirical law (scheme) employed for describing complicated phenomena with the different time and spatial scales. A validationof the empirically given values is

regarded as this problem.

[4] To inter/extrapolate (estimate) an physical quantity at times and locations without observations based on a numerical simulation model. This procedure is called “a generation of re-analysis dataset (product)”. This dataset is used to discover a new scientific findings by general geophysical researchers.

[5] To conduct an experiment with a virtual observation network and perform a sensitivity analysis in an attempt to construct an effective observation

(3)

5

Outline

• Mathematical basis and Bayesian computation

• Sequential data assimilation

• Ensemble-based nonlinear filtering method

䞊Particle filter (PF)

• Advanced methods for PF

䞊Merging PF 䞊Meta PF

䞊PF with GPGPU

• Conclusions

5/42

Construction of Simulation Model

Observation points and observed variables are limited.

) , , ( i i i

i T U V

[

i 1 i

physical variable vector is assigned at each grid point. temperature

Wind vector

1

[i

[i

) , ( t 1 t

t F x v

x

) ( t1

t F x

x

vt

w

w 2 t cx x

䠄simplified meteorological model around Japan䠅

PDE : Partial differential equation

(time varying)

6/42

(4)

777

Past and

Present Present

and Future State

x t

State Vector 䠖 Contact point between past and future

) ( t 1

t F x

x

Simulation Model

State of time t-1

State of timet

7/42

From one path to PDF (=Probability Distribution Function)

Simulation model

1 t t

t F x

x

1

| t t

t F x

x

Simulation model with uncertainty

x

t

x

t

t t1

t F x

x

p G

xt

p

t t

t h x

y |

Relation to observed data yt

xt yt

p |

Estimate

DA

xt

xt yy

p |

1 t

t x

F

)

| (

t t

p x y

Conditional distribution

Next slide

(5)

999

Conditional and Joint Probabilities

9/42

B

A ˆ

{ ) A (

p Probability of A

) { B A, (

p Probability of A and B

{ ) B

| A (

p Probability of B given A

Conditional Probability Joint Probability

Total: 100

# of consumers to buy a bag of coffee grounds : 30

# of consumers to buy milk: 60

# of consumers to buy a coffee bag and milk: 10

A=1: Buy a bag of coffee grounds

=0: Not buy B=1: Buy milk

=0: Not buy

A ‰ B

20 10 50

30 60

20 )

10 20 (

10 30

) 10 1 A

| 1 B ( 100, ) 10 1 B , 1 A ( 100, ) 30 1 A

( p p

p

¦¦

4

1 6

1

1 ) B , A (

i j

j i

p

10 10 10

Marginalization

10/42

¦

6

1

) B

, 3 A ( )

3 A (

j

j p

p p ( A i , B j )

Joint probability

A=3

B=1 B=6

A

A=1: Yahoo

=2: Google

=3: Microsoft

=4: Others

B=1: NEC, =2: Fujitsu, =3: Dell

=4: Toshiba, =5: Apple , =6: Others Others

B

possible

¦

B '

) B B , A ( )

A A (

j j

j i

i p

p

PC Web Search Engine

³ ( A, B ) B

) A

( p d

p

(6)

11 11 11

Bayes’ Theorem

11/42

B

B A ‰

20 10 50

30 60

20

¦

¦

Ÿ



) A ( A)

| (B

) A ( A)

| (B )

B A, (

) B A, ( )

B (

B) , ) (A

B

| A (

) A ( ) A

| B ( B) , ) (A

A (

B) , ) (A

A

| B (

A A

p p

p p

p p p

p p

p p

p p p p

䚷䚷 䚷䚷䚷䚷

䛾䛸䜛ྍ⬟ᛶ

50 10

10 100

70 70 50 100

30 30

10 100 30 30 10

0) (A 0) A

| 1 (B 1) (A 1) A

| 1 (B

1) (A 1) A

| 1 (B ) 1 B

| 1 A (

˜

˜

˜

p p

p p

p p

p

It is

easy

to

calculate.

A=1: Buy a bag of coffee grounds (Search Engine type) B=1: Buy milk (PC type)

Generative Model, Inversion with Bayes’ theorem, and Data Assimilation

y x x y p( | )

Data distribution :Forward Posterior distribution:

Inverse

y x y x p( | ) x

x p( )

Prior distribution :Forward

Build a generative model and Use Bayes’ theorem Simulation Fitness of Simulation to Data

x

Latent Variables䠖

) , (

) ( )

| (

) (

) ( )

| ) (

| (

y x p

x p x y p

y p

x p x y y p

x p

˜ v

{ ˜

䚷䚷䚷䚷

Bayes’ Theorem

䚷䚷䚷䚷

xsimulation variables

ydata

Prior dist.

Data dist.likelihood function

Posterior dist.

(7)

) ,

(

) ,

( 1

t t

t

t t

t

w x

h y

v x

f

x

Data Assimilation in Generalized State Space Model

State VectorSimulation variables

Stochastic simulation model

Observation model

Measurement model

map nonlinear :

L LŸ

t t

t t

G G

!!

' '

1 step time simulation :

ns observatio of

time sampling :

14/42

G t

' t

Missing value (Outlier)

time integration

%D\HVLDQ HVWLPDWLRQ

Prior Belief about values of x Posterior

Improved knowledge about values of x

13 Likelihood

Feasibility of realization of y for given x

) ( )

| ( )

|

( x y p y x p x

p v ˜

13/42

Cyclical structure

(8)

x

1

x

2

x

3

) ( x

1

x p

)

| ( X x

T

p

:

x

1

X {

x

t

) (

t

x p x

x

T

State Vector and Concatenated State Vectors

15 15

Chain Structure Graphical Model

x

0

y

2

y

t

x

2

x

t

Latent Vector

Observable Vector

y

1

Observation model

x

1

System model

^ `

^

x0,yx11,,,,yxtt,,,,yxTT

`

)

| (

]) ,

, , [

| (

]) ,

, , [

| (

]) ,

, , [

| (

: 1 :

1

2 1 :

1

2 1 :

1

1 2

1 1

: 1

T T

T T

t

t t

t

t t

t

p

y y

y p

y y

y p

y y

y p

y x

y x

y x

y x

{

{

{

Today’s economic situation given

yesterday’s stock market data Today’s economic situation estimated by the stock market data up to today Today’s economic situation analyzed by

using all available data when we look back on the today in future

Suppose a statistical inference problem on a daily economic status given daily stock market data

15/42

(9)

Find X=[x1, …, xT ] that maximize

y

T

p X |

1:

For t= 1, …,T :

estimate p

xt | y1:T

x1 xt1 xt xt1 xT

y1 yt1 yt yt1 yT

x1 xt1 xt xt1 xT

y1 yt1 yt yt1 yT

Variational DA method Sequential DA method

•Adjoint method (4DVar)

•Representer method

•Kalman filter (KF), smoother

•Extended KF (EKF)

•Ensemble KF (EnKF)

•Particle filter (PF)

Two ways of DA method

Smoothing dist.

17/42

Optimization and Statistical Inference

x

t

t t t

t

t t

d p

p f

x x x

x

x x

˜

Ÿ

³ ( )

ˆ

) ( )

(

^ ( ) `

ˆ max ) (

X X

X

f f

Dimension of Xis huge

^ `

^ ` ^

f T

`

T f

T

h f exp ( )/

/ ) ( exp

/ ) ( ) exp

( ș

ș

ș ș v

¦

18/42

(10)

Recursive formula

)

|

di ti d it (

Conditional Distribution

)

| (

)

|

(

t 1t: 1

x

p x p

y y

predictive density:

filter density:

Today’s economic situation given yesterday’s stock market data Today’s economic situation

)

| (

)

| (

: 1

: 1

T t

t t

x p

x p

y y

filter density:

smoother density:

estimated by the stock market data up to today

Today’s economic situation analyzed by using all available data when we

)

| (

t 1T:

p y

y

)

| ( x

j 1:k

p y j

y g

look back on the today in future

)

| ( x

t 1

y

1:t 1

p (

t1

| y

1:tprediction1

) p ( x

t

| y

1t: 1

) y

1:t

{ { y

1

, , y

t

} p

k p ( x

t1

| y

1:t

)

)

| (

t

y

1:t1

p

)

| ( x

t

y

1t:

p

filtering

)

| ( x

n1

y

1:n1

p

19 19

smoothing 19

)

| ( x

t 1

y

1:T

p

)

| ( x

t

y

1T:

p p ( x

T

| y

1T:

)

19/42

Prediction

1 :

1

)

| ( x

t

y

t

p

1 1

: 1 1 1

: 1

)

| ,

(

|

³

t t t t

t t

dx y

x x p y

1 1

: 1 1 1

: 1

1

, ) ( | )

|

(

³ p x

t

x

t

y

t

p x

t

y

t

dx

t

³

)

| ( ) ,

|

(xt xt1 y1:t1 p xt xt1

p Markov property䠄䠍䠅

1 1

: 1 1

1

) ( | )

|

(

³ p x

t

x

t

p x

t

y

t

dx

t

Filter pdf at time t-1

(11)

filtering

t

t

y

x

p ( |

1:

) Posterior, Belief

t t t

t t t

y y x p

y y x p

)

| , (

) ,

| (

1 : 1

1 : 1

t t

t t t

y y p

y y p

)

| ( )

| (

)

| (

)

| , (

1 : 1

1 : 1

Markov Property䠄䠎䠅

˜

t t

t t t

t t

y y p

y x p y

x y p

)

| (

)

| ( ) ,

| (

1 : 1

1 : 1 1

:

1 p(yt |xt,y1:t1) p(yt |xt)

Markov Property䠄䠎䠅

˜

t t t

t

y y p

y x p x y p

)

| (

)

| ( )

|

(

1: 1

˜

t t

t t

t t

y x p x y p

y y p

)

| ( )

| (

)

| (

1 : 1 1

: 1

³ p ( y

t

| x

t

) ˜ p ( x

t

| y

1:t1

) dx

t 21 19-2/42

Smoothing

: 1

)

| (

)

| (

³

T t

dx y

x x p y x p

1 :

1 1 :

1 1

1 :

1 1

)

| ( ) ,

| (

)

| , (

³

³

˜

t T t

T t

t

t T t

t

dx y

x p y

x x p

dx y

x x p

1 :

1 1 :

1 1

1 :

1 1 :

1 1

)

| ( ) ,

| (

)

| ( ) ,

| (

³

³

˜

t T t

t t

t

t T t

T t

t

dx y

x p y

x x p

y p

y p

1 :

1 1 :

1

1

( | )

)

| (

)

| , (

³ p p x

t

x x

t

y y

t

˜ p x

t

y

T

dx

t

1 :

1 : 1

1 1

: 1

: 1 1

)

| ) (

,

| ( )

| (

)

| (

³

t t

˜

t t t

˜

t T t

t t

dx y

x y p

x x p y

x p

y x p

1

1 :

1 1 :

1 1

)

| ) (

| ) (

| (

)

| ) (

| (

³

³

t t

t T t

t t

x d x p

y y p

x p

Filter Dist. Smoothing Dist.

22 1

: 1 1 :

1 1 : 1

1

( | )

)

| (

)

| ) (

|

(

³

˜

˜

t T t

t t

t t t

t

p x y dx

y x p y p

x p

Prediction Dist.

19-3/42

(12)

Sequential Data Assimilation

)

|

( x

t1

y

1:t1

p p ( x

t

| y

1:t1

) y

t Simulation

)

| ( x

t

y

1:t

p p ( x

t1

| y

1:t

)

Simulation

) , , ,

| ( )

|

(xi y1:k p xi y1 y2 yk

p

䠄 䠅

Estimate PDF of state vector or its moments (mean, variance, …) sequentially on each observation

1

y

t

xt

1

y

t

20/42

– : All variables in simulation model – : All observed variables

– : Stochastic part to represent uncertainty of model (boundary condition, …)

– : Observation error

– : Normally Gaussian

&KDOOHQJLQJSUREOHP +XJHGLPHQVLRQDQGLQYHUVLRQ

t t t t

t t t t

w x H y

x v x F x

, )

(

1 0

ytt

x

t

t w

v ,

• Data Assimilation = Estimation problem of state vector 䠖

vt

wt (system model)

(observation model)

dimension

x

t:

10 4 ~10 6 y

t :102~105

x

t

t t

t

t

h x w

y

or

) dim(

)

dim( x

t

!! y

t

x0: Initial condition

(13)

25 25 25

Numerical representation of distribution

True distribution

)

| ( ),

| ( ),

|

( x

t

y

1:t 1

p x

t

y

1:t

p x

t

y

1:T

p

Monte Carlo approximation

Represent pdf by the actual realizations.

N: # of particles

> @

>

(| )

@

) 2 (

| ) 1 (

|

| :

1

) (

1

| )

2 (

1

| ) 1 (

1

| 1

| 1

: 1

, , ,

)

| (

, , ,

)

| (

N t t t

t t t t

t t

t

N t t t

t t

t t

t t

t

x x

x X

y x p

x x

x X

y x p

{

#

{

#

22/42

^ ` x

t(|it)1 iN1

^ x

t(i)1|t1

`

iN1

) , ( t(i)1|t 1 t(i)

t x v

f

x

t

Time

State

Prediction Step (Common in EnKF and PF)

) 1 (

1

| 1 t

xt ) 2 (

1

| 1

t

xt

) (

1

| 1 N t

xt

) 2 (

1

|t

xt

) 1 (

1

|t

xt ) (

1

| N t

xt ) (

1

| i t

xt

Prediction step

: ensemble member of predictive PDF : ensemble member of filtered PDF simulation

1 t

23/42

(14)

^ ` x

t(|it)1 iN1

x

t

Time

State

Filtering Step of PF

) 2 (

1

|t

xt

) 1 (

1

|t

xt ) (

1

| N t

xt

^ ` x

t(|it) iN1

Observation:

y

t

Filtered by a resample proportional to likelihood Calculate

likelihood for each particle likelihood

Filtering step

¸¸

¸

¹

·

¨¨

¨

©

§

¦

j

j t t t

i t t i t

t

t p y x

x y p

)

| (

)

| (

) (

1

| ) (

1 ) |

(

Z|

: ensemble member of predictive PDF : ensemble member of filtered PDF

) 2 (

|t

Zt

) 1 (

|t

Zt ) (

| N t

Zt

24/42

㽢 㽢 㽢 㽢

1

| 1 1

: 1

1| )

( t t #Xtt p x y

1

| 1 :

1 )

|

( t t #Xtt p x y

t t t

t X

p(x |y1:)# | timet-1

timet

y

t

) (i

vt 䝅䝇䝔䝮䝜䜲䝈䝪䞊䝹

(˜) Ft

) (

1

| i t t

Z

N

i t t

) 1

(

Z|

25/42

(15)

Family of particle filters

• Kalman filter (EKF: Extended Kalman filter)

• EnKF: Ensemble Kalman filter (Evensen 1994)

• Particle filter

– SIR filter (e.g., Gordon et al. 1993, Kitagawa 1993)䖩 – Gaussian particle filter (e.g., Kotecha and Djuric 2003) – Kernel filter (e.g., Hurzeler and Kunsh 1998)

– Merging particle filter (MPF)

It encounters a problem called ‘degeneration’ in applying to high-dimensional models. (i.e., the diversity of an ensemble is lost after repeating resampling procedures. )y p g p g p )

ͤGaussian PF: 1) Each particle in filtered ensemble is drawn from a Gaussian function with the mean and covariance of the forecast ensemble. 2) It requires high computational cost due to a factorization of a high-dimensional covariance matrix in generating Gaussian samples.

29

g g g p

26/42

Particle filter

(SIR: Sequential Importance Resampling)

A posterior (filtered) ensemble is obtained by resampling the forecast ensemble with weights of likelihood. Thus, an ensemble member is duplicated in the filtered ensemble according to its likelihood.

27/42

(16)

31

Merging particle filter (MPF)

28/42

We draw samples from the forecast ensemble with weights of , and obtain an ensemble: . A subset from this samples satisfies

because it is a filtered ensemble obtained using normal PF.

Each number of a filtered ensemble is generated as a weighted sum of n samples from the sample set as:

n N u w

i

n N u

n N u

MPF algorithm (1)

(17)

MPF algorithm (2)

In order to ensure that the newly generated ensemble

approximately preserves the mean and covariances of the filtered PDF the merging weights

D

are set to satisfy

PDF, the merging weights are set to satisfy

1

2

1 ( 3 such that 0 for all )

n n

n j

D D d D z

¦ ¦

D

j

h h i l b

1 1

1 , 1 ( 3 such tha t 0 for all )

j j

j j j

n j

D D d D z

¦ ¦

where each is a real number.

Then, a new ensemble approximation of the filtered PDF

D

j

|

1:

(

t t

) p x y

is obtained as

30/42

)ORZFKDUWRI3)

A concept of the “Islands” in GA is similar, but different.

31/42

(18)

0HWDSDUWLFOHILOWHU

'3)'LVWULEXWHGSDUWLFOHILOWHU

Inter-node resampling:

Resampling between ensembles (not ensemble member) each of which consists of many particles in a node. This ensemble is called

“super-particle”.

When a weight for any super- particles (i.e., nodes) exceeds 0.3, the inter-node resampling procedure is applied.

32/42

] [

1

| j t

t

:

] [

1

| j t

t

: ]

)[

( 1

| j i t

t

Z

5HVXOW

FRPSXWDWLRQDOWLPH

1 core

Each core has 4,098particles.

2 cores 4 cores 64 cores

Each core has 256 particles.

16 times

(19)

Inside GPU: 128 240 parallel computing

GPGPU’s power is equivalent to a computer with 128 1.2GHz processors

3(memory 3( memory

3( memory 3( memory

3( memory 3(memory

3(memory 3(memory

4Kb=1,000 variables

Shared

memory 4,000ኚᩘ

3( 3URFHVVLQJ(OHPHQW On tip memory

Device memory 䠐GB

# of multiprocessor䠍䠒~䠏䠌

$FFHVVVSHHG

register

[ 3(

0XOWLSURFHVVRU

34/42

㽢 㽢 㽢 㽢

1

| 1 1

: 1

1| )

( t t #Xtt p x y

1

| 1 :

1 )

|

( t t #Xtt p x y

t t t

t X

p(x |y1:)# | timet-1

timet

timet+1

y

t

) (i

vt 䝅䝇䝔䝮䝜䜲䝈䝪䞊䝹

(˜) Ft

) (

1

| i t t

Z

N

i t t

) 1

(

Z|

෌ᥖ/42

(20)

㻻㼜㼠㼑㼞㼛㼚 㻗㻌㼀㼑㼟㼘㼍㻌㻗㻌㻯㼘㼑㼍㼞㻿㼜㼑㼑㼐㻬 㻴㼕㼓㼡㼏㼔㼕㻌㻸㼍㼎㻚

Host machine (CPU)

HP xw9400 Workstation

Dual-Core AMD Opteron(tm) Processor 2220

Tesla (GPU/Cuda)

nVIDIA S1070 1U GPU computing server (1.6GiB/1.6GHz/410GB/s)

ClearSpeed e 710)

35/42

SISmeetsGPGPU.

• SISonGPGPUdesignedforparameterestimation.

– SimulationiscarriedoutonGPGPU.

– ParameterestimationiscarriedoutonCPU.

particles PF

Opteron 2220 1core

PF

Opteron2220 1core +GPGPU(Tesla C870)

SIS

Opteron2220 1core +GPGPU(Tesla C870) 100,000,000

(1)

8Days (6.8105sec)

12Hours (4.5104sec)

3Hours (1.0104sec)

㽢䠍䠑 㽢䠒䠓

1,000,000,000 (10)

2.6Months? 5Days? 28Hours

(1.0105sec) (Nakamura et al., 2009)

(21)

HPC of our group

1HKDOHP

;HRQ

# of Cores

2000

200

37/42

# of Cores

Next-Generation of Supercomputer in Japan at Kobe

䕔 Grand Challenge:

-- Nanotech

(Institute for Molecular Science)

-- Life Science

(RIKEN)

Japanese Government will spend more than 1 billion US$ for this national project. It has more than 600,000cores.

38/42

(22)

43

Research projects in progress by our group

࣭ Coupled Ocean-Atmosphere model

䞉 3D structure of ring current

࣭ Tsunami model

࣭ Ocean tide

࣭ Genome informatics

࣭ Marketing (with agent simulations)

39/42

TIPS:Achoiceofthedataassimilationmethods

Reductionofdegreeoffreedom

Degree of Freedom of the system

Information provided by data at each time- integration step of the simulation.

Nonlinearity

Small Large

Genome Informatics

Oceanandatmospheric science

(23)

Contact

Email: [email protected] Homepage:

http://daweb ism ac jp/

http://daweb.ism.ac.jp/

42/42

参照

関連したドキュメント

Section 5 considers the analog of the right-tail upper bound for the minimal eigenvalue in the β -Laguerre ensemble, this case holding the potential for some novelty granted

In this case, the extension from a local solution u to a solution in an arbitrary interval [0, T ] is carried out by keeping control of the norm ku(T )k sN with the use of

In the second computation, we use a fine equidistant grid within the isotropic borehole region and an optimal grid coarsening in the x direction in the outer, anisotropic,

He thereby extended his method to the investigation of boundary value problems of couple-stress elasticity, thermoelasticity and other generalized models of an elastic

In section 2 we present the model in its original form and establish an equivalent formulation using boundary integrals. This is then used to devise a semi-implicit algorithm

Theorem 4.8 shows that the addition of the nonlocal term to local diffusion pro- duces similar early pattern results when compared to the pure local case considered in [33].. Lemma

Kilbas; Conditions of the existence of a classical solution of a Cauchy type problem for the diffusion equation with the Riemann-Liouville partial derivative, Differential Equations,

A Darboux type problem for a model hyperbolic equation of the third order with multiple characteristics is considered in the case of two independent variables.. In the class