Sequential Data Assimilation:
Online Information Fusion Platform for Simulation and Observation Data
Simulation and Observation Data
R h O i ti f I f ti d S t
Research Organization of Information and Systems The Institute of Statistical Mathematics/JST CREST* f
Tomoyuki Higuchi
*
JapanScienceTechnology Agency CoreResearch forEvolutionalScience andTechnology1
CoreResearch forEvolutionalScience andTechnology
TESD: Four Kind of Methodology of Science
Deductive Approach
Inductive Approach
Approach Approach
S:Simulation D䠖Massive
(Axle)
Data Assimilation
Cyber-space
Data Analysis
Მ Analysis
T:Theory E:Experiment
T:Theory E:Experiment
Drive Force for Science
Outline
1. Simulations with uncertainties 2 Data Assimilation (DA)
2. Data Assimilation (DA) 3. Modeling uncertainties
4 Sequential DA and generalized state space model 4. Sequential DA and generalized state space model 5. Ensemble-based nonlinear filtering methods
1 Ensemble Kalman filter 1. Ensemble Kalman filter 2. Particle filter
6. Applications with peta-scale computing pp p p g
1. Tsunami Simulation model 2. Ocean Tide Simulation 3. Genome Science
7. Next generation of supercomputer
8 C l i
3
8. Conclusions
Construction of Simulation Model
䠄simplified meteorological model around Japan䠅
º ª [ '
physical variable vector is assigned
[
iw
w
2x cx
PDE : Partial differential equation» » º
« «
ª [ ' 1
) , , (
i i ii
T U V
[
at each grid point.
temperature
wt
» »
»
« «
«
[ '
i
Wind vector
) (
t1t
F x
x
(time varying)
» »
»
« «
« [
[
i i
x t ' 1
1
t
i
v
» »
»
« «
« [ i 1
Observation points and observed
1
[
i) , (
t 1 tt
F x v
x
» »
« ¼
«
¬ T [ ' K
4
variables are limited.
¼
¬ T
State Vector䠖 Contact point between past and future
Past and P t
Past and Present
Present and State
x t
Future
t
) ( x
F
x
State of time t-1State of timet
x t F ( x t 1 )
Simulation Model
5 5 5
Simulation
Simulation model 0 , 1 , 2 , , ,
time
n t
t t n G
o
Simulation model
t 1
t
t F x
x x
: State vector
step time simulation :
, , , , , G t
t 1
t
t x t : State vector
(simulation variables)
Wh
given condition,
Initial
0
: x
1 2 2
1 1
x F x
F
x D
, When x
0D
T 1T
T
F x
x
T1T T
, , is obtained determinis tically.
Sequence x
1x
2x
T6
Simulation including uncertainty
Si l i d l i h i l i
Simulation model
t1t
t
F x
x x
t| F
tx
t1with uncertain evolution
%&SDUDPHWHUV«
, When
condition Initial
:
0 0
D x x
, When x
0| D with uncertain IC
1
1
,
F F
x D
0
,
0,
1
1
,
|
| x F x
F
x D
1 2
2
F x ,
x
1 2
2
| ,
F x F x
T1T
T
F x
x
1 , 2 , is obtained determinis tically.
Sequence x x xT
1|
t TT
F x
x
, , should be evaluated probablist ically.
Sequence q q x
11, x x 1
22, , x 2 , x
TTxT p y y
What is Data Assimilation ?
E i bj t i t l d h
Emerging subject in meteorology and oceanography.
0HWKRGRORJ\WRV\QWKHVL]HQXPHULFDOVLPXODWLRQPRGHO d b d d t
and observed data
± 6LPXODWLRQPRGHOFDQQRWUHSUHVHQWUHDOSKHQRPHQDDFFXUDWHO\
䠄e g 䠅Accurate weather forecast needs good initial conditions
䠄e.g.䠅Accurate weather forecast needs good initial conditions.
8QFHUWDLQW\LQWKHPRGHOERXQGDU\FRQGLWLRQLQLWLDOFRQGLWLRQXQNQRZQ SDUDPHWHUVXQNQRZQG\QDPLFVH[LVWV
± Observation data have some physical/budgetary restrictions.
Correct variables in numerical simulation model using observation data
using observation data.
= Data Assimilation
Objects of Data Assimilation from a viewpoint of Meteorology and Oceanography
[1] To produce the best (better) initial condition for forecasting. It is actually realized in the real weather forecast (ex., Japan Meteorological Agency).
Meteorology and Oceanography
[2] To find the best (better) boundary condition in constructing a simulation model. This procedure includes a setting of appropriate boundary conditions
f d li ith l d h
necessary for dealing with a coupled phenomena.
[3] To attain an optimal parameter vector that appears in an empirical law
( h ) l d f d ibi li t d h hi h
(scheme) employed for describing complicated phenomena which possesses the different time and spatial scales. A validation of the empirically given values is regarded as this problem.
[4] To inter/extrapolate (estimate) an physical quantity at times and locations without observations based on a numerical simulation model. This procedure LVFDOOHG³Dgeneration of re-analysis dataset (product)´7KLVGDWDVHWLVXVHG
g y(p ) to discover a new scientific findings by general geophysical researchers.
[5] To conduct an experiment with a virtual observation network and perform a
[ ] p p
sensitivity analysis in an attempt to construct an effective observation
network system with less budgetary cost and less consuming time.
(ex. Kamachi et al., 2006)
Modeling uncertainties
R t id i t f t i t i h
Represent a wide variety of uncertainty in a research target by distribution function.
8QGHUVWDQGDFRPSOH[WDUJHWV127IURPLWVVLPSOH statistics such as mean, , %87IURPLWVGLVWULEXWLRQ directly.
N i f P b bili Th hi f b bili Notion of Probability: The machinery of probability
theory is used to describe the uncertainty in model parameters or choice of model itself
parameters or choice of model itself.
Probability theoryprovides a framework for quantification and manipulation of uncertainty We will introduce a basic concept of probability theory next uncertainty. We will introduce a basic concept of probability theory next.
Bayesian View
Central Role in Pattern Recognition and Machine Learning Data dist.䠄likelihood function䠅 Prior dist.
It expresses how probable the observed dataset is for different settings of the parameter vector X.
g g
) ( )
| ) (
|
( p y x p x
y x
p
{
䠄 䠅
X䠖Parameter vector
It is independent of data Y, and describes a conviction degree against X numerically.
Posteiror
) ( )
| (
) ) (
| (
x p x y p
y y p
x p
v
{
BayHV¶7KHRUHP
a a ete vecto Y䠖Data
3UREDELOLWyRIGDWD䠄Since data is
dist.
) , (
) ( )
| (
y x p
x p x y
p
v 䚷䚷䚷䚷
y 䚷䚷䚷䚷
Joint dist.
y 䠄
given (the actually observed one), it takes some value.䠅
We are interested in estimating a posterior
di t ib ti i t f i t
Jo t d st.
distribution in most of circumstances.
We would like to be able to quantify our expression of uncertainty and make a precise revisions of uncertainty in the light of new evidence as well as subsequently to be able to take optimal actions/decisions as a consequence
evidence, as well as subsequently to be able to take optimal actions/decisions as a consequence.
Generative Model, Inversion with Bayes’ theorem, and Data Assimilation
and Data Assimilation
䠖 y n Observatio
6LPXODWLRQ )LWQHVV RI 6LPXODWLRQ WR 'DWD
y x x y p ( | )
Data distribution :Forward Posterior distribution:
Inverse
y x y x p ( | ) x
x p ( )
Prior distribution :Forward
)LWQHVV RI 6LPXODWLRQ WR 'DWD
Build a generative model and Use Bayes’ theorem
Latent Variables䠖xx
Data Assimilation in Generalized State Space Model
State Vector䠄Simulation variables䠅
map nonlinear :
L L
t t
t t
G G
!!
' '
1 step time simulation :
ns observatio of
time sampling :
t t
t F x v
x ( 1 , )
Stochastic simulationmodel
t t
t Hx w
y
Observation modelt t
y t
Measurement model
Bayesian Approach
Simulation system Large-scale observation
y pp
Data Assimilation
x
x
ty
tx
t|tRecursive formula
)
|
di ti d it (
Conditional Distribution
)
| (
)
|
( t 1t : 1 x
p x p
y y
predictive density:
filter density:
7RGD\¶VHFRQRPLFVLWXDWLRQJLYHQ
\HVWHUGD\¶VVWRFNPDUNHWGDWD 7RGD\¶VHFRQRPLFVLWXDWLRQ
)
| (
)
| (
: 1 : 1 T t
t t
x p
x p
y
filter density: y
smoother density:
HVWLPDWHGE\WKHVWRFNPDUNHW data up to today
7RGD\¶VHFRQRPLFVLWXDWLRQDQDO\]HG by using all available data when we
)
| ( t 1T :
p y
y )
| ( x
j 1k:p y j
y g
ORRNEDFNRQWKHWRGD\LQIXWXUH
)
| ( x
t 1y
1:t 1p p ( x
t| y
1t: 1)
prediction y
1:t{ { y
1, , y
t} )
| (
t1y
1:t1p
k ( | )
: 1
1 t
t
y
x p
)
| (
ty
1:t1p
)
| ( x
ty
1t:p
filtering
)
| ( x
n1y
1:n1p
14 14
smoothing
14)
| ( x
t 1y
1:Tp
)
| ( x
ty
1T:p p ( x
T| y
1T:)
Prediction
1 :
1
)
| ( x
ty
tp
1 1
: 1 1 1
: 1
)
| ,
(
³
t t t tt t
dx y
x x p y
1 1
: 1 1 1
: 1
1
, ) ( | )
|
(
³ p x
tx
ty
tp x
ty
tdx
t³
)
| ( ) ,
|
( x
tx
t1y
1:t1p x
tx
t1p
Markov property䠄䠍䠅1 1
: 1 1
1
) ( | )
|
(
³ p x
tx
tp x
ty
tdx
t15
Filter pdf at time t-1
filtering
t
t
y
x
p ( |
1:) Posterior, Belief
t t t
t t t
y y x p
y y x p
)
| , (
) ,
| (
1 : 1
1 : 1
t t
t t t
y y p
y y p
)
| ( )
| (
)
| (
)
| , (
1 : 1
1 : 1
Markov Property䠄䠎䠅
t t
t t t
t t
y y p
y x p y
x y p
)
| (
)
| ( ) ,
| (
1 : 1
1 : 1 1
:
1
p ( y
t| x
t, y
1:t1) p ( y
t| x
t)
Markov Property䠄䠎䠅
t t tt
y y p
y x p x y p
)
| (
)
| ( )
|
(
1: 1
t tt t
t t
y x p x y p
y y p
)
| ( )
| (
)
| (
1 : 1 1 : 1
³ p ( y
t| x
t) p ( x
t| y
1:t1) dx
t 16Smoothing
: 1
)
| (
)
| (
³
T t
dx y x x p y x p
1 :
1 1 :
1 1
1 :
1 1
)
| ( ) ,
| (
)
| , (
³
³
t T tT t
t
t T t
t
dx y x p y
x x p
dx y x x p
1 :
1 1 :
1 1
1 :
1 1 :
1 1
)
| ( ) ,
| (
)
| ( ) ,
| (
³
³
t T tt t t
t T t
T t
t
dx y x p y x x p
y p
y p
1 :
1 1 :
1
1
( | )
)
| (
)
| , (
³ p p x
tx x
ty y
t p x
ty
Tdx
t1 :
1 1 :
1 1
: 1
: 1 1
)
| ) (
,
| ( )
| (
)
| (
³
t t
t t t
t T tt t
dx y x y p
x x p y x p
y x p
1
1 :
1 1 :
1 1
)
| ) (
| ) (
| (
)
| ) (
| (
³
³
t t
t T t
t t
x d x p
y y p
x p
Filter Dist. Smoothing Dist.
17
1 :
1 1 :
1 1
1 :
1
( | )
)
| (
)
| ) (
|
(
³
t T tt t
t t t
t
p x y dx
y x p y p
x p
Prediction Dist.
Sequential Data Assimilation
Estimate PDF of state vector or its moments (mean, variance, …) sequentially on each observation
x
t) , , ,
| ( )
|
( x
iy
1:kp x
iy
1y
2y
kp
䠄 䠅
1
y
t)
|
( x
t1 y 1 :
t1
p p ( x
t| y 1 :
t1 ) y
t Simulation)
| ( x y 1
p ( x
t| y 1 :
t) p ( x
t1 | y 1
t) p p ( x
t1 | y 1 :
t)
Simulation
1
y
t18
1
y
t&KDOOHQJLQJSUREOHP +XJHGLPHQVLRQDQGLQYHUVLRQ
• Data Assimilation = Estimation problem of state vector 䠖 x
tt t t t
H
x v x F
x (
1,
0)
(system model)
w x
h
y
± $OOYDULDEOHVLQVLPXODWLRQPRGHO
t t t
t
H x w
y
x
t(observation model) or
y t h t x t w t
± $OOREVHUYHGYDULDEOHV
± 6WRFKDVWLFSDUWWRUHSUHVHQWXQFHUWDLQW\RIPRGHOERXQGDU\
GL L
y
tv
tFRQGLWLRQ« )
± 2EVHUYDWLRQHUURU 1RUPDOO\ *DXVVLDQ w
v w
tx ,QLWLDO FRQGLWLRQ
± v
t, w
t1RUPDOO\ *DXVVLDQ
dimension x
t: 10 4 ~10 6 y
t:10
2~10
5GLP x
t) !! GLP y
t)
x
0,QLWLDO FRQGLWLRQ
19
t
10 10
Numerical representation of distribution
7UXHGLVWULEXWLRQ
)
| ( ),
| ( ),
|
( x
ty
1t 1p x
ty
1tp x
ty
1Tp
0RQWH&DUORDSSUR[LPDWLRQ
5HSUHVHQWSGIE\WKHDFWXDOUHDOL]DWLRQV
N I WL O N # RISDUWLFOHV
> ( 1 ) ( 2 ) ( ) @
)
|
( # X > N @
> ( 1 ) ( 2 ) ( ) @
) (
1
| )
2 (
1
| ) 1 (
1
| 1
| 1
1
)
| (
, , ,
)
| (
N N t t t
t t t t
t t
t
x x
x X
y x p
x x
x X
y x
p
{
#
{
#
> ( | ) ( | ) ( | ) @
|
1 ) , , ,
|
( x t y t X t t x t t x t t x t t
p # {
20 20 20
Sequential DA Methodology
Ensemble Kalman Filter (EnKF) is widely used Ensemble Kalman Filter (EnKF) is widely used.
± &RQGLWLRQDO3')LVDSSUR[LPDWHGE\DVHWHQVHPEOHRI UHDOL]DWLRQV
± .DOPDQ)LOWHULVXVHGIRUILOWHULQJ
$SSOLFDWLRQRI3DUWLFOH)LOWHU SS ( 3) ) LVUDUH
± 7KLVLVDOVRHQVHPEOHEDVHG
th member
i
¦
#
N
i
i t t t t
t
x x
y N x p
1
) (
1
| 1
:
1
1 ( )
)
|
( G ^ ` x t ( | i t ) 1 i N 1 x t ( | i t )
¦
#
N
i t t t t
t
i
x N x
y x
p
1: (|)1
) 1 (
)
|
( G
^ `
^ ` t i t i N
t i t
x 1 ) (
|
| 1
Time step
Time step of used observations
21
ISM N
i 1^ ` t | t i 1
Prediction Step (Common in EnKF and PF)
^ `
(i) N^
(i)`
N) , (
t(i)1|t 1 t(i)t
x v
f
x
State ()
1
| i t
x
t^ ` x
tit iN 1 ) (1
|
^ x
ti t`
i1 )
( 1
| 1 ) 1 (
) 2 (
1
|t
x
t )1 (
1
| 1 t
x
t ) 2 (1
| 1 t
x
t|t 1 t
: ensemble member of predictive PDF simulation
) (
1
| 1 N t
x
t) 1 (
1
|t
x
t ) (N: ensemble member of predictive PDF : ensemble member of filtered PDF
Ti
) (
1
| N t
x
tPrediction step
22
ISM
t
Time1
t
Filtering step of EnKF
^ `
Nx
^ `
NState
: ensemble member of predictive PDF
^ ` x
tit iN 1 ) (1
| ) 2 (
1
|t
x
t^ ` x
tit iN 1 ) (|
Ö
: ensemble member of filtered PDF
1
|t
x
ty
1
|n
V
nSample Covariance Matrix :
Ob ti
y n
1 1
| 1
|
( Ö )
Ö
Ö V H
TH V H
TR
K
Observation :
) 1 (
1
|t
x
t ) (N)
Ö (
()1
| )
( )
( 1
| ) (
|
i t t t i
t t t i
t t i t
t
x K y w H x
x
1
| 1
|t t
(
t tt t t)
t
t
V H H V H R
K
Ti
) (
1
| N t
x
tFiltering Step
Kalman Gain
23
ISM
t
TimeFiltering Step of PF
^ `
Nx
State
^ `
N: ensemble member of predictive PDF : ensemble member of filtered PDF
^ ` x
tit iN 1 ) (1
| ) 2 (
1
|t
x
t^ ` x
tit iN 1 ) (|
Observation:
y t
likelihood1
|t
x
ty t
C l l t
¸ ·
¨ §
Calculatelikelihood for
each particle
¸ ¸ ¸
¹
·
¨ ¨
¨
©
§
¦
j
j t t t
i t t t
x y p
x y p
)
| (
)
| (
) (
1
| ) (
1
|
) 1 (
1
|t
x
t )(N Filtered by a resample proportional to likelihood
¹
©
jTi
) (
1
| N t
x
tFiltering step
t
24TimeProjects in progress
࣭Coupled Ocean-Atmosphere model
Genta Ueno (ISM/JST CREST) ( )
T. Kagimoto (JAMSTEC, FRCGC), N. Hirose (Kyushu Univ., RIAM)
࣭ Tsunami model
Kazuyuki Nakamura (JST CREST)
N. Hirose (Kyushu Univ., RIAM)
࣭Ocean tide
Daisuke Inazu (JST CREST)
T Sato S Miura (Tohoku Univ ) and others (Alaska Univ )
䞉3D structure of ring current
Shin ¶ ya Nakano (JST CREST),
T. Sato, S. Miura (Tohoku Univ.), and others (Alaska Univ.)
Shin ya Nakano (JST CREST),
Y. Ebihara (Nagoya Univ.) 䠈M.-C Fok (NASA) S.-I. Ohtani䠈P.C.Brandt䠄Johns Hopkins Univ.)
࣭Genome informatics
25
Genome informatics
Ryo Yoshida (ISM/JST CREST
) Miyano lab. (Tokyo Univ./IMS)7LPHDQG6SDWLDO6FDOH
Near Earth Space Ocean and 1AU
1 000k
Near Earth Space
䠄Ring Current䠅
Ocean and Atmosphere 1,000km
Ocean Tsunami Tide
1km
Data Assimilation in
O d A h i
1m Genome Informatics
Ocean and Atmospheric SciencesĸLeading area
in DA researches䠅
1cm
Hour Day M th Y 100
(Measurement of gene and protein expressions)
26 26 26
Hour Day Month Year 100 years
Revisit : What is Data Assimilation?
Emerging subject in meteorology and oceanography
Emerging subject in meteorology and oceanography.
0HWKRGRORJ\WRV\QWKHVL]HQXPHULFDOVLPXODWLRQPRGHO and observed data
and observed data
± 6LPXODWLRQPRGHOFDQQRWUHIOHFWUHDOSK\VLFVDFFXUDWHO\
䠄e.g.䠅$FFXUDWHZHDWKHUIRUHFDVWQHHGVJRRGLQLWLDOFRQGLWLRQV
8QFHUWDLQW\LQWKHPRGHOERXQGDU\FRQGLWLRQLQLWLDOFRQGLWLRQ XQNQRZQSDUDPHWHUVXQNQRZQG\QDPLFVH[LVWV
± 2EVHUYDWLRQGDWDKDYHVRPH physical/budgetary p y g y UHVWULFWLRQV Correct variables in numerical simulation model using observation data. = Data Assimilation 6LPXODWLRQPRGHO 2EVHUYDWLRQGDWD
6KDOORZ ZDWHU HTXDWLRQV PRGHO Tide gage data
27
6KDOORZ ZDWHU HTXDWLRQV model Tide gage data
Tsunami Simulation Model
Based on PDEĺ6KDOORZ ZDWHU HTXDWLRQV >&KRL Based on PDEĺ6KDOORZ ZDWHU HTXDWLRQV >&KRL et al et al. 01] 01]
'LVFUHWL]HGWHPSRUDOO\DQGVSDWLDOO\
± 4 physical variables at each grid i
Ki1RUPDOVHDVXUIDFH
4 physical variables at each grid
)ORZYHFWRU䠄longitudinal/latitudinal䠅䠖
'LVpODFHPHQWRIVHDVXUIDFHKHLght䠖 K
iU
iV
ii
6HD ERWWRP
di 7VXQDPLRFFXUUHG
p g
:DWHUGHSWKDWHDFKJULG䠖
± RIJULGSRLQWV䠄longitude䠅㽢240䠄latitude䠅
6HDbottom
Uncertainty in measured water depth!
d
i +DOIRIWKHPDUHRQWKHVHD
'LPHQVLRQRIVWDWHYHFWRULVDERXW㽢10
4.P i d d d d h
Propagati RQVSHH d depend VRQZDWHU depth.
± 'HHSZDWHUPDNHVWVXQDPLSURSDJDWLRQIDVWHU
28
Numerical Simulation (Not DA)
6LPXODWLRQRI2NXVKLUL7VXQDPL
6LPXODWLRQ EDVHG RQ WRSRJUDSKLHV PDGH E\ GLIIHUHQW
± 6LPXODWLRQ EDVHG RQ WRSRJUDSKLHV PDGH E\ GLIIHUHQW RUJDQL]DWLRQV
± ,WORRNVVLPLODU䠈EXWWLPHVHULHVRIVHDVXUIDFHGLVSODFHPHQW
)URP-DSDQ&RDVW*XDUGDWDSRLQWLV
29
Comparison of Sea Surface Displacement
SKKU DBDBV Displacement at 䠄cm䠅
10 20
10
0 0
-10 -10
-20
10 20 30 40
Time 䠄minute䠅
30
Observation Data
7LGHJDJHGDWD
Installation site± /LQHDU1RQOLQHDU5HVSRQVHWR VHDVXUIDFHGLVSODFHPHQWDW
L L OO L L
LQVWUXPHQW LQVWDOODWLRQVLWH
± 2QHGLPHQVLRQDOWLPHVHULHV
1XPEHURIWLGHJDJHVWDWLRQV
± SRLQWV
ex䠅
䠅
Okushiri tsunami(1993) 20
40
( )
0 -20
Tide gage䠄cm䠅
0 50 100 150 200 250
20 -40
Time 䠄min䠅
Application to Real Data
$QDO\VLVE\UHDOWVXQDPLRFFXUUHGLQWKH-DSDQ6HDLQ
•The depths in and around Yamato Rises (area A) varies among 4 bottom topography data set
data set.
•Uncertainty is introduced into South Rises and around area as linear combination of 4 data sets combination of 4 data sets.
j: number of data sets
, 25
~
,
2,
w d w N
V
d ji
j
j m i j i
m
¦
j: number of data sets
Used tide gauge
DA result
Shallow
Deep
•South Rise might be shallower than public data.
33 33
•Deeper area exists in south slope.
Personalized Simulation 䠖
Aboundaryconditionisassimilatedtolocalinformation.
We introduce a local/personal information into a numerical simulation model and personalize the
i l ti f h l ti /
)
( u
2w
w A
g H
t
bv v
Hv
v f v v v
J K
㻹㼛㼠㼕㼛㼚㻌㻱㼝㻚䠖v
simulation for each location/person.
0
w
w w
t H
H t
K v
㻯㼛㼚㼠㼕㼚㼡㼛㼡㼟㻌㻱㼝㻚䠖
Sea Bottom friction coefficient
ᵱᶃᵿᴾᶂᶃᶎᶒᶆ
wt
2 di i l fl t
v䠖 2-dimensional flow vector Ș䠖 Water surface height
H䠖 Water depth䠈f䠖 Coriolis parameter
Ocean tide simulation by our CREST project
Alaska Southeastern region 34
Water level and Flow vectors
35
SealevelrangeisAbout㼼5m.Currentspeedis(much)morethan1m/satthemouth of
Chatham Strait.
Water level and flow vectors 䠄Closeup䠅
36
Result with GINA
ᵥᵣᵬᵟᴾᶇᶑᴾᶂᶃᶃᶎᶃᶐᴾᶇᶌᴾᵿᴾᶀᵿᶗᵌM 2 component tide
Result with GINA
GENAshowsagreatperformancein representing an ocean tide.
representinganoceantide.
1 2 3 4 5
Amplitude
5 6
7
Phase
37
ISM
37Genomic Data Assimilation
Statistical framework to link simulation model and data
+
Simulation model Biological data
P-P interaction expression
Formulated by the generalized State Space Model System model t
t
t
f x v
x (
1, T )
System modelObservation model t
t t
t t t
w Hx y
f
, ) (
1 ,T
t
f
x
: state vector at timev
t : system noise, : parameter vector, : simulation devise,t , t 1 , , T
38
t 38
y
: observation vector at time) , 0 (
~ N V
2w
t : observation noiset , H
: observation matrix,Circadian Rhythm Model with HFPN
HFPN (Hybrid Functional Petri Net) : A graphical programming language suitable to model biological pathways and can be used for simulations
Circadian Rhythm Model
Circadian Model Represented by HFPN Circadian Rhythm Model
of Mouse
Fujii et al 2005
Parameters Fujii et al. 2005
) 0 ( , ), 0
( 12
1 m
m
4 3 2 1,k ,k ,k k
:Initial values :Speeds
45 parameters (12 states), 4 observations
39 39 4
3 2 1,s ,s ,s s
2 2,V W
:Thresholds :Noise variances
ኚ䝟䝷䝯䞊䝍䠄㏿ᗘᐃᩘ䠄䠍䠐䠅䛸ᩜᒃ್䠄䠓䠅ཬ䜃ึᮇ್䠄䠍䠎䠅䠅ᩘ䠏䠏䠈≧ែ䛜䠍䠎
7RZDUG3HWDVFDOH&RPSXWLQJ
100,000,000 particles 䠄䠍൨䠅 100,000 particles:10
3UHGLFWLRQ
100,000,000 particles 䠄䠍൨䠅 100,000 particles:10
Next-Generation of Supercomputer in Japan at Kobe
2008/11/26
Japanese Government will spend more than 1 billion US$ for this national project.
䕔Grand Challenge:
-- Nanotech
(Institute for Molecular Science)-- Life Science
(RIKEN)ḟୡ௦ィ⟬⛉Ꮫ◊✲㛤Ⓨ䝥䝻䜾䝷䝮䠄ᙜ㠃䠅 Development for next-generation simulation
software for whole human body
Next-Generation simulation R&D group for
software for whole human body
Next Generation simulation R&D group for integrating life form simulations
1 M l l l
Riken Next-Generation Supercomputer R&D center
1.Molecular scale
2.Cell scale
3.Body organ scale
6 B i d N
4. Data analysis fusion 5. Upgrading of basic software 6. Brain and Neuron
5. Upgrading of basic software
Data analysis fusion Team
Univ. Tokyo (Prof. Miyano) ISM (Higuchi)
Estimation of large-scale gene network and its
applications
Representative Development of data
assimilation technique for life science simulations
Estimation method for large-scale gene network
Bayesian information
fusion technique
Data assimilation
technique
Genetic linkage analysis
Haplotype analysis technique
Prediction technique for
Protein
Tokyo Inst. Tech (Prof.
Akiyama) Riken Genome Med. Inst. (Md.
and Dr Kamatani)
y q
network
Akiyama)
Estimation of large-scale protein network and its
applications and Dr. Kamatani)
Development for associating polymorphic data and phenotype
data and its validation applications
data and its validation
Attempt to realize personalization technique
Making a parallel computation scale larger enables us to carry out a data-dependent simulation, and results in drawing a scenario data dependent simulation, and results in drawing a scenario
and in making a risk assessment.
Prior distributionof Prior and posterior distributions for Prior distributionof
parameters
قleft؟10^5 right: 10^8ك Prior and posterior distributions for
three parameters among
parameters estimated PF are demonstrated in 3-dimensional space.
(VWLPDWLRQ E\ 3) Althoughg the PF with
10^5 particles results in the PDF with a small number of particles, the PF ith 10^8 particles
Posterior distributionof parameters PF with 10^8 particles
leaves many particles.
A A SDUWLFOHV
Posterior distributionof parameters