最適値関数に表れる黄金比
(Golden
Optimal
Value
in
Discrete-time
Dynamic
Optimization
Processes)
岩本誠–
(Seiichi
IWAMOTO)
Kyushu University,
Fukuoka
812-8581,
Japan
E–mail:
$iwamotoQen.kyushu-u.ac.jp$
安田正實
(Masami YASUDA)
Chiba
University,
Chiba 263-8522,
Japan
E–mail:
yasudaMath.
$s$.
chiba-u.$ac$.
jp
概要
Aesthetics feature fascinatesmathematician. Sinceancienttimes,theGoldenRatio $(\phi)$has been
$keeP^{\dot{u}l}g$togiveaprofound influence invariousfields. Wewill showtypical dynamic programming
problems: Allocation problem, Linear-Quadraticcontrolproblemand Multi-variatestopping prob-lem. Fortheseproblems,thereappearstheGolden Ratioin thesolution of Bellmanequation. This
paper also considers aminimization problem ofquadratic functions over an infinite horizon. We showthat theGoldentrajectory isoptimalintheoptimization. The Goldenoptimal trajectory is
obtained through the corresponding Bellman equation, which in turn admits the Golden optimal policy. ForMulti-variatestopping problemwith three playersontheunit interval$[0,1]$, itsrelated expectedvaluecouldbeobtainedas $\phi^{-1}$
.
KEYWORDS:
Golden Ratio, Dynamic Programming, Allocation Problem, LQ Control Problem,Monotone StoppingProblem. Primary$90C39$; Secondary llA99.
1
Introduction.
The Golden Ratio $(\phi=1.61803\cdots)$ has been
a
profound influence since ancient times suchas
theParthenonatAthens. The shape of theGoldenRatio is supposedto be interesting inagraphicformsfor
their sculptures andpaintings. The beautyappears
even
intheingredientofnature creatures. The mostinfluential mathematics textbook by Euclid of Alexandria defines the proportion. These information
presents
a
broad samplingof$\phi$-related topics inan
engagingand easy-to-understand format.The Fibonaccisequence (1,1,2;3,5,8,13,$\cdots$ ) is closelyrelated to theGolden Ratio, whichis
a
limitingratioofits two adjacent numbers. It isalsoknownthat the diagonalsummation producestheFibonacci
sequence in the Pascal’striangle. These repeated procedure or iteration havesomethingin
common.
Theprinciple of Dynamic Programming is said to ‘divide and conquer.’ In fact, ifit is not poesible
original problem an effective family of sub-problems. The Bellman’s
curse
of dimensionality conquers thecomputational explosionwith the problem dimension throughtheuse
of parametricrepresentations.The
more
it’s ina complex, themore
it is divided. Whena
problem is ina
multi-stage decisionform,we
should consider the problem repeatedly. If this reduction procedure gives a self-similar one, themethodology turns out to be effective. The Golden Ratio is created repeatedly by its
own
in a quitesame
form. Arecurrence
relationis ubiquitous. Letus
say thata
beautiful continued fraction representsthe Golden number. It is interesting that this quite introductory problem of Dynamic programming
produces the basic mathematical aspects.
In the following sections, we treat typical dynamic programming problems; Allocation problem and
Linear-Quadratic Control problem. However problems
are
in asimple fashion, it figures out theessence
of Dynamic Programming.
Let
us
considera
typical type of criterion ina
deterministic optimization. We minimize the nextquadratic criteria:
(1.1) $I(x)= \sum_{n=0}^{\infty}[x_{n}^{2}+(x_{n}-x_{n+1})^{2}]$
,
and
(1.2) $J(x)= \sum_{n=0}^{\infty}[(x_{n}-x_{n+1})^{2}+x_{n+1}^{2}]$
.
Let $R^{\infty}$ be theset ofallsequences ofrealvalues :
$R^{\infty}=\{x=(x_{0},x_{1}, \ldots, x_{n}, \ldots)|x_{n}\in R^{1}n=0,1, \ldots\}$
.
First wetake thequadratic criterion (1.1).
Nowweconsider
a
mathematical programmingproblem for anygiveninitial value $c$:$\underline{MP_{1}(c):}minimizeI(x)$ subject to $(i)x\in R^{\infty}$, $(\ddot{u})x_{0}=c$
.
Let
us
evaluatea
few special trajectories:Example 1 First allbut
on
and$aRer$ nothing $y=\{c,$ $0,0$,...,
$0$,...
$\}$ yields $I(y)=2c^{2}$.
Example 2 Always all constant $z=\{c,$ $c$
, ...,
$c$,.. .
$\}$ yields $I(z)=\infty$.
Exmple $ A proportional (geometrical) $w=\{c,$ $\rho c$,
. .
., $\rho^{n}c$,.
..
$\}$ yields$I(w)=\backslash$ $\{c^{2}+(1-\rho)^{2}c^{2}\}(1+\rho^{2}+\cdots+\rho^{2n}+\cdots)$
(13)
$=$ $\frac{1+(1-\rho)^{2}}{1-\rho^{2}}c^{2}$ $(0<\rho<1)$
.
Since
$\min_{0\leq\rho<1}\frac{1+(1-\rho)^{2}}{1-\rho^{2}}$
is attaind at $\hat{\rho}=2-\emptyset$
,
we
have the minimum valueExample 4 The propotional $\text{\^{u}}=\{c, (2-\phi)c, . .., (2-\phi)^{n}c, . ..\}$, with ratio $(2-\phi)$, yields
(14) $I(\hat{u})=\phi c^{2}$
.
Thus $\text{\^{u}}=(\hat{u}_{n})$
.
givesa
Golden optimal trajectory, because $\hat{u}_{n+1}$ is always a Golden section point ofinterval $[0,\hat{u}_{n}]$
.
Next let
us now
considera
control process withan
additive transition$T(x, u)=x+u$.
minimize $\sum_{n=0}^{\infty}(x_{n}^{2}+u_{n}^{2})$
subject to (i) $x_{n+1}=x_{n}+u_{n}$, $n\geq 0$
(ii) -\infty <un<o科
(iii) $x_{0}=c$
.
Then the valuefunction$vsati\epsilon fies$ Bellmanequation:(1.5) $v(x)= \min_{-\infty<u<\infty}[x^{2}+u^{2}+v(x+u)]$
.
Eq. (1.5) hasa
quadratic form$v(x)=\phi x^{2}$.
Second
we
take the following quadraticcriterion$J(x)= \sum_{n=0}^{\infty}[(x_{n}-x_{n+1})^{2}+x_{n+1}^{2}]$
.
We$con8ider$
a
problem ofform:$\underline{MP_{2}(c):}minimizeJ(x)$ subject to $(i)x\in R^{\infty}$, $(ii)x_{0}=c$
.
Since $J(x)=I(x)-c^{2}$
,
the minimumvalue is $J(\hat{u})=(\phi-1)c^{2}$ at the Golden trajectory$\text{\^{u}}=\{c, \mu c, ..., \mu^{n}c, .. \};\mu=2-\phi$
.
In fact,
a
proportional $w=\{c,$ $\mu$,...,
$\rho^{n}c$,...
$\}$ yields$J(w)$ $=\{\rho^{2}c^{2}.+(1-\rho)^{2}c^{2}\}(1+\rho^{2}+\cdots+\rho^{2n}+\cdots)$
$= \frac{\rho^{2}+(1-\rho)^{2}}{1-\rho^{2}}c^{2}$ $(0<\rho<1)$
.
Figure 1 in theAppendixshows that
$\min_{0\leq x<1}\frac{x^{2}+(1-x)^{2}}{1-x^{2}}$
isattained at $\hat{x}=2-\emptyset$with the minimum value
2
An lllustrative Graph.
Let
us now
describe a graph which has dual Goldenextrem.um
points in the previous section. Thegraph is $x=f(u)= \frac{u^{2}+(1-u)^{2}}{1-u^{2}}$. See Figure 1 in Appendix. For this function, it is seen that two
equalities:
(2.1) $\min_{0<u<1}f(u)=\min_{0<u<1}\frac{u^{2}+(1-u)^{2}}{1-u^{2}}=-1+\phi$
and $1<u< \infty\max f(u)=1<u<\infty\max\frac{u^{2}+(1-u)^{2}}{1-u^{2}}=-\emptyset$hold. Equivalently, the latterequalityis that
(2.2) $\min_{1<u<\infty}\{-f(u)\}=\min_{1<u<\infty}\frac{u^{2}+(1-u)^{2}}{u^{2}-1}=\phi$
The minimum in (2.1) attains iff$\text{\^{u}}=2-\phi$, and the minimum in (2.2) attainsiff$u^{*}=1+\phi$
.
Thuswe
have the inequality
$f(u)\geq-1+\phi$
on
$(-1,1)$ and $f(u)\leq-\emptyset$on
$(-\infty, -1)\cup(1, \infty)$.
Refer to the shape for this graph in the Figure 1 of Appendix.
3
Dynamic Programming of
the
discrete-time
system.
The conceptualcluster of Dynamic Programmingare investigated throughout themathematics. Not only the analytical aspect ofoptimization method, but also the investigate problem with
a
repeatedstructure. To give a useful explanation and
an
interesting implication,we
showsome
explicitly solvableproblems.
First the general setting of Dynamic Programming problem
are
illustrated. It is composedas
$(S, A,r,T)$
.
Let $S$ be a state space in the Euclidean space $R$ and $A=(A_{x}),A_{x}\subset R,$$x\in S$ meansa feasible action space depending on a current state $x\in S$
.
The immediate reward is a function of$r=r(x, a,t),x\in S,$$a\in A_{x},$$t=0,1,2,$$\cdots$
.
And the terminal reward $K=K(x),$$x\in S$ is given. Thetransition law fromthe current $x$ tothe new$y=m(x,$$a,$$t\rangle$by the action
or
decision$a\in A_{x}$ at atime$t$.
Ifthe transition law $m(x, a, t)$ does not depend $t$, it is called astationary $m(x, a,t)$ and
we
treat it inthis paper. Here
we
consider additive costs and the optimal value of $a_{t}$ will be depend on the decisionhistory. Assumeits value at time$t$ denoted by
$x_{t}$,which enjoythefollowing properties:
(a) The value of$x_{t}$ isobservable attime$t$
.
(b) The sequence $\{x_{t}\}$ follows
a
recurrence
in time:(3.1) $x_{t+1}=m(x_{t}, a_{t},t)$
.
It is termed thatthe function$y=m(x, \pi(x,t), t)$
means a move
from the current $x$to the next$y$attsothe lawofmotion
or
the plantequation by adaptingapolicya$=\pi(x,t)$.(d) Thecost function $C_{\pi}(x,t)$starting
a
state$x$ at time-to-go$T_{t}=T-t$ tooptimizeover
all policies $\pi$has the additive form;(3.2) $C_{\pi}(x_{0}, t_{0})= \sum_{t=t_{0}}^{T-1}r$($x_{t}$,at,$t$)$+K(x_{T})$
with $x_{0}=x_{t_{0}}$ and
$x_{t+1}=m(x_{t}, \pi(x_{t},t), t)$, $at=\pi(x_{t}, i)$
where$T$ is agiven finite planning horizon.
Let
$F(x,t)= \inf_{\pi}C_{\pi}(x,t)$
.
It iswellknown that the sequence $\{F(\cdot,t)\}$ satisfiesthe optimality equation(DP equation):
(3.3) $F(x,i+1)= \inf_{a\in A_{g}}[r(x,a,t)+F(m(x,a,T_{t}),t)]$
with the boundary$F(x,T)=K(x)$ where $T_{t}=T-t$ for $x\in S,$ $0\leq t<T$
.
All ofthese
are
referredfromtextbooks by Bertsekas [2], Whittle [15], Sniedovich [14], etc.Therelationbetween TheGoldenratio formula andFibonacci sequenceis knownas [4]etc. To produce
the Fibonacci sequence, it is a good example in
a
recursive programming [13]. Also the Fibonaccisequence
are
related with continued ffaction. Forthe notationofcontinued fraction,we
adoptourselftothe following notations:
$b_{0}+ \frac{c_{1}}{b_{1}+\frac{c_{2}}{b_{2}+\frac{c_{3}}{b_{3}+\frac{c_{4}}{b_{4}+}}}}\ldots=b_{0}+\frac{c_{1}}{b_{1}+}\frac{c_{2}}{b_{2}+}\frac{c_{3}}{b_{3}+}\frac{c_{4}}{b_{4}+}\cdots$
.
Note that the Golden number satisfiae $\phi^{2}=1+\phi$
.
By using this relation repeatedly, $\phi=1+\frac{1}{\phi}=$$1+ \frac{1}{1+\frac{1}{\phi}}=1+\frac{1}{1+}\frac{1}{\phi}=1+\frac{1}{1+}\frac{1}{1+}\frac{1}{\phi}=1+\frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\cdots$
.
Similarly the reciprocal (or$8ometimescaUed$as
a dual Goldennumber) is denoted $\phi^{-1}=0.618\cdots=\phi-1=\frac{1}{\phi}=\frac{1}{1+}\frac{1}{\phi}=\frac{1}{1+}\frac{1}{1+}\frac{1}{\phi}=\frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\cdots$.
This reproductive property suggestsour fundamental claim for the following typical example of Dy-namic Programming. Beforewe solve the problem, let
us
induceasequence $\{\phi_{n}\}$ as(3.4) $\phi_{n+1}=1+\frac{1}{1+1/\phi_{n}}=1+\frac{1}{1+}\frac{1}{\phi_{n}}$ $(n\geq 1),$ $\phi_{1}=1$
.
ノ1lso let $\{\hat{\phi}_{n}\}$
as
$\hat{\phi}_{n+1}=\frac{1}{1+}\frac{1}{1+}\hat{\phi}_{n}$ $(n\geq 1),\hat{\phi}_{1}=1$,
(3.5) $i.e$
.
Thesequence $\{\phi_{n}\}$ of(3.4) satisfies
$\phi_{n+1}$ $=1+ \frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\frac{1}{\phi_{n-i}}$
$=1+ \frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\frac{1}{\phi_{n-2}}$
Similarly $\{\hat{\phi}_{n}\}$ of(3.5) satisfies
$\frac{1}{\hat{\phi}_{n+1}}$ $=1+ \frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\hat{\phi}_{n-1}$
$=1+ \frac{1}{1+}\frac{1}{1+}\frac{1}{1+}\frac{1}{1+}.n-2$
Rom this definition, it is
seen
easily that(3.6) $\lim_{narrow\infty}\phi_{n}=\phi=(\sqrt{5}+1)/2$
(3.7) $\lim_{narrow\infty}\hat{\phi}_{n}=1/\phi=(\sqrt{5}-1)/2$
4
$\llcorner lnear$-Quadratic
Control
problem.
The LinearQuadratic $(LQ)$ control problemis to minimize the quadratic cost function
over
thelinearsystem. If the state of system $\{x_{t}\}$ movae on
(4.1) $x_{t+1}=x_{t}+a_{t},$ $t=0,1,2,$$\cdots$
$withx_{0}=1byaninputcontro1\{a_{t}, -\infty<a_{t}<\infty\}$ Thecost incured 狛
(42) $\sum_{t=0}^{T-1}(x_{t}^{2}+a_{t}^{2})+x_{T}^{2}$
.
Then DPequation of LQ is
(4.3) $v_{t+1}(x)= \min_{a\in Aae}\{r(a,x)+v_{t}(a+x)\}$
where
$r(a,x)=a^{2}+x^{2}$,
$a\in A_{l}=(-\infty, \infty),$$x\in(-\infty, \infty)$
Theorem 4.1 The solution of(4.3) is given by
$usi\dot{n}g$the Golden nunberrelated sequence$\{\phi_{n}\}$ by (3.4).
(Proof) The proofisimmediatelyobtainedby
an
elementaryquadraticminimization andthenthe5 AIIocation
problem.
Allocationproblem or sometime called
as
partition problem,isofthe form(5.1) $v_{t+1}(x)= \min_{a\in A_{\alpha}}\{r(a,x)+v_{t}(a)\}$
for$t=0,1,2,$$\cdots,$$T$, where
$r(a,x)=a^{2}+(x-a)^{2}$, $a\in A_{x}=[0,x],x\in.(-\infty, \infty)$
.
Theorem 5.1 Thesolutionof(5.1) is given by using thedualgolden number as
(Prvof) Using the Schwartzinequality, the following holds immediately: Forgiven positiveconstants $A$
and $B$with a fixed $x$,
$\min_{0\leq a\leq ae}\{Aa^{2}+B(x.-a)^{2}\}=\frac{x^{2}}{1/A+1/B}$
.
Sothe proofcould be done inductively. $\square$
Remark 1 : Wenoteherethat thenumber$\phi^{-1}=0.618\cdots$ ofreciprocalof the Goldennumber iscalled
sometimae Dual Golden number. The abovetwo problems
are
closely related.Remark 2 : Itis
seen
that thesame
quadratic function of the $form\cdot v(|x)=cx^{2}$ where $c$isa
constant,becomes
a
solution ifthe DPequation is, for Allocation and$LQ$,(5.2) $v_{t+1}(x)= \min_{a\in A_{g}}\{r(a,x)+2\int_{0}^{a}v_{t}(y)/ydy\}$
(5.3) $v_{t+1}(x)= \min_{a\in A_{*}}\{r(a, x)+2\int_{0}^{a+x}v_{t}(y)/ydy\}$
respectively. Referto [9].
6
Monotone Stopping
Game.
Amonotonerule is introducedto
sum
up individualdeclarirationsina
multi-variatestopping problem[16]. The rule is defined by a monotone logical function and is equivalent to the winning claes of
Kadane [12]. There given p–dimensional random process $\{X_{n};n=1,2, \cdots\}$ and a stopping rule $\pi$ by
which thegroup decisiondetermined ffom the declararion of$p$players at each stage. The stopping rule
is$\gamma variate\{0,1\}$-valuedmonotone logical function. Weconsidertwo
cases
of rules with$p=3$as
follows:(6.1) $\pi(x_{1}, x_{2},x_{3})=x_{1}+x_{2}$
and
$x_{1} x_{2} x_{3}$ $\pi(x_{1},x_{2}, x_{3})$
$0 0 0$
$0 0 1$
$0 1 0$
$0 1 1$
$1 0 0$
$1 0 1$
$1 1 0$
$1 1 1$
$0$ $0$ $0$ $0$ $0$ $1$ $1$ $1$X1 $\pi(x_{1},x_{2},xa)=x_{1}+x_{2}$ for any$xs$
図2 $\pi(x_{1},x_{2},xs)=x_{1}x_{2}+x_{1}xs$
Thatis, in
case
of thecase
(6.1),if either ofplayer1 and2declaraestop,then the systemstopsneglectingof$x_{S}$
.
Incase
of (6.1),we bave
thesystem stopswheneither of player1
and 2declaraestop accompanyingwith player 1. Without loss ofgenerality,
we
can assume
that each $X_{n}$ takae theuniformly distributionon
$[0,1]$.
Then equilibrium expected values for each player is givenas
Table 1. Theequilibrium expectedvalue for each players.
Inorder to derive the value $\phi^{-1}=\frac{\sqrt{5}-1}{2}$ , weconsider
an
equilibrium stoppingstratey of thraeholdtype in the form $\{X_{n}>a\}$ for
some
$a$.
Bellman type equation for thisgameversion will be givenas
[16].Thatis,eachplayerdeclares “stop” or “continue” iftheobserved valueexceeds
some
$a$or not. The eventof the
occurence
is denoted by $D_{n}^{1}=$ {Player$i$ declaresstop}. Two trivialcases are
the whole event $\Omega$and the emptyevent $\phi$
.
In generally alogicalfunction is assumed “monotone” so itsfunctioncan be written
as
$\pi(x^{1}, \cdots,x^{p})=x^{:}\cdot\pi(x^{1}, \cdots,\dot{i}, \cdots x^{p})+\overline{x^{i}}\cdot\pi(x^{1}, \cdots,0,\cdot x^{p}):\cdot$
.
$x^{:}\in\{0,1\}\forall i$$wherex^{*}=1-x^{:}-$
.
Corraeponding to thisexpression,$II(D^{1}, \cdots,D^{p})=D^{i}\cdot\Pi(D^{1}, \cdots,\Omega,\cdot D^{p})+\overline{D^{t}}\cdot \mathbb{I}(D^{1}:\cdot\cdot, \cdots,\emptyset,\cdot\cdot D^{p})i$
.
where$\overline{D^{1}}$
is thecomplement ofthe event $D^{:}$
.
The generalequation for the expected for player $i$ at thestep $n$ equakas follows:
where $D_{n}^{i}=\{X_{n}^{i}\geq v^{i}\}$ and $(x)^{+}= \max\{x,0\},$ $(x)^{-}= \min\{x, 0\}$. If
we assume an
independencecase
betweenplayer’s randomvariable$X_{n}^{1}$ for each $i$. The equation (6.3) becomes
(6.4) $\beta_{n}^{\Pi(i)}E[(X_{n}^{i}-v^{i})^{+}]-\alpha_{n}^{n(i)}E[(X_{n}^{i}-v^{1})^{-}]=0$
.
Our objective is to find
an
equilibrium strategy and values of players fora
given monotone ruleas
therule (6.1) and (6.2). A sequence of expected value (a net gain) under the situation formulated in the
aection isobtained
as
(6.5) $v_{n+1}^{i}=v_{n}^{1}+\beta_{n}^{II(\dot{*})}E[(X_{n}^{i}-v_{n}^{i})^{+}]-\alpha_{n}^{n(i)}E[(X_{n}^{1}-v_{n}^{i})^{-}]$
for player $i=1,$$\cdots,p$and$n$denotesatime-to-go. The detailsrefer toTheorem 2.1 inYKN [16]. Under
thaee derivation,
now we are
able to calculate theoptimal (equilibrium) value $v:= \lim_{n}v_{n}^{1}$ for player7. Apppendix. $x$
参考文献
[1] E.F. Beckenbachand R. Bellman, Inequalities, 3rd
revised..
printing, Springer, New York, 1971.[2] Dimitri P. $Bertse\ovalbox{\tt\small REJECT}$
,
Dynamic Programming and Stochastic Control, Academic Press, New York,1976.
[3] A. Beutelspacher and B. Petri, Der Goldene Schnitt 2., tiberarbeitete und enveiterte Auflange,
El-sevier$GmbH$, Spectrum AkademischerVerlag, Heidelberg, 1996.
[4] R. A. Dunlap, The Golden Ratio andFibonacci Numbers, World Scientific Press, 1997.
[5] S. Iwamoto, Inverse theorem in dynamic programming I, II, III, J. Math. Anal. Appl. 58(1977),
113-134, 247-279, 439-448.
[6] S.Iwamoto, Cross dual on the Goldenoptimumsolutions, RIMSKokyuroku, Kyoto Yniv. No.1443,
pp.27-43,
2005.
[7] S.Iwamoto, $Theo\eta$
of
Dynamic Program (in Japanese), KyushuUniv. Press, Fukuoks, Japan1987.
[8] S. Iwamoto, The Golden optimum solution in quadratic programming, Ed. W. Takahashi and T. Tanala, Proceedings of The Fourth International Conference
on
Nonlinear Analysis and ConvexAnalysis (NACA05), Okinawa, Japan, June-July, 2005, YokohamaPublishers, to appear.
[9] S. Iwamoto, Contrvlled integral equations
on
nondeterminstic dynamic programming: ffidholmandVolterra types, Abstract ofMath Soc Japan, StatisticDivision, 51-52, March,
2003.
[10] S. Iwamoto and M. Yasuda, “Dynamic programming creates the Golden Ratio, too,” Proc.
of
theSixth Intl
Conference
on Optimization: Techniques andApplications (ICOTA 2004), Ballarat,Aus-tralia, December
2004.
[11]
S.
Iwamoto and M. Yasuda, Golden Optimal Path in Discrete-Time Dynamic OptimizationPro-cesses, The 15-th International Conference
on
Difference Equations and Application (ICDEA 2006),Kyoto University, Kyoto, Japan, July, 2006.
[12] J.B.Kadane;“Reversibilityof
a
MultilateralSequaentialGame: ProofofaConjectureofSakaguchi”,J.Oper.Res.Soc.Japan,vol.21,509-516,(1978).
[13] Ron Knott, “Fibonacci Numbers and the Golden Section” in http:$//www.mcs$
.
surrey.$ac.uk/-$$Personal/R.Knott/Fibonacci/fib$.html
[14] M. Sniedovich, Dynamic Programming, Marshal Dekker, Inc. NewYork, 1992.
[15] P. Whittle, Optmization
over
Time, Dynamic Programming and Stochastic Control, Vol.$I$, JohnWiley&Sons Ltd, NewYork, 1982.
[16] M.Yasuda, J.Nakagami, M.Kurano, ““Multi-variate stopping problems with