Characterizations of Embeddable 3 3 Stochastic Matrices with a Negative Eigenvalue

(1)

Characterizations of Embeddable 3 3 Stochastic Matrices with a Negative Eigenvalue

Philippe Carette

Abstract. The problem of identifying a stochastic matrix as a transition matrix between two xed times, say^t= 0 and ^t= 1, of a continuous-time and nite-state Markov chain has been shown to have practical importance, especially in the area of stochastic models applied to social phenomena. The embedding problem of nite Markov chains, as it is called, comes down to investigating whether the stochastic matrix can be expressed as the exponential of some matrix with row sums equal to zero and nonnegative o-diagonal elements. The aim of this paper is to answer a question left open by S. Johansen (1974), i.e., to characterize those stochastic matrices of order three with an eigenvalue^<0 of multiplicity 2.

Contents

1. Introduction 120

2. Notations and preliminaries 122

3. Characterizations of embeddability 123

4. Illustration of the results 128

References 129

1. Introduction

Let the stochastic process (X(t))t ⁰ on a probability triple (^A^P) be a continuous time Markov chain with a nite numberN of states and time-homogeneous transition probabilities

Pij(t) =^PX(t+u) =j^jX(u) =i] ij= 1::: N i.e., which are independent of the timeu0. Under the assumption

tlim^!0^> Pii(t) = 1 i= 1::: N

Received February 20, 1995.

Mathematics Subject Classication. 60J27,15A51.

Key words and phrases. Continuous-time Markov chain, stochastic matrix, matrix exponential, embedding problem.

c1995StateUniversityofNewYork

ISSN1076-9803/95

120

(2)

the transition probabilities Pij(t) are gathered into stochastic N N matrices

P

⁽^t⁾^t0, which satisfy

P

^{(0) =}

I

the identity matrix (1)

P

⁽^t⁺^s^{) =}

P

⁽^t⁾

P

⁽^s⁾ ^ts⁰^:

(2)

It is a well-known result (see e.g., Fre71], p. 148) that in this case dtd

P

^{(0) =}

R

^and

P

⁽^t^{) = exp(}

R

^t⁾ ^t⁰

(3)

where

R

is aNN matrix satisfying

R_ij 0 fori6=j

N

X

j⁼¹Rij = 0 i= 1::: N:

(4)

The (ij)-th element of the matrix

R

represents the transition intensity from state i to statej. In general, any matrix

R

that satises (4) will be called an intensity matrix. Thus, the Markov chain is completely determined by its intensity matrix.

Now, an interesting problem emerges as follows. Suppose (Y(t))t⁼⁰¹²::: is a discrete time Markov chain withN states and time-homogeneous transition probability matrix

P

, i.e., the elements of

P

Pij=^PY(u+ 1) =j^jY(u) =i] ij= 1::: N (5)

are independent of the timeu= 012:::. Then, one can ask the question whether or not this process is a discrete manifestation of an underlying time-homogeneous and continuousN-state Markov chain (X(s))s ⁰, or equivalently, under what cir- cumstances

P

has the form

P

^{= exp}

R

(6)

for some intensity matrix

R

. If this occurs, we say that fY(t)jt = 012:::gis embeddable into a continuous and time-homogeneous Markov chain, or briey that

P

is embeddable and that

R

generates

P

^.

This problem has been acknowledged to have practical relevance when mathematical modelling of social phenomena is concerned (see Bar82, SS77]).

From a purely mathematical point of view however, the question has already been investigated by Kingman (Kin62]), who published a simple necessary and sucient condition (due to D. G. Kendall) for the two state case: det

P

^> ^0.

He gave also necessary and sucient conditions for embeddability of any N N stochastic matrix, but they are not applicable in practice. At the time, it was already known that forN 3 more than one intensity matrix could generate the same

P

(Spe67]).

However, for the three state case, some simplications do also occur. Papers Joh74, Cut73] have shown that necessary and sucient conditions were in a critical sense dependent upon the nature of the eigenvalues of

P

. In both of them such conditions were given, except for the case when

P

has a negative eigenvalue of multiplicity two.

Since then, the problem has mainly been dealt with in the more general context of time-inhomogeneous Markov chains. (See FS79, JR79, Fry80, Fry83].)

(3)

The purpose of this paper is to ll the gap for 33 stochastic matrices with a negative eigenvalue of multiplicity 2, by providing a method to check whether or not they are embeddable and giving an explicit form of the possible intensity matrices that generate them (Theorem 3.3). Another interesting characterization involving a lower bound for the negative eigenvalues is contained in Theorem 3.7.

2. Notations and preliminaries

The set of positive integers will be denoted by^N, the set of real numbers by^Rand the set of complex numbers by^C. The sets of real (resp. complex)mnmatrices is denoted by ^R^mⁿ(^C^mⁿ) and matrices by bold face capital letters. We shall use the notation diag(pqr) for the diagonal matrix

0

@

p 0 0 0 q 0 0 0 r

1

A pqr ² ^C. Finally, the (ij)-th element of then-th power

M

ⁿ of a square matrix

M

shall be written asM_ij⁽ⁿ⁾,n= 23:::

It was already remarked in Joh74] that a negative eigenvalue of an embeddable stochastic matrix has to be of even algebraic multiplicity. Indeed, in our case, if

P

is a 33 stochastic matrix with an eigenvalue < 0, then

R

has an eigenvalue ²^C with e=. But then must also be an eigenvalue of

R

⁽

R

is real), hence the three distinct eigenvalues of

R

are 0,and forcing

R

to be diagonalizable:

R

⁼

U

^diag(0⁾

U

^;1

(7)

where

U

is a nonsingular 33-matrix. We have then

P

^{= exp}

R

⁼

U

^diag(e⁰^e^e⁾

U

^;1

=

U

^diag(1⁾

U

^;1^:

Consequently, has algebraic multiplicity 2 and ^;¹² < <0, because 1 + 2= trace

P

0. Now

P

¹ = limn^!+1

P

ⁿ certainly exists (as is the case for every embeddable stochastic matrix

P

, see Kin62]) and

P

¹⁼

U

^diag(1⁰⁰⁾

U

^;1 ^since^j^j^<^1.

(8)

This leads to the following useful relation between

P

and

P

¹(see also Joh74])

P

⁼

P

¹⁺⁽

I

^;

P

¹⁾^:

More can be said about the nature of

P

¹. Indeed, since

P

¹ is a diagonalizable stochastic matrix with eigenvalues 1 and 0 (the latter of algebraic multiplicity 2),

P

¹ must consist of identical rows, equal to say^;p¹ p² p³. Furthermore, none of thepi can be equal to 0: Otherwise, ifpi= 0 for somei, thenPii = <0.

Thus, in seeking characterizations for embeddability of

P

, we only need to focus our attention on these matrices that have the form

P

=

P

(

P

¹), where

P

⁽

P

¹^{) =}

P

¹⁺⁽

I

^;

P

¹⁾

(9)

(4)

with

P

¹⁼

0

@

11 1

1

A

;p¹ p² p³

p¹+p²+p³= 1 pi>0 i= 123 and <0 :=;min_i pi

1^;pi (10)

the last condition being a regularity one ensuring that

P

⁽

P

¹) has nonnegative entries.

3. Characterizations of embeddability

A characterization involving the generating intensity matrix.

^{The idea}

behind this approach is to seek an explicit form of a general matrix

R

^{with exp}

R

⁼

P

and then to submit the elementsRij to the constraints (4). We begin with a few properties of matrix square roots of

P

¹^;

I

, which we shall need in the proof of our main result (Theorem 3.3).

Lemma 3.1.

^Let

X

be a real square matrix with

X

²⁼

P

¹^;

I

^{. Then}

(a)

XP

¹⁼

P

¹

X

⁼

0

(b) exp(2k+ 1)

X

^{= 2}

P

¹^;

I

⁽^k²^N⁾

(c) ^P_jXij = 0 for alli.

Proof.

(a) Using (8), we have

P

¹^;

I

⁼

U

^diag(0^;¹^;¹⁾

U

^;1

and so (see Gan60])

X

=

UV

diag(0i;i)

V

^;1

U

^;1

where

V

is a real nonsingular matrix that commutes with diag(0^;1^;1). But then

V

commutes also with diag(100), and

P

¹

X

⁼

UV

^diag(1⁰⁰⁾

V

^;1

U

^;1

UV

^diag(0ⁱ^;ⁱ⁾

V

^;1

U

^;1

=

UV0V

^;1

U

^;1

=

0

^:

Analogously

XP

¹⁼

0

^.

(b) By

X

⁼

UV

^diag(0ⁱ^;ⁱ⁾

V

^;1

U

^;1^{, we get}

exp(2k+ 1)

X

=

UV

diag(1e⁽²^k⁺¹⁾ⁱe^;(2^k⁺¹⁾ⁱ)

V

^;1

U

^;1

=

UV

^diag(1^;¹^;¹⁾

V

^;1

U

^;1

= 2

UV

^diag(1⁰⁰⁾

V

^;1

U

^;1^;

UV

^diag(1¹¹⁾

V

^;1

U

^;1

= 2

P

¹^;

I

^:

(5)

(c) By (a), we have

XP

¹^@¹¹

1

A=^@0 00

A. Hence

X

0

@

11 1

1

A=

0

@

00 0

1

A: This concludes the proof.

Corollary 3.2.

If

R

= ln^j^j(

I

^;

P

¹)+(2k+1)

X

with

X

²=

P

¹^;

I

andk²^N,

then exp

R

=

P

¹+(

I

^;

P

¹):

Proof.

Statement (a) of Lemma 3.1 implies that

X

commutes with

I

^;

P

¹^{, in}

which case we have

exp

R

^{= exp}^ln^j^j⁽

I

^;

P

¹⁾^exp⁽²^k^{+ 1)}

X

^:

We now calculate expln^j^j(

I

^;

P

¹)and use the same notations as in the proof of (a), Lemma 3.1:

expln^j^j(

I

^;

P

¹⁾^{= exp}

U

^diag(0^ln^j^j^ln^j^j⁾

U

^;1

=

U

diag(1jjjj)

U

^;1

= (1 +)

U

diag(100)

U

^;1^;

U

diag(111)

U

^;1

= (1 +)

P

¹^;

I

^:

Hence

exp

R

⁼^;^{(1 +}⁾

P

¹^;

I

⁽²

P

¹^;

I

⁾

= (1^;)

P

¹+

I

as (

P

¹)²=

P

¹.

Theorem 3.3.

Let

P

(

P

¹) be as in (9) and (10). Then the following conditions are equivalent:

(a)

P

(

P

¹) is embeddable.

(b) There existk2^N and

X

²^R³³ such that

X

²=

P

¹^;

I

and

R

= ln^j^j(

I

^;

P

¹) + (2k+ 1)

X

is an intensity matrix.

(c) There exists

X

²^R³³ ^{such that}

X

²⁼

P

¹^;

I

^and

Xij 1ln^j^jpj i⁶=j:

Proof.

(a ⁾b) Let

P

= exp

R

, where

R

is an intensity matrix. As remarked in the preliminaries section, the eigenvalues of

R

are in this case 0,_k, and_k, where k is a complex number satisfying e^k=, i.e.,

k= ln^j^j+ (2k+ 1) i (k²^N) and

R

⁼

U

^diag(0kk)

U

^;1 ^with

U

nonsingular

= ln^j^j

U

^diag(0¹¹⁾

U

^;1^{+ (2}^k^{+ 1)} ⁱ

U

^diag(0¹^;¹⁾

U

^;1^:

(11)

(6)

Using (8) and Equation (11), and putting

X

^{= i}

U

^diag(0¹^;¹⁾

U

^;1, we obtain the desired result.

(b ⁾ c) Because

R

is an intensity matrix, we must have Rij 0 i ⁶=j. This means that, by (b), fori⁶=j

ln^j^j(^;pj) + (2k+ 1) Xij 0 which yields

X_ij 1

(2k+ 1) ln^jjp_j 1ln^jjp_j:

(c ⁾ a) In this case, by Lemma 3.1 (b) and (c), ln^j^j(

I

^;

P

¹^{) +}

X

^{denes an}

intensity matrix, say

R

, which has the property exp

R

=

P

¹+(

I

^;

P

¹) =

P

(

P

¹). The proof is now complete.

A connection with embeddable stochastic matrices with complex conju- gated eigenvalues.

Another approach can be given starting from the following observation, which is valid for all stochastic matrices.

Lemma 3.4.

A stochastic matrix

P

is embeddable if and only if it is the square of a stochastic matrix

Q

that is also embeddable.

Proof.

If

P

is embeddable, then an intensity matrix

R

exists with exp

R

=

P

. Now, ¹²

R

is also an intensity matrix, hence the matrix

Q

= exp¹²

R

is stochastic (see Fre71], p. 125), embeddable, and has the property

Q

²⁼

P

^.

Conversely, suppose that an embeddable stochastic matrix

Q

exists with

Q

²⁼

P

^{. Then}

Q

^{= exp}

R

for some intensity matrix

R

^and

P

⁼

Q

²^{= (exp}

R

⁾²^{= exp(2}

R

⁾

so

P

is embeddable.

This is interesting, because any stochastic `square root'

Q

^of

P

⁽

P

¹) must have eigenvalues 1, i^p^j^j and ^;i^p^j^j. The following characterization of embeddable stochastic matrices of order three with complex conjugate eigenvalues, has been proven in Joh74].

Theorem 3.5.

Joh74]. A stochastic matrix

P

of order three with eigenvalues e⁺ⁱ and e^;i0 < < , can be embedded if and only if one of the following conditions holds.

(12) P_ij⁽²⁾(ecos^;1)^;esin

Pij

(e²cos2^;1)^;e²sin2 for alli⁶=j (13) P_ij⁽²⁾(;2 )(ecos;1)^;esin

Pij

(;2 )(e²cos2;1);e²sin2 for alli6=j A direct application of this result now yields the following characterization.

(7)

Theorem 3.6. P

⁽

P

) is embeddable if and only if there exists a stochastic matrix

Q

of order three such that

Q

²⁼

P

⁽

P

¹) and one of the following conditions holds.

Qij(1 +

pj^j_ln_j^j)pi for alli⁶=j (14)

Qij (1^;

pj^j

3 ln^j^j)pi for alli⁶=j (15)

Proof.

In view of Theorem 3.5, Lemma 3.4 and the observations made about the eigenvalues of any stochastic square root

Q

of

P

(

P

¹), we have, with the notations used in Theorem 3.5,= ln^p^j^j,= ², andQ⁽²⁾_ij = (1^;)pi. Inequality (12) then becomes

Qij 1 + ^p^j^jln^j^j

1 +^j^j ⁽¹^;)pi= (1 +

pjjln^j^j)pi i⁶=j and inequality (13)

Qij 1^;^p³^j^jln^j^j

1 +^j^j ⁽¹^;)pi= (1^;

pjj

3 ln^j^j)pi i⁶=j:

It may be interesting for the sake of consistency to remark the equivalence of the characterizations in Theorems 3.3 and 3.6.

Suppose (14) holds. Then fori⁶=j,

Qij(1 +

pj^j_ln_j^j)pi

1 m

pj^j⁽Qij^;pi) 1ln^j^jpi: Consequently, the matrix

X

dened as

X

^{= 1}^p_j_j⁽

Q

^;

P

¹⁾

(16)

is the one that satises (c) of Theorem 3.3, for it has also the property

X

²=

P

¹^;

I

, which remains to be shown:

X

²= 1_j^j⁽

Q

^;

P

¹)²

= 1_j^j⁽

Q

²^;²

QP

¹^{+ (}

P

¹⁾²^{) (}

Q

^and

P

¹ ^commute)

= 1_jj(

P

⁽

P

¹⁾^;²

P

¹⁺

P

¹⁾

= 1_j^j⁽

P

(

P

¹)^;

P

¹)

=

P

¹^;

I

^{by (9) and}^<⁰^:

(8)

If (15) holds, however, we'll have to put

X

=^; 1

pj^j⁽

Q

^;

P

¹) to arrive at the same conclusion.

On the other hand, one can easily show using the power series denition exp(

A

) =^X¹

p⁼⁰

p1!

A

^p

that, starting from (c) Theorem 3.3, the relation (16) denes a matrix

Q

that is stochastic since it can be written as the exponential of an intensity matrix¹ :

Q

=^p^j^j

X

+

P

¹= exp1

2(ln^j^j(

I

^;

P

¹) +

X

):

Furthermore,

Q

²⁼

P

⁽

P

¹) by Corollary 3.2. Finally,

Q

satises (14) by its very denition.

A characterization involving a lower bound for the negative eigenvalue.

Instead of seeking

R

, we put the following question. For a xed

P

¹ that has the property (10), what are the possible values of <0 that make the stochastic matrix

P

⁽

P

¹) embeddable? As a consequence of Theorem 3.3 among other things, the following result emerges. Recall the denition offrom (10).

Theorem 3.7.

^Let

P

¹ be dened by (10), then there exists ²]0, depending on

P

¹, such that

P

⁽

P

¹) embeddable ^, <0:

Proof.

Dene the map F : 0^{! M} : 7!

P

(

P

¹), where ^M is the set of 33 stochastic matrices with positive determinant. Let P be the set of all embeddable stochastic 33 matrices. We have then by the nature of the condition (c) of Theorem 3.3

2F^;1(^P) ⁾ 0 F^;1(^P): (17)

Also, if we take an arbitrary real square root

X

of

P

¹ and a⁰such that ln^j⁰^j mini⁶⁼j Xp^j^ij, then

P

(⁰

P

¹) is embeddable, so

F^;1(^P)⁶=: (18)

In addition, it was proven in Kin62] that P is closed in M, so by continuity of F, we have that F^;1(^P) is closed in 0. Hence, in conjunction with (17) and (18), F^;1(^P) = 0 , for some ² 0. We still have to show that ⁶= . Suppose =. Then ²F^;1(^P) and

P

⁽

P

¹) is embeddable. But then there existsk such thatPkk(

P

¹) = 0, so we can apply Ornstein's Theorem (Chu67], II.5, Theorem 2) which states that whenever a stochastic matrix

Q

^{, with}^Qij = 0 for somei andj, is embeddable, thenQ⁽_ijⁿ⁾= 0 n1 and in particular Q¹_ij = 0.

Consequently limn^!+1P_kk⁽ⁿ⁾(

P

¹) =pk= 0, giving a contradiction.

1The exponential of an intensity matrix is always stochastic. For a proof, see Fre71], p. 151.

(9)

4. Illustration of the results

Let p¹ = p² = p³ = 1=3, then = ^;1=2. For

P

⁽

P

¹^{) with} ^;¹⁼²^{< <} ⁰

to be embeddable, there must exist by Theorem 3.3 a matrix

X

²^R³³ ^satisfying

X

²⁼

P

¹^;

I

^{such that}

X_ij 1

3c i6=j withc= ln^j^j_. (19)

By Gan60], such a matrix

X

assumes the form

X

⁼

X

⁽^uv^{) =}

V

0

@

0 0 0

0 u ;¹⁺vu²

0 v ;u

1

A

V

^;1^uv²^R^v⁶^{= 0}

(20)

where

V

is a nonsingular matrix satisfying

P

¹^;

I

⁼

V

^diag(0^;¹^;¹⁾

V

^;1^:

It is easy to check that we can take

V

=

0

@

1 1=3 1=3 1 ^;1=3 0 1 0 ^;1=3

1

A whence

X

⁽^uv_{) = 13}

0

@

;uu²u^;²^;;uvvvvv²⁺¹⁺¹ ^;u²u⁺²²u⁺²+ 2vv²uvv⁺³⁺¹vuv⁺¹ ²u^;²;⁺²u2v²²uv⁺⁺³v;uvuv⁺²v⁺²

1

A: (21)

We shall now show that

P

⁽

P

¹) cannot be embeddable for values of with

;2=^p5< c<0.

Suppose

P

(

P

¹) is embeddable. Then according to (19), there exist real num- bersuandv,v⁶= 0 such that

Xij(uv)1

3c i⁶=j:

Using (21), these conditions become algebraic inequalities in u and v. Among others:

(a) ;u+v ;c for (ij) = (31) (b) u+ 2vcfor (ij) = (32)

(c) X¹²(uv) +X²¹(uv)2c=3, yielding^;2u^;vc

(d) jvj 2c+ 2^p1 +c²for (ij) = (12) and (21)

It is a straightforward matter to see that (a), (b), (c) and (d) together imply 2c+ 2^q1 +c²^;c:

The contradiction lies in the fact that this inequality cannot be satised if^;2=^p5<

c < 0. Hence for

P

(

P

¹) to be embeddable, one must necessarily have c

;2=^p5 or^;e^;2⁼^p⁵. In this case, the lower bound from Theorem 3.7 must satisfy ^;e^;2⁼^p⁵. In fact, =^;e^;^p³, which is a consequence of the following more general result that provides a formula to express in terms ofp¹,p² andp³:

ln^j^j^;2= 4p=s²^;1 ifpm(1^;pm)> s=2 pm=(1^;pm) otherwise

(10)

wheremis an index such thatpm= minipi,p=p¹p²p³ands=p¹p²+p¹p³+p²p³. The proof uses the characterization in Theorem 3.3 and is very technical, so it will be omitted here.

Acknowledgement.The author is indebted to the referee for his suggestions resulting in an enhancement of the manuscript.

References

Bar82] D. J. Bartholomew,Stochastic Models for Social Processes, third ed., John Wiley & Sons, Chichester, 1982.

Chu67] K. L. Chung,Markov Chains with Stationary Transition Probabilities, second ed., N. Y., Springer, 1967.

Cut73] James R. Cuthbert,The logarithm function for nite-state Markov semi-groups, J. Lon- don Math. Soc. (2)⁶(1973), 524{532.

Fre71] David Freedman,Markov Chains, Holden-Day, 1971.

Fry80] Halina Frydman, The embedding problem for Markov chains with three states, Math.

Proc. Camb. Phil. Soc.⁸⁷(1980), 285{294.

Fry83] Halina Frydman,On a number of Poisson matrices in bang-bang representations for3 3 embeddable matrices, J. Multivariate Analysis¹³⁽³⁾(1983), 464{472.

FS79] Halina Frydman and Burton Singer, Total positivity and the embedding problem for Markov chains, Math. Proc. Camb. Phil. Soc.⁸⁶(1979), 339{344.

Gan60] F. R. Gantmacher,The Theory of Matrices, vol. one, Chelsea Publishing Company, N.

Y., 1960.

Kin62] J. F. C. Kingman, The imbedding problem for nite Markov chains, Z. Wahrschein- lichkeitstheorie¹(1962), 14{24.

JR79] Sren Johansen and Fred L. Ramsey,A bang-bang representation for3 3embeddable stochastic matrices, Z. Wahrscheinlichkeitstheorie verw. Gebiete⁴⁷(1979), 107{118.

Joh74] Sren Johansen, Some results on the imbedding problem for nite Markov chains, J.

London Math. Soc. (2)⁸(1974), 345{351.

Spe67] J. M. O. Speakman,Two Markov chains with a common skeleton, Z. Warscheinlichkeit- stheorie⁷(1967), 224.

SS77] Burton Singer and Seymour Spilerman,Fitting stochastic models to longitudinal survey data { some examples in the social sciences, Bull. Int. Statist. Inst. (3) ⁴⁷ (1977a), 283{300.

Centrum voor Statistiek en Operationeel Onderzoek, Vrije Universiteit Brussel, Belgium

[email protected]