質問に基づく形式言語の機械学習に関する研究

(1)

九州大学学術情報リポジトリ

Kyushu University Institutional Repository

質問に基づく形式言語の機械学習に関する研究

坂本, 比呂志

Graduate School of Information Science and Electrical Engineering, Kyushu University

https://doi.org/10.11501/3147889

出版情報：Kyushu University, 1998, 博士（理学）, 課程博士

(2)

CHAPTER 4 Learning Finite-Memory Automata

Automata theory is greatly attractive and several types of automata have b n introduced, e.g., two-way, two-head, and infinite-state automata.

A finit -memory autornaton is one of thes representations. Our interest is whether a finite-memory automaton expresses a regular language in a mor compact way than finite automata as well as whether such automata are l arnable using helpful queries.

In this chapter, we deal with finite-memo;y automata, denoted by FMA, introduced in

(32).

Assuming a restricted memory structur that consists of a finite number of registers, an FMA can store arbitrary input syn1bols. Thus, th language of an FMA is defin d ov r a potentially infinite alphab t.

W first study the following d cision problem for any FMA A as well as for any deterministic one, d noted by DFMA: th m mbe;ship p;oblem (i.e., given an A and a string w decid whether w is accepted by A), and the non-emptiness p;oblem (i.e., given an A, d cide wheth r th language of A i nonempty). The membership probl m is P-complet provided th A is d t rministic and any other problem is NP-complete.

Other results ar obtained for inequivalence probl m (i.e., giv n two FMAs, decide whether they accept th same languag ). Using th NP-hardness of non-emptiness for DF MA, we show that the in qui val nee probl m for DFMAs is also NP-hard.

Moreover, this problem for FMAs is at least PSPACE-hard.

By the resulting intractability of the problen1s, we consider a restriction of FMAs and the class is referred to as simple DFMAs. We second establish the learnability of

(3)

the simple deterministic class via membership and equivalence queries.

In order to arrive at a meaningful learning model, w first present a procedure, given any two simple DFMAs A and

^B,

to output a string accepted by exactly one of them if L(A) # L(B) and output "yes' otherwise. Then, we provide the announced learning algorithm, show its correctness, and analyze its running time. The algorithm is partially based on Angluin 's [4] observation table technique. Finally, we obtain the result that the class of simpl DFMAs is exactly learnable via membership and equivalenc qu ries.

4.1 FMAs and DFMAs

Rec ntly Kamin ki and Francez [32] introduced a n w model of computation the so

called finite-memory automata denoted by FMAs, for dealing with languages defined over an infinite alphabet. In this model, we assume a very restricted memory structure lik a work tape which consists of a finite number of registers. A register stores arbitrary symbols.

An FMA can check whether or not a register contains an input symbol. Reading an input symbol it changes its stat by exclusively using inforn1ation in which register of it work tap th input symbol is or will b tor d. If th input symbol has not been stored y t, then it is written into a cell of th work tape depending only on the current tate. Cons qu ntly, an FMA deals with any finite and pot ntially infinite alphabet.

Compar d to oth r con1putational models FMA is natural extension of finite au

tomaton and it can be also finitely described analogou ly to oth r computational mod

els. Moreover, the class of FMAs possesses s veral u ful closur properties ( cf. [32]).

We b gin with th definition.

^L

t

ⁿ=

{

^ai

I i

^E

IN} be any fix d countably infinite alphabet and let

^�

be any finite subset of

^n.

he fr monoid ov r

ⁿ

is denoted by D*

and

ⁿ⁺

is defined similarly. For any str·ing a, the xpres ions [a] and a[i] denote the range and the ith symbol of a, resp ctiv ly, wh r the range of a is the set of symbols appearing in a. Furth rmor ' for any

n E lN'

w d fine

ⁿⁿ=

{a

^E

D* I Ia I

^{= n}

}

.

For each

^�

�

^n,

the set

^�n

is

^d

fined similarly. Any s t

^L

� D* is also referred to as a

language.

Let � be a special symbol not belonging to

^n.

An assignment

IS

a finite string

(4)

x1x2

^·^·^·

Xn E (!1

U

{rt} _)n

such that if

xi = Xj

and

i # j,

_then

_{xi =} rt

_{for all}1

:s; i, j _:s;

n.

Finally, we define a

per·mutation

^over

n

u

{rt}

as follows. Let

51,52 � n

u

{rt}. A

total function 'lj; :

51

^---+

52

is called

one-to-one

if there exists inverse of

'l/;,

denot d

by �>-I, such that for every

a E 51, 'lj;-1

^·

'lj;(a) =a.

Every such one-to-one mapping

is said to b a

permutation,

especially d noted by 7r' over

51

if

51 = 52

and

7r(rt) = rt

provid d

rt E

¹^·

DEFINITION 4.1.1. (Kaminski

&

Francez

[32]). A finite-memory automaton

^(abbr.

FMA)

is d noted by a 6-tupl

A= (Q,q0

^T

,

g

, 8 ,

^F

)

, where

(1) Q

is a finite set and

q0 E Q,

the 1 m nts of

Q

are are called

states

^and

q0

is said to be the

initial state, (2)

T

E (!1

U

{rt} _)k

is an assignment of length

k

called the

initial assignment, (3)

(},called the

reas ignm nt,

is a partial function from

Q

to

{ 1, 2 _. . . k},

that is, for each p

E Q,

g(

p ) E {1, 2, ... , k}

or und fined,

(4) 8 � Q

x

{1,

2,

.

^.^.

, k}

x

Q

is called the

transition relation,

and

( ₅₎

F

� Q

is call d the set of

final states.

Let A be an FMA. When A is in a state p and reads a symbol

a

then A changes its stat into

q

by a transition (p,

i _q) E 8

provided

a

is the ith symbol of As current assignm nt i. If

a tf:_ [

i

]

, thenAr writes the g(p)th window by

a

and change the state by a transition (p, g(p),

q) E 8.

More formally, th computations of an FMA A ar defined as follows. Every pair of a state and an a ignm nt is call d a

configuration.

The pair

( _q0, T)

i ref rred to as the

initial configuration. A

ll configuration with final tates are called

final configurations.

We defin a r lation f-as follows: Let

a

and /3 b a signments, and let p

q E Q.

Then, (p,

a)

f-

(q,/3)

iff there exist an

a En, ani E {1, 2, ... , k},

_anda (p,

i, _q) E 8

such that:

1. _{a =} a[i]

and

a _{= /3}

or

2. _a rf_ [a],

_g(p)

₌ i fJ[i] _=a

_and

a[j] ₌ fJ[j]

_{for all}

j f i.

When necessary, we write

(p, a)

f-a

( _{q /3)}

by sp cifying th symbol

a.

If ther exists a sequence of configurations eo,

c1, ... , Cn

such that

ci

^f-a•

ci+l

for all 0

:s; i :s;

n, then we write

c0

^f--w

cn,

^where^w

= a0a1

^·^· ^·^an^- Finally, we use

c0

^f-*

Cn

provided there exists a w

E

fl* such that Co f--w Cn^· A is said to

accept

a string w if

c0

^f--w

cn,

where

c0

is the initial configuration and

Cn

is a final configuration of A. The language accepted by A is denoted by L(A) and defin d to be the set of all strings w

E

fl* that are accepted by A.

(5)

DEFINITION 4.1.2. (Kaminski and Francez

[32]).

An FMA is said to be determin

istic (abbr. DFMA) if for ach p

E

Q and each 1 � i � k, the value of

e(p)

is defined and th re exists exactly one

q E

Q such that

(p,

i,

q) E

b, where k is the length of the initial a signment.

Th following results are du to

[32].

The first proposition establishes a relationship betw en FMA and finite automata, that is, FMAs and finit automata are computa

tionally equival nt with r spect to every finite alphabet. The second one deals with an FMA it elf, that is, ev ry language of an FMA is closed under permutations over the alphab t n.

PROPOSITION 4.1.1. (Kaminski

&

Francez

[32]).

For every finite automaton Mover a finite alphab t

�'

th re exists an FMA

A

such that

L(A)

n

�* = L(M).

Conversely for every FMA

A

and finit alphabet

�'

there exists a finite automaton M such that

L(M) = L(A)

n

�*.

PROPOSITION 4.1.2. (Kaminski

&

Francez

[32]).

Let

A

be an FMA, T be the initial assignment of

A,

and ^1rbe a permutation over

n

with

1r(a) =a

for all

a E [

^T

].

Then,

1r

( L (A) )

⁼ L

(A)

, w her 1r

(

L

(A) ) = { 1r ( w) I w E L (A)} .

Sin1ilar to th ca e of finite automata, an FMA can be described as a directed graph whose nod d not stat s and dges denot tran itions. There is an edge lab led by k from nod ito node

j

iff

(

i k,

Sj) E

b. A node is labeled by

e(s)

if

e

is defin d for

s.

EXAMPLE 4.1.1. An example for an FMA with two regist rs is displayed in Fig

ure 4.1. Th registers ar initially mpty. Thus, th initial assignment is denoted by

The states from th left order are d noted by

q0, p

and

q,

where

q0

is the initial state and

q

is the unique final state. Since th transition r lation and reassignment are everywhere defined, this FMA is deterministic.

An example run of this DFMA for the string

abae

is:

(q0, rtrt)

�a

(p, art)

�b

(p, ab)

�a

(q,ab)

�c

(q,ae).

The languag ace pted by the DFMA is

L = {awa I a En, wED*}.

Additionally, a corresponding DFA for th alphabet

� = {a, b}

is also illustrated.

This DFA accepts the languag

{ewe I e E �' wE �*}.

(6)

a

2 1

�

² ^a _b ^b _a

Initial assignment: ##

ao·�M

Figure 4.1: A DFMA and the corresponding DFA.

By D finition 4.1.2, the class of DFMAs is a subset of the set of all FMAs. In ord r to se that it is even a proper subset, let

A= (Q,

q0, T, !],

8, F)

be any DFMA.

By changing F into

Q \ F,

we obtain a DFMA

A'

such that

L(A') = n* \ L(A),

since the computation of A is unique for each input. Thus, the class of DFMAs is closed under compl ment but the class of all FMAs is not.

THEOREM 4.1.1.

(

Kaminski & Francez

[32]).

Th class of FMAs is closed und r union and intersection, but not closed under complement. The class of DFMAs is closed under complement, but not closed und r union and intersection.

On th oth r hand by Proposition 4.1.1 wh n r tricted to finite alphabets every languag ace pted by an FMA must be a regular language. L t

L(A)

be the language of an FMA over an infinit alphab t and let

L:1 L:2, ... , L:n,

..

.

be a sequence of finite alphabets such that

L:i

C

l:i+I

for all i � 1. Then, we have that

L( A)

n

L:i

C

L(A)

n

L:i+I·

hus, many d cision problems for FMAs can be reduced to the question whether we can efficiently find an evidence string w E

L:�

such that w

rf_ L( A)

n

L:�,

and indeed, in the next section, we solve several decision problems for FMAs by reducing the decision problems to a finding problem. Esp cially, we show that there exists a polynomial upper bound for the l ngth of a minimum string in

L(A)

if

L(A) =/- 0.

(7)

4.2 Decision Problems for FMAs

In this section, w first study the membership problem* for FMAs as well as for deter

ministic FMAs. The definition is provided as follows.

PROBLEM 4.2.1. The membership problem

f

or FMAs IS given an FMA A and a string w E 0*, to decide wh ther w E L(A). The problem is denoted by MEM. The m mb r hip problem for DFMAs is defined similarly and denoted by MEMD.

Since the computation time of an FMA equals the input length, we directly get MEMD E P and MEM E NP. A Turing machin simulating an FMA needs O(k log n

)

steps to r write the assignm nt, where k is the length of the assignment and n =

max

{ i I

a

i

E

[

w

] } .

We show th P-hardness of MEMD by reducing the monotone circuit value problem to it. Th monotone circuit value problem is defined as follows ( cf.

[22]).

Let X ⁼

{Xi I li

E

lN}

be th set of Boolean variables. We denote a circuit C over X by a sequence C1 C2, ... , Cn- Each component C

i (1 � i �

n

)

is call d a gate of

ct.

A circuit C is called monotone if C contains no negation. A truth assignment of a circuit ofm variables is a mapping

f:{x1, ... , xm}---+ {0,1}.

The value ofC with respect to f is defin d as usual (cf.

[50]).

PROBLEM 4.2.2. The monotone circuit valu problem, denot d by monotone Cvp, is given a monotone circuit C and a truth assignment

f

appropriate for C, to decide wheth r

f(

C) ⁼ 1.

THEOREM 4.2.1. MEMD is P-compl t

PROOF. We show that monoton CvP IS log- pac reducibl to MEMD. Let C =

cl) c2 .

. . Cn be a mono ton circuit, let

xl) x2). . . Xm

b all variabl s in c and l t

f

be any fixed truth assignment of C. First, w construct a string w E 0* from C as follows. For each

1 � i �

n, 1 t

( ') { ^ai,

^{if Ci}⁼

^Xi,

wz =

akaja·i, if C

i

⁼Ck 1\ Cj or C

i

⁼Ck v Cj.

*In d cision problems, we assume a symbol ai E n = { aj

I j

E lN} to be encoded by a followed by bin(i) E {0, 1}*, where bin(i) denotes i in binary.

twithout loss of generality, we assume the m input gates of a circuit C to be C1, ... , Cm, i.e., C'i =Xi fori= 1,

...

, m.

(8)

The string

w

is th concatenation of all ^w

(i)

^'s

,

that is,

w

⁼

w(1)w(2)

^·^·^·

w(n).

Com

puting

w

can be done in O(log

n

)-space.

Next, we have to construct an FMA

A

⁼

( ^Q, q0, T,

(!, 8,

F).

We set ^{T =}

�2n,

and define th set Q of states as well as the reassignment (! by distinguishing the following cases for all

i E {

1'

.

..

'n}.

If

ei

⁼

Xi

we add the state

Pi

to Q and set

e(Pi)

⁼

i

provid d

f(xi)

⁼ 1 and

e(Pi)

⁼

n + i

otherwise. If

ei

is not an input gate, we add four states

Pi, Pi1, Pi2, Pi3.

Furth rmore, we set

e(PiJ

⁼

i, e(Pi3)

⁼

n + i

in case

ei

⁼

e k

^1\

ej

and

e(PiJ

⁼

n + i, e(PiJ

⁼

i

if

ei

⁼

ek

^V

ej.

Clearly, this computation can be done in O(log ⁿ)-space.

For each 1

::;

i

::; n,

th transition relation 8 is defined as follows:

2.

Let

ei

⁼

ek

^1\

ej;

we distinguish the following cases:

(a) If

i

^<

n

then let

(pi,k,piJ, (pi,n + k,pi3), (Pi1,J,Pi2), (Pi1,n + j,piJ,

(Pi2,i,Pi+I), (Pi3,j,piJ, (Pi3,n + j,pi3), (Pi3 n + i,Pi+I) E

8.

3. Let

ei

⁼

ek

^V

ej;

we distinguish the following cases:

(a) If

i

^<

n

then let

(pi,n + k,Pi1), (pi,k,piJ, (Pi1,n + j Pi2), (Pi1,J,Pi3), (Pi2 n+i,Pi+I) (Pi3,J,Pi3) (Pi3 n + j Pi3), (Pi3,i,Pi+I) E

^8.

(b) In case

i

⁼

n

l t

(Pi n + k, Pi1) (Pi k Pi3) (Pi1 j, Pi3), (Pi3 ^j PiJ, (Pi3, n + j, PiJ, (Pi3, i, Pi+I) E

^8.

Finally, we define Qo ⁼

PI

and

F

⁼

_{Pn+I}.

For each stat in Q, at most two tran itions are defined. Thus the FMA

A

can be computed in O(log

n)

spac

.

It remains to show that

wE L(A)

iff

J(e)

⁼

1.

CLAIM 4.2.1. Let

i 2 1,

let

(qo,T) r-w(I)w(2)···w(i) (Pi+I,ai+I)

and l t ^a be the last symbol of

w( i),

where

( q0, T)

⁼

(PI, ai )

. Then,

ai+I [i]

^{= a}if

f( ei)

⁼ 1 and

ai+I [n+i]

⁼

a if

j(ei)

⁼0.

The proof is done by induction on

i.

By construction, only the gates

ell e2 .. . , em

are variables. For each 1

::; i ::;

^m,

lw( i) I

⁼ 1 and

e(p) -j. e( q)

if p

-j. q.

Thus, by the definition of A, the claim follows for all

i

⁼

1,

^{.. . ,}^m.

(9)

Next, assume the induction hypothesis for all

1 ::; i ::;

^m. Note that, for each 1

::; i ::; jTj,

the ith window is rewritten at most once, since

g(p) -f g(q)

if

p -1

q. Now, let

i ^�

rn; then

Ci+l

⁼

Ck

A

Cj

^or

Ci+l

⁼

Ck

V

Cj,

^where

k

^<

j::; i.

First, w consider

Ci+l

⁼

Ck

A

Cj.

If

f(Ci+I)

⁼ 1, then

f(Ck)

⁼ 1 and

f(Cj)

⁼ 1. By the induction hypothesis,

ak[k]

⁼

ak

and

aj[j]

⁼

aj.

Since these two symbols are never replaced,

ai+I[k]

⁼

ak

and

ai+I[j]

⁼

aj.

^Thus,

(Pi+I,ai+I) �

^aka]

(P(i+I)2,ai+I)·

Note that

w(i)

^contains

exactly on

ai

that does not appear in any

w(j)

for all

j

^<

i.

Then,

ai+l rf_ [ai+1].

Since

e(P(i+I)2)

⁼

i +

1, it follows that

(P(i+I)2, ai+I)

^�^a•+l

(Pi+2, ai+2)

and

ai+2[i

+

1]

⁼

ai+l·

Thus,

(Pi+l ai+I) �w(i+l) (Pi+2,ai+2)

and

ai+2[i

+

1]

⁼

ai+l·

If

f(Ci+I)

⁼ 0 th n

f(Ck)

⁼ 0 or

f(Cj)

⁼ 0. Suppose that

f(Ck)

⁼ 0. By the induction hypothesis,

ak[n+k]

⁼

ak,

and thus

ai+I[n+k]

⁼

ak.

Hence,

(Pi+I,ai+I) �

^ak

(P(i+l)3' O'i+l)'

and by definition,

(P(i+lb' O'i+l) �

^a)

(P(i+lb' O'i+I).

Since

e(P(i+l)J

⁼

n + i + 1,

w also have

(Pi+I, ai+I)

^�^aka1a'+1

(Pi+2, ai+2).

Next, 1 t

f(Ck)

⁼ 1. Since

f(Ci+I)

⁼0, we conclude

f(Cj)

⁼0. Then

ai+I[k]

⁼

ak

and

O'i+l[n

+

j]

⁼ aj. Thus,

(Pi+l O'i+I) �

^ak

(P(i+l)p O'i+t)

and

(P(i+l)p O'i+I)

^�^a1

(P(i+Ih ai+I).

Again, since

e(P(i+Ih)

⁼

n + i + 1,

we obtain

(Pi+ I, ai+I) �

^akaJa•+l

(Pi+2, ai+2)·

Hence,

(Pi+I, ai+I) �w(i+l) (Pi+2, ai+2)

and

ai+2[n + i + 1]

⁼

ai+I·

Second 1 t

Ci+l

⁼

Ck

v

Cj.

The reassignment

e

on

Ci+l

is obtained from the reassignment for

Ci+l

⁼

Ck

A

Cj

by switching

i + 1

and

n + i + 1 j

and

n + j,

and

k

and

n + k

to one another, and th same witch s apply to 8. Thus, this case can be handl d analogously.

Hence, for each

1 ::; i ::; n- 1, (p1,ai) �w(l)w(2)···w(i) (Pi+I,ai+I),

and for each 1

::; j ::;

i,

ai+

¹

[j]

⁼

a

j iff

f ( C j)

⁼

1

and

ai +

¹

[ n + j]

⁼

a j

iff

f ( C j)

⁼ 0. Now,

(PI a!) �w(l)w(2)···w(n-l) (Pn an)·

Finally th computation

(Pn, an) �w(n) (Pn+I, an+d

is defined only if

f(n)

⁼

1.

This prov s Claim 4.2.1.

We conclude that

f(C)

⁼ 1 iff

w

E

L(A)

since F ⁼

{Pn+1}.

Ther fore, MEMD is

P-complete.

Q.E.D.

EXAMPLE 4.2.1. We illustrat th reduction d fined abov using the monotone cir

cuit

c

⁼

cl, c2, ... 'c6,

wher

ci

⁼

Xi

for i ⁼

1,

2, 3,

c4

⁼

cl

A

c3, Cs

⁼

c2

v

c4

and

C6

⁼

C3

A

C5

as well as the truth assignment

f

d fined as

f(xi)

⁼ 0 and

(10)

j(x2)

⁼

J(x3)

= 1.

Then, the string

w

is the concatenation of

w(l)

₌

a1, ^w(2)

=

a2, ^w

(3) =

a3,

w(4) =

a1a3a4, ^w(5)

=

a2a4a5

and

w(6)

=

a3a5a6,

and

�12

is the initial assignm nt.

The resulting FMA is shown in Figure 4.2.

Cs•---·.

Figure 4.2: The FMA A comput d in Exampl 4.2.1.

We next how the NP-hardness of MEM by reducing circuit SAT to it (for circuit SAT, see Problem 2.3.2). For a given circuit

C,

since there are 2m possible truth assignments for

m

variables of

C,

the main part of the proof is a one-to-one mapping between the truth assignment of

C

and th configurations of an FMA A on a string

w con tructed from

C.

THEOREM 4.2.2. MEM is NP-complet

PROOF. We provide a log-space reduction from circuit SAT to MEM. This reduction is almost the same as the on in Theorem 4.2.1. The construction of the string

w

remains unchanged except that we add

a0

to it a it first symbol. Suppose the target circuit

c

=

cl' c2, ... 'Cn

to have

m

variables, and

ci

=

Xi

for all

i

₌ _1'

... ' m.

Again, we have to construct an FMA A ₌ (Q,

q0,

T, (}, 8,

F).

First, we set T =

�2n+l.

Next, we d fine Q =

{qo,Pn+d

^U

{pi, qi jl::; ^i::; m}

^UQ', and F ₌

{Pn+I},

wher Q' is the union of the following s ts.

(11)

Finally, we define (} and 8 as follows. Let

e(q0)

⁼

2n +

1 and let

(q0, 2 n + l,pi), (qo, 2n+l, qt) E

8. For ach

1 � i

^<m, let

e (p i )

= i and let

e(qi)

⁼

n+i.

More

over, w put

(Pi i Pi+l) , (Pi

i,

qi+I) E

8 as w ll as

(qi, n +

i,

qi+1), (qi, n + i, Pi+l) E

^8.

Fori= rn, we d fin

e (Pm)

= m and

e(qm)

⁼

n

+ m. Also, let

(Pm,m,Pm+I), (qm, n +

^rn,

Pm+d E

^8.

Since we can assume that for any

i �

^m

+ 1, C i

⁼

Ck

^1\

Cj

^or

C i

⁼

Ck

^v

Cj,

wher k <

j

^<

i.

Th gates can be treated exactly as in Theorem

4.2. 1 .

Clearly, th reduction is O(log

n

)-spac computable.

We show Lhat

f( C)

⁼

1

^iff

w E L(A).

First,

A

reads

a0

in state

q0,

it writes

a0

in the

2n +

1st window, and changes its state nondeterministically to

p1

^or

q1.

Furthermore, for each 1

�

i < m, when

A

reads

w(i)

⁼

ai

in a state

Pi,

^the

ai

is written in the ith window and

A

nondet rministically changes its tate to

Pi+l

^or

qi+l·

If

A

is in a state qi, then ai is written on th n + ith window and A nondetern1inistically changes its state to

Pi+l

or qi+l· When A reads

am

in state

Pm,

it writes

am

in the mth window and changes its state to

Pm+l·

Suppose,

A

is in state

qm·

Then

am

is written in the

( n +

^m)th window, but

A

changes its state to

Pm+l,

^too.

Thus there xist 2m distinct computations on

(

^{q0 T}

) �aow(l)w(2)···w(m) (Pm+l, am+d·

Moreov r w can easily defin a on -to-on correspondence betw en the 2m possible assignm nt

am+

^I and all pos ible

2m

truth as ignn1ents

f

: X -+ {0,

1 }m

by setting

Xi

1--t 1 provided

A

writes

w( i)

in the ith window and

Xi

1--t 0 oth rwise.

Let i > m; th cases of logical and/ or gates have alr ady been shown in Theo

rem

4.2.1.

^Let

ci

⁼

--.cj w(i)

⁼

ajai

^and

(qo,T) �aow(l)w(2)···w(i-l) (Pi ai)·

Since

aj

already app ars in w

(

l

)w( 2)

^·^·^·^u

(i-

1),

ai[j]

=

aj

or

ai[n + j]

=

aj.

If

f(Cj)

=

1,

then

ai[j]

=

aj

and if

f(Cj)

⁼ 0, th n

ai[n + j]

=

aj.

If

i

^{< n,} when

A

reads the

aj,

it changes its state to

Pi1

if

ai[j]

⁼

aj,

and changes to

Pi2

otherwise. Since

g(piJ

=

n

+

i

and

e(PiJ

⁼ i,

(pi,ai) �a1a' (Pi+l,ai+t)

such that

ai+I[i]

⁼

ai

if

f( C i)

= 1 and

ai+l [n

+ i

]

=

ai

if f(

Ci)

= 0. If i =

n,

then

Pn1

is not defined. Thus,

(Pn,an) �a1a' (Pn+l,an+d

iff J

( C n )

⁼

1 .

^Hence,

J(C)

⁼

1

^iff

wE L(A).

Therefor ,

MEM is NP-complete. Q.E.D.

(12)

Next, we study further decision problems for FMAs, that is, the non-emptiness and inequivalence probl ms. We b gin with the non- mptiness problem for FMAs defined as follows.

PROBLEM 4.2.3. The

non-emptiness problem jo1· FMAs

is given an FMA

A,

to d - cide whether

L(A) =J

^0. This problem is d noted by ^--.EM^P^. The non-emptiness problem for DFMAs is defined similarly and denoted by --.EMPD.

Whil an FMA deals with an infinite alphabet, we can reduce it to a finite alphabet using the Proposition 4.1.2 which tells us that if a string

w

is accepted by

A,

then the string

1r(w)

for any ^7ris accept d by

A.

Vic versa, if

w ¢:_ L(A),

then so is

1r(w)

for any permutation ^1r.

Mor ov r, Kaminski and Francez

[32]

showed that if

A

accepts a string

w

of length n, then A also ace pts a string

w'

of length n _{such that}

ll[w]ll

^{:S k,}wher k is the length of the initial assignm nt. Thus,

II [w] II

of any string accepted by

A

can be reduced to be finite. However, we can not conclude --.EMP

E

P from their result, since it is not providing an upp r bound on the length of a minimum string ace pted. Thu first we have to show an upper bound on the length of minimum strings accepted by

A.

Clearly, in order to establish --.EMP

E

NP, this upp r bound has to be polynomial in the siz of A.

Let

A

=

(Q, q0 T

e,

8, F)

be an FMA and let ⁿ b an assignment of

A

where

IIQII

= n and

ITI

= k. _By

C(A)

w denote th t {(p a)

I (q0 T)

1-* (p,a), p

E Q

a ED U

{�}k}.

A state p is said to b

reachable

if (p, a)

E C(A)

for some a. By a we denote the set

{ i I

1 :S

i

^{:S k,}a

[ i] =J �}.

Now w ar r ady to prove the announced upper bound.

LEMMA 4.2.1. If an FMA

A

⁼

(Q q0, T,

e

8, F)

ace pts at least one string ^w,then there is also a string

w'

such that

lwl

^{:S kn,}wh re k ⁼

ITI

and n ₌

IIQII·

PROOF. Assume any FMA A =

(Q, q0, T,

e,

8, F)

and any string

w E L(A).

Then

w = b1 b2

·

· · bm for some ^m 2:: 1. If m :S _kn, we ar done. Now, suppose that ^m^> kn.

Since

wE L(A),

there exists a computation ^c⁼

(q0,T)

^l-b1 ^{• •}^• ^f-bm (p,a) with p

E

F.

Without loss of generality, w can also assume that p is the first state in ^cb longing

(13)

to F. Cons]der any con1putation step of the form

(Pi-l,ai_1)

r61

(Pi,ai)

such that

ai-l

⁼

ai.

There ar two cases to be distingui hed.

Case 1.

bi

E

[ai-l]·

Then then the computation does exclusively depend on 8, that is,

ai-l

⁼

ai.

Assum

bi

to be found at position

j

in

ai-l·

If

(Pi-1,J,Pi-l)

E 8 and

Pi = Pi-l,

then we just r move

bi

from w. L t w' be the resulting string. Clearly, w E L(A) iff w' E L(A).

Otherwis , A has to reach a state that is different from

Pi-l·

The worst-case would happen, if the following symbols in w are equal to

bi,

since then

j

remains unchanged.

But even in this case, there are at most n-2 states being different from

Pi-l

and the possible singl ton tat

p

1 E F. Finally, if

then A must have reached on state twice or

Pi+n-2

is in

F.

In the latter case, we are done again. Otherwise, i.e., if

Pi+n-2 tf_ F,

we simply cut the whole circle by removing the corresponding symbols in w. Again, th resulting string is accepted iff w has been accepted.

Case 2.

bi tf_ [ai-l]·

Then, the reassignment

g

i applied and we have again two cas s to consider. Let

j

⁼

g(Pi-1)

such that

ai-l (j) f:- �·

Hence th r has already be n a symbol in

ai-l (j),

say tj. In this ca e we can r place

bi

in w by t1 without altering the computation and hence w are back to Cas 1.

Now, suppos

ai-l #- ai,

that i ,

j = g(pi_1)

such that

ai_1(j)

replace

�

by

bi.

Note that this case can happ n only k times.

Cons quently, the resulting str]ng u can have a length at most kn.

LEMMA 4.2.2. ^-,EM

P

E NP.

�·

Then, we

Q.E.D.

PROOF. By Proposition 4.1.2 and Le1nma 4.2.1, it suffices to consider the finite al

phabet

'E = {

a1, ... , ak

}

, w h re k ⁼

IT 1.

A nondeterministic Turing machine M that accepts -,EMP in polynomial tim b haves as follows. Given an FMA A ⁼

(Q, q0, T, g,

8,

F),

the machine gu sses a string w E

'E*

with

lwl :::;

kn, where n ⁼

IIQII

as well as a state

p

E F and an assignm nt

a

E

('E U {�})k.

If

(q0,T)

r--w

(p,a),

the machine M accepts A.

(14)

By Lemma

4. 2.1,

^!VIbehaves correctly, and clearly, M works in time polynomial in

the _{siz of}A.

Q.E.D.

In order to prove th NP-completeness of •EMP, we recall the satisfiability problen1, denoted by SAT (for this problem, see Probl ^In

2.3.1).

Let

n 2:: 1,

and let G _b

a Boolean formula in conjunctive normal form over

X = { x1, ... , x2n},

where the

x1, ... , Xn

are variables of G _and

Xn+i = •xi

for all

i

⁼

1, ... , n.

For each claus

Ci = fj1

^V^· ^·^·^V

eik

of G,

eiz EX

for ^z

=

^1,

.

^{, k.} In particular, we use

(Ci)

to denote the set

{jl, J2, .

^..

, j k}.

Summarizing the r sults obtained above, we arrive at the following theorem.

THEOREM 4.2.3. •EMP _isNP-complete.

PROOF. _{By Lemma}

4.2.1

and

4.2.2,

it suffices to prove that SAT is log-space reducible to •EMP. _Let

n 2:: 1,

and let G

= /\J"=1 Cj

be a Boolean formula in conjunctive normal form ov r

X= ^{{x1, ...} ^{, X2n}·}

Th want d FMA A

= (Q, q0,

^T,^{!,

8, F)

is d fined as follows. Q ⁼

{ qi I

⁰

::;

ⁱ

::;

2n +

^m

+ 1},

^T

= {�}2n+I, F = { q2n+m+1}, e(q0)

⁼

2n + 1

and for each

i::; 2n,

we set

e( qi) = i.

Finally, we define

8 = 81

^U

82

^U

83,

wh re

2. 82 = {(qi, i qi+I), (qi, i, qn+i+l) 11::; i::; n-1}U{ (qi i qi+I) (qi, i, qi-n+l) I n+1::;

i ::; 2n -

¹

}

, and

Note that th FMA A _is_loop-fr _e._Th _tate

qi

corre pond to the positive literal

xi

and

qn+i

to

•Xi.

For ach

qi,

exactly two transitions

(qi, i, qi+l)

and

(qi, i, qn+i+I)

are defined if i is odd and

1 ::;

ⁱ

::; n -

^1,and exactly two transitions

(qi, i, qi+l)

and

( qi, i, qi-n+d

are defin d if

i

is ven and

n

^{+ 1}

::; i ::; 2n -

^1. For the states

qn

and

q2n,

the transitions

(qn, n, q2n+d

and

(q2n 2n, q2n+d

are d fined. Thus, for each

s

� { 1 , 2,

.. .

, n}

and

s

⁼

{n

⁺

1::; i::; 2n I i-n rf_ s},

there exists an ^a

E (D

^U

{�})171

such that

(qo,

^T

)

^�*

(q2n+l,

^a

)

and

a[j] En

iff

jEsus (1 ::; j::; 2n).

(15)

Now, we prove that G is satisfiable iff

q2n+m+I

is reachable. Note that (} is undefined for each

q2n+i ⁽¹ ::; i ::;

m

+ ^1).

Thus, all computations from

q2n+I

depend only on the assigniTI nt 0'. For each 1

::;

k

::;

m, the clause

ck

is satisfiable by a truth assignment

j

iff ther xists a lit ral of a variable

Xi

appearing in

Ck

such that

j(xi)

⁼ ¹ or

J( ^•Xi)

⁼ ⁰

⁽ ¹ ::; i ::; n).

By construction, the latter condition is fulfilled iff there exist a s t �

{1,2, ... n}

with s =

{n+ ^1::; i::; 2n I i-n tf_ s}

such that i E

s

or

i + n

^{E :S.} This condition is equivalent to the existence of an

a

such that

a[i]

^{E !1}or

a[i + n]

^{E !1.} Finally, the latter condition holds iff

(q2n+k,i,q2n+k+1)

or

( q2n+k, i + n, q2n+k+

^I

)

ar defined. Hence, G is satisfiable iff

q2n+m+l

is reachable.

Therefore,

•EMP

is NP-complete. Q.E.D.

EXAMPLE 4.2.2. We ex mplify the construction outlined above by using G =

(x1

^V

x2

^V

x3)

^1\

^{( •x1}

^V

•x2

^V

•x3)

^1\

⁽ •x1

^V

x2

^V

•x3).

The resulting automaton is displayed in Figure 4.3. Th initial assignment is

r.

Figure 4.3: _TheFMA A r duced from G in Example 4.2.2.

Next w show that

•EMP

remains NP-compl t when r stricted to DFMAs. Ob

viously,

•EMPD

E NP by Lemma 4.2.2. _Weshow that SAT _{is also}_log-space_reducible to

•EMPD.

However, th technique used in Theor m 4.2.3 can not be applied directly to the deterministic case because th automata constructed there ar nondeterminis

tic. Just converting the obtain d nondeterministic automata into deterministic ones is infeasible, since it could lead to automata having siz exponential in the length of the formula G. Therefore, we have to reduce SAT dir ctly to

•EMPD.

THEOREM 4.2.4.

•EMPD

is NP-compl te.

(16)

PROOF. W have to show that SAT is log-spacer ducible to --.EMPD. Let n � 1, and let let G

= /\j=1 C1

b a Boolean fonnula over X in conjunctive normal form.

Th n an FMA

A= (Q,q0,r,g,8,F)

is computed as follows. Let

Q = {qi I

^0:::;

^i:::;

2n

+ ^n-z,+

1},

¹ ^t^T

= {�}2n+l,

¹ ^t

^{F =} {q2n+m+1},

^{and set}

^{g(q0) =}

²ⁿ^{+ 1} as well as

g( ^{qi) = i}

^{for all} ^l

^{:::; i}

^{:::; 2n.} Furth rmore, we s t

8 = 81

U

8 2

^U

^83,

^where

Not that ach path from th initial state

q0

to

q2n+2

contains no loop. Each path from

qo

to

q2n+2

contains xactly one of two edg s labeled

i

or n + i for all 1

:::; i

:::; n.

Then, for every string ^wwe hav

(q0, r)

�w

(q2n+2 ^a)

^iff

^a[i]

-:/- � or

a[n

+

i]

-:/- � for all 1 :::;

i :::; n.

Moreover, there exist exactly

2n

distinct configurations

(q2n+2,

^a).

Thus, th r exists a one-to-one mapping from the truth assignments for all variables

x1, x2,

..^.

, Xn

to th possible assignments on

q2n+2.

The remaining part of the definition of the automaton

A

is the same as in the proof of Th orem 4.2.3. Thus, G is satisfiable iff the final state

q2n+m+l

is reachable.

However, inc om states are not lab led and some transitions are not defin d, th automaton i till not d terministic. By adding an extra state as a trash box, it is easy to compl t the definition of

A

to b d t rmini tic. We omit th details. Clearly,

the computation of

A

is in log-space. Q.E.D.

EXAMPLE 4.2.3. Again, let us take th 3-CNF in Exan1ple 4.2.2 and we show the reduced DFMA

A

in Figure 4.4. This automaton

A

has also the initial assignm nt

�7.

Although some transitions ar not defined in

A,

for instance there is no transition for the initial state and the first window, we put a special state p as a trash box for undefined transitions. Thus, this automaton is essentially deterministic.

Additionally, we mention more difficult problem inequival nee for FMAs.

PROBLEM 4.2.4. Th inequival nee probl ^mfor FMAs

A

and B, denoted by --.EQ, is to decide whether

L(A)

-:/- L(B). This problem is denoted by --.EQD if the automata

A and B are both deterministic.

(17)

Figure 4.4: The DFMA

A

computed from G in Exampl 4.2.3.

Let�= {a1,

^{• • .}^{, an}

}

^,^{and let} ^M ⁼

(Q,�, 8, q0, F)

^be^afinite automaton. Then, we can con truct an FMA

A

⁼

( Q, q0, T,

^(},

8', F)

such that for each 1 :s; _i:s;

IT I

⁼ ^n,

T[i]

⁼ai, and for any p

q

E

Q

and

j

⁼ ^1,.

.

. , n

e(p)

is undefined and

(p,j, q)

E

8'

iff

(p

^aj,

q)

^E

8.

The automaton

A

accepts the regular language

L(M)

^(cf.

[32]).

^Thus

•EQ is log-spac reducible to the finite automata inequivalence problem which is known to be PSPACE-compl te ( cf., e.g.,

[23]).

This directly follows the PSPACE-hardness of •EQ, however, it is an open question whether or not •EQ is decidablet.

On th _othr hand, for the class of DFMAs, we have already shown that •EMPD is NP-complet . By this result, the NP-hardne s of •EQD is concluded as follows.

Let

A

⁼

(Q, q0,

^T,

e, 8, F)

be any DFMAs. The FMA B is obtain d from

A

^by^B⁼

(Q

^,

q

^o^,^T

e,8 Q\F).

^Since

L(B)

⁼

!l*\L(A)

it follow that

L(A)-#

⁰^iff

L(B)-# !1*

^iff

L(B)-# L(B'),

wh re B ' is a trivial automaton such that

L(B)

⁼

!1*.

^Hence, ^•EMPD

is log-spac r ducible to •EQD that is •EQD is NP-hard.

4.3 Learning Simple DFMAs

We introduce a restrict d FMA, th so-call d a imple DFMA. The aim is to inves

tigate the principl 1 arnability of languages d fin d over an infinite alphabet within Angluin's [6] minimally ad quate teacher (i.e., th learning probl m of simple DFMAs via membership and equivalenc qu ries).

In order to solve this probl m, first we establi h reasonableness of the setting con

sidered. As we have already shown, the m mbership problem for any FMA and any string is a typical probl m for P. Thus, it is reasonable to require the membership tThis problem is known to be decidable if one side of given two FMAs is deterministic and it has at most two registers (cf. [32]).

(18)

orad . On the oth r hand, while it is an open question whether the equivalence prob

lem for DFMAs is NP-hard ven if the automata are simple, we pr sent a procedure, giv n any two sin1pl DFMAs, to output a counterexample if they are not equal and output "y s" oth rwise. For the corT ctness of this procedure, we -first show that two auton1ata A and B are not equivalent iff there exists a finite alphabet

EA,B

such that L(A) n

E�,B =J L(B)

n

EA.,B·

Thus, it also se ms to be r asonable to require the equival nc oracle.

N xt, we show the learnability of simple DFMAs using membership and equiva

lence queri s. Our learning algorithm is partially based on Angluin's observation table technique ( cf. [4]). However, sine the size of an alphabet is potentially infinite in our s tting, it is impossible to directly apply her notion. As we have already seen, the model of FMAs is a natural extension of DFAs (i.e., for every finite alphabet and every FMA th re xisLs a DFA consist nt with the th FMA over the alphabet). For the class of simple DFMAs, we show such a DFA to be computable by membership and equival nee queries using an observation tabl . Furthermore, we also provide th

method to transform the DFA to a simple DFMA, thereby pre erving consistency.

Consequently for each finite alphabet and each targ t, the algorithm can compute a simple DFMA consistent with the target over that alphabet. Together with the first re

sult we conclud that our algorithm can learn every sirnpl DFMA using membership and equivalence queri s.

DEFINITION 4.3.1. A simpl FMA i an FMA A ⁼

(Q,

q0, T, e, p,,

F)

such that its initial assignm nt contains no symbol in n, that i , T E {�}*. A simple DFMA is a DFMA A such that A is simple.

In Definition 4.3.1, we use th differ nt expr ssion p, as a transition of an FMA and a DFMA. It is b cau that in thi s ction, the 8 is used as transitions for DFAs as usual. Thus in order to distinguish th se transitions, we set p, for finite-memory automata and 8 for finite automata.

Next, for decidability of quivalence of simpl DFMAs, we forn1ally define consis

tency for simple DF MAs.

DEFINITION 4.3.2. L _t A _and B _{b simple} _DFMAs. We say that A and B are consistent over an alphabet

E

if for every string w E

E*,

w E L( A) iff w E L( B). This

(19)

is denoted by

A �L:

B. In particular, when the alphabet is n, we say that

A

and B are

equivalent,

denoted by

A�

^B. Similarly, for finite alphabets, we use this notation for DFAs.

In Figure 4.5 below, we provide an algorithm DEC deciding whether or not any two sin1pl Dl� MAs

A

and B are equivalent. The behavior of DEC can be summarized as follows. Let A =

(QA,qt,TA,eA,flA,FA)

and B ⁼

(Q8,qt,T8,g8,fl8,F8).

Fur

thern1or , let k = _max

{ jTAj

,

jT8j}.

Then, DEC takes

A

and B as input, and computes

� =

{ a1, a2,

... ,

a

k

} ·

N xt, it computes a particular set of configurations of

A

and of B d fin d on L: as follows.

For the A let

C(A)

=

QA

^x

(L:U{�} )ITAI.

Initially, let

C(A, 0)

=

{(q0, TA)}.

For each

i

^{E lN,}w proc d inductively. For all

i � 1

we define

C(A, i)

=

C(A, i -1) U ^C(A)i-I,

wher

C(A)i-I

⁼

{c

^E

C(A) I �a

^E

L:, �c'

^E

C(A,i- 1), c'

^�a

c}.

If

C(A,i)

⁼

C(A, i- 1)

for an

i � 0,

then 1 t

C(A)

:=

C(A, i).

Analogously,

C(B)

is computed from B.

Now, using

C(A),

the algorithm DEC computes a DFA

MA

⁼

(Q,

^L:,

^b, q0 F)

as follows.

Q

^:=

^C(A) q0

^:=

(qt,TA) ^{b �} ^C(A)

^x ^L:^x

^C(A)

wher

(c a,c')

^E

b A

iff

c

^�a

c'.

Furthermore,

F ^� ^C(A),

where

(p, T)

^E

F

iff

p

^E

FA.

The automaton

1148

is defined analogously. Now, using any standard algorithm for testing equivalence of two DFAs a a subroutine, DEC d cides wheth r or not

L(J\.,fA)

⁼

L(!II8),

and returns

yes' and 'no re pecb vely.

As it was hown in

[32],

for any FMA

A

and any finite alphabet ^�, there exists a DFA M such that M

�l: A

and M is computable from

A

and L:. Thus, we can establish the termination of DEC and for any giv n simpl DFMAs

A

and B, and any finite L:, w can d cid wh ther or not

A �l:

^B. ^W show that there exists a number k depending on

A

and B such that

A �l:

^Birnplies

A �

B for a finite alphab t L: of cardinality k.

First, we consider reflecting a string ^w E !1* to a string ^w' E L:* such that the transition of states of

A

on ^wis preserved. That is, w propose a method computing for every ^wE !1* a string ^w' E

L:*

such that

A(

^w

)

⁼

A(

w'

)

for a sufficiently larg but

finite L:. In order to formalize this idea, we ne d the following d finition.

(20)

Procedure DEC(A,B); A,B;

simple DFMAs

begin

let k =

max{JTAJ,JT8J},

� =

{a1, ... , a

^k

} ,

and

C(A,O)

=

{(qo,TA)}

compute

C(A,

i), fori= 1

while(C(A,

i)

# C(A,

i-

1))do

compute

C(A,

i +

1)

and i := i + 1 let

C(A)

:=

C(A,i)

let

MA

=

(Q, �'

8,

q0, F)

such that

Q

= C(A,i)

qo

=

(qt,TA),

8 �

C(A)

^X� ^x

C(A)

where

(

^{c, a, c'}

)

E 8 ^iff^cf-a ^c',and

F

�

C(A),

where

(p,T)

E

F

iff p E

FA

analogously compute

Ms if MA

^�

M8 then

output 'yes and halt

if MA 7:- Ms then

output 'no" and halt

end

Figure 4.5: Th procedure for deciding consistency of simple DFMAs.

(21)

DEFINITION 4.3.3. Let A =

(Q,

q0

T,

g, �L

F)

be a simple DFMA and � a finite alphabet of cardinality k =

IT

1. Then, for w E n+, we define sets w

�

recursively as follows. For each ^aE n, 1 t a

�

= �. Then, for each a E n and w E n+, l t

(

wa

)�

=

Uw'Ew� {

w'a^'}, where

a1EL:

if a=

Tw[i]

or a

� [Tw],

^a=

Twa[i]

and��

[Tw]

if a

� [Tw],

a=

Twa[i], Tw[i]-::/-

�and� E

[Tw]

if a �

[Tw],

a=

Twa[i], Tw[i]

= �·

For the r ader's understanding, we show an application of this notion taking an exampl a follows.

EXAMPLE 4.3.1. A simple DFMA, denoted by A, is illustrated in Figure 4.6. Since

ITI

=

2,

s t � =

{

al, a2}· Now, asa6alla6 En+. By Definition 4.3.3, al E

(

as)

�

= �.

Since

Ta5

^as

�

^,

Ta5a6

^{= asa6,} and

Ta1

^{= a1}

�

^, this case corresponds to the third condition of Definition 4.3.3. Then, a2 E �

\ [Ta1]

follows a1a2 E

(

asa6

)�

. Since

Ta5a6

^{= asa6}

Ta5a6a11

= a san, and

Ta1 a2

= a1 a2 this case corresponds to the first condition of Definition 4.3.3. Then, a2 =

Ta1a2 [2]

follows a1a2a2 E

(

asa6an

)�.

Similarly,

a1a2a2a2 E

(

asa6alla6)

�

.

On this simple DFMA A, the first symbol of every input is substituted in the first window of the assignn1 nt. Once A reads other symbol , they are substituted in the second window. Thus ther is no computation su h as

(p

a

�)

r6 (q b�). Hence, in this example, th econd condition of Definition 4.3.3 do s not happen.

On the other hand, the string a2a1a1a1 is also in

(

asa6a11a6

)�

and the two strings

a1a2a2a2 and a2a1a1a1 can simulate the computation (q0, ��)

r-asa6a11a6

(q1, asa6

)

. That is, asa6ana6 E L(A) iff a1a2a2a2,a2a1a1a1 E L(A).

Next, we formalize the prop rty of strings in w

�

.

LEMMA 4.3.1. Let wE nn and w' E w

�

, wher the length of the initial assignment of A is k. Then, for each 1:::;

i:::;

k,

Tw[i] =�iff Tw'[i]

= �·

PROOF. The proof is done by induction on th length of w E n+. For the induction basis, let a En. Then, a

�

= � and since T E {�}*,it is also true that for each a' E a

�

,

(22)

Initial assignment: ##

Figure ^4.6: A simple DFMA with the initial assignment of empty two registers.

Ta[i]

=�iff

'a'[i]

= �- Assume the induction hypothesis for each

w E nn.

Let

w' E w�.

Taking a syn1bol

a E n,

consider

( wa )�.

In case

a

=

Tw[i],

or

a

tf_

[Tw], a= Twa[i]

and� tf_

[Tw]

(the first case of Definition 5), there exists

a, ab E pre(w)

such that

Ta[i]

= � and

Tab[i]

=

b.

Let

a'b' E (ab)�.

By the induction hypothesis,

Ta[i]

= � implies

Ta'[i] =

� as well as

Tab[i]

=

b

implies

Ta'b'[i]

⁼⁼

b'.

Thus

Tw'[i] #- �- By th

d finition

w'a'

^E

(wa)�

for some symbol

a'

^{E �}

such that

a'= Tw'[i] .

Hence, for each ¹

� j � ITI, Twa[)]=�

iff

Tw'a'[j] =

�-

In case

a

tf_

[Tw], a

=

Twa[i], Tw[i]

-1-� and �

E [Tw]

(the second case of Definition 5) since

Tw[i]

#-� by the induction hypothesis

Tw'[i]

=

a'

-1-�- Thus,

w'a' E (wa)�.

Similarly to th above case,

Twa[i]

= � iff

Tw'a'[i]

= �- In addition �

E [Tw]·

By the induction hypothesis

Tw[j]

=

Tw'[j]

=�for som j -1-

i.

Since

IIL:II

=

ITI,

there exists an

b' E

^l:

\ [Tw'],

namely,

w'b' E (wa)�.

Moreov r,

Tw'[i] =a'

and

Tw'b'[i]

=

b'.

Hence,

Twa [ i]

= � iff

T w'b' [ i]

= �.

The last ca e is proved analogously. Therefore we have prov d that

( wa )� � L:n+I

and that for each

w'a' E

(

wa)�

and ach ¹

� i �

^k,

Twa[i]

=�iff

Tw'a'[i]

= �- The proof

is completed. Q.E.D.

By Lemma ^4.3.1,for each

wa E n+

and each

w'a' E (wa)�,

if

(P,Tw) �a (q,Twa),

then

(p, T w') �a' ( q, T w'a') .

^H nc , we have the following le1nma.

LEMMA 4.3.2. For each

w E n+

and

w' E w�, A( w) = A( w').

For each simple DFMA

A

and ^Band for each finite alphabet L:, we can decide whether or not Â�E B. Hence, it is sufficient to prove that Â�E B implies Â� B

(23)

for the equivalence problem, where

JJL..JJJ = max{jTA J, jTB J} .

Moreover, by Lemma 4.3.2 this condition is simplified to that for each

w

^{E n+,}there exists

w'

^{E E+}such that

w' E

w� n w�.

The next lemma is the most critical part of this section.

LEMMA 4.3.3. L t A and B be simple DFMAs with initial assignments

TA

and

TB

respectively. L t ^L..J be an alphabet of cardinality k

= max{jTAj, jTBj}.

Then, for each

w

^{E n+}

w� n w� f= ^0.

PROOF. Suppos that

jTAj ^� jTBj.

We show the desired result by induction on the 1 ngth of

w.

For the induction base, let

a

^E

t.

Clearly, for each

a

^{E n,}

a� = a� =

^E.

OW' assum the induction hypothesis on nn' n

� 1.

_Let

wa

^{E nn+l.}

Case 1. Let

a

^E

[T;] n [T!]

^.Then, there exists

w'

^E

w� nw�

such that

w'T;,[i]

^E

(wa)�

for

a = T;[i]

and

w'T!,[j]

^E

(wa)�

for

a = T![j],

where 1

:::; i :::; jTAj

and 1

:::; j ^:::; J TB J.

Sine

a

^E

[ T;],

we can assume that

w = aa{3

for some

a

^E

D*

and j3 E

(D\ {a})*.

By the induction hypothesis, there xists

a'a'f3'

^E

(aa{3)� n (aa{3)�.

Since

a ti [{3]

and

a

^E

[T;] n [T!], T:a[i] = u�[i]

⁼

T!aUJ = T![j] = a.

Thus by Lemma 4.3.1

T:,a

,

[i] = T:,[i] = T!,a,[j] = T!,[J] = a'.

Thus

w'a'

⁼ ^W1

T:

,

[i]

^E

(wa)�

and

w'a' = w'T!,[j]

^E

(wa)�.

That is,

(wa)� n (wa)� f= ^0.

Case 2. Let

a ti [ T;]

^U

[T!].

Since

JJEJJ = jTBj ^� jTAj

we can take a

w'

^E

w� n w�

such that

[T;,] ^� [ T!,]

^.

If

[T!,]

i a proper subs t of E, that is at 1 ast on window ofT

!

, contains no symbol in E, then a ymbol

a'

^{E E}

\ [T!,]

is not in

[T;,].

Thu

w'a'

^E

(wa)� n (wa)�.

If

[T!,J = E,

then, by Lemma 4.3.1,

T![j] =

band

T!aUJ = a

implies

T!, [ j] = b'

for some

b'

^E

E

and 1

:::; j :::; jTBj.

Th n, let ^w'

= a'b'{3',

where

b' ti [{3].

Since

b'

^E

[T:,6,] n [Tt,6,]

and

b' ti [{3'], w'

^E

w� n w�

impli s

w'b' = a'b'f3'b'

^E

(wb)� n (wb)�,

where

b = T![j].

Sine

a ti [T!], � ti [T!]

and

b

^E

[T!],

by th definition,

(wb)� = (wa)�.

Similarly,

(wb)� = (wa)�

if

b ti [ T

^w

]

, and

(wb)� ^� (wa)�

oth rwise. Thus,

w'b' =E (wa)� n (wa)�,

namely,

(wa)� n (wa)� # ^0.

Case 3. L t

a

^E

[ T!] \ [T:].

Sine

a

^E

[w],

let

w = aa{3

for son1e

a

^E

D*

and

(3

E

(D\{a})*.

By the induction hypoth sis, 1 t

a'a'

^E

(aa)�.

By Lemma 4.3.1, there exists

j

such that

T!a [j] = a

iff

Tt,a' [j] = a'.

Let

w' = a' a' {3'

^E

w�.

Since

a ti [{3],

by Lemma 4.3.1,

a' ti [{3'].

Wh n w take

w'a'

^E

(wa)�,

since

a

^E

[T!],

we can assume that

a= T![i]

and

a'= T:,a,[i].

Since

a ti [T:],

by Lemma 4.3.1,

T![i] =a f= T:[i]

implies

質問に基づく形式言語の機械学習に関する研究