Conclusion - 質問に基づく形式言語の機械学習に関する研究

In this chapter, w discuss several open questions not answered in this thesis. We mainly discuss the NP-cornpleteness for inequivalence of two DFMAs as well as the consistency problem for one-variable patterns in detail.

First, we consider the inequivalence for DFMAs. That is, the problem, denoted by -.EQD, given any two DFMAs

A

and

B,

to decide whether

L(A) -=J L(B).

In Chapter 4, we defined th non-emptiness problem for DFMAs, denoted by -.EMPD, given any

DFMA

A,

to decide whether

L(A) -=J

^0.

We also have shown the -.EMPD to be NP-complete in Theorem 4.2.4. There is a trivial log-space r^ed^{u c}tion from -.EMPD to -.EQD by setting that one side of DFMAs accepts the n*. H nee, it is straightforward that the problem -.EQD is NP-hard.

Consequently, for the proof of its NP-completeness, it remains to show only whether or not -.EQD E NP. As we showed in Lemma 4.2.2, for every FMA

A

and for any state p of

A,

a nondeterministic Turing machine can decide in time polynomial whether there exists a string such that

A

arrives at the state p by processing the string. The difficulty in case of -.EQD is, in a word,

whether there is a polynomial-sized upper

boun d of strings for any pair of states of given two DFMAs.

Needless to say, a trivial upper bound for these strings is the length of sequence of all pairs of configurations, that is an exponential in the size of input DFMAs. This problem can be reduced to the reachability problem for two DFMAs. In this problem, given a pair

(

q)

of states and two DFMAs, an algorithm must decide whether there exists a string such that it guides one DFMA top and the other to

q

at the same time.

By Lemma 4.2.1 and Theorem 4.3.1, we can simulate all configurations of each

DFMA

A

by strings ov r a polynomial-sized alphabet �. Thus, the NP-completeness of --.EQD is equivalent to finding a witness for the following hypothesis.

HYPOTHESIS 6.0.1. Let

A

and B be DFMAs. For each states qA of

A

and q8 of B, if

A

arrives at qA and B arrives at q8 at the same time for a string, then there exists a polynon1ial

p

such that the length of at least one of these strings is less than

p(

^{n , m}

)

wher n

=

n1ax{IIQAII, IIQ8II} and ^m

=

max{ITAI, IT81}.

Next, w consid r th consist ncy problem. In Chapter 5, we studied on several variants of this problem for one-variable patterns with respect to their computational compl xity. However, it is still open whether there exists an effective algorithm to decide th consist ncy problem itself.

A nonnal id a may be derived from Angluin's

[1) pattern automata.

Let

s, t

E �+, wh ret is a substring of

s.

Let

P(s,t) = {1r

P1 l1r[tjx] = s

}. We can construct a particular automaton to recognize

P(s,t)

as follows.

DEFINITION 6.0.1. (Angluin

[1]).

one-variable pattern automaton A(s, t)

for

s,t

E �+such that tis a substring of sis a 5-tuple

(Q,�,8,q0,F)

where (i,j) E

Q

iff i,j

� 0

_{and i}₊

jltl::; l si, qo = (0, 0),

_(i,j)_E

F

iff

j � 1

and i +

jltl = lsi,

and for the

variable ^x and each

a

^E�' the transition is defined as follows:

8

( ( i j)

a) =

( i +

1 ,

j) iff

s [

i + j

It

I +

1) = a

and

8(

( i j)

x) =

( i j +

1)

iff ther xists an occurT nee of

t

s

beginning at position i + j

It I

1 s.

A language accepted by the above one-variable pattern automaton

A( s, t)

is the set of all

1r

P1

such that

8( q0, 1r) = p

F.

THEOREM 6.0.1. (Angluin

[1]).

For every

, t

�+

such that tis a substring of

s,

it holds that

L(A(s,t)) = P(s,t).

Let

Ai = (Qi,

�'

8i, q0, Fi)

fori

= 1,

2 be two patt rn automata. Then, define

A1nA2

to be the finite automaton

A= (Q,

�'

8, q0, F),

where

Q = Q1

Q2, F = F1 n F2,

and for each p E

Q

and

a

^E �'if

81(p,a)

and

82(p,a)

are defined and equal, then

8(p,a)

is defined and equal their common value. Then, the most interesting property of one-variable pattern automata is the following compactness for intersection.

THEOREM 6.0.2. (Angluin

[1]).

Let A and B be one-variable pattern automata.

Then, L(A)

n L ( B )

= L(A

n B).

EXAMPLE 6.0.1. LetT=

{(ab)3a(ab)3a, (ab)2a(ab)2a}

be a set of positive exa1npl and let F =

{ babbaabbab, abababba}

be a set of negative examples.

For patterns 1r E P1 such that

�( 1r, x)

= 2*, there exist 1r1 =

xbaxba 1r2

₌

xbaabx, 1r3

abxxba

and

1r

4 =

abxabx

such that T

�

^L(^1ri

)

^for^{i =}1, ... , 4. The

1r2

and

1r3

are removed because

babbaabbab

_E_L(

1r2)

for the substitution

bab

_and

abababba

_E_L(

1r3)

_for

the

a b.

The r maining 1r1 and 1r 4 are selected as consistent patterns for T U F.

On the other hand, let F' ⁼ F U

{ abbbaabbba}.

In this case, th re are substitutions

abb

for 1r1 and

bba

_for

1r

4. Thus,

abbbaabbba

_E _L(1r!

)

_n_L(1r 4

)

, that is there exists no consistent 1r E P1 with T U F such that

�(

1r,

x)

₌_2.

A (u, v)

u = (abababa)"2 v =ababa

B (x, y) x = (ababa)"2 y = aba

Inter ection:

A(u, v) (l B(x, y)

Figure 6.1: Patt rn automata.

By Theorem 6.0.2, th class of one-variabl pattern automata is closed under in

tersection, and for any one-variable pattern automata A and B, we can represent

*Since there are at most polynomially many substrings, we can fix the number of variables and constants for possible patterns.

L(A) n L(B)

by a compact automaton. However, no one knows a compact representa

tion for th difference

L(A) \ L(B)

because the class of one-variable pattern automata is not closed under complement.

One way to simplify the consistency problern is a restriction of alphabets

(

e.g., I; =

{

b}).

Moreov r, by Theorem 6.0.1 and 6.0.2, we can generally assume that

IITII

2

b cause for each T and

T'

such that

T � T', L(Ar) � L(AT'),

where

Ar

is a one

variabl pattern automaton that accepts

T

and

Ar'

is defined analogously.

Suppos that there xists a polynomial

p

such that for any

w E F,

the number of patterns ¹r

E L(Ar)

satisfying

w E L(1r)

is at most

p(n),

where

n

is the number of states of

Ay.

In this case, we can guess

IIFII · p( n)

patterns of

Ar

and eventually find a pattern consistent with

T

F

within these patterns provided it exists.

For the other cases, we use the notion of overlapping substrings. Let

u

be a non-null substring of a

w

su h that

�(w, u)

⁼

k,

where

k 2: 2.

For each 1

:::;

i <

j :::; k,

let

Pi

and

Pj

be the positions of the ith and ith occurrences of

u,

respectively. If

Pj -Pi

JuJ,

then the suffix of

u

of length

Jul -(Pj- Pi)

is said to be the overlap of the ith and jth occurrences. We denote an ith occurrence of u in

w

(

i)w·

Case 1. There exists an

1 :::;

:::; k

-1 such that the length of the overlap of

( u,

i)w and

(u,

i +

1)w

is greater than

lluJ/2J.

The substring

u

is said to be a long solution of an equation

w

= 7r if 7r

E

P1 and

w

= ¹r

[

j

] .

Case 2. For any 1

:::;

-

1, it is less than or equal to

llu I /2 J.

The substring

u

is said to b an exception if it is not a long olution of

w

= 7r for any 7r

E P1.

Our first hypothesis is that for any

Ar

and for any

wE F,

either

IIL(ArnAw)ll:::;

1 or

L(Ar n Aw)

L(Ar)

wher

Aw

is a on -variable pattern automaton that accepts all

1r E

P1 such that

w E L ( 1r).

Unfortunately, this hypothesis is dismissed by Ex

ample 6.0.1. Since th negative example

w

abbbaabbba

is g n rated by two pat

terns

1r

1 ⁼

xbaxba,1r4

abxabx E L(Ar),

thus neither

IIL(Ar n A(w,u))ll :S

1 nor

L(Ar n A(w, u))

L(Ar ).

However, not all possibilities w r removed because the pairs

(w, 1r1)

and

(w,

¹r

4)

both correspond to Case 2. Currently, there is no known counterexample for Case 1.

Thus, let us modify our hypoth sis as follows.

HYPOTHESIS 6.0.2. Let

A

b a one-variabl pattern automaton for strings wi, where i = 1, ... nand n � 2. Assume that w = 1r

[

j

]

, 1r E

L(A),

and that

(

w,u

)

satisfies Case 1. Then, either

L( A n A(

w,u

))

= {1r} or

L( A n A(

w,u

))

L(A).

Finally, we consider a strategy for Case ²using one-variable pattern equations

1fx = 1fy, where 1fx and 1fy are in ^P1distinguished by the different variables x andy. A pair of strings u and u' is considered as a solution of the equation if 1rx

[

/

]

^{= 7ry}

[

^u'/y].

It is known that if the 1r x = 1r y has solutions u, then either u is of

(

^a.

f3)

^na.for all n � ¹ (it correspond to as 1) or the length of u is bounded by a polynomial in the length of max{

l

1r x

I

l

1r Y I} (it corresponds to Case 2).

We not that for any 1r E P1, if 1r

[

j

]

= 1r

[

j

]

and

l

i

l

, then u = u'. Thus, w have that th number of exceptions for 1r x = 1r y is also bounded by a polynomial.

Hence, for the Cas 2, the following hypothesis arises naturally.

HYPOTHESIS 6.0.3. Let

A

be a one-variable pattern automaton in the above Hy

pothesis. Assume that w = 1r

[

j

]

, 1r E

L(A),

and the

(

w,u

)

satisfies the Case 2.

Then there exists a polynomial

p

such that II

L( An A(

^{w, u}

))

:::; p( m),

where

m

is the number of states of

A. References

[1] Angluin, D. (1980). Finding patterns common to a set of strings.

Journal of Com

puter and System Sciences

21:46-62.

[2] Angluin, D. (1981). A note on the number of queries needed to identify regular languag s.

Information and Control

51:76-87.

[3] Angluin, D. (1987a). Learning k-bounded context-free grammars.

Technical Report

YALEU /DCS/RR-557. Department of Computer Science, Yale University.

[4] Angluin,

^D.

(1987b ). Learning regular sets from queries and counterexamples.

formation and Computation

75:87-106.

[5] Angluin, D. (1987c). Learning k-term DNF formulas using queries and counterex

ample .

Technical Report

YALE/DCS/RR-559. Department of Computer Science, Yale University.

[6] Angluin,

^D.

(1988). Queri s and concept

arning.

Machine Learning

2:319-342.

[7] Angluin,

^D.

(1990). Negative results for equivalence queries.

Machine Learning

5:121-150.

[8] Angluin,

^D.,

and Kharitonov, M. (1991). When won t memb rship queries help? In

Proc. 23rd Annual ACM Symposium on Theory of Computing

(pp. 444-454). ACM Press.

[9] Arikawa, S., Miyano, S., Shinohara, A., Kuhara, S., Mukouchi, Y., and Shino

hara,

^T.

(1993). A machine discovery from amino acid sequences by decision trees

over regular patterns.

New Generation Computing

11:361-375.

[10] Arimura, H., Shinohara, T., and Otsuki, S. (1994). Finding minimal generaliza

tions for unions of pattern languages and its application to inductive inference from positive data. In Proc. llst Symposium on Theoretical Aspects of Computer Science (pp. 649-660). LNCS 775, Springer-Verlag.

[11] Beimel, A., Geller F. and Kushilevitz, E. (1998). The query complexity of finding local minin1a in the lattic . In Proc. llst Annual Conference on Computational Learning Th ory (pp. 294-302). The Association for Computing Machinery.

[12] Berman, P., and Roos, R. (1987). Learning one-counter languages in polynomial tim . In Proc. 28th IEEE Symposium on Foundations of Computer Science (pp.

61-67). IEEE Computer Society Press.

[13] Birkendorf, A., Boker, A. and Simon, H.U. (1997). Learning deterministic finite automata from smallest counterexamples. eCOLT Tec hnical Report eC-TR-97-001.

[14] Boros, E., Ibaraki, T., and Makino, K. (1997). Monotone ext nsions of Boolean data sets. In Proc. 8th International Work hop on Algorithmic Learning Theory (pp.

161-175). LNAI 1316, Springer-Verlag.

(15] Brazma, A. (1993). Efficient identification of regular expressions from represen

tative examples. In Proc. 6th Annual ACM Con} renee on Computational Learning Theory (pp. 236-242). The As ociation for Computing Machinery.

[16] Burago, A. (1994). Learning structurally reversible cont xt-free grammars from queries and counterexamples in polynomial time. In Proc. 7th Annual ACM Confer

ence on Computational Learning Theory (pp. 140-146). Th Association for Com

puting Machinery.

[17] Cook, S.A. (1971 ). The complexity of th orem-proving procedur s. In Proc. 3rd Annual ACM Symposium on Theory of Computing (pp. 151-158). ACM Press.

[18] Erl bach, T., Rossmanith, P. Stadtherr, H., Steg r, A. and Zeugmann, T. (1997).

Learning one-variable pattern languages very fficiently on average, in parallel, and by asking queries. In Proc. 8th International Workshop on Algorithmic Learning

Theory (pp. 260-276). LNAI 1316, Springer-Verlag.

[19] Gavalda,

^R.

(1993). On the power of equivalence queries. In

Proc. 1st European Conference on Computational Learning Theory.

Clarendon Press, London.

[20] Gavalda

^R.

( 1994). The complexity of learning via queries. In

Proc. 9th Structure in Complexity Theory Conference

(pp.324-337).

^IEEE

Computer Society Press.

[21] Gold

^M.E.

( 1967). Language identification in the limit.

Information and Control

10:44 7-4 7 4.

[22] Goldschlager, L.M. (1977). The monoton and planar circuit value problems are log space complete for P.

SICA CT News

9:25-29.

[23] Garey,

^M.R.

and Johnson, D.S. (1979). Computers and intractability. W.H. Free

man

Co.

[24] Gray, J.N., and Harrison, M.A. (1972). On the covering and reduction problems for context-free grammars.

Journal of the A CM

19:675-698.

[25] Hancock T., Galea, M., and Marchand M. (1991). Learning nonoverlapping per

ception n tworks from examples and membership queries.

Technical Report

TR-2691. Harvard University Center for Research in Computing Technology.

[26] Harrison, M.A. ( 1978). Introduction to formal language th ory. Addison- Wesley.

[27] Hopcroft,

^J.E.,

and Ullman, J.D. (1979). Introduction to automata theory, lan

guages, and computation. Addi on- Wesley.

[28] Ibarra,

^0.,

and Jiang,

^T.

(1988). Learning regular languag s from counterexam

ples. In

Proc. 1988 Workshop on Computational Learning Theory

(pp. 371-385).

Morgan Kaufmann.

[29] Ishizaka, H. (1990). Polynomial time learnability of simple d terministic lan

guages.

Machine Learning

5:151-164.

[30] Ishizaka, H., Arimura,

^H.,

and Shinohara, T. (1994). Finding tree patterns con

sistent with positive and negative examples using queries. In

Proc. 5th International Workshop on Algorithmic Learning Theory

(pp. 317-332). LNAI 872, Springer

Verlag.

[31]

Jiang, ., Salomaa, A., Salomaa K., and Yu, S.

(1995).

Decision probl ms for patterns.

Journal of Computer and System Sciences 50:53-63.

[32]

Kaminski, M., and Francez, N.

(1994).

Finite-memory automata.

Theoretical Co·mput r Science 134:329-363.

[33]

K arn , M., and Pitt, L.

(1989).

A polynomial-time algorithm for k-variable pat

tern languages from examples. In

Proc. 2nd Annual Workshop on Computational Learning Th ory

(pp.

57-71).

Morgan Kaufmann.

[34]

Kearns, M., and Valiant, L.G.

(1994).

Cryptographic limitations on learning Bool an formula and finite automata.

Journal of the ACM 41(1):67-95.

[35]

Kearns, M.J., and Vazirani, U.J.

(1994).

An introduction to computational learn

ing th ory. MIT Press.

[36]

Knuth, D.E.

(1967).

Characterization of parenthesis languages.

Information and Control11:269-2 9.

[37]

Ladner, R.E.

(1975).

Th circuit value probl m Is log space complete for P.

SIGACT News 7(1):18-20.

[3:1]

Laird, P.

(19

). Learning from good data and bad. Ph.D. thesis Yale University.

Kluwer Acad mic Publi h rs.

[39]

Lange, S. and Wi hagen R.

(1991).

Polynon1ial-time infer nee of arbitrary pat

tern languages.

New Generation Computing 8:361-370.

[40]

Levy, L.S., and Joshi, A.K.

(1978).

Sk letal structural descriptions.

Information and Control 39:192-211.

(41]

Maass, W., and Turan, G.

(1990).

On the compl xity of learning from counterex

amples and membership queries. In

Proc. 31st Annual Symposium on Foundations of Computer Science (

^pp.

^203-210).

^IEEEComputer Society Press.

[42]

Marron, A.

(1988).

Learning pattern languages from a single initial example and from queries. In

Proc. 1988 Workshop on Computational Learning Theory (

^pp.

311-325).

Morgan Kaufmann.

[43]

Marron, A., and Ko, K.

(1987).

Identification of pattern languages from examples and queries. Information and Computation

74(2).

[44]

McNaughton, R.

(1967).

Parenthesis gramn1ars. Journal of the ACM

14:490-500.

[45]

_Mitchell,_A.R.

(1998).

Learnability of a subclass of extended pattern languages.

In Proc. 11 t Annual Confer nee on Computational Learning Theory (pp.

64-71 ) _.

The Association for Computing Machinery.

[46]

Miyano, S., Shinohara, A., and Shinohara, T.

(1991).

Which classes of elementary formal systems are polynomial-time learnable? In Proc. 2nd Workshop on Algorith

mic Learning The01�y (pp.

139-150).

Japanese Society for Artificial Intelligence.

[47]

_Natarajan _B.K.

(1989).

On learning from exercises. In Proc. 2nd Annual Work

shop on Computational Learning Theory (pp.

72-87).

Morgan Kaufmann.

[48]

Natarajan, B.K.

(1991).

Machine learning: a theoretical approach. Morgan Kauf-mann.

[49]

_Oncina,_J.,and Garda, P.

(1992).

Inferring regular languages in polynomial up

date time. In Perez, N. et al. eds., Pattern Recognition and Image Analysis (pp.

49-61).

World Scientific.

[50)

Papadimitriou C.H.

(1994).

Computational complexity. Addi on-Wesley.

[51]

Porat, S., and Feldman, J.

(1988).

Learning automata from ordered examples.

In Proc. 1988 Workshop on Computational Learning Theory (pp.

386-396).

_Morgan

Kaufmann.

[52]

_Reischuk,R. and Z ugn1ann, T.

(199 ₎

. Learning !-variable patt rn languages in linear av rag time. In Proc. list Annual Conference on Computational Learning

Theory (pp.

198-208).

The Association for Con1puting Machinery.

[53]

Sakakibara, Y.

(1990).

Learning context-free grammars from structural data in polynomial time. Theoretical Computer Science

76:223-242.

[54]

_Sakamoto,H.

(1995).

Language learning from membership queries and character

istic examples. In Proc. 6th International Workshop on Algorithmic Learning Theory (pp.

55-65).

_{L AI}

997,

Springer-Verlag.

[55]

_Sakamoto,H.

(1997).

Learning sirnple deterministic finite-memory automata. In

Proc. 8th Int rnational Workshop on Algorithmic Learning Theory (pp.

416-431).

LN AI

1316

Spring r-Verlag.

[56]

_Sakamoto,H.

(1998a).

The learnability of simple deterministic finite-memory au

tomata via queries. Bulletin of Informatics and Cybernetics

30(1 ):93-108.

[57]

_Sakamoto, H.

(199

b). Finding a one-variable pattern from incomplete data. In

Proc. 9th International Conference on Algorithmic Learning Theory (pp.

234-246).

LNAI

1501,

_Spring_r-V_rlag.

[58]

Sakan10to, H., and Ikeda, D.

(1998).

Intractability of decision problems for finite

memory automata. In Margenstern, M. _ed., Proc. 2nd International Colloquium Universal Machines and Computations (vol. 2, pp.

58-74,

ISB _:

2-9511539-1-0).

Special issue on Theoretical Computer Science, to appear.

[59]

Salomaa, A.

(1969).

Theory of automata. International Series of Monographs in Pure and Applied Math matics (vol.

100).

Pergamon Press, London.

[60]

Sammut, C., and Banerji, R.

(1986).

Learning cone pts by asking questions. In Michalski R.S., Carbonell, J.G., and Mitchell T.M. eds., Machine Learning: an Artificial Intelligence Approach (vol.

2).

Morgan Kaufmann.

[61]

_Schapir_, R.E.

(1990).

Patt rn languages are not learnable. In Proc. 3rd Annual Workshop on Computational Learning Theory (pp.

122-129).

Morgan Kaufmann.

[62]

_Shapiro,E.Y.

(1981).

A gen ral incremental algorithm that infers theories from facts. In Proc. 7th International Joint Conference on Artificial Intelligence (pp.

446-451).

Morgan Kaufmann.

[63]

_Shapiro,E.Y.

(1982).

Algorithmic program diagnosis. In Proc. 9th A CM _Sympo

sium on Principles of Programming Languages (pp.

299-308).

The Association for Cornputing Machinery.

[64] Shapiro, E.Y. (1983). Algorithmic program debugging. MIT Press.

[65] Shinohara, T. (1982a). Polynomial time inference of pattern languages and its ap

plications. In Proc. 4th IB/1.1 Symposium on Mathematical Foundations of Computer Sci nc (pp. 191-209).

[66] Shinohara, T. (1982b ). Polynomial tim inference of extended regular pattern languages. In Proc. RIMS Symposia on Software Science and Engineering (pp.

115-127). LN S 14 7 Springer-V rlag.

[67] Shinohara, T. (1986). Studies on Inductive Inference from Positive data. Ph.D.

the i , Kyushu Univ rsity.

[68] Stephan, . (1998). The consistent extension problem for one-variable patterns is NP-compl t . Personal communication.

[69] Stockmeyer, L.J., and Meyer, A.R. (1973). Word problems requiring exponential time. In Proc. 5th Annual ACM Symposium on Theory of Computing (pp. 1-9).

ACM Pr ss.

[70] Takada, Y. (1988). Grammatical inferenc for even linear languages based on control sets. Infor·mation Processing Letters 28:193-199.

[71] Trakhtenbrot B., and Barzdin, Ya. (1973). Finite automata: behavior and syn

thesis. North Holland Publishing Company, Am t rdam.

[72] Valiant, L.G. (1984). A theory of the l arnable. Communications of the ACM 27(11):1134-1142.

[73] Wiehag n, R., and Zeug mann, T. (1994). Ignoring data rnay be the only way to learn efficiently. Journal of Experimental and Theoretical Inte lligence 6:131-144.

[74] Yokomori, T. (1991). Polynomial-time l arning of very simple grammars from positive data. In Proc. 4th Annual Workshop on Computational Learning Theory (pp. 213-227). Morgan Kaufmann.

[75] Yokomori, T. (1993). Learning two-tape automata from queries and counterex

an1ples. In Proc. 6th Annual A CM Conference on Computational Learning Theory (pp. 228-235). The Association for Computing Machinery.

ドキュメント内質問に基づく形式言語の機械学習に関する研究 (ページ 53-68)