In this chapter, w discuss several open questions not answered in this thesis. We mainly discuss the NP-cornpleteness for inequivalence of two DFMAs as well as the consistency problem for one-variable patterns in detail.
First, we consider the inequivalence for DFMAs. That is, the problem, denoted by -.EQD, given any two DFMAs
A
andB,
to decide whetherL(A) -=J L(B).
In Chapter 4, we defined th non-emptiness problem for DFMAs, denoted by -.EMPD, given anyDFMA
A,
to decide whetherL(A) -=J
0.We also have shown the -.EMPD to be NP-complete in Theorem 4.2.4. There is a trivial log-space redu ction from -.EMPD to -.EQD by setting that one side of DFMAs accepts the n*. H nee, it is straightforward that the problem -.EQD is NP-hard.
Consequently, for the proof of its NP-completeness, it remains to show only whether or not -.EQD E NP. As we showed in Lemma 4.2.2, for every FMA
A
and for any state p ofA,
a nondeterministic Turing machine can decide in time polynomial whether there exists a string such thatA
arrives at the state p by processing the string. The difficulty in case of -.EQD is, in a word,whether there is a polynomial-sized upper
boun d of strings for any pair of states of given two DFMAs.
Needless to say, a trivial upper bound for these strings is the length of sequence of all pairs of configurations, that is an exponential in the size of input DFMAs. This problem can be reduced to the reachability problem for two DFMAs. In this problem, given a pair
(
p,q)
of states and two DFMAs, an algorithm must decide whether there exists a string such that it guides one DFMA top and the other toq
at the same time.By Lemma 4.2.1 and Theorem 4.3.1, we can simulate all configurations of each
DFMA
A
by strings ov r a polynomial-sized alphabet �. Thus, the NP-completeness of --.EQD is equivalent to finding a witness for the following hypothesis.HYPOTHESIS 6.0.1. Let
A
and B be DFMAs. For each states qA ofA
and q8 of B, ifA
arrives at qA and B arrives at q8 at the same time for a string, then there exists a polynon1ialp
such that the length of at least one of these strings is less thanp(
n , m)
,wher n
=
n1ax{IIQAII, IIQ8II} and m=
max{ITAI, IT81}.Next, w consid r th consist ncy problem. In Chapter 5, we studied on several variants of this problem for one-variable patterns with respect to their computational compl xity. However, it is still open whether there exists an effective algorithm to decide th consist ncy problem itself.
A nonnal id a may be derived from Angluin's
[1) pattern automata.
Lets, t
E �+, wh ret is a substring ofs.
LetP(s,t) = {1r
EP1 l1r[tjx] = s
}. We can construct a particular automaton to recognizeP(s,t)
as follows.DEFINITION 6.0.1. (Angluin
[1]).
Aone-variable pattern automaton A(s, t)
fors,t
E �+such that tis a substring of sis a 5-tuple(Q,�,8,q0,F)
where (i,j) EQ
iff i,j� 0
and i +jltl::; l si, qo = (0, 0),
(i,j) EF
iffj � 1
and i +jltl = lsi,
and for thevariable x and each
a
E �' the transition is defined as follows:8
( ( i j)a) =
( i +1 ,
j) iffs [
i + jIt
I +1) = a
and8(
( i j)x) =
( i j +1)
iff ther xists an occurT nee oft
ins
beginning at position i + jIt I
+1
ofs.
A language accepted by the above one-variable pattern automaton
A( s, t)
is the set of all1r
EP1
such that8( q0, 1r) = p
EF.
THEOREM 6.0.1. (Angluin
[1]).
For every, t
E�+
such that tis a substring ofs,
it holds that
L(A(s,t)) = P(s,t).
Let
Ai = (Qi,
�'8i, q0, Fi)
fori= 1,
2 be two patt rn automata. Then, defineA1nA2
to be the finite automaton
A= (Q,
�'8, q0, F),
whereQ = Q1
nQ2, F = F1 n F2,
and for each p EQ
anda
E �'if81(p,a)
and82(p,a)
are defined and equal, then8(p,a)
is defined and equal their common value. Then, the most interesting property of one-variable pattern automata is the following compactness for intersection.
THEOREM 6.0.2. (Angluin
[1]).
Let A and B be one-variable pattern automata.Then, L(A)
n L ( B )
= L(An B).
EXAMPLE 6.0.1. LetT=
{(ab)3a(ab)3a, (ab)2a(ab)2a}
be a set of positive exa1npl and let F ={ babbaabbab, abababba}
be a set of negative examples.For patterns 1r E P1 such that
�( 1r, x)
= 2*, there exist 1r1 =xbaxba 1r2
=xbaabx, 1r3
=abxxba
and1r
4 =abxabx
such that T�
L( 1ri)
for i = 1, ... , 4. The1r2
and1r3
are removed becausebabbaabbab
E L(1r2)
for the substitutionbab
andabababba
E L(1r3)
forthe
a b.
The r maining 1r1 and 1r 4 are selected as consistent patterns for T U F.On the other hand, let F' = F U
{ abbbaabbba}.
In this case, th re are substitutionsabb
for 1r1 andbba
for1r
4. Thus,abbbaabbba
E L( 1r!)
n L( 1r 4)
, that is there exists no consistent 1r E P1 with T U F such that�(
1r,x)
= 2.A (u, v)
u = (abababa)"2 v =ababa
B (x, y) x = (ababa)"2 y = aba
Inter ection:
A(u, v) (l B(x, y)
Figure 6.1: Patt rn automata.
By Theorem 6.0.2, th class of one-variabl pattern automata is closed under in
tersection, and for any one-variable pattern automata A and B, we can represent
*Since there are at most polynomially many substrings, we can fix the number of variables and constants for possible patterns.
L(A) n L(B)
by a compact automaton. However, no one knows a compact representation for th difference
L(A) \ L(B)
because the class of one-variable pattern automata is not closed under complement.One way to simplify the consistency problern is a restriction of alphabets
(
e.g., I; ={
ab}).
Moreov r, by Theorem 6.0.1 and 6.0.2, we can generally assume thatIITII
=2
b cause for each T and
T'
such thatT � T', L(Ar) � L(AT'),
whereAr
is a onevariabl pattern automaton that accepts
T
andAr'
is defined analogously.Suppos that there xists a polynomial
p
such that for anyw E F,
the number of patterns 1rE L(Ar)
satisfyingw E L(1r)
is at mostp(n),
wheren
is the number of states ofAy.
In this case, we can guessIIFII · p( n)
patterns ofAr
and eventually find a pattern consistent withT
UF
within these patterns provided it exists.For the other cases, we use the notion of overlapping substrings. Let
u
be a non-null substring of aw
su h that�(w, u)
=k,
wherek 2: 2.
For each 1:::;
i <j :::; k,
letPi
andPj
be the positions of the ith and ith occurrences ofu,
respectively. IfPj -Pi
<JuJ,
then the suffix of
u
of lengthJul -(Pj- Pi)
is said to be the overlap of the ith and jth occurrences. We denote an ith occurrence of u inw
by(
ui)w·
Case 1. There exists an
1 :::;
i:::; k
-1 such that the length of the overlap of( u,
i)w and(u,
i +1)w
is greater thanlluJ/2J.
The substringu
is said to be a long solution of an equationw
= 7r if 7rE
P1 andw
= 1r[
uj
x] .
Case 2. For any 1
:::;
i:::;
k-
1, it is less than or equal tollu I /2 J.
The substringu
is said to b an exception if it is not a long olution ofw
= 7r for any 7rE P1.
Our first hypothesis is that for any
Ar
and for anywE F,
eitherIIL(ArnAw)ll:::;
1 orL(Ar n Aw)
=L(Ar)
wherAw
is a on -variable pattern automaton that accepts all1r E
P1 such thatw E L ( 1r).
Unfortunately, this hypothesis is dismissed by Example 6.0.1. Since th negative example
w
=abbbaabbba
is g n rated by two patterns
1r
1 =xbaxba,1r4
=abxabx E L(Ar),
thus neitherIIL(Ar n A(w,u))ll :S
1 norL(Ar n A(w, u))
=L(Ar ).
However, not all possibilities w r removed because the pairs
(w, 1r1)
and(w,
1r4)
both correspond to Case 2. Currently, there is no known counterexample for Case 1.
Thus, let us modify our hypoth sis as follows.
HYPOTHESIS 6.0.2. Let
A
b a one-variabl pattern automaton for strings wi, where i = 1, ... nand n � 2. Assume that w = 1r[
uj
x]
, 1r EL(A),
and that(
w,u)
satisfies Case 1. Then, eitherL( A n A(
w,u))
= {1r} orL( A n A(
w,u))
=L(A).
Finally, we consider a strategy for Case 2 using one-variable pattern equations
1fx = 1fy, where 1fx and 1fy are in P1 distinguished by the different variables x andy. A pair of strings u and u' is considered as a solution of the equation if 1rx
[
u/
x]
= 7ry[
u' /y].It is known that if the 1r x = 1r y has solutions u, then either u is of
(
a.f3)
na. for all n � 1 (it correspond to as 1) or the length of u is bounded by a polynomial in the length of max{l
1r xI
,l
1r Y I} (it corresponds to Case 2).We not that for any 1r E P1, if 1r
[
uj
x]
= 1r[
u'j
x]
andl
ui
=l
u'l
, then u = u'. Thus, w have that th number of exceptions for 1r x = 1r y is also bounded by a polynomial.Hence, for the Cas 2, the following hypothesis arises naturally.
HYPOTHESIS 6.0.3. Let
A
be a one-variable pattern automaton in the above Hypothesis. Assume that w = 1r
[
uj
x]
, 1r EL(A),
and the(
w,u)
satisfies the Case 2.Then there exists a polynomial
p
such that IIL( An A(
w, u))
II:::; p( m),
wherem
is the number of states ofA.
References
[1] Angluin, D. (1980). Finding patterns common to a set of strings.
Journal of Computer and System Sciences
21:46-62.
[2] Angluin, D. (1981). A note on the number of queries needed to identify regular languag s.
Information and Control51:76-87.
[3] Angluin, D. (1987a). Learning k-bounded context-free grammars.
Technical ReportYALEU /DCS/RR-557. Department of Computer Science, Yale University.
[4] Angluin,
D.(1987b ). Learning regular sets from queries and counterexamples.
Information and Computation
75:87-106.
[5] Angluin, D. (1987c). Learning k-term DNF formulas using queries and counterex
ample .
Technical ReportYALE/DCS/RR-559. Department of Computer Science, Yale University.
[6] Angluin,
D.(1988). Queri s and concept
1arning.
Machine Learning2:319-342.
[7] Angluin,
D.(1990). Negative results for equivalence queries.
Machine Learning5:121-150.
[8] Angluin,
D.,and Kharitonov, M. (1991). When won t memb rship queries help? In
Proc. 23rd Annual ACM Symposium on Theory of Computing
(pp. 444-454). ACM Press.
[9] Arikawa, S., Miyano, S., Shinohara, A., Kuhara, S., Mukouchi, Y., and Shino
hara,
T.(1993). A machine discovery from amino acid sequences by decision trees
over regular patterns.
New Generation Computing11:361-375.
[10] Arimura, H., Shinohara, T., and Otsuki, S. (1994). Finding minimal generaliza
tions for unions of pattern languages and its application to inductive inference from positive data. In Proc. llst Symposium on Theoretical Aspects of Computer Science (pp. 649-660). LNCS 775, Springer-Verlag.
[11] Beimel, A., Geller F. and Kushilevitz, E. (1998). The query complexity of finding local minin1a in the lattic . In Proc. llst Annual Conference on Computational Learning Th ory (pp. 294-302). The Association for Computing Machinery.
[12] Berman, P., and Roos, R. (1987). Learning one-counter languages in polynomial tim . In Proc. 28th IEEE Symposium on Foundations of Computer Science (pp.
61-67). IEEE Computer Society Press.
[13] Birkendorf, A., Boker, A. and Simon, H.U. (1997). Learning deterministic finite automata from smallest counterexamples. eCOLT Tec hnical Report eC-TR-97-001.
[14] Boros, E., Ibaraki, T., and Makino, K. (1997). Monotone ext nsions of Boolean data sets. In Proc. 8th International Work hop on Algorithmic Learning Theory (pp.
161-175). LNAI 1316, Springer-Verlag.
(15] Brazma, A. (1993). Efficient identification of regular expressions from represen
tative examples. In Proc. 6th Annual ACM Con} renee on Computational Learning Theory (pp. 236-242). The As ociation for Computing Machinery.
[16] Burago, A. (1994). Learning structurally reversible cont xt-free grammars from queries and counterexamples in polynomial time. In Proc. 7th Annual ACM Confer
ence on Computational Learning Theory (pp. 140-146). Th Association for Com
puting Machinery.
[17] Cook, S.A. (1971 ). The complexity of th orem-proving procedur s. In Proc. 3rd Annual ACM Symposium on Theory of Computing (pp. 151-158). ACM Press.
[18] Erl bach, T., Rossmanith, P. Stadtherr, H., Steg r, A. and Zeugmann, T. (1997).
Learning one-variable pattern languages very fficiently on average, in parallel, and by asking queries. In Proc. 8th International Workshop on Algorithmic Learning
Theory (pp. 260-276). LNAI 1316, Springer-Verlag.
[19] Gavalda,
R.(1993). On the power of equivalence queries. In
Proc. 1st European Conference on Computational Learning Theory.Clarendon Press, London.
[20] Gavalda
R.( 1994). The complexity of learning via queries. In
Proc. 9th Structure in Complexity Theory Conference(pp.324-337).
IEEEComputer Society Press.
[21] Gold
M.E.( 1967). Language identification in the limit.
Information and Control10:44 7-4 7 4.
[22] Goldschlager, L.M. (1977). The monoton and planar circuit value problems are log space complete for P.
SICA CT News9:25-29.
[23] Garey,
M.R.and Johnson, D.S. (1979). Computers and intractability. W.H. Free
man
&Co.
[24] Gray, J.N., and Harrison, M.A. (1972). On the covering and reduction problems for context-free grammars.
Journal of the A CM19:675-698.
[25] Hancock T., Galea, M., and Marchand M. (1991). Learning nonoverlapping per
ception n tworks from examples and membership queries.
Technical ReportTR-2691. Harvard University Center for Research in Computing Technology.
[26] Harrison, M.A. ( 1978). Introduction to formal language th ory. Addison- Wesley.
[27] Hopcroft,
J.E.,and Ullman, J.D. (1979). Introduction to automata theory, lan
guages, and computation. Addi on- Wesley.
[28] Ibarra,
0.,and Jiang,
T.(1988). Learning regular languag s from counterexam
ples. In
Proc. 1988 Workshop on Computational Learning Theory(pp. 371-385).
Morgan Kaufmann.
[29] Ishizaka, H. (1990). Polynomial time learnability of simple d terministic lan
guages.
Machine Learning5:151-164.
[30] Ishizaka, H., Arimura,
H.,and Shinohara, T. (1994). Finding tree patterns con
sistent with positive and negative examples using queries. In
Proc. 5th International Workshop on Algorithmic Learning Theory(pp. 317-332). LNAI 872, Springer
Verlag.
[31]
Jiang, ., Salomaa, A., Salomaa K., and Yu, S.(1995).
Decision probl ms for patterns.Journal of Computer and System Sciences 50:53-63.
[32]
Kaminski, M., and Francez, N.(1994).
Finite-memory automata.Theoretical Co·mput r Science 134:329-363.
[33]
K arn , M., and Pitt, L.(1989).
A polynomial-time algorithm for k-variable pattern languages from examples. In
Proc. 2nd Annual Workshop on Computational Learning Th ory
(pp.57-71).
Morgan Kaufmann.[34]
Kearns, M., and Valiant, L.G.(1994).
Cryptographic limitations on learning Bool an formula and finite automata.Journal of the ACM 41(1):67-95.
[35]
Kearns, M.J., and Vazirani, U.J.(1994).
An introduction to computational learning th ory. MIT Press.
[36]
Knuth, D.E.(1967).
Characterization of parenthesis languages.Information and Control11:269-2 9.
[37]
Ladner, R.E.(1975).
Th circuit value probl m Is log space complete for P.SIGACT News 7(1):18-20.
[3:1]
Laird, P.(19
). Learning from good data and bad. Ph.D. thesis Yale University.Kluwer Acad mic Publi h rs.
[39]
Lange, S. and Wi hagen R.(1991).
Polynon1ial-time infer nee of arbitrary pattern languages.
New Generation Computing 8:361-370.
[40]
Levy, L.S., and Joshi, A.K.(1978).
Sk letal structural descriptions.Information and Control 39:192-211.
(41]
Maass, W., and Turan, G.(1990).
On the compl xity of learning from counterexamples and membership queries. In
Proc. 31st Annual Symposium on Foundations of Computer Science (
pp.203-210).
IEEE Computer Society Press.[42]
Marron, A.(1988).
Learning pattern languages from a single initial example and from queries. InProc. 1988 Workshop on Computational Learning Theory (
pp.311-325).
Morgan Kaufmann.[43]
Marron, A., and Ko, K.(1987).
Identification of pattern languages from examples and queries. Information and Computation74(2).
[44]
McNaughton, R.(1967).
Parenthesis gramn1ars. Journal of the ACM14:490-500.
[45]
Mitchell, A.R.(1998).
Learnability of a subclass of extended pattern languages.In Proc. 11 t Annual Confer nee on Computational Learning Theory (pp.
64-71 ) .
The Association for Computing Machinery.
[46]
Miyano, S., Shinohara, A., and Shinohara, T.(1991).
Which classes of elementary formal systems are polynomial-time learnable? In Proc. 2nd Workshop on Algorithmic Learning The01�y (pp.
139-150).
Japanese Society for Artificial Intelligence.[47]
Natarajan B.K.(1989).
On learning from exercises. In Proc. 2nd Annual Workshop on Computational Learning Theory (pp.
72-87).
Morgan Kaufmann.[48]
Natarajan, B.K.(1991).
Machine learning: a theoretical approach. Morgan Kauf-mann.[49]
Oncina, J., and Garda, P.(1992).
Inferring regular languages in polynomial update time. In Perez, N. et al. eds., Pattern Recognition and Image Analysis (pp.
49-61).
World Scientific.[50)
Papadimitriou C.H.(1994).
Computational complexity. Addi on-Wesley.[51]
Porat, S., and Feldman, J.(1988).
Learning automata from ordered examples.In Proc. 1988 Workshop on Computational Learning Theory (pp.
386-396).
MorganKaufmann.
[52]
Reischuk, R. and Z ugn1ann, T.(199 )
. Learning !-variable patt rn languages in linear av rag time. In Proc. list Annual Conference on Computational LearningTheory (pp.
198-208).
The Association for Con1puting Machinery.[53]
Sakakibara, Y.(1990).
Learning context-free grammars from structural data in polynomial time. Theoretical Computer Science76:223-242.
[54]
Sakamoto, H.(1995).
Language learning from membership queries and characteristic examples. In Proc. 6th International Workshop on Algorithmic Learning Theory (pp.
55-65).
L AI997,
Springer-Verlag.[55]
Sakamoto, H.(1997).
Learning sirnple deterministic finite-memory automata. InProc. 8th Int rnational Workshop on Algorithmic Learning Theory (pp.
416-431).
LN AI
1316
Spring r-Verlag.[56]
Sakamoto, H.(1998a).
The learnability of simple deterministic finite-memory automata via queries. Bulletin of Informatics and Cybernetics
30(1 ):93-108.
[57]
Sakamoto, H.(199
b). Finding a one-variable pattern from incomplete data. InProc. 9th International Conference on Algorithmic Learning Theory (pp.
234-246).
LNAI
1501,
Spring r-V rlag.[58]
Sakan10to, H., and Ikeda, D.(1998).
Intractability of decision problems for finitememory automata. In Margenstern, M. ed., Proc. 2nd International Colloquium Universal Machines and Computations (vol. 2, pp.
58-74,
ISB :2-9511539-1-0).
Special issue on Theoretical Computer Science, to appear.
[59]
Salomaa, A.(1969).
Theory of automata. International Series of Monographs in Pure and Applied Math matics (vol.100).
Pergamon Press, London.[60]
Sammut, C., and Banerji, R.(1986).
Learning cone pts by asking questions. In Michalski R.S., Carbonell, J.G., and Mitchell T.M. eds., Machine Learning: an Artificial Intelligence Approach (vol.2).
Morgan Kaufmann.[61]
Schapir , R.E.(1990).
Patt rn languages are not learnable. In Proc. 3rd Annual Workshop on Computational Learning Theory (pp.122-129).
Morgan Kaufmann.[62]
Shapiro, E.Y.(1981).
A gen ral incremental algorithm that infers theories from facts. In Proc. 7th International Joint Conference on Artificial Intelligence (pp.446-451).
Morgan Kaufmann.[63]
Shapiro, E.Y.(1982).
Algorithmic program diagnosis. In Proc. 9th A CM Symposium on Principles of Programming Languages (pp.
299-308).
The Association for Cornputing Machinery.[64] Shapiro, E.Y. (1983). Algorithmic program debugging. MIT Press.
[65] Shinohara, T. (1982a). Polynomial time inference of pattern languages and its ap
plications. In Proc. 4th IB/1.1 Symposium on Mathematical Foundations of Computer Sci nc (pp. 191-209).
[66] Shinohara, T. (1982b ). Polynomial tim inference of extended regular pattern languages. In Proc. RIMS Symposia on Software Science and Engineering (pp.
115-127). LN S 14 7 Springer-V rlag.
[67] Shinohara, T. (1986). Studies on Inductive Inference from Positive data. Ph.D.
the i , Kyushu Univ rsity.
[68] Stephan, . (1998). The consistent extension problem for one-variable patterns is NP-compl t . Personal communication.
[69] Stockmeyer, L.J., and Meyer, A.R. (1973). Word problems requiring exponential time. In Proc. 5th Annual ACM Symposium on Theory of Computing (pp. 1-9).
ACM Pr ss.
[70] Takada, Y. (1988). Grammatical inferenc for even linear languages based on control sets. Infor·mation Processing Letters 28:193-199.
[71] Trakhtenbrot B., and Barzdin, Ya. (1973). Finite automata: behavior and syn
thesis. North Holland Publishing Company, Am t rdam.
[72] Valiant, L.G. (1984). A theory of the l arnable. Communications of the ACM 27(11):1134-1142.
[73] Wiehag n, R., and Zeug mann, T. (1994). Ignoring data rnay be the only way to learn efficiently. Journal of Experimental and Theoretical Inte lligence 6:131-144.
[74] Yokomori, T. (1991). Polynomial-time l arning of very simple grammars from positive data. In Proc. 4th Annual Workshop on Computational Learning Theory (pp. 213-227). Morgan Kaufmann.
[75] Yokomori, T. (1993). Learning two-tape automata from queries and counterex
an1ples. In Proc. 6th Annual A CM Conference on Computational Learning Theory (pp. 228-235). The Association for Computing Machinery.