九州大学学術情報リポジトリ
Kyushu University Institutional Repository
論理プログラム学習における汎化と述語発見
石坂, 裕毅
https://doi.org/10.11501/3065651
出版情報:Kyushu University, 1992, 博士(理学), 論文博士 バージョン:
権利関係:
Chapter 5
Model Inference with Predicate Invention
It seems that the theory of model inference is applicable to more practical problems such as automated program synthesis. However, several problems in applying the theory to the practical problems are also pointed out
[
AI87]
. The most serious problem is that we need togi
ve in advance a first order language £with finitely many predicate symbols over which a target model is described. Furthermore, the oracle which gives information about a target model to an inference machine is assumed to be able to answer the truth of all elements in B.c. In more practical setting such as in automated program synthesis, it seems difficult to assume such a power on the oracle. The assumptions require a user of the synthesis system to have too much knowledge about the target program.Now, we consider, via a simple simulation, what will happen when the assumptions are removed.
Suppose that an inexperienced programmer is going to make a program for the pred
icate
reverse(listl, list2),
wherelist2
is a reversed list oflistl,
using a system which is an implementation of the model inference algorithm, e.g. Model Inference System(
Sha82]
. Since he/
she is inexperienced, he/
she does not have enough knowledge to understand of an algorithm for reversing a list or he/
she understands an algorithm but he/
she does not see what kind of auxiliary predicates(
e.g.concat(_, _, _)
orappend(_,_,_)
etc.)
are necessary for the algorithm. So, he/
she gives the first order language with only one predicate symbolreverse(_,-)
to the system, then he/
she starts the task. He/
She patiently continues to input facts aboutreverse(_,_).
However, if there is no program for reversing a list which consists of only the predicate symbol, then his/
her efforts will result in failure. On the other hand, suppose that the system realizes the difficulty of the programming with only one predicatereverse(_,_)
and it introduces a new predicate as an auxiliary one. Since the programmer has no interpretation for such a new predicate, he/
she cannot answer the questions from the contradiction backtracing algorithm which plays a central role in the model inference algorithm. Hence he/
she cannot continue the inference process anymore.Thus, it seems that the purpose of the inexperienced programmer is not accomplished.
However, from the above simulation, the following two questions arise.
(
1)
Is there no list reversal program with only one predicatereverse(_,_).
(
2)
Is there no method for giving an interpretation, which reflects the programmer's intended model, to the new predicates introduced by the system.
For applying the theory of model inference to more practical problems such as automated program synthesis, it seems very important to answer the above two questions. We leave the first question open for researchers of the theory of logic programs. Here, we reconsider the model inference from the point of view of the second question.
The predicates, which may be necessary to define a target predicate but are not directly observable from the examples of the target predicate, are called
theoretical terms [
Sha81]
.Shapiro assumed that the theoretical terms and their intended interpretations necessary for defining the target predicate are given to the inference algorithm. Recently, however, several researchers are trying to develop a method for automatically inventing theoretical terms
[
MB88, Mug90, Ban88, Lin89b, Lin89a]
. In this chapter, we also consider the challenging problem of inventing theoretical terms.In Section 5.1, we extend the model inference problem so that it may include more practi
cal problems. In Section 5.2, we consider the problems caused in solving the extended model inference problem, especially, in introducing new predicates. In Section 5.3, we propose an easy approach to the problem. Several classes of programs for which the easy approach might go well are introduced in Section 5.4.
This chapter is based on the paper
[
Ish88b]
.5.1 An Extended Model Inference Problem
In this section, we redefine a model inference problem so that it can include more practical problems. We assume that a set of function symbols and a set of variable symbols are common to
allfirst order languages considered in this chapter. The set of predicate symbols of a first order language £ is denoted by II ( £ ) . For any first order language £, an extended language £' of £ is a first order language such that II ( £ ) � II ( £' ) . If £' is an extended language of £, then £ is called an restricted language of £'. Let P be a program over £' that is an extended language of£. Then M(P).c denotes the set M(P)
nB.c.
Let £ be a first order language and M be the least Herbrand model of an unknown program over an extended language of £. An oracle for M over £ is a device that can answer the membership of
aE B.c in M. The answer is true if
aEM, false otherwise. Note that the oracle over £ answers the membership for only the elements of B.c. If a ground atom {3 rf_ B.c is given, then the oracle is assumed to returns a special symbol
l_that means
"unknown".
An extended model inference problem is defined as follows:
Suppose that £ and an oracle for M over £ are given. Then infer a program P over an extended language£' of£ such that M(P).c =
Mfrom the information given by the oracle.
Example 5.1
Let£ be a first order language such that II ( £ )
={reverse(_, -)} and M = {reverse(lr, l2) E B.c ll2 is the reversed list of a list fr}. An extended model inference prob
lem is to find a program such as
P=
reverse([XIY], Z)
+-reverse(Y, Yi), concat(X, Yi, Z).
reverse(O, 0).
concat(X, [Y IZ], [Y IW] )
+-concat(X, Z, W).
concat(X, 0, [X]).
only by using an oracle for
M,that is, using information about reverse(_, _). In this case, II £' ( ) = II ( £ )
U{concat(_,
_,-)}.
In what follows, we distinguish the use of terms, predicate names, predicate symbols,
and predicates according to the objects indicated by them as follows.
Apredicate name
indicates just a string such as
reverse, concat,
etc. Apredicate symbol
indicates a predicate name with arity such as
reverse(_,_), concat(_, _, _),
etc. Apredicate
indicates a predicate symbol concerned with its model, e.g.reverse(_,_)
concerned with the model{reverse(O, 0), reverse([a], [a]), reverse([a, b], [a, b])}.
5.2 Problems in Extended Model Inference
The most difficult problem for an algorithm to solve an extended model inference problem is how to introduce an unknown predicate symbol in
II(£') -II(£)
when it is necessary. In this section, we review recent two approaches to such an extended model inference problem.One was given by Muggleton and Buntine [MB88] and another was given by Banerji [Ban88].
Through their approaches, we consider the essential problems in solving an extended model inference problem.
M uggleton and Bun tine proposed a method called
inverse resolution
[MB88]. They implemented their method as a inductive logic programming system called CIGOL. Inverse resolution is executed by three operators, truncation, absorption, and intra-construction.
The truncation constructs unit clauses from given positive facts using the least generaliza
tion algorithm. The absorption constructs non-unit clauses from unit clauses obtained by the truncation. A new predicate symbol is introduced by the intra-construction. CIGOL re
quires a more powerful oracle than that we introduced in the previous section. For example, the oracle can answer correctness of a general clause such as
reverse([XIY ], Z)
+-reverse(Y, Y1), concat(X, Yi, Z).
in a target model. Furthermore, although CIGOL invents predicate symbols, it does not invent predicates. That is, assigning a model to a newly invented predicate symbol is left to the oracle.
The following two problems should be considered for introducing new predicates in the extended model inference.
1. When (or why) is it necessary to introduce a new predicate symbol?
2. How to define the model assigned to the new predicate symbol.
Although CIGOL avoids the second problem by oracle power, the problem is apparently most essential in extended model inference.
The first problem is concerned with the convergency of an inference algorithm. For example, CIGOL introduces a new predicate symbol in inferring a model
M
={ arch(t) I
tis a triplet(t1, beam, t1)
andt1
is a list(
possibly empty)
which consists ofblock
andbrick}.
The new predicate symbol corresponds to a predicate
column(_)
whose intended model is{column(t) I
tis a list consists ofblock
andbrick}.
Of course, the predicatearch(_)
can be defined usingcolumn(_)
asarch((X, beam, X))
�column(X).
However, there exists a programP
such thatM(P)
=M
andP:
arch(([blockiX], beam, [block IX]))
�arch((X, beam, X)).
P =
arch(([brickiX], beam, [brick IX]))
�arch((X, beam, X)).
arch((O, beam, 0)).
which consists of only the predicate symbol
arch(_).
In order for the inference algorithm to converge to some program, it should be very careful to introduce new predicates. If an inference algorithm introduces new predicates immoderately, then it might diverge. In a setting in which an inference algorithm infers not only a target model but also a language for describing the model, it becomes difficult to ensure the convergency of the inference process.On the other hand, Banerji proposed a procedure called DREAM that produces a new predicate
[
Ban88]
. DREAM works as follows. Suppose that there exist two clauses p � A, D and p � B, D in the current hypothesis, where D is a set of atoms appearing in both bodies in common, and A and B are the sets of left atoms of the two bodies after removing the common atoms. Then, DREAM replaces the two clauses by the following three clauses:p �
new(t1, ... , tn),
D.new(X1, ... , Xn)
� A'.new(X1,
... , Xn)
� B'. ,where
t1,
... , tn are the terms in A and B, A' and B' are atoms obtained from A and B by replacingt1
withXi
at every occurrence andnew(_, ... ,_)
is a new predicate symbol.This method answers very clearly for the second problem we mentioned above. The model of
new(_, ...
,-)
is strictly defined according to the models of A' and B' that are already known. However, for the first problem, the method seems to be problematic. Actually, as Banerji noticed in his paper[
Ban88]
, this kind of new predicates may be useful for simplifying programs. However, it is not necessary to introduce the new predicate. The predicate p had been already defined by the clauses p +-- A, D and p +-- B, D.Ling classified predicates to be invented into two categories called
necessary intermediate terms
anduseful intermediate terms [
Lin89a]
. A necessary intermediate term is a predicate which is necessary to describe the target program. For example, the predicate such asconcat(_,
_, _)
inreverse
program is a necessary intermediate term. On the other hand, a useful intermediate term is a predicate which is useful for simplifying conjectures, but not necessary for describing the target program. For example, both predicates,column(_)
introduced by CIGOL and new
(
_, ... ,_)
introduced by DREAM, are useful intermediate terms. As mentioned in the introduction of this chapter, our interest is in invention of necessary intermediate terms.5.3 A Simple Approach to the Problems
It seems very difficult to develop a general strategy, which has plausible answers to the two problems mentioned in the previous section, for introducing necessary intermediate terms. However, if each model in the target class to be inferred can be represented by a fairly restrictive program, it may be possible to develop a plausible strategy for introducing necessary terms in inferring the class. In this section, we consider what kind of restrictions on the program is preferable for such a strategy.
5.3.1 When a new predicate is necessary
First, we consider the situation in which a new predicate should be introduced. Let M be a target model. In such a situation, the current conjecture must be inconsistent with at least one fact known so far, for a new predicate becomes necessary. There are two possible cases.
One is when the conjecture
P
is too strong, that is, a Elvf(P)
for some known negativefact a rf. M. The other is when Pis too weak, that is, a rf.
M(P)
for some known positive fact aE
M. It follows from Corollary 3.3 that, for any too strongP, P
has a false clause in M. Hence, whenP
is too strong then it suffices to remove the false clause. Thus, a new predicate is not necessary in this case.On the other hand, when
P
is too weak then there exists some ground atom {3E
M(possibly, {3 =a
)
such that {3 is not covered by any clause inP
with respect to M. In such a case, a clause which covers {3 with respect to M should be added toP.
A new predicate becomes necessary when there is no clause which consists of known predicates and covers {3 with respect toM.Thus, in order to launch into the introduction of a new predicate, the inference sy stem must be able to examine all the candidate clauses and recognize the point when such clauses are exhausted. Hence, it is preferable that, for any positive fact a, the number of all possible
clauses
whichcovers
ain a target model is finitely bounded. Several types of restrictions for
bounding the number of possible clauses can be considered. At least the number of atoms allowed to appear in the body and the depth of terms in the atoms should be bounded in advance. It is preferable that the number or the depth are bounded as small as possible for the efficiency of the inference algorithm.
5.3.2 The model of a new predicate
As mentioned in the previous subsection, a new predicate at first appears as an atom in the body of a clause which covers a positive fact. The model of the new predicate should closely be related to the models of other predicates in the clause. By using this property, we may define the model of the new predicate. For example, suppose that a new predicate
new(_,_,_)
is introduced with a clausereverse([XjY],
Z) +-- reverse(Y, Y1), new(X,
Yi, Z)
. Then, the model ofnew(_,_,_):
{new(t1, t2, t3) I reverse([t1lt4], t3) EM
andreverse(t4, t2) E .N/}.
can be defined in terms of the model M of
reverse(_,_,_):
{reverse( t1, t2) I t2
is the reversed list oft1}.
Thus, the model is concerned with M through the shared variables in the clause.
In order for this simple method to work well, the following items should have been fixed or bounded as tight as possible.
(a) The arity of a predicate symbol to be introduced.
(b) The possible form of atoms occurring in a clause.
(c) The possible form of terms occurring in the atoms.
It is difficult to fix these items when general programs are targeted. Conversely, if we restrict the target to the class of programs for which the above matter can be fixed, then it may be possible to develop a plausible strategy for introducing new predicates and solving an extended model inference problem for the class of models described by the programs. In the next section, we give some examples of such classes of programs.
5.4 Examples of Restricted Programs
In this section, we give some examples of the class of programs which have considerably restricted syntax, and discuss the possibility of predicate invention in the extended model inference.
5.4.1 DRLP
A
deterministic regular logic program
(DRLP, for short) is a program which is equivalent to a deterministic finite automaton. Thus, the class of models described by DRLP's, called regular models, is equivalent to the class of regular languages. A DRLP consists of clauses of the following two forms:qi([aiX])
+-qj(X).
qi ([]).
Furthermore, for each predicate symbol
qi(-)
and each constant symbola (#- 0),
a DRLP has at most one clause whose head isqi([aiX]).
A DRLP is so restrictive as to satisfy the requirement mentioned in the previous section.
The model of a new predicate
qj(-)
introduced by a clauseqi([aiX])
+-qj(X)
can be definedas
{ qj( w) I qi([alw])
EM},
wherew
denotes a ground list. In Chapter 6, we will discuss in detail the extended model inference of regular models.5.4.2 LMLP
A linear monadic logic program
(
LMLP, for short)
is a program which is equivalent to a deterministic tree automaton. A L:NILP consists of clauses of the following two forms:qi(j(X1, ... , Xn))
+-qj1 (X1), ... , qin (Xn)·
qi( a).
where x1, .
.
., Xn
are mutually distinct variables. That is, the class of LMLP's is a natural extension of that of DRLP's. Skakibara showed that the extended model inference problem forthe class
of models of LMLP's can efficiently be solvable[
Sak90)
.Since the syntax of LMLP's is fixed, they also satisfy the requirement mentioned in the previous section. Furthermore, the atoms appearing in the body do not share a variable.
Thus, as in the case of DRLP, it seems that the model of a new predicate which is introduced by a clause
can be defined as
However, for a LMLP, the uniqueness of a clause according to its structure is not assumed.
Hence, a LMLP is allowed to contain the following clauses:
ql (X
vY)
+-ql (X), q2(Y).
q1 (X
vY)
+-ql (X), q3(Y).
In such a case, it is impossible to distinguish the model of
q2(_)
from the model ofq3(_).
Thus, for the programs containing such clauses, that is, for the programs which essentially have OR-parallelism, the simple way of defining the model may not work well. Actually, the efficient extended model inference algorithm by Sakakibara is based on a special strategy.
5.4.3 SDG
A LMLP has no shared variable in the body of each clause but may have clauses with a common head. Conversely, a program introduced here has no other clauses than those which have a common head but may have a shared variable in the body of a clause. This class corresponds to that of
simple deterministic grammars (
SDG, for short)
. An SDG is a context-free grammar G =(N,
:E, P,S)
such thatif both
A� a
a andA--t a(3
are production rules in P then a=(3. (5.1)
A language generated by an SDG is called a simple deterministic language
(
SDL, for short)
.It is well known that every context-free grammar can be represented by a definite clause grammar which is a kind of logic program. A definite clause grammar corresponding to an SDG, denoted by SDDCG, consists of the following three types of clauses:
A([aiX], Y)
t-B(X, Z), C(Z, Y).
A([aiX], Y)
t-B(X, Y).
A([aiX], X). ,
where each predicate name corresponds to a nonterminal of the SDG and
a
is a terminal symbol of the SDG. Each predicateN(x, y)
is interpreted as a predicate which is true if and only the string represented by thedifferential list x
-y1
can be generated from the nonterminal N in the SDG. From the condition
(5.1)
on the production rules, for each predicate symbolA(_,_)
and each terminal symbol a, there exists at most one clause whose head is an instance ofA([aiX], Y).
Thus, the problem as mentioned in the previous subsection does not occur.For SDDCG's, however, a new problem arises because a clause of the first type shares a variable in its body. For example, the models
M
BandMC
of the predicate symbolsB(_, _)
and
C(_,
_)
can be defined respectively as follows:M B
={ B(t1, t2) I A([alt1], t3)
Elvf, C(t3, t2)
EN/C}, MC
={C(t2,t2) I A([alt3],t2)
EM,B(t3,t1)
Elv!B}.
Since each model is defined mutually by its alternative's, the definitions are problematic.
1 For example, a differential list
[a, b, c, d] - [c, d]
represents the list[a, b].
Thus, for a program which needs a clause with shared variables in the body, when two or more predicates are necessary to be introduced simultaneously by one clause, a problem how to define the models of both predicate arises. For general programs, the problem may be intractable. Fortunately, however, for an SDG G
= (N,
I:, P, S)
,it follows from the condition 5.1 thatif
A=}* w a
andA=}* w{3 (A EN, wE
r:+,{3 EN*)
thena= {3.
Hence, for a clause of the form
A([a!X], Y)
+-B(X,
Z)
, C(
Z,Y ),
if some positive factB([wju], u)2
is found, then the model MC of C(
_, _)
can uniquely be defined by Mas follows:A positive fact about
B(_,
_)
can be obtained asB([wlu], u)
from a positive factA( [ax j y] , y)
such that
xy = wu
and w is a prefix ofx.
In Chapter 7, we will discuss in detail the extended model inference problem for this class. The algorithm described in there will use the above property and achieve an efficient inference.
5.4.4 Simple recursive programs
Since each program introduced above is a direct representation of an automaton or a gram
mar, there exists a special pattern on the structure of a clause or the arity of a predicate sy mbol appearing in the clause. It may be impossible to expect a class of general programs to have such a pattern. However, it may also be a fact that we often make a program by repeating some stereoty ped patterns. Here, we see such patterns found in some simple programs called
term-free transformations
introduced by Shapiro[
Sha81]
.Multiplication:
times(O, X, 0).
times(s(X), Y,
Z)
+-times(X, Y, U),plus(Y, U,
Z)
.plus(O, X, X).
plus(s(X), Y, s(Z)) +- plus(X, Y,
Z)
.2 As in the next chapter, we abbreviate a list
[a1,
... , anI [b1, ... , bm]]
as[wlu]
where w = a1 a2 ···an andU =
b1�1
• • • bm.Subset relation:
Insertion sort:
subset(O, X).
subset([XIY], Z) +-subset(Y, Z), member(X, Z).
member(X, [XIY]).
member(X, [YIZ]) +-member(X, Z).
sort([], 0).
sort([XIY], Z) +-sort(Y, Yi), insert(X, Yi, Z).
insert(X, 0, [X]).
insert(X, [YIZ], [X, YIZ]) +-X � Y.
insert(X, [YIZ], [YIZ1]) +-Y �X, insert(X, Z, Z1).
x�x.
X � s(Y) +-X � Y.
The list reversal program described in the introduction of this chapter is also a term-free transformation. There are following properties common to these programs.
(1)
There is at most one auxiliary predicate in the body of each clause.(2)
All arguments of the atoms appearing in the bodies are variables.(3)
No free-variable appears in the body, where a free-variable is a variable which appears at most once only in the body of a clause.( 4)
Each head has no common instance with other head.These properties seem to be preferable for clearing the problems mentioned before. From the property
(1),
we do not need to worry if several new predicates are introduced simultaneously by one clause. By the propety( 4),
we can clear the problem mentioned in Section5.4.2.
Bythe properties
(2)
and(3),
we can restrict(
b)
and(
c)
in Section5.3.2.
Furthermore, there is one more typical property common to the above programs, which may be useful for deciding the arity of predicate to be introduced.(5)
Each variable processed recursively does not occur in each auxiliary predicate.For example, the variable
x
intimes(s(x), y,
z)
does not occur inplus(y, u,
z)
.Thus, there may exist a plausible strategy to solve the extended model inference problem such as in Example 1.1. It is a very interesting and challenging theme to develop such a strategy.
In the following two chapters, we shall consider some concrete extended model inference problems. One is concerned with inferring DRLP's and the other with inferring SDG's. In both cases, we shall develop efficient inference algorithms. The existence of an effective method for introducing predicate symbols and assigning appropriate models to the symbols is essential for the efficiency.
Chapter 6
Learning Regular Languages
In this chapter, we consider a problem of inferring a class of restricted logic programs that corresponds to the class of acceptors for regular languages. In our setting, a target logic program is over a first order language £ with countably many unary predicate symbols:
q0, q1, q2, . . . . A given oracle is that for a model M0 over £0, the restricted language of £ in which only one predicate symbol q0 is allowed. As mentioned in the previous chapter, in such a setting, the oracle has no interpretation for predicates other than the predicate q0 . This implies that we cannot take advantage of the contradiction backtracing algorithm which is one of the most important part for the efficiency of Shapiro's model inference algorithm.
In order to overcome the disadvantage, we develop a method for giving an interpretation for predicates other than the predicate q0, which is based on the idea of using the oracle for M0 and a one to one mapping from a set of predicates to a set of strings. Furthermore, we propose a model inference algorithm for regular languages using the method, then discuss the correctness and the time complexity of the algorithm.
In Section 6.1, we define the problem of learning regular languages as an extended model inference problem. In Section 6.2, we propose a key idea to solve the extended model inference problem. We introduce a simple mapping, called a predicate characterization, by which a model over £0 can be extended to a model over £. Then, we show some conditions which the mapping should satisfy to perform an appropriate extension of the model. In Chapter 6.3, we give an inference algorithm to solve the extended model inference problem.
The correctness of the algorithm is shown in Chapter 6.4, and time analysis of the algorithm
is made in Chapter 6.5.
This chapter is based on the paper
[
Ish88a]
.6.1 Regular Model Inference Problem
We introduce a class of Herbrand models called regular models and show that the class is equivalent to that of regular languages. Then a regular model inference problem, which is an extended model inference problem, is defined.
Let .C be a first order language with countably many unary predicate symbols
q0, q1,
... , a list constructor[-1-J,
and finitely many constant symbols0, a1, a2, ... , am.
Since every predicate symbol treated in this chapter is unary, we shall identify each predicate symbolqi(-)
with its predicate name
qi
for notational convenience. £0 denotes the restricted language of£ in which only one predicate symbol
q0
is allowed. Let P be a logic program over .C . .Cp denotes the language with only predicate, function and constant symbols occurring in P, IT(
P)
denotes the set of the predicate symbols.Let
:E
denote the set{ a1, a2, ... , am}·
A stringx
=ai1 ai2
• • •ain
E:E*
is denoted using list notation by[ai1, ai2,
•••,aiJ·
Furthermore, the list[ai1, ai2,
• ••,aiJ
is abbreviated as[ai1 ai2
• • •ain].
Thus the stringx
is represented as[x]
in the following context of logic programs. Note that, for the empty string,
[c]
is denoted by the empty list0.
For a program P over .C, M(
P)
q, denotes the set{qi([x])
E M(
P) I x
E:E*}.
Definition 6.1 A deterministic regular logic program
(
DRLP, for short)
P is the logic program over .C which satisfies the following conditions.
1)
Each clause in P is of one of the following two forms:qi([ak!X])
+--qj(X), qi([] ).
2)
For anyqi
E II(P)
andak
E :E, there is at most one clause inP
whose head isqi([ak!X] ).
Definition 6.2 Let Mo � Bc0• We say M0 is a regular model if there exists a DRLP P such that Nfo =
NI(P)qo
Let
L(Mo)
denote the set{x
E I:*I q0([x])
EM0}.
Then we have the following theorem which ensures that the class of regular models is equivalent to that of regular languages.Theorem 6.1
For any M0
� l3.c0,M0 is a regular model if and only if L(Mo) is a regular language.
Proof: First, we prove the
only if
direction. LetP
be a DRLP such thatM(P)qo
=M0.
Without loss of generality, we may assume that
II(P)
={ qo, q1, ... , qm}·
Then, construct a DFA A =(
Q U{ qm+1},
I:,8, q0, F),
where Q =II(P), qm+l
is a predicate symbol which does not appear inII(P), F
={qi
E QI qi([])
EP},
and{ qj
ifqi([akiX])
�qj(X)
EP, 8(qi, ak)
=qm+l
otherwise.Here, we show that
8(qi, x)
EF
if and only ifqi([x])
EM(P)
for anyqi
EII(P)
andx
E I:* by induction on the length ofx.
If
lx I
= 0, that is,x
=c
then it follows from the definition of A that(6.1)
From Corollary 3.3, if
qi([])
EP
thenqi([])
EM(P).
Also, ifqi([])
EM(P)
then there exists a clauseC
inP
such thatqi([])
EC(M(P)).
Hence, it follows from the definition of DRLP's thatC
is of the formqi([]).
Thus it holds thatqi(O)
EP
if and only ifqi([])
EM(P).
With(6.1),
this implies that8(qi, c)
EF
if and only ifqi([c])
EM(P).
Suppose that
8(qi, a)
=qj.
Since8(qi, ax)
=8(8(qi, a), x),
it holds that8(qi, ax)
EF
if and only if
8(qj, x)
EF.
From the induction assumption, it holds that8(qj, x)
EF
if and only ifqj([x])
EM(P)
for anyqj
EII(P)
and any stringx
E I:* such thatlxl
� n for some n � 0. On the other hand, from the definition of A, it holds that, for anya
E I: andqi
EII(P),
Thus, it holds that
From Corollary 3.3, if
qi([aiX])
�qj(X)
EP
andqj(x)
EM(P),
thenqi([ax])
EM(P).
Also, if
qi([ax])
EM(P),
then there exists a clauseC
inP
such thatqi([ax])
EC(M(P)).
It follows from the definition of DRLP's that suchC
is unique if it exists. Hence, ifqi([ax])
EM(P)
then there uniquely exists a clauseqi([aiX])
�qj(X)
inP
such thatqj(x)
EM(P).
Thus, it holds that
b(qi, ax)
E F iffqi([ax])
EM(P).
This completes the induction step of the proof. Hence, it holds that
8( qi, x)
E F if and only ifqi([x])
EM(P)
for anyqi
EIT(P)
andx
E E*. As a result, we obtain the following relation:b(qo, x)
E F iffqo((x])
EM(P)qo
iffqo((x])
EMo.
Thus
L(Mo)
is a regular language.Conversely, from a DFA which accepts
L(M0),
we can construct a DRLP and show thatthe other direction in a similar fashion.
D
Here we define the problem of learning regular languages as an extended model inference problem. Let
M0 (� B.c0)
be a regular model. Anoracle
forM0
over C0 is the device which, for any inputa
EB.c0,
returnstrue
if a EM0, false
otherwise.Facts
aboutM0
are pairs of the form(a, V),
wherea
EB.c0
andV
E{true, f al se}
is the output value of an oracle forM0
on an inputa.
Ground atoms inM0
are calledpositive facts,
while othersnegative facts.
An
enumeration of M0
is an infinite sequence: F1, F2, F3, . •., where each Fi is a fact aboutM0
and everya
EB.c0
occurs in a fact Fi =(a, V)
for some i � 1. We assume the oracle forM0
can give any enumeration ofM0
to an inference algorithm.The main problem considered in this chapter is as follows.
Suppose an oracle for some unknown regular model
M0
over C0 is given. Find a DRLPP
such thatM(P)qo =Mo.
In this chapter, the inference algorithm is allowed to use membership queries about a target regular model
M0•
A membership query aboutM0
is to propose a ground atoma
EB.c0
and ask the oracle ifa
E lv/0. An inference process of the algorithm is as follows. At each time, the algorithm reads one fact(ai, Vj)
from a given enumeration of the target modelM0.
Then it makes finitely many membership queries aboutM0.
According to answers from the oracle, the algorithm produces a DRLP Pi as a conjecture.Although, the almost all definitions concerned with the behavior of an inference algorithm are same as those defined in Chapter 2, we need to redefine the notion of identifiability. An inference algorithm is said to identify a regular model
M0
in the limit if it converges on every enumeration ofM0
given by the oracle to a conjecture P such thatM(
P)
qo =M0.
Figure 6.1 illustrates the framework of the regular model inference.Figure 6.1: The framework of the regular model inference
(qo([xl]), Vl), (qo([x2]), V2),
..., (qo([x;]), Vi),
...
•I
Oracle forM0 I I
Inference}
algorithmI )
!
Query:q0([x])
EMo
?P1, P2, ... , Pi, ... , Pn, Pn, Pn, ... :
M(
Pn)
qo =Mo
In the framework of Shapiro's model inference
[
Sha81, Sha82]
, a first order language ..C with finitely many predicate symbols is given in advance. The Model Inference System(
MIS, for short)
based on the model inference algorithm requires an oracle for a model M over£. Then MIS efficiently synthesizes a logic program P over ..C such thatM(
P)
=M.
The model inference algorithm is equipped with the contradiction backtracing algorithm as a sub-algorithm. The sub-algorithm plays the most important part for the efficiency of the model inference algorithm. With the help of the given oracle, the contradiction backtracing algorithm finds out a wrong clause in a hypothesis unsuitable for a conjecture by examining falsity of clauses in the hypothesis. The examination is executed by making membership queries about
M,
that is, by proposing a ground atom a E B.c and asking if a E M.Note that, in the setting, hypotheses are logic programs over ..C and a target model
M
is a Herbrand model over the same language ..C.In our setting, however, the given oracle cannot answer membership queries for a E B.c
B.c,0• Thus, the oracle cannot be used directly to examine the falsity of clauses which contain a predicate symbol except
q0.
In order to construct an efficient regular model inference algorithm, we should develop some method for examining falsity of clauses over ..Cp for anyhypotheses P. In the next section, we shall argue such a method.
6.2 An Extended Model of a Regular Model
We introduce a mapping called a predicate characterization. W ith the mapping satis
fy
ingsome conditions, we can appropriately extend a regular model over
£0
to a model over£p
for any DRLP P.
Definition 6.3 A predicate characterization for a DRLP P, denoted by
CHp,
is a one to one mapping from Il(
P)
to I:*.For any
qi
E Il(
P)
,CHp(qi)
is called the characteristic string ofqi
withCHp.
Definition 6.4 Let
M0
� B.c0 andCHp
be a predicate characterization for a DRLP P. We define an extended model ofM0
withCHp,
denoted byI(M0
,CHp ),
as follows:With the above extension of a model over £0, for the present, we can get a model over
£p.
However, it is nonsense that a model over£0
is arbitrarily extended with a haphazard predicate characterization. The extension should satisfy the following condition:M(
P)
qo =Mo
iffM(P)
=I(Mo, CHp )
. We show that some restrictions onC H p
lead to such an extension.Definition 6.5 Let
CHp
be a predicate characterization for a DRLP P. TheCHp
is saidto be consistent if, for any
qi
E Il(
P)
, there exists a derivation tree ofq0([CHp(qi)])
on P inwhich
qi(O)
appears.Note that, for any DRLP P and any ground atom
qi([x])
E B.cP, any possible derivation tree of a on P is unique and is of the very simple form as in Figure 6.2. The above definition is translated in terms of the theory of finite-state automata as follows. Let P be a DFA with a transition function8, CHp
is consistent if and only if8(q0, CHp(qi))
=qi
for any state qi in P.Figure 6.2: A derivation tree on a DRLP
qi([a1a2a3 · · ·]) qi1 ([a2a3
I ·· ·])
qi2 ([a3 · · ·])
I ILemma 6.2 For any DRLP
P,
if a predicate characterizationCHp
forP
is consistent, then it holds thatM(P)
=I(M(P)q0, CHp ).
Proof: From the uniqueness of the derivation tree on a DRLP and the consistency of
CHp,
for any
qi
EIT(P)
andx
E �*, it holds thatP
�qi ( [ x])
iffP
�qo ( [ C H p ( qi) · x]).
Since
PI-
a if and only if a EM(P),
it holds thatqi([x])
EM(P)
iffqo([CHp(qi)·x])
EM(P)
iff
qo([CHp(qi)·x])
EM(P)qo
iff
qi([x])
EI(M(P)q0, CHp ).
D
Theorem 6.3 Suppose
CHp
for a DRLP Pis consistent andCHp(q0)
= c. Then, for anyMo
�B.c0, M(P)q0
=Mo
if and only ifM(P)
=I(Mo, CHp ).
Proof: Since the only if direction immediately follows from Lemma 6.2, it is sufficient to prove the if direction. For any
x
E �*, it holds thatqo([x])
EMo
iffqo([CHp(qo) ·x])
EMo (
fromCHp(q0)
= c)
iff
qo([x])
EJ(Mo, CHp)
iff
qo([x])
Elv!(P) (
from the assumption)
iff
qo([x])
Elvf(P)q0•
D
By the above theorem, the regular model inference problem can be restated as follows.
Suppose an oracle for some unknown regular model
M0
is given. Find a DRLP P such thatM(
P)
=I(M0, CHp )
, whereCHp
is consistent andCHp(q0)
=c.With such a predicate characterization, it is possible to take advantage of the contradiction backtracing algorithm. Whenever the algorithm needs information about membership of some ground atom
qi([x]) E
B.cp in an extended modelI(lvf0, CHp )
, it can get the information by making a query
"q0([CHp(qi)·x]) E M0?''
to the given oracle.6.3 A Regular Model Inference Algorithm
In this section, first we show that every clause in a DRLP P has a property called complete
ness with respect to
M(
P)
. Then we give a regular model inference algorithm in which an extended model of a target regular model is used for detecting incomplete clause in unsuitable hypotheses with respect to the extended model.Definition 6.6 Let
M
� B.c andC
be a clause. Then Cis said to besufficient in
M if, for every aEM
such thathead(C)
t a, it holds that aE C(M).
Definition 6. 7 A clause Cis said to be
complete in
M if Cis both true and sufficient inM.
The following proposition directly follows from the above definitions.
Proposition 6.4
The following two statements are equivalent.
1. A
clause qi([a!X])
+--qj(X) is complete in a model M .
2.
For any x E
I:*,qi([ax]) E M if and only if qj([x]) E M.
For any DRLP Panda
E M(
P)
, since there exists at most one clause which covers a in M(
P)
, each clause in Pis sufficient inlvf.
Thus we have the following proposition.Proposition 6.5
For any DRLP P, every clause in Pis complete in M(P).
With Lemma 6.2, this implies the following proposition.
Proposition 6.6
Let CHp be a consistent predicate characterization for a DRLP P. If
lvf(P)qo
= lvfothen every clause in Pis complete in I(lv/0, CHp).
By the above proposition, if there is a clause in
P
which is not complete inI(M0, CHp)
for some
CHp,
then.Nf(P)qo ::1 M0•
Hence, whenever a clause which is not complete inI(M0, CHp)
exists in a hypothesisP,
then the clause must be eliminated fromP.
Thus inour setting, the clause that are not sufficient in an extended model are removed from an unsuitable hypothesis, while, in Shapiro's model inference algorithm, only the clauses that are not true in a target model are removed.
Algorithm 6.1: A regular model inference algorithm
Given:
An oracle for a regular modelM0
over£0.
Input:
An enumeration aboutM0•
Output:
A sequence of DRLP's.Procedure:
P := ¢; CHp := {(q0, c-)};
Strue:= ¢;
Sfalse:= ¢; State:=
0;repeat
read the next fact
(a, V); Sv := Sv
U{ a };
repeat
while
there existsa
E Sfalse such that a EM(P) do
let
PTa
be the proof tree ofa
onP;
C := contradiction_backtracing(PTa);
P := P- {C};
C' := next_clause( C);
P := P u {C'};
while
there exists(3
E Strue such that(3 � M(P) do (3' := uncovered_atom(/3);
C
:=search_clause(/3');
P := Pu {C};
until
neither of thewhile
loop is entered;output
P;
forever.
Now we state an outline of the algorithm mainly concentrating on the following two:
• how to modify hypotheses,
• how to construct a predicate characterization.
For the present, we assume that, at any time, the predicate characterization
CHp
constructed by the algorithm is consistent andC
Hp ( q0) =
c.Algorithm 6.2: Sub-procedures for modifying a too strong hypothesis
contradiction_backtracing:
Given: An oracle for a regular model
M0
over£0.
Input: A proof tree of an atom
qi([ax])
onP
such thatqi([ax])
EM(P)
butqi([ax]) rt I(M0, Clip).
Output: A clause
C
EP
which is false inI(M0, Clip).
Procedure:
let
qj([x])
be the child ofqi([ax])
in the proof tree;if
qj([x])
EI(M0, Clip)
then returnqi([aiX])
*-qj(X);
else
let
PT
be the proof tree ofqj([x])
onP;
I*
Such a proof tree can be obtained from the input proof tree by removing the root node*I
return contradiction_backtracing(
PT).
next_clause:
Input: A clause
qi([aiX])
*-qj(X).
Output: A clause
qi([ajX])
*-qj+l (X).
Procedure:
if
j = State
thenState := State+
1;let x be the string such that
( qi, x)
EC Hp;
Clip:= Clip
U{(qj+l, xa)};
return
qi([a!X])
*-qj(X).
There are following two cases in which a hypothesis
P
should be modified.(
1)
The hypothesis is too strong, that is,M(P)
contains some negative fact(
the first while loop in Algorithm 6.1)
.(
2)
The hypothesis is too weak, that is,M(P)
does not contain some positive fact(
thesecond while loop in Algorithm 6.1
)
.In the first case, there exists at least one clause in
P
which is not true inI(M0, Clip)
(
this will be clear in the proof of Lemma 6.8)
. The algorithm finds such a clause using the contradiction backtracing algorithm(
procedure contradiction_backtracing). The clauseis removed from the hypothesis. Then an alternate clause constructed by a clause generator (procedure
next_clause)
is added to the hypothesis.In the procedures
contradiction_backtracing
anduncovered_atom,
the examinationof
if
statement"qj([x])
EJ(M0, CHp )
" is done by making a membership query proposingq0([CHp(qj)·x]).
We do not consider the atom of the formqi([])
as an input forcontradic
tion_backtracing.
The reason be described after Proposition6.7.
Algorithm
6.3:
Sub-procedures for modifying a too weak hypothesisundovered_atom:
Input:
An atom {3 such that {3 EJ(M0, CHp)
but {3 <$M(
P) .
Output:
An atom inI(M0, CHp)
which is not covered by any clause in P with respect toI(M0, CHp)
Procedure:
if there exists a clause
qi([aiX])
f--qj(X)
E P such thatqi([aiX])e
= {3 for some substitution ethen if qj(X)B
EJ(M0, CHp) then
return
uncovered_atom(qj(X)B);
else
return {3;else
return {3.search_clause:
Input:
An uncovered atomqi([x])
which is returned by the above procedure.Output:
A new clauseC
whose head is unifiable withqi([x]).
Procedure:
if x
= cthen
return
qi ( []);
else
let x =ax';
if
there existsC
E P such thathead( C)
=qi([aiX]) then
p := p-
{C};
return
next_clause( C);
else
returnqi([aiX])
f--qo(X).
In the second case, there exists a ground atom
qi([x])
EI(lv£0, CHp)
which is not covered by any clause in P with respect toI(lv£0, CHp)
(this will be clear in the proof of Lemma6.9).
The procedure
uncovered_atom
finds out such an atom. Note that when the procedure iscalled with an input
qi([]),
it always return the input. Because, ifqi([])
� M(P), then it is immediately found thatq1([])
is not covered by any clause in P with respect to any model.For such an uncovered atom, the following two cases are possible.
(i) There is no clause in P whose head is unifiable with
qi([x]).
(ii) Although there is a clause C whose head is unifiable with
qi([x]),
C does not coverqi([x])
inI(M0,
CHp)
.In the case (i), if
x =
c, then the unit clauseqi([])
is added to the hypothesis, and ifx = aw (a
E:E, w
E:E*),
then the clauseqi([ajX])
�q0(X)
is added to the hypothesis. In the case (ii), according to Proposition 6.6, Cis removed from the hypothesis, then an alternate clause is added to the hypothesis similarly in the case (1). The procedure search_clause works according to each case.In the algorithm, the predicate characterization is represented as a set of pairs of a predi
cate symbol and a string. The set is constructed as follows. Let P be the current hypothesis and CHp
= {(q0,
c)
,(q1, x1), (q2, x2), ... , (qk, xk)}
be the current predicate characterization.Now we assume that the algorithm finds out a clause C
= qi([ajX])
�qk(X)
in P which is not complete inI(M0,
CHp ).
Then P is modified in the way mentioned above. Hence, Cis removed from P and an alternative constructed by the clause generator is added to P. Since the clause generator increases the index ofqk,
the alternative has a new predicate symbolqk+l
which has never appeared in P before. When such a new predicate symbol is introduced, the algorithm adds a pair
(qk+l, xia)
to CHp, that is, a characteristic string of the new predicate symbol is determined by the head of the clause which caused the introduction of the new predicate symbol.Figure 6.3 illustrates the outline of modifying hypotheses mentioned above. The dotted line denotes the operation which is made in the case (2)-(i). The two solid lines denote the operations that are made simultaneously in the case (1) and (2)-(ii).
6.4 Correctness of the Algorithm
First, we give a simple proposition concerned with the property of predicate characterization.
From the definition of an extended model, the following holds.
Figure 6.3: The outline of modifying hypotheses
qo(O) qJ(O)
qo([a IX]).- qo(X) qk([aiX]).- qo(X)
---...
'
\ I
A clause which is not complete in
l(Mo,
CHp)qi([a IX])+- qJ(X)
The alternate clause
qo( i a IX])+- ql(X) } qi([a IX])+- q J+l(X)
qk([a IX])+- Qz(X)
�--+--���
Proposition 6.7 Let P1 and P2 be any DRLP's such that II(P1) � II(P2). Suppose that
CHp1(q)
=CHp2(q)
for anyq
E II(P1). Then, for anyMo
� Bc0 andq([x])
E Bcp1,q([x])
EI(Mo, CHpJ
if and only ifq([x])
EJ(M0, CHp2).
The predicate characterization constructed by the algorithm changes with increase of its domain, the set of predicate symbols in hypotheses. This leads to the change of the extended model with the predicate characterization. However, from the way of constructing a predi
cate characterization, it is clear that the change of the predicate characterization caused by modifying hypotheses satisfies the premises of Proposition 6. 7. Hence the change of the ex
tended model with the predicate characterization satisfies the conclusion of Proposition 6.7.
Thus, once a ground atom is true
(
false)
in an extended model, the atom is still true(
false)
in subsequent models.
In Algorithm 6.2, we do not consider the atom of the form
qi([])
as an input for contradiction_backtracing. Because, at any time on the inference process, there is no case in which
qi(O)
is in M(P) but not inJ(M0, CHp )
. The ground atomqi([])
is in M(P) if and only if there exists a unit clauseqi([])
in P. The clauseqi([])
is added to P after the procedure search_clause is called on the inputqi([]).
Let P' be the hypothesis for which the procedure call is occurred. Thenqi([])
is inJ(J\;!0, Cflp,).
By Proposition 6.7,qi([])
isensured to be in
I(M0, CHp)
for any subsequent hy pothesisP.
Hence, there is no case in whichqi([])
is inM( P )
but not inJ(M0, CHp ) .
Here we show the correctness of the sub-procedures contradiction_backtracing and uncovered_atom.
Lemma 6.8
Suppose that the procedure
contradiction_backtracingis called with the in
put proof tree of a ground atom
a EM(P) such that
a¢ I(M0, CHp )
.Then the procedure returns a clause in P which is not true in I ( M0, C H p).
Proof: Suppose that the procedure contradiction_backtracing given a proof tree of a ground atom
qi([ax])
returns a clauseqi([a!X])
�q1(X).
Then it is ensured thatqi([ax]) ¢ I(M0, CHp)
butq1([x])
EJ(M0, CHp )
. Hence the clause is ensured to be false in the extended modelI(M0, CHp )
. Furthermore, sinceq1([x])
is the child ofqi([ax])
in the input proof tree onP,
the clauseqi([aiX])
�q1(X)
exists inP.
On the other hand, every input proof tree has the leaf
qk(O)
for someqk(O)
EP.
Fromthe discussion above, it is ensured that
qk([])
EJ(M0, CHp )
. Since the input proof tree of each recursive call clear the input condition, a clause which is false inJ(M0, CHp)
must befound eventually.
D
Lemma 6.9
Suppose that the procedure
uncovered_atomis called with an input f3
EM0 such that f3 ¢ M(P). Then the procedure returns some ground atom /3'
EJ(M0, CHp) such that f3' is not covered by any clause in P with respect to I(M0, CHp ) .
Proof: Since the procedure examines if
q1(X)B
EJ(M0, CHp)
before calling itself recursively, every input
q1(X)B
for its recursive call is ensured to be inI(M0, CHp )
. On the other hand, if an input for its recursive call is inM(P),
then the input has a proof tree onP.
This implies that all ancestors of the input have also proof trees on
P.
This contradicts thatf3 ¢ M(P),
that is, the initial input of the procedure has no proof tree onP.
Thus every input for its recursive call is not inJvf(P).
Since
qi([])
is not unifiable with anyqi([a!X]),
the procedure called with an input of the formqi([])
returns the input directly. Sinceqi([]) ¢ lvf(P),
it holds thatqi([]) rt P.
For anyclause in a DRLP, a ground atom
qi([])
is covered only by the clauseqi([]).
Thus, if theprocedure is called with the input
qi([]),
then it immediately follows thatqi([])
is not covered by any clause inP
with respect toJ(M0, CHp )
.For an input of the form
qi([ax]),
if there is no clause whose head is unifiable withqi([ax]),
it is clear thatqi([ax])
is not covered by any clause inP.
Since a clause of the formqi([ajX]) +--- qj+1(X)
is introduced into a hypothesis after the clauseqi([ajX]) +---%(X)
is removed from the hypothesis, there is at most one clause whose head is unifiable with
qi([ax]).
Thus ifqi([ax])
is not covered such a clause, there is no other clause which can coverqi([ax])
with respect toJ(M0, CHp )
. Thus, if the procedure make an output, then the output is ensured to satisfy the claim of the lemma.On the other hand, the size of each input
qj([X])B
for the recursive call is decreasing one by one. Thus, even if in the worst case, the procedure will encounter an input of the formqi([])
and terminate. This completes the proof of the lemma.D
Next, we show the justification of the way of constructing a predicate characterization.
We can restate Proposition 6. 7 as follows.
Proposition 6.10 Let
P1
andP2
be any DRLP's such thatIT(P1) � II(P2).
Suppose thatCHp1 (q)
=CHp2(q)
for anyq
EII(P1).
Then, for anyMo �
B.c0 andC
EP1, C
is complete inI(M0, CHpJ
if and only ifC
is complete inI(M0, CHp2).
Lemma 6.11 Let
P
be a DRLP. SupposeCHp(qj)
=CHp(qi)·a,
wherea
E I: andqi, qj
EIT(P).
Then, for anyM0 �
B.c0, the clauseqi([ajX]) +--- qj(X)
is complete inI(M0, CHp )
. Proof: By the definition of the extended model, for anyx
E I:*, it follows thatqi([ax])
EJ(Mo, CHp)
iffqo([CHp(qi)·ax])
EMo
iff
qo([CHp(qj) ·x])
EMo
iff
qj([x])
EJ(M0, CHp )
.Hence, from Proposition 6.4,
qi([a!X]) +--- qj(X)
is complete inJ(M0, CHp )
.D
Theorem 6.12 The predicate characterization