• 検索結果がありません。

Efficient Algorithms on the Moore Family Associated to an Implicational System

N/A
N/A
Protected

Academic year: 2022

シェア "Efficient Algorithms on the Moore Family Associated to an Implicational System"

Copied!
24
0
0

読み込み中.... (全文を見る)

全文

(1)

Efficient Algorithms on the Moore Family Associated to an Implicational System

Karell Bertet

1

and Mirabelle Nebut

2

1L3I, Universit´e de La Rochelle,

Pˆole Sciences et Technologies, av Michel Cr´epeau, 17042 La Rochelle C´edex 1, France karell.bertet@univ-lr.fr

2LIFL, Universit´e des Sciences et Technologies de Lille, Bˆatiment M3, 59655 Villeneuve d’Ascq C´edex, France mirabelle.nebut@lifl.fr

received Jun 7, 2002, revised Jun 13, 2002, Jun 28, 2003, accepted Dec 18, 2003.

An implication system (IS)Σon a finite set S is a set of rules calledΣ-implications of the kind AΣB, with A,BS.

A subset XS satisfies AΣB when “AX implies BX ” holds, so ISs can be used to describe constraints on sets of elements, such as dependency or causality. ISs are formally closely linked to the well known notions of closure operators and Moore families. This paper focuses on their algorithmic aspects. A number of problems issued from an ISΣ(e.g. is it minimal, is a given implication entailed by the system) can be reduced to the computation of closures ϕΣ(X), whereϕΣis the closure operator associated toΣ. We propose a new approach to compute such closures, based on the characterization of the direct-optimal ISΣdo which has the following properties: 1. it is equivalent toΣ 2.

ϕΣdo(X)(thusϕΣ(X)) can be computed by a single scanning ofΣdo-implications 3. it is of minimal size with respect to ISs satisfying 1. and 2. We give algorithms that computeΣdo, and fromΣdo closuresϕΣ(X)and the Moore family associated toϕΣ.

Keywords: Moore family, implicational system, closure operator, algorithm, lattice.

1 Introduction

As recalled in [CM04], the basic mathematical notion of closure operator (an isotone, extensive and idempotent mapϕ) defined on a poset (P,≤) is fundamental in a number of fields linked to computer science, in particular when defined on the lattice(2S,⊆)of all subsets of a finite set S. In this case, closure operators are closely linked to the notion of Moore family, a familyF⊆2Swhich contains S and is closed under intersection (see [CM04] for more details). The notions of closure operator and Moore family both involve the concept of logical or entail implication, used for instance in knowledge systems or relational data-bases (these fields handle systems of implications, called for example functional dependencies in relational data-bases [MR92, Mai83], and association rules in data-mining [PBTL99]). Hence the notion of Implicational System (IS for short) defined in [CM04], to which is dedicated this paper.

Formally an IS on S denoted byΣ⊆2S×2Sis a set of rules calledΣ-implications of the kind A→ΣB, with A,BS. A subset XS satisfies an implication AΣB when “AX implies BX ”. So ISs

1365–8050 c2004 Discrete Mathematics and Theoretical Computer Science (DMTCS), Nancy, France

(2)

can be used to easily describe constraints between sets of elements, such as dependency or causality. Let us give here an intuitive example which will also be used in the core of the paper (see Ex. 1 in Sect. 3).

Assume that S={a,b,c,d,e}is a set of events. The ISΣ={a→b,acd,ea}can be interpreted as “if a resp. e occurs then so does b resp. a, and if a and c occur then so does d”.

Given such a system, several types of questions arise. A common problem is to find a minimum

“full” system of implications, from which all implications between elements can be obtained. Another very natural issue is for instance the question “is it possible that a and e occur and not c ?”. One can answer using either the implicational Moore family associated toΣ(FΣcontains all subsets XS that satisfy eachΣ-implication) or the closure operator associated toFΣFΣ maps a subset XS on the least element F∈FΣ s.t. XF). In our example the answer is “yes” because abe∈FΣ and c6∈ae, or because c6∈ϕFΣ(ae) =abe. Answering questions about a system using the closure ϕFΣ(X)has a great advantage: it avoids the construction of the whole Moore family (which contains 14 elements in our example). MoreoverϕFΣ can also be used to compute efficientlyFΣ, whose direct definition-based generation relies upon an exponential enumeration of all subsets of S. Note that data-mining has to deal with a reverse problem adressing the efficient generation of association rules from a family of closures called itemsets [PBTL99].

The properties of implicational Moore families and ISs have been studied in [GD86, Wil94, Wil95, CM04] from a theoretical point of view. This paper focuses on algorithmic issues. Following the intuition given before, it is based on the efficient computation ofϕFΣ(X). As detailed in the core of the paper, this computation was addressed in several ways in [Mai83, MR92, Wil95]:ϕFΣ(X)is obtained by several enumerations of the implications ofΣ. For instance in the previous example the computation ofϕFΣ(ae) = abe is performedby scanning once theΣ-implications (first and third implications) but the computation ofϕFΣ(ce) =abcde is performed by scanning them twice: The first enumeration brings ace∈ϕFΣ(ce) (third implication) and the second one brings bd∈ϕFΣ(ce)(first and second implications).

The new approach we propose is based on two fundamental algorithmic observations: 1. the compu- tation ofϕFΣ(X)is more efficient when Σis optimal, where optimal means “of minimal size”; 2. the enumeration number ofΣ-implications needed to computeϕFΣ(X)can be reduced to 1 whenΣis direct.

Let us illustrate it on our example. The ISΣd =Σ∪ {e→b,ced}is direct and equivalent toΣ(it is easy to check thatϕFΣ

d(ce)can be now computed by a single scanning ofΣd-implications). It is not optimal. The ISΣo ={e→ab,acd,a→b,ced}is similarly equivalent toΣand direct, but also direct-optimal in the sense that there exists no equivalent direct IS of smaller size (thoughΣo6⊆Σd). Our approach also consists in computingϕFΣ(X)(henceFΣ) by exploiting the directness and optimality prop- erties: We define the direct-optimal ISΣdo generated fromΣ. FirstΣis completed by some implications into the direct ISΣd, thenΣd is modified into the optimal ISΣdo (Σ,Σd andΣdo being equivalent).

The paper is organized as follows: Sect. 2 gives notations and standard definitions. Section 3 first gives some preliminaries on the computation ofϕFΣ(X)(Sect. 3.1) then defines the notion of direct IS and characterizes the direct ISΣd generated fromΣ(Sect. 3.2). In the same way, Sect. 3.3 defines the notion of direct-optimal IS and characterizes the direct-optimal ISΣo generated from a direct ISΣ. By combination of these two definitions, we naturally obtain the direct-optimal ISΣdo generated from a given ISΣ(Sect. 3.4).

Section 4 deals with algorithmic aspects of the above result. We first describe an efficient data structure

We abuse notations and write ac for{a,c}.

At this stage the reader should admit the following recipe: InitializeϕFΣ(X)with X , then iteratively scanΣ-implications until stabilization doing: If ABΣand AϕFΣ(X)then add B toϕFΣ(X).

(3)

introduced in [Gan84, HN96, NR99] and called lexicographic tree, traditionally used to represent families and extended here to represent ISs (Sect. 4.1). We then give an algorithm to compute the closureϕFΣ(X) from a direct-optimal IS (Sect. 4.2), and an algorithm to compute the direct-optimal ISΣdo generated from some ISΣ, whereΣandΣdo are represented by a lexicographic tree. We finally propose an algorithm to generateFΣ(Sect. 4.3), based on properties of the lattice(FΣ,⊆).

2 Definitions and Notations

Let us consider a finite set of elements S. A familyF on S is a set of subsets of S:F 2S. A Moore family Fon S is a family stable by intersection and which contains S: S∈Fand F1,F2∈Fimplies F1∩F2∈F. The poset(F,⊆)is a lattice with, for each F1,F2∈F, F1F2=F1F2and F1F2=T{F∈F|F1F2F}

(recall that a lattice is an order relation (i.e. reflexive, antisymmetric and transitive) over a set of elements such that any pair x,y of elements has a join (i.e. a least upper bound) denoted by xy, and a meet (i.e. a greatest lower bound) denoted by xy).

Let X,X0be subsets of S. A closure operatorϕon S is a map on 2Swhich is isotone (XX0implies ϕ(X)⊆ϕ(X0)), extensive (X⊆ϕ(X)) and idempotent (ϕ2(X) =ϕ(X)). ϕ(X)is called the closure of X byϕ. X is said to be closed byϕwhenever it is a fixed point forϕ, i.e.ϕ(X) =X .

The set of all Moore families and the set of all closure operators on S are in a one-to-one correspon- dence. The Moore familyFϕassociated to the closure operatorϕis the set of all closed elements ofϕ:

Fϕ={F⊆S|F=ϕ(F)} (1) The closure operatorϕFassociated to the Moore familyFis such that, for any XS,ϕF(X)is the least element F∈Fthat contains X :

ϕF(X) =\{F∈F|XF} (2) In particularϕF(/0) =⊥F. Note thatϕF(X)∈Fbecause Moore families are closed by intersection. More- over for all F1,F2∈F, F1F2F(F1F2)and F1F2F(F1F2) =F1F2.

Let A,B be subsets of S. An Implicational System (IS for short) Σon S is a binary relation on 2S: Σ⊆2S×2S. A couple(A,B)∈Σis called aΣ-implication whose premise is A and conclusion is B. It is written AΣB or AB (meaning “A implies B”). The familyFΣon S associated toΣis:

FΣ={X⊆S|AXBX for each AB∈Σ} (3) i.e. it is the set of sets XS such that “X contains A implies X contains B”.FΣis clearly a Moore family called the implicational Moore family on S associated toΣ. Several ISs can describe the same Moore family: ΣandΣ0on S are equivalent ifFΣ=FΣ0. The problem is to find the smallest ones, according to various criteria [Wil94]. Σis non-redundant ifΣ\ {X→Y}is not equivalent toΣ, for all X→Y inΣ. It is minimum if|Σ| ≤ |Σ0|for all ISΣ0equivalent toΣ.Σis optimal if s(Σ)s(Σ0)for all ISΣ0equivalent toΣ, where s(Σ)is the size ofΣdefined by:

s(Σ) =

A→B∈Σ

(|A|+|B|) (4)

Other definitions not recalled here can be found in the survey of Caspard and Monjardet [CM04].

In the following, S is endowed with a total order<α or simply α. A subset X ={x1,x2, . . . ,xn} is viewed as the word xj1xj2. . .xjnsorted according toα: xj1<αxj2 <α· · ·<αxjnis an IS on S,FΣorF is the Moore family associated toΣ, andϕFΣ orϕΣor simplyϕis the induced closure operator.

(4)

3 Characterization of ϕ

Σ

from Σ

As explained in introduction, a number of problems related to an ISΣcan be answered by computing closures of the kindϕΣ(X), for some X ⊆S. Section 3.1 presents important notions used further and in- troduces our method: The idea is to perform the computation ofϕΣ(X)not onΣbut on another equivalent IS which makes the computation more efficient. Section 3.2 defines such convenient and equivalent IS, called direct. Section 3.3 characterizes the smallest equivalent direct IS inferred from a direct one, called direct-optimal. Finally Sect. 3.4 characterizes the direct-optimal IS equivalent to some ISΣ.

3.1 Preliminaries

A direct and naive computation ofϕΣ(or simplyϕ) follows from equations (2) and (3):

ϕ(X) =T{X0S|XX0and

AX0implies BX0for each AΣB} (5) It requires an enumeration of all subsets X0such that XX0S, plus a test on the premise and conclusion of each implication. Moreover these enumerations must be done for each particular X under consideration.

[Wil94, Wil95] propose a definition ofϕ(X)which induces a more efficient computation:

ϕ(X) =XΣΣ

...Σ

(6) where

XΣ=X[{B|AX and AΣB} (7)

According to [Wil95]ϕ(X)is in this way obtained in O(|S|2|Σ|)by iteratively scanningΣ-implications:

ϕ(X)is initialized with X then increased with B for each implication AΣB such thatϕ(X)contains A. The computation cost depends on the number of iterations, in any case bounded by|S|. In order to practically limit this number (keeping the same complexity), [Wil95] tunes algorithms using additional data structures.

It is worth noting that for some particular ISs the computation ofϕrequires only one iteration. Such an IS is called direct (one can also find iteration-free in [Wil94]):

Definition 1 An ISΣis direct if, for all XS:

ϕ(X) =XΣ=X[{B|AX and AΣB} (8) Instead of tuning algorithms applied to some ISΣ, a possible approach is to infer fromΣan equiv- alent and direct IS Σ0. Once it is done, each closureϕ(X)can be computed by simply enumerating Σ0-implications. As an illustration, let us consider full ISs, that are a classical type of direct ISs.

According to [CM04] (Def. 49 p. 20), a full IS is a preorder (a reflexive and transitive relation) that contains the preorder⊇on 2S×2Sand is∪-stable, that is it verifies the property:

for all A,B,C,DS,AB and CD imply A∪C→BD As stated by Prop. 1, a full IS is direct.

(5)

Proposition 1 (Corollary 53 in [CM04]) ForΣa full IS,

ϕ(X) =[{B⊆S|XΣB}=XΣ

Starting from the notion of full IS, and given some ISΣ, we define the full IS Σf inferred fromΣ, equivalent to Σ(Prop. 2), and direct (Prop. 1): it contains all Σ-implications, all implications due to inclusions in 2S×2S, and all implications generated byΣ-implications and inclusions.

Definition 2 The full ISΣf inferred fromΣis defined as the smallest§ISs.t.:

1. Σ⊆Σf and

2. Σf verifies the three following properties: For all A,B,C,DS, P1 (inclusion axiom): BA implies AΣf B

P2 (transitivity axiom): AΣf B and BΣf C implies AΣf C P3 (union axiom): AΣf B and CΣf D implies A∪C→Σf BD Proposition 2 ΣandΣf are equivalent.

For completeness, we give the proof of this simple result.

Proof: Let us prove thatFΣ=FΣf.

⊇. Immediate sinceΣ⊆Σf.

⊆. Consider F∈FΣ. It is easy to check by induction that F satisfies “AF implies BF” for any

AΣf B induced by P1, P2and P3. 2

UsingΣf, one can compute a closureϕΣ(X)in only one iteration. Nevertheless note that the directness of Σf is due to the fact that any subset AS appears as a premise of a Σf-implication: it makes the computation ofΣf exponential thus impracticable. The idea is then to look for smaller ISs, not necessarily full, but still direct and equivalent toΣ(andΣf). The smallest such one is called direct-optimal.

Definition 3 An ISΣis direct-optimal if it is direct, and if s(Σ)s(Σ0)for any direct ISΣ0equivalent toΣ.

Our approach can be summarized as follows. Given some ISΣ:

We start from the three axioms that describe a full IS (cf. Def. 2) to define in Sect. 3.2 the direct IS Σd inferred fromΣ, whose directness is stated by Th. 1;

• Consider Σis direct but perhaps not optimal: In this case someΣ-implications can be removed or simplified, while preserving the directness and semantics ofΣ. In Sect. 3.3 we first formally characterize direct-optimal ISs (Th. 2) then, given a direct ISΣ, we define the direct-optimal ISΣo

inferred fromΣ.

By combination of these two results, we obtain the definition of the direct-optimal ISΣdo inferred from some ISΣ. Moreover, we state that equivalent ISs define an unique direct-optimal IS (Corol- lary 1). ClosuresϕΣ(X)can then be computed by only one enumeration ofΣdo-implications, at a minimal cost.

§”Smallest” for the preorder⊆.

“Minimal” in the sense that using any other equivalent direct IS would be less efficient; Nevertheless in the cases where few closures are needed, or where a small non-direct IS is considered, it may be more efficient to iterateΣ-enumerations instead of computingΣdthenΣdo.

(6)

3.2 Σ

d

: a Direct IS Generated from an IS Σ

In this section we define an IS smaller than Σf, but still direct and equivalent toΣ. To do so, let us consider again the three axioms that characterizeΣf (Def. 2), and let us explain whatΣf-implications can be removed without altering the directness and semantics of the IS, or dually what implications must necessarily be added toΣ. We consider the computation ofϕ(X)indicated by (6), for XS.

Given a pair of implications(I1,I2)present in the IS under construction, the principle is to “summarize”

via a third implication the result of theϕ(X)iterative computation process applied to(I1,I2). Axioms P2 and P3do apply this principle. Nevertheless the inferred implications (included these inferred by P1are sometimes clearly redundant with properties particular to the closure operatorϕ. It is the case when no iterative process is needed, because X contains both the implications premises:

1. Assume AX . The implication AΣf B stated by P1is redundant: it causes the explicit enrichment ofϕ(X)with B while according to Eq. (7) (and due to theϕextensiveness) we haveϕ(X)⊇X , and XAB.

2. Assume A,BX . The implication AΣf C stated by P2is redundant with BΣf C, which already states the enrichment ofϕ(X)with C.

3. Assume A,CX . Similarly the implication A∪CΣf B∪D stated by P3is redundant with AΣf B and CΣf D.

When an iterative process is required to compute ϕ(X), implications inferred by a combination of the three axioms are necessary. For example let us consider the following ISΣ:

{ac→Σd , eΣa}

Assume X =ce. The computation ofϕ(X) =acde throughΣ requires an iterative process: The fact d∈ϕ(X)is known from the first implication only when the intermediateϕ(X)has been enriched with a (second implication). To be direct, the IS must contain the implication:

ceΣd obtained by applying successively:

P1to infer the implication cc;

P3applied to cc and eΣa to infer ceac;

P2applied to ceac and acΣd to infer ceΣd.

Nevertheless implications cc and ceac are redundant with others, as explained below. To avoid this redundancy, let us consider the pair

{A→ΣB , CΣD} (9)

In the case where the computation ofϕ(X)requires an iteration: AX but C6⊆X . Because AX , the first implication adds B toϕ(X). Now if C⊆XB, the second implication adds D toϕ(X). Since A⊆X and CX∪B is equivalent to A∪(C\B)X , we can summarize this reasoning by the implication (10):

A∪(C\B)ΣD (10)

(7)

In the previous example, ced is indeed obtained from the pair{e→Σa,acΣd}.

Note that the implication (10) is redundant with the one CΣD when BC= /0, since it yields A∪C→D. This case does not happen here due to the condition C6⊆X (C6⊆X and CXB imply CB6=/0): We enforce it by imposing BC6=/0as the application condition of the rule.

The rule that infers implication (10) from implications (9) (called overlap axiom in Def. 4) encompasses the combination of axioms P2and P3, but also P1: The goal of P1is mainly to make appear any subset AS as a premise of aΣf-implication, in order to computeϕ(X)by Prop. 1. Since we computeϕ(X)by Equations (6) and (7) instead, we can drop P1.

The definition of the direct IS inferred fromΣnow follows directly from what precedes:

Definition 4 The direct implicational systemΣd generated fromΣis defined as the smallest IS s.t.

1. Σ⊆Σd and

2. Σd verifies the following property:

P4 (overlap axiom) : for all A,B,C,DS:

AΣd B, CΣd D and B∩C6=/0imply A∪(C\B)→Σd D We now adapt Prop. 1 to characterizeϕfromΣd.

Theorem 1 ϕ(X) =XΣd =XS{B|AX and AΣd B}

Two lemmas are needed to prove this theorem. Lemma 1 states thatΣd ⊆Σf, therefore thatΣd is equiv- alent toΣsinceΣ⊆Σd andΣf is equivalent toΣ. Lemma 2 is the core of the proof: it states that Σd

contains all “significant”Σf-implications. By “significant” we mean an implication AΣf B s.t. A6⊆B, so that it can add B\A to someϕ(X)and is not trivially redundant like implications inferred by P1 in Def. 2.

Lemma 2 states that any suchΣf-implication AΣf B is imitated by a set ofΣd-implications, where a Σd-implication is associated to each yB\A.

Lemma 1 Σd⊆Σf

Proof: Let{XiYi}1≤i≤pbe the implications successively added toΣdin order to completeΣby appli- cation of P4. We defineΣ0=Σ,Σii−1∪ {XiYi}andΣpd. The proof is by induction on i, with 0≤ip. The base case is obtained by definition and Def 2:Σ0=Σ⊆Σf.

Inductive step: For i≥1, let us prove thatΣi−1⊆Σf impliesΣi⊆Σf, equivalently that XiYi∈Σf. Since XiYiis added toΣi−1by application of P4, there exist AΣi−1B and CΣi−1 D such that B∩C6=/0, Xi=A∪(C\B)and Yi=D. By induction hypothesis AB∈Σf and CD∈Σf. Then

From AB∈Σf (by hypothesis) and BB∩C∈Σf (by P1) we deduce from P2 that AB∩C∈ Σf.

From AB∩C∈Σf and C\BC\B∈Σf (by P1) we deduce from P3 that A∪(C\B)→(B∩ C)∪(C\B) =C∈Σf.

From A∪(C\B)→C∈Σf and CD∈Σf by hypothesis we deduce from P2 that A∪(C\B)→ D∈Σf.

Therefore XiYi∈Σf and the proof is achieved. 2

(8)

Lemma 2 For all XΣf Y and yY\X , there exists X0ΣdY0such that X0X and yY0.

Proof: Let{XiYi}1≤i≤pbe the implications successively added toΣf in order to completeΣby appli- cation of P1, P2 or P3. We defineΣ0=Σ,Σii−1∪ {XiYi}andΣpf. The proof is by induction on i, with 0ip.

Base case: SinceΣ0=ΣandΣ⊆Σd: for all XΣ0Y and yY\X , the implication X →Σd Y verifies XX and yY .

Inductive step: For i≥1, assume the property is proved forΣi−1, i.e. for all XΣi−1Y , for all yY\X , there exists X0ΣdY0such that X0X and yY0. We consider the implication XiYiand some element yYi\Xiand show that:

there exists Xi0ΣdYi0s.t. Xi0Xiand yYi0 (11) If YiXi then Yi\Xi= /0and (11) is trivially satisfied. Assume Yi6⊆Xi and let us consider that XiYi has been added to Σi−1 by the application of P2 or P3 (since applying P1 implies that YiXi, which contradicts the hypothesis). Let us consider successively the application of P3 and P2.

?Case P3: There exist AΣi−1 B and CΣi−1 D such that Xi=AC and Yi=BD, moreover y∈ (B∪D)\(AC). We may assume that yB, the case yD being dual. Then yB\A and, since AB∈Σi−1and by induction hypothesis: There exists A0Σd B0such that A0A and yB0. Since A0AA∪C=Xi, A0Σd B0satisfies (11).

?Case P2: There exist AΣi−1B and BΣi−1C such that Xi=A, Yi=C and yC\A. Let us consider the two sub-cases yB and y6∈B.

yB implies yB\A, and since AB∈Σi−1: By induction hypothesis there exists A0Σd B0 such that A0A and yB0. Since A0A=Xi, A0ΣdB0satisfies (11).

y6∈B implies yC\B, and since BC∈Σi−1: By induction hypothesis there exists B0Σd C0 such that B0B and yC0. If B0A=Xithen B0ΣdC0satisfies (11). If B06⊆A, let us write

B0\A={yk}1≤k≤q

Since B0B: ykB\A, and since AB∈Σi−1: By induction hypothesis there exist q implications AkΣd Bksuch that AkA and ykBk. Therefore:

B0\A⊆ [

1≤k≤q

Bkand B0A[

1≤k≤q

Bk

Axiom P4is now used to build an implication whose premise is included into A and whose con- clusion is C0, so it verifies (11) since Xi=A and yC0. This implication is the last element of a sequence of q implications A0kΣkC0obtained by applying iteratively P4to implications AkΣdBk. – initialization: we define A01ΣdC0as the result of P4applied to A1Σd B1and B0Σd C0

(note that y1B1B0so B1B06=/0and P4can be applied), so A01=A1B0\B1. – induction: for 1<kq, we define A0kΣdC0as:

(9)

the result of P4applied to AkΣd Bkand A0k−1Σd C0 if BkA0k−16=/0, so A0k=AkA0k−1\Bk

A0k−1ΣdC0otherwise, so A0k=A0k−1.

To prove that A0qA, let us prove by induction on k∈[1,q], that:

A0kA[

k<j≤q

Bj

– initialization: For k=1, we have A01=A1∪(B0\B1)so A01AS1<j≤qBjdirectly follows from B0AS1≤k≤qBkand A1A.

– induction step: For k>1, the induction hypothesis is A0k−1A[

k−1<j≤q

Bj

moreover the computation of A0kdepends on the emptiness of BkA0k−1.

If BkA0k−1=/0, then A0k−1⊆(A∪Sk−1<j≤qBj)\Bk, moreover A0k=A0k−1. So we directly obtain A0kASk<j≤qBj.

If A0k−1Bk6=/0, then A0k−1\BkASk<j≤qBj. Moreover A0k=Ak∪(A0k−1\Bk)and since AkA, we also obtain A0kASk<j≤qBj.

We finally obtain

A0qA[

q<j≤q

BjA

and A0qΣdC0satisfies (11) (since A=Xiand yC0). Thus the property is proved.

2 We can now prove Theorem 1.

Proof of Theorem 1: SinceΣandΣf are stated equivalent by Prop. 2, provingϕΣ(X) =XΣd for XS is equivalent to proveϕΣf(X) =XΣd, where from Prop. 1 and Def. 1:

ϕΣf(X) =XΣf =[{B⊆S|XΣf B} (12)

XΣd =X[{B⊆S|AΣd B and AX} (13)

⊇. Using Eq. (7) XΣf =X∪ {B⊆S|AΣf B and AX}. Then XΣdXΣf directly follows fromΣd⊆Σf

stated by Lemma 1.

⊆. Consider any b∈XΣf. If bX then bXΣd by (13). Assume b6∈X . Since bXΣf, there exists by (12) XΣf B such that bB. b6∈X implies bB\X and by Lemma 2 there exists A0Σd B0such that

A0X and bB0. So bXΣd and XΣfXΣd. 2

(10)

3.3 Σ

o

: a Direct-Optimal IS Generated from a Direct IS Σ

Let us consider a direct ISΣ. IfΣis not direct-optimal then there exists an equivalent direct IS of smaller size. Like in Sect. 3.2, it means that some premise or conclusion parts ofΣare redundant with some prop- erties particular to closure operators. This redundancy can be suppressed without altering the directness property. Let us consider the computation ofϕ(X)for XS and an implication AΣB.

1. Assume AX and AB6=/0. AΣB causes the explicit enrichment ofϕ(X)with B= (A∩B)∪ (B\A). The AB part is redundant with the isotony and extensiveness ofϕfrom which we have A⊆ϕ(A)⊆ϕ(X)(moreover ABA). So only the part B\A of the AΣB conclusion is useful.

2. Assume CD∈Σwith CA, BD6=/0and AX . Since CX , CΣD causes the explicit enrichment ofϕ(X)with D= (B∩D)∪(D\B). The part BD is similarly redundant with AΣB, which already states the enrichment ofϕ(X)with B= (B∩D)∪(B\D).

3. Assume AB0∈Σ, with B6=B0. Then the cardinality|A|is added twice to the size ofΣ, while it is only added once if the pair{A→ΣB,AΣB0}— in a way redundant — is replaced by the equivalent implication ABB0.

4. Assume AX and B=/0. A→ΣB is clearly useless to computeϕ(X).

Theorem 2 generalizes these remarks: it states that the absence of such redundancies is a necessary and sufficient condition for a direct IS to be direct-optimal.

Theorem 2 A direct ISΣis direct-optimal iff:

P5 (extensiveness axiom): for all AΣB, A∩B=/0

P6 (isotony axiom): for all AΣB and CΣD, CA implies BD=/0 P7 (premise axiom): for all AΣB and AΣB0, B=B0

P8 (not empty conclusion axiom): for all AΣB, B6=/0.

Two lemmas are needed to prove this theorem. Lemma 3 states that the deletion of the previously mentioned redundancies preserves the directness property of the considered IS. In Lemma 4 we consider the particular direct ISs whose conclusion parts are singletons. Such an ISΣdoes not necessarily verifies P7, but Lemma 4 states that if Σverify P5 and P6 then Σis smallerkthan any other equivalent such IS(whose conclusions are also singletons).

Lemma 3 LetΣbe a direct IS.

1. If AB∈Σwith A∩B6=/0thenΣ\ {A→ΣB} ∪ {AB\A}is also a direct IS equivalent toΣof smaller size.

2. If AB∈Σand CD∈Σwith CA and BD6=/0thenΣ\ {A→ΣB} ∪ {AB\D}is also a direct IS equivalent toΣof smaller size.

kIn the sense of inclusion.

(11)

3. If AB∈Σand AB0∈Σwith B6=B0thenΣ\ {A→ΣB,AΣB0} ∪ {A→BB0}is also a direct IS equivalent toΣof smaller size.

4. If AB∈Σwith B=/0thenΣ\ {A→ΣB}is also a direct IS equivalent toΣof smaller size.

Proof:

1. Let AΣB be such that AB6=/0. Let us denote byΣ0the ISΣ\ {A→ΣB} ∪ {AB\A}. Let us consider XS and prove thatΣ0is a direct IS equivalent toΣby stating XΣ0 =XΣ. When A6⊆X , the implications involved in the computation of XΣand XΣ0 are the same, thus XΣ0 =XΣ. When AX , XΣ0is obtained as follows:

XΣ0 = X∪ {B0|A0X,A0Σ0B0}

= X∪ {B0|A0X,A0Σ0B06=AΣ0B\A} ∪B\A

= X∪ {B0|A0X,A0Σ0B06=AΣ0B\A} ∪B since AX so X∪(B\A) =XB

= X∪ {B0|A0X,A0ΣB06=AΣB} ∪B by definition ofΣ0

= X∪ {B0|A0X,A0ΣB0}

= XΣ

2. The proof is the same for AΣB and CΣD such that CA and B∩D6=/0. Let us denote by Σ0the ISΣ\ {A→ΣB} ∪ {AB\D}. Stating XΣ0=XΣallows to conclude thatΣ0is a direct IS equivalent toΣ. In this case, CΣD∈Σimplies DXΣ0 when CAX .

3. The proof is the same for AΣB and AΣB0such that B6=B0. 4. Immediate since the implication AΣ/0adds no element to closures.

2 Lemma 4 LetΣandΣ0be two equivalent and direct ISs whose conclusions are singletons. IfΣverifies P5 and P6 thenΣ⊆Σ0.

Proof: Let AB be aΣ-implication. By hypothesis, the conclusion B contains only one element, say b.

SinceΣ0only owns implications whose conclusion is a singleton, let us prove thatΣ⊆Σ0by stating that Ab is also aΣ0-implication.

Let us considerϕΣ0(A), the closure of A inΣ0, as the union of three subsets:

ϕΣ0(A) = A∪ {D|CA,CΣ0D}

= A∪ {B0|AΣ0B0} ∪ {D|CA,CΣ0D}

SimilarlyϕΣ(A) =A∪ {B0|AΣB0} ∪ {D|CA,CΣD}. SinceΣandΣ0are equivalent,ϕΣ(X) = ϕΣ0(X)for any XS. In particular since AB∈Σand B={b}, b∈ϕΣ(A)and b∈ϕΣ0(A).

SinceΣverifies P5, we deduce from Ab∈Σthat A∩{b}=/0and b6∈A. SinceΣverifies P6,{b}∩D=/0 for any implication CΣD such that CA. So b6∈ {D|CA,CΣD}. Since b6∈A, we also have b6∈C

(12)

and b6∈ϕΣ(C) =C∪ {D|CΣD}for each CA. SinceϕΣ(C) =ϕΣ0(C), b6∈ {D|CA,CΣ0D}.

Therefore, the only subset containing b inϕΣ0(A)is{B0|AΣ0B0}and AΣ0b is aΣ0-implication. This

achieves the proof. 2

We can now prove Theorem 2.

Proof of Theorem 2:

⇒): By Lemma 3, we state thatΣis not direct-optimal when:

1. there exists AΣB such that AB6=/0or

2. there exist AΣB and CΣD such that CA and BD6=/0or 3. there exist AΣB and AΣB0such that B6=B0or

4. there exists AΣB such that B=/0.

⇐): Let us introduce s(Σ|A)as the size of an ISΣreduced to itsΣ-implications of premise A⊆S. Note that

s(Σ) =

A⊆S

s(Σ|A) (14)

LetΣbe an IS verifying P5, P6, P7 and P8, and letΣ0be a direct IS equivalent toΣ. To prove thatΣis direct-optimal we have to show that s(Σ)s(Σ0). To do so, we use (14) and prove the stronger property:

∀A⊆S,s(Σ|A)s(Σ0|A) (15)

Let us consider a set AS. If there is noΣ-implication of premise A, then we have s(Σ|A) =0≤s(Σ0|A).

If there is aΣ-implication A→B, where B={b1,b2, . . . ,bn}, then it is the onlyΣ-implication of premise A by P7, and n>0 by P8. Let AΣ0B1,. . ., A→Σ0 Bmbe the mΣ0-implications whose premise are A, with∗∗m0, and let p be the total cardinality of their conclusions:

(p=0 if m=0 p=∑1≤i≤m|Bi| if m>0 Then:

s(Σ|A) = |A|+n s(Σ0|A) = m|A|+p

In order to compare s(Σ|A)and s(Σ0|A), let us define fromΣanother ISΣwhose conclusions are single- tons by:

Σ=[{C→d1, . . . ,C→dp|C→ {d1, . . . ,dp} ∈Σ,CS} (16)

Σis direct and equivalent toΣby Lemma 3(3). It also verifies P5 and P6. LetΣ0be defined fromΣ0in the same way. Σcontains n>0 implications of premise A: AΣ b1, . . . ,AΣ bn; AndΣ0contains p0 implications of premise A (whose conclusions are also singletons). So we have:

s(Σ|A) = n(|A|+1) s(Σ0|A) = p(|A|+1)

∗∗Note that m=0 when there is noΣ0-implication of premise A.

(13)

SinceΣverifies P5 and P6 and the conclusions ofΣandΣ0are of cardinality 1, Lemma 4 states that Σ⊆Σ0. Then

s(Σ0|A) ≥ s(Σ|A) p(|A|+1) ≥ n(|A|+1)

Therefore pn>0. Remark that p>0 implies p=∑1≤i≤m|Bi|and m>0. We finally obtain s(Σ0|A)≥ s(Σ|A)by:

s(Σ0|A) =m|A|+pm|A|+n

≥ |A|+n=s(Σ|A)

2 We can now derive from Th. 2 the direct-optimal ISΣogenerated from a direct ISΣ:

Definition 5 The direct-optimal ISΣo generated from a direct ISΣis a direct IS s.t.:

P8 (optimization axiom) for all A,BS, AB∈Σo iff B6=/0and

B=[{B0S|A→ΣB0}\[{D⊆S|C→ΣD and CA}\A (17)

3.4 Σ

do

: a Direct-Optimal IS Generated from an IS Σ

Let us consider an ISΣ. The combination of Def. 4 and Def. 5 describes Σdo, the direct-optimal IS generated fromΣ:

Definition 6 The direct-optimal ISΣdo generated from some ISΣis defined as the direct-optimal ISob- tained by Def. 5 from the direct ISΣd which itself is obtained by Def. 4 fromΣ.

Σdo is then an IS of minimal size, equivalent toΣand such thatϕΣ(X)can be obtained by scanning only once its implications (see Ex. 1). Moreover equivalent ISs define an unique direct-optimal IS, as stated by the following corollary.

Corollary 1 LetΣandΣ0be equivalent ISs. ThenΣdo0do.

Proof: Let us defineΣfromΣdo andΣ0fromΣ0doas indicated by Eq. (16). Remark thatΣdo (resp.Σdo0) can dually be defined fromΣ(resp.Σ) since it satisfies axiom P7:

Σdo ={C→ {d1, . . . ,dn} |CΣd1, . . . ,C→Σ dn,CS} (18)

Σ(resp. Σ0) is direct and equivalent toΣdo (resp. Σdo0) by Lemma 3(3). By constructionΣandΣ0

satisfy P5 and P6. So by Lemma 4Σ0. We conclude using Eq. (18):

Σdo ={C→ {d1, . . . ,dn} |CΣd1, . . . ,C→Σdn,CS}

={C→ {d1, . . . ,dn} |CΣ0d1, . . . ,C→Σ0dn,CS}

0do

2

(14)

abe abd

ab bc

abcde

abcd abde

b c

bcd

bd

0 d

cd

Fig. 1:FΣforΣgiven in Ex. 1 Example 1 Consider the following ISΣon{a,b,c,d,e}:

Σ=

1 : ab 2 : acd 3 : ea

The full ISΣf is not given since it contains more than 35=243 implica- tions††.ΣdandΣdo are given below. Note thatΣd is not direct-optimal because (3) and (4) do not verify P7.

Σd=









1 : ab 2 : acd 3 : ea

4 : eb (P4 on 3 and 1) 5 : ced (P4 on 3 and 2)

Σdo =





6 : ab 7 : acd 8 : eab 9 : ced For example,ϕ(ce) =ceabd=abcde is directly deduced from im- plications 8 and 9 by Th. 1. Similarly, ϕ(ae) =aebab=abe is deduced from implications 6 and 8. It is also easy to check onFΣ(given on Fig. 1 by its Hasse diagram where the cover relation of the order relation is oriented from bottom to top.) that abcde (resp. abe) is the least set ofFΣthat contains ce (resp. ae).

4 Algorithms

We give in this section algorithms that rely on the results obtained in Sect. 3. We first present in Sect. 4.2 an algorithm which takes as input a direct-optimal ISΣand a subset XS, and computes the closure ϕΣ(X). We also give an algorithm which computes from any ISΣthe associated direct-optimalΣdo. In Sect. 4.3 we give an algorithm which takes as input some ISΣand computes the associated Moore family FΣ, based not on the direct characterization ofFΣbut on properties of the lattice(FΣ,⊆). All algorithms handle ISs and Moore families on S. Both are represented by a data-structure called lexicographic tree, presented in Sect. 4.1.

4.1 Lexicographic Tree

A nice and well-known data structure to represent a familyF on S ordered byα, a total order on S, is its lexicographic tree of depth|S|. Using this tree basic operations onF (such as deletion, addition and search of a subset) can be efficiently performed in O(|S|). Introduced for a distributive Moore family in [Gan84, MN96], it has been generalized in [NR99] to any familyF by introducing marked nodes. Its principle is intuitively the following. Nodes represent subsets XS: The tree contains a node for each subset XF with F∈F. Conventionally the root represents the empty set. A node that represents an element ofF is marked. Edges are labelled by elements of S so that labels of edges that leave a given

††Σfexactly contains 275 implications:

35=243 implications such that the conclusion is a subset of the premise,

and 32 implications such that the conclusion is not included in the premise.

(15)

0

a a

b b

c c

d d

ab b

bc c

bd d

cd d

abc c

abd d

abe e

bcd d

abcd d

abde e

abcde e

Fig. 2: The lexicographic tree associated to the Moore familyFgiven on Fig. 1 for the orderα=a<b<c<d<e.

node are sorted according toα from left to right. Moreover consider a marked node n that represents an element F∈F sorted according toα. Then (see Prop. 3 below) F can be retrieved from the tree by collecting labels along the path from the root to n (labels along such a path are by construction sorted according toα).

Example 2 Figure 2 shows the lexicographic tree TFassociated to the Moore familyFgiven in Fig. 1 for the orderα=a<b<c<d<e, where each node n is labelled by the set X it represents (0 denotes /0) and marked nodes are doubly circled.

A lexicographic tree is formally defined as follows:

Definition 7 LetF be a family on S={s1, . . . ,s|S|} whose elements are sorted according toα=s1<

s2< . . . <s|S|. The lexicographic tree TF ofF (or simply T ) is a 3-uplet(N,child,mark), where:

N is the set of nodes of T , where a node nXN is associated to every subset X of some element F∈F. By convention n/0is the root of the tree.

N={nX|XF and F∈F}

参照

関連したドキュメント

(A precise definition is given in Section 3.) In particular, we show that Z is equal in distribution to a Brownian motion running on an independent random clock for which

In the case, say, of showing that a genus 2 algebraically slice knot is not concordant to a knot of genus 1, we have to prove that it is not concordant to any knot in an innite

Theorem 4.8 shows that the addition of the nonlocal term to local diffusion pro- duces similar early pattern results when compared to the pure local case considered in [33].. Lemma

In section 3 all mathematical notations are stated and global in time existence results are established in the two following cases: the confined case with sharp-diffuse

For this case the proof seems especially natural. Suppose that Kµ is summable on ∂Γ with respect to the arclength. on ∂Γ with respect to the arclength. The statement presents a

We shall see below how such Lyapunov functions are related to certain convex cones and how to exploit this relationship to derive results on common diagonal Lyapunov function (CDLF)

In this paper we develop an elementary formula for a family of elements {˜ x[a]} a∈ Z n of the upper cluster algebra for any fixed initial seed Σ.. This family of elements

The main purpose of the present paper is a development of the fibering method of Pohozaev [17] for the investigation of the inhomogeneous Neumann boundary value problems