• 検索結果がありません。

The binomial ideal of the intersection axiom for conditional probabilities

N/A
N/A
Protected

Academic year: 2022

シェア "The binomial ideal of the intersection axiom for conditional probabilities"

Copied!
9
0
0

読み込み中.... (全文を見る)

全文

(1)

DOI 10.1007/s10801-010-0253-5

The binomial ideal of the intersection axiom for conditional probabilities

Alex Fink

Received: 10 February 2009 / Accepted: 27 August 2010 / Published online: 17 September 2010

© The Author(s) 2010. This article is published with open access at Springerlink.com

Abstract The binomial ideal associated with the intersection axiom of conditional probability is shown to be radical and is expressed as an intersection of toric prime ideals. This solves a problem in algebraic statistics posed by Cartwright and Eng- ström.

Keywords Conditional independence·Intersection axiom

Conditional independence constraints are a family of natural constraints on proba- bility distributions, describing situations in which two random variables are indepen- dently distributed given knowledge of a third. Statistical models built around con- siderations of conditional independence, in particular graphical models in which the constraints are encoded in a graph on the random variables, enjoy wide applicability in determining relationships among random variables in statistics and in dealing with uncertainty in artificial intelligence.

One can take a purely combinatorial perspective on the study of conditional inde- pendence, as does Studený [10], conceiving of it as a relation on triples of subsets of a set of observables which must satisfy certain axioms. A number of elementary implications among conditional independence statements are recognized as axioms.

Among these are the semi-graphoid axioms, which are implications of conditional independence statements lacking further hypotheses, and hence are purely combina- torial statements. The intersection axiom is also often added to the collection, but unlike the semi-graphoid axioms it is not uniformly true; it is our subject here.

Formally, a conditional independence modelMis a set of probability distribu- tions characterized by satisfying several conditional independence constraints. We

A. Fink (

)

Department of Mathematics, North Carolina State University, Raleigh, NC, USA e-mail:[email protected]

(2)

will work in the discrete setting, where a probability distributionp is a multi-way table of probabilities, and we follow the notational conventions in [1].

Consider the discrete conditional independence modelMgiven by {X1⊥⊥X2|X3, X1⊥⊥X3|X2}

whereXi is a random variable taking values in the set[ri] = {1, . . . , ri}. Throughout we assumer1≥2. Letpij k be the unknown probabilityP (X1=i, X2=j, X3=k) in a distribution from the modelM. The set of distributions in the modelMis the variety whose defining idealIMS=C[pij k]is

IM=

pij kpijkpijkpij k: i, i∈ [r1], j, j∈ [r2], k∈ [r3] +

pij kpij kpij kpij k: i, i∈ [r1], j∈ [r2], k, k∈ [r3] .

The intersection axiom is the implication whose premises are the statements ofM and whose conclusion isX1⊥⊥(X2, X3). To be true, this implication requires the fur- ther hypothesis that the distributionpis in the interior of the probability simplex, i.e.

that no individual probabilitypij kis zero. It is thus a natural question to ask what can be inferred about distributionsp which may lie on the boundary of the probability simplex. In algebraic terms, we are asking for the (set-theoretic) components of the varietyV (IM).

A problem posed by Dustin Cartwright and Alexander Engström appears in Sect. 6.6.3 of [1], giving a conjectural description of the associated primes ofIMin terms of certain subgraphs of a complete bipartite graph. Our main theorem resolves this conjecture in the positive, and gives stronger information, namely the primary decomposition ofIM.

In the course of this project the author computed primary decompositions ofIM for various values ofr1,r2, andr3with the computer algebra system Singular [4,5].

Thomas Kahle has recently written dedicated Macaulay2 code [3] for binomial pri- mary decompositions [7], in which the same computations may be carried out.

A broad generalization of this paper’s results to the class of binomial edge ideals of graphs has been obtained by Herzog, Hibi, Hreinsdóttir, Kahle, and Rauh [6]. The r=2 case ofIMis treated, with a different term order, in their Sect. 4.

LetKp,q be the complete bipartite graph with bipartitioned vertex set[p] [q].

Given a subgraph Gof Kr2,r3 with edge set Edges(G), the primePG to which it corresponds is defined to be

PG=PG(0)+PG(1) where

PG(0)=

pij k: i∈ [r1], (j, k) /∈Edges(G) , PG(1)=

pij kpijkpijkpij k: i, i∈ [r1]; and

j, j∈ [r2], k, k∈ [r3]are in the same connected component ofG .

(3)

Note thatj need not be distinct fromj, nork fromk. We will also want to refer to the individual summandsPC(1) of PG(1), where PC(1) includes only the generators {pij k: (j, k)C}arising from edges in the connected componentCofG. Then

PG=PG(0)+

C

PC(1), (1)

whereCruns over connected components ofG.

We say that a subgraphGofKr2,r3 is admissible ifGhas vertex set[r2] [r3] and all connected components ofGare isomorphic to some complete bipartite graph Kp,q withp, q≥1.

Let ≺dp be the revlex term order on S over the lexicographic variable order on subscripts, with earlier subscripts more significant. Thus under ≺dp, we have p111dpp112dpp211.

Theorem 1 The primary decomposition IM=

G

PG (2)

holds and is an irredundant decomposition, where the union is over admissible graphsGon[r2] [r3]. We also have

indpIM=

G

indpPG.

Each indpPGis squarefree, so indpIMand henceIMare radical ideals.

In particular, the value ofr1is irrelevant to the combinatorial nature of the primary decomposition.

Corollary 2 (Conjecture, Cartwright–Engström) The set of minimal primes of the idealIMis

PG: Gan admissible graph on[r2] [r3] .

Remark 3 This corollary amounts to the set-theoretic identity V (IM)=

Gadmissible

V (PG).

Points(pij k)on the varietyV (PG)are characterized by the conditions thatpij k=0 for (j, k) /∈Edges(G), and that for any two edges (j, k) and (j, k)in the same connected component ofG, the vectors(p·j k)and(p·jk)inCr1are proportional.

The core ideas of a proof of Corollary2are present in [1, Sect. 6.6.4]. That discus- sion focuses on the primePKr

2,r3, corresponding to the locus where the conclusion of the intersection axiom is valid, but it extends without great difficulty to anyPG.

(4)

It is noted in [1, §6.6] that the numberη(p, q)of admissible graphsGon[p][q] is given by the generating function

exp

ex−1

ey−1

=

p,q≥0

η(p, q)xpyq

p!q!, (3)

which in that reference is said to follow from manipulations of Stirling numbers.

This equation (3) can also be obtained as a direct consequence of a bivariate form of the exponential formula for exponential generating functions [9, §5.1], using the observation that

ex−1 ey−1

=

p,q1

xpyq p!q!

is the exponential generating function for complete bipartite graphs withp, q≥1, and these are the possible connected components of admissible graphs.

We now review some standard facts on binomial and toric ideals [2]. Let I be a binomial ideal in C[x1, . . . , xn], generated by binomials of the form xvxw with v, w∈Nn. There is a lattice LIZn such that the localization Ix1···xn ⊆ C[x1±1, . . . , xn±1] has the form (xv−1: vLI), provided that this localization is a proper ideal, i.e.Icontains no monomial. IfφI:Zn→Zmis aZ-linear map whose kernel containsLI, thenφI provides a multigrading with respect to whichI is ho- mogeneous. (If kerφI=LIexactly thenφI is said to compute the minimal sufficient statistics for the statistical model associated toI.)

The condition that a multivariate Laurent polynomialf ∈C[x1±1, . . . , xn±1]lies inIx1···xncan be expressed in terms of a graphΓ, whose vertices areZnand whose edge set is{(v, w): xvxwis a Laurent monomial multiple of a generator ofI}; in the statistical context these edges are known as moves. To wit,f is inIx1···xn if and only if, for each connected componentC ofΓ, the sum of the coefficients on all monomialsxvwithvC is zero. In particularIx1···xn is determined by the partition ofZn into connected components ofΓ. Note that this partition refines the partition ofZninto fibers ofφI, for any mapφI as in the last paragraph. If we are concerned with membership inI rather thanIx1···xn, analogues of everything in this paragraph are true if we substituteNnforZn and use ordinary monomials rather than Laurent monomials in defining the edges. We will denote the resulting graph onNnbyΓ (I ), and its induced subgraph on a subsetF⊆NnbyΓF(I ).

Any prime binomial idealI is equal to the toric idealIAof a lattice point configu- rationA, whereIAis the kernel of the monomial map whose monomials are the points ofA. Sturmfels shows in [8] that the radicals of the monomial initial ideals of IA are exactly the Stanley–Reisner ideals of regular triangulations ofA. The Stanley–

Reisner idealIΔ of a simplicial complexΔon a vertex setT is the monomial ideal ofC[xt:tT]generated by the products of variablesxt1· · ·xtkfor which{t1, . . . , tk} is not a face ofΔ. Primary decompositions of Stanley–Reisner ideals are easily de- scribed:IΔis the intersection of the ideals(xt: t /F )over all facetsF ofΔ.

Sturmfels treats explicitly the 2×2 determinantal ideal of anr×smatrix, which is the toric idealIA for A the set of vertices of the productΔr1×Δs1 of two simplices.

(5)

Theorem 4 (Sturmfels [8]) LetI be the ideal of 2×2 minors of anr×s matrix of indeterminatesY =(yij). For any term order≺, inI is a squarefree monomial ideal.

Remark 5 If≺is the revlex term order on theyij, set up analogously to≺dp, thenΔ has a pleasant description [8]: it is the staircase triangulation. The facets ofΔare the setsπ of entries of the matrixY which form maximal (“staircase”) paths throughY starting at the upper left corner, taking steps right and down, and terminating at the lower right corner. Note that staircase paths are maximal subsets of indeterminates not including bothyij andyij for anyi < iandj < j. Thus the associated primes of inI are generated by minimal sets of variablesyij which include at least one of yij andyij wheneveri < iandj < j.

The significance of the idealsPC(1) of connected components comes from the fol- lowing fact.

Fact 6 IfG is an admissible graph, then (1) expressesPG as a sum of primes in disjoint sets of variables.

Indeed, PG(0) is the irrelevant ideal in thepij k with(j, k) /∈Edges(G), and for each connected componentCofGwiths left andtright vertices,PC(1)is the 2×2 determinantal ideal of ther1×st matrix of indeterminates (pij k)where the row indices arei∈ [r1]and the column indices(j, k)∈EdgesC. The irrelevant ideal can mostly be ignored, and so this fact reduces many of our considerations to handling 2×2 determinantal ideals. (Note that the hypothesis thatGbe admissible is needed, since otherwisePG(1)includes variables corresponding to nonedges ofG. We could amend the definition ofPGto salvage Fact6, but we would lose the also important fact that the summands are determinantal.)

For a first immediate application, by Theorem4the indpPGare squarefree mono- mial ideals, implying the radicality claim of Theorem1.

For a second, we recover the primary decomposition of inPG for an arbitrary admissible graphG. Let the connected components ofGbeC1, . . . , Cl. Fact6im- plies that inPG=inPG(0)+ iinPC(1)

i . It then also gives us that if inPC(1)

i =

jQCi,j are primary decompositions of the inPC(1)

i , then inPG=

j

PG(0)+

l i=1

inQCi,ji

is a primary decomposition ofPG, where j=(j1, . . . , jl)ranges over the Cartesian product of the index sets in

jQC,j.

Lemma 7 Let G and G be subgraphs of Kr2,r3. Then PGPG if and only if Edges(G)⊆Edges(G)and every connected component of G is a union of con- nected componentsC1, . . . , ClofGsuch that at most oneCicontains more than one vertex.

(6)

In particular, for any subgraphG of Kr2r3 there exists an admissible graph G such thatPGPG. Such aGcan be constructed fromGas follows: add toGone new edge incident to each of its isolated vertices, and then complete each connected component of the new graph to a bipartite complete graph.

Proof First suppose the consequence fails. Then either (1) Gcontains an edge not inG, or

(2) some connected component ofGis not contained in a single connected compo- nent ofG, or

(3) a connected component ofG contains two connected components ofG both larger than one vertex.

In case (1), let(j, k)be an edge ofGnot inG. Thenp1j kPG, butp1j k/PG, since Remark3describes points inV (PG)withp1j k=0. Case (2) implies case (1). And in case (3), let(j, k)and(j, k)be edges ofGin different connected components there but in the same connected component ofG. Thenp1j kp2jkp1jkp2j k is in PG but notPG, again using Remark3.

Suppose instead the consequence holds. The generators ofPG(0) are inPG, since nonedges ofGare nonedges ofG. The generators ofPG(1) are also inPG. For every pair of edges(j, k),(j, k)in a connected component ofG, either all their endpoints are in the same component ofGor one of their endpoints is isolated: in the former casepij kpijkpij kpijk is inPG(1), in the latter case inPG(0). Proof of Theorem1 We first check that the right side of (2) is an irredundant primary decomposition. LetGbe an admissible graph. SincePGis a sum of primes in disjoint variables by Fact6, it is prime. Irredundance of (2) is the assertion that forGandG distinct admissible graphs,PGis not contained inPG. This follows directly from the definition of admissibility and Lemma7.

So we must prove the intersection statement (2). Let≺be≺dp, and writeI=IM. It is apparent that

IPG (4)

for eachG(without using admissibility). Indeed, given a generatorf ofI, without loss of generality of the formf =pij kpijkpijkpij k, either both edges(j, k)and (j, k)lie in Edges(G), in which casef is a generator ofPG(1), or one of these edges is not in Edges(G), in which casefPG(0). Therefore the containments

inI⊆in

G

PG

G

inPG

hold, the intersections still being over admissibleG. It now suffices to show an equal- ity of Hilbert functions

H (S/inI )=H

S/

G

inPG

, (5)

forcing these containments to be equalities.

(7)

The latticeLI associated to ourI is generated by all vectors of the formeij k+ eijkeijkeij k andeij k+eij keij keij k. The map φ=φI :Zr1r2r3 → Zr1+r2r3 sending(uij k)to

(j,k)

u1j k, . . . ,

(j,k)

ur1j k,

i

ui11, . . . ,

i

uir2r3

has kernel containingLI. Therefore we obtain aZr1+r2r3-valued multigrading onS, degφpij k=(ei, ej k), in whichI is homogeneous. The degφmultigrading refines the standard grading. We will prove that (5) holds in this stronger context ofφ-graded Hilbert functions.

Letd∈Zr1+r2r3 be the multidegree of some monomial, and write its components asdi fori∈ [r1]anddj k for(j, k)∈ [r2] × [r3]. LetG(d)be the bipartite graph with vertex set[r2] [r3]and edge set{(j, k):dj k=0}. We now prove the following two claims:

Claim 1 Id=(PG(d))d. Claim 2 (

GadmissibleinPG)d=(inPG(d))d. These claims imply

H (inI )(d)=H (I )(d)=H (PG(d))(d)=H (inPG(d))(d)

=H

G

inPG

(d).

Thence we conclude that (5) holds, proving Theorem1.

Proof of Claim 1 Observe first that no polynomial homogeneous of multidegreedcan be divisible by anypij kwith(j, k) /∈Edges(G(d)). Accordingly we have(PG(d))d= (PG(d)(1) )d, and we will work withPG(d)(1) .

SinceI andPG(d)(1) are binomial ideals generated by differences of monomials, it will suffice to show that the two graphsΓF(I )andΓF(PG(d)(1) )of moves on the fiber F=φ1(d)have the same partition into connected components. It is clear thatΓF(I ) is a subgraph ofΓF(PG(d)(1) ), since containment (4) impliesId(PG(d))d=(PG(d)(1) )d. So given an edge ofΓF(PG(d)(1) ), say with endpointsu, uF, we must show that this edge is contained in a connected component ofΓF(I ), i.e. thatpupuI. We haveu=u+ei0j0k0+eijkei0jkeij0k0 for somei0, i∈ [r1]and some two edges(j0, k0), (j, k)of G(d)in the same component. Let(jm, km)m=0,..., be the edges of a path inG(d)between these, so that for each 0≤m < eitherjm=jm+1

orkm=km+1. Assume the(jm, km)are pairwise distinct. For each 1≤m−1, let imbe such thatpimjmkm dividespu. Such animmust exist becausedjmkmis positive.

Then

(8)

(pi0j0k0pijkpi0jkpij0k0)pi1j1k1· · ·pi1j1k1

=

1

m=0

pi1j0k0· · ·pimjm1km1gii0jmkm

m+1jm+1km+1pim+2jm+2km+2· · ·pijk

2

m=0

pi1j0k0· · ·pimjm1km1giijmkm

m+1jm+1km+1

×pim+2jm+2km+2· · ·pi1j1k1pi0jk

is inI, where to save spacegiij kjkdenotes the generatorpij kpijkpijkpij k ofI. The binomialpupu is a monomial multiple of this binomial, sopupuI. Proof of Claim 2 There is an admissible graph G such that PGPG(d), by Lemma7. Then inPG⊆inPG(d), which implies

GadmissibleinPG⊆inPG(d), one of the containments of the claim.

For the other containment, suppose pu is a monomial of multidegree d be- longing to inPG(d). By Remark5,pu is divisible by some pijkpij k for i < i in[r1]and(j, k), (j, k)two edges lying in the same connected component ofG(d) with (j, k) < (j, k) lexicographically. Now let G be any admissible graph. If G(d)is not a subset ofG, thenpu is divisible by some indeterminatepijk with (j, k) /E(G), so pu∈inPG. OtherwiseG(d)G, in which case the edges (j, k)and(j, k)lie in the same component ofG, sopijkpij k∈inPG, implying pu∈inPG. Therefore inPG(d)

GadmissibleinPG.

Observe finally that Claim 1 alone would suffice for the radicality in Theorem1, supposing we already knew thePGto be the associated primes. For radicality it suf- fices thatI contain the intersection of its minimal primes, and this follows using Claim 1 one multidegree at a time, since the multidegreedpart of this intersection is contained in(PG(d))d.

Acknowledgements We thank Thomas Kahle for discussions, and Bernd Sturmfels and a patient referee for careful readings and for several helpful suggestions.

Open Access This article is distributed under the terms of the Creative Commons Attribution Noncom- mercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

References

1. Drton, M., Sturmfels, B., Sullivant, S.: Lectures on Algebraic Statistics. Oberwolfach Seminars, vol. 39. Springer, Berlin (2009)

2. Eisenbud, D., Sturmfels, B.: Binomial ideals. Duke Math. J. 84, 1–45 (1996)

3. Grayson, D.R., Stillman, M.: Macaulay2, a software system for research in algebraic geometry. Avail- able athttp://www.math.uiuc.edu/Macaulay2/

4. Greuel, G.-M., Pfister, G., Schönemann, H.: SINGULAR3.0—a computer algebra system for poly- nomial computations. In: Kerber, M., Kohlhase, M. (eds.) Symbolic Computation and Automated Reasoning, The Calculemus-2000 Symposium, pp. 227–233 (2001)

(9)

5. Greuel, G.-M., Pfister, G.:primdec.lib, a SINGULAR3.0 library for computing the primary de- composition and radical of ideals (2005)

6. Herzog, J., Hibi, T., Hreinsdóttir, F., Kahle, T., Rauh, J.: Binomial edge ideals and conditional inde- pendence statements. arXiv:0909.4717

7. Kahle, T.: Binomials.m2, code for binomial primary decomposition in Macaulay2. http://

personal-homepages.mis.mpg.de/kahle/bpd/index.html

8. Sturmfels, B.: Gröbner bases of toric varieties. T¯ohoku Math. J. 43, 249–261 (1991)

9. Stanley, R.P.: Enumerative Combinatorics, vol. 2. Cambridge Studies in Advanced Mathematics, vol. 62. Cambridge University Press, Cambridge (1997)

10. Studený, M.: Probabilistic Conditional Independence Structures, Information Science and Statistics.

Springer, New York (2005)

参照

関連したドキュメント

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided

Using conditional variance denotes the expected risk model which is known as the ARCH mean regression model ARCH-M.. The left is the logarithm of conditional variance which means

We establish why expected value is insensitive to catastrophic risks see the study by Chichilnisky 1996, and use another criterion to evaluate risk based on axioms for choice

As a result of this computer-based market analysis, the following findings were made: 1 improvements in the forecast accuracy of fundamentalists can contribute to an increase in

The new results provided here show that the standard axiom of decision theory, Monotone Continuity, is equivalent to De Groot’s Axiom SP 4 that lies at the foundation of

The equality in (2.6) holds if and only if a and b are linearly dependent, or the vector c is a linear combination of a and b, and (a, c)(b, c) = 0, but (a, c) and (b, c) are

Relations between (set-theoretic) complete intersections and lo- cal cohomology are studied; it is explained in what sense Matlis duals of cer- tain local cohomology modules

The author of this paper does not currently know whether (E , M) ∗ can be equal to the collection of all countable sets of reals (see Problem 21 below).. However, all the other nodes