DOI 10.1007/s10801-010-0253-5
The binomial ideal of the intersection axiom for conditional probabilities
Alex Fink
Received: 10 February 2009 / Accepted: 27 August 2010 / Published online: 17 September 2010
© The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract The binomial ideal associated with the intersection axiom of conditional probability is shown to be radical and is expressed as an intersection of toric prime ideals. This solves a problem in algebraic statistics posed by Cartwright and Eng- ström.
Keywords Conditional independence·Intersection axiom
Conditional independence constraints are a family of natural constraints on proba- bility distributions, describing situations in which two random variables are indepen- dently distributed given knowledge of a third. Statistical models built around con- siderations of conditional independence, in particular graphical models in which the constraints are encoded in a graph on the random variables, enjoy wide applicability in determining relationships among random variables in statistics and in dealing with uncertainty in artificial intelligence.
One can take a purely combinatorial perspective on the study of conditional inde- pendence, as does Studený [10], conceiving of it as a relation on triples of subsets of a set of observables which must satisfy certain axioms. A number of elementary implications among conditional independence statements are recognized as axioms.
Among these are the semi-graphoid axioms, which are implications of conditional independence statements lacking further hypotheses, and hence are purely combina- torial statements. The intersection axiom is also often added to the collection, but unlike the semi-graphoid axioms it is not uniformly true; it is our subject here.
Formally, a conditional independence modelMis a set of probability distribu- tions characterized by satisfying several conditional independence constraints. We
A. Fink (
)Department of Mathematics, North Carolina State University, Raleigh, NC, USA e-mail:[email protected]
will work in the discrete setting, where a probability distributionp is a multi-way table of probabilities, and we follow the notational conventions in [1].
Consider the discrete conditional independence modelMgiven by {X1⊥⊥X2|X3, X1⊥⊥X3|X2}
whereXi is a random variable taking values in the set[ri] = {1, . . . , ri}. Throughout we assumer1≥2. Letpij k be the unknown probabilityP (X1=i, X2=j, X3=k) in a distribution from the modelM. The set of distributions in the modelMis the variety whose defining idealIM⊆S=C[pij k]is
IM=
pij kpijk−pijkpij k: i, i∈ [r1], j, j∈ [r2], k∈ [r3] +
pij kpij k−pij kpij k: i, i∈ [r1], j∈ [r2], k, k∈ [r3] .
The intersection axiom is the implication whose premises are the statements ofM and whose conclusion isX1⊥⊥(X2, X3). To be true, this implication requires the fur- ther hypothesis that the distributionpis in the interior of the probability simplex, i.e.
that no individual probabilitypij kis zero. It is thus a natural question to ask what can be inferred about distributionsp which may lie on the boundary of the probability simplex. In algebraic terms, we are asking for the (set-theoretic) components of the varietyV (IM).
A problem posed by Dustin Cartwright and Alexander Engström appears in Sect. 6.6.3 of [1], giving a conjectural description of the associated primes ofIMin terms of certain subgraphs of a complete bipartite graph. Our main theorem resolves this conjecture in the positive, and gives stronger information, namely the primary decomposition ofIM.
In the course of this project the author computed primary decompositions ofIM for various values ofr1,r2, andr3with the computer algebra system Singular [4,5].
Thomas Kahle has recently written dedicated Macaulay2 code [3] for binomial pri- mary decompositions [7], in which the same computations may be carried out.
A broad generalization of this paper’s results to the class of binomial edge ideals of graphs has been obtained by Herzog, Hibi, Hreinsdóttir, Kahle, and Rauh [6]. The r=2 case ofIMis treated, with a different term order, in their Sect. 4.
LetKp,q be the complete bipartite graph with bipartitioned vertex set[p] [q].
Given a subgraph Gof Kr2,r3 with edge set Edges(G), the primePG to which it corresponds is defined to be
PG=PG(0)+PG(1) where
PG(0)=
pij k: i∈ [r1], (j, k) /∈Edges(G) , PG(1)=
pij kpijk−pijkpij k: i, i∈ [r1]; and
j, j∈ [r2], k, k∈ [r3]are in the same connected component ofG .
Note thatj need not be distinct fromj, nork fromk. We will also want to refer to the individual summandsPC(1) of PG(1), where PC(1) includes only the generators {pij k: (j, k)∈C}arising from edges in the connected componentCofG. Then
PG=PG(0)+
C
PC(1), (1)
whereCruns over connected components ofG.
We say that a subgraphGofKr2,r3 is admissible ifGhas vertex set[r2] [r3] and all connected components ofGare isomorphic to some complete bipartite graph Kp,q withp, q≥1.
Let ≺dp be the revlex term order on S over the lexicographic variable order on subscripts, with earlier subscripts more significant. Thus under ≺dp, we have p111≺dpp112≺dpp211.
Theorem 1 The primary decomposition IM=
G
PG (2)
holds and is an irredundant decomposition, where the union is over admissible graphsGon[r2] [r3]. We also have
in≺dpIM=
G
in≺dpPG.
Each in≺dpPGis squarefree, so in≺dpIMand henceIMare radical ideals.
In particular, the value ofr1is irrelevant to the combinatorial nature of the primary decomposition.
Corollary 2 (Conjecture, Cartwright–Engström) The set of minimal primes of the idealIMis
PG: Gan admissible graph on[r2] [r3] .
Remark 3 This corollary amounts to the set-theoretic identity V (IM)=
Gadmissible
V (PG).
Points(pij k)on the varietyV (PG)are characterized by the conditions thatpij k=0 for (j, k) /∈Edges(G), and that for any two edges (j, k) and (j, k)in the same connected component ofG, the vectors(p·j k)and(p·jk)inCr1are proportional.
The core ideas of a proof of Corollary2are present in [1, Sect. 6.6.4]. That discus- sion focuses on the primePKr
2,r3, corresponding to the locus where the conclusion of the intersection axiom is valid, but it extends without great difficulty to anyPG.
It is noted in [1, §6.6] that the numberη(p, q)of admissible graphsGon[p][q] is given by the generating function
exp
ex−1
ey−1
=
p,q≥0
η(p, q)xpyq
p!q!, (3)
which in that reference is said to follow from manipulations of Stirling numbers.
This equation (3) can also be obtained as a direct consequence of a bivariate form of the exponential formula for exponential generating functions [9, §5.1], using the observation that
ex−1 ey−1
=
p,q≥1
xpyq p!q!
is the exponential generating function for complete bipartite graphs withp, q≥1, and these are the possible connected components of admissible graphs.
We now review some standard facts on binomial and toric ideals [2]. Let I be a binomial ideal in C[x1, . . . , xn], generated by binomials of the form xv−xw with v, w∈Nn. There is a lattice LI ⊆Zn such that the localization Ix1···xn ⊆ C[x1±1, . . . , xn±1] has the form (xv−1: v∈LI), provided that this localization is a proper ideal, i.e.Icontains no monomial. IfφI:Zn→Zmis aZ-linear map whose kernel containsLI, thenφI provides a multigrading with respect to whichI is ho- mogeneous. (If kerφI=LIexactly thenφI is said to compute the minimal sufficient statistics for the statistical model associated toI.)
The condition that a multivariate Laurent polynomialf ∈C[x1±1, . . . , xn±1]lies inIx1···xncan be expressed in terms of a graphΓ, whose vertices areZnand whose edge set is{(v, w): xv−xwis a Laurent monomial multiple of a generator ofI}; in the statistical context these edges are known as moves. To wit,f is inIx1···xn if and only if, for each connected componentC ofΓ, the sum of the coefficients on all monomialsxvwithv∈C is zero. In particularIx1···xn is determined by the partition ofZn into connected components ofΓ. Note that this partition refines the partition ofZninto fibers ofφI, for any mapφI as in the last paragraph. If we are concerned with membership inI rather thanIx1···xn, analogues of everything in this paragraph are true if we substituteNnforZn and use ordinary monomials rather than Laurent monomials in defining the edges. We will denote the resulting graph onNnbyΓ (I ), and its induced subgraph on a subsetF⊆NnbyΓF(I ).
Any prime binomial idealI is equal to the toric idealIAof a lattice point configu- rationA, whereIAis the kernel of the monomial map whose monomials are the points ofA. Sturmfels shows in [8] that the radicals of the monomial initial ideals of IA are exactly the Stanley–Reisner ideals of regular triangulations ofA. The Stanley–
Reisner idealIΔ of a simplicial complexΔon a vertex setT is the monomial ideal ofC[xt:t∈T]generated by the products of variablesxt1· · ·xtkfor which{t1, . . . , tk} is not a face ofΔ. Primary decompositions of Stanley–Reisner ideals are easily de- scribed:IΔis the intersection of the ideals(xt: t /∈F )over all facetsF ofΔ.
Sturmfels treats explicitly the 2×2 determinantal ideal of anr×smatrix, which is the toric idealIA for A the set of vertices of the productΔr−1×Δs−1 of two simplices.
Theorem 4 (Sturmfels [8]) LetI be the ideal of 2×2 minors of anr×s matrix of indeterminatesY =(yij). For any term order≺, in≺I is a squarefree monomial ideal.
Remark 5 If≺is the revlex term order on theyij, set up analogously to≺dp, thenΔ has a pleasant description [8]: it is the staircase triangulation. The facets ofΔare the setsπ of entries of the matrixY which form maximal (“staircase”) paths throughY starting at the upper left corner, taking steps right and down, and terminating at the lower right corner. Note that staircase paths are maximal subsets of indeterminates not including bothyij andyij for anyi < iandj < j. Thus the associated primes of in≺I are generated by minimal sets of variablesyij which include at least one of yij andyij wheneveri < iandj < j.
The significance of the idealsPC(1) of connected components comes from the fol- lowing fact.
Fact 6 IfG is an admissible graph, then (1) expressesPG as a sum of primes in disjoint sets of variables.
Indeed, PG(0) is the irrelevant ideal in thepij k with(j, k) /∈Edges(G), and for each connected componentCofGwiths left andtright vertices,PC(1)is the 2×2 determinantal ideal of ther1×st matrix of indeterminates (pij k)where the row indices arei∈ [r1]and the column indices(j, k)∈EdgesC. The irrelevant ideal can mostly be ignored, and so this fact reduces many of our considerations to handling 2×2 determinantal ideals. (Note that the hypothesis thatGbe admissible is needed, since otherwisePG(1)includes variables corresponding to nonedges ofG. We could amend the definition ofPGto salvage Fact6, but we would lose the also important fact that the summands are determinantal.)
For a first immediate application, by Theorem4the in≺dpPGare squarefree mono- mial ideals, implying the radicality claim of Theorem1.
For a second, we recover the primary decomposition of in≺PG for an arbitrary admissible graphG. Let the connected components ofGbeC1, . . . , Cl. Fact6im- plies that in≺PG=in≺PG(0)+ iin≺PC(1)
i . It then also gives us that if in≺PC(1)
i =
jQCi,j are primary decompositions of the in≺PC(1)
i , then in≺PG=
j
PG(0)+
l i=1
in≺QCi,ji
is a primary decomposition ofPG, where j=(j1, . . . , jl)ranges over the Cartesian product of the index sets in
jQC,j.
Lemma 7 Let G and G be subgraphs of Kr2,r3. Then PG ⊆PG if and only if Edges(G)⊆Edges(G)and every connected component of G is a union of con- nected componentsC1, . . . , ClofGsuch that at most oneCicontains more than one vertex.
In particular, for any subgraphG of Kr2r3 there exists an admissible graph G such thatPG ⊆PG. Such aGcan be constructed fromGas follows: add toGone new edge incident to each of its isolated vertices, and then complete each connected component of the new graph to a bipartite complete graph.
Proof First suppose the consequence fails. Then either (1) Gcontains an edge not inG, or
(2) some connected component ofGis not contained in a single connected compo- nent ofG, or
(3) a connected component ofG contains two connected components ofG both larger than one vertex.
In case (1), let(j, k)be an edge ofGnot inG. Thenp1j k∈PG, butp1j k∈/PG, since Remark3describes points inV (PG)withp1j k=0. Case (2) implies case (1). And in case (3), let(j, k)and(j, k)be edges ofGin different connected components there but in the same connected component ofG. Thenp1j kp2jk−p1jkp2j k is in PG but notPG, again using Remark3.
Suppose instead the consequence holds. The generators ofPG(0) are inPG, since nonedges ofGare nonedges ofG. The generators ofPG(1) are also inPG. For every pair of edges(j, k),(j, k)in a connected component ofG, either all their endpoints are in the same component ofGor one of their endpoints is isolated: in the former casepij kpijk−pij kpijk is inPG(1), in the latter case inPG(0). Proof of Theorem1 We first check that the right side of (2) is an irredundant primary decomposition. LetGbe an admissible graph. SincePGis a sum of primes in disjoint variables by Fact6, it is prime. Irredundance of (2) is the assertion that forGandG distinct admissible graphs,PGis not contained inPG. This follows directly from the definition of admissibility and Lemma7.
So we must prove the intersection statement (2). Let≺be≺dp, and writeI=IM. It is apparent that
I⊆PG (4)
for eachG(without using admissibility). Indeed, given a generatorf ofI, without loss of generality of the formf =pij kpijk−pijkpij k, either both edges(j, k)and (j, k)lie in Edges(G), in which casef is a generator ofPG(1), or one of these edges is not in Edges(G), in which casef ∈PG(0). Therefore the containments
in≺I⊆in≺
G
PG⊆
G
in≺PG
hold, the intersections still being over admissibleG. It now suffices to show an equal- ity of Hilbert functions
H (S/in≺I )=H
S/
G
in≺PG
, (5)
forcing these containments to be equalities.
The latticeLI associated to ourI is generated by all vectors of the formeij k+ eijk−eijk −eij k andeij k+eij k −eij k −eij k. The map φ=φI :Zr1r2r3 → Zr1+r2r3 sending(uij k)to
(j,k)
u1j k, . . . ,
(j,k)
ur1j k,
i
ui11, . . . ,
i
uir2r3
has kernel containingLI. Therefore we obtain aZr1+r2r3-valued multigrading onS, degφpij k=(ei, ej k), in whichI is homogeneous. The degφmultigrading refines the standard grading. We will prove that (5) holds in this stronger context ofφ-graded Hilbert functions.
Letd∈Zr1+r2r3 be the multidegree of some monomial, and write its components asdi fori∈ [r1]anddj k for(j, k)∈ [r2] × [r3]. LetG(d)be the bipartite graph with vertex set[r2] [r3]and edge set{(j, k):dj k=0}. We now prove the following two claims:
Claim 1 Id=(PG(d))d. Claim 2 (
Gadmissiblein≺PG)d=(in≺PG(d))d. These claims imply
H (in≺I )(d)=H (I )(d)=H (PG(d))(d)=H (in≺PG(d))(d)
=H
G
in≺PG
(d).
Thence we conclude that (5) holds, proving Theorem1.
Proof of Claim 1 Observe first that no polynomial homogeneous of multidegreedcan be divisible by anypij kwith(j, k) /∈Edges(G(d)). Accordingly we have(PG(d))d= (PG(d)(1) )d, and we will work withPG(d)(1) .
SinceI andPG(d)(1) are binomial ideals generated by differences of monomials, it will suffice to show that the two graphsΓF(I )andΓF(PG(d)(1) )of moves on the fiber F=φ−1(d)have the same partition into connected components. It is clear thatΓF(I ) is a subgraph ofΓF(PG(d)(1) ), since containment (4) impliesId⊆(PG(d))d=(PG(d)(1) )d. So given an edge ofΓF(PG(d)(1) ), say with endpointsu, u∈F, we must show that this edge is contained in a connected component ofΓF(I ), i.e. thatpu−pu∈I. We haveu=u+ei0j0k0+eijk−ei0jk−eij0k0 for somei0, i∈ [r1]and some two edges(j0, k0), (j, k)of G(d)in the same component. Let(jm, km)m=0,..., be the edges of a path inG(d)between these, so that for each 0≤m < eitherjm=jm+1
orkm=km+1. Assume the(jm, km)are pairwise distinct. For each 1≤m≤−1, let imbe such thatpimjmkm dividespu. Such animmust exist becausedjmkmis positive.
Then
(pi0j0k0pijk−pi0jkpij0k0)pi1j1k1· · ·pi−1j−1k−1
=
−1
m=0
pi1j0k0· · ·pimjm−1km−1gii0jmkm
m+1jm+1km+1pim+2jm+2km+2· · ·pijk
−
−2
m=0
pi1j0k0· · ·pimjm−1km−1giijmkm
m+1jm+1km+1
×pim+2jm+2km+2· · ·pi−1j−1k−1pi0jk
is inI, where to save spacegiij kjkdenotes the generatorpij kpijk−pijkpij k ofI. The binomialpu−pu is a monomial multiple of this binomial, sopu−pu∈I. Proof of Claim 2 There is an admissible graph G such that PG ⊆PG(d), by Lemma7. Then in≺PG⊆in≺PG(d), which implies
Gadmissiblein≺PG⊆in≺PG(d), one of the containments of the claim.
For the other containment, suppose pu is a monomial of multidegree d be- longing to in≺PG(d). By Remark5,pu is divisible by some pijkpij k for i < i in[r1]and(j, k), (j, k)two edges lying in the same connected component ofG(d) with (j, k) < (j, k) lexicographically. Now let G be any admissible graph. If G(d)is not a subset ofG, thenpu is divisible by some indeterminatepijk with (j, k) /∈E(G), so pu∈in≺PG. OtherwiseG(d)⊆G, in which case the edges (j, k)and(j, k)lie in the same component ofG, sopijkpij k∈in≺PG, implying pu∈in≺PG. Therefore in≺PG(d)⊆
Gadmissiblein≺PG.
Observe finally that Claim 1 alone would suffice for the radicality in Theorem1, supposing we already knew thePGto be the associated primes. For radicality it suf- fices thatI contain the intersection of its minimal primes, and this follows using Claim 1 one multidegree at a time, since the multidegreedpart of this intersection is contained in(PG(d))d.
Acknowledgements We thank Thomas Kahle for discussions, and Bernd Sturmfels and a patient referee for careful readings and for several helpful suggestions.
Open Access This article is distributed under the terms of the Creative Commons Attribution Noncom- mercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References
1. Drton, M., Sturmfels, B., Sullivant, S.: Lectures on Algebraic Statistics. Oberwolfach Seminars, vol. 39. Springer, Berlin (2009)
2. Eisenbud, D., Sturmfels, B.: Binomial ideals. Duke Math. J. 84, 1–45 (1996)
3. Grayson, D.R., Stillman, M.: Macaulay2, a software system for research in algebraic geometry. Avail- able athttp://www.math.uiuc.edu/Macaulay2/
4. Greuel, G.-M., Pfister, G., Schönemann, H.: SINGULAR3.0—a computer algebra system for poly- nomial computations. In: Kerber, M., Kohlhase, M. (eds.) Symbolic Computation and Automated Reasoning, The Calculemus-2000 Symposium, pp. 227–233 (2001)
5. Greuel, G.-M., Pfister, G.:primdec.lib, a SINGULAR3.0 library for computing the primary de- composition and radical of ideals (2005)
6. Herzog, J., Hibi, T., Hreinsdóttir, F., Kahle, T., Rauh, J.: Binomial edge ideals and conditional inde- pendence statements. arXiv:0909.4717
7. Kahle, T.: Binomials.m2, code for binomial primary decomposition in Macaulay2. http://
personal-homepages.mis.mpg.de/kahle/bpd/index.html
8. Sturmfels, B.: Gröbner bases of toric varieties. T¯ohoku Math. J. 43, 249–261 (1991)
9. Stanley, R.P.: Enumerative Combinatorics, vol. 2. Cambridge Studies in Advanced Mathematics, vol. 62. Cambridge University Press, Cambridge (1997)
10. Studený, M.: Probabilistic Conditional Independence Structures, Information Science and Statistics.
Springer, New York (2005)