The M¨obius function of a composition poset

(1)

DOI 10.1007/s10801-006-0017-4

The M¨obius function of a composition poset

Bruce E. Sagan·Vincent Vatter

Received: 22 July 2005 / Accepted: 11 January 2006 / Published online: 11 July 2006

CSpringer Science+Business Media, LLC 2006

Abstract We determine the M¨obius function of a poset of compositions of an integer.

In fact, we give two proofs of this formula, one using an involution and one involving discrete Morse theory. This composition poset turns out to be intimately connected with subword order, whose Möbius function was determined by Björner. We show that, using a generalization of subword order, we can obtain both Björner’s results and our own as special cases.

1. Introduction

If A is any set then the corresponding Kleene closure or free monoid, A^∗is the set of words with letters from A, i.e.,

A^∗= {w=w(1)w(2). . . w(n)|n ≥0 andw(i )∈ A for all i}.

We denote the length (number of elements) ofwby|w|.

This work was partially done while B. E. Sagan was on leave at DIMACS.

Partially supported by an award from DIMACS and an NSF VIGRE grant to the Rutgers University Department of Mathematics.

B. E. Sagan ( )

Department of Mathematics, Michigan State University East Lansing, MI 48824-1027

e-mail: sagan@math.msu.edu V. Vatter

School of Mathematics and Statistics, University of St Andrews, St Andrews, Fife, Scotland KY16 9SS

e-mail: vince@mcs.st-and.ac.uk

(2)

Letting Pdenote the positive integers, we see thatP^∗ is the set of integer compositions (ordered partitions). We can turnP^∗ into a partially ordered set by letting u ≤wif there is a subwordw(i₁)w(i₂). . . w(il) ofwhaving length l= |u|such that u( j)≤w(ij) for 1≤ j ≤l. For example, 433≤16243 because of the subword 643.

Bergeron, Bousquet-Mélou and Dulucq [2] were the first to study P^∗, enumerating saturated chains that begin at its minimal element. Snellman has also studied saturated chains in this poset as well as two other partial orders onP^∗ [21, 22]. One of these other partial orders was originally defined by Björner and Stanley [8] who showed that it has analogues of many of the properties of Young’s lattice. One of the main results of this paper is a formula for the Möbius function ofP^∗.

This order on P^∗ is closely related to subword order. If A is any set then the subword order on A^∗ is defined by letting u≤w if w contains a subsequence w(i1), w(i2), . . . , w(il) such that u( j )=w(ij) for 1≤ j ≤l = |u|. By way of illus- tration, if A= {a,b}then abba≤aabbbaba becausew(1)w(3)w(5)w(8)=abba.

Note that we use the notation A^∗when referring to subword order as opposed to the partial order on P^∗, even though we use≤ for both. We will always give enough context to make it clear which poset we are dealing with. Björner [4] was the first to completely determine the Möbius function of subword order, although special cases had been obtained previously by Farmer [13] and Viennot [24]. In fact, Björner gave two proofs of his formula, one using an involution [5] and one using shellability [4]. He also gave a demonstration with Reutenauer [6] via generating functions on monoids. Another proof was given by Warnke [26] using induction, while Wang and Ma [25] used Cohen-Macaulayness to investigate μ. We derive the Möbius function formula for P^∗ using first combinatorial and then topological techniques.

In the next section, we review Bj¨orner’s result for subword order as well as the related definitions which will be useful forP^∗. This section contains a statement of our formula for the M¨obius function ofP^∗ in Theorem 2.2. Section 3 is devoted to giving a proof of this theorem using a sign-reversing involution.

Although intervals in A^∗are shellable, those inP^∗need not even be connected as evident from the example in Fig. 2, so we need a more powerful tool to study the topology of this composition poset. For this we turn to discrete Morse theory, which was developed by Forman [14, 15] and has been a powerful tool in the computation of the homology of many CW-complexes arising naturally in topological combinatorics.

A method for applying this theory to the order complex of a poset was given by Babson and Hersh [1], and further studied by Hersh herself [16]. Since this is a relatively recent addition to the combinatorial toolbox, we provide an exposition of the basic ideas of the theory in Section 4. The subsequent section gives a Morse theoretic proof of Theorem 2.2.

The similarity between the formulas for the M¨obius functions of A^∗andP^∗leads one to ask if there is a common generalization. In fact, if P is any poset then there is a partial order on P^∗which we call generalized subword order. It has been used in the context of well-quasi-ordering; see Kruskal’s article [18] for a survey of the early literature.

When P is a chain or an antichain, one recovers our results or Bj¨orner’s, respectively.

This construction is studied in Section 6. We end with a section of comments and open problems.

(3)

2. Subword and composition order

We now review Björner’s formula for the Möbius function of subword order, refor- mulating it slightly so as to emphasize the connection to the composition order which is our main objective. We assume the reader is familiar with Möbius functions, but all the necessary definitions and theorems we use here can be found in Stanley’s text [23, Sections 3.6–3.7].

We first need to restate the definition of the partial order in A^∗ in a way that, although slightly more complicated, has a direct connection with the M¨obius function.

Suppose we have a distinguished symbol, 0, and suppose that 0∈ A. Then a word η=η(1)η(2). . . η(n)∈( A∪0)^∗has support

Suppη= {i |η(i )=0}.

An expansion of u∈( A∪0)^∗ is a wordηu∈( A∪0)^∗such that the restrictions of u andηu to their supports are equal. For example, if u=abba then one expansion of u isηu =a0b0b00a. An embedding of u intowis an expansionηuwof u which has length|w|and satisfies

ηuw(i )=w(i ) for all i ∈Suppηuw.

Note that u ≤win A^∗if and only if there is an embedding of u intow. The example expansionηuwgiven above is exactly the embedding which corresponds to the subword ofw=aabbbaba given at the beginning of the third paragraph of this paper. Ifwis clear from context we simply writeηuforηuw.

The M¨obius function of A^∗ counts certain types of embeddings. If a∈ A then a run of a’s inwis a maximal interval of indices [r,t] such that

w(r )=w(r+1)= · · · =w(t)=a.

Continuing our example, the runs inw=aabbbaba are [1,2], [3,5], [6,6], [7,7], and [8,8]. Call an embeddingηuintownormal if, for every a∈ A and every run [r,t]

of a’s inw, we have

(r,t]⊆Suppηu

where (r,t] denotes the half-open interval. In our running example, this means that w(2)=a and w(4)=w(5)=b must be in any normal embedding. (If r =t then (r,t]= ∅, so there is no restriction on runs of one element.) Thus in this case there are precisely two normal embeddings of u intow, namely

ηu =0a0bba00 and 0a0bb00a.

Define_w

u

nto be the number of normal embeddings of u intow.

We now have everything in place to state Bj¨orner’s result.

(4)

Theorem 2.1 (Bj¨orner [4]). If u, w∈ A^∗then μ(u, w)=(−1)^|w|−|u|

w u

n

.

Finishing our example, we see that

μ(abba,aabbbaba)=(−1)⁸⁻⁴·2=2.

We now turn toP^∗. The definitions of support and expansion are exactly as before, but the notion of embedding must be updated to reflect the different partial order. To this end we define an embedding of u intowas an expansionηu of u having length

|w|such that

ηu(i )≤w(i ) for 1≤i ≤ |w|.

Again, u≤winP^∗ is equivalent to the existence of an embedding of u intow. An interval inP^∗is displayed in Fig. 1.

If u≤w andηw is an expansion ofw then there is a unique last or rightmost embedding ρuw of u intoηw which has the property that for any other embedding ηuwof u intoηwone has Supp (ηuw)≤Supp (ρuw). (If S= {i1<· · ·<im}and S= {i₁ <· · ·<i_m}then we write S≤ Sto mean that ij≤i_j for 1≤ j≤m.) Note that ρuwdepends onη_w, not just onw, but the expansion ofwused will always be clear from context.

Fig. 1 The interval [322,3322] and its maximal chains

(5)

Like subword order, we define normal embeddings forP^∗in terms of runs (defined in the same way in this context). We say that an embeddingηuintowis normal if the following conditions hold.

1. For 1≤i≤ |w|we haveηu(i )=w(i ),w(i )−1, or 0.

2. For all k≥1 and every run [r,t] of k’s inw, we have (a) (r,t]⊆Suppηuif k=1,

(b) r ∈Suppηuif k≥2.

Comparing this with Bj¨orner’s definition, we see that inP^∗a normal embedding can have three possible values at each position instead of two. Also, the run condition for ones is the same as in A^∗, while that condition for integers greater than one is complementary. As an example, ifw=2211133 and u=21113, then there are two normal embeddings, namely ηu =2101130 and 2011130. Note that 2001113 and 0211130 are not normal since they violate conditions (1) and (2), respectively.

As with subword order, the M¨obius function for compositions is given by a signed sum over normal embeddings, although here the sign of a normal embedding depends on the embedding itself and not just the length of the compositions. Given a normal embeddingηu intowwe define its defect to be

d(ηu)=#{i|ηu(i )=w(i )−1}.

The sign of the embedding is, then, (−1)^d(^η^u⁾. We can now state our main theorem aboutP^∗. Theorem 2.2. If u, w ∈P^∗then

μ(u, w)=

ηu

(−1)^d(^η^u⁾

where the sum is over all normal embeddingsηuintow.

In the example of the previous paragraph, this gives

μ(21113,2211133)=(−1)²+(−1)⁰=2.

Although this example does not show it, it is possible to have cancellation among the terms in the sum forμ.

3. Proof by sign-reversing involution

We now prove Theorem 2.2 using a sign-reversing involution. The proof is similar in nature to Bj¨orner’s proof in [5], but is significantly more complicated.

Proof (of Theorem 2.2): If u=wthen there is exactly one normal embeddingηww

and it has defect 0. This gives (−1)⁰=1=μ(w, w), as desired.

(6)

Now assume that u< w=w(1)· · ·w(n). Since the M¨obius recurrence uniquely definesμ, it suffices to show that

v∈[u,w]

ηvw

(−1)^d(^η^vw⁾=0,

where the inner sum is over all normal embeddings of v intow. We prove this by constructing a sign-reversing involution on the set of normal embeddings ηvw for v∈[u, w].

Letρuvdenote the rightmost embedding of u intoηvw, so for all i ∈[1,n], ρuv(i )≤η_vw(i )≤w(i ). (1) Also let s denote the left-most index whereρuvandwdiffer, i.e., s=min{i :ρuv(i )= w(i )}. Since u< w, s must exist, and by the definition of s we have

ρuv(i )=η_vw(i )=w(i ) for all i∈[1,s). (2) Set k =w(s), soηvw(s) is k, k−1, or 0 by normality andρuv(s)≤k−1. Finally, let [r,t] denote the indices of the run of k’s inwthat contains the index s.

Our involution mapsη_vwto the embeddingη_vw=η_vw(1)· · ·η_vw(n) whereη_vw(i )= η_vw(i ) for all i =s andη_vw(s) is determined by the following rules. If k=1 then

ηvw(s)=

0 ifη_vw(s)=1,

1 ifηvw(s)=0. (3)

If k ≥2, s>r , andρuv(s)=0 then η_vw(s)=

0 ifη_vw(s)=k−1,

k−1 ifη_vw(s)=0. (4)

Finally, if k ≥2 and either s=r orρuv(s)=0 then ηvw(s)=

k−1 ifη_vw(s)=k,

k ifηvw(s)=k−1. (5)

It is not obvious that this map is defined for all normal embeddings η_vw when k≥2. For example, if s >r and ρuv(s)=0, then we should apply (4), but it is a priori possible that η_vw(s)=k, in which case (4) is not defined. However, since s>r ,ρuv(s−1)=w(s−1)=k by (2), and this contradicts our choice ofρuvas the rightmost embedding of u intoη_vw. Similar issues arise when s=r orρuv(s)=0: if s=r thenηvw(s)=0 by normality, and ifρuv(s)=0 thenηvw(s)=0 by (1).

Having established that this map is indeed defined on all normal embeddings, we have several properties to prove. First, it is evident from (3), (4), and (5) that the number of elements equal to k−1 changes by exactly one in passing fromηvw toηvw, and

(7)

thus (−1)^d(^η^vw⁾= −(−1)^d(^η^vw⁾. We must now prove thatηvwis a normal embedding of vintowfor somev∈[u, w] and that this map is an involution.

We begin by showing thatηvwis an embedding of somev∈[u, w] intow. It follows from the fact thatηvwis an embedding intowand the definition of our map thatηvwis an embedding of some wordvintow, so we only need to show thatv≥u. We prove this by showing that

η_vw(i )≥ρuv(i ) (6)

for all i ∈[1,n]. This is clear for all i =s because for these indicesηvw(i )=ηvw(i )≥ ρuv(i ). Furthermore,ρuv(s)≤k−1 by the definition of s, so the only case in which (6) is not immediate is when k ≥2 and η_vw(s)=0. However, this can only occur from using (4), which requires that ρuv(s)=0, completing the demonstration that v∈[u, w].

We now aim to show that η_vw is a normal embedding. If η_vw(s)>0 then Supp (η_vw)⊇Supp (η_vw), so the normality ofη_vwfollows from the normality ofη_vw and the fact thatη_vw(s) is either k−1 or k. Ifη_vw(s)=0 then there are two cases depending on whether (3) or (4) was applied. Suppose first that (3) was applied, so k=1. Comparing (3) with the definition of normality, we see that it suffices to show s=r . Suppose to the contrary that s∈(r,t]. Since s is the left-most position at which ρuv andwdiffer, we haveρuv(s)=0 andρuv(s−1)=1. However, this contradicts our choice ofρuvas the rightmost embedding of u inηvw. Now suppose that (4) was applied, so s>r andρuv(s)=0. Then (2) implies that ηvw(r )=ηvw(r )=k, and normality is preserved.

It only remains to show that this map is an involution. Consider applying the map toηvw. In this process, we defineρuv to be the rightmost embedding of u intoηvw, s=min{i :ρuv(i )=w(i )}, and k=w(s). We then follow the rules (3) (4), and (5) to construct an embeddingη_vw, which we would like to show is equal toη_vw.

First we claim that ρuv=ρuv. Suppose to the contrary that ρuv=ρuv. By (6), η_vw(i )≥ρuv(i ) for all i , soρuvalso gives an embedding of u intov, and thus the only way we can haveρuv=ρuvis ifρuvis further to the right thanρuv. This requires that 0=ρuv(s)< ρu ¯v(s)≤k (7) and that

ηvw(s)< ηvw(s). (8)

Because we are assuming thatρvw is further to the right thanρvw, there is some position to the left of s at whichρuvis nonzero. In fact, (2) shows that we must have ρuv(s−1)=0 and also implies that

ηvw(s)< ρuv(s)=ρuv(s−1)=w(s−1). (9) We now consider the three cases arising from each of the rules (3), (4), and (5) in turn.

(8)

Suppose (3) was applied so k=1. It follows from (8) and the definition of our map thatηvw(s)=0. Also (7) and (9) givew(s−1)=ρuv(s)=1. However, this implies thatηvwzeroed out a 1 which was not the first in its run, contradicting normality.

Now suppose (4) was applied. Then by (8) we haveηvw(s)=k−1. We also have s>r which in conjunction with (9) givesρuv(s)=w(s−1)=k. This implies that ηvw(s)< ρuv(s), but that contradicts thevversion of (1).

Finally suppose that (5) was used. By (7) it must be the case that s=r . Also, Eq. (8) gives η_vw(s)=k−1 and η_vw(s)=k. Now applying (7) and (9) we have k−1< w(s−1)=ρuv(s)≤k, sow(s−1)=k which contradicts that fact that s = r .

Now that we have established the equality ofρuv andρuv, the fact that this map is an involution can be readily observed. We must haves=s, so k=k, and thus we apply the same rule to go fromη_vwtoη_vwas we applied to getη_vwfromη_vw, and each of these rules is clearly an involution.

4. Introduction to discrete Morse theory

In this section we review the basic ideas behind Forman’s discrete Morse theory [14, 15]

as well as Babson and Hersh’s method for applying the theory to the order complex of a poset [1].

Let X be a CW-complex. Since we will be working in reduced homology, we assume that X has an empty cell∅of dimension−1 which is contained in every cell of X . Ifσ is a d-cell (cell of dimension d) in X then letσ^∂be the set of (d−1)-cells τ which are contained in the closure ¯σ. Dually, letσ^δdenote the set of (d+1)-cells τ such thatσ ⊂τ¯.

A real-valued function f on the cells of X is a Morse function if it satisfies the following two conditions.

1. For every cellσ of X we have (a) #{τ ∈σ^∂| f (τ)≥ f (σ)} ≤1, and (b) #{τ ∈σ^δ| f (τ)≤ f (σ)} ≤1.

2. Ifτ ∈σ^δ and f (τ)≤ f (σ) thenσis a regular face ofτ.

Intuitively, the first condition says that, with only certain exceptions, f increases with dimension. In fact, f (σ)=dimσ is a perfectly good Morse function on X , although we will see shortly that it is not very interesting. A simple example of a Morse function on a CW-complex is given in Fig. 1 where the value of f is given next to each cellσ and we also set f (∅)= −1.

The fact that condition (1) holds for every cell implies that, in fact, at most one of the two sets under consideration has cardinality equal to 1. Thus the function f induces a Morse matching between pairs of cellsσ, τ withσ ∈τ^∂ and f (σ)≥ f (τ). Furthermore, condition (1) implies that this matching is acyclic in the following sense: the directed graph formed from the Hasse diagram of the face poset of X by directing every edge of the matching upwards and all other edges downwards has no

(9)

directed cycles. The regularity condition ensures that for each matched pair there is an elementary collapse of ¯τ onto ¯τ−(τ ∪σ).

The cells which are not matched by f are called critical. From the acyclicity property and the fact that each individual collapse is a homotopy equivalence, it follows that X can be collapsed onto a homotopic complex X^f built from the critical cells. In our example, the cells labeled 1 and 2 are matched, and after collapsing we clearly have a complex which is still homotopically a circle. Note that if we take f to be the dimension function then every cell is critical and X^f =X , so the cell complex does not simplify in this case which does not help in understanding its structure.

Let ˜md be the number of critical d-cells of X and let ˜bd be the d-th reduced Betti number over the integers. We also use ˜χ(X ) for the reduced Euler characteristic. From the considerations in the previous paragraph, we have the following Morse inequalities which are analogous to those in traditional Morse theory.

Theorem 4.1 (Forman [15]). For any Morse function on a cell complex X we have 1. ˜bd ≤m˜d for d≥ −1, and

2. ˜χ(X )=

d≥−1(−1)^dm˜d.

(One can get further inequalities relating various partial alternating sums of the ˜bd

and ˜md.) Continuing our example, we see that ˜m₋₁=m˜0 =m˜1=1 which bound

˜b₋₁= ˜b0 =0 and ˜b₁=1, as well as ˜χ(X )= −1+1−1= −1, as expected.

We now turn to the special case of order complexes. Let P be a poset and consider an open interval (u, w) in P. The corresponding order complex (u, w) is the abstract simplicial complex whose simplices (faces) are the chains in (u, w). We are interested in the order complex because of the fundamental fact [20] that

μ(u, w)=χ( (u, w)).˜ (10)

Therefore finding a Morse function for (u, w) could permit us to derive the corresponding M¨obius value as well as give extra information about its Betti numbers.

Suppose we have an ordering of the maximal chains of (u, w) (facets of (u, w)), say C1,C2, . . . ,Cl. Call a face (subchain)σ of Ck new if it is not contained in any Cj

for j <k. We would like to construct a Morse matching inductively, where at the kth stage we extend the matching on the faces in Cj for j <k by matching up as many of the new facesσ in Ck as possible. It turns out that under fairly mild conditions on the facet ordering, one can construct such a matching so that all the new faces in Ckare matched if there are an even number of them, and only one is left unmatched if the number is odd. Thus adding each facet contributes at most one critical cell.

A maximal chain contributing a critical cell is called a critical chain. In reading the details of this construction, the reader may find it useful to refer to the example of the interval [322,3322]⊂P^∗given in Fig. 2. Note that, by abuse of notation, we include u andwwhen writing out a maximal chain C, even though C is really a subset of the open interval (u, w). Also, because of the way our chain order is constructed, we start with the top elementwand work down to u which is dual to what is done normally. Thus in a chain C, terms like “first” and “last” refer to this ordering of C’s elements. Finally, we list the elements of a chain as embeddings intowfor reasons which will become apparent when we also describe the labels given to the edges (covers) of a chain.

(10)

Fig. 2 A Morse function on a CW-complex

To define the types of chain orderings we consider, suppose we have two chains C : w=v0 v1 . . . u and C: w=v₀ v₁ . . . u where x y means that x covers y. Then we say that C and Cagree to index j ifvi =v_i for i≤ j. In addition, C and C diverge from index j if they agree to index j and vj+1 =v_j+1. In addition, we use the notation C<C to mean that Ccomes before C in the order under consideration. An ordering of the maximal chains of [u, w] is a poset lexicographic order, or PL-order for short, if it satisfies the following condition.

Suppose Cand C diverge from index j with C<C. Then for any maximal chains D and D which agree to index j+1 with Cand C, respectively, we must have D<D.

Note that orderings coming from the EL-labelings introduced by Bj¨orner [3] or from the more general CL-labelings of Bj¨orner and Wachs [9] are PL-orders as long as one breaks ties among labels consistently.

The PL-order we use inP^∗ is as follows. If x y is a cover then y is obtained from x by reducing a single part of x by 1. Thus there is a unique normal embedding of y into x, since if a 1 is reduced to 0 then it must be the first element in the run of ones to which it belongs. Similarly, for any expansionηxthere is a unique normal embedding of y intoηx. Now given any chain C : w=v0 v1 . . . u we inductively associate with eachvj an embeddingη_vj intowwhereη_v0 =wand, for j ≥0,ηvj+1is the unique normal embedding ofvj+1intoηvj. We label the edge vj vj+1 of C with the index i of the position which was decreased in passing from ηvj toηvj+1. Furthermore, we often writeηvj in place ofvj when listing the elements of C. Figure 2 illustrates this labeling. It is important to note that although ηvj+1 is normal inηvj, it need not be normal inw. We should also remark that this labeling is similar to the one used by Bj¨orner [4] in his CL-shelling of the intervals in subword order. Finally, if one orders the chains of [u, w] using ordinary lexicographic order on their label sequences, then the result is a PL-order. This is due to the fact that if two chains agree to index j then their first j labels are the same. The chains in Fig. 2 are listed in PL-order.

We now return to the general exposition. To construct our matching, when we come to a chain C : w=v0 v1 . . . u in a given order we must be able to determine which faces of C are new. Denote the open interval I fromvitovj in C by

I =C(vi, vj)=vi+1 vi+2 . . . vj−1.

(Do not to confuse this with an open interval in the poset.) Then I is a skipped interval if C−I ⊂Cfor some C<C. It is a minimal skipped interval or MSI if it does not strictly contain another skipped interval. In Fig. 2, the MSI’s are circled. One can find

(11)

the MSI’s by taking the maximal intervals in C−(C∩C) for each C<C and then throwing out any that are not containment minimal in C. LetI=I(C) be the set of MSI’s in C. Then it is easy to check that a faceσ is new in C if and only ifσ has a nonempty intersection with every I ∈I(C).

The setIin not quite sufficient to construct the matching because the MSI’s can overlap and we will need a family of disjoint intervalsJ. It happens that in all the critical chains forP^∗, the intervals inIwill already be disjoint. SoI=J and we will not need the following construction, but we include it for completeness. In general, there are no containments among the intervals inI, so they can be ordered I₁,I2,I3, . . . according to when they are first encountered on C. We now inductively construct the set J =J(C) of J -intervals as follows. Let J₁=I1. Then consider the intervals I₂ =I2−J1,I₃=I3−J1, . . .; throw out any which are not minimal; and pick the first one which remains to be J2. Continue this process until there are no nonempty modified MSI’s left.

We are finally in a position to describe the matching. List the maximal chains of [u, w] using a PL-order. A family K of intervals of a maximal chain C covers the chain if C= ∪K. There are three cases depending on whetherI(C) or J(C) covers C or not. First suppose that I(C) does not cover C, so neither does J(C), and pick x0to be the first vertex in C− ∪I. Consider the mapσ →σ {x0}where is symmetric difference (not the order complex). One can show that this map is a fixed-point free involution on the new facesσin C which extends the Morse matching already constructed from the previous chains. Now suppose thatJ = {J1,J2, . . . ,Jr} does cover C and consider the new faceσC = {x1,x2, . . . ,xr}where xi is the first element of Jifor 1≤i ≤r . Given any other new faceσ =σC, we find the interval Ji

of smallest index whereσ∩Ji = {xi}and mapσ →σ {xi}. This involution pairs up all new faces in C exceptσC, which is critical. Finally, suppose thatI covers C butJ does not. Then we use the mapping of the second case to pair up all new faces whose restriction to ∪J is different fromσC. We also pair up the remaining new faces (includingσC) by using the mapping of the first case where we take x0to be the first vertex in C− ∪J. Thus we have outlined the proof of the following theorem, remembering that the dimension of a simplex is one less than its number of vertices.

Theorem 4.2 ([1]). Let P be a poset and [u, w] be a finite interval in P. For any PL-order on the maximal chains of [u, w], the above construction produces a Morse matching in (u, w) with the following properties.

1. The maximal chain C is critical if and only ifJ covers C.

2. If C is critical then its unique critical cell has dimension #J(C)−1.

5. A Morse theory derivation ofμ

We are now ready to find the critical cells for the PL-order inP^∗ defined previously.

We first need three lemmas which will prove useful in a number of cases. Unless otherwise specified, we always use the notation

C : w=v0 l1 v1 l2 v2 l3 . . . ^l^d vd =u (11)

(12)

for labeled maximal chains, or C : w=η_v0

l1 η_v1

l2 η_v2

l3 . . . ^l^d ηu (12) if we wish to be specific about the embeddings determined by C. We also use

l(C)=(l1,l2, . . . ,ld) for its label sequence.

Take an interval [u, w]⊂P^∗ with|u| = |w|and let m_i =w(i )−u(i). Now con- sider the multiset M_uw= {{1^m¹,2^m², . . .}}where i^mⁱmeans that i is repeated m_itimes.

Then every permutation of M is the label sequence for a unique maximal chain in [u, w] and this accounts for all the chains. (In fact, [u, v] is isomorphic to the poset of submultisets of Muw.) We record this simple observation for later reference.

Lemma 5.1 (Same length lemma). If|u| = |w|then the the label function l gives a bijection between the maximal chains in [u, w] and the permutations of Muw. In particular, if Muwcontains only one distinct element (possibly with multiplicity) then [u, w] contains a unique maximal chain.

If|u|<|w|then we no longer have the nice bijection of the previous paragraph, but we can still say something. Let C be a maximal chain as in (11) and let l=(l₁, . . . ,l_d) be any permutation of the label sequence l(C). Then ldefines a sequence of expansions η_v₀, η_v₁, . . . , η_v_dwhereη_v₀=wand for j ≥1 we getη_v_jfromη_v_j₋₁by subtracting one from position l_jinηv_j−1. It is still true that C:w=v₀ v₁ . . . vd =u is a maximal chain in [u, w]. We call Cthe chain specified by l. Sinceη_v_jmay not be a normal embedding inη_v_j−1, we may not have l(C)=l. Still, at the first place where l(C) and ldiffer, that difference must have been caused because using the label in lwould have resulted in changing a 1 to a 0 where that 1 was not the first in its run.

Thus the corresponding normal embedding in l(C) uses the first 1 in that run which is to the left. Hence l(C)≤lin lexicographic order. We summarize this discussion in the following lemma.

Lemma 5.2 (Chain specification lemma). If C is a maximal chain in [u, w] and lis any permutation of l(C) then l(C)≤lwhere Cis the chain specified by l.

As our first application of the Chain Specification Lemma, we can determine what happens at descents. A descent of C isvj ∈C such that lj >lj+1. An ascent is defined by reversing the inequality.

Lemma 5.3 (Descent lemma). Ifvjis a descent of C then it is an MSI.

Proof: Let lbe the permutation of l(C) gotten by interchanging lj and lj+1 and let Cbe the chain specified by l. Then by Lemma 5.2 we have l(C)≤l<l(C), so C comes before C in PL-order and it is easy to check that Cdiverges from C atvj−1

and rejoins C atvj+1. Thus{vj}is a skipped interval; and since the interval contains

only one element it must also be minimal.

(13)

We only need a few more definitions to state our result characterizing the critical chains. A chain C will be said to have a certain property, e.g., weakly decreasing, if l(C) has that property. Also, ifηuis a normal embedding intowthen we need to keep track of the zero positions which did not come from decreasing a 1 inwby letting

D(ηu)=#{i|ηu(i )=0, w(i )≥2}.

Theorem 5.4. Consider the maximal chains in [u, w]⊂P^∗in the given PL-order.

1. There is a bijection between critical chains C and normal embeddingsηu intow where the chain corresponding toηuis the unique weakly decreasing chain ending atηu.

2. If C is critical and ends atηu thenI(C)=J(C) and

#I(C)=d(ηu)+2D(ηu)−1.

We shall prove this theorem by considering 3 cases: when C is weakly decreasing and ends at a normal embedding, when C is weakly decreasing and does not end at a normal embedding, and when C is not weakly decreasing. Note that for any embedding ηu intow, there is at most one weakly decreasing chain ending atηu, and that ifηu

is normal then such a chain will exist because it will be possible to make each cover normal. Thus there is a bijection between normal embeddings and weakly decreasing chains ending at them, but we need to show such chains are critical. To do so, we define a plateau of C to be an interval C(vi, vj) such that (li+1,li+2, . . . ,lj) is a run of length at least 2 in l(C).

Proposition 5.5. If C is weakly decreasing and ends at a normal embeddingηu then C is critical,I(C)=J(C), and #I(C)=d(ηu)+2D(ηu)−1.

Proof: Every descent is an MSI of C by the Descent Lemma, so any other MSI must be contained in a plateau by minimality. In fact, we claim that any plateau C(vi, vj) is an MSI. Without loss of generality we can assumevi =w(since otherwisevi is a descent and so no MSI can contain it) andvj =u.

To show C =C(w,u) is a skipped interval, first note that by construction l(C) consists of c repeated k = j−i ≥2 times, sow(c)=ηu(c)+k. Thus by normality and the fact that k≥2 we getηu(c)=0 andw(c)=k. Using normality again implies that c cannot be the first element in its run of k’s inw, and thusw(c−1)=k. Because of this, there is a chain Cfromwto u all of whose labels are c−1. By construction C<C and C∩C= ∅, so C is a skipped interval as desired.

To show the plateau is minimal suppose, to the contrary, that there is a skipped interval I ⊂C and letw=v0, u=vk. Note that because|v0| = |v1| = · · · = |vk−1|, the Same Length Lemma applies to show that there is only one chain (namely an interval of C) between any two of these compositions. Thus the chain Cgiving rise to I must rejoin C at an embeddingηuof u intow, and hence also containv1in order to cut out a proper subinterval. From this and normality ofηuwe have

ηu(c)<k=w(c−1)=ηu(c−1). (13)

(14)

Also,|u| = |w| −1 implies that C must zero out exactly one element ofw. Since C<C, that element must be in a position strictly to the left of position c, but then becauseηuandη_u are both expansions of u we are forced to haveηu(c−1)=η_u(c), contradicting (13).

Now we know thatI(C) consists of the descents and plateaus of C which are disjoint and cover C by their definition, soJ(C)=I(C) and C is critical by Theorem 4.2. To count the number of MSI’s, note that if c is a position counted by d(ηu) then the vertex just after the edge labeled c in C will be a descent, unless that edge is the very last one. On the other hand, if c is counted by D(ηu) then the run of c’s in l(C) contribute both a plateau and a descent just after the plateau toI (unless the run is at the end of C when only the plateau will be an interval). In this manner we count each MSI exactly once for a total of d(ηu)+2D(ηu)−1 intervals.

Proposition 5.6. If C is weakly decreasing and ends at an embeddingηuwhich is not normal then C is not critical.

Proof: As in the proof of the previous proposition, it suffices to consider the case where l(C) consists of a label c repeated k≥2 times so that w(c)=ηu(c)+k. If ηu(c)>0 then|w| = |u|and so the Same Length Lemma applies to show that C is the only chain fromwto c. In particular, it is the lexicographically first chain and thus not critical.

Ifηu(c)=0 thenw(c)=k and, sinceηuis not normal, it must be that c is the first index in this run of k’s inw. Let

ηu(c−1)=w(c−1)=h=k. (14) To demonstrate that C is not critical, it suffices to show that there is no MSI I containing the elementv1 in C. Suppose, to the contrary that such an interval I exists and let C be a chain giving rise to I . Using the Same Length Lemma as in the proof of Proposition 5.5 (third paragraph) we see that Cmust rejoin C at u and this forces I =C(w,u).

To finish the proof, it suffices to find a skipped interval I⊂I since that will contradict the minimality of I . Let b be the smallest label in l(C). Then b<c since l(C)<l(C)=(c^k). The same argument used at the end of the third paragraph of the previous proposition together with Eq. (14) give

ηu(c)=ηu(c−1)=h =k=w(c).

Since the parts of a composition can only (weakly) decrease along a chain, we must haveηu(c)< w(c). It follows that c is a label on C. Now consider any permutation of l(C) which starts l=(c,b, . . .) and let C be the chain specified by l. Then by construction and the Chain Specification Lemma l(C)≤l<l(C). This implies that C<C in PL-order and, by construction again, Ccontainsv1. Thus no MSI of C can containv1, a contradiction.

Our third and final proposition completes the proof of Theorem 5.4

(15)

Proposition 5.7. If C is not weakly decreasing then C is not critical.

Proof: If C is not weakly decreasing then it has an ascentv. It suffices to show thatv is in no MSI. Suppose, to the contrary, thatvis in an MSI, I =C(vi, vj). Then by the Descent Lemma, I contains no descents and so C is weakly increasing fromvi tovj. As in the previous two proofs, it is no loss of generality to assumevi =wandvj =u so that C is itself an MSI.

Since C is an MSI, it is not the first chain in [u, w]. That first chain is the unique weakly increasing chain which ends at the rightmost embeddingρuof u intow. Thus ifηuis the embedding defined by C then we must haveηu =ρu.

Forηu define

z_η( j )=#{i ≤ j|ηu(i )=0}

and similarly define z_ρ( j ) forρu. Becauseρu is rightmost we always have z_ρ( j )≥ z_η( j ) with equality when j = |w|. Butρu is not equal toηu, so there is an index a such that z_ρ(a)>z_η(a). Thus there is a first index c>a such that z_ρ(c)=z_η(c). This definition of c forces ηu(c)=0 and ρu(c)=k for some k>0. Butηu andρu are embeddings of the same composition, so there must be some index b with a≤b<c such thatηu(b)=ρu(c)=k andηu(i )=0 for b<i ≤c.

We can now derive a contradiction by constructing a smaller skipped interval in C as follows. We havew(c)≥ρu(c)=k andηu(c)=0. Since C is weakly increasing, the labels equal to c must occur as a plateau. Therefore there must be verticesw,u∈C such that I=C(w,u) satisfiesηw(c)=k,ηu(c)=0, and l(I)=(c^k). But

ηw(i )=ηu(i )=ηu(i )= k if i =b, 0 if b<i<c,

so there is a chain Cfromw to u with l(C)=(b^k). Since b<c, I is a skipped interval and we have obtained the desired contradiction.

We can now rederive the formula for μ(u, w) in P^∗. Combining Eq. (10) with Theorems 4.1, 4.2, and 5.4 we obtain

μ(u, w)=χ˜(u, w)=

C

(−1)^dim^σ^C =

ηu

(−1)^d(^η^u⁾⁺^2D(^η^u⁾⁻²=

ηu

(−1)^d(^η^u⁾

where the first sum is over all critical chains C in (u, w) and the other two are over all normal embeddingsηuintow.

We end this section by remarking that the Morse method can be used as a powerful tool not just for proving theorems but for discovering the correct statement to be proved.

The reader may have found our definition of a normal embedding somewhat ad hoc.

However, by starting with the very natural chain labeling used above and looking at the critical chains, one is quickly led to this definition in order to characterize the embeddings at which such chains end. Similarly, the defect may seem to have come

(16)

out of nowhere, but in order to determine the dimension of the critical cells one is forced to define this quantity as well as its big brother, D(ηu).

6. Generalized subword order

We can now generalize both our result and Bj¨orner’s as follows. Let (P,≤P) be any poset. The generalized subword order over P is the partial order on P^∗ obtained by setting u≤P^∗ wif wcontains a subsequence w(i₁), w(i₂), . . . , w(i_|u|) such that u( j)≤P w(ij) for 1≤ j≤ |u|. We get ordinary subword order when P= A is an antichain and we get our composition poset when P =P.

It is a simple matter to recast this generalized order in terms of embeddings. Let ˆ0 be a special element which is not in P and let ˆP be the poset obtained by adjoining ˆ0 as a minimum element, i.e., ˆ0<Pˆ x for all x ∈P. Then the definitions of support and expansion are as usual, just replacing 0 with ˆ0. An embedding of u intowis a length

|w|expansionηuof u with

ηu(i )≤Pˆ w(i ) for 1≤i ≤ |w|.

As expected, u≤P^∗wif and only if there is an embedding of u intow.

Finding an analogue of normality in this context is more delicate, and so far we have only been able to do it for a special class of posets. However, there is evidence that more general results are possible; the next section contains a discussion of this issue. First note that the definition of a run carries over verbatim to any P^∗. Now call P a rooted tree if its Hasse diagram is a tree with a minimum element. A rooted forest is a poset where each connected component of its Hasse diagram is a rooted tree. Note that both antichains and chains are rooted forests. Note also that if P is a rooted forest then ˆP is a rooted tree so the following definition makes sense. If x ∈ P where P is a rooted forest then let x⁻be the element adjacent to x on the unique path from x to ˆ0 in the Hasse diagram for ˆP. For a rooted forest, a normal embedding of u intowis an embeddingηuintowsatisfying two conditions.

1. For 1≤i≤ |w|we haveηu(i )=w(i ),w(i )⁻, or ˆ0.

2. For all x ∈ P and every run [r,t] of x’s inw, we have (a) (r,t]⊆Suppηu if x is minimal in P,

(b) r∈Suppηuotherwise.

Finally, we need the definition of defect in this situation, which is as expected:

d(ηu)=#{i|ηu(i )=w(i )⁻}

for a normal embeddingηu intow. The following theorem is the promised generalization of Theorems 2.1 and 2.2. Both of the two proofs we have given of the special case where P =Pgeneralize easily, with the minimal elements in P playing the role of 1 and the rest functioning like the integers k ≥2.

(17)

Theorem 6.1. Let P be a rooted forest. Then the M¨obius function of P^∗is given by μ(u, w)=

ηu

(−1)^d(η^u⁾, where the sum is over all normal embeddingsηuof u intow.

7. Comments and open problems

There are several possible avenues for future research. We discuss some of them here.

7.1. Generating functions

As mentioned in the introduction, Bj¨orner and Reutenauer [6] gave another proof of the formula forμin A^∗ using generating functions on monoids. LetZAdenote the algebra of formal series using the elements in A as noncommutative variables and the integers as coefficients. Such a series can be written

f =

w∈A^∗

c_ww

for certain c_w∈Z. For example, given u∈ A^∗one can consider the series

m(u)=

w≥u

w u

n

w. (15)

Björner and Reutenauer showed that (15) is rational for any u and obtained, upon specialization of the variables, nice expressions for various ordinary generating func- tions associated with the Möbius function of A^∗. They also derived results for the zeta function of A^∗. The map m : A^∗→ZAcan be extended to a continuous linear endomorphism ofZA. In fact, the full incidence algebra of A^∗is isomorphic to a subalgebra of this endomorphism algebra. Björner and Reutenauer gave another proof of Theorem 2.1 using this fact.

It is natural to try and apply these ideas toP^∗, and more generally to rooted forests.

This has been done by Bj¨orner and Sagan [7].

7.2. The poset of permutations

Our original interest in P^∗ came from the rapidly growing subject of permutation patterns. For an overview of permutation patterns the reader is referred to B´ona’s text [10]. Let Sndenote the nth symmetric group and letπ ∈Snandσ ∈Sk. We say thatπcontains aσ-pattern, and writeπ≥σ, if there are indices i1<i2<· · ·<ik

such that the subsequence π(i1)π(i2). . . π(ik) has the same pairwise comparisons as σ(1)σ(2). . . σ(k). This subsequence is called a copy of σ in π. For example, 312≤24153 because of the copy 413. This defines a partial order on the set of all finite permutations. Figure 3 displays the interval [1,31524] in this poset. Wilf was the first to ask the following question.

(18)

Fig. 3 The Hasse diagram for the interval [1,31524] in the pattern containment ordering on permutations

Question 7.1 (Wilf [27]). What can be said about the M¨obius function of permutations under the pattern-containment ordering?

Given two permutationsπ ∈Smandσ ∈Sn, their direct sum is the permutation of length m+n whose first m elements formσand whose last n elements are the copy of πgotten by adding m to each element ofπ. For example, 132⊕32145=13265478.

A permutation is said to be layered if it can expressed as the direct sum of some number of decreasing permutations. (An equivalent characterization of layered permutations is that they are the permutations that contain neither a 231-pattern nor a 312-pattern.) Our previous example is layered because 13265478=1⊕21⊕321⊕1⊕1. Clearly the set of layered permutation of length n is in bijection with the set of compositions of n. Almost as clearly, this bijection sends the pattern-containment order to the composition order we have considered, so Theorem 2.2 answers Wilf’s question for the set of layered permutations.

Any normal embedding approach to describing the M¨obius function for permutations in general would need to incorporate non-unitary weights, as witnessed by the fact thatμ(1,31524)=6.

7.3. Factor order

Subword order is not the only partial order on the set of words. We say that the word u is a factor of the wordwif there exist (possibly empty) wordsv1 andv2 so that w=v1uv2, or in other words, if u occurs as a contiguous subword inw. Bj¨orner [5]

showed that the M¨obius function for factor order only takes on values in{0,±1}and gave a recursive rule that allows the computation ofμ(u, w) in O(|w|²) steps.

The factor order can be defined on P^∗for any poset P: we say that u is a factor of wif there are wordsv1, v2, v3such that:

1. w=v1v2v3, 2. |v2| = |u|,

3. u(i )≤v2(i ) for all 1≤i ≤ |u|.

Indeed, this is one of the orders onP^∗ studied by Snellman [21, 22]. The M¨obius function ofP^∗under factor order remains unknown.