DOI 10.1007/s10801-006-0017-4

**The M¨obius function of a composition poset**

**Bruce E. Sagan****·****Vincent Vatter**

Received: 22 July 2005 / Accepted: 11 January 2006 / Published online: 11 July 2006

CSpringer Science+Business Media, LLC 2006

**Abstract We determine the M¨obius function of a poset of compositions of an integer.**

In fact, we give two proofs of this formula, one using an involution and one involving discrete Morse theory. This composition poset turns out to be intimately connected with subword order, whose M¨obius function was determined by Bj¨orner. We show that, using a generalization of subword order, we can obtain both Bj¨orner’s results and our own as special cases.

**1. Introduction**

*If A is any set then the corresponding Kleene closure or free monoid, A*^{∗}is the set of
*words with letters from A, i.e.,*

*A*^{∗}= {w=*w*(1)*w*(2)*. . . w(n)*|*n* ≥0 and*w(i )*∈ *A for all i}.*

We denote the length (number of elements) of*w*by|w|.

This work was partially done while B. E. Sagan was on leave at DIMACS.

Partially supported by an award from DIMACS and an NSF VIGRE grant to the Rutgers University Department of Mathematics.

B. E. Sagan ( )

Department of Mathematics, Michigan State University East Lansing, MI 48824-1027

e-mail: sagan@math.msu.edu V. Vatter

School of Mathematics and Statistics, University of St Andrews, St Andrews, Fife, Scotland KY16 9SS

e-mail: vince@mcs.st-and.ac.uk

Letting Pdenote the positive integers, we see thatP^{∗} is the set of integer com-
positions (ordered partitions). We can turnP^{∗} into a partially ordered set by letting
*u* ≤*w*if there is a subword*w(i*_{1})*w(i*_{2})*. . . w(i**l*) of*whaving length l*= |u|such that
*u( j)*≤*w(i**j*) for 1≤ *j* ≤*l. For example, 433*≤16243 because of the subword 643.

Bergeron, Bousquet-M´elou and Dulucq [2] were the first to study P^{∗}, enumerating
saturated chains that begin at its minimal element. Snellman has also studied saturated
chains in this poset as well as two other partial orders onP^{∗} [21, 22]. One of these
other partial orders was originally defined by Bj¨orner and Stanley [8] who showed
that it has analogues of many of the properties of Young’s lattice. One of the main
results of this paper is a formula for the M¨obius function ofP^{∗}.

This order on P^{∗} *is closely related to subword order. If A is any set then the*
*subword order on A*^{∗} *is defined by letting u*≤*w* if *w* contains a subsequence
*w(i*1), w(i2), . . . , w(i*l**) such that u( j )*=*w(i**j*) for 1≤ *j* ≤*l* = |u|. By way of illus-
*tration, if A*= {a,*b}then abba*≤*aabbbaba becausew*(1)*w*(3)*w*(5)*w*(8)=*abba.*

*Note that we use the notation A*^{∗}when referring to subword order as opposed to the
partial order on P^{∗}, even though we use≤ for both. We will always give enough
context to make it clear which poset we are dealing with. Bj¨orner [4] was the
first to completely determine the M¨obius function of subword order, although spe-
cial cases had been obtained previously by Farmer [13] and Viennot [24]. In fact,
Bj¨orner gave two proofs of his formula, one using an involution [5] and one using
shellability [4]. He also gave a demonstration with Reutenauer [6] via generating
functions on monoids. Another proof was given by Warnke [26] using induction,
while Wang and Ma [25] used Cohen-Macaulayness to investigate *μ. We derive*
the M¨obius function formula for P^{∗} using first combinatorial and then topological
techniques.

In the next section, we review Bj¨orner’s result for subword order as well as the
related definitions which will be useful forP^{∗}. This section contains a statement of
our formula for the M¨obius function ofP^{∗} in Theorem 2.2. Section 3 is devoted to
giving a proof of this theorem using a sign-reversing involution.

*Although intervals in A*^{∗}are shellable, those inP^{∗}need not even be connected as
evident from the example in Fig. 2, so we need a more powerful tool to study the
topology of this composition poset. For this we turn to discrete Morse theory, which
was developed by Forman [14, 15] and has been a powerful tool in the computation of
the homology of many CW-complexes arising naturally in topological combinatorics.

A method for applying this theory to the order complex of a poset was given by Babson and Hersh [1], and further studied by Hersh herself [16]. Since this is a relatively recent addition to the combinatorial toolbox, we provide an exposition of the basic ideas of the theory in Section 4. The subsequent section gives a Morse theoretic proof of Theorem 2.2.

*The similarity between the formulas for the M¨obius functions of A*^{∗}andP^{∗}leads one
*to ask if there is a common generalization. In fact, if P is any poset then there is a partial*
*order on P*^{∗}which we call generalized subword order. It has been used in the context
of well-quasi-ordering; see Kruskal’s article [18] for a survey of the early literature.

*When P is a chain or an antichain, one recovers our results or Bj¨orner’s, respectively.*

This construction is studied in Section 6. We end with a section of comments and open problems.

**2. Subword and composition order**

We now review Bj¨orner’s formula for the M¨obius function of subword order, refor- mulating it slightly so as to emphasize the connection to the composition order which is our main objective. We assume the reader is familiar with M¨obius functions, but all the necessary definitions and theorems we use here can be found in Stanley’s text [23, Sections 3.6–3.7].

*We first need to restate the definition of the partial order in A*^{∗} in a way that,
although slightly more complicated, has a direct connection with the M¨obius function.

Suppose we have a distinguished symbol, 0, and suppose that 0∈ *A. Then a word*
*η*=*η(1)η(2). . . η(n)*∈*( A*∪0)^{∗}*has support*

Supp*η*= {i |*η(i )*=0}.

*An expansion of u*∈*( A*∪0)^{∗} is a word*η**u*∈*( A*∪0)^{∗}*such that the restrictions of u*
and*η**u* *to their supports are equal. For example, if u*=*abba then one expansion of*
*u isη**u* =*a0b0b00a. An embedding of u intow*is an expansion*η**u**w**of u which has*
length|w|and satisfies

*η**u**w**(i )*=*w(i )* *for all i* ∈Supp*η**u**w*.

*Note that u* ≤*win A*^{∗}*if and only if there is an embedding of u intow*. The example
expansion*η**u**w*given above is exactly the embedding which corresponds to the subword
of*w*=*aabbbaba given at the beginning of the third paragraph of this paper. Ifw*is
clear from context we simply write*η**u*for*η**u**w*.

*The M¨obius function of A*^{∗} *counts certain types of embeddings. If a*∈ *A then a*
*run of a’s inwis a maximal interval of indices [r,t] such that*

*w(r )*=*w(r*+1)= · · · =*w(t)*=*a.*

Continuing our example, the runs in*w*=*aabbbaba are [1,*2], [3*,*5], [6*,*6], [7*,*7],
and [8*,*8]. Call an embedding*η**u*into*wnormal if, for every a*∈ *A and every run [r,t]*

*of a’s inw*, we have

*(r,t]*⊆Supp*η**u*

*where (r,t] denotes the half-open interval. In our running example, this means that*
*w(2)*=*a and* *w(4)*=*w(5)*=*b must be in any normal embedding. (If r* =*t then*
*(r,t]*= ∅, so there is no restriction on runs of one element.) Thus in this case there
*are precisely two normal embeddings of u intow*, namely

*η**u* =*0a0bba00 and 0a0bb00a.*

Define_{w}

*u*

*n**to be the number of normal embeddings of u intow.*

We now have everything in place to state Bj¨orner’s result.

* Theorem 2.1 (Bj¨orner [4]). If u, w*∈

*A*

^{∗}

*then*

*μ(u, w)*=(−1)

^{|w|−|u|}

*w*
*u*

*n*

*.*

Finishing our example, we see that

*μ(abba,aabbbaba)*=(−1)^{8}^{−}^{4}·2=2*.*

We now turn toP^{∗}. The definitions of support and expansion are exactly as before,
but the notion of embedding must be updated to reflect the different partial order. To
*this end we define an embedding of u intow*as an expansion*η**u* *of u having length*

|w|such that

*η**u**(i )*≤*w(i )* for 1≤*i* ≤ |w|.

*Again, u*≤*w*inP^{∗} *is equivalent to the existence of an embedding of u intow*. An
interval inP^{∗}is displayed in Fig. 1.

*If u*≤*w* and*η**w* is an expansion of*w* *then there is a unique last or rightmost*
embedding *ρ**u**w* *of u intoη**w* which has the property that for any other embedding
*η**u**w**of u intoη**w*one has Supp (*η**u**w*)≤Supp (*ρ**u**w**). (If S*= {i1*<*· · ·*<i**m*}*and S*^{}=
{i_{1}^{} *<*· · ·*<i*_{m}^{}}*then we write S*≤ *S*^{}*to mean that i**j*≤*i*^{}* _{j}* for 1≤

*j*≤

*m.) Note that*

*ρ*

*u*

*w*depends on

*η*

*, not just on*

_{w}*w, but the expansion ofw*used will always be clear from context.

**Fig. 1 The interval [322,**3322] and its maximal chains

Like subword order, we define normal embeddings forP^{∗}in terms of runs (defined
in the same way in this context). We say that an embedding*η**u*into*wis normal if the*
following conditions hold.

1. For 1≤*i*≤ |w|we have*η**u**(i )*=*w(i ),w(i )*−1, or 0.

*2. For all k*≥*1 and every run [r,t] of k’s inw, we have*
*(a) (r,t]*⊆Supp*η**u**if k*=1,

*(b) r* ∈Supp*η**u**if k*≥2.

Comparing this with Bj¨orner’s definition, we see that inP^{∗}a normal embedding can
have three possible values at each position instead of two. Also, the run condition
*for ones is the same as in A*^{∗}, while that condition for integers greater than one is
complementary. As an example, if*w*=*2211133 and u*=21113, then there are two
normal embeddings, namely *η**u* =2101130 and 2011130. Note that 2001113 and
0211130 are not normal since they violate conditions (1) and (2), respectively.

As with subword order, the M¨obius function for compositions is given by a signed
sum over normal embeddings, although here the sign of a normal embedding depends
on the embedding itself and not just the length of the compositions. Given a normal
embedding*η**u* into*wwe define its defect to be*

*d(η**u*)=#{i|*η**u**(i )*=*w(i )*−1}.

*The sign of the embedding is, then, (*−1)^{d(}^{η}^{u}^{)}.
We can now state our main theorem aboutP^{∗}.
* Theorem 2.2. If u, w* ∈P

^{∗}

*then*

*μ(u, w*)=

*η**u*

(−1)^{d(}^{η}^{u}^{)}

*where the sum is over all normal embeddingsη**u**intow.*

In the example of the previous paragraph, this gives

*μ(21113,*2211133)=(−1)^{2}+(−1)^{0}=2.

Although this example does not show it, it is possible to have cancellation among the
terms in the sum for*μ.*

**3. Proof by sign-reversing involution**

We now prove Theorem 2.2 using a sign-reversing involution. The proof is similar in nature to Bj¨orner’s proof in [5], but is significantly more complicated.

* Proof (of Theorem 2.2): If u*=

*w*then there is exactly one normal embedding

*η*

*ww*

and it has defect 0. This gives (−1)^{0}=1=*μ*(*w, w*), as desired.

*Now assume that u< w*=*w*(1)· · ·*w(n). Since the M¨obius recurrence uniquely*
defines*μ*, it suffices to show that

*v∈**[u**,w*]

*η**vw*

(−1)^{d(}^{η}^{vw}^{)}=0,

where the inner sum is over all normal embeddings of *v* into*w. We prove this by*
constructing a sign-reversing involution on the set of normal embeddings *η**vw* for
*v*∈*[u, w*].

Let*ρ**u**v**denote the rightmost embedding of u intoη**vw**, so for all i* ∈[1*,n],*
*ρ**u**v**(i )*≤*η*_{vw}*(i )*≤*w(i ).* (1)
*Also let s denote the left-most index whereρ**u**v*and*wdiffer, i.e., s*=min{i :*ρ**u**v**(i )*=
*w(i )*}*. Since u< w, s must exist, and by the definition of s we have*

*ρ**u**v**(i )*=*η*_{vw}*(i )*=*w(i )* *for all i*∈[1,*s).* (2)
*Set k* =*w(s), soη**vw**(s) is k, k*−1, or 0 by normality and*ρ**u**v**(s)*≤*k*−1. Finally, let
*[r,t] denote the indices of the run of k’s inwthat contains the index s.*

Our involution maps*η** _{vw}*to the embedding

*η*

*=*

_{vw}*η*

*(1)· · ·*

_{vw}*η*

_{vw}*(n) whereη*

_{vw}*(i )*=

*η*

_{vw}*(i ) for all i*=

*s andη*

_{vw}*(s) is determined by the following rules. If k*=1 then

*η**vw**(s)*=

0 if*η*_{vw}*(s)*=1,

1 if*η**vw**(s)*=0. (3)

*If k* ≥*2, s>r , andρ**u**v**(s)*=0 then
*η*_{vw}*(s)*=

0 if*η*_{vw}*(s)*=*k*−1,

*k*−1 if*η*_{vw}*(s)*=0. (4)

*Finally, if k* ≥*2 and either s*=*r orρ**u**v**(s)*=0 then
*η**vw**(s)*=

*k*−1 if*η*_{vw}*(s)*=*k,*

*k* if*η**vw**(s)*=*k*−1. (5)

It is not obvious that this map is defined for all normal embeddings *η** _{vw}* when

*k*≥

*2. For example, if s*

*>r and*

*ρ*

*u*

*v*

*(s)*=0, then we should apply (4), but it is a priori possible that

*η*

_{vw}*(s)*=

*k, in which case (4) is not defined. However, since*

*s>r ,ρ*

*u*

*v*

*(s*−1)=

*w(s*−1)=

*k by (2), and this contradicts our choice ofρ*

*u*

*v*as the

*rightmost embedding of u intoη*

_{vw}*. Similar issues arise when s*=

*r orρ*

*u*

*v*

*(s)*=0: if

*s*=

*r thenη*

*vw*

*(s)*=0 by normality, and if

*ρ*

*uv*

*(s)*=0 then

*η*

*vw*

*(s)*=0 by (1).

Having established that this map is indeed defined on all normal embeddings, we
have several properties to prove. First, it is evident from (3), (4), and (5) that the number
*of elements equal to k*−1 changes by exactly one in passing from*η**vw* to*η**vw*, and

thus (−1)^{d(}^{η}^{vw}^{)}= −(−1)^{d(}^{η}^{vw}^{)}. We must now prove that*η**vw*is a normal embedding of
*v*into*w*for some*v*∈*[u, w*] and that this map is an involution.

We begin by showing that*η**vw*is an embedding of some*v*∈*[u, w*] into*w*. It follows
from the fact that*η**vw*is an embedding into*w*and the definition of our map that*η**vw*is
an embedding of some word*v*into*w, so we only need to show thatv*≥*u. We prove*
this by showing that

*η*_{vw}*(i )*≥*ρ**u**v**(i )* (6)

*for all i* ∈[1*,n]. This is clear for all i* =*s because for these indicesη**vw**(i )*=*η**vw**(i )*≥
*ρ**u**v**(i ). Furthermore,ρ**u**v**(s)*≤*k*−*1 by the definition of s, so the only case in which*
*(6) is not immediate is when k* ≥2 and *η*_{vw}*(s)*=0. However, this can only occur
from using (4), which requires that *ρ**u**v**(s)*=0, completing the demonstration that
*v*∈*[u, w].*

We now aim to show that *η** _{vw}* is a normal embedding. If

*η*

_{vw}*(s)>*0 then Supp (η

*)⊇Supp (η*

_{vw}*), so the normality of*

_{vw}*η*

*follows from the normality of*

_{vw}*η*

*and the fact that*

_{vw}*η*

_{vw}*(s) is either k*−

*1 or k. Ifη*

_{vw}*(s)*=0 then there are two cases depending on whether (3) or (4) was applied. Suppose first that (3) was applied, so

*k*=1. Comparing (3) with the definition of normality, we see that it suffices to show

*s*=

*r . Suppose to the contrary that s*∈

*(r,t]. Since s is the left-most position at which*

*ρ*

*uv*and

*w*differ, we have

*ρ*

*uv*

*(s)*=0 and

*ρ*

*uv*

*(s*−1)=1. However, this contradicts our choice of

*ρ*

*u*

*v*

*as the rightmost embedding of u inη*

*vw*. Now suppose that (4) was

*applied, so s>r andρ*

*u*

*v*

*(s)*=0. Then (2) implies that

*η*

*vw*

*(r )*=

*η*

*vw*

*(r )*=

*k, and*normality is preserved.

It only remains to show that this map is an involution. Consider applying the map
to*η**vw*. In this process, we define*ρ**u**v* *to be the rightmost embedding of u intoη**vw*,
*s*=min{i :*ρ**u**v**(i )*=*w(i )}, and k*=*w(s). We then follow the rules (3) (4), and (5) to*
construct an embedding*η** _{vw}*, which we would like to show is equal to

*η*

*.*

_{vw}First we claim that *ρ**u**v*=*ρ**u**v*. Suppose to the contrary that *ρ**u**v*=*ρ**u**v*. By (6),
*η*_{vw}*(i )*≥*ρ**u**v**(i ) for all i , soρ**u**v**also gives an embedding of u intov, and thus the only*
way we can have*ρ**u**v*=*ρ**u**v*is if*ρ**u**v*is further to the right than*ρ**u**v*. This requires that
0=*ρ**u**v**(s)< ρ**u ¯**v**(s)*≤*k* (7)
and that

*η**vw**(s)< η**vw**(s).* (8)

Because we are assuming that*ρ**vw* is further to the right than*ρ**vw*, there is some
*position to the left of s at whichρ**u**v*is nonzero. In fact, (2) shows that we must have
*ρ**u**v**(s*−1)=0 and also implies that

*η**vw**(s)< ρ**u**v**(s)*=*ρ**u**v**(s*−1)=*w(s*−1)*.* (9)
We now consider the three cases arising from each of the rules (3), (4), and (5) in turn.

*Suppose (3) was applied so k*=1. It follows from (8) and the definition of our map
that*η**vw**(s)*=0. Also (7) and (9) give*w(s*−1)=*ρ**u**v**(s)*=1. However, this implies
that*η**vw*zeroed out a 1 which was not the first in its run, contradicting normality.

Now suppose (4) was applied. Then by (8) we have*η**vw**(s)*=*k*−1. We also have
*s>r which in conjunction with (9) givesρ**u**v**(s)*=*w(s*−1)=*k. This implies that*
*η**vw**(s)< ρ**u**v**(s), but that contradicts thev*version of (1).

*Finally suppose that (5) was used. By (7) it must be the case that s*=*r . Also,*
Eq. (8) gives *η*_{vw}*(s)*=*k*−1 and *η*_{vw}*(s)*=*k. Now applying (7) and (9) we have*
*k*−1*< w(s*−1)=*ρ**u**v**(s)*≤*k, sow(s*−1)=*k which contradicts that fact that s* =
*r .*

Now that we have established the equality of*ρ**u**v* and*ρ**u**v*, the fact that this map
is an involution can be readily observed. We must have*s*=*s, so k*=*k, and thus we*
apply the same rule to go from*η** _{vw}*to

*η*

*as we applied to get*

_{vw}*η*

*from*

_{vw}*η*

*, and each of these rules is clearly an involution.*

_{vw}

**4. Introduction to discrete Morse theory**

In this section we review the basic ideas behind Forman’s discrete Morse theory [14, 15]

as well as Babson and Hersh’s method for applying the theory to the order complex of a poset [1].

*Let X be a CW-complex. Since we will be working in reduced homology, we*
*assume that X has an empty cell*∅of dimension−1 which is contained in every cell
*of X . Ifσ* *is a d-cell (cell of dimension d) in X then letσ*^{∂}*be the set of (d*−1)-cells
*τ* which are contained in the closure ¯*σ*. Dually, let*σ*^{δ}*denote the set of (d*+1)-cells
*τ* such that*σ* ⊂*τ*¯.

*A real-valued function f on the cells of X is a Morse function if it satisfies the*
following two conditions.

1. For every cell*σ* *of X we have*
(a) #{τ ∈*σ** ^{∂}*|

*f (τ*)≥

*f (σ*)} ≤1, and (b) #{τ ∈

*σ*

*|*

^{δ}*f (τ*)≤

*f (σ*)} ≤1.

2. If*τ* ∈*σ*^{δ}*and f (τ*)≤ *f (σ*) then*σ*is a regular face of*τ*.

*Intuitively, the first condition says that, with only certain exceptions, f increases*
*with dimension. In fact, f (σ*)=dim*σ* *is a perfectly good Morse function on X ,*
although we will see shortly that it is not very interesting. A simple example of a
*Morse function on a CW-complex is given in Fig. 1 where the value of f is given next*
to each cell*σ* *and we also set f (∅)*= −1.

The fact that condition (1) holds for every cell implies that, in fact, at most one
of the two sets under consideration has cardinality equal to 1. Thus the function
*f induces a Morse matching between pairs of cellsσ, τ* with*σ* ∈*τ*^{∂}*and f (σ*)≥
*f (τ). Furthermore, condition (1) implies that this matching is acyclic in the following*
*sense: the directed graph formed from the Hasse diagram of the face poset of X by*
directing every edge of the matching upwards and all other edges downwards has no

directed cycles. The regularity condition ensures that for each matched pair there is
an elementary collapse of ¯*τ* onto ¯*τ*−(*τ* ∪*σ*).

*The cells which are not matched by f are called critical. From the acyclicity*
property and the fact that each individual collapse is a homotopy equivalence, it
*follows that X can be collapsed onto a homotopic complex X** ^{f}* built from the critical
cells. In our example, the cells labeled 1 and 2 are matched, and after collapsing we

*clearly have a complex which is still homotopically a circle. Note that if we take f to*

*be the dimension function then every cell is critical and X*

*=*

^{f}*X , so the cell complex*does not simplify in this case which does not help in understanding its structure.

Let ˜*m**d* *be the number of critical d-cells of X and let ˜b**d* *be the d-th reduced Betti*
number over the integers. We also use ˜*χ(X ) for the reduced Euler characteristic. From*
*the considerations in the previous paragraph, we have the following Morse inequalities*
which are analogous to those in traditional Morse theory.

**Theorem 4.1 (Forman [15]). For any Morse function on a cell complex X we have***1. ˜b**d* ≤*m*˜*d* *for d*≥ −1, and

2. ˜*χ(X )*=

*d*≥−1(−1)^{d}*m*˜*d**.*

*(One can get further inequalities relating various partial alternating sums of the ˜b**d*

and ˜*m**d*.) Continuing our example, we see that ˜*m*_{−1}=*m*˜0 =*m*˜1=1 which bound

*˜b*_{−1}= *˜b*0 =*0 and ˜b*_{1}=1, as well as ˜*χ(X )*= −1+1−1= −1, as expected.

*We now turn to the special case of order complexes. Let P be a poset and consider*
*an open interval (u, w) in P. The corresponding order complex* *(u, w*) is the abstract
*simplicial complex whose simplices (faces) are the chains in (u, w*). We are interested
in the order complex because of the fundamental fact [20] that

*μ(u, w)*=*χ( (u, w)).*˜ (10)

Therefore finding a Morse function for *(u, w*) could permit us to derive the
corresponding M¨obius value as well as give extra information about its Betti numbers.

*Suppose we have an ordering of the maximal chains of (u, w*) (facets of *(u, w*)), say
*C*1*,C*2*, . . . ,C**l*. Call a face (subchain)*σ* *of C**k* *new if it is not contained in any C**j*

*for j* *<k. We would like to construct a Morse matching inductively, where at the kth*
*stage we extend the matching on the faces in C**j* *for j* *<k by matching up as many*
of the new faces*σ* *in C**k* as possible. It turns out that under fairly mild conditions
on the facet ordering, one can construct such a matching so that all the new faces in
*C**k*are matched if there are an even number of them, and only one is left unmatched
if the number is odd. Thus adding each facet contributes at most one critical cell.

*A maximal chain contributing a critical cell is called a critical chain. In reading the*
details of this construction, the reader may find it useful to refer to the example of the
interval [322*,*3322]⊂P^{∗}*given in Fig. 2. Note that, by abuse of notation, we include u*
and*wwhen writing out a maximal chain C, even though C is really a subset of the open*
*interval (u, w*). Also, because of the way our chain order is constructed, we start with
the top element*wand work down to u which is dual to what is done normally. Thus in*
*a chain C, terms like “first” and “last” refer to this ordering of C’s elements. Finally,*
we list the elements of a chain as embeddings into*w*for reasons which will become
apparent when we also describe the labels given to the edges (covers) of a chain.

**Fig. 2 A Morse function on a CW-complex**

*To define the types of chain orderings we consider, suppose we have two chains C :*
*w*=*v*0 *v*1 *. . .* *u and C*^{}: *w*=*v*_{0}^{} *v*^{}_{1} *. . .* *u where*
*x* *y means that x covers y. Then we say that C and C*^{}*agree to index j ifv**i* =*v*^{}_{i}*for i*≤ *j. In addition, C and C*^{} *diverge from index j if they agree to index j and*
*v**j+1* =*v*^{}_{j+1}*. In addition, we use the notation C*^{}*<C to mean that C*^{}comes before
*C in the order under consideration. An ordering of the maximal chains of [u, w*] is a
*poset lexicographic order, or PL-order for short, if it satisfies the following condition.*

*Suppose C*^{}*and C diverge from index j with C*^{}*<C. Then for any maximal chains D*^{}
*and D which agree to index j*+*1 with C*^{}*and C, respectively, we must have D*^{}*<D.*

Note that orderings coming from the EL-labelings introduced by Bj¨orner [3] or from the more general CL-labelings of Bj¨orner and Wachs [9] are PL-orders as long as one breaks ties among labels consistently.

The PL-order we use inP^{∗} *is as follows. If x* *y is a cover then y is obtained*
*from x by reducing a single part of x by 1. Thus there is a unique normal embedding*
*of y into x, since if a 1 is reduced to 0 then it must be the first element in the run*
of ones to which it belongs. Similarly, for any expansion*η**x*there is a unique normal
*embedding of y intoη**x**. Now given any chain C :* *w*=*v*0 *v*1 *. . .* *u*
we inductively associate with each*v**j* an embedding*η*_{v}*j* into*w*where*η** _{v}*0 =

*w*and,

*for j*≥0,

*η*

*v*

*j*+1is the unique normal embedding of

*v*

*j+1*into

*η*

*v*

*j*. We label the edge

*v*

*j*

*v*

*j+1*

*of C with the index i of the position which was decreased in passing*from

*η*

*v*

*j*to

*η*

*v*

*j*+1. Furthermore, we often write

*η*

*v*

*j*in place of

*v*

*j*when listing the

*elements of C. Figure 2 illustrates this labeling. It is important to note that although*

*η*

*v*

*j*+1 is normal in

*η*

*v*

*j*, it need not be normal in

*w*. We should also remark that this labeling is similar to the one used by Bj¨orner [4] in his CL-shelling of the intervals in

*subword order. Finally, if one orders the chains of [u, w*] using ordinary lexicographic order on their label sequences, then the result is a PL-order. This is due to the fact

*that if two chains agree to index j then their first j labels are the same. The chains in*Fig. 2 are listed in PL-order.

We now return to the general exposition. To construct our matching, when we come
*to a chain C :* *w*=*v*0 *v*1 *. . .* *u in a given order we must be able to*
*determine which faces of C are new. Denote the open interval I fromv**i*to*v**j* *in C by*

*I* =*C(v**i**, v**j*)=*v**i*+1 *v**i*+2 *. . .* *v**j*−1*.*

*(Do not to confuse this with an open interval in the poset.) Then I is a skipped interval*
*if C*−*I* ⊂*C*^{}*for some C*^{}*<C. It is a minimal skipped interval or MSI if it does not*
strictly contain another skipped interval. In Fig. 2, the MSI’s are circled. One can find

*the MSI’s by taking the maximal intervals in C*−*(C*∩*C*^{}*) for each C*^{}*<C and then*
*throwing out any that are not containment minimal in C. LetI*=*I(C) be the set of*
*MSI’s in C. Then it is easy to check that a faceσ* *is new in C if and only ifσ* has a
*nonempty intersection with every I* ∈*I(C).*

The set*I*in not quite sufficient to construct the matching because the MSI’s can
overlap and we will need a family of disjoint intervals*J*. It happens that in all the
critical chains forP^{∗}, the intervals in*I*will already be disjoint. So*I*=*J* and we will
not need the following construction, but we include it for completeness. In general,
there are no containments among the intervals in*I, so they can be ordered I*_{1}*,I*2*,I*3*, . . .*
*according to when they are first encountered on C. We now inductively construct the*
set *J* =*J(C) of J -intervals as follows. Let J*_{1}=*I*1. Then consider the intervals
*I*_{2}^{} =*I*2−*J*1*,I*_{3}^{}=*I*3−*J*1*, . . .*; throw out any which are not minimal; and pick the
*first one which remains to be J*2. Continue this process until there are no nonempty
modified MSI’s left.

We are finally in a position to describe the matching. List the maximal chains
*of [u, w] using a PL-order. A family* *K* *of intervals of a maximal chain C covers*
*the chain if C*= ∪K. There are three cases depending on whether*I(C) or* *J(C)*
*covers C or not. First suppose that* *I(C) does not cover C, so neither does* *J(C),*
*and pick x*0*to be the first vertex in C*− ∪I. Consider the map*σ* →*σ {x*0}where
is symmetric difference (not the order complex). One can show that this map is a
fixed-point free involution on the new faces*σin C which extends the Morse matching*
already constructed from the previous chains. Now suppose that*J* = {J1*,J*2*, . . . ,J**r*}
*does cover C and consider the new faceσ**C* = {x1*,x*2*, . . . ,x**r*}*where x**i* is the first
*element of J**i*for 1≤*i* ≤*r . Given any other new faceσ* =*σ**C**, we find the interval J**i*

of smallest index where*σ*∩*J**i* = {x*i*}and map*σ* →*σ {x**i*}. This involution pairs
*up all new faces in C exceptσ**C*, which is critical. Finally, suppose that*I* *covers C*
but*J* does not. Then we use the mapping of the second case to pair up all new faces
whose restriction to ∪J is different from*σ**C*. We also pair up the remaining new
faces (including*σ**C**) by using the mapping of the first case where we take x*0to be the
*first vertex in C*− ∪J. Thus we have outlined the proof of the following theorem,
remembering that the dimension of a simplex is one less than its number of vertices.

**Theorem 4.2 ([1]). Let P be a poset and [u**, w] be a finite interval in P. For any*PL-order on the maximal chains of [u, w], the above construction produces a Morse*
*matching in (u, w) with the following properties.*

*1. The maximal chain C is critical if and only ifJ* *covers C.*

*2. If C is critical then its unique critical cell has dimension #J(C)*−*1.*

**5. A Morse theory derivation of****μ**

We are now ready to find the critical cells for the PL-order inP^{∗} defined previously.

We first need three lemmas which will prove useful in a number of cases. Unless otherwise specified, we always use the notation

*C :* *w*=*v*0 *l*1 *v*1 *l*2 *v*2 *l*3 *. . .* ^{l}^{d}*v**d* =*u* (11)

for labeled maximal chains, or
*C :* *w*=*η** _{v}*0

*l*1 *η** _{v}*1

*l*2 *η** _{v}*2

*l*3 *. . .* ^{l}^{d}*η**u* (12)
*if we wish to be specific about the embeddings determined by C. We also use*

*l(C)*=*(l*1*,l*2*, . . . ,l**d*)
for its label sequence.

*Take an interval [u, w*]⊂P^{∗} with|u| = |w|*and let m** _{i}* =

*w(i )*−

*u(i). Now con-*

*sider the multiset M*

*= {{1*

_{uw}

^{m}^{1}

*,*2

^{m}^{2}

*, . . .}}where i*

^{m}

^{i}*means that i is repeated m*

*times.*

_{i}*Then every permutation of M is the label sequence for a unique maximal chain in*
*[u, w] and this accounts for all the chains. (In fact, [u, v*] is isomorphic to the poset
*of submultisets of M**u**w*.) We record this simple observation for later reference.

* Lemma 5.1 (Same length lemma). If*|u| = |w|

*then the the label function l gives a*

*bijection between the maximal chains in [u, w] and the permutations of M*

*u*

*w*

*. In*

*particular, if M*

*u*

*w*

*contains only one distinct element (possibly with multiplicity) then*

*[u, w] contains a unique maximal chain.*

If|u|*<*|w|then we no longer have the nice bijection of the previous paragraph, but
*we can still say something. Let C be a maximal chain as in (11) and let l*^{}=*(l*_{1}^{}*, . . . ,l*^{}* _{d}*)

*be any permutation of the label sequence l(C). Then l*

^{}defines a sequence of expansions

*η*

_{v}_{0}

^{}

*, η*

_{v}^{}

_{1}

*, . . . , η*

_{v}^{}

*where*

_{d}*η*

_{v}^{}

_{0}=

*wand for j*≥1 we get

*η*

_{v}^{}

*from*

_{j}*η*

_{v}^{}

_{j}_{−}

_{1}by subtracting one

*from position l*

^{}

*in*

_{j}*η*

*v*

^{}

_{j−1}*. It is still true that C*

^{}:

*w*=

*v*

_{0}

^{}

*v*

_{1}

^{}

*. . .*

*v*

*d*

^{}=

*u*

*is a maximal chain in [u, w]. We call C*

^{}

*the chain specified by l*

^{}. Since

*η*

_{v}^{}

*may not be a normal embedding in*

_{j}*η*

_{v}^{}

_{j−1}*, we may not have l(C*

^{})=

*l*

^{}. Still, at the first place where

*l(C*

^{}

*) and l*

^{}differ, that difference must have been caused because using the label in

*l*

^{}would have resulted in changing a 1 to a 0 where that 1 was not the first in its run.

*Thus the corresponding normal embedding in l(C*^{}) uses the first 1 in that run which
*is to the left. Hence l(C*^{})≤*l*^{}in lexicographic order. We summarize this discussion
in the following lemma.

**Lemma 5.2 (Chain specification lemma). If C is a maximal chain in [u**, w] and l^{}*is*
*any permutation of l(C) then l(C*^{})≤*l*^{}*where C*^{}*is the chain specified by l*^{}*.*

As our first application of the Chain Specification Lemma, we can determine what
*happens at descents. A descent of C isv**j* ∈*C such that l**j* *>l**j*+1*. An ascent is defined*
by reversing the inequality.

**Lemma 5.3 (Descent lemma). If**v*j**is a descent of C then it is an MSI.*

**Proof: Let l**^{}*be the permutation of l(C) gotten by interchanging l**j* *and l**j*+1 and let
*C*^{}*be the chain specified by l*^{}*. Then by Lemma 5.2 we have l(C*^{})≤*l*^{}*<l(C), so C*^{}
*comes before C in PL-order and it is easy to check that C*^{}*diverges from C atv**j−1*

*and rejoins C atv**j+1*. Thus{v*j*}is a skipped interval; and since the interval contains

only one element it must also be minimal.

We only need a few more definitions to state our result characterizing the critical
*chains. A chain C will be said to have a certain property, e.g., weakly decreasing, if*
*l(C) has that property. Also, ifη**u*is a normal embedding into*w*then we need to keep
track of the zero positions which did not come from decreasing a 1 in*w*by letting

*D(η**u*)=#{i|*η**u**(i )*=0*, w(i )*≥2}.

* Theorem 5.4. Consider the maximal chains in [u, w]*⊂P

^{∗}

*in the given PL-order.*

*1. There is a bijection between critical chains C and normal embeddingsη**u* *intow*
*where the chain corresponding toη**u**is the unique weakly decreasing chain ending*
*atη**u**.*

*2. If C is critical and ends atη**u* *thenI(C)*=*J(C) and*

#I*(C)*=*d(η**u*)+*2D(η**u*)−1.

*We shall prove this theorem by considering 3 cases: when C is weakly decreasing*
*and ends at a normal embedding, when C is weakly decreasing and does not end at a*
*normal embedding, and when C is not weakly decreasing. Note that for any embedding*
*η**u* into*w*, there is at most one weakly decreasing chain ending at*η**u*, and that if*η**u*

is normal then such a chain will exist because it will be possible to make each cover
normal. Thus there is a bijection between normal embeddings and weakly decreasing
chains ending at them, but we need to show such chains are critical. To do so, we
*define a plateau of C to be an interval C(v**i**, v**j**) such that (l**i*+1*,l**i*+2*, . . . ,l**j*) is a run
*of length at least 2 in l(C).*

**Proposition 5.5. If C is weakly decreasing and ends at a normal embedding**η*u* *then*
*C is critical,I(C)*=*J(C), and #I(C)*=*d(η**u*)+*2D(η**u*)−*1.*

**Proof: Every descent is an MSI of C by the Descent Lemma, so any other MSI must***be contained in a plateau by minimality. In fact, we claim that any plateau C(v**i**, v**j*)
is an MSI. Without loss of generality we can assume*v**i* =*w*(since otherwise*v**i* is a
descent and so no MSI can contain it) and*v**j* =*u.*

*To show C* =*C(w,u) is a skipped interval, first note that by construction l(C)*
*consists of c repeated k* = *j*−*i* ≥2 times, so*w(c)*=*η**u**(c)*+*k. Thus by normality*
*and the fact that k*≥2 we get*η**u**(c)*=0 and*w(c)*=*k. Using normality again implies*
*that c cannot be the first element in its run of k’s inw, and thusw(c*−1)=*k. Because*
*of this, there is a chain C*^{}from*wto u all of whose labels are c*−1. By construction
*C*^{}*<C and C*∩*C*^{}= ∅*, so C is a skipped interval as desired.*

To show the plateau is minimal suppose, to the contrary, that there is a skipped
*interval I* ⊂*C and letw*=*v*0*, u*=*v**k*. Note that because|v0| = |v1| = · · · = |v*k*−1|,
the Same Length Lemma applies to show that there is only one chain (namely an
*interval of C) between any two of these compositions. Thus the chain C*^{}giving rise
*to I must rejoin C at an embeddingη*^{}*u**of u intow*, and hence also contain*v*1in order
to cut out a proper subinterval. From this and normality of*η**u*we have

*η**u*^{}*(c)<k*=*w(c*−1)=*η**u**(c*−1)*.* (13)

Also,|u| = |w| −*1 implies that C*^{} must zero out exactly one element of*w. Since*
*C*^{}*<C, that element must be in a position strictly to the left of position c, but then*
because*η**u*and*η*_{u}^{} *are both expansions of u we are forced to haveη**u**(c*−1)=*η*^{}_{u}*(c),*
contradicting (13).

Now we know that*I(C) consists of the descents and plateaus of C which are disjoint*
*and cover C by their definition, soJ(C)*=*I(C) and C is critical by Theorem 4.2. To*
*count the number of MSI’s, note that if c is a position counted by d(η**u*) then the vertex
*just after the edge labeled c in C will be a descent, unless that edge is the very last*
*one. On the other hand, if c is counted by D(η**u**) then the run of c’s in l(C) contribute*
both a plateau and a descent just after the plateau to*I* (unless the run is at the end
*of C when only the plateau will be an interval). In this manner we count each MSI*
*exactly once for a total of d(η**u*)+*2D(η**u*)−1 intervals.

**Proposition 5.6. If C is weakly decreasing and ends at an embedding**η*u**which is not*
*normal then C is not critical.*

**Proof: As in the proof of the previous proposition, it suffices to consider the case**
*where l(C) consists of a label c repeated k*≥2 times so that *w(c)*=*η**u**(c)*+*k. If*
*η**u**(c)>*0 then|w| = |u|*and so the Same Length Lemma applies to show that C is*
the only chain from*wto c. In particular, it is the lexicographically first chain and thus*
not critical.

If*η**u**(c)*=0 then*w(c)*=*k and, sinceη**u**is not normal, it must be that c is the first*
*index in this run of k’s inw. Let*

*η**u**(c*−1)=*w(c*−1)=*h*=*k.* (14)
*To demonstrate that C is not critical, it suffices to show that there is no MSI I containing*
the element*v*1 *in C. Suppose, to the contrary that such an interval I exists and let*
*C*^{} *be a chain giving rise to I . Using the Same Length Lemma as in the proof of*
*Proposition 5.5 (third paragraph) we see that C*^{}*must rejoin C at u and this forces*
*I* =*C(w,u).*

*To finish the proof, it suffices to find a skipped interval I*^{}⊂*I since that will*
*contradict the minimality of I . Let b be the smallest label in l(C*^{}*). Then b<c since*
*l(C*^{})*<l(C)*=*(c** ^{k}*). The same argument used at the end of the third paragraph of the
previous proposition together with Eq. (14) give

*η*^{}*u**(c)*=*η**u**(c*−1)=*h* =*k*=*w(c).*

Since the parts of a composition can only (weakly) decrease along a chain, we must
have*η*^{}*u**(c)< w(c). It follows that c is a label on C*^{}. Now consider any permutation
*of l(C*^{}*) which starts l*^{}=*(c,b, . . .) and let C*^{} *be the chain specified by l*^{}. Then
*by construction and the Chain Specification Lemma l(C*^{})≤*l*^{}*<l(C). This implies*
*that C*^{}*<C in PL-order and, by construction again, C*^{}contains*v*1. Thus no MSI of
*C can containv*1, a contradiction.

Our third and final proposition completes the proof of Theorem 5.4

**Proposition 5.7. If C is not weakly decreasing then C is not critical.**

* Proof: If C is not weakly decreasing then it has an ascentv. It suffices to show thatv*
is in no MSI. Suppose, to the contrary, that

*vis in an MSI, I*=

*C(v*

*i*

*, v*

*j*). Then by the

*Descent Lemma, I contains no descents and so C is weakly increasing fromv*

*i*to

*v*

*j*. As in the previous two proofs, it is no loss of generality to assume

*v*

*i*=

*w*and

*v*

*j*=

*u*

*so that C is itself an MSI.*

*Since C is an MSI, it is not the first chain in [u, w*]. That first chain is the unique
weakly increasing chain which ends at the rightmost embedding*ρ**u**of u intow*. Thus
if*η**u**is the embedding defined by C then we must haveη**u* =*ρ**u*.

For*η**u* define

*z*_{η}*( j )*=#{i ≤ *j*|*η**u**(i )*=0}

*and similarly define z*_{ρ}*( j ) forρ**u*. Because*ρ**u* *is rightmost we always have z*_{ρ}*( j )*≥
*z*_{η}*( j ) with equality when j* = |w|. But*ρ**u* is not equal to*η**u**, so there is an index a*
*such that z*_{ρ}*(a)>z*_{η}*(a). Thus there is a first index c>a such that z*_{ρ}*(c)*=*z*_{η}*(c). This*
*definition of c forces* *η**u**(c)*=0 and *ρ**u**(c)*=*k for some k>*0. But*η**u* and*ρ**u* are
*embeddings of the same composition, so there must be some index b with a*≤*b<c*
such that*η**u**(b)*=*ρ**u**(c)*=*k andη**u**(i )*=*0 for b<i* ≤*c.*

*We can now derive a contradiction by constructing a smaller skipped interval in C as*
follows. We have*w(c)*≥*ρ**u**(c)*=*k andη**u**(c)*=*0. Since C is weakly increasing, the*
*labels equal to c must occur as a plateau. Therefore there must be verticesw*^{}*,u*^{}∈*C*
*such that I*^{}=*C(w*^{}*,u*^{}) satisfies*η**w*^{}*(c)*=*k,η**u*^{}*(c)*=*0, and l(I*^{})=*(c** ^{k}*). But

*η**w*^{}*(i )*=*η**u*^{}*(i )*=*η**u**(i )*= *k* if *i* =*b,*
0 if *b<i<c,*

*so there is a chain C*^{}from*w*^{} *to u*^{} *with l(C*^{})=*(b*^{k}*). Since b<c, I*^{} is a skipped
interval and we have obtained the desired contradiction.

We can now rederive the formula for *μ(u, w) in* P^{∗}. Combining Eq. (10) with
Theorems 4.1, 4.2, and 5.4 we obtain

*μ(u, w*)=*χ*˜*(u, w*)=

*C*

(−1)^{dim}^{σ}* ^{C}* =

*η**u*

(−1)^{d(}^{η}^{u}^{)}^{+}^{2D(}^{η}^{u}^{)}^{−}^{2}=

*η**u*

(−1)^{d(}^{η}^{u}^{)}

*where the first sum is over all critical chains C in (u, w) and the other two are over all*
normal embeddings*η**u*into*w.*

We end this section by remarking that the Morse method can be used as a powerful tool not just for proving theorems but for discovering the correct statement to be proved.

The reader may have found our definition of a normal embedding somewhat ad hoc.

However, by starting with the very natural chain labeling used above and looking at the critical chains, one is quickly led to this definition in order to characterize the embeddings at which such chains end. Similarly, the defect may seem to have come

out of nowhere, but in order to determine the dimension of the critical cells one is
*forced to define this quantity as well as its big brother, D(η**u*).

**6. Generalized subword order**

*We can now generalize both our result and Bj¨orner’s as follows. Let (P,*≤*P*) be any
*poset. The generalized subword order over P is the partial order on P*^{∗} obtained
*by setting u*≤*P*^{∗} *w*if *w*contains a subsequence *w(i*_{1})*, w(i*_{2})*, . . . , w(i*_{|}*u*|) such that
*u( j)*≤*P* *w(i**j*) for 1≤ *j*≤ |u|*. We get ordinary subword order when P*= *A is an*
*antichain and we get our composition poset when P* =P.

It is a simple matter to recast this generalized order in terms of embeddings. Let ˆ0
*be a special element which is not in P and let ˆP be the poset obtained by adjoining ˆ0*
as a minimum element, i.e., ˆ0*<**P*ˆ *x for all x* ∈*P. Then the definitions of support and*
*expansion are as usual, just replacing 0 with ˆ0. An embedding of u intow*is a length

|w|expansion*η**u**of u with*

*η**u**(i )*≤*P*ˆ *w(i )* for 1≤*i* ≤ |w|.

*As expected, u*≤*P*^{∗}*wif and only if there is an embedding of u intow*.

Finding an analogue of normality in this context is more delicate, and so far we
have only been able to do it for a special class of posets. However, there is evidence
that more general results are possible; the next section contains a discussion of this
*issue. First note that the definition of a run carries over verbatim to any P*^{∗}. Now call
*P a rooted tree if its Hasse diagram is a tree with a minimum element. A rooted forest*
is a poset where each connected component of its Hasse diagram is a rooted tree. Note
*that both antichains and chains are rooted forests. Note also that if P is a rooted forest*
then ˆ*P is a rooted tree so the following definition makes sense. If x* ∈ *P where P is*
*a rooted forest then let x*^{−}*be the element adjacent to x on the unique path from x to*
*ˆ0 in the Hasse diagram for ˆP. For a rooted forest, a normal embedding of u intow*is
an embedding*η**u*into*w*satisfying two conditions.

1. For 1≤*i*≤ |w|we have*η**u**(i )*=*w(i ),w(i )*^{−}, or ˆ0.

*2. For all x* ∈ *P and every run [r,t] of x’s inw*, we have
*(a) (r,t]*⊆Supp*η**u* *if x is minimal in P,*

*(b) r*∈Supp*η**u*otherwise.

*Finally, we need the definition of defect in this situation, which is as expected:*

*d(η**u*)=#{i|*η**u**(i )*=*w(i )*^{−}}

for a normal embedding*η**u* into*w*. The following theorem is the promised general-
ization of Theorems 2.1 and 2.2. Both of the two proofs we have given of the special
*case where P* =P*generalize easily, with the minimal elements in P playing the role*
*of 1 and the rest functioning like the integers k* ≥2.

**Theorem 6.1. Let P be a rooted forest. Then the M¨obius function of P**^{∗}*is given by*
*μ(u, w)*=

*η**u*

(−1)^{d(η}^{u}^{)}*,*
*where the sum is over all normal embeddingsη**u**of u intow.*

**7. Comments and open problems**

There are several possible avenues for future research. We discuss some of them here.

7.1. Generating functions

As mentioned in the introduction, Bj¨orner and Reutenauer [6] gave another proof of
the formula for*μin A*^{∗} using generating functions on monoids. LetZAdenote
*the algebra of formal series using the elements in A as noncommutative variables and*
the integers as coefficients. Such a series can be written

*f* =

*w∈**A*^{∗}

*c*_{w}*w*

*for certain c** _{w}*∈Z

*. For example, given u*∈

*A*

^{∗}one can consider the series

*m(u)*=

*w≥**u*

*w*
*u*

*n*

*w.* (15)

*Bj¨orner and Reutenauer showed that (15) is rational for any u and obtained, upon*
specialization of the variables, nice expressions for various ordinary generating func-
*tions associated with the M¨obius function of A*^{∗}. They also derived results for the zeta
*function of A*^{∗}*. The map m : A*^{∗}→ZAcan be extended to a continuous linear
endomorphism ofZ*A. In fact, the full incidence algebra of A*^{∗}is isomorphic to a
subalgebra of this endomorphism algebra. Bj¨orner and Reutenauer gave another proof
of Theorem 2.1 using this fact.

It is natural to try and apply these ideas toP^{∗}, and more generally to rooted forests.

This has been done by Bj¨orner and Sagan [7].

7.2. The poset of permutations

Our original interest in P^{∗} came from the rapidly growing subject of permutation
patterns. For an overview of permutation patterns the reader is referred to B´ona’s
*text [10]. Let S**n**denote the nth symmetric group and letπ* ∈*S**n*and*σ* ∈*S**k*. We say
that*πcontains aσ-pattern, and writeπ*≥*σ, if there are indices i*1*<i*2*<*· · ·*<i**k*

such that the subsequence *π(i*1)π*(i*2)*. . . π(i**k*) has the same pairwise comparisons
as *σ*(1)σ(2)*. . . σ(k). This subsequence is called a copy of* *σ* in *π*. For example,
312≤24153 because of the copy 413. This defines a partial order on the set of all
finite permutations. Figure 3 displays the interval [1*,*31524] in this poset. Wilf was
the first to ask the following question.

**Fig. 3 The Hasse diagram for the interval [1,**31524] in the pattern containment ordering on permutations

**Question 7.1 (Wilf [27]). What can be said about the M¨obius function of permutations***under the pattern-containment ordering?*

Given two permutations*π* ∈*S**m*and*σ* ∈*S**n**, their direct sum is the permutation of*
*length m*+*n whose first m elements formσand whose last n elements are the copy of*
*πgotten by adding m to each element ofπ*. For example, 132⊕32145=13265478.

*A permutation is said to be layered if it can expressed as the direct sum of some number*
of decreasing permutations. (An equivalent characterization of layered permutations
is that they are the permutations that contain neither a 231-pattern nor a 312-pattern.)
Our previous example is layered because 13265478=1⊕21⊕321⊕1⊕1. Clearly
*the set of layered permutation of length n is in bijection with the set of compositions*
*of n. Almost as clearly, this bijection sends the pattern-containment order to the*
composition order we have considered, so Theorem 2.2 answers Wilf’s question for
the set of layered permutations.

Any normal embedding approach to describing the M¨obius function for permuta-
tions in general would need to incorporate non-unitary weights, as witnessed by the
fact that*μ*(1*,*31524)=6.

7.3. Factor order

Subword order is not the only partial order on the set of words. We say that the word
*u is a factor of the wordw*if there exist (possibly empty) words*v*1 and*v*2 so that
*w*=*v*1*uv*2*, or in other words, if u occurs as a contiguous subword inw*. Bj¨orner [5]

showed that the M¨obius function for factor order only takes on values in{0*,*±1}and
gave a recursive rule that allows the computation of*μ(u, w) in O(*|w|^{2}) steps.

*The factor order can be defined on P*^{∗}*for any poset P: we say that u is a factor of*
*w*if there are words*v*1*, v*2*, v*3such that:

1. *w*=*v*1*v*2*v*3,
2. |v2| = |u|,

*3. u(i )*≤*v*2*(i ) for all 1*≤*i* ≤ |u|.

Indeed, this is one of the orders onP^{∗} studied by Snellman [21, 22]. The M¨obius
function ofP^{∗}under factor order remains unknown.