1Introduction HarryCrane ConsistentMarkovbranchingtreeswithdiscreteedgelengths

(1)

ISSN:1083-589X in PROBABILITY

Consistent Markov branching trees with discrete edge lengths

^∗

Harry Crane

^†

Abstract

We study consistent collections of random fragmentation trees with random integer- valued edge lengths. We prove several equivalent necessary and sufficient conditions under which Geometrically distributed edge lengths can be consistently assigned to a Markov branching tree. Among these conditions is a characterization by a unique probability measure, which plays a role similar to the dislocation measure for homogeneous fragmentation processes. We discuss this and other connections to previous work on Markov branching trees and homogeneous fragmentation processes.

Keywords:Markov branching model; homogeneous fragmentation process; splitting rule; dislocation measure; sampling consistency; exchangeable random partition; weighted tree; random tree.

AMS MSC 2010:Primary 60J80, Secondary 60G09; 60C05.

Submitted to ECP on June 14, 2013, final version accepted on August 15, 2013.

1 Introduction

Random tree models arise in population genetics when inferring unknown phylogenetic relationships among extant species. Phylogenetic trees are often used to repre- sent these relationships, with leaves labeled by species and branch points correspond- ing to speciation events. The root of the tree corresponds to the most recent common ancestor of the species under consideration. In [1], Aldous provides some modeling axioms for phylogenetic trees; among these axioms are exchangeability and consistency (under subsampling). Typically, the species labeling the leaves are represented by distinct elements of[n] :={1, . . . , n}, and the exchangeability axiom reflects the assumption that the model should be invariant under arbitrary reassignment of elements to species. In a statistical setting, consistency reflects the assumption that the observed phylogenetic tree is a finite subtree sampled from the (possibly infinite) phylogenetic tree for all species. An admissible statistical model, therefore, corresponds to a family of probability measures on the space of infinite phylogenetic trees, that is, trees with leaves labeled in the natural numbersN^.

Along with these axioms, Aldous introduced the beta-splitting family of Markov branching trees. In general, a Markov branching tree is a random tree for which non- overlapping subtrees are conditionally independent. Within the phylogenetic frame- work, it is natural to consider random trees with edge lengths or weights (weighted Markov branching trees), where edge lengths are interpretted as time between speciation events. Previous authors [4, 6] have considered the task of assigning continuous

∗This work has been supported in part by NSF Grant DMS-1308899 and NSA Grant MSP-121011.

†Department of Statistics, Rutgers University, USA. E-mail:[email protected]

(2)

(Exponentially distributed) edge lengths to Markov branching trees in a consistent way as the size of the initial mass varies. In this paper, we undertake the related question of assigning discrete (Geometrically distributed) edge lengths to Markov branching trees.

In a phylogenetic context, discrete edge lengths correspond to evolution occurring in discrete-time and, therefore, reflects the assumption that generations are nonoverlap- ping, an assumption shared by some classical population genetics models; see [7] for an extensive treatment of probability models in population genetics.

Aside from applications to phylogenetics, random tree models are of their own math- ematical interest. Particularly, part of the treatment in [4] relates weighted Markov branching trees to homogeneous fragmentation processes [2], a class of continuous- time Feller processes on partitions ofN. In our main theorem, we give precise conditions under which discrete edges can be consistently attached to a Markov branching tree; and we characterize these trees by a unique probability measure on the space of ranked-mass partitions.

We point out at least one novelty that distinguishes this paper from previous work.

In contrast to [4], we do not appeal to Bertoin’s theory of homogeneous fragmentations;

rather, our proofs rely on a construction of discrete-weighted Markov branching trees as the projective limit of a sequence of finite weighted trees. At least some of the conclusions in [4] could be derived using our methods; however, as we explicitly consider trees withinteger-valuededge lengths, we cannot appeal to the theory of homogeneous fragmentations, which evolve in continuous-time. Nevertheless, our characterization of discrete-weighted Markov branching models also ties into previous work on homogeneous fragmentations, which we discuss in Sections 3.1 and 3.4.

Probabilistically, discrete-weighted Markov branching models are complementary to continuous-weighted Markov branching models. Taken together, these weighted tree models illustrate a fundamental aspect of the memoryless property: the Exponential and Geometric distributions are, respectively, the unique memoryless distributions on the positive real numbers and positive integers. An interesting twist, however, is that, unlike the continuous weight case, it is not always possible to attach Geometric random edge weights consistently for alln ∈N. Our main theorem states precisely when this embedding is possible.

An overview of the paper is as follows: in Section 2, we state our main theorem as well as give some preliminary definitions and notation; in Section 3, we discuss the components of the main theorem in detail, putting our observations in the context of previous literature on the topic; in Section 4, we formally define some concepts introduced in previous sections; in Section 5, we prove the main theorem.

2 Preliminaries and statement of main theorem

Throughout the paper,fragmentationformalizes the notion of a phylogenetic tree.

Definition 2.1. Afragmentationof a finite setA⊂Nis a collectiontAof subsets ofA such that

(i) A∈tAand

(ii) if#A≥2, then there exists a (root) partitionπ_A:={A1, . . . , A_k}ofAsuch that tA:={A} ∪t1∪ · · · ∪tk,

wheretiis a fragmentation ofAifor eachi= 1, . . . , k.

We call the elements ofπb, forb ∈tA, thechildren ofband writeΠt_A =πA to denote theroot partitionoftA. We identify the setA∈tA as therootofAand we writeTA to

(3)

denote the collection of all fragmentations with rootA. Alternatively, we may refer to a fragmentation as afragmentation treeor, simply, atree.

The illustration in (2.2) makes clear the connection between Definition 2.1 and the visual interpretation of a phylogenetic tree.

Remark 2.2. Definition 2.1 is initialized by taking t_{i} := {{i},∅}for each singleton {i} ⊂N. Inclusion of the empty set in the definition oftAis done for notational conve- nience, which arises when taking restrictions of weighted trees in the sequel.

To any subsetA⁰⊂A, there is a natural restriction of anyt∈ TAtoTA⁰ by

RA⁰,At=t_|A⁰ :={b∩A⁰: b∈t}, t∈ TA, (2.1) called thereduced subtree. Form≤n, we writeRm,n :=R_[m],[n]. The projective limit of{T_[n]}n∈N under the restriction maps{Rm,n}m≤n is denotedT_N and corresponds to the space of fragmentation trees with rootN^{. For}n ∈N^{, we write}Rn : T_N → T[n] to denote the restriction toT[n] of an infinite tree, as defined in (2.1) with A⁰ = [n] and A=N^{. We equip}T_Nwith theσ-fieldσhR_ni_n∈_Nso that these maps are measurable.

We illustrate the action of the restriction mapR_4,5in (2.2) below. Note that, in the left panel,tis a tree with root{1,2,3,4,5}and root partition {{1,2},{3,4},{5}}. Also, relating to Definition 2.1,tcorresponds to the collection of subsets

{∅,{1,2,3,4,5},{1,2},{3,4},{1},{2},{3},{4},{5}}

that label its vertices.

Root=∅ {1,2,3,4,5}

{1,2} {3,4} {5}

{1} {2} {3} {4}

Root=∅ {1,2,3,4}

{1,2} {3,4}

{1} {2} {3} {4}

t 7→ R4,5(t) =t_|[4]

.

(2.2) We are specifically interested in probability models for fragmentation trees with integer-valued edge lengths. From any t∈ TA, we obtain adiscrete-weighted tree t^• by assigning a positive integer weight wb > 0 to every b ∈ t. The pair t^• := (t,w), withw :={wb}b∈t, then determines a tree with edge lengths. We writeT_A^• to denote the space of discrete-weighted trees with rootA, for which there is also a natural restriction mapR^•_A0,A, for everyA⁰ ⊆ A, defined by removing elements and elongating edges as needed. These restrictions make the collection {T_[n]^•}_n∈N of finite discrete- weighted trees projective with limit denotedT_N^•. Weighted fragmentations are formally introduced in Section 4.2; a pictorial representation of a discrete-weighted tree is given in (4.1).

The probability models we consider are extensions of Markov branching models on T_N. By the projective structure ofT_N, any probability measureQonT_N is determined by its finite-dimensional restrictionsQ^[n]:=QR⁻¹_n toT_[n], for everyn∈N. Specifically, we consider the task of assigning random Geometrically distributed edge lengths to exchangeable Markov branching trees.

In general, the collectionQ:= (Q^[n])_n∈Ndetermines anexchangeable Markov branching modelif, for everyn∈N^,T∼Q^[n] is

(4)

• exchangeable: the law ofTis invariant under the obvious action of relabeling its leaves by an arbitrary permutationσ: [n]→[n];

• consistent:Rm,nT∼Q^[m] for everym≤n; and,

• Markovian: given any collection {A1, . . . , Ak} of non-overlapping subsets in T, the collection {T_|A₁, . . . ,T_|A_k} of reduced subtrees is conditionally independent and distributed according toQ^[n¹^], . . . , Q^[n^k^], respectively, where n_j := #A_j, j = 1, . . . , k.

Any exchangeable Markov branching modelQis determined by a family of exchange- ablesplitting rulesp:= (p_n)_n≥2, where eachp_n is a probability measure on the space P_[n]\{1_[n]}of partitions of the set[n]with the trivial partition1_[n]:={[n]}removed. For m≤n, there is an obvious deletion operationDm,n :P_[n] → P_[m] defined by removing elements in[n]\[m],

D_m,n(π) :={b∩[m] : b∈π} \ {∅}, π∈ P[n]. (2.3) It has been shown, e.g. in [1, 4, 6], that p := (pn)_n≥2 determines an exchangeable Markov branching model if and only ifpnis exchangeable and

p_n(π) =p_n+1(D⁻¹_n,n+1(π)) +p_n+1(e⁽ⁿ⁺¹⁾_n+1 )p_n(π), π∈ P[n]\{1[n]}, for everyn≥2, (2.4) wheree⁽ⁿ⁺¹⁾_n+1 :={[n],{n+ 1}}. We writeQ_p := (Q^[n]p )_n∈_N to denote the Markov branching model determined by the consistent splitting rulep. Note that (2.4) is merely the requirement that the marginal distribution of the root partition of T ∼ Q^[n+1]p , after removal of elementn+ 1, is the same as the distribution of the root partition underQ^[n]p , for everyn≥2.

Given a Markov branching treeTn ∼Q^[n]p , we randomly assign edge lengths toTn

as follows. First, we specify τ := (τ_n)_n≥0, with τ₀ = τ₁ = 0 and τ_n ∈ (0,1] for all n≥2. GivenT_n=t, we take independent random variablesW_n :={W_n(b)}_b∈t, where Wn(b)∼Geo(τ#b)has the Geometric distribution with parameterτ#b. (We define Geo(0) to be the point mass at∞.) We writeQ^[n]p,τ to denote the distribution ofT^•_n := (T_n,W_n) obtained in this way. Our main theorem considers the question of when the collection (Q^[n]p,τ)_n∈N of finite-dimensional distributions determines a unique probability measure Q^•_p,τ on the limit spaceT_N^•.

We now state our main theorem.

Theorem 2.3. Let p:= (p_n)_n≥2 be a family of exchangeable splitting rules satisfying (2.4). The following are equivalent.

(i) There exists a collectionτ:= (τ_n)_n≥0of Geometric success probabilities such that (Q^[n]p,τ)_n∈N are the finite-dimensional restrictions of a unique probability measure Q^•_p,τ onT_N^•.

(ii) The familyτ := (τ_n)_n≥0satisfiesτ₀=τ₁= 0and

τ_n=τ_n+1(1−p_n+1(e⁽ⁿ⁺¹⁾_n+1 )) for alln≥2. (2.5) (iii) There is a unique probability measureν^∗on

∆^↓:=

(

(s₁, s₂, . . .) : s₁≥s₂≥ · · · ≥0,X

i

s_i≤1 )

satisfying

ν^∗({(1,0, . . .)})<1 (2.6) so that(p, τ)is given byp:= (p^ν_n^∗)_n≥2andτ := (τ_n^ν^∗)_n≥0in(3.1)and (3.2), respectively.

(5)

(iv) There exists a uniqueτ_∞ ∈(0,1]and a unique probability measureν on∆^↓satis- fyingν({(1,0, . . .)}) = 0such that the pair(ν, τ_∞)determinesν^∗through (3.5).

(v) Q_p-almost everyt∈ T_Npossesses a root partition.

(vi) λ∞:= limn→∞λn<∞, whereλ:= (λn)n≥2is defined recursively byλ2= 1and λ_n+1=λ_n/(1−p_n+1(e⁽ⁿ⁺¹⁾_n+1 )), n≥2. (2.7) 2.1 The paintbox measure

The paintbox measure plays a key role in our discussion in the next section as well as in our proof of uniqueness of ν^∗ in Theorem 2.3(iii). For s ∈ ∆^↓, we write s0 :=

1−P∞

i=1si to denote the amount ofdustin sand we define thepaintbox measure %s

directed bysas the distribution of a random partitionΠgenerated as follows. First, we take independent random variablesX1, X2, . . .with distribution

P^s(Xi=j) :=

sj, j≥1 s0, j=−i.

Given(X₁, X₂, . . .), we defineΠby

iandjare in the same block ofΠ ⇐⇒ Xi=Xj.

We writeΠ ∼ %_s to denote thatΠis distributed as a paintbox directed by s. Given a measureνon∆^↓, the paintbox measure directed byνis the mixture of paintboxes:

%ν(dπ) :=

Z

∆^↓

%s(dπ)ν(ds), π∈ P_N.

According to Kingman’s correspondence [5], to any exchangeable random partitionΠ ofNthere corresponds a unique probability measureν^∗ on∆^↓such thatΠ∼%_ν^∗.

3 Discussion of Theorem 2.3

We now discuss the components of Theorem 2.3 in some detail, paying attention to the interplay among (i)-(vi) as well as connections to previous literature. Roughly speaking, the six parts of the theorem can be decomposed into three motifs: (i)-(ii) is a condition in the vein of Markov branching trees with Exponentially distributed edge lengths; (iii)-(iv) gives a structure result reminiscent of the characterization of homogeneous fragmentations; (v)-(vi) describes the existence of Q^•_p,τ without explicit reference toτ; in particular, both (v) and (vi) depend only onp. The connection between (v)-(vi) and existence ofQ^•_p,τis tied to the existence of a well-defined root partition of the limiting fragmentation tree. This also relates to the existence of a Markov branching tree with Exponentially distributed edge lengths; see Sections 3.4-3.6.

3.1 The characteristic measureν^∗

Theorem 2.3(iii) establishes a bijection between probability laws Q^•_p,τ of infinite Markov branching trees with integer edge lengths and probability measures ν^∗ satisfying (2.6). Given such a ν^∗, we define (p, τ) by p := (p^ν_n^∗)n≥2 and τ := (τ_n^ν^∗)n≥0, where

p^ν_n^∗(π) := %⁽ⁿ⁾_ν∗(π) 1−%⁽ⁿ⁾_ν∗(1_[n])

, π∈ P_[n]\{1_[n]}, n≥2, (3.1) τ₀^ν^∗=τ₁^ν^∗= 0, and

τ_n^ν^∗:= 1−%⁽ⁿ⁾_ν∗(1[n]), n≥2. (3.2)

(6)

Note that we have written%⁽ⁿ⁾_ν∗ to denote the image of%ν^∗by the obvious restriction map P_N → P_[n]. Condition (2.6) ensures that (3.1) is a well-defined probability distribution onP_[n]\{1_[n]}and the success probabilitiesτnare strictly positive for everyn≥2.

A further consequence of the characterization byν^∗ties into part (iv) of the theorem.

In particular, from (3.2), the sequenceτ is monotonically nondecreasing and bounded above by 1; hence, the limitτ_∞:= lim_n→∞τ_n^ν^∗exists and equals

τ_∞= 1−%ν^∗(1_[∞]) = 1−ν^∗({(1,0, . . .)})>0. (3.3) Fromν^∗, we can define a finite measureν_K, for anyK∈(0,∞), by

νK(ds) :=Kν^∗(ds)(1−δ(1,0,...)(s)), s∈∆^↓, (3.4) whereδ_•(·)is the point mass at•. Note thatνKis finite and satisfiesνK({(1,0, . . .)}) = 0. Since trivial partitions are assigned zero probability by any splitting rule, the measures ν^∗andνK determine the same splitting rule through the generalization to (3.1):

p^ν_n^K(π) := %⁽ⁿ⁾ν_K(π) ν_K(∆^↓)−%⁽ⁿ⁾ν_K(1_[n])

, π∈ P[n]\{1[n]}, n≥2.

Indeed, from (3.4), we have, forπ∈ P_[n]\{1_[n]},

p^ν_n^K(π) = %⁽ⁿ⁾νK(π) νK(∆^↓)−%⁽ⁿ⁾νK(1_[n])

= K%⁽ⁿ⁾_ν∗(π)

K(1−ν^∗({(1,0, . . .)}))−%⁽ⁿ⁾νK(1_[n])

= %⁽ⁿ⁾_ν∗(π) 1−%⁽ⁿ⁾_ν∗(1_[n])

,

which coincides with (3.1).

Conversely, givenτ_∞∈(0,1]and a finite measureνsatisfyingν({(1,0, . . .)}) = 0, we obtain a measureν^∗satisfying (2.6) by

ν^∗(ds) := ν(ds)

ν(∆^↓)τ∞+ (1−τ∞)δ(1,0,...)(s), s∈∆^↓. (3.5) For anyK ∈(0,∞), any probability measureν^∗ satisfying (2.6) coincides with (3.5) for ν:=νKandτ∞:= 1−ν^∗({(1,0, . . .)}).

3.2 The role ofτ∞

The quantityτ_∞:= lim_n→∞τ_nplays an important role in the description of the limiting treeT^•∼Q^•_p,τ in that it parameterizes its edge lengths. That is, the limiting object T^• is an infinite Markov branching tree with independent Geometrically distributed edge lengths, all with success probabilityτ_∞. Moreover, the special caseτ_∞= 1corre- sponds to Geometric edge lengths all with success probability 1. Hence, almost surely, the edge lengths of the limiting treeT^•are all identically 1. In this case, the random- ness of the edge lengths disappears in the limiting object. Viewed another way, from (3.5), we notice that1−τ_∞=ν^∗({(1,0, . . .)})corresponds to the probability that a random partition ofNis trivial. Since only non-trivial partitions correspond to dislocations in a fragmentation tree, τ_∞ = 1−ν^∗({(1,0, . . .)}) naturally corresponds to a success probability in our Geometric weighting scheme.

(7)

3.3 The success probabilitiesτ

Given a splitting rulep = (p_n)_n≥2 and a collectionλ := (λ_n)_n≥0 with λ₀ = λ₁ = 0 andλn >0for alln≥2, we can assign independent random lengthsWn(b)∼Exp(λ#b) to eachb∈Tn, where Exp(λ)denotes the Exponential distribution with rate parameter λ. (The Exp(0)distribution corresponds to the point mass at∞.) We writeQ^[n]_p,λ to denote the law of aQ^[n]p -distributed Markov branching tree with Exponentially distributed edge lengths parameterized by λ. By Proposition 3 of [4], the collection (Q^[n]_p,λ)_n∈_N is consistent if and only ifpsatisfies (2.4) andλsatisfies

λn=λn+1(1−pn+1(e⁽ⁿ⁺¹⁾_n+1 )) for everyn≥2. (3.6) Note that (3.6) is identical to condition (2.5) of Theorem 2.3(ii); however, in the discrete case we encounter the additional constraint0≤τ_n ≤1for alln≥0. Moreover, while continuous embedding is always possible for an infinitely exchangeable family of splitting rules, discrete embedding is not. Conditions (2.5) and (3.6) seem intimately tied to the memoryless property of the Exponential and Geometric distributions. Both (2.5) and (3.6) can be proven using the same strategy as in Theorem 5.1, with the modification that to prove (3.6) we use characteristic functions rather than probability generating functions.

3.4 Relation to homogeneous fragmentations

The definition ofν_K in (3.4) connects the characteristic measure ν^∗ to a collection of dislocation measures of homogeneous fragmentation processes. From Theorem 1 of [4], any exchangeable splitting rulep= (pn)_n≥2 satisfying (2.4) is associated to a pair (c, ν)(see equations (2) and (3) of [4]), wherec≥0is theerosion coefficientandνis the dislocation measureof a homogeneous fragmentation processT^◦. To ensure that each finite restriction ofT^◦ determines a fragmentation of a finite set with strictly positive edge lengths, the dislocation measureν is subject to the constraint

ν({(1,0, . . .)}) = 0 and Z

∆^↓

(1−s1)ν(ds)<∞; (3.7) see also, Bertoin [3] (Theorem 3.1). The measureνK constructed in (3.4) trivially satisfies (3.7) and, therefore, is the dislocation measure of some homogeneous fragmentation. As shown in Section 3.1, forK, K⁰ ∈(0,∞), any two pairs(νK, τ∞)and (νK⁰, τ∞) defined from the same characteristic measureν^∗determine the same splitting rule and, hence, the same discrete-weighted Markov branching model. Similarly, by Theorem 1 of [4],(c, ν)determines the same splitting rule as(Kc, Kν)for allK∈(0,∞).

3.5 Root partitions

The erosion coefficientc≥0also relates to (v) and (vi) of our theorem. In particular, the erosion coefficient is the rate at which “erosion” of a single element occurs, that is, the event that the initial split of the entire massN^{is into}{N\ {n},{n}}. Assuming the dislocation measureνis finite, the total rate at which a(c, ν)-fragmentation process with initial mass[n]experiences dislocation isλ_n =cn+%⁽ⁿ⁾ν P_[n]\{1_[n]}

. As a result, we see thatλn → ∞wheneverc >0andλn→ν(∆^↓)<∞whenc= 0. Therefore, (iv) and (vi) together imply that discrete-weighted fragmentations correspond to homogeneous fragmentations with zero erosion coefficient and finite dislocation measure.

Furthermore, Theorem 2.3(v) asserts that the existence of a collectionτ for which Q^•_p,τ exists depends on whether T∼ Qp possesses a well-defined root partition. Intu- itively, there will be such a root partition only ifλ∞ is finite because ifλ∞ =∞then

(8)

the root edges of the finite trees must be getting shorter asnincreases. Thus, Theorem 2.3(v) separates Markov branching trees into two classes, those with root partition and those without. By (v), Markov branching trees with a root partition can be assigned Geometrically distributed edge lengths, while those without a root partition cannot. To be explicit, givenλ∞<∞, we can choose anyλ^∗∈[λ∞,∞)and putτn=λn/λ^∗for each n≥ 2. By (2.7), (τn)n≥2 chosen this way satisfies (2.5). Moreover, relating to Section 3.2, we haveτ_∞=λ_∞/λ^∗∈(0,1].

3.6 Beta-splitting model

We conclude this section with an illustration of Theorem 2.3 in the special case of the beta-splitting model. For−2< β <∞, we define the splitting rule

p^β_n(π) := 2κ⁻¹_n β^↑#π¹β^↑#π²

(2β)^↑n , (3.8)

whereπ={π₁, π₂} is a partition of[n]with exactly two blocks,κ_n := 1−2β^↑n/(2β)^↑n and β^↑n := β(β + 1)· · ·(β +n−1). (The limiting casesβ → −2 and β → ∞ are also defined: β =−2 corresponds to the exchangeable distribution on “combs” andβ =∞ corresponds to the “symmetric binary trie.” For simplicity, we ignore these cases.)

These splitting rules are based on the family of dislocation measures νβ(dx) := 2x^β(1−x)^β1_[1/2,1](x)dx, −2< β <∞,

which is supported on the subspace of binary mass partitions. Note thatνsatisfies (3.7) and is, therefore, a dislocation measure for a sub-family of homogeneous fragmentation processes. In particular, forβ >−1,νβis a finite measure and, for−2< β≤ −1,νβis infinite. Therefore, even whenc= 0,λ_∞→νβ(∆^↓)<∞only forβ >−1, and so these are the onlyβfor which(p^β_n)_n≥2in (3.8) determines a distributionQ^•_p,τ on discrete-weighted trees. In fact, in the caseβ >−1, the splitting rule(p^β_n)n≥2 is determined by the Beta distribution with parameter(β, β). In particular,ν_β(dx) := 2x^β(1−x)^β1_[1/2,1](x)dxis the kernel of the probability measureν_β^∗governingmax(X,1−X)forX ∼Beta(β, β). Note thatν_β^∗({(1,0, . . .)}) = 0in this case and so we are in the situationτ_∞= 1. Alternatively, givenν_β^∗, β > −1, we can defineν^∗ with arbitraryτ_∞ ∈ (0,1]through (3.5). Through (3.1) and (3.2), the resulting probability measureν^∗determines a unique pair(p, τ)that parameterizesQ^•_p,τ.

4 Some formalities

In preparation for the proof of Theorem 2.3, we now formally introduce some concepts from previous sections.

4.1 Root partitions

WithA⊂fNdenoting thatA⊂Nis finite, apartitionofAis a collection{A1, . . . , Ak} of non-empty, disjoint subsets for whichSk

i=1A_i =A. We write P_A to denote the collection of all partitions ofA. The collection{P_[n]}_n∈N of spaces of finite set partitions is projective under the deletion maps (2.3). We writeP_N to denote the projective limit of partitions of N, which we furnish with the discrete σ-algebra σS

n∈NP[n]

. For eachn∈ N^{, we write}D_n := D_n,∞to denote the deletion operation P_N → P[n], where [∞] := N in (2.3). Partitions appear in the study of Markov branching trees through the splitting rule, which is a distribution onP_[n]\{1_[n]} that determines the law of the branching below a child of sizenin a random fragmentation.

Also, in Theorem 2.3(v), partitions ofNarise in the notion of a limiting root partition.

For anyA⊂fN^,#A≥2, everyt∈ TAhas a well-definedroot partitiondenoted by Πt

(9)

and defined by the partition π_A in Definition 2.1(ii). In general, for any A ⊆ N^{, we} say thatt∈ T_Apossesses a root partition if there existsN ∈Nsuch that the sequence (Πt_|[m])_m≥N has a projective limit inP_N, that is, if for alln≥m≥N,Πt_|[m] =Dm,nΠt_|[n]. We denote this root partition byΠt:= lim_n→∞Πt_|[n].

Example 4.1. An infinite tree need not possess a well-defined root partition. For example, theinfinite combcis defined by the collectionc:= (cn)_n≥2, whereΠc_n =e⁽ⁿ⁾n for everyn≥2. In this case, the sequence of finite root partitions is(e⁽ⁿ⁾n )_n≥2, for which Dm,ne⁽ⁿ⁾n =1[m]6=e^(m)m for everym < n; hence,limn→∞Πc_ndoes not exist.

4.2 Weighted fragmentation trees

We define aweighted fragmentationofA⊂_fN^{as a pair}t^◦:= (t,w)such thatt∈ T_A andw:={wb}_b⊆A, withwb ∈[0,∞]for allb⊆Aand

(i)^w w_b=∞if and only ifbis a singleton or the empty set;

(ii)^w w_b= 0if and only ifb /∈t.

Remark 4.2. Item (i)^w is not necessary for the above definition to make sense; however, we are interested in constructing consistent collections of weighted fragmentations ofN^{and (i)}^wis the convention that works best in this context.

Pictorially, we interpretwb as the length of the edgeaboveb ∈t, although we sup- press the edge of infinite length associated to∅. For example, for the treetin (2.2), if we specifyw{1,2,3,4,5}= 1,w_{1,2}= 3andw_{3,4}= 2, then we obtain

Root=∅ {1,2,3,4,5}

{1,2} {3,4} {5}

{1} {2} {3} {4}

1

3 ∞

2

∞ ∞ ∞ ∞

, (4.1)

where edge lengths are not drawn to scale. We writeT_A^◦ to denote the collection of weighted fragmentations ofA⊂fN^.

For non-empty subsets A⁰ ⊆ A⊂_fN, we define R^◦_A0,A : T_A^◦ → T_A^◦0 by t^◦ 7→ t^◦_|A0 :=

(RA⁰,A(t),w⁰), withRA⁰,Adefined in (2.1) andw⁰ :={w⁰_b}_b⊆A⁰, where w_b⁰ := X

{b⁰⊆A:b⁰∩A⁰=b}

wb⁰, b⊆A⁰, (4.2)

the sum of all weights associated tobby restriction ofA toA⁰. In particular, form≤ n < ∞, we write R^◦_m,n := R^◦_[m],[n] and denote the projective limit of {T_[n]^◦}_n∈_N under these restrictions byT_N^◦, the space of weighted fragmentations ofN^{. Any}t^◦ ∈ T_N^◦ is determined by a sequence (t^◦_n)_n∈N satisfying R^◦_m,nt^◦_n = t^◦_m for all m ≤ n, for every n ∈ N^{. For each} n ∈ N, we defineR^◦_n : T_N^◦ → T_[n]^◦ by the projection of T_N^◦ into T_[n], t^◦7→t^◦_n.

4.2.1 Integer-valued edge weights

For eachn ∈ N^{, we write}T_[n]^• ⊂ T_[n]^◦ to denote the subspace of allt^◦ := (t,w) ∈ T_[n]^◦ such thatwb∈ {0,1, . . . ,∞}for allb⊆[n]. Form≤n, we letR^•_m,nbe the restriction of

(10)

R^◦_m,ntoT_[n]^• and we defineT_N^•as the projective limit of{T_[n]^•}n∈Nunder these restriction maps. The space T_N^• comes equipped with R^•_n : T_N^• → T_[n]^• , the restriction ofR^◦_n to T_N^• for eachn ∈ N^{. Writing}Dn :=N

b⊆[n]2{0,1,...,∞} to denote the product of discrete σ-fields on subsets of{0,1, . . . ,∞}, we equipT_[n]^• with theσ-fieldT_[n]⊗D_n and T_N^• with theσ-fieldσhR^•_ni_n∈Nso that the restriction maps are measurable.

4.3 Random weighted fragmentations ofN

Letp:= (p_n)_n≥2be a collection of splitting rules satisfying (2.4) and letτ:= (τ_n)_n≥0 satisfy τ0 = τ1 = 0and τn ∈ (0,1]for all n ≥ 2. Formally, we defineQ^[n]p,τ as the law of T^•_n := (T_n,W_n), where T_n ∼ Q^[n]p is a Markov branching tree with splitting rule pand W_n :={W_n(b)}_b⊆[n] is a collection of discrete edge weights defined as follows.

First, we generate independent Geometric random variables Υn := {Υn(b)}_b⊆[n] with Υn(b) ∼ Geo(τ#b)for each b ⊆ [n]; then, given Tn = t and Υn, we define a discrete weighted treeT^•_n:= (Tn,Wn)inT_[n]^•, whereWn:={Wn(b)}_b⊆[n]is defined fromΥnby

Wn(b) :=

Υn(b), b∈Tn

0, otherwise. (4.3)

We can expressQ^[n]p,τ explicitly by Q^[n]_p,τ(t^•) = Y

b∈t:#b≥2

pb(Πt_|b)τ#b(1−τ#b)^w^b⁻¹, t^•:= (t,w)∈ T_[n]^•, (4.4) wherepb(·)denotes the splitting rule induced onPb\ {1b}byp#bthrough exchangeability.

5 Proof of Theorem 2.3

Theorem 2.3 summarizes the conclusions of a series of theorems and propositions that we prove in this section. Throughout this section, assumep:= (pn)n≥2is a collection of splitting rules satisfying (2.4) andτ := (τ_n)_n≥0 is a collection of success probabilities. The pair(p, τ)determines a family (Q^[n]p,τ)n∈N of finite-dimensional probability distributions through (4.4). By Kolmogorov’s extension theorem,(Q^[n]p,τ)n∈Ndetermines a unique probability measureQ^•_p,τ onT_N^•if and only if

Q^[m]_p,τ =Q^[n]_p,τR^•_m,n⁻¹ for allm≤n, for everyn∈N. (5.1) Theorem 5.1. The family(Q^[n]p,τ)n∈N satisfies(5.1)if and only ifτ0=τ1= 0and

τ_n=τ_n+1(1−p_n+1(e⁽ⁿ⁺¹⁾_n+1 )) for everyn≥2. (5.2) Proof. Clearly,τ₀=τ₁= 0is both necessary and sufficient forQ^[n]p,τ-almost everyt∈ T_[n]^• to satisfy (i)^w in the definition of a weighted fragmentation tree, for every n ∈ N^. Henceforth, we fixn≥2and examine condition (5.2) forτ_n.

For Q^[n]p,τ defined as in (4.4), let T^•_n+1 = (Tn+1,Wn+1) ∼ Q^[n+1]p,τ and define T^•_n = (Tn,Wn) :=R^•_n,n+1T^•_n+1. By (5.1), we must show thatT^•_n∼Q^[n]p,τ.

In general, for any pair (t,t⁰), witht⁰ ∈ T_[n+1]and t:=Rn,n+1t⁰, there is a unique element b ∈ tsuch that b∪ {n+ 1}, b and {n+ 1} are all elements of t⁰. We denote this unique element byb^∗ ∈tand we say that{n+ 1}isattached belowb^∗ int⁰. Now, by construction, R^•_n,n+1T^•_n+1 = T^•_n and, therefore,R_n,n+1T_n+1 =T_n. Hence, we can defineb^∗∈T_nas the uniqueb^∗below whichn+ 1is attached inT_n+1. By definition of R^•_n,n+1in (4.2),

Wn(b) = max(Wn+1(b), Wn+1(b∪ {n+ 1})) for allb∈Tn\ {b^∗}

(11)

and

Wn(b^∗) =Wn+1(b^∗) +Wn+1(b^∗∪ {n+ 1})>max(Wn+1(b^∗), Wn+1(b^∗∪ {n+ 1})) a.s.

By assumption (2.4), the finite-dimensional distributions(Q^[n]p )_n∈Non{T_[n]}_n∈Nare consistent and, therefore,T_n is distributed according toQ^[n]p for eachn∈N. The Markov property of Tn, together with conditional independence of the edge lengths, implies thatT^•_n∼Q^[n]p,τ if and only if, for everyn≥0,

X+X⁰I_E=_LX⁰, (5.3)

whereX ∼ Geo(τ_n+1), X⁰ ∼Geo(τ_n), E is an event with probability p_n+1(e⁽ⁿ⁺¹⁾_n+1 )and X, X⁰, E are mutually independent. (Here, we write X=_LY to denote that random variablesX andY areequal in law.) Note that, by assumptionτ0=τ1= 0, (5.3) plainly holds for n ∈ {0,1}, and so we need only consider the case n ≥ 2. The probability generating functionGY(s) := Es^Y of a Geometric variableY with success probability p∈(0,1)is

G_Y(s) := sp 1−s(1−p); and so, (5.3) implies that

Es^X+X⁰^I^E = sτn

1−s(1−τn), for alln∈N. Fixings >0and writingσn:= 1−τn, we have

Es^X+X⁰^I^E=

= sτn+1

1−sσn+1

pn+1(e⁽ⁿ⁺¹⁾_n+1 ) sτn

1−sσn

+ 1−pn+1(e⁽ⁿ⁺¹⁾_n+1 )

= sτ_n 1−sσn

( sτ_n+1 1−sσn+1

"

p_n+1(e⁽ⁿ⁺¹⁾_n+1 )sτ_n+ (1−p_n+1(e⁽ⁿ⁺¹⁾_n+1 ))−sσ_n(1−p_n+1(e⁽ⁿ⁺¹⁾_n+1 )) sτn

#)

= sτn

1−sσn

( sτn+1

1−sσn+1

"

(1−s)(1−pn+1(e⁽ⁿ⁺¹⁾_n+1 )) +sτn

sτn

#) .

It follows thatX+X⁰IE=LX⁰ if and only if τn+1

1−sσ_n+1 = τn

(1−s)(1−pn+1(e⁽ⁿ⁺¹⁾_n+1 )) +sτn

.

By assumption, bothτ_nandτ_n+1are strictly positive. Hence, there exists a uniqueα >0 such thatατn=τn+1. We must have

τn+1

1−sσ_n+1 =α α

τn

(1−s)(1−pn+1(e⁽ⁿ⁺¹⁾_n+1 )) +sτn

= τn+1

(1−s)(1−pn+1(e⁽ⁿ⁺¹⁾_n+1 ))α+sτn+1

.

Becauses >0, it follows thatα(1−pn+1(e⁽ⁿ⁺¹⁾_n+1 )) = 1. This completes the proof.

Our next step is to show the correspondence between probability measuresν^∗ satisfying (2.6) and pairs(p, τ)satisfying (2.4) and (2.5). In this direction, letν^∗be a probability measure on∆^↓ satisfying (2.6). By Kingman’s correspondence,ν^∗ determines a unique exchangeable paintbox measure%_ν^∗ onP_N. As before, we write%⁽ⁿ⁾_ν∗ :=%_ν^∗D⁻¹_n to denote the distribution%_ν^∗ induces onP_[n] through deletion. Furthermore, for any b⊂fN^{, we write}%^b_ν∗to denote the measure%ν^∗induces onPb. By construction,(%⁽ⁿ⁾_ν∗)n∈N

is exchangeable and satisfies the consistency condition

%^(m)_ν∗ (π) =%⁽ⁿ⁾_ν∗(D⁻¹_m,n(π)), π∈ P[m], for everym≤n <∞. (5.4)

(12)

Givenν^∗, definep:= (p^ν_n^∗)_n≥2 as in (3.1). By assumption (2.6), it is clear thatp^ν_n^∗ is a probability distribution onP_[n]\{1_[n]}for everyn≥2. Exchangeability and consistency (2.4) ofpfollows easily from properties of%ν^∗.

Theorem 5.2. The identities (3.1)and (3.2)establish a bijection between pairs(p, τ) satisfying(2.4)and (2.5)and probability measuresν^∗on∆^↓satisfying(2.6). Therefore, to any such (p, τ), there is a unique measureν^∗ such thatQ^•_p,τ has finite-dimensional marginal distributionsQ^[n]p,τ :=Q^[n]_ν∗, where

Q^[n]_ν∗(t^•) := Y

b∈t:#b≥2

%^b_ν∗(1b)^w^b⁻¹%^b_ν∗(Πt_|b), t^•:= (t,w)∈ T_[n]^•, for everyn∈N. (5.5)

Proof. First, suppose(p, τ)satisfies (2.4) and (2.5). For eachn∈N, we define a probability measurePn(·)onP_[n]by

Pn(π) :=

τnpn(π), π6=1_[n]

1−τn, π=1[n].

PuttingP1(1_[1]) = 1, we have a collection(Pn)n∈N of exchangeable marginal distributions on {P[n]}n∈N that corresponds to p through (3.1). From the assumptions (2.4) and (2.5), it is easy to check that(P_n)_n∈_N is consistent. Therefore, by Kolmogorov’s extension theorem,(P_n)_n∈_Ndetermines a unique exchangeable probability measure on P_N which, by Kingman’s correspondence, is a paintbox measure%ν^∗ for some unique probability measureν^∗on∆^↓. Moreover, by assumption,τn+1 ≥τn>0for alln≥2and soτn→τ∞>0. By monotone convergence, we have

%_ν^∗(1_[∞]) = lim

n→∞↓%_ν^∗D⁻¹_n (1_[n]) = lim

n→∞↓%⁽ⁿ⁾_ν∗(1_[n]) = 1− lim

n→∞τ_n= 1−τ_∞<1.

Hence,ν^∗must satisfy (2.6).

Conversely, let ν^∗ be a probability measure on∆^↓ satisfying (2.6) and definep^∗ :=

(p^ν_n^∗)_n∈Nby (3.1) andτ^∗:= (τ_n^ν^∗)_n≥0by (3.2). Plainly,p^∗satisfies (2.4). We also see that, for everyn≥2,

τ_n+1^ν^∗ (1−p^ν_n+1^∗ (e⁽ⁿ⁺¹⁾_n+1 )) = (1−%⁽ⁿ⁺¹⁾_ν∗ (1_[n+1])) 1− %⁽ⁿ⁺¹⁾_ν∗ (e⁽ⁿ⁺¹⁾_n+1 ) 1−%⁽ⁿ⁺¹⁾_ν∗ (1_[n+1])

!

= 1−%⁽ⁿ⁺¹⁾_ν∗ (1_[n+1])−%⁽ⁿ⁺¹⁾_ν∗ (e⁽ⁿ⁺¹⁾_n+1 )

= 1−%⁽ⁿ⁾_ν∗(1_[n])

= τ_n^ν^∗,

where the above expression simplifies because%⁽ⁿ⁺¹⁾_ν∗ (1_[n+1])+%⁽ⁿ⁺¹⁾_ν∗ (e⁽ⁿ⁺¹⁾_n+1 ) =%⁽ⁿ⁾_ν∗(1_[n]) by consistency (5.4) of(%⁽ⁿ⁾_ν∗)_n∈_N. Hence, (2.5) is satisfied.

Equation (5.5) follows immediately from (4.4). This completes the proof.

Theorem 5.3. Let p := (pn)_n≥2 be a family of splitting rules satisfying (2.4)and let λ := (λn)n≥2 be as defined in (2.7) with respect to p. ThenQp-almost every t ∈ T_N possesses a root partition if and onlyλ_∞:= lim_n→∞λ_n<∞.

Proof. First, suppose thatQ_p-almost everyt∈ T_N possesses a root partition. Then, by our definition of root partition in Section 4.1,

P({Π_Texists}) =P

∞

[

n=1

{Π_T∈D⁻¹_n (P_[n]\{1_[n]})}

!

= 1.

(13)

On the other hand, by (2.7), we have

λ_n/λ_n+1= 1−p_n+1(e⁽ⁿ⁺¹⁾_n+1 ) for alln≥2.

Now,pn(e⁽ⁿ⁾n )∈[0,1]for everyn∈N, and so the sequenceλ:= (λn)_n≥2is monotonically nondecreasing andλ_∞:= lim_n→∞λn exists. For fixedn∈N^andπ∈ P_[n]\{1_[n]},

P({ΠT ∈D⁻¹_n (π)}) =p_n(π)

∞

Y

j=1

(1−p_n+j(e^(n+j)_n+j )) =p_n(π)λ_n lim

j→∞λ⁻¹_n+j; hence,

P({ΠT∈D⁻¹_n (P[n]\{1[n]})}) =λn lim

j→∞λ⁻¹_n+j =λn/λ∞. (5.6) Now, eitherλ∞ =∞or0< λ∞ <∞. On the one hand, ifλ∞ =∞, thenP({ΠT ∈ D⁻¹_n (P[n]\{1[n]})}) =λn/λ∞= 0for alln∈N^{; whence,}

1 =P({ΠTexists}) = P

∞

[

n=1

{ΠT ∈D⁻¹_n (P_[n]\{1_[n]})}

!

≤

∞

X

n=1

P {Π_T ∈D⁻¹_n (P_[n]\{1_[n]})}

= 0,

a contradiction. On the other hand, if λ_∞ < ∞, then λ_n/λ_∞ → 1 as n → ∞ and, therefore,P {ΠT ∈D⁻¹_n (P_[n]\{1_[n]})}

→1asn→ ∞. Consequently,

1 =P({ΠTexists}) = P

∞

[

n=1

{ΠT ∈D⁻¹_n (P_[n]\{1_[n]})}

!

≤

∞

X

n=1

P {ΠT ∈D⁻¹_n (P_[n]\{1_[n]})}

=∞,

establishing the first claim.

Conversely, suppose λ_∞ := lim_n→∞λ_n < ∞. For eachn ≥ 2, we define the event A_n := {Π_T_|[n] = e⁽ⁿ⁾n }. By the Markov branching property and consistency (2.4), the events {An}_n≥2 are independent; hence, the random variables{1A_n}_n≥2 are independent Bernoulli random variables with parameter p_n(e⁽ⁿ⁾n ) for each n ≥ 2. Moreover, {Π_T exists}={P1_A_n <∞}. Clearly, the event{P1_A_n<∞}is in the tailσ-field generated by{An}_n≥2. Hence, the event{ΠT exists}has probability0or1by Kolmogorov’s 0-1 law. However, by (5.6),

P{ΠT ∈D⁻¹_n (P_[n]\{1_[n]})}=λn lim

j→∞λ⁻¹_n+j =λn/λ_∞>0 for everyn≥2.

Therefore,P({ΠTexists})≥ λn/λ∞ > 0 and we conclude{ΠTexists} has probability one.

Proposition 5.4. Let p := (p_n)_n≥2 be a family of splitting rules satisfying (2.4) and defineλ:= (λ_n)_n≥2as in (vi) of Theorem 2.3. Then there exists a collectionτ:= (τ_n)_n≥0 of success probabilities satisfying(2.5)with respect topif and only iflim_n→∞λn<∞. Proof. We have already noted that(λ_n)_n≥2defined in (2.7) is monotonically nondecreasing, and solim_n→∞λn exists. Suppose there existsτ satisfying (2.5) with respect top. Thenλ:= (λn)_n≥2, as defined in (2.7), satisfies (3.6), which is identical to (2.5); hence, there existsα∈(0,∞)such thatλn =ατn for everyn∈N^{. Since}τn ≤1for alln∈N^,

(14)

we concludelim_n→∞λ_n =αlim_n→∞τ_n ≤α <∞. Conversely, ifλ_n →λ_∞ <∞, we can defineτ_n:=λ_n/λ_∞forn≥2, which satisfies (2.5).

In fact, we could take anyλ_∞ ≤λ^∗ <∞and putτn :=λn/λ^∗. The choiceλ^∗ =λ_∞ coincides with the case τ_∞ = 1; in general, to specify τ_∞ ∈ (0,1], we choose λ^∗ = λ∞/τ∞≥λ∞and we have

n→∞lim τ_n= lim

n→∞λ_n/λ^∗= τ_∞ λ∞

n→∞lim λ_n=τ_∞.

Proposition 5.5. To any probability measureν^∗satisfyingν^∗({(1,0, . . .)})<1andK∈ (0,∞), there corresponds a unique pair (νK, τ_∞), where νK is a measure on ∆^↓ with total massKτ_∞,νK({(1,0, . . .)}) = 0andτ_∞ ∈(0,1], such that(νK, τ_∞)determinesν^∗ through(3.5).

Proof. This follows by the discussion in Section 3.1: Givenν^∗ satisfying (2.6) andK ∈ (0,∞), we define ν_K as in (3.4) and put τ_∞ := 1−ν^∗({(1,0, . . .)}). From(ν_K, τ_∞), we defineν^∗by (3.5). Uniqueness is a consequence of the constraints placed onν_K andτ_∞ and follows immediately from (3.5). This completes the proof.

The equivalence of Parts (i)-(vi) of Theorem 2.3 have now been proven according to the following scheme.

(i)⇔(ii): Theorem 5.1 (ii)⇔(iii): Theorem 5.2 (v)⇔(vi): Theorem 5.3 (ii)⇔(vi): Proposition 5.4 (iii)⇔(iv): Proposition 5.5 This completes the proof.

References

[1] D. Aldous. Probability distributions on cladograms. InRandom Discrete Structures, pages 1–18. Springer, 1996. MR-1395604

[2] J. Bertoin. Homogeneous fragmentation processes. Probab. Theory Related Fields, 121(3):301–318, 2001. MR-1867425

[3] J. Bertoin.Random fragmentation and coagulation processes, volume 102 ofCambridge Stud- ies in Advanced Mathematics. Cambridge University Press, Cambridge, 2006. MR-2253162 [4] B. Haas, G. Miermont, J. Pitman, and M. Winkel. Continuum tree asymptotics of discrete

fragmentations and applications to phylogenetic models. Ann. Probab., 36(5):1790–1837, 2008. MR-2440924

[5] J. F. C. Kingman. The representation of partition structures. J. London Math. Soc. (2), 18(2):374–380, 1978. MR-0509954

[6] P. McCullagh, J. Pitman, and M. Winkel. Gibbs fragmentation trees. Bernoulli, 14(4):988–

1002, 2008. MR-2543583

[7] S. Tavaré.Ancestral Inference in Population Genetics, volume 1837 ofLecture Notes in Math- ematics. Springer-Verlag, Berlin, 2004. Lectures from the 31st Summer School on Probability Theory held in Saint-Flour, 2001. MR-2071630