42
Computing Phylogenetic Roots with Bounded Degrees and
Errors
is
Hard
Tsukiji
Tatsuie1
and Zhi-ZhongChen2
1 築地 立家
DepartmentofInformation Science, TokyoDenki University, Hatoyama, Saitama 350-0394, Japan.
$\mathrm{t}$sukiji@j .dendai. $\mathrm{a}\mathrm{c}.\mathrm{j}\mathrm{p}$
2 隙致中
DepartmentofMathematical Sciences,Tokyo Denki University, Hatoyama, Saitama 350-0394,
Japan. chenQr.dendai
.
$\mathrm{a}\mathrm{c}.\mathrm{j}\mathrm{p}$Abstract. The $\mathrm{D}\mathrm{E}\mathrm{G}\mathrm{R}\mathrm{E}\mathrm{E}-\Delta$ CLOSEST PHYLOGENETIC $k\mathrm{T}\mathrm{H}$ Root PROBLEM $(\Delta \mathrm{C}\mathrm{P}\mathrm{R}k)$ is the problem of finding a (phylogenetic) tree $T$ from a given graph $G=(V, E)$ such that (1) the
degreeofeachinternal node of$T$isat least 3 andat most$\Delta$,(2)theexternalnodes($i.e$
.
leaves)of$T$areexactlytheelementsof$V$,(3)The number of disagreements, $|E\mathrm{a}$
{
$(u, v)$ :$u$,$v$areleavesof$T$ and$dr(u, v)\leq k\}|$ does notexceed a given number, where$d_{T}$(u,$v$) denotes the distance
between $u$ and $v$ in tree $T$
.
We show that this problem is $\mathrm{N}\mathrm{P}$-hard for every fixed constants6,$k\geq 3.$
Our majortechnical contribution is the determination of all pylogenetic roots that approximate the almost largest cliques. In more precise, let $f_{\Delta}(k)$ be the size of alargest clique having a
$k\mathrm{t}\mathrm{h}$phylogenetic rootwithmaximumdegree$\Delta$
.
We determinetheall phylogenic$k\mathrm{t}\mathrm{h}$rootswithmaximumdegree4 that approximate the$(f_{\Delta}(k)-1)$-cliquewithinerror2, whereweallowthe
internal nodes ofphylogenyto have degree 2.
1
Introduction
A
phylogeny isa
tree where the leavesare
labeled by species and each internal node representsa
speciation event whereby
an
ancestral species gives rise to twoor
more
child species. The internalnodes of
a
phylogeny have degrees (in thesense
ofunrooted trees, $i$.
$e$.
thenumber of incident edges)at least
3.
Specifically, interspecies similarity is represented bya
graph where the verticesare
thespecies and the adjacency relation representsevidence of evolutionary similarity.
A
phylogenyis thenreconstructed from the graph such that the leaves of the phylogeny
are
labeled by vertices of thegraph ($i.e$. species) and for anytwo vertices ofthe graph, they
are
adjacent in the graph if andonlyiftheir corresponding leaves inthe phylogeny
are
connected bya
pathoflengthatmost $k$, where $k$isa
predeterminedproximitythreshold. To beclear, vertices in the graphare
called vertices while thoseinthe phylogeny nodes. Recall that the length ofthe (unique) path connecting two nodes $u$ and$v$
in
phylogeny $T$is the number of edgeson
the path, which is denoted by $d_{T}(u, v)$.
This approach givesrise to the followingalgorithmicproblem [5]:
PHYLOGENETIC $k\mathrm{T}\mathrm{H}$ Root PROBLEM (PRfc):
Given
a
graph $G=(V, E)$, finda
phylogeny $T$ with leaves labeled by theelementsof$V$ such that foreach pairofvertices $u$,$v\in V$, $(u, v)\in E$if and
onlyif$d_{T}(u,v)\leq k.$ 数理解析研究所講究録 1375 巻 2004 年 42-48
43
Such
a
phylogeny$T$ (if exists) iscalleda
phylogenetic$k\mathrm{t}\mathrm{h}$root,or a
$k\mathrm{t}\mathrm{h}$rootphylogeny, ofgraph$G$.
Graph $G$ is called the$k\mathrm{t}\mathrm{h}$ phylogeneticpowerof$T$
.
For convenience,we
denote the $k$th phylogeneticpower of
a
phylogeny $T$as
7$k$$(T)$.
That is, Vu(T) has the vertex set $L(T)=\{u$ : $u$are
leaves of$T$
$\}$ and the edge set $T^{k}=$
{
$(u$,$v$) $|u$ and $v$are
leaves of$T$ and $d_{T}(u,$$v)\leq k$}.
Thus, $\mathrm{P}\mathrm{R}k$ asks fora
phylogeny$T$ such that $G=P_{k}(T)$
.
The input graphin$\mathrm{P}\mathrm{R}k$ is derived from
some
similarity data, whichis usuallyinexact in practiceand may have
erroneous
(spuriousor
missing) edges.Such
errors
may
result in graphs that haveno
phylogenetic roots and hence
we are
interested in finding approximate phylogenetic roots for suchgraphs. For
a
graph $G=(V, E)$, each tree $T$ whoseleavesare
exactly theelements of$V$ iscalledan
approximatephylogeny of$G$, and the
error
of$T$ is $|7$$k$
ea
$E|=|$$(E-T^{k})\cup(T^{k}-E)|$.
This motivatedChen et.
at.
to consider thefollowing problem:CLOSEST PHYLOGENETIC $k\mathrm{T}\mathrm{H}$ Root PROBLEM (CPRfc):
Given
a
graph $G=(V, E)$ anda
nonnegative integer $\ell$, decide if$G$ hasan
approximate phylogeny$T$with at most $\ell$
errors.
An approximate phylogeny of $G$ with the minimum
errors
is calleda
closest $k\mathrm{t}\mathrm{h}$ root phylogeny ofgraph $G$
.
The hardness of PRCforlarge$k$
seems
tocome
fromtheunbounded degreeofan
internal nodeinthe output phylogeny.Ontheotherhand,in the practice ofphylogenyreconstruction,mostphylogenies
considered
are
trees of degree 3 [7] because speciation eventsare
usually bifurcatingevents
in theevolutionaryprocess. We call these restricted
versions
the$\mathrm{D}\mathrm{E}\mathrm{G}\mathrm{R}\mathrm{E}\mathrm{E}-\Delta$$\mathrm{P}\mathrm{R}k$and
the$\mathrm{D}\mathrm{E}\mathrm{G}\mathrm{R}\mathrm{E}\mathrm{E}-\Delta$CPRk,and denotethem for short
as
$4\mathrm{P}\mathrm{R}7\mathrm{c}$and ACPRk, respectively.1.1 Previous Results
on
Phylogenetic Root Problems$\mathrm{P}\mathrm{R}k$
was
first studied in [5] wherelinear-timealgorithms for PR2 andPR3were
proposed. Alinear-time algorithm for the special
case
of PR4 where the input graph is required to be connected,was
also presented in [5]. At present, the complexity of$\mathrm{P}\mathrm{R}k$with $k\geq 5$ is stillunknown.
Chen et. al. [2] presented
a
linear-time algorithm that determines, for any input connectedgraph$G$ and constant $\Delta$ $\geq 3,$ if$G$ has
a
$k\mathrm{t}\mathrm{h}$root phylogeny with degree at most$\mathrm{a}$, and if so, demonstrates
one
such phylogeny. Onthe other hand, Chen et. al. [2] showedthat CPRk is $\mathrm{N}\mathrm{P}$-complete, for any$k\geq 2.$ One of theiropen questions asks for the complexity of$\Delta$CPRk.
Ofspecial interest is CPR2. CPR2 is essentially identical to the correlation clustering problem
which has drawn much attention recently [1], The proofof the $\mathrm{N}\mathrm{P}$-hardness of CPR2 given in [2] is
also
a
valid
proofof
the$\mathrm{N}\mathrm{P}$-hardnessof the
correlation Clustering problem.1.2 Our Contribution
In this
paper,
we
will show that ACPRkis $\mathrm{N}\mathrm{P}$-complete, forevery $k\geq 3$ andIS
$\geq 3.$ Thisanswers
an
open question in [2].In
a
course
of the proofwe
first reduce HAMILTONIAN PATH,a
famous $\mathrm{N}\mathrm{P}$-complete problem, to$4\mathrm{C}\mathrm{P}\mathrm{R}3$, and then ACPRSto ACPRk. Theformerreduction is tedious but
a
routinework.On
theother ha $\mathrm{d}$, the latter reduction
seems
to requirenew combinatorial
investigation that is proper of$\Delta$
CPRk.
44
-There is
a
tree$T$ of maximum degree4
whose phylogenetic $k\mathrm{t}\mathrm{h}$power is $G$and such that $T$ hasa
uniqueunsaturated ($i.e$.
degree $<4$) internal node $\alpha$,
the degree ofat isa
– 1, $\mathrm{d}\mathrm{r}(\mathrm{a}, u)$ $=h$holds for
one
leaf$u$ and $d_{T}(\alpha,v)\geq h+1$ for all leaves $v$otherthan $u$.-For everyapproximate phylogeny$T$of$G$ with maximum degree$\Delta$ and at most $\ell$errors, there is
at most
one
pair $(\alpha,u)$ such that $\alpha$ isan
unsaturated internal node of$T$,
$u$ isa
leaf of$T$, and$\mathrm{d}\mathrm{r}(\mathrm{a}, u)$ $\leq h;$ moreover, if$(\alpha,u)$ existsthen the degreeof$\alpha$ in $T$is
5-1.
Then,
we
establish the reduction from ACPRZ to$\Delta \mathrm{C}\mathrm{P}\mathrm{R}k$byprovidinga
family of$(\Delta$,
$k$, $\lfloor k/2\rfloor-$$1$,2)-core
graphs
for every fixed
$4\geq 3$and
$k\geq 4.$Our construction of a
(A $k,\cdot\lfloor k/2\rfloor-1,2$)more
graphis
a
pile of $(\mathrm{A}, k’, 1,2)$-core
graphs for $k’=5,7$,
$\ldots$,
$k$ if$k\geq 5$ is odd, and $k’=4,6$,
$\ldots$,
$k$ if$k\geq 4$is
even.
So,a
more
basicproblem is toprove
thata
certaingraph isa
$(\mathrm{A}, k, 1,2)$-core
graph.The maximum size of
a
clique havinga
(nO-error) $k\mathrm{t}\mathrm{h}$root phylogeny withmaximum
degree4
isgivenby the followingfunction,
$f_{\Delta}(k)=\{$
$\Delta\cdot(\Delta-1)9-1$,if$k$ is even,
2
.
(4-1)$k\mathrm{T}^{1}$, if$k$ is odd.
We prove that the clique of size $f_{\Delta}(k)-1$ is
a
$(\Delta, k, 1,2)$-core
graph. Moreover,we
determine theall $k\mathrm{t}\mathrm{h}$root phylogenies with maximum degree
a
that approximate the clique withinerror
2, wherewe
allow theinternal nodes ofphylogeny tohave degree 2. Forexample, all phylogeneticroots ofthe$(f_{3}(5)-1)$ clique
are
$D_{5}$ in Figure 1, $E_{5}$ inFigure 2, and the tree obtained from$D_{5}$ by removingtheleaf$u$
.
2
Notations and
Definitions
We employ standard terminologies in graph theory. In particular, for
a
graph $G$, $V(G)$ and $E(G)$denote the sets of vertices and edges of $G$, respectively. An induced subgraph of
a
graph $G$ is the subgraph $H$ induced bya
subset $W$ of $V(G)$,
$i.e$.
$\mathrm{E}(\mathrm{H})=${
$(\mathrm{u},\mathrm{v})$:
$u,v\in W$ and $(\mathrm{w},\mathrm{v})\in E(G)$}.
Two graphs $G=(V, E)$ and$G’=(V’, E’)$
are
isomorphicif there isa
one-tO-One correspondence $\phi$between $V$ and$V’$ such that $(u, v)\in E$ if and only if$(\phi(u), \mathrm{O}(\mathrm{v}))\in E’$, and
we
denoteitas
$G\cong_{\phi}G’$.
The distance between twovertices$u$and $v$in $G$is denotedby $d_{G}(u,v)$
.
The degree ofa
vertex$v$in $G$is denotedby da(v), which isthe number of vertices adjacent to$v$ in$G$
.
Similarly, fora
tree$T_{:}V(T)$,$E(T)$
,
and $L(T)$ denote the setsofnodes, edgesand leavesof$T$, respectively.We alsointroduce
some new
terminologies oftrees forconvenience.
Fora
tree$T$ofmaximumdegree$\Delta$,
an
internalnode$\alpha$of$T$isunsaturatedif$d_{T}(\alpha)\leq$A-l. Tree$T$is$i$-extensibleif$i= \sum_{v}(\Delta-deg_{T}(v))$,
where the summation is taken
over
all unsaturated internal nodes $v$ of$T$.
A tree $T$ is $h$-away if for each unsaturatedinternal node$x$of$T$,
theminimum
distancefrom
$x$toa
leaf is at least $h$andfurther
there is exactly
one
leaf$v_{l}$ such that $d_{T}(x,v_{li})=h.$ For any set $U$ of nodes of$T$, $\mathrm{T}[\mathrm{J}7]$ denotes theminimum subtree containing$U$. Note that each leaf of$T[U]$ belongs to $U$
.
A phylogeny is atreethatcontains
no
degree2nodes.As
already mentioned,the$k\mathrm{t}\mathrm{h}$phylogeneticpowerofany tree$T$isdenotedas
Vk(T) $=(L(T),T^{k})$, $T^{k}$ is the set of edges $(u, v)$ with $\{u,v\}\subseteq L(T)$ and$d_{T}(u, v)\leq k.$3
Construction
of
$(\Delta, k, \lfloor k/2\rfloor ・1, 2)$-core
graphs
In this section
we
givea
construction
of (3,$k$,$\lfloor$7c/2$\rfloor$ – 1,2)-core graphs for every odd $k\geq 5.$ It isstraightforward to generalize the arguments of this sectionto obtain $(\Delta, k, \lfloor k/2\rfloor ・1, 2)$
-core
graphs45
Throughout thissection, alltreesand phylogenies
are
ofmaximum degree3 or
less. Weabbreviate$\mathrm{f}\mathrm{s}(\mathrm{k})$
as
$\mathrm{f}(\mathrm{k})$.
The proofs of themost lemmas and corollariesare
omitted due to lack of space.A
phylogenywhose$k\mathrm{t}\mathrm{h}$phylogenetic power realizes the $f(k)$-cliquecan
beconstructed
as
follows:Start
witha
path $P$ of length exactly $k$.
Let $u$ and $v$ be the endpoints of$P$.
Then connectas
manynew
nodesas
possibleso
that $P$ becomesa
tree of degree 3 and every node has distance at most $k$ffom both$u$ and $v$
.
Thistree is unique up toisomorphism and hencewe
denote it by $C_{k}$.
Moreover,removing
an
arbitraryleaf from$C_{k}$ yieldsa
tree, which is unique up toisomorphism.We denote thistree by$D_{k}$
.
Figure 1 depicts$D_{4}$,
$D_{5}$, and$D_{6}$ where themissingsiblingleaf ofti hasbeen removed. Bydefinition, the $k\mathrm{t}\mathrm{h}$phylogenetic powerof$D_{k}$ is
an
$f(k)-1$ clique.Fig.1.$D_{4}$,$D_{6}$ and$D_{6}$
.
Lemma 1. Forevery tree$T$ (of maimum degree 3),
if
there are two leaves$u$ and$v$ with$d_{T}$(u,$v$) $=k$and allleaves$w$
of
$T$ have distance at most$k$from
both $u$ and$v$, then $T$ is isomorphic toa
subtreeof
$C_{k}$
.
Corollary 1. Forany tree$T$,
if
$dr\{u,$$v$) $\leq k$for
all leaves$u$ and$v$, then$T$ is isomorphic toa subtreeof
$c_{k}$.
FVict 1 For every tree$T$ with $|L(T)|=f(k)$-1 and $|Tk|\geq(_{2}^{f(k)-1})-2,$
we
have$d_{T}(u,v)\leq k$for
allbut at
most teuo
unorderedpairs $(u,v)$of
leavesof
$T$.
Lemma 2. Let$k\geq 4.$ Let$T$ be
an
arbitrary tree such that$|L(T)$$|=f(k)-1$ and$|Tk|\geq(_{2}^{f(k)-1})-2.$Then, there
are
leaves$u$ and$v$of
$T$ with$d_{T}(u,v)=k.$Lemma 3. Let$k\geq 6.$ Let$T$ be
an
arbitrary treehaving$f(k)-1$ leaves. Suppose that there areleaves$u$
,
$v$,$w$of
$T$ such that$d_{T}(u, v)=k$ and$\max(d_{T}(u, w),$$dr\{u,$$w$)) $\geq kf$$1$.
Then, $|T^{k}|\leq(_{2}^{f(k)-1})-3.$Lemma 4. Let$k\geq 6.$ Let$T$ be
a
treehaving $f(k)-1$ leaves such that $|\mathrm{r}k|\geq(_{2}^{f(k)-1})$ - 2. Then$T$is 0-extensible
or
1-extensible. Moreover,if
$T$ is 1-extensible then$T\underline{\simeq}D_{k}$.
For $k\in\{4,5\}$
,
let$E_{k}$ bethetree inFigure 2.Lemma
5.
Let$k\in\{4,5\}$.
Let
$T$ bea
tree
having$f(k)$ $-1$ leaves such that $|Tk|\geq(_{2}^{f(k)-1})-2.$ Then48
Fig.2. $E_{4}$ and$E_{6}$
.
Fig.$\theta$
.
$R_{7,2}$Theorem 1. Forevery$k\geq 4,$ the $(f(k)-1)$ -clique is $a(3, k, 1,2)$
-core
graph.Now
we are
readytoconstructa
$(3, k, \lfloor \mathrm{A}/2\rfloor-1, 2)$-core
graphfor every odd$k\geq 5.$ Werecursivelyconstructtrees$R_{k,\lfloor k/2\rfloor-1}$, $k=5,7,9$,$\ldots$, anddefine
a
family of$(3, k, \lfloor k/2\rfloor- 1, 2)$-core
graphsas
thekthphylogenetic power of thetrees $R_{k,\lfloor k/2\rfloor-1}$ (seeFigure
3
for$R_{7,2}$ ):-Let $h_{k}=\lfloor k/2\rfloor-$$1$
.
For each $1\leq i\leq h_{k}$,
let$g(i)= \prod_{j=1}^{t}(f(2j+3)-1)$.
Let $g(0)=1.$$-\tilde{R}_{k,h_{k}}$ is
a
leveled tree of depth $h_{k}$ such that at depth $i(0\leq i\leq h_{k})$are
placed $g(i)$ nodes andeach node $v$ at depth$i<h_{k}$ is connected to
some
$f(2i+5)-1$ nodesat depth $i+$ l.$-R_{k,h_{k}}$ is
an
expansion of$\tilde{R}_{k,h_{k}}$ such that each node $v$ of $\tilde{R}_{k,h_{k}}$ at depth $i$ $(0\leq i\leq h_{k}-1)$ isexpandedto
a
copy $D(v)$ of$D_{2:+5}$ in $R_{k,h_{k}}$, where $v$ isidentified with the degree-2node of$D(v)$and thechild nodes of$v$ in $\tilde{R}_{k,h_{k}}$
are
identified with the leaves of$\mathrm{D}(\mathrm{v})$ in
an
arbitraryonetoonemanner.
By construction, $R_{k,h_{h}}$ is 1-extensible and $h_{k}$-away, and has
a
unique degree-2 node, namely, theunique degree-2 nodeof$D_{5}$
.
Lemma 6. Let$k\geq 4.$ Let$T$ be
a
tree having$f(k)-1$ leaves such that$|T_{k}|$ $\geq(_{2}^{f(k)-1})-2.$ Supposefurther
that$T$ isnot0-extensible. Let$T(w)$ bethetree obtainedby connectinganeen
leaf
to an arbitraryleaf
$w$of
T. Then, $|\mathrm{T}(\mathrm{w})|^{k}\leq(_{2}^{f(k)-1})-3.$Lemma
7.
Let$k\geq 5$ beodd. Let$T$ bea tree
such that$L$(7 ) $=$L(Rk,hk) and$|Tk$$\oplus R_{k,h_{k}}^{k}|\leq 2.$ Then, $T$ is 0-extensibleor
1-extensible. Moreover,47
Theorem 2. Forevery odd$k\geq 5,$ the graph$7^{\mathit{2}_{k}}(R_{k,\lfloor k/2\rfloor-1})$is $a$ $(3, k, \lfloor k/2\rfloor-1, 2)$
-core
graph.Theseconstructions, lemmas and theorems
can
be generalized to every fixed$k\geq 4$ and $\Delta\geq 3.$ Aphylogenic $k\mathrm{t}\mathrm{h}$root of the $f_{\Delta}(k)$-clique
can
be constructed in thesame
wayas
$D_{i}$ and
we
denote itas
$Ds,i$.
Wecan
constructa
phylogeny$R_{\Delta,k,h_{k}}$ of degree $\Delta$ recursivelyin thesame
wayas
$R_{k,h_{k}}$ but replacing$f$ and $D_{i}$ therein with $f_{\Delta}$ and $D_{\Delta,i}$, respectively; further, if$k$ is
even
then thefunction$g(i)$ thereinshould be replaced by $\prod_{j=1}^{i}(f_{\Delta}(2j+2)-1)$.
Lemma7
and Theorem 2can
be generalizedas
follows:
Lemma 8. Let$k\geq 4$ and
a
$\geq 3.$ Let$T$ bea
treeof
mairnurn degree$\Delta$ such that$L(T)=L(R_{\Delta,k,h_{k}})$and$|Tk\oplus R_{\Delta,k,h_{k}}^{k}|\leq 2.$ Then$T$ is $\theta$-extensible
or
1-extensible. Moreover,if
$T$ is 1-extensible then itis $h_{k}$-away.
Theorem 3. For every$k$
a
4,$\mathcal{P}_{k}(R_{\Delta,k,\lfloor}c\mathit{7}2\rfloor-1)$ is $a(\Delta,$$k$
,
$\lfloor k/2\rfloor-$ $1$,$2i$-core
graph.4
The
$\mathrm{N}\mathrm{P}$-hardness of
$\Delta$CPRk
This section proves that 3CPR& is $\mathrm{N}\mathrm{P}$-hard for each odd $k\geq 3.$ It is straightforward to generalize
thearguments ofthis section to prove that $3\mathrm{C}\mathrm{P}\mathrm{R}k$ is $\mathrm{N}\mathrm{P}$-hardforforevery $4\geq 3$ and
$k\geq 3.$
Throughoutthis section, alltreesandphylogenies
are
of maximumdegree3or
less. Proofsof most lemmas andcorollariesare
omitted due to lackof space.We begin with the $\mathrm{N}\mathrm{P}$-hardness proofof$3\mathrm{C}\mathrm{P}\mathrm{R}3$because the $\mathrm{N}\mathrm{P}$-hardness proofs of$3\mathrm{C}\mathrm{P}\mathrm{R}\mathrm{A}$; for largerodd$k$ arereductionsfromit.Wereduce the following version ofHAMILTONIAN PATH PROBLEM
(HP) to $3\mathrm{C}\mathrm{P}\mathrm{R}3$, whose $\mathrm{N}\mathrm{P}$-hardnessproofs
can
be foundin [3] and [6, Section 9.3].HAMILTONIAN PATH PROBLEM (HP): Given agraph $G=(V, E)$ such that
- all vertices
are
of degree3or
less,- twospecificvertices
are
of degree 1 and each ofthemis adjacent toa
vertex of degree 2, and- thereis
no
cycle of length less than5.
Find
a
Hamiltonian
pathof$G$, $i.e$.
finda
linearordering of the vertices of$G$ such that each pairof consecutive vertices
are
adjacent in $G$.
Let $G=(V, E)$ be
an
arbitrary instance of $\mathrm{H}\mathrm{P}$, hence the maximum degree of $G$ is 3 and $G$contains
no
cycle oflength less than 5. Let $T=(V, E(T))$ bean
approximate phylogeny of $G$.
Wedefine
a
fractionalvalue$cost_{\mathit{3}}(v)$ associated with each vertex$v\in V$as
follows:COSt3
$(v)= \frac{1}{2}|${
$u$ : $(u,v)\in E$and$d_{T}(u,$$v)>3$}
$|$$+|$
{(
$u$,$w$) : $u\neq w$,$(u,$$v)\in E$,$(v,$$w)\in E$, $(u,$ $w)$\not\in
$E$and$d_{T}(u,$$w)\leq 3$}
$|$.
Lemma 9. The
surn
of
$cost_{\mathit{3}}(v)$ overall vertices $v\in V$ is nomore
than $|7$$3\oplus E|$.
Lemma
10. Let
$v$ bea vertex
of
$G$ having threepairwise nonadjacent neighbors$u_{1},u_{2}$ and113. Then,COSt3
$(v)= \frac{1}{2}$or
cOst3$(v)\geq 1.$ Moreover,if
$cost_{\mathit{3}}(v)$ $= \frac{1}{2}$, then$d_{T}(u_{i}, v)>3$for
one
$u_{\dot{*}}\in\{u_{1},u_{2},u_{3}\}$and$d_{T}(u_{j}, v)=3$
for
the othertwo
$u_{j}\in\{u_{1},u_{2},u_{3}\}-$$(u_{\mathrm{i}}\}$.48
Theorem
5. Forevery
odd$k\geq 3,$3CPR&
is NP-complete.It is straightforward togeneralize Theorem 5 to every $4\geq 3$ and $k\geq 4,$ obtaining thefollowing
theorem.
Theorem 6. For
every
$\Delta\geq 3$ and every$k\geq 3,$ 3CPR& is NP-complete.5
Summary
and
an
Open Question
Wehave provedthat
ACPRk
is$\mathrm{N}\mathrm{P}$-completefor every $\mathrm{i}$ $\geq 3$and$k\geq 3.$
A
more
fundamental
problemis the TREE $k\mathrm{T}\mathrm{H}$ Root PROBLEM $(\mathrm{T}\mathrm{R}/\mathrm{c})$
,
where the nodes (not only the leaves) of$T$ correspondto the vertices of $G$
.
Kearney and Corneil proved that CTRA; is $\mathrm{N}\mathrm{P}$-complete when $k\geq 3[4]$.
Weconjecturethat $4\mathrm{C}\mathrm{T}\mathrm{R}k$is $\mathrm{N}\mathrm{P}$-completefor
every
fixed $4\geq 3$ and $k\geq 2.$References
1. N. BANSAL, A. BLUIVI, AND S. CHAWLA, Correlation Clustering, in Proceedings of the $43\mathrm{r}\mathrm{d}$ Symposium
on Foundations ofComputer Science (FOCS 2002), pages 238-250, 2002.
2. Z.-Z. CHEN, T. JIANG, AND G.-H. LIN, Computing phylogenetic roots with bounded degrees and errors, SIAM Journalon Computing,toappear. A preliminary versionappearedin Proceedings ofWADS2001.
3. M. R. GAREY, D. S. JOHNSON AND R. E. TARJAN, The Planar Hamiltonian Circuit Problem is
$NP$-Complete,SIAMJournalon Computing, $5(4):704-714$, 1976.
4. P. E. KEARNEYAND D. G. CORNEIL, $\mathrm{R}ee$powers, Journal of Algorithms, 29:111-131, 1998.
5. G.-H. Lin, P. E. KEARNEY, AND T. JIANG, Phylogenetic $k$-root and Steiner$k$-rvot,in The 11thAnnual
International SymposiumonAlgorithms and Computation (ISAAC 2000), volume 1969 of Lecture Notes inComputer Science, pages539-551, Springer, 2000.
6. C. H. PAPADJMITRIOU, Computatational Compleity,Addison-Wesley, 1994.
7. D. L. SWOFFORD, G. J. OLSEN, P. J. WADDELL, AND D. M. HILLIS, Phylogenetic inference,In D. M. Hillis, C. Moritz, and B. K. Mable, editors, MolecularSystematics (2nd Edition), pages407-514, Sinauer