• 検索結果がありません。

ラベル選択を有する最小全域木問題 (アルゴリズムと計算機科学の数理的基盤とその応用)

N/A
N/A
Protected

Academic year: 2021

シェア "ラベル選択を有する最小全域木問題 (アルゴリズムと計算機科学の数理的基盤とその応用)"

Copied!
6
0
0

読み込み中.... (全文を見る)

全文

(1)

ラベル選択を有する最小全域木問題

藤芳明生 鈴木昌和

茨城大学工学部情報工学科 九州大学大学院数理学研究院

[email protected] [email protected]

Abstract

In this paper, we study the minimum spanning tree problem for vertex-labeled

graphs, where the weights of edges may vary depending on the selection of labels

of vertices at both ends. The problem is especially important as the application to mathematical OCR. It is shown that the problem is NP-hard. However, there exists alinear-time algorithm for graphs of small tree-width. In this paper, a linear-time algorithm for series-parallelgraphs is presented. The relation tothe generalized minimumspanning tree problem is discussed.

1

Introduction

The minimum spanning tree problem is one ofthe most famous combinatorial problems in computer science. Fast algorithms to solve the problem

are

well-known. In this paper, we study a generalization of the problem for vertex-labeled graphs, where the weight of edges may

vary

depending

on

the selection of labels of vertices at both ends.

An instance of the problem is illustrated in Fig. 1 (a): a finite set of vertex labels is $\Sigma=\{a, b,c, d\}$; vertices

are

indicated by dotted rectangles; each vertex has at least

one

candidates of labels represented by circled symbols; each weighted edge connects labels that belong to different vertices; and

some

pairs oflabels may not be connected. For this instance of the problem, the minimum spanning tree is illustrated in Fig. 1 (b): exactly

one

label is selected from candidates for each vertex; and the graph induced by selected labels and selectededges becomes

a

spanning treewhere the

sum

of weights of edges is the minimum. We also introduce the notion of a base graph. The corresponding base graph is illustrated in Fig. 1 (c). We say that two vertices

are

connected if

some

candidates of labels of two vertices

are

connected.

For the development of mathematical OCR [1], the problem is especially important. As shown in Fig. 2 (a) and (b), a mathematical OCR system constructs a graph that expresses possible adjacency connections of bounding boxes from

a

scanned image. At this moment, several character recognition candidatesmay remain for each bounding box. Eachedge is weighted by the positional relation and

co-occurrence

ofcharacter recognition candidates. In order to outputabetter recognition result

as

shown in Fig. 2 (c), thesystem wants to find the minimum spanning tree from the graph not only by selecting character recognition candidates for boundingboxes but also by determining adjacency connections of bounding boxes.

(2)

Figure 1: (a)

an

instance ofthe minimum spanning tree problem with label selection, (b) the minimum spanning tree, and (c) the base graph.

2

NP-Hardness

To show the NP-hardness, we reduce the Boolean satisfiability problem (SAT) to this problem.

Theorem 1 The minimum spanning tree problem with label selection is NP-hard.

Sketch

of

proof.

Given a CNF

formula,

we

can

construct

a

graph such that it has

a

spanning tree if and only if the formula has

a

truth assignment. For example, the graph correspondingtothe CNF formula $(x_{1}\vee x_{2}^{-}\vee x_{5})$A$(x_{2}\vee x_{3}\vee x_{4}^{-})$A$(x_{3}\vee x_{4}^{-}\vee\overline{x}_{5})$ isillustrated

in Fig. 3. All edges are weighted by the unit value. $\square$

3

Linear-Time

Algorithm

Inthis section, wepresent a linear-time algorithm for graphs whose base graph is a series-parallel graph [4]. The idea behind the algorithm is described as follows:

.

When two graphs are connected in series, the minimum spanning tree is obtained by simply connecting the two minimum spanning trees of original graphs.

(3)

(a)

$\mu(a,$ $b)= \int_{l}^{b}\overline{\Theta}\frac{\ }{(\epsilon)}$

(b)

$b$ $d-c$

(c) $\mu-(-a-,-b-)-=-(\backslash -+$

a

$\Theta-(-c-)$

Figure 2: (a)

a

scanned image, (b) thegraph expressing possible adjacencyconnections of bounding boxes, (c) the correct recognition result.

Figure 3: the DAG corresponding to the formula.

$\bullet$ When two graphs are connected in parallel, the minimumspanning tree is obtained

by connecting the minimum spanning tree of

one

original graph and the minimum two disconnected spaming subtrees of the other graph.

Let us write $G(s, t)$ to

mean

that the graph $G$ has two distinguished vertices, namely,

the

source

$s$ and the sink $t$

.

Definition 1 A graph is

a

series-parallel graph $(SPG)$ if (1) it consists of a single edge

connecting $s$and$t$, or (2) it

can

be produced by asequence of thefollowing two operations:

Series Composition: Given two series-parallel graphs $G_{1}(s_{1}, t_{1})$ and $G_{2}(s_{2}, t_{2})$, form a

(4)

Parallel Composition: Given two series-parallel graphs $G_{1}(s_{1}, t_{1})$ and $G_{2}(s_{2}, t_{2})$, form

a new

graph $G(s, t)$ by identifying $s=s_{1}=s_{2}$, and $t=t_{1}=t_{2}$

.

Due to the recursive definit\’ion of SPGs,

we

can

obtain

an

vertex-labeled ordered tree in accordance with a decomposition of

an

SPG.

Deflnition 2 A series-paralleltree $(SPT)T$for

an

SPG $G(s, t)=(V, E)$ is

an

edge-labeled rooted tree defined

as

follows: The set of vertex labels is $\{S, P\}\cup E$

.

$\bullet$ If $G(s,t)$ consists of

a

single edge, then

$T=(r,\emptyset)$ where $r$ is

a

new

vertex (the root

of $T)$, and the label of$r$ is $(s, t)$

.

$\bullet$ If $G(s,t)$ is obtained by a series composition of

$G_{1}(s_{1}, t_{1})$ and $G_{2}(s_{2}, t_{2})$, and $T_{1}=$

$(V_{1},E_{1})$ and$T_{2}=(V_{2}, E_{2})$

are SPTs

of them, then$T=(\{r\}\cup V_{1}\cup V_{2},$ $\{(r,r_{1}), (r,r_{2})\}\cup$ $E_{1}\cup E_{2})$ where $r$ is

a new

vertex (the root of$T$), and $r_{1}$ and $r_{2}$

are

the roots of$T_{1}$

and $T_{2}$, the label of$r$ is $S$, the first child of$r$ is $r_{1}$, and the second child of$r$ is $r_{2}$

.

$\bullet$ If$G(s_{1}t)$ is

obtained

by

a

parallel composition of

$G_{1}(s_{1}, t_{1})$ and $G_{2}(s_{2},t_{2})$, and $T_{1}=$ $(V_{1}, E_{1})$ and$T_{2}=(V_{2}, E_{2})$

are

SPTsofthem, then$T=(\{r\}\cup V_{1}\cup V_{2},$ $\{(r,r_{1}), (r, r_{2})\}\cup$ $E_{1}\cup E_{2})$ where $r$ is

a

new vertex (the root of$T$), and $r_{1}$ and $r_{2}$

are

the roots of$T_{1}$

and $T_{2}$, the label of$r$ is $P$, the order of children of$r$ is ignored.

Notethat thechildren of

a

vertex labeled $S$

are

ordered, while thechildren of

a

vertex

labeled $P$

are

unordered. All edges of $D$ appear exactly

once

as

a

label of leaves. An

SPG may have many corresponding SPTs since the above decomposition is not unique in general. It is known that an SPT is obtained from any SPG in linear time depending on the number of edges of

an

SPG [4].

Let $\Sigma=\{\sigma_{1}, \sigma_{2}, \ldots , \sigma_{m}\}$ be the set of vertex lables. Note that the number of vertex

labels $m$ is

a

constant number.

The algorithm consists of the following functions Main and Calculate:

$\overline{Ehnction}$

Main

$\overline{Input:}$

an SPG $G(s, t)=(V, E)$ anda corresponding SPT $T=(V_{T}, E_{T})$;

Output: the minimumweight of spanning trees of $G$;

1: Let $u$ be the root vertex of$T$

:

2: $(A, B)$ $:=$ Calculate$(u)$;

3: $\min=\infty$

4: for $i:=$ lto $m$ do

$\check{o}$: for $j$ $:=1$ to $m$ do 6: if $A[i,$$j]<$ 7漉$n$ then 7: $\min:=A[i,j]$; 8: end if 9: end for 10: end for 11: return $\min$; $\overline{\frac{EunctionCa1cu1ate}{Input:avertexu\in V_{\mathcal{T}_{1}}}}$

(5)

Output: arrays of real numbers $A[1\ldots m, 1\ldots m]$ and $B[1\ldots m, 1\ldots m]$;

1: if the label of$u$ is $(v_{1}.v_{2})\in E$ then

2: for $i:=$ lto $m$ do

3: for $j:=$ lto $m$ do

4: if there extsts an weighted edge $e$ between the lable candidate $\sigma i$ of$v_{1}$ and the

lable candidate $\sigma j$ of $v_{2}$ then

5: $A[i_{1}j]$ $:=$ the weight of $e$.

6: else 7: $A[i,j]:=\infty$; $s$: end if 9: $B[i,j]:=0_{:}$

.

10: end for 11: end 最)r

12: else if the label of$u$ is $S$ then

13: Let $u_{1}$ and $u2$ be the first and second child of$u$, respectively;

14: $(A_{1}, B_{1})$ :$=$ Calculate(ui);

5: $(A_{2}, B_{2})$ $:=$ Calculate$(u_{2})$;

16: for $i$ $:=1$ to $m$ do

17: for $j$ $:=1$ to $m$ do lS: $m$伽ノ 1 $=\infty$;

19: $minB=\infty$;

20: for $k:=1$ to $m$ do

21: if $A_{1}[i,$ $k]$ 十 $A_{2}[$ん,$j]< \min$ノ 1 then

22: $minA:=A_{1}[i$,た$]$ $+A_{2}[$た,$j]$;

23: end if

24: if $A_{1}[i,$ $k]$ 十 $B_{2}[k,$

$j]<minB$

then

25: $minB:=A_{1}|$ $+B_{2}[k,j]$;

26: end if

27 if $B_{1}[i,$$k]+A_{2}[k,$

$j]<m$

伽$B$ then

28: $minB:=B_{1}[i,$$k]+A_{2}[$た,$j]$; 29: end if 30: end for 31: $A[i,j]:=minA$; 32: $B[i,j]:=minB_{:}$

.

33: end for 34: end for

35: else if the lable of $u$ is $P$ then

36: Let $u_{1}$ and $u_{2}$ be the children of$u$;

37: $(A_{1}, B_{1})$ :$=$ Calculate(ui);

38: $($ノ12,$B_{2})$ $:=Calculate(u_{2})$

:

39: 最)r $i:=1$ to $m$ do

40: for ブ:$=1$ to $m$ do

41: if $A_{1}[i, j]+B_{2}[i,j]<B_{1}[i,j]+A_{2}[i,j]$ then

42: $A[i,j]:=A_{1}[i_{:}j]+B_{2}[i,j]$;

43’. else

(6)

45: end if 46: $B[i,j]:=B_{1}[i,j]+B_{2}[i.j]$; 47: end for $4S$: end for 49: end if

$\underline{60:return(A,B);}$

4

Relation to the

Generalized Minimum

Spanning

Tree

Prob-lem

The minimum spanning tree problem with label selection is closely related to the

gener-alized minimum spanning tree problem (GMSTP) [2, 3]. The following results

are

known for

GMSTP

[3]:

$\bullet$ GMSTP is NP-haid, and the problem is still NP-hard

even on

trees.

$\bullet$ If the number of clusters is fixed, then GMSTP

can

be solved in polynomial-time with respect to the number ofvertices.

The result of this paper

can

be translated into the following new results for

GMSTP:

.

GMSTP is still NP-hard

even

ifthe size of each cluster is at most 2.

.

Ifthe tree-widthofthe base graph issmall, tllenGMSTPcan be solved in polynomial-time with respect to the number ofvertices.

References

[1] Yuko Eto and Masakazu Suzuki. Mathematical formula recognition using virtual link network. In Proceedings

of

the 6th Intemational

Conference

on Document Analysis

and Recognition (ICDA$R$ 2001), pp. $430-437_{:}$ 2001.

[2] Young-Soo Myung, Chang-Ho Lee, and Dong-Wan Tcha. Onthe generalized minimum spanning tree problem. Networks, Vol. 26, No. 4, pp. 231-241, 1995.

[3] Petrica Claudiu Pop. The generalized minimum spanning tree problem. $PhD$ thesis,

Twente University Press, http://doc.utwente.nl/38643/, December 2002.

[4] Kazuhiko Takamizawa, Takao Nishizeki, and Nobuji Saito. Linear-time computability of combinatorial problems on series-parallel graphs. Joumal

of

the $ACM$, Vol.

29:

Figure 1: (a) an instance of the minimum spanning tree problem with label selection, (b) the minimum spanning tree, and (c) the base graph.
Figure 2: (a) a scanned image, (b) the graph expressing possible adjacency connections of bounding boxes, (c) the correct recognition result.

参照

関連したドキュメント

Transirico, “Second order elliptic equations in weighted Sobolev spaces on unbounded domains,” Rendiconti della Accademia Nazionale delle Scienze detta dei XL.. Memorie di

Then it follows immediately from a suitable version of “Hensel’s Lemma” [cf., e.g., the argument of [4], Lemma 2.1] that S may be obtained, as the notation suggests, as the m A

Applications of msets in Logic Programming languages is found to over- come “computational inefficiency” inherent in otherwise situation, especially in solving a sweep of

It is known that if the Dirichlet problem for the Laplace equation is considered in a 2D domain bounded by sufficiently smooth closed curves, and if the function specified in the

Indeed, when using the method of integral representations, the two prob- lems; exterior problem (which has a unique solution) and the interior one (which has no unique solution for

This paper gives a decomposition of the characteristic polynomial of the adjacency matrix of the tree T (d, k, r) , obtained by attaching copies of B(d, k) to the vertices of

Shi, “The essential norm of a composition operator on the Bloch space in polydiscs,” Chinese Journal of Contemporary Mathematics, vol. Chen, “Weighted composition operators from Fp,

We present evidence on the global existence of solutions of De Gregorio’s equation, based on numerical computation and a mathematical criterion analogous to the