• 検索結果がありません。

1Introduction KeqinLi AnalysisofanApproximationAlgorithmforSchedulingIndependentParallelTasks

N/A
N/A
Protected

Academic year: 2022

シェア "1Introduction KeqinLi AnalysisofanApproximationAlgorithmforSchedulingIndependentParallelTasks"

Copied!
12
0
0

読み込み中.... (全文を見る)

全文

(1)

Analysis of an Approximation Algorithm for Scheduling Independent Parallel Tasks

Keqin Li

Department of Mathematics and Computer Science, State University of New York, New Paltz, NY 12561-2499, USA received April 14, 1998, revised May 28, 1999, accepted May 30, 1999.

In this paper, we consider the problem of scheduling independent parallel tasks in parallel systems with identical processors. The problem is NP-hard, since it includes the bin packing problem as a special case when all tasks have unit execution time. We propose and analyze a simple approximation algorithm called , where is a positive integer. Algorithm has a moderate asymptotic worst-case performance ratio in the range

for all ; but the algorithm has a small asymptotic worst-case performance ratio in the range !

, when task sizes do not exceed of the total available processors, where#"$ is an integer. Furthermore, we show that if the task sizes are independent, identically distributed (i.i.d.) uniform random variables, and task execution times are i.i.d.

random variables with finite mean and variance, then the average-case performance ratio of algorithm% is no larger than 1.2898680..., and for an exponential distribution of task sizes, it does not exceed 1.2898305.... As demonstrated by our analytical as well as numerical results, the average-case performance ratio improves significantly when tasks request for smaller numbers of processors.

Keywords: approximation algorithm, average-case performance ratio, parallel task scheduling, probabilistic analy- sis, worst-case performance ratio

1 Introduction

In this paper, we consider the problem of scheduling independent parallel tasks in parallel systems with identical processors. Assume that we are given a list of & tasks ')(+*-,/.102,4310657585709,4:<; . Each task,4=

is specified by its execution time >6=, and its size ?=, i.e., ,4= requires ?= processors to execute. There are @ identical processors, and any ?= processors can be allocated to ,4=. Once,4= starts to execute, it runs without interruption until it completes. Tasks in ' are mutually independent, that is, there is no precedence constraint nor data communication among the& tasks. The problem addressed here is to find a nonpreemptive schedule of ' such that its makespan (i.e., the total execution time of the & tasks) is minimized.

The problem is NP-hard, since it includes the bin packing problem as a special case when all tasks have unit execution time. Therefore, one practical way to solve this problem is to design and analyze approximation algorithms that produce near optimal solutions. LetAB*C'D; be the makespan of the schedule

E

The author can be reached at email: [email protected], phone: (914) 257-3534, fax: (914) 257-3571.

1365–8050 c

F

1999 Maison de l’Informatique et des Math´ematiques Discr`etes (MIMD), Paris, France

(2)

generated by an algorithm A for ' , and GIHKJL*'; be the makespan of an optimal schedule of ' . The

quantity MON

P

(RQ7S8T

UWV

NYX Z2[<\

]_^ `_a8b<cd UIe AB*';

GIHKJL*';1fhg

is called the asymptotic worst-case performance ratio of algorithmA . If there exist two constantsi andj such that for all' ,AB*';%klinmoGIHDJL*'D;Wpqj/>r , where>rs(tTvu1wx.zy{=y{:_*>6=|; is the longest execution time of the& tasks, then

M N

P

k}i , andi is called an asymptotic worst-case performance bound of algorithm

A . Moreover, if for any small~hY€ and all large‚l€ , there exists a list' , such thatGIHKJL*C'D;ƒ„ , and

AB*';%ƒ…*i†‡~z;ˆGIHKJL*';, then the boundi is called tight, i.e.,

M N

P

(‰i . When task sizes and execution times are random variables, bothAB*'; andGIHDJL*'D; become random variables, and

Š

MON

P

(‹Q7S8T

: V N XŒ

*AŽ*';9;

Œ *|GIHDJO*';9;g

is called the asymptotic average-case performance ratio of algorithm A , where Œ *m; stands for the ex- pectation of a random variable. Of course,

Š

M N

P depends on the probability distributions of task sizes and execution times. If there exists a constant‘ such that for all' ,Œ *CAB*';9;#k’‘“m

Œ

*|GIHDJI*C'D;2; as&”–• , then

Š

M N

P

k—‘ , and‘ is called an asymptotic average-case performance bound of algorithmA .

We notice that our scheduling problem defined above looks similar to but is quite different from the two dimensional rectangle packing problem [1, 2, 7, 9], where each task, = is treated as a rectangle with width

? = and height> =. The rectangle packing model implies that processors should be allocated in contiguous groups. That is, the@ processors have indices 1, 2, 3, ...,@ , and task, = must be allocate? = processors with indices˜ ,˜spY™ , ...,˜#pš? = †l™ for some˜ . Such a scheduling problem arises in parallel systems like linear arrays. The rectangle packing problem has been extensively studied, where a complicated algorithm with asymptotic worst-case performance ratio as low as 1.25 has been found [1]. However, contiguous processor allocation is not required in our model, where any?= processors can be allocated to,4=. Our problem could be regarded as a resource constraint scheduling problem [3, 6], where the resource is a set of processors. It has applications in parallel computing systems such as symmetric shared memory multiprocessors and distributed computing systems such as bus-connected networks of workstations. In these systems, a processor allocation mechanism is independent of the topology of an interconnection network. Another related problem is scheduling malleable tasks which has also been investigated in the literature [8, 10]. In that problem, each task requests for several possible numbers of processors, i.e., a task has adjustable size, and for each size, an execution time is also specified. The problem has several variations depending on different ways in which the execution time of a task changes with the number of processors allocated to it, and the performance measures to be optimized (e.g., makespan, average flow time).

The problem we consider here is to schedule nonmalleable tasks with noncontiguous processor alloca- tion. Even though the complicated algorithm in [1] for rectangle packing can also be applied to solve our problem, we propose and analyze a simple approximation algorithm called›#œ , where is a positive inte- ger. Algorithm› œ has a moderate asymptotic worst-case performance ratio (™ž5ŸžŸŸ<57575 k

M N

¡W¢

k£™ž5¤¥¥<58575

for all¦ƒ£Ÿ ); but the algorithm has a small asymptotic worst-case performance ratio ™hp£™§<*-¨#p£™©;sk M N¡ ¢

k)™spª™§1¨ , when task sizes do not exceed@„§1¨ , where¨—«™ is an integer. We notice that the

capability to deal with small tasks is important in real applications since many task sizes are relatively small as compared with the system size so that a large scale parallel system can be shared by many users

(3)

simultaneously. However, it is not clear whether the algorithm in [1] has such capability. Furthermore, the simplicity of our algorithm allows us to conduct average-case performance analysis. In particular, we show that if the numbers of processors requested by the tasks are independent, identically distributed (i.i.d.) random variables uniformly distributed in the range¬8™575@£­, and task execution times are i.i.d. ran- dom variables with finite mean and variance, then

Š

M N

¡4¢

k®™ž5¥¯°ž¯Ÿ¯ž€<57585, i.e.,Œ *›sœŽ*C'D;2;2§

Œ

*|GIHKJO*C'D;2;

is asymptotically bounded from above by 1.2898680... as&£”±• . For an exponential distribution of task sizes, we have

Š

M N

¡4¢

k«™5¥¯ž°¯²ž€ž³<58575. As demonstrated by our analytical as well as numerical re-

sults, the average-case performance ratio improves significantly when tasks request for smaller numbers of processors. We notice that there is lack of such results on probabilistic algorithm analysis, especially in multi-dimensional cases [4]. The average-case performance of the algorithm in [1] is unknown.

The rest of the paper is organized as follows. We present algorithm› œ in Section 2. The worst-case performance of the algorithm is analyzed in Section 3, and its average-case performance is studied in Section 4. Finally we give a summary in Section 5.

2 The Approximation Algorithm H



Our algorithm Hœ for scheduling independent parallel tasks divides' into  sublists' . ,' 3 , ..., 'Kœ

according to task sizes (i.e., numbers of processors requested by tasks), where´ƒ‰™ is a positive integer.

For™µk}¶·k‰¸†t™ , define'¹Ž(º©, =D» '…¼ @„§ *|¶Bp£™©;s½‰? = k}@„§¶{¾ , i.e.,'D¹ contains all tasks in'

that have sizes in the interval¿o¹(*C@„§ *|¶Àp£™©;0ˆ@„§¶­. Define'Dœª(ºo, =D» '‚¼€Á½„? = k‚@„§¾ , i.e.,

'Kœ contains all tasks whose sizes are in the range¿œ‚(…*C€<0ˆ@„§1Â­.

Algorithm Hœ produces schedules of the ' ¹ ’s sequentially and separately. To process tasks in ' ¹ , where™·kölkÝ«†£™ , the@ processors are partitioned into groups, ÄB.0ˆÄO30o58575702Ä ¹ , each contains

@„§¶ processors. Each group ÄsÅ of processors is treated as a unit, and is assigned to a task in ' ¹ . Such an allocation can be implemented using, for example, the list scheduling algorithm [4]. Suppose

' ¹ ()*,

¹

.

09,

¹

3

0657585709,

¹

;, where& ¹ is the number of tasks in' ¹ . Initially, groupÄsÅ is assigned to, ¹Å , where™ŽkY˜nk}¶ , and, .¹ 09, 3¹ 0657585709,

¹

¹ are removed from' ¹ . Upon the completion of a task, ¹Å , the first unscheduled task in , i.e.,, ¹zǹ . , is removed from and scheduled to execute onÄ Å . This process repeats until all tasks in'D¹ are finished. Then algorithm Hœ begins the scheduling of next sublist'D¹Ç . .

For 'Dœ , there is no need to divide the @ processors. The list scheduling algorithm is again em- ployed here. Let'Kœ«(È*-, .œ 09, 3œ 0657585709,

œ

: ¢ ;. Initially, as many tasks in 'Kœ are scheduled as possible, i.e., tasks , .œ 02, 3œ 0o58575709,

œ

É start their execution, whereÊ is defined in such a way that the total size of

, œ

.

02,

œ

3

0o58575802,

œ

É is no larger than @ , but the total size of , œ.

09,

œ

3

0657585709,

œ

ÉÇ . exceeds @ . When a task finishes, the next task in' œ begins its execution, provided that there are enough idle processors. Notice that it is the scheduling of tasks in' œ that takes advantage of noncontiguous processor allocation.

3 Combinatorial Analysis

In this section, we analyze the worst-case performance of algorithm Hœ . Let›#œ*'; be the makespan of the schedule produced by algorithm Hœ for' , andGIHKJL*'; be the makespan of an optimal schedule of

' . First, we show the following result.

Theorem 1. For any list' of& tasks and)ƒl² , we have

›#œµ*C'D;k‰™

™o²

GIHDJL*'D;_p‰*-Ë†l™;Ì>

r 0

(4)

where>r is the longest execution time of the& tasks. Furthermore, forÍƒÎŸ and any large«€ , there exists' , such that GIHKJL*';σ‚ , and›sœ*C'D;ˆ§GIHKJI*';(Ù

. Therefore, for allÈƒ£Ÿ , we have

™5ŸŸŸx58575k M N

¡4¢

k}™5¤¥ž¥ 57585.

Proof. Assume that all tasks in' are executed in the time interval ¬€<0ˆ› œ *';Ñ­, and that tasks in' ¹ are scheduled in¬Ò ¹ 09Ò¹zÇ .Ó­, where™LkY¶“k$ , i.e., the first task in' ¹ starts at timeÒ¹ , and the last completion time of the tasks in isÒ9¹zÇ . . Letʏ¹ be the starting time of the last task, :žÆ¹ in . DefineÔ . („Ò 3 †ÕÒ. , and for all¥µk‰¶nkY , defineÔ1¹À(}Ê©¹I†qÒ9¹ , and¨¹Ž(‰Ò9¹zÇ . †šÊ¹ be the remaining execution time of'D¹ once, ¹ starts. Clearly,¨©¹Žk’>r, and

›sœŽ*C'D;Ö( Ô . p$Ô 3 p—Ô

Ð

p„m6mom©p$Ô©œ’pš¨ 3 pš¨

Ð

p„m6m6m1pš¨©œ

k Ô . p$Ô 3 p—Ô

Ð

p„m6mom©p$Ô©œ’p„*®†’™©;Ì>

r 5 (1)

We give several lower bounds forGIHKJL*';. First, tasks in' . have large sizes such that no two of them can execute in parallel. Therefore, we have

GIHDJI*C'D;ƒ’Ô . 5 (2)

Second, there can be at most two tasks from 'K3 that can execute at the same time, and there can be at most one task from' 3 that can execute simultaneously with a task in' . . Thus,

GIHDJI*C'D;ƒ’Ô1.Kp

¥Ô©3%†qÔ.

¥ (

Ô.

¥

p$Ô©35 (3)

Third, define the area of a task ,4= to be ? =C>6=, and the total area of ' ¹ to be A ¹ (Ø×tÙÚÛ b Æ

? =>6= for

™Ükª¶‡k} , andA(ÝA . pYA 3 p}mom6mpYA#œ . For convenience, we assume that processor requirements

are normalized such that€“½‰? = kޙ for all™µk£ßhk‰& . Clearly,GIHDJL*'D;sƒ}A . We give a lower bound forA as follows. For™Žk}¶Õk„¸†Y™ , we notice that all the groupsÄ . ,Ä 3 , ...,ÄÀ¹ are busy until, ¹ starts its execution. That is, during time interval¬Ò9¹0Óʏ¹©­, at least¶x§ *|¶%p—™©; of the processors are busy, i.e., we haveAs¹£*|¶x§ *|¶#pl™©;2;Ô1¹ . For'Dœ , we note that during time interval¬ÒœB0ÓÊ©œÏ­, the percentage of busy processors is at least*Ã†—™©;ˆ§ , i.e.,A#œÞ}*2*-Ã†—™;2§Õ;Ôœ ; otherwise, some tasks should start earlier.

Hence,

GIHDJI*C'D;ƒ

™

¥ Ô . p ¥² Ô 3 p ²à Ô Ð p„m6mom©p

†l™



ԏœ#á . p

®†l™



Ô©œB5 (4)

Now let us consider the ratio

â ( Ô . p$Ô 3 p$Ô

Ð

p„m6mom©p$Ô©œsá . p—ԏœ

Tvuwã|Ô1.10

.

3

Ô.äp$ԏ30

.

3

Ô1.äp

Ô©3ptmom6m©p

œsá .

œ Ô

œsá .äp

œsá .

œ Ô

œIå

5 (5)

LetæÁ(‰Ô

Ð

p„m6mom©p$Ô œ . Apparently,

â k

Ô.äp$ԏ3Dpšæ

Tvu1w ãÔ . 0 .3 Ô . p—Ô 3 0 .3 Ô . p Ô 3 p Ð

ç æ å 5 (6)

We give the proof of the following result in the appendix.

Ô.äp$Ô©3Kp—æ

TµuwãÑÔ . 0 . Ô . p—Ô 3 0 . Ô . p Ô 3 p Ðç æ å k‰™

™o²

™o¯

0 whereÔ.©0ˆÔ309淃l€<5 (7)

(5)

Combining Equations (1)–(7), we get›sœŽ*C'D;k}™

. Ð

.9è

GIHDJI*C'D;/p‰*-Ë†l™;Ì>r . To show the lower bound for

M N

¡ ¢ , let us consider a list' of tasks which contains& tasks of size .3 pÕ~,

& tasks of size pY~, and& tasks of size .é †l¥~ , where& is a multiple of 6, and~ހ is a very small

quantity. All tasks have unit execution time. Clearly,GIHDJI*C'D;L(& . Algorithm Hœ divides' into' . ,

' 3 , and' é , and›#œ*';#(Þ&Ápl&ê§¥Ipl&ê§1ŸÂ(®™

& . Thus, we can choose& a sufficiently large number

while keeping›sœŽ*C'D;2§žGIHKJL*';ë(…™

. This completes the proof of the theorem.

For tasks with small sizes, algorithm›#œ exhibits much better performance due to increasing processor utilization, as claimed by the following theorem.

Theorem 2. For any list' of& tasks, such that? = k’@„§1¨ for allß, where¨‰™ is an integer, we have

›#œ*';k

e

™Dp

™¨ f

GIHDJL*'D;êp„*†¨#p„™©;>

r 0

where>r is the longest execution time of the& tasks. Furthermore, forÈƒY¨#p„™ and any large…„€ , there exists ' , such that GIHDJI*C'D;ƒ¸ , and ›#œ*';2§žGIHKJL*';(ì™spޙ§<*-¨Op…™;. Therefore, we have

™Dpt™1§ *-¨Ïp„™©;Dk M N

¡4¢

k}™Dpt™1§¨ .

Proof. The proof is similar to that of Theorem 1. Since¨Áƒ‚¥ , and sublists' . ,' 3 , ...,'Kí6á . are empty, Equation (1) becomes

› œ *';k’Ô í p$Ô íˆÇ .äp„m6mom©p—Ô œ p‰*-Ë†¨Ïp„™©;Ì>

r 5

By using the area lower bound forGIHDJI*C'D; , we have

GIHKJL*';Rƒ

e ¨

¨hp„™xf ԏíKp

e

¨Ïpt™

¨Ïp$¥_f ԏí2Ç

.

p„m6mom©p

e

†’™

 f

ԏœsá

. p e

†’™

 f

ԏœ

ƒ ¨

¨Ïp„™

í p$Ô í2Ç .äp„m6mom©p$Ô œ ;z5

The above two inequalities give the asymptotic worst-case performance bound ™#p‚™1§¨ in the theorem.

To show the lower bound™Dpt™1§ *¨hp„™©; for

M N¡ ¢ , let us consider a list' which contains&4¨ tasks of size

™§ *¨#p£™©;êpl~ , and& tasks of size™1§ *¨Ïp£™;ä†q¨1~, where& is a multiple of¨sp‰™ , and~ is a sufficiently small value. All tasks have unit execution time. Clearly,GIHKJL*C'D;À(Ã& . Algorithm Hœ divides' into

'Kí and'Dí2Ç . , and›sœ*C'D;(¸&Õp„&ê§ *¨Opª™©;. Thus, we can choose & sufficiently large while keeping

›#œ*'D;ˆ§GIHDJI*C'D;ë(ª™Dpt™1§ *¨%p„™©;.

When¨Áƒ…³ , algorithm›#œ has better asymptotic worst-case performance ratio than the algorithm in

[1].

4 Probabilistic Analysis

Now let us consider the average-case performance of algorithm Hœ . For convenience, we assume the task sizes are normalized such that€µ½l? =ëk‰™ , and that the?=’s are independent, identically distributed (i.i.d.) random variables with a common probability density functionîï*æ{; in the range*C€<0o™­. Our assumption on the task execution times is quite general, i.e., the>o=’s are i.i.d. random variables with meanð and variance

(6)

ñ 3

, whereð andñ are any finite numbers independent of& . The probability distributions of task sizes and execution times are independent of each other.

Theorem 3. We have the following asymptotic average-case performance bound for algorithm›sœ :

Š

MON

¡4¢

(òQ8S7T

: V N

XLŒ *›sœµ*'D;2;

Œ *ÑGIHKJO*C'D;9;1g

k

œ#á .

ó¹ d . ™

¶ÜôÎõ Æõ

Æ2ö

õ

îï*-æ{;9÷ævp



®†l™sôøõ

¢

ù

æ{îï*-æW;÷žæ

Tvuw

e ô .

ù

æ{îï*æ{;÷žæ/0

ô .

õú

îï*æ{;÷žæ

f 5

(Note that the bound only depends onîï*-æW; .) Proof. It is clear that the mean task size is

Š

?(

ô .

ù

æ{îï*æ{;÷žæ/5

Since a task size falls into¿ ¹ with probabilityû©üÆ

îï*-æ{;9÷æ , where¿ ¹ (Î*™§<*C¶Àp‰™©;06™1§¶ ­ for all™k‰¶Õk

†l™ , and¿œ‚(…*C€<06™1§“­, the expected number of tasks in is

Œ *-&/¹;K(

Xô ü Æ

îï*-æW;÷žæ

g

&ä0

for all™OkY¶Âk’ . Also, the expected size of tasks in'D¹ is

Š

?¹O(

ô ü Æ

æWîï*-æW;÷žæ

ô ü Æ îï*æ{;÷žæ

5

Since the area of a task, = has expectation? ðŠ , and tasks in' . have to be executed sequentially, we have

Œ *|GIHDJI*C'D;2;ƒlTvu1wn*-&

Š

? ðä0 Œ *& . ;ð/;K(týv&4ðä0 (8)

where

ýÎ(„Tvu1wn*

Š

?{0 Œ *& . ;2§1&/;ë(tTvuw

e ô .

ù

æ{îï*æ{;9÷æê0

ô .

õú

îï*-æ{;9÷æ

f 5

Letþv¹ be the makespan of the schedule for . Then,Œ *C›#œµ*';9;ë(

Œ

.

; p

Œ

3

;žpmom6mÑp

Œ

*þµœI;. Clearly,

Œ

*Cþ . ;ë(

Œ

*-& . ;Ìðÿ(

Xô .

õú

îï*-æW;÷žæ

g

&4ðä5 (9)

For¥Bkt¶“kl®†l™ , we haveŒ *Cþv¹;K( Œ *Ô1¹;/p Œ *-¨¹; , and

Œ *Ô1¹;k

Œ *&/¹;

ðä5

(7)

Furthermore,¨©¹ is no more than the maximum of random execution times. It is well known from order statistics [5] that the mean of the maximum of i.i.d. random variables . 0 3 065757580 with meanð and varianceñ 3 is

Œ *-Tvu1w_* . 0 3 0o5857570;2;%k’ðvp

s†l™

¥s†l™

ñ 5

Therefore,

Œ ¹ ; k Œ *&/¹;

ðÜp—ðÜp

¶Ž†l™

¥¶Ž†’™

ñ

( X ™

e ô ü Æ îï*æ{;÷žæ

f

&Âpt™

g

ðÜp

¶B†’™

¥ž¶À†’™

ñ 5 (10)

Finally, we considerþµœ . Since

Œ *A œ ;D( Œ *& œ ; Š

? œ ðÿ(

Xô ü¢

æ{îï*-æW;÷žæ

g

&4ðä0

and the processor utilization is at least™h†’™§1 in the time interval¬ÒœB0ÓÊoœ#­, we get

Œ *CþµœL;k

Œ *CA#œO;

™h†

.

œ p Œ *-¨oœO;ë(



†’™

Xô ü¢

æWîï*-æW;÷žæ

g

&4ðÜp Œ *-¨oœO;5

ForŒ œ ;, the main difficulty is that when task, :œ¢ starts execution, the number of active tasks still in execution is unknown, which could be as large as& œ , the total number of tasks in' œ . SinceŒ *& œ ; could be*&/; , we use the following quite loose upper bound forŒ *-¨oœO;, that is,¨oœ is no more than the maximum of& random execution times,

Œ

*-¨©œL;k’ðÜp

&Á†’™

¥1&Á†’™

ñ

½’ðvp

&¥ ñ 5

Therefore, we get

Œ

*CþµœL;k

e 

†’™

Xô ü¢

æ{îï*æ{;÷žæ

g

&vp„™

f

ðÂp

&¥ ñ 5 (11)

Combining Equations (9)–(11), we obtain

Œ

*›sœµ*'D;2;k

e ô .

õú

îï*-æW;÷žæµp

œsá .

ó¹ d 3 ™

¶Üô õ

Æõ

Æ2ö

õ

îï*-æ{;9÷æ

p 

†’™sô õ

¢

ù

æ{îï*æ{;9÷ævp

Ë†l™

& f

&4ðÜp

e

œsá .

ó¹ d 3

¶Ž†l™

¥¶B†l™

p &

¥f ñ 5

Notice that

¶†’™

½ ¥

(8)

for all¶“ƒ£™ . Consequently,

œsá .

ó

¹ d 3

¶B†’™

¥ž¶À†’™

½

œ#á .

ó

¹ d 3 ¥ ½ ô œ

3 æ¥

÷žæÿ½

¥

²  . 5

The above calculations give rises to

Œ

*C›#œµ*';9;k

e

œsá .

ó

¹ d . ™

¶vôÎõ

Æõ

Æ9ö

õ

îï*æ{;÷žæµp



†’™Iô õ

¢

ù

æWîï*-æW;÷žæµp

†l™

& f

&4ðÜp

e ¥

² 

. p &

¥f ñ 5

(12) Using Equations (8) and (12), we obtain

Œ *›sœ*C'D;2;

Œ *ÑGIHKJL*';9;

k ™

ý e

œsá .

ó¹ d . ™

¶vô õ

Æõ

Æ9ö

õ

îï*-æW;÷žæµp



®†’™sô õ

¢

ù

æWîï*-æ{;9÷ævp

®†l™

& f

p ™

ý e ¥

² m 

.

& p ™

¥&$f

ñð

( ™

ý e

œsá

.

ó¹ d . ™

ôÎõ Æõ

Æ9ö

õ

îï*-æW;÷žæµp



®†’™

ô õ

¢

ù

æWîï*-æ{;9÷æ

f

p

e  .

& p ™& f 5

It is clear that as&Ք¦• , we get the asymptotic average-case performance bound in the theorem.

As an example, let us consider the uniform distributions, that is, the? =’s are i.i.d. random variables uniformly distributed in the range *€x06™§1¨©­, where¨tƒ ™ is a positive integer. That is, îï*æ{;n(´¨ for

€µ½$æÿk£™§¨ . (Notice that when@ is sufficiently large, a discrete uniform distribution onºž™0Ó¥ 0o5857580Ó@„§¨ ¾

can be treated as a continuous uniform distribution on*C€<06™1§¨©­.)

Theorem 4. If the? =’s be i.i.d. random variables uniformly distributed in the range*€x06™§1¨©­, we have

Š

M N

¡4¢

k í , that is,

Œ

*C›#œµ*';9;ksí

Œ

*|GIHKJO*C'D;2;z0

as&n”Ø• , where

IíÏ(‰¥¨

3

X

3

Ÿ † ™¨ † e

™Dp

™

¥ 3 p‰mom6m©p

™

*-¨I†l™;

3

fhg

0

as¸”¦• .

Proof. We examine the numerator and denominator of the bound in Theorem 3. The denominator is simply

ýÎ(„Tvu1w

e ô .

ù

æ{îï*-æW;÷žæ/0

ô .

õú

îï*æ{;÷žæ

f ( ™

¥¨

0

and the numerator is

(t¨

e

œ#á .

ó d ™

3

*C¶Op„™©;

p ™

¥1q*-†’™©;f

5

(9)

Since

™

3

*C¶Àp„™©;

( ™

3 † ™

p ™

¶Opt™

0

we have

œsá .

ó¹ d í ™

3

*C¶Àp„™©;

( e

œ#á .

ó¹ d í ™

3 f † ™

¨ p ™

 5

Note that

œ#á .

ó¹ d í ™

3 ( N

ó

¹ d . ™

3 † í6á .

ó

¹ d . ™

3 † N

ó

¹ d œ ™

3 k 3

Ÿ †

í6á .

ó

¹ d . ™

3 † ™

 5

Thus, we have

œ#á .

ó¹ d í ™

3

*C¶Àp„™©;

k 3

Ÿ † ™

¨ † e

™Dp

™

¥ 3 p„m6m6m1p

™

*¨s†’™©;

3 f 0

and

k’¨

e 3

Ÿ † ™

¨ † e

™Dp

™

¥ 3 ptmom6m©p

™

*-¨L†’™©;

3 f p ™

¥‡*†l™; f 5

By choosing sufficiently large, the average-case performance bound

§1ý can be made arbitrarily close to .

When¨L(ª™ , the asymptotic average-case performance bound given in Theorem 4 is . ( 3 §1²%†Õ¥s(

™5¥¯°ž¯Ÿž¯€<57575. To show the quality of the average-case performance bound in Theorem 4, we give the

following numerical data.

. ( ™ž5¥¯°¯žŸ¯ž€<57585

3 ( ™ž58™³°

à

¤¥€<57585

Ð ( ™ž58™©€¯¯x™©¥<™57585

ç ( ™ž5€ž¯ž¥²ž²ž¥ ¤57585

( ™ž5€žŸŸ<™

àà

°<57585

é ( ™ž5€ ³³¥

à

¯¤57585

( ™ž5€ à ¤ à

¥ ™©°<57585

è ( ™ž5€ à

™©³²€žŸ<57585

( ™ž5€ž²Ÿ°ž² ¤¥ 57585

.ù ( ™ž5€ž²²ž¥ž³³°<57585

It is clear that í ½™Ïp}™1§¨ for all¨ÿΙ , i.e., í is less than the asymptotic worst-case performance bound in Theorem 2.

Though closed form solutions are not available, the average-case performance bounds of algorithm› œ could be calculated using Theorem 3 numerically for arbitrary probability distribution of task sizes. For instance, let us consider a truncated exponential distribution, i.e.,

îï*-æ{;D(

á!

™h†

á

0 €µ½$æ·k‰™5

(10)

Theorem 5. If the? =’s be i.i.d. random variables exponentially distributed in the range *€x06™­, we have

Š

M N¡ ¢

k" , that is,

Œ

*C›#œµ*';9;%k#

Œ

*ÑGIHKJL*';9;0

as&n”Ø• , where

( N

ó

¹ d . ™

m á$&%

a

¹zÇ . c † á'%Ó¹

™h†

á

™

†

á$

™h†

á

0

as¸”¦• .

Proof. It can be easily verified by straightforward calculation that the numerator and denominator in Theorem 3 are

(

œsá .

ó¹ d . ™

m á$&%

a

¹zÇ . c † á$&%Ó¹

™h†

á

p 

Ë†l™

m ™

™h†

á$

e

™h†

á$&%ˆœ

†(

á&%2œ

 f 0

and

ý(tTvuw

e ™

†

á$

™h†

á$

0

á$&% 3 † á

™h†

á

f ( ™

†

á$

™h†

á$

0

respectively. By letting ”¦• , the average-case performance bound

§1ý can be made arbitrarily close to .

To show the average-case performance bound" in Theorem 5, we let ()™o€ž¥ à , and choose

in such a way that the mean task size

Š

?(

™

†

á$

™h†

á

takes the values ™1§ *|¥1¨; for ¨’( ™ž0ˆ¥<0657585706™©€ , so that a comparison can be made between performance bounds of›#œ under the uniform and the exponential distributions.

(t€x5€ž€€<™©¯<™©²<5758570 (ª™5¥¯°ž¯²ž€ž³ 57575

(t²x5³°²ž³<™™©°<5758570

(ª™5¥ž¤1²x™o€žŸ

à

57575

(„³<5°ž€²€ž€€ž€<5758570 (ª™5¥ ™

à

³ž¤³³ 57575

(‰¤ 5°¤1¯<™©€ ¤ž¤5758570

(ª™57™oŸ

à

€€x™oŸ<57575

(t°x5°ž°ž³ àžà

™ž™5758570 "B(ª™57™©¥ž¤²

à

€€<57575

(ª™™ž5°ž°°x™™

à

³ 5758570 "B(ª™57™o€ž¥<™™©¯<™57575

(ª™o²x5°ž°°ž¯² ¤€<5758570 "B(ª™5€¯

à

Ÿ ¤

à

³ 57575

(ª™©³<5°ž°°ž° ¤™ž™5758570 "B(ª™5€ ¤¥ž¥€¤1Ÿ<57575

(ª™¤ 5°ž°°ž°°ž³€<5758570 "B(ª™5€Ÿž¥°²ž€ ¤57575

(ª™o°x5°ž°°ž°°°x™5758570 "B(ª™5€ž³³ ¤1Ÿ ³²<57575

As shown in the above list, is slightly smaller than 3 §1²v†‰¥ , when?„(Ȁ<5³Š , i.e, ¨l(–™ , due to distribution imbalance in*C€<0o™­. However, is larger than í for all¨B}™ , becauseŒ *-&ê.6; is never null.

(11)

5 Conclusions

We have studied the problem of scheduling independent nonmalleable parallel tasks in parallel systems with identical processors. We proposed a simple approximation algorithm called ›#œ , and performed combinatorial analysis for its worst-case performance and probabilistic analysis for its average-case per- formance. In particular, we proved the following results. (1) The asymptotic worst-case performance ratio

M N

¡W¢ is in the range¬8™ 5857™ .

Ð

.9è

­. (2) If the numbers of processors requested by the tasks are uniformly distributed i.i.d. random variables and task execution times are i.i.d. random variables with finite mean and variance, then the average-case performance ratio is

Š

M N

¡4¢

kޙž5¥¯°ž¯Ÿ¯ž€<57585. In other words, less than 22.5% of the allocated computing power is wasted. (3) Both the worst- and average-case performance ratios improve significantly when tasks request for smaller numbers of processors. (4) Results similar to (2)–(3) also hold for the truncated exponential distribution of task sizes.

Acknowledgements

The author wishes to express his gratitude to the editor and two anonymous referees for their comments on improving the paper. This research was supported in part by National Aeronautics and Space Adminis- tration and the Research Foundation of State University of New York through the NASA/University Joint Venture in Space Science Program under Grant NAG8-1313.

References

[1] B.S. Baker, D.J. Brown, and H.P. Katseff, “A 5/4 algorithm for two-dimensional packing,” Journal of Algorithms 2, 348-368, 1981.

[2] B.S. Baker and J.S. Schwarz, “Shelf algorithms for two-dimensional packing problems,” SIAM Jour- nal on Computing 12, 508-525, 1983.

[3] J. Blazewicz, J.K. Lenstra, and A.H.G. Rinnooy Kan, “Scheduling subject to resource constraints:

classification and complexity,” Discrete Applied Mathematics 5, 11-24, 1983.

[4] E.G. Coffman, Jr., D.S. Johnson, P.W. Shor, and G.S. Lueker, “Probabilistic analysis of packing and related partitioning problems,” in Probability and Algorithms, 87-107, National Research Council, 1992.

[5] H.A. David, Order Statistics, John Wiley & Sons, 1970.

[6] M.R. Garey and R.L. Graham, “Bound for multiprocessor scheduling with resource constraints,”

SIAM Journal on Computing 4, 187-200, 1975.

[7] I. Golan, “Performance bounds for orthogonal oriented two-dimensional packing algorithms,” SIAM Journal on Computing 10, 571-582, 1981.

[8] R. Krishnamurti and E. Ma, “An approximation algorithm for scheduling tasks on varying partition sizes in partitionable multiprocessor systems,” Technical Report RC 15900, IBM Research Division, 1990.

(12)

[9] D.K.D.B. Sleator, “A 2.5 times optimal algorithm for bin packing in two dimensions,” Information Processing Letters 10, 37-40, 1980.

[10] J. Turek, J.L. Wolf, K.R. Pattipati, and P.S. Yu, “Scheduling parallelizable tasks: putting it all on the shelf,” Proc. International Conference on Measurement and Modeling of Computer Systems, 225-236, 1992.

Appendix. Proof of Equation (7)

The following fact is used in the proof. If )

*æ{;K(+*

æµp-,

.

æp$÷

0

where

*

0/,10

.

0ˆ÷x02æ$ª€ , then

)

*-æW; is an increasing (decreasing) function ofæ if

*

֠,

.

ހ (½ª€ ). We consider three cases.

Case 1.Ô . is the maximum. SinceÔ . ƒ .3 Ô . pqÔ 3 , we haveÔ 3 k .3 Ô . . SinceÔ . ƒ .3 Ô . p Ô 3 p Ðç æ , we haveæÕk

çÐ * .

3 Ô . † Ô 3 ; . Hence,

â ( ™Dp

Ô 3

Ô . p æ

Ô .

( ™Dp

Ô 3

Ô.

p ಠe ™¥ † ¥² m Ô 3

Ô. f

( ™Dp

¥² p ™

° m Ô 3 Ô .

k ™Dp

¥² p ™

° m ™

¥

( ™

™o²

™o¯

5

Case 2. 3. Ô . p£Ô 3 is the maximum. Since Ô . k 3. Ô . p£Ô 3 , we haveÔ . k ¥Ô 3 . Since 3. Ô . p‰Ô 3 ƒ

.

3 Ô . p Ô 3 p Ðç æ , we getæÿk

ç Ô 3 . Now,

â (

Ô.äp$ԏ3Dpšæ

.3 Ô . p$Ô 3 k Ô.äp

. Ð

Ô©3

.3 Ô . p—Ô 3 0

which is an increasing function ofÔ . . Henceâ takes its maximum value™ .

Ð

.9è whenÔ . (£¥Ô 3 .

Case 3. .3 Ô . p Ô 3 p Ðç æ is the maximum. Since .3 Ô . plÔ 3 k .3 Ô . p Ô 3 p Ðç æ , we getæƒ

ç Ô 3 . Note that

â (

æŽplÔ1.äp$ԏ3

Ðç

æµp

.3 Ô . p Ô 3

is a decreasing function ofæ . Thus,â gets its maximum value whenæÁ(

ç

ԏ3 , i.e.,

â k Ô . p .Ð

Ô 3

.

3 Ô . p$Ô 3 0

which is an increasing function ofÔ. . Since, Ô.µk .3 Ô1.p

ԏ3%p

Ðç æ , andæš(

ç

ԏ3 , we haveÔ.µkޥԏ3 . Hence

â

reaches its maximum value™ .

Ð

.è whenÔ.h(£¥Ô3 .

参照

関連したドキュメント

I observe confidence interval changes according to sample size use the normally distributed random number generator to produce 10 sets of normally distributed random numbers with mean

Abstract: In this work I dedused the moments of order statistics of independent nonidentically (inid) distributed Kumaraswamy random variables.. Numerical

An estimate is given for the lower bound of real zeros of random algebraic polynomials whose coefficients are non-identically distributed dependent Gaussian random variables..

We will show in section 2 that the matrix transformation corresponding to the regular convolution method generated by an independent and identically distributed sequence of

The random elements {X,} belong to a type p stable space and m’e assumed to be independent, but not necessarily identically distributed.. No assumptions are placed on the

Complete convergence is studied for linear statistics that are weighted sums of identically distributed ρ ∗ -mixing random variables under a suitable moment condition.. The

An exponential inequality is established for identically distributed negatively associated random variables which have the finite Laplace transforms.. The inequality improves

We will examine double sequence to double sequence transformation of independent identically distribution random variables with respect to four-dimensional summability matrix