Japan Advanced Institute of Science and Technology
JAIST Repository
https://dspace.jaist.ac.jp/
Title
疎結合分散環境における耐故障性と適応性を実現するソフトウェアの構成に関する研究
Author(s)
豊島, 真澄Citation
Issue Date
2001‑06Type
Thesis or DissertationText version
authorURL
http://hdl.handle.net/10119/927Rights
Description
Supervisor:片山 卓也, 情報科学研究科, 博士on Loosely Coupled Distributed Systems
Masumi Toyoshima
Schoolof Information Science,
Japan Advanced Institute of Science and Technology
June 2001
Abstract
Thisthesis presentsatask allocationmethod to implementfault tolerant software using functionalpro-
grammingparadigm. Thedesignand implementation usinggroupcommunicationsystemisalso discussed
in ordertobuildtheruntimesystemonlooselycoupleddistributed environment.
Recently,manyCOTScomputersareconnectedusingLANorWANandthereariseseveralprojectswhich
usethecomputationpoweroftheseinexpensivecomputationresources.Intheselooselycoupleddistributed
environment,thereexistsmanycomponentsbuildupsinglesystem,there aredierentclassofperformance
of computers, and the communication links may be sometimes down. In order to run large applications
including longrunningscientic computation onsuch environment,it is important forsuch systemstobe
fault tolerance,thecharacteristicsofasystemtotoleratesomefaultand continuerunninginanacceptable
level.
Manyresearchworkhasbeendoneinthis areasince1970sandthere existssomebasictechniquestoim-
plementfaulttolerance. However,mostofthesetechniquesarebasedonimperativeprogrammingparadigm.
Thoughitiseasytounderstanditsoperationalsemanticsusingimperativeprogrammingparadigm,program-
mers have to consider many complication: detecting the fault, checkpointingsystem state and recovering
usingthem tocorrectstates,etc.
InordertoavoidthesediÆculties,APRreplicationtechniquewhich isbasedonfunctional programming
paradigmisintroducedin1998. Theholedevelopment,fromthemodelofcomputationthroughimplemen-
tationis introducedin APRapproach. APRprovidesnotonlyfaulttolerancebut italsoshortensthetime
to completecomputation. Itis onlyneededforprogrammerstodescribeapplication programin functional
mannerin ordertogaintheabovebenets.
Thoughtheschedulingalgorithm offunctions in APR isintroduced, theresourceallocationmethodnor
details ofdesign and implementation includingcommunicationin looselycoupled distributed environment
arestill notdened.
This theses starts with the formalization of APR task scheduling algorithm. Then the RAFT resource
management system is introduced in order to manage computation resources and allocate tasks to the
resources. RAFT divides APR functions to more ne-grainedtasks called RAFT process and distributes
these processesto thecomputation resourcesexisting on network. Therecovery process in RAFT is also
dened soastominimizetherecoverytimewhenfailuresoccur.
Thedesignandimplementationusinggroupcommunicationisalsointroducedinthisthesiswithitscost
analysis. Theseworksshowtheeectivenessandcharacteristicsofthefaulttolerantsystemwhichbasedon
functional programmingparadigm.
keywords: faulttolerantsoftware,looselycoupleddistributedsystems,parallelcomputation,functional
programming,resourceallocation