博士論文概要

全文

(1)早稲田大学大学院情報生産システム研究科. 博士論文概要. 論. 文. 題. 目. Study on Probabilistic Model Building Genetic Network Programming. 申請者 Xianneng LI 情報生産システム工学専攻ニューロコンピューティング研究. 2012 年. 11 月.

(2) I n s p i r e d b y D a r w i n ’s t h e o r y o f n a t u r a l e v o l u t i o n , E v o l u t i o n a r y A l g o r i t h m s (EAs) use iterative progress to evolve a population of candidate solutions for solving optimization problems . The population is evolved towards more promising region of the search space by natural selection and random genetic operators, such as crossover and mutation. EAs have shown the advantages of simplicity of implementation and flexibility of different problems, comparing with the classical methods of operational research or mathematics. which. requires. much. prior. knowle dge. to. build. the. mathematical models of problems . As a result, EAs have attracted much attention by researchers to propose numerous. algorithms in the past. several years, such as Genetic Algorithm (GA), Evolution Strategy (ES), Evolutionary Programming (EP) and Genetic Programming (GP), etc. The main difference among these algorithms relies on the representation of individual structures. GA encodes its individual by a sequence of bit-strings, ES individuals are coded as vectors of real numbers, EP represents its individuals by finite state machines and GP uses tree structure to represent its individuals. In the last decade, a new EA named Genetic Network Programming (GNP) was proposed. GNP extends the classical EAs to the graph individual representation, whe re a directed graph structure is developed. Comparing with the classical bit -string and tree structures EAs, the distinguished directed graph structure allows GNP to ensure higher expression ability for modeling complex problems. The fundamental basis that makes EAs succeed in solving optimization problems is that evolution is capable of reproducing and combining the high-quality partial solutions (generally called Building Blocks, BBs) to f o r m n e w s o l u t i o n s w i t h h i g h e r q u a l i t y. H o w e v e r, i n c l a s s i c a l E A s , t h i s process is achieved implicitly through the problem -independent, stochastic and fixed crossover and mutation. The evolution of crossover and mutation are actually the variants. of stochastic search , which. have risks for. breaking down the BBs which causes the low evolution efficiency or even make the problem unsolvable. To o v e r c o m e t h e a b o v e p r o b l e m s , r e s e a r c h e r s d e v e l o p e d a n e w r e s e a r c h branch. of. EAs. called. Estimation. of. Distribution. Algorithm. (EDA).. Different from conventional EAs which implicitly recombine the BBs by stochastic. genetic. operators,. EDA. builds. a. probabilistic. model. by. estimating the probability distribution from the promising solutions and 2.

(3) sample the model for generating the new population. The fundamental basis. of. EDA is. that. the. BBs. can. be. explicitly. represented. in. the. probabilistic model and recombined by sampling the model. The process of estimating the probability distribution is carried out by the advanced techniques of statistic and machine learning. Many previous studies have shown that the breakage of the BBs in conventional EAs can be reduced by EDA, which causes the improvement of evolution performance s. A large. number. of. studies. have. been. conducted. on. EDA to. propose. n u m e r o u s a l g o r i t h m s . H o w e v e r, m o s t o f t h e c u r r e n t E D A s w e r e p r o p o s e d i n b i t - s t r i n g s t r u c t u r e b a s e d G A a n d t r e e s t r u c t u r e b a s e d G P. D u e t o t h e r e s t r i c t i o n o f t h e s e s t r u c t u r e s i n t e r m s o f e x p r e s s i o n a b i l i t y, m o s t o f t h e c u r r e n t E D A s a r e a p p l i e d t o s o l v e t h e b e n c h m a r k p r o b l e m s o f G A a n d G P, which produces one of the essential challenges in exploring EDA to solve many other problems. On the other hand, in most of the advanced EDAs, the complex machine learning techniques, such as Bayesian network, are very. time. consuming. for. constructing. the. probabilistic. model.. The. construction of the probabilistic model itself is an optimization problem. The objective of th is thesis is to propose a novel paradigm of EDA named Probabilistic Model Building Genetic Network Programming (PMBGNP). It e x t e n d s E D A t o t h e g r a p h i n d i v i d u a l r e p r e s e n t a t i o n , w h e r e G N P ’s d i r e c t e d graph. is. used.. PMBGNP. ensures. higher. expression. ability. than. the. existing EDAs, where a number of problems can be explored and solved e f f i c i e n t l y a n d e f f e c t i v e l y, s u c h a s d a t a m i n i n g p r o b l e m s a n d t h e p r o b l e m s of. controlling. PMBGNP. is. the. agents’. studied. by. b e h a v i o r. integrating. M o r e o v e r,. the. enhancement. Reinforcement. Learning. of. (RL). techniques, which does not require additional large number of time cost for the probabilistic modeling comparing with conventional EDAs. C h a p t e r 2 p r o p o s e s s t a n d a r d P M B G N P, w h e r e t w o M a x i m u m L i k e l i h o o d Estimation. (MLE) -based. methods. are. developed. for. the. probabilistic. m o d e l i n g o f P M B G N P. T h e p r o b a b i l i s t i c m o d e l e s t i m a t e s t h e d i s t r i b u t i o n of. node. connections. for. finding. the. optimal. solutions.. To. verify. its. performance, PMBGNP is applied to solve the data mining problems, including the time series traffic dataset and the UCI benchmark datasets , by comparing with the conventional EAs and data mining methods. In chapter 3, the issue of the population diversity loss in PMBGNP is 3.

(4) addressed by the theoretical comparison with classical EDAs. A hybrid PMBGNP is therefore proposed to improve its exploration ability for the m a i n t e n a n c e o f d i v e r s i t y. T h e e f f e c t i v e n e s s o f t h e h y b r i d P M B G N P a n d i t s theoretical. ability. to. maintain. the. population. diversity. is. testified. t h r o u g h t h e p r o b l e m o f c o n t r o l l i n g t h e a g e n t s ’ b e h a v i o r, i . e . , r o b o t c o n t r o l . In chapter 4, an extended PMBGNP is proposed to accelerate the evolution by using both of the good and bad individuals. Most of the existing EDAs focus on estimating the distribution of the good individuals, while the bad ones are ignored. This chapter proposes a RL -based method to extract the good sub-structures from the bad individuals. The proposed method learns the experiences of individuals to formulate. the Q values, which can. measure the quality of sub-structures. By incorporating the learnt Q values, the good sub-structures from the bad individuals can be extracted and combined into the probabilistic modeling of PMBGN P to boost the evolution. The simulation results on robot control. problems show its. superiority over conventional methods. In chapter 5, the algorithm of integrating RL is used from another sight of EDA. That is, the learnt Q values are directly used in t he probabilistic m o d e l i n g o f P M B G N P, r a t h e r t h a n e x t r a c t i n g s u b - s t r u c t u r e s f r o m t h e b a d individuals. The proposed algorithm in this chapter is called Reinforced P M B G N P. C o m p a r i n g w i t h t h e e x i s t i n g a d v a n c e d E D A s b a s e d o n B a y e s i a n network which requires much time cost, Reinforced PMBGNP only requires linear additional time for the. learning. of. Q. values. to construct. an. accurate model. The proposed algorithm is systematically studied in both b e n c h m a r k p r o b l e m , i . e . , Ti l e w o r l d s y s t e m , a n d r o b o t c o n t r o l p r o b l e m s b y the comparison with the state -of-the-art algorithms in EAs, EDA and RL . Chapter. 6. continuous. extends. PMBGNP. domains,. where. from an. discrete algorithm. optimization named. problems. PMBGNP. to. with. Actor-Critic (PMBGNP-AC) is proposed. In PMBGNP -AC, the cont inuous variables of nodes are formulated by Gaussian distribution, which is updated by the analog of AC through evolution. The superiority of the proposed algorithm is verified in robot control problem by comparing with the classical algorithms. F i n a l l y, c h a p t e r 7 c o n c l u d e s t h e t h e s i s b y d r a w i n g t h e u n i q u e f e a t u r e s o f PMBGNP and its contributions. 4.

(5)

博 士 論 文 概 要

博士論文概要