早稲田大学大学院情報生産システム研究科
博 士 論 文 概 要
論 文 題 目
A Study of Meta-heuristic Optimization based on Probabilistic Distribution to solve
Structural Design Problems
申 請 者
Haydee Rocio MELO CISNEROS
情報生産システム工学専攻 経営工学研究
2015 年 12 月
2
Genetic Algorithms (GA), Particle Swarm Optimization (PSO) and Neural Networks (NN) have demonstrated a strong ability to solve complex problems. Even though they have a simple structure based on a biological model such as brain, bird flock or evolution; they yield good performance.
However, nowadays applications are more complex. A vast majority of applications such as classification, forecasting, and function approximation have been investigated using various different approaches. Those approaches have some limitations such as being trapped in local minima, having experience the loss of information and possible poor solutions. In a real application, the higher accuracy and more efficiency level are required. The objectives of the work reported in this thesis are primarily to 1) introduce probabilistic distributions in meta-heuristics algorithms, 2) design a new training algorithm, for NN by using normal distribution, and 3) demonstrate how such hybrid algorithms that use probabilistic distributions, can be exploited to improve training in neural networks and solve complex problems. In addition, a pursuing goal is to improve and expand the training algorithm with fuzzy and structural learning that includes the development of new methodologies and software for the simulation and analysis of its performance in experiments. The thesis is organized as follows:
Chapter 1[Introduction] presents the objective and motivation by explaining the innovation and difficulties that must be overcome and mitigated for the proposed methods.
Chapter 2 [Previous works] introduces the rapidly progressing research field of meta-heuristics algorithms and explains why this field is important particularly in the computational sciences.
Chapter 3 [Preliminaries of Basic Methods] provides the basic concepts of Genetic Algorithms, Particle Swarm Optimization and Neural Network concepts used throughout the thesis.
Chapter 4 [Estimation of Distribution Algorithm for solving a Redundancy Allocation Problem]
proposes an Estimation of Distribution Algorithm (EDA) approach to solve a redundancy allocation problem (RAP) for a high-security-system. This algorithm can yield an optimal solution for combinatorial optimization problems by predicting the movements of the populations in the search space and avoiding several parameters associated with other evolutionary computation algorithms.
GAs have been widely used in research studies on redundancy allocation to find the minimal cost configuration of a series-parallel and r-out-of-n systems under the constraints of reliability or availability. Although a GA yields good results in solving various NP hard (non-deterministic polynomial-time hard) problems, it does not guarantee global optimization and requires many modifications of the algorithm to solve each problem. This mixed integer problem is difficult to solve
3
even by the heuristics method; therefore an EDA approach is proposed. The EDA incorporates the knowledge the previous solutions into the search to learn about the problem structure to find the best combinations for the number of components and get an optimal solution in a small number of iterations. It can assist engineers during the design phase by creating a highly reliable and available system. A redundancy allocation problem for a high security system is proposed to test the Estimation of Distribution Algorithm (EDA) in a large space. In this study, also a new series-parallel redundancy allocation mathematical model is formulated to maximize the system availability. The motivation of this research is not only to develop a new formulation for the RAP but also to create an optimization model from the perspective of availability. The highest system availability achieved by EDA with 1-1.11 x10-4and a cost of $982,071.94 cost. In comparison, PSO achieved a 1- 0.61 x 10-3availability with a cost of $821,541.16. Tabu Search (TS) achieved an availability of 1 -0.13x10-3and a cost of
$813,295.54. Even though PSO and TS have lower cost, their availability values are also low.
Therefore, compared with PSO and TS, the EDA resulted in a superior performance in terms of solution quality and less computational time. EDA achieved 4.322 s, while PSO and TS achieved 9.246 s and 9.354 s, respectively.
Chapter 5 [Gaussian-Cauchy Particle Swarm Optimization for training a Neural Network]
proposes a PSO with Gaussian and Cauchy random variables to train the NN. Training an NN can be formulated as an optimization problem to search the best network design. Although GA is widely used as a training algorithm in NN, PSO is scarcely used in applications for training the NN. PSO has demonstrated better performance than BP algorithm, GA or other evolutionary algorithms in training the NN. Also, PSO is used to optimize the weights and bias in the network without taking into account the network structure. Even though PSO has a good performance in a global optima search, considering that PSO has the ability to find and explore optimal solution in the search space, parameter tuning influences its efficiency, PSO is not able to find the optimal network design. Since, PSO has no momentum this is a serious disadvantage; causing difficulty to reach the global optimum.
The use of normal distribution has demonstrated to yield good result in meta-heuristics, especially population based algorithms such as GA and PSO. The application of statistical distributions makes the meta-heuristics algorithms learn the solutions. Gaussian random variables improve the search performance and the information sharing among the particles by allowing a more stable search for an optimal solution while Cauchy random variables are used when the global best does not change in the search for another global best. In contrast to other methods that only use classical PSO to find the optimal values for the weights in the network, the combination of Gaussian-Cauchy probability distributions induces learning into the algorithm by faster convergence and escape from the local minima. The proposed weight encoding strategy allows a graphic representation and easy decoding of
4
the solution. The proposed improved PSO with Gaussian-Cauchy random variables that straighten out the convergence to avoid local minima is applied to train a three-layer neural network. For the function approximation problem the Mean Squared Error (MSE) for GCPSO-NN is 6.65e-010 while PSO and BP are 8.75e-008and -1.91e001respectively. In Iris classification the best recognition rate that the BP algorithm achieved was 0.85 while the best recognition rate attained by the PSO algorithm was 0.88 and 0.97 for the GCPSO with any number of hidden nodes. The proposed Gaussian-Cauchy PSO for training the NN had better recognition rate, and its best mean recognition rate 0.9185 was achieved under a different number of hidden nodes.
Chapter 6 [Gaussian Particle Swarm Optimization for Fuzzy Reasoning based Structural Learning]proposes a Fuzzy Reasoning based Structure Learning with (SLFR) to identify the optimal network structure on the GPSO-NN. Gaussian Random variables in the PSO improve search performance and information sharing capability among the particles providing more stability during the search for an optimal solution. In contrast to other methods that use only PSO to find the optimal values for the weights in the network, the Gaussian Particle Swarm Optimization (GPSO) avoids parameters tuning problems, while the learning of the search space with a Gaussian probability distribution enables escape from the local minima. However the size of the solution influences the computational time and the convergence. The total number of the weights in the Neural Network determines the size of each solution, therefore the size of the network structure is computationally time consuming. The structural learning methods improve the computational time by modifying the network structure and the number of units in the hidden layer. The proposed method improves the learning and removes the stress by eliminating the necessity of determining a detailed network. The fuzzy reasoning based on structural learning supports a more compact network by finding the weak weights and the optimal number of units for the network. This is determined by the goodness factor and membership functions that avoid the deletion in early stages of the algorithm and assure the link to be erased has no effect on the Mean Square Error (MSE). Therefore the most important weights and units in the structure of the NN are identified. The proposed algorithm is applied to train a three-layer NN. To test the proposed algorithm an iris data set classification problem is used. The best recognition rate that BP algorithm achieved was 0.85 while the best recognition rate for the PSO and SLFR-GPSO was 0.88 and 0.95, respectively. The proposed algorithm has a better recognition rate and requires less computational time with 11.598s while PSO achieved 15.207 s and Back Propagation 24.775 s with 14 units in the hidden layer. Moreover the best recognition mean-rate achieved by the proposed method was 0.9055 under different number of hidden units.
Chapter 7 [Conclusions and future works]concludes and summarizes the thesis and explains future works.