Improved Hybrid Swarm Intelligence Algorithm for Voice Command Recognition

(1)

西南交通大学学报

第 55 卷第 2 期

2020 年 4 月

JOURNAL OF SOUTHWEST JIAOTONG UNIVERSITY

Vol. 55 No. 2

Apr. 2020

ISSN: 0258-2724 DOI：10.35741/issn.0258-2724.55.2.23

Research article

Computer and Information Science

I

MPROVED

H

YBRID

S

WARM

I

NTELLIGENCE

A

LGORITHM FOR

V

OICE

C

OMMAND

R

ECOGNITION

用于语音命令识别的改进型混合群智能算法

Jamal Salahaldeen Majeed Alneamy, Ghada Mohammad Tahir Kasim Aldabagh

Software Department, College of Computers Science and Mathematics, University of Mosul Alkhatonia, Mosul, 41001, Iraq, [email protected]

Received: August 20, 2019 ▪ Review: September 18, 2019 ▪ Accepted: February 27, 2020

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution License

(http://creativecommons.org/licenses/by/4.0)

Abstract

Swarm intelligence involves the aggregation of the boids, which interact with each other in their own environment. The boids are agents that follow simple rules in the absence of centralized structure. The Artificial Neural Network is also known as a connectionist system that originated from the biological neural network. The aim of the present study is to improve the hybrid swarm intelligence algorithm for voice command recognition. The proposed algorithm hybrid combines the lion optimization algorithm with the fish swarm algorithm, and was improved upon using voice features. Approximately ten voice commands were recorded, including “open”, “close”, “open door”, “close door”, “open window”, “close window”, “on”, “off”, “play”, and “stop”.

Keywords:Artificial Neural Network, Swarm Intelligence, Hybrid Algorithm, Voice Recognition

摘要群智能涉及波德的聚集，它们在各自的环境中相互影响。Boid 是在缺乏集中式结构的情况下感知简单规则的主体。人工神经网络也被称为源自生物学神经网络的连接系统。本研究的目的是改进用于语音命令识别的混合群智能算法。提出的混合算法是狮子优化算法和鱼群算法，并在使用语音功能时得到了改进。大约记录了十个语音命令，包括“打开”，“关闭”，“打开门”， “关闭门”，“打开窗口”，“关闭窗口”，“打开”，“关闭”，“播放”和“ 停”。 关键词: 人工神经网络，群体智能，混合算法，语音识别

I. I

NTRODUCTION

The artificial neural network (ANN) is known as a connectionist system that originated from the

biological neural network. There are various applications of the ANN that include image recognition and image processing. Around 20 years ago, there was significant interest in

(2)

cellular automation. Cellular structure is a representation in group format on the basis of cellular automation, and further leads to swarm.

Swarm intelligence (SI) was introduced by Beni and Wang in 1989 [1], [2], [3]. Swarm behavior is similar to aggregative motion. Initially, only biological researchers dealt with swarm etiquette. Furthermore, engineers are placing significant interest in swarm behavior and therefore SI. SI is a similar concept, based on artificial intelligence, that combines the etiquette of decentralized and self-regimented systems, and may be artificial or natural. The SI has been applied in various fields and plays an important role in the optimization of telecommunication engineering [4], automated traffic systems, military defense system application [5], and robotic engineering [1], [6].

II. B

ASIC

C

ONCEPT

SI involves the aggregation of the boids. The boids exist in their own environment without centralized structure, and interact with each other to follow simple rules. The concept of SI is seen in several natural examples such as the growth of bacteria, fish schooling, bird flocking, ant colonies, and many others. SI involves a generalized set of algorithms.

The swarm behavior model includes the boids and is an artificial life program that was created by C. Reynolds in 1986 [7]. It imitates flocking etiquette in birds. Separation, alignment, and cohesion are the basic rules applied in the boids’ world. Interaction between individual boids causes complexity.

SI has two important and essential properties: self-organization and labor division. Self-organization is defined as the ability of a system to release its boids (i.e. agents, or components) in an appropriate form without external help. Self-organization depends on several functions such as fluctuation, complex interaction, positive feedback, and negative feedback [8]. Feedback plays an important role in amplification and stabilization, while fluctuation is used to create randomness. Complex interaction is useful when swarms claim data between themselves, within their own searching area. The labor division property of SI is defined as the synchronous execution of different feasible and simple duties by individuals. This property empowers the swarm to route the complex problems that require individuals to work together [9].

Swarm is most favorable because of its simplicity and reliability. As compared to centralized systems, swarm requires simpler components. In systems level robotics

engineering, swarm can help in the robotics parts modularized, dominant, mass produced, interchangeable and disposable. Swarm is also highly reliable. It can be designed in such way to endure different types of disturbances. Due to redundancy, swarm is able to adapt to its working environment dynamically. Swarm acts like a massive paralleled computational system, so it carries out duties that reach beyond those of other robotic systems, such as centralized systems or complex robot systems. Thus, swarm becomes a promising asset for robotics. SI Algorithms include Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Genetic Algorithms (GA), Glowworm Swarm Optimization (GSO), Particle Swarm Optimization (PSO), Cuck-oo Search Algorithm (CSA) and Differential Evolution (DE) [9]. All aforementioned algorithms have exhibited the potential to resolve various optimization difficulties.

SI Algorithms are used in various forms of recognition software, such as image, speech, voice, etc. [10]. Basically, hybrid swarm algorithms increases the reliability and efficiency of the software. The SI frame works as follows: initialize population, define pause or stop condition, evaluate the fitness function, update and move agents and return to global best solution [11]. Many more combinations and techniques along with the basic SI algorithms result in hybrid swarm algorithms whose outcome is the best optimization [12], [13] and [14]. For example, swarm with Viterbi is used for speech or voice recognition. Optimization of hidden markov model (HMM) with the Evolutionary algorithm provides a solution to the voice tag problem. In this paper, the input voice is compared with the outwards weighted parameters. Neighboring parameters and fitness function will be analyed to obtain the best solution. PSO-Viterbi algorithm is routinely used because of its potential to provide a framework for future improvement. Feature extraction must recognize any speech or voice so we can adjust the weight (a parameter which transformed input data into impact output within neural network at hidden layers) and compute the result by comparing the original voice and the weighted voice. For clearer output, we can run the algorithm for a number of iterations varying the weighted parameter [15]. The aim of the present study is to improve the hybrid SI algorithm for voice command recognition.

Weight is the parameter within a neural network that transforms input data within the network's hidden layers.

(3)

III. M

ATERIAL AND

M

ETHODS A. Proposed Algorithm

The proposed hybrid algorithm combines the lion optimization algorithm with the fish swarm algorithm. It improved a previous hybrid algorithm using the features of the voices.

The proposed algorithm consists of four stages, which are as follows:

First stage: In this stage, we built the database

to be used later in recognition. This stage consists of the following steps:

Step 1: The commands was given to record

the voices. In the present study, 10 voice commands were recorded. They are shown in Figure 1.

Figure 1. The database of commands voices Step 2: Record about twenty samples for each

voice command.

Second stage: In this stage, the input

command voice was recorded.

Third stage: This is known as the “feature

extraction stage.” In this stage, the features of the database and input samples were extracted.

B. Analysis of the Features

The features we calculated are as follows:

1) Summation of Each Sample Value’s Analysis

Sum value (k) = (1) where N = length of sample, K = number of sample.

2) Calculate the Standard Deviation of Each Sample Value

Standardvalue(K) = (2)

where N = length of sample, K = number of sample.

3) Calculate the Mean Features of Each Sample Value

Mean value (K) = (3)

where N = length of sample, K = number of samples.

Fourth stage: In this stage, a new version of

the hybrid algorithm applied the features used by the hybrid algorithm to recognize the command voice. The proposed algorithm used the lion algorithm to calculate the visual value for the fish algorithm.

+ rand(0 to 1) (4)

where = the new location of the current fish, = the old location of the current fish, visual = the visual distance, rand(0 to 1) =

random number generated between (0,1).

The minimum Euclidean distance fitness function is used to recognize the voice command.

IV. R

ESULTS

The proposed hybrid algorithmprovides many advantages over the fish and lion optimization SI algorithm, mainly with respect to time and accuracy of recognition. The following aspects of the proposed algorithm were tested to be compared to the first algorithm: the time needed to recognize the voice command and the accuracy of recognition of the voice command. Table 1 shows the results of the experiment conducted to determine the recognition time needed by the hybrid algorithm without features extraction compared with the fish and lion optimization algorithm.

Table 1.

(4)

Voice command Fish recognition time (sec) Lion optimization recognition time (sec) Hybrid algorithm recognition time (sec) Proposed features hybrid algorithm recognition time (sec) Open 0.004092 0.005049 0.00321 0.000395 Close 0.004172 0.005063 0.00320 0.000367 Open door 0.004998 0.005697 0.00419 0.000435 Close door 0.004999 0.005299 0.003933 0.000455 Open window 0.004938 0.005538 0.003981 0.000501 Close window 0.005001 0.005741 0.004033 0.000441 On 0.004996 0.005548 0.004012 0.000467 Off 0.005540 0.006290 0.004812 0.000441 Play 0.005006 0.005655 0.004611 0.000423 Stop 0.00302 0.00316 0.00249 0.000578

Table 1 also shows the results of the experiment conducted to determine the recognition time needed by the proposed feature hybrid algorithm compared with the standard fish and lion optimization algorithm. As shown in Table 1, the time the hybrid algorithm too to recognize the voice commands is less than the time taken by the standard fish and lion

optimization. The difference in time is small, but in the proposed hybrid algorithm is faster with large time difference than the earlier algorithm.

Table 2 presents the recognition results of each sample using the standard fish and lion optimization algorithm and the hybrid and features hybrid algorithms after removing silent moments using Matlab10.

Table 2.

Result determining the recognition time by using standard fish, lion optimization, hybrid and features hybrid proposed algorithm

Voice command Fish recognition time (sec) Lion optimization recognition time (sec) Hybrid algorithm recognition time (sec) Proposed

features hybrid algorithm recognition time (sec)

Open 0.00304 0.00319 0.00191 0.000022 Close 0.00276 0.00279 0.00120 0.000047 Open door 0.00371 0.00457 0.00419 0.0000215 Close door 0.00413 0.005198 0.003933 0.000026 Open window 0.00493 0.005538 0.003981 0.000043 Close window 0.00471 0.004941 0.00403 0.000084 On 0.00143 0.00343 0.00312 0.000023 Off 0.00240 0.00470 0.00112 0.000026 Play 0.00496 0.004995 0.00351 0.0000308 Stop 0.00102 0.00203 0.00123 0.0000251

Figure 2 to 10 represents the input voice signals command and the signals it recognized.

Figure 2. Open voice command

(5)

Figure 4. Open door voice command

Figure 5. Close door voice command

Figure 6. Open window voice command

Figure 7. Close window voice command

Figure 8. On voice command

Figure 9. Off voice command

Figure 10. Play voice command

Figure 11. Stop voice command

V. D

ISCUSSION

Beni and Wang (1989) [1] reported the swarm algorithm which focused on the uses and advantages of swarm algorithms and its importance in intelligence. He related SI to robotic engineering [1]. Garg et al. [3] reported a SI related to the ant colony optimization based on previous research.

Wahab et al. reported comparative analysis of various types of algorithms through experiments conducted using well-known benchmark functions [9]. They were explained basic steps for a generic algorithm along with the brief and diagrammatic manner. They carried out many statistical tests to determine the significant performances breifly. The results were reported in the diagrammatic manner to explain the overall advantage of differential evolution (DE) and particle swarm optimization (PSO) [9]. The artificial ant colony and its optimization were considered. The particle swarm algorithm, cuckoo search algorithm, and glowworm swarm optimization were also explained with examples [9]. Some researchers reported analysis which is inspired from nature like ant and bee colonies along with the application of many SI algorithms [16]. Much research was conducted on SI–based algorithms [16]. From last two decades, near about 9000 research studies were available on insects and animal-based algorithms. Some reports were published on nature-inspired optimization algorithms [17].

Speech recognition has a higher complexity (

recognition of input voice and translation of

spoken language into text

) and a broad application range. Shukla et al. [18] explained speech recognition as predictions were obtained

optimization techniques to redesign artificial

neural network

. They compared three different algorithms to check their performance for the highest accuracy of voice recognition. They all produced good results with an average 95.3% accuracy. A similar report was published by Kaur et al. on voice recognition [19]. Our results are in accordance with these reports.

Swarm algorithms based on HMM have been reported in the literature [20]. These reports

(6)

discussed improved particle SI algorithms for enhanced voice recognition. They suggested a novel voice recognition technique based on a particle SI algorithm and vector quantization. They produced better results than those generated by Shukla et al. Their accuracy was 97.14%. Some researchers obtained 97.8% accuracy in speech recognition using an advanced feature extraction technique along with the particle SI algorithm [21].

R

EFERENCES

[1]

BENI, G. and WANG, J. (1989)

Swarm intelligence in cellular robotic

systems. In: Proceedings of the NATO

Advanced

Workshop

on

Robots

and

Biological System, Tuscany, June 1989. New

York: North Atlantic Treaty Organization.

[2]

BENI, G. (2014) From swarm

intelligence to swarm robotics. In: ŞAHIN,

E. and SPEARS, W.M. (eds.) Swarm

Robotics. SR 2004. Lecture Notes in

Computer Science, Vol. 3342. Berlin,

Heidelberg: Springer, pp. 1-9.

[3]

GARG, A., GILL, P., RATHI, P.,

AMARDEEP and GARG, K.K. (2009) An

insight into swarm intelligence. International

Journal of Recent Trends in Engineering, 2

(8), pp. 42-44.

[4]

BONABEAU, E., DORIGO, M., and

THERAULAZ,

G. (1999)

Swarm

Intelligence: From Natural to Artificial

Systems. New York: Oxford University

Press.

[5]

PACHTER, M. and CHANDLER, P.

(1998) Challenges of autonomous control.

IEEE Control Systems Magazine, 18 (4), pp.

92-97.

[6]

ARKIN, R. (1998) Behavior-Based

Robotics. Cambridge, Massachusetts: MIT

Press.

[7]

REYNOLDS, C. (1987) Flocks, herds

and schools: A distributed behavioral model.

ACM Siggraph Computer Graphics, 21, pp.

25-34.

[8]

SANFILIPPO, F., YNDESTAD, H.,

and ALALIYAT, S. (2014) Optimisation of

boids swarm model based on genetic

algorithm and particle swarm optimisation

algorithm

(comparative

study).

In:

Proceedings

of

the

28th

European

Conference on Modelling and Simulation,

Brescia, May 2014. European Council for

Modeling and Simulation, pp. 643-650.

[9]

WAHAB, M., NEFTI-MEZIANI, S.,

and ATYABI, A. (2015) A comprehensive

review of swarm optimization algorithms.

Plos One, 10 (5), e0122827.

[10]

ZHANG,

Y.,

AGARWAL,

V.,

BALOCHIAN, S., and YAN, J. (2013)

Swarm Intelligence and Its Applications. The

Scientific World Journal, 2013, 528069.

[11]

BREZOČNIK, L., FISTER, I., and

PODGORELEC,

V. (2018)

Swarm

intelligence algorithms for feature selection:

a review. Applied Sciences, 8 (9), 1521.

[12]

AMUDHA, P., KARTHIKAND, S.,

and SIVAKUMARI, S. (2015) A hybrid

swarm intelligence algorithm for intrusion

detection using significant features. Scientiﬁc

World Journal, 2015, 574589.

[13]

CHENG, M. and LIEN, L. (2011) A

hybrid swarm intelligence based particle bee

algorithm for benchmark functions and

construction site layout optimization. In:

Proceedings of the 28th International

Symposium on Automation and Robotics in

Construction,

Seoul,

June

2011.

The

International Association for Automation and

Robotics in Construction, pp. 898-904.

[14]

SINGH, N. and SINGH, S. (2017)

Hybrid

algorithm

of

particle

swarm

optimization and grey wolf optimizer for

improving

convergence

performance.

Journal of Applied Mathematics, 2017,

2030489.

[15]

SUN, S., LIN, H., and LIU, H. (2011)

A hybrid PSO-Viterbi algorithm for HMMs

parameters weighting in Part-of-Speech

tagging. In: Proceedings of the International

Conference of Soft Computing and Pattern

Recognition,

Dalian,

October

2011.

Piscataway,

New

Jersey:

Institute

of

Electrical and Electronics Engineers, pp.

518-522.

[16]

CHAKRABORTY, A. and KAR,

A.K. (2017) Swarm intelligence: a review of

algorithms. In: PATNAIK, S., YANG, X.S.,

and NAKAMATSU, K. (eds.)

Nature-Inspired

Computing

and

Optimization.

Modeling and Optimization in Science and

Technologies, Vol. 10. Cham: Springer, pp.

475-495.

(7)

[17]

DHANALAKSHMI,

N. (2018)

Nature inspired optimization algorithms in

artificial

neural

network

for

speaker

recognition.

International

Journal

of

Electrical Engineering & Technology, 9 (3),

pp. 114-120.

[18]

SHUKLA,

S.,

JAIN,

M.,

and

DUBEY,

R.K.

(2019)

Increasing

the

performance of speech recognition system by

using different optimization techniques to

redesign artificial neural network. Journal of

Theoretical

and

Applied

Information

Technology, 97 (8), pp. 2404-2415.

[19]

KAUR, G., SRIVASTAVA, M., and

KUMAR, A. (2018) Genetic algorithm for

combined speaker and speech recognition

using deep neural networks. Journal of

Telecommunication

and

Information

Technology, 2, pp. 23-31.

[20]

SELVARAJ, L. and GANESAN, B.

(2014) Enhancing Speech Recognition Using

Improved Particle Swarm Optimization

Based Hidden Markov Model. Scientiﬁc

World Journal, 2014, 270576.

[21]

KANISHA,

B. and

BALARISHNANAN, G. (2016) Speech

recognition with advanced feature extraction

methods using adaptive particle swarm

optimization.

International

Journal

of

Intelligent Engineering and Systems, 9 (4),

pp. 21-30.

参考文:

[1] BENI，G. 和 WANG，J.（1989）细胞

机器人系统中的群体智能。载于：1989 年

6 月在托斯卡纳举行的北约机器人与生物

系统高级研讨会论文集。纽约：北大西洋

公约组织。

[2] BENI，G.（2014）从群体智能到群体

机器人技术。于： E. ŞAHIN 和 W.M.

SPEARS（合编）群机器人。SR2004。计

算机科学讲义，第 1 卷。 3342。柏林，海

德堡：施普林格，第 1-9 页。

[3] A. GARG ， GILL ， P.RATHI ， P. ，

AMARDEEP 和 K.K. GARG。（2009）深

入了解群智能。国际工程最新趋势杂志，

2（8），第 42-44 页。

[4] BONABEAU ， E. ， DORIGO ， M. 和

THERAULAZ，G.（1999）群智能：从自

然系统到人工系统。纽约：牛津大学出版

社。

[5] PACHTER ， M. 和 CHANDLER ， P.

（1998）自主控制的挑战。电气工程师学

会控制系统杂志，18 (4)，第 92-97 页。

[6] ARKIN，R.（1998）基于行为的机器

人。马萨诸塞州剑桥：麻省理工学院出版

社。

[7] REYNOLDS，C.（1987）羊群，牧群

和学校：一种分布式的行为模型。ACM

信号图计算机图形学，21，第 25-34 页。

[8] SANFILIPPO ， F. ， YNDESTAD ， H.

和 ALALIYAT，S.（2014）基于遗传算法

和粒子群优化算法的对等群体模型的优化

（比较研究）。在：第 28 届欧洲建模与

仿真会议论文集，布雷西亚，2014 年 5 月。

欧洲建模与仿真理事会，第 643-650 页。

[9] WAHAB， M.， NEFTI-MEZIANI，S.

和 ATYABI，A.（2015）群体优化算法的

全面综述。普洛斯一号， 10 （ 5 ），

e0122827。

[10] ZHANG Y. ， AGARWAL V. ，

BALOCHIAN S. 和 YAN J.（2013）群智

能及其应用。科学世界杂志， 2013 ，

528069。

[11] BREZOČNIK ， L. ， FISTER ， I. 和

PODGORELEC，V.（2018）用于特征选

择的群智能算法：综述。应用科学， 8

（9），1521。

[12] AMUDHA，P.，KARTHIKAND，S.

和 SIVAKUMARI，S.（2015）一种使用

显着特征进行入侵检测的混合群智能算法。

世界科学杂志，2015，574589。

[13] CHENG，M. 和 LIEN，L.（2011）一

种用于基准功能和施工现场布局优化的基

于混合群智能的粒子蜂算法。于：2011 年

6 月在汉城举行的第 28 届国际建筑自动化

和机器人技术研讨会论文集。国际建筑自

动化和机器人技术协会，第 898-904 页。

[14] SINGH，N. 和 SINGH，S.（2017）粒

子群优化和灰狼优化器的混合算法，用于

提高收敛性能。应用数学学报，2017，

2030489。

(8)

Improved Hybrid Swarm Intelligence Algorithm for Voice Command Recognition

西 南 交 通 大 学 学 报

第 55 卷 第 2 期

2020 年 4 月

JOURNAL OF SOUTHWEST JIAOTONG UNIVERSITY

Vol. 55 No. 2

Apr. 2020

ISSN: 0258-2724 DOI：10.35741/issn.0258-2724.55.2.23

I

MPROVED

H

YBRID

S

WARM

I

NTELLIGENCE

A

LGORITHM FOR

V

OICE

C

OMMAND

R

ECOGNITION

用于语音命令识别的改进型混合群智能算法

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution License

I. I

II. B

C

III. M

M

IV. R

V. D

recognition of input voice and translation of

spoken language into text

optimization techniques to redesign artificial

neural network

R

[1]

BENI, G. and WANG, J. (1989)

Swarm intelligence in cellular robotic

systems. In: Proceedings of the NATO

Advanced

Workshop

on

Robots

and

Biological System, Tuscany, June 1989. New

York: North Atlantic Treaty Organization.

[2]

BENI, G. (2014) From swarm

intelligence to swarm robotics. In: ŞAHIN,

E. and SPEARS, W.M. (eds.) Swarm

Robotics. SR 2004. Lecture Notes in

Computer Science, Vol. 3342. Berlin,

Heidelberg: Springer, pp. 1-9.

[3]

GARG, A., GILL, P., RATHI, P.,

AMARDEEP and GARG, K.K. (2009) An

insight into swarm intelligence. International

Journal of Recent Trends in Engineering, 2

(8), pp. 42-44.

[4]

BONABEAU, E., DORIGO, M., and

THERAULAZ,

G.

(1999)

Swarm

Intelligence: From Natural to Artificial

Systems. New York: Oxford University

Press.

[5]

PACHTER, M. and CHANDLER, P.

(1998) Challenges of autonomous control.

IEEE Control Systems Magazine, 18 (4), pp.

92-97.

[6]

ARKIN, R. (1998) Behavior-Based

Robotics. Cambridge, Massachusetts: MIT

Press.

西南交通大学学报

第 55 卷第 2 期