Histogram of Gradient and Local Binary Pattern with Extreme Learning Machine Based Ear Recognition

(1)

西南交通大学学报

第 54 卷第 6 期

2019 年 12 月

JOURNAL OF SOUTHWEST JIAOTONG UNIVERSITY

Vol. 54 No. 6

Dec. 2019

ISSN: 0258-2724 DOI：10.35741/issn.0258-2724.54.6.31

Research Article

Computer Science and Image Processing

H

ISTOGRAM OF

G

RADIENT AND

L

OCAL

B

INARY

P

ATTERN WITH

E

XTREME

L

EARNING

M

ACHINE

B

ASED

E

AR

R

ECOGNITION

基于极端学习机的耳朵识别的梯度和局部二元模式的直方图

Ahmed Kawther Hussein

Department of Computer Science, College of Education, Mustansiriyah University Palestine St., P.O. Box: 14022, Baghdad, Iraq

Abstract

The ear recognition system is an attractive research topic in the area of biometrics. It involves building machine learning models to verify the identities of humans using their ears. In this article, an exploration of the performance of ear recognition using two features - local binary pattern and histogram of gradient - has been done using the famous dataset USTB. The finding is that there is a similarity in the performance of these two features in terms of accuracy with a difference in the number of false predictions. The achieved accuracy of the histogram of gradient based extreme learning machine was 99.86% while for local binary pattern based extreme learning machine it was 99.59%.

Keywords: Biometrics, Ear Recognition, Extreme Learning Machine, Local Binary Pattern, Histogram of

Gradient 摘要耳朵识别系统是生物识别领域中一个有吸引力的研究主题。它涉及建立机器学习模型，以验证人耳的身份。在本文中，已经使用著名的数据集美国旅游局进行了使用两个特征（局部二进制模式和梯度直方图）的耳朵识别性能的探索。发现是，在准确性方面，这两个功能的性能相似，但错误预测的数量有所不同。基于梯度的极限学习机的直方图的实现精度为 99.86％，而对于基于本地二进制模式的极限学习机，则为 99.59％。 关键词: 生物识别，耳朵识别，极限学习机，局部二值模式，梯度直方图

I. I

NTRODUCTION

Human identification is crucial for the operations of many systems in today’s technologies and services. A whole field that concerns technologies that can perform human identification has emerged called biometrics [1].

In biometrics, various methods are used for human identification: fingerprints, voice, iris of the eyes, gait, etc. The reason for the special interest in biometrics is the practicality for consumers and difficulty to hack in comparison to passwords and cards [2]. One of the most

(2)

challenging and unique applications of human identification is ear based identification. The ear has unique physiological and structural aspects. For example, in surveillance systems, detection and identification of an ear is more feasible than an eye. Another aspect of application of ear recognition is one can do twin identifications [3]. In some models, ears and other parts of the body or the human characteristics are fused to enable multi-model based human identification [2], [3], [4].

Development of an ear based identification system has various phases starting from the phases of image processing, ear detection, feature design and extraction, and classifiers building. Each phase has its challenges. Image pre-processing is concerned with image enhancement or dealing with parts of the body or other things that are blocking the ear such as hair or earrings [5], and dealing with illumination changes [12]. Ear detection is also crucial to determine the region of interest (ROI). Ear recognition involves machine learning approaches [6]. Therefore, feature design and extraction is regarded as core of any work related to ear identification [7]. Classifiers building and training are top priorities in systems of ear recognition [1].

The literature contains a wide range of features for vision based recognition systems. Histograms of Oriented Gradients (HOG) and Local Binary Pattern (LBP) are two types of them that are very famous and have good performance for many applications in pattern recognition [8]. In this article, we use HOG and LBP for ear based identification.

II. R

ELATED

W

ORKS

The literature contains numerous approaches that were developed for ear based identification. In the work of [1], the convolutional neural network was designed for the problem of human ear recognition and was based on a deep network topology. This approach selected the optimal activation function. The goal is preventing over-fitting. In order to detect the number of feature graphs, the configuration of the learning rate with the other parameters in the network is a training process done in the network model. Lastly, the human ear recognition test is conducted on the trained network model.

In the work of [9], it presented the local color texture descriptors and evaluated them in comparison to several color spaces. The Support Vector Machine (SVM) was a classifier. The authors have concluded that BSIF-RGB show promising performance.

A pipeline for ear recognition was proposed by [10]. The pipeline is composed of Ear Detection, Ear Feature Extraction, and Matching. The pipeline is similar to a complete pattern recognition pipeline. However, the pipeline approach makes it likely to employ the arbitrary images of subjects taken in a different environment and look at the subjects based only on ears. In the work of [11], an approach for deciding whether an ear is excluded or not has

been done using block based principle

component analysis (PCA) to recognize the subject ear. Another work is the work of [12], the convolutional neural network, which is called Faster R-CNN, to detect an ear using profile images in 2D based on a multiple scale faster region. In the work of [13], eight types of features are extracted using a Local Binary Pattern, Local Phase Quantization, Binaries Statistical Image Features, Patterns of Oriented Edge Magnitudes, a histogram of gradient, different scale invariant feature transformation and Gabor. They have provided a tool for extraction of these features from any ear dataset. In the work of [14], the usage of light field cameras for ear recognition has been proposed, and the authors have argued that light field cameras are being commercialized and the proposed ear recognition is based on that is needed. Thus, they have combined 536 light field images with Lenslet Light Field Ear using a database for 67 objects in 4 different poses taken by a Lytro ILLUM lenslet light field camera. The Lenslet Light Field Ear Database contains stringent cases like ear images partly occluded by ear piercing, earrings, hair, or a combination of these. The ear recognition solution based on a novel light field suggests that it takes advantage of the richer spatio-angular information.

III. M

ETHODOLOGY

Our goal is to compare the performance of two classifiers for ear recognition: one trained on HoG features while the other trained on LBP features. In order to perform this, an extreme learning machine classifier was used. This

section provides the methodology for

implementing this study.

A. LBP Features

Local binary pattern is one type of image feature that concentrates on extracting the texture information in an image [16]. In an LBP, a binary code is generated for each pixel to represent the relative change in the intensity between the pixel and its surrounding pixels. Next, the frequency of occurrence of each binary pixel is represented by

(3)

a histogram. The histogram provides a compact representation of the textual patterns in the image. There are several variants of LBP: typical LBP, uniform LBP, and completed LBP.

B. HoG Feature

It is a histogram of the gradient information of the orientation in the zone. The zone might be described by either Cartesian or Polar coordinates. Next, the rotation differences in the stroke within the zone is enabled using normalization. The operation of normalization is applied using the higher gradient orientation, which is used to represent the first bin in the histogram. Such normalization is important to enable good matching between both the original histogram and the matched one.

C. Extreme Learning Machine (ELM)

This involves training one hidden layer of feed-forward neural networks using the concept of least square mean error [14]. It starts with random initialization of the input hidden layer, then the output hidden matrix is found. Then the weights of the hidden output layer are found using the Moore-Penrose inverse. This approach is better than the usual way that uses the concept of a gradient. This is because of fast training and more avoidance of local minima.

IV. P

ROCEDURE FOR

B

UILDING AN

E

AR

B

ASED

R

ECOGNITION

S

YSTEM

We state in this section the procedure for building an ear based recognition system using LBP and HoG. The procedure is provided in Table 1.

Table 1.

Procedure of building an ear based recognition system

Input

Dataset

The Number of hidden neurons The activation function

Output

Accuracy

Starts

1. The data divide into training and testing 2. Extract HoG and LBP

3. Normalize

Train an ELM based on HoG, we call it ELM- HoG using (the number of hidden neurons and activation function)

Train and ELM based on LBP, we call it ELM-LBP using (the number of hidden neurons and activation function)

4. Test extreme learning machine - HoG using testing data

5. Test extreme learning machine -LBP using testing data

6. Return testing accuracy

End

V. D

ATASET

The dataset includes 180 images of 60 subjects, both students and teachers from USTB [15], from 3 sessions in July and August 2002. The database includes images of the right ear from each subject. In each session, the images were possessed under different lighting and in conditions with a different rotation. We show examples of some images from the data in Figure 1.

Figure 1. Example of four images from USTB dataset

VI. R

ESULTS AND

D

ISCUSSION

In order to evaluate both HoG and LBP features, an ELM with a number of neurons equal to 10000 was created. The sigmoid function was used as an activation function. The predicted ELM values were compared with ground truth for each of the types of features. The results are shown in Figures 2 and 3. In addition, we show the confusion matrix in Figures 4 and 5. It was observed that the accuracy of HoG at 99.83% was superior to LBP with 99.87%. From the confusion matrix, we observed that the HoG predictions matched LBP predictions in one of the false identifications out of the total number of false identifications. For further exploration, we plotted the accuracy of 10 experiments on LBP and HoG features with the corresponding mean

(4)

values in Figure 6. It turned out that the mean value of the accuracy of HoG is slightly bigger than LBP. In order to determine whether there is statistical significance, we conducted a t-test between the two hypotheses. The results of the t-test was 8.05974E-08. This indicates that HoG is superior over LBP. 0 10 20 30 40 50 60 subject 0 10 20 30 40 50 60 p re d ic ti o n v s . tr u th predicted ELM lbp truth

Figure 2. The predicted values vs. ground truth for ELM trained on LBP 0 10 20 30 40 50 60 subject 0 10 20 30 40 50 60 p re d ic ti o n v s . tr u th

predicted ELM hog truth

Figure 3. The predicted values vs. ground truth for ELM trained on HoG

Figure 4. The confusion matrix for extreme learning machine and LBP

Figure 5. The confusion matrix for extreme learning machine and HoG

0 10 20 30 40 50 60 70 80 90 100 0.993 0.994 0.995 0.996 0.997 0.998 0.999 lbp hog meanlbp meanhog

Figure 6. The accuracy of 10 experiments for LBP and HoG features based ELM

VII. C

ONCLUSION

Ear based recognition is to identify people based on their ear images. It attracts researchers because of its potential for high accuracy and feasibility of deployment in various environments such as security systems. In this article, we found that both HoG-ELM and LBP-ELM are highly effective and have accurate feature and classifiers combinations. The resulted accuracy based on USTB features is more than 99%. Future work is to explore the performance of other types of features and to use 3D images for ears as input.

A

CKNOWLEDGMENT

The author would like to thank the

Mustansiriyah university

(www.uomustansiriyah.edu.iq) Baghdad - Iraq for its support in the present work.

R

EFERENCES

[1]

YING,

T.,

SHINING,

W.,

and

WANXIANG,

L. (2018)

Human

ear

recognition based on deep convolutional

neural network. In: Proceedings of the 2018

(5)

Chinese Control and Decision Conference,

Shenyang, June 2018. Piscataway, New

Jersey: Institute of Electrical and Electronics

Engineers, pp. 1830-1835.

[2] SÁNCHEZ, D., MELIN, P., and

CASTILLO, O. (2017) Optimization of

modular granular neural networks using a

firefly algorithm for human recognition.

Engineering

Applications

of

Artificial

Intelligence, 64, pp. 172-186.

[3] AKIN, C., KACAR, U., and KIRCI, M.

(2018) A Multi-Biometrics for Twins

Identification Based Speech and Ear.

Available

from

https://arxiv.org/ftp/arxiv/papers/1801/1801.

09056.pdf.

[4] FAN, T.Y., MU, Z.C., and YANG, R.Y.

(2017) Multi-modality recognition of human

face and ear based on deep learning. In:

Proceedings of the 2017 International

Conference on Wavelet Analysis and Pattern

Recognition, Ningbo, July 2017. New Jersey:

Institute of Electrical and Electronics

Engineers, pp. 38-42.

[5] RAHIM, M.S.M., REHMAN, A.,

KURNIAWAN, F., and SABA, T. (2017)

Ear biometrics for human classification based

on region features mining. Biomedical

Research, 28 (10), pp. 4660-4664.

[6] ZHANG, Y., MU, Z., YUAN, L., ZENG,

H., and CHEN, L. (2017) 3D ear

normalization and recognition based on local

surface variation. Applied Sciences, 7 (1),

104. [7] YUAN, L., LIU, W., and LI, Y. (2016)

Non-negative

dictionary

based

sparse

representation

classification

for

ear

recognition with occlusion. Neurocomputing,

171, pp. 540-550.

[8] WANG, X., HAN, T.X., and YAN, S.

(2009) An HOG-LBP human detector with

partial occlusion handling. In: Proceedings of

the 2009 IEEE 12th International Conference

on Computer Vision, Kyoto,

September-October 2009. New Jersey: Institute of

Electrical and Electronics Engineers, pp.

32-39.

[9] BENZAOUI, A. and BOUKROUCHE, A.

(2019) Ear

Biometric Recognition

in

Unconstrained Conditions. In: BOYACI, A.,

EKTI, A., AYDIN, M., and YARKAN, S.

(eds.)

International

Telecommunications

Conference. Lecture Notes in Electrical

Engineering, Vol. 504. Singapore: Springer,

pp. 261-269.

[10] EMERŠIČ, Ž., KRIŽAJ, J., ŠTRUC, V.,

and PEER, P. (2019) Deep Ear Recognition

Pipeline. In: HASSABALLAH, M. and

HOSNY, K. (eds.) Recent Advances in

Computer Vision. Studies in Computational

Intelligence, Vol. 804. Cham: Springer, pp.

333-362.

[11] RATNA KUMARI, V., RAJESH

KUMAR, P., and SRINIVASA KUMAR, S.

(2019) Occluded Ear Recognition Using

Block-Based PCA. In: WANG, J., REDDY,

G., PRASAD, V., and REDDY, V. (eds.) Soft

Computing and Signal Processing. Advances

in Intelligent Systems and Computing, Vol.

898. Singapore: Springer, pp. 569-577.

[12] ZHANG, Y. and MU, Z. (2017) Ear

detection under uncontrolled conditions with

multiple

scale

faster

region-based

convolutional neural networks. Symmetry, 9

(4), 53.

[13] EMERŠIČ, Ž., ŠTRUC, V., and PEER,

P. (2017) Ear recognition: More than a

survey. Neurocomputing, 255, pp. 26-39.

[14] DING, S., ZHAO, H., ZHANG, Y., XU,

X., and NIE, R. (2015) Extreme learning

machine: algorithm, theory and applications.

Artificial Intelligence Review, 44 (1), pp.

103-115.

[15] PFLUG, A. and BUSCH, C. (2012) Ear

biometrics: a survey of detection, feature

extraction and recognition methods. IET

Biometrics, 1 (2), pp. 114-129.

[16] WANG, Q., LI, B., HOU, Y., and FAN,

H. (2018) An Improved LBP Feature for Rail

Fastener Identification. Journal of Southwest

Jiaotong University, 53 (5), pp. 893-899.

参考文:

[1] YING ， T. ， SHINING ， W. ，和

WANXIANG， L.（2018）基于深度卷积

神经网络的人耳识别。于：2018年中国控

制与决策会议论文集，沉阳，2018年6月。

新泽西州皮斯卡塔维：电气与电子工程师

学会，第 1830-1835 页。

[2] SÁNCHEZ ， D. ， MELIN ， P. 和

CASTILLO，O.（2017）使用萤火虫算法

(6)