博士学位論文

(1)

博士学位論文 Doctoral Dissertation

内容の要旨及び

審査結果の要旨

Dissertation Abstract and

Summary of the Dissertation Review Result

第 34 号

The Thirty-Fourth Issue

2019年9月 September, 2019 The University of Aizu

(2)

はしがき

博士の学位を授与したので、学位規則（昭和２８年４月１日文部省令第９号）第８条の規定に基づき、その論文の内容の要旨及び論文審査の結果の要旨をここに公表する。

学位記番号に付した｢甲｣は学位規則第４条第１項（いわゆる課程博士）によるものであることを示す。

Preface

On granting the Doctoral Degree to the individuals mentioned below, abstracts of their theses and the theses review results are herewith publicly announced, in according to the provisions provided for in Article 8 of the Ruling of Degrees (Ministry Of Education Ordinance No.9, enacted on April 1, 1953)

The Chinese character, “甲”, at the beginning of the diploma number represents that an individual has been granted the degree in accordance with the provisions provided for in Paragraph 4-1 of the Ruling Of Degrees (what is called “Katei Hakase,” or the Doctoral Degree granted by the University at which the grantee was enrolled.).

(3)

- 1 -

目次 Contents

掲載順

Order

学位記番号 Diploma No.

学位 Degree

氏名 Name

論文題目 Dissertation Title

頁 Page

1

甲CI博第71号

博士（コンピュータ理工学）

The Degree of Doctor of Science and

Engineering

赵昊立 ZHAO, Haoli

Trainable Sparse Coding with Lp-Norm-Based Regularization

Lpノルムに基づく正則化を用いた学習可能スパースコーディング

3

2

甲CI博第72号

Engineering

NGUYEN VAN, Duc

Viewport-adaptive Streaming of 360-degree Video over Networks

ネットワーク上360度動画用ビューポート対応ストリーミング

6

3

甲CI博第73号

Engineering

黄華錕 HUANG, Huakun

Machine Learning Approaches for Device-Free Localization

デバイスフリーローカリゼーションのための機械学習アプローチ

9

4

甲CI博第74号

Engineering

于建国 YU, Jianguo

Multimodal Information Fusion based on Deep Learning

ディープラーニングに基づくマルチモーダル情報融合の研究

12

5

甲CI博第75号

Engineering

趙凌君 ZHAO, Lingjun

Classification for Device-free Localization based on Deep Neural Networks

ディープニューラルネットワークに基づくデバイスフリーな位置推定のための分類

15

(4)

- 2 - 6

甲CI博第76号

Engineering

VU HUY, The

Algorithms and Architectures for Spiking Neuromorphic Systems

スパイキングニューロモルフィックシステムのためのアルゴリズム及びアーキテクチャ

18

7

甲CI博第77号

Engineering

王鈺 WANG, Yu

Models and Algorithms for Efficient Data Processing in Fog Computing Supported Disaster Areas

災害地におけるフォグコンピューティング基盤の効率化のためのモデルとアルゴリズム

23

8

甲CI博第78号

Engineering

周璐 ZHOU, Lu

Continuous Authentication and Lightweight Implementation of Elliptic-Curve Cryptography for the Internet of Things

IoT向けの継続的ユーザ認証および楕円曲線暗号の軽量化実装に関する研究

26

(5)

- 3 - Name

氏名

ZHAO, Haoli 赵昊立 The relevant degree

学位の種類

Doctoral degree (in Computer Science and Engineering) 博士（コンピュータ理工学）

Number of the diploma of the Doctoral Degree 学位記番号

甲CI博第71号

The Date of Conferment 学位授与日

September 18, 2019 2019年9月18日 Requirements for Degree Conferment

学位授与の要件

Please refer to the article five of “University Regulation on University Degrees”

会津大学学位規程第5条該当 Dissertation Title

論文題目

Trainable Sparse Coding with Lp-Norm-Based Regularization Lpノルムに基づく正則化を用いた学習可能スパースコーディング

Dissertation Review Committee Members 論文審査委員

The University of Aizu, Associate Prof. LI, X. (Chief Referee)

The University of Aizu, Senior Associate Prof. ZHU, X.

The University of Aizu, Associate Prof. OKUYAMA, Y.

The University of Aizu, Senior Associate Prof. MARKOV, K.

Guilin University of Electronic Technology, Prof. DING, S.

会津大学准教授李想（主査）

会津大学上級准教授朱欣会津大学准教授奥山祐市

会津大学上級准教授コンスタンティンマルコフ桂林電子科学技術大学教授丁数学

(6)

- 4 -

Abstract

Sparse representation, which aims at finding appropriate sparse representations of data with an overcomplete dictionary set, has been proven to be a powerful tool for analysis and processing of various signals. Performance of sparse representation mainly depends on a well-defined dictionary and an appropriate sparse constraint for corresponding data.

In this thesis, we concentrate on emphasizing the importance of l_p norm (0<p<1) in three different directions in sparse representation to show its potential in enhancing sparsity and accuracy. At first, we consider a dictionary learning problem with l_p norm (0<p<1) regularization. To solve the problem brought by the nonconvex property of the l_p norm (0<p<1), we introduce the weighted l_1 norm to convexly approximate l_p norm (0<p<1), and combined with the hierarchically alternating update strategy. To further solve the problem that the absolute value function is not derivable at zero, we propose to use logarithm and hyperbolic function to approximate absolute value function. We present an efficient algorithm for learning dictionary with the weighted l_1norm as sparsity constraint, including two alternating phases: sparse coding and dictionary update. This algorithm presents good robustness to noise when the signal-to-noise ratio (SNR) level is higher than 10 dB that dictionaries can be recovered to nearly 100% and overperform the other classical dictionary learning algorithms.

Then, we propose to construct a family of l_p norm (0<p<1) based Deep Neural Network structured Sparse Coding (DNN-SC) algorithms. DNN-SC uses parameter training methodology in Recurrent Neural Network (RNN) to train parameters in a truncated sparse coding algorithm. The encoder with well-trained parameters can perform as good as converged sparse coding algorithms while obviously enhancing efficiency. We show how to unfold IHTA and WISTA to form feed-forward neural networks, where the parameters can be learned by back-propagation. Moreover, by changing the setting of the loss function, all DNN-SC algorithms can be learned through supervised and unsupervised schemes. We validate proposed algorithms in synthetic data experiments and graphic denoising experiments. We show that DNN-SC algorithms can conduct real-time video denoising with only CPU for 25 frames/s 360×480-pixel gray-scaled videos.

What is more, we show how a well-posed sparsity constraint can affect performances in highly correlated data by introducing l_p norm (0<p<1) into the regularization part of In-dependently Interpretable Lasso (IILasso). IILasso proposes a new regularization by using the strategy of selecting uncorrelated variables. In this chapter, we have shown how to introduce l_p norm (0<p<1) into the regularization part of IILasso to further enhance the performance of sparse coding in highly correlated data. We propose to use the coordinate descent algorithm with weighted l_1 norm and the proximal operator to solve the optimization problem with the new regularization. We then validate proposed algorithms in synthetic data and highly correlated gene expression data. We show that proposed algorithms can present reasonable suggestions for diseases and developmental stages on gene expression. Furthermore, we further enhance the efficiency of independently interpretable algorithms

(7)

- 5 -

by building DNN-SC versions of them. We show how to unfold IILasso, II-ISTA and IIWLasso to form feed-forward neural networks, where the parameters can be learned by back-propagation. We then validate the proposed algorithms in synthetic data experiments. We show that proposed DNN-SC algorithms can present reasonably good sparse representations with better efficiency in highly correlated data.

Summary of the Dissertation Review Result

The contributions of the dissertation in sparse representation area are recognized and confirmed by the committee in the final review. Moreover, the committee has agreed with the responses in the final review to the issues of preliminary review, including the concerns about the research vision, the difference with the existing weighted norm methods and how to choose the -value etc.

Furthermore, the applicant has shown good scholastic aptitude and sufficient English ability in his dissertation and through his oral presentation. In summary, the committee members unanimously agreed that the candidate passes his doctoral thesis review with high quality in both publication record and oral presentation.

The research makes contributions in sparse representation of signals, which concentrates on three subareas of sparse representation, namely, dictionary learning, deep neural network structured sparse coding (DNN-SC) and sparse coding for highly correlated data. This research have introduced norm regularization in these three subareas to develop several algorithms and have validated performances of these algorithms in both synthetic data and real-world data.

Three main contributions of the dissertation are focused and present during the final review, and they are list as follows.

1. We have introduced the non-convex regularization norm (0<p<1) in dictionary learning for sparse representation. By using appropriate approximations, we make the non-convex optimization problem convex and smooth. We have validated that the proposed dictionary learning algorithm Hierarchical Dictionary Learning with Weighted norm (HDLWL) can obtain better performances compared with the other dictionary learning algorithms in experiments;

2. We have built norm (0<p<1) based Deep Neural Network structured Sparse Coding (DNN-SC) algorithms based on Weighted Iterative Shrinkage Thresholding Algorithm (WISTA) and Iterative Half Thresholding Algorithm (IHTA), we have validated that unsupervised learnt DNN-SCs can significantly improve efficiency in graphics denoising tasks and realize the goal of online video denoising;

3. We have developed efficient sparse coding algorithms for highly correlated data by introducing norm (0<p<1) into the regularization part of Independently Interpretable Least Absolute Shrinkage and Selection Operator (IILASSO). We have also validated that norm (0<p<1) based independently interpretable sparse coding algorithms can enhance performances in highly correlated data for both synthetic experiments and gene expression experiments.

(8)

- 6 - Name

氏名

NGUYEN VAN, Duc

（グエンヴァンドゥック）

The relevant degree 学位の種類

甲CI博第72号

論文題目

Viewport-adaptive streaming of 360-degree video over networks

ネットワーク上360度動画用ビューポート対応ストリーミング Dissertation Review Committee Members

論文審査委員

The University of Aizu, Senior Associate Prof. TRUONG, C.T.

(Chief Referee)

The University of Aizu, Prof. PHAM, A.

The University of Aizu, Prof. PAIK, I.

The University of Aizu, Prof. COHEN, M.

会津大学上級准教授コンタンチョオン（主査）

会津大学教授アントゥアンファン会津大学教授白寅天

会津大学教授マイケルコーエン

(9)

- 7 -

Abstract

Virtual Reality (VR) is changing the way we play, learn, and communicate by offering “immersive experiences”. Yet, truly immersive VR has extreme requirements on visual quality, sound quality, and interactions. To bringing VR to users, new technologies for the entire VR delivery chain are required.

In this dissertation, I focus on viewport-adaptive streaming of 360-degree video, which is the most important content type in VR applications. Specifically, three key aspects in tiling-based viewport-adaptive streaming which are 1) server-based adaptation, 2) client-based adaptation, and 3) adaptive tiling will be investigated.

First, a server-based adaptation framework for tiling-based viewport adaptive streaming is proposed.

The tile selection problem is formulated as an optimization problem with a new quality metric which take into account the visible portion of each tile. Two tiles’ versions selection options are devised, considering viewport estimation errors and user head movements. Experiment results show that the proposed approach can improve the average viewport quality by up to 3.8dB while reducing the standard deviation of viewport quality by up to 1.1dB. Also, the impacts of segment duration and buffer size are also investigated. It is found that long segment durations and large buffer sizes can greatly reduce the performance of tile’ versions selection methods.

Second, a client-based adaptation framework for viewport adaptive streaming that can support different application scenarios is presented. The proposed framework supports estimation of instant bitrate/quality. A set of adaptation methods and their various options are evaluated to show the benefits of bitrate/quality estimations. Experimental results show that the proposed framework can effectively improve the viewport quality compared to conventional systems.

Third, a solution that can dynamically adapt the tiling scheme on-the-fly according to the user head movements is proposed. Experiment results show that the proposed solution can improve the average viewport quality by up to 2.3dB compared to a fixed tiling solution. Also, it is found that 4x3 tiling scheme results in the lowest performance and thus should not be used in practice.

Summary of the Dissertation Review Result

Virtual Reality (VR) is changing the way we play, learn, and communicate by offering “immersive experiences”. The focus of this research is on viewport-adaptive streaming of 360-degree video, which is the most important content type in VR applications. Specifically, the three key contributions are presented. In the first contribution, a server-based adaptation framework for tiling-based viewport adaptive streaming is proposed. The tile selection problem is formulated as an optimization problem with a new quality metric which takes into account the visible portion of each tile. Two tile version selection options are devised, considering viewport estimation errors and user head movements.

Experiment results show that the proposed approach can improve the average viewport quality by up

(10)

- 8 -

to 3.8dB while reducing the standard deviation of viewport quality by up to 1.1dB. In the second contribution, a client-based adaptation framework for viewport adaptive streaming that can support different application scenarios is presented. The proposed framework supports estimations of instant bitrate and quality. A set of adaptation methods and their various options are evaluated to show the benefits of bitrate/quality estimations. Experiment results show that the proposed framework can effectively improve the viewport quality by up to 8Db compared to conventional systems. In the third contribution, a solution that can dynamically adapt the tiling scheme on-the-fly according to the user head movements is proposed. Experiment results show that the proposed solution can improve the average viewport quality by up to 2.3dB compared to fixed tiling solutions. Also, it is found that 4x3 tiling scheme results in the lowest performance and thus should not be used in practice.

The three contributions above have been published as three major journal papers. The candidate is the first author of all these papers. In the final review, the candidate successfully improved his presentation and dissertation accordingly to the review committee's comments. In conclusion, the candidate has fulfilled all of the formal requirements for the doctoral degree.

(11)

- 9 - Name

氏名

HUANG, Huakun 黄華錕

The relevant degree 学位の種類

甲CI博第73号

論文題目

Machine Learning Approaches for Device-Free Localization

デバイスフリーローカリゼーションのための機械学習アプローチ

The University of Aizu, Associate Prof. SU, C. (Chief Referee)

The University of Aizu, Associate Prof. PEI, Y.

The University of Aizu, Prof. MORI, K.

The University of Aizu, Associate Prof. LI, X.

会津大学准教授蘇春華（主査）

会津大学准教授裴岩会津大学教授森和好会津大学准教授李想

桂林電子科学技術大学教授丁数学

(12)

- 10 -

Abstract

Internet-of-Things (IoT) networks have spawned an extensive set of emerging applications in industrial, domestic, infrastructure, consumer, commercial spaces. In the various IoT-based applications, wireless localization and machine failure detection are still with the requirements of high accuracy and efficiency, which hinders their development in the fields of smart home, smart factory, etc. In order to solve these problems, in this thesis, taking the advantages of machine learning technology in the classification, we transform the localization problem and the failure-detection problem into the corresponding classification problems. Then, some machine learning algorithms, including sparse coding and deep learning, are developed for improving the classification accuracies and efficiencies.

At first, for the wireless localization, in order to achieve an accurate and efficient process for device-free localization (DFL), we formulated DFL as a sparse representation classification (SRC) problem, presented a sparse model, and conducted sparse coding in signal subspace for locating targets, which led to the algorithms of sparse coding via the iterative shrinkage-thresholding algorithm (SC-ISTA) and subspace-based SC-ISTA (SSC-ISTA). Experimental results showed that, for locating a single target, the proposed SC-ISTA was robust to the noisy testing data with a high accuracy, almost 100%. Even though the dimension of original data was reduced to a low dimension, SSC-ISTA still could achieve the same localization accuracy with SC-ISTA for both single-target and multi-target localization.

Further, in order to achieve a more accurate and robust process of DFL, we exploited the l_2,1 norm as the regularizer and devised the optimization method with the proximal operator, which led the proposed block-sparse-coding algorithm, BSCPO. Compared with the conventional work that employed l_0 norm or l_1 norm as a regularizer, the proposed approach had the advantage of generating a group sparsity and improving the joint sparsity for the sparse solution. Experimental results on the real-world dataset and our real testbeds showed that, both in the indoor DFL and outdoor DFL, the proposed approach outperformed the state-of-the-art methods adopted l_0 norm or l_1 norm.

The proposed BSCPO algorithm presented robust performance in the severely noisy environments for target localization.

Finally, we studied the failure detection for the Self-driving network (SelfDN) based edge IoT networks. We first reviewed the recent works on applying failure detection techniques to IoT networks. We then transformed the failure detection problem into a classification problem.

Meanwhile, an enhanced deep neural network was proposed to achieve an accurate detection result.

The real trace-driven experiments shown that the proposed scheme achieved the detection accuracy about 89% and outperformed five other machine learning algorithms. Finally, some open issues have been summarized for future research. We hope this work can inspire blooming studies on the related topics of SelfDN based edge intelligence IoT networks.

(13)

- 11 -

Summary of the Dissertation Review Result

Study of the candidate proposes accurate or efficient methods to solve the wireless localization problem and machine failure detection. For the wireless localization, to achieve an accurate and efficient access for device-free localization (DFL), the thesis formulates DFL as a sparse representation classification problem, presents a sparse model, and conducts sparse coding in signal subspace for locating targets. The proposed algorithms could achieve the high localization accuracy for both single-target and multi-target localization. Regarding the failure detection for the self-driving-network based edge IoT networks, the thesis transforms the failure detection problem into classification problem and proposes an enhanced deep neural network to achieve an accurate detection result. Experimental result shows the proposed approach is outperformed the other commonly-used machine-learning approaches. The thesis is excellent in the points of originality, theory, and experimental evaluation.

During the final doctoral dissertation review, the candidate Mr. Huakun Huang showed good scholastic aptitude and good English language ability in both his presentation and dissertation. In addition, the candidate also properly addressed and responded all the comments and requests discussed in the preliminary thesis review. Moreover, two major journal papers which are related the dissertation were published. Overall, the review committee unanimously agreed that the candidate, who demonstrated great ability to do research independently, passed the final review and fulfilled all the requirements for the doctoral degree.

(14)

- 12 - Name

氏名

YU, Jianguo 于建国 The relevant degree

学位の種類

甲CI博第74号

論文題目

Multimodal Information Fusion based on Deep Learning ディープラーニングに基づくマルチモーダル情報融合の研究

The University of Aizu, Senior Associate Prof. MARKOV, K.

(Chief Referee)

The University of Aizu, Prof. SUGIYAMA, M.

The University of Aizu, Prof. ZHAO, Q.

The University of Aizu, Prof. WILSON, I.

会津大学上級准教授コンスタンティンマルコフ（主査）

会津大学教授杉山雅英会津大学教授趙強福

会津大学教授イアンウイルソン

(15)

- 13 -

Abstract

To survive in this complex world, we have to constantly obtain information about events happening around us. A modality is a particular form of a signal, from which we can extract information about an event and the information about the same event can have many modalities. This thesis is related to deep learning-based multimodal learning approaches.

Chapter 1 describes the motivations and applications of multimodal learning; What challenges we are facing as well as the basic approaches and related studies.

Chapter 2 presents the most commonly used representations for different modalities, i.e. acoustic information, articulatory information, video information, image information, and text information as well as their pre-processing and feature extractions approaches.

Chapter 3 presents the basic knowledge about deep neural networks, i.e. feedforward neural networks, recurrent neural networks (LSTM and GRU implementations), convolutional neural networks, autoencoders, correlational neural networks, and word embedding.

Chapter 4 first presents the basic knowledge of ASR systems, i.e. the GMM-HMM system, and the DNN-HMM hybrid speech recognition system, then gives a brief review of related studies of acoustic and articulatory information integration. After that, the details of our proposed methods are presented, i.e. acoustic RNN training using generalized distillation and joint inversion training followed by the experiments, results, and analysis.

Chapter 5 presents the basic knowledge of personality recognition; The Big Five personality traits, the details of our low-level and high-level feature extractions, multi-stage training strategy, the formulation of our objective functions and neural network settings. Then, the experiments, results, and analysis will be reported.

Chapter 6 summaries the overall principles of multimodal information fusion, the advantages, and the disadvantages. Then, the directions for future research will also be discussed.

Contributions

Proposed an ASR system that use acoustic and articulatory information as a regularization to guide training of a model using only acoustic features.

Proposed an ASR system that jointly train a hybrid model combining inversion model and the acoustic model.

Proposed a system to automatically learn the text representation for APR, which is combined with

(16)

- 14 - author information.

Proposed a architecture that automatically captures speaking styles for AAPR.

Proposed a training strategy that deals with the different convergence speeds of multiple modalities.

Summary of the Dissertation Review Result

The dissertation presents a framework for integrating multiple information sources (modalities) using deep neural networks for several machine learning tasks. First, for automatic speech recognition (ASR), author proposed two deep neural network structures where articulatory data are combined with the conventional acoustic data for improved speech recognition performance. Both approaches achieved superior recognition rates compared to the existing methods on the XRMB database. Second, author developed several deep neural network structures for the Personality Recognition task. As input data, he used multiple sources, such as text speech and video. The novelty in his approach was to use speaking style extraction and modeling. Based on this model, the system he developed achieved SOTA results using the Apparent Personality Recognition database. Another contribution of the author is the new training scheme which allows neural network models for multiple modalities to be trained jointly overcoming the effect of different convergence speeds, which usually impedes the training.

This scheme allowed to achieve the best performance among all published studied on this task.

In the final review, the candidate presented his work in 60 minutes followed by 50 minutes questions and discussion. The committee have reviewed the submitted dissertation and the response to questions raised after the preliminary review and satisfied the answers. All member of the committee agreed to confirm the significance of the dissertation for a PhD degree.

(17)

- 15 - Name

氏名

ZHAO, Lingjun 趙凌君 The relevant degree

学位の種類

甲CI博第75号

論文題目

Classification for Device-free Localization based on Deep Neural Networks

ディープニューラルネットワークに基づくデバイスフリーな位置推定のための分類

The University of Aizu, Associate Prof. SU, C. (Chief Referee)

The University of Aizu, Prof. PAIK, I.

The University of Aizu, Senior Associate Prof. TOMIOKA, Y.

The University of Aizu, Associate Prof. LI, X.

会津大学准教授蘇春華（主査）

会津大学教授白寅天

会津大学上級准教授富岡洋一会津大学准教授李想

桂林電子科学技術大学教授丁数学

(18)

- 16 -

Abstract

Wireless sensor networks (WSNs) has spawned a variety of emerging applications in recent years, such as location-based services in smart city, patient or aging healthcare- monitoring in smart home, mobile robot localization in smart factory and so on. Among the conventional device-based wireless localization techniques, e.g., global positioning system (GPS), the target must be attached with wireless devices or tags. However, this may not be applicable to some emerging scenarios. For example, in the security safe- guard, one cannot usually pre-equip a traceable device on the intruder for monitoring the locations of the targets. To address this kind of problems, a new kind of wireless localization technology, named device-free localization (DFL), is proposed.

Contrasting with the conventional localization technology, DFL does not need the targets carry any electronic devices. Therefore, it has recently attracted the extensive attentions. The focus of this thesis presents contributions to the field of DFL based on deep neural networks (DNNs). The organization and main contributions of this dissertation are briefly summarized in the following manner.

Chapter 1 first offers an introduction and background of the DFL problem. Then gives an overview of DNNs and some other architectures, including restricted Boltzmann machines (RBMs), autoencoder (AE) and convolutional neural network (CNN), which are used to solve the DFL problems in this thesis.

Chapter 2 describes an outdoor DFL experiment and an accurate approach based on a pre-trained AE to address the DFL problem. In this algorithm, multiple Gaussian Bernoulli RBM (GBRBMs), a variant of RBMs, are utilized for the pre-training of the AE. The GBRBMs-based AE (GBRBMs-AE) extracts discriminative features from the received signal strength indicator (RSSI) measurements automatically to characterize the target’s location. In addition, this algorithm is also used for dimensionality reduction, which reduce the effects of noise as well as the time cost for locating.

Chapter 3 describes two designed real testbeds of indoor DFL scenarios and an outdoor DFL experiment. In addition, a pre-processing method, named background elimination (BE), is employed to dig out the variation component with distinguished features from the collected RSSI measurements. In this chapter, the DFL problem is formulated as an image classification problem; moreover, Chapter 3 presents an BE pre-processing based CNN (BE-CNN) to perform target locating for both indoor and outdoor DFL. According to the experimental results, for both indoor and the outdoor DFL, the proposed BE-CNN can accurately locate the target even the environment is noisy.

In order to improve the anti-interference ability of the DFL approach, Chapter 4 presents an accurate and robust approach of DFL based on convolutional autoencoder (CAE). The proposed approach performs unsupervised feature extraction from raw signals followed by supervised fine- tuning for classification. The CAE combines the advantages of an CNN and an AE in the feature learning and signals reconstruction. The experimental results on the real- world dataset show that the proposed approach demonstrates a superior localization performance to that of some other compared DFL approaches, especially when the noise in the environment is serious. In addition, the average testing time cost is 4 ms, which is sufficiently fast for online locating.

Finally, Chapter 5 summarizes our contributions in this thesis and presents some directions for the

(19)

- 17 - further research.

Summary of the Dissertation Review Result

So far, many researchers have paid attention to a promising wireless localization technology, called device-free localization (DFL), which can locate the targets without carrying any electronic devices.

The key purpose of DFL is to locate the target with high accuracy and high efficiency. The main concern in Ms. Zhao’s doctoral dissertation is to solve the DFL problem by employing deep neural networks. The novelty of the proposed schemes to solve the problem of DFL in this research are sound enough for writing a good dissertation.

In addition, the doctoral dissertation contains three contributions by Ms. Zhao, including a major conference paper published in the IEEE flagship Conference on Systems, Man, and Cybernetics (SMC) and two major journal paper published in Symmetry and IEEE Internet of Things Journal. As the first author of the three publications, Ms. Zhao fulfills the requirements of the doctoral dissertation review.

In both the preliminary review and final review, Ms. Zhao showed good scholastic aptitude in her dissertation and great English language ability in her oral presentation. Moreover, Ms. Zhao had well addressed the major questions from the referees and carefully revised the final draft of the dissertation according to the comments of the referees. In summary, the quality of the contributions was confirmed significant and the committee members unanimously agreed that the candidate passed her doctoral dissertation review with high quality in thesis and oral presentation.

The committee agreed that as a result of the thesis review, the thesis has been recognized as qualified for conferment for an academic degree.

(20)

- 18 - Name

氏名

VU HUY, The ヴーフィテー The relevant degree

学位の種類

甲CI博第76号

論文題目

Algorithms and Architectures for Spiking Neuromorphic Systems

スパイキングニューロモルフィックシステムのためのアルゴリズム及びアーキテクチャ

The University of Aizu, Prof. BEN, A. (Chief Referee) The University of Aizu, Prof. MIYAZAKI, T.

The University of Aizu, Prof. TSUKAHARA, T.

The University of Aizu, Prof. KITAMICHI, J.

Keio University, Prof. AMANO, H.

会津大学教授アブデラゼクベンアブダラ（主査）

会津大学教授宮崎敏明会津大学教授束原恒夫会津大学教授北道淳司慶應義塾大学教授天野英晴

(21)

- 19 -

Abstract

Inventing the powerful machine like the human brain has been a driving force in computing for decades. The von Neumann architecture has been considered to be a clear standard for such the system. However, the significant differences in the organization, power consumption requirements, and the computational power of von Neumann architecture compared to a biological brain leads to creating alternative architectures. Brain-inspired computing or neuromorphic computing is a biologically inspired approach created from highly connected neurons to not only model neuroscience theories but also solve machine learning problems. The term neuromorphic was first introduced by Carver Mead in 1990, where it referred to very large-scale integration (VLSI) with analog components to mimic biological neural systems.

In recent years, artificial neural networks (ANNs) with efficient learning methods (e.g., backpropagation) have shown a remarkable improvement in terms of accuracy (even better human-level) for large-scale visual/auditory recognition and classification tasks. Particularly, the convolution neural network (CNN) and recurrent neural network (RNN) have shown promising tools for a wide range of applications such as image, video, and speech. To reach considerable achievement, state-of-the-art neural networks, however, tend to deeply increase their number of layers and size (i.e., deep learning). Consequently, they require hardware platforms with a huge amount of computation as well as power consumption. On the other hand, spiking neural networks (SNNs) was proposed to not only mimic efficiently the behavior of biological neurons but also make neuromorphic systems extremely power-efficient with tens of pJ per connection.

However, implementing a scalable interneuron communication architecture is one of the major challenges for hardware-based SNNs. The architecture is required to maintain a huge amount of traffic created from a massive number of neurons and synapses accommodated on neural computation units.

Furthermore, since the arrival time of spikes is used to encode the information, timing violation in such communication architecture affects the overall performance of SNNs. A shared bus as a communication medium is a poor choice for implementing a large-scale complex SNN chip/system because adding neurons decreases the communication capacity of the chip and may affect the neuron's firing rate due to increasing length of the shared bus. Moreover, the nonlinear increase in neural connectivity is too significant to be directly implemented using a dedicated point-to-point communication scheme. Two-dimensional packet-switched network-on-chip (2D-NoC) has been considered as a potential solution to deal with the interconnection problems found in previously proposed shared communication medium based SNNs. However, such interconnect strategies make it difficult to achieve a high level of parallelism and scalability with low power consumption, especially in large-scale SNN chips.

We also consider three-dimensional network on chips (3D-NoCs) which take advantage of 3D Integrated Circuits (3D-ICs) and mesh-based network on chip (NoCs) opening a promising

(22)

- 20 -

architecture for SNNs. They offer scalability and parallelism of NoCs that are enhanced in the third dimension thanks to the short wire length and the low power consumption of 3D-ICs interconnects.

Consequently, 3D-NoCs are considered to be one of the most advanced and suitable for SNN systems, with capabilities of extremely high bandwidth, efficient scalability, and low power. However, to take the advantages for SNNs, 3D-NoC demands an efficient multicast routing algorithm to deal with a high traffic pattern where a presynaptic neuron sends spikes to a subset of postsynaptic ones.

Furthermore, due to the complex nature of 3D-ICs and the continuing shrinkage of the semiconductor components, 3D-NoC based systems are becoming susceptible to a variety of faults. Especially in SNNs, when connections are faulty, the post-synaptic neuron becomes silent or near-silent (i.e, firing rate reduction). This may degrade overall system performance.

Starting from the facts mentioned above, this dissertation proposes algorithms and architectures for spiking neural network systems based on 3D-NoC (3DNoC-SNN). First, a performance assessment for 3DNoC-SNN is presented to analyze the system performance with different spiking neural network topologies, spike routing methods (i.e., unicast, multicast and broadcast), and in both with and without faults occurring in the system. This analytical model aims to early analyze the system architecture before actual implementation. Second, this dissertation proposes novel multicast spike routing algorithms which are a combination of k-means clustering and tree-based routing method. Adopting k-means is as a partition method helping to get overall balanced traffic and then improve system performance. Moreover, a fault-tolerant multicast routing algorithm is also proposed to deal with connection faults in the system, in which primary and backup routing paths are pre-defined. When faults appear in the primary route, routers switch incoming spike packages via the backup path. This reduces recovery overhead, average latency, and enables the system to avoid timing violation of SNNs. Finally, architecture and hardware design and evaluation of the proposed 3DNoC-SNN system are presented to evaluate the proposed works, as well as compare with the analytical model.

概要

数十年にわたり、人間の脳のような強力な計算機を発明することがコンピュータの分野においての原動力とされてきた。フォン・ノイマン型アーキテクチャは、これらのようなシステムにおいて、明らかな基準とされている。しかし、その構成における重大な違いである、電力消費量、生物の脳に比べたフォン・ノイマン型アーキテクチャの計算能力は、新たなアーキテクチャの創出につながった。脳に想起された、もしくは、脳の構造を模した計算システムという新たな計算手段は、高度に結びついている神経細胞から創出され、人間の脳構造を模した理論を形成するだけでなく、機械学習における問題を解くことにつながっている。”

Neuromorphic”という専門用語は１９９０年にCarver Meadにより最初につくられたもので

あり、それはアナログな部品を付帯した超大規模集積回路(VLSI)による生物的神経細胞システムの模倣を指したことばからきている

近年では、人工神経細胞ネットワーク(ANNs)と、誤差逆伝搬法のような効率的な学習手法が、

大規模な視覚・聴覚的認識と分類において、精度の観点から、顕著な功績を示しており、そ

(23)

- 21 -

れはときに人間のレベルを凌駕することもあった。特に、畳み込みニューラルネット(CNN) と再帰型ニューラルネットは、画像、動画、音声のような幅広い分野における有望なツールとしての成果を示している。著しい成果、最先端なニューラルネットに達するとき、そこには深層ニューラルネットとよばれるような、深く増加された層や大きさのネットワークが形成される。結果として、それらは大規模な計算量と消費電力を必要とするハードウェアプラットフォームを必要とする。一方で、スパイキングニューラルネット(SNN)は、生物の神経細胞を効率的に模倣するだけでなく、非常に電力効率の良い脳の構造を模したシステムの構成

（一つの結びつきにつき、数十pJ程度）を可能にする。

しかしながら、拡張可能な神経細胞の通信アーキテクチャを実装することは、ハードウェアを基盤としたSNNの実装における大きな課題となっている。そのアーキテクチャは、膨大なニューロンとその計算に用いられる接続部における、大規模な通信網の制御性を維持することが必要とされしまう。さらに、スパイクの到達時間はデータの加工に用いられ、タイミング違反はSNNの処理全体に影響を与えてしまう。通信手段としての共有バスは、ニューロン数を増やすことは通信容量を減少させることにつながるため、大規模で複雑な SNN 回路/システムの実装において乏しい選択であり、共有バスの長さを増加させることからそれはニューロンの発火率に影響するとされる。さらに、ニューロンの接続における非線形的増加はとても著しく、ポイント・ツー・ポイント型の通信に適用されるような直接の接続は実装できない。二次元パケットスイッチ型ネットワーク・オン・チップ（NoC）は、先で述べられたような共有バスを媒体とした SNN の実装おける相互通信問題に対する潜在的な解決策として考えられている。しかしながら、そのような相互通信における戦略は、高い並列性と拡張性および大規模な SNN 回路における低消費電力を獲得する上で大きな困難を要することにつながる。

私たちは、SNNにおいて有望なアーキテクチャとされる３次元階層における集積に利点を持つ３次元ネットワーク・オン・チップについても考えます。それらは、３次元化することにより縮小されたワイヤーのおかげで、ネットワーク・オン・チップにおいての拡張性と並列性を提供する。結果として、非常に高い帯域幅と効率的な拡張性、低消費電力により、３次元型 NoC はもっとも SNNのシステムにおいて最も適しているとされるものの中の一つである。しかしながら、SNNにおいてこのような利点を受けるために、３次元型NoCは、高度な交通形態に対処するための、効率的なマルチキャストルーティングアルゴリズムを必要とする。さらに、３次元型NoCの複雑性と継続的なセミコンダクタ部品の縮小により、３次元型 NoCを元ととしたシステムは様々な欠陥に対して影響を受けやすくなってしまう。特にSNN において、接続に欠陥が生じた場合、ポストシナプティックなニューロンは、発火率の現象に見られるような、静もしくはほとんど静な状態になってしまう。これはシステム全体の性能を低下させてしまう。

以上に述べられた事実をはじめ、この論文では３D-NoCをベースとしたスパイキングニューラルネット(3D-NoC-SNN)のためのアルゴリズムとアーキテクチャを提案する。第一に、３

DNoC-SNNの性能評価は、ユニキャスト、マルチキャスト、ブロードキャストのようなスパ

イクルーティングの方法、そしてシステムに発生する欠陥のあるかないかにおいて、異なるトポロジーのSNNとの比較で行われた。第二に、この論文は画期的な、k-平均法とツリーベースのルーティング方法の組み合わせによるマルチキャストスパイクルーティングアルゴリ

(24)

- 22 -

ズムについて提案する。分割方法として k-平均法を採用することは、全体的にバランスの取れた交通を可能にし、システムの性能を向上させることにつながった。さらに、フォールトトレラントなマルチキャストルーティングアルゴリズムは、主要な経路とバックアップ経路を設けることにより、システムにおける接続の欠落に対処することにも役立てられた。主要ルートにおいて欠陥が現れた時、ルーターはバックアップ経路を用いて、入力スパイクのパッケージを切り替える。これは復旧にかかるオーバーヘッド、平均遅延を削減し、システムがタイミング違反を回避することを可能とした。最後に、アーキテクチャとハードウェアの設計と提案された3D-NoC-SNNシステムは、解析的な比較により提案された仕事についての評価を示した。

Summary of the Dissertation Review Result

The dissertation has three main contributions: (1) A performance assessment for 3DNoC-SNN. The assessment is done by providing an analytic model to analyze the system performance with different spiking neural network topologies, spike routing methods (i.e., unicast, multicast and broadcast), and in both with and without faults occurring in the system. The goal is to provide an efficient and accurate performance assessment to early understand and evaluate the advantages and drawbacks of potential neural network topologies before the actual hardware development of the SNN system; (2) Multicast spike routing algorithms for 3DNoC-SNN. In SNNs, a neuron needs to send their output spikes to thousands of others. In addition, neurons also have different spiking operation modes with different spike rates. As a result, an efficient multicast routing method is highly demanded. This thesis proposes novel routing algorithms which are a combination of k-means clustering and tree-based routing method. Adopting k-means is as a partition method helping to get overall balanced traffic and then improve system performance as well; (3) A fault-tolerant routing mechanism to deal with link faults in the 3DNoC-SNN system. In SNNs, when faults occur in inter-neuron connection, the postsynaptic neuron becomes silent because it does not receive enough inputs (spikes) from presynaptic ones. To deal with this issue, this thesis proposes a new fault-tolerant routing algorithm where it pre-defines primary and backup routing paths. When faults appear in the primary route, routers switch incoming spike packages via the backup path. This reduces recovery overhead, average latency, and enables the system to avoid timing violation of SNNs.

The committee evaluates the significance of the dissertation by reviewing the candidate’s doctoral thesis and by listening to his presentations. During the Q & A sessions, the candidate answered all questions asked by the review committee. The committee judges that the body of work accomplished by the candidate is relevant and essential to the scientific community. In conclusion, the dissertation has enough contributions and results and is recognized as qualified for conferment for a Doctor degree.

(25)

- 23 - Name

氏名

WANG, Yu 王鈺 The relevant degree

学位の種類

甲CI博第77号

論文題目

Models and Algorithms for Efficient Data Processing in Fog Computing Supported Disaster Areas

災害地におけるフォグコンピューティング基盤の効率化のためのモデルとアルゴリズム

The University of Aizu, Associate Prof. WANG, J. (Chief Referee)

The University of Aizu, Associate Prof. LI, P.

The University of Aizu, Prof. PHAM, A.

The University of Aizu, Senior Associate Prof. TRUONG, C.

T.

会津大学准教授王軍波（主査）

会津大学准教授李鵬

会津大学教授アントゥアンファン会津大学上級准教授コンタンチョオン

(26)

- 24 -

Abstract

Mobile devices are commonplace today, and smartphones have enabled big data analytics. The spreading mobile phones allows us to easily collect human activities and generate a large quantity of mobile data, such as health data, commuting route or restaurant recommendations. Big data analysis is especially important in disaster scenarios. During disaster scenarios, big data analysis can be used to detect obstructions and dispatch rescue in a timely manner.

Communication infrastructure can be destroyed by the disaster, especially earthquakes which occur in or near major cities that cause a lot of damage. It is critical to immediately recover the communication system after a disaster occurs. Movable base stations (MBSs) can be positioned to reestablish an emergency communication network after a disaster.

The above emergency communication network brings new challenges for big data analytics because big data is often analyzed in a cloud center to save processing time with high-performance PCs and servers in a general case. The transmission delay will be extremely large when collecting data from mobile phones to cloud center via an MBS-based network. We need to process small amounts of data before they get to the server in order to handle the size of data that is collected.

In this thesis, we first proposed a set of delay models for spatial data processing networks, which specifically targeted disaster-stricken areas. We also have implemented a genetic algorithm (GA) solution which showed to have a reduced maximum end-to-end (E2E) delay over various network sizes. We researched some realistic constraints, and tested some of the conventional methods with MATLAB simulator, and modeled their worst-case delays. The results showed that none of the conventional cases matched the capabilities of the GA for increased computation or increased transmission rates.

Second, because the GA solution had a significant computation time, we proposed the disaster area adaptive delay minimization algorithm (DAADM). The goal of this new algorithm was to run in real-time. We also present a detailed mathematical model to represent data processing and transmission in an ECN fog network and an NP-hard proof for the problem of optimization the overall delay. We evaluated the systems across various transmissions speeds, processing speeds, and network sizes. Furthermore, we tested calculation time, accuracy, and percent error of the systems. Through evaluation, we found that the proposed DAADM algorithm can be implemented in a real-time system, which had a major advantage over the GA.

All emergency communication networks have performance drawbacks. In order to improve the network performance even further, we proposed an algorithm that determines the best way for the MBSs to be connected. We assume that each of the fog node has traffic in addition to the data that is used for the algorithm that needs to be processed, and thus the system needs to be account for a queueing delay, both on the processor and transmitter. The delay models were run in a genetic algorithm which solved for a delay-optimized solution for the network. The results shows that the GA outperforms the greedy algorithm.

Finally, the proposed architectures and algorithms were simulated with MATLAB, which realized effective data processing and transmission in disaster scenarios. Results show that the proposed system

(27)

- 25 -

was able to achieve a higher performance with minimal delay for the overall system and can be run in real-time. The resulting system is able to address the problems of network topology as well as optimize processing delay and transmission delay. The resulting system is more suitable for deployment than previous existing systems.

Summary of the Dissertation Review Result

This dissertation focuses on various issues that are present in wirelessly networked disaster areas (WNDAs). Traditional networks can handle transferring large amounts of data because the network that is created by the wired connections can handle the demand of all the users without considering some methods to reduce the data size. WNDAs are much slower but can be the only viable option after disasters such as earthquakes. She targeted three main issues of WNDA-based networks.

First, she addressed the problem of determining how much data to process at each node. This is an important factor because processing too much of the data can delay the results by keeping them in the fog nodes and sending too much data can overload the system and cause delays. Determining the ideal value was proven to be NP-Hard, and so a Genetic Algorithm was applied to yield the best results possible. The GA worked by starting with random results and crossbreeding the best solutions a certain amount of times, until the results had plateaued.

Second, she improved upon the previous solution by creating a real-time adaptive solution.

The real-time solution had its accuracy and error rates compared with the results determined by the GA. Even though the real-time solution ran 236x faster than the GA, it still maintained an accuracy of 99%.

Third, because the users are often in areas that can allow them to be allocated to one of multiple base stations. She developed an algorithm which can appropriately allocate each user to the particular base station which will yield the best system performance.

The committee reviewed the submitted dissertation and responses carefully. While evaluating her research, aptitude and presentation capabilities, we determined that her level in each category was enough to obtain a Ph.D. degree. As a result, the committee members unanimously voted to affirm the significance of her research as sufficient for her Ph.D.

degree.

博 士 学 位 論 文