低消費エネルギー物体追跡システムのための適応的フレームレート制御

(1)

九州大学学術情報リポジトリ

Kyushu University Institutional Repository

低消費エネルギー物体追跡システムのための適応的フレームレート制御

井上, 優良

http://hdl.handle.net/2324/2236258

出版情報：Kyushu University, 2018, 博士（工学）, 課程博士バージョン：

権利関係：

(2)

Adaptive Frame-Rate Control for Low Energy Object Tracking Systems

Yusuke Inoue

A DISSERTATION

Kyushu University

March, 2019

(3)

Adaptive Frame-Rate Control for Low Energy Object Tracking Systems

Yusuke Inoue

Abstract

The importance of object tracking to pursue target objects in consecutively captured images (called a frame) has recently been rapidly increasing as technology supported our life. Object tracking estimates an object’s behavior and position and is classified into two types, i.e., oﬄine and online. Oﬄine object tracking is executed after recording videos. It is used for post analysis, e.g., feature extraction from surveillance cameras and encoding for video compression.

In contrast, online object tracking is executed at the same time as recording. The purpose of this tracking is to immediately use the features from videos, and it is applied for autofocus, obstacle detection, and so on. Online object tracking has become more important as emerging real-time applications are developed, e.g., advanced driving-assist systems and augmented reality. Since online object tracking is often incorporated into a battery-driven system, not only improvement in tracking accuracy but also reduction in energy consumption are important.

Online object tracking mainly involves executing two major energy-consuming processes:

image capturing and object tracking. Employing low power devices such as CMOS (Com- plementary Metal-Oxide-Semiconductor) image sensors and processing units is a well-known approach to improve energy eﬃciency. However, such device-level local optimizations have a potential limitation, i.e., they cannot take into account the energy trade-oﬀ between the two processes. For example, if an object tracking system executes with a high frame-rate, the energy for obtaining frames increases; inversely, energy for object tracking decreases. This is because the moving distance of the object tracked between the frames becomes small, and the number of comparisons with the templates required for the tracking process is reduced.

(4)

The first contribution of this thesis is a proposal of an energy-oriented adaptive frame-rate optimization method based on the moving speed of the target object. The impact of frame-rate on total energy of online object tracking system is analyzed theoretically. It is argued there is an optimal frame-rate that minimizes the total energy consumption when an object speed is given. As a result of a simulation evaluation targeting an object in actual videos, the method achieved a maximum of 74.8% reduction in energy consumption and 53.9% on the benchmark average compared with a conventional object tracking method with a fixed frame-rate.

The second contribution is to propose an accuracy-oriented frame-rate control method to improve tracking accuracy. Reducing the frame-rate according to the movement of an object contributes to the reduction in energy consumption, but tracking accuracy deteriorates. When the estimation of object speed is inaccurate, it is diﬃcult to continue tracking. An algorithm with control parameters that significantly throttle frame-rate reduction in such a case is introduced because the cause of speed-estimation failure is a significant change in the object speed at a low frame-rate. As a result of the evaluation, it is observed that the extended algorithm can improve dramatically the object tracking accuracy.

The third contribution of this thesis is to extend the accuracy-oriented frame-rate control algorithm for improving the energy reduction rate. Although the speed of a target object changes from moment to moment, the accuracy-oriented frame-rate control method sets the frame-rate with the fixed control parameters. To improve energy eﬃciency, parameter tuning regarding video characteristics is required. Therefore, a method of adaptively changing these control parameters according to the object speed is proposed. Evaluation results demonstrate that the dynamic parameter tuning can achieve 12.7% energy reduction compared to the original accuracy-oriented frame-rate control method.

(5)

List of Figures

2.1 Components of object tracking system. . . 8

2.2 Search area size . . . 10

2.3 Dynamic search area determination . . . 11

3.1 Relationship between energy consumption and frame-rate. . . 24

3.2 Relationship between energy consumption, frame-rate, and object speed. . . 25

4.1 Walkthroughs of the (a)conventional and (b)proposed method. . . 30

4.2 Flowchart of the velocity based frame-rate control. . . 31

4.3 Simulation flow . . . 32

4.4 Definition of FLA (Frame Level Accuracy) . . . 33

4.5 Tracker Benchmark v1.0 . . . 34

4.6 Tracking accuracy results. . . 36

4.7 Energy consumption results. . . 37

4.8 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in Dog1. . . 39

4.9 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in Doll. . . 40

4.10 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in Dudek. . . 41

4.11 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in FaceOcc1. . . 42

4.12 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in FaceOcc2. . . 43

4.13 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in FleetFace. . . 44

4.14 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) FIX and (B) ADAPT in Sylvester. . . 45

5.1 Classification of object speed transitions. . . 48

5.2 Frame-rate control with energy and accuracy aware parameter settings. . . 50 iii

(8)

5.3 Flowchart of the velocity-variation based frame-rate control. . . 52

5.4 Tracking accuracy results. . . 54

5.5 Energy consumption results. . . 54

5.6 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPT_AC in Dog1. . . 56

5.7 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPT_AC in Doll. . . 57

5.8 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPTAC in Dudek. . . 58

5.9 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPT_AC in FaceOcc1. . . 59

5.10 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPT_AC in FaceOcc2. . . 60

5.11 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPT_AC in FleetFace. . . 61

5.12 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT and (B) ADAPT_AC in Sylvester. . . 62

5.13 Impact on α to tracking accuracy. . . 64

5.14 Impact on α to energy consumption. . . 64

5.15 Impact on α to frame-rate. . . 65

5.16 Transitions of object speed (a), accuracy and frame-rate atα =1(b), α=10(c) in Dog1. . . 66

5.17 Transitions of object speed (a), accuracy and frame-rate atα =1(b), α=10(c) in Dudek. . . 66

5.18 Impact ofN_t to tracking accuracy. . . 68

5.19 Impact ofN_t to energy consumption. . . 68

6.1 Parameters analysis results (accuracy) . . . 73

6.2 Parameters analysis results (energy) . . . 73

6.3 Flowchart of dynamic parameter selection for frame-rate control. . . 75

6.4 Results of tracking accuracy. . . 77

6.5 Results of energy consumption. . . 77

6.6 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPTAC and (B) ADAPTPARAM in Dog1. . . 79

6.7 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT_AC and (B) ADAPT_PARAM in Doll. . . 80

6.8 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT_AC and (B) ADAPT_PARAM in Dudek. . . . 81

6.9 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT_AC and (B) ADAPT_PARAM in FaceOcc1. . 82

iv

(9)

6.10 Transitions of actual object speed (Top), estimated object speed, FLA, search- area size, frame-rate, on (A) ADAPT_AC and (B) ADAPT_PARAM in FaceOcc2. . 83 6.11 Transitions of actual object speed (Top), estimated object speed, FLA, search-

area size, frame-rate, on (A) ADAPT_AC and (B) ADAPT_PARAM in FleetFace. . 84 6.12 Transitions of actual object speed (Top), estimated object speed, FLA, search-

area size, frame-rate, on (A) ADAPT_AC and (B) ADAPT_PARAM in Sylvester. . . 85

v

(10)

(11)

List of Tables

3.1 Parameters for energy analysis . . . 24 4.1 Frame size and average template size in each benchmark . . . 35

vii

(12)

1

Chapter 1 Introduction

1.1 Background

To solve critical social problems, such as the declining birth rate and aging population, the use of advanced information technology has been spreading. For example, some technical keywords, such as big-data, AI (Artificial Intelligence), and IoT (Internet of Things), have become popular, and related computing technologies, such as cloud computing, AI accelerated computing, and real-time computing, have been dramatically progressing. Particularly, the concept of edge computing, in which computing nodes are distributed in target fields, is emerging direction as a next-generation computing platform. A representative edge device consists of sensors to obtain data from the physical (or real) world and computing units for data processing, for instance, a smart camera has an image sensor and an embedded CPU [23, 75]. Since edge devices tend to be battery operated, reducing the energy consumption is an important challenge.

Energy consumption is the product of power consumption and execution time, i.e., En- ergy = Power×Time. Therefore, it can be considered at least three main challenges to reduce energy consumption: 1) reducing power consumption by maintaining performance, 2) improving performance (reducing execution time) without increasing power consumption, or 3) reducing both power consumption and execution time. Many researchers have been focusing on device-level energy optimizations because shrinking CMOS (Complementary Metal-Oxide-

(13)

2 Chapter 1 Introduction

Semiconductor) transistor sizes significantly directly contributes to the 3rd approach, i.e., performance and power improvements. For example, when the transistor size becomes 1/k (k is a scaling coefficient), the power consumption is reduced according to Denard’s scaling [22], i.e., the power consumption becomes 1/k². Also, due to Moore’s Law [61], i.e., integrated circuit density doubles in 1.5 years, the IC (Integrated Circuit) has been growing in scale, and the performance has been improving. However, due to limitations in CMOS device characteristics, these benefits are about to come to an end. Therefore, it is difficult to improve energy efficiency at the device level.

Image processing is also applied to various areas such as robotics and automatic driving systems. In image processing community, the importance of object tracking to pursue target objects in consecutively captured images (called a frame) is rapidly increasing as technology supported our life. Object tracking estimates an object’s behavior and position, and it is classified into two types, i.e., oﬄine and online. Oﬄine object tracking is executed after recording the videos. It is used for post analysis, e.g., feature extraction from surveillance cameras and encoding for video compression. In contrast, online object tracking is executed at the same time as recording. The purpose of this tracking is to immediately use the features from videos, and it is applied for autofocus, obstacle detection, and so on. Online object tracking has become more important as emerging real-time applications are developed, e.g., advanced driving-assist systems and augmented reality. Since online object tracking is often incorporated into a battery-driven system, not only improvement in tracking accuracy but also reduction in energy consumption are important.

Online object tracking mainly executes two major energy consuming processes: an image capturing and object tracking. Employing low power devices such as CMOS image sensors [32, 71] and processing units [87, 89] is a well-known approach to improve energy eﬃciency. However, such device-level local optimizations have a potential limitation, i.e., they cannot take into account the energy trade-oﬀ between the two processes. For example, if an object tracking system executes with high frame-rate, the energy for obtaining frames increases; inversely, energy for object tracking decreases. This is because the moving distance of an object tracked between the frames becomes small, and the number of comparisons with the templates required for the tracking process is reduced.

(14)

1.2. Research Goal 3

1.2 Research Goal

As described in Section 1.1, it is diﬃcult to reduce energy consumption at the device level.

Therefore, it is important to reduce energy consumption at the system level considering the energy trade-oﬀ. Techniques have been studied at the system level, e.g., dynamic voltage and frequency scaling control and reduction in computation cost. In particular, there are two methods for reducing the amount of power or energy computation: 1) improving eﬃciency by skipping computation although the amount of data to be input is constant, and 2) reducing input data. This thesis focuses on the latter approach, and we call it ”input throttling”.

In online object tracking, there is an energy trade-off between the two processes of a frame acquisition and object tracking. This energy trade-off depends on the frame-rate that determines the input quantity. To clarify the effectiveness of energy reduction by optimizing input quantity, frame-rate control in online object tracking is evaluated.

In this thesis, to clarify eﬀectiveness of input throttling scheme, simple and basic template matching is assumed as a tracking algorithm. Since the idea of input-throttling is orthogonal with tracking algorithms, the frame-rate optimization techniques proposed in this thesis can be extended to other tracking algorithms.

1.3 Contributions

This thesis proposes some system-wide energy reduction methods for online object tracking systems. Unlike conventional methods, my method adaptively tunes the frame-rate by taking into account the energy trade-oﬀ based on the behavior of the tracked object. The key challenge is to determine the optimum frame-rate that minimizes the total energy without degrading tracking accuracy. Researchers have proposed a fast-tracking algorithm to reduce computational cost [26, 59]. In contrast, my method takes into account not only computational cost but also cost required to obtain frames. Although a few studies have discussed static frame-rate optimization for reducing energy consumption [52], my method has an advantage of increasing energy eﬃciency by controlling frame-rate dynamically. The contributions of this thesis are as follows.

(15)

4 Chapter 1 Introduction

The first contribution of this thesis is a proposal of an energy-oriented adaptive frame-rate optimization method based on the moving speed of the target object. The impact of frame-rate on total energy in online object tracking systems was theoretically analyzed. It is argued there is an optimal frame-rate that minimizes the total energy consumption when an object speed is given. As a result of a simulation evaluation targeting an object in actual videos, the method achieved a maximum of 74.8% reduction in energy consumption and 53.9% on the benchmark average compared with a conventional object tracking method with a fixed frame-rate.

The second contribution to propose an accuracy-oriented frame-rate control method to improve tracking accuracy. Reducing the frame-rate according to the movement of an object contributes to the reduction in energy consumption, but tracking accuracy deteriorates. When the estimation of object speed is inaccurate, it is diﬃcult to continue tracking. An algorithm with control parameters that significantly throttle frame-rate reduction in such a case is introduced because the cause of speed-estimation failure is a significant change in the object speed at a low frame-rate. As a result of the evaluation, it is observed that the extended algorithm can improve dramatically the object tracking accuracy.

The third contribution of this thesis is to extend the accuracy-oriented frame-rate control algorithm for improving the energy reduction rate. Although the speed of a target object changes from moment to moment, the accuracy-oriented frame-rate control method sets the frame-rate with the fixed control parameters. To improve energy eﬃciency, parameter tuning regarding video characteristics is required. Therefore, a method of adaptively changing these control parameters according to the object speed is proposed. Evaluation results demonstrate that the dynamic parameter tuning can achieve 12.7% energy reduction compared to the original accuracy-oriented frame-rate control method.

1.4 Overview

This thesis consists of seven chapters. Chapter 1 describes the background and purpose of this thesis. Chapter 2 clarifies the position of this research by classifying related research. Chapter 3 analyzes an impact of input throttling scheme for online object tracking systems. Chapter 4 dis- cusses proposed adaptive frame-rate optimization method. Chapter 5 presents improving track-

(16)

1.4. Overview 5

ing accuracy by accuracy-oriented frame-rate control. Chapter 6 proposes adaptive management of frame-rate throttling level for energy reduction rate. Finally, the thesis is summarized in and future directions of this research are discussed in Chapter 7.

(17)

(18)

7

Chapter 2 Online object tracking system

2.1 Principle of object tracking

Object tracking is a technique for estimating the position of a target object in consecutive frames. It is also applied to object recognition, posture estimation, text recognition, and so on.

It is a fundamental technique in image processing. Object tracking is mainly classified into the following two types: oﬄine and online. Oﬄine object tracking is executed after recording. For example, it is used for post analysis of surveillance cameras and feature extraction of various videos. On the other hand, online object tracking is executed while simultaneously recording video and is applied to emerging applications such as advanced driving assist systems, augmented reality and so on. The online object tracking is focused on because the importance of this process is increasing rapidly due to the spread of emerging real-time applications.

Online object tracking system mainly consists of an image sensor, frame memory, and processor. These components and data flow are shown in Figure 2.1. Frames taken with the image sensor are written in the frame memory. The frames are read from the memory to the processor, and the tracking process is executed in the processor. The processor tracks the object by executing the following two processes for each frame readout 1) determining search area size and 2) finding the object in the search area.

(19)

8 Chapter 2 Online object tracking system

Image sensor Frame memory Processor

Data bus 1.Write data 2.Read data

3.Tracking process

Figure 2.1: Components of object tracking system.

1. Determining search area

The search area of an object is determined on the basis of the detection result in the previous frame. The simplest method for doing this is to use the entire frame as a search area. Although this method can potentially achieve higher tracking accuracy, it is quite energy ineﬃcient because many pattern match operations are required in proportion to the frame size. On the other hand, there is a method of setting an area smaller than the frame size as a search target. With this method, it is necessary to determine the position and size of the search area.

The position determination method of the search area is classified into the following two methods.

• The position coordinate detected with an object tracking system before one frame is set as the center of the search area in the next frame.

• The system estimates the position on the basis of previous tracking results. Then the coordinate is set as the center of the search area for the next frame.

In this thesis, it is assumed the former because it is not necessary to take into account accuracy degradation due to failure in position estimation.

Next, determining the search area size is divided into the following two methods:

(20)

2.1. Principle of object tracking 9

• The size of the search area is determined on the basis of the assumed maximum object speed at the system-design phase, and its size is fixed during execution of the tracking process.

• The position where the object moves is predicted according to the tracking results of the previous frames and is updated to the size of the search area including the destination of the object.

In this thesis, the latter is assumed, which can reduce the process amount by changing the size of the search area according to the movement of the object. If the object speed is low, it is presumed not to move significantly from the position of the detected object, so that the search area can be made small. On the other hand, if the speed of the object is high, the search range must be expanded. It is assumed that the shape of the search-area is square and that its size S in pixels is defined using Equation (2.1).

S = (2r+ 1)², (2.1)

where r is the predicted moving distance of the object between two consecutive frames in pixels. This is because it is predicted that the object will move in all directions by r pixels from the estimated center of the object before one frame, and the search area is a square with 2r+ 1 sides. For example, in the case of r = 3 pixels, the size of the search area is 7×7 pixels as shown in Figure 2.2.

The moving distance r of the object is expressed by r= v

F R, (2.2)

wherev (pixels/s) is the moving speed and theF R (frames per second: fps) is frame-rate.

The object speed v is calculated on the basis of the object moving distance r from the previous frame to the next one, so obtaining the correct value is impossible. To solve this problem, real implementations predict v on the basis of the moving distance r between two frames, assuming that the object maintains the moving speed.

Figure 2.3 shows dynamic search area determination based on the object speed. The arrow shows the speed of the object, and its length shows the magnitude of the speed vector.

(21)

r = 3 [pixels]

2r+1[pixels]

2r + 1[ pi xe ls ] r = 3 [pi xe ls ]

Figure 2.2: Search area size

First, the object speed is estimated between two immediately preceding frames F_i₋₁ and F_i₋₂ of the target frame F_i for searching. Similarly, it predicts that the object will move fromF_i₋₁ toF_i at equal speed. Then, the size of the search area is determined so that the object is included when it moves at the estimated speed from the object center position detected at F_i₋₁.

2. Detection of object in search area

There are two main ways to detect the object: area-base and feature point base. First, the area-based object detection in the search area includes, for example, template matching and background diﬀerence methods. Template matching compares sample object images prepared in advance (templates) with a partial area within the search area in scanning and searches for the most similar area. On the other hand, the background subtraction method compares the frames with and without a target object and detects the region where the

(22)

2.1. Principle of object tracking 11

time

F_i F_i-i

F_i-2

(a) Low object speed

F_i F_i-i

F_i-2

time

(b) High object speed

Figure 2.3: Dynamic search area determination

amount of change is the largest. The feature values used with these methods include, for example, a pixel value (brightness value of the pixel) vector, and a color histogram.

Second, feature point based method detects the characteristic points of the target object.

There are various types of the feature point, e.g., SIFT (Scale Invariant Feature Trans- form) with strong robustness on rotation, illumination, and scale [55], KAZE feature adopted nonlinear scale space rather than Gaussian space used in SIFT [3]. Also, SURF (Speeded Up Robust Features) [9] and Akaze feature are proposed [4] as acceleration on SIFT and Kaze, respectively. In other features point, a binary pattern is adopted to reduce the computation complexity, e.g., BRIEF (Binary Robust Independent Elementary Features) [11], BRISK (Binary Robust Invariant Scalable Keypoints) [50], ORB (Oriented Brief) feature [77], and FREAK (Fast Retina Keypoints) [2]. It is assumed that template matching, and a simple pixel value vector is used as a feature because it is not necessary to acquire a background frame and the object can be detected even if the position of the camera changes.

Calculating similarity with template matching includes SSD (Sum of Squared Diﬀerence),

(23)

SAD (Sum of Absolute Diﬀerence), and NCC (Normalized Cross-Correlation) and the like. NCC is used because it is robust against change in brightness. If it is assumed that the template has N width and M height, an N×M pixel block in the defined search-area is selected. Then NCC presented using Equation (2.3) is calculated as a similarity score.

R_{N CC}(x, y) =

∑

j

∑

iSA(x+i, y+j)·T(i, j)

√∑

j

∑

iSA(x+i, y+j)²·∑

j

∑

iT(i, j)²

. (2.3)

Here, SA(x+i, y+j) andT(i, j) are the pixel values for each position in the search-area and template, respectively. The NCCs for all candidates in the search-area are calculated.

Finally, the pixel block that has the highest NCC score is selected as the result of the tracking process. Each NCC calculation requires 1) three sum-of-products with N ×M multiply-and-addoperations, 2) onemultiply operation in the denominator, 3) onesquare- root operation, and 4) one divide operation. Thus, it can be considered the required number of operations for obtaining an NCC score n_matching as follows,

n_matching = 6·N·M + 3. (2.4)

The object speed is used rather than acceleration with emphasis on implementation cost (i.e., model complexity and the number of frames for estimation) to determine the search area size because the diﬀerence in tracking accuracy between speed and acceleration in this thesis’s experiments is not obtained.

2.2 Low energy techniques

Since online object tracking is incorporated into battery-powered systems such as automobiles, smartphones, and robots, energy consumption reduction as well as tracking accuracy is required.

To tackle this issue, researchers have proposed techniques for device, system, and algorithm level, respectively. On the hardware side, employing low power devices, such as image sensors and processing units is a well-known for improving energy eﬃciency. In a system level, many approaches are proposed, which coordinates the devices considering with characteristic of each device. Also, a fast-tracking algorithm for reducing in computational cost was proposed on the

(24)

2.2. Low energy techniques 13

software side. In the following sections, it is surveyed the technologies and approaches related to the low energy techniques constituting object tracking in various hierarchies.

2.2.1 Device level approaches for image sensing

In this section, the image sensors which are one of the main components of online object tracking systems are discussed. Image sensors are classified into two types, i.e., CCD (charge-coupled device) and CMOS (Complementary Metal Oxide Semiconductor Image Sensor). In this thesis, CMOS sensors is focused on, which are commonly used. A CMOS image sensor is equipped with a pixel array. Each pixel has a photodiode, and if it receives light, electric charge is produced.

This image sensor is also equipped with horizontal and vertical scanners as peripheral circuits.

These scanners switch each pixel readout. Each pixel is also output as a digital value by using an ADC (Analog Digital Converter). To reduce power consumption of the image sensor, improvements in an ADC were proposed.

Mcilrath et al. targeted sensors equipped with an ADC parallel to each pixel unit [57]. By operating transistors constituting each pixel of such a sensor in the weak inversion region, the drain current and power consumption are reduced. This results in low power consumption of 40 nW/pixel. On the other hand, Ignjatovic et al. improved power eﬃciency of a pixel-parallel image sensor [36]. A part of the Σ∆ converter is separated from the pixel into the column signal circuit, and the required current of the pixel circuit is reduced, thereby achieving a low power consumption.

To operate an image sensor at a low voltage, PWM (Pulse Width Modulation) image sensor was proposed [62]. A PWM imager has an advantage that the pulse width can be easily converted to a digital value by counting it with a counter and latch memories. Kagawa et al.

proposed PWM based image sensor [41]. This sensor eliminates the need for a large capacitor because the feedback reset of the photodiode throttles the pixel fixed-pattern-noise. Therefore, the power consumption of the capacitor can be eliminated. In addition, they reduce power consumption by clarifying a static bias current is necessary only for resetting the pixel, and shortening the reset cycle [42]. Hanson et al. proposed an image sensor that read out pixel valuesby PWM and convert them to digital valueswith TDC (Time Digital Converter) instead of

(25)

an ADC. [32]. In a sensor with such a configuration, the circuit configuration of the comparator, which compares the voltage and photodiode voltage, was changed from that of conventional comparator. As a result, the operation was achieved at a lower voltage (power supply voltage Vdd = 0.5 V) than a conventional compator.

To operate in isolated and remote areas under harsh environmental conditions, a self-powered sensor called an EHI (Energy harvesting imager) has become desirable in recent years. The challenge with an EHI is not only to increase the amount of power generation but also reduce drive voltage. The EHI proposed by Cevik et al. eﬃciently generates current by entering harvest mode while not sensing (when in standby mode). In addition, low-speed and low- resolution image sensor architecture contributes to low voltage [13]. Chiou and Hsieh adjusted the cycle of harvesting and sensing to an appropriate one, making it possible to drive with the minimum necessary voltage [14].

As other low power image sensors, Cottini et al. [20] implemented a background subtraction algorithm in hardware on a pixel-by-pixel basis in an image sensor. In a buﬀer that holds the voltage of each pixel, the sensor can reduce the power consumption by reducing the drive duty ratio, and it achieved 620 pW / pixel. Couniot et al exploited a row-based gating scheme to reduce gate leakage of comparator during the exposure phase [21].

2.2.2 Device level approaches for image processing

A crucial component for object tracking is the processing unit. To process images that consist of large volumes of data, it is essential to improve the energy eﬃciency of the processing unit.

In this section, it is described conventional energy eﬃcient processing hardware. Notably, the survey targets image processing units for not only object tracking but also video compression, and recognition as related work.

Recently, the feature point based algorithm is used for object tracking and recognition, e.g., SIFT [55], KAZE [3], and so on. Optimized architecture to achieve eﬃcient computation of these feature point detection was proposed based on FPGA (Field Programmable Gate Ar- ray) [40, 96, 35]. These focuses on the SIFT algorithm and perform the optimization on memory accessing and data path. Also, Jiang et al. implemented parallel processing architecture on the

(26)

FPGA for Akaze algorithm [39]. Although these architectures perform high performance on computing speed and reduce power consumption, redundant logic in FPGA brings ineﬃcient design power. As other feature extraction, a fully pixel-parallel architecture was explored in adaptive binarization of filtered images for essential feature extraction as well as in their tempo- ral integration and derivative operations [99]. Also, another feature point specific architecture is proposed, for SURF [53], ASIFT [67].

As other hardware, application specific many-core processor for tracking and recognition was proposed [64, 65, 46, 70, 93]. Xu et al. presented low power processor for image recognition [93].

The processor attempts to reduce static power consumption by applying variously granular clock and power gating. H.264 decoding with 30 fps by using 32 processor core was executed, then it was observed that the power consumption was 500 mW. An architecture proposed by Oh et al.

achieves high throughput by optimizing data parallelism [65].

As another approach, heterogeneous architecture was proposed, which is composed of processors, image processing engine, GPU (Graphics Processing Unit), and so on. Sasagawa and Mori developed multi-core architecture with GPU and computer vision engine based on pre- processor, target detector, and target tracker [82]. The engine has multiple data-paths with pipeline and local memory, and it manages data flow to fill the pipeline eﬀectively. Park et al.

designed an object recognition system for mobile application [69]. The system adopts multi accelerator and DRM (Dynamic Resource Management) for real-time operation. DRM controls the power-mode of two domains (i.e., the feature detection and description generation domain, respectively) to reduce power consumption.

As hardware for eﬃcient image compression, SDRAM interface architecture was proposed to improve memory eﬃciency by reduction of row-activation overhead [45, 84, 85, 100]. For example, to achieve high bandwidth in H.264, Song et al. reduce redundant memory transfer overhead by tile based memory access method and pixel cache [84].

2.2.3 System level approach

A system is composed of various devices, e.g., processor, sensor, memory and so on. Therefore, system-level low power and energy approach coordinates the devices considering with character-

(27)

istic of each device. In general computer systems, OS (operating system) perform management power management.

The power consumption of CMOS devices is consists of two factors,, i.e., dynamic power and static power. The dynamic power consumption is proportional to the operating frequency and the square of a supply voltage. Therefore, reduction of the voltage and the frequency is eﬀective.

DVFS (Dynamic Voltage and Frequency Scaling) is a well-known power reduction scheme in the computer system, which controls the supply voltage and operating frequency dynamically. As a simple DVFS, an interval-based scheduling algorithm was proposed [27, 91]. This approach uses a uniform-length interval to monitor the system utilization in the previous intervals, and then, controls the voltage for the next interval. This technique is useful for applications such as batch processing. On the other hand, intra-task voltage scheduling technique was proposed. This technique controls the voltage of each segment of application code divided by pre-analysis [83].

In addition to this scheme, a method using software feedback loop was proposed [49]. This method provides a time slot and defines the slack time as an extra-time between the actual execution time and a deadline of the time slot. In the next time slot, it decides operating frequency from the current slack time and a deadline of next time slot.

Also, some studies about DVFS for video decoding was explored. Ma et al. analyzed the frame decoding frequencies [56]. Choi et al. proposed frame-baed DVFS which reduce the voltage and the frequency if the computation is independent between frames [17]. Lee et al.

focused on predicting the proper frequencies and voltages [48]. In addition to these approaches, Liang et al. dealt with multiple codec types by using history table based DVFS [51].

Since the increase in the number of cores installed in the system, concurrency is focused on attention as another optimization parameter for improving the power and energy efficiency. The concurrency is the number of threads or cores used to execute a multithreaded program at the same time. To achieve high efficiency of the system, researchers proposed thread management schemes. Nguyen et al. [63] and Corbalan et al. [18, 19] presented systems that measures the efficiency at different allocations and adjusts the job allocation. Suleman et al. devised a framework that controls the number of threads using run-time information dynamically [86].

Rangan et al. proposed a power management technique based on thread migration between cores with diﬀerent voltage and frequency settings [74]. Vega et al. investigated the optimum

(28)

combination of core-wise SMT (Simultaneous Multithreading) level and the number of active cores [90].

To manage power eﬃciently under power constraint, researchers focused on overprovisioning systems [72, 81]. The overprovisioning systems have more hardware resource than the thermal design power. Also, the system can activate selectively based on workload characteristics under the given power constraints. On the overprovisioning systems with various processors, the impact of power inhomogeneity on application performance has investigated, and a variation- aware power budgeting algorithm has presented by Inadomi et al [37].

Approximate computing is based on the intuitive observation that gains eﬃciency instead of sacrificing the correctness. For specific domains such as image processing, the approximation is commonly used due to high compatibility. Approximate computing is also studied in various layers such as a circuit [47, 73], architecture [34], compiler [79] and so on. In the system level, researchers explored output quality management that is independent of software algorithms.

Green system builds a QoS (Quality of Service) model based on the profiling data from annotated code by pragmas[8]. This model is used for check output quality. Then the QoS is used for a judge whether to use the various approximate version of the code. PowerDail can dynamically tune control knobs configured into approximation code statically [33]. Ansel et al. utilized a genetic algorithm to find the best approximate code with acceptable quality [5]. CCG is another approach for monitoring output quality run-time [80]. In this technique, host CPU checks output quality provided from approximate code executed on GPU. Khudia et al. also proposed output quality management with light-weight checkers which only monitor representative small data subset [43].

2.2.4 Algorithm level approach

On the software side, the researcher proposed many approaches improving the algorithm of the computation reduction for the object tracking. Recently, an appearance model based on machine learning is utilized for eﬃcient object tracking. In general, the tracking algorithm is classified two types mainly, i.e., generative and discriminative models.

The generative algorithm uses features of a target object and updates the learning model

(29)

like that minimizes reconstruction errors. To deal with appearance variation, WSL Tracker [38]

and incremental learning method [76] as online learning algorithms are proposed. One of the approaches using adaptive appearance model, linear appearance models have been extensively studied within the pattern recognition community. The appearance of a target changes from moment to moment. For example, the features of a target face diﬀer greatly depending on its direction. In the tracking used in the appearance model, to deal with the change of the target appearance, the model is updated with tracking. Gonzalez-Mora and Torre cut the main cost of the model update to perform tracking in real time. This technique reduces calculation amount for tracking by extracting terms that can be calculated in advance from the model [26]. As another generative algorithm, SR (Sparse Representation)-based tracking was proposed [58].

SR is an algorithm for reconstructing a signal (e.g., an image) using the prior knowledge that the reconstruction is sparse or compressible. With this technique, the signal is represented by a sparse set of basis functions. That is, all the coeﬃcients corresponding to the basis functions vanish except for a few. Mei and Ling proposed an SR-based tracker by ℓ1 regularization to handle object occlusion problem by casting tracking as a sparse approximation in a particle filter framework. This method needs to solve the ℓ1 regularization problem repeatedly; and thus its computational cost is high. Yang et al. enhanced the ℓ1 tracker. They modeled the appearance of a target with an orthogonalized template set and computed the coeﬃcients using a least-squares regularization instead of the computationally expensive ℓ1 optimization [95].

These generative algorithms have a drawback that does not take into account surrounding visual context and cannot exploit to better separate the target object from the background.

The discriminative model detects target object by classification of positive sample (i.e., target object) and negative sample (e.g., background). Boosting methods were proposed for object tracking. The boosting methods achieve robust and eﬃcient tracking by combining results from weak classifiers [6, 29, 28]. Liu and Yu [54] improved the boosting method [28] in terms of eﬃciency taking into account time dependency of target features. Zhang et al. also proposed a supervised learning method which optimizes the classifier objective function in the steepest ascent direction with concerning the positive samples while in steepest descent direction with concerning the negative ones [97]. However, the discriminative model also has a drawback.

These algorithms use only one positive sample and multiple negative samples for model update.

(30)

2.3. Conclusions 19

If the estimated position of the target is incorrect, the model includes noise. Therefore, it may cause tracking failure by the accumulation of errors [7]. To alleviate the problem, an online semi- supervised approach is proposed that increasing weight of features at first detected frame [30].

As other discriminative approaches, compressive sensing theories [1, 12, 98] was proposed. These achieve eﬃcient tracking by extracting low dimensional features of target object randomly from the high ones.

2.3 Conclusions

The principle of online object tracking system is explained. Object tracking is a fundamental technique for estimating the position of a target object in consecutive frames. It is also applied to object recognition, posture estimation, text recognition, and so on. The technique is necessary to reduce energy consumption without degrading the tracking accuracy since it is applied in the battery-operated system.

Researchers presented low power and low energy approaches in each layer, i.e., device, system, and algorithm level. The approaches of individual device and algorithm executed on each device are hard to improve performance and power eﬃciency due to a limitation in CMOS device characteristics. Therefore, system-level approach is needed to reduce energy consumption.

Since these low power and low energy techniques are orthogonal, each technique can cooperate with another one.

The system level energy eﬃcient techniques are 1) optimizing supplied voltage and frequency, 2) reducing computation. It can be considered that reducing computation is classified as two types; 1) skipping the computation, and 2) input throttling. The introduced system- level approaches in this section are the former. On the other hand, input throttling approach is discussed in this thesis. Impact of input throttling approach on online object tracking system is clarified in following chapters.

(31)

(32)

21

Chapter 3 Energy impact of input throttling for online object tracking systems

3.1 Concept of input throttling

In this thesis, online object tracking is evaluated as a target to clarify eﬀect on reduction of the input data amount (called input throttling). The input data to object tracking are images that are input continuously in time, and a frame-rate can control its input amount per time.

Therefore, it is possible to throttle the input amount by setting the frame-rate to low.

Tracking accuracy is an important performance indicator of online object tracking. The tracking accuracy represents how correctly the position of the target object is estimated. The key challenge is to determine the optimum frame-rate that minimizes the total energy without degrading tracking accuracy. This approach is similar to approximate computing [60, 78, 94].

However, approximate computing diﬀers in that it focuses on improving processing eﬃciency, whereas input throttling reduces total energy consumption of the system by reducing the data amount. For the proof of concept of input throttling, the relationship is analyzed between frame-rate and energy consumption in the following sections.

(33)

22 Chapter 3 Energy impact of input throttling for online object tracking systems

3.2 Energy modeling

In this section, an energy model for an online object tracking system is presented. This thesis assumed the tracking system equips CPU, main memory, input/output interface. To understand the energy consumption of the tracking system intuitively, the abstracted energy model targeted only CPU and main memory. The energy consumed for object tracking in one frame consists of two sources: the energy for frame input from DRAM (dynamic random access memory) E_f J and that for object tracking E_t J in the CPU. Thus, the total energy consumed per second, E J, can be expressed as follows.

E = (E_f +E_t)·FR, (3.1)

where,E_f is defined with the energy for obtaining a pixel from DRAME_mem J, frame widthW pixels, and frame heightH pixels.

E_f =E_mem·W ·H. (3.2)

To determine the fetch area at every frame, it is required to consider the hardware for selecter;

therefore, the energy model becomes more complicated. Thus, an intuitive object tracking system is modeled due to simple implementation for evaluation. Single object tracking is focused on; therefore, the impact of cache is not regarded. The energy model for taking the fetch area and cache into account will be evaluated for future work.

Generally,E_t is proportional to the number of computations required for tracking processes.

It is assumed for simplicity that they consume the same amount of energy E_op in the CPU because of the computing complexity of arithmetic operations, e.g., add, multiply, square-root, div, is implementation dependent. Input throttling does not depend on the diﬀerence in the operation complexity, so it can easily be applied to any execution platform by following the diﬀerence in the computing complexity. Therefore,E_t can be defined as follows by using S and n_matching from Equation (2.1) and Equation (2.4), respectively.

E_t =E_op·n_matching ·S. (3.3)

In summary, the total the energy consumed in one second can be presented by Equation (3.4).

E = (α·W ·H+ (6·N ·M + 3)(2·v

F R + 1)²)·E_op·F R, (3.4)

(34)

3.3. Impact of frame-rate control on energy consumption 23

where α is the energy ratio, E_mem divided by E_op. The exact values of E_op and E_mem depend on the specification of devices, e.g., the performance capability and memory capacity.

In this thesis, the memory and CPU was focused on because these devices are dominant on the real systems. Therefore, the energy consumption of the memory and CPU is approximated by using the energy model. The other devices in the tracking system(e.g., image sensor) are out of the scope in this thesis. To rigidly evaluate the energy consumption of an entire tracking system, it is necessary to evaluate the proposed input throttling on a real system. Therefore, the energy consumption on a real system will be evaluated for future work.

3.3 Impact of frame-rate control on energy consumption

On the basis of the energy model introduced in Section 3.2, the fundamental impact of the frame-rate on the total energy is analyzed by assuming a constant speed of the tracked object.

Based on some implementations of a DRAM [44] and a multimedia processor chip [88], the energy ratio α is estimated between an arithmetic operation and memory access.

If video data composed of RGB565 (16-bit image) of VGA size (640×480 pixels) processed at 66 fps is transferred, an electric current of 128µAand 1.1 V are supplied to the interface of the DRAM memory [44]; therefore, E_mem is calculated by Equation (3.5).

E_mem = 1.1V ×128µA

66f ps×(640×480)pixels = 6.94×10⁻¹²J. (3.5) It has also been reported that the peak performance of 388.1 GOPS/W was achieved in a multi-core processor for image processing equipped with four processor cores [88]. Thus,Eop is obtained by Equation (3.6).

E_op= 1

388.1GOP S/W = 2.58×10⁻¹²J. (3.6) Therefore,α is estimated as

α= E_mem

E_op ≃2.69. (3.7)

Since α is device specific, it is assumed that 1 ≤ α ≤ 10 for analysis and the representative value is five in this thesis.

(35)

Table 3.1: Parameters for energy analysis

Explanation Parameter Value

Energy for memory reading (J) E_mem 6.94×10⁻¹² Energy for arithmetic operations (J) E_op 2.58×10⁻¹²

Energy ratio α=E_mem /E_op 5.0

Frame size (pixels) W ·H 76,800

Template size (pixels) N ·M 5,914.2

0.00E+00 5.00E+06 1.00E+07 1.50E+07

0 5 10 15 20 25 30

Normalized energy consump3on

Frame-rate [fps]

E Ef・_FR Et・_FR

Figure 3.1: Relationship between energy consumption and frame-rate.

It is also assumed that the frame size W ·H is 76,800 (320×240) pixels, that the template size N ·M is 5,914.2 pixels, and that the object speed is 10 pixels/s. A frame size based on Dog1 is assumed, which is a video stream included in the Tracker Benchmark [92]. Template data has been prepared for each benchmark video stream because the Tracker Benchmark does not provide templates for object tracking. Dog1 has seven templates, and their average size is 5,914.2 pixels. Table 3.1 lists these parameters for energy analysis of online object tracking system.

Figure 3.1 shows the correlations between the frame-rate (x-axis) and energy consumption

(36)

3.3. Impact of frame-rate control on energy consumption 25

0.00E+00 1.00E+07 2.00E+07 3.00E+07 4.00E+07 5.00E+07 6.00E+07 7.00E+07

0 5 10 15 20 25 30

Normalized energy consumption

Frame-rate [fps]

v=50[pix/s]

v=40[pix/s]

v=30[pix/s]

v=20[pix/s]

v=10[pix/s]

55.3

%

24.7

%

Figure 3.2: Relationship between energy consumption, frame-rate, and object speed.

(y-axis), expressed in units of E_op. An optimum frame-rate that minimizes the total energy consumption E was found. This observation comes from the fact that against the frame-rate 1) the energy for frame inputsE_f ·FR increases proportionally and 2) that for object tracking E_t ·FR decreases in inverse proportion to the square of the frame-rate. There are similar observations in Figure 3.1 even if other benchmarks were assumed.

Next, the impact of object speed on energy consumption is discussed. Figure 3.2 presents the total energy with various object speed. The x-axis and the y-axis are the same as those in Figure 3.1. The dot marker in the figure shows the minimum point on each energy curve.

As I can see from Figure 3.2, the optimum frame-rate depended precisely on the object speed.

Unfortunately, the conventional online object tracking systems cannot exploit this feature due to the fixed frame-rate. If the target object speed is 50 pixels/s, the minimum energy can be achieved with a frame-rate of 29 fps, as shown in Figure 3.2. When the object changes its speed

(37)

to 10 pixels/s, the conventional method can reduced energy consumption by 55.3% due to the eﬀect of reducing the size of the search area. However, it misses a chance at reducing energy by 24.7% because the optimum frame-rate was 6 fps for the object moving at 10 pixels/s.

3.4 Conclusions

To clarify relationship between energy consumption and the frame-rate in the online object tracking system, an energy model is constructed and the energy characteristics is analyzed. As a result, it can be seen that there is the energy trade-oﬀ between obtaining frame and tracking process. Furthermore, the analysis results show that there is a frame-rate which minimizes energy consumption on the basis of the energy trade-oﬀ.

(38)

27

Chapter 4 Energy-oriented adaptive frame-rate control

4.1 Motivation

The analysis results of energy impact as explained in chapter 3 motivate to exploit the frame-rate for energy efficient online object tracking systems as following two reasons. First, conventional online object tracking method with fixed frame-rate cannot reduce wasted energy consumption. Although there is an energy trade-off between obtaining frame and tracking process, the conventional method cannot take into account the energy trade-off due to its fixed frame-rate.

Therefore, the frame-rate control based on the energy trade-oﬀ can reduce energy consumption.

Second, the frame-rate which minimizes energy consumption depends on the object speed. Also, target object speed changes moment to moment. Therefore, it needs to control the frame-rate adaptively based on the target object moving.

In summary, my analysis results motivate us to exploit the frame-rate as an energy-control knob and dynamically tune the knob based on the object speed to implement energy-eﬃcient online object tracking systems. In the following sections, the adaptive frame-rate control for energy eﬃcient online object tracking system is proposed.

(39)

28 Chapter 4 Energy-oriented adaptive frame-rate control

4.2 Related work

Researchers have discussed frame-rate control for reducing data size, power consumption, and computation. LiKamWa et al. reported the development of an image sensing device that supports static frame-rate configuration [52]. Choi et al. also proposed an image sensor that can change frame-rate, which the sensor was fabricated and measured the impacts on the power consumption [15, 16]. Although the static optimization works well for fixed systems such as surveillance cameras installed in buildings, it cannot exploit the dynamic behavior of tracking objects, resulting in energy ineﬃciency in mobile online object tracking applications. On the other hand, Han et al. presented a dynamic energy reduction technique that lowers the frame- rate in a smartphone s display based on scroll actions [31]. Their scope covers only the output devices, so it is orthogonal to my adaptive optimization. As the work closely related to my approach, frame-skipping techniques have been studied based on motion activity for video transcoding [10, 25, 68]. These techniques exploit bi-directional information in terms of time. On the other hand, my approach diﬀers in that it is necessary to execute input data sequentially.

4.3 Velocity based frame-rate optimization

To reduce the wasted energy, adaptive frame-rate optimization to reduce energy consumption is proposed. Figure 4.1 shows the walkthroughs of the conventional and proposed method.

The conventional method decides the search-area by using the estimated object speed. The estimated object speed in the current frame is calculated by Equation (2.2). Here, r is moving distance of the object tracked from the just previous frame to the current one. In Figure 4.1(a), the conventional method sets the search-area widely atframe₃ because of the high object speed atframe₂. It also reduces the search-area size at frame₄ and frame₅.

However, the proposed method adaptively optimizes not only the search-area but also the frame-rate. The proposed method attempts to minimize energy consumption by taking into account the energy trade-oﬀ between the number of processing frames and the search area. Until frame₃, the behavior of the proposed method is the same as that of the conventional method,

(40)

4.3. Velocity based frame-rate optimization 29 Algorithm 1Velocity based adaptive frame-rate optimization

Ensure: coordinate[i]

1: while SystemIsRunning()do 2: tc ←ti

3: frame[i]←ObtainFrame(t_c)

4: search area ←DecideSearchArea(coordinate[i−1],v[tc−1]) 5: coordinate[i]←SearchObject(frame[i],search area)

6: v[tc]←EstimateSpeed(coordinate,FRcurrent) 7: FR_min←DecideFR(v[t_c])

8: FR_next ←FR_min 9: i ←i+ 1

10: end while

because the proposed method processes high frame-rate due to its high object speed. If the estimated object speed is low, the proposed method reduces the frame-rate, i.e., it decreases the number of frames that are used for object tracking. For instance, the proposed method does not track the object at frame₄ called unprocessed frame in Figure 4.1(b). Instead, the search area is expanded to continue tracking the object atframe5. Note that the conventional object tracking is assumed that its frame-rate is maximum and fixed. In addition, conventional object tracking completes its computation before coming to next frame (i.e., within the minimum budget time). Therefore, object tracking computation is completed within any time budget with the frame-rate for optimizing energy consumption.

Algorithm 1 shows the pseudo code of the proposed velocity based adaptive frame-rate control method. Figure 4.2 illustrates flowchart of this proposed method. In lines 1–5, the method executes 1) capturing a current frame, 2) defining a search area based on the object speed obtained in the previous iteration, 3) finding the object in the search-area, and 4) calculating the predicted object speed, the same as in a conventional online object tracking method with a fixed frame-rate, as explained in Section 2.1. Then 5) the proposed method optimizes the frame-rate based on the estimated object speed to minimize the energy consumption (lines 7–

11). If multiple target objects exist, the fastest one is targeted. The frame-rate to minimize energy consumption, FR_min, is calculated based on by solving the derivative of Equation (3.4)

(41)

time frame

₁

frame

₂

frame

₃

frame

₄

frame

₅

Estimated speed Search-area

(a) Conventional method

frame

₁

frame

₂

frame

₃

frame

₄

frame

₅

time

(b) Proposed method

Figure 4.1: Walkthroughs of the (a)conventional and (b)proposed method.

(42)

4.3. Velocity based frame-rate optimization 31

Decide search area Search the object

Estimate speed

Set optimized frame-rate Obtain a frame

Detect an object

Figure 4.2: Flowchart of the velocity based frame-rate control.

as follows.

FR_min(t_c) = 2·

√ 6·N·M+ 3

α·W ·H + 6·N·M+ 3·v(t_c). (4.1)

The four parameters, α, W, H, and n_matching are constants and known at design time. Thus, only monitoring the object speed v at run-time is required for the dynamic optimization, as obtained in line 5. The frame-rate is determined every frame inputted. Main overhead of the proposed technique is the calculation of optimized frame-rate. Since this calculation is smaller than the matching process, the calculation cost is negligible. For example, the calculation cost of Equation (4.1) similarly to Equation (2.4) is obtained, which the cost is sum of 1) three add operations, 2) eight multiply operations, 3) one divide operation, and 4) one square-root operation. Therefore, the cost is 13 per frame, and it is very smaller than Equation (2.4).

(43)

Simulator Benchmark(Input video)

Tracking accuracy Energy consumption

Obtaining a frame Object tracking Calculation of accuracy and energy

Figure 4.3: Simulation flow

4.4 Evaluation

4.4.1 Evaluation overview

Experimental setup

To evaluate eﬀectiveness of the proposed method for the actual videos, tracking accuracy and energy consumption in each video were simulated. For the evaluation, an object tracking simulator constructed using OpenCV [66] is utilized. The procedure of this evaluation is shown in Figure 4.3. First, a benchmark video for object tracking is input to the simulator. Subsequently, the simulator reads the input frame by frame and executes tracking processing. Every time one frame is processed, tracking accuracy is calculated and energy consumption is estimated. After processing the last frame of the input video, the obtained tracking accuracy and total energy consumption are outputted.

The simulator calculates tracking accuracy and energy consumption. The evaluation index

低消費エネルギー物体追跡システムのための適応的 フレームレート制御