博士論文概要

全文

(1)早稲田大学大学院情報生産システム研究科. 博士論文概要. 論. 文. 題. 目. Motion Correlation Based Low Complexity and Low Power Schemes for Video Codec. 申. 請. 者. Chen LIU 情報生産システム工学専攻マルチメディアシステム研究. 2012 年 8 月.

(2) Among various kinds of multimedia applications, video is the most expressive and attractive one. Owing to the fast development of multimedia communication t e c h n o l o g y, v i d e o i s m o r e a n d m o r e w i d e l y u s e d i n t h e m o d e r n s o c i e t y. W i t h t h e increasing demand for more convenient and higher quality video experience, more efficient video coding schemes are gathering more attention. The popularity of video applications on mobile devices makes the low comple xity and low power d e s i g n f o r v i d e o c o d e c b e c o m i n g a n i m p o r t a n t r e s e a r c h t o p i c . To o p t i m i z e t h e v i d e o codec system, a number of algorithms have been proposed for both encoder and d e c o d e r r e s p e c t i v e l y. This dissertation presents my research on the low compl exity and low power s c h e m e s f o r H . 2 6 4 / AV C e n c o d e r a n d d e c o d e r b a s e d o n t h e m o t i o n c o r r e l a t i o n . T h e motion correlation information, such as motion vectors (MVs) and residual which are generated during the video compression process are skillfully utilized in the proposed schemes. The motion correlation information provides more underlying spatial and temporal relation of the video content which is helpful for the motion feature analysis and optimization for the coding process. In vi d eo e n c od e r, th e M ot io n E s t i ma t i o n (M E ) c o st s a bo u t 70 % c o mpu t a ti on a l amount, thus to reduce the amount of computation of the ME is an important issue. In. ME,. there. are. 8. inter. modes. for. inter. frame. prediction.. To. reduce. the. computational complexity by proper mode selection while keeping the visual quality and compression efficiency at the same time becomes a research topic (fast inter m o d e d e c i s i o n ) . D i f f e r e n t f r o m p r e v i o u s i n t e r m o d e d e c i s i o n a l g o r i t h m s ( e . g . Yu ’ s ICASSP 2004) which mainly focus on the feature analysi s of the frame contents in original video, the motion correlation information such as motion vectors and residual are utilized in proposed algorithms. The proposed motion adaptation based algorithm is introduced in Chapter 2, and t he proposed residual feature based algorithm is introduced in Chapter 3. F o r v i d e o d e c o d e r, t o r e d u c e t h e d e c o d i n g t i m e , a n o v e l p a r t i a l v i d e o d e c o d i n g s c h e m e i s p r o p o s e d . U s u a l l y, w h e n t h e H i g h - D e f i n i t i o n ( H D ) v i d e o i s p l a y e d o n a relatively lower resolution screen as on the smartphone, the video contents in the frame have to be fully decoded and then. down-sampled to adapt the screen. resolution of the device. In Chapter 4, an Object of Interest (OOI) oriented partial d e co di n g i s p r op o s ed . Th e O O I i s d e fi n e d b y th e u s e r, an d on l y t h e O O I r e l a t e d r e g i o n i s d e c o d e d a n d d i s p l a y e d o n t h e d e v i c e s c r e e n . I n t h a t w a y, t h e O O I r e g i o n i s a b l e t o k e e p t h e v i s u a l q u a l i t y a s t h e o r i g i n a l v i d e o . To a c h i e v e p a r t i a l d e c o d i n g a n d k e e p t h e v i s u a l q u a l i t y o f t h e D e c o d e d P a r t i a l A r e a ( D PA ) , t h e m o t i o n c o r r e l a t i o n information such as motion vectors and residual are utilized. The proposed partial decoding is also able to decrease the decoding time.. 2.

(3) In order to check the real effectiveness on low power of the proposed method, a new workload prediction algorithm is developed for the partial decoding scheme p r o p o s e d i n C h a p t e r 4 , a n d w o r k s t o g e t h e r w i t h D y n a m i c Vo l t a g e F r e q u e n c y S c a l i n g (DVFS) on an embedded platform to achieve low power consumption and energy cost. This envelop detection based workloa d prediction method is proposed in Chapter 5. This dissertation consists of six chapters as follows: C h a p t e r 1 [ I n t r o d u c t i o n ] i n t r o d u c e s t h e H . 2 6 4 / AV C v i d e o c o d i n g s t a n d a r d . The motion correlation is discussed, which is the key of this dissertation. The contributions of this dissertation are also summarized. Chapter 2 [Motion Adaptation Based Inter Mode Decision] proposes a fast i n t e r m o d e d e c i s i o n a l g o r i t h m f o r H . 2 6 4 / AV C e n c o d i n g b a s e d o n t h e m o t i o n v e c t o r of current macroblock. The dynamically updated motion vector (MV) which is obtained from the 16 ×16 motion search and the absolute value of MV are used to s e l e c t t h e m o d e s . A c c o r d i n g t o t h e a b s o l u t e v a l u e o f M V, t h e m o d e w h i c h s h o u l d b e c h o s e n i s j u d g e d a m o n g t h e 8 m o d e s . F i r s t l y, f r o m t h e 5 m o d e s i n t h e f i r s t l e v e l , t h e candidate modes are selected according to the appropriate threshold s. And then, for the 8×8 block modes in the second level, all the 4 modes (8×8, 8×4, 4×8, 4×4) are conducted. The performance of encoding time saving is able to be increased by c o m b i n i n g C h o i ’s S K I P - e a r l y s t r a t e g y ( I E E E Tr a n s . C S V T 2 0 0 6 ) . C o m p a r i n g t o t h e H.264 reference software JM14.1, the proposed algorithm achieves an average of 33. 4 % t i me r e d u c ti on wi th mi n i ma l q u a l it y l o s s a nd l e s s b i t - r a t e in c r e a s e . M o r eo ve r, the proposed algorithm reduces 19.1% encoding time comparing to the JM with C h o i ’ s S K I P - e a r l y s t r a t e g y. Chapter 3 [Residual Feature Based Fast Inter Mode Decision] proposes a fast inter mode decision algorithm based on the residual of macroblocks. The proposed algorithm optimizes for both two levels of inter mode decision. The residual is obtained after the motion search of 16×16 mode or 8×8 mode . In the p r e v i o u s r e s i d u a l b a s e d Wa n g ’ s m e t h o d ( I C M E 2 0 0 7 ) , a l l t h e r e s i d u a l i s s u m m e d u p together for selecting modes. In the proposed algorithm, to select the modes, the p o s i t i v e a n d n e g a t i v e r e s i d u a l a r e c a l c u l a t e d s e p a r a t e l y. S u c h t h a t , t h e m o t i o n analysis is more detailed, which results in more accurate mode selection . According to the calculation of the complexity of the block (16×16, 8×8) and the similarity between two blocks (8 ×8, 4×4), the most appropriate mode for current block is conducted. Experiments show that, comparing to JM14.1, 60%~72% mode-search amount is reduced. As a result, the proposed algorithm achieves an average of 56.6% o n t i m e - s a v i n g c o m p a r i n g t o J M . C o m p a r i n g t o Wa n g ’ s m e t h o d , t h e t i m e - s a v i n g i s 21.7%. Chapter. 4. [Encoder. Unconstrained. 3. User. Interactive. Partial. Decoding.

(4) Scheme] presents an Object of Interest (OOI) oriented partial decoding scheme at only decoder side. There is no required processing at the encoder side. The OOI which is defined by the user at decoder side is tracked, and only the OOI related partial area is decoded. The OOI tracing is conducted for every P frame, and the D e c o d e d P a r t i a l A r e a ( D PA ) i s c a l c u l a t e d a n d u p d a t e d a f t e r d e c o d i n g e a c h I f r a m e . In. these. processes,. all. motion. vectors. are. used. while. not. all. the. residual. i n f o r m a t i o n , a n d o n l y t h e r e s i d u a l o f t h e m a c r o b l o c k s t h a t b e l o n g t o t h e D PA a r e used. Such that, only compressing domain information is used for the OOI tracking a n d D PA a d a p t a t i o n . T h e m o t i o n v e c t o r s a r e r e c o r d e d t o s e r v e t h e O O I t r a c k i n g a n d D PA a d a p t a t i o n , t h e r e s i d u a l i n f o r m a t i o n i s s e l e c t i v e l y u s e d f o r d a t a d e c o d i n g . During the partial decoding, when the reference block or pixel belong to the u n d e c o d e d a r e a ( o u t s i d e t h e D PA ) , f o r i n t e r b l o c k , t h e c l o s e s t b l o c k i n D PA i s u s e d as. the. candidate. reference. block. (named. direction. preferred. reference. block. relocation (RBR) in this dissertation); for intra block, the certain pixels in the collocated block of previous frame is used as the candidate reference pixels (named co-located temporal Intra prediction (CTIP) in this dissertation ). The proposed partial decoding sch eme provides an average of 50.2% decoding time reduction with 2 9 % D PA r a t i o c o m p a r i n g t o t h e f u l l y d e c o d i n g p r o c e s s , a n d t h e P S N R ( P e a k S i g n a l to Noise Ratio) drop of the displayed region is only 0.09dB. This proposal is especially useful for displaying HD video on portable devices where the battery life is. a. crucial. f a c t o r.. Also,. the. encoder-unconstrained. solution. is. a. necessary. condition of real-time broadcasting. C h a p t e r 5 [ E n v e l o p e D e t e c t i o n B a s e d Wo r k l o a d P r e d i c t i o n ] p r e s e n t s a n e w workload prediction algorithm for the partial decoding scheme proposed in Chapter 4 , a n d w o r k s t o g e t h e r w i t h D y n a m i c Vo l t a g e F r e q u e n c y S c a l i n g ( D V F S ) o n a n embedded. platform. to. achieve. low. power. consumption. and. energy. cost.. For. w o r k l o a d e s t i m a t i o n , t h e p r e v i o u s J i n ’s H i l b e r t t r a n s f o r m b a s e d m e t h o d ( I E E E Tr a n s . CAD 2012) has the drawback of relatively large computational cost and still noticeable deadline missing rate. The proposed algorithm predicts future workloads by detecting the envelope of the difference value between the previous adjacent a c t u a l - w o r k l o a d s o f P f r a m e s . To f u r t h e r i m p r o v e t h e a c c u r a c y o f t h e e n v e l o p e detection, the negative truncation is introduced by ignoring the negative difference i n c a l c u l a t i o n . T h e d e a d l i n e m i s s i n g r a t e o f J i n ’s m e t h o d i s 4 . 6 1 % w h i l e t h e proposed one is only 0.66%. This proposal is implemented on the RP2 board and works together with DVFS. Comparing to the method without workload prediction and DVFS, the power reduction achieves about 41.7% and the energy saving is about 10.2%. Comparing to fully decoding, the energy saving is about 61.1%. Chapter 6 [Conclusion] concludes the whole dissertation.. 4.

(5)

博 士 論 文 概 要

博士論文概要