Blind Source Separation Based on Fast-Convergence Algorithm Using ICA and Beamforming for Real Convolutive Mixture
全文
(2) where the superscript“(lCA)" is used to express that the inverse of the mixing matrix is obtained by lCA. [Step 3: DOA estimation] Estimate DOAs of the sound sources by utilizing the directivity pattem of the array system, Fl (f, 8), which is given by K. FI(!,8) =乞WI�CA)(!) 叫[j21l'fdk sin8jc),. (5). WJ;r;fC叫叫A川)υ(げf)凶iぬS. t山he 巴 花 e 、J 目 e巳J 伽 W叫h町e釘悦r陀e pa剖tt陀ems, directional nulls exist in only two pa口凶i比c印u凶la訂r directions. Accordingly, by obtaining statistics with respect to th巴 directions of nulls at all frequency bins, we can estimate the DOAs of the sound sources. The DOA of the l th sound source, 81, can be es timated asÔl =2 81(!m)/N, where N is a to凶point of DFf, and81(fm) 閃presents the.DOA of the l th sound source at the m th frequency bin. These are given by. L��1. 。l」. -ーーーー. Fig. 2. Proposed algorithm combining frequency-domain ICA and beamforming.. 一(iþ(Y(f, t))yH(f,t)) t]W‘(f)+Wi(f),. (1). where (・)t denotes the time引eraging operator, i is used to express the value of the i th step in the iterations, and ηis the step-size parameter. Also, we defìne the nonlinear vector function iþ(.) as. iþ(Y(f,t))三iφ(日(f,t)),"',φ(YL(f,t))]T ,. 1 φ(お(f,t))三[1 +exp(一巧(R)(f,t))]1 + j. [1 +叫(-y;川f,t))]一. (2). 81(ん)=min[叫呼nlFl(ん,e)|,ugqn|日(fm,8)1],. (6). 82(fm)=max[叫IヂIF1(!m,8)1, 叫可�n 1日(fm,8)I],. (7). where min[x, y] (max[x, y]) is defined as a function in order to obtain the smaller (Jarger) value among x and y. [Step 4: Beamfc)l'ming] Construct an altemative matrix for signal separation, WB ( F)(f)-, based on the null-beamforming technique where the DOA results obtained in the previous step is used. ln the. case that the look direction is81 and the directional nuJl is steered to82, the elements of the matrix for signal separation are given as. Wl(�F)(fm) =exp[ - j21l'fmd1 sin Ô1/C] {叫[j21l'fmdl(山Ô2-sinÔl)/c] x. (3). -exp[j2π!md2(sinÔ2 -sinム)/c]} -� WI(�F)(fm) =一閃[- j21l'んのsinÔI/c] {exp[j2πfmdl(sinÔ2-sin (1)/c]. where y;(R)(f,t) and y;(I)(f, t) are the real and imaginary parts of yj (f,t), respectively.. x. 3. PROPOSED ALGORITHM. 一切[j2π!md2(sinÔ2-sinÔI)/c]}. The conventional ICA method inherently has a signifìcant disad vantage which is due to low convergence through nonlinear opti mization in ICA. In order to resolv巴出e. problem, we propos巴a叩na討1gorit出hm based on t出he t匂巴mpoωra叫1 altemation of leaming b巴飢twee叩nICA and be巴amf,伽0ωrmin tain巴d through ICA iおS t匂emporally substituted by t白h巴 mat凶rix based on null be伺amf,おorming for a temporal initialization or acceleration of the iterative optimization. The proposed algorithm is conducted by the following steps with respect to aJl frequency bins in parallel (see Fig. 2). [Step 1: InitiaIization] Set the initial Wi(f), i.e., Wo (f), to an arbitrary value, wher巴 the subscripts i is set to be O. [Step 2: 1・time ICA iteration] Optimize W(i f) using the fol lowing l -time ICA iteration:. [ 吋 (φ附,収H(川t ). W� ; �A)(f) =η d. - (φ(Y(!,t))yH(f,t) \]Wi(f)+Wi(f),. (4). 1 -. (8). -1.. (9). Also, in the case that the look direction is O2 and the directional. null. is steered to 01,. the elements of the. matnx are given. as. WJ�F)(fm) = -exp[ - j2π!mdl sinÔ2/C] { 一切[j2πfmdl(sinÔ1-sinÔ2)/c] x. +exp[j2π!md2(sinÔI-sinÔ2)/c]}-I, wrF)(fm) =exp[- jhんd2sinÔ2/c] {-exp[j2π!mdl(sin ê1-sin (2)/c] 1 +叫[j21l'fmd2(sinÔ1-sin (2)/c]}一. (10). x. (11). [Step 5: Diversity with cost function] Select the most suitable unmixing matrix in each frequency bin and each iteration point, i.e., algorithm diversity in both iteration and frequency domain As a cost function used to achieve the diversity, we calculate two kinds of cosine distances between the s巴parated signals which are. 922. -126 -.
(3) 5.73m fF j 115川悦) 払2.15 斗事 ) (43t1h)φ. obtained by ICA and beamforming. These are given by. j(Iυ閃I悶m川C叫A. J. (1伊伊|ド肘吋げI川刊(1似|同ガ悶臥叫川A叫)り(げf川. 何 凹問 門川F町)(刊 (f). 刊. 門F川川 件 (|ド1 吋R ず町P御 町)川 但門叫町町附B川( 以 川 ~ 勺(f げ川 仏仰 λμf川刈, りt). C where lí(I A ) (f, t )is the separated signal by ICA, and lí(BF)(f, t) is the separated signal by beamforming. If the separation per formance of beamforming is superior to that of ICA, we obtain C the condition, j<ICA)(f). > j(BF)(f); otherwis巴j(I A)(f)三 j(BF)(f). Thus, an observation of the conditions yields the fol lowing algorithm:. W(f). =. f W ;��A)(f), (j(臥)(f)壬j(BF)(f) ) ' \ �1. W(BF)(f), ��T)ii}\ ,':\" (j(臥)(f) \ I'�:{ ( : II CA \ I �: > j(BF) : (f) (BF ) :'. (14). If the (i + l)th iteration was the final iteration, go to step 6; oth erwise go beck to step 2 and repeat the ICA iteration inserting the W(f) given by Eq. (14) into Wi(f) in Eq. (4) with an increment of i [Step 6: Ordering and scaling] Using the DOA information ob tained in step 3, we detect and correct the source permutation and the gain inconsistency [6]. 4. EXPERI九1ENTS IN REVERBERANT ROOM 4.1. Conditions for experiments A two-element aπay with the interelement spacing of 4 cm is as sumedτne speech signals are assumed to arrive from two direc tions, -300 and 400 • 1\為'0 kinds of sentences, those spoken by two male and two female speakers selected from the ASJ contin uous speech corpus for research, are used as the original speech samples. Using these sentences, we obtain 12 combinations with respect to speakers and source directions. In these experiments, we use the following signals as the source signals: the original speech convolved with the impulse responses specified by different re verberation times (RTs) of 150 msec and 300 msec. The impulse responses are recorded in a variable reverberation time room as shown in Fig. 3. The analytical conditions of these experiments are as follows: the sampling frequency is 8 kHz, the frame length is 128 msec, the frame shift is 2 msec, and th巴 step-size parameter ηis set to be 1.0 X 10-5.. Fig. 3. Layout of reverberant room used in experiments In Fig. 4, it is evident that the separation performances of th巴 proposed algorithm are superior to those of the conventional ICA-based BSS method at every iteration point, even considering the additional computational cost of th巴 proposed algorithm. For example, compared with the conventional method, the proposed method can improv巴 the NRR of about 4.6 dB at the 50-iteration point in the conventional ICA when the RT is 150 msec. AIso, when the RT is 300 msec, the proposed method can improve the NRR of about 1.5 dB. Figure 5 shows a result of altemation between ICA and null beamforming through iterative optimization by the proposed algo rithm when the RT is 300 msec. In this figure, the symbol“・" represents that the null beamforming is used in the iteration point and frequency bin. As shown in Fig. 5, the proposed aIgorithm can work automatically as follows: (1) null beamforming is used for the acceleration of leaming at early times in the iterations because W(BF)(f) is a rough approximation of the inverse of the mixing matrix A(f), (2) lCA is used after the early pa口 of the iterations because ICA can update the inverse of th巴 mixing matrix more ac・ curately, and (3) th巴 inverse of the mixing matrix obtained by ICA is substituted by the matrix based on null beamforming through whole iteration points at particular frequency bins wher巴 the inde pendence between the sources is low. From these results, although null beamforming is not suitable for signal separation under the condition that the direct sounds and their reftections exist, we can confirm that the temporal utilization of null beamforming for al gorithm diversity through ICA iterations is effective for improving the separation performance and convergence. 5. EXPERIMENTS IN CAR ENVIRON民1ENT. A two-element aπay with the interelement spacing of 4 cm is as sumed. The speech signals are assumed to arrive from two direc tions, 50 for the driver and 500 for the speaker in the assistant seat. The impulse responses are recorded in a real car environment as shown in Fig. 6, wher,巴we use 3 kinds of a汀ay position. The analytical conditions in this experiment are the same as those of the previous section, except for the sampling fr巴quency (which is 16 kHz). Figure 7 shows NRR results of the proposed method, where we also plot the results of the conventional Delay-and-Sum (DS) a汀ay with 16-element for comparison (a p吋ori information on DOAs was given in DS array). From this figure, it is evident that the separation performances of the proposed method are remark ably superior to those of the conventional DS a汀ay at every a打ay position. This indicates that the BSS is eff,巴ctive for speech 巴n-. -0. 4.2. Objective evaluation of separated signals In order to compare the performance of the proposed algorithm with that of the conventional BSS described in Sect. 2 for different iteration points in ICA, the noise reduction rale (NRR), defined as the output signaトto-noise ratio (SNR) in dB minus input SNR in dB, is shown in Fig. 4. These values were averages of all of the combinations with respect to speakers and sourc巴 directions. As for the proposed algorithm, we also plot the NRR which is r巴scaled by the computational cost (see dott巴d lines) because the proposed algorithm has a computational compl巴xity of about 1.9fold compared with the conventional ICA.. 1 - 923 127.
(4) 14 . ã)12� /........11: W 豆 1 f j戸 � 10 � L./ 市 l 不 匡. �. 11 ,i. 11 8Hi. :x- -. k i 6lj 11: 3 4 lIj 戸/, �.. 0:. z. 2. 00. .. --. 4000 3500 1::-・ 3000 巴一一一一 一一-- N I ・・ 旦2500 þ: 〉、 gω 2000 ー司ーー『田ー..."....._・v・yo-、. ・� 31500 IJ... 1000 500. - _...... w . . . . . . . .一“. . � _ -ーーー__ __ーーーーー-ーーーー.'掛~・・ -…・・・・・. �. V 円. *. CO加n附M附ve削削e引制n川州tiω伽iぬion加n州 CA-x十. P向ro叩p閃o悦s則e吋dM胤e附+. 。 。. Proposed Method (rescaled by computational cost)・#・. 50. 10. 150. 100 Number 01 Iterations. 200. 〆 書7Hi 1 1; ¥ 6Hi Q 川, ち5Hi コ 川 E. rr �. I. y. a-- 一一一一一-ーー J. I. I. リ. 2 4 1jF リ・. 3l 3 Hi � 2勝. I. 80. 100. Fig. S. The result of altemation between ICA and null beamform ing through iterative optimization by the proposed algorithm. The symbol“・" represents that the null beamfoロTIing is used in the iteration point and frequency bin. The RT is 300 msec.. (〉 山. 事8145;y-. 60. Number of Iterations. ý'. 掛H・H・--中山主午M・M ・....…・・・-…… … - …・・..… ・報 .. 9f. 40. 20. - � -/- � ・ -"" -. Conventional ICA -xー. Back. 打。 Array_ 3 金盈. Proposed Methodー← Proposed Method (rescaled by computation cost)・骨・. 。. 。. 50. 100. 150. 200. Number 01 Iteratoions. Fig. 4. Noise reduction rates for different iteration in ICA. Rever beration time is 150 msec (top) and 300 msec (bottom).. Fig. 6. Layout of aπay in car cabin used in experiment. hancement in the car environment 6. CONCLUSION. In this paper, we described a fast- and high-convergence algorithm for BSS where null beamforrning is used for temporal algorithm diversity through ICA iterations. Th巴 results of the signal separa tion experiments reveal出at the signal separation performance of the proposed algorithm is superior to that of the conventional ICA based BSS method, and the utilization of null beamforrning in ICA is effective for improving the separation performance and conver gence, even under reverberant conditions. Also, the experiment in a real car environment shows that the separation p巴rformances of the proposed method are remarkably superior to those of the conventional DS a汀ay.. Array. Array 2. Array 3. Intemational Symposium on Nonlinear Theory and Its A p plication (NOLTA '98), vol.3, pp.923-926, Sep. 1998 [3] P. Smaragdis,“Blind separation of convolved mixtures in the frequency domain," Neurocomputing, vo1.22, pp.21-34, 1998 [4] L. Parra and C. Spence,“Convolutive blind separation of non-stationary sources," IEEE Trans. Speech & A udio Pro cess., vol.8, pp.32{}-327, 2000. 7. ACKNOWLEDGEMENT. ηle authors are grateful to Dr. Shoji Makino, Mr. Ryo Mukai of NTT. CO., LTD, and Mr. Masaru Yamazaki of NISSAN MOTOR CO., LTD. for their discussions on this work. This work was pa口Iy suppo口ed by NISSAN MOTOR CO., LTD. and CREST (Core Re search for Evolutional Science and Technology) in Japan.. [5] H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, and K. Shikano,“Blind source separation bas巴d on subband lCA and beamfoπning," Proc. ICSLP2000, vol.3, pp.94-97, Oct. 2000.. 8. REFERENCES. [6] S. Kurita, H. Saruwatari, S. Kajita, K. Takeda, and F. Itakura,“Evaluation of blind signal separation method us ing directivity patt巴rn under reverberant conditions," Proc. ICASSP2000, vol.5, pp.3140-3143, June 2000.. [1] P. Common,“Independent component analysis, a new con・ cept?," Signal Processing, vo1.36, pp.287-314, 1994 [2] N. Murata and S. Ikeda, “An on-line algorithm for blind source separation on speech signals," Proceedings 011998. 。。 円ノU 11ム. 1. 1. Fig. 7. Noise reduction rates for different aπay position.. -. 924.
(5)
図
関連したドキュメント
Restricting the input to n-vertex cubic graphs of girth at least 5, we apply a modified algorithm that is based on selecting vertices of minimum degree, using operations that remove
It is suggested by our method that most of the quadratic algebras for all St¨ ackel equivalence classes of 3D second order quantum superintegrable systems on conformally flat
Using right instead of left singular vectors for these examples leads to the same number of blocks in the first example, although of different size and, hence, with a different
We proposed an additive Schwarz method based on an overlapping domain decomposition for total variation minimization.. Contrary to the existing work [10], we showed that our method
At the same time, a new multiplicative noise removal algorithm based on fourth-order PDE model is proposed for the restoration of noisy image.. To apply the proposed model for
Keywords: continuous time random walk, Brownian motion, collision time, skew Young tableaux, tandem queue.. AMS 2000 Subject Classification: Primary:
For performance comparison of PSO-based hybrid search algorithm, that is, PSO and noising-method-based local search, using proposed encoding/decoding technique with those reported
Inside this class, we identify a new subclass of Liouvillian integrable systems, under suitable conditions such Liouvillian integrable systems can have at most one limit cycle, and