Timing Optimization of Filter Replacement in Compressive Coding for Stereo Audio Signals Using Independent Component Analysis
全文
(2) X(j,t) = A(j)S(j,t).. (1). 2.2. Blind source separation. Although sparseness between musical instrument sources is high generally, sparseness is low between channels of signals consisting of many instruments, which have high corr巴lation mutually. To use blind source s巴paration (BSS) with合equency-domain inde pendent component analysis (FDICA) [4], the mixed signals is separated to each of instrument sources so that we enhance sparseness between the channels of the signals. First, we perform signal separation by us ing an L x M complex valued separation matrix W(j) optimized so that the output signals Y(j,t) in the time frequency domain become statistically independent mu tually. Thus Y(j,t) is given as Y(f, t) = [Y1 (f,t),・・・,YL(j,t)f = W(j)X(j,t).. (2). In addition,W(j) can be optimized by the following iter ative updating formula so-called higher-order ICA (HO ICA): W[i+IJ(j). =μ[1 - (φ(Y(j, t))Y(j,t)円 1 W[iJ(f). +. i W[ J(j), (3). W附町げωf刀斤パ)-戸-→イI�仏,[ "',0, れ町W附(げ以刷fλ, L. 2.3. Selective placement (SP) [3]. We show the configuration of SP coding algorithm in Fig. 1. Here we compare powers of the time-frequency grids between both channels which hav巴 b巴come high sparseness via FDICA,and we perfoロn encoding. First, we need to detect the grids of non-sparseness between the channels, so check whether power of the ICA out puts exceeds threshold. Thus we use the largest power component between the channels as 2 P",(j,t) = ma�12mM,t)1 , / 1,2. =. (5). where 2m/(f,t) is the l-th component of 2",(j,t). The detection of non-sparse grids is conduct巴d by setting threshold Th as ThくP1 (j,t)P2(j,t). (6) Thus we d巴termine the sparseness of grids if (6) is not. t,I.I,.). Transmitter. I. Receiver. Fi伊re 1. Configuration of our previous SP coding.. satisfied. Hence we describe the encoding process. If (6) holds,we set the flag of non-sp紅seness by setting P(j,t) = 0, (7) and store both values of Y(j,t) in VSP(j,t) as SP (8) v (j,t) = Y(j,t). does hold not If (6) , y,sP(j,t)(j, t) is regarded as a sparse grid, and we store index of the dominant sig nal in j SP(j,t) and only the signal with larger power in SP v (j, t),as j"'l"(j,t) = argmax PM,t), (9) /=1.2 SP (10) V (j,t) = Y,町[,t)(j, t). h白e decoding steps, we first obtain estimation y5P(f,t) of Y(f,t) by allocating VSP(f,t) corTECtly ac cording to jSP(j,t). If jSP(j,t) = 1 or 2, sp,. 1 '-川(j,t). r .' YjI"'l" ('Jλ" t) = {1 0. if i = j SP, oth. On the other hand,if jSP(j,t) =0, YSP(j,t) = VSP(j,t).. 、‘,J 'E-a 'Ea rt、. 1. where denotes the identity matrix, 01 denotes the time-averaging operator,superscript H denotes complex co吋ugate transposition of vector/matrix, [i] is used to express the value of the i-th step in the iterations,μis the step-size parameter. Also,φ(-) is the appropriate nonlinear vector function. Next, we can obtain M-channels of each source by usmg pr句巴ction back (PB) [5] which adapt inverse ma trix of W(j). Hence restoring source 2/(j,t) for 1 = 1,・・・,L is given as 2M的げ肌山,t) =. J,(I.I). xιz. (12). 3. Proposed Method 3.1. Motivation SP realizes the high compression efficiency by ICA to extract sparseness among sound sources and choose only the dominant one. However, sp紅seness tends to be low when the performance of source separation de grades, so compressive efficiency becomes worse with increasing the grids which should be transmitted. In fact, sep訂ation filter W(f) is tim巴invariant and addi tional source separation is not performed along with the change of composition of musical instruments, so the deterioration of sound quality is a big problem inher ent is SP coding. If we conduct source separation of the spatially time-v訂iant audio signals by using a sin gle scparation filter W(j), the performance degrades be cause it can not adapt to the change of source compo・ sition. Moreover, if we use multiple sep紅ation filters which are formed at regular intervals, the performance also degrades in two intervals of changing the compo sition of sources. In addition, several sep紅ation filters in the same interval lead to the redundant transmission quantity because we仕組smit血e sarne separation filters again and again.. 511. - 240-.
(3) studied by us [7]. This subsection briefly describes the overview of signal processing in the closed-form ICA. The strict proofs of the theorem will be omitted due to the limitation of the current manuscript's space First, we obtain the correlation matrices with differ ent tlme pomts as. x,(f.') T q開. x.(f.') Fi伊re 2. Configuration of proposed algorithm. RI' (f) = (x(f, t)x(f, t)H)町,. 3.2. Timing optimization algorithm of replacement separation filter. where 01日, denotes the time-averaging operator over specific time duration ti, and i = 1,2,… represent in dices of time-averaging block Next, we apply the singular value decomposition (SVD) to a superposition of R1,(f), which is repr巴sented as. To solve the problem in Sect. 3.1, we replace sepa ration fìlter co町'esponding to the changing the compo sition of instruments. Thus, we need to know the op tima1 tirning of replacing separation fìlters according to the changing the composition of instruments automati cally. Therefore we propose a timing optimization algo rithm of replacement, which detects the optimal timing based on the distortion of decoded signal. We show the con負guration of血e proposed a1gorithm in Fig. 2 First, we encode in all possible timings of replace ment to det巴ct the optimal timing of replacement, and one decoded stereo signal ZT, (f, tk) is denoted in one timing of replacement Tk (Tkεt) as. I. R1,(f). [ I. RI,(f)f. (1 3). L(f). where the decoded signal YT, (f,tk) is obtained by using separation filter WT, (f, tk), which is obtained using ICA from input signals X(f, tk), and To= 1. Next, we calculate SNRT, which is signal-to-nois巴 ratio (SNR) in each timing of replacement as. 2 �I �f IIX(f, t)11. �'_�'_ � ^ - �I �f IIX(f, t) - ZT,(f,t)112. ,. = U(f) di刷1 À2, ...)U(f)H,. (17). where λk are the eigenvalues, diag(λ1,…) denotes the di司 agonal matrix which includ巴s the eigenvalues,and U(f) is the matrix consisting of the eigenvectors. Then we obtain a full-rank decomposition for pseudo-inverse of �i R1,(f) as follows. 会TKU,tk)=wtu,tk)fTKU,tk) (tk=TKー1,'" , 九),. SNRT,企= 10 10g _ _. (16). L(f)L(f)H,. (18) 1. 1. 一一 一一,…). (19) ..[X;'..jI;. U(f)diag(. If the covariance of the sources s(f, t) in ti is negli gible, every L(f)HRI' (f)L(f) for any i shares the same eigenvectors, and this is given via SVD form as. ,. L(f)HR1,(f)L(f)=T(f)diag(σ1 (ti) σ2(ti),…)T(f)H( 20). (14). where σk(ti) are the eigenvalues for a specific time block ti, and T(f) denotes the matrix consisting of shared eigenvectors which are independent of time-block index i. Therefore, for any i, the simultaneous diagonalization of RI,(f) can be achieved as follows;. ,. After calculation SNR T in all of the timing of replace ment Tk, optima1 tirning of separation filter replacement TOpl is given as (15) Topl = argmax(SNRT.). T,. T(f)HL(f)HR1,(f)L(f)T(f)=diag (σ1(ti),σ2(ti), ...),(2 1). 百le more sparseness is satisfied as more higher SNRT, is archived. As for update of separation filter W(f) by FDICA in all intervals of switching,it spends hug巴 amount of com putational complexies. Fortunately our purpos巴 is not source s巴paration itself with high performance but fìlter replacement timing detection, so we can partially use a limited performance of sourc巴5巴paration to search opti mal tirning of the fìlter replacement. In se紅ching steps, we use more faster method of ICA than HO-ICA (see (3)), e.g., closed-form 2nd-order ICA (SO・ICA) [6][7], or fastICA [8]. Both of them have fast-convergence prope口y and a certain level of the separation perfor mance.. and this means that白e optimal sep紅ation filter ma出x in the 2nd-order sense is given by. Wso(f) = (L(f)T(f))H.. (22). Computational cost in the closed-form SO-ICA is very small. In fact, it should be mentioned that the whole computations in the closed-form solution ar巴 almost the same as thos巴 for 1 or 2 it巴rations in HO-ICA. 4. Experiments and results 4.1. Conditions of experiments. In this experiment we evaluate performances of timing optirnization algorithm of s巴paration filter in compressive coding using ICA, and we use two stereo recordings of music. At first, track 1 is the localization of出at a flute is the right and a guitar is出e center. Next,. 3.3. Closed-form 2nd・order ICA. Closed-form SO-ICA has been found by Tanaka [6], and its application to acoustic signals is now being. 512. Aせ 円〆臼.
(4) 18 16 宅12 白 a: 10 Z u) 8 6 4 2. θ. θ. @0. 8@. Figure 3. The compositions of sources at track 1 and track 2.. track 1 changes to that a flute is between the right and the center and a guitar is the center in Fig. 3. Also track 2 is the localization of that a flute is出e left and a guitar is the right at first. Next, track 2 changes to that a flute is none, a guitar is the center, and both drums and a bass is the center in Fig. 3. Both track 1 and track 2 are recorded and edited by professional musicians, and have changing the composition of the instruments near the center of the signals. They are recorded in sampling frequency 44.1 kHz with quantization of 16 bits. The length of filter is 1024 taps. 百le size of window used 1024 points with 90 overlap points (60-point hanning, 30-point zeros). In this experiment we use three tech niques of ICA; HO・ICA with d巴fault matrix which is the identity matrix, SO-ICA, and fastICA, and we calculate SNRT, in each timing Tk, where is given by (14). o. 100. 200. 300. 400. 500. 600. 700. Frame Index n. Figure 4. The result of proposed searching optimal timing to replace at track 1.. 14 12 _11 宣10. �. 9. m 8 7 6 100. 200. 500. 600. 700. Frame Index n. Figure 5. The result of proposed searching optimal timing to replace at track 2. 6. References [1] C. FaJler and F. Baumgarte, "Binaural cue coding-part II: schemes and applications," IEEE Trans. Speech And Audio Processing, vol.ll, pp.520-531, 2003.. 4.2. Experimental results. We show two results of searching optimal tirning; track 1 has two sources in Fig. 4, track 2 has four sources in Fig. 5, where the shaded zone represents near the cor rect tirning. Both tracks have changing the composition of the sourc巴s near center of the signals, so both of the results and all of ICA techniques can show the coπect tirning peak. In addition, we can use SO・ICA and fas tICA as abundant techniques for searching the optimal timing of changing the composition of sources, although they only provide lower SNRT, than HO-ICA. Note that SO・ICA's computational efficiency is remarkable; SO ICA can work with 4% computations of HO-ICA.. [2] J. H巴πe et al., "Spatial audio coding: next-generation 巴筒cient and compatible coding of multi-chann巴I audio," 117th Conv. Aud. Eng. Soc., Preprint 6187,2004. [3] S. Miyabe et al., "Compressive coding of ster巴o audio sig nals extracting sparseness among sound sources with in dependent component analysis," Proc. WASPAA, pp.331334.2007. [4] P. Smaragdis, "Blind separation of convolved mixtures in the仕equency domain," Neurocomputing, vol. 22, pp. 2134,1998. [5] N. Murata and S. Ikeda, "An on-line algorithm for blind source separation," Proc. NOLTA, pp.923-926, 1998.. 5. Conclusion. [6] A. Tanaka et al., "Theoretical foundations of second order-statistics-based blind source sep釘ation for non stationary sources," Proc. ICASSP, pp.600-603, 2006.. In this paper we proposed a tirning optirnization of the separation filter in compressive coding method of stereo signals using ICA. Experimental results show that the proposed algorithm realiz巴s the optimal timing of changing the composition of the instruments localized into two direction. In the future, we need to realize the algorithm of optimal timing against a number of mu sic by compact disc, and more efficient algorithm co汀巴 sponding to more than three timing changes.. [7] K. Tachibana et al., "E筒cient blind source separation combining closed-form second-order ICA and nonclosed form higher-order ICA," Proc. ICASSP, Vol 1, pp.45-48, 2007. [8] A. Hyv誌rinen組d E. Oja, "A fast fixed-point a1go rithm for independent analysis," Neural Computation 9, pp.1483 -1492, 1997.. 513. ワU Aせ っ“.
(5)
図
関連したドキュメント
In this work, we have applied Feng’s first-integral method to the two-component generalization of the reduced Ostrovsky equation, and found some new traveling wave solutions,
We will show that under different assumptions on the distribution of the state and the observation noise, the conditional chain (given the observations Y s which are not
By our convergence analysis of the triple splitting we are able to formulate conditions on the filter functions to obtain second-order convergence in τ independent of the plasma
The following result about dim X r−1 when p | r is stated without proof, as it follows from the more general Lemma 4.3 in Section 4..
His idea was to use the existence results for differential inclusions with compact convex values which is the case of the problem (P 2 ) to prove an existence result of the
0.1. Additive Galois modules and especially the ring of integers of local fields are considered from different viewpoints. Leopoldt [L] the ring of integers is studied as a module
Keywords: Electrocardiogram; Parameterization; Quadratic spline wavelet; PCA variance estimator; Feature extraction; Validation; Principal component analysis; Independent
As a general remark, sensor fault detection results obtained with OKID are similar to those obtained with a traditional Kalman filter, but, with the proposed method, the OKID