• 検索結果がありません。

A survey of a method of restoration noisy reverberation for improving speech intelligibility

N/A
N/A
Protected

Academic year: 2021

シェア "A survey of a method of restoration noisy reverberation for improving speech intelligibility"

Copied!
4
0
0

読み込み中.... (全文を見る)

全文

(1)

Japan Advanced Institute of Science and Technology

JAIST Repository

https://dspace.jaist.ac.jp/

Title 音声明瞭度回復を目的とした雑音・残響除去に関する

調査研究 [課題研究報告書]

Author(s) 森田, 翔太

Citation

Issue Date 2010‑03

Type Thesis or Dissertation Text version author

URL http://hdl.handle.net/10119/8951 Rights

Description Supervisor:鵜木 祐史, 情報科学研究科, 修士

(2)

A survey of a method of restoration noisy reverberation for improving speech intelligibility

Shota Morita (0810062) School of Information Science,

Japan Advanced Institute of Science and Technology February 9, 2010

Keywords: Speech intelligibility, Listening difficulty, Restoration noisy reverberation .

Speech communication is a fundamental way of information propagation for us. Noisy or/and reverberant environments impair the speech commu- nication in the room. For example, the speech communication is affected by the noise or/and reverberation when we use the hands-free capability of a speech application out of the microphone and we cannot tell the utter- ance content for the party. We need the method of restoration noise and reverberation to solve the problem for restoring speech intelligibility, but its method has not been proposed.

We set the final goal that is the achievement of smooth voice communi- cations in noisy reverberant environments. Firstly, we surveyed if speech intelligibility can evaluate the smooth speech communication or not by surveying the evaluating method of speech transmission performance. Sec- ondly, we surveyed the previous recovering methods of noise and rever- beration. Finally, we surveyed whether the speech intelligibility can be rocovered the method that is noisy reverberant restoration.

We found that we need using the word intelligibility or sentence intel- ligibility to evaluate the speech communication by surveying the evaluate method of speech transmission performance and we have to feature the familiarity of words when we evaluate the speech intelligibility. “Listening difficulty” was proposed because vocal words of high familiarity had not

Copyright c2010 by Shota Morita

1

(3)

difference in speech intelligibility, but it had difference hearings. There- fore it was found that speech communication should be evaluated by both speech intelligibility and listening difficulty. Speech Transmission Index (STI) is objective evaluation of speech intelligibility proposed by Houtgast and Steeneken that is high correlation with speech intelligibility. However, Toida showed that STI is low correlation with speech intelligibility case by case, but Sato shwed that STI is very high correlation with listening difficulty. Therefore, STI is the best method of objective evaluation to evaluate the speech communication.

We surveyed the methods of noise reduction that are Spectral Subtrac- tion (SS) method, Active Noise Canceling, Minimum Mean Square Error- Short Time Spectral Amplitude (MMSE-STSA), Winner filtering method and Max Likelihood method, Real Active SpecTrAl (RASTA). Their meth- ods can reduct the noise well, but it is difficult to dereverberate by their methods because the characteristic of the noise and the characteristic of the reverberation are quite different. The purpose of a lot of noise re- duction method is Automatic Speech Recognition (AIR) that it is used recovering feature parameter differ from recovering speech intelligibility.

We surveyed methods of dereverberation that are Minimum-phase inverse filtering method, Multiple-input/output inverse theorem (MINT) method, method of acoustic inverse filtering through multi-microphone sub-band processing and Harmonic-based dEReverBeration (HERB). However, the methods other than HERB need to measure the room impulse response (RIR), while the RIR has to be precisely measured before the dereverber- ation that is non-blind method. HERB can dereverberate on the blind processing by single microphone, but the purpose of the method is ASR that it is used recovering feature parameter, but the purpose of the method is ASR that it is used recovering feature parameter differ from recovering speech intelligibility. In the achievement of the restoration noisy reverber- ation by using these methods, there is only technique for combining the noise reduction with the dereverberation. Such as methods are proposed that are method of combining the SS method with Linear Prediction (LP) and method of combining the Winner filtering with Linear filter. Either method is sequentially-processing of the noise reduction as subtraction of noise component and dereverberation as inverse filtering of reverberant

2

(4)

component. These methods has limit to recover speech intelligibility be- cause these method do not recover the important parameter for speech.

On the other hand, noise reduction and dereverberation based on Mod- ulation Transfer Function (MTF) have been proposed that the MTF has interrelationship with STI. This method restore the temporal envelope of signal based on MTF that the envelope has important feature of speech intelligibility, but the envelope smeared due to noise and reverberation.

STI can predict the Speech intelligibility that STI is calculated by MTF is static approximation of room acoustics (noise and impulse response). STI can be evaluated speech intelligibility in noisy reverberant environments.

Therefore the method based on MTF can restore the speech intelligibility and listening difficulty in noisy reverberant environment.

We further surveyed the noise reduction and dereverberation based on MTF. In this time, we put the problem of the method of power enve- lope inverse filtering based on MTF in reverberant environment and we studied the improvement of restoration accuracy of the restored power en- velope. As the result, it was found the proposed method can adequately improve restoration accuracy of the power envelope in the previous method.

Improvements are power envelopes, however improvement degree was not bigger as we expected. Speech intelligibility cannot be improved only by recovered envelope of speech, but we knew that speech intelligibility can be improved carrier restoration. However, the method of carrier restoration is carrier regeneration that method needs fundamental frequency estimation and Voice Activity Detection (VAD) in noisy reverberant environment. In the future, we would like to propose these method and finally we are going to achieve to recover the speech intelligibility and listening difficulty based on MTF in noisy reverberant environment.

3

参照

関連したドキュメント

Laplacian on circle packing fractals invariant with respect to certain Kleinian groups (i.e., discrete groups of M¨ obius transformations on the Riemann sphere C b = C ∪ {∞}),

Thus, in Section 5, we show in Theorem 5.1 that, in case of even dimension d > 2 of a quadric the bundle of endomorphisms of each indecomposable component of the Swan bundle

The problem is modelled by the Stefan problem with a modified Gibbs-Thomson law, which includes the anisotropic mean curvature corresponding to a surface energy that depends on

She reviews the status of a number of interrelated problems on diameters of graphs, including: (i) degree/diameter problem, (ii) order/degree problem, (iii) given n, D, D 0 ,

This section describes results concerning graphs relatively close to minimum K p -saturated graphs, such as the saturation number of K p with restrictions on the minimum or

At the same time, a new multiplicative noise removal algorithm based on fourth-order PDE model is proposed for the restoration of noisy image.. To apply the proposed model for

delineated at this writing: central limit theorems (CLTs) and related results on asymptotic distributions, weak laws of large numbers (WLLNs), strong laws of large numbers (SLLNs),

delineated at this writing: central limit theorems (CLTs) and related results on asymptotic distributions, weak laws of large numbers (WLLNs), strong laws of large numbers (SLLNs),