日本語話し言葉コーパスを用いた話し言葉音声の音響的特徴の分析
6
0
0
全文
(2). ! " # $ % & ' ( ) * + , & ) . / 0 1 2 & ) 3 4 5 $ % 6 7 8 9 ( : ; < = > ? @ A < B C D E F G H I J K L M N O P Q R C D E F S T U V W X Y S T Z S [ \ ] ^ _ ` a b R c C D E F S T U P ) d e f S T @ g h e f S T @ i C _ j k @ V W X Y T U D P ) d e f S T Z l m n o _ V W X Y p q V W X Y _ j k X Y R c o O r Z s H t ? @ u v Z C wC`UDaExxFyp¦z?¢RU{o]cs^@HJCxt£|D} > { T D P Q R ~ C w K J } Z Q [ x > m R c Q S U J Z Z ~ ` a _ p o @ ? @ S T i D P b U ] ^ P k > ¡ i ¤ K ¥ O R o x ¦ p c S T s H t _ Q { C § ¨ EFZ{C§¨Z?@qVWXYKiDP{C§¨(©z_o ¢ o > < = > ? @ A < B C D E F G H I J Ñ Ò > ª « ¬ U Ó Ô K L M N O P Q R C D E F S T U ? V W X Y S T Z S [ \ z ]  Z ` a _ c o O r ® ¯ @ l m E F Z V W X Y S T Z ° ± K Q P Z S T s H t u v Z w x Õ V W ? ² Q ° ± ³ ¨ x ´ r O R K z P m R c Ö ¶ _ { T D k @ w K R J D ¦ µ ¶ > @ C D E F · Q ¸ ¹ R º » { } Z Q b R o U x × È > Ø c C Z S T _ U @ ¢ Z ° ± ³ ¨ ? ¼ ½ ¾ [ \ ] ^ U D P ? @ S Z J U ¿ b R c C D E F S T À _ o U > @ V W X { C § ¨ K Ù 7 b R c Ú Û ? > @ K L Y U ? y z R ] Á S [ \ z  à _ M N O P R ) d e f T U @ Ü µ C w R Ü Ä Å Æ o U x > m O Ç C D E F T Z ° ± µ Ý Þ Z V W X Y S _ Q S Z J ÂÈZÉXKÊËÌÍÎx´rORÏÐNORc 2004−SLP−53 (2). 社団法人 情報処理学会 研究報告 IPSJ SIG Technical Report. 2004/10/22. 152-8552. 2-12-1. Email: {masa, iwano, furui}@furui.cs.titech.ac.jp. 5. 3. Analysis of acoustic characteristics in sponaneous speech using Corpus of Spontaneous Japanese Masanobu Nakamura, Koji Iwano, and Sadaoki Furui. Department of Computer Science, Tokyo Institute of Technology 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8552 Japan Email: {masa, iwano, furui}@furui.cs.titech.ac.jp. This paper compares acoustic characteristics of spontaneous speech and read speech analyzed using the Corpus of Spontaneous Japanese (CSJ). Academic presentations, extemporaneous presentations and dialogue utterances were analyzed as spontaneous speech, and the utterances reading manual transcriptions of academic presentations were analyzed as read speech. Since they were uttered in different speaking styles by a group of common speakers, the problem of individual spectral difference could be avoided. Reduction of cepstral distribution for each phone was analyzed using utterances by 5 male and 5 female speakers. It was found that the cepstral distribution of spontaneous speech was reduced comparing to that of read speech. This was especially significant for dialogue utterances. Speaking rate was also analyzed using phonetically labeled utterances spoken by 3 male and 3 female speakers, and it was found that the speaking rate of spontaneous speech was higher than that of read speech.. (. 1. CSJ. ). [1] CSJ. −7−.
(3) p c ¢ Z @ V W X Y S T K i biP¦YU{<CSQRr=ZCTÜ`>§KDa?E¨iZÉ__@FbxZÉS T J x b R S Z # J ´ r O p Ü Z s H t _ Q R S ` a _ U o @ V W X R C D E F S T { C § ¨ ( m p c À i Z C D E F K g h e f S T @ J U Z . _ { C t | Z S K P R T U x ´ r O R ¦. b R c J U b R c p { C t | Z V S Z U b R c C w U K _
(4) _ ~ @ ¢ Z w T W _ b c X } . K ? Y H # [ \ ] Q < > Q p S T s H t ? K L M N O P K R V W X Y S T U Õ Â C w C}fdeQpºSefqRDTiSEV_U@CÂTF S T Ø R c C D E F S T U D P ) d e ^ _ D E F G H I J ` a g h e f @ i _ Q @ V W X Y J Z Z Z 2 b K ? @ C w Õ P ) d S T Z l m n o _ { C t | U K ~ p S T U J W X Y _ Q R c i C ? Z # Z W Q R c i U D ? @ Z | t H g h e f Z | t H @ d S c Õ c e U D p c K Ø R @ i C p @ À i s H t ? K Ø Y p V S T s H K < > p S T s H t S t d c f e U D p b c q V W X Y @ ) d e f T g h o > S Z T U J ? @ Ñ ¿ Z U D P Q S T _ ¢ O K D P g b N O R c Z C w Z K ? Z G K O R C w:OQRDcS x Ì k @ o r S T s H t K ? @ S T s H t ¦ r Q + U i I h H @ K P U ! } x " N Q ¢ O r Z µ Q i U j i k Z 1 + [ I l H t _ g b R c À m Ï ? T s H t ? > # $ O p ¢ @ À n o ? D @ { C / 0 U % Z s H t Ø R c K & D P @ K p _ P Q 'bPp92( ) * | } _ + K e f S T _ Ñ X Z , C w Õ { C t | U K À i s H t _ > 9 k @ 9 r O p 9 . { C / 0 1 U Q P q 8 _ ) r b R c 3 D p c { C / 0 x 4 5 6 Z 7 8 ? 9 V Z S s } ? @ t u Z v R { C / 0 U ; : D @ Ì W z U b R c c b w X x p Z x @ À i S Z Z y t u ¦ r Q + < =. > ? @ A B C ! } _ S T U J U D P j k b c D E F G H I J K L J M N O V W X Y S T U C D E F S T Z ` a ? @ V W X T K i b R Z z { | A B C CqYRVSSZ > c { t | S K P C D E F S T Z { C § ¨ ? V W X Y S T U ` a J Z Z _ Q P } ~ o x ( m © @ ¢ Z [ > ° ± k x 2 3 b Q U . x Ø R c ¢ Z p ~ @ C D E F S T V W X Y S T { C § ¨ É _ > Ø R U WTXZY{SCTt@|)d_e¢fOST@ghefSTU@Di { CctR|?K°R±{ÂCȧ¨ZÉXQKÁ_. 2: /a,i,u,e,o,a:,i:,u:,e:,o:/. /w,y,r,p,t,k,b,d,g,j,ts,ch, z,s,sh,h,f,N,N:,m,n/. cp (X). X. p. X. cp (X). redp (X). redp (X). 2. CSJ. 5. 3.2. 2. 31. 10. 21. 1. 1. 40. R,A,S,D. 10. CSJ. 3. (1). MFCC12. 39. 16kHz. 10ms CMS. 15. 25ms. 400ms. (2). 1. 1 1. monophone HMM 3 left–to–right. HMM. (3). 3. monophone HMM HMM 2. MFCC. 3.1. 4. X p redp (X). redp (X) =. [2]. kcp (X) − cp (X)k kcp (R) − cp (R)k. R, A, S, D. −8−. 12.
(5) S T s H t Z S } .
(6). . . . . ÂÂ R Z S T s H t _ P ¡ x N © z P Q R c SKKU>Qz?b@pZRSc:"ÀZU{iDGNCpO§cZPK_¨Q~2?O Spb ! } _ Q P @ p K R S U S Z Z T U _ { C 9 @ ¢ j R o t | U b c ) d e f S T @ g h e f S T @ c / 0 ? i C S T _ ¢ O U D p o O k V U Ü © P Z { t | > J x k @ i C S T ? ¢ O x £ ¤ K Ø r ¸ O P Q R c J Z Z S _ + , \ K b 7 \ > Ø R w K P R i C T U q V W X Y S T Z Q + Z _ Q Z k ! } K . D p K b c . ? S D E F G H I J K L J M N O i / @ X x S @ ¿ x S _ D P Q R c y y k ! } _ ¢ O 0 K @ R S U S Z Z Z 1 @ 2 1 j P Q R c q V W X Y T Z _ { C t | U K b c X x R Z T U J _ 3. > D @ 4 S @ ¿ x D P Q R c _ 5 6 > ® 7 D p i C S S b®fzcQST>ZKo"TO(PQUZxRbckJK_@¦c#ÎKR%qp 8 ¢ _ V W X Y S T i R ) d e f P Q R c o O k V P Z Z Z Z T U J x Z 8 K ® 9 © @ S P R ? U ! · K b z ¸ x S K J Z Q x N c o O ? @ J P Q R ¦ U m q V W X Y S T ) d e p @ Ñ ¿ Z z : ; _ p c S T s H x z Q U _ D P Q R c S t U D G O R _ Q @ ? @ ( $ Z S x k ¡ N © " N O S ! } Q P @ 9 D S K & D P ? @ _ k b Z Q + Z U W p s H t > { T N O z ¦ p ~ ~ R c o Z T U J _ q V ? ~ z Q c ' K q V W X Y S T X Y S T K i b R { C t | Z J Z K i b R g h e f Z J Z Z Z _ ~ c K b c X ? _ b ) d e f S T U ` a b R U @ RfZJS*TKUP`qZVabWZXRYUxS@(RTKiNb©QRzi_C¡PSQT)KcdeQtduJ@¦e¿rÜ?USKx@ZJÀC@]_D>gEKØiFbRCSDcTp78ÌKQPyR 1:. ID. (R). (A). (S). (D). M1. 7,420. 7,371. 5,213. 9,915. M2. 10,768. 10,815. 6,000. 14,489. M3 M4. –. 12,118 23,154. 12,211 23,208. 8,525 8,615. 17,616 19,892. M5. –. 8,598. 8,651. 11,518. 29,862. F1. 12,162. 12,071. 10,119. 25,428. F2 F3. 7,843 11,383. 7,757 11,360. 7,206 4,837. 20,141 17,044. F4. –. 8,111. 8,038. 8,232. 20,999. F5. –. 17,797. 17,848. 9,598. 22,083. 1. 6. 2. redp (X). /sec 2. 3.1. A, S, D. 5. 12. MFCC. 5.1. 1. 1 redp (X). 2 3. 2. redp (X) = 1. 1. redp (A). (0,0). 1. 3. 1 /ch/,/N:/. 12. MFCC. 1. 4. redp (S). HMM. 1. 1. redp (D). −9−. 3. 2.
(7) A-Vowel. a. 1.2 1.0 0.8 0.6 0.4 0.2 0. o: e:. i. 1.2 1.0 0.8 0.6 0.4 0.2 0. o:. u. e:. S-Vowel. a. 1.2 1.0 0.8 0.6 0.4 0.2 0. o:. i u. e:. D-Vowel. a. i u. { C t | K P R S Õ S Z J Z Z ¦ r ) d e f S T @ g h e fST@iST i C S T K P R S Õ S Z J Z J Z Z Z T U < À K Q P.
(8) z ] ^ g b > Ø p U Q Z ¨ 8 Q x ( m Q @ U Q É x Î r O o U x > m R c R c : ; K P R T J Z g b " N O p S Z ! } _ Qx?_7@Z>DÅ8pPm :KQ;?TxRÞ@o U>!U¦ r @ Å z À _ ´ R o U. } m s H t x ( K D z x Ç Ì ©  c K @ { C t | Z { C § ¨ _ b c q ØÜJk@¦ÉZ@xg´br¶Op?Ro@Us!¦H}rt_@{V>CW§XD¨Y)SdTeZUf¢ST§ @¨0gWh_eUf¢SZ>T @0iWCXS4TZ? e. u:. u:. e. i:. o. i:. o. A-Consonant. n. N. f. h. w. n. r. m. p. N. t. f. k. h. b. ts. j. w. p. z. ts. t. f. k. h. w. y. 1.2 1.0 0.8 0.6 0.4 0.2 0. p. t. k. b. s. d. z. ts. g. j. (. (S). (A). (D)). 1.0. 0.95. Consonant. Vowel. Vowel Cons. 0.94. 0.90. 0.9. r. sh. g. j. 1:. Reduction. N. d. s. n. m. r. b. g. D-Consonant. y. 1.2 1.0 0.8 0.6 0.4 0.2 0. sh. d. s. a:. S-Consonant. y. 1.2 1.0 0.8 0.6 0.4 0.2 0. sh. z. o. i:. a:. a:. m. e. u:. 0.87. 0.85. 0. 0.81. 0.8. 0.7. A. S. D. 3:. 2:. 5.2. HMM. −10−. 5. 0.
(9) 1.2 1.0 0.8 0.6 0.4 0.2 0. o: e:. A-Vowel. a. 1.2 1.0 0.8 0.6 0.4 0.2 0. o:. i u. e:. S-Vowel. a. i u. D-Vowel. a 1.2 1.0 0.8 0.6 0.4 0.2 0. o: e:. i u. S ! } _ Q P S 9 _ k b D p Z { C t | K P R S Õ S Z J Z Z ¦ r ) d e f T @ g h e f S T @ i S T ReTZt_okfb@{|SUicSxC2@CqbK§¿V@¦¨iTWRRp?CXc_~SS¢YOpÕZ STZ _ b c p K { C Ì K Ù 7 D P À D p c { C § ¨ Z 0 W _ q V W X Y S T K i @ ) d e f S T @ g h e @ ) d e f S T @ g h e f S f S T @ i C Z J Z ¨ 8 Q ? · O ¡ ( m © z R ] ^ _ x @ ] K i C S T U D c o O > ¢ Z ] ^ x £ ¤ ¥ O R o U ¦ p c K i D P @ ) d e f S T @ g h { C § ¨ K & D P ? @ S @ S ¡ V W X Y { C § ¨ x ( m © z P Q R S T k C E F S T Z ¶ ( m © z R É x @ V W X Y S T C E F S T 0 Á # Ø R ¦ · ¦ Å ° N O p c o Z T s H t U D P } G Z 0 2 Z ) d e f S T U V W X Y T _ ` a p _ o > ? 1 2 & > @ { C . / , ¡ µ D P Q R t | K Ì Q P , . K S _ g b D < = > j k X Y p J Z Z Õ . º {  1 _ b > Ø R U V\mq<WQz ]XUkÂEY>SDZ?To¬9@QUCKxi_DDÁCÄE2SrF_
(10) ¦TDS EKZTbFPURVQ>pÑTWÅR ~Xc°@{KY¢NCÌSZO§QTp¨PZcsxS?[Vq(H{ xZP?_¡CiÜR.§´pr¨µZOsD~rCz?OHKRÀwQ@cpt@_Í_oUÎD ZZRP GC2Q K µ x Ø R ¦ · ¦. Ñ. Z G H I J K i D P x Ø R c p % D @ < > D E F S T U V W X Y S T 1 p ~ @ ¢ Z z s H t H I J _ À b R K C w U m Z } ~ Ì Q D P © x Ø c p @ @ C E F S T Z ° ± Â È Z É X K Ê Ë P R o x b w R ¦ · ¦ @ b R t _ Q P V W X Y { C J t | } y z R CDEF_SZU{C§¨¡Øc e. u:. i:. u:. e. o. i:. a:. n. m. N. f. h. y. 1.2 1.0 0.8 0.6 0.4 0.2 0. r. m. p. sh. N. t. f. k. h. b. s. d. z. ts. j. n. a:. a:. S-Consonant. D-Consonant. w. n. y. 1.2 1.0 0.8 0.6 0.4 0.2 0. r. m. p. N. t. f. k. h. b. sh. s. g. z. o. i:. o. A-Consonant. w. e. u:. ts. j. g. w. y. 1.2 1.0 0.8 0.6 0.4 0.2 0. r. p. t. k. b. sh. d. s. d. z. ts. j. g. 4:. (. (A). (S). (D)). 6. R, A, S, D. CSJ. (U. ATR. ). [3]. 1,500. CSJ. 1%. 6. CSJ. CSJ. 2. −11−.
(11) S-Vowel. A-Vowel. D-Vowel. {Ct|Z{C§¨¦r)defS</2TA{=_>?@dgQe5hpf6e:7fk8S;BTCZD@ÀEiFC°AS±<TSK[)PdR.@ . < . / _
(12) ~ R K
(13) p k. z E _ Q p % B . K Q p D b < Q p ? 0 ) $ . ( g*PÍ ± ! Z " # Æ U $ % &. ' 1 Z µ U D +F< )S.,NT[/O-0U)VdD1W9p2XÏc3Y.4S/T{5Z6Sd7[e8\f]:^A<Z;`BaCDEA 0. 10. 20. 30. 40. 50. 60. 0. 10. 20. 10. 20. 30. 40. 50. 40. 50. 0. 60. 10. 20. S-Consonant. A-Consonant. 0. 30. 60. 5:. 0. 10. 20. 30. 40. (. 50. 30. 40. 50. 60. D-Consonant. 60. 0. 10. 20. (A). 30. (S). 40. 50. 60. (D)). (2004-9).. Speaking rate. 35 30. Vowel Cons. 27.8. [2]. 25.5. 25.0. 25. 20. 16.0. 15. 16.7. 10. A. S. D. 6:. ( 21. [1]. ,. ) COE. ,. ,“. [3] K.Maekawa, “Corpus of Spontaneous Japanese: Its Design and Evaluation,” Proc. SSPR 2003, pp.7-12 (2003-4).. 16.3. 13.9. R. ,. ,” , 1-1-9 (2001-10).. 27.2. ,“. ,” , 2-P-5. −12−.
(14)
関連したドキュメント
TV会議やハンズフリー電話においては、音声のスピーカからマイク
In [11, 13], the turnpike property was defined using the notion of statistical convergence (see [3]) and it was proved that all optimal trajectories have the same unique
具体音出現パターン パターン パターンからみた パターン からみた からみた音声置換 からみた 音声置換 音声置換の 音声置換 の の考察
関西学院大学手話言語研究センターの研究員をしております松岡と申します。よろ
今回の調査に限って言うと、日本手話、手話言語学基礎・専門、手話言語条例、手話 通訳士 養成プ ログ ラム 、合理 的配慮 とし ての 手話通 訳、こ れら
手話言語研究センター講話会.
本センターは、日本財団のご支援で設置され、手話言語学の研究と、手話の普及・啓
2 保健及び医療分野においては、ろう 者は保健及び医療に関する情報及び自己