• 検索結果がありません。

話者照合のための整数化を用いた位相情報抽出に関する考察

N/A
N/A
Protected

Academic year: 2021

シェア "話者照合のための整数化を用いた位相情報抽出に関する考察"

Copied!
6
0
0

読み込み中.... (全文を見る)

全文

(1)Vol.2016-SLP-114 No.16 2016/12/20. ৘ใॲཧֶձ‫ڀݚ‬ใࠂ IPSJ SIG Technical Report. ࿩ऀর߹ͷͨΊͷ੔਺ԽΛ༻͍ͨҐ૬৘ใநग़ʹؔ͢Δߟ࡯ ஥໺ ࢻ৫1,a). Ԙా ͞΍͔1. ‫و‬Ո ਔࢤ1. ֓ཁɿԻ੠৴߸͔Βಛ௃நग़Λߦ͏ࡍʹ޿͘࢖༻͞ΕΔέϓετϥϜಛ௃͸ɼप೾਺෼ղ࣌ʹಘΒΕΔৼ ෯εϖΫτϧͷΈΛ༻͍ͯ‫͞ࢉܭ‬Ε͍ͯΔɽҰํɼۙ೥ͷ‫ڀݚ‬ใࠂʹΑΓԻ੠஌֮ͰҐ૬εϖΫτϧ͕༗ ༻Ͱ͋Δ͜ͱ͕஌ΒΕΔΑ͏ʹͳΓɼԻ੠ೝࣝɼԻ੠߹੒ɼԻ੠র߹౳ɼ༷ʑͳ෼໺ͰҐ૬εϖΫτϧͷ ‫ํ༻׆‬๏͕‫ݕ‬౼͞Ε͍ͯΔɽ͔͠͠ɼҐ૬৘ใ͸ϑϨʔϜ੾Γग़͠ͷӨ‫ڹ‬΍Ґ૬৘ใͷ‫͜ىͰࢉܭ‬ΔҐ૬ ඈͼΛߟྀ͢Δඞཁ͕͋ΔͨΊ༗༻ͳಛ௃நग़Λߦ͏͜ͱ͸೉͍͠ɽͦͷͨΊҐ૬Λਖ਼‫ن‬Խ͢Δख๏΍‫܈‬ ஗ԆΛҐ૬৘ใͱͯ͠༻͍Δ͜ͱͰҐ૬ඈͼͷӨ‫ڹ‬Λճආ͢Δख๏͕ఏҊ͞Ε͍ͯΔɽ͔͠͠ɼ‫Ͱػࢉܭ‬ ‫͞ࢉܭ‬ΕΔҐ૬εϖΫτϧ͸‫ࠩޡࢉܭ‬౳ͷখ͞ͳ஋ͷมԽʹରͯ͠΋༨෼ͳεϖΫτϧΛൃੜͤͯ͞͠· ͏͜ͱ͕͋Δɽͦ͜Ͱɼຊ‫Ͱڀݚ‬͸Ґ૬৘ใநग़ख๏ʹؔͯ͠ɼ੔਺Խ΍؆ૉԽΛ༻͍ͨख๏Λ‫ݕ‬౼ͨ͠ɽ ·ͨɼ‫ݕ‬౼ख๏ʹΑͬͯநग़ͨ͠Ґ૬ʹ‫ͮ͘ج‬ಛ௃ྔͱৼ෯ʹ‫ͮ͘ج‬ಛ௃ྔΛ༻͍ͨ࿩ऀর߹࣮‫ݧ‬Λߦͬ ͨɽ࣮‫ݧ‬ͷ݁Ռ͔Βɼैདྷͷৼ෯ͷΈΛಛ௃ྔͱͯ͠༻͍Δख๏ΑΓ΋ɼৼ෯ͱҐ૬Λ߹ΘͤΔํ͕র߹ ਫ਼౓͕޲্͢Δ͜ͱΛใࠂ͢Δɽ Ωʔϫʔυɿ࿩ऀর߹ɼҐ૬நग़ɼ੔਺Ґ૬ɼUBM–GMMɼi–vector. Investigation of integer–based phase information extraction for automatic speaker verification Nakano Shiori1,a). Shiota Sataka1. Kiya Hitoshi1. Abstract: Almost all automatic speaker verification (ASV) systems are based on statistical approaches (e.g., GMM, SVM, i–vector). These systems are traditionally assumed that input feature vectors are calculated from mel–frequency cepstral coefficients (MFCCs), which are extracted from a short-time magnitude spectrum. Recently, it has been reported that phase related features perform well in various research topics. However, the phase information changes considerably according to the frame position in an input speech. In addintion, the phase jump is sometimes occurred depended on some calculation methods. It is necessary to normalize the phase response with respect to the frame position. Therefore, this paper investigates the effectiveness of an integer phase extraction and a phase simplification method. The experimental results show that the system combinations of the magnitude spectrum and the phase spectrum are improved the performance than the conventional methods. Keywords: automatic speaker verification, phase extraction, integerd phase, UBM–GMM, i–vector. 1. ͸͡Ίʹ ‫ ࡏ ݱ‬ͷ ࿩ ऀ র ߹ Ͱ ͸ ϝ ϧ έ ϓ ε τ ϥ Ϝ ܎ ਺ (Mel–. ࢖༻͞Ε͍ͯΔɽҰํͰɼMFCC ͷநग़աఔͰಘΒΕΔҐ ૬εϖΫτϧ͸ಛ௃ྔͱͯ͠΄ͱΜͲ࢖༻͞Ε͍ͯͳ͔ͬ ͨɽ͜Ε͸ɼਓؒͷௌ֮‫͕ܥ‬Ґ૬εϖΫτϧʹಷ‫͋Ͱײ‬Γɼ. frequency cepstral coefficients; MFCC) ͷΑ͏ͳԻ੠ͷ. Ի੠஌֮ʹ͸ৼ෯εϖΫτϧΛओʹ࢖༻͍ͯ͠Δͱߟ͑Β. ৼ෯εϖΫτϧͷΈΛ༻͍ͯಋग़͞ΕΔಛ௃ྔ͕Ұൠతʹ. Ε͖ͯͨͨΊͰ͋Δ [1]ɽ͔͠͠ɼԻ੠஌֮ͰҐ૬εϖΫτ. 1 a). ट౎େֶ౦‫ژ‬ IPSJ, 6–6 Asahigaoka, Hino–shi, Tokyo, 191–0065, Japan nakano-shiori1@ed.tmu.ac.jp. ⓒ 2016 Information Processing Society of Japan. ϧ͕༗༻Ͱ͋Δ͜ͱ͕ۙ೥ใࠂ͞Ε [2]ɼ༷ʑͳ‫ڀݚ‬෼໺Ͱ ͦͷ༗༻ੑ͕ใࠂ͞Ε͍ͯΔɽҐ૬εϖΫτϧ͸ϑϨʔϜ. 1.

(2) Vol.2016-SLP-114 No.16 2016/12/20. ৘ใॲཧֶձ‫ڀݚ‬ใࠂ IPSJ SIG Technical Report. ੾Γग़͠ͷӨ‫ڹ‬΍‫Ͱࢉܭ‬ੜ͡ΔҐ૬ඈͼ͕ൃੜͯ͠͠·͏. ϓϩʔνͷ 1 ͭͰ͋ΔɽҼࢠ෼ੳͰൃ࿩σʔλΛ࿩ऀͱ. ͜ͱ͕஌ΒΕ͓ͯΓɼҐ૬εϖΫτϧΛ௚઀࢖༻͢Δ͜ͱ. νϟωϧґଘͷશมಈ (total variavility; TV) ۭؒʹࣸ૾. ͸೉͍͠ɽͦ͜ͰɼҐ૬Λਖ਼‫ن‬Խ͢Δख๏ [3] ΍‫܈‬஗ԆΛ. ͢ΔΞϓϩʔνͰ͋Δ [9]ɽ࿩ऀ s ͷ GMMλs ͷฏ‫͚ͩۉ‬. Ґ૬৘ใͱͯ͠༻͍Δख๏ [4], [5] ͳͲ͕ఏҊ͞Ε͍ͯΔɽ. Λ݁߹ͨ͠ GMM εʔύʔϕΫτϧ ms ͸Ҽࢠ෼ੳʹΑͬ. ຊߘͰ͸จ‫[ ݙ‬3] ͷҐ૬৘ใͷநग़๏ʹؔͯ͠ߋͳΔ‫ݕ‬౼. ͯҎԼͷΑ͏ʹఆٛ͞ΕΔɽ. Λߦͬͨɽจ‫[ ݙ‬3] ͷख๏Ͱநग़ͨ͠Ґ૬εϖΫτϧ͸࿩. ms = m + T · ω s .. ऀͷ৘ใΛ‫ؚ‬Ήͱಉ࣌ʹ༨෼ͳεϖΫτϧ੒෼͕ൃੜͯ͠. (3). ͓Γɼ·ͨɼҐ૬৘ใ͸‫ํࢉܭ‬๏͔Βมಈ͕ܹ͍͜͠ͱ͕. ͜͜Ͱɼm ͸ UBM ͔ΒಘΒΕΔ࿩ऀ‫ͼٴ‬νϟωϧඇґଘ. ෼͔ͬͨɽͦͷͨΊຊߘͰ͸੔਺ԽΛ༻͍ͯ༨෼ͳҐ૬Λ. ͷ GMM εʔύʔϕΫτϧͰ͋ΔɽT ͸௿ϥϯΫͷۣ‫ܗ‬. আ‫͢ڈ‬Δ͜ͱ΍؆ૉԽΛߦ͏͜ͱͰҐ૬৘ใͷมಈΛ཈͑. ߦྻͰɼTV ۭؒΛషΔ‫ج‬ఈϕΫτϧ͔Βߏ੒͞ΕΔɽωs. Δ͜ͱΛఏҊ͠ɼ࿩ऀর߹࣮‫ʹݧ‬Αͬͯ‫ݕ‬౼ͨ͠ख๏ͷ༗. ͕༩͑ΒΕͨൃ࿩ʹର͢Δ i–vector Ͱ͋Δɽর߹࣌͸ೖྗ. ޮੑΛใࠂ͢Δɽ. σʔλʹରͯ͠ࢉग़ͨ͠ i–vector(ωtest ) ͱ, ࿩ऀϞσϧͱ. 2. ࿩ऀর߹γεςϜ. ͯ͠ొ࿥ͨ͠ i–vector(ωtrg ) ͷίαΠϯྨࣅ౓͕޿͘༻͍ ΒΕΔɽ. ࿩ऀর߹γεςϜͱ͸ೖྗ͞ΕͨԻ੠͕ొ࿥͞Εͨ࿩ऀ. score(ωtrg , ωtest ) =. ຊਓͷԻ੠͔൱͔Λࣝผ͢ΔγεςϜͷ͜ͱͰ͋Δɽຊষ Ͱ͸౷‫ܭ‬ϞσϧΛ༻͍ͨ࿩ऀর߹γεςϜͱͯ͠޿͘༻͍Β ΕΔ UBM–GMM(Universal background model–GMM)[6] ͓Αͼ i–vector[7] ʹ‫ͮ͘ج‬࿩ऀর߹ʹ͍ͭͯ঺հ͢Δɽ. ωtrg · ωtest . |ωtrg ||ωtest |. (4). ͜ͷর߹είΞ score(ωtrg , ωtest ) ͕༧Ίઃఆͨ͠ᮢ஋Α Γେ͖͚Ε͹ొ࿥࿩ऀͷԻ੠Ͱ͋Δͱ൑ఆ͢Δɽ. 3. Ґ૬நग़ख๏ 2.1 UBM–GMM GMM ͸ M ‫ݸ‬ͷ୯ๆੑΨ΢ε෼෍ pi (X) ͱࠞ߹ॏΈ ωi Λֻ͚߹Θͤͨઢ‫ܗ‬ॏͶ߹ΘͤͰද‫͞ݱ‬ΕΔɽ͜͜Ͱɼొ ࿥࿩ऀ s Λද͢ GMM ͸ࣜ (1) ͷΑ͏ʹఆΊΔɽ. p(X|λs ) =. M . ωi pi (X).. (1). ͜͜ͰɼX = {x1 , x2 , ..., xT } ͸ಛ௃ϕΫτϧΛද͢ɽ·. M. i=1. ༻͞Ε͓ͯΓɼԻ੠ʹ‫·ؚ‬Ε͍ͯΔҐ૬৘ใ͸ߟྀ͞Εͯ ͍ͳ͔ͬͨɽۙ೥ͷ‫ʹڀݚ‬ΑΓҐ૬εϖΫτϧ΋Ի੠৴߸ Λදͨ͢ΊʹඞཁෆՄܽͳཁૉͰ͋Γɼ༷ʑͳ‫ڀݚ‬෼໺ͷ. i=1. ͨɼ. ैདྷͷ࿩ऀর߹Ͱ͸Ի੠ಛ௃ྔͱͯ͠ MFCC ͕ओʹ࢖. ωi = 1 Ͱ͋ΔɽUBM–GMM ʹ‫ͮ͘ج‬࿩ऀর. ߹ʹ༻͍Δಛఆ࿩ऀϞσϧͷֶश͸·ͣɼొ࿥࿩ऀ s ͷ σʔλ͔Βಛ௃ྔΛநग़͠ొ࿥࿩ऀ s ͷ GMMλs Λֶश͢ Δɽ࣍ʹɼෆಛఆ࿩ऀͷฏ‫ۉ‬తͳϞσϧͰ͋Δ UBM Λࣄ લʹֶश͓͖ͯ͠ࣄ‫࠷཰֬ޙ‬େԽ (Maximum a posteriori. probability; MAP) దԠ๏ [8] Λ༻͍ͯొ࿥࿩ऀ s ͷ෼෍ʹ దԠͤ͞Δɽ࠷‫ʹޙ‬ɼEM ΞϧΰϦζϜΛ༻͍ͯ࠷దԽΛ. ੑೳվળʹ༗༻ͳ৘ใΛ͍࣋ͬͯΔ͜ͱ͕Θ͔͖ͬͯͨɽ ͔͠͠ɼҐ૬εϖΫτϧΛಛ௃ྔͱͯ͠༻͍Δ৔߹ɼϑ ϨʔϜ੾Γग़͠ͷӨ‫ڹ‬Λड͚ͯ͠·͏͜ͱͳͲ͕஌ΒΕͯ ͓Γɼѻ͍͕೉͍͠ɽͦͷͨΊ‫܈‬஗ԆεϖΫτϧΛҐ૬৘ ใͱͯ͠༻͍Δख๏ [10], [11] ΍Ґ૬Λਖ਼‫ن‬Խ͢Δख๏ [3] ͕ఏҊ͞Ε͍ͯΔɽຊߘͰ͸จ‫[ ݙ‬3] Λ΋ͱʹҐ૬நग़ख ๏ʹؔͯ͠ߋͳΔ‫ݕ‬౼Λ͓͜ͳͬͨɽ. 3.1 Relative phase information[3] Ի੠৴߸ͷ཭ࢄϑʔϦΤม‫׵‬͸ҎԼͷࣜͰද͞ΕΔɽ.  X 2 (ω + t) + Y 2 (ω + t)×ejθ(ω+t) .. ߦ͍ొ࿥࿩ऀ s ͷϞσϧΛ࡞੒͢Δɽর߹࣌ʹ͸ొ࿥࿩ऀ Ϟσϧ λs ʹର͢Δೖྗσʔλ X ͷϑϨʔϜฏ‫ۉ‬ର਺໬౓. ͜͜Ͱɼω ɼt ͸प೾਺ͱ࣌ؒɼX ɼY ͸࣮෦ͱ‫ڏ‬෦Λද. . Λࣜ (2) ͷΑ͏ʹࢉग़͢Δɽ T 1  log p(X|λs ) = log p(xi |λs ). T. ͢ɽ. (2). i=1. UBM–GMM Ͱ͸র߹είΞͱͯ͠ର਺໬౓Λ༻͍Δ୅Θ Γʹɼಛఆ࿩ऀϞσϧ λs ͱෆಛఆ࿩ऀϞσϧ λubm ͷର਺ ໬౓ൺΛ༻͍ɼ༧Ίઃఆͨ͠ᮢ஋ΑΓେ͖͚Ε͹ొ࿥࿩ऀ ͷԻ੠Ͱ͋Δͱ൑ఆ͢Δɽ. 2.2 i–vector i–vector ʹ‫ͮ͘ج‬࿩ऀর߹͸Ҽࢠ෼ੳΛ༻͍ͨϞσϧΞ ⓒ 2016 Information Processing Society of Japan. (5). X 2 (ω + t) + Y 2 (ω + t) ͕ৼ෯εϖΫτϧɼθ(ω + t). ͕Ґ૬εϖΫτϧͰ͋ΔɽҐ૬εϖΫτϧ͸ɼಉ͡प೾਺. ω Ͱ΋ϑϨʔϜ੾Γग़͠ͷҐஔʹΑͬͯ஋͕େ͖͘มΘͬ ͯ͠·͏ɽͦ͜Ͱɼࣜ (6) ͷΑ͏ʹ͋Δ‫ج‬४ͱ͢Δप೾਺. ωb ͷҐ૬ΛҰఆʹͯ͠ଞͷप೾਺ʹ͓͚ΔҐ૬Λ૬ରతʹ ‫ٻ‬ΊΔ͜ͱͰਖ਼‫ن‬ԽΛߦ͏ɽ. ˜ + t) = θ(ω + t) + ω (A − (ωb + t)). θ(ω ωb. (6). ͜͜ͰɼA ͸‫ج‬४प೾਺ ωb ʹઃఆͨ͠Ґ૬ͷ஋Ͱ͋Δɽຊ ߘͰ͸ A = 0 ͱ͢Δɽ. 2.

(3) Vol.2016-SLP-114 No.16 2016/12/20. ৘ใॲཧֶձ‫ڀݚ‬ใࠂ IPSJ SIG Technical Report. ਤ 2. Ґ૬৘ใͷ؆ૉԽ. Fig. 2 Simplification of phase information. 4. Ґ૬৘ใͷϞσϧԽ͓ΑͼγεςϜ౷߹ 4.1 Ґ૬৘ใͷϞσϧԽ 3 ষͰड़΂ͨநग़๏Λ༻͍ͯநग़͞ΕͨҐ૬৘ใ͸ GMM ʹΑͬͯϞσϧԽΛߦ͏ɽҐ૬ಛ௃ͷΈΛ༻͍ͯ࿩ऀর ߹࣮‫ݧ‬Λߦͬͨͱ͜ΖɼҐ૬ʹ‫ ͮ͘ج‬GMM ͷର਺໬౓. Lphase ͷฏ‫ۉ‬ɼ෼ࢄʹେ͖ͳ͹Β͖͕ͭΈΒΕͨɽͦ͜ ਤ 1 Ի੠೾‫ͱܗ‬Ґ૬εϖΫτϩάϥϜ. ͰɼҎԼͷࣜͰείΞͷਖ਼‫ن‬ԽΛߦ͏ɽ. Fig. 1 Speech waveform and phase spectrogram. . Lphase =. (b)Relative phase spectrogram, (c)Round relative phase spectrogram. Lphase − m . αV. (7). ͜͜ͰɼmɼV ͸ͦΕͧΕ Lphase ͷฏ‫ۉ‬ɼ෼ࢄΛද͢ɽ· ͨɼα ͸ਖ਼‫ن‬Խ‫ޙ‬ͷ෼ࢄΛิਖ਼͢ΔύϥϝʔλͰ͋Δɽ. 4.2 είΞͷ౷߹. 3.2 ੔਺Խ Ի੠৴߸ͷϑʔϦΤม‫׵‬͸‫ػ‬ց‫ࢉܭ‬Λ༻͍Δ͜ͱͰख‫ܭ‬ ࢉͰ͸ൃੜ͠ͳ͍प೾਺ʹ΋‫ࢉܭ‬ਫ਼౓ͷ‫ݶ‬քͳͲͰҐ૬৘ ใΛ࣋ͬͯ͠·͏͜ͱ͕͋ΔɽҐ૬ͷ஋͸ −π ∼ π ͷؒʹ ͳΔͨΊɼ‫͋Ͱࠩޡ‬ΒΘΕΔ஋΋Ґ૬ͱͯ͠͸େ͖ͳ஋ͱ ͳΔ͜ͱ͕͋Δɽ͜ͷӨ‫ڹ‬Λ཈͑ΔͨΊʹҐ૬৘ใͷ‫ࢉܭ‬ Λ͢Δࡍʹ஋Λ੔਺Խ͢Δ͜ͱΛ‫ݕ‬౼ͨ͠ɽਤ 1(b) ʹ 3.1 અͰநग़ͨ͠Ґ૬εϖΫτϧΛɼਤ 1(c) ʹҐ૬Λ੔਺Խ͠ ͨͱ͖Ґ૬εϖΫτϧΛࣔ͢ɽਤΑΓɼ(b) Ͱ͸ϊΠζ෦ ෼ʹ΋େ͖ͳ஋ͷมԽ͕දΕ͓ͯΓɼ(c) Ͱ͸༨෼ͳ෦෼ ͷ஋͕ফ͓͑ͯΓɼԻ੠෦෼ͷಛ௃ΛΑΓ໌֬ʹද͍ͯ͠ Δ͜ͱ͕Θ͔Δɽ. 3.3 Ґ૬৘ใͷ؆ૉԽ. ຊߘͰ͸ɼMFCC Λ༻͍ͨ UBM–GMM ·ͨ͸ i–vector ͱҐ૬Λ༻͍ͨ GMMɼ2 ͭͷγεςϜΛ౷߹ͯ͠༻͍Δɽ ࿩ऀর߹Λߦ͏ࡍʹ͸ɼUBM–GMM ·ͨ͸ i–vector ͔Β ಘΒΕͨর߹είΞͱҐ૬Λ༻͍ͨ GMM ͔ΒಘΒΕͨର ਺໬౓ΛҎԼͷࣜͷΑ͏ʹઢ‫͠߹݁ܗ‬ɼ౷߹είΞ Lscomb ΛಘΔɽ. Lscomb = (1 − β)LsM F CC + βLsphase .. (8). ͜͜ͰɼLsM F CC ͱ Lsphase ͸ͦΕͧΕ࿩ऀ s ͷর߹είΞ ͱର਺໬౓Ͱ͋Γɼβ ͸ॏΈ܎਺Ͱ͋Δɽ. 5. ࣮‫ݧ‬৚݅ ‫ݕ‬౼ͨ͠Ґ૬ಛ௃நग़ख๏ͷ࿩ऀর߹ʹ͓͚Δ༗ޮੑʹ. Ґ૬৘ใ͸‫࠲ۃ‬ඪද‫Ͱݱ‬ද͢͜ͱ͕Ͱ͖Δɽ͜ͷ࣌ɼҐ૬. ؔͯ͠ߟ࡯͢ΔͨΊʹɼUBM–GMM ͓Αͼ i–vector ʹΑ. ৘ใ θ ͸ਤ 2 ʹࣔ͢Α͏ʹɼ−π ≤ θ < − π2 ɼ− π2 ≤ θ < 0ɼ. Δ࿩ऀর߹࣮‫ݧ‬Λߦͬͨɽ࣮‫݁ݧ‬Ռͷൺֱʹ͸ࢉग़͞Εͨ. 0≤θ<. π π 2ɼ2. ≤ θ ≤ π ͷ͍ͣΕ͔ͷ஋Ҭʹ෼͚Δ͜ͱ͕. র߹είΞ͔Βຊਓ‫ڋ‬൱཰ͱଞਓड͚ೖΕ཰Λ‫͠ࢉܭ‬ɼશ. Ͱ͖ΔɽҐ૬৘ใͷ஋͸ϑϨʔϜ੾Γग़͠ͳͲͷӨ‫ʹڹ‬Α. ࿩ऀ‫ڞ‬௨ͷᮢ஋Λઃఆͯ͠‫ٻ‬Ίͨ౳ՁΤϥʔ཰ʢEERʣΛ. Δมಈ͕େ͖͍ɽͦ͜Ͱɼ3.2 અͰ੔਺Խͨ͠Ґ૬ಛ௃Λ. ༻͍ͨɽ࿩ऀর߹࣮‫Ͱݧ‬͸ VLD σʔλϕʔε [12] ͷϔο. ͞Βʹ 4 ͭͷྖҬʹΘ͚ɼ࣮ࡍͷ਺஋Λ؆ૉͳද‫ʹݱ‬ม͑. υηοτϚΠΫͰऩ࿥͞ΕͨԻ੠σʔλΛ༻͍࣮ͯ‫ݧ‬Λ. Δ͜ͱͰҐ૬৘ใͷେ͖ͳมಈͰ͸ͳ͓͓͘·͔ͳมಈͷ. ߦͬͨɽ1 ճ໨ͷऩ࿥͔Β 2 ճ໨ͷऩ࿥·Ͱͷ‫ؒظ‬͸໿ 3. Έʹண໨ͨ͠ಛ௃நग़Λߦͬͨɽ. िؒͱͳ͍ͬͯΔɽ1 ճ໨ͷऩ࿥σʔλΛ࣌‫ ظ‬Aɼ2 ճ໨ ͷऩ࿥σʔλΛ࣌‫ ظ‬B ͱ͢Δɽ. ⓒ 2016 Information Processing Society of Japan. 3.

(4) Vol.2016-SLP-114 No.16 2016/12/20. ৘ใॲཧֶձ‫ڀݚ‬ใࠂ IPSJ SIG Technical Report ද 1. UBM–GMM ͓Αͼ i–vector ʹ‫ͮ͘ج‬࿩ऀর߹ͷ࣮‫ݧ‬৚݅. Table 1 Experimental conditions for UBM–GMM and i–vector based speaker verification systems ొ࿥࿩ऀσʔλϕʔε. VLD σʔλϕʔε (ঁੑͷΈ). ֶशσʔλ. 70 จষ ʷ 17 ໊. ʢಛఆ࿩ऀϞσϧʣ. ʢ‫ ܭ‬1190 จষʣ. ςετσʔλ. 30 จষ ʷ 17 ໊. UBM ༻σʔλϕʔε. JNAS(ঁੑͷΈ). UBM ֶशσʔλ. 23657 จষ. GMM ࠞ߹਺. 1024. ʢ‫ ܭ‬510 จষʣ. i–vector ͷ࣍‫਺ݩ‬. 400. αϯϓϦϯάप೾਺. 16 kHz. ϑϨʔϜ௕/ϑϨʔϜγϑτ. 25 msec / 10 msec. ಛ௃ྔ. MFCC 19 ࣍+ Δ + ΔΔ. ද 2. ʢaʣൃ࿩௕ original. Ґ૬ಛ௃நग़͓Αͼ GMM ϞσϧԽͷ࣮‫ݧ‬৚݅. Table 2 Experimental conditions for phase feature extraction and GMM modeling ొ࿥࿩ऀσʔλϕʔε. VLD σʔλϕʔε. ֶशσʔλ. 70 จষ ʷ 17 ໊. ʢಛఆ࿩ऀϞσϧʣ. ʢ‫ ܭ‬1190 จষʣ. ςετσʔλ. 30 จষ ʷ 17 ໊. ʢbʣൃ࿩௕ short ਤ 3. ౷߹γεςϜͷ EER(UBM–GMM ͱҐ૬). Fig. 3 EERs of integrated systems (UBM–GMM and phase). ʢ‫ ܭ‬510 จষʣ. ද 3. GMM ࠞ߹਺. 1. αϯϓϦϯάप೾਺. 16 kHz. ࢖༻प೾਺ଳҬ. 60–700Hz. Λ༻͍ͯಛఆ࿩ऀϞσϧΛֶश͠ɼςετσʔλʹ͸ֶश. Ґ૬ಛ௃நग़ʹ࢖༻ͨ͠ϑϨʔϜ௕ͱϑϨʔϜγϑτʢmsecʣ. Table 3 Frame length and frame shift used for phase feature. σʔλͱಉ࣌͡‫ͱظ‬ҟͳΔ࣌‫ʹظ‬ऩ࿥ͨ͠σʔλΛ༻͍ͨɽ. 4.2 અͰࣔͨ͠ํ๏Ͱ MFCC Λ༻͍ͨ UBM–GMM ·ͨ ͸ i–vector ͔ΒಘΒΕͨর߹είΞͱ֤Ґ૬ಛ௃நग़ख๏ ʹ‫ ͮ͘ج‬GMM ͔ΒಘΒΕͨର਺໬౓ͷείΞ౷߹Λߦ ͍, จষ୯Ґͷ EER ͱൺֱΛߦͬͨɽείΞ౷߹ͷલॲཧ. extraction(msec) ϑϨʔϜ௕. ϑϨʔϜγϑτ. frameleg0. 12.5. 5. frameleg1. 50. 25. frameleg2. 75. 37.5. frameleg3. 100. 50. frameleg4. 500. 100. ͱͯ͠Ґ૬ಛ௃ (R)enph ‫( ͼٴ‬R)sep4–enph ʹ͸ 4.1 અͰ ࣔͨ͠ํ๏ͰείΞͷਖ਼‫ن‬ԽΛߦͬͨɽਖ਼‫ن‬Խʹ༻͍ͨύ ϥϝʔλ α ͸ͦΕͧΕ 0.25ɼ0.1 Ͱ͋Δɽ. 6. ࣮‫݁ݧ‬Ռ 6.1 Ґ૬ಛ௃ͱൃ࿩௕ ֤ൃ࿩௕ʹରͯ͠ MFCC Λ༻͍ͨ UBM–GMM Λ 1 छ. UBM–GMM ͓Αͼ i–vector ʹ‫ͮ͘ج‬࿩ऀর߹ͷ࣮‫ݧ‬৚. ྨɼMFCC Λ༻͍ͨ i–vector Λ 1 छྨɼҐ૬ಛ௃நग़ख. ݅Λද 1ɼҐ૬ಛ௃நग़͓Αͼ GMM ϞσϧԽͷ࣮‫ݧ‬৚݅. ๏ʹ‫ ͮ͘ج‬GMM Λ 3 छྨΛֶश͠ɼ֤Ϟσϧʹςετ. Λද 2 ʹͦΕͧΕࣔ͢ɽҐ૬ಛ௃͸ 3 ষͰࣔͨ͠ Relative. σʔλΛೖྗͯ͠র߹είΞΛࢉग़ͨ͠ɽֶशσʔλɼς. phase information(enph), ੔਺ԽΛ༻͍ͨ Relative phase. ετσʔλʹ‫ ظ࣌ʹڞ‬A Λ࢖༻͍ͯ͠ΔɽMFCC Λ༻͍. information((R)enph), ੔਺Խ͓Αͼ؆ૉԽΛ༻͍ͨ Rel-. ͨ UBM–GMM ·ͨ͸ i–vector ͔ΒಘΒΕͨর߹είΞͱ. ative phase information((R)sep4–enph) ͷ 3 छྨͰநग़Λ. ֤Ґ૬ಛ௃நग़ख๏ʹ‫ ͮ͘ج‬GMM ͔ΒಘΒΕͨର਺໬౓. ߦͬͨɽ·ͨɼͦΕͧΕͷಛ௃நग़๏ʹରͯ͠ 5 छྨͷ. ͔Β౷߹είΞΛࢉग़ͨ͠ɽਤ 3 ʹ UBM–GMM ͔ΒಘΒ. ϑϨʔϜ௕Ͱಛ௃நग़ΛߦͬͨɽͦͷͨΊɼҐ૬ಛ௃͸‫ܭ‬. Εͨর߹είΞͱ֤Ґ૬நग़ख๏ʹ‫ ͮ͘ج‬GMM ͔ΒಘΒ. 15 छྨͰ͋ΔɽҐ૬ಛ௃நग़ʹ࢖༻ͨ͠ϑϨʔϜ௕Λද. Εͨର਺໬౓ͷ౷߹είΞΛ༻͍ͨࡍͷ EER Λࣔ͢ɽε. 3 ʹࣔ͢ɽςετσʔλʹ͸ 2 छྨͷൃ࿩௕Λ࢖༻ͨ͠ɽ. ίΞ౷߹ʹ࢖༻͢Δύϥϝʔλ β ͸ 0.1 ∼ 0.9 ·Ͱ 0.1 ࠁ. σʔλϕʔεͷ΋ͱ΋ͱͷൃ࿩௕ (໿ 4 ඵ) Λ original ͱ. ΈͰมԽͤͨ͞ɽ·ͨɼਤ 4 ʹ i–vector ͔ΒಘΒΕͨর߹. ͯ͠ɼൃ࿩۠ؒͷඵ਺͕͓Αͦ 1 ඵͱͳΔΑ͏ʹΧοτ͠. είΞͱ֤Ґ૬நग़ख๏ʹ‫ ͮ͘ج‬GMM ͔ΒಘΒΕͨର਺. ͨ short Λ࡞੒ͨ͠ɽMFCC ͓ΑͼҐ૬ͦΕͧΕͷಛ௃ྔ. ໬౓ͷ౷߹είΞΛ༻͍ͨࡍͷ EER Λࣔ͢ɽείΞ౷߹. ⓒ 2016 Information Processing Society of Japan. 4.

(5) Vol.2016-SLP-114 No.16 2016/12/20. ৘ใॲཧֶձ‫ڀݚ‬ใࠂ IPSJ SIG Technical Report. ɹ. ද 5 ֤ϑϨʔϜ௕ʹ͓͍ͯ࠷খͷ EER(%) Table 5 Minimum EER for each frame length(%). ɹ ɹ ʢaʣֶशσʔλͱςετσʔλ͕ಉ࣌͡‫ظ‬ʢ࣌‫ ظ‬A-Aʣ ςετσʔλʢ࣌‫ ظ‬Aʣ Ґ૬நग़ख๏. EER. ֶश. MFCC. –. 0.26. σʔλ. frameleg0. enph. 0.18. ʢ࣌‫ ظ‬Aʣ. frameleg1. enph. 0.18. frameleg2. enph. 0.21. frameleg3 ʢaʣൃ࿩௕ original. enph. 0.23. (R)enph. 0.23. enph. 0.23. (R)sep4–enph. 0.23. (R)sep4–enph. 0.23. frameleg4. ʢbʣֶशσʔλͱςετσʔλ͕ҟͳΔ࣌‫ظ‬ʢ࣌‫ ظ‬A-Bʣ ςετσʔλʢ࣌‫ ظ‬Bʣ Ґ૬நग़ख๏. EER. ֶश. MFCC. –. 1.37. σʔλ. frameleg0. enph. 1.27. ʢ࣌‫ ظ‬Aʣ. frameleg1. (R)enph. 1.31. frameleg2. (R)enph. 1.31. frameleg3. (R)enph. 1.24. frameleg4. (R)enph. 1.31. ʢbʣൃ࿩௕ short ਤ 4. ౷߹γεςϜͷ EER(i–vector ͱҐ૬). Fig. 4 EERs of integrated systems (i–vector and phase). enph ͕࠷খͷ EER ͱͳ͍ͬͯΔɽҰํͰɼද 4ʢbʣ͔Β. ද 4 ֤Ґ૬ಛ௃நग़ख๏ʹ͓͚Δ࠷খͷ EER(%) ɹTable 4. ɹɹ. Minimum EER for each phase feature extraction method(%) ʢaʣUBM–GMM ͱҐ૬ͷ౷߹݁Ռ ςετσʔλʢ࣌‫ ظ‬Aʣ original. short. i–vector ͱͷ౷߹Ͱ͸ (R)enph ͕࠷খͷ EER ͱͳ͓ͬͯ Γɼenph ͸վળ͕‫ݟ‬ΒΕͳ͔ͬͨɽൃ࿩௕ short ͷ৔߹ɼ ද 4ʢaʣɼ(b) ͱ΋ʹ (R)enph ͕࠷খͷ EER ͱͳ͍ͬͯ Δɽ͜ͷ͜ͱ͔Βɼൃ࿩௕͕୹͍ͱҐ૬ͷ͹Β͖͕ͭӨ‫ڹ‬. ֶश. MFCC. 0.26. 0.59. ͯ͠͠·͏͜ͱ͕ߟ͑ΒΕΔɽ·ͨɼҐ૬ͷ੔਺ԽΛߦ͏. σʔλ. MFCC+enph. 0.18. 0.59. ͜ͱͰ༨෼ͳ੒෼Λআ‫͢ڈ‬Δ͜ͱ͕Ͱ͖ɼগྔͷσʔλͰ. MFCC+(R)enph. 0.25. 0.45. ΋҆ఆͨ͠ϞσϧԽ͕Մೳͱͳͬͨ͜ͱͰ EER ͕վળ͠. MFCC+(R)sep4–enph. 0.26. 0.56. ͨͱߟ͑ΒΕΔɽ. ʢ࣌‫ ظ‬Aʣ. ʢbʣi-vector ͱҐ૬ͷ౷߹݁Ռ ςετσʔλʢ࣌‫ ظ‬Aʣ ֶश. original. short. 0.98. 1.30. MFCC. 6.2 ऩ࿥࣌‫͓ظ‬ΑͼϑϨʔϜ௕ Ґ૬ͷϑϨʔϜ௕ʹର͢ΔӨ‫ڹ‬Λௐࠪ͢ΔͨΊʹ UBM–. GMM ʹΑΔ࿩ऀর߹࣮‫ݧ‬Λߦͬͨɽ15 छྨ (3 ख๏ʷ. σʔλ. MFCC+enph. 0.98. 1.30. ʢ࣌‫ ظ‬Aʣ. MFCC+(R)enph. 0.86. 1.27. frameleg0ʙ4) ͷҐ૬ಛ௃நग़ख๏ʹ‫ ͮ͘ج‬GMM Λֶश. MFCC+(R)sep4–enph. 0.91. 1.30. ͠ɼ֤ϞσϧʹςετσʔλΛೖྗͯ͠ର਺໬౓Λࢉग़͠ ͨɽUBM–GMM ͔ΒಘΒΕͨর߹είΞͱ֤Ґ૬ಛ௃ந. ʹ࢖༻͢Δύϥϝʔλ β ͸ 0.0001 ∼ 0.001 ·Ͱ 0.0001 ࠁ. ग़ख๏ʹ‫ ͮ͘ج‬GMM ͔ΒಘΒΕͨର਺໬౓͔Β౷߹εί. ΈͰมԽͤͨ͞ɽ·ͨɼද 4 ʹਤ 3ɼ4 ͷ݁ՌͰɼMFCC. ΞΛࢉग़͠ɼ֤ϑϨʔϜ௕Ͱ࠷΋௿͍ EER Λද 5 ʹࣔ͢ɽ. ͷΈΛ༻͍ͨ৔߹ͷ EER ͱ MFCC ͱ౷߹֤ͨ͠Ґ૬ಛ௃. είΞ౷߹ʹ࢖༻͢Δύϥϝʔλ β ͸ 0.1 ∼ 0.9 ·Ͱ 0.1. நग़ख๏Ͱ࠷΋௿͍ EER Λࣔ͢ɽ. ࠁΈͰมԽͤͨ͞ɽද 5ʢaʣ͸ֶशσʔλʹ࣌‫ ظ‬Aɼς. ·ͣɼMFCC ͷΈΛಛ௃ྔͱͯ͠༻͍ͨ৔߹ͱɼMFCC. ετσʔλʹ࣌‫ ظ‬A Λɼද 5ʢbʣ͸ֶशσʔλʹ࣌‫ ظ‬Aɼ. ͱҐ૬ͷ྆ํΛ༻͍ͨ৔߹ͷҧ͍Λൺֱ͢Δɽද 4ʢaʣɼ. ςετσʔλʹ࣌‫ ظ‬B Λ༻͍ͨ EER Λ͍ࣔͯ͠Δɽදத. ʢbʣΑΓɼUBM–GMMɼi–vector ͱ΋ʹ MFCC ୯ମΑΓ. ͷ MFCC ͷߦʹ͸ UBM–GMM ͷΈͰͷ EER Λࣔͯ͠. ΋Ґ૬ಛ௃Λ౷߹͢Δ͜ͱͰ EER ͕վળ͍ͯ͠Δ͜ͱ͔. ͍Δɽ“Ґ૬நग़ख๏” ͸౷߹είΞ‫ʹࢉܭ‬Αͬͯ MFCC. ΒҐ૬৘ใͷ༗༻͕֬͞ೝͰ͖Δɽ. ͱ౷߹ͨ͠Ґ૬நग़ख๏ͷ͏ͪ࠷΋ਫ਼౓ͷߴ͔ͬͨख๏Λ. ࣍ʹɼςετσʔλͷൃ࿩௕ʹؔͯ͠ൺֱ͢Δɽൃ࿩௕. original ͷ৔߹ɼද 4ʢaʣΑΓ UBM–GMM ͱͷ౷߹Ͱ͸ ⓒ 2016 Information Processing Society of Japan. ͍ࣔͯ͠Δɽ ·ͣɼMFCC ͷΈΛಛ௃ྔͱͯ͠༻͍ͨ৔߹ͱɼMFCC. 5.

(6) Vol.2016-SLP-114 No.16 2016/12/20. ৘ใॲཧֶձ‫ڀݚ‬ใࠂ IPSJ SIG Technical Report. ͱҐ૬ͷ྆ํΛಛ௃ྔͱͯ͠༻͍ͨ৔߹ͷҧ͍Λൺֱ͢Δɽ. ͨ৔߹ΑΓ΋ɼMFCC ͱҐ૬Λ߹Θͤͯ࢖༻ͨ͠৔߹ͷ. ද 5ʢaʣ ɼ ʢbʣͱ΋ʹɼ͢΂ͯͷϑϨʔϜ௕Ͱ MFCC ͷΈ. ํ͕ྑ͍݁Ռ͕ಘΒΕͨɽࠓ‫ޙ‬ͷ՝୊ͱͯ͠͸ɼൃ࿩௕ͱ. ΑΓ΋Ґ૬ಛ௃Λ౷߹ͨ͠৔߹ͷํ͕ EER ͕௿͘ͳͬͯ. ൃ࿩࣌‫ظ‬ͷҧ͍ʹ͍ͭͯͷ‫ݕ‬౼΍Ґ૬ͷϞσϧԽख๏ͷ‫ݕ‬. ͍Δɽ͜Ε͸લड़ͷ࣮‫ͱݧ‬ಉ༷ͷ܏޲Ͱ͋ΓɼҐ૬৘ใ͕. ౼ɼଞͷҐ૬நग़ख๏ͷ‫ݕ‬౼ͳͲ͕͋͛ΒΕΔɽ. ಛ௃ͱͯ͠༗༻Ͱ͋Δ͜ͱ͕֬ೝͰ͖Δɽ ࣍ʹɼϑϨʔϜ௕ͷछྨʹؔͯ͠ൺֱ͢ΔɽҐ૬͸ϑ. ँࣙ. ຊ‫ڀݚ‬ͷҰ෦͸Պֶ‫ڀݚ‬අ‫ج‬൫ (B)26280066 ͓Α. ͼՊֶ‫ڀݚ‬අएख (B)93008552 ʹΑΔɽ. ϨʔϜ੾Γग़͠ʹΑͬͯӨ‫ڹ‬Λड͚ΔͨΊɼϑϨʔϜ௕ ͕௕͍΄ͲͦͷӨ‫ڹ‬Λ௿‫͖Ͱݮ‬Δͱߟ͑ΒΕΔɽ͔͠͠ɼ. ࢀߟจ‫ݙ‬. ද 5ʢaʣ ɼ(b) ͔ΒɼϑϨʔϜ௕ͷ௕͕͞ EER ͷվળͱൺ. [1]. ྫ͍ͯ͠ͳ͍͜ͱ͕Θ͔ΔɽҰํͰɼϑϨʔϜ௕ͷ௕͞ͱ ͦͷͱ͖࠷খͷ EER ΛͱͬͨҐ૬நग़ख๏ͱͷؔ܎ΛΈ ΔͱɼಛʹϑϨʔϜ௕͕௕͍৔߹ʢframeleg4ʣʹɼ੔਺Խ ΍ 4 ஋Խͨ͠ࡍͷҐ૬Λ༻͍ͨ΋ͷ͕ EER ͕Ұ൪௿͘ͳ. [2]. Δ܏޲ʹ͋Δɽ͜ͷ͜ͱ͔ΒɼҐ૬நग़ख๏ʹΑͬͯద੾ ͳϑϨʔϜ௕͕ҟͳΔ͜ͱ͕ߟ͑ΒΕΔɽ. [3]. ࣍ʹɼҐ૬ಛ௃ྔͷछྨʹؔͯ͠ൺֱ͍ͯ͘͠ɽද 5ʢaʣ Ͱ͸ɼenph ͕શͯͷ৚݅ͷதͰҰ൪௿͍ EER ͱͳ͍ͬͯ Δɽ͔͠͠ɼframeleg3 ͓Αͼ frameleg4 Ͱ͸‫ݕ‬౼ͨ͠Ґ૬. [4]. ಛ௃ ((R)enph ͱ (R)sep4–enph) ΋ಉఔ౓ͷ EER ͱͳͬ ͍ͯΔɽͭ·Γɼ(R)enph ͱ (R)sep4–enph ͸ enph ΑΓ ΋ಛ௃఺͕গͳ͍͕ɼಉ༷ʹҐ૬ͷಛ௃Λද͍ͤͯΔͱߟ. [5]. ͑ΒΕΔɽ ࠷‫ʹޙ‬ɼϑϨʔϜ௕ͱςετσʔλͷ࣌‫ظ‬ͷҧ͍ʹؔ͠ ͯൺֱ͢Δɽද 5ʢaʣΑΓɼֶशσʔλͱςετσʔλ. [6]. ͷ࣌‫͕ظ‬ಉ͡৔߹ʹ͸ frameleg0 ͓Αͼ frameleg1 ͕࠷খ ͷ EER ͱͳ͕ͬͨɼද 5ʢbʣΑΓɼֶशσʔλͱςετ σʔλͷ࣌‫͕ظ‬ҟͳΔ৔߹ʹ͸ frameleg3 ͕࠷খͷ EER. [7]. ͱͳͬͨɽ͜Ε͸ಉ͡ൃ࿩಺༰Ͱ͋ͬͯ΋ൃ࿩࣌‫ʹظ‬ΑΔ มಈ͕େ͖͘ɼϑϨʔϜ௕Λ௕͘ͱͬͨํ͕҆ఆͨ͠Ґ૬ நग़͕ՄೳʹͳΔͨΊͩͱߟ͑ΒΕΔɽ֤ϑϨʔϜ௕Ͱ࠷. [8]. খͷ EER ͱͳͬͨ৔߹ͷҐ૬நग़ख๏ʹؔͯ͠ൺֱ͢Δ ͱɼද 5ʢaʣΑΓɼֶशσʔλͱςετσʔλͷ࣌‫͕ظ‬ಉ. [9]. ͡৔߹ʹ͸ैདྷख๏Ͱ͋Δ enph ͕શϑϨʔϜ௕ͷதͰ࠷ খͷ EER ͱͳΓɼϑϨʔϜ௕͕௕͍৔߹ͷΈ (R)enph ·. [10]. ͨ͸ (R)sep4–enph ͕࠷খͷ EER ͱͳͬͨɽҰํͰɼද. 5ʢbʣΑΓɼֶशσʔλͱςετσʔλͷ࣌‫͕ظ‬ҟͳΔ ৔߹ʹ͸ frameleg0 Λআ͘શͯͷϑϨʔϜ௕Ͱ (R)enph ͕. [11]. Ұ൪௿͍ EER ͱͳͬͨɽ͜ͷ͜ͱ͔ΒɼఏҊख๏Ͱ͋Δ. (R)enph ͸Ґ૬ಛ௃ͷ‫݈ؤ‬ੑΛ޲্ͤ͞Δ͜ͱ͕Ͱ͖͍ͯ Δͱߟ͑ΒΕΔɽ. 7. ͓ΘΓʹ ຊߘͰ͸ಛ௃ྔͱͯۙ͠೥஫໨͞Ε͍ͯΔҐ૬৘ใͷந. [12]. Zhu, D. and Paliwal, K. K.: Product of power spectrum and group delay function for speech recognition, Acoustics, Speech, and Signal Processing, 2004. Proceedings.(ICASSP’04). IEEE International Conference on, Vol. 1, IEEE, pp. I–125 (2004). Paliwal, K. K. and Alsteris, L. D.: Usefulness of phase spectrum in human speech perception., INTERSPEECH (2003). Wang, L., Yoshida, Y., Kawakami, Y. and Nakagawa, S.: Relative phase information for detecting human speech and spoofed speech, Proc. Interspeech, pp. 2092–2096 (2015). Hegde, R. M., Murthy, H. A. and Gadde, V. R. R.: Significance of the modified group delay feature in speech recognition, IEEE Transactions on audio, speech, and language processing, Vol. 15, No. 1, pp. 190–202 (2007). ࢁຊҰެɼ຤٢ӳҰɼத઒੟Ұɿ௕࣌ؒ෼ੳʹ‫ͮ͘ج‬Ґ ૬৘ใΛ༻͍ͨԻ੠ೝࣝͷ‫ݕ‬౼ (ೝࣝ, ཧղ, ର࿩, Ұൠ)ɼ ిࢠ৘ใ௨৴ֶձٕज़‫ڀݚ‬ใࠂ. SP, Ի੠ɼ Vol. 110, No. 143, pp. 31–36 (2010). Reynolds, D. A., Quatieri, T. F. and Dunn, R. B.: Speaker verification using adapted Gaussian mixture models, Digital signal processing, Vol. 10, No. 1, pp. 19–41 (2000). Dehak, N., Kenny, P. J., Dehak, R., Dumouchel, P. and Ouellet, P.: Front-end factor analysis for speaker verification, IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, No. 4, pp. 788–798 (2011). Povey, D., Chu, S. M. and Varadarajan, B.: Universal background model based speech recognition, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, pp. 4561–4564 (2008). খ઒఩࢘ɼԘా͞΍͔ɿi-vector Λ༻͍ͨ࿩ऀೝࣝɼ೔ຊ Ի‫ֶڹ‬ձࢽɼ Vol. 70, No. 6, pp. 332–339 (2014). Yegnanarayana, B. and Murthy, H. A.: Significance of group delay functions in spectrum estimation, IEEE Transactions on signal processing, Vol. 40, No. 9, pp. 2281–2289 (1992). Correia, M. J., Abad, A. and Trancoso, I.: Preventing converted speech spoofing attacks in speaker verification, Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2014 37th International Convention on, IEEE, pp. 1320–1325 (2014). Shiota, S., Fernando, V., Yamagishi, J., Ono, N., Echizen, I. and Matsui, T.: Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification, in Proc. Interspeech 2015 ((accepted), 2015).. ग़๏ͷΑΓద੾ͳநग़ख๏ʹ͍ͭͯ‫ݕ‬౼Λߦͬͨɽ‫ݕ‬౼͠ ͨநग़ख๏ʹΑͬͯಘͨҐ૬৘ใʹ‫ͮ͘ج‬ಛ௃ྔ͕༗ޮͰ ͋Δ͔Λௐࠪ͢ΔͨΊʹ UBM–GMM ͓Αͼ i–vector Λ༻ ͍ͨ࿩ऀর߹࣮‫ݧ‬Λߦͬͨɽ࣮‫݁ݧ‬ՌͰ͸ֶशσʔλͱς ετσʔλ͕ಉ࣌‫ظ‬ͷ΋ͷͰ͋Ε͹ MFCC ͷΈΛ࢖༻͠. ⓒ 2016 Information Processing Society of Japan. 6.

(7)

Fig. 2 Simplification of phase information
Table 1 Experimental conditions for UBM–GMM and i–vector based speaker verification systems

参照

関連したドキュメント

担い手に農地を集積するための土地利用調整に関する話し合いや農家の意

前章 / 節からの流れで、計算可能な関数のもつ性質を抽象的に捉えることから始めよう。話を 単純にするために、以下では次のような型のプログラム を考える。 は部分関数 (

が前スライドの (i)-(iii) を満たすとする.このとき,以下の3つの公理を 満たす整数を に対する degree ( 次数 ) といい, と書く..

「系統情報の公開」に関する留意事項

(注)本報告書に掲載している数値は端数を四捨五入しているため、表中の数値の合計が表に示されている合計

生活のしづらさを抱えている方に対し、 それ らを解決するために活用する各種の 制度・施 設・機関・設備・資金・物質・

排出量取引セミナー に出展したことのある クレジットの販売・仲介を 行っている事業者の情報

排出量取引セミナー に出展したことのある クレジットの販売・仲介を 行っている事業者の情報