九州大学学術情報リポジトリ
Kyushu University Institutional Repository
日本語音声におけるパワースペクトル因子の音声知 覚上の役割
岸田, 拓也
https://doi.org/10.15017/1931919
出版情報:Kyushu University, 2017, 博士(芸術工学), 課程博士 バージョン:
権利関係:
ຊޠԻʹ͓͚ΔύϫʔεϖΫτϧҼࢠͷԻ্֮ͷׂ
Perceptual roles of spectral-change factors in Japanese speech
؛ాɹ
Takuya Kishida
2018 3 ݄
࣍
ୈ1ষ ং 1
1.1 ݚڀഎܠ . . . . 2
1.1.1 Իૉͷ֮ . . . . 3
1.1.2 ֮ͷख͕͔Γͷੑ . . . . 6
1.1.3 εϖΫτϧͷશମߏ͕ͭख͕͔Γ . . . . 11
1.1.4 ݴޠͷϦζϜͱ໐Իੑ . . . . 12
1.2 ຊจͷత . . . . 15
1.3 ຊจͷߏ . . . . 15
ୈ2ষ จԻͷ໌ྎͳ֮ʹཁ͢ΔύϫʔεϖΫτϧҼࢠͷݸ 17 2.1 ୈ̎ষͷత . . . . 17
2.2 ੳ̍ɿ ىҠಈओੳʹΑΔύϫʔεϖΫτϧҼࢠͷநग़ . . . . 18
2.2.1 ੳࢼྉ . . . . 18
2.2.2 खଓ͖ . . . . 19
2.2.3 ݁Ռͱߟ . . . . 25
2.3 ࣮ݧ̍ɿ ύϫʔεϖΫτϧҼࢠͷݸͱจԻͷ໌ྎͷؔ . . . . 33
2.3.1 ࣮ݧࢀՃऀ . . . . 33
2.3.2 ࣮ݧஔ . . . . 33
2.3.3 ܹԻ . . . . 34
2.3.4 खଓ͖ . . . . 41
2.3.5 ݁Ռͱߟ . . . . 42
ୈ3ষ ύϫʔεϖΫτϧҼࢠͷඇෛަجఈԽ 45 3.1 ୈ̏ষͷత . . . . 45
3.2 ੳ̎ɿ ύϫʔεϖΫτϧҼࢠͷඇෛԽʹࡍ͢ΔӨڹͷྔతͳݕ౼ . . . . . 46
3.2.1 ඇෛަجఈԽͷํ๏ . . . . 46
3.2.2 ඇෛަجఈԽʹΑΔྦྷੵد༩ͷมԽ . . . . 46
3.3 ࣮ݧ̎ɿ ඇෛަجఈҼࢠΛ༻͍ͨจԻ໌ྎͷଌఆ . . . . 48
3.3.1 ࣮ݧࢀՃऀ . . . . 48
3.3.2 ࣮ݧஔ . . . . 48
3.3.3 ܹԻ . . . . 49
3.3.4 खଓ͖ . . . . 49
3.3.5 ݁Ռͱߟ . . . . 49
ୈ4ষ ύϫʔεϖΫτϧҼࢠͷݸʑͷׂ 54 4.1 ୈ̐ষͷత . . . . 54
4.2 ࣮ݧ̏ɿ ̐Ҽࢠ͔ΒͳΔύϫʔεϖΫτϧҼࢠͷݸʑͷׂ . . . . 55
4.2.1 ࣮ݧࢀՃऀ . . . . 55
4.2.2 ࣮ݧஔ . . . . 55
4.2.3 ܹԻ . . . . 56
4.2.4 खଓ͖ . . . . 59
4.2.5 ݁Ռͱߟ . . . . 60
4.3 ࣮ݧ̐ɿ ̎Ҽࢠɺ̏Ҽࢠɺ̐Ҽࢠ͔ΒͳΔύϫʔεϖΫτϧҼࢠͷݸʑͷׂ 62 4.3.1 ࣮ݧࢀՃऀ . . . . 64
4.3.2 ࣮ݧஔ . . . . 64
4.3.3 ܹԻ . . . . 64
4.3.4 खଓ͖ . . . . 68
4.3.5 ݁Ռͱߟ . . . . 68
ୈ5ষ ૯߹ߟ 72 5.1 ݁Ռͷུ֓ . . . . 72
5.2 ݁ . . . . 73
5.3 ཧతҐஔ͚ͮ . . . . 76
5.3.1 ֮ͷख͕͔ΓͷੑΛࣔͨ͠ઌߦݚڀͱͷؔ . . . . 77
5.3.2 εϖΫτϧͷ౷ܭతੳ͔ΒಘΒΕͨͷͷղऍ . . . . 78
5.3.3 Ի֮ͷཧͱͷؔ . . . . 78
5.3.4 ຊݚڀͷݶք . . . . 79
5.3.5 ຊݚڀͷԠ༻Մೳੑ . . . . 80
5.4 ࠓޙͷల . . . . 80
จݙ 83
ँࣙ 88
ه 90
ୈ 1 ষ ং
͡Ίʹ
ੜͲ͏͠ʹΑΔίϛϡχέʔγϣϯࢹ֮ɺௌ֮ɺ৮֮ɺᄿ֮ͳͲɺ༷ʑͳײ֮Λ௨ͯ͠
ߦΘΕ͍ͯΔɻݴ༿Λ༻͍Δͱ͍͏ώτಛ༗ͷੑ࣭͔Βɺ༷ʑͳײ֮ͷதͰௌ֮Λ༻͍ͨί ϛϡχέʔγϣϯզʑʹͱͬͯಛʹॏཁͰ͋Δͱݴ͑ΔͩΖ͏ɻݴ༿ओͱͯ͠ԻʹΑͬ
ͯ͑ΒΕΔɻԻৼಈ͕ۭؾதΛൖ͢Δཧతͳݱʹա͗ͳ͍͕ɺώτௌ֮ثͱൃ
ثͱΛΈʹར༻͢Δ͜ͱͰɺԻΛͬͯෳࡶͳίϛϡχέʔγϣϯ͕Ͱ͖ΔΑ͏ʹਐ Խͨ͠ɻ͜ͷෳࡶͳίϛϡχέʔγϣϯΛࢧ͑ΔԻ֮ͷΈΛղ໌͢Δ͜ͱɺԻʹ
͔͔ΘΔݚڀʹ͓͚Δ࠷ॏཁͳςʔϚͷҰͭͰ͋Δͱݴ͑Δɻ
ຊจͰɺௌ֮ܥধʹ͓͚ΔԻͷεϖΫτϧදݱʹର͠౷ܭతख๏Λ༻͍Δ͜ͱͰͦ
ͷओཁͳΛऔΓग़͠ɺԻ֮ͷͨΊͷख͕͔Γͱͯͦ͠ΕΒͷ͕ͲͷΑ͏ʹར༻͞
Ε͍ͯΔͷ͔Λௌऔ࣮ݧʹΑͬͯௐͨɻԻͷԻڹతಛʹؚ·ΕΔԻ֮ͷख͕͔Γʹ
ੑ͕ͲΕ͚ͩ͋Δͷ͔Λɺ߹ԻΛ༻͍ͨௌऔ࣮ݧʹΑͬͯௐΔݚڀͱɺ౷ܭతख๏
ʹΑΔੳΛ௨ͯ͠ௐΔݚڀͱΛ݁ͼ͚ͭΔݚڀͰ͋Δɻ
ຊষͰɺ·ͣɺԻ֮ʹؔ͢Δݚڀഎܠͱͯ͠ɺຊจͰऔΓѻͬͨ༰ͱಛʹؔ࿈ͷ
ਂ͍ઌߦݚڀΛհ͢ΔɻઌߦݚڀʹΑͬͯ໌Β͔ͱͳͬͨ͜ͱΛཧ͢ΔதͰɺݕ౼͕ෆे
ͳΛऔΓ্͛Δɻͦͯ͠ɺຊจͷతΛࣔ͢ɻ࠷ޙʹɺຊจશମͷߏΛઆ໌͢Δɻ
1.1 ݚڀഎܠ
ɹ
ݴޠԻͱͯ͠ͷԻΛ༻͍ͨௌ֮ίϛϡχέʔγϣϯͷΈʹɺ͍͔ͭ͘ͷஈ֊͕͋Δ ͱߟ͑ΒΕ͍ͯΔ(de Saussure, 1959)ɻਤ 1.1ɺͦͷௌ֮ίϛϡχέʔγϣϯͷ༷ʑͳஈ֊
Λࣜతʹࣔͨ͠ͷͰ͋Δɻऀ͕ࣗͷߟ͑ײΛଞऀʹ͑Α͏ͱ͢Δͱ͖ɺ·ͣ
ͦͷߟ͑ײ͕ऀͷͰݴޠԽ͞ΕΔɻݴޠऀͷͰௌ֮Πϝʔδͱͯ͠
දݱ͞Ε͍ͯΔɻ͜ͷௌ֮ΠϝʔδΛ࣮ࡍͷཧతݱͱͯ͠ͷԻͱ࣮ͯ͠ݱ͢ΔͨΊʹɺ
͔ΒௐԻثӡಈࢦྩ͕ૹΒΕɺൃ͕ى͖ΔɻൃʹΑͬͯੜͨ͡Ի͕ௌऔऀͷࣖʹ
ೖྗ͞ΕΔͱɺۭؾͷৼಈͰ͋ͬͨԻ͕ௌ֮ܥͷॲཧΛܦͯௌऔऀʹௌ֮ΠϝʔδΛҾ͖ى
ͤ͜͞ΔɻௌऔऀͷͰ·ͨɺௌ֮Πϝʔδݴޠͦͷͷͱ݁ͼ͍͍ͭͯΔɻΑͬͯɺ
ௌऔऀʹऀ͕͑Α͏ͱͨ͠ݴޠ༰͕ΘΔɻҎ্͕ௌ֮ίϛϡχέʔγϣϯͷجຊత
ΈͰ͋ΔɻDenes and Pinson (1993)͜ͷΈͷதʹɺऀ͕ൃͨ͠ԻΛऀࣗ
͕ௌऔ͢Δͱ͍͏ஈ֊͋Δ͜ͱΛड़͓ͯΓɺҰ࿈ͷஈ֊͕ͷΑ͏ʹ࿈ͳΔ༷ࢠ͔Βɺ
ௌ֮ίϛϡχέʔγϣϯͷΈΛݴ༿ͷͱݺশ͍ͯ͠Δɻ͜ͷݴ༿ͷʹ͓͍ͯɺԻΛ
ௌ֮ΠϝʔδͱରԠ͚ͮΔஈ֊͕Ի֮Ͱ͋Δɻ
Linguistic level
Linguistic level Physiological
level
Acoustic level
Physiological level
Speaker Listener
Brain Vocal muscles Ear Brain
Ear
Sound wave
ਤ 1.1ௌ֮ίϛϡχέʔγϣϯͷ༷ʑͳஈ֊(ݴ༿ͷ)ɻDenes and Pinson (1993)ʹܝࡌͷ ਤΛࢀߟʹචऀ͕࡞ɻ
Ի࣮֮ݧ৺ཧֶɾݴޠֶɾௌ֮ݚڀɾిؾֶɾਓೳݚڀͳͲͷ༷ʑͳͷݚ ڀऀ͕ؔ৺ΛدͤΔɺֶࡍతͳྖҬͰ͋Δ(Pisoni, 1985)ɻ༷ʑͳཱͷݚڀ͔ΒಘΒΕͨ
ݟɺ֎ࠃޠͷशಘɺޮతͳԻ৴߸ͷૹٕज़ɺԻڧௐٕज़ɺࣗಈԻೝٕࣝज़ɺௌ
ऀͷ༻͢Δิௌثɾਓࣖͷ։ൃͳͲɺ๛͔ͳΒ͠Λઃܭ͢ΔͨΊʹར༻Ͱ͖Δɻ Իʹɺݴޠ༰Ҏ֎ʹऀͷੑ࣭ɾঢ়ଶɾײͱ͍ͬͨͷΛ͑Δׂ͕͋Δ(Schuller
et al., 2013)͕ɺݴޠ༰ͷୡʹݶͬͯݴ͑ɺԻ֮ͷݚڀ͓͓Αͦ̎ͭͷݚڀྖҬ
ʹ͚ΒΕΔͩΖ͏ (Plomp, 2002; Samuel, 2011)ɻ̍ͭɺԻʹ͓͚Δ͋ΔԻڹతಛ
͕ɺԻૉɺԻઅɺ୯ޠͱ͍ͬͨԻͷߏ୯Ґͷ֮ͱͲͷΑ͏ʹ͍ؔͯ͠Δͷ͔ΛௐΔ
ྖҬͰ͋Δɻͦͯ͠͏̍ͭɺৗձͷΑ͏ʹ࿈ଓతʹൃ͞ΕͨԻΛௌ͍ͯɺͲͷΑ
͏ʹͯͦ͠ͷԻΛݴޠͱͯ͠ೝࣝ͠ɺॲཧ͍ͯ͠Δͷ͔ΛௐΔྖҬͰ͋ΔɻຊݚڀͰɺ จ୯ҐͷԻΛରͱͯ͠ɺԻ৴߸ͷੳͱௌऔ࣮ݧΛߦͬͨɻ࿈ଓతʹൃ͞ΕͨԻΛ ѻ͏ͱ͍͏ͰຊݚڀޙऀͷݚڀྖҬͰ͋Δͱݴ͑Δ͕ɺԻͷԻڹతಛʹ͍ͭͯ
֮ͱରԠ͚ͮΔͱ͍͏ͰલऀͷྖҬͱؔ࿈͕ਂ͍ɻͦ͜Ͱݚڀഎܠͱͯ͠ɺલऀͱޙऀ
ͷݚڀྖҬͷҧ͍ʹͩ͜ΘΒͣɺຊݚڀͱؔ࿈͕ਂ͍ઌߦݚڀΛհ͢Δɻ
1.1.1 Իૉͷ֮
ԻૉԻͷ࠷খߏ୯ҐͰ͋ΔɻԻૉԻͱࢠԻͱʹେผͰ͖ɺڞ௨͢Δ෦͋Δ
ͷͷݴޠ͝ͱʹͦͷछྨͱମܥগͣͭ͠ҟͳ͍ͬͯΔɻԻ֮ͷ࠷ॳظͷݚڀͰɺԻૉ
Λಉఆ͢ΔͷʹͲͷΑ͏ͳԻڹతಛ͕ඞཁͰ͋Δͷ͔͕ৄ͘͠ௐΒΕͨɻԻΛࢹ֮తʹͱ Β͑Δ͜ͱΛՄೳͱ͢ΔɺαϯυεϖΫτϩάϥϑ(Potter, 1945)ͱ͍͏ஔ͕͋Δɻ͜ͷ
ஔʹԻ৴߸Λೖྗ͢Δͱɼ৴߸͕ղੳ͞Εɺؚ·Ε͍ͯΔपͷ࣌ؒมಈ͕ࢴ্ʹ ೱ୶Ͱඳ͔Εͨͷ(εϖΫτϩάϥϜ)Λग़ྗ͢Δɻਤ1.2Իͷ࣌ؒܗͱͦͷ࣌ؒܗ Λղੳ͢Δ͜ͱͰಘΒΕΔεϖΫτϩάϥϜͷྫͰ͋ΔɻҰํύλʔϯϓϨΠόοΫ(Cooper, Liberman, & Borst, 1951)ɺαϯυεϖΫτϩάϥϑͱٯͷൃʹΑΔͷͰɺεϖΫ τϩάϥϜΛٖͯ͠ඳ͔ΕͨύλʔϯΛಡΈࠐΈɺԻΛ࠶ੜ͢Δͱ͍͏ஔͰ͋Δɻύλʔ ϯϓϨΠόοΫͰԻΛٖͨ͠Իڹ৴߸͕࡞ΒΕɺ͜ΕΛͬͨԻૉͷ࣮֮ݧ͕ߦΘΕ
ͨɻϋεΩϯεݚڀॴ (Haskins Laboratories)ͷݚڀऀʹΑͬͯߦΘΕͨ͜ΕΒ࠷ॳظͷݚڀ
͔ΒɺԻૉͷ֮ʹϑΥϧϚϯτ͕ॏཁͰ͋Δ͜ͱ͕໌Β͔ʹ͞Εͨ(Delattre, Liberman, Cooper, & Gerstman, 1952; Liberman, 1957)ɻଳͷৼಈͰੜͨ͡Իޱ৶͔Β์ࣹ͞Ε Δ·Ͱͷؒʹप͝ͱʹৼ෯͕ڧΊΒΕͨΓऑΊΒΕͨΓ͞ΕΔɻͲͷपʹ͓͍ͯͲ ͷఔৼ෯ͷڧऑͷมԽ͕ى͖Δͷ͔ɺଳ͔Βޱ৶·Ͱͷܗঢ়ɺ͢ͳΘͪಓܗঢ়ʹΑΔ ڞৼಛੑͰܾ·ΔɻϑΥϧϚϯτͦͷΑ͏ʹͯ͠Ͱ͖ΔԻͷεϖΫτϧแབྷ্ͷࢁͷ͜ͱ Ͱ͋Δɻ ͦͷϐʔΫͱͳΔप͕ϑΥϧϚϯτपͰ͋Γɺ͕͍ॱʹୈ̍ϑΥϧϚ
ϯτɺୈ̎ϑΥϧϚϯτɺୈ̏ϑΥϧϚϯτͱ͍͏Α͏ʹݺΕΔɻԻͷൃ࣌ϑΥϧϚ ϯτ͕ఆৗతʹ؍͞ΕΔͨΊɺϑΥϧϚϯτपͷͷύλʔϯΛௌ͖͚Δ͜ͱͰ
ԻΛ֮͢Δ͜ͱ͕Ͱ͖ɺҰํࢠԻϑΥϧϚϯτपͷભҠύλʔϯ͕ಛతͰ͋Γɺ
͜ΕΛख͕͔Γʹ֮Ͱ͖Δͱߟ͑ΒΕ͍ͯΔɻਤ1.3ʹຊޠͷԻɺ/a/ͱ/i/ͷϑΥϧϚ ϯτͷύλʔϯͷҧ͍ΛݟΔ͜ͱ͕Ͱ͖ΔɻԻɺ/a/ͱ/i/ΛͦΕͧΕൃ͢Δࡍಓͷڱ ΊΒΕΔҐஔ͕ҟͳΔͨΊɺڞৼಛੑ͕มԽ͠ɺͦΕ͕ϑΥϧϚϯτύλʔϯ(εϖΫτϧแ བྷ)ͱͯ͠ݱΕΔɻҰํεϖΫτϧͷࡉ͔ͳมಈύλʔϯ(ඍࡉߏ)ɺଳৼಈʹΑͬͯ࡞
ΒΕΔͨΊɺಉ͡ͷߴ͞Ͱൃ͠Α͏ͱ͢ΕɺͦͷมಈͷִؒೋͭͷԻͷؒͰ΄ͱ ΜͲมΘΒͳ͍ɻ
Time (s)
0 2.5
0
Time (s)
0 2.5
0 8000
Frequency (Hz)
0.5 1 1.5 2
0.5 1 1.5 2
1600 3200 4800 6400
Amplitude
ਤ 1.2 Իͷ࣌ؒܗ(্)͓ΑͼͦͷεϖΫτϩάϥϜ(Լ)ͷྫɻʮলΤωϧΪʔ͕ڣΕ
͍ͯ·͢ɻʯͱ͍͏จ༰Λஉੑ͕ൃͨ͠ͷɻNTT-ATଟݴޠԻσʔλϕʔε2002ʹ ऩɻ
0 2 4 6 8 10 12 14
0 1000 2000 3000 4000 5000 6000 7000 8000
Frequency (Hz)
Amplitude (dB)
/a/
/i/
ਤ 1.3ຊޠͷԻͷεϖΫτϧ(ࡉ͍࣮ઢ͕/a/ɺࡉ͍ഁઢ͕/i/)ͱͦͷεϖΫτϧแབྷ(ଠ
͍࣮ઢ͕/a/ɺଠ͍ഁઢ͕/i/)ɻචऀ͕ൃͨ͠ͷΛԻɺੳͨ͠ɻ
ԻૉͷಉఆʹϑΥϧϚϯτ͕ॏཁͰ͋Δ͜ͱ͕͔͕ͬͨɺޙଓ͢ΔԻͷछྨ͕มΘ Δͱɺಉ͡ࢠԻͰ͋ͬͯͦͷϑΥϧϚϯτपͷ࣌ؒભҠύλʔϯ͕ඇৗʹҟͳΔ͜ͱ
͕ಉ࣌ʹ؍͞ΕɺಛఆͷࢠԻͰ͋ΔͱಉఆͰ͖ΔෆมతͳԻڹతಛΛݟग़͢͜ͱࠔ
Ͱ͋Δ͜ͱ͕໌Β͔ͱͳͬͨɻͦ͜ͰɺLibermanΒɺԻૉͱௐԻӡಈͱͷؒͷҰ؏ͨ͠ର Ԡؔʹணͯ͠ɺԻڹతಛͰͳ͘ɺௐԻثΛಈ͔͢ےͷӡಈࢦྩʹࢠԻΛಉఆ
͢ΔෆมతಛΛݟग़͢͜ͱ͕Ͱ͖ΔͩΖ͏ͱओுͨ͠(Liberman, Cooper, Shankweiler, &
Studdert-Kennedy, 1967)ɻ͜Ε͕Ի֮ͷӡಈཧ(Motor Theory)Ͱ͋Δɻ͞Βʹӡಈཧ
Ͱɺώτ͕ԻΛ֮͢Δͱ͖ͦΕҎ֎ͷԻͷ֮ͱผʹɺઐ༻ͷػߏΛར༻͍ͯ͠
Δͱओு͍ͯ͠ΔɻӡಈཧΛࢧ࣋͢Δݚڀऀɺྫ͑ɺೋॏ֮(Rand, 1974)Χςΰ Ϧʔ֮(Liberman, Harris, Hoffman, & Griffith, 1957)ͱ͍ͬͨݱΛൃݟ͠ɺ͜Ε͕Ի
ܹΛ༻͍ͨͱ͖ʹ͚ͩಛ༗ʹى͜ΔݱͰ͋Δͱͯ͠ӡಈཧΛূ໌͠Α͏ͱͨ͠ɻԻ
֮ͷӡಈཧɺޙͷԻ֮ݚڀʹ༩͑ͨӨڹͷେ͖͞Ώ͑ʹɺ࠷ॏཁͳཧͷҰͭͰ͋
Δͱݴ͑Δɻ
ӡಈཧʹର߅͢ΔܗͰɺԻ֮ͷཧ͕ෳఏএ͞Ε͍ͯΔɻ͜͜ͰɺBlumsteinͱ
Stevens͕ओு͢ΔɺԻڹతෆมੑཧʹ͍ͭͯ৮ΕΔɻ͜ͷཧεϖΫτϧͷશମߏ͕
ख͕͔ΓͱͳΔͱ͍͏Ͱຊจͷ༰ͱؔ࿈͢Δɻӡಈཧͷࢧ࣋ऀΒɺԻͷԻڹత ಛͷதʹԻૉΛಉఆͰ͖ΔෆมతͳಛΛݟग़͢͜ͱΛఘΊ͕ͨɺBlumstein and Stevens
(1979, 1980)ԻૉΛೋ߲ରཱૉੑΛͬͯྨ͢Δͱ͍͏ߟ͑ํʹैͬͯɺԻͷԻڹత
ಛ͔ΒԻૉΛಉఆ͢ΔෆมతಛΛݟ͚ͭΑ͏ͱͨ͠ɻྫ͑ดࢠԻͷดͷ։์͔Β
20–30 ms·Ͱͷ۠ؒͷεϖΫτϧʹ͓͍ͯɺΤωϧΪʔ͕प্࣠ͷத৺ʹू͍ͯ͠Δ
͔શମతʹ֦ࢄ͍ͯ͠Δ͔(ूੑ–֦ࢄੑͷରཱ)ɺ֦ͦͯ͠ࢄ͍ͯ͠Δ߹ɺߴҬʹ͏
ʹͭΕͯϑΥϧϚϯτͷϐʔΫͷৼ෯͕૿Ճ͍ͯ͠Δ͔ݮগ͍ͯ͠Δ͔(ߴԻௐੑ–Իௐੑ
ͷରཱ)Λ؍͢Δ͜ͱʹΑͬͯɺดࢠԻͷௐԻҐஔʹΑΔ̏छྨͷྨ͕ՄೳͰ͋Δ͜ͱ Λࣔͨ͠ɻ͔͠͠ҰํͰɺෆมతͳಛͱಉ࣌ʹޙଓԻͷҧ͍ʹΑͬͯมΘΔಛ͕༩͑Β Εͨͱ͖ɺௌऔऀมΘΔಛͷํΛجʹͯ͠ԻૉΛஅ͢Δ͕͋Δ͜ͱ͕ࣔ͞Ε͓ͯ
Γ(Blumstein, Isaacs, & Mertus, 1982; Walley & Carrell, 1983)ɺԻڹతෆมੑཧશͱ
͍͏Θ͚Ͱͳ͍ɻ
ଞʹԻ֮ͷཧʹදతͳͷͱͯ͠ɺ࣮ࡏཧ (Direct Realism Theory;
Fowler, 1991), ҰൠΞϓϩʔν(General Approach; Diehl, Lotto, & Holt, 2004)ͳͲ͕͋Δɻ
͜ΕΒͷཧɺೋॏ֮ݱ͕ඇԻͰੜ͡Δ͜ͱ(Fowler & Rosenblum, 1990)ɺΧ ςΰϦʔ͕֮ώτҎ֎ͷੜʹΈΒΕΔ(Kluender, Diehl, Killeen, et al., 1987)ͱ͍͏࣮
ݧใࠂΛجʹཱͯΒΕͨͷͰ͋ΔɻԻ֮ͷཧʹ͓͚ΔͦΕͧΕͷߟ͑ํɺ֮ͷର
͕ௐԻӡಈͰ͋Δ͔Ͳ͏͔ɺԻ͕֮ઐ༻ͷػߏʹΑΔͷ͔Ͳ͏͔ͱ͍͏ཱͰྨͰ
͖Δ(Diehl et al., 2004)ɻ͔͠͠ɺ୭͕͠ೲಘ͢Δཧͳ͘ɺݱࡏͰ׆ൃͳ͕ٞଓ͍
͍ͯΔɻ
1.1.2 ֮ͷख͕͔Γͷੑ
࣮ڥʹԻҎ֎ʹ༷ʑͳԻ͕͋;Ε͓ͯΓɺձதʹແؔͷԻΛԻͱಉ࣌ʹௌ͘
ঢ়گ͋͠͠Δɻ͞ΒʹɺԻ͙͢ʹཱͪফ͑ͯ͠·͏ͷͰ͋ΔɻΑͬͯԻΛͬ
ͯ҆ఆͨ͠ίϛϡχέʔγϣϯΛߦ͏ͨΊʹɺԻʹ͋Δఔ༨ʹ֮ͷख͕͔ΓͱͳΔ
ใؚ͕·Ε͍͔ͯͯ͠Δ͖Ͱ͋Δɻ࣮ࡍʹɺԻʹ͓͚Δ֮ͷख͕͔ΓͷੑΛࣔ͢
༷ʑͳใࠂ͕ͳ͞Ε͍ͯΔɻ͜͜Ͱपใʹؔ͢Δੑʹ͍ͭͯհ͢Δɻ
ࣖͷ৫Ͱ͋ΔڇʹԻ͕ೖྗ͞Εͨͱ͖ɺҟͳΔप͝ͱʹڇͷجఈບ্Ͱڧ͘
ৼಈ͢Δॴ͕ҟͳΔ͜ͱ͔Βɺզʑͷࣖपੳثͱͯͨ͠Β͘͜ͱ͕໌Β͔ʹ͞
Ε͍ͯΔ(Schnupp, Nelken, & King, 2011; Plack, 2014)ɻ͜ͷௌ֮ܥͷपੳػೳத
৺पͱଳҬ෯ͷ͜ͱͳΔϑΟϧλ͕ෳฒΜͩϑΟϧλόϯΫͱͯ͠ϞσϧԽͰ͖Δɻ
͜ͷϑΟϧλ͕ௌ֮ϑΟϧλ(Patterson, 1974; Unoki, Irino, Glasberg, Moore, & Patterson, 2006; Moore, 2013)ͱݺΕΔͷͰ͋ΔɻྟքଳҬ(Fletcher, 1940; Zwicker & Terhardt,
1980; Greenwood, 1990; Schneider, Morrongiello, & Trehub, 1990)ௌ֮ϑΟϧλΛۣܗʹۙ
ࣅͨ͠ͷͰ͋Δɻಉ࣌ϚεΩϯάΛར༻ͨ͠ௌऔ࣮ݧ͔ΒɺྟքଳҬͷଳҬ෯͕ٻΊΒΕΔ (Fletcher, 1940; Zwicker & Terhardt, 1980)ɻྟքଳҬ෯த৺प͕500 Hz·ͰҰఆ
ͷ100 HzఔͰ͋Δ͕ɺ500 HzҎ্Ͱ͓Αͦத৺पͷ20%ͷ͞ʹͳΔͱ͍͏ಛ
Λͭɻ͔͠͠ௌ֮ϑΟϧλͷܗঢ়࣮ࡍʹۣܗͰͳ͍ɻPatterson (1974)ʹΑͬͯௌ֮
ϑΟϧλͷܗঢ়ΛٻΊΔ࣮ݧख๏͕ఏҊ͞Εɺௌ֮ϑΟϧλͷܗঢ়͕໌Β͔ʹͳ͍ͬͯͬͨɻ
ௌ֮ϑΟϧλத৺पʹରͯ͠ରশͳܗঢ়Ͱͳ͘ɺҬଆͳͩΒ͔ʹϑΟϧλग़ྗ͕
Լ͠ɺߴҬଆٸफ़ʹϑΟϧλग़ྗ͕Լ͢Δͱ͍͏ಛ͕͋Δɻ͞ΒʹɺҬʹ͓͍ͯҰ ఆͷଳҬ෯ͱߟ͑ΒΕ͍ͯͨௌ֮ϑΟϧλ࣮ࡍʹҬ΄Ͳڱ͘ͳΔͱ͍͏͜ͱ໌Β͔
ͱͳͬͨɻ·ͨɺௌ֮ϑΟϧλͷܗঢ়ೖྗϨϕϧʹΑͬͯมԽ͢Δ͜ͱ͔͍ͬͯΔɻ தఔͷϨϕϧʹ͓͍ͯɺϑΟϧλग़ྗରप্࣠ͰରশͰ͋Δͱߟ͑ͯΑ͍ɻௌ֮
ϑΟϧλͷଳҬ෯ΛՁۣܗଳҬ෯ʹࢉ͢Δ͜ͱ༗༻Ͱ͋ΔɻՁۣܗଳҬ෯ͱɺௌ֮
ϑΟϧλ͕௨͢ന৭ࡶԻͷύϫʔͱಉྔͷύϫʔΛ௨͢Α͏ͳۣܗϑΟϧλͷଳҬ෯ͷ͜ͱ
Ͱ͋Δ(Moore, 2013)ɻۣܗϑΟϧλͷߴ͞ௌ֮ϑΟϧλͷத৺पʹ͓͚Δߴ͞ʹͦΖ
͑ΒΕΔɻਤ1.4ͰதఔͷϨϕϧͷԻʹର͢Δௌ֮ϑΟϧλͷՁۣܗଳҬ෯ͱྟքଳҬ෯ ͱΛൺֱ͢Δ͜ͱ͕Ͱ͖ΔɻൺֱతߴଳҬʹ͓͍ͯɺௌ֮ϑΟϧλͷՁۣܗଳҬ෯ͱྟք ଳҬ෯ಉఔͰ͋Δ͜ͱ͕Θ͔Δɻ ྟքଳҬͷଳҬ෯جఈບͷ1.3 mmͷ͞ʹରԠ
͠(Fastl & Zwicker, 2006)ɺ͜ΕଳҬͷத৺पͷ͓Αͦ1/4ʙ1/3ΦΫλʔϒఔͰ͋
Δ(Plomp, 2002)ɻ͜Ε͕ௌ֮ͷपղೳͰ͋Δ͕ɺ࣍ʹ͍͔ࣔͭ͘͢ͷྫͷΑ͏ʹɺԻ
ͷ֮ʹ͓͍ͯ͜ͷղೳे͗͢Δɻ
10 100 1000
100 500 1000 5000 10000
50 50 500
Center frequnecy (Hz)
Bandwidth (Hz)
Critical bandwidth
Equivalent rectangular bandwidth
ਤ 1.4ྟքଳҬ෯ͱௌ֮ϑΟϧλͷՁۣܗଳҬ෯ͷൺֱɻ྆࣠ͱʹର࣠ɻZwicker and Terhardt (1980)͓ΑͼɺMoore (2013)Λࢀߟʹ࡞ͨ͠ɻ
Ի͓Αͦ50–8000 HzͷपଳҬʹΤωϧΪʔ͕͓ͯ͠ΓɺͦͷൣғʹԻ֮
ͷͨΊͷ༷ʑͳख͕͔Γ͕༩͑ΒΕ͍ͯΔɻಉ͡ःஅपʹ͓͍ͯҬ௨աϑΟϧλʹ௨
͞ΕͨԻͱߴҬ௨աϑΟϧλʹ௨͞ΕͨԻͱͷ̎ͭͷ݅ͰɺԻͷ໌ྎ͕ःஅप
ͷมԽͱͱʹͲͷΑ͏ʹมԽ͢Δ͔Λௐͨݚڀ͕ෳ͋Δ(French & Steinberg, 1947;
Hirsh, Reynolds, & Joseph, 1954; Miller & Nicely, 1955; Studebaker, Pavlovic, & Sherbecoe, 1987)ɻःஅप͕͍߹ɺߴҬ௨աϑΟϧλʹ௨͞ΕͨԻͷํ͕໌ྎ͕ߴ͘ɺͦ
ͷٯͷ߹Ҭ௨աϑΟϧλʹ௨͞ΕͨԻͷํ͕໌ྎ͕ߴ͘ͳΔΘ͚͕ͩɺ̎ͭͷ݅
Ͱಉ͡໌ྎͱͳΔͱ͜Ζͷःஅप(͓Αͦ1700 Hzۙ)ʹ͓͍ͯɺଟ͘ͷݚڀͰԻઅ ਖ਼୯ޠਖ਼͕50%Ҏ্ͱͳΔ͜ͱ͕ใࠂ͞Ε͍ͯΔɻ͜ͷ͜ͱ૬ิతͳԻڹత
ใͦΕͧΕ͚ͩΛ༻͍ͯɺԻઅ͋Δ͍୯ޠͷ͕֮͋ΔఔՄೳͰ͋Δ͜ͱΛ͓ࣔͯ͠Γɺ
֮ͷख͕͔ΓͷੑΛࣔ͢ҰͭͷྫͰ͋Δɻ·ͨؔ࿈͢Δݚڀͱͯ͠ɺ1/3ΦΫλʔϒͷ
͞ͷڱଳҬϑΟϧλʹ௨͞ΕͨԻ৴߸Ͱɺඇৗʹߴ͍໌ྎ͕ಘΒΕΔ͜ͱ͕ɺ͞Β
ʹ1/20ΦΫλʔϒͷΑΓڱ͍ଳҬϑΟϧλʹ௨͞ΕͨԻ৴߸Ͱ͋ͬͯɺϑΟϧλͷத৺
प͕1500 HzۙͰ͋Ε૬ߴ͍໌ྎ͕ಘΒΕΔ͜ͱ͕ Warren, Riener, Bashford, and Brubaker (1995)ʹΑͬͯใࠂ͞Ε͍ͯΔɻ
্ʹ͋͛ͨपଳҬΛ੍ݶ͢ΔݚڀͷଞʹɺεϖΫτϧશମʹؚ·ΕΔใΛྼԽͤͨ͞Ի
ʹΑΔௌऔ࣮ݧͰɺԻ֮ͷख͕͔Γʹੑ͕͋Δ͜ͱ͕ใࠂ͞Ε͍ͯΔɻTer Keurs,
Festen, and Plomp (1992, 1993)ɺԻͷεϖΫτϧแབྷͷมԽΛΨεϑΟϧλΛͬͯ
ಷΒͤɺεϖΫτϧแབྷ্ͷࢁ͕͘ɺ୩͕ઙ͘ͳͬͨԻ͕ఆৗࡶԻԼͰͲΕ͚ͩ໌ྎʹௌ
͖ͱΕΔ͔ΛௐͨɻΨεϑΟϧλͷଳҬ෯Λม͑ͯɺεϖΫτϧแབྷͷมԽͷಷ͞ͷҟͳ ΔԻͰൺֱͨ͠ͱ͜ΖɺΨεϑΟϧλͷଳҬ෯͕1/3ΦΫλʔϒ·ͰͷԻɺॲཧΛߦ Θͳ͍݅ͷԻͱಉʹ໌ྎͰ͋Δ͜ͱ͕͔ͬͨɻͦΕ͚ͩͰͳ͘ɺ̐ΦΫλʔϒͷଳҬ ෯ͰεϖΫτϧแབྷΛಷΒͤͯɺࡶԻͷϨϕϧʹରͯ͠ेʹԻͷϨϕϧΛେ͖͘͢Ε
ɺ໌ྎʹԻΛ֮Ͱ͖Δ͜ͱ͔ͬͨɻ͜͜Ͱ൴ΒɺԻ͕உੑͷͷͰ͋ͬͯঁ
ੑͷͷͰ͋ͬͯ݁Ռ͕มΘΒͳ͍ͱ͍͏͜ͱ͔ΒɺεϖΫτϧʹؚ·ΕΔඍࡉߏΑΓ
εϖΫτϧશମͷแབྷߏ͕Իͷ໌ྎΛܾఆ͚ͮΔཁҼͰ͋Δͱߟ͍ͯ͠Δɻ
εϖΫτϧશମͷใΛྼԽͤͨ͞Իͷผͷྫͱͯ͠ɺνϟϯωϧϘίʔμԻ(Dudley,
1939)͕͋ΔɻνϟϯωϧϘίʔμԻɺԻΛ͍͔ͭ͘ͷपଳҬ(νϟϯωϧ)ʹ͚
ͯॲཧΛ͢Δ͜ͱͰɺͦͷνϟϯωϧʹ͓͚Δৼ෯แབྷͷΈΛऔΓग़͠ɺͦͷऔΓग़͞Εͨ
ৼ෯แབྷͰผͷ৴߸(ൖૹ৴߸)ͷରԠ͢ΔνϟϯωϧΛͦΕͧΕۦಈ͢Δ͜ͱͰ߹͞ΕΔ Ի৴߸Ͱ͋Δ(ਤ1.5)ɻ ൖૹ৴߸͕ଳҬࡶԻͷ߹ࡶԻۦಈԻ(noise-vocoded speech)ɺ ਖ਼ݭͷ߹ਖ਼ݭۦಈԻ(sine-vocoded speech)ͱݺΕΔɻνϟϯωϧϘίʔμԻ
νϟϯωϧΛมԽͤ͞Δ͜ͱͰஈ֊తʹεϖΫτϧใΛྼԽͤ͞Δ͜ͱ͕Ͱ͖Δɻνϟ ϯωϧϘίʔμԻΛ༻͍ͨݚڀͷՌਓࣖͷपνϟϯωϧͷઃఆͳͲʹར༻͞Ε
͍ͯΔ(Xu & Pfingst, 2008)ɻ
Original speech
Carrier signal BPF 1
Σ
BPF 2
BPF N
BPF 1 BPF 2 BPF N
LPF
LPF
LPF
Vocoded speech Rect.
Rect.
Rect.
…
… … …
…
…
・
・・
ਤ 1.5 Nνϟϯωϧ͔ΒͳΔνϟϯωϧϘίʔμԻͷ࡞खॱΛࣔ͢ྲྀΕਤɻத৺प
ͷҟͳΔNݸͷଳҬ௨աϑΟϧλ(ਤதͷBPF)ʹ௨͞ΕͨͦΕͧΕͷԻ৴߸Λɺ͞Βʹ
ྲྀ(ਤதͷRect.)͠ɺҬ௨աϑΟϧλ(ਤதͷLPF)ʹ௨͢͜ͱͰ֤पଳҬʹ͓͚ΔԻ
৴߸ͷৼ෯แབྷ͕ಘΒΕΔɻ͜ͷৼ෯แབྷͰൖૹ৴߸(ࡶԻ·ͨਖ਼ݭ)ͷରԠ͢Δप
ଳҬΛৼ෯มௐ͠ɺ֤ଳҬͷ৴߸Λ͠߹ΘͤΔ͜ͱͰνϟϯωϧϘίʔμԻ͕߹͞
ΕΔɻ
Shannon, Zeng, Kamath, Wygonski, and Ekelid (1995)ͷ࣮ݧͰɺԻΛ4000 HzҎԼʹ ଳҬ੍ݶͨ͠͏͑Ͱɺ̐νϟϯωϧͷࡶԻۦಈԻͱͯ͠߹ͨ͠߹ͰɺจԻͰ͋Ε
୯ޠਖ਼͕90%Λ͑Δ͜ͱ͕ใࠂ͞Ε͍ͯΔɻ·ͨɺࡶԻۦಈԻͱਖ਼ݭۦಈԻ
ͱΛൺֱ࣮ͨ͠ݧͰɺͲͪΒͷ݅Ͱ̐νϟϯωϧͰจԻͷ୯ޠਖ਼͕90%Λ͑
Δ݁Ռ͕ಘΒΕ͍ͯΔ(Dorman, Loizou, & Rainey, 1997)ɻ͞Βʹಉ༷ͷ࣮ݧɺෳͷऀ
Λ͏(Loizou, Dorman, & Tu, 1999)ɺएऀͱߴྸऀͷ̎ͭͷ࣮ݧࢀՃऀάϧʔϓʹ͚Δ (Sheldon, Pichora-Fuller, & Schneider, 2008)ɺνϟϯωϧͷ࣌ؒใΛมԽͤ͞Δ(Souza &
Rosen, 2009),ݴޠνϟϯωϧͷःஅपΛม͑Δ(Ellermeier, Kattner, Ueda, Doumoto,
& Nakajima, 2015)ͳͲɺ༷ʑͳ݅ͰߦΘΕ͖͕ͯͨɺ͍ͣΕͷ࣮ݧʹ͓͍ͯνϟϯωϧ
ϘίʔμԻ̐ʙ̒ଳҬఔ͋Εेʹ໌ྎʹͳΔ͜ͱ͕͔͍ͬͯΔɻ͜ͷΑ͏ʹগ ͳ͍ଳҬͷνϟϯωϧϘίʔμԻͰ͜Ε͚ͩ໌ྎʹԻΛ֮Ͱ͖Δͷɺνϟϯωϧ
ͷ࣌ؒมԽͷใ͕ୈ̍ͷ֮ͷख͕͔ΓͰ͋Δ͔Βͩͱߟ͞Ε͍ͯΔ(Shannon et al.,
1995)͕ɺνϟϯωϧؒͷϨϕϧࠩʹΑͬͯ༩͑ΒΕΔεϖΫτϧͷେ·͔ͳߏ͕ख͕͔Γ
ͱͳ͍ͬͯΔͱใࠂ͢Δݚڀ͋Δ(Roberts, Summers, & Bailey, 2010)ɻνϟϯωϧϘίʔ μԻΛ༻͍ͨݚڀɺͲΕ͚ͩԻʹੑ͕͋Δͷ͔Λ͖ࣔͯͨ͠ɻ͔͠͠ͳ͕Βɺͳͥ
νϟϯωϧϘίʔμԻͰ໌ྎʹԻΛ֮͢Δ͜ͱ͕Ͱ͖Δͷ͔ͱ͍͏͍ʹରͯ͠े
ʹ͑Δ͜ͱͰ͖͍ͯͳ͍ɻ·ͨɺͲͷपଳҬͷใͷد༩͕໌ྎʹ༩͑ΔޮՌ͕େ
͖͍͔ʹ͍ͭͯेʹݕ౼͞Ε͍ͯͳ͍ɻ
1.1.3 εϖΫτϧͷશମߏ͕ͭख͕͔Γ
͜͜·ͰɺԻ֮ʹ͓͍ͯɺεϖΫτϧͷશମߏ͕ख͕͔ΓͰ͋Δͱ͍͏ՄೳੑΛ܁
Γฦ͖ࣔͯͨ͠͠ɻ͜ͷΑ͏ͳεϖΫτϧͷશମߏ͕ͭख͕͔Γʹ͍ͭͯผͷ͔֯Β ௐΔํ๏ͱͯ͠ɺԻͷԻڹతಛΛ౷ܭతख๏ʹΑͬͯੳ͢Δݚڀ͕͋Δɻ͜͜Ͱɺ
౷ܭతख๏ʹΑΔԻͷੳʹ͍ͭͯͷઌߦݚڀΛհ͠ɺຊจͷςʔϚͱͳΔύϫʔεϖ ΫτϧҼࢠʹ͍ͭͯಋೖ͢Δɻ
Plomp, Pols, and Geer (1967)Φϥϯμޠͷ15ͷԻͷεϖΫτϧΛྟքଳҬ෯ʹ͍ۙɺ 1/3ΦΫλʔϒόϯυͰ18ଳҬʹׂ͠ɺͦΕͧΕͷଳҬͷύϫʔΛجʹओੳΛߦͬͨɻ
൴ΒओੳʹΑͬͯɺεϖΫτϧͷશମతͳಛ͕ͲͷΑ͏ͳ୯७ͳύλʔϯʹΑͬͯ
ߏ͞ΕΔͷ͔Λ͔֬ΊͨͷͰ͋Δɻͦͷ݁Ռɺୈ2ओ·ͰͰσʔλͷ͓Αͦ70%͕આ
໌Ͱ͖Δͱ͔Γɺ15ͷԻΛୈ̍ɺୈ̎ओۭؒͰेʹ۠ผՄೳͰ͋Δ͜ͱ͕ࣔ͞Ε
ͨɻ·ͨɺ͜ͷୈ̍ɺୈ̎ओ্ۭؒʹ͓͚ΔԻͷஔɺୈ̍ϑΥϧϚϯτपͱୈ
̎ϑΥϧϚϯτपͷରΛ࣠ͱ͢Δฏ໘্ʹ͓͚ΔԻͷஔͱରԠ͚ΒΕΔ͜ͱ͕
͔ͬͨ(Pols, Tromp, & Plomp, 1973; Plomp, 1976, 2002)ɻ͜ͷΑ͏ͳσʔλΛഎܠʹɺ
Իͷࣝผʹ͓͍ͯɺϑΥϧϚϯτपΑΓΉ͠ΖεϖΫτϧશମͷܗঢ়͕ख͕͔Γͱͯ͠
༗༻Ͱ͋Δ͜ͱΛ Zahorian and Jagharghi (1993)͕͍ࣔͯ͠Δɻ
Ueda and Nakajima (2017) Plomp et al. (1967); Pols et al. (1973); Plomp (1976, 2002) ͷੳख๏Λɺ̔ͭͷҟͳΔݴޠɾํݴʹ͓͚Δ࿈ଓతʹൃ͞ΕͨจԻΛରʹ֦ு͠
ͨɻ൴ΒɺZwicker and Terhardt (1980)Λࢀߟͱͯ͠ઃఆͨ͠20ͷྟքଳҬͰɺจԻͷ ύϫʔεϖΫτϧͷ࣌ؒมಈΛׂ͠ɺ1 msຖʹ20ଳҬͷύϫʔͱͯ͠औΓग़͞Εͨͷ ΛҼࢠੳʹ͔͚ͨɻ͜ͷੳʹΑͬͯ̏ͭͳ͍̐ͭ͠ͷҼࢠΛऔΓग़͢ͱɺ̔ͭશͯͷݴޠ ʹ͓͍ͯڞ௨͢ΔύλʔϯͷҼࢠ͕ಘΒΕΔ͜ͱΛ൴Βൃݟͨ͠ɻ͜ͷ݁ՌݴޠΛ͑ͨ
ීวతͳԻڹతಛ͕Իʹؚ·Ε͍ͯΔ͜ͱΛࣔ͢ͷͰ͋ΔɻຊจͰ͜ͷҼࢠΛɺύ ϫʔεϖΫτϧͷ࣌ؒมಈΛߏ͢Δ͜ͱ͔ΒɺʮύϫʔεϖΫτϧҼࢠʯͱݺͿ͜ͱͱ͢Δɻ
͜Ε·Ͱʹ͓͍ͯɺԻͷεϖΫτϧͷશମతͳܗঢ়ͷ౷ܭతੳ͔ΒಘΒΕΔεϖΫτϧ Λߏ͢Δओཁͳಛ͕ɺԻ֮ͷख͕͔Γͱͯ͠ͲͷΑ͏ʹػೳ͢Δ͔ʹ͍ͭͯௐͨݚ ڀචऀ͕୳͢ൣғͰݟͨΒͳ͍ɻ͜ͷΑ͏ͳ౷ܭతख๏Λ༻͍ͨԻͷεϖΫτϧͷ
ੳ݁ՌΛɺԻ֮ͷΈͱ݁ͼ͚ͭͯߟ͢Δͱ͍͏ΑΓɺऔΓग़͞Εͨओ
ۭؒΛ༻͍ͯԻΛࣗಈతʹࣝผ͢Δٕज़ʹར༻͢Δํʹݚڀ͕ൃల͍ͯ͠ΔΑ͏Ͱ͋Δɻ
Ueda and Nakajima (2017)͕औΓग़ͨ̏ͭ͠·ͨ̐ͭͷύϫʔεϖΫτϧҼࢠɺԻͷ
εϖΫτϧΛ̐ଳҬʹׂ͢ΔΑ͏ͳಛΛ͍࣋ͬͯΔɻ͜ͷ͜ͱɺઌʹड़ͨ̐ଳҬͷ νϟϯωϧϘίʔμԻ͕ߴ͍໌ྎΛͭ͜ͱͱԿΒ͔ͷؔ࿈͕͋Δ͜ͱΛ͏͔͕ΘͤΔɻ Ellermeier et al. (2015)ɺUeda and Nakajima (2017)͕औΓग़ͨ̏ͭ͠ͳ͍̐ͭ͠ͷύϫʔ εϖΫτϧҼࢠʹΑׂͬͯ͞ΕΔ̐ଳҬʹैͬͯυΠπޠɾຊޠͷ̐ଳҬࡶԻۦಈԻΛ
߹͠ɺௌऔ࣮ݧʹΑͬͯ͜ͷ̐ଳҬࡶԻۦಈԻ͕ߴ͍໌ྎΛͭ͜ͱΛ͔֬Ί͍ͯΔɻ
͜ͷݚڀύϫʔεϖΫτϧҼࢠ͕ͭԻ֮ͷख͕͔Γʹ͍ͭͯߟ͢ΔͨΊͷॏཁͳ ใࠂͰ͋Δɻ͔͠͠ͳ͕ΒɺΑΓతͳߟΛߦ͏ͨΊʹɺύϫʔεϖΫτϧҼࢠ͕ද ݱ͠͏Δใ͚ͩΛ࣋ͭԻΛ߹͠ɺͦΕΛ༻͍ͨௌऔ࣮ݧΛߦ͏͜ͱ͕ඞཁͰ͋Ζ͏ɻ Zahorian and Rothenberg (1981)ʹ͓͍ͯͦͷΑ͏ͳࢼΈ͕ͳ͞Ε͍ͯΔɻ൴Β Plomp et
al. (1967)͕ߦͬͨओੳͱಉ༷ͷํ๏ͰऔΓग़͞Εͨओ͔ΒԻΛ࠶߹͠ɺͦͷ
Իͷ໌ྎΛଌఆ͍ͯ͠Δ͕ɺ൴Βͷݚڀੳʹ͓͚Δ࠷దͳ݅ͷ୳ࡧʹओ؟͕ஔ͔Ε
ͯ͋ΓɺԻ֮ʹ͓͍ͯओ͕ͭҙຯʹ͍ͭͯͷߟेʹͳ͞Ε͍ͯͳ͔ͬͨɻ
1.1.4 ݴޠͷϦζϜͱ໐Իੑ
Ի֮ͷݚڀ୯ಠͰൃ͞ΕͨԻૉɺԻઅɺ୯ޠͷ֮ΛௐΔྖҬͱɺձ࣌ͷΑ͏
ʹ࿈ଓతʹൃ͞ΕͨԻͷ֮ΛௐΔྖҬʹ͔ΕΔͱ࠷ॳʹड़ͨɻৗͷதͰൃ͞
ΕΔԻԻڹతʹΕͷͳ͍࿈ଓମͰ͋Δɻ͜ͷ࿈ଓతͳԻΛௌ͍ͯݴޠͱͯ͠ਖ਼͘͠
ೝࣝ͢ΔͨΊʹɺΕͷͳ͍ԻΛ͋Δ୯Ґʹઅ͢Δͱ͍͏ॲཧ͕ߦΘΕͳ͚Εͳ Βͳ͍ɻͰɺͲͷΑ͏ͳ୯Ґʹઅ͞ΕͯԻ֮͞Ε͍ͯΔͷͰ͋Ζ͏͔ɻ͜͜Ͱɺ ݴޠͷϦζϜʹযΛͯͯ͜ͷʹ͍ͭͯऔΓ্͛Δɻ
ݴޠʹϦζϜ͕͋Δɻ୯ޠɾจઅɾจʹΑͬͯϦζϜ֊తʹ࡞ΒΕɺϦζϜऀͷ ײΛ͑ΔతɺಛఆͷޠΛڧௐ͢ΔతͰ༻͍ΒΕΔ͜ͱ͋Δ(Handel, 1989)ɻҟͳ ΔݴޠͷԻͲ͏͠Λௌ͖ൺͯΈΕɺͦͷϦζϜ͕ҟͳΔ͜ͱʹؾͮͩ͘Ζ͏ɻ࣮ࡍʹݴ ޠͦͷϦζϜߏʹΑ͍͔ͬͯͭ͘ͷදతͳάϧʔϓʹྨ͞ΕΔɻRamus, Nespor, and
Mehler (1999)༷ʑͳݴޠͷԻΛੳ͠ɺԻͷ۠ؒͷׂ߹ͱ̍จͷࢠԻͷ۠ؒͷׂ߹
ͷඪ४ภࠩͱͰද͞ΕΔฏ໘ʹ֤ݴޠΛஔ͢Δͱɺ̏ͭͷάϧʔϓʹ͔Εͯஔ͞ΕΔ
͜ͱΛࣔͨ͠ɻ͜ͷ̏ͭͷάϧʔϓݴޠͷϦζϜߏͷදతͳάϧʔϓͱ͞ΕΔɺετ ϨελΠϛϯάݴޠ(stress-timed language)ɺԻઅλΠϛϯάݴޠ(syllable-timed language)ɺ ϞʔϥλΠϛϯάݴޠ(mora-timed language)ʹͦΕͧΕରԠ͍ͯ͠Δɻྫ͑ӳޠɾυΠπ ޠετϨελΠϛϯάݴޠɺϑϥϯεޠɾΠλϦΞޠԻઅλΠϛϯάݴޠ(Ladefoged &
Johnson, 2011)ɺͦͯ͠ຊޠɾλϛϧޠϞʔϥλΠϛϯάݴޠͰ͋Δ(Port, Dalby, & Oʟ Dell, 1987; Ramus et al., 1999)ɻݴޠʹ͓͚ΔϦζϜͷׂͷҰ֮ͭͷ୯ҐΛܗ͢Δ͜
ͱͰ͋ΔͱݴΘΕ͍ͯΔ(Cutler, 1994)ɻ࣮֮ݧʹΑͬͯɺӳޠԻ͕ετϨεͷ୯ҐͰɺϑ ϥϯεޠԻ͕Իઅͷ୯ҐͰ֮͞Ε͍ͯΔ͜ͱΛ(Cutler, Mehler, Norris, & Segui, 1986)ɺ
ຊޠԻʹ͓͍ͯϞʔϥΛ୯Ґʹ֮ͯ͠͞Ε͍ͯΔ͜ͱΛ(Otake, Hatano, Cutler, &
Mehler, 1993)ࣔ͢σʔλ͕ಘΒΕ͍ͯΔɻ
ݴޠͷϦζϜͱύϫʔεϖΫτϧҼࢠʹؔ࿈͕͋Δ͜ͱΛࣔ͢ݚڀ͕͋ΔɻYamashita et
al. (2013)ɺӳޠͱຊޠͷͦΕͧΕͷݴޠڥԼͰҭͯΒΕͨೕ༮ࣇ͕ࣗવʹൃͨ͠
ΛܧଓతʹԻ͠ɺೕ༮ࣇͷͷաఔʹ͓͍ͯԻͷԻڹతಛʹͲͷΑ͏ͳมԽ͕ݟΒ ΕΔͷ͔Λ؍ͨ͠ɻ൴ঁΒ݄ྸ15ɺ20ɺ24͔݄ͷ̏ͭͷ࣌ظͷԻʹରͯ͠ Ueda and
Nakajima (2017)ͱಉ͡ํ๏ͰҼࢠੳΛߦ͍ɺ݄ྸ͕ߴ͍ೕ༮ࣇͷԻ΄ͲɺύϫʔεϖΫ
τϧҼࢠͷύλʔϯ͕ਓͷͷʹ͍ۙ͜ͱΛݟ͚ͭͨɻ͞ΒʹɺҼࢠੳͰ̏ҼࢠΛऔΓग़
ͨ͠͏ͪͷҰͭͰ͋Δ1100 HzۙͷதଳҬʹେ͖͍ҼࢠෛՙྔΛͭҼࢠͷҼࢠಘʹͭ
͍ͯɺͦͷࣗݾ૬ؔؔΛٻΊΔ͜ͱͰͦͷҼࢠಘͷ࣌ؒมಈʹ͓͍ͯϦζϜύλʔϯͷ Α͏ͳͷ͕ݟΒΕΔͷ͔Λௐͨɻࣗݾ૬ؔؔͷʹ͖ͬΓͱͨ͠ϐʔΫ͕ݱΕͨͱ͖
ʹɺͦͷϐʔΫ͕Ͱ͖Δ࣌ؒΛִ࣌ؒؒͱ͢ΔϦζϜ͕ܗ͞Ε͍ͯΔͱߟ͑Δ͜ͱ͕Ͱ͖
Δɻ͜ͷੳʹΑͬͯɺ݄ྸͷߴ͍ೕ༮ࣇͷԻͷϦζϜਓͷԻͷϦζϜʹ͍ۙ͜ͱ͕
͔ͬͨɻ
1100 HzۙͷଳҬʹେ͖͍ҼࢠෛՙྔΛͭύϫʔεϖΫτϧҼࢠݴޠͷϦζϜΛௐ
Δ͜ͱʹར༻Ͱ͖Δ͜ͱΛ͕ࣔͨ͠ɺ͜ͷҼࢠʹ͍ͭͯ͞Βʹৄ͘͠ੳͨ͠ Nakajima, Ueda, Fujimaru, Motomura, and Ohsaka (2017)ͷݚڀʹ͍ͭͯ৮ΕΔɻNakajima et al. (2017)
ΠΪϦεӳޠԻʹରͯ͠ Ueda and Nakajima (2017)Ͱ༻͍ΒΕͨԻͷεϖΫτϧߏ
ʹର͢ΔҼࢠੳΛ༻͍ͯ̏ͭͷύϫʔεϖΫτϧҼࢠΛऔΓग़͠ɺ֤ԻૉͷҼࢠಘΛ؍
ͨ͠ɻ൴Β֤ԻૉΛҼࢠಘʹ͕ͨͬͯ̏͠Ҽࢠͷ্ۭؒʹஔͤ͞Δͱɺ֤Իૉ͕͋
Δۂઢ্Λɺ໐Իੑ(sonority) ͷईͷॱʹै͏Α͏ʹ͢Δ͕͋Δ͜ͱΛݟ͚ͭͨɻ
͞Βʹ͔ͦ͜Βɺ1100 HzपลͷதଳҬʹ͓͍ͯҼࢠෛՙྔ͕େ͖͍ύϫʔεϖΫτϧҼࢠʹ ໐Իੑͷईͱਖ਼ͷ૬͕ؔ͋Δ͜ͱ͕ɺͦͯ͠3300 HzҎ্ͷߴଳҬʹ͓͍ͯҼࢠෛՙྔ͕
େ͖͍ύϫʔεϖΫτϧҼࢠʹ໐Իੑͷईͱෛͷ૬͕ؔ͋Δ͜ͱΛݟ͚ͭͨɻ໐Իੑͱɺ
ͦΕͧΕͷԻૉʹ͍ͭͯɺͦΕΒΛͲΕ͚ͩେ͖͘ڹ͔ͤͯൃͰ͖Δ͔Λࣔ͢ॱংईͰ
͋ΓɺݴޠֶԻֶͷݚڀऀΒʹΑͬͯఏএ͞ΕͨͷͰ͋Δ(Selkirk, 1984; Harris, 1994;
Spencer, 1996)ɻde Saussure (1959)ɺൃͷࡍʹͲΕ͘Β͍ௐԻث͕։͍͍ͯΔ͔ɺͦ
ΕʹΑͬͯͲΕ͚ͩԻ͕ڹ͔͘ͱ͍͏؍ͰԻૉΛ։ޱ (aperture)ͱ͍͏ॱংईͰྨ
͍ͯ͠Δɻ։ޱͱݺশ͞Ε͍ͯΔ͕ɺௐԻͱௌ֮ͱ͕Γͤͳ͍ͷͱ͍͏ߟ͑ʹج͍ͮ
͓ͯΓɺԻͷௌ͑͜ͱͷؔʹॏ͖Λஔ͍ͯߟ͕ਐΊΒΕ͍ͯΔɻ։ޱ໐Իੑͱಉ༷ͷ
ͷͰ͋Δͱߟ͑Δ͜ͱ͕Ͱ͖ΔɻSpencer (1996)ʹΑΔ໐ԻੑͷईͰɺԻɺΓԻɺ
ྲྀԻɺඓԻɺຎࡲԻɾഁࡲԻɺഁ྾Իͷॱʹ໐Իੑ͕͘ͳΔͱ͍ͯ͠ΔɻԻઅԻૉ͕࿈݁
͞ΕΔ͜ͱʹΑͬͯߏ͞ΕΔ͕ɺجຊతʹ໐Իੑ͕͍Իૉ͔Βߴ͍Իૉͱͭͳ͕Γɺͦ
ͯ͠·͍ͨԻૉʹͭͳ͕ΔΑ͏ʹͳ͍ͬͯΔɻ͜Ε໐Իੑ࿈ଓݪཧ(sonority sequencing
principle; Rahilly, 2016)ͱݺΕΔͷͰɺ͜ͷنଇʹैͬͯԻૉ͕࿈ͳΔͱɺ໐Իੑͷࢁ͕
Ͱ͖ΔॴʹԻઅͷ͕֩ܗ͞ΕΔɻNakajima et al. (2017)ͷݚڀͷಛච͖͢ɺ໐Ի
ੑͷਫ਼ਆཧֶత࣮ମΛఏҊͨ͜͠ͱͰ͋Γɺ͜ͷํ๏Ͱ໐ԻੑΛఆٛ͢Εɺ୯ޠ stopͷ Α͏ͳӳޠʹ͓͍ͯසग़͢ΔɺຎࡲԻ/s/ͱഁ྾Ի/t/ͷ಄ࢠԻ࿈݁ʹ͓͍ͯɺ/s/͕Իઅͷ֩
ͱͳΒͳ͍͜ͱΛ໐Իੑ࿈ଓݪཧʹໃ६ͤͣʹઆ໌͢Δ͜ͱ͕Ͱ͖Δɻ
ԻΛௌ͍ͨͱ͖ʹײͥΒΕΔϦζϜڧऑͷཁૉ͕࣌ؒతʹنଇੑΛͬͯฒΜͰ͍ΔͷΛ
֮͢Δ͜ͱͰܗ͞ΕΔɻ͜ͷԻͷڧऑͷཁૉ͕໐ԻੑͰ͋Δͱߟ͑ΒΕ͍ͯΔ(Handel, 1989)ɻGalves, Garcia, Duarte, and Galves (2002)໐Իੑ͕ݴޠͷϦζϜͱؔ࿈͕͋Δ͜ͱ ΛҟͳΔϦζϜߏΛͭݴޠͷԻΛԻڹతʹੳ͢Δ͜ͱͰ͔֬Ί͍ͯΔɻ ݴޠͷϦζϜ ΛͱΒ͑Δ͜ͱ͕Իͷ֮ʹॏཁͰ͋Δ͜ͱ͔Βɺ໐Իੑ͕Իͷ֮ʹͲͷΑ͏ʹӨڹΛ ༩͑Δͷ͔ʹ͍ͭͯௐΔ͜ͱ·ͨॏཁͰ͋Ζ͏ɻUeda and Nakajima (2017)ɺNakajima
et al. (2017)ͷݚڀʹΑͬͯ໐Իੑ͕ύϫʔεϖΫτϧҼࢠͱ͍͏ଌఆͰ͖ΔͷͰͱΒ͑Δ
͜ͱ͕Ͱ͖ΔΑ͏ʹͳͬͨɻΑͬͯύϫʔεϖΫτϧҼࢠ͔ΒԻΛ࠶߹͢ΕɺԻ
֮ͱ໐ԻੑͷؔΛௐΔௌऔ࣮ݧΛߦ͏͜ͱ͕Ͱ͖Δɻ
1.2 ຊจͷత
ຊจͰɺԻͷྟքଳҬ͝ͱͷύϫʔมಈΛߏ͢ΔύϫʔεϖΫτϧҼࢠʹ͍ͭͯɺ Ի֮ʹ͓͚ΔͦͷׂΛௌऔ࣮ݧʹΑͬͯௐΔ͜ͱΛతͱ͢Δɻ͜ΕΛ࣮ݱ͢ΔͨΊ ʹɺύϫʔεϖΫτϧҼࢠ͔ΒԻΛ࠶߹͢Δख๏Λཱ֬͢Δɻ߹ԻΛ༻͍ͨԻͷ໌
ྎΛଌఆ͢Δௌऔ࣮ݧͱɺ౷ܭతख๏ʹΑΔԻͷεϖΫτϧͷߏੳͱΛ݁ͼ͚ͭΔݚ ڀʹҐஔ͚ͮΒΕΔɻ
1.3 ຊจͷߏ
ୈ̍ষͰɺຊݚڀͷഎܠͱͯ͠ɺௌ֮ܥধʹΑͬͯಘΒΕΔԻͷεϖΫτϧදݱ͕Ի
ͷ֮ͷࡍʹͲͷΑ͏ʹར༻͞Ε͍ͯΔͷ͔ʹ͍ͭͯɺௌऔ࣮ݧΛ௨ͯ͠ௐͨઌߦݚڀ͓
ΑͼԻͷ౷ܭతੳΛ௨ͯ͠ௐͨઌߦݚڀΛհͨ͠ɻͦͷதͰΛཧ͠ɺຊݚڀ ͷతΛࣔͨ͠ɻ
Ҏ߱ɺୈ̎ষ͔Βୈ̐ষʹ͓͍ͯɺຊݚڀͰߦͬͨ2ͭͷੳ͓Αͼ4ͭͷ࣮ݧʹ͍ͭͯใ ࠂ͢Δɻୈ̎ষͷੳ̍Ͱ Ueda and Nakajima (2017)ʹΑΔԻͷྟքଳҬ͝ͱͷύϫʔ มಈʹର͢ΔҼࢠੳʹΑͬͯಘΒΕΔύϫʔεϖΫτϧҼࢠ͔ΒɺྟքଳҬ͝ͱͷύϫʔม ಈΛ࠶ߏ͠ɺௌऔ࣮ݧʹ༻͍ΔͨΊͷܹԻΛ࡞͢Δͷʹదͨ͠Ҽࢠੳ๏ΛఏҊ͢Δɻ
͜ͷҼࢠੳ๏ʹΑͬͯຊޠɾΠΪϦεӳޠɾதࠃޠ(ී௨)ͷԻΛੳ͠ɺಘΒΕͨ
ύϫʔεϖΫτϧҼࢠ͕ Ueda and Nakajima (2017)ͷੳͷͷͱಉͷҼࢠͰ͋Δͷ͔Λ
֬ೝ͢Δɻଓ࣮͘ݧ̍ͰɺύϫʔεϖΫτϧҼࢠʹΑͬͯදݱͰ͖ΔԻͷύϫʔεϖΫτ ϧͷ࣌ؒมԽͷใʹΑͬͯɺຊޠԻ͕Ͳͷఔਖ਼֬ʹ͑ΒΕΔͷ͔Λௌऔ࣮ݧͰௐ
Δɻ͜ͷ࣮ݧʹΑͬͯύϫʔεϖΫτϧҼࢠΛ͍ͭ͘·Ͱ༻͍Εɺेʹ໌ྎͳԻΛ߹
͢Δ͜ͱ͕Ͱ͖Δͷ͔Λ͔֬ΊΔɻୈ̏ষͰୈ̎ষͷύϫʔεϖΫτϧҼࢠΛ༻͍ͨԻ
ͷ࠶߹ͷࡍʹى͖Δʹ͠ɺ͜ͷΛճආ͢Δํ๏ͱͯ͠ύϫʔεϖΫτϧҼ ࢠͷަੑΛҡ࣋ͨ͠··ඇෛԽͨ͠ͷʹमਖ਼͢Δํ๏ΛఏҊ͢Δɻੳ̎ͱͯ͠ɺύ ϫʔεϖΫτϧҼࢠʹΑΔɺԻͷύϫʔεϖΫτϧมԽͷઆ໌͕͜ͷमਖ਼ʹΑͬͯͲΕͩ
͚ӨڹΛड͚Δͷ͔ΛௐΔɻ࣮ݧ̎Ͱ࣮ݧ̍ͱಉ༷ͷํ๏Λ༻͍ͯɺ͜ͷඇෛԽΛߦͬ
ͨύϫʔεϖΫτϧҼࢠΛ༻͍ͯ߹͞ΕͨԻͷ໌ྎΛଌఆ͢Δɻ࣮ݧ̍ͱ࣮ݧ̎ͷ݁
ՌΛൺֱ͢Δ͜ͱʹΑͬͯɺୈ̎ষʹ͓͚ΔԻͷ࠶߹ͷࡍʹੜ͍͕ͯͨ͡ຊݚڀͷ
తΛୡ͢Δ্ͰॏཁͳͰ͋Δͷ͔Λݕ౼͢Δɻୈ̐ষͰୈ̎ষ͓Αͼୈ̏ষͰ͔ͬ
ͨԻͷ໌ྎͳ֮ʹ͓͍ͯॏཁͱͳΔύϫʔεϖΫτϧҼࢠʹ͍ͭͯɺݸʑͷҼࢠͷׂʹ
͢ΔɻԻΛύϫʔεϖΫτϧҼࢠ͔Β࠶߹͢Δࡍʹɺ͍͔ͭ͘ͷύϫʔεϖΫτϧҼ ࢠʹΑͬͯ༩͑ΒΕΔύϫʔεϖΫτϧͷ࣌ؒมԽͷใΛऔΓআ͖ɺ࠶߹͞ΕͨԻͷ໌
ྎ͕ͲΕ͚ͩԼ͢Δͷ͔ΛௐΔɻऔΓআ͘Ҽࢠ͕ҟͳΔͱɺ໌ྎ͕ͲΕ͚ͩҟͳΔͷ
͔Λൺֱ͢Δ͜ͱͰɺҼࢠͷͭݸʑͷׂʹ͍ͭͯߟ͢Δɻ
ୈ̑ষͰ૯߹ߟΛߦ͏ɻ·ͣɺୈ̎ষ͔Β̐ষʹ͓͍ͯߦͬͨੳ͓Αͼ࣮ݧͷ݁ՌΛ
·ͱΊɺύϫʔεϖΫτϧҼࢠ͕Իͷ֮ʹͲͷΑ͏ͳׂΛͭͷ͔ʹ͍ͭͯ݁Λड़
Δɻ࣍ʹಋ͔Ε͕ͨ݁ݚڀ࢙ͷதͰͲͷΑ͏ʹҐஔ͚ͮΒΕΔͷ͔ɺ·ͨઌߦݚڀʹର
ͯ͠ͲͷΑ͏ͳ৽͍͠ղऍΛ༩͑Δͷ͔ʹ͍ͭͯߟ͢ΔɻͦͷதͰɺΘ͔ͣͳଳҬͷνϟ ϯωϧϘίʔμԻͰͳͥ໌ྎʹ༰Λ֮͢Δ͜ͱ͕Ͱ͖Δ͔ʹ͍ͭͯɺຊݚڀͷ͔݁Β આ໌ΛࢼΈΔɻ࠷ޙʹݚڀͷʹ͍ͭͯ৮ΕɺࠓޙͷలΛड़Δɻ
ୈ 2 ষ จԻͷ໌ྎͳ֮ʹཁ͢Δύϫʔ εϖΫτϧҼࢠͷݸ
2.1 ୈ̎ষͷత
զʑ͕ԻΛௌऔ͢Δͱɺௌ֮ͷपੳػೳʹΑͬͯௌ֮ܥধʹ͓͍ͯԻͷεϖΫ τϧදݱ͕ಘΒΕ͍ͯΔɻྟքଳҬϑΟϧλΛ༻͍Δ͜ͱͰɺௌ֮ܥধʹ͓͚ΔԻͷεϖ ΫτϧදݱΛٖ͢Δ͜ͱ͕Ͱ͖ΔɻUeda and Nakajima (2017)͕ಋग़ͨ͠ύϫʔεϖΫτ ϧҼࢠ20ݸͷྟքଳҬϑΟϧλʹΑͬͯಘΒΕͨԻͷύϫʔεϖΫτϧΛΑΓগͳ͍ݸ
ͷҼࢠͷઢܗ݁߹ʹΑͬͯۙࣅ͢ΔͷͰ͋Δɻ͢ͳΘͪɺऔΓग़͢Ҽࢠ͕20ݸʹ͍ۙ
΄ͲྟքଳҬϑΟϧλͷग़ྗΛ࣮ʹ࠶ݱͰ͖ΔɻUeda and Nakajima (2017)͕ੳରͱ
ͨ̔͠ݴޠؒͰڞ௨ͨ͠Ҽࢠߏ͕ಘΒΕͨͷ̐Ҽࢠ·ͰͰ͋ͬͨɻ൴ΒͷҼࢠੳओ
ੳΛجૅͱ͍ͯ͠Δ͜ͱ͔Βɺ͜ͷੳ݁ՌɺԻͷύϫʔεϖΫτϧߏͷ͏ͪͷ ओཁͳಛΛߏ͢ΔͨΊͷ4Ҽࢠ͕֤ݴޠʹ͓͍ͯڞ௨͍ͯ͠Δͱ͍͏͜ͱΛ͍ࣔͯ͠Δɻ
ͦΕͰɺݴޠΛ͑ͯڞ௨͢ΔಛΛ࣋ͬͨԻͷύϫʔεϖΫτϧҼࢠԻΛ໌ྎʹ
֮͢ΔͨΊʹͲΕ͚ͩͷใΛ༩͑͏ΔͷͩΖ͏͔ɻ
ຊষͰɺԻΛ࠶߹͢Δͷʹదͨ͠ύϫʔεϖΫτϧҼࢠΛಘΔ͜ͱ͕Ͱ͖Δ৽͍͠ओ
ੳɺʮىҠಈओੳ (origin-shifted principal component analysis)ʯΛఏҊ͢Δɻ
ੳ̍ͱͯ͠ɺఏҊख๏ʹΑͬͯԻΛੳͨ͠߹ɺઌߦݚڀͰಘΒΕͨҼࢠͱಉͷҼࢠ
͕ಘΒΕΔ͔Λ֬ೝ͢Δɻ࣍ʹ࣮ݧ̍ͱͯ͠ɺύϫʔεϖΫτϧҼࢠ͔ΒຊޠԻΛࡶԻۦ ಈԻͱͯ͠࠶߹ͨ࣌͠ɺҼࢠΛ͍ͭ͘·Ͱ༻͍Ε࠶߹͞ΕͨࡶԻۦಈԻΛे໌ྎ
ʹௌ͖औΔ͜ͱ͕Ͱ͖Δͷ͔ΛௐΔɻ
2.2 ੳ̍ɿ ىҠಈओੳʹΑΔύϫʔεϖΫτϧҼࢠ ͷநग़
ैདྷͷओੳʹΑͬͯಘΒΕΔύϫʔεϖΫτϧҼࢠ͔ΒԻΛ࠶߹͢Δ߹ɺޙड़
͢ΔఆৗࡶԻ͕ൃੜ͢Δͱ͍͏͕͋Γɺௌऔ࣮ݧʹ༻͍Δͷ͕ෆదͰ͋Δͱߟ͑Β ΕΔɻͦ͜Ͱओੳͷม๏ΛఏҊ͠ɺ͜ͷΛճආ͢Δ͜ͱͱͨ͠ɻ͜ͷઅͰɺ৽͠
͘ఏҊ͢ΔʮىҠಈओੳʯΛ௨ͯ͠ಘΒΕΔύϫʔεϖΫτϧҼࢠͱैདྷͷओ
ੳΛ௨ͯ͠ಘΒΕΔύϫʔεϖΫτϧҼࢠͱΛൺֱ͠ɺੳ๏ͷมߋ͕ɺ݁Ռʹରͯ͠ຊ࣭త ͳӨڹΛ༩͍͑ͯͳ͍͔Λ͔֬ΊΔɻ
2.2.1 ੳࢼྉ
NTT-ATࣾͷʮଟݴޠԻσʔλϕʔε2002 (NTT-AT, 2002)ʯʹσΟδλϧऩ(16-bit
ྔࢠԽɺ16000 Hz αϯϓϦϯά)͞ΕͨɺຊޠɺΠΪϦεӳޠɺதࠃޠ(ී௨)ԻΛ
ੳࢼྉͱͯ͠༻͍ͨɻຊޠԻɺΠΪϦεӳޠԻͦΕͧΕ200จ͔ΒͳΓɺ֤ݴޠͷ
ޠऀͰ͋Δஉੑ໊͕̑ͦΕͧΕͷจΛൃͨ͠ɻதࠃޠ(ී௨)ʹ͍ͭͯɺ78จ͔Βͳ ΓɺޠऀͰ͋Δஉੑ໊͕̑ͦΕͧΕͷจΛൃͨ͠ɻ֤จฏۉ̎sఔͷ͞Ͱൃ͞
Ε͍ͯΔɻͯ͢ͷੳରͷԻͷ૯ൃ࣌ؒɺຊޠɺΠΪϦεӳޠɺதࠃޠ(ී௨) ͷॱʹɺ2484 sɺ1979 sɺ870 sͰ͋ͬͨɻ͜Εɺ҆ఆͨ͠ੳ݁Ռ͕ಘΒΕΔͷʹඞཁͳ
͞Ͱ͋Δͱ͞ΕΔ30 s (Li, Hughes, & House, 1969; Zahorian & Rothenberg, 1981)Λे
ʹ͍͑ͯΔɻԻͷฏۉجຊपຊޠɺΠΪϦεӳޠɺதࠃޠ(ී௨)ͰͦΕͧΕ 136 HzʢSD = 31 Hzʣɺ126 HzʢSD = 30 Hzʣɺ164 HzʢSD = 38 HzʣͰ͋ͬͨɻ͜ΕΒͷ
̏ͭͷݴޠɺҟͳΔݴޠάϧʔϓͷදͱͯ͠ɺUeda and Nakajima (2017)Ͱੳ͞Εͨ
̔ݴޠͷத͔Βબग़ͨ͠ɻຊޠɾΠΪϦεӳޠɾதࠃޠ(ී௨)ޓ͍ʹҟͳΔݴޠϦζ ϜΛ͓࣋ͬͯΓɺΠΪϦεӳޠετϨελΠϛϯάݴޠɺຊޠϞʔϥλΠϛϯάݴޠɺ தࠃޠԻઅλΠϛϯάݴޠͰ͋Δ(Cutler, 1994; Ramus et al., 1999)ɻUeda and Nakajima
(2017)ͷੳͰɺஉੑऀ͚ͩͰͳ͘ঁੑऀͷԻੳʹ༻͍ΒΕ͍ͯͨɻຊจͰ
ɺௌऔ࣮ݧͰஉੑऀͷԻΛݪԻʹ༻͍ͨɻஉੑऀͷԻঁੑऀͷԻΑΓج ຊप͕͍ɻجຊप͕͍ԻύϫʔεϖΫτϧͷแབྷܗঢ়ͷใ͕औΓग़͢͠
͍ɻͦͷͨΊɺஉੑऀͷԻΛௌऔ࣮ݧʹ༻͍ΔݪԻͱͨ͠ɻௌऔ࣮ݧͰ༻͍Δύϫʔε ϖΫτϧҼࢠΛಘΔͨΊʹɺஉੑऀͷԻͷΈΛੳ͢Δ͜ͱʹͨ͠ɻ
2.2.2 खଓ͖
ੳखଓ͖Λਤ2.1ʹࣔ͢ɻੳ֤Ի৴߸Λॲཧͯ͠ྟքଳҬ͝ͱͷύϫʔมಈΛಘΔ ෦ͱɺྟքଳҬ͝ͱͷύϫʔมಈΛىҠಈओੳʹ͔͚ͯύϫʔεϖΫτϧҼࢠΛಘ Δ෦ͱʹ͚ΒΕΔɻ
Signal Processing
…
…
File 1 File 2 File 200
Origin-shifted principal component analysis
&
Varimax rotation 1
2 20
Power fluctuations in each critical band
Spectral change factors Speech signals 0.5 s
0.5 s
75 150 250 350 450 570 700 840 1000 1170 1370 1600 1850 2150 2500 2900 3400 4000 4800 5800 Center frequency (Hz)
ਤ 2.1ੳखଓ͖ͷྲྀΕਤɻσΟδλϧԻ͞ΕͨݪԻ৴߸ॲཧΛܦͯ20ͷྟքଳҬ͝
ͱͷύϫʔมಈͱͳΔɻ20ͷྟքଳҬ͝ͱͷύϫʔมಈΛىҠಈओੳͱόϦϚοΫ εճసʹ͔͚ͯɺύϫʔεϖΫτϧҼࢠ͕ಘΒΕΔɻ
·ͣɺҼࢠੳʹ͔͚ΔσʔλͱͳΔྟքଳҬ͝ͱͷύϫʔมಈΛಘΔͨΊͷੳखଓ͖
ʹ͍ͭͯઆ໌͢Δɻ͜͜Ͱߦͬͨ৴߸ॲཧͷྲྀΕਤ2.2ͷΑ͏ʹ·ͱΊΒΕΔɻ
ੳରͷԻΛ20ͷྟքଳҬʹׂ͠ɺ֤ଳҬͷύϫʔมಈΛ̍ msִؒͰಘΔͨΊͷ ॲཧΛߦͬͨɻੳରԻͷ͋Δ࣌ͰͷύϫʔεϖΫτϧΛಘΔͨΊʹɺͦͷ࣌Λத৺
ͱ͢Δ30 msͷ۠ؒͷ࣌ؒܗΛ૭ؔͰ෦తʹΓऔͬͨɻ૭ؔʹϋϛϯά૭Λ༻͍
ͨɻ࣍ʹߴϑʔϦΤมΛߦ͍ɺϋϛϯά૭ͰΓऔΒΕͨ30 msͷ࣌ؒ৴߸͔Βৼ ෯εϖΫτϧΛಘͨɻ͞Βʹͦͷৼ෯εϖΫτϧΛ̎ͯ͠ύϫʔεϖΫτϧΛࢉग़ͨ͠ɻ
͜͜·ͰͷॲཧͰಘΒΕͨύϫʔεϖΫτϧɺप্࣠Ͱύϫʔ͕ඍࡉʹมಈ͍ͯ͠
Δɻ͜ͷύϫʔεϖΫτϧͷඍࡉͳओʹଳৼಈʹىҼ͠ɺԻͷجຊपʹؔ࿈͢
ΔͰ͋ΔɻҰํύϫʔεϖΫτϧͷแབྷͷಛൃ࣌ͷಓܗঢ়ʹΑܾͬͯ·ΔɻԻ
ͷύϫʔεϖΫτϧଳৼಈ(Իݯ)͕࡞Δपಛੑͱಓܗঢ়(ϑΟϧλ)ʹΑܾͬͯ·
Δڞ໐ͷपಛੑͱͷੵͰϞσϧԽ͞ΕΔɻ͜ͷߟ͑ํԻݯϑΟϧλཧ(Fant, 1960)ͱ ݺΕ͍ͯΔɻಓܗঢ়ʹىҼ͢ΔύϫʔεϖΫτϧแབྷͷಛΛੳ͢Δ͜ͱ͕͜͜Ͱͷ
తͰ͋ΔɻUeda and Nakajima (2017)ͷੳ݁ՌͰɺ̐ҼࢠΛऔΓग़͢ੳʹ͓͍ͯɺத
৺पͷ͍ྟքଳҬʹ͓͍ͯҼࢠෛՙྔ͕ྟքଳҬҰͭඈ͠Ͱେ͖ͳͱͳΔΑ͏ͳ Ҽࢠ͕औΓग़͞Ε͍ͯΔɻύϫʔεϖΫτϧͷඍࡉߏʹݟΒΕΔҰͭͻͱͭͷࢁͷִؒ
ଳৼಈͷपظɺ͢ͳΘͪجຊपͱ͓͓ΉͶҰக͢Δɻجຊप͕ྟքଳҬ෯Λ͍͑ͯ
Δ߹ɺ͋ΔྟքଳҬͰύϫʔεϖΫτϧͷඍࡉߏͷࢁ͕Ͱ͖͍ͯΔͱ͖ʹͦͷྡͷྟք ଳҬͰ୩͕Ͱ͖ΔΑ͏ͳ͜ͱ͕͋Δɻ͜Ε͕ɺҼࢠෛՙྔ͕ྟքଳҬҰͭඈ͠Ͱେ͖ͳ
ͱͳͬͨݪҼͰ͋Ζ͏ɻͦ͜ͰຊݚڀͰύϫʔεϖΫτϧͷแབྷܗঢ়Λਪఆ͠ɺ͜ΕΛੳ
͢Δ͜ͱͱͨ͠ɻ ύϫʔεϖΫτϧแབྷΛਪఆ͢ΔͷʹԻڹֶͷͰΑ͘༻͍ΒΕ͍ͯ
ΔɺέϓετϥϜੳ(e.g., Rabiner & Schafer, 1978)Λ࠾༻ͨ͠ɻύϫʔεϖΫτϧͷର
ΛऔΓɺͦΕΛϑʔϦΤม͢Δ͜ͱͰಘΒΕΔέϓετϥϜʹͦͷߴ࣍ͷ(ߴέϑϨ ϯγ)ʹԻݯͷपಛੑ͕ɺ࣍ͷ(έϑϨϯγ)ʹಓͷपಛੑ͕ݱΕ
͍ͯΔ͜ͱ͕ԻݯϑΟϧλཧΛجʹࣔ͞Ε͍ͯΔɻ͋ΔҎ্ͷߴέϑϨϯγΛέϓε τϥϜ͔ΒऔΓআ͍ͨ͏͑ͰɺύϫʔεϖΫτϧʹ͢͜ͱͰύϫʔεϖΫτϧͷৼ෯แབྷΛ ਪఆ͢Δ͜ͱ͕Ͱ͖Δɻ
4 6 8 10 12 14 16 18 20 22 24
2000 4000 6000
1000
0 3000 5000
8 10 12 14 16 18 20 22
2000 4000 6000
1000
0 3000 5000
30 ms
Fast fourier transform
Short-pass liftering
(5 ms)
Averaging in each critical band
Hamming window (30 ms)
Power (dB)Power (dB)Power (dB)
Frequency (Hz)
Frequency (Hz)
Frequency (Hz)
Center frequency (Hz)
75 150 250 350 450 570 700 840 1000 1170 1370 1600 1850 2150 2500 2900 3400 4000 4800 5800
8 10 12 14 16 18 20 22
2000 4000 6000
1000
0 3000 5000
Input: speech signal
Short segment
Power spectrum
Smoothed power spectrum
Critical band filter output
Output: power fluctuations
ਤ 2.2 Ի৴߸͔Β20ͷྟքଳҬ͝ͱͷύϫʔมಈΛಘΔͨΊͷ৴߸ॲཧखॱɻ
ͦ͜ͰຊݚڀͰɺέϑϨϯγ্࣠Ͱ̑ msҎ্ͷߴέϑϨϯγΛআڈ(ϦϑλϦϯά)
ͨ͠ɻ͜ΕʹΑͬͯɺඍࡉͳߏ͕ฏԽ͞ΕͨύϫʔεϖΫτϧΛಘͨɻ ϦϑλϦϯάʹ ΑͬͯฏԽ͞ΕͨύϫʔεϖΫτϧΛ20ͷྟքଳҬʹपղ͠ɺ֤ྟքଳҬͰଳҬͷ ฏۉͷύϫʔΛٻΊͨɻҎ্ͷૢ࡞ͰԻ৴߸ͷ͋Δ࣌ʹ͓͚Δ20ͷྟքଳҬͷύϫʔ
͕ಘΒΕΔɻ͜ͷૢ࡞Λϋϛϯά૭ͷҐஔΛ̍ msͣͭͣΒͯ͠(ϑϨʔϜपظ̍msͰ)ߦ
͏͜ͱͰɺ20ͷྟքଳҬͷύϫʔมಈΛ̍ msִؒͰಘͨɻ20ͷྟքଳҬͷத৺प͓Α ͼःஅपɺUeda and Nakajima (2017); Nakajima et al. (2017)ͱಉ݅͡Ͱߦ͏ͨΊ ʹɺ Zwicker and Terhardt (1980)Λࢀߟͱͨ͠ɻͨͩ͠ɺ50 HzҎԼͷଳҬԻͱؔ࿈
͕খ͘͞ɺσʔλϕʔεऩͷࡍʹআڈ͞Ε͍ͯΔͨΊɺୈ1൪ͷྟքଳҬ͕0–100 Hzͱ ͳ͍ͬͯΔͱ͜ΖΛɺ 50–100 Hz ʹมߋͨ͠ɻΑͬͯ50–6400 Hzͷൣғͷपʹ20ͷྟ
քଳҬ͕ஔ͞Εͨ (ද2.1)ɻੳରͷݪԻ͕ͱͱ8000 Hz·Ͱऩ͞Ε͍ͯͨͷʹ ରͯ͠ɺ6400 Hzͷप·Ͱ͔͠ੳ͠ͳ͍͜ͱͱͳΔɻݪԻΛ6400 HzҎԼʹଳҬ
੍ݶͯ͠Իͷݴޠ༰Λਖ਼͘͠ௌ͖ͱΔ͜ͱ͕ՄೳͰ͋Δ͜ͱࣄલʹ֬ೝ͍ͯ͠Δɻ·
ͨɺαϯϓϦϯάप16000 HzͰԻՄೳͳपൣғͷ্ݶۙͷपɺͨͱ͑
ԻͰ͖͍ͯͨͱͯ͠ຊདྷͱҟͳΔಛʹΊΒΕ͍ͯΔ͓ͦΕ͋ΔɻҎ্ͷΑ͏ͳཧ༝
ʹΑΓɺ6400 Hz·ͰΛੳରͷपൣғͱܾΊͨɻ
·ͨɺੳʹྟքଳҬΛ༻͍Δ͜ͱͷੋඇʹ͍ͭͯड़͓ͯ͘ɻ ຊݚڀͷతɺԻͷ ύϫʔεϖΫτϧแབྷʹΈΒΕΔಛΛҼࢠͱͯ͠நग़͢Δ͜ͱͰ͋Δɻੳରͱ͢ΔԻ
ͷجຊप150 HzఔͰ͋ΔɻҬʹ͓͍ͯ100 HzҎԼͷଳҬ෯ͱͳΔௌ֮ϑΟϧλ Λඞͣ͠༻͍Δඞཁͳ͍ɻྟքଳҬ෯Ͱׂ͞Εͨ20νϟϯωϧͷࡶԻۦಈԻ͕΄΅
શʹ໌ྎͰ͋ΔɻࡶԻۦಈԻΛ߹͢Δ߹ʹ͓͍ͯҰͭͷνϟϯωϧͰεϖΫτϧ
͕ฏୱͰ͋Δํ͕؆ศͰ͋ΔɻҎ্ͷΑ͏ͳ͜ͱ͔ΒྟքଳҬΛٞͷग़ൃʹ͢Δ͜ͱ͕ଥ
Ͱ͋Δͱߟ͑ΒΕΔͰ͋Ζ͏ɻ
ද 2.1 ྟքଳҬϑΟϧλͷத৺पͱ௨աଳҬ Band no. Center frequency (Hz) Passband (Hz)
1 75 50–100
2 150 100–200
3 250 200–300
4 350 300–400
5 450 400–510
6 570 510–630
7 700 630–770
8 840 770–920
9 1000 920–1080
10 1170 1080–1270
11 1370 1270–1480
12 1600 1480–1720
13 1850 1720–2000
14 2150 2000–2320
15 2500 2320–2700
16 2900 2700–3150
17 3400 3150–3700
18 4000 3700–4400
19 4800 4400–5300
20 5800 5300–6400
ਤ 2.2ͷखଓ͖ͰಘΒΕͨ20ͷύϫʔมಈ20มྔ͔ΒͳΔଟมྔσʔλͱͯ͠ΈΔ͜ͱ
͕Ͱ͖Δɻ͜ͷଟมྔσʔλΛຊݚڀͰ৽͘͠ఏҊ͢ΔʮىҠಈओੳʯʹ͔͚ɺओ
ΛऔΓग़͠ɺओͷҼࢠෛՙྔΛόϦϚοΫεճస(Kaiser, 1958)͢Δ͜ͱͰύϫʔεϖ ΫτϧҼࢠΛநग़ͨ͠ɻ
ຊདྷͷओੳɺଟ࣍ݩۭؒதʹදݱ͞Εͨଟมྔσʔλʹରͯ͠ɺͦͷσʔλͷࢄ
͕࠷େͱͳΔΑ͏ͳଟ࣍ݩ্ۭؒͷํΛओͱͯ͠ॱ࣍ٻΊ͍ͯ͘ੳख๏Ͱ͋Δ(e.g.,
Jolliffe, 2002)ɻͦͷͨΊɺओΛܾఆ͚ͮΔݻ༗ϕΫτϧσʔλͷॏ৺Λىͱͯ͠ٻΊ
ΒΕΔɻ͜ͷΑ͏ʹͯ͠ٻΊΒΕͨओ͕දݱ͠͏Δใ͚ͩͰݩͷଟมྔσʔλΛ࠶ߏ
͢Δͱ͍͏͜ͱɺݩͷଟมྔσʔλΛओۭؒʹਖ਼ࣹӨͨ͠ͷʹஔ͖͑Δͱ͍͏͜ͱ Ͱ͋Δɻ͜͜Ͱ͠ɺσʔλͷྵɺຊݚڀͷ߹ແԻΛද͕͢ओੳʹΑΔ෦ۭ
ؒʹؚ·Εͳ͔ͬͨ߹ɺແԻΛද͢σʔλΛ࠶ߏ͢ΔࡍʹΊΒΕɺྟքଳҬͷͲ͜
͔ʹύϫʔΛ࣋ͬͨʹҠΔɻ͜͏ͯ͠࠶ߏ͞ΕͨσʔλΛجʹ࠶߹ͨ͠Իఆৗతͳ
ࡶԻΛؚΉ͜ͱͱͳΔɻ͜ΕΛௌऔऀ͕ௌ͚ɺ࠶߹͞ΕͨԻͷதͰࡶԻ͕໐Γଓ͍͍ͯ
ΔΑ͏ʹײ͡ΔͰ͋Ζ͏ɻຊདྷҙਤ͠ͳ͍ఆৗతͳࡶԻ͕ੜ͡ΔԻΛௌऔ࣮ݧʹ༻͍Δ ͷదͰͳ͍ɻ͓ͦΒ͘ɺಉ༷ͷఆৗࡶԻ͕ Zahorian and Rothenberg (1981)ͷ࠶߹
Իʹ͓͍ͯੜ͍ͯͨ͡ͱߟ͑ΒΕΔ͕ɺ͜ͷ͜ͱʹ͍ͭͯಛʹݴٴ͞Ε͍ͯͳ͔ͬͨɻ
͜Εʹରͯ͠ىҠಈओੳɺओੳʹΑͬͯٻΊΒΕΔ෦ۭؒΛఆٛ͢Δϕ Ϋτϧ͕σʔλͷॏ৺Ͱͳ͘ɺશͯͷมྔͷ͕ྵͱͳΔɺຊจͰແԻΛද͢Λى
ͱ͢ΔΑ͏ʹมܗͨ͠ख๏Ͱ͋Δ1ɻ͜͏͢Δ͜ͱʹΑͬͯɺແԻΛද͢ɺσʔλͷ࠶
ߏΛͯ͠ඞͣແԻͷ··ͱͳΓɺ্ड़ͷʹىҼ͢Δɺҙਤ͠ͳ͍ఆৗతͳࡶԻ͕
ൃੜ͢Δͱ͍͏͜ͱͳ͘ͳΔɻਤ 2.3ɺ̎มྔ͔ΒͳΔඇෛͷσʔλʹରͯ͠ɺ௨ৗͷओ
ੳͱىҠಈओੳΛͦΕͧΕߦͬͨ߹ͱͰɺࢉग़͞ΕΔओ͕ͲͷΑ͏ʹҟ ͳΔͷ͔Λࣔ֓͢೦ਤͰ͋Δɻ
V1 V2
Gravity center of data PC1
V1
V2 PC1
Conventional PCA Origin-shifted PCA
ਤ 2.3௨ৗͷओੳ(ࠨ)ͱىҠಈओੳ(ӈ)ʹ͓͚Δओࢉग़ͷ֓೦ਤɻ௨ৗ
ͷओੳͰɺଟ࣍ݩۭؒͷݪؚ͕·ΕΔΑ͏ʹओ͕ࢉग़͞Ε͓ͯΒͣɺݪΛओ
࣠ʹਖ਼ࣹӨ͢ΔͱͣΕ͕ੜ͡Δɻ͜ͷͣΕ͕Իͷ࠶߹ʹ͓͍ͯఆৗࡶԻͷݪҼͱ ͳΔɻ
ىҠಈओੳͰಘΒΕͨओۭؒΛఆٛ͢ΔϕΫτϧɺ͢ͳΘͪҼࢠෛՙྔΛόϦ ϚοΫεճస͢Δ͜ͱͰɺىҠಈओੳʹΑͬͯಘΒΕͨύϫʔεϖΫτϧҼࢠΛΑΓ ղऍ͍͢͠ɺͭ·Γ֤Ҽࢠ͕ͲͷྟքଳҬͱؔ࿈͕ڧ͍ͷ͔͕͔Γ͍͢ܗʹ͢Δ͜ͱ
͕Ͱ͖ΔɻόϦϚοΫεճసΛୈԿओ·ͰؚΊͯߦ͏͔Λม͑Δ͜ͱͰɺ̕छྨͷύϫʔ εϖΫτϧҼࢠͷΛಘͨɻྫ͑ɺ̏Ҽࢠ͔ΒͳΔύϫʔεϖΫτϧҼࢠΛಘ͍ͨ߹ɺ
1ओۭؒΛఆٛ͢ΔϕΫτϧͷىΛશͯͷมྔͷ͕ྵͱͳΔʹ͢ΔͨΊͷ࣮ࡍతͳํ๏͍͔ͭ͘
ߟ͑ΒΕΔɻຊจͰɺओੳʹ͔͚Δଟมྔσʔλʹූ߸Λٯసͤͨ͞ଟมྔσʔλΛͭͳ͛ͯɺσʔ λͷॏ৺Λมྔͷ͕ྵͱͳΔʹม͑Δͱ͍͏ํ๏Λ༻͍ͨɻ
ୈ̏ओ·ͰΛόϦϚοΫεճసͨ͠ɻ
2.2.3 ݁Ռͱߟ
ਤ2.4ɺ2.5ɺ2.6̏ͭͷݴޠ͔ΒಘΒΕͨ̕ͷύϫʔεϖΫτϧҼࢠʹ͍ͭͯɺྟքଳ
Ҭ͝ͱͷҼࢠෛՙྔΛࣔͨ͠ͷͰ͋Δɻ·ͣɺಘΒΕͨύϫʔεϖΫτϧҼࢠͷಛ͕
̏ͭͷݴޠͷؒͰࣅ͍ͯΔ͔Ͳ͏͔Λݟ͍ͯ͘ɻ̐ҼࢠΛநग़ͨ͠ͱ͜Ζ·Ͱޓ͍ʹࣅͨҼࢠ ͷஔ͕ಘΒΕ͍ͯΔͷ͕͔Δɻ̑ҼࢠΛ͑ͯύϫʔεϖΫτϧҼࢠΛநग़͢Δͱɺݴ ޠؒͰڞ௨͍ͯ͠Δͱߟ͑ΒΕΔҼࢠΛݟ͚ͭΔͷ͕ࠔʹͳͬͨɻҎ্ͷ݁Ռ Ueda and
Nakajima (2017)Ͱใࠂ͞Εͨ༰ͱҰக͍ͯ͠Δɻ
࣍ʹɺݴޠؒͰڞ௨ͨ͠ಛͷύϫʔεϖΫτϧҼࢠ͕ಘΒΕͨҼࢠͰɺͲͷΑ͏ͳಛ
Λ࣋ͬͨҼࢠ͕ಘΒΕͨͷ͔Λݟ͍ͯ͘ɻ̍ҼࢠੳͰɺͯ͢ͷྟքଳҬʹ͓͍ͯҼࢠ
ෛՙྔ͕ਖ਼ͷͰ͋ͬͨɻ͜ΕىҠಈओੳʹΑͬͯಘΒΕΔୈ̍ओ(Ҽࢠ)͕ ඞͣͭಛͰ͋Δɻ̎ҼࢠੳͰɺ1000 HzΛத৺ͱ͢ΔதଳҬʹେ͖͍Ҽࢠෛՙྔ
Λ࣋ͭҼࢠͱͦͷ྆ଆͷଳҬͰҼࢠෛՙྔ͕େ͖͍Ҽࢠͱ͕ಘΒΕ͍ͯΔɻ ͞Βʹɺ̏Ҽࢠ
ੳͰɺ1000 HzΛத৺ͱ͢ΔதଳҬʹେ͖͍ҼࢠෛՙྔΛ࣋ͭҼࢠɺ3000 HzҎ্
ͷߴଳҬʹେ͖͍ҼࢠෛՙྔΛͭҼࢠɺͦͯ͠500 HzҎԼͷҬ͓ΑͼɺதଳҬͱߴଳ ҬͷؒͷଳҬ(1500–3000 Hz)ʹ͔Εͯେ͖͍ҼࢠෛՙྔΛ࣋ͭೋๆੑͷҼࢠ͕ͦΕͧΕ ಘΒΕ͍ͯΔɻೋๆੑͷҼࢠ͕ݱΕΔͱ͍͏ಛɺUeda and Nakajima (2017)ͷੳͰ
ಉ༷ʹใࠂ͞Ε͍ͯΔɻຊੳ͕ຊ࣭తʹUeda and Nakajima (2017)ͷੳͱಉ݁͡ՌΛ ಋ͍͍ͯΔ͜ͱΛࣔ͢ࢦඪͰ͋Δͱݴ͑ΔͰ͋Ζ͏ɻͦͯ̐͠ҼࢠੳͰɺ̏Ҽࢠੳͷͱ
͖ʹಘΒΕͨதଳҬʹେ͖͍ҼࢠෛՙྔΛ࣋ͭҼࢠͱߴଳҬʹେ͖͍ҼࢠෛՙྔΛ࣋ͭҼࢠ ʹՃ͑ɺೋๆੑͷҼࢠ͕ೋͭͷҼࢠʹ͔ΕͨΑ͏ͳҼࢠ͕ಘΒΕ͍ͯΔɻ͜ͷ݁Ռ·ͨɺ
Ueda and Nakajima (2017)ͷੳ݁ՌͱҰக͍ͯ͠Δɻ͕ͨͬͯ͠ɺࠓճಋೖͨ͠ىҠಈ
ओੳʹΑͬͯɺઌߦݚڀͱಉͷύϫʔεϖΫτϧҼࢠ͕ಘΒΕͨͱஅͰ͖Δɻ
͋ΔྟքଳҬʹ͓͍ͯɺಛఆͷҼࢠͷҼࢠෛՙྔͷઈର͕େ͖͘ɺͦΕʹൺͯͦΕҎ֎
ͷҼࢠͷҼࢠෛՙྔͷઈର͕খ͍͞߹ɺͦͷଳҬͷύϫʔͷมԽ͕Ҽࢠෛՙྔͷઈର
͕େ͖ͳҼࢠʹΑͬͯઆ໌͞ΕΔׂ߹͕େ͖͍͜ͱΛҙຯ͢Δɻ·ͨผͷྟքଳҬʹ͓͍ͯ
ಉ͡Α͏ͳҼࢠෛՙྔͷؔʹͳ͍ͬͯΔ߹ɺͦΕΒͷෳͷଳҬʹ͓͍ͯύϫʔ͕ಉ͡Α
͏ʹมԽ͢Δ͜ͱͱͳΔɻྫ͑̏Ҽࢠੳͷ1000 Hzۙͷ͍͔ͭ͘ͷྟքଳҬʹ͓͍ͯ
ɺҰͭͷҼࢠ(നൈ͖ͷؙͰදͨ͠ͷ)ͷҼࢠෛՙྔ͕ਖ਼ͷͰେ͖͘ɺͦΕҎ֎ͷҼࢠ
Center frequency of critical bands (Hz)
1-factor2-factor3-factor
Factor loading
British English Japanese Mandarin Chinese
ਤ 2.4ىҠಈओੳʹج͍ͮͯಘΒΕͨύϫʔεϖΫτϧҼࢠͷɺྟքଳҬ͝ͱͷҼ ࢠෛՙྔɻԣ࣠ͷ50–6400 Hzͷൣғͷपʹஔ͞Εͨ20ͷྟքଳҬͷ֤த৺प
Λ͍ࣔͯ͠ΔɻྟքଳҬͷத৺प͓ΑͼͦͷଳҬ෯ʹ͍ͭͯɺද 2.1Λࢀরͷ͜ͱɻ ࠨ͔ΒɺΠΪϦεӳޠޠऀɺຊޠޠऀɺதࠃޠ(ී௨)ޠऀͷ݁ՌΛࣔ͢ɻ্
ஈ͔Βɺ1Ҽࢠɺ2Ҽࢠɺ3ҼࢠɺΛͦΕͧΕநग़ͨ͠߹ͷ݁ՌͰ͋Δɻ
Center frequency of critical bands (Hz)
4-factor5-factor6-factor
Factor loading
British English Japanese Mandarin Chinese
ਤ 2.5ىҠಈओੳʹج͍ͮͯಘΒΕͨύϫʔεϖΫτϧҼࢠͷɺྟքଳҬ͝ͱͷҼ ࢠෛՙྔɻԣ࣠ͷ50–6400 Hzͷൣғͷपʹஔ͞Εͨ20ͷྟքଳҬͷ֤த৺प
Λ͍ࣔͯ͠ΔɻྟքଳҬͷத৺प͓ΑͼͦͷଳҬ෯ʹ͍ͭͯɺද 2.1Λࢀরͷ͜ͱɻ ࠨ͔ΒɺΠΪϦεӳޠޠऀɺຊޠޠऀɺதࠃޠ(ී௨)ޠऀͷ݁ՌΛࣔ͢ɻ্
ஈ͔Βɺ4Ҽࢠɺ5Ҽࢠɺ6ҼࢠɺΛͦΕͧΕநग़ͨ͠߹ͷ݁ՌͰ͋Δɻ
Center frequency of critical bands (Hz)
7-factor8-factor9-factor
Factor loading
British English Japanese Mandarin Chinese
ਤ 2.6ىҠಈओੳʹج͍ͮͯಘΒΕͨύϫʔεϖΫτϧҼࢠͷɺྟքଳҬ͝ͱͷҼ ࢠෛՙྔɻԣ࣠ͷ50–6400 Hzͷൣғͷपʹஔ͞Εͨ20ͷྟքଳҬͷ֤த৺प
Λ͍ࣔͯ͠ΔɻྟքଳҬͷத৺प͓ΑͼͦͷଳҬ෯ʹ͍ͭͯɺද 2.1Λࢀরͷ͜ͱɻ ࠨ͔ΒɺΠΪϦεӳޠޠऀɺຊޠޠऀɺதࠃޠ(ී௨)ޠऀͷ݁ՌΛࣔ͢ɻ্
ஈ͔Βɺ7Ҽࢠɺ8Ҽࢠɺ9ҼࢠɺΛͦΕͧΕநग़ͨ͠߹ͷ݁ՌͰ͋Δɻ
ͷҼࢠෛՙྔྵʹ͍ۙͱ͍͏ಛ͕ڞ௨͍ͯ͠Δ(ਤ 2.4)ɻΑͬͯ͜ΕΒͷଳҬͷύϫʔ
·ͱ·ͬͯมಈ͍ͯ͠Δͱղऍ͢Δ͜ͱ͕Ͱ͖Δɻ
͞Βʹ̐Ҽࢠੳʹ͓͍ͯɺ500 HzҎԼͷ̑ͭͷྟքଳҬͰҼࢠෛՙྔͷେ͖͍Ҽࢠɺ
ͦΕΒͷ̑ͭͷଳҬʹ͓͍ͯಉఔͷҼࢠෛՙྔͰ͋ͬͨɻέϓετϥϜੳʹΑͬͯύϫʔ εϖΫτϧͷԻݯಛੑ͕దʹऔΓআ͔ΕͨͨΊͰ͋Ζ͏ɻ
ੳ͢Δଟมྔσʔλͷߏ(ଟ࣍ݩ্ۭؒͰͷσʔλͷ)ʹΑͬͯɺ௨ৗͷओ
ੳͱىҠಈओੳͱͰେ͖͘ҟͳΔҼࢠΛಋ͘Մೳੑ͕͋Δɻ̏Ҽࢠੳͱ̐Ҽࢠ
ੳͰಘΒΕͨύϫʔεϖΫτϧҼࢠ͕ɺ௨ৗͷओੳʹج͍ͮͯಘΒΕͨରԠ͢Δύ ϫʔεϖΫτϧҼࢠͱࣅͨΑ͏ಛΛ͍࣋ͬͯΔͱ͍͏͜ͱɺྟքଳҬ͝ͱͷύϫʔεϖΫ τϧͷมಈͷσʔλΛଟ࣍ݩ্ۭؒʹදݱͨ͠ࡍʹɺσʔλͷॏ৺ͱແԻ(ଟ࣍ݩۭؒͷݪ
)ͱΛ݁Ϳઢ্ۙʹσʔλ͕͍ͯͨ͠ͱ͍͏͜ͱʹͳΔɻ
ද2.2ʹىҠಈओੳʹج͍ͮͯಘΒΕͨҼࢠɺ͓Αͼ௨ৗͷओੳʹج͍ͮͯ
ಘΒΕͨҼࢠͷྦྷੵد༩Λࣔ͢ɻد༩ͱɺ͋Δओ·ͨҼࢠ͕ݩͷଟมྔσʔλ ͷࢄΛͲΕ͚ͩͷׂ߹Ͱอ͍࣋ͯ͠Δͷ͔Λࣔ͢ͷͰ͋ΔɻͦΕͧΕͷओ·ͨҼ ࢠ͕ɺݩͷଟมྔσʔλͷใΛͲΕ͚ͩઆ໌͍ͯ͠Δ͔Λද͢ͷͱߟ͑Δ͜ͱ͕Ͱ͖
Δɻྦྷੵد༩ͦͷد༩Λ͋ΔओɺҼࢠ·ͰͰྦྷੵͨ͠ͷͰ͋ΔɻىҠಈओ
ੳʹج͍ͮͯಘΒΕͨୈ̍Ҽࢠͷد༩17–22%ఔͰ͋ΓɺҼࢠ͕૿Ճ͢Δʹ͠
͕ͨͬͯྦྷੵد༩؇͔ʹ্ঢ͠ɺୈ̕Ҽࢠ·ͰͰ74–77%ఔ·Ͱ্ঢͨ͠ɻ֤ݴޠͰ ύϫʔεϖΫτϧҼࢠͷಛ͕ڞ௨͍ͯ͠Δୈ̐Ҽࢠ·Ͱͷྦྷੵد༩49–56%ఔͰ͋ͬ
ͨɻԻͷύϫʔεϖΫτϧมಈͷಛͷ͓Α͕ͦୈ̐Ҽࢠ·ͰͰઆ໌Ͱ͖ɺ͞Βʹ̏
ͭͷݴޠؒͰͦͷಛ͕ڞ௨͍ͯ͠Δͱ͍͏͜ͱΛ͍ࣔͯ͠Δɻ
ݴޠ͝ͱʹɺ௨ৗͷओੳͱىҠಈओੳͷྦྷੵد༩Λಉ͡ҼࢠͷؒͰൺֱ
͢Δͱɺͦͷࠩ̎%ҎԼͰ͋ͬͨɻྦྷੵد༩ͱ͍͏؍ͰɺىҠಈओੳʹجͮ
͍ͯಘΒΕͨύϫʔεϖΫτϧҼࢠ͕௨ৗͷओੳʹج͍ͮͯಘΒΕͨύϫʔεϖΫτ ϧҼࢠͱಉͰ͋Δͱݴ͑ΔͩΖ͏ɻ
࿈ଓతʹൃͨ͠ԻͷύϫʔεϖΫτϧมಈͷσʔλɺىҠಈओੳΛ༻͍Δͷ ʹద͍ͯ͠Δͱݴ͑ΔɻҰํͰɺԻͷఆৗ෦ͷύϫʔεϖΫτϧΛԻ͝ͱʹҰͭҰͭ؍ଌ
͠ɺͦΕΒͷύϫʔεϖΫτϧͷपଳҬ͝ͱͷϨϕϧΛมྔʹ༻͍Δ߹ɺىҠಈओ
ੳద͞ͳ͍ͱߟ͑ΒΕΔɻͪΐ͏ͲPlomp et al. (1967)͕ੳରͱͨ͠Ի͕ͦΕ ʹ͋ͨΔɻԻͷఆৗ෦ύϫʔͷมԽ͕҆ఆ͓ͯ͠ΓɺԻΛҰԻͣͭൃͨ͠߹ʹɺ
ऀ͕͑ͯͦ͏͠Α͏ͱ͠ͳ͍ݶΓɺԻ͝ͱʹύϫʔ͕େ͖͘ҟͳΔͱ͍͏͜ͱͳ͍Ͱ
͋Ζ͏ɻΑͬͯ؍ଌσʔλͷଟ࣍ݩ্ۭؒͷσʔλͷॏ৺ͱۭؒͷݪͱΛ݁Ϳઢ্
ۙʹ͠ͳ͍ɻ͜ͷΑ͏ͳσʔλʹରͯ͠௨ৗͷओੳΛߦ͑ɺୈ̍ओͷҼ ࢠෛՙྔͷ͍͔͕ͭ͘ෛͷ2ͱͳΔ͜ͱ͕༧ଌ͞ΕΔ͕ɺಉ͡σʔλʹىҠಈओੳ Λߦ͑ɺୈ̍ओͷҼࢠෛՙྔͯ͢ਖ਼ͷͱͳΓɺੳ݁Ռ͕େ͖͘ҟͳΔ͓ͦΕ͕
͋Δɻ
௨ৗͷओੳʹΑͬͯಘΒΕͨύϫʔεϖΫτϧҼࢠ͔ΒԻΛ࠶߹ͨ͠߹ʹੜ
͡ΔఆৗࡶԻͷྫΛਤ 2.7ʹݟΔ͜ͱ͕Ͱ͖ΔɻݪԻ͕ࡶԻۦಈԻͱͯ͠࠶߹͞Εͨ
ͷͰ͋Δɻ࣮ࡍͷ࠶߹ͷํ๏ʹ͍ͭͯ࣍અͰৄ͘͠ड़Δɻ͜ͷਤͰࣔ͢ྫͷݪԻ
1.9–2.0 s͕ۙ΄΅ແԻঢ়ଶͰ͋Δɻ͜ΕΛ௨ৗͷओੳͰಘΒΕͨ̐Ҽࢠ͔Β࠶߹
ͨ͠߹ɺ͓Αͦ1000–1500 HzͷଳҬʹఆৗతͳࡶԻ͕ੜ͍ͯ͡Δͷ͕͔ΔɻҰํɺى
ҠಈओੳͰಘΒΕͨҼࢠ͔Β࠶߹ͨ͠߹ɺͦͷΑ͏ͳఆৗతͳࡶԻੜ͍ͯ͡
ͳ͍ɻ
2Ҽࢠෛՙྔͷූ߸ࣗମʹҙຯͳ͍ɻͯ͢ͷූ߸ΛೖΕସ͑ͯओ·ͨҼࢠ͕ද͢ͷಉ͡Ͱ
͋Δɻූ߸ͷҧ͍͕ҙຯΛͭͷɺҰͭͷओ·ͨҼࢠʹ͓͍ͯɺมྔؒͰҼࢠෛՙྔΛൺֱ͢Δ߹
Ͱ͋ΔɻΑͬͯɺىҠಈओੳͰಘΒΕΔୈ̍ओͷҼࢠෛՙྔͯ͢ෛͷͱͳΔͱݴ͍͑Δ͜
ͱͰ͖ΔɻຊจͰੳ݁ՌΛݟ͘͢͢ΔͨΊʹɺͦΕͧΕͷҼࢠʹ͍ͭͯɺҼࢠෛՙྔͷઈର͕࠷
େ͖͍ͷ͕ਖ਼ͷͱͳΔΑ͏ʹූ߸ΛͦΖ͑ͯදࣔͨ͠ɻ
ද 2.2 ىҠಈओੳͱ௨ৗͷओੳͰಘͨҼࢠͷྦྷੵد༩ɻ Japanese
Cumulative contribution (%) Number of factors Conventional Proposal
1 23.3 22.9
2 38.0 37.3
3 49.9 47.9
4 55.9 55.6
5 61.7 61.3
6 66.5 66.3
7 70.7 70.5
8 74.3 74.1
9 77.7 77.6
British English
Cumulative contribution (%) Number of factors Conventional Proposal
1 22.4 21.5
2 35.3 34.5
3 48.7 45.6
4 54.0 53.7
5 59.9 59.7
6 65.2 64.5
7 69.2 68.8
8 73.3 72.3
9 76.5 76.3
Mandarin Chinese
Cumulative contribution (%) Number of factors Conventional Proposal
1 17.9 17.3
2 32.0 31.5
3 43.1 40.7
4 48.8 48.6
5 55.7 55.3
6 61.2 60.3
7 65.8 65.4
8 70.3 69.3
9 73.8 73.7