• 検索結果がありません。

JAIST Repository

N/A
N/A
Protected

Academic year: 2021

シェア "JAIST Repository"

Copied!
6
0
0

読み込み中.... (全文を見る)

全文

(1)

JAIST Repository

https://dspace.jaist.ac.jp/

Title

文音声中の基本周波数の時間変化に含まれる個人性に

関する研究

Author(s)

大野, 宏

Citation

Issue Date

1997‑09

Type

Thesis or Dissertation

Text version

author

URL

http://hdl.handle.net/10119/1105

Rights

Description

Supervisor:赤木 正人, 情報科学研究科, 修士

(2)

contours of sentences

Hiroshi Ohno

Scho ol of InformationScience,

Japan AdvancedInstitute of Scienceand Technology

February 13, 1998

Keywords: fundamental frequencycontours,speakerindividuarity,Fujisakimodel.

1 Introduction

Thispap erdiscussessp eakerindividualityinfundamentalfrequencycontoursofsentencesbased

on analysisusing theFujisaki modeland psychoacousticexperiments. Thestimuliusedforthe

experiments are synthesized using STRAIGHT [1], whosefundamentalfrequency contours are

modied by the Fujisaki mo del. The experiment results indicate that (1) fundamental fre-

quencycontoursofsentenceshavemuchspeakerindividuality,(2)esp ecially,thebasefrequency

F

min

and the timing parameters (T

0

;T

1 and T

2

) in the frequency contour have more speaker

individualitythanother parametersandsubjectscanbe dividedintotwogroups,inwhichfun-

damentalfrequencyheightortimingof fundamentalfrequencydynamicsaectsdiscrimination,

and (3) sp eaker individuality can be controlled by manipulating a few parameters including

timing parameters.

2 Fujisaki model

A fundamentalfrequencycontoursF

0

(t)[2] asfollows:

lnF

0

= lnF

min +

I

X

i=1 A

pi G

pi (t0T

0i )+

J

X

j=1 A

aj fG

aj (t0T

1j )0G

aj (t0T

2j )g;

G

pi (t)=

(

2

i

texp(0

i

t) (t0);

0 (t<0)

(1)

G

aj (t)=

(

min[10(1+

j

t)exp(0

j t);

j

] (t0);

(3)

0 10 20 30 40 50

F m in Ap 0

F ratio

Ap

Ap 1 Ap 2 Aa 0 Aa 1 Aa 2 Aa 3 Aa 4

F min Aa

∆ T0

∆ T01

∆ T02

∆ T10

∆ T11

∆ T12

∆ T13

∆ T14

∆ T1 ∆ T2

∆ T20

∆ T21

∆ T22

∆ T23

∆ T24

Figure1: F ratioofeachparameter

where F

min

: baseline value of a F

0

contour, I: numb er of phrase commands, J: numb er of

accentcommands,A

pi

: magnitudeofthei-thphasecommand,A

ai

: amplitudeofthej-thphase

command, T

0i

:instant of occurrence of theith phrase command, T

1j

: onset of thej-th accent

command, T

2j

: end of thej-th accent command,

i

: natural angular frequencyof thephrase

control mechanism to the i-th phrase command,

j

: natural angular frequency of the accent

control mechanism tothe j-th accent command,and

j

: ceiling level of theaccent component

forthej-thaccent command.

3 Analysis of dierence in fundamental frequency contours on

sentence

Speech datafor all theexperimentsare sentencessuch as\aoiao

iga aoiyaneno ue n

iaru"(\"

meanspositionsofthe accent)|uttered byvemale sp eakers.

ParametersoftheFujisakimodelareestimatedbyminimizingthemeansquarederrorb etween

theextractedF

0

contourandthemodeledF

0

contouronalogarithmicscale. Theminimization

process utilizes theanalysis-by-synthesismethod.

To cho ose some physical characteristics representing sp eaker individualitiy in the analyzed

parameters,wecalculatedtheF ratio(inter-speakervariationdividedbyaveragedintra-speaker

variation)foreachparameter.

F

k

= P

n

i

c

ik 0

1

n P

n

i c

ik

2

1

N P

n

i P

N

j (c

ijk 0c

ik )

2

; 0

@

c

ik

= 1

N N

X

j c

ijk 1

A

(2)

where c

ijk

is the j-thobservation of the i-th speakerfor theparameter k. The larger F ratio

indicates the parametermore signicant for sp eakerclassifrcation. Notesthat the 1 of 1T

0i ,

1T

1j

and1T

2j

indicatedierencesb etweenthephasecommandtimingsandthemoraboundary

T

00 .

(4)

Sp eaker 5

Subject 5

Headphone SENNHEISERHDA200

HeadphoneAmp SANSUIAU-907MR

Hearinglevel 76dB(A)

Table2: t-test oftheexperimentresult(betweensyntheticspeech)

stimulisample same samesp eaker dierspeaker

O,ST 1.424 4.079 9.111

O,SF 1.585 3.654 9.199

ST,SF 1.187 0.115 0:265

t0:05=1:960;t0:01=2:576

4 Perception of speaker individuarity

In order to investigate fundamental frequency contours, mo deled by Fujisaki model, psychoa-

coustic experimentsused STRAIGHTsp eechwaveswith spectraland amplitudeexchanged.

The typ esofthestimuli arsasfollows:

1. O:originalsp eechwaves

2. ST:synthesized speech by STRAIGHT and TEMPO, whose sp ectra come from another

sp eakersp eech.

3. SF:synthesizedspeechbySTRAIGHTand Fujisaki mo del,whosespectraalso come from

anothersp eakerspeech.

Psychoacousticexperiment wasbymethod ofparired comparisonof ve judgescale.

The resultsoft-test amongthree stimuli areshowninTable2 and Table3.

The experiment results indicate that (1)fundamental frequency contours of sentences have

speakerindividulity,and(2)fundamentalfrequencycontoursbytheFujisakimodelhavesp eaker

individulityasmuch asthosebyTEMPO.

5 Shift of perception by each parameters

ThepsychoacousticexperimentusedABX method,thestimuli xresynthesizedbyexchangeda

fewparameter, and subjects judgedwhether thesynstheticsp eechxwas closertospeakeraor

speakerb.

(5)

stimle samestimulianddierspeaker somesp eakeranddiersp eaker

ST 41.024 61.221

SF 37.722 57.52

t0:05=1:960;t0:01=2:576

Table4: Parameterset

type A B C D E F G H

base a b a a a b b a

phrase a a b a a b a b

accent a a a b a a b b

timing a a a a b a a a

2. phrase A

pi

3. accentA

aj

4. timingT

0i

;T

1j

;T

2j

The exchangedparameters setsare showninTable4.

The psychoacoustic exp erimentresult is shownin Table5. This result is theaverage rate of

thatsubjects judgedspeakerb.

The experiment results inducate that (1)the shift of perception aect dierence of the pa-

rametersbetweenspeakers,(2)F

min

andthetimingparamerters(T

0

;T

1 andT

2

)inthefrequency

contourhavemorespeakerindividualitythanotherparameters,(3)subjectscanbedividedinto

twogroups,inwhichfundamentalfrequencyheightortimingoffundamentalfrequencydynam-

icsaects discrimination,and(4)speakerindividualitycanbecontrolled bymanipulatingthree

parameters including timing parameters.

The results inducate that the timing parameters in the fundamental frequency contours of

sentenceshavemorespeakerindividualitythanwords. Theexperimentresultobtainsameresult

of thereport[4], thesp eakerindividuarityaect dierenceofacoustic features.

6 Conclusion

In order to investigate sp eakerindividualityin fundamentalfrequency countours of sentences,

parameter extraction byFujisaki model, analysis of dierence, and thepsychoacoustic exp eri-

mentswere carriedout.

The resultsindicate thatfundamentalfrequencycontours of sentenceshavespeakerindivid-

uality,and timing parameterhavemore speakerindividualitythanother parameters.

(6)

parameter set A B C D E F G H

subject1 ×

subject2 ×

subject3 ×

subject4

subject5 × ×

average ×

perceptual r ate×:05;:520;:2040;:40100

References

[1] H.Kawaahara\Ahightqualitysp eechanalysis,mo dicationandsynthesismethodSTRAIGHT",J.

Acoust. So c.Jpnpp.189-1921997

[2] H. Fujisaki andK. Hirose: \Analysisof voicefundamentalfrequencycontours fordeclarativesen-

tencesofJapanese",J.Acoust. So c.Jpn.(E)5,4(1984)

[3] M.AkagiandT.Ienaga\Speakerindividualitiesinfundamentalfrequencycontoursanditscontrol", J.Acoust. Soc.Jpn.(E)18,2 (1997)

[4] M.Hashimoto, N Higuchi\ Analysis of acoustic features aecting sp eaker identication", Eu- rospeech'95,pp.435-438(1995)

Figure 1: F ratio of each parameter
Table 2: t-test of the experiment result(between synthetic speech)
Table 4: Parameter set

参照

関連したドキュメント

Keywords: Learning Process, Instructional Design, Learning Analytics, Time-Series Clustering, Dynamic Time

Causation and effectuation processes: A validation study , Journal of Business Venturing, 26, pp.375-390. [4] McKelvie, Alexander &amp; Chandler, Gaylen &amp; Detienne, Dawn

Previous studies have reported phase separation of phospholipid membranes containing charged lipids by the addition of metal ions and phase separation induced by osmotic application

It is separated into several subsections, including introduction, research and development, open innovation, international R&amp;D management, cross-cultural collaboration,

UBICOMM2008 BEST PAPER AWARD 丹   康 雄 情報科学研究科 教 授 平成20年11月. マルチメディア・仮想環境基礎研究会MVE賞

To investigate the synthesizability, we have performed electronic structure simulations based on density functional theory (DFT) and phonon simulations combined with DFT for the

During the implementation stage, we explored appropriate creative pedagogy in foreign language classrooms We conducted practical lectures using the creative teaching method

講演 1 「多様性の尊重とわたしたちにできること:LGBTQ+と無意識の 偏見」 (北陸先端科学技術大学院大学グローバルコミュニケーションセンター 講師 元山