A Model of Coarticulation 利用統計を見る

(1)

0riginal Report

A Model of Coarticulation

MinoruSHIGENAGA KiyoshiTANAKA HitoshiARIIZUMI KenjiSAITO

（Received Octover 26，1970）

Synopsis

In thc study of speech， it is one of the principal problcms to investigatc the mcchanism of contcxt in specch and to cxpress it by rule． We have constructed a simple rule for the transition of vocal tract area fdnctions． Thc rule assumcs target configurations 51，32，．．． fbr cach phonemc， and express the area fUnction 50n the way of transition betweenぷ1 and 52as丘）110ws：ぷ＝31十（S2−51）・τ（η）（η＝normalized time， T（n）＝normalizcd locus of second f（）rmant frequency．F2 between SI and S2）． For the transition among 51，ぷ2 andぷ3， the locus of F2 is assumcd as the product of the locus of F2 bctwecnぷ1 andぷ2（7’1（n））and the orle between 512 and S3（T2（n））， and oll the articulatory level it is expressed as fbllows： 5＝Sl十［32十（53−S2）・τ2（n−nα）−51］・T1（η）（nα＝instant of beginning to carry out thc command to move on to S3）． It has been shown that the results calculated by our rulc

coincide good cnough with the observed effects onγ1γ2γ1，γ10γ2 and Cア℃context

（V＝vowe1， C−stop or nasa1）and that the results are available長）r phoneme discrimination．

1． Introduction

In the study of spccch， it is one of thc fUndamental problems and plays an important rolc to make clcar the mechanism of coarticulation in speech and ex− press it by rule． This paper is conccrning with how

the coarticulation efflects apPear in the context

of Vl V2 V1， Vl C V2， etc．（V＝vowel， C＝consonant） and how we should move thc vocal tract area func− tion in order to realize the effects． Further this rule may contributc to thc discrimination among the cluster of like phonemes． The problem of coarticulation reportcd in thc past was in the aspects of frequency spectrum1）， and when it is discussed on articulatory level2）thc rclation with the frequency spectrum is not clcarly solvcd． Such

aproblem may not be solved on the acoustical

level only but needs．、to investigate the movement of articulatory organ． However it is not easy to

take X−ray photograph， morcover it is not yct

enough precisc to decide the vocal tract area fUnction from the latcral cineradiograph． Theref（）re， it is necessary to make a model f（）r the transition of area function． The writcrs have made efforts to makc a model and the f（）rmant patterns calculated by the

model havc become qualitatively to coincide with

the observed ones and to be able to cxprcss the COartiCUIatiOn CffeCt．

2．Amodel for the transition of vocal tract

area function＿A rule of coarticulation

Our model is made up as f（）110ws：−each phoneme has the proper target area fUnction （targct con−

figuration）and uttering Phonemcs in succcssion is

expressed by combining thc target configurations

of each phoneme by rule． That is， if the targct

configurations of two phoncmcs are exprcssed as

S1（x）， S2（κ）（x＝distance from glottis）respectively then the area fUnction S（x）on the way of transition from Si（κ）to S2（x）is shown as f‘）110ws：一 SIlp（x）＝S11／P（κ）十［S21／P（κ）−Slllp（x）］・n （1）

n＝normalizcd time，0≦n≦1

Makc p bc equal to 2∼3 and the sccond formant

(2)

frequencies corresponding to S（x）， S1（x） and S2（κ） be F， Fl and F2 respectively， then the f‘）llowing relation is apProximately realizcd． F＝F1十（F2−」F1）・n （2） This relation is also realizcd approximately f（）r the

transition of丘rst fbrmant frequency and has been

also applied to the transitions bctwccn any con−

sonant and voweL Accordingly by making♪＝3

and using T（n）instead of n in Eqs．1and 2， which is a normalized function of n coinciding approxi− mately with the F2 curve， the fbllowing cquation is obtained． S113（x）＝S、113（κ）＋［S21／3（κ）一ぷ、113（κ）］・T（・）， 0≦T（n）≦1 （3）

Moreovcr by adding thc tcrm of complementary

functions， the f（）llowing equation is fbrmed f（）r the

transition bctween two phonemes．

二：：：欝｝三蒙…三三曇驚X）°ぶ2（X）｝1／3｝（4） Where 1）LH＠，κ）and K（n，κ）express the inhcrent

features fbr the coarticulation betwccn the pho−

ncmCS．3，9） Next Iet’s consider the case uttering three pho−

nemcs successively． The target configurations of

each phonemc are rcprcscntcd asぷ1， S2 andぷ3，

then the F2 curve may bc cxpressed as thc product of the normalizcd time function 7「1＠）， which is thc normalized F2 curvc drawn on thc way of tran− sition from Sl toぷ2， and the fUnction T2（n＿nα）， which is the normalizcd F2 curve f（）r the transition from S2 to S3 and nαis the instant of bcginning to

carry out the command to move on to S30n the

way of transition from Sl to S2． That is T’（

猿ﾇ馴（5）

whcrc o is thc coe伍cicnt of transfbrmation bctwccn the normalized F2 curve from Si to S2（Tl（n））and the normalized F2 curvc fromぷ2 to S3（T2（n））． This rulc has bcen concluded fbr the F2 curvc drawn by successive utterancc of three vowels Vl V2171， but wc assume that to evcry contexts this rule is applicable．

Rewriting this relation by fbrmant frequencies

and designating thc F2’s ofSl， S2，ぷ3 and s asF1， F2， F3 and F respcctivcly， we get the fbllowing equatlon・

：≒芸一T・（n）・［1−：iゴi・T・（一・・）］

．°。F＝F1＋［F2＋（F3−F2）・T2（n−n。） −171］・Tl（n）（6） This equation is interpreted as fbllows：−instead of the target frequency．F2 f（）r the transition from Sl to S2， the F20n the way from S2 to S3， that is F2十（F3−F2）T2（n−nα）， is regarded as the target frequ− ency for the transition from Si to S2． Whilc， if we use Eq．3for the transition of the area fUnction， the relation between F2 curve calcu− 1ated from the area fUnction and thc time function for the transition of the area fUnction becomes almost linear． So the area function S among S1，ぷ2 and S3 is shown as fbllows：一 S1／3＝S、1／3＋［S21／3＋（S31／3−S21／3）・T2（n−n。） −Sll！3］・T1（n）（7） In thc casc of morc than three phonemes， the above relation may be easily applicable on expanded forms． Morcover in the case of context of three vowels Vl V2 V1， if the F2 is traced and cach symbol is

defined as shown in Fig．1and F2t is the target

frequency of the sccond formant frequency of V2，

the f（）llowing rclation has been confirmed by

tracing the F2 curves fbr various V， V2 Vl， fbr ex− ample as shown in Fig．1． F2。＝κ（F2z 一 F2t）・exp（一βτ）＋F2t，（8） κ，β＝constants This rclation shows the effect of coarticulation by

obscrving the variation of the maximum or mini−

mum value of F2 curve， and have to coincide with thc rcsult obtained by Eq．5， consequently Eq．7． Now， if wc approximate both T1（n）and T2（n） by［1−exp（一αη）］， the maximum or minimum point of」F2 curve occurs at the cross point of T1（n）and T2（n−nα）at which n＝＝・τ／2， and the value at the

(3)

AModel of Coarticulation

2．0 1．0 0．5 ⊃α

w

》・1！＼｝ ⊃ 1 缶＝ 1 ）己o．3！

） 1

0・21 f O．1 0．071 1 0 ）、 F20 くト ti（）i 「 R自Rx x＼。 1＼ x 口 V， x x x 口・x：male talker A 。ロ：：〃〃 B，・＼ご ’ loio「ヘへ・＼＜。．Sx e ．．＼＼一゜ o°° ・ θ、＼●＼2 ＼ 0．04L．，一＿＿一一・一．−秩E・∼一一t鼈黶E．n−→一一一＿T−一＿ 2．QO 400 τ（ms） Fig・1 Relation between（F20−F2t）／（F2i−F2t）and transitional duration τfor ／ioi／ and ／oio／ uttered by A and B（male）． extremc point， which is the normalized．F20， is shown as f（）110ws：一［1−cxp（一ατ／2）］2 ＝1−2cxp（一ατ／2）十exp（一ατ） Ifατis not so small then ［1−exp（一ατ／2）］2二ご1−2 exp（一ατ／2）（9）

while by Eq．8the normalized F20 is cxpresscd as

f（）llOWS：＿ F20−F2i ＝1一κ・exp（一βτ）（10） F2￠−F2z

Namely the normalizcd F20’s in Eqs．9and 10，

which are dcrived from Eqs．5and 8 respcctively，

take approximately the same fbrm． Theref（）re， by moving the target con丘guration as shown in Eq．7， the relation of Eq．8may bc almost consistellt． The above equations havc bccn made sure f（）r vowels， but by taking the inhcrcnt features of cach phoncmc into considerations Eq．7 may bc applica−

ble to the transition among any phoncmes． After

all， the problem of coarticulation is exprcssed by

Eq．4fbr thc transition bctween two phoncmes

and by Eqs．5，60r 7 fbr the transition among thrce phonemcs quantitatively， and f（）r the transition

among morc phonemes the rule may be expandcd．

In thc abovc relations thcrc arc somc points which arc not provcd strictly but they are simplc exprcssions and consistent enough at first approximation， and we will explain by a few examples that these rclations are consistent and also uscful in specch rccognition． 3・Examples

3．1 Vowels

In the case that threc vowels， Vl V2 Vl， are uttered succcssively at different speeds， Eq．8is established fairly well ifF2z， F20 andτare defincd as in Fig．1and F2t is equal to the targct frequency of・F20． Inverscly， the vowel V2 in V， V2 VI may bc dis− tinguishcd by means of calculation of F2t， sub− stituting measured F2i， F20 andτinto Eq．84）． But in this case， instead of cxpressingκandβas fUnction of（F2z−F20）， give thc timc fUnction T（n）， then F2t may bc also calculatcd by Eq．11．（F2・−F2i）／（F、・−F2i）＝T2（n）（11） Thc same rclation can be discusscd as to F1． And with the same calculation of Ftt versus、Flo， V2 can

bc distinguishcd even when Vl V2 VI is uttercd

rapidly． But， there can be some cases in which inaccuracy is left in measuringτ．

Fig．2 shows the calculated vocal tract area

functions and the f（）rmant frequencies in the cases where we make nα in Eq．7bc equal to O．3 and O．5 f（）rthe context of／i u i／． It shows that thc vocal tract returns to／i／without reaching／u／suMciently． And we can see that the faster the speed of thc uttcrance（the smaller nα）is this phonemcnon be． COmeS CIearCr．

3．2 Nasal consonants

Let us considcr the case that V1ハτ V2（V，＝／i， o／，2V＝／m，11，η／and V2＝／i， e， a， o， u／）are uttercd at different speeds． If we dcfine F2z andτas shown in Fig．3and・F2t designates the target frequency of F2z， we will find that the next rclation is consistcnt almost sufHcicntly as shown in Fig．4．6） F2z＝κ（F2i 一・F2∂・exp（一βτ）十F2t，（12） κ，β＝C・nstants

(4)

1°

n

ε

ご、

… § 貧

5

ら 8 雪

9

よ芸 E 唱・ 3．0 ピ 2．5 2．3 2．0 ぱノノ＼／／＼一・・n。−03／／／ ’t 一一’：・・＝o・『／／ ∫ 1 ‘ 〆ノ L ’ ・！ l l l l l ∫ i F3 ！1 エア 1 ’ ∫ 、 1 ／ジニ」て一．ぷ／一∫−r7r ＼）／＼，，7K＼i‘／ 1 J ＼1∫ 1、 ’ l lil

一叫lli」

、／！＼F2／／’ ＼＼ノ／ ∼ ∫ Fig．2 0．5 0．3 0．0 0．5 1．0 1．5 nOrmaliZed time n na＝0．3， n＝＝O．5 ＝0．5，n＝0．6 IL ゴ− tt ア ’ r ドコ 10 15 distance from glottis x （cm） Transitions of area fUnctions and f（）rmant 丘equencics of ／iui／ fbrηα＝0．3 and o．5

0btained by the modeL The smallerηα

corresponds to the faster utterance．

N

F2i τ ・F，1 Fig．3 111ustration of symbols． Moving the vocal tract area fUnction by Eq．7 and calculating F2i，172z and T from the area function， we have plotted．F2z according to Eq．12 by making

F2t be identical with the F210f the monosyllable

（NV2）and it is shown in Fig．5． The result satisfies cnough thc actualization of Fig．4 qualitatively．7） In the case of monosyllables the discrimination among the nasal consonants／m／，／n／and／i）／as for cvery same following vowels may be possible by the accurate cxtraction of F2z． And if the discrimination only bctwcen／m／and／n／is required， cven in words， it is possiblc on real time．5）But it is impossible to discriminate／m／，／n／and／i］／in words by the ex一（ N I ’N _＼

A

吉・』 1 ・巴 0．7’ 0．5 0．3 0．2 ｛｝ユ 1no IMO i）o 75丁 τ（ms） 1100 Fig．4 Normalized F2t plotted against nasal segment durationτin the contexts of／ino／，／ipo／and ／imo／（male voice）． 1．0

A

N l 『N ］、）＼

A

N l O．1 巴 0．05 ・0．0 9−．（normalized time）

05

Fig．5 Normalized F2i plotted against nasal segment durationτ（in normalized time）calculated for／ino／by the model． ■ ＝1．0 ぎ’ 討

OJ

mo ・ono IMO omo ▲ iり0 6’OIOo 75 τ（ms） llb・［ lml 100 Fig．6 F2t’s calculated by Eq．12 assigning l．O and O．017forκandβrespectively． traction of F210nly． Thercfbre， we tried the trans−

formation by Eq．120f、F2z’s into the domain of

F2t． By that， F2t，s of／m／，／n／and／η／related to the

same f（）110wing vowel were separated into each

(5)

AModel of Coarticulation

句＝5 1，21ka 1 3，4 1ikal lki 1：〃ノ1 ／（3） di、tance f。。m gl・tti・x（㎝） Fig．7 Area f皿ctions at the instant of／k／−explosion ［（1）and（3）］and the onset of the制lowing vowel／a／［（2）and（4）］fbr／ka／and／ika／（nα＝0．225） respectivelly calculated by the model． Curve（5）is the target configuration of／k／． Curves（3）and （4）show the in− fluence of the preceding vowe1／i／upon／ka／．

nasal domain and the discrimination among them

became possible as shown in Fig．6．6）

3．3 Stop Co血sonants

In the case of stop consonants， we also move thc vocal tract area fUnction according to Eq．7， which is now under particular considerations taking ac− count of the features of stops i．e． the existence of the pcriod of vocal tract closurc and the rapid movcment of the placc of constriction just after the instant of

cxplosion． Fig．7shows the arca fUnctions at thc

instant of explosion and thc onset of the fbllowing vowe1／a／for／ka／and／ika／calculatcd by the model． They show thc effect of coarticulation by／i／upon ／ka／・

Now we compare the actual values measured as

to the influences of coarticulation on VI C V2 and

CVC（C＝stop）with the results of calculation by

our modcl．（i） In thc case ofレ71CV2 For 7iC V2（Vl−／i，o／， C＝／k，t，P／， V2＝／a／）uttcrεd

at different spcCds we have measufed thc second

fbrmant frequencies at the onsct of V2（、F2z）． The F2z，s of／ka／，／ta／and／pa／， which are separatcd in the case of monosyllablcs， ovcrlap each other be− cause of the coarticulation． But when we investigate the relatiorトbetween lo9［（F2乙一F2t）／（F2i−F2t）］and τ，defining F2i， F2乞andτas same as in Fig．3（putting 焦 1．0

《α5＼，〈．

皇遥 Lo・1

b

O．05 0．56exp（0．019τ）「 ▲ ▲ ： °：．・、． o 0．03 50 100 150， τ（ms） Fig．8 Normalized F2z plotted against consonant seg。 ment durationτfbr various contexts ofγ1 C V2（V1＝／i， o／， V2＝／a／， C＝／k， t， P／）（male voice）． In Fig．8and 9 。：／ika／，・：／oka／， △：／ita／，▲：／ota／，＋：／ipa／and×：／opa／・ 1．0 N 兵、 1 ’＆ 1色 ’）z N 兵 0．1 1 N ・』） 0．02 0．2 A x 0．4 τ（n・rmalized tim・．） Fig． 9 Normalized F2z plotted segment durationτ （in calculated by the mode1． same as in Fig．8． 0．6 ザ agalnSt COnSOnant normalized time） Notations are the Cin place of N）， we find almost linear rclation as shown in Fig．8． So the next relation can be concluded as same as in thc case of nasals． F2z＝κ（」F 2z−F2の・exp（一βτ）十IF 2￠ κ，β＝constants， F2t＝＝target of F2z （13） We have calculatedF2￠’s inversely伽m the measur− ed、F2i， F21 andτ，6xingκandβto thc mean values f〈）rcach context as shown in Fig．8． Then F2t，s are divided into each domain of／ka／，／ta／and／pa／， which makc possible to distinguish each other．8）、 Our model also shows similar rcsults with the

(6)

ノ Fig．10 tこ1．0 ピ「：巴0，5 ＞己 1。．α2

ε

b．1 0．1 ：．髭｝／・・d／

割乃・b／

C V ．F，i F，i ：τ ・十 _◆ 0。5 ：髭｝／・・g／

C

Normalized Fゴo（」＝1，2）plotted against vowel scgment durationτ（in normalized time）calculated f（）r Cア℃contexts by the model． above obscrvation as shown in Fig．99）． So we may conclude that our modcl is ablc to represent the influence of the coarticulation wcll enough．（ii） In the case ofCVC Dr． B． Lindbrom1）， according to his detail cx− periments， confirmed the next relation to be almost consistcnt， defining the symbols as shown in Fig．］0． F」・＝κ（Fゴi−F」t）・exp（一βτ）＋F」t，（14） where，ゴ＝1，2， κ，β＝constants， FJt＝target of Fjo Calculating by our modcl， F弦， Fj。 andτf（）r the various contcxts of C＝／9，d，b／ and γ＝／a／， the relations betwcen log［（Fゴo．−Fゴ古）／（Fjz ＿ Fゴt）］and

τ at various spceds of the utterencc become as

shown in Fig．109）．These results satisfy well the ones of the experiments shown in the reference 1）．（ii）except that the correction of 1’（n）in／bab／is a little small and that we havc dealt with、Flo as same as．F20．

4． Discussion and Conclusion

We have shown some cxamples that the effects

of coarticulation upon the second formant frequencies of Vl V2 Vl， Vl Cレア2（0＝stops or nasals）and O V C （0＝stops）are represenred by Eqs．8，12，13 and l 4 and that， invcrsely， if the targct frequcncies are calculated by these relations， evcn in the case that the freqUency spectrum is considcrably defbrmed by th・i・flu・ヰ9・・f the cga・liω1・ti・n， th・ph・n・m・・

cah．b・di・tingui・h・d．；And w・hav・・h・wn th・t

・h・・ercl輌l m・y b・・calized・lm・・t・u伍・i・n・ly by thc way↓h容t the transiti皿betwcen two phoncmes

is represeh巨dlby Eq．4and among three phonemes

simply by EqsI 50r 6 in the frequency domain and by Eq．7in thhg artic’浮撃≠狽盾窒凵@lcve1． In the articulation discussed here it is assumed that

the command寧om the brain always works in order

to lead the area fUnction to thc ncw target and that the perf（）rmance of thc command aiming at the new targct will be done with invariant eff（）rt． But it may be hard to assume that the same eff（）rt is made even in short scntences， because of stress or intonation activity． Anq．！he phenomenon like‘1aziness， may take placc． So we should take such influences into consideration in our model． Although the concept of F210cus is advocatcd as f（）rthe stop consonant and our model can give the

丘equcncy which is similar to the、F210cus given

up to now， it is di伍cult to considcr that the way obtaining the F210cus discusscd up to the prcscnt givcs the丘cqucncy corresponding to the real arti− culation of thc stop consonant as initially intended． According to our mode1， howcver， the丘）rmallt fre−

quency corresponding to the target configuration

of thc stop consollant（we call it target frequcncy） can be decided as FI calculated by substituting、F and F2 into Eq．2．10）（There is a premise that Eqs．1 and 2 still hold fbrmally during the closure of vocal tract．） With the use of this target frequency， we can represent the cffect of coarticulation upon thc case

containing not only vowels but also both vowels

and consonants simply by the fbrms like Eqs．50r 6． But the study of coarticulation should be discusscd at the articulatory level and by doing so wc can solve simultaneously the effect on each f（）rmant and antif（）rmant frcquencies of thc vocal tract． But in order to procecd thc study， it is nccessary to decide the accuratc target configuration of each

phoncme and to gct accuratc data of the vocal

tract arca fUnction during the uttcrance， which

may make us imprOve our model fUrthcr．

(7)

AModel of Coarticulation

Since lessening nec than O．3 corresponds to a very rapid utterance， in the utterance at natural speed it is cllough to consider that the two preceding and

thc two following phonemes havc the influence of

coarticulation on thc phoncme． And in thc case of rapid utterancc， it will bc enough to take three phoncmes which precede or fi）110w into consideration．

From the vicwpoint of physiological level our

modcl docs not explain suMcicntly the option of

complementary terms used in it or some parameters proper to context， but they represent well the phy− siological constraints of articulatory manner and the

formant pattern calculated by thc modcl coincidcs

su伍ciently well with the observed one． Furthermore， the transition of the area fUnction

may be decomposed into the movements of each

articulatory organ such as tongue， lips and mandible．

And we are now constructing a new mode111）in

w．hich these organs move by somc commalld such

as time optimal control satisfying the acoustical

demand according to a sequence of discrete phonemic 1nStrUCt10nS． Although many discussions are made in the study of thc recognition of speech， we consider it reasonablc to transf（）rm the observcd pattcrns to the targct

frequency domain by the method such as mcntioned

in this article and， if possible， into the rccognition spacc of target configurations．

Refe1「ences

1）（i）K．N． Stevens， A．S． House and A．P． Paul：

Acoustical Description of Syllabic Nuclei：

An Interpretation in Tcrms of a Dynamic

Modcl of Articulation，J． Acoust． soc． America， 40，1，p．123−132（1966）．

（ii）B． Lindblom：Spectrographic Study of

vowel Rcduction， J． Acoust． soc． Amcrica， 35，11，p．1773−1781（1963）．（iii）S．E．G． Ohman：Coarticulation in VCV

Utterances：Spectrographic Measurements， J．

Acoust． Soc． America，39，1， p．151−168（1966） etc．

2）（i）S．E． G． Ohman：Numcrical Model of

coarticulation， J． Acoust． soc． Amcrica，41， 2，p．310−332 （1967）．

（ii）R． A． Houde：AStudy of Tongue Body

Motion during selccted Specch Sounds， SC肌

Monograph No．2， Speech Communications

Research Laboratory， INC．（1968−8）．（iii）J． s． Perkell：Physiology of speech Pro− duction：Results and Implications of a Quanti−

tative Cineradiographic Study， The MIT

Press， Cambridge（1969）etc．

3）MShigenaga， H． Ariizumi and T． Tanaka：

（i）AModel fbr Transitions of Vocal Tract

Area Functions of Vowels， Report of Committee on Automaton， Inst． Electronics and Com− munication Eng． Japan，（1968−2）・（ii）6th Intcrnational Congrcss on Acoust．： A Model for Transitions of Vocal Tract Area Functions， B−5−9 （1968−8）．

4）H．Ariizumi and M． Shigenaga：On the

Transition of the Sccond Formant Frequency

in Threc vowels vl v2 vl， Report ofJoint

Meeting of Elect． Eng． No．2951（1967−4）．

5）M．Shigenaga and H． Ariizumi：Automatic

Rccognition of Nasal consonants， J・Acoust・ soc． Japan，21，5， P．263−271（1965）・

6）H．Ariizumi and M． Shigenaga：On the

Discrimination of Nasal Consonants， Rcport

of Faculty of Eng．， Yamanashi Univ． No．17， p．126−138 （1966−12）．

7）H．Ariizumi， T． Tanaka and M． Shigcnaga：

Transitions of Vocal Tract Area Functions f（）r Nasals， Rcport of Meeting of Acoust． Soc． Japan， P．47 （1968−4）・

8）F．Ogawa and T． Kobayashi：Efllects of the

Preccding Vowels upon Voicclcss Stops， Gra− duate Thcsis of Dept． of Elcct． Eng．， Yama− nashi Univ．（1967−3）．

9）T．Tanaka and M． Shigenaga：Transition of

Vocal Tract「Area Functions for Stop Con−

sonants， Report of Committee on Speech，

Acoust． soc． Japan（1968−12）and J・Acoust・ soc． Japan，25，3， P．144（1969−5）・

A Model of Coarticulation 利用統計を見る

0riginal Report

A Model of Coarticulation

MinoruSHIGENAGA KiyoshiTANAKA HitoshiARIIZUMI KenjiSAITO

Synopsis

coincide good cnough with the observed effects onγ1γ2γ1，γ10γ2 and Cア℃context

1． Introduction

the coarticulation efflects apPear in the context

aproblem may not be solved on the acoustical

take X−ray photograph， morcover it is not yct

model havc become qualitatively to coincide with

2．Amodel for the transition of vocal tract

area function＿A rule of coarticulation

figuration）and uttering Phonemcs in succcssion is

expressed by combining thc target configurations

configurations of two phoncmcs are exprcssed as

n＝normalizcd time，0≦n≦1

Makc p bc equal to 2∼3 and the sccond formant

transition of丘rst fbrmant frequency and has been

sonant and voweL Accordingly by making♪＝3

Moreovcr by adding thc tcrm of complementary

transition bctween two phonemes．

features fbr the coarticulation betwccn the pho−

nemcs successively． The target configurations of

each phonemc are rcprcscntcd asぷ1， S2 andぷ3，

carry out the command to move on to S30n the

猿ﾇ馴 （5）

Rewriting this relation by fbrmant frequencies

：≒芸一T・（n）・［1−：iゴi・T・（一・・）］

defined as shown in Fig．1and F2t is the target

the f（）llowing rclation has been confirmed by

obscrving the variation of the maximum or mini−

AModel of Coarticulation

w

） 1

while by Eq．8the normalized F20 is cxpresscd as

Namely the normalizcd F20’s in Eqs．9and 10，

which are dcrived from Eqs．5and 8 respcctively，

ble to the transition among any phoncmes． After

Eq．4fbr thc transition bctween two phoncmes

among morc phonemes the rule may be expandcd．

3．1 Vowels

bc distinguishcd even when Vl V2 VI is uttercd

Fig．2 shows the calculated vocal tract area

3．2 Nasal consonants

n

ご、

5

9

一叫lli」

0btained by the modeL The smallerηα

N

F2t be identical with the F210f the monosyllable

A

A

A

05

OJ

formation by Eq．120f、F2z’s into the domain of

same f（）110wing vowel were separated into each

AModel of Coarticulation

nasal domain and the discrimination among them

3．3 Stop Co血sonants

cxplosion． Fig．7shows the arca fUnctions at thc

Now we compare the actual values measured as

CVC（C＝stop）with the results of calculation by

at different spcCds we have measufed thc second

《α5＼，〈．

b

ε

割乃・b／

C

τ at various spceds of the utterencc become as

4． Discussion and Conclusion

We have shown some cxamples that the effects

cah．b・di・tingui・h・d．；And w・hav・・h・wn th・t

is represeh巨dlby Eq．4and among three phonemes

the command寧om the brain always works in order

丘equcncy which is similar to the、F210cus given

quency corresponding to the target configuration

猿ﾇ馴（5）