• 検索結果がありません。

音源分離の限界を規定する要因の検討(第23回大会 優秀発表賞抄録)

N/A
N/A
Protected

Academic year: 2021

シェア "音源分離の限界を規定する要因の検討(第23回大会 優秀発表賞抄録)"

Copied!
2
0
0

読み込み中.... (全文を見る)

全文

(1)

The Japanese Psychonomic Society

NII-Electronic Library Service

The JapanesePsychonomic Society

llige

fitpanese

.jou・rnat

oj'tlsychonemic Science

2005, VoL24,No.1,]27-128

Summary

ofAwarded

Presentation2P15

Factors

affecting

theperception

ofmultiple

simultaneous

voices

TakayukiKAwAsHIMA*'

i)and

Takao

The [hiiversit))

of

Toleyo

SATO**

The maximum number of streams thatcould be heurd from a mixed sound wa$ measured.

In

each triala mixcd voice and a single voice

(probc)

"rere succcssively presented.

The

7 adult

subjects were required to

judge

whether. or not, theprobe was present

jn

the mult,iple voices. The

pereeptual 1imit

(the

maximum number of streams) was calculated

by

multiplying the truehitratio

with the number o[ speakers, The estimated perceptual limitwas approximately

3

when more

than 4 speakers wcre presented.

The

results indicatethatauditory processing ismore effective

than previously

bc]icved

(cL

Kashino

&

Hirahara,

1996,}.

The

data indicatedthatthe probe could

bc

detected

morc casily when itwas presented beforethemixed voices rather than after themixed

voices.

This

may indicatethatthe perceptual limit results from a limited attention capacity. Key wordsi multiple voices, attention, sound source segrcgation, masking

We

have measured the maximum number of

per-ceptua] streams that can

be

heard

from

a sound which arises

from

multiple simultaneous sources.

The limitof per¢eption was then considered to

meas-ure theeMciency of auditory processing.

Other

than Kashino and

Hirahara

<1996)

thereare

few studies that have attempted tomeasure the

liinit

ef perception of multiple simultaneuus sounds, In

theirreport multiple voices were presented to the

subjects.

The

number of speakers

(voices)

was ma-nipulatecl as an independent varjable, and the

sub-)ectswere required to give the{rnumber.

The

pro-portion of the correct responses was very

high

(near

touniLy) when 1or

2

speakers were presented but

decreased rapidly when more than

3

speakers were

presented,

The

aurhors thereforeconcluded that the

maximum nurnber of streams thatcan beperceived isapproximately two,

One

drawback

of estimating the pcrceptual limit

by

asking thesubjects

to

count the number of voices

is

thatthey can

(correctly>

answer, but without

per-* 21stcentury COE

program "Center

for

ary cognitive sciences,"

Graduate

School ef Arts

and

Sciences,

The

University of

Tokyo,

3-8-1

Komaba, Meguro-ku, Tokyo 153-8902

i) Takayuki

Kawashima

is

now at the

Department

of

Cognitive

and Behavioral

Science,

Graduate

School

of

Arts

and

Sciences,

The

University

of

Tokyo

** Department of

Psychology,

Graduate SchoQl

of

Hurnanities and

Sociology,

The Univer$ity of

kyo, 7L3-1

Hongo,

Bunkyo-ku,

Tok}ro

113-O033,

ceiving the

individual

voices separately.

For

exam-ple,they might use the timbre of the mixed sound as

acue forestimating the numbcr, Itispossible

there-fore,

that

Kashino

and Hirahara

(1996)

didnot

meas-ure the perceptual ]imitcorrecUy.

Wc

estimated the limitof perception with a new method. Inthepresent experiment inevery trialwe used mixed voices and a single voice

{probe)

succes-sively.

The

subjects were required to

judge

whether,

or not, the probe was presenr inthe multiple voices.

At theconclusion we calculated the

true

hit

ratio

(IL)

according to

Equation

1

(Macmillan

&

Creelman,

1991,p.89).

fl}==(H-F)/(1-F).

(1)

The symbolsH and F inEquation

1

represent thehit

ratio and the falsealarm ratio respective]y. For each

condition of the number of speakers Hl was

calcu-latedand the number of $treams was estimated by

multiplying

ff}

with thenumber ofconcurrent

speak-ers. This procedure ensured that theestimation of

theperceptual

limit

was

based

on a perceptual

sepa-ration ofa mixecl sound.

Methods

Participants

The

participants were

7

Japanese

aduats with normal hearing.

Apparatus

and stimuli

A

set of the 30

Japanese

words were digitallyrecorded on audio tape

(441OO

Hz, 16bit>

in

a sottndproef room by

7

female

(2)

The Japanese Psychonomic Society

NII-Electronic Library Service

The JapanesePsychonomic Society

128 The

Japanese

Journal

of Psychonomic Science VoL24, No. 1

ers who did not

participate

as subjects. Allof the

words consisted of 4 moras and theiraverage

dura-tionwas

O.87

seconds. Allof the stimuli were

pre-sentcd

diotically

through headphones

(Sennheiser

HDA200} and a single word was played at a sound

levelof 63 dB SPL. The stimulus presentation and

data

acquisition were controlled

by

a personal

com-puter

{Apple

Power

Mac

G4).

Procedure As mentioned above, we presented a

mixed sound and a single voicc successively

in

every

trial. The probability that the mixed sound

con-tainedtheprobe was

O.5.

In

one condition, theprobe

was presented before the multiple voices

(preprobc

condition), and in the other condition, itwas

pre-sented after the rnultiple voices

(postprobe

condi-tion), A silent interval

(O,3

seconds} was inserted

between thetwo sounds inboth conditions,

There

were

6

conditions inwhich 1,2,3,4,5.and 6

voices produced the mixed sound.

These

6

condi-tionswere presented

in

a random order

in

a

block.

The mixed voices were composed of the different

speaker$ each of whom said differentwords. The

speaker of the probe voice was randomly selected

from the 7 speakers incvery tria].Thc participants

repeated

their

judgments

33

times

ineach condition.

Three

of the 7subjects were testedfirstinthe

pre-probe

cendition and the remainder were testedfirst

inthe postprobe condition.

Results

and

Discussion

The mean va]ues of the estimated perceptual

limit

across the subiects are shown in Figure 1. In the

postprobe condition

the

numbcr of strcaTns was

ap-proximately

3when more than 4speakers were

pre-sented

(square

symbols). This suggests thatat

least

3streams can

be

scparated from multiple simultane-ous voices and indicatesthatthe auditory processing ismore etficient than previously reported.

Therc was a considerable differenceinthe

esti-mated number of streams

in

the

2

probe conditions

when more than 3speakers were presented. A

two-way ANOVA revealed thatthe effect of the

interac-tionwas significanL

(F{5,

30).= 14.12,p<O,Ol),and the

sirnple main effect of the probe positionswas sig-nificant when more than

3

speakers were presented

6co:5gco"o'4gi3iee2..E--,coLLj

d

o

1

2 3

4

5

6

Number

ct

Talkers

Figure 1. The estimated number of streams as

a

function

of the number of concurrent

speakers. The symbols

(Z

and small

O}

ind[cate the average values acruss the

jects.

Error bars are the SEMs. The thin tine

represents the theoreticalmaximum number,

(p<O.05).

Ifinformation of the individual sounds was

lost

by

mutual

interference

inthe peripheral auditory system

(as

in

energetic masking), then the changed probe positionwould not affect the

estima-tion of the number of streams. The difference

be-tween the probe conditions thereforemay indicate

thatthe perceptual

limit

is

determined

by

cognitive

(central)

factors,such as attention or memory.

It

is

not clear

how

the

limit.

of percept.ion

is

changed whcn sounds other than

human

voices

cre-ate a mixed $ound, or when objects are presented

across differentsense modalities. Mcasurement. of

the perceptual

limit

under theseconditions will

be

helpful

for

understanding

how

our perceptual world

arises.

References

Kashino, M.

&

Hirahara, T. 1996

One,

two, many

judging

the number of concurrent talkers.

.10urnat

of

theAcoustical Soctety

ofAmerica,

99,

2597,

Macmillan, N. A. &

Cree]man,

C,

D.1991 Detection

theonyv':A user's guide.Cambridge: Cambridge versitv

Press,

Figure 1. The estimated number of streams as

参照

関連したドキュメント

Standard domino tableaux have already been considered by many authors [33], [6], [34], [8], [1], but, to the best of our knowledge, the expression of the

q-series, which are also called basic hypergeometric series, plays a very important role in many fields, such as affine root systems, Lie algebras and groups, number theory,

Thus, in Section 5, we show in Theorem 5.1 that, in case of even dimension d &gt; 2 of a quadric the bundle of endomorphisms of each indecomposable component of the Swan bundle

In this paper, we present a new numerical scheme by QSC methods to solve the fractional bioheat equation with mixed boundary value conditions for thermal therapy.. This new

In the present work we suggest a general method of solution of spatial axisymmetric problems of steady liquid motion in a porous medium with partially unknown boundaries.. The

Comparing the Gauss-Jordan-based algorithm and the algorithm presented in [5], which is based on the LU factorization of the Laplacian matrix, we note that despite the fact that

For instance, Racke &amp; Zheng [21] show the existence and uniqueness of a global solution to the Cahn-Hilliard equation with dynamic boundary conditions, and later Pruss, Racke

We present sufficient conditions for the existence of solutions to Neu- mann and periodic boundary-value problems for some class of quasilinear ordinary differential equations.. We