The Japanese Psychonomic Society
NII-Electronic Library Service
The JapanesePsychonomic Society
llige
fitpanese
.jou・rnat
oj'tlsychonemic Science2005, VoL24,No.1,]27-128
Summary
ofAwardedPresentation2P15
Factors
affecting
theperception
ofmultiple
simultaneous
voices
TakayukiKAwAsHIMA*'
i)andTakao
The [hiiversit))
of
Toleyo
SATO**
The maximum number of streams thatcould be heurd from a mixed sound wa$ measured.
In
each triala mixcd voice and a single voice
(probc)
"rere succcssively presented.The
7 adultsubjects were required to
judge
whether. or not, theprobe was presentjn
the mult,iple voices. Thepereeptual 1imit
(the
maximum number of streams) was calculatedby
multiplying the truehitratiowith the number o[ speakers, The estimated perceptual limitwas approximately
3
when morethan 4 speakers wcre presented.
The
results indicatethatauditory processing ismore effectivethan previously
bc]icved
(cL
Kashino&
Hirahara,
1996,}.
The
data indicatedthatthe probe couldbc
detected
morc casily when itwas presented beforethemixed voices rather than after themixedvoices.
This
may indicatethatthe perceptual limit results from a limited attention capacity. Key wordsi multiple voices, attention, sound source segrcgation, maskingWe
have measured the maximum number ofper-ceptua] streams that can
be
heard
from
a sound which arisesfrom
multiple simultaneous sources.The limitof per¢eption was then considered to
meas-ure theeMciency of auditory processing.
Other
than Kashino andHirahara
<1996)
therearefew studies that have attempted tomeasure the
liinit
ef perception of multiple simultaneuus sounds, In
theirreport multiple voices were presented to the
subjects.
The
number of speakers(voices)
was ma-nipulatecl as an independent varjable, and thesub-)ectswere required to give the{rnumber.
The
pro-portion of the correct responses was very
high
(near
touniLy) when 1or
2
speakers were presented butdecreased rapidly when more than
3
speakers werepresented,
The
aurhors thereforeconcluded that themaximum nurnber of streams thatcan beperceived isapproximately two,
One
drawback
of estimating the pcrceptual limitby
asking thesubjectsto
count the number of voicesis
thatthey can(correctly>
answer, but withoutper-* 21stcentury COE
program "Center
for
ary cognitive sciences,"
Graduate
School ef Artsand
Sciences,
The
University ofTokyo,
3-8-1Komaba, Meguro-ku, Tokyo 153-8902
i) Takayuki
Kawashima
isnow at the
Department
of
Cognitive
and BehavioralScience,
Graduate
School
ofArts
andSciences,
The
University
ofTokyo
** Department of
Psychology,
Graduate SchoQlof
Hurnanities and
Sociology,
The Univer$ity ofkyo, 7L3-1
Hongo,
Bunkyo-ku,Tok}ro
113-O033,
ceiving the
individual
voices separately.For
exam-ple,they might use the timbre of the mixed sound asacue forestimating the numbcr, Itispossible
there-fore,
that
Kashino
and Hirahara(1996)
didnotmeas-ure the perceptual ]imitcorrecUy.
Wc
estimated the limitof perception with a new method. Inthepresent experiment inevery trialwe used mixed voices and a single voice{probe)
succes-sively.
The
subjects were required tojudge
whether,or not, the probe was presenr inthe multiple voices.
At theconclusion we calculated the
true
hit
ratio(IL)
according toEquation
1(Macmillan
&
Creelman,
1991,p.89).
fl}==(H-F)/(1-F).
(1)
The symbolsH and F inEquation
1
represent thehitratio and the falsealarm ratio respective]y. For each
condition of the number of speakers Hl was
calcu-latedand the number of $treams was estimated by
multiplying
ff}
with thenumber ofconcurrentspeak-ers. This procedure ensured that theestimation of
theperceptual
limit
wasbased
on a perceptualsepa-ration ofa mixecl sound.
Methods
Participants
The
participants were7
Japanese
aduats with normal hearing.
Apparatus
and stimuliA
set of the 30Japanese
words were digitallyrecorded on audio tape
(441OO
Hz, 16bit>
in
a sottndproef room by7
female
The Japanese Psychonomic Society
NII-Electronic Library Service
The JapanesePsychonomic Society
128 The
Japanese
Journal
of Psychonomic Science VoL24, No. 1ers who did not
participate
as subjects. Allof thewords consisted of 4 moras and theiraverage
dura-tionwas
O.87
seconds. Allof the stimuli werepre-sentcd
diotically
through headphones(Sennheiser
HDA200} and a single word was played at a sound
levelof 63 dB SPL. The stimulus presentation and
data
acquisition were controlledby
a personalcom-puter
{Apple
PowerMac
G4).
Procedure As mentioned above, we presented a
mixed sound and a single voicc successively
in
everytrial. The probability that the mixed sound
con-tainedtheprobe was
O.5.
In
one condition, theprobewas presented before the multiple voices
(preprobc
condition), and in the other condition, itwaspre-sented after the rnultiple voices
(postprobe
condi-tion), A silent interval
(O,3
seconds} was insertedbetween thetwo sounds inboth conditions,
There
were6
conditions inwhich 1,2,3,4,5.and 6voices produced the mixed sound.
These
6condi-tionswere presented
in
a random orderin
ablock.
The mixed voices were composed of the different
speaker$ each of whom said differentwords. The
speaker of the probe voice was randomly selected
from the 7 speakers incvery tria].Thc participants
repeated
their
judgments
33times
ineach condition.Three
of the 7subjects were testedfirstinthepre-probe
cendition and the remainder were testedfirstinthe postprobe condition.
Results
andDiscussion
The mean va]ues of the estimated perceptual
limit
across the subiects are shown in Figure 1. In the
postprobe condition
the
numbcr of strcaTns wasap-proximately
3when more than 4speakers werepre-sented
(square
symbols). This suggests thatatleast
3streams canbe
scparated from multiple simultane-ous voices and indicatesthatthe auditory processing ismore etficient than previously reported.Therc was a considerable differenceinthe
esti-mated number of streams
in
the
2
probe conditionswhen more than 3speakers were presented. A
two-way ANOVA revealed thatthe effect of the
interac-tionwas significanL
(F{5,
30).= 14.12,p<O,Ol),and thesirnple main effect of the probe positionswas sig-nificant when more than
3
speakers were presented
6co:5gco"o'4gi3iee2..E--,coLLj
d
o
1
2 34
5
6
Number
ct
Talkers
Figure 1. The estimated number of streams as
a
function
of the number of concurrentspeakers. The symbols
(Z
and smallO}
ind[cate the average values acruss the
jects.
Error bars are the SEMs. The thin tinerepresents the theoreticalmaximum number,
(p<O.05).
Ifinformation of the individual sounds waslost
by
mutualinterference
inthe peripheral auditory system(as
in
energetic masking), then the changed probe positionwould not affect theestima-tion of the number of streams. The difference
be-tween the probe conditions thereforemay indicate
thatthe perceptual
limit
is
determined
by
cognitive(central)
factors,such as attention or memory.It
is
not clearhow
the
limit.
of percept.ionis
changed whcn sounds other than
human
voicescre-ate a mixed $ound, or when objects are presented
across differentsense modalities. Mcasurement. of
the perceptual
limit
under theseconditions willbe
helpful
for
understandinghow
our perceptual worldarises.
References
Kashino, M.
&
Hirahara, T. 1996One,
two, many
judging
the number of concurrent talkers..10urnat
of
theAcoustical SoctetyofAmerica,
99,
2597,Macmillan, N. A. &
Cree]man,
C,
D.1991 Detectiontheonyv':A user's guide.Cambridge: Cambridge versitv