NII-Electronic Library Service
7Zhe
.lapanese
"]umatofRsychonomicScience2009,Vol.28,No.1,13e-L34
Lecture
Exploringmultimodal
integration
in
of
perception
two
functions
Kenzo
SAKURAI*
7bhohu
Galeuin
Lbiiversit),*
Focusing
on the topics of event perception and self-motion perception,in
this paper,I
introduce eur recent research on the
integration
of visualinformation
with auditory andvcstibu-lar
information.
We
have been investigating the limitsof audiofvisual integrationby
modifyingconventional streamlbounce
disp]ays
in
spatial and ternporaldomains, We found thata sound hasa markedly greater organizing
infiuence
on visual perception than was previously thought,influencing the resolution of visual motion sequences over a wide range of spatiotemporal
manipulations,
Regarding
theintegration
of visual and vestibularinforrnation
in perceivedse]f-motion, theresults ofour experiments, inwhich we manipulated thecongruency
between
vestibu-lar
and visual(eptic
fiew}inputs,suggestthat
the
multlmodalintegration
is
an cither-or process when the discrepancybetween
visual and vestibu]ar information islarge,but the integrationisaweighted combjnation of both inputs when thatdifferenceissmaH,
Key
words: stream/bounce effect, multimodal perception, rnultisensory integration,audio/xrisualinteractions
Introduction:
Two
majorfunctions
of
perception
Human perception provides at
least
two
functions.
One
is
toidentify,
locate
and track objects or eventsin
the environment based on the retinal tmages andother receptors
<sound,
touch,
sme]L etc.).Another
is
to perceive self-positjon or sel,f-motion
from
retinal optical fiow,vestibular stimulation, auditory signals,etc.
Focusing
on thesetwoiunctions
of perceptionI
wi]1 introduce our recent investigations on multisen-sory integration in this article.
First,
Iwill report some of our research on theaudio/visualintegration.
Second,
I
willdescribe
investigations
on theintegra-tion of visual
'information
with vestibu]arinforma-tion.
Spatiotemporal
limits
of audio!visual
integration
in
the
stream/bounce effectOur interestand understanding about
audio/vis-* Department of Psychology,
Toheku
Gakuin
Un'iverslty,
2-1-1Tenjinzawa,
Izumi-ku,
Sendai
981-3193
Copyright2009.
ual integration has increa$ed significantly over the
Last
decade.
Prior
tothissurgeln
interest,
a classlcalview of audio/visua] integrati'onwas that vision
dominates
confiicttng auditoryinformation.
A
typi-cal exarnple isthe ventriloquist cffect inwhich
ob-servers perceive the ventriloquist's voice as ifit
comes
frorn
the puppet's mouth.A
contrary rnodernview of audiofvisua] integration isthat there are
instances
where auditoryinformation
influences
vi-sion. A seminal
finding
was Sekuler, Sekuler andLau's
(1997)
report of thc so-called `stream/bounce'effect, a transient-induced shift
in
perceptual biaswhen resolving an ambiguous motion djsplay. Ina
typicaldisplay,two identicaltargcts
(usua]ly
dots orsquares), movc toward one another from opposite
sides of adisplay at a constant speed, superjmpose at
the center
(it
isreferred to as "thepoint of
coinci-dence''),and continue past one another tothe other
Qbject's starting point. This visual sequencc is equally consistent with Lhe two objects "streaming"
past one another with their
individual
motions un-changed or "bouncing"off of one another where the
targets reverse thei,r motion after superimposirig
(Figure
1).Streaming isthedominant
perceptionin
The JapanescPsychonomic Secicty,Allrights reservecl. NII-Electronic
Streaming
Percept
Figure
K.
SAKuRAi:
Exploring
rnultimodalintegration
in
twofunctions
of perceptionor
1.
Two
possible
perceptsa typical stream/bounce display.
Beuncing
Percept
generated in
visual only
displays,
though bouncing isoccasion-ally reported
CBertenthal,
Banton,
&
Bradbury,
l993;
Sekuler & Sekuler, 1999), Interestingly,however,
Sekuler
et a], showed that abrief
auditory tonepresented at or near the point of coincidence alters
thisbiasfrom predomi/nant]y streaming to
predomi-nantly
bouncing.
This stream!bounce effect havebeen
replicated and examinedin
detail
by
severalsubsequent
investigators
(e.g.
Fujisaki,Shimejo,
Kashino,
&
Nishida,
2004)
though allof themhave
employed ambiguous visual motion sequences,
lead-ing tothe assumption that auditory stimulation has
little
influence,
sufficient only tobias
the resolutionof ambiguous visual displays,
Inorder to more thoroughly assess the organizing strength of an auditory stimu]us on a visual
display,
we have been investigating thespatiotemporal limits of audio/visualintegration
by
manipulating the con-ventional streamfbounce display inspatial andtern-poral domains.
Spatial
domain
In
the
spatialdornain,
webegan
with aconven-tional,ambiguous, streamlbounce display and pro-gressively offset the rnotion trajectoriesof
the
mov-ing
targets either verticallyin
a 2-D display or indepth ina 3-D disp]ay
(Figures
2 and 3). We foundthat
the
bias
toward$bouncing,
induced
by
anaudi-tor}rtransient,
persists
despite
the trajectoryoffsets reducing theprobability of a motion reversal(Grove
&
Sakurai,
2009).
Offset conditions were cornbined with two
audi-t.oryconditions
(tone
or no tone at the point ofcoin-cidence)
in
the presence or absence of a centraloc-cluder.
In
conditions with no sound, streaming wasreported on a clear majority of tria]sregardless of spatial offset. When a transienttone was presented,
reported motion reversa]s
dominated
and persistedforincreasing
2-D
vertical offsets up to 17,9min arca)
b)
Stream
Bounce
131
Figure
2.Schematic
illustration
of observers'view and possible perceived trajectoriesof
the targetsina 2-D display. Ina) no occluder
is present. Observers could perceive the
targets
to
either stream past{left
panel)or to reverse their trajectory after ceincidence
(right
panel}. In b) possible streaming andbouncing percepts when the targets coincjde
behind
a central occ]uder.Dashed
lines
in
the
left
panel of a) and b)a]So represent theobjective path of the targets.
a)
b)Figure3,
Oblique view illustrating possible
perceived trajectoriesof the targets
in
a3-D
display.
In
a) no occluderis
present.vers could perceive the targets to either
stream past
(]eft
panel) or to reverse their
trajectory
after coincidence(right
panel).
Motion
reversal after coincidence would
involve
targetsswitching depth planes afterthe point of coincidence. In b) possible streaming
(left
panel)
and bouncing{right
panel) percepts when the targets coincide
behind
a central occluder. Dashed linesinthe leftpane] of a) and b)also represent the
NII-Electronic Library Service
132
TheJapanese
Journal
of Psychonomic Science Vol.28, No. 1and for3-D trajectory offsets up to
25.6
min arc. Thepersistence of the
bounce
promoting effect of an audltory tone at thepoint of coincidence, indisplaysrendered unambiguous and more consistent with
streaming, shows that the organizing strcngth of audttery stimulation on visual perception isstronger
than previous]y thought, Ternporal domain
To
explore the tempora] properties of audiovisualinteraction,we employed a novel motion sequence, a
multiple streamfbounce
display,
consisting of twodisc-pairstracing orthogona] oblique
(
±45 deg)tra-jectorics
at equal speeds(13.44
deg/sec),
andcoincid-ing
at acentraLfixation
paintCKawachL
Grove,
Saku-rai, & Gyoba, 2008)
{Fig.
4). We discovered that inthe
absence of anytransient
stimulL when a]1four
discssimultancously coincided at the center of the
display,
perceivcdbouncing
dominates.
Ilowever,
when one ofthe
disc-pairs
lags
behind
theother suchthat
the disc-pairscoincide atdifferent
tirnes,
vi.$ual only sequences are resolved as streaming events.Bouncing
ofboth
disc-pairs
is
recovered,however,
ifa transienttone ispresented simultaneously with the
first
coincidence event. Therefore,in
the temporal・domain,
wefound
that one auditory tonecrossmo-dallyaffects mulLiple visual events. Our experiments are systematically exploring this
phenomenon.
Inour firstexperiment with thi$
display,
w・ema-nipulated the temporal offsets
between
thecoinci-dences of the
disc-pairs
(O-250ms)
by staggeringmetion onset
between
the pairs,A
brief
tone,syn-chronou$ with the
first
coincidence, was presentedon half the trials. Observers
judged
whether alldisc-pairsappeared tobounce or not. The tone
pro-moted bounce pcrcepts lnboth disc-pairsin spite of
increasing
offsets up to250 ms, againdemonstrating
a robust effect of auditory stimulation on visual
perception.
However,
it
is
possible that perceived bouncing in the firstdisc-pair per se promotedbouncing insecond. A subsequent experiment with
three disc-pairsruled out thispossibilityby eliciting vision-only re]iable bouncing in the firstmotion
se-quence
{simuLtaneous
coincidence offour
discs}
with-out atransient sound
{Kawachi
ct al., 2008). ResultsFigure4.
The
inu]tjple stream!bounceplay,
The
left
sequence shows the O ms
delay
conditionin
which all the objectssimultaneously begin movjng toward the
center, simultaneously coincide and then
move away
from
one another.The
right
illustrates
thedelay
conditions inwhich oneof two-disc pairs moves firstand then the
other pairmoves such thatthe coincidence of
the two
discpairs
isno]onger
simultaneous.showed
that
transientfree
bouncing
in
the
first
coin-cidence didnot promote bouncing inthc second. Weconclude thatone auditory tone alters theperception
of multipLe visua] events.
Pereeived
direetion
ef self-motionby
visual/vestibular
integration
To
investigate
theintegration
of visual andvesti-bular
informationin
perceived self-motion, wemeas-ured the apparent
direction
of self-motion when the directionsof visuaL and vestibuLar information areinconsistent,
extending our previous research{Saku-raL
Kikuchi,
KikuchL
&
Misawa,
2002;2003).
Ob-servers were seated on an osci]lating motor-drlve'n
swing at various orientations to the direction of
swing motion, while they viewed
expandingfcon-tracting optic fiow consistent with
lorwardlback-ward selFmotion via a head mounted
display,
InallcondiLions, the optic
fiow
xiv'asphaseIocked
to
theactual motion of the swing.
Observers
performed arod-pointing tasktoreport the perceived directienof self-motion.
Whe" observers' bodies were orlented ]essthan 90
degrees
awayfrom
the
direction
of motion specifiedby
optic flow,$ome observers perceived mid-wayK.
SAKuRAI:
Exploring multimodalintegration
in
twofunctions
ofperception
133
i"/
Figure5. The vestibular produced pattern(RDP)
video camera arrow and the Gray arrow RDPCCDtt
stimuiation expandinglcontractingby
lmage mountedindicates
synchronize indicates contraction of the RDP.motor-driven swing for
and synchronized
optical
fiow
stirnu!icapturing the randomdot on a board with a CCD
on a swing, White
observer's rightward motion
'
d
expansion of theRDP.
Ieftward motion and
vestibular
peroeption
Figure6. Illustration of the integration of
visual and vestibu]ar information for srnall
(<90
degrees)
differences
between
visual andvestibular
inputs.
When
observers
ience
rightward vestibular stirnulationpanding optic flow, ebservers perceive mid-way diagonaL self-motion direction,
diagonal se!f-motion, consistent with a weighted
cornbination of vjsual and vestibular
input$.
Fororicntations more than 90 degrees away
from
thedirection
of opticflow,
observers reported theyeridi-cal
djrection
of theirbody
rnotion.That
istheyreported the
direction
of motion as signa]ed by the vestibular organs, discounting thevtsual input. Theresults suggest that the multirnodal integration in
the perception of se]f-motion isan either-or process
when the
direction
difference
between
visual andvestibular information
ls
large,but
theintegration
isa combination of
both
inputs
when thatdifference
is
smalL
Question
about multimodalintegration
Here I address a questioR whether there isany
common rule ofmultimodal
integration
in
twofunc-tionsof perception. Ernst and Banks
(2002)
showedthe
multimodalintegration
between
visual andhap-tic
information
isa weight ¢d cornbination of bothinputs. Their claim can be generalized toany other
combinatien of sensory
inputs
asfar
asit
is
theobject orevent perception.
On
the other hand, forits generalization,there
are not enoughdata
ofpsycho-physical measurements on the multimodal
integra-tion
in
self-motionpercepti,on,
Our
data shows the possibilitythat the finding byErnst
and Banks(2002)
can be generalized tothe mu]timodalintegra-tionof visual and vestibular inputs.
Conclusions
The bias towards
bouncing,
inducedby
anaudi-tory transient,persistsas theprobability ofa motion
reversal was reduced by
introductng
a spatial offset either vertically ina 2-D display orin
depth
in
a3-D
djsplay.
A
single auditory tone crossmodally affectsmultiple visual evcnts with various temporal otfsets
between the two coincidences. Thesc rcsu]ts
demon-strate that auditory
information
has a $trongeror-ganizing
influence
on visual perception thanprevi-ously thought and contradicts the old view that vi-sion dominates theother senses.
The
apparent perceiveddirection
of se]f-motionis
crossmodallyintegrated
from
visual and vestibularsenses when theinputs are moderately disparate,but
is an either-or process, with the vestibular sense
dominating,
when theinputs
are widely different.
On
two majorfunctions
oi perccption, a questionis
left
whether a common rule of the multimodalinte-gration exists.
Acknowledgements
I thank all rny collaborators, Phi]ip M.
Grove,
Saka-NII-Electronic Library Service
134
The
Japanese
Journal
ofPsychonomic
Science
Vol.28,
No. Imoto,
Jiro
Gyoba,
and YOiti Suzuki. This researchwas supported
by
aGrant-in-Aid
ofMEXT
for
Spe-cially
Promoted
Research
(no,
19001004}.
References
Bcrtenthal, B
.I.
Banton, T,,& Bradbury, A.(1993}.
Directional bias in the perception of translating
pattems,
Percoption,
22,
193-207.
Ernst,
M.O.,
&
Banks,
M,
S.
(2O02).
Humans
integrate
visual and hapticinformation ina statistically
timal
fashjen.
Nliture,
415, 429-433,FujisakL
W,,
Shimojo.
S,,
Kashino,
M.
&
Nishida,
S.
(2004).
Recalibration of audiovisual simultaneity.IVlatureArL?uroscience,7,773-778.
Grove,
P.M., &Sakurai,
K,(2007).
EquivaLenteamlbounce perception in cyclopean and
nance
defined
displays.
[Abstract]
lburaal
ofsion, 7(9):302,
302a,
http:1/journalofvision.org/7/
9f303L
doi:10,116717.9.302.
Grove,
P.
M.
&
Sakurai,
K.
(2009).
Auditory
induced
bounce
perception persist$as the probability of amotion reversal
is
reduced.Perception
advanceonline
pubLication,
doj:10,1068fp5860.KawachL Y,,Grove, P,M., SakuraL K.,& Gyoba,
J.
(2008),
Crossmodal
effects of a single auditor}rtone
on multiple visual events. PtarcqPtion,
ment), 27,
Sakurai,
K.
KjkuchL A.,Kikuchi,
R,,
&Misawa,
Y.
(2002).
Perceived
direction
of self-motionfrom
ua] and vestibular sensory integration.PV,oceedings
of
theSecond Asian Conference on Vision,102,Sakurai,
K.
KikuchL
A.
Kikuchi,
R.
&
Misawa,
Y.
(2003),
Perceived direction and distance ofmQtion frornvisual and vestibular sensory
tion.
Perception,
32CSupplemenO,
71.
Sekuler,
R.,
&
Sekuler,
A.B.
(1999).
Collisions
tween moving visual targets:what controls
native ways of seeing an ambiguous display?
cePtion,
28,
415-32.
Sekuler,
R.,
Sekuler,
A.
B.,
&
I.au,R,
C1997).
Sound
alters visual motion perception.Ndture,385, 308.