2M4-OS-20a-4
Colletion and analysis of multi-party interation data
for boredom reognition
NataliiaBiriukova 1
KoutaroFunakoshi 2
Koihi Shinoda 1
1
Tokyo Institute of Tehnology 2
Honda ResearhInstitute
Inhuman-omputer interationsystems suh as tutoring systems or entertainment robots, it is important to
keepusers' attentionand not toget them bored. For this purpose, rst suh systemsshould reognize whether
usersareboredornot. Weplantodevelopanautomatiboredomreognitionsysteminwhihseveralnon-verbal
ues from users suh as gestures and faial expressions are aptured and utilized. Inthis paper we report our
database olletion for this development. It onsists of a set of multi-party onversations inluding a personal
robot,reorded byRGB-Dameraandmirophones. Weannotated`bored',`notbored',`annot say
’
,and`fae notvisible'ategories. Wefoundorrelationbetweenphysialativitiesofsubjetsandtheirboredomstates. Thelakofbodymovementsduringinterationindiatesboredomstate.
1. Introdution
Themostommonhuman-omputerinterationstylenow
is thedesktopstyle, inwhihtheinterationis performed
throughgraphialuserinterfaes, keyboards,andpointing
devies. Althoughit is very useful wheninteratingwith
PCs,itisnotenoughforemergingappliationsof
omput-ers,suhasintelligenttutoringsystemsorsoialassistants
[1℄.
Withreenttehnologyadvane,newkindsofomputers
for those newappliations have beendeveloped. In those
appliationsasystemneedstounderstandusers
’
aetive states, suh as emotions, interest level, engagement, andboredom. Humansexpresstheiraetivestateinboth
ver-bal and non-verbal ues. Several studies (e.g. [2℄) have
reportedthathumansmostlyrelyonnon-verbalueswhen
judging aetive states. Non-verbal ues play important
rolesinaetivestatereognition.
Dierent aetive statesplay dierentroles andseveral
researhes have been devoted to reognition of emotions,
interestlevel,and engagement. Ontheotherhand,
auto-mati boredomreognitionimportane hasnot been fully
explored. When a person is bored during interation in
anyareaoflife,thegoalsofinterationmightnotbefully
reahed.
In this paper we will rst review previous studies and
their methods for dataset labeling, then desribe our
datasetandannotationstrategy,andreportourresults.
2. Previous studies
2.1 Aetive states reognition
Therehasbeenanumberofresearhesdealingwith
non-verbalommuniationues;todetetuser'suriosityin
us-tomer servie appliation [3℄, interest detetion in
one-to-oneinteration[4℄andinmeetings[5℄. Therealsohasbeen
boredomreognitionresearhesbasedonheadpositions[6℄
oronpostures[7℄.
連
絡
先
: Nataliia Biriukova, 080 3019 8755,
biriukova.n.aam.titeh.a.jp
So far, most of those works has foused on only one
modalitywhilesimultaneoususeofmultiplemodalitieshave
inreasedreognitionauray[8℄. Somestudieshave
om-binedonevisualmodalitysuhasfaialexpressionwithone
audiomodality(e.g. [9℄).
2.2 Datasetlabelingmethods
Most ofthelabeling methodsinaetiveomputing
re-searheshas usedannotation byjudgesandquestionnaire.
Jaobs[6℄usedtheirombinationtolabelboredomstates.
Partiipantsrstlabeledhowboredtheywereineahvideo
on a 7-point Likert sale, then two judges put one label
pervideo. The two judges ahieved anaverage of 76.9%
agreementaftertherstannotation. Theythenwentbak
andre-annotatedtheeventswheretherewasdisagreement.
Thisimprovedtheagreementtoanaverage of96.7%.
InCastellano[10℄,their datasetwas annotatedinterms
ofuserengagementwitharobotbythreeannotators.
An-notatorshoseoneoutofthreeoptionsandtheresultsfrom
eahannotatorwerethenompared. Alabelwasonrmed
whenit washosenbytwoor threeof theannotators. In
aseeahoftheannotatorshoseadierentlabel,the
seg-mentwaslabeledas`annotsay'andwasnotusedintheir
furtherstudy.
Ourstrategydiersfromthem. Wedonotusethe
ques-tionnaire. Aimingfor naturalinteration, ineahphaseof
theironversationwefousedonlong-timeinteration
se-narioswhere subjetsmay notbe able toorretlyreport
theirboredomstate.
3. Database
3.1 Data
Database 1
[12℄onsistsof60reordings,ineahofwhih
threeusers interating with arobot, reorded by RGB-D
ameraandmirophones. Thenumberofsubjetsintotal
is90. Eahreordingis25minuteslong. WeusedNaorobot
[13℄andemployedWizard-of-Oz(WoZ)tehniqueinwhih
1 thispaper'snotionofpartiipationisdierentfromthe
Fig. 1: `GestureGame'senario
an operator remotelymanipulates a robot, ontrolling its
movement,speeh,andgestures. Duringonesessionallthe
users anappear inthe senetogether, inpairs,or alone.
They wereinstruted tobehavenaturally,free toleaveor
jointhesenewhenevertheywant.
Eahgroup of threeusers (further alled A, B,and C)
partiipated in two dierent interation senarios. First
senariois`QuizGame'. In`QuizGame
’
,therobot imag-ines aword (e.g. `apple') andanswersyes-noquestionsofusers. Users'goal istoorretlyguesstheimaginedword,
askingquestionsanddisussing therobot
’
s answers with eahother. Theseondsenario is a`Gesturegame'(Fig.1). Itisagameinwhihtherobottriestoteahusersaset
of gesturesinEnglish. Forexample, the robottouhes its
noseand says`Nose',askingusersto repeat thesame
ges-ture. Ifuser'sgestureisorret,therobotgivesapproving
omment.
3.2 Annotation strategy
Annotationisondutedby threejudges(further alled
X,Y,Z),twofemales(X,Z)andonemale(Y).In
annota-tion we used`bored', `notbored',`not sure' and `faenot
visible' labels. If a state is observable less than 2 se, it
is not labeled. Followings are the desription of the four
labels:
A) Nofaevisible-Thefaeoftheuseristurnedfromthe
robotfor90degreesormore,orthefaeisblokedby
theotheruser.
B) Bored-Theuserisnotative,reatsslowly,ordoesn't
reatatalltotheotherpartiipants.
C) NotBored-Theuserativelypartiipatesinthegame,
reatstotherobot'squestionsfast,interatswiththe
robotortheotherusersenergetially
D) Cannotsay -Itisextremelyhardforthejudgetoput
anyoftheaboveategories
Figure2showsthedeisiontreeusedinthelabeling. To
answerquestions`Subjetpartiipates?' and`Subjetlooks
interestedinpartiipating?',judgesusedthenextrules:
1. Subjetpartiipates, whenheorshe:
Fig. 2: Annotationdeisionhart
(a) Doesgesturesthattherobotaskedtodowithin
3seaftertherobotnisheditsspeeh.
(b) Replies to the questions within 3 se after the
robotnishedherspeeh.
() Raisesahandtoreplytothequestionswithin3
seaftertherobotnisheditsspeeh.
(d) Touhesortalkstotheothersubjets.
(e) Makesexitedorhappynoises.
(f) Doesnotaverthis/hergazefromthe robotand
theothersubjetsforlongerthan7se.
2. Subjetlooksinterestedinpartiipation,whenhe/she:
(a) Looksattherobotortheotherpartiipantswith
smile
(b) Whenstandinginthebak,thesubjetxesgaze
ontherobotortheotherpartiipants
() Startstalkingtotherobotbeforetherobotasks
him/hertoplay
Intheases whenjudgeswerenot sureaboutpreseneof
featuresfromthelist aboveandthereforewerenotableto
answerquestionsinthehart,theyannotated
‘
annotsay’
label.Some spontaneous gestures are informative for
annota-tors. We listed theminTable 1. Whenannotators found
Group Gesture
Fixing Clothesxing
Hairtouhing
Faetouhing
Waving Wave
Win Win
Clap
Handsup
Pointing Pointing
Selfpointing
Next
Playfull Daning
Table 1: List of of partiipants'spontaneous
ges-tures
A B C
Before 7 10 43
After 5 9 13
Table 2: Disagreementratesbefore andafter
re-annotation(%)
4. Results
Table2showsthedisagreementratesfor
‘
GestureGame’
sessionbeforeandafterre-annotation.Before re-annotation the disagreement rate between
judges was high. For example, it was high for C due to
hisambiguousbehavior. Table3showstheexampleofthe
amountoftimeperstate,labeledbyjudgeXtothree
parti-ipants,beforeandafterre-annotation. ThejudgeXtended
toput`bored'labelmoreofteninitially. Alsore-annotation
redued the amount of time of `annot say' label for all
judges. However, it is not lear whether this was dueto
thebetterunderstandingofsubjets'reationsorthemore
biaseddeisions.
We'vefound strongorrelation betweenboredomstates
and the numberof spontaneous gestures of subjets.
Ta-ble4 shows theamountof gestures ineahstate for eah
subjet. In`bored'statesubjetstendtobemorestilland
makelessgestures. We'vealsofoundaorrelationbetween
`bored' state ourrene and the number of partiipants
presentin the sene. For ases whenonlyone person
in-teratedwiththerobotandthepersonbeomesbored,the
appearane of the other partiipants in the sene always
ausesstatehangeto`not-bored'. Therewereno
disagree-mentbetweenjudgesinall suhinstanes,whihmakesus
totrustthelabelinghere.
5. Conlusion
Anautomatiboredomreognitionsystemplays
impor-tantroleinaetivestatereognition. Weolletedand
an-alyzedthedatasetforsuhasystem,usingmultiple
modal-ities. Weusedinterativesenariosfor human-robot
inter-ation,reordedthedatasetbyRGB-Dameraand
miro-phones, andlabeledthemintermsofboredomstates. We
ahieved 80% agreement rate between three judges. We
A B C
Before After Before After Before After
Bored 00:57 00:42 01:43 00:50 03:58 03:13
NotBored 12:47 14:02 11:10 12:23 08:59 10:58
Cannotsay 01:09 00:00 01:07 00:21 01:33 00:31
Nofaevisible 00:00 00:05 00:41
Table 3: Time per state before and after
re-annotation(min:se)
A B C
Bored 5 3 11
NotBored 58 16 60
Cannotsay 0 2 5
Nofaevisible 0 0 2
Table 4: Amount of gestures ourred in eah
state
have found the orrelation between the physial ativity
of subjets and their boredom states. The lak of body
movementsandativenessduring theinterationindiates
boredomstate.
Weplan todevelopautomatiboredomreognition
sys-teminfuture.
Referenes
[1℄ M. Turk, G. Robertson, The Human-Computer
Inter-ationHandbook: Fundamentals,EvolvingTehnologies
andEmergingAppliations,2008.
[2℄ A. Mehrabian, Communiation without words,
Psy-hol.Today,vol.2,no.4,pp.53-56,1968.
[3℄ P.Qvarfordt,D.Beymer,S.X.Zhai,Realtourist-astudy
of augmentinghuman-humanand human-omputer
di-alogue with eye-gaze overlay. INTERACT 2005, vol.
LNCS3585,pp.767-780,2005.
[4℄ A. Pentland, A. Madan, Pereption of soial interest.
Pro.IEEEInt.Conf.onComputerVision,Workshopon
Modeling People and Human Interation (ICCV-PHI),
2005.
[5℄ L. Kennedy, D. Ellis, Pith-based emphasis detetion
forharaterizationofmeetingreordings,Pro.ASRU,
2003.
[6℄ A. Jaobs, B. Fransen, J.M. MCurry, F. Hekel, A.
Wagner, J.G. Trafton, A preliminarysystem for
reog-nizingboredom. Proeedingsofthe FourthACM/IEEE
International Conferene on HumanRobot Interation,
2009.
[7℄ S.Mota,R.W.Piard,Automatedpostureanalysisfor
detetinglearner
’
sinterestlevel,ComputerVisionand PatternReognition Workshop,2003.[8℄ E. Hudlika, To feel or not to feel: The role of aet
in human-omputer interation, Int. J. Hum.-Comput.
[9℄ M. Panti et al., Aetive Multimodal
Human-ComputerInteration,Pro.ofACMInt'lConf.on
Mul-timedia,pp.669-676,2005.
[10℄ G. Castellano, A. Pereira, I. Leit, A. Paiva, P.
MOwan,Detetinguserengagementwitharobot
om-panion usingtask andsoialinteration-based features.
Proeedingsofthe11thICMI,2009.
[11℄ S.K.D
’
Mello,P.Chipman,A.C.Graesser,Postureasa preditoroflearner’
saetiveengagement.Proeedings of the 29th Annual Cognitive Siene Soiety, pp.571-576, 2006.
[12℄