• 検索結果がありません。

1405【ICASSP2014 ポスター】pdf 最近の更新履歴 Ryo Masumura: Web

N/A
N/A
Protected

Academic year: 2018

シェア "1405【ICASSP2014 ポスター】pdf 最近の更新履歴 Ryo Masumura: Web"

Copied!
1
0
0

読み込み中.... (全文を見る)

全文

(1)

Copyright©2014 NTT corp. All Rights Reserved.

Role Play Dialogue Topic Model for Language Model Adaptation

in Multi-Party Conversation Speech Recognition

Ryo MASUMURA, Takanobu OBA, Hirokazu MASATAKI, Osamu YOSHIOKA and Satoshi TAKAHASHI

NTT Media Intelligence Laboratories, NTT Corporation, Japan

�,�

� �

 Share the topic distribution among all speakers.

 Treat not only conversational topic but also speaker role.

In RPDTM, word probability distribution has different form depending on the value of condition variable. � takes the values of 0 or 1.

� �|�, � = � � �, , � = � � � , � � �,

, � =

3-1. Role Play Dialogue Topic Model (RPDTM)

1.For ea h topi t =1,…,T: (a). Draw �~Dirichlet β 2.For ea h speaker r=1,…,R:

(a). Draw ~Dirichlet (b). Draw~Dirichlet 3. For each dialogue m=1,..M:

(a). Draw �~Dirichlet . For ea h speaker r=1,…,R: For each word i=1,…, �,�: (b-1). Draw �~ �� � �

(b-2). Draw ~ �� � � (b-3). If = ,

then draw ~ �� � �, else draw �~ �� � �

1. Overview

[Objective]

Introduce an unsupervised language model adaptation

technique for multi-party conversation tasks.

[Points]

Propose a novel topic model called role play dialogue topic

model (RPDTM) and also propose an adaptation framework

that can utilize multi-party conversation attributes.

• Each speaker shares the same conversational topic.

• Ea h speaker’s uttera e depe d o ot o ly

conversational topic but also own role.

2-1. Latent Dirichlet Allocation (LDA) [D. M. Blei+, 2003.]

1.For ea h topi t=1,…,T: (a). Draw ~Dirichlet β 2. For each document m=1,..M:

(a). Draw �~Dirichlet (b) For each word i=1,…, �,�: (b-1). Draw �~Muliti � (b-2). Draw ~Muliti � Topic model can capture semantic properties of words and documents.

2-2. LDA-based unsupervised LM adaptation [Y. Tam+, 2006.]

Single recognition hypothesis

Topic probability estimation

Adapted unigram

Unigram

marginal Adapted N-gram

Background N-gram

A recognition hypothesis is used for estimating the topic probability and adapted unigram probability is calculated. Then, n-gram is adapted using unigram probability based on unigram marginal technique.

3-3. RPDTM-based unsupervised LM adaptation

Set multiple recognition hypotheses for each speaker and simultaneously adapts LMs for each speaker role using the shared conversation topic.

Recognition Hypothesis A (Operator)

Adapted unigram A

Unigram

marginal Adapted N-gram A

Background N-gram

4. Experiments

The conventional models are only appropriate for single speaker task because they assume that each document has a different topic.

 In multi-party conversation, we have to give consideration to the aspect of the correlation among several speech sets

Adapted unigram B

Adapted N-gram B Unigram

marginal Recognition

Hypothesis B (Customer)

3-2. Inference of RPDTM

Gibbs sampling can be used for the assignment of topic variable

and conditional variable �.

� �|�−�, �, � ~ � �, , �= � �, � �|� , , �=

� �|�−�, �, � ~ � �� �, � �|�, , =

, � � � , , �=

Used contact center dialogue data sets. One dialogue set means a telephone call between one operator and one customer.

Methods Topic

sharing

Consider for

speaker role PPL WER (%)

BASE First decoding pass based on background LM. - - 47.12 22.70

LDA1 Individually constructed adapted LM using each

speaker recognition hypothesis based on LDA. × 42.56 22.26

LDA2 Constructed single adapted LM using all speaker

recognition hypotheses based on LDA. × 43.88 22.18

RPDTM Individually constructed adapted LM using all

speaker recognition hypotheses based on RPDTM. 39.66 21.20

• Training set: 1922 dialogues (1.7M)

• Test set: 18 dialogues (20K)

• Background LM: 3-gram hierarchical Pitman-Yor LM (60K)

• Acoustic model: Triphone DNN-HMM (7 hidden layers of 2048 nodes)

• Decoder: WFST-based VoiceRex

• Number of topics: 20

RPDTM-based adaptation is more effective than LDA-based adaptation. Both topic sharing and role-dependent adaption are effective for multi-party conversation.

Once topic variable and condition variable assignments are concluded, Each probability distribution can be calculated.

• If � = , � is related to speaker role.

• If � = , � is related to the topic of the dialogue.

The role means speaker type. For example, there are two roles in

contact center dialogue, which is operator and customer.

RPDTM assu es that ea h speaker’s role i o versatio is give .

RPDTM generates a topic distribution for each dialogue, which includes several speech sets.

Topic variable

Topic variable Condition

variable

Topic probability estimation One of the most accurate approaches is based on probabilistic topic

models such as latent Dirichlet allocation.

Problems:

RPDTM is used to estimate topic probability for the target dialogue.

参照

関連したドキュメント

To capture the variation of effective control reproduction number (R c (t)), the control process are divided into three periods, the average of R c (t) are calculated for each stage

When S satisfies the Type II condition, N is closed under both ordinary matrix product and Hadamard (entry-wise) product, and N becomes a commutative algebra (with unity element)

We define the notion of an additive model category and prove that any stable, additive, combinatorial model category M has a model enrichment over Sp Σ (s A b) (symmetric spectra

Furthermore, computing the energy efficiency of all servers by the proposed algorithm and Hadoop MapReduce scheduling according to the objective function in our model, we will get

By incorporating the chemotherapy into a previous model describing the interaction of the im- mune system with the human immunodeficiency virus HIV, this paper proposes a novel

S.; On the Solvability of Boundary Value Problems with a Nonlocal Boundary Condition of Integral Form for Multidimentional Hyperbolic Equations, Differential Equations, 2006, vol..

Professionals at Railway Technical Research Institute in Japan have, respectively, developed degradation models which utilize standard deviations of track geometry measurements

In this paper, a state-dependent impulsive dynamical model with Holling I functional response predator-prey concerning different control methods at different thresholds is proposed;