• 検索結果がありません。

JAIST Repository: Tree-to-String Phrase-based Statistical Machine Translation

N/A
N/A
Protected

Academic year: 2021

シェア "JAIST Repository: Tree-to-String Phrase-based Statistical Machine Translation"

Copied!
2
0
0

読み込み中.... (全文を見る)

全文

(1)

Japan Advanced Institute of Science and Technology

JAIST Repository

https://dspace.jaist.ac.jp/

Title Tree-to-String Phrase-based Statistical Machine Translation

Author(s) NGUYEN, Thai Phuong Citation

Issue Date 2008-03

Type Thesis or Dissertation Text version author

URL http://hdl.handle.net/10119/12070 Rights

(2)

Abstract

The major aim of our study is to improve phrase-based statistical machine translation (SMT) using syntactic information represented in constituent tree form. In recent years, there have been many studies about syntactic SMT. Most studies rely on formal grammars such as synchronous context-free grammars and tree transducers. The approaches can be different in a number of aspects such as input type for example string or tree, in rule form for example SCFG or xRs, in rule function including word reordering or word choice. Since these studies aim to improve both word reordering and word choice, their grammars have been fully lexicalized. We would like to make a distinction between word order and word choice when statistically modelling the translation process. We suppose that the input of a SMT system is a syntactic tree. Considering word order as a syntactic problem, we define syntactic transformation task which involves the word reordering, the deletion and the insertion of function words. We propose a syntactic transformation model based on the probabilistic context free grammar. By using this model, we studied a number of tree-to-string phrase-based SMT approaches which vary in the way syntactic information is used including preprocessing and decoding and the level of syntactic analysis including chunking and parsing. Our experimental results showed significant improvements in translation quality. Considering word choice as a semantic problem, we aim at incorporating WSD into phrase-based SMT. Our empirical study on this problem reveal various aspect of the integration. Our experiments showed a significant improvement in translation quality.

Key words: Computational Linguistics, Statistical Machine Translation, Syntactic Pars-ing, Word ReorderPars-ing, Word Sense Disambiguation, Word Choice.

参照

関連したドキュメント

In this paper, based on the concept of rough variable proposed by Liu 14, we discuss a simplest game, namely, the game in which the number of players is two and rough payoffs which

A second way involves considering the number of non-trivial tree components, and using the observation that any non-trivial tree has at least two rigid 3-colourings: this approach

We denote by Rec(Σ, S) (and budRec(Σ, S)) the class of tree series over Σ and S which are recognized by weighted tree automata (respectively, by bottom- up deterministic weighted

Based on this, we propose our opinion like this; using Dt to represent the small scaling of traffic on a point-by-point basis and EHt to characterize the large scaling of traffic in

[Mag3] , Painlev´ e-type differential equations for the recurrence coefficients of semi- classical orthogonal polynomials, J. Zaslavsky , Asymptotic expansions of ratios of

Wro ´nski’s construction replaced by phase semantic completion. ASubL3, Crakow 06/11/06

In the new approach, we use a hierarchical tree-based panel method to rep- resent and update the vortex sheet surface adaptively and truly locally by using a tree of panels.. Each

That is, we want to know if we can generalize Jacobsthal numbers, to express the number of occurrences of each digit in each shortest repeating string in the b-ary g-Collatz