• 検索結果がありません。

Discrepancies in Head Definition

Head Finalization relies on the idea that head-dependent relations are largely consistent among different languages while word orders are different. However, in Chinese, there has been much debate on the definition of head2, possibly because Chinese has fewer surface syntactic features than other languages like English and Japanese. This causes some discrepancies between the definitions of the head in Chinese and Japanese, which leads to undesirable reordering of Chinese sentences. Specifically, in preliminary experiments we observed unexpected reorderings that are caused by the differences in the head definitions, which we describe below.

4.2.1 Aspect particle

Although Chinese has no syntactic tense marker, three aspect particles following verbs can be used to identify the tense semantically, namely 了(did), 着(doing), and 过(done).

Their counterparts in Japanese are た(did), ている(doing), and た(done), respectively.

Both the first one and third one can represent the past tense, but the third one is more often used in the past perfect.

The Chinese parser3treated aspect particles as dependents of verbs, whereas their Japanese counterparts are identified as the head. The mis-reordering of “去(go to)了(-ed)” to “了 (-ed) 去(go to)” in Figure 4.1 is one of the examples. Since “了(-ed)” is recognized as a dependent of “去(go to)” while its Japanese counterpart is the syntactic head of the verb,

2In this thesis, we only consider the syntactic head.

3The discussions in this chapter presuppose the syntactic analysis done by Chinese Enju, but most of the analysis is consistent with the common explanation for Chinese syntax.

Chapter 4. Head Finalization for Chinese (HFC)

simple implementation of HF leads to the wrong operation. Similarly, 着(doing) and 过(done) are reordered wrongly with the verbs that they are modified.

4.2.2 Adverbial modifier bu4(not)

Both in Chinese and Japanese, verb phrase modifiers typically occur in pre-verbal posi-tions, especially when the modifiers are adverbs. Since adverbial modifiers are dependents in both Chinese and Japanese, head finalization works perfectly for them. However, there is an exceptional adverb in Chinese, namely bu4(not) and its Chinese character is不(not), which functions as a negator and is usually translated into ない in Japanese. As an ad-verb in Chinese, 不(not) is always a dependent of the verb that it modifies, whereas な い in Japanese is always at the end of the sentence and thus is the syntactic head.

As an illustration, an example is shown in Figure 4.2. In the subtree of c4, the verb

“看(watch)” is identified as the syntactic head and “不(not)” is its dependent; on the contrary, in the Japanese translation, “ない(not)” as the counterpart of “不(not)” has been identified as the syntactic head of the verb. As a result, the alignment between reordered Chinese sentence and its Japanese translation is not monotonic as shown in Figure 4.2b.

4.2.3 Sentence-final particle

Sentence-final particles often appear at the end of a sentence to express a speaker’s attitude: e.g. 吧(right?), 啊(ah) in Chinese, and なぁ, ねぇ in Japanese. Although they are in the same position in both Chinese and Japanese, in accordance with the differences of head definition, they are identified as the dependent in Chinese whereas they are the syntactic head in Japanese.

For example in Figure 4.3, since “啊(ah)” was identified as the dependent, it has been reordered to the beginning of the sentence. However, its Japanese translation “ね” is at the end of the sentence and acts as the syntactic head. Likewise, the alignment between

Chapter 4. Head Finalization for Chinese (HFC)

(a) Original HPSG Chinese parse tree (b) Reordered Chinese sentence Figure 4.2: An example of showing the mis-reordering of adverbial modifier不(not) while implementing HF to Chinese pre-reordering. Figure 4.2a shows the original parse tree and its English translation. Figure 4.2b shows the wrongly reordered Chinese sentence along with its Japanese translation.

reordered Chinese sentence and its Japanese translation is not monotonic as shown in Figure 4.3b.

4.2.4 Et cetera

In Chinese, there are two expressions for representing the meaning of “and other things”

with one Chinese character: 等(etc.) and 等等(etc.), which are both identified as de-pendent of a noun phrase of verb phrase. In contrast, in Japanese, the translation of Et cetera is など which is always the head because it appears as the right-most word in a phrase. For instance, the verb phrase of “包括(include) 苹果(apple) 等(etc.)” in Figure 4.4. Since “等(etc.)” is not the syntactic head in Chinese but is in Japanese, HF produced a wrong reordering for the phrase.

Chapter 4. Head Finalization for Chinese (HFC)

(a) Original HPSG Chinese parse tree (b) Reordered Chinese sentence Figure 4.3: An example of showing the mis-reordering of sentence-final particle while implementing HF to Chinese pre-reordering. Figure 4.3a shows the original parse tree and its English translation. Figure 4.3b shows the wrongly reordered Chinese sentence along with its Japanese translation.

(a) Original HPSG Chinese parse tree (b) Reordered Chinese sentence Figure 4.4: An example of showing the mis-reordering ofEt ceterawhile implementing HF to Chinese pre-reordering. Figure 4.4a shows the original parse tree and its English translation. Figure 4.4b shows the wrongly reordered Chinese sentence along with its Japanese translation.

Chapter 4. Head Finalization for Chinese (HFC)

Table 4.1: The List of POS tags for Exception Reordering Rules AS Aspect particle

SP Sentence-final particle ETC et cetera (i.e. 等 and 等等) IJ Interjection

PU Punctuation

CC Coordinating conjunction

4.2.5 Head finalization for Chinese (HFC)

In the preceding sections, we have discussed syntactic constructions that cause wrong application of Head Finalization (HF) to Chinese sentences. Following the observations, we propose a method to improve the original Head Finalization pre-reordering rule for Chinese to obtain better alignment with Japanese.

The idea is simple: we define a list of Part-of-Speech (POS) tags4 (See Table 4.1) to control the implementation of HF. In other words, if the POS tag of a leave node in a branch belongs to the predefined POS tag list, whether operates HF on the branch is depended on the exception rules. For example, as for the case of sentence-final particle in Figure 4.3, since the POS tag of “啊(ah)” is SP which is in Table 4.1 and according to the analysis in previous section, HF thus will not be operated on the node of c1. That is, branch of c2 and c7 will not be swapped.

In Table 4.1, interjection is included as well. Although we did not discuss interjections in detail, it is obviously that interjections should not be reordered, because they are position-independent. Moreover, the rules for PU and CC are basically equivalent to the exception rules proposed by Isozaki et al. [1].

4The definitions of POS tags follow the guideline of the Penn Chinese Treebank v3.0 [92].

Chapter 4. Head Finalization for Chinese (HFC)

関連したドキュメント