investigated criteria for deciding the Japanese translation of ʻtheyʼ among
ʻkareraʼ, ʻkanojo-tachiʼ, ʻhito-bitoʼ and
ʻsoreraʼ. Our example sentences then were
“They are presents from my aunt. They are 1.Selection of Japanese word for ʻtheyʼ
in case “They are” is not followed by a noun phrase
In the 5th article [1] of this series, we
* Department of Applied Mathematics, Faculty of Science Fukuoka University, 8-19-1 Nanakuma, Jonan-ku, Fukuoka 814-0180, Japan
福岡大学理学部応用数学科
Dependence on Context
in case of English-Japanese Machine Translation I -8
Katsuyuki SHIBATA* (Received May 31, 2008)
英和機械翻訳における文脈依存性Ⅰ-8
柴田 勝征
(平成20年5月31日受理)
Abstract
This is the eighth of a series of articles on the context dependency analysis in case of English- Japanese machine translation system which we call “US system” . In this issue we examine the examples taken from Lesson 8 of the English textbooks for the first year grade Japanese junior high school students.
The main subjects to be discussed in this issue are the followings: (1) Selection of Japanese word for ʻtheyʼ in case “They are” is not followed by a noun phrase. (2) Is ʻnoʼ a negative an- swer or a Japanese art? (3) Is ʻbatʼ an animal or a baseball tool? (4) Selection among “asobu”,
“puree-suru”, “wo suru”, “wo hiku” and “wo enjiru” for verb ʻplayʼ. (5) Add a counting unit to a numeral. (6) Selection of Japanese for “then” among “sou sure-ba”, “sono toki”, “sorekara”, “sore- de-wa”, “suruto” and “sono koro ni-wa” for ʻbeʼ verb case. (7) Selection among “sono you-ni”,
“sore-hodo”, “totemo”, “you-ni”, “sore-de” and “mo mata” for ʻsoʼ. (8) Selection of Japanese for ʻthemʼ using information about the sentence before the last.
Key words: machine translation, context dependence,
context inherited from the preceding sentences, US system.
her favorite songs.” This time we consider the cases where “They are” is not followed by a noun phrase.
Mrs. Brown: Do you have any falls in Japan?
ブラウン夫人「日本に滝がありますか?」
Buraun fujin “Nihon ni taki ga ari- masu ka?”
Kenji: Yes, we do. They arenʼt so big.
But they are beautiful. ... (1)
健二「はい,あります.それら(それらの 滝)はそれほど大きくはありません.しか
しそれら(それらの滝)は美しいです.」
Kenji “Hai, ari-masu. Sorera (sorera no taki) wa sore-hodo ookiku-wa ari- masen. Shikashi sorera (sorera no taki) wa utsukushii desu.”
There are two ʻtheyʼs. The choice rules applied to the latter ʻtheyʼ are:
22;彼 女;2;T-1<>r;TA<>W;TZ<>W;SO<>W;
K630;
22;人びと;2;TZ<>ceghm; K735;
and there remain the candidates “karera” and “sorera” for a moment.
In contrast, the Japanese for the former
ʻtheyʼ is immediately determined by the following rule:
22;*そ れ ら;2;E0=they;FT;T0=O;SZ<>xF;OZ=
xF;OZ<>h;SZ2<>h;OZ2<>h;T1<>u; K145;
This rule verifies that this ʻtheyʼ is at the head of a sentence (FT = FirsT) and hence begins with an uppercase letter (T0=O;), that the attribute set for the subject of the preceding sentence does not contain symbols for plural nouns but that the
attribute set of the object of the preceding sentence does, and that either of the attribute set for the object of the preceeding sentence or for the subject and object of the sentence before the last contains no symbols for plural nouns. These imply that the first ʻtheyʼ modifies the object (ʻfallsʼ) of the preceeding sentence which is without human attribute ʻhʼ.
This determination of the first ʻtheyʼ induces the determination of the latter
ʻtheyʼ by the following choice rule:
22;*そ れ ら;2;E0<>those;T0<>o;SO<>h;PLS;P 1JT=2X;F2E=E0;F2T<>h; K560;
This rule verifies that the pronoun in question is not equal to ʻthoseʼ, that the attribute sets of the subject and object of the precedding sentence and of the sentence before the last do not contain ʻhʼ (=human), that it is preceded by another sentence (PLS) which contains the same pronoun as that in question, and that it is plural without human attribute ʻhʼ.
2.Is ʻnoʼ a negative answer or a Japa- nese art?
Recently we added noun ʻnoʼ as a Japanese traditional art in our system dictionary. As a consequence, we are obliged to distinguish between the noun
ʻnoʼ and the negative adverb ʻnoʼ. For example, the second English sentence below begins with ʻNoʼ.
Tom: Is that ʻThank youʼ too?
トム「あれ(それ)も ʻありがとう'です か?」
Tomu “Are (sore) mo ʻarigatouʼ desu ka?”
− 65 −
Machine Translation I-8(K. Shibata)
Kumi: No, it is not. It is ʻantʼ in English. ... (2)
久美「いいえ,ちがいます.それは英語で ʻ蟻'です.」
Kumi “Iie, chigai-masu. Sore wa Eigo de ʻariʼ desu.”
The choice rule for ʻnoʼ in (2) above is;
16;*6;0;E0=no;FT;LT;T0=O;E1=,;BS=*?;
G865;
This rule requires that ʻnoʼ in question should be at the head of a sentence beginning with an uppercase letter and that it should be followed by a comma, and finally that the preceding English sentence (BS) contains the interrogation mark. With all these conditions satisfied, we reject the noun candidate for ʻnoʼ and accept adverb candidates.
3.Is ʻbatʼ an animal or a baseball tool?
N o u n ʻb a tʼ h a s t w o v e r y d i s t i n c t meanings, namely, an animal name and a ball hitting tool. In our machine translation system, attribute ʻaʼ correponds to ʻanimalʼ and ʻmʼ to “medical, health or sports”. Thus, investigating the attribute sets (SO) of the subjects and objects of the preceding sentence and the sentence before the last, we can choose contextually better one for
ʻbatʼ.
I like baseball.
私は野球が好きです.
Watashi wa yakyuu ga suki-desu.
I have two bats. ... (3)
私は2本のバットを持っています.
Watashi wa ni hon no batto wo motte- i-masu.
The choice rule for ʻbatʼ in (3) above is the following:
11;*バット;2;E0=bat;SO=m;SO<>a; 0740;
4.Selection among “asobu”, “puree-suru”,
“wo suru”, “wo hiku” and “wo enjiru” for verb ʻplayʼ
English verb ʻplayʼ has a very wide range of usage, and it is often difficult to choose the most adequate Japanese for it.
Japanese verb “wo hiku” for ʻplayʼ is only used for musical instruments.
He has two rackets.
彼は2本のラケットを持っています.
Kare wa ni hon no raketto wo motte-i- masu.
He usually plays tennis at school.
... (4)
彼はたいてい学校でテニスをします.
Kare wa taitei gakkou de tenisu wo shi-masu.
The choice rule for rejecting “wo hiku” for the ʻplayʼ in (4) above is;
33;を弾く;2;E0=play;E1<>the;T1<>d;J1<>6790;
EZ<>guitar;EZ<>music;VR<>弾く; S885;
Before rejecting the musical usage, the rule checks, among other things, that ʻplayʼ in question is not followed by the definite article ʻtheʼ. This is based on the fact that we say “play the guitar”, “play the piano”, or “play the violin”, etc, while on the other hand we say “play baseball”, “play tennis”,
“play soccor” and so on.
5.Add a counting unit to a numeral.
In Asian languages as Japanese, Chinese and Korean, we use counting units for counting concrete things like humans, animals, books, houses, fruits, vegetables, m a c h i n e s , e t c , w h i l e i n E u r o p e a n languages as English, French, German, Russian, etc, they donʼt attach counting units to numerals when they count things.
Consequetly, we have to add a counting unit to numerals when we translate an English sentence into Japanese. In case the numeral is directly followed by a countable noun, the attribute information of the noun registered in the system dictionary gives the adequate counting unit for the noun.
But sometimes the noun phrase at the end of a sentence is omitted and the sentence ends with a numeral. See the following example.
Ken: How many rackets does she have?
健「彼女は何本のラケットを持っています か?」
Ken “Kanojo wa nan bon no raketto wo motte-i-masu ka?”
Mike: She has only one. It is old. She wants a new racket. ... (5)
マイク「彼女はたった1本持っています.
それ(その1本)は古いです.彼女は新し いラケットが欲しいと思っています.」
Maiku “Kanojo wa tatta ippon motte- i-masu. Sore (sono ippon) wa furui desu. Kanojo wa atarashii raketto ga hoshii to omotte-i-masu.”
One of the generation rules for constructing the Japanese translation for example (5)
above is:
5;361;3;1;LT;T2=n;DJ3=.;DJ4=そ れ
(;DJ1<>>;T2<>ty;J0#-格;J0#VR;T2#+OZ$;T2
#KAZ1;J2#KAZ2;J2#-の;J2#OJ;J4#-( );J4#+
(その;J4#+J2;J4#+);J1<->J2; E070;
This rule connects “3(verb)wo motte-iru”,
“6(adverb)tatta” and “1(noun)ichi”. The command “T2#+OZ$” appends the attribute set (OZ) of the object ( ʻracketʼ) of the preceding sentence to that of the second component of the phrase ( ʻoneʼ). The command “T2#KAZ1” picks up the counting unit ʻhonʼ from the appended information, and the command “J2#KAZ2” attaches that
ʻhonʼ to the numeral “ichi” and makes it into “ippon”.
6.Selection of Japanese for “then” among “sou sure-ba”, “sono toki”,
“sorekara”, “sore-de-wa”, “suruto” and
“sono koro ni-wa” for ʻbeʼ verb case.
In [1] of this series, we showed that English adverb ʻthenʼ has its corresponding Japanese “sou sure-ba”, “sono toki”,
“sorekara”, “sore-de-wa”, “suruto” and
“sono koro ni-wa” registered in our system dictionary. The preceding sentence then was “You do not have friends.” This time we present the case where the preceding sentence is “ ... is not ...”
Tom: Oh, your new watch is pretty.
トム「ああ,あなたの新しい時計はきれい です.」
Tomu “Aa, anata no atarashii tokei wa kirei desu.”
Kumi: Thanks. But it isnʼt mine.
久美「ありがとう.しかしそれ(その新し い時計)は私のものではありません.」
− 67 −
Machine Translation I-8(K. Shibata)
Kumi “Arigatou. Shikashi sore (sono atarashii tokei) wa watashi no mono de-wa ari-masen.”
Tom: Then, whose watch is it?
... (6)
トム「それでは,それ(その新しい時計)
は誰の時計ですか?」
Tomu “Sore de-wa, sore (sono atarashii tokei) wa dare no tokei desu ka?”
In the process of choosing the best candidate for ʻthenʼ in (6) above, the following two choice rules are applied:
66;そ う す れ ば;2;E0=then;PJS<>*な さ い 」; l990;
66;*それでは;2;FT;E0=then;T0=O;E1=,;T2=q;
BS=* isnʼt ; m335;
The former rule is explained in [1]. The latter one requires that ʻthenʼ in question is at the head of the sentence biginning with an uppercase letter, followed by the comma (E1=,;) and that the comma is followed by an interrogative (T2=q;) and finally that the preceding sentence (BS) contains “ isnʼt”. Therefore this ʻthenʼ is a response to some negative statement and starts an interrogative sentence.
7.Selection among “sono you-ni”, “sore- hodo”, “totemo”, “you-ni”, “sore-de” and “mo mata” for ʻsoʼ.
A s w e l l a s ʻt h e nʼ, a d v e r b ( a n d conjunction) ʻsoʼ has various corresponding Japanese words.
Tom: I like it.
トム「私はそれが好きです.」
Tomu “Watashi wa sore ga suki-desu.”
Kumi: Me too. It shows the time all over the world.
久美「私もです.それは世界中の時刻を示 しています.」
Kumi “Watashi mo desu. Sore wa sekai-juu no jikoku wo shimeshite-i- masu.”
Tom: So you know the time in New York. ... (7)
トム「それであなたはニューヨークの時刻 を知っているでしょう.」
Tomu “Sore-de anata wa Nyuu-yooku no jikoku wo shitte-iru deshou.”
The choice is usually very delicate and difficult to make. In example (7) above, the choice rule is the following:
69;*それで;2;E0=so;FT;T0=O;E1=you know;BS=*It shows; o950;
The ʻsoʼ in question is at the head of the sentence biginning with an uppercase letter and is followed by “you know”. Further, the preceding sentence (BS) contains “It shows”. Therefore this ʻsoʼ starts a sentence which describes the results of what is shown.
8.Selection of Japanese for ʻthemʼ using information about the sentence before the last.
Tom: Next week we visit people in the nursing home.
トム「来週私達は老人ホームに人びとを訪 れます.」
Tomu “Raishuu watashitachi wa roujin hoomu ni hito-bito wo otozure-masu.”
Kumi: Right. I remember now.
久美「そのとおりです.今思い出しまし た.」
Kumi “Sono toori desu. Ima omoi- dashi-mashita.”
Tom: We sing for them. ... (8)
トム「私達は彼等の為に歌います.」
Tomu “Watashitachi wa karera no tame ni utai-masu.”
In [1] of this series of articles, we explained that the English word ʻthemʼ has four corresponding Japanese words, namely,
“karera” (males), “kanojo-tachi” (females),
“hito-bito” (people in general) and “sorera” (inanimate things). In example (8) above, the following choice rules are applied:
22;*彼;2;E0=them;E-1<>sell;E-3<>,;OZ=h;OZ=
FxX;VZ=u; J010;
22;彼女;2;E0=them;SZ<>xFX;OZ<>W;TA<>
W; J310;
The former rule checks the value of the variable OZ which contains the attributes of the object of the preceding sentence.
But the verb ʻrememberʼ in the preceding sentence has no object, and consequently the value of OZ was unchanged from that of the sentence before the last, namely, the attributes “hXz” of ʻpeopleʼ. Because of this, the former rule picks up “karera” and
“kanojo-tach” as adequate candidates. Then the latter rule verifies that there is no trace of woman attibute ʻWʼ anywhere preceding the current sentence, and thus rejects
“kanojo-tachi”.
Reference
[1] Katsuyuki SHIBATA: Dependence on Context in case of English-Japa- nese Machine Translation I-5. Fukuo- ka University Science Reports, vol.37 No.1. pp.93-103, 2007
http://www1.rsp.fukuoka-u.ac.jp/cho- sho/Cntxt1-5.html