• 検索結果がありません。

JTF Tokyo MH 最近の更新履歴 smallmediajp 言語技術を考える JTF Tokyo MH

N/A
N/A
Protected

Academic year: 2018

シェア "JTF Tokyo MH 最近の更新履歴 smallmediajp 言語技術を考える JTF Tokyo MH"

Copied!
12
0
0

読み込み中.... (全文を見る)

全文

(1)

standards to work… well

Manuel Herranz – PangeaMT - Pangeanic

www.pangea.com.mt

(2)

Unmanageable amounts of

data? The data deluge

As of May 2009: 487 Billion gigabytes or

1,000,000,000 * 487,000,000,000 = 4,87 x 10

20

Estimates

Up 50% a year (Oracle)

Doubles every 11 hours (IBM)

Language translation as a job becoming

unmanageable. Increasing demands, increasing

volumes, shorter deadlines. Human production is

not sufficient.

(3)

Short history

Pangeanic: LSP. Major clients in Asia, European localization, increasing number of languages and volumes

Need to produce faster, cheaper, quality

Experimenting with some RB systems

TAUS & TDA founding members (M's of words!)

Partnering with Valencia's Computer Science

Institute (R&D and EU projects: Casacuberta,

Och, Vidal, Koehn)

(4)

Short history

CHALLENGE: Turn academic development (Moses) into commercial application.

Limitations: plain text (txt), language model building (first), no reordering, no updating features (always re-start), data availability, Linux-based (server). You need computational linguists (programmers), not

translators, to operate it.

Partnering with Valencia's Computer Science

Institute PangeMatic (v1) was developed and then

PangeaMT 2009 (web-based)

(5)

Short history

OBJETIVES:

1. To provide HQ MT for Post-Editing and save time and cost.

2. To use only community-based

Open standards

Oasis / ISO: xliff / tmx, xml)

.

NO proprietary formats (technology

independence) so clients are not “locked” in to buying and updating expensive software.

3. To automate as many processes as possible.

(6)

Short history - Implementations

Plus many

other internal

engines for ...

* Large Japanese Car

manufacturing firm

* Electronics firms

* Technical / Engineering

--- >

(7)

How PangeaMT works

Use Open Standars Browser: Mozilla, Safari

(8)

How PangeaMT works

(9)

Users get an email with the translation minutes later

How PangeaMT works

(10)

Post-editing

(11)

Future Work

- “on the fly” MT training (minutes, not manually) - modular data sets of

CLEAN DATA

to

“pick & match” SMT training

- confidence scores for users (→ translators or

readers) with CAT integration (web-based / desktop) - Web interface: mobile, OCR, on the spot

translation

(12)

Thank you !

QUESTIONS ?

[email protected]

参照

関連したドキュメント

SUSE® Linux Enterprise Server 15 for AMD64 & Intel64 15S SLES SUSE® Linux Enterprise Server 12 for AMD64 & Intel64 12S. VMware vSphere® 7

ESET Server Security for Windows Server、ESET Mail/File/Gateway Security for Linux は

直流抵抗 温度上昇 PART

本資料は Linux サーバー OS 向けプログラム「 ESET Server Security for Linux V8.1 」の機能を紹介した資料です。.. ・ESET File Security

Building on the achievements of the Tokyo Climate Change Strategy so far, the Tokyo Metropolitan Government (TMG) is working with a variety of stakeholders in

Other regulations : This Safety Data Sheet is for a pesticide product registered by the US Environmental Protection Agency (USEPA) and is therefore also subject to certain

以上の各テーマ、取組は相互に関連しており独立したものではない。東京 2020 大会の持続可能性に配慮し

This policy shows TMG’s approaches toward the formulation of our Climate Change Adaptation Plan, in order to avoid or reduce as much as possible the impacts on or damage to the