• 検索結果がありません。

JAIST Repository: A study on Hierarchical Table of Indexes for Multi-documents

N/A
N/A
Protected

Academic year: 2021

シェア "JAIST Repository: A study on Hierarchical Table of Indexes for Multi-documents"

Copied!
3
0
0

読み込み中.... (全文を見る)

全文

(1)

Japan Advanced Institute of Science and Technology

JAIST Repository

https://dspace.jaist.ac.jp/

Title A study on Hierarchical Table of Indexes for Multi-documents

Author(s) LE, Tho Thi Ngoc Citation

Issue Date 2012-09

Type Thesis or Dissertation Text version author

URL http://hdl.handle.net/10119/10752 Rights

Description Supervisor: Professor Akira Shimazu, 情報科学研究 科, 修士

(2)

A study on Hierarchical Table of Indexes for

Multi-documents

LE Thi Ngoc Tho (1010226) School of Information Science,

Japan Advanced Institute of Science and Technology

August 09, 2012

Keywords: hierarchical summary, table of indexes, keyphrase extraction, clustering, unsupervised, graph based ranking.

Nowadays, when the information increase exponentially, catching up the new information is a time-consuming task for people, especially for busy ones. So, natural language processing is trying to support people in get-ting the news quickly by providing them a summary of text automatically, starting from summary of single document to multiple documents, or sum-mary of news, meeting transcripts.

A summary of a document or a collection of documents is a condense representation of main ideas of the content. It is obvious that the sum-mary of documents will help the readers gain the general ideas of docu-ments. However, the representation of summary in form of text may cause inconvenience for the readers. Especially, the summary for a very long document or a collection of documents is still too long to read, and the non-native-speaker readers may not familiar with different writing styles. Even if the readers can get all ideas of documents, they have to figure out the structural organization of ideas by themselves.

In this thesis, we take into account the organization of main ideas as well when trying to get the summary of multiple documents. In order to do that, we generate a tree-based structure, called hierarchical table of indexes. A table of indexes in hierarchical structure helps the readers understanding

Copyright c 2012 by LE Thi Ngoc Tho

(3)

the content and the structure in semantics aspects. It also provides a navigation for the readers to quickly refer to interested information.

To create the hierarchical table of indexes automatically, we proposed an unsupervised framework to generate a hierarchical table of indexes. In which, unsupervised clustering algorithm is employed to create the hier-archical structure, and graph-based ranking method is applied to extract keyphrases and form the indexes. The experiment is applied for both En-glish and Japanese in contribution to Legal Engineering. The preliminary result of summary is provided as the illustration for our approach. And searching information on the hierarchical summary is evaluated better than searching on original plain documents.

参照

関連したドキュメント

In the second computation, we use a fine equidistant grid within the isotropic borehole region and an optimal grid coarsening in the x direction in the outer, anisotropic,

Keywords: continuous time random walk, Brownian motion, collision time, skew Young tableaux, tandem queue.. AMS 2000 Subject Classification: Primary:

This paper presents an investigation into the mechanics of this specific problem and develops an analytical approach that accounts for the effects of geometrical and material data on

Beyond proving existence, we can show that the solution given in Theorem 2.2 is of Laplace transform type, modulo an appropriate error, as shown in the next theorem..

While conducting an experiment regarding fetal move- ments as a result of Pulsed Wave Doppler (PWD) ultrasound, [8] we encountered the severe artifacts in the acquired image2.

We will study the spreading of a charged microdroplet using the lubrication approximation which assumes that the fluid spreads over a solid surface and that the droplet is thin so

In this section, we study the tail distribution of the number of occurrences of a single word H 1 in a random text T.. In [RS97a], a large deviation principle is established by

• Informal discussion meetings shall be held with Nippon Kaiji Kyokai (NK) to exchange information and opinions regarding classification, both domestic and international affairs