Catalog-Search
∗
✝ ✞ ✝ ✞ [email protected]✝ [email protected]✞Corpus Search(http://corpus-search.nii.ac.jp/)
Catalog-Search Corpus Search
1 1 1
1
LDC(Linguistic Data Consortium)[1] ELRA(European Language Resources Association)[2] NII SRC ( )[3] Catalog-Search Corpus Search[4] Corpus Search Catalog Search
Copyright(C) 2011 The Association for Natural Language Processing. All Rights Reserved. ― 721 ―
言語処理学会 第 17 回年次大会 発表論文集 (2011 年 3 月)
2 22 2 Corpus Search ( ) 2 [5]
Figure 1 Corpus map [5]
Corpus Search
2 ( 1 )
3 33
3 Catalog Catalog Catalog Catalog----SearchSearchSearch Search 3
33
3.1 .1 .1 .1
[4]
( )
Table 1 Corpus attributes Table 1 Corpus attributesTable 1 Corpus attributes Table 1 Corpus attributes
Attributes Attributes Attributes
Attributes ItemsItems ItemsItems ContentsContentsContentsContents Sources 7 Recording devices Environment 4 Recording environment
Speakers 12 Numbers of speakers Quantity 7 Quantity of data
Style 4 Speech style Mode 5 Speech mode Sampling Rate 3 Sampling rate
Data 9 Miscellaneous data Languages 4 Languages of data
Purpose 11 Purpose of construction
NII SRC 6 76 3 33 3.2 .2 .2 .2 ( 2 ) 3
Copyright(C) 2011 The Association for Natural Language Processing. All Rights Reserved. ― 722 ―
Figure 2: flowchart of Catalog-Search
Figure 3: An example of items check
4
Figure 4: An example of output results
3 33 3....3333 [6] Corpus Search Corpus Search Corpus Search [7] Catalog-Search 4 44 4 Catalog-Search “Languages” mixed-language “Data”
Copyright(C) 2011 The Association for Natural Language Processing. All Rights Reserved. ― 723 ―
8 5 55 5 [6] (1) (2) (3) Corpus Search
[1] [Linguistic Data Consortium]-LDC http://www.ldc.upenn.edu/
[2] [European Language Resources Association] – ELRA http://www.elra.info/ [3] , , , “ ”, ( ), 3-Q-10, 395-396, 2008. [4] , , , , “ ” , ( ), 2-P-6, 457-458, 2009. [5] , , , , , “ ” , ( ), 3-P-33, 441-442, 2009. [6] [ ] http://research.nii.ac.jp/src/ [7] , , , , , , “ ”, ( ), 2011( )
[8] Itahashi, Yamakawa, Matsui, Ishimoto, “A proposal for standardizing catalogue specifications of speech corpora” , Proc. Oriental COCOSDA Workshop 2010, 2010.
[9] , , , “
”, ( ), 1-P-20,
447-448, 2007.
∗ Construction and application of the search system of speech corpora – Catalog-Search, by KIKUCHI Hideaki, Raymond SHEN (Waseda University)
Copyright(C) 2011 The Association for Natural Language Processing. All Rights Reserved. ― 724 ―