総括 - 結言 - テスト項目分析への応用

第 6 章結言

6.1 総括

将来，eテスティングの普及が進むにつれ，局所独立性指標の意義は高まると考えられる．本論文で提案したLCI検定は，従来の局所独立性検定に比べ，危険率は一般的な有意水準の従来手法以下であり，検出力は同じ危険率の従来手法以上であるという優れた特徴を持つ．したがって，eテスティング・システムにおけるアイテム・バンクのデータベースに，局所独立性の指標としてLCI指標の実装が進むことが期待される．

本論文では，LCI検定の危険率および検出力が閾値の選択に依存することを示したが，

適切な閾値を選択する方法が今後の課題である．また，ILS分析は能力潜在変数の一次元性を前提としているが，ILS分析の結果から，実際のテストには複数次元の能力潜在変数が影響している可能性が示唆された．したがって，実際のテストをより適切に分析するために，LCI検定およびILS分析を，能力潜在変数が多次元である場合に拡張することが必要である．

付録 A LCI _{検定実行ソフトウェア}

「 LCItest.exe _」の開発

A.1 _開発概要

第3章で，従来の局所独立性検定の問題を解決し，検定対象以外の項目間に局所独立性があっても対象の局所独立性を正しく検定できるLCI検定を提案した．本論文ではLCI検定を容易に実行できるWindows用ソフトウェア「LCItest.exe」の開発も行った．本章ではLCItest.exeの使用方法を説明する．

LCItest.exeの実行ファイルおよびJava言語によるソースコードは以下のWebサイト

にて公開している．

http://homepage2.nifty.com/hashimoto-t/lcitest/

LCItest.exeのソースコードはすべてJava言語で記述した．Windows Vistaのインストールされたコンピューターで開発を行ったため，ソースコードの文字コードはShift JIS，改行コードはCR+LFである．コンパイルに用いた開発環境はJava SE 6で，JDKのバージョンは1.6.0 Update 26である．作成したアーカイブ（JARファイル）を実行形式（EXE ファイル）にラップするためにJSmooth 0.9.9-7を利用した．

A.2 使用方法

A.2.1 必要な環境

LCItest.exeはWindowsのインストールされたコンピューターで実行可能である．これま

でに，Windows XP (Professional)，Windows Vista (Home Basic，Business），Windows 7（Home Premium）での動作を確認している．

実行にあたり，Javaランタイム環境（JRE）のバージョン6以上がインストールされていることが必要である．

A.2.2 データの形式

データはプレーンテキスト形式で作成する．ASCII文字のみ利用可能で，漢字などの全角文字を利用してはならない．

1行は1人の受検者に対応する．1行に複数の受検者のデータを記録したり，1人の受検者のデータを複数行に記録したりすることはできない．

受検者の反応は，正答を半角の1，誤答を半角の0，欠損値を半角のピリオド.で表し，

隙間なく記録する．スペース，タブ，カンマなどによる区切りは行わない．データを1列目から記録する必要はないが，同じ列には同じ項目への反応を記録する必要がある．

図A.1はデータファイルの例である．このファイルでは7列目から12項目分のデータが記録されている．データが同じ開始位置から隙間なく記録されていれば，他の部分に記入する内容はASCII文字である限り自由である．このファイルのように，行の先頭に受検者IDを含めることも可能である．

A.2.3 操作方法

LCItest.exeをダブルクリックすると，図A.2のウインドウが現れる．

ウインドウ右上部のOpenボタンをクリックするとデータファイルを選択できる．

Start Positionには項目反応の開始位置を記入する．列番号は1から数える．

Number of Itemsにはテストの全項目数を記入する．このとき，検定対象でない項目も

含めることに注意されたい．空欄にすれば自動的に行末まで読み込まれる．

Analyzed Itemsは，特定の項目間のみでLCI検定を実行するときに使用する．すべて

の項目の間で実行するときには，Fromにはいずれも1を記入し，Toは空欄にする．値はファイル内の列位置ではなく，項目番号で指定する．項目番号は1から数える．たとえば，

項目1から項目3までの3項目と，項目7から項目10までの4項目の間で検定を実行したい場合には，Analyzed Items (Column)のFromに1，Toに3と記入し，Analyzed Items (Row) のFromに7，Toに10と記入する．

ThresholdにはLCI検定の閾値を記入する．

Output File (Values)で指定したファイルには，LCI検定統計量の値がカンマ区切りテキスト形式で出力される．項目名は「ITEM」と番号で表される．Output File (Edges)で

00001 1111....1110 00002 11101100....

00003 ....10001111 00004 10001111....

00005 0000....0000

...

図 A.1: データファイルの例

図A.2: LCItest.exeのウインドウ

ウインドウ左部中程のStartボタンをクリックするとLCI検定が開始される．すべての項目間で検定が終了した後，Calculation completed.というダイアログボックス（図A.4）が表示され，Output Filesで指定したファイルに結果が出力される（図A.5, A.6）．

図A.3: 値の記入例

図A.4: 計算終了時の画面

図A.5: LCI検定統計量の出力例

図A.6: 局所従属な項目の出力例

引用文献

[1] Bergsma, W.P. (2004). “Testing conditional independence for continuous random variables”,EUEANDOM-reoirt 2004-049.

[2] Birnbaum, A. (1957). “Eﬃcient design and use of tests of a mental ability for vari-ous decision-making problems”, (Series Report 58-16, no.7755-23).USAF School of Aviation Medicine, Randolph Air Force Base, Texas.

[3] Birnbaum, A. (1968). “Some latent trait models”, inStatistical Theories of Mental Test Scores, eds. Lord, F.M. and Novick, M.R., pp. 397–424, Reading, MA: Addison–

Wesley.

[4] Bock, R.D. (1972). “Estimating item parameters and latent ability when responses are scored in two or more nominal categories”, Psychometrika, vol. 37, no. 1, pp.

29–51.

[5] Bradlow, E., Wainer, H., and Wang, X. (1999). “A Bayesian random eﬀects model for testlets”,Psychometrika, vol. 64, no. 2, pp. 153–168.

[6] Chen, W.H., and Thissen, D. (1997). “Local dependence indexes for item pairs using item response theory”,Journal of Educational and Behavioral Statistics, vol.22, no.

3, pp. 265–289.

[7] Cheng, J., Greiner, R., Kelly, J., Bell, D., and Liu, W. (2002) “Learning Bayesian networks from data: an information-theory based approach”,Artificial Intelligence, vol. 137, no. 1-2, pp.43–90.

[8] Embretson, S. (1984). “A General latent trait model for response processes”, Psy-chometrika, vol. 49, no. 2, pp.175–186.

[9] Fischer, G.H. and Formann, A.K. (1982). “Some applications of logistic latent trait models with linear constraints on the parameters”,Applied Psychological Measure-ment, vol. 6, no. 4, pp. 397–416.

27, no. 2, pp. 87–106.

[11] 石岡恒憲（研究代表者） (2008). “平成20年度大学入試センター試験モニター調査報告書”.大学入試センター研究開発部.

[12] Jannarone, R.J. (1986). “Conjunctive item response theory kernels”,Psychometrika, vol. 51, no. 3, pp. 357–373.

[13] Kelderman, H. (1984). “Loglinear Rasch model tests”, Psychometrika, vol.49, no.

2, pp. 223–245.

[14] Koivisto, M., and Sood, K. (2004). “Exact Bayesian structure discovery in Bayesian networks”,Journal of Machine Learning Research, vol. 5, pp. 549–573.

[15] 熊谷龍一(2009). “初学者向けの項目反応理論分析プログラムEasyEstimationシリーズの開発”,日本テスト学会誌, vol. 5, no. 1, pp. 107–118.

[16] Lord, F.M. and Novick, M.R. (1968). “Statistical Theories of Mental Test Scores”, MA: Addison-Wesley.

[17] Masters, G.N. (1982). “A Rasch model for partial credit scoring”, Psychometrika, vol. 47, no. 2, pp. 149–174.

[18] Pearl, J. (1988). “Probabilistic reasoning in intelligent systems: networks of plausible inference”, Morgan Kaufmann, San Francisco, California.

[19] Pearl, J. (2001). “Causality: models, reasoning, and inference”, Cambridge Univer-sity Press, New York.

[20] Rasch, G. (1966). “An item analysis which takes individual diﬀerences into account”, British Journal of Mathematical and Statistical Psychology, vol.19, no. 1, pp. 49–57.

[21] Reese, L.M. (1995). “The impact of local dependencies on some LSAT outcomes”, Law School Admission Council Statistical Report, vol. 95-02.

[22] Samejima, F. (1969). “Estimation of latent ability using a response pattern of graded scores”,Psychometrika monograph, no. 17.

[23] Samejima, F. (1973). “Homogeneous case of the continuous response model”, Psy-chometrika, vol. 38, no. 2, pp. 203–219.

[24] 佐野真(2009). “相互情報量を用いた項目識別力の過大推定の検出”,日本テスト学会

誌, vol. 5, no. 1, pp. 3–21.

[25] 佐藤隆博 (1979). “ISM法による学習要素の階層的構造の決定”, 日本教育工学雑誌, vol. 4, no. 1, pp. 9–16.

[26] Scott, S.L. and Ip, E.H. (2002). “Empirical Bayes and item-clustering eﬀects in a latent variable hierarchical model: a case study from the National Assessment of Educational Progress”,Journal of the American Statistical Association, vol.97, no.

458, pp. 409–419.

[27] 繁桝算男，本村陽一，植野真臣 (2006). “ベイジアンネットワーク概説”, 培風館, 東京.

[28] 新藤茂，赤堀侃司 (1988). “項目協同関連構造（Item Co-Relational Structure）によるテストの特性解析”,日本教育工学雑誌, vol. 12, no. 2, pp. 37–49.

[29] 荘島宏二郎（研究代表者） (2009). “平成21年度大学入試センター試験モニター調査報告書”.大学入試センター研究開発部.

[30] 荘島宏二郎（研究代表者） (2010). “平成22年度大学入試センター試験モニター調査報告書”.大学入試センター研究開発部.

[31] Silander, T., Kontkanen, P., and Myllym aki, P. (2007). “On sensitivity of the MAP Bayesian network structure to the equivalent sample size parameter”, Proceedings of the 23rd Conference on Uncertainty in Artificial Intelligence, pp.19–22.

[32] Sireci, S.G., Thissen D., and Wainer, H. (1991). “On the reliability of testlet-based tests”,Journal of Educational Measurement, vol. 28, pp. 237–247.

[33] Spirtes, P., and Glymour, C. (1990). “An algorithm for fast recovery of sparse causal graphs”,Report CMU-PHIL 15.

[34] 竹谷誠 (1980). “IRSテスト構造グラフの構成法と活用法”, 日本教育工学雑誌, vol.

5, no. 3, pp. 93–103.

pp. 31–78.

[36] Tsai, T.-H. and Hsu, Y.-C. (2005). “The use of information entropy as a local item dependence assessment”,Paper printed at the annual meeting of the American Educational Research Association, Montr´eal, Qu´ebec, Canada.

[37] 植野真臣(2000). “ベイズ・アプローチによるグラフィカル・テスト理論”,日本教育工学雑誌, vol. 24, no. 1, pp. 35–52.

[38] Ueno, M. (2002). “An extension of the IRT to a network model”,Behaviormetrika, vol. 29, no. 1, pp. 59–79.

[39] 植野真臣 (2007). “知識社会におけるeラーニング”,培風館,東京.

[40] Ueno, M. (2011). “Robust learning Bayesian network for prior belief”, Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp.698–707.

[41] van den Wollengerg, A.L. (1982). “Two new test statistics for the Rasch model”, Psychometrika, vol. 47, no. 2, pp. 123–140.

[42] Wainer, H. and Kiely, G.L. (1987). “Item clusters and computerized adaptive test-ing: a case for testlets”, Journal of Educational Measurement, vol. 24, no. 3, pp.

185–201.

[43] Wainer, H., Bradlow, E.T., and Du, Z. (2000). “Testlet response theory: an ana-log for the 3PL model useful in testlet-based adaptive testing”, in Computerized adaptive testing, eds. van der Linden, W.J. and Glas, C.A.W., pp. 245–269, Kluwer Academic Publishers, Dordrecht, Netherlands.

[44] Whitely, S.E. (1980). “Multicomponent latent trait models for ability tests”, Psy-chometrika, vol. 45, no. 4, pp. 479–494.

[45] Wilks, S.S. (1962). “Mathematical Statistics”, 2nd. ed., Wiley, pp. 355–356.

[46] 山下元(1991). “ファジィグラフを応用した教材構造分析法”,電子情報通信学会論文

誌, D-1, 情報・システム, 1,情報処理, vol. 74, no. 2, pp. 88–94.

[47] Yen, W.M. (1984). “Eﬀects of local item dependence on the ﬁt and equating perfor-mance of the three-parameter logistic model”,Applied Psychological Measurement, vol. 8, no. 2, pp. 125–145.

[48] Yen, W.M. (1993). “Scaling performance assessments: strategies for managing local item dependence”,Journal of Educational Measurement, vol.30, no. 3, pp.187–213.

謝辞

本研究を進めるにあたり，主任指導教員として終始懇切なる御指導を賜るのみならず，

親身な御助言と力強い激励を頂きました，電気通信大学大学院准教授の植野真臣先生に，

心より感謝を申し上げます．

電気通信大学大学院の渡辺俊典教授，長岡浩司教授，大須賀昭彦教授，岡本敏雄教授の各先生からは，本論文の審査過程において，数々の貴重な御助言と御指導を賜りました．

深謝申し上げます．

電気通信大学大学院植野真臣研究室の皆様，および東京大学大学院繁桝算男研究室の皆様からは，本研究にあたって有益な議論と情報交換をして頂きました．特に東京大学在学中に同じ繁桝研究室に所属し現東京大学講師の森一将先生には本論文を丁寧にお読み頂き多数の有益なコメントを賜りました．感謝申し上げます．

最後に，東京大学在学中に修士課程・博士課程の5年間の御指導を通じて本研究分野の基礎知識を御教示くださった上に，本論文の審査を通じて貴重な御助言を賜りました，東京大学名誉教授で現帝京大学教授の繁桝算男先生に深謝申し上げます．

総括

第 6 章結言

6.1 総括

付録 A LCI _{検定実行ソフトウェア}

「 LCItest.exe _」の開発

A.1 _開発概要

A.2 使用方法

引用文献

謝辞

関連論文の印刷公表の方法及び時期

査読付き論文

国際会議

総括

第 6 章 結言

6.1 総括

付 録 A LCI 検定実行ソフトウェア

「 LCItest.exe 」の開発

A.1 開発概要

A.2 使用方法

引用文献

謝辞

関連論文の印刷公表の方法及び時期

査読付き論文

国際会議

第 6 章結言

付録 A LCI _{検定実行ソフトウェア}

「 LCItest.exe _」の開発

A.1 _開発概要