今後の課題

第 6 章まとめ 32

6.2 今後の課題

今後の課題としてはn-gram言語モデルにおけるスムージングの追加があげられる．本論文では使用していないが，NeubigとDyerの研究では改良型Kneser-Neyスムージングを用いている．スムージングを行うことでn-gramを使用する言語モデル全体の精度向上が見込めるので，Kneser-Neyスムージング処理を行ったn-gram言語モデルを用いて再度比較実験を行う必要がある．

また，本研究ではニューラルネットワークでもXGBoostでもワンホット表現によってシンボルをベクトル化していたが，ワンホット表現以外にも様々なベクトル変換手法が存在する．特にSkip-gramやContinuous Bag-Of-Wordsといった分散表現によるベクトル化が様々なタスクで精度改善に繋がっている[7]．各シンボルに対して分散表現を用いてベクトル化を行えば精度改善が見込めると考えられる．

提案手法のアイデアは複数の機械学習手法を組み合わせるというものであった．手法の組み合わせが増えれば増えるほど，チューニングの必要があるハイパーパラメータの種類は増えていく．ハイパーパラメータをチューニングするためにはそれぞれの学習機を一度学習させなければならないため，非常に時間がかかってしまう．出来得る限り少ない試行回数での効率的なハイパーパラメータのチューニングが必要となる．

参考文献

[1] Waleed Ammar, George Mulcaire, Miguel Ballesteros, Chris Dyer, and Noah A.

Smith. One parser, many languages. CoRR, 2016.

[2] Borja Balle, R´emi Eyraud, Franco Luque, Ariadna Quattoni, and Sicco Verwer.

Results of the Sequence PredIction ChallengE (SPiCe): a Competition on Learning the Next Symbol in a Sequence. In 13th International Conference in Grammatical Inference, Vol. 57. JMLR W&CP, 2016.

[3] Yoshua Bengio, R´ejean Ducharme, Pascal Vincent, and Christian Janvin. A neural probabilistic language model.Journal of Machine Learning Research, pp. 1137–1155, 2003.

[4] Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system.

In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794, 2016.

[5] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural Com-putation, pp. 1735–1780, 1997.

[6] Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda B. Vi´egas, Martin Wattenberg, Greg Corrado, Mac-duﬀ Hughes, and Jeﬀrey Dean. Google’s multilingual neural machine translation system: Enabling zero-shot translation. CoRR, 2016.

[7] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeﬀrey Dean. Eﬃcient estimation of word representations in vector space. CoRR, 2013.

[8] Tomas Mikolov, Martin Karafi´at, Lukas Burget, Jan Cernock`y, and Sanjeev Khu-danpur. Recurrent neural network based language model. In Interspeech, Vol. 2, pp.

1045–1048, 2010.

[9] Graham Neubig and Chris Dyer. Generalizing and hybridizing count-based and neural language models. InProceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1163–1172, November 2016.

[10] Nitish Srivastava, Geoﬀrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting.

Journal of Machine Learning Research, Vol. 15, pp. 1929–1958, 2014.

[11] Martin Sundermeyer, Ralf Schl¨uter, and Hermann Ney. Lstm neural networks for language modeling. In Interspeech, pp. 194–197, 2012.

[12] A¨aron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew W. Senior, and Koray Kavukcuoglu.

Wavenet: A generative model for raw audio. CoRR, 2016.

[13] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeﬀ Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliﬀ Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduﬀ Hughes, and Jeﬀrey Dean. Google’s neural machine translation system: Bridging the gap between human and machine trans-lation. CoRR, 2016.

[14] 麻生英樹, 安田宗樹, 前田新一, 岡野原大輔, 岡谷貴之, 久保陽太郎, ボレガラダヌシカ. 深層学習 Deep learning. 近代科学社, 2015.

[15] 岡谷貴之. 深層学習. 機械学習プロフェッショナルシリーズ. 講談社, 2015.

ドキュメント内深層学習を用いた系列データ解析に関する研究 (ページ 36-40)

第 6 章 まとめ 32

6.2 今後の課題

参考文献

第 6 章まとめ 32