Future Work Direction - アノテーションに基づく画像検索の改善に関する研究

7.2.3 Image Description Generation

Automatically describing the content of an image is a fundamental problem in arti-ficial intelligence that connects computer vision and natural language processing. In [113], Vinyals Oriol et al. presented a generative model based on a deep recurrent architecture that combined recent advances in computer vision and machine trans-lation and that could be used to generate natural sentences describing an image. In [89], Miyazaki Takashi et al. developed a Japanese version of the MS COCO caption dataset and a generative model based on a deep recurrent architecture that took in an image and used this Japanese version of the dataset to generate a description in Japanese. In the future, it is an urgent need for corpora sufficiently large for image description in other languages with high-level semantics.

Acknowledgements

First of all, I appreciate my university: Kyushu University, it provides me with an enjoyable and convenient atmosphere of learning. Second, I would like to acknowledge my supervisor, Prof. Akira Fukuda, for allow-ing me the freedom to pursue my own ideas and for his constant support and advice. I appreciate Prof. Kazuaki Murakami when he helps me in my most difficult time. I would like to thank Prof. Yoichi Tomiura and Prof. Tsunenori Mine for their comments and constructive feedback on improving this thesis. I would also like to thank Mr. Antoine Trouve for his objective advice and assistance. I am greatly indebted to secretaries:

Rika Shudo, Yoko Otsuru and other teachers, who have helped me during the period of my study in Kyushu University. Last but not least, I would like to thank my parents and my wife for their unconditional love, under-standing and support. Without their continuous encouragement, I would not be where I am today.

Bibliography

[1] Somak Aditya, Yezhou Yang, Chitta Baral, Yiannis Aloimonos, and Cornelia Ferm¨uller. Image understanding using vision and reasoning through scene de-scription graph. Computer Vision and Image Understanding, 2017.

[2] Somak Aditya, Yezhou Yang, Chitta Baral, Cornelia Fermuller, and Yiannis Aloimonos. From images to sentences through scene description graphs using commonsense reasoning and knowledge.arXiv preprint arXiv:1511.03292, 2015.

[3] Sareh Aghaei, Mohammad Ali Nematbakhsh, and Hadi Khosravi Farsani. Evo-lution of the world wide web: From web 1.0 to web 4.0. International Journal of Web & Semantic Technology, 3(1):1, 2012.

[4] Haytham Al-Feel, MA Koutb, and Hoda Suoror. Toward an agreement on semantic web architecture. Europe, 49(3):806–810, 2009.

[5] D Allemang and J Hendler. Rdf–the basis of the semantic web. in: Semantic web for the working ontologist, 2011.

[6] MB Alves, Carlos Viegas Dam´asio, and N Correia. Improving tag-based image search by using linked open data. In Proceedings of the 10th Conference on Open Research Areas in Information Retrieval, pages 21–24, 2013.

[7] Ahmad Alzu’bi, Abbes Amira, and Naeem Ramzan. Semantic content-based image retrieval: A comprehensive study. Journal of Visual Communication and Image Representation, 32:20–54, 2015.

[8] Peter Anderson, Basura Fernando, Mark Johnson, and Stephen Gould.

Spice: Semantic propositional image caption evaluation. arXiv preprint arXiv:1607.08822, 2016.

[9] Pierre Andrews, Ilya Zaihrayeu, and Juan Pane. A classification of semantic annotation systems. Semantic web, 3(3):223–248, 2012.

[10] S¨oren Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyga-niak, and Zachary Ives. Dbpedia: A nucleus for a web of open data. In The semantic web, pages 722–735. Springer, 2007.

[11] PY Bard and SESAME Participants. The sesame project: an overview and main results. InProc. 13th World Conf. Earth. Engng., Vancouver, 2004.

[12] David Beckett. N-triples: A line-based syntax for an rdf graph. Technical report, W3C Working Group Note 9 Apr, 2013.

[13] David Beckett, Tim Berners-Lee, and Eric Prud’hommeaux. Turtle-terse rdf triple language. W3C Team Submission, 14(7), 2008.

[14] Tamara L Berg and David A Forsyth. Animals on the web. In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, volume 2, pages 1463–1470. IEEE, 2006.

[15] Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem, Erkut Er-dem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat, and Barbara Plank.

Automatic description generation from images: A survey of models, datasets, and evaluation measures. J. Artif. Intell. Res.(JAIR), 55:409–442, 2016.

[16] Tim Berners-Lee. The world wide web: A very short personal history (1998), 2009.

[17] Tim Berners-Lee, Mark Fischetti, and Michael L Foreword By-Dertouzos.

Weaving the Web: The original design and ultimate destiny of the World Wide Web by its inventor. HarperInformation, 2000.

[18] Tim Berners-Lee, James Hendler, Ora Lassila, et al. The semantic web. Scien-tific american, 284(5):28–37, 2001.

[19] Steven Bird, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. “O’Reilly Media, Inc.”, 2009.

[20] Christian Bizer, Tom Heath, and Tim Berners-Lee. Linked data-the story so far.Semantic services, interoperability and web applications: emerging concepts, pages 205–227, 2009.

[21] Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Tay-lor. Freebase: a collaboratively created graph database for structuring human knowledge. InProceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250. AcM, 2008.

[22] Dan Brickley and R Guha. Rdf schema 1.1. w3c recommendation (25 february 2014). World Wide Web Consortium, 2014.

[23] Joseph J Budovec, Cesar A Lam, and Charles E Kahn Jr. Informatics in radi-ology: radiology gamuts ontradi-ology: differential diagnosis for the semantic web.

Radiographics, 34(1):254–264, 2014.

[24] Erik Cambria, Yangqiu Song, Haixun Wang, and Newton Howard. Semantic multidimensional scaling for open-domain sentiment analysis. IEEE Intelligent Systems, 29(2):44–51, 2014.

[25] Erik Cambria, Robert Speer, Catherine Havasi, and Amir Hussain. Sentic-net: A publicly available semantic resource for opinion mining. In AAAI fall symposium: commonsense knowledge, volume 10, 2010.

[26] Erik Cambria and Bebo White. Jumping nlp curves: A review of natural lan-guage processing research.IEEE Computational intelligence magazine, 9(2):48–

57, 2014.

[27] K Sel¸cuk Candan and Maria Luisa Sapino. Data management for multimedia retrieval. Cambridge University Press, 2010.

[28] Andrew Carlson, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R Hr-uschka Jr, and Tom M Mitchell. Toward an architecture for never-ending lan-guage learning. InAAAI, volume 5, page 3, 2010.

[29] Gustavo Carneiro, Antoni B Chan, Pedro J Moreno, and Nuno Vasconcelos.

Supervised learning of semantic classes for image annotation and retrieval.IEEE transactions on pattern analysis and machine intelligence, 29(3):394–410, 2007.

[30] Gavin Carothers. Rdf 1.1 n-quads. World Wide Web Consortium, Recommen-dation REC-n-quads-20140225, 2014.

[31] Himali Chaudhari and D Patil. A survey on automatic annotation and anno-tation based image retrieval. International Journal of Computer Science and Information Technologies, 5(2):1368–1371, 2014.

[32] Hua Chen, Antoine Trouve, Kazuaki J Murakami, and Akira Fukuda. Semantic image retrieval for complex queries using a knowledge parser. Multimedia Tools and Applications, pages 1–19, 2017.

[33] Xinxiong Chen, Zhiyuan Liu, and Maosong Sun. A unified model for word sense representation and disambiguation. In EMNLP, pages 1025–1035, 2014.

[34] Nupur Choudhury. World wide web and its journey from web 1.0 to web 4.0. International Journal of Computer Science and Information Technologies, 5(6):8096–8100, 2014.

[35] Lingyang Chu, Shuqiang Jiang, Shuhui Wang, Yanyan Zhang, and Qingming Huang. Robust spatial consistency graph model for partial duplicate image retrieval. IEEE Transactions on Multimedia, 15(8):1982–1996, 2013.

[36] Peter Clark, Bruce Porter, and Boeing Phantom Works. Km–the knowledge machine 2.0: Users manual. Department of Computer Science, University of Texas at Austin, 2:5, 2004.

[37] World Wide Web Consortium, World Wide Web Consortium, et al. Sparql query language for rdf. W3C Recommendation, 2008.

[38] World Wide Web Consortium et al. Json-ld 1.0: a json-based serialization for linked data. 2014.

[39] Douglas Crockford. The application/json media type for javascript object no-tation (json). 2006.

[40] Richard Cyganiak, Andreas Harth, and Aidan Hogan. N-quads: Extending n-triples with context. W3C Recommendation, 2008.

[41] Bo Dai, Dahua Lin, Raquel Urtasun, and Sanja Fidler. Towards diverse and nat-ural image descriptions via a conditional gan. arXiv preprint arXiv:1703.06029, 2017.

[42] Stamatia Dasiopoulou, Eirini Giannakidou, Georgios Litos, Polyxeni Malasioti, and Yiannis Kompatsiaris. A survey of semantic image and video annotation tools. In Knowledge-driven multimedia information extraction and ontology evolution, pages 196–239. Springer, 2011.

[43] Mike Dean, Guus Schreiber, Sean Bechhofer, Frank van Harmelen, Jim Hendler, Ian Horrocks, Deborah L McGuinness, Peter F Patel-Schneider, and Lynn An-drea Stein. Owl web ontology language reference.W3C Recommendation Febru-ary, 10, 2004.

[44] T Dharani and I Laurence Aroquiaraj. A survey on content based image re-trieval. InPattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on, pages 485–490. IEEE, 2013.

[45] Orri Erling and Ivan Mikhailov. Rdf support in the virtuoso dbms. InNetworked Knowledge-Networked Media, pages 7–24. Springer, 2009.

[46] Ali Farhadi, Mohsen Hejrati, Mohammad Amin Sadeghi, Peter Young, Cyrus Rashtchian, Julia Hockenmaier, and David Forsyth. Every picture tells a story:

Generating sentences from images. InEuropean conference on computer vision, pages 15–29. Springer, 2010.

[47] Robert Fergus, Li Fei-Fei, Pietro Perona, and Andrew Zisserman. Learning object categories from google’s image search. InComputer Vision, 2005. ICCV

2005. Tenth IEEE International Conference on, volume 2, pages 1816–1823.

IEEE, 2005.

[48] D Flanagan. Developing metaweb-enabled web applications. Metaweb Tech-nologies, 2007.

[49] Fabien Gandon and Guus Schreiber. Rdf 1.1 xml syntax. W3C recommendation, 25, 2014.

[50] Marco Grassi and Francesco Piazza. Towards an rdf encoding of conceptnet.

Advances in Neural Networks–ISNN 2011, pages 558–565, 2011.

[51] Jane Greenberg, Stuart Sutton, and D Grant Campbell. Metadata: A funda-mental component of the semantic web. Bulletin of the Association for Infor-mation Science and Technology, 29(4):16–18, 2003.

[52] Michael Grobe. Rdf, jena, sparql and the “semantic web”. InProceedings of the 37th annual ACM SIGUCCS fall conference: communication and collaboration, pages 131–138. ACM, 2009.

[53] Ankush Gupta and Prashanth Mannem. From image annotation to image de-scription. In Neural information processing, pages 196–204. Springer, 2012.

[54] Jonathon S Hare, Paul H Lewis, Peter GB Enser, and Christine J Sandom. Mind the gap: Another look at the problem of the semantic gap in image retrieval.

2006.

[55] Bernhard Haslhofer, Elaheh Momeni Roochi, Bernhard Schandl, and Stefan Zander. Europeana rdf store report. Technical report, University of Vienna, 2011.

[56] Hamed Hassanzadeh and MohammadReza Keyvanpour. A machine learn-ing based analytical framework for semantic annotation requirements. arXiv preprint arXiv:1104.4950, 2011.

[57] James Hendler, T Brners-Lee, and Eric Miller. Integrating applications on the semantic web. The journal of the Institute of Electrical Engineers of Japan, 122(10):676–680, 2002.

[58] Tsukasa Hirashima, Yusuke Hayashi, Sho Yamamoto, and Kazushige Maeda.

Bridging model between problem and solution representations in arith-metic/mathematics word problem. Proc. ICCE2015, pages 9–18, 2015.

[59] Micah Hodosh, Peter Young, and Julia Hockenmaier. Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research, 47:853–899, 2013.

[60] Ian Horrocks, Peter F Patel-Schneider, Harold Boley, Said Tabet, Benjamin Grosof, Mike Dean, et al. Swrl: A semantic web rule language combining owl and ruleml. W3C Member submission, 21:79, 2004.

[61] Ming-Hung Hsu, Ming-Feng Tsai, and Hsin-Hsi Chen. Query expansion with conceptnet and wordnet: An intrinsic comparison. In Asia Information Re-trieval Symposium, pages 1–13. Springer, 2006.

[62] Jiewen Huang, Daniel J Abadi, and Kun Ren. Scalable sparql querying of large rdf graphs. Proceedings of the VLDB Endowment, 4(11):1123–1134, 2011.

[63] Eero Hyv¨onen, Samppa Saarela, Avril Styrman, and Kim Viljanen. Ontology-based image retrieval. In WWW (Posters), 2003.

[64] Dong-Hyuk Im and Geun-Duk Park. Linked tag: image annotation using se-mantic relationships between image tags. Multimedia Tools and Applications, 74(7):2273–2287, 2015.

[65] Ian Jacobs and Norman Walsh. Architecture of the world wide web. 2004.

[66] Priyanka Jadhav. Improvised tag ranker for tag based image retrieval (tbir).

2015.

[67] Juhani J¨arvikivi, Roger PG van Gompel, and Jukka Hy¨on¨a. The interplay of implicit causality, structural heuristics, and anaphor type in ambiguous pronoun resolution. Journal of psycholinguistic research, 46(3):525–550, 2017.

[68] Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David Shamma, Michael Bernstein, and Li Fei-Fei. Image retrieval using scene graphs. In Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3668–3678, 2015.

[69] Maged N Kamel Boulos and Steve Wheeler. The emerging web 2.0 social soft-ware: an enabling suite of sociable technologies in health and health care edu-cation. Health Information & Libraries Journal, 24(1):2–23, 2007.

[70] Marja-Riitta Koivunen and Eric Miller. W3c semantic web activity. Semantic Web Kick-Off in Finland, pages 27–44, 2001.

[71] Neeraj Kumar, Peter Belhumeur, and Shree Nayar. Facetracer: A search engine for large collections of images with faces. InEuropean conference on computer vision, pages 340–353. Springer, 2008.

[72] Nate Kushman, Yoav Artzi, Luke Zettlemoyer, and Regina Barzilay. Learning to automatically solve algebra word problems. Association for Computational Linguistics, 2014.

[73] Guillaume Lample, Miguel Ballesteros, Sandeep Subramanian, Kazuya Kawakami, and Chris Dyer. Neural architectures for named entity recognition.

arXiv preprint arXiv:1603.01360, 2016.

[74] Victor Lavrenko, Raghavan Manmatha, and Jiwoon Jeon. A model for learning the semantics of pictures. InAdvances in neural information processing systems, pages 553–560, 2004.

[75] Lillian Lee. “i’m sorry dave, i’m afraid i can’t do that”: Linguistics, statistics, and natural language processing circa 2001. arXiv preprint cs/0304027, 2003.

[76] Douglas B Lenat. Cyc: A large-scale investment in knowledge infrastructure.

Communications of the ACM, 38(11):33–38, 1995.

[77] Siming Li, Girish Kulkarni, Tamara L Berg, Alexander C Berg, and Yejin Choi.

Composing simple image descriptions using web-scale n-grams. InProceedings of the Fifteenth Conference on Computational Natural Language Learning, pages 220–228. Association for Computational Linguistics, 2011.

[78] Kaihong Liu, William R Hogan, and Rebecca S Crowley. Natural language processing methods and systems for biomedical ontology learning. Journal of biomedical informatics, 44(1):163–179, 2011.

[79] Dan Lu, Xiaoxiao Liu, and Xueming Qian. Tag-based image search by social re-ranking. IEEE Transactions on Multimedia, 18(8):1628–1639, 2016.

[80] N Magesh and P Thangaraj. Semantic image retrieval based on ontology and sparql query. Proceedings of International Journal of Computer Applications (IJCA)–ICACT, pages 12–16, 2011.

[81] Vafa Maihami and Farzin Yaghmaee. Fuzzy neighbor voting for automatic image annotation.Journal of Electrical and Computer Engineering Innovations, 4(1):1–8, 2016.

[82] Ameesh Makadia, Vladimir Pavlovic, and Sanjiv Kumar. A new baseline for image annotation. Computer Vision–ECCV 2008, pages 316–329, 2008.

[83] Frank Manola and Eric Miller. Resource description framework (rdf) primer.

W3C Recommendation, 10:5, 2004.

[84] Umar Manzoor, Mohammed A Balubaid, Bassam Zafar, Hafsa Umar, and M Shoaib Khan. Semantic image retrieval: An ontology based approach. In-ternational Journal of Advanced Research in Artificial Intelligence (IJARAI), 1(4):1–8, 2015.

[85] Brian McBride. Jena: Implementing the rdf model and syntax specification. In Proceedings of the Second International Conference on Semantic Web-Volume 40, pages 23–28. CEUR-WS. org, 2001.

[86] Sergey Melnik and Stefan Decker. Wordnet rdf representation, 2001.

[87] George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995.

[88] Taylor L Miller. The role of prosody in english sentence disambiguation. The Journal of the Acoustical Society of America, 136(4):2176–2176, 2014.

[89] Takashi Miyazaki and Nobuyuki Shimizu. Cross-lingual image caption genera-tion. InACL (1), 2016.

[90] Erfan Najmi, Khayyam Hashmi, Zaki Malik, Abdelmounaam Rezgui, and Habib Ullah Khanz. Conceptonto: An upper ontology based on conceptnet.

In Computer Systems and Applications (AICCSA), 2014 IEEE/ACS 11th In-ternational Conference on, pages 366–372. IEEE, 2014.

[91] Erfan Najmi, Zaki Malik, Khayyam Hashmi, and Abdelmounaam Rezgui. Con-ceptrdf: An rdf presentation of conceptnet knowledge base. InInformation and Communication Systems (ICICS), 2016 7th International Conference on, pages 145–150. IEEE, 2016.

[92] Roberto Navigli and Simone Paolo Ponzetto. Babelnet: Building a very large multilingual semantic network. In Proceedings of the 48th annual meeting of the association for computational linguistics, pages 216–225. Association for Computational Linguistics, 2010.

[93] Liqiang Nie, Shuicheng Yan, Meng Wang, Richang Hong, and Tat-Seng Chua.

Harvesting visual concepts for image search with complex queries. In Proceed-ings of the 20th ACM international conference on Multimedia, pages 59–68.

ACM, 2012.

[94] Martha Palmer, Daniel Gildea, and Paul Kingsbury. The proposition bank: An annotated corpus of semantic roles. Computational linguistics, 31(1):71–106, 2005.

[95] Sean B Palmer. The semantic web: An introduction, 2001. URL http://infomesh.net/2001/swintro, 2009.

[96] Xueming Qian, Dan Lu, and Xiaoxiao Liu. Tag based image search by social re-ranking. IEEE Transactions on Multimedia, 2000(006206):1, 2013.

[97] Miguel ´Angel Rodr´ıguez-Garc´ıa, Rafael Valencia-Garc´ıa, Francisco Garc´ıa-S´anchez, and J Javier Samper-Zapater. Ontology-based annotation and retrieval of services in the cloud. Knowledge-Based Systems, 56:15–25, 2014.

[98] Sharmi Sankar, Awny Sayed, and Jihad Alkhalaf Bani-Younis. A schematic analysis on selective-rdf database stores. International Journal of Computer Applications, 86(11), 2014.

[99] Sohail Sarwar, Zia Ul Qayyum, and Saqib Majeed. Ontology based image retrieval framework using qualitative semantic image descriptions. Procedia Computer Science, 22:285–294, 2013.

[100] Ansgar Scherp. Semantic technologies for multimedia content: foundations and applications. InProceedings of the 21st ACM international conference on Multimedia, pages 1107–1108. ACM, 2013.

[101] A Th Schreiber, Barbara Dubbeldam, Jan Wielemaker, and Bob Wielinga.

Ontology-based photo annotation. IEEE Intelligent systems, 16(3):66–74, 2001.

[102] Sebastian Schuster, Ranjay Krishna, Angel Chang, Li Fei-Fei, and Christo-pher D Manning. Generating semantically precise scene graphs from textual descriptions for improved image retrieval. In Proceedings of the Fourth Work-shop on Vision and Language, pages 70–80, 2015.

[103] Arpit Sharma, Nguyen Ha Vo, Somak Aditya, and Chitta Baral. Towards addressing the winograd schema challenge-building and using a semantic parser and a knowledge hunting module. InIJCAI, pages 1319–1325, 2015.

[104] Li Si, Qiuyu Pan, and Xiaozhe Zhuang. An empirical analysis of user behaviour on multilingual information retrieval. The Electronic Library, 35(3), 2017.

[105] Behjat Siddiquie, Rogerio S Feris, and Larry S Davis. Image ranking and retrieval based on multi-attribute queries. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 801–808. IEEE, 2011.

[106] Behjat Siddiquie, Brandyn White, Abhishek Sharma, and Larry S Davis. Multi-modal image retrieval for complex queries using small codes. In Proceedings of International Conference on Multimedia Retrieval, page 321. ACM, 2014.

[107] Robert Speer and Catherine Havasi. Representing general relational knowledge in conceptnet 5. InLREC, pages 3679–3686, 2012.

[108] Nova Spivack. Web 3.0: The third generation web is coming. Lifeboat Founda-tion Special Report. http://lifeboat. com/ex/web, 3:20, 2015.

[109] Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. Yago: a core of semantic knowledge. In Proceedings of the 16th international conference on World Wide Web, pages 697–706. ACM, 2007.

[110] Ying Hua Tan and Chee Seng Chan. Phrase-based image captioning with hier-archical lstm model. arXiv preprint arXiv:1711.05557, 2017.

[111] Yukihiro Tsuboshita, Noriji Kato, Motofumi Fukui, and Masato Okada. Im-age annotation using adapted gaussian mixture model. In Pattern Recognition (ICPR), 2012 21st International Conference on, pages 1346–1350. IEEE, 2012.

[112] Tiberio Uricchio, Lamberto Ballan, Lorenzo Seidenari, and Alberto Del Bimbo.

Automatic image annotation via label transfer in the semantic space. Pattern Recognition, 2017.

[113] Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan. Show and tell: A neural image caption generator. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3156–3164, 2015.

[114] Huan Wang, Xing Jiang, Liang-Tien Chia, and Ah-Hwee Tan. Ontology en-hanced web image retrieval: aided by wikipedia & spreading activation theory.

In Proceedings of the 1st ACM international conference on Multimedia infor-mation retrieval, pages 195–201. ACM, 2008.

[115] Mark Wick. GeoNames. GeoNames, 2006.

[116] Dan Wu, Daqing He, and Bo Luo. Multilingual needs and expectations in digital libraries: a survey of academic users with different languages. The Electronic Library, 30(2):182–197, 2012.

[117] Hongyan Wu, Toyofumi Fujiwara, Yasunori Yamamoto, Jerven Bolleman, and Atsuko Yamaguchi. Biobenchmark toyama 2012: an evaluation of the per-formance of triple stores on biological data. Journal of biomedical semantics, 5(1):32, 2014.

[118] Lei Wu, Rong Jin, and Anil K Jain. Tag completion for image retrieval.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3):716–

727, 2013.

[119] Xing Xu, Li He, Atsushi Shimada, Rin-ichiro Taniguchi, and Huimin Lu. Learn-ing unified binary codes for cross-modal retrieval via latent semantic hashLearn-ing.

Neurocomputing, 213:191–203, 2016.

[120] Changbo Yang, Ming Dong, and Jing Hua. Region-based image annotation using asymmetrical support vector machine-based multiple-instance learning.

In Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, volume 2, pages 2057–2063. IEEE, 2006.

[121] Yezhou Yang, UMD EDU, Yiannis Aloimonos, and Cornelia Fermuller. Deepiu:

An architecture for image understanding. Advances in Cognitive Systems, 2016.

[122] Linlin Yu, Lifang Song, Jianyan Sun, and Lin Li. An improved word sense disambiguation method. DEStech Transactions on Computer Science and En-gineering, 2016.

[123] Xiaoming Zhang, Zhoujun Li, and Wenhan Chao. Improving image tags by exploiting web search results. Multimedia tools and applications, 62(3):601–

631, 2013.

ドキュメント内アノテーションに基づく画像検索の改善に関する研究 (ページ 99-118)