Re-implementation - アノテーションに基づく画像検索の改善に関する研究

Due to the preliminary system in Section 5.2 has some problems: it has no ability to understand the deep meaning of complex queries; the concepts are not well recognized;

it has some errors in the parsing process with the K-Parser. In this context, we want to find a better way to solve these problems.

agent

(a) a group of white birds fly

(b) a black dog running through water

instance_of white is_subclass_of

Figure 5.7: Some semantic graphs generated by K-Parser.

5.3.1 Optimization Techniques

We use WordNet and ConceptNet for better classification of concepts and define a set of rules to enhance the quality of image annotations.

5.3.1.1 Classification of Concepts

As mentioned above in 5.2.3, the concepts generated by the K-Parser are sometimes incorrect. We correct these mistakes by using WordNet and ConceptNet: we first analyze the features of image annotations, and then get a set of concept names form WordNet and ConceptNet for better classification of concepts. Furthermore, we get other different types of relations from ConceptNet² (e.g., “CapableOf”), which can be used to infer concept names.

2In our work, we only consider the relation “CapableOf” from ConceptNet.

5.3.1.2 Rule-based Method

We also define a set of rules based on the semantic graphs generated by the K-Parser to enhance the quality of annotations and increase the retrieval accuracy.

Rule 1 If there is an “of” phrase (e.g. “a group of dogs”, “a bunch of dogs”,

“a couple of dogs”, “a group of dogs”) followed by a verb in a sentence, the “agent”

of the verb should be the “object” (e.g. “dogs”) of the “of” phrase. Figure 5.7 (a) shows an example for “a group of white birds fly”. In this sentence, the “agent” of

“fly” should be “white birds”, but not “group”.

Rule 2 If there is a verb with “-ing” form following a noun or pronoun in a sentence, the “agent” of the verb should be the noun or pronoun. Figure 5.7 (b) shows an example for “a black dog running through water”. In this sentence, the

“agent” of “run” should be “black dog”.

Rule 3If there are two or more adjectives modifying a noun in a sentence, these adjectives should have their own concepts. Figure 5.7 (c) shows an example for “a small white dog”. In this sentence, adjectives should have their own concepts (“white”

and “small”, respectively).

Rules No.1∼No.3 defined above are used to correct the inaccuracies of the K-Parser. They are enough to enhance the quality of the annotations, thereby improving the retrieval accuracy.

Considering the features of SPARQL, we also define some other rules to increase the flexibility of image retrieval system.

select ?uri ?sen where

{

?uri hasSen ?sen_node.

?sen_node SenContent ?sen.

?sen_node hasIns ?dog_2.

?dog_2 instance_of dog .

?sen_node hasIns ?wears_3 .

?wears_3 agent ?dog_2 .

?wears_3 instance_of wear .

?wears_3 recipient ?sth_4 .

?sen_node hasIns ?sth_4 .

?sth_4 instance_of something.}

Delete

(A) (B)

select ?uri ?sen where

{

?uri hasSen ?sen_node.

?sen_node SenContent ?sen.

?sen_node hasIns ?dog_2.

?dog_2 instance_of dog .

?sen_node hasIns ?wears_3 .

?wears_3 agent ?dog_2 .

?wears_3 instance_of wear .

?wears_3 recipient ?sth_4 .

?sen_node hasIns ?sth_4 .}

Figure 5.8: A SPARQL query improved with rule 4 for “a dog wears something”.

Rule 4 If there are some indefinite pronouns (such as someone, somebody, some-thing) in a query sentence, the concepts of these indefinite pronouns will be replaced by any other concepts. This rule is used for some fuzzy description queries; it can be easily implemented by deleting the restricted concepts of “someone”, “somebody”

or “something” (as shown in Figure 5.8, the concept “something” is deleted, then it can be matched with “anything”). For example, with a query sentence “a dog wears something”, it can match images with “a dog wears hat”, “a dog wears shirt”, “a dog wears something”, and so on.

Rule 5 If there is a number being used to count objects in a query sentence, the number will not only match the number itself but also match the sum of a concept.

For example, with a query sentence “three animals run”, it can not only match the images with “three animals run” or “three dogs run”, it can also match the images with “two dogs and a cat run” or “a dog and two cats run”. This rule can be implemented with the counting expression of SPARQL; the details were described with some use cases in Section 4.3.

Rule 6 If there are some adjectives modifying a noun in a query sentence, these adjectives can be seen as a set of property values, images can be returned as query results only when their annotations are matched with all these property values. For example, with a query sentence “a small black dog”, images with “small black dog”

or “black small dog” can be matched, but images with “small black and white dog”

cannot be matched.

Rule 7 If there are some well-known features being used to describe concepts in a query sentence, we use ConceptNet to infer concept names. For example, with a query sentence “a kind of animal that can smell drugs runs on grass”, we use “animal that can smell drugs” as a well-known feature and infer that this kind of animal may be “dog”.

5.3.2 A New ABIR System for Complex Queries

In order to address the problems mentioned in 5.2.3, we re-implement a new image retrieval system with our optimization techniques.

The overview of the new system is shown in Figure 5.9. It is similar to the preliminary system introduced in Section 5.2, which also consists of two modules:

data pre-processing module and image query module.

The new system improves the old system in two aspects:

(1) using WordNet and ConceptNet for better classification of concepts in both data pre-processing module andimage query module; the details are introduced above in 5.3.1.1.

(2) using a set of rules to improve the accuracy of image retrieval. Rules No.1∼No.3 defined above in 5.3.1.2 are used in both modules (data pre-processing module and

Triple'store Data pre-processing module

RDF annotations

Queries

User Results SPARQL'

query generator SPARQL queries Image query module

Flickr8K

www.flickr.com Output'interface' Get'URIs'&'captions' Parsing'process'' Generate'

annotations'

K-Parser Jena RDF API Jena TDB

K-Parser

(a) (b)

Images Sentences

Wordnet ConceptNet

Rules set

Wordnet ConceptNet

Rules set

Figure 5.9: Overview of the new ABIR system.

image query module) to correct the inaccuracies of the K-Parser. Rules No.4∼No.7 defined above in 5.3.1.2 are used in image query module to increase the flexibility of image retrieval.

ドキュメント内アノテーションに基づく画像検索の改善に関する研究 (ページ 81-86)