From
“Snow White
and the Magic
Mirror”
To
“Scientist and
Multimedia
What was
Mirror, mirror, on the wall,
Who is the fairest of them all?
O Lady Queen, Snow-White is
the fairest of them all
Cool!
How the mirror can do that?
.. .
Searching
Process
Asking question Sending question with invocation Searching Selecting 呪文 Snow-White's stepmother: User who creates
the question or the query
Mirror: Interpreter who translates “muggles”
language to “wizard” language, and vice versa
Wizard: Seeker who uses his “magic” search
engine to find the right person in the fairy-tale land. The answer is called retrieval
白雪姫の継母: 質問(クエリ)をつくる
魔法の鏡: 魔法を使えない人間の言葉を魔法使いの言葉に訳したり、その逆をする
魔法使い: 魔法のサーチエンジンを使っておとぎの国の住人の中から クエリにあった人を探す。これを検索(retrieval)と呼ぶ
Mr. Scientist started to think about it. He wants to turn the magic mirror into his computer
Remember
Computer only understands “0” and
“1”
All data must be transformed to 'digital
signal'
デジタル信号Translate from sound to digital audio signal
It seems that
everything is solved,
but
...
Challenges
The question of Snow-White's stepmother is
“who is the fairest of them all?”. It means that, the magic mirror must look for
− The most beautiful girl, and − The most benevolent girl
Challenges
How to evaluate “a beauty” ? for example, in
fairy-tale land, a beautiful girl must reach a
standard
Hair is as black as ebony → color(色)
Body is as slender as willow → shape(形) Skin is as soft as silk → texture(質感)
Singing voice is as nightingale‘s → signal
(音)
→ Since these information belong to a girl, it is called internal information or internal features
コンピュータはどうやって「美しさ」を評価するのか?
Challenges
How to evaluate “a benevolence” ? for
example, in fairy-tale land, a benevolent girl
must reach a standard
– Be loved by most of inhabitants in fairy-tale
land
– Doing a lot of things out of charity
→ Since these information does not belong to a girl, it must be collected from inhabitant who
Some technical terms
Color, texture, shape, and signal are called
low-level features that are extracted directly
from objects and the computer very easily understands.
色、質感、形、音は低レベル特徴量と呼ばれる。
これは、画像、映像中の「モノ」から直接取り出すことができて、 コンピュータでも理解しやすい
Some technical terms
Hair, body, skin, and eyes are called
high-level features. In order to let the computer
understand these features, Mr. Scientist must prepare a lecture by which the search engine could recognize them.
→ That lecture is called “bridge the semantic 色、質感、形、音は低レベル特徴量と呼ばれる。
これは、画像、映像中の「モノ」から直接取り出すことができて、 コンピュータでも理解しやすい
Some technical terms
After the computer translate query of users to
features, the search engine will compare
these features to features in its database to choose the best match one.
The tool to calculate/compare the similarity
between two sets of features is called
Distance measure.
コンピュータ(魔法の鏡)が質問を特徴量に変換したら、
サーチエンジン(魔法使い)がデータベース内の特徴量と比べて一番似ているものを選ぶ
Remind … Remind ...
One object usually has internal features and
external information
Multimedia includes images, audio, video,
and text
Query is what users ask computer to search
Retrieval is what search engine found
Remind … Remind ...
Computer does not understand human being
language. Computer easily recognizes
low-level features but hardly understand
high-level features
The semantic gap between low-level features
Remind … Remind ...
Search engine only understands and works
with features
Distance measure is necessary to comparing
two sets of features
Browsing is the process of showing results
returning by search engine under human languages
Indexing is the process of re-ordering data in
database in order to look for faster
閲覧は人に理解しやすい形で検索の結果を見せること
Query by keyword
Annotation
Cinderalla Mermaid OK FALSE Manual Time consuming Automatic Fast but very difficultQuery by color
It looks good ...
But sometime the results are not good