Interfaces - A Study on Interaction between Human and Digital Content

Scroll bars to control facial parameters

Text box for text-based retrieval

Faces extracted from video clips

Figure 9.4: Face retrieval interface.

parameters non-discrete. For example, the parameter of sex has a continuous value. These values are normalized to take a value between 0 and 100. The value 0 of the parameter of sex means a perfect female face and 100 means a perfect male face. Here, parameters such as age are not used in that specific meaning. In Figure 9.3, an actual age of the left person is 25 and the right person is 23, but parameter shows how old they look by facial appearance.

Same thing can be said to skin brightness, which is semantically diﬀerent from skin color. FaceXplorer focuses on how the face looks in aspect of appearance rather than its semantics.

Current facial parameters are subjectively chosen basic factors for faces.

These parameters will be further extended in our future implementation.

Since the diﬀerent users might focus on diﬀerent parameters, we are also planning an extension to adapt parameters for each user as future work.

9.5 Interfaces

In this section, we describe the details of FaceXplorer, a video browsing interface using the results of the face recognition method described above. The interface is roughly divided into two components: a facial appearance-based

Video thumbnail

A list of faces corresponding to the left video thumbnail

Figure 9.5: Video retrieval interface with visualization of face.

video retrieval interface and a video playing interface. Both components use information about faces in a video clip which automatically annotated to the clips by our video analysis procedure. Figure 9.4, 9.5, 9.6 show screenshots of the interfaces.

9.5.1 Video Retrieval Interface

From the input of video collections and its analysis result, we implemented interfaces to retrieve faces by its appearance.

Viet al.[94] proposed a method to display photos by dynamically placing them in a box and realized an interactive photo display system. Referring to this method, FaceXplorer dynamically displays all faces appearing in the video collection. Each face image indicates a FTCs (Figure 9.4).

A text box and scroll bars are present to control each facial parameter for retrieval. When the user inputs text, faces from the clip that do not match with the input text disappear. This typical text-based retrieval enables the user to narrow down candidates of clips (faces) to watch by inputting a keyword, such as the title of a clip or name of a pop group. By manipulating the scroll bars of facial parameters, thereby specifying the range, candidate faces will be narrowed down further to the list of faces that the user wants to watch. This narrowing down process is done immediately after the user has executed an action, so the user can interactively browse the list of faces. Thus, FaceXplorer

9.5. INTERFACES

A list of faces of current clip

timeline : Visualizing the shots in which face appear Selected face Pointed face

Scenes including faces Scene skip button

Selected face Pointed

face

Figure 9.6: Face playing interface.

helps the user to find faces of interest.

It is not just faces that the user wants to find. There is another retrieval mode that enables the user to retrieve a video clip. Figure 9.5 shows a screen-shot of this mode, which can be switched from the former face retrieval interface by toggling a switch. In this mode, all the faces appearing in a clip are displayed beside the video thumbnail. This mode enables the user to select the clip to watch by looking at lists of who is appearing in each clip. It is useful for the user to decide whether to watch a clip on the basis of people who appear in it. For example, a user who likes a specific person in a specific pop group can look through the list before watching the video. If the favorite person does not appear in some of those clips even with their name included in the description, the user can avoid watching the clip by looking through the list.

9.5.2 Video Playing Interface

After selecting a face or video using the retrieval interface, the video playing interface is activated (Figure 9.6). The biggest space in the interface is a screen for playing a video clip. Below the screen is a timeline that visualizes scenes containing the target person and other people who appear in it. On the timeline, the shots in which people appear in the clip are shown in blue, and the shots in which the target person appears are shown in green. On the

Figure 9.7: Viewing the face appearing at a certain point on the timeline.

right side of the screen, a list of faces that appear in the clip is shown. When the user places a cursor on a face in the list, the part of the timeline showing scenes in which the face appears is highlighted in red. Selection of a face from the list leads to a continuous playback of scenes containing the person.

The same operation is achieved when the user selects a specific face using the face retrieving interface. By selecting a face from the list, the video playing interface is activated, playing scenes containing the person. When the cursor is placed over the timeline, the face of person appearing at that point of time in the clip is displayed above the timeline (Figure 9.7).

While playing a video clip, the most important thing to see is the clip itself.

Therefore, we designed the timeline and list of faces to be as small as possible.

The video can also be played in full screen mode. This interface helps the user watch a video clip specific to the people that the user is interested in watching.

ドキュメント内 A Study on Interaction between Human and Digital Content (ページ 133-136)