学位論文の要旨
Abstract of Thesis 研究科
School
自然科学研究科
専 攻
Division
産業創成工学専攻
学生番号
Student No. 51430706
氏 名
Name
寇 鄴郡
学位論文題目 Title of Thesis(学位論文題目が英語の場合は和訳を付記)
Pose Estimation of Arbitrary 3D Target Object by Stereo-vision ( ステレオビジョンによる任意3次元対象物のポーズ推定 )
学位論文の要旨 Abstract of Thesis
It is a difficult task to make robots act like a human. Although it cannot be accurate, a human can understand surrounding situations and comprehend 3D space perception in unknown circumstances. When the researchers hope to make robots see and comprehend a perception of the environment in an engineering solution, it seems to be a long way. This may come from lacking understanding of situations of the surrounding environment with 3D perception, showing that the robots’ abilities are not enough for making the robot intelligent enough to behave like a human in unknown circumstances.
Most studies on robot-vision for enabling robots to estimate a target’s pose, i.e., position and orientation through visual information are based on a monocular camera with a pre-defined recognition model. The disadvantage is that the precision of distance measurement of the camera’s depth direction in monocular vision is not enough for applications in the task such as:
grasping the target object, in which high distance measurement accuracy is required. On the other hand, the process of constructing a 3D model in programming is complicated. For handling industrial parts by robots, some other researchers use CAD models because they are readily available from their design phase. However, for general objects appearing in nature, e.g., a rock in the natural environment, it is impossible to describe them in the CAD models. Therefore many studies have exploited RGB-D camera to satisfy the requirement of estimating an arbitrary target’s pose. The structured light and laser points cloud can avoid the complicated model establishment process. Still, the RGB-D camera can barely be used in outer and underwater environments, limiting the feasible situation.
To realize the real-time 6DOF pose estimation to an arbitrary target, the author proposes a new method named the “projection-based 3D perception (Pb3DP)” method. The projection- based is derived and expanded from our previous research, which is known as the photo-model based method, which has confirmed that a 2D photo-model of a 3D object enable to estimate the 3D target’s pose. In the projection-based method, stereo vision plays an essential role. The benefit of stereo-vision exists in the inherited merit, a parallactic looking way, which can
Name 寇 鄴郡
improve distance estimation accuracy. The concept of Pb3DP exploits the merit of using a 2D photo-model for the pose tracking of a 3D target. The 2D photo-model that should have been prepared beforehand is substituted in Pb3DP by a 2D target image projected in the left camera image, enabling that the pose of an arbitrary 3D target could be measured as long as the target could be seen in both stereo-vision images. Since the 2D photo-model could be obtained at any time in the left camera image if the target is seen in the image, there is no need to prepare the 2D photo-model.
This dissertation proposes a real-time 6DOF projection-based pose estimation method, and the feasibility of real-time pose estimation towards an arbitrary target has been confirmed through visual servoing experiments. This method can detect the full pose of a 3D arbitrary target object. To the best of the author’s knowledge, no studies have yet been conducted on 3D pose visual servoing with only stereo-vision information of an object in a real world. What the author wants to certify by actual experiments in this dissertation is whether an arbitrary target’s pose can be estimated without a pre- defined recognition model, and whether the estimated pose can be used for 3D pose visual servoing.
And the results have shown that the full pose of an arbitrary 3D target can be estimated in real-time, and it also enables 3D visual servoing to the target. The above results have been confirmed by actual experiments that use a 6DOF manipulator with stereo vision cameras at the end-effector.