Graduate School of Global Information and Telecommunication Studies, Waseda University
Abstract of Doctoral Dissertation
Research on Improving Methods for Visualizing Common Elements in Video Game Applications
ビデオゲームアプリケーションにおける共通的な要素 の視覚化手法の改良に関する研究
Candidate’s name Sven Dierk Michael Forstmann
Global Information and Telecommunication Studies Research on Image Processing II
Date March 2013
Since the invention of computers in the 1940’s, data acquisition, data processing, and data visualization have been tasks and research topics in computer science. The history of computer graphics (CG) started soon after the invention of computers. In early times, only simple graphics could be rendered, but since 1980’s, many important component technologies of modern 3D graphics have been developed. As these 3D graphics related technologies grow, CG have started being used for many applications such as webpages, scientific and/or engineering research, medicine, art, education and entertainment. In particular, among entertainment applications, at present video games have become larger industries than cinemas and are attracting very many people.
When video games appeared for the first time as arcade machines developed using unique hardware, those video games have been developed independently, without using any third party software. Also for home-computers at that time, third party game engines were not used. Early games did not require complex algorithms. However, with the evolution of Intel x86-hardware and the beginning of 3D graphics in the early 1990’s, a significant increase in the complexity of game development happened. At that time, the first game engines for 3D games appeared, but due to some limitations, those game engines were normally referred to as game engines for 2.5D graphics, because those game engines could not handle real 3D graphics. Then, in the late 1990’s, real 3D engines, which could handle arbitrary 3D geometry configurations and could utilize 3D hardware acceleration, appeared for the first time. Since then, game engines have been used as common tools for producing 3D games.
“Game Engine” is defined as a software framework that includes all required modules needed for producing a video game. Among these modules, the most important module is the so-called “3D engine”. As opposed to the terminology “Game Engine”, “3D Engine” is not strictly defined. 3D Engine can accommodate CG software applications developed in multiple platforms, and can visualize the combined CG images, each of which is generated by the corresponding accommodated CG software application. State of the art 3D engines visualize the following general types of objects: terrain, static objects, skeletal animated objects, plants, sky and clouds. Among these, this thesis assumes that terrain, static objects and skeletal animated objects are accommodated in a 3D Engine.
The purpose of this thesis is to improve limitations of existing methods for visualizing terrain, static objects and skeletal animated objects. Specific goals of each of the three objects are as follows.
(1) Terrain: The goals for overcoming the limitations of existing volumetric terrain visualization methods are as follows. First, the pre-computation-free visualization of volumetric terrain data should be achieved, because all existing methods require a pre-computation step prior to the visualization. Second, infinite sized terrain should be achieved, because existing algorithms can visualize only limited sized terrain as they need to pre-process the entire terrain data. Last, new procedural terrain data should dynamically be generated on the fly in parallel to the visualization, to allow the visualization of infinite sized terrains, where this is impossible for the conventional methods.
(2) Static objects: The goals for improving existing methods for visualizing static 3D models with complicated structures are as follows. Complex voxel scenes should be visualized faster than conventional splatting methods and conventional triangle based rasterization. The
memory consumption should be lower than the scene’s equivalent triangle mesh, and lower than related voxel based raycasting methods. Therefore, highest rendering speed, and lowest memory consumption for visualizing detailed voxel scenes should be achieved.
(3) Skeletal animated objects: The goals for improving existing skeletal animation methods are as follows, where the goals are divided into two groups.
Goal 1 (approached by Skeletal Animation): Collapsing geometries caused by large bend angles of articulated objects, which could be observed in matrix skinning (SSD), should be prevented. Faster computation, artifact-free deformations and more parameters to adjust the deformation compared to quaternion skinning (QS) and dual QS (DQS) should be achieved.
The number of joints for complex skeletons should be reduced, where fast rendering speed is preserved, and the artifacts are prevented.
Goal 2 (approached by Pose Dependent Customization): Different from pose-space-deformation, two kinds of “re-usability”, which is impossible for existing methods, should be achieved. First, simple and abstract design of deformation styles for re-usable deformation behaviors such as muscles bulges or cloth wrinkles, should be achieved.
Second, one deformation style should be applied to any number of target characters instantly.
The above-mentioned goals are tackled by this thesis as follows.
Chapter 1 is the introduction, which consists of the background, the related work, the purpose, the proposed approach and the organization of this thesis.
Chapter 2 overviews existing 3D engines and explains their relationship to Game Engines and how they combine multiple visualization modules. The most advanced 3D engine currently available is the so-called CryEngine 3. This thesis assumes that CryEngine 3 embeds the three methods proposed in Chapters 3 to 5, respectively, as modules.
Chapter 3 explores how to achieve the goal (1) for terrain visualization. This thesis proposes a method based on nested Clip-Boxes. Nested Clip-Boxes allow the immediate and pre-computation free visualization of arbitrary sized volume data. A Clip-Box consists of a cubic regular grid of voxels and the corresponding triangulation. For the visualization, multiple nested Clip-Boxes are centered about the viewer, each of which half of the size of the previous. To preserve this constraint over time, their position, their voxel data and their triangulation are updated frequently according to the position of the viewer, concurrent to the visualization. Since only a small and well-defined region of the entire terrain data around the viewer is required for the visualization, the entire terrain size can of arbitrary size. It is therefore possible, to visualize infinite sized procedural terrains without any pre-computation.
The procedural method does not need to compute the entire terrain data. Computing the terrain data within the Clip-Boxes which surround the viewer is sufficient.
Experimental results confirm the validity of the proposed method. Specifically, arbitrary sized terrains, which are obtained by computing terrain data from mathematical functions, are procedurally generated and simultaneously visualized at 30-100 fps (frames/second), without the need of any pre-computation or stream in any data from mass-storage devices.
Chapter 4 explores how to achieve the goal (2) for visualizing static objects. This thesis proposes a parallel voxel based raycasting approach for run-length-encoded (RLE) data sets.
The proposed method achieves the visualization by raycasting the scene in vertical planes that are perpendicular to the ground plane. For each plane, only one ray is casted into the RLE
structure, where the result of each vertical plane is stored in a temporary buffer as column.
The temporary buffer is finally mapped to the screen for achieving a proper visualization. In addition to only casting one ray per column on the screen, efficient visibility culling by an extended floating horizon algorithm and early ray termination are the main reasons that provide a high speed. It is therefore possible to visualize voxel scenes faster than by using triangle based rasterization or basic splatting. Due to RLE data compression, lower memory consumptions per element than triangle based rasterization, triangle based raycasting and related voxel based raycasting methods are further achieved.
Experiments show that the proposed method and QSplat consumes least memory.
Gigavoxels, triangles, triangle raytracing and sparse voxel octrees consume more memories.
As a result of comparing the rendering speed, the following results are obtained. The proposed method, triangle raytracing, GigaVoxels, and sparse voxel octree are fastest. QSplat and triangle rasterization are slower. The proposed method is tied with QSplat in terms of low memory consumption, and is tied with triangle raycasting, GigaVoxels and sparse voxel octree in terms of rendering speed. Comprehensive evaluation for these results indicate that the proposed method is best and that the goals are achieved.
Chapter 5 explores how to achieve the goals (3) for skeletal animation. This thesis proposes a deformation style based approach combined with spline skinning. The proposed spline skinning is a combination of spline aligned deformations and conventional SSD. While SSD uses vertex weights to blend matrices, spline skinning uses them to blend multiple splines curves. As spline aligned deformations do not expose artifacts common for SSD, deformation artifacts are avoided. Splines can further help to simplify complex skeletal animations, such as a spine or facial animations, by replacing multiple common joints by one spline. Further, since the computation per vertex can almost be reduced to SSD, spline skinning can be computed faster than QS and DQS.
Last, the combined method with deformation styles allows the creation of muscles and other custom deformations that can be applied to any number of characters simultaneously.
Deformations are defined abstract by three scale textures and three scale curves. Once defined, these deformation styles can be applied immediately to any number of target characters simultaneously.
Experimental results for spline skinning show that deformation artifacts caused by SSD, QS and DQS can be avoided. This thesis’ method achieves faster speed than QS and DQS, which achieve 78% and 72% the performance of SSD, respectively, where the proposed method achieves over 90% the speed of SSD. Complex lip deformations using two splines and the animation of a spine with one spline prove that complex animations with less control points can be achieved. Experimental results for deformation styles show that deformation styles can immediately be applied to two target characters simultaneously, which shows the validity of the “re-usability” is confirmed. These results indicate that this chapter’s goals in terms of artifact-free deformations, flexibility, faster speed and re-usable deformations are achieved.
Chapter 6 concludes this thesis together with future work.