Photorealistic 3D Rendering for VW in Mobile Devicescsce.uark.edu › ~cwt › COURSES ›...

1

University of Arkansas – CSCE Department Advanced Virtual Worlds – Spring 2013

Photorealistic 3D Rendering for VW in Mobile Devices

Rafael Aroxa

Abstract In the past few years, the demand for high performance mobile devices has increased substantially. Most of the applications that demand 3D graphics use the commonly available frameworks, such as Unity, Unreal Engine, or straight OpenGL. These frameworks use a technique called rasterization, which is somewhat simple and yields good performance without sacrificing graphic quality. However, this technique cannot easily handle some phenomena of light (i.e. reflection and refraction). In order to support such effects, the framework has to emulate them, which makes it hard to achieve optimal results in term of quality. Other techniques, such as ray tracing, do not require such emulation to be implemented, as the aforementioned phenomena of light are inherently considered. The idea of this project is to implement a photorealistic 3D renderer for mobile devices that uses ray tracing to generate high quality graphics. Due to its inherent complexity, this technique has several drawbacks, especially in terms of performance. To alleviate this problem, some especial algorithms and data structures will be used to substantially improve the performance of the rendering engine. Being able to render high quality graphics with good performance can potentially revolutionize the gaming industry, as such technique has never been implemented in commonly available devices, such as smartphones and tablets.

1.0 Problem Most of currently available frameworks for rendering graphics in mobile platforms use a technique called rasterization, which is very common and yields good performance and quality. As it became commonly used over time, several efforts led to the hardware implementation of several portions of this technique, thereby yielding optimal performance. Most of the smart devices (phones and tablets) available today have this type of specialized hardware embedded into them. Some of them may present better performance than others, but most of them support commonly available APIs for 3D graphics (e.g. OpenGL). Even though rasterization may present a good overall performance and graphical quality, the technique has some severe limitations. For instance, taking into consideration some light phenomena, such as reflection and refraction, is not an easy task. These effects are often emulated in order to improve image quality at the penalty of decreasing performance. Not surprisingly, this emulation does not yield optimal quality if compared to other techniques that can be used for rendering.

2

As the graphics hardware for mobile devices have improved over the years, users are now expecting better graphics quality in the applications. This statement is especially true when it comes to mobile games, whose users are fastidious about graphics. Since the launch of mobile stores (e.g. Apple’s App Store and Google’s Play Store), the number of mobile games users has been greatly increased. In fact, games are being the most downloaded category of application in both stores.

The frameworks available today for rendering in mobile platforms only implement a handful of techniques. Therefore, they do not allow for the developer to use more sophisticated techniques that produce higher quality graphics (e.g. ray tracing, radiosity, path tracing and photon mapping). Being able to use such techniques in mobile platforms, and produce better graphics, will undoubtedly affect people positively.

2.0 Objective The objective of this project is to implement a high quality 3D rendering engine for mobile devices. The technique that will be used to achieve this goal is computationally expensive, but using the correct algorithms and data structures can help alleviate this problem.

3.0 Background In Computer Graphics, rendering is the process of synthesizing an image from a 3D model. This model is usually defined in a scene file, which is responsible for storing information about object geometries, camera position, lighting and other parameters. The output of this process can either be an image to be saved on disk, or pixel colors that will be written to a buffer and displayed on screen. There are several rendering techniques, and the trade-off between quality and performance is a common statement among all of them.

Figure 1: Several shapes transformed into the projection plane. Note the aliasing caused by the discrete

nature of the display.

Rasterization is a rendering technique that works by performing 3D geometric transformations to the scene objects and projecting them in a 2D plane that is called the Projection Plane (Figure 1).

3

In a second step, each pixel in this plane has its color calculated based on the correspondent 2D polygon. This technique is capable of delivering reasonable quality images at high performance.

Techniques such as ray tracing[1] are able to generate higher quality images (Figure 2). It works by tracing rays towards each pixel in the projection plane and calculating the color of each of them individually. Each ray may or may not hit the scene. In fact, if the ray does not hit any object, the color of the correspondent pixel will be the same as the background color (constant set in the scene file), which is usually black. If the ray however hits an object, its surface physical properties will define the color of the correspondent pixel. It is noticeable that this algorithm has a high computational cost, especially because it includes a high amount of ray-polygon intersection, which is the most expensive operation in this technique. An average ray tracer implementation indeed spends around 95% of the rendering time in such intersection calculations[2]. There are several approaches for optimizing the ray tracing algorithm, and most of them include the use of specialized data structures.

Figure 2: Side by side comparison of rasterization and ray tracing. Note the reflection of the scene on the

surface of the teapot. Such effect is not possible in rasterization without the use of other techniques.

OpenGL ES[3] is an open framework for rendering graphics in embedded systems, hence its name. Since its introduction, it has been adopted and used by major companies such as Apple and Google. The version 3.0 of this framework was released in August 6, 2012. However, 1.1 and 2.0 are the most used versions. This framework implements rasterization and is responsible for interfacing with the graphics card in order to perform calculations and updating the screen.

4.0 Design The architectural designed of this project involves the following entities: Application, Renderer, Storage, OpenGL ES and Graphics Chip. The details about each of them and how they interact with each other, as well as a diagram (Figure 3), are explained below.

Application is the binary executable per se, and is responsible for connecting the other entities. Initially, it allocates a buffer, in which the pixel colors will be written, and then instantiates the Renderer. The Application is also responsible for communicating with the operating system and instantiating user interface elements. In order for the individual frames to be rendered sequentially, the Application instantiates a render loop with the operating system. This where the Renderer is requested to render the frame and the output data is passed to OpenGL ES for final processing. This rendering loop is executed until interrupted by the user or operating system.

4

Figure 3. Overview of the architecture. Arrows represent how the data flows between the different entities.

The Renderer is the most important entity in this work. It is responsible for generating an image based on the 3D models present in the scene. Initially, it communicates with the Storage and loads into memory the models from a scene file. When the render operation is called, it traces the rays, computes the colors of each pixel, and saves the result in the buffer. After each frame is processed, the Renderer cleans up the unused resources and prepares itself for the next frame to be rendered.

In this work, OpenGL ES is used to draw the final rendered image on screen. It is responsible for instantiating a texture buffer in the video memory and receiving the buffer that is to be drawn on screen. This framework is responsible for communicating with the graphics chip present on the device and transferring data to/from it.

5.0 Implementation For the sake of simplicity, some physical phenomena were not considered in this work. More specifically, refraction and diffuse inter-reflection were left as future work a (part of my Master’s Thesis). This section explains the detailed implementation of the main algorithms in the order that they are executed in the application.

First, the Renderer loads the scene file from the Storage. This file is described in SDL (Scene Description Language) and contains several definitions about the camera, 3D environment, number of recursive rays to be traced, etc. In addition to that, the SDL file also references the 3D objects that are present in the scene. Figure 4 shows an example of SDL file that represents the Cornell Box, a common 3D scene model used by the community. For each object, the surface material specifications may vary. In fact, in this example file, each object has its own set of parameters. Each of the constants ka, kd, ks, ke, kr and kt has a special purpose. The first 3 define how much of ambient, diffuse and specular light should be present in the surface material color (note that they have to sum up to 1.0). ke represents the power exponent in the specularity equation [4]. kr and kt represent the weights of reflected and transmitted (refracted) colors in the final value, together with the surface material color.

Application

Renderer

OpenGL ES

Storage

Graphics Chip

5

eye 0.0 0.0 5.7 #(x,y,z) vector that defines camera direction size 256 256 #(width,height) size of the rendering buffer ortho -1 -1 1 1 #rectangle that defines the visible area background 0.0 0.0 0.0 #(r,g,b) background color ambient 1 #(float) ambient light intensity max_depth 1 #(integer) number of recursive rays to be traced #light <name.obj> r g b intensity light luzcornell.obj 1.0 1.0 1.0 1.0 #object <name.obj> r g b ka kd ks ke kr kt object leftwall.obj 0.5 0.0 0.0 0.3 0.7 0.0 0.00 0.8 0.0 # left wall object rightwall.obj 0.0 0.5 0.0 0.3 0.7 0.0 0.00 0.8 0.0 # right wall object back.obj 1.0 1.0 1.0 0.3 0.3 0.4 10.0 0.8 0.0 # back wall object floor.obj 1.0 1.0 1.0 0.3 0.7 0.0 0.00 0.0 0.0 # floor object ceiling.obj 1.0 1.0 1.0 0.3 0.7 0.0 0.00 0.0 0.0 # ceiling object cube1.obj 1.0 0.0 0.0 0.3 0.3 0.4 1.00 0.0 0.0 # cube RED object cube2.obj 0.0 1.0 0.0 0.3 0.7 0.0 0.00 0.0 0.0 # cube GREEN

Figure 4. Example of SDL file.

Algorithm 1 Render scene. ray_origin := camera position for i := lower bound to higher bound − 1 do for j := 0 to image width − 1 do ray_direction := vector from camera focal point to center of pixel pixel_color := (0,0,0) if intersection(ray, scene) does not exist then return background_color end if for all lights in scene do shadow_ray := vector from intersection point to current_light if intersection(shadow_ray, scene) ≠ current_light then continue // skip to next light end if pixel_color := pixel_color + diffuse_color pixel_color := pixel_color + specular_color end for if intersected object has reflective properties then calculate reflection_ray trace_ray_recursively(reflection_ray) pixel_color := pixel_color + reflected_color end if output_buffer[i][j] := pixel_color end for end for

In the second step, the Renderer processes the scene model and generates an image. This is the step in which the objects in the scene are tested for intersections, and the color components are calculated for the current pixel. One ray is traced for each pixel in the projection plane, and all of them have their origin defined in the focal point of the camera and point towards the center of the pixel in the projection plane. Each ray is tested for intersection against the scene and if the former does not intersect with any object in the latter, the background color is returned. If the ray otherwise hits the scene, then one shadow ray is casted towards each light object in the scene. If there is an object obstructing every light, this object is considered to be in shadow. If the object

6

is not in shadow, the diffuse and specular components of the final color are calculated for each light that hits the intersection point in the object. It is important to mention that the final result for these components is the sum of the colors calculated for each light. Lastly, if the intersected object has reflective properties, the surface reflection angle is calculated and secondary ray is traced (recursively). For more details about this step, please refer to Algorithm 1. As a result of the rendering step, the shadows are inherently obtained. However, the shadow model used in this implementation only contemplates hard shadows. This means that the edges in the shadow shape will have a rough transition from the bright color to the dark color, and not a smooth gradient transition as in soft shadows. This improvement is left as future work. In the last step, the application is responsible for uploading the buffer into the graphics chip. This operation is the very last step in the rendering loop. Due to an inherent limitation of the OpenGL ES framework, pixels cannot be drawn directly from a buffer onto the screen. In order to do that, one has to instantiate a texture buffer and to write the pixel color on it. The downside of this approach is that an extra rasterization step needs to be performed in order to display the image on screen. More specifically, the projection plane is defined by four vertices (top left, bottom left, top right and bottom right) that form 2 triangles on screen. The coordinates of these vertices are associated with a texture coordinate in the texture buffer. In the rasterization process, these vertices are interpolated and the final image (produced by the ray tracer) is displayed on screen. Not surprisingly this extra step requires an extra effort in terms of processing power, but fortunately it is negligible. The graphic chip has hardware implementation of the rasterization related operation, and thus makes this process run extremely fast.

6.0 Results Several configurations were used to capture the results. Each configuration had specific values for the image resolution and maximum recursion depth. The results were captured in a MacBook Pro 15” mid-2012 using XCode 4.6.1 and iOS Simulator 6.1. Table 1 shows the collected data for the different run configurations. The highest frame-per-second rate happens in the “128x128 depth1” configuration, whereas the lowest takes place in the “512x512 depth2” configuration. The results show that the most time consuming part is indeed the rendering task of the ray tracer. Further optimizations are left as future work.

128x128 256x256 512x512

Time (milliseconds) depth 1 depth 2 depth 1 depth 2 depth 1 depth 2 Render 41.86 63.66 165.81 249.92 657.24 989.92 Texture Upload 0.02 0.03 0.05 0.05 0.16 0.16 OpenGL Render 0.47 0.48 1.72 1.71 6.69 6.51 Time-‐to-‐image 42.35 64.17 167.58 251.68 664.08 996.58 FPS 20.01 15.01 5.45 3.75 1.46 1.00

Table 1. Time measurements collected for different configurations.

7

(a) (b) (c) Figure 5. Images captured for different configurations of recursion depth.

Different configurations for maximum recursion depth were also considered for reflection rays. Figure 5 shows the comparison of the same scene rendered with different configurations. In the first configuration, no reflection rays were casted (Figure 5a). In the second, 1 recursive ray was casted (Figure 5b). In the last scenario, up to 2 rays were casted recursively (Figure 5c). The final value of the pixel color not only depends on the material color of the object hit, but also on the reflected rays. It is important to mention that the quantity of recursive rays greatly affects the performance, and may be unfeasible for dynamic applications. In all scenarios tested, only some objects, i.e. red, white and green walls, had reflective properties (80% of color was attributed to reflection rays).

7.0 Conclusions In this work, we demonstrated that the implementation of a complex photorealistic 3D renderer is feasible in mobile platforms. In fact, the use of more advanced techniques can greatly improve the performance. It may be a burden to have the ray tracer use OpenGL ES to perform the drawing using texture coordinates in 2 triangles that form the projection plane. It turns out that this is the fastest way to perform the drawing of the rendered buffer in mobile devices. The quality of the images generated by the renderer is unquestionable, which makes this technique prone to real world usage.

8.0 Future Work There are several features to be implemented in the near future. First, phenomena such as refraction and diffuse inter-reflection will be implemented. This will enable the ray tracer to deliver images with better quality and improved photorealism. Second, acceleration data structures will be used to boost the performance of the ray traversal. Lastly, the project code and architecture will be refactored and designed as a library that can be easily used by other programmers.

8

9.0 References [1] GLASSNER A.S. An Introduction to Ray Tracing. 1989. Academic Press. San Diego,

United States. [2] WHITTED T.: An improved illumination model for shaded display, Communications of

ACM, v. 23, no. 6, ACM, New York, USA, 1980, pp. 343- 349. [3] OpenGL ES. http://www.khronos.org/opengles/.

[4] B. T. Phong, Illumination for computer generated pictures, Communications of ACM 18 (1975), no. 6, 311–317.

Photorealistic 3D Rendering for VW in Mobile Devicescsce.uark.edu › ~cwt › COURSES ›...

Documents

Transcript of Photorealistic 3D Rendering for VW in Mobile Devicescsce.uark.edu › ~cwt › COURSES ›...