Real-Time Rendering Chapter 6: Special Effects

"Special effects" is a relative term in computer graphics. In the film and television industry, today's new cool effect eventually becomes just another tool in the toolbox. Convincing hair and cloth are currently a hot property, done well at the high end, but will become mainstream for still image rendering (not real-time, yet) as the technology moves into commercial products. Image morphing techniques that were unique and exciting just a decade ago are now a standard part of children's shows. Once upon a time Gouraud shading, or even simple fill shading, was special.

The field of real-time rendering in many ways recapitulates the evolution of computer graphics for still images. As processors get faster, algorithms that took minutes now take fractions of a second. Graphics accelerators help perform operations that are widely used, such as filling triangles. However, because the new factor of dedicated graphics hardware is added to the mix, new ways of performing old algorithms arise. Multipass rendering (section 5.4) is a prime example. In traditional computer graphics the lighting equation is resolved at each pixel in a single pass, either by Gouraud interpolation or per-pixel shading. Software real-time rendering engines use this approach. Because graphics accelerators provide some basic operations that are extremely fast, an elaborate lighting equation can broken into separate pieces which hardware can handle and combine in a few passes, in less time than a single software pass can complete.

Special effects work for real-time rendering depends upon both classic computer graphics techniques and using available hardware acceleration to best effect. Algorithms leverage the existing triangle fill, filtered texturing, and transparency support. As new capabilities become available in hardware, new special effects become possible, or more general, or at least faster. A good example of a capability that is becoming standard is the stencil buffer. The stencil buffer is a special buffer that is not displayed (similar to how the Z-buffer is not displayed). Instead, it is normally used to mask areas of the screen off and make them unwritable. Primitives are written to the stencil buffer, and where these primitives appear in the buffer becomes the only areas of the screen where succeeding objects will be displayed. As will be seen in this chapter, techniques such as true reflections and shadows can be done more rapidly when a stencil buffer is available in hardware. There are many other operations that can be aided by a stencil buffer, such as capping objects cut by an arbitrary clipping plane [10] and visualizing models formed directly by adding and subtracting solids [10,15].

6.1 The Rendering Spectrum

Up to this point in the book we have focused on showing three dimensional objects on the screen by representing them with polygons. This is not the only way to get an object to the screen, nor is it always the most appropriate one. The goal of rendering is to portray an object on the screen; how we attain that goal is our own decision. There is no correct way to render a scene. Each rendering method is an approximation to reality, at least if photo-realism is the goal.

Polygons have the advantage of representing the object in a reasonable fashion from any view. As the camera moves the representation of the object does not have to change. However, to improve quality we may wish to substitute a more highly detailed model as the viewer gets closer to the object. Alternately, we may wish to use a simplified form of the model if it is off in the distance. These are called level of detail techniques (section 7.3). Their main purpose is to make the scene faster to display.

However, other techniques can come into play as an object recedes from the viewer. Speed can be gained by using images instead of polygons to represent the object. It is less expensive to represent an object with a single image which can be quickly sent to the screen. Algorithms that use images to portray objects are a part of image-based rendering. One way to represent the continuum of rendering techniques is from Lengyel [8] and shown in figure 6.1

Figure 6.1: The rendering spectrum. (after Lengyel [8])

Within the field of real-time rendering, global illumination techniques such as radiosity and ray tracing are not feasible except on extremely high-end machines (or for simple demonstration programs). Such techniques will undoubtedly move into the realm of real-time as processor speeds increase. Currently these algorithms' main contribution to real-time rendering is in precomputing data such as vertex colors, light maps, environment maps, etc.

6.2 Image-Based Rendering

One of the simplest image-based rendering primitives is the sprite. A sprite is an image that moves around on the screen. A mouse cursor is a sprite, for example. The sprite does not have to have a rectangular shape, as various pixels can be identified as being transparent. For simple sprites there is a one-for-one mapping with pixels on the screen. Each pixel stored in the sprite will be put in a pixel on the screen. Various acceleration schemes exist for sprites, such as precompiling them into a list of individual spans of pixels and so avoiding having to test for transparency at each pixel [3].

The idea of a sprite can be extended in many ways. Sprites can be trivially zoomed at integer zoom factors, for example, if the object represented by the sprite is to appear to approach the viewer. A 10x10 pixel sprite can be turned into a 20x20 or 30x30 sprite by simple replication. Transitions between zoom levels can be lessened by adding sprites at other resolutions. Such techniques preserve the simplicity that sprites offer for changing pixels directly on the screen.

Animation can be done by displaying a succession of different sprites. The video stream creates a time series of sprites which are merged with the scene. Another use for a set of sprites is interactive object representation. As the viewer sees an object from different angles, different sprites can be used to represent it. The illusion is fairly weak because of the jump when switching from one sprite to another.

A sprite can also be treated as an image texture on a polygon, with the image's alpha channel providing full or partial transparency. With the use of texturing acceleration hardware such techniques incur little more cost than direct copying of pixels. Images applied to polygons can be kept facing the viewer using various billboard strategies (section 6.2.2).

One way to think of a scene is that it is made of a series of layers put one atop another. For example, in plate XIII, the tailgate is in front of the chicken, which is in front of the truck's cab, which is in front of the road and trees. From a large number of views this layering holds true. Each sprite layer has a depth associated with it. By rendering in a back to front order the scene is built up without need for a Z-buffer, thereby saving time and resources. Camera zooms simply make the object larger, which is simple to handle with the same sprite. Moving the camera in or out actually changes the relative coverage of foreground and background, which can be handled by changing each sprite layer's coverage independently. As the viewer moves perpendicular to the direction of view the layers can be moved relative to their depths.

However, as the view changes, the appearance of the object changes. For example, viewing a cube straight on results in a square. As the view moves, the square appears as a warped quadrilateral. In the same way a sprite representing an object can also be warped as its relation to the view changes. The rectangle containing the sprite still appears on a layer with a single z-depth, just the screen (x,y) coordinates of the rectangle change. Note that as the view changes, however, new faces of the cube become visible, invalidating the sprite. At such times the sprite layer is regenerated. Determining when to warp vs. regenerate is one of the more difficult aspects of image-based rendering. In addition to surface features appearing and disappearing, specular highlights and shadows add to the challenge.

This layer and image warping process is the basis of the Talisman architecture [1,14]. Objects are rendered into sprite layers, which are then composited on the screen. The idea is that each sprite layer can be formed and reused for a number of frames. Image warping and redisplay is considerably simpler than resending the whole set of polygons for an object each frame. Each layer is managed independently. For example, in plate XIII, the chicken may be regenerated frequently because it moves or the view changes. The cab of the truck needs less frequent regeneration because its angle to the camera is not changing as much in this scene. Performing warping and determining when to regenerate a layer's image is discussed in depth by Lengyel and Snyder [7]. One interesting efficiency technique is to perform multipass rendering to generate the sprite, and use lower resolution passes which are then bilinearly magnified (section 5.2.1) and combined. Another idea is to create separate shadow and reflection sprite layers for later compositing.

Interpenetrating objects such as the wing and the tailgate are treated as one sprite. This is done because the wing has feathers both in front and behind the tailgate. So, each time the wing moves the entire layer has to be regenerated. One method to avoid this full regeneration is to split the wing into a component that is fully in front and one that is fully behind the tailgate. Another method was introduced by Snyder and Lengyel [12], in which in some situations occlusion cycles (where object A partially covers B, which partially covers C, which in turn partially covers A) can be resolved using layers and compositing operations.

Pure image layer rendering depends on fast, high quality image warping, filtering, and compositing. Image-based techniques can also be combined with polygon based rendering. Section 7.2 deals extensively with impostors, nailboards, and other ways of using images to take the place of polygonal content.

At the far end of the image-based rendering spectrum are image-based techniques such as QuickTime VR and the Lumigraph. In the Quicktime VR system [2] a 360 degree panoramic image, normally of a real scene, surrounds the viewer as a cylindrical image. As the camera's orientation changes the proper part of the image is retrieved, warped, and displayed. Though limited to a single location, it has an immersive quality compared to a static scene because the viewer's head can turn and tilt. Such scenes can serve as backdrops and polygonal objects can be rendered in front of them. This technology is practical today, and is particularly good for capturing a sense of the space in a building, on a street, or other location, real or synthetic. See figure 6.2. QuickTime VR's runtime engine is a specialized renderer optimized for cylindrical environment mapping. This allows it to achieve an order of magnitude gain in performance over software polygon renderers handling the same texture map placed on a cylinder.

Figure: A panorama of the Mission Dolores, used by QuickTime VR to display a wide range of views, with three views below generated from it. Note how the views themselves are undistorted. (Courtesy of Ken Turkowski)

The Lumigraph [5] and light field rendering [9] techniques are related to QuickTime VR. However, instead of viewing much of an environment from a single location, a single object is viewed from a set of viewpoints. Given a new viewpoint, an interpolation process is done between stored views to create the new view. This is a more complex problem, with a much higher data requirement (tens of megabytes for even small image sets), than QuickTime VR. The idea is akin to holography, where a two dimensional array of views captures the object. The tantalizing aspect of the Lumigraph and light field rendering is the ability to capture a real object and be able to redisplay it from any angle. Any real object, regardless of surface complexity, can be displayed at a nearly constant rate [11]. As with the global illumination end of the rendering spectrum, these techniques currently have limited use in real-time rendering, but they demark what is possible in the field of computer graphics as a whole.

To return to the realm of the mundane, what follows are a number of commonly used special effects techniques that have image-based elements to them.

6.2.1 Lens Flare and Bloom

Lens flare is a phenomenon that is caused by the lens of the eye or camera when directed at bright light. It consists of a halo and a ciliary corona. The halo appears because the lens material refracts light of different wavelengths by different amounts, as a prism does. The halo looks like a circular ring around the light, with its outside edge tinged with red, inside with violet. The ciliary corona is from density fluctuations in the lens, and appears as rays radiating from a point, which may extend beyond the halo [13]. Camera lenses can also create secondary effects due to parts of the lens reflecting or refracting light internally. For example, hexagonal patterns can appear due to the camera's diaphragm blades. Bloom is caused by scattering in the lens and other parts of the eye, giving a glow around the light and dimming contrast elsewhere in the scene. In video production, the video camera captures an image by converting photons to charge using a charge coupled device (CCD). Bloom occurs in a video camera when a charge site in the CCD gets saturated and overflows into neighboring sites. As a class, halos, coronae, and bloom are called glare effects.

In practice what this means is that we associate these effects with brightness. Once thought of as relatively rare image artifacts, they are now routinely added digitally to real photos to enhance the effect. There are limits to the light intensity produced by the computer monitor, so to give the impression of increased brightness to a scene these glare effects are explicitly rendered. The lens flare effect is now something of a cliché due to its common use. Nonetheless, when skillfully employed it can give strong visual cues to the viewer; see plate XIV.

Figure 6.3 shows a typical lens flare. It is produced by using a set of textures for the glare effects. Each texture is applied to a square that is made to face the viewer (i.e., is placed perpendicular to the view direction). The texture is treated as an alpha map, determining how much of the square to blend into the scene. Because it is the square itself which is being displayed, the square can be given a color (typically a pure red, green, or blue) for prismatic effects for the ciliary corona. These colored, textured squares are blended using an additive effect to get other colors. Furthermore, by animating the ciliary corona a sparkle effect is created [6].

Figure: A lens flare and its constituent textures. On the right, a halo and a bloom are shown above, two sparkle textures below. (Images from a Microsoft DirectX6 SDK program)

6.2.2 Billboarding

Lens flares have the quadrilateral drawn facing the viewer. Orienting the polygon based on the view direction is called billboarding and the polygon a billboard [10]. As the view changes, the orientation of the polygon changes. Billboarding, combined with alpha texturing and animation, can be used to represent many phenomena that do not have solid surfaces. Smoke, fire, explosions, vapor trails, and clouds are just a few of the objects that can be represented by these techniques [4,10] - see plates XII and XV. Effects such as energy beams and shields are also possible. There are a few popular forms of billboards described in this section.

Bibliography

Excerpt from the book Real-Time Rendering by Tomas Möller and Eric Haines, 512 pages, from A.K. Peters Ltd., $49.95, ISBN 1-56881-101-2.

Date this article was posted to GameDev.net: 2/24/2000
(Note that this date does not necessarily correspond to the date the article was written)