Chapter 6 Special Effects by Tomas Möller and Eric Haines
6.2 Image-Based RenderingOne of the simplest image-based rendering primitives is the sprite. A sprite is an image that moves around on the screen. A mouse cursor is a sprite, for example. The sprite does not have to have a rectangular shape, as various pixels can be identified as being transparent. For simple sprites there is a one-for-one mapping with pixels on the screen. Each pixel stored in the sprite will be put in a pixel on the screen. Various acceleration schemes exist for sprites, such as precompiling them into a list of individual spans of pixels and so avoiding having to test for transparency at each pixel [3]. The idea of a sprite can be extended in many ways. Sprites can be trivially zoomed at integer zoom factors, for example, if the object represented by the sprite is to appear to approach the viewer. A 10x10 pixel sprite can be turned into a 20x20 or 30x30 sprite by simple replication. Transitions between zoom levels can be lessened by adding sprites at other resolutions. Such techniques preserve the simplicity that sprites offer for changing pixels directly on the screen. Animation can be done by displaying a succession of different sprites. The video stream creates a time series of sprites which are merged with the scene. Another use for a set of sprites is interactive object representation. As the viewer sees an object from different angles, different sprites can be used to represent it. The illusion is fairly weak because of the jump when switching from one sprite to another. A sprite can also be treated as an image texture on a polygon, with the image's alpha channel providing full or partial transparency. With the use of texturing acceleration hardware such techniques incur little more cost than direct copying of pixels. Images applied to polygons can be kept facing the viewer using various billboard strategies (section 6.2.2). One way to think of a scene is that it is made of a series of layers put one atop another. For example, in plate XIII, the tailgate is in front of the chicken, which is in front of the truck's cab, which is in front of the road and trees. From a large number of views this layering holds true. Each sprite layer has a depth associated with it. By rendering in a back to front order the scene is built up without need for a Z-buffer, thereby saving time and resources. Camera zooms simply make the object larger, which is simple to handle with the same sprite. Moving the camera in or out actually changes the relative coverage of foreground and background, which can be handled by changing each sprite layer's coverage independently. As the viewer moves perpendicular to the direction of view the layers can be moved relative to their depths. However, as the view changes, the appearance of the object changes. For example, viewing a cube straight on results in a square. As the view moves, the square appears as a warped quadrilateral. In the same way a sprite representing an object can also be warped as its relation to the view changes. The rectangle containing the sprite still appears on a layer with a single z-depth, just the screen (x,y) coordinates of the rectangle change. Note that as the view changes, however, new faces of the cube become visible, invalidating the sprite. At such times the sprite layer is regenerated. Determining when to warp vs. regenerate is one of the more difficult aspects of image-based rendering. In addition to surface features appearing and disappearing, specular highlights and shadows add to the challenge. This layer and image warping process is the basis of the Talisman architecture [1,14]. Objects are rendered into sprite layers, which are then composited on the screen. The idea is that each sprite layer can be formed and reused for a number of frames. Image warping and redisplay is considerably simpler than resending the whole set of polygons for an object each frame. Each layer is managed independently. For example, in plate XIII, the chicken may be regenerated frequently because it moves or the view changes. The cab of the truck needs less frequent regeneration because its angle to the camera is not changing as much in this scene. Performing warping and determining when to regenerate a layer's image is discussed in depth by Lengyel and Snyder [7]. One interesting efficiency technique is to perform multipass rendering to generate the sprite, and use lower resolution passes which are then bilinearly magnified (section 5.2.1) and combined. Another idea is to create separate shadow and reflection sprite layers for later compositing. Interpenetrating objects such as the wing and the tailgate are treated as one sprite. This is done because the wing has feathers both in front and behind the tailgate. So, each time the wing moves the entire layer has to be regenerated. One method to avoid this full regeneration is to split the wing into a component that is fully in front and one that is fully behind the tailgate. Another method was introduced by Snyder and Lengyel [12], in which in some situations occlusion cycles (where object A partially covers B, which partially covers C, which in turn partially covers A) can be resolved using layers and compositing operations. Pure image layer rendering depends on fast, high quality image warping, filtering, and compositing. Image-based techniques can also be combined with polygon based rendering. Section 7.2 deals extensively with impostors, nailboards, and other ways of using images to take the place of polygonal content. At the far end of the image-based rendering spectrum are image-based techniques such as QuickTime VR and the Lumigraph. In the Quicktime VR system [2] a 360 degree panoramic image, normally of a real scene, surrounds the viewer as a cylindrical image. As the camera's orientation changes the proper part of the image is retrieved, warped, and displayed. Though limited to a single location, it has an immersive quality compared to a static scene because the viewer's head can turn and tilt. Such scenes can serve as backdrops and polygonal objects can be rendered in front of them. This technology is practical today, and is particularly good for capturing a sense of the space in a building, on a street, or other location, real or synthetic. See figure 6.2. QuickTime VR's runtime engine is a specialized renderer optimized for cylindrical environment mapping. This allows it to achieve an order of magnitude gain in performance over software polygon renderers handling the same texture map placed on a cylinder.
The Lumigraph [5] and light field rendering [9] techniques are related to QuickTime VR. However, instead of viewing much of an environment from a single location, a single object is viewed from a set of viewpoints. Given a new viewpoint, an interpolation process is done between stored views to create the new view. This is a more complex problem, with a much higher data requirement (tens of megabytes for even small image sets), than QuickTime VR. The idea is akin to holography, where a two dimensional array of views captures the object. The tantalizing aspect of the Lumigraph and light field rendering is the ability to capture a real object and be able to redisplay it from any angle. Any real object, regardless of surface complexity, can be displayed at a nearly constant rate [11]. As with the global illumination end of the rendering spectrum, these techniques currently have limited use in real-time rendering, but they demark what is possible in the field of computer graphics as a whole. To return to the realm of the mundane, what follows are a number of commonly used special effects techniques that have image-based elements to them.
|
||||||||||||