High Dynamic Range Environment Mapping On Mainstream Graphics Hardware
I. IntroductionThe computer graphics industry has seen dramatic leaps in visual fidelity due to advances in hardware, memory density, and display resolution in the past decade. Researchers in computer graphics have been exploring the dynamic range of the visual display and how to best make use of its limited display range for some time now. Rendering algorithms that utilize high dynamic range imagery is one way to better utilize the display. With programmable graphics hardware widely deployed, game developers can take advantage of these effects in their game engines. This article discusses the capture, storage, and display of high dynamic range images. The capability to display and process high dynamic range imagery is widely available today – for example, in mainstream computer graphics chipsets such as the Intel® 915G Express Chipset. The article also demonstrates the ability to use high dynamic range images without support for floating point texture formats. This is demonstrated by environment mapping objects in a scene with HDR textures in real-time on integrated graphics processors. The article first presents background on high dynamic range imagery and describes the theory and the mathematics of high dynamic range image capture and display with emphasis on Erik Reinhard's photographic tone reproduction operator. The authors then describe their own implementation of HDR environment mapping provided with this article including an SSE optimized HDR image loader, SSE optimized image key calculator, and a Pixel Shader 2.0 implementation of high dynamic range tone mapping. An example of the results can be seen in Figure 1. The authors obtained a >20% speedup on HDR image loading and a >30% speedup on image key calculations using SSE2 optimized routines. Performance is based on a wide variety of factors including hardware, software, and system configuration – your results may vary. II. Background2.1 A trip down the image acquisition pipelineFigure 2 depicts a simplified image acquisition pipeline inspired by [Debevec97]. After passing through the lens, photons travel through the shutter to a light sensor, typically a CCD. The shutter is used to control the amount of time the light sensor accumulates photons and the lens is used to focus the incoming photons onto the light sensor. After arriving at the light sensor the photons are converted into digital values by passing through an analog digital converter (ADC). These digital values will travel through some final adjustments depending on the camera manufacturer and camera settings and be written to an image as RGB values. 2.2 Dynamic RangeIn Figure 3 we can see the variations in dynamic range of the human visual system compared to the dynamic range of visible light and LCD display range. The human visual system adapts the amount of incoming light via moderation of the pupil as well as chemical and neural processes in the photoreceptors and neurons. Photographic devices mimic this behavior via a lens aperture and exposure time. However, in doing so, the remainder of the information outside the range of the lens and exposure combination is forever lost - a significant blow to the use of these images for real-time rendering purposes, where the conditions under which the light is perceived may warrant modification. In other words, we may want to capture all the information in of the full dynamic range in a scene, and only later chose which parts to discard. As we'll demonstrate in this article, the use of high dynamic range images allows for the storage of more lighting information in the source image, and therefore runtime modification of the end users perception of this lighting information 2.3 Storage of HDR ImagesAfter creating an HDR image, it will need to stored for later retrieval, processing, and display. [Ward03] has a summary of different formats for storing HDR images. Examples include Pixar's 33 bit log encoded TIFF, Radiance's 32-bit RGBE and XYZE, IEEE 96-bit TIFF and Portable FloatMap, LogLuv TIFF, and ILM's 48-bit OpenEXR format. Each format has its own set of advantages and disadvantages including file size, dynamic range, and quantization. For our work we have used the RGBE file format. Ultimately, the format you choose will be dependent on the context of your work and the tools available. We need a tool to manage and manipulate HDR images. Fortunately there is a tool available on-line to help in this effort, HDRShop. Since HDRShop exports RGBE files and RGBE files have an acceptable displayable range we chose to use them for our work [HDRShop04]. Commercial software packages are also available, including Version 2.0 of HDRShop and Photogenics [Photogenics04]. III. Theory3.1 The need for HDR imagesAn image is composed of the response of each pixel of the sampling device. The light sensor has the largest error when near its maximum or minimum input. Since any value above the saturation point is mapped to the largest value for storage we have not obtained an accurate measurement of the amount of light hitting the pixel for a given exposure time. Therefore, in many cases images fail to accurately sample and store the intensities when HDR image techniques are not utilized. To compensate for the capabilities of today's digital cameras, it is useful to vary the exposure time and take a series of images. These images can then be used to gain a more accurate understanding of the light entering the lens and being sampled by the light sensor for later display as HDR images. By storing a set of images with different exposure times it becomes easier to map the true luminance of a scene into the displayable range of whatever device or application limitations we have available. Much of the work in the film industry is motivated by the desire to have images that can be matched to the displayable range of film. 3.2 Uses of HDR imagesHDR images are used for environment mapping of objects in a game engine. The motivation is to create a more accurate and compelling visual experience given the capabilities of today's hardware. Since integer texture formats are widely supported we demonstrate our HDR techniques using this format. This also demonstrates that floating point texture support is not a requirement for supporting high dynamic range imagery. This may provide just the right effect in key elements of the user experience. Since integer formats have a limited dynamic range, we chose a tone mapping operator that we will discuss for display on hardware that does not support floating point textures. In addition to HDR images as useful for environment mapping, they can also be used for motion blur and simulating characteristics of the human visual system when rendering. For example, it can be used in where the high luminance values are not clamped and used to demonstrate a depth of field effect [Northrop04], [Kawase03], [Kawase04]. 3.3 DisplayTo this point we've covered the acquisition, usage, and storage of HDR images. Next, it is important to consider the display of HDR imagery. There has been research in creating displays that can more accurately render high dynamic range imagery [Seetzen04]. However, this technology is in its infancy and not likely to be widely deployed in the next few years. Therefore, it is necessary to find techniques that allow mapping the large variance in luminance stored in HDR textures into something that can be displayed. One such technique is called High Dynamic Range Texture Mapping, HDRTM, discussed in [Cohen01]. The basic idea is the following: since it is not possible to store the full range of luminance values in 8 bit per channel texture maps, instead break high dynamic range textures into a set of the 8 high bits and 8 low bits per channel. This technique is completely general: break the image down before sending to the hardware into corresponding channels, then reassemble and composite on the other side by inverting the decomposition process that was applied before submission to the graphics subsystem. This can be applied dynamically or statically depending on the goals of your application and assumes the presence of 16-bit texture values. Fixed point hardware techniques that use texture combining hardware and two textures simultaneously also exist. Due to the real-time requirements and the wide availability of programmable graphics hardware, the authors have chosen to implement a tone-mapping technique presented in [Reinhard01] for the demo seen in Figure 4. For tone mapping, the idea is to map the high dynamic range of the real world luminance to the lower dynamic range of the display device. In fact, since pixels are always constrained to some maximum value in the framebuffer, we always have performed tone mapping by applying a clamp operator per pixel that causes our display to act as a low pass filter, thus we lost much of the higher luminance of the scene. We would like to take a smarter approach. Ansel Adams faced a similar problem in photography. We choose to adopt a technique inspired by his work called the Zone System. The Zone System is still widely used in analog image acquisition today. As seen in Figure 5, a zone is a range of luminance values, taking into account the reflectance of the print. There are eleven print zones ranging from pure black to pure white, each doubling in intensity. Each zone is represented with a roman numeral, Zone 0* through Zone X. The middle-grey value is the subjective middle brightness region of the scene which is mapped to print Zone V in most cases. A photographer would take luminance readings of what was middle grey in a scene, typically this middle grey would be what would produce an 18% reflectance on the print. If the scene was low key this value would be lower in the spectrum of print zones. Similarly, if it was high key the middle grey would be one of the higher print zones. * - '0' was not a Roman numeral. In fact, the Romans had not yet discovered 0. 3.4 Mathematics of Reinhard's Photographic Tone Reproduction OperatorThe first step in applying Reinhard's tone reproduction operator is to obtain an average luminance value which we will use as the key of the scene. Normally, a simple average is fine, however since luminance values are an exponential curve we will use the logarithmic average luminance value, or log-average luminance as an approximation for the scene key. To compute this value, we first compute the luminance from our RGB values as follows: (1) Equation 1 uses the luminance conversion found in [Akenine-Moller02] which is based on modern CRT and HDTV phosphors. Next, we compute the log average luminance using this value, summing over the entire image: (2) Here, the number of pixels is the total number of pixels in the image. The is a small value to avoid taking the log of a completely black pixel whose luminance is zero. Now that we have the image key we would like to re-map the pixels into a new image that scales the values so we can give greater dynamic resolution to the upper range of luminance values. Since we know that .18 is the middle (zone V) in our logarithmic scale from 0-1 in intensity values, we use the ratio: (3) Solving for the new luminance value we obtain: (4) Now, this is assuming our goal is to map the range such that zone V is in the middle. For images that have a higher than average image_key it may be desirable to raise the middle zone. To do this we generalize our expression: (5) Typically the midzone_luminance_value will vary in amounts that double each step: .045, .09, .18, .36, .72. We still have two problems. First, most images have a normal dynamic range for most of the image and only a small percentage of the pixels have very high dynamic range, for example at the light source such as the sun or a window. Equation 5 assumes a linear mapping, what we really want to do is a non-linear mapping to emphasize these areas of high dynamic range. Second, equation 5 can still produce values that lie outside of the 0.0 – 1.0 viewable on the monitor. This leads us to a final adjustment to our scaled luminance value: (6) Notice equation 6 scaled our luminance values between 0 and 1, scaling high luminance values by very small amounts and doing nothing to low luminance values since the 1 dominates the calculation. This gives us a greater dynamic range in the high regions of luminance as can be see in Figure 6 and 7. For current real-time applications expressions 2, 5, and 6 are adequate and will bring all luminance values within a displayable range. Sometimes this is not always desirable and we would instead like them to burn out for certain values in the highest range of luminance. This is accomplished with a different tone mapping operator: (7) White_luminance refers to the smallest luminance value that will be mapped to pure white. Notice if white_luminance is large the fraction in the numerator of equation 7 goes to 0 and we are left with equation 6. However, if white_luminance is small we get larger values in the numerator of our expression, acting to enhance lower dynamic range pixels. In our examples, we use equation 6. Mapping luminance to RGB Next, to get the final RGB values we multiply the final luminance by each original RGB value respectively to compute the new pixel RGB values. 3.5 ConversionThe process to acquire a high dynamic range image is relatively simple given the tools available today. A high dynamic range image is constructed by collecting a set of conventional photographs with identical position and orientation and varying the exposure times, usually by varying the f-stop for each image. One way to create a high dynamic range image is to implement the algorithm presented in [Debevec97] that can be used to recover the response function of the camera and use this information to construct a high dynamic range image. This image will have pixels whose values represent the true radiance values in the scene. Another option is to use HDRShop, a tool that allows the manipulation of a set of low resolution images taken from a standard camera to be used to create a single HDR image [HDRShop04]. For many applications of HDR images in entertainment where the incoming radiance at a point is important, a light probe is a more suitable format. A light probe is created by placing a mirrored ball in the environment and taking pictures from both sides. The result is an environment map: a set of ray samples (an image) of all the rays of light intersecting the point at the center of the light probe from each point in space. This can then be rendered into a high resolution sphere map or cube map for use in the rendering pipeline and is used to approximate the light rays intersecting the object we are environment mapping. IV. Implementation4.1 DemoIncluded with this article is an SSE2 optimized implementation of file loading and image key calculations utilizing the HDRFormats demo from the Microsoft SDK. [MSSDK04]. Our implementation was measured to have a 21.6% speedup on a 1024x768 HDR image over a C implementation for image reading. We also were able to gain about 31% speedup on image key calculations for 640x480 images compared to a C based implementation. The demo also allows interactive adjustment of the midzone_luminance (referred to as MIDDLE_GREY in the demo) from Equation 5 to allow the reader to better understand how adjustments of the midzone_luminance affects the resulting image. Additionally, we noticed that a pure implementation of the mathematics of tone mapping for each image could result in images that changed tone too dramatically from frame to frame. Therefore, we limit the amount the image_key can vary from frame to frame to prevent the image from changing too drastically, allowing the image to 'settle' to the correct value after a few iterations. The result was much more aesthetically pleasing, and is a more accurate depiction of what the light does in situations where the light does vary dramatically. 4.2 RGBE formatThe RGBE format is suitable for storage of high dynamic imagery for real-time graphics and was used for our implementation. RGBE was originally created by Greg Ward for his Radiance software package [Radiance04]. The format consists of an 8-bit mantissa for each Red, Green, and Blue channel along with an 8-bit exponent for 32 bits per pixel as seen in Figure 8. They share the exponent thus reducing the storage required significantly when comparing to a 32 bit per channel format. (32 bits per float * 3 = 92 bits per pixel vs 32 bits per pixel). A downside is a lack of dynamic resolution between color channels since you are sharing the exponent for all of the color channels. Encoding and decoding using the RGBE format is easy. To encode a pixel using the RGBE format the following HLSL pixel shader 2.0 code can be used: float4 EncodeRGBE8( in float3 rgb ) { float4 vEncoded; float maxComponent = max(max(rgb.r, rgb.g), rgb.b ); float fExp = ceil( log2(maxComponent) ); vEncoded.rgb = rgb / exp2(fExp); vEncoded.a = (fExp + 128) / 255; return vEncoded; } To decode a pixel using the RGBE format the following HLSL pixel shader 2.0 code can be used: float3 DecodeRGBE8( in float4 rgbe ) { float3 vDecoded; float fExp = rgbe.a * 255 - 128; vDecoded = rgbe.rgb * exp2(fExp); return vDecoded; } 4.3 SSE2 Optimized High Dynamic Range Image ReadingFor our implementation we created an SSE optimized HDR reader for RGBE images that shows a speedup of 21.6% when reading in images of size 1024x768. The C version is based on Greg Ward's implementation, originally written and posted by Bruce Walter at http://www.graphics.cornell.edu/~bjw/rgbe.html. It was altered to read HDRShop headers by Alex at www.FusionIndustries.com. Using this SSE2 routine in your engine can speed up load times of images used for high dynamic range environment maps in your engine. If you only have one high dynamic range image in your game, this performance difference could be negligible, but if it occurs for 5, 10, or 20 images per level one can quickly see the benefits of such a routine. The demo includes this code. 4.4 SSE2 Optimized High Dynamic Range Image KeyTo avoid having to transfer the image over the bus to compute an image key we calculate this on the CPU using an SSE optimized image_key computation included in the example. Deciding whether to calculate the image key on the CPU or the GPU is application, graphics card, and graphics bus dependent. Experiment with a CPU and GPU approach to see what is best for your application. Our SSE optimized HDR image key computation was shown to be 33% faster than doing the calculation with C code on a 640x480 render target. 4.5 Pixel Shader for Integrated GraphicsWe have also written a pixel shader in HLSL that supports using HDR images for environment mapping on Intel 915G Graphics. The 915G is optimized for DirectX 9 support and uses DirectX's Intel-architecture-optimized PSGP (Platform Specific Graphics Processing) Vertex Shader 3.0 and Pixel Shader 2.0. Since there is no support for floating point textures in this hardware, we perform the tone mapping described earlier in the pixel shader using RGBE images. The complete source code for our pixel shaders is given in the effects file in the demo. 4.6 HDR samples in the Microsoft DirectX 9.0 SDKMicrosoft has provided examples in the DirectX SDK that demonstrate the above techniques without the optimizations made in this article [MSSDK04]. They provide examples that show HDR in several different scenarios. HDRCubemap is a demonstration of cubic environment mapping that uses floating-point cube textures to store values where the total amount of light illuminating a surface is greater than 1.0. and HDR lighting. HDRFormats shows a technique much like what is used in this article for displaying HDR images on hardware that is not capable of using floating point textures and was the original inspiration for our work. The most notable difference is that this sample is not tied to the DDS file format, therefore any HDR image that is encoded by HDRShop can be used. We have a SIMD accelerated optimization to determine powers of two for the shared exponent. HDRLighting demonstrates blue shift under low light and bloom under intense lighting conditions as well as under and overexposing the camera. V. Future WorkThe authors are considering several areas for future work. One is the use of the OpenEXR file format for storage and display and authoring tools that can take advantage of this file format. While there is an SDK for reading and displaying the images, there is no publicly available windows based implementation of an exporter. To do this we are thinking of writing a plug-in for HDRShop. We would also like to experiment with additional tone mapping operations. This article presents one well suited to today's graphics hardware, but other techniques may be more suitable, depending on desired effect. Finally, we would like to experiment with per-pixel tone mapping. [Reinhard02] discusses a technique to simulate the photographer's use of dodge and burn. Dodge and burn is the action of adding or subtracting light from areas in the print to increase or limit exposure, usually done with a piece of article with a hole cut out or a small wand. Think of it as choosing a key value for every pixel. Since we wanted a fast tone mapping technique we chose not to focus on this operator, but as graphics hardware becomes faster per pixel tone mapping operators will surely be able to be done in real-time. Another alternative to consider is the application of a tone mapping in regions of luminance values by determining local keys to apply either the per pixel tone mapping or the average luminance value operator we described in detail above. VI. References[Akenine-Moller02] Tomas Akenine-Moller and Erik Haines. Real-Time Rendering, 2nd Edition. Page 193. AK Peters. 2002. [Cohen01] Jonathan Cohen, Chris Tchou, Tim Hawkins, and Paul Debevec. Real-Time High Dynamic Range Texture Mapping. In Rendering Techniques 2001. S. J. Gortler and K. Myszkowski, eds. 313-320. [Debevec97] Paul Debevec, Jitendra Malik. Recovering High Dynamic Range Radiance Maps from Photographs. SIGGRAPH 1997. 1997. Pages 369-378. [Debevec02] Paul Debevec. Image-Based Lighting. Computer Graphics and Applications. March/April 2002. Pages 26-34. [Ferwerda96] James A. Ferwerda, Sumanta N. Pattanaik, Peter Shirley, and Don Greenberg. A Model of Visual Adaptation for Realistic Image Synthesis. SIGGRAPH 1996. Pages 249-258. [Halsted93] Charles Halsted. Brightness, Luminance, and Confusion. Information Display. 1993. http://www.crompton.com/wa3dsp/light/lumin.html. [HDRShop04] Software for creating, editing, and saving HDR imagery. http://www.ict.usc.edu/graphics/HDRShop/. 10/29/2004. Kawase03] Kawase, Masaki. Framebuffer Post-Processing Effects in DOUBLE S.T.E.A.L. (Wreckless). Presentation. Game Developers Conference 2003. [Kawase04] Kawase, Masaki. Practical Implementation of High Dynamic Range Rendering. Presentation. Game Developers Conference 2004. [MSSDK04] Microsoft Corporation DirectX 9.0 SDK Summer 2004 Update. http://www.microsoft.com/downloads/search.aspx?displaylang=en&categoryid=2. August 2004. [Northrop04] Northrop, Cody. High Dynamic Range Lighting Brown Bag Presentation. Intel Brown Bag Lunch. 5/26/2004. [OpenEXR04] The OpenEXR web site: http://www.openexr.org/downloads.html. August 19, 2004. [Photogenics04] http://www.idruna.com/downloads.html. [Probe04] Light Probe Image Gallery. http://athens.ict.usc.edu/Probes/. 9/8/2004. [Radiance04] http://radsite.lbl.gov/radiance/HOME.html. Radiance Imaging System. August 2004. [Reinhard04] Erik Reinhard, Personal Email Communication. July 30, 2004. [Reinhard02] Erik Reinhard, Michael Stark, Peter Shirley, and James Ferwerda. Photographic Tone Reproduction for Digital Images. SIGGRAPH 2002. Pages 267-276. [Shastry99] Anirudh S. Shastry. High Dynamic Range Rendering. http://www.gamedev.net/columns/hardcore/hdrrendering/. [Seetzen04] Seetzen, Helge, Wolfgang Heidrich, Wolfgang Stuerzlinger, Greg Ward, et. Al. High Dynamic Range Display Systems. 2004 ACM Transactions on Graphics, Volume 23 Number 3. SIGGRAPH 2004. Pages 760-768. [Ward03] Greg Ward. Global Illumination and HDRI Formats. SIGGRAPH 2003 Course #19: HDRI and Image Based Lighting. SIGGRAPH 2003. [Walter04] Bruce Walter. RGBE File Format. http://www.graphics.cornell.edu/~bjw/rgbe.html. 2004. Appendix A: A Simple Tone Mapping Implementation//no warranties, expressed or implied, free for re-use #include "stdafx.h" #include "math.h" #define N 11 #define delta 1.0f #define MIDDLE_GRAY 0.36f #define MAX_RGB 2048.0f #define MAX_LUMINANCE ((0.2125f*MAX_RGB)+(0.7154*MAX_RGB)+(0.0721f*MAX_RGB)) // assume we have converted from RGB to luminance as described in the paper float L[] ={ 0.0f, 1.0f, 3.0f, 7.0f, 15.0f, 31.0f, 63.0f, 127.0f, 255.0f, 511.0f, 1023.0f, 2047.0f}; /* L refers to Luminance, wanted to fit on a page */ float L_NormalizedFloats[N]; float scaled_L[N]; float final_L[N]; int final_pixel_vals[N]; void _tmain(int argc, _TCHAR* argv[]) { float sum = 0.0f; float log_avg_L = 0.0f; float a = MIDDLE_GRAY; for(int i=0;i Discuss this article in the forums
See Also: © 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
|