GameDev.net -- Introduction to Shader Programming Part IV: Programming Pixel Shaders

RacorX9

RacorX9 combines a diffuse reflection model with the specular reflection model. It is based on RacorX6, RacorX7 and a further improved ps.1.4 shader, which gets the specular value from the alpha value of the normal map.

Figure 7 - RacorX9 Diffuse & Specular Lighting

The main point about this example is, that it handles both reflection models with the help of only two textures. So there is some room left to use additional textures for other tasks or for more per-pixel lights (See [Gosselin] for a ps.1.4 shader with three lights with falloff).
This example uses two different vertex shaders. One for the ps.1.1 and one for the ps.1.4 pixel shader. The vertex shader that feeds the ps.1.1 pixel shader named SpecDot3Pix.vsh stores the half vector in oD1 and the light vector in oD0. The two texture coordinates are stored in oT0 and oT1.

; half vector -> oD1 ps.1.1 mad oD1.xyz, r8, c33, c33 ; multiply by a half to bias, then add half ... ; light -> oD0 mad oD0.xyz, r8.xyz, c33, c33 ; multiply a half to bias, then add half mov oT0.xy, v7.xy mov oT1.xy, v7.xy

The only difference compared to the vertex shader used in RacorX7 is the storage of an additional light vector in oD0. The light vector is necessary to calculate the diffuse reflection in the pixel shader in the same way as shown in RacorX6:

ps.1.1 tex t0 ; color map tex t1 ; normal map dp3 r0,t1_bx2,v1_bx2 ; dot(normal,half) mul r1,r0,r0; ; raise it to 32nd power mul r0,r1,r1; mul r1,r0,r0; mul r0,r1,r1; dp3 r1, t1_bx2, v0_bx2 ; dot(normal,light) mad r0, t0, r1, r0

Highten the specular power value with four mul instructions in the pixel shader is a very efficient method. The drawback of visible banding effects is reduced by combining the specular reflection model with a diffuse reflection model. The light vector in the dp3 instruction is used in the same way as in RacorX7.
Compared to the vertex shader above the second vertex shader named SpecDot314.psh stores the half vector in oT2 and oT3 instead of oD1 and the texture coordinates, which are used later in the pixel shader for both textures, in oT0:

; half vector -> oT2/oT3 ps.1.4 mad oT2.xyz, r8, c33, c33 ; multiply by a half to bias, then add half mad oT3.xyz, r8, c33, c33 ; multiply by a half to bias, then add half ... ; light -> oD0 mad oD0.xyz, r8.xyz, c33, c33 ; multiply a half to bias, then add half mov oT0.xy, v7.xy -------- ; specular power from a lookup table ps.1.4 ; r1 holds normal map ; t0 holds texture coordinates for normal and color map ; t2 holds half angle vector ; r0 holds color map texld r1, t0 texcrd r4.rgb, t2 dp3 r4.rg, r4_bx2, r1_bx2 ; (N dot H) mov r2, r1 ; save normal map data to r2 phase texld r0, t0 texld r1, r4 ; samples specular value from normal map with u,v dp3 r3, r2_bx2, v0_bx2 ; dot(normal,light) mad r0, r0, r3, r1.a

The real new thing in this pixel shader is the storage of the specular power value in the alpha value of the normal map. This Look-up table is accessed like the Look-up table in RacorX8. Therefore the normal map is sampled a second time in phase 2.

If this pixel shader would try to use the v0 color register, the two dp3 instructions would have to be moved in phase 2, but then the necessary dependant texture read done in the second texld instruction in phase 2 would not be possible. Therefore the ps.1.4 shader wouldn't work with the half vector in v0 at all.

The look-up table is built up with the following piece of code:

//specular light Look-up table void LightEval(D3DXVECTOR4 *col,D3DXVECTOR2 *input, D3DXVECTOR2 *sampSize,void *pfPower) { float fPower = (float) pow(input->y,*((float*)pfPower)); col->x = fPower; col->y = fPower; col->z = fPower; col->w = input->x; } ... // // create light texture // if (FAILED(D3DXCreateTexture(m_pd3dDevice, desc.Width, desc.Height, 0, 0, D3DFMT_A8R8G8B8, D3DPOOL_MANAGED, &m_pLightMap16))) return S_FALSE; FLOAT fPower = 16; if (FAILED(D3DXFillTexture(m_pLightMap16,LightEval,&fPower))) return S_FALSE; // copy specular power from m_pLightMap16 into the alpha // channel of the normal map D3DLOCKED_RECT d3dlr; m_pNormalMap->LockRect( 0, &d3dlr, 0, 0 ); BYTE* pDst = (BYTE*)d3dlr.pBits; D3DLOCKED_RECT d3dlr2; m_pLightMap16->LockRect( 0, &d3dlr2, 0, 0 ); BYTE* pDst2 = (BYTE*)d3dlr2.pBits; for( DWORD y = 0; y < desc.Height; y++ ) { BYTE* pPixel = pDst; BYTE* pPixel2 = pDst2; for( DWORD x = 0; x < desc.Width; x++ ) { *pPixel++; *pPixel++; *pPixel++; *pPixel++ = *pPixel2++; *pPixel2++; *pPixel2++; *pPixel2++; } pDst += d3dlr.Pitch; pDst2 += d3dlr2.Pitch; } m_pNormalMap->UnlockRect(0); m_pLightMap16->UnlockRect(0); SAFE_RELEASE(m_pLightMap16);

A specular map in m_pLightMap16 is created as already shown in RacorX8 with the help of the LightEval() function. The values of this map are stored in the alpha values of the normal map after retrieving a pointer to the memory of both maps. The specular map is then released. This way the ps.1.4 pixel shader only uses two texture stages, but there is a weak point in this example.
The ps.1.4 pixel shader is slow compared to the ps.1.1 pixel shader. The higher precision of the specular value has its price. Using the normal map with 2048x1024 pixels for storage of the specular power slows down the graphics card. Using a smaller normal map speeds up the framerate substantially, but on the other side reduces then precision of the normals. Using specular power in an additional texture would eat up one texture stage. Using an equivalent to the ps.1.1 shader, which is shown in the following lines, won't allow us to use more than one or two lights:

ps.1.4 texld r0, t0 ; color map texld r1, t1 ; normal map dp3 r2, r1_bx2, v1_bx2 ; dot(normal, half) mul r3,r2,r2 ; raise it to 32nd power mul r2,r3,r3 mul r3,r2,r2 mul r2,r3,r3 dp3 r1, r1_bx2, v0_bx2 ; dot(normal,light) mad r0, r0, r1, r2

The best way to improve the ps.1.1 and ps.1.4 pixel shaders in this example is to store the specular power value in the alpha value of an additional smaller texture, which might add new functionality to this example. This is shown by Kenneth L. Hurley [Hurley] for a ps.1.1 shader and by Steffen Bendel [Bendel] and other ShaderX authors for a ps.1.4 pixel shader.

Summarize

This example has shown the usage of a combined diffuse and specular reflection model, while using only two texture stages. It also demonstrates the trade-off which has to be made by using the specular power in the alpha value of a texture, but also its advantage: there are more instruction slots left for using more than one per-pixel light. A rule of thumb might be using up to three per-pixel lights in a scene to highlight the main objects and the rest of the scene should be lit with the help of per-vertex lights.
These examples might be improved by adding an attenuation factor calculated on a per-vertex basis like in RacorX5 or by adding an attenuation map in one of the texture stages [Dietrich00][Hart].

Further Reading

I recommend the article by Philippe Beaudoin and Juan Guardado [Beaudoin] to see a power function in a pixel shader that calculates a high-precision specular power value. David Gosselin implements three lights with a light falloff at the end of his article [Gosselin]. Kenneth Hurly [Hurley] describes how to produce diffuse and specular maps in an elegant way with Paint Shop Pro. Additionally he describes a ps.1.1 pixel shader that uses a diffuse and a specular texture map to produce better looking diffuse and specular reflections. This pixel shader is an evolutionary step forward compared to the ps.1.1 shaders shown here. Steffen Bendel [Bendel] describes a way to produce a combined diffuse and specular reflection model in a ps.1.4 pixel shader, that uses a much higher precision and leads to a better visual experience. That's the reason why he called it smooth lighting.

References

[Beaudoin] Philippe Beaudoin, Juan Guardado, "A Non-Integer Power Function on the Pixel Shader", ShaderX, Wordware Inc., pp ?? - ??, 2002, ISBN 1-55622-041-3

[Bendel] Steffen Bendel, "Smooth Lighting with ps.1.4", ShaderX, Wordware Inc., pp ?? - ??, 2002, ISBN 1-55622-041-3

[Dietrich] Sim Dietrich, "Per-Pixel Lighting", NVIDIA developer web-site.

[Dietrich00] Sim Dietrich, "Attenuation Maps", Game Programming Gems, Charles River Media, pp 543 - 548, ISBN 1-58450-049-2

[Gosselin] David Gosselin, "Character Animation with Direct3D Vertex Shaders", ShaderX, Wordware Inc., pp ?? - ??, 2002, ISBN 1-55622-041-3

[Hurley] Kenneth L. Hurley, "Photo Realistic faces with Vertex and Pixel Shaders", ShaderX, Wordware Inc., pp ?? - ??, 2002, ISBN 1-55622-041-3

[Lengyel] Eric Lengyel, Mathematics for 3D Game Programming & Computer Graphics, Charles River Media Inc., 2002, pp 150 - 157, ISBN 1-58450-037-9

[Taylor] Philip Taylor, "Per-Pixel Lighting", http://msdn.microsoft.com/directx

Acknowledgment

I would like to thank Philip Taylor for permission to use the earth textures from the Shader Workshop held at Meltdown 2001. Additionally I would like to thank Jeffrey Kiel from NVIDIA for proofreading this paper.

Epilogue

Improving this introduction is a constant effort. Therefore I appreciate any comments and suggestions. Improved versions of this text will be published on http://www.direct3d.net and http://www.gamedev.net.

© 2000 - 2002 Wolfgang Engel, Frankenthal, Germany

Contents

RacorX6

RacorX7

RacorX8

RacorX9

Printable version

Discuss this article

The Series

Fundamentals of Vertex Shaders

Programming Vertex Shaders

Fundamentals of Pixel Shaders

Programming Pixel Shaders

Diffuse & Specular Lighting with Pixel Shaders