Upcoming Events
Unite 2010
11/10 - 11/12 @ Montral, Canada

GDC China
12/5 - 12/7 @ Shanghai, China

Asia Game Show 2010
12/24 - 12/27

GDC 2011
2/28 - 3/4 @ San Francisco, CA

More events...
Quick Stats
64 people currently visiting GDNet.
2406 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!
Link to us Events 4 Gamers
Intel sponsors gamedev.net search:


The main improvement of RacorX3 over RacorX2 is the addition of a per-vertex diffuse reflection model in the vertex shader. This is one of the simplest lighting calculations, which outputs the color based on the dot product of the vertex normal with the light vector.

RacorX3 uses a light positioned at (0.0, 0.0, 1.0) and a green color.

Figure 11 - RacorX3

As usual we are tracking the life-cycle of the vertex shader.

Vertex Shader Declaration

The vertex shader declaration has to map vertex data to specific vertex shader input registers. Additionally to the previous examples, we need to map a normal vector to the input register v3:

// vertex shader declaration
DWORD dwDecl[] =
  D3DVSD_REG(0, D3DVSDT_FLOAT3 ), // input register #1
  D3DVSD_REG(3, D3DVSDT_FLOAT3 ), // normal in input register #4

The corresponding layout of the vertex buffer looks like this:

  FLOAT x, y, z; // The untransformed position for the vertex
  FLOAT nx, ny, nz; // the normal

// Declare custom FVF macro.

Each vertex consists of three position floating point values and three normal floating point values in the vertex buffer. The vertex shader gets the position and normal values from the vertex buffer via v0 and v3.

Setting the Vertex Shader Constant Registers

The vertex shader constants are set in FrameMove() and RestoreDeviceObjects(). This example uses a more elegant way to handle the constant registers. The file const.h that is included in racorx.cpp and diffuse.vsh, gives the constant registers an easier to remember name:

#define CLIP_MATRIX 0
#define CLIP_MATRIX_1 1
#define CLIP_MATRIX_2 2
#define CLIP_MATRIX_3 3



#define DIFFUSE_COLOR 14
#define LIGHT_COLOR 15

In FrameMove() a clipping matrix and an inversed world matrix are set into the constant registers:

HRESULT CMyD3DApplication::FrameMove()
  // rotates the object about the y-axis
  D3DXMatrixRotationY( &m_matWorld, m_fTime * 1.5f );

  // set the clip matrix
  m_pd3dDevice->SetVertexShaderConstant(CLIP_MATRIX, &(m_matWorld *
                                        m_matView * m_matProj), 4);
  D3DXMATRIX matWorldInverse;
  D3DXMatrixInverse(&matWorldInverse, NULL, &m_matWorld);

 return S_OK;

Contrary to the previous examples, the concatenated world-, view- and projection matrix, which is used to rotate the quad, is not transposed here. This is because the matrix will be transposed in the vertex shader as shown below.

To transform the normal, an inverse 4x3 matrix is send to the vertex shader via c4 -c6.

The Vertex Shader

The vertex shader is a little bit more complex, than the one used in the previous examples:

; per-vertex diffuse lighting

#include "const.h"

; transpose and transform to clip space
mul r0, v0.x, c[CLIP_MATRIX]
mad r0, v0.y, c[CLIP_MATRIX_1], r0
mad r0, v0.z, c[CLIP_MATRIX_2], r0
add oPos, c[CLIP_MATRIX_3], r0

; transform normal
dp3 r1.x, v3, c[INVERSE_WORLD_MATRIX]
dp3 r1.y, v3, c[INVERSE_WORLD_MATRIX_1]
dp3 r1.z, v3, c[INVERSE_WORLD_MATRIX_2]

; renormalize it
dp3 r1.w, r1, r1
rsq r1.w, r1.w
mul r1, r1, r1.w

; N dot L
; we need L vector towards the light, thus negate sign
dp3 r0, r1, -c[LIGHT_POSITION]

mul r0, r0, c[LIGHT_COLOR] ; modulate against light color
mul oD0, r0, c[DIFFUSE_COLOR] ; modulate against material

The mul, mad and add instructions transpose and transform the matrix provided in c0 - c3 to clip space. As such they are nearly functionally equivalent to the transposition of the matrix and the four dp4 instructions shown in the previous examples. There are two caveats to bear in mind: The complex matrix instructions like m4x4 might be faster in software emulation mode and v0.w is not used here. oPos.w is automatically filled with 1. These instructions should save the CPU cycles used for transposing.

The normals are transformed in the following three dp3 instructions and then renormalized with the dp3, rsq and mul instructions.

You can think of a normal transform in the following way: Normal vectors (unlike position vectors) are simply directions in space, and as such they should not get squished in magnitude, and translation doesn't change their direction. They should simply be rotated in some fashion to reflect the change in orientation of the surface. This change in orientation is a result of rotating and squishing the object, but not moving it. The information for rotating a normal can be extracted from the 4x4 transformation matrix by doing transpose and inversion. A more math-related explanation is given in [Haines/Mller][Turkowski].

So the bullet-proof way to use normals, is to transform the transpose of the inverse of the matrix, that is used to transform the object. If the matrix used to transform the object is called M, then we must use the matrix, N, below to transform the normals of this object.

N = transpose( inverse(M) )

The normal can be transformed with the transformation matrix (usually the world matrix), that is used to transform the object in the following cases:

  • Matrix formed from rotations (orthogonal matrix), because the inverse of an orthogonal matrix is its transpose
  • Matrix formed from rotations and translation (rigid-body transforms), because translations do not affect vector direction
  • Matrix formed from rotations and translation and uniform scalings, because such scalings affect only the length of the transformed normal, not its direction. A uniform scaling is simply a matrix which uniformly increases or decreases the object’s size, vs. a non-uniform scaling, which can stretch or squeeze an object. If uniform scalings are used, then the normals do have to be renormalized.

Therefore using the world matrix would be sufficient in this example.

That's exactly, what the source is doing. The inverse world matrix is delivered to the vertex shader via c4 - c6. The dp3 instruction handles the matrix in a similar way as dp4.

By multiplying a matrix with a vector, each column of the matrix should be multiplied with each component of the vector. dp3 and dp4 are only capable to multiply each row of the matrix with each component of the vector. In case of the position data, the matrix is transposed to get the right results.

In case of the normals, no transposition is done. So dp3 calculates the dot product by multiplying the rows of the matrix with the components of the vector. This is like using a transposed matrix.

The normal is re-normalized with the dp3, rsq and mul instructions. Re-normalizing a vector means align its length to 1. That's because we need a unit vector to calculate our diffuse lighting effect.

To calculate a unit vector, divide the vector by its magnitude or length. The magnitude of vectors is calculated by using the Pythagorean theorem:

x2 + y2 + z2 = m2

The length of the vector is retrieved by

||A|| = sqrt(x2 + y2 + z2)

The magnitude of a vector has a special symbol in mathematics. It is a capital letter designated with two vertical bars: ||A||. So dividing the vector by its magnitude is:

UnitVector = Vector / sqrt(x2 + y2 + z2)

The lines of code in the vertex shader, that handles the calculation of the unit vector looks like this:

; renormalize it
dp3 r1.w, r1, r1  ; (src1.x * src2.x) + (src1.y * src2.y) + (src1.z * src2.z)
rsq r1.w, r1.w    ; if (v != 0 && v != 1.0) v = (float)(1.0f / sqrt(v))
mul r1, r1, r1.w  ; r1 * r1.w

dp3 squares the x, y and z components of the temporary register r1, adds them and returns the result in r1.w. rsq divides 1 by the result in r1.w and stores the result in r1.w. mul multiplies all components of r1 with r1.w. Afterwards, the result in r1.w is not used anymore in the vertex shader.

The underlying calculation of these three instructions can be represented by the following formula, which is mostly identical to the formula postualted above:

UnitVector = Vector * 1/sqrt(x2 + y2 + z2)

Lighting is calculated with the following three instruction:

dp3 r0, r1, -c[LIGHT_POSITION]
mul r0, r0, c[LIGHT_COLOR] ; modulate against light color
mul oD0, r0, c[DIFFUSE_COLOR] ; modulate against diffuse color

Nowadays the lighting models used in current games are not based on much physical theory. Game programmers use approximations that try to simulate the way photons are reflected from objects in a rough but efficient manner.

One differentiates usually between different kind of light sources and different reflection models. The common lighting sources are called directional, point light and spotlight. The most common reflections models are ambient, diffuse and specular lighting.

This example uses a directional light source with an ambient and a diffuse reflection model.

Directional Light

RacorX3 uses a light source in an infinite distance. This simulates the long distance the light beams have to travel from the sun. We treat this light beams as beeing parallel. This kind of light source is called directional light source.

Diffuse Reflection

Whereas ambient light is considered to be uniform from any direction, diffuse light simulates the emission of an object by a particular light source. Therefore you are able to see that light falls onto the surface of an object from a particular direction by using the diffuse lighting model.

It is based on the assumption that light is reflected equally well in all directions, so the appearance of the reflection does not depend on the position of the observer. The intensity of the light reflected in any direction depends only on how much light falls onto the surface.

If the surface of the object is facing the light source, which means is perpendicular to the direction of the light, the density of the incident light is the highest. If the surface is facing the light source under some angle smaller than 90 degrees, the density is proportionally smaller.

The diffuse reflection model is based on a law of physics called Lambert's Law, which states that for ideally diffuse (totally matte) surfaces, the reflected light is determined by the cosine between the surface normal N and the light vector L.

Figure 12 - Diffuse Lighting

The left figure shows a geometric interpretation of Lambert's Law (see also [RTR]). The middle figure shows the light rays hitting the surface perpendicularly in a distance d apart. The intensity of the light is related to this distance. It decreases as d becomes greater. This is shown by the right figure. The light rays make an angle with the normal of the plane. This illustrates that the same amount of light that passes through one side of a right-angle triangle is reflected from the region of the surface corresponding to the triangles hypotenuse. Due to the relationships that hold in a right-angle triangle, the length of the hypotenuse is d/cos of the length of the considered side. Thus you can deduce that if the intensity of the incident light is Idirected, the amount of light reflected from a unit surface is Idirected cos . Adjusting this with a coefficient that describes reflection properties of the matter leads to the following equation (see also [Savchenko]):

Ireflected = Cdiffuse * Idirected cos

This equation demonstrates that the reflection is at its peak for surfaces that are perpendicular to the direction of light and diminishes for smaller angles, because the cosinus value is very large. The light is obscured by the surface if the angles is more than 180 or less than 0 degrees, because the cosinus value is small. You will obtain negative intensity of the reflected light, which will be clamped by the output registers.

In an implementation of this model, you have to find a way to compute cos . By definition the dot or scalar product of the light and normal vector can be expressed as

N dot L = ||N|| ||L||cos

where ||N|| and ||L|| are the lengths of the vectors. If both vectors are unit length, you can compute cos as the scalar or dot product of the light and normal vector. Thus the expression is

Ireflected = Cdiffuse * Idirected(N dot L)

So (N dot L) is the same as the cosine of the angle between N and L, therefore as the angle decrease, the resulting diffuse value is higher. This is exactly what the dp3 instruction and the first mul instruction are doing. Here is the source with the relevant part of constant.h:


dp3 r0, r1, -c[LIGHT_POSITION]

mul r0, r0, c[LIGHT_COLOR] ; modulate against light color
mul oD0, r0, c[DIFFUSE_COLOR] ; modulate against material

So the vertex shader registers are involved in the following way:

r0 = (r1 dot -c11) * c14

This example modulates additionally against the blue light color in c15:

r0 = (c15 * (r1 dot -c11)) * c14


RacorX3 shows the usage of an include file to give constants a name that can be remembered in a better way. It shows how to normalize vectors and it just strive the problem of transforming normals, but shows a bullet-proof method to do it.

The example introduces an optimization technique, that eliminates the need to transpose the clip space matrix with the help of the CPU and it shows the usage of a simple diffuse reflection model, that lights the quad on a per-vertex basis.



  Printable version
  Discuss this article

The Series
  Fundamentals of Vertex Shaders
  Programming Vertex Shaders
  Fundamentals of Pixel Shaders
  Programming Pixel Shaders
  Diffuse & Specular Lighting with Pixel Shaders