Contents

Preface
The Third
Dimension
Transformation
Math
Down to the Code
Conclusion

Get the source
Printable version

The Series

The Basics
First Steps to
Animation
Multitexturing
Building Worlds
With X Files

Transformation Math

Let's give our old math teachers a smile :-) . I learned math from the beginning-seventies to the mid-eighties at school (yes ... we've got another education system here in Germany). At that time, I never thought that there would be such an interesting use (i.e. game-programming) for it. I wonder if math teachers today talk about the use of math in computer games.

Any impressive game requires correct transformations: Consider the following example. An airplane, let's say an F22, is oriented such that its nose is pointing in the positive z direction, its right wing is pointing in the positive x direction and its cockpit is pointing in the positive y direction. So the F22's local x, y and z axes are aligned with the world x, y and z axes. If this airplane is to be rotated 90 degrees about its y axis, its nose would be pointing toward the world -x axis, its right wing toward the world z axis and its cockpit will remain in the world +y direction. From this new position, rotate the F22 about its z axis. If your transformations are correct, the airplane will rotate about its own z-axis. If your transformations are incorrect, the F22 will rotate about the world z axis. In Direct3D you can guarantee the correct transformation by using 4x4 matrices.

Matrices are rectangular arrays of numbers. A 4x4 world matrix contains 4 vectors, which represent the world space coordinates of the x, y and z unit axis vectors, and the world space coordinate which is the origin of these axis vectors:


x x x 0
y y y 0
z z z 0
x y z 1

Vectors are one of the most important concepts in 3D games. They are mathematical entities that describe a direction and a magnitude (which can, for example, be used for speed). A general purpose vector consists of two coordinates. You can see the direction of these vectors by drawing a line between the two coordinates. Magnitude is the distance between the points.

The first coordinate is called the inital point and the second is the final point. Three dimensional games often use a specific kind of vector - the free vector. Its inital point is assumed to be the origin, and only the final point is specified.

Vectors are usually denoted by a bold face letter of the alphabet, i.e. a. So, we could say the vector v = (1,2,3). The first column is units in the x direction, the second column is units in the y direction, the third column, units in z.

The first column contains the world space coordinates of the local x axis. The second column contains the local y axis and the third column the world space coordinates of the local z axis. The vectors are unit vectors whose magnitude are 1. Basically unit vectors are used to define directions, when magnitude is not really important. The last row contains the world space coordinates of the object's origin, which translates the object.

A special matrix is the identity matrix:

The identity matrix represents a set of object axes that are aligned with the world axes. The world x coordinate of the local x axis is 1, the world y and z coordinates of the local x axis are 0 and the origin vector is (0, 0, 0). So the local model x axis lies directly on the world x axis. The same is true for the local x and y axes. So it's a "set back to the roots" matrix.

This matrix could be accessed by


D3DMATRIX mat;
mat._11 = 1.0f; mat._12 = 0.0f; mat._13 = 0.0f; mat._14 = 0.0f;
mat._21 = 0.0f; mat._22 = 1.0f; mat._23 = 0.0f; mat._24 = 0.0f;
mat._31 = 0.0f; mat._32 = 0.0f; mat._33 = 1.0f; mat._34 = 0.0f;
mat._41 = 0.0f; mat._42 = 0.0f; mat._43 = 0.0f; mat._44 = 1.0f;

If an object's position in model space corresponds to its position in world space, simply set the world transformation matrix to the identity matrix.

A typical transformation operation is a 4x4 matrix multiply operation. A transformation engine multiplies a vector representing 3D data, typically a vertex or a normal vector, by a 4x4 matrix. The result is the transformed vector. This is done with standard linear algebra:


Transform   Original  Transformed
 Matrix      Vector     Vector

a b c d      x         ax + by + cy + dw          x'
e f g h   x  y    =    ex + fy + gz + hw    =     y'
i j k l      z         ix + jy + kz + lw          z'
m n o p      w         mx + ny + oz + pw          w'

Before a vector can be transformed, a transform matrix must be constructed. This matrix holds the data to convert vector data to the new coordinate system. Such an interim matrix must be created for each action (scaling, rotation and transformation) that should be performed on the vector. Those matrices are multiplied together to create a single matrix that represents the combined effects of all of those actions (matrix concatenation). This single matrix, called the transform matrix, could be used to transform one vector or one million vectors. The time to set it up amortizes by the ability to re-use it. The concatenation of the world-, view- and projection- matrices is handled by Direct3D internally.

One of the pros of using matrix multiplication is that scaling, rotation and translation all take the same amount of time to perform. So the performance of a dedicated transform engine is predictable and consistent. This allows software developers to make informed decisions regarding performance and quality.

The World Matrix

Usually the world matrix is a combination of translation, rotation and scaling the matrices of the objects. Code for a translate and rotate world matrix could look like this:


struct Object
{
  D3DVECTOR   vLocation;
  FLOAT       fYaw, fPitch, fRoll;
  ...
 
  D3DMATRIX   matLocal;
};


class CMyD3DApplication : public CD3DApplication
{
  ...
  Object       m_pObjects[NUM_OBJECTS];
  ...
};


// in FrameMove()
for (WORD i = 0; i < dwnumberofobjects; i++)
{
  D3DUtil_SetTranslateMatrix( matWorld, m_pObject[i].vLocation );

  D3DMATRIX matTemp, matRotateX, matRotateY, matRotateZ;
  D3DUtil_SetRotateYMatrix( matRotateY, m_pObject[i].fYaw );
  D3DUtil_SetRotateXMatrix( matRotateX, m_pObject[i].fPitch );
  D3DUtil_SetRotateZMatrix( matRotateZ, m_pObject[i].fRoll );
  D3DMath_MatrixMultiply( matTemp, matRotateX, matRotateY );
  D3DMath_MatrixMultiply( matTemp, matRotateZ, matTemp );
  D3DMath_MatrixMultiply( matWorld, matTemp, matWorld );

  m_pObject[i].matLocal = matWorld;
}

// in Render()
for (WORD i = 0; i < dwnumberofobjects; i++)
{
  ...
  m_pd3dDevice->SetTransform(D3DTRANSFORMSTATE_WORLD,
                             &m_pObject[i].matLocal );
  ...
}

You can make life easy for yourself by storing matrices which contain axis information in each object structure. We're only storing the world matrix here, because the object itself isn't animated, so a model matrix isn't used. A very important thing to remember is that matrix multiplication is not cummutative. That means [a] * [b] != [b] * [a]. The formula for transformation is

|W| = |M| * |T| * |X| * |Y| * |Z|

where M is the model's matrix, T is the translation matrix and X, Y and Z are the rotation matrices.

The above piece of code translates the object into its place with D3DUtil_SetTranslateMatrix(). Translation can best be described as a linear change in position. This change can be represented by a delta vector [tx, ty, tz], where tx (often called dx) represents the change in the object's x position, ty (or dy) represents the change in its y position, and tz (or dz) the change in its z position. You can find D3DUtil_SetTranslateMatrix() in d3dutil.h.


inline VOID D3DUtil_SetTranslateMatrix( D3DMATRIX& m,
                                        FLOAT tx, FLOAT ty, FLOAT tz )
{ 
  D3DUtil_SetIdentityMatrix( m );
  m._41 = tx; m._42 = ty; m._43 = tz; 
}

=

1  0  0  0
0  1  0  0
0  0  1  0 
tx ty tz 1

Using our F22 sample from above, if the nose of the airplane is oriented along the object's local z axis, then translating this airplane in the +z direction by using tz will make the airplane move forward in the direction its nose is pointing.

The next operation that is performed by our code piece is rotation. Rotation can be described as circular motion about some axis. The incremental angles used to rotate the object here represent rotation from the current orientation. That means, by rotating 1 degree about the z axis, you tell your object to rotate 1 degree about its z axis regardless of its current orientation and regardless on how you got the orientation. This is how the real world operates.

D3DUtil_SetRotateYMatrix() rotates the objects about the y-axis, where fRads equals the amount you want to rotate about this axis. You can find it, like all the other rotation matrices, in d3dutil.h.


VOID D3DUtil_SetRotateYMatrix( D3DMATRIX& mat, FLOAT fRads )
{
  D3DUtil_SetIdentityMatrix( mat );
  mat._11 =  cosf( fRads );
  mat._13 = -sinf( fRads );
  mat._31 =  sinf( fRads );
  mat._33 =  cosf( fRads );
}

=

cosf fRads  0  -sinf fRads  0
    0       0       0       0
sinf fRads  0  cosf fRads   0
    0       0       0       0

D3DUtil_SetRotateXMatrix() rotates the objects about the x-axis, where fRads equals the amount you want to rotate about this axis:


VOID D3DUtil_SetRotateXMatrix( D3DMATRIX& mat, FLOAT fRads )
{
  D3DUtil_SetIdentityMatrix( mat );
  mat._22 =  cosf( fRads );
  mat._23 =  sinf( fRads );
  mat._32 = -sinf( fRads );
  mat._33 =  cosf( fRads );
}

=

1       0       0       0
0  cos fRads  sin fRads 0
0 -sin fRads  cos fRads 0
0       0       0       0

D3DUtil_SetRotateZMatrix() rotates the objects about the z-axis, where fRads equals the amount you want to rotate about this axis:


VOID D3DUtil_SetRotateZMatrix( D3DMATRIX& mat, FLOAT fRads )
{
  D3DUtil_SetIdentityMatrix( mat );
  mat._11  =  cosf( fRads );
  mat._12  =  sinf( fRads );
  mat._21  = -sinf( fRads );
  mat._22  =  cosf( fRads );
}

=

cosf fRads    sinf fRads    0      0
-sinf fRads   cos fRads     0      0
0     0     0      0
0     0     0      0

The prototype of D3DMath_MatrixMultiply() prototype looks like VOID D3DMath_MatrixMultiply (D3DMATRIX& q, D3DMATRIX& a, D3DMATRIX& b). In other words: q=a*b. Matrix multiplication is the operation by which one matrix is transformed by another. A matrix multiplication stores the results of the sum of the products of matrix rows and columns.


a b c d    A B C D
e f g h  * E F G H  = 
i j k l    I J K L 
m n o p    M N O P

a*A+b*E+c*I+d*M  a*B+b*F+c*J+d*N  a*C+b*G+c*K+d*O a*D+b*H+c*L+d*P
e*A+f*E+g*I+h*M  e*B+f*F+g*J+h*N etc.

A slow but more understandable matrix multiplication routine could look like this:


VOID D3DMath_MatrixMultiply( D3DMATRIX& q, D3DMATRIX& a, D3DMATRIX& b )
{
  FLOAT* pA = (FLOAT*)&a;
  FLOAT* pB = (FLOAT*)&b;
  FLOAT  pM[16];

  ZeroMemory( pM, sizeof(D3DMATRIX) );

  for (WORD i=0; i< 4; i++)
    for (WORD j=0; j< 4; j++)
      pM [i][j]= pA[i][0] * pB[0][j] 
       + pA[i][1] * pB[1][j]
       + pA[i][2] * pB[2][j] 
       + pA[i][3] * pB[3][j];

  memcpy( &q, pM, sizeof(D3DMATRIX) );
}

A faster version is implemented in d3dutil.h:


VOID D3DMath_MatrixMultiply( D3DMATRIX& q, D3DMATRIX& a, D3DMATRIX& b )
{
  FLOAT* pA = (FLOAT*)&a;
  FLOAT* pB = (FLOAT*)&b;
  FLOAT  pM[16];

  ZeroMemory( pM, sizeof(D3DMATRIX) );

  for( WORD i=0; i<4; i++ ) 
    for( WORD j=0; j<4; j++ ) 
      for( WORD k=0; k<4; k++ ) 
        pM[4*i+j] +=  pA[4*i+k] * pB[4*k+j];

  memcpy( &q, pM, sizeof(D3DMATRIX) );
}

Once you've built the world transformation matrix, you need to call the SetTransform() method in the public interface method Render() of the Direct3D Framework. Set the world transformation matrix, specifying the D3DTRANSFORMSTATE_WORLD flag in the first parameter.

The View Matrix

The view matrix describes the position and the orientation of a viewer in a scene. This is normally the position and orientation of you, looking through the glass of your monitor into the scene. This thinking model is abstracted by a lot of authors by talking about a camera through which you are looking into the scene.

To rotate and translate the viewer or camera in the scene, three vectors are needed. These could be called the LOOK, UP and RIGHT vectors.

They define a local set of axes for the camera and will be set at the start of the application in the InitDeviceObjects() or in the FrameMove() method of the framework.


static D3DVECTOR vLook=D3DVECTOR(0.0f,0.0f,-1.0); 
static D3DVECTOR vUp=D3DVECTOR(0.0f,1.0f,0.0f); 
static D3DVECTOR vRight=D3DVECTOR(1.0f,0.0f,0.0f);

The LOOK vector is a vector that describes which way the camera is facing. It's the camera's local z axis. To set the camera's look direction so that it is facing into the screen, we would have to set the LOOK vector to D3DVECTOR (0, 0, 1). The LOOK vector isn't enough to descibe the orientation of the camera. The camera could stand upside down and the LOOK vector won't reflect this change in orientation. The UP vector helps here; it points vertically up relative to the direction the camera points. It's like the camera's y axis. So the UP vector is defined as D3DVECTOR (0, 1, 0). If you turn the camera upside down, the UP vector will be D3DVECTOR (0, -1, 0). We can generate a RIGHT vector from the LOOK and UP vectors by using the cross product of the two vectors.

Taking the cross product of any two vectors forms a third vector perpendicular to the plane formed by the first two. The cross product is used to determine which way polygons are facing. It uses two of the polygon's edges to generate a normal. Thus, it can be used to generate a normal to any surface for which you have two vectors that lie within the surface. Unlike the dot product, the cross product is not commutative. a x b = - (b x a). The magnitude of the cross product of a and b, ||axb|| is given by ||a||*||b||*sin(@). The direction of the resultant vector is orthogonal to both a and b.

Furthermore, the cross product is used to derive the plane equation for a plane determined by two intersecting vectors.

Now imagine, the player is sitting in the cockpit of an F22 instead of looking at it from outside. If the player pushes his foot pedals in his F22 to the left or right, the LOOK and the RIGHT vector has to be rotated about the UP vector (YAW effect) or y axis. If he pushes his flightstick to the right or left, the UP and RIGHT vectors have to be rotated around the LOOK vector (ROLL effect) or z axis. If he pushes the flightstick forward and backward, we have to rotate the LOOK and UP vectors around the RIGHT vector (PITCH effect) or x axis.

There's one problem: when computers handle floating point numbers, little accumulation errors happen whilst doing all this rotation math. After a few rotations these rounding errors make the three vectors un-perpendicular to each other. It's obiously important for the three vectors to stay at right angles from each other. The solution is Base Vector Regeneration. It must be performed before the vectors are rotated around one another. We'll use the following code to handle base vector regeneration:


vLook = Normalize(vLook);
vRight  = CrossProduct( vUp, vLook); // Cross Product of the UP and LOOK vector
vRight = Normalize (vRight);
vUp = CrossProduct (vLook, vRight); // Cross Product of the RIGHT and LOOK vector
vUp = Normalize(vUp);

First, we normalize the LOOK vector, so its length is 1. Vectors with a length of one are called unit or normalized vectors. To calculate a unit vector, divide the vector through its magnitude or length. You can calculate the magnitude of vectors by using the Pythagorean theorem:


x²+y²+z² = m²

The length of the vector is retrieved by


||A|| = sqrt (x² + y² + z²)

It's the squareroot of the Pythagorean theorem. The magnitude of a vector has a special symbol in mathematics. It's a capital letter designated with two vertical bars ||A||.

To normalize a vector, the following inline functions in d3dvec.inl are defined:


inline _D3DVECTOR
Normalize (const _D3DVECTOR& v)
{
  return v / Magnitude(v);
}

inline D3DVALUE
Magnitude (const _D3DVECTOR& v)
{
  return (D3DVALUE) sqrt(SquareMagnitude(v));
}

inline D3DVALUE
SquareMagnitude (const _D3DVECTOR& v)
{
  return v.x*v.x + v.y*v.y + v.z*v.z;
}

The Normalize() method divides the vector through its magnitude, which is retrieved by the squareroot of the Pythagorean theorem.

sqrt() is a mathematical function from the math library of Visual C/C++ provided by Microsoft™. Other compilers should have a similar function.

After normalizing the LOOK vector, we create the RIGHT vector by assigning it the cross product of UP and LOOK vector and normalize it. The UP vector is created out of the cross product of the LOOK and RIGHT vector and a normalization thereafter.

After that, we build the pitch, yaw and roll matrices out of these vectors:


// Matrices for pitch, yaw and roll

// This creates a rotation matrix around the viewers RIGHT vector. 
D3DMATRIX matPitch, matYaw, matRoll;
D3DUtil_SetRotationMatrix(matPitch, vRight, fPitch); 

// Creates a rotation matrix around the viewers UP vector. 
D3DUtil_SetRotationMatrix(matYaw, vUp, fYaw );

// Creates a rotation matrix around the viewers LOOK vector. 
D3DUtil_SetRotationMatrix(matRoll, vLook, fRoll);

By multiplying, for example, the matYaw matrix with the LOOK and RIGHT vectors, we can rotate two vectors around the other vector.


// now multiply these vectors with the matrices we've just created. 
// First we rotate the LOOK & RIGHT Vectors about the UP Vector
D3DMath_VectorMatrixMultiply(vLook , vLook, matYaw);
D3DMath_VectorMatrixMultiply(vRight, vRight,matYaw);

// And then we rotate the LOOK & UP Vectors about the RIGHT Vector
D3DMath_VectorMatrixMultiply(vLook , vLook, matPitch);
D3DMath_VectorMatrixMultiply(vUp, vUp, matPitch);

// now rotate the RIGHT & UP Vectors about the LOOK Vector
D3DMath_VectorMatrixMultiply(vRight, vRight, matRoll);
D3DMath_VectorMatrixMultiply(vUp, vUp, matRoll);

Now that we set the view matrix:


D3DMATRIX view=matWorld;
D3DUtil_SetIdentityMatrix( view );// defined in d3dutil.h and d3dutil.cpp 
view._11 = vRight.x; view._12 = vUp.x; view._13 = vLook.x;
view._21 = vRight.y; view._22 = vUp.y; view._23 = vLook.y;
view._31 = vRight.z; view._32 = vUp.z; view._33 = vLook.z;
view._41 = - DotProduct( vPos, vRight ); // dot product defined in d3dtypes.h
view._42 = - DotProduct( vPos, vUp );
view._43 = - DotProduct( vPos, vLook );

m_pd3dDevice->SetTransform(D3DTRANSFORMSTATE_VIEW,&view)

=

   vx       ux       nx      0
   vy       uy       ny      0
   vz       uz       nz      0
-(u * c) -(v * c) -(n * c)   1

In this matrix u, n and v are the UP, RIGHT and LOOK-direction vectors, and c is the camera's world space position. This matrix contains all the elements needed to translate and rotate vertices from world space to camera space.

The x, y and z translation factors are computed by taking the negative of the dot product between the camera position and the u, v, and n vectors. They are negated because the camera works the opposite to objects in the 3D world.

To rotate the vectors two about another, we change fPitch, fYaw and fRoll variables like this:


fPitch=-0.3f * m_fTimeElapsed;
...
fPitch=+0.3f * m_fTimeElapsed;
...
fYaw=-0.3f * m_fTimeElapsed;
...
fYaw=+0.3f * m_fTimeElapsed;
...
fRoll=-0.3f * m_fTimeElapsed;
...
fRoll=+0.3f * m_fTimeElapsed;
...

To synchronize the different number of frame rates with the behaviour of the objects, we have to use a variable with the elapsed time since the last frame. To move the camera forward and backward use the position variable:


vPos.x+=fspeed*vLook.x;
vPos.y+=fspeed*vLook.y;
vPos.z+=fspeed*vLook.z;

vPos.x-=fspeed*vLook.x;
vPos.y-=fspeed*vLook.y;
vPos.z-=fspeed*vLook.z;

The Projection Matrix

An interesting transform is the perspective projection, which is used in Direct3D. It converts the camera's viewing frustrum (the pyramid-like shape that defines what the camera can see) into a cube space, as seen above (with a cube shaped geometry, clipping is much easier). Objects close to the camera are enlarged greatly, while objects farther away are enlarged less. Here, parallel lines are generally not parallel after projection. This transformation applies perspective to a 3D scene. It projects 3D geometry into a form that can be viewed on a 2D display.

The projection matrix is set with D3DUtil_SetProjectionMatrix() in d3dutil.cpp.


HRESULT D3DUtil_SetProjectionMatrix( D3DMATRIX& mat, FLOAT fFOV, FLOAT fAspect,
     FLOAT fNearPlane, FLOAT fFarPlane )
{
  if( fabs(fFarPlane-fNearPlane) < 0.01f )
    return E_INVALIDARG;
  if( fabs(sin(fFOV/2)) < 0.01f )
    return E_INVALIDARG;

  FLOAT w = fAspect * ( cosf(fFOV/2)/sinf(fFOV/2) );
  FLOAT h =   1.0f  * ( cosf(fFOV/2)/sinf(fFOV/2) );
  FLOAT Q = fFarPlane / ( fFarPlane - fNearPlane );

  ZeroMemory( &mat, sizeof(D3DMATRIX) );
  mat._11 = w;
  mat._22 = h;
  mat._33 = Q;
  mat._34 = 1.0f;
  mat._43 = -Q*fNearPlane;

  return S_OK;
}

=

w   0   0   0
0   h   0   0
0   0   Q   1
0   0 -QZN  0

This code sets up a projection matrix, taking the aspect ratio, front (-Q*Zn) or near plane and back or far clipping planes and the field of view with fFOV in radians. Note that the projection matrix is normalized for element [3][4] to be 1.0. This is performed so that w-based range fog will work correctly.

After this last transform, the geometry must be clipped to the cube space and converted from homogenous coordinates to screen coordinates by dividing the x-, y- and z-coordinates of each point by w. Direct3D performs these steps internally.

Homogenous coordinates: Just think of a 3x3 matrix. As you've learned above, in a 4x4 matrix, the first three elements, let's say i,j and k, in the fourth row are needed to translate the object. With 3x3 matrices an object cannot be translated without changing its orientation. If you add some vector (representing translation) to the i, j and k vectors, their orientation will also change. So we need fourth dimension with the so called homogenous coordinates.

In this 4D space, every point has a fourth component that measures distance along an imaginary fourth-dimensional axis called w. To convert a 4D point into a 3D point, you have to divide each component x, y, z and w by w. So every multiplication of a point whose w component is equal to 1 represent that same point. For example (4, 2, 8, 2) represents the same point as (2, 1, 4, 1).

To describe distance on the w-axis, we need another vector l. It points in the positive direction of the w-axis and its neutral magnitude is 1, like the other vectors.

In Direct3D, the points remain homogenous, even after being sent through the geometry pipeline. Only after clipping, when the geometry is ready to be displayed, are these points converted into Cartesian coordinates by dividing each component by w.

Before doing the transformation stuff, normally the viewport is set by the Direct3D IM Framework. It defines how the horizontal, vertical and depth components of a polygon's coordinates in cube space will be scaled before the polygon is displayed.

Next : Down to the Code