Matrix Transformation

        OTMMATX.DOC     -   matrix transformation tutorial

        (or if you prefer, "A proven antidote for VLA 3d tutorials" :)

        Copyright (C) 1995, Zach Mortensen  [Voltaire OTM.TAP]
        All rights reserved.

        email - mortens1@nersc.gov


Table of Contents

        0.  Introduction
        1.  Math Prerequisites
        2.  Importance of Correct Transformations
        3.  What a Matrix Represents
        4.  Matrix Multiplication
        5.  Transforming a Vector by a Matrix
        6.  Object Space Transformations
        7.  Camera Transformations
        8.  Inverse Transformations
        9.  Hierarchical Transformations
        A.  Precision
        B.  Conclusion


Introduction

            It has been entirely too long since I wrote my last doc, I just
        had to make another one :)  This is a topic which I have spent a great
        deal of time teaching lately, so I decided to make a doc that would
        expedite the process for me.

            If you are looking for advice from a certified expert, you will
        not find it in this doc.  I am merely an individual who has spent the
        past year studying realtime rendering, and who is willing to share
        what he has learned.  However, you should rest assured that I do not
        write docs about things I have never implemented in working code, and
        I never code anything I don't fully understand.  In other words, I
        think I know what I am talking about here :)

            All of the techniques in this file can be implemented regardless
        of the programming language you use, from assembler to C to Visual
        Basic.  I will, however, be giving any pseudocode examples in C,
        because it seems to be the universal language of coders right now.
        For the sake of simplicity, all code examples will use floating point
        math.  Floating point is the wave of the future, as a matter of fact
        its faster than fixed point integer math on the newest RISC machines
        (PowerPC, DEC Alpha).  Unfortunately, Intel users have been given
        hideously slow floating point processors in the past and are less than
        confident in the ability of their machines; but things will get better
        very shortly.


Math Prerequisites

            Sadly, matrix math is something that is all but ignored in most
        high school curricula in the United States.  But hey, the best stuff
        is skipped in high school, everybody knows that :)  You should not be
        afraid of matrix math just because your high school math teacher does
        not teach it.  Now that I think about it, matrix math is probably
        skipped in high school simply because there is nothing taught in high
        school that applies its principles.

            The only matrix math I was taught in high school was for solving
        systems of equations.  Basically, we were taught that a matrix is a
        simple and powerful way to represent a complex system of equations.
        This is a great explanation, although extremely simplistic.  Keep this
        in mind throughout the course of this doc.

            The math of matrices is very simple.  Nothing higher than first
        year algebra is used, although an understanding of trigenometry will
        make your life much easier when it comes to understanding the meaning
        of all those numbers in a matrix.  If you have had a course in Linear
        Algebra at a university, this doc will probably be old news to you.


Importance of Correct Transformations

            When I first began coding vector graphics, I had a great
        understanding of trigenometry and anything but fond memories of matrix
        math as it was presented in my high school courses.  To me, matrix
        math seemed to be an unneccessarily complex way to solve a simple
        algebraic problem, like trying to swat a fly with an elephant gun.
        Indeed, many algebraic problems can be solved without the use of
        matrix math.

            However, vector graphics pose a system of equations that are
        anything but simple to solve algebraically.  Consider the following:
        in a typical vector world, there are many objects.  Each of these
        objects moves in its own space (object space), relative to its own set
        of coordinate axes.  There are several cameras, each of which moves in
        its own space (camera space or eyespace) according to its own set of
        coordinate axes.  At some point, the objects must be transformed from
        object space to eyespace so they can be displayed as the eye sees
        them.  Apart from this, there is the issue of object hierarchies.  In
        a realistic vector environment, object hierarchies must be handled
        correctly.  These issues present formidable challenges without the use
        of matrix math.  However, by using matrix math we can deal with all
        of these issues in a simple and speedy manner.

            Granted, I am assuming that everyone wants to make simulation
        quality vector code.  This type of code requires correct
        transformations.  Speaking from a demo scene point of view, very few
        demos implement correct transformations simply because they are not
        required for the application.  The stereotypical vector scene in a
        demo is a childish attempt to display speed and pretty rendering on a
        single object.  Most demo coders do not care if their objects are
        moving correctly, they just want them to move around a bit.  This
        attitude is fine, assuming you never want to do anything with your
        code besides show a spinning cube or toroid.

            Any impressive vector application (game, simulator, etc.) requires
        correct transformations.  Consider the following example.  An airplane
        is oriented such that its nose is pointing in the positive z
        direction, its right wing is pointing in the positive x direction, its
        cockpit is pointing in the positive y direction.  The airplane's local
        x, y, and z axes are aligned with the world x, y, and z axes.  If this
        airplane were rotated 90 degrees about its y axis, its nose would be
        pointing toward the world -x axis, its right wing pointing toward
        the world +z, and its cockpit remaining in the world +y direction.
        Most transformations, whether correct or incorrect, would accomplish
        this.  Here is the 'acid test' to see whether your transformations are
        correct.  From this new position, rotate the airplane about its z
        axis.  If your transformations are correct, the airplane will rotate
        about its own z axis (it will roll).  If your transformation is
        incorrect, the airplane will rotate about the world z axis (its nose
        will pitch up or down).

            This rather silly example poses quite a serious question.  If your
        transformations are not correct, how will you control object movement
        in a vector world?  How will you guarantee your airplane will roll
        when you tell it to instead of pitching?  The same problem arises with
        incorrect camera rotation.  The consequences of such incorrectness in
        a flight simulator or game would make the game unplayable.

            The solution to this problem is simple; the use of 4x4 matrix
        transformations.


What a Matrix Represents

            Before we continue, it will help you greatly to understand what
        the values in a matrix represent.  A 4x4 matrix contains 4 vectors,
        which represent the world space coordinates of the x, y and z unit
        axis vectors, and the world space coordinate which is the origin of
        these axis vectors.

                X     Y     Z     C
            Ú                         ż
            ł Ú   ż Ú   ż Ú   ż Ú   ż ł
            ł ł   ł ł   ł ł   ł ł   ł ł
            ł łX  ł łX  ł łX  ł ł0  ł ł
            ł ł   ł ł   ł ł   ł ł   ł ł
            ł ł   ł ł   ł ł   ł ł   ł ł
            ł łY  ł łY  ł łY  ł ł0  ł ł
            ł ł   ł ł   ł ł   ł ł   ł ł
            ł ł   ł ł   ł ł   ł ł   ł ł
            ł łZ  ł łZ  ł łZ  ł ł0  ł ł
            ł Ŕ   Ů Ŕ   Ů Ŕ   Ů ł   ł ł
            ł ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż ł   ł ł
        O   ł  X     Y     Z    ł1  ł ł
            ł ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ Ŕ   Ů ł
            Ŕ                         Ů

        The X column contains the world space coordinates of the local X axis
        The Y column contains the world space coordinates of the local Y axis
        The Z column contains the world space coordinates of the local Z axis

        These vectors are unit vectors.  A unit vector is a vector whose
        magnitude is 1.  Basically, unit vectors are used to define
        directions, when magnitude is not really important.

        The C column always contains the specified values
        The O row contains the world space coordinates of the object's origin

        You can make life easy for yourself by storing matrices which contain
        axis information in each object.  I keep two matrices for every
        object; omatrix, which stores the object space matrix, and ematrix,
        which stores the eyespace matrix for the object.

        A special matrix is the identity matrix:

        Ú                         ż
        ł Ú   ż Ú   ż Ú   ż Ú   ż ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł1  ł ł0  ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł1  ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł0  ł ł1  ł ł0  ł ł
        ł Ŕ   Ů Ŕ   Ů Ŕ   Ů ł   ł ł
        ł ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż ł   ł ł
        ł  0     0     0    ł1  ł ł
        ł ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ Ŕ   Ů ł
        Ŕ                         Ů

        Notice why the identity matrix is special?  The identity matrix
        represents a set of object axes that are aligned with the world axes.
        Remember that the vectors stored in the matrix are unit vectors.  Now,
        because the world x coordinate of the local x axis is 1, the world y
        and z coordinates of the local x axis are 0, and the origin vector
        is [0, 0, 0], the local x axis lies directly on the world x axis.  The
        same is true for the local y and z axes.

        The other special property of the identity matrix is given away in its
        name.  If you are familiar with math, you know that there are identity
        elements in the set of any arithmetic operation.  When an binary
        operation is performed on some operand and the identity element of the
        set, the operand is the result of the operation.  For example,
        identity elements for multiplication and division are 1, and identity
        elements for addition and subtraction are 0.  x + 0 = x; x - 0 = x; x
        * 1 = x; x / 1 = x.  Similarly, [x] * [identity] = [x] (I will denote
        matrices in brackets [] throughout this doc, for example [x] is
        "matrix x").


Matrix Multiplication

            There are two matrix operations which we will use in our
        matrix transformations, multiplying (concatenating) two matrices, and
        transforming a vector by a matrix.  We will now examine the first of
        these two operations, matrix multiplication.

            Matrix multiplication is the operation by which one matrix is
        transformed by another.  A very important thing to remember is that
        matrix multiplication is not commutative.  That is, [a] * [b] != [b] *
        [a].  For now, it will suffice to say that a matrix multiplication
        stores the results of the sum of the products of matrix rows and
        columns.  Here is some example code of a matrix multiplication routine
        which multiplies matrix [a] * matrix [b], then copies the result to
        matrix a.


void matmult(float a[4][4], float b[4][4])
{
    float temp[4][4];       // temporary matrix for storing result
    int i, j;               // row and column counters

    for (j = 0; j < 4; j++)         // transform by columns first
        for (i = 0; i < 4; i++)     // then by rows
            temp[i][j] = a[i][0] * b[0][j] + a[i][1] * b[1][j] +
                         a[i][2] * b[2][j] + a[i][3] * b[3][j];

    for (i = 0; i < 4; i++)         // copy result matrix into matrix a
        for (j = 0; j < 4; j++)
            a[i][j] = temp[i][j];
}

            I have been informed that there is a faster way of multiplying
        matrices, which involves taking the dot product of rows and columns.
        However, I have yet to implement such a method, so I will not discuss
        it here at this time.


Transforming a Vector by a Matrix

            This is the second operation which is required for our matrix
        transformations.  It involves projecting a stationary vector onto
        transformed axis vectors using the dot product.  One dot product is
        performed for each coordinate axis.


        x = x0 * matrix[0][0] + y0 * matrix[1][0] + z0 * matrix[2][0] +
            w0 * matrix[3][0];

        y = x0 * matrix[0][1] + y0 * matrix[1][1] + z0 * matrix[2][1] +
            w0 * matrix[3][1];

        z = x0 * matrix[0][2] + y0 * matrix[1][2] + z0 * matrix[2][2] +
            w0 * matrix[3][2];


        The x0, y0, etc. coordinates are the original object space coordinates
        for the vector.  That is, they never change due to transformation.

        "Alright," you say.  "Where did all the w coordinates come from???"
        Good question :)  The w coordinates come from what is known as a
        homogenous coordinate system, which is basically a way to represent 3d
        space in terms of a 4d matrix.  Because we are limiting ourselves to
        3d, we pick a constant, nonzero value for w (1.0 is a good choice,
        since anything * 1.0 = itself).  If we use this identity axiom, we can
        eliminate a multiply from each of the dot products:


        x = x0 * matrix[0][0] + y0 * matrix[1][0] + z0 * matrix[2][0] +
            matrix[3][0];

        y = x0 * matrix[0][1] + y0 * matrix[1][1] + z0 * matrix[2][1] +
            matrix[3][1];

        z = x0 * matrix[0][2] + y0 * matrix[1][2] + z0 * matrix[2][2] +
            matrix[3][2];


        These are the formulas you should use to transform a vector by a
        matrix.


Object Space Transformations

            Now that we know how to multiply matrices together, we can
        implement the actual formulas used in our transformations.  There are
        three operations performed on a vector by a matrix transformation:
        translation, rotation, and scaling.

            Translation can best be described as linear change in position.
        This change can be represented by a delta vector [dx, dy, dz], where
        dx represents the change in the object's x position, dy represents the
        change in its y position, and dz its z position.

            If done correctly, object space translation allows objects to
        translate forward/backward, left/right, and up/down, relative to the
        current orientation of the object.  Using our airplane as an example,
        if the nose of the airplane is oriented along the object's local z
        axis, then translating the airplane in the +z direction will make the
        airplane move forward (the direction in which its nose is pointing)
        regardless of the airplane's orientation.

        Here is the translation matrix:

        Ú                         ż
        ł Ú   ż Ú   ż Ú   ż Ú   ż ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł1  ł ł0  ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł1  ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł0  ł ł1  ł ł0  ł ł
        ł Ŕ   Ů Ŕ   Ů Ŕ   Ů ł   ł ł
        ł ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż ł   ł ł
        ł  dx    dy    dz   ł1  ł ł
        ł ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ Ŕ   Ů ł
        Ŕ                         Ů

        where [dx, dy, dz] is the displacement vector.  After this operation,
        the object will have moved in its own coordinate system, according to
        the displacement (translation) vector.

            The next operation that is performed by our matrix transformation
        is rotation.  Rotation can be described as circular motion about some
        axis, in this case the axis is one of the object's local axes.  Since
        there are three axes in each object, we need to rotate around each of
        them.  Here are the matrices for rotation about each axis:

        about the x axis:

        Ú                         ż
        ł Ú   ż Ú   ż Ú   ż Ú   ż ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł1  ł ł0  ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł łcx ł łsx ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł-sxł łcx ł ł0  ł ł
        ł Ŕ   Ů Ŕ   Ů Ŕ   Ů ł   ł ł
        ł ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż ł   ł ł
        ł  0     0     0    ł1  ł ł
        ł ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ Ŕ   Ů ł
        Ŕ                         Ů


        about the y axis:

        Ú                         ż
        ł Ú   ż Ú   ż Ú   ż Ú   ż ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł łcy ł ł0  ł ł-sył ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł1  ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł łsy ł ł0  ł łcy ł ł0  ł ł
        ł Ŕ   Ů Ŕ   Ů Ŕ   Ů ł   ł ł
        ł ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż ł   ł ł
        ł  0     0     0    ł1  ł ł
        ł ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ Ŕ   Ů ł
        Ŕ                         Ů

        about the z axis:

        Ú                         ż
        ł Ú   ż Ú   ż Ú   ż Ú   ż ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł łcz ł łsz ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł-szł łcz ł ł0  ł ł0  ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł   ł ł   ł ł   ł ł   ł ł
        ł ł0  ł ł0  ł ł1  ł ł0  ł ł
        ł Ŕ   Ů Ŕ   Ů Ŕ   Ů ł   ł ł
        ł ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż ł   ł ł
        ł  0     0     0    ł1  ł ł
        ł ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ Ŕ   Ů ł
        Ŕ                         Ů


            The cx, sx, cy, sy, cz, and sz variables are the values of the
        cosines and sines of the angles of rotation about the x, y, and z
        axes, respectively.  Remeber that the angles used represent angular
        displacement just as the values used in the translation step denote a
        linear displacement.  Correct transformation CANNOT be accomplished
        with matrix multiplication if you use the cumulative angles of
        rotation.  I have been told that quaternions are able to perform this
        operation correctly, however I know nothing of quaternions and how
        they are implemented.  The incremental angles used here represent
        rotation from the current object orientation.  In other words, by
        rotating 1 degree about the z axis, you are telling your object
        "Rotate 1 degree about your z axis, regardless of your current
        orientation, and regardless of how you got to that orientation."  If
        you think about it a bit, you will realize that this is how the real
        world operates.  In object space, the series of rotations an object
        undergoes to attain a certain orientation have no effect on the
        object space results of any upcoming rotations.

            Now that we know the matrix formulas for translation and rotation,
        we can combine them to transform our objects.  The formula for
        transformations in object space is

        [O] = [O] * [T] * [X] * [Y] * [Z]

        where O is the object's matrix, T is the translation matrix, and X, Y,
        and Z are the rotation matrices for their respective axes.  Remember,
        that order of matrix multiplication is very important!

            The recursive assignment of O poses a question:  What is the
        original value of the object matrix?  To eliminate any terrible errors
        in transformation, the matrices which store an object's orientation
        should always be initialized to identity.


Camera Transformations

            Good camera code is a key to a good vector engine.  If the camera
        does not move correctly in a first-person type simulation, any hope of
        realism is lost.  Fortunately, camera code is very easy to implement
        using matrices.

            Each camera should contain one matrix which defines its current
        orientation.  Like the object matrix, the camera matrix (I call it
        cmatrix) should be initialized to identity to avoid hideous errors.
        The camera matrix is built in exactly the same way as the object
        matrix, using CT, CX, CY and CZ.  These translation and
        rotation matrices are made in exactly the same way as their
        object space counterparts, except for the obvious fact that
        they are made using camera rotation and translation vectors.
        Depending on your implementation, the components of the displacement
        vectors may need to be negated.  Sometimes all of the [dx, dy, dz] and
        [xangle, yangle, zangle] need to be negated, sometimes some of them.
        Once again, it depends on your implementation.  If you find your
        camera is moving forward instead of backward, right instead of left,
        up instead of down, etc. you need to negate a vector component
        somewhere.  Once you get that straightened out, build the camera
        matrix like this

        [C] = [C] * [CT] * [CX] * [CY] * [CZ]

        Once you have made the camera matrix, you can transform your object
        matrices to camera space with a simple matrix multiplication

        [E] = [O] * [C]

        where [E] is the eyespace matrix, [O] is the object space matrix for
        the object, and [C] is the camera space matrix for the camera you are
        using (which we calculated above).

            If you are an incredibly astute individual, you will have realized
        that I have left a step out of this process.  Where did I add the
        world coordinates of the object's origin?  The translation in the
        object matrix is a displacement which is added to the origin of
        the object, but what of the original value of the object's origin?

            The reason I left this step out until now is because it is easiest
        to perform this operation in the middle of transforming an object to
        eyespace.  Something to remember is that the object origin is
        expressed in terms of world space coordinates.  If we want our
        transformations to be correct, we have to add world space coordinates
        to world space coordinates, adding them to camera or object space
        coordinates would give us incorrect results.  The only time we have
        world space coordinates in this process is after object transformation
        is complete.  At this time, we can add the coordinates of the object's
        origin to the origin vector found in the transformation matrix.  Here
        is some example code illustrating this process.  omatrix contains the
        object space transformation matrix, cmatrix contains the camera
        transformation matrix.  ematrix will be the result of omatrix *
        cmatrix.


    (initialize tmatrix, xmatrix, ymatrix, zmatrix...)

    ...

    matmult(omatrix, tmatrix);      // translate the object
    matmult(omatrix, zmatrix);      // rotate about z
    matmult(omatrix, xmatrix);      // rotate about x
    matmult(omatrix, ymatrix);      // rotate about y

    initmatrix(ematrix);            // initialize eye space matrix to identity

    matmult(ematrix, omatrix);      // copy the object space matrix to eye space

    ematrix[3][0] += origin->lx;    // add origin vector
    ematrix[3][1] += origin->ly;
    ematrix[3][2] += origin->lz;

    matmult(ematrix, cmatrix);      // convert world space to eye space



Inverse Transformations

            Inverse transformations are used to perform hidden surface removal
        in object space, without transforming any normal vectors.  Likewise,
        the same technique can be used in shading.  This method provides some
        obvious speed increases over transforming normal vectors, the correct
        culling or shading information can be determined by inversely
        transforming a view or light vector, then taking the dot product of
        this inversely transformed vector and the non-transformed normal
        vectors.  In addition to the rendering time saved by not transforming
        normal vectors, this method can give significant time savings by
        allowing you to hide points as well as polys.  If you determine that
        only the points contained in visible polygons are visible, you can
        avoid transforming points contained in those polygons which are not
        visible.  This usually accounts for around half of the points in an
        object.

            Inverse transformation requires an inverse matrix, which is made
        by negating all of the angles in the constituent rotation matrices and
        multiplying them together in reverse order.  As you can imagine, this
        is quite a slow process, especially when a camera and hierarchical
        object are involved!  Thankfully though, there is a shortcut.

            Certain types of matrices (I believe those that have a determinant
        of 1) can be inverted by swapping rows and columns.  The
        transformation matrix is this type of matrix.  I made a utility to
        test this property, generating an inverse matrix with the negate and
        multiply in reverse order algorithm, and comparing it to the original
        matrix.  I found that transformation matrices do fit the pattern of
        inversion by row and column swapping.

            This means that you can inverse transform your view and light
        vectors with the following formulas (notice that these are identical
        to the regular transformation formulas, but have the row and column
        indices swapped):


        x = x0 * matrix[0][0] + y0 * matrix[0][1] + z0 * matrix[0][2];

        y = x0 * matrix[1][0] + y0 * matrix[1][1] + z0 * matrix[1][2];

        z = x0 * matrix[2][0] + y0 * matrix[2][1] + z0 * matrix[2][2];


        If you are a smart one, you will also notice that there are only 3
        terms in these equations, the w term (translation part of the matrix)
        is missing.  This is because we are inverse transforming, the result
        is object space.  We want all of our vectors to start at the origin
        for this operation, in object space the origin is [0, 0, 0].


Hierarchical Transformations

            I must confess, MechWarrior ][ is the game that inspired me to
        grind out all of this matrix code.  I found it so realistic, being
        able to move in so many different coordinate systems (I wonder if the
        casual user realizes this? :)  I had been wanting to implement all
        this in my code for quite some time, and playing MW2 was the event
        that pushed me over the edge.

            Another thing that this game showed me was the usefulness of
        hierarchical transformations.  If you are lost in my terminology, let
        me explain what I mean by hierarchical transformations.

            Apart from being a great spelling bee word, a hierarchy is an
        organazation of things into levels.  In a very general sense, the
        things higher up in a hierarchy have control over the lower things.
        This applies to business, government, and our favorite subject, matrix
        transformations.

            As far as matrix transformations are concerned, consider an arm as
        an example of a hierarchical object.  The topmost object in the arm
        hierarchy is the bicep or upper arm, followed by the forearm, the
        hand, and the fingers.  Each of these 'objects' has an origin, the
        bicep has the shoulder, the forearm has the elbow, the hand has the
        wrist, and the fingers have the knuckles.

            Here is where the hierarchical part comes into play, remember what
        I said earlier about higher things on a hierarchy being able to
        control the lower things?  Well, move your bicep up in the air by
        pivoting your shoulder.  Surprise!  Your forearm, hand and fingers
        moved with your bicep.  Now move your forearm by bending your elbow.
        Notice that your hand and fingers moved, but your bicep did not.  This
        is because your forearm is lower in the hierarchy than your bicep, and
        therefore has no control over it.

            Here is an interesting question.  How on earth do you
        mathematically represent this hierarchy of objects in a correct
        manner?  More importantly, how can you do it quickly?  Using matrices,
        you can correctly transform a hierarchy of objects in no more time
        than it takes to transform non-hierarchical objects.

            Alright, its time for a little side trip into the dreaded land of
        data structures.  While it is coding hell for the beginner, the land
        of data structures is the playground of any experienced programmer.
        Data structures are easy, its the implementation that kills most
        people.  Most people with little experience in data structures and
        data relationship immediately try to swat a fly with an elephant gun
        by applying every data structure in the book to a simple problem.
        "How about if I make an array of stacks containing doubly linked lists
        of AVL trees, would that work?"  Well, I'm not one to say it would
        not.  However, I have found that the simplest solution to a problem is
        generally the best.

            In this case, the simplest solution to creating a hierarchical
        object data structure is to have parent objects keep track of their
        children only.


typedef struct object
{
    (other stuff here)

    ...

    object **children;
    int numchildren;

} object;


        The **children member of the struct above allows you to dynamically
        allocate enough memory for the children of each object.  For example,
        if we were making an instance of this struct for a hand, we would
        allocate an array of five pointers for **children.  These pointers
        would be set to the addresses of the object structs which contain the
        children of the hand, namely the fingers.

            Now that we have discussed how to implement the hierarchy data
        structures, lets move on to cover how to perform the actual
        hierarchical transformations.  First, if you will recall, our system
        of matrix transformations handles incremental rotation and
        translation; that is, objects are transformed according to some change
        in position and change in angles of rotation.  This feature lends
        itself well to hierarchical transformation.

            If you were to transform two objects by the same matrix, what
        would be the result?  The two objects would have the same orientation
        relative to each other before and after the transformation.  This
        property of matrix transformation is key to hierarchical
        transformations.  Consider the arm example again.  Keeping your elbow
        and wrist stiff, extend your arm in front of you and raise and lower
        it by rotating your arm at the shoulder.  Your arm will stay straight,
        it appears that all objects (bicep, forearm, hand, fingers) are
        behaving as one object.

            If this were the extent of a hierarchy's usefulness, we could
        simply transform each child object using the matrix of its parent.
        But we want each child object to be able to move on its own too; that
        is, we want to be able to bend the elbow, wrist, and fingers while we
        raise and lower our shoulders, using the arm example again.  Here is
        where the incremental rotation comes in handy.  Consider the matrix of
        a parent object as a starting point.  The orientation of a child can
        then be expressed in terms of this parent orientation plus some
        rotation and translation increment.  This increment can be expressed
        in terms of an object matrix.

            Actually, we already calculate this matrix in our code; in my code
        its the omatrix member of the object struct.  This matrix contains the
        orientation of the object, and we can easily express this orientation
        in terms of an increment to the orientation of a parent object.  To do
        this, simply understand that when you rotate the forearm 5 degrees,
        you are putting a 5 degree bend in your arm between the forearm and
        the bicep.  Likewise, translating the forearm or hand would move it
        relative to its parent according to the displacement vector specified.
        Of course, the initial position of the object's origin must be given
        as a displacement from the origin of the parent object if you are
        going to use this system of object hierarchy.

            Here is where the big advantages of this method come into play.
        According to what we have already said, the transformation formula for
        a child in an object hierarcy would be

        [O] = [O] * [T] * [X] * [Y] * [Z] * parent.[O] * parent.parent.[O] ...
        [E] = [O] * [C]

        where parent.[O] is the object space matrix of the parent to this
        object, parent.parent.[O] is the object space matrix of the parent's
        parent, etc.

        We can shorten this definition to

        [E] = [O] * [T] * [X] * [Y] * [Z] * parent.[O] * ... * [C]

        Since [E] = [O] * [C], we can substitute parent.[E] = parent.[O] *
        [C].  Then, our transformation formula becomes

        [E] = [O] * [T] * [X] * [Y] * [Z] * parent.[E]

        Which is the same number of matrix multiplies as our original
        transformation formula!

            Now back to data structures for a second, so we can make some
        pseudo code illustrating this point.  In my rendering engine, I keep a
        list of objects to be rendered.  Using this method of hierarchy, only
        the parent object in the hierarchy should be in this list.  This is
        because all objects in the object list are rendered using the camera
        matrix, whereas all children must be rendered using the eyespace
        matrix of their parent (which contains the parent object space matrix
        and the camera matrix).  Therefore, in a call to transform or render
        an object, you should make sure that the object will transform and
        render its children.  here is some pseudocode

void transformobject(object *obj, float cmatrix[4][4])
{

    (do transformations on this object)

    ...

    for (count = 0; count < obj->numchildren; count++)
        transformobject(obj->children[count], obj->ematrix);

}

void renderobject(object *obj)
{

    (render the object however you like)

    ...

    for (count = 0; count < obj->numchildren; count++)
        renderobject(obj->children[count]);

}


Precision

            There is one drawback to this method of matrix transformation, it
        requires that the values stored in your matrix be very precise.  This
        is a problem because all digital representations of fractional numbers
        are approximated somewhat, and no matter how precise the
        approximation, errors will propagate.  This means that the more you
        use an approximated value in calculations, the less precise the value
        becomes.  There are several solutions to this problem, I will briefly
        discuss each and then leave the decision to you.

            The first (and most obvious) solution is to use the most precise
        data type availible.  On the PC, we have 64 bit (double) and 80 bit
        (long double) floating point data types.  Clearly, using these data
        types will result in VERY little loss of precision in calculations.
        This is a viable solution to our problem but may not be desireable due
        to the larger storage requirements and possible speed hits which
        accompany these data types.

            The second solution is to use 32 bit single precision floating
        point data.  Its precision is more than adequate in most cases, and
        its speed is tolerable when a floating point coprocessor is present.
        Also, single precision floats do not require any more space to store
        than long integers.  Disadvantages of this data type are that it is
        somewhat slow even on machines with a floating point coprocessor, and
        there is always the chance that someone without a FPU will use the
        software.  In this case, pure floating point math will be anywhere
        from 10 to 100 times slower than integer math.

            The third option is to use integer math.  The obvious benefit here
        is speed, the obvious disadvantage is precision.  16.16 integer math
        (16 bit integer, 16 bit fraction) allows for less than 5 decimal
        places of precision.  This may sound like a lot, but consider the fact
        that the axis vectors in the transformation matrix all have a
        magnitude of 1, which means that the individual x, y, and z components
        are always less than or equal to 1.  Your fractional bits become very
        important in such a situation.

            As far as my personal preference, I use single precision floating
        point math in my transformation matrix.  I have found it to be the
        best compromise between speed and precision.  As processors become
        more and more adept at floating point calculation, floating point math
        will be faster than integer math.  Many of the newer RISC based
        processors already have floating point math that is faster than its
        integer counterpart.  For example, friends of mine developing
        PowerPC rendering software tell me that floating point math is three
        to four times faster than integer math on their platforms.  The
        PowerPC has a nifty little instruction called fpmuladd which performs
        a floating point multiply and add in once clock cycle!  It makes your
        matrix multiplication routines pretty fast.  There are also faster
        desktop FPUs.  The floating point unit in a DEC Alpha chip is roughly
        the same speed as those in Cray supercomputers made in the early
        1980's!

            Now, about how to handle the loss of precision.  Two things
        happen when a matrix loses precision; its axis vectors change
        magnitude so that they are no longer unit vectors, and its axis
        vectors "wander" around a bit so that they are no longer
        perpendicular to each other.  I have implemented both integer and
        single precision floating point versions of these matrix
        transformations.  I found that the 16.16 fixed point integer versions
        will lose precision after somewhere around 10^2 transformations, while
        the single precision floating point version shows no noticable loss of
        precision after 10^5 transformations.  However, there was a slightly
        noticable wobble of objects in the third level of hierarchy when using
        single precision floating point.  I attribute this to a normal,
        usually unnoticable loss of precision which is more noticable because
        it is shown in the same frame as objects transformed with more precise
        versions of the same matrix.

            If you are a die hard integer math freak who refuses to accept the
        fact that floating point is the wave of the future, there are a few
        things you can do to justify your use of integer math with reasonable
        matrix precision.  First, you could try using 0.32 fixed point in the
        rotation portion of the transformation matrix.  However, you are going
        to have to do some 64 bit shifting around, and some tricky shifting in
        your matrix multiplication routine to accomidate translation values,
        which are almost always greater than 1.  The next method involves
        fixing your matrix so all the vectors are the proper length and
        mutually perpendicular.  This is quite a math problem to solve if you
        have never considered the solution or have no knowledge of vector
        operations.  There are two ways of going about this, one involves dot
        products, the other involves cross products.

            The dot product method is based on the fact that the dot product
        of perpendicular vectors is 0, because the dot product is the
        projection of one vector onto another.  If the dot product is not 0,
        you have a value which is related to how far the vectors overlap in a
        certain direction.  By subtracting this value from the vector, you can
        straighten it in that direction.  After you straighten all your
        vectors in this manner, you re-normalize them so that their lengths
        are 1 (length changes as a result of the perpendicularity correction),
        and you are set.

            The cross product method involves the fact that the cross product
        operation yeilds a vector which is perpendicular to its operands.  By
        taking the cross product of two axes, you can make a third axis which
        is perpendicular to both.  Then, by taking the cross product of this
        new axis and the first of the two axes in the first cross product
        operation, you can generate a new second axis which is perpendicular
        to axes one and three.  One is perpendicular to three and two, two is
        perpendicular to three.  There you have it, mutual perpendicularity.
        Of course, you also need to normalize these vectors when you are done.


Conclusion

            Wow, writing docs is loads of fun...I wonder why I waited this
        long to make a new one?  heh...

            All of this stuff came from my head.  I'm not the type who sits
        down in front of a terminal with a copy of Foley's book (I don't even
        own a copy of that monster...probably never will) or any other book
        for that matter (don't own any books on graphics at all, come to think
        of it :)  On the other hand, I'm not claiming that everything in here
        is my own idea.  Actually, there are no big secrets divulged here,
        sorry if thats what you were looking for.  These techniques are pretty
        much common knowledge, if that were not the case, how would I have
        found out about them? :)  "So why did you write a doc then, fool?"
        Because I have found myself spending lots of time explaining this
        stuff lately, and its better for all parties involved for me to write
        things down in a doc rather than teaching the same things to different
        people day after day.  Now I can just say, "here, read this! :)"

            I have been asked if its ok to publish my other docs in diskmags,
        newsletters, etc.  I have no problem with this whatsoever, as long as
        I am dealt with fairly.  That is, you can publish this doc anywhere as
        long as you do not lead anyone to believe it is not my work.

        Greets to -

            OTM
            TAP coders
            lfp
            HardCode
            Darkshade and wili
            StarScream
            ShadowH
            Daredevil
            MaxD
            tmL
            Simm
            MoominG
            Silvio
            MiKiEx
            Riz

            y todos he olvidado
Discuss this article in the forums
Date this article was posted to GameDev.net: 7/16/1999
(Note that this date does not necessarily correspond to the date the article was written)
See Also:
Matrices