3D Matrix Math Demystified
Note for math weenies: This is not meant to be a full tutorial on matrices, and is also not intended to cover the math behind them in the usual manner. It is my personal way of thinking about rotation matrices in 3D graphics, and it will hopefully provide at least some people with a new (and possibly more intuitive) way of visualizing matrices that they may not have considered before.
For serious 3D graphics, you will need to use matrix math. The problem is that at first glance it seems bloody complicated. The truth is there are simpler ways to think about matrix math than as abstract rectangular arrays of numbers. My personal favorite way of thinking about and visualizing 3D rotation matrices is this:
Matrices can be thought of as representing the transformation (or change) in orientation and position required to get from one Coordinate Space, or Frame of Reference, to another one. Imagine two people, one standing up, and one lying down (or try it yourself). To the person who is standing, up is up, down is down, forward is forward, etc. But to the person lying down, forward is really up, and backward is really down, and down is forward, and up is backward. Earth's gravity dictates that our normal frame of reference has "down" pointing towards the center of the earth, but even then, everything has its own local frame of reference, where "up" and "down" are in relation to the person or thing and not in relation to the earth. All a matrix contains is a vector pointing in the direction that the local reference frame considers "up" when seen from the global, Identity reference frame (you have to have some fixed frame that you measure everything else in relation to), plus another vector pointing "right", and another vector pointing either "forward" or "backward", depending on convention.
The Identity Matrix, which produces no rotation at all, simply has an X Axis Vector of (1, 0, 0), a Y Axis Vector of (0, 1, 0), and a Z Axis Vector of (0, 0, 1). Notice how each Axis Vector extends only along the same axis (X, Y, or Z) of the Identity coordinate system. In Matrix notation, this is usually represented (in column-major order) as:
This notation (which is OpenGL's notation style) fits perfectly into C's two-dimensional arrays, where the first index into the array is the Axis Vector you want (0 for X, 1 for Y, and 2 for Z), and the second index is the component (X, Y, or Z) of that Axis Vector. (It's also backwards from C's normal syntax of [row][column], where in this case you specify [column][row] in the matrix.) For example, the number in "matrix" is the Y component of the Z Axis Vector for the matrix. A very useful product of that arrangement in RAM is that if you take the address of the X component (second index is 0) of say, the Y Axis Vector (e.g. "&matrix"), the result is a pointer to an ordinary 3D vector, which can be passed into vector math functions that expect a pointer to a 3-value vector. This leads into a different way of thinking about the actual calculation of rotations:
What you do is this: Take the X Axis Vector of the Rotation Matrix, and scale it (multiply all three of its components by the same scaling factor) by the X component of the 3D point to be rotated, scale the Y Axis Vector by the Y component of the point, and scale the Z Axis Vector by the Z component of the point. Finally, take those three scaled vectors and add them together (add all of the X components to produce the new X, add all of the Ys to produce the new Y, etc.), and the result is your original 3D point, but rotated into the coordinate space defined by the Axis Vectors in your Matrix. The end result is the same as doing a "normal" point times matrix operation, but with an entirely different way of thinking about it.
See Visualizing Vector Addition for a visual representation of vector addition. Vector Scaling is simply the act of changing the length of the vectors without changing their direction.
To help visualize a rotation matrix, hold out your right hand in a "hand gun" position with thumb up and index finger out, then extend your middle finger so it points out of your palm. Now rotate your whole hand so that your index finger points at you, and you have a standard right-handed Identity Matrix, with your index finger the Z Axis Vector, your thumb the Y Axis Vector, and your middle finger the X Axis Vector. As you rotate your hand, those vectors will rotate in 3D space, and any points multiplied by that matrix will follow. To see for yourself how points are rotated by the matrix using the above scaling and adding method, first start with a point lying along one of the axes, say the point (5, 0, 0) along the X axis. Now the Y and Z components are zero, so the Y and Z Axis Vectors of the Matrix will be scaled to zero, and all you have to think about is the X Axis Vector, which is simply multiplied by 5 (made 5 times as long). So as the X Axis Vector of the Matrix rotates, so does the point which lies on it.
In practice, the scaling and adding of vectors can be simplified to the following math, applicable for use in an actual program. Notice that the X coordinate of the point is only ever multiplied with a component of the X Axis Vector. This is just a simplification and condensation of the scaling and adding of vectors that is still going on under the covers.
Extending to Affinity
The previous section only covered rotational, or "Linear" transformations. Usually you also want to have a matrix be able to translate, or shift, a point through space as well as rotate it, and that sort of operation is called an "Affine" transformation. You do that with either a 3x4 or a 4x4 matrix, but I'll deal with 3x4 matrices to keep things simpler. 4x4 matrices enter into the realm of Homogeneous Coordinates, and Perspective transforms, which are not things you usually have to deal with yourself. A 3x4 matrix (3 tall, 4 wide) basically just adds another Vector to the matrix, but instead of being an Axis Vector, it is a Translation Vector. The representation is usually as follows:
The Translation Vector is merely the amount to shift the point in X, Y, and Z, as seen from the Identity reference frame. It would work the same as if it was outside of the matrix, and you simply added it to the point after the matrix transform. If the matrix was representing a jet plane's orientation in the world, the Translation Vector would merely be the coordinates of the center of the jet plane in the world. When describing matrix multiplication as the scaling and adding of vectors before, the X, Y, and Z components of the point to be rotated scaled the X, Y, or Z Axis Vectors respectively, but what scales the Translation Vector? If you're working with matrices that are larger in one or both dimensions than your vectors, you usually fill in the extra vector components with 1s. So a 3-component X, Y, Z vector gets an extra imaginary Translation Component added to the end which is always a 1, and that Translation Component scales the Translation Vector of the matrix. Scaling by 1 is easy, since it means no change at all. Here's the code to transform a point through a 3x4 matrix:
Going in Reverse
Multiplying forwards through a matrix is great, but what if you want to multiply "backwards", to take a point that has been transformed through a matrix, and bring it back into the Identity reference frame where it started from? In the general case, this requires calculating the Inverse of the matrix, which is a lot of work for general matrices. However in the case of standard rotation matrices (said to represent an Orthonormal Basis) where the three Axis Vectors are at perfect right angles to each other, the Transpose of the matrix happens to also be the Inverse, and the Transpose is created merely by flipping the matrix about its primary diagonal (which runs from upper left to lower right). Here is a visual representation:
You can either make a transposed version of your matrix and then multiply by that, or you can do the math to directly multiply through the transpose of the matrix without actually changing it. If you go for the latter, it just so happens that instead of scaling and adding, you do Dot Products. The Dot Product of the vectors (X, Y, Z) and (A, B, C) is X*A + Y*B + Z*C. The following code to multiply a point through the Transpose of a matrix simply does 3 Dot Products, Point Dot X Axis Vector, Point Dot Y Axis Vector, and Point Dot Z Axis Vector.
Going in reverse works similarly for 3x4 matrices with Translation, except that you have to subtract the Translation Vector from the point before rotating it, to properly reverse the order of operations, since when going forwards, the Translation Vector was added to the point after rotation.
Now things start to get really interesting. Multiplying matrices together, or Concatenating them, allows you to combine the actions of multiple separate matrices into a single matrix, such that when points are multiplied through it, will produce the same result as if you multiplied the points through each of the original matrices in turn. Imagine all the computation that can save you, when you have a lot of points to transform!
I like to say that you multiply one matrix "through" another matrix, since that's what you're really doing mathematically. The process is astoundingly simple, and all you basically do is take the three Axis Vectors of the first matrix (for 3x3 matrices), and do a normal Vector Times Matrix operation on each of them with the second matrix to produce the three Axis Vectors in the result matrix. The same is true for 3x4 matrices being multiplied through each other, in which case you multiply each of the X, Y, and Z Axis Vectors and the Translation Vector of the first matrix through the second matrix as normal, and you get your resultant 3x4 matrix out the other side. Keep in mind that you can also reverse multiply one matrix through another, by using the Transpose of the second matrix (or the appropriate code to do a transposed multiply directly). For a 3x3 matrix, the code for the normal forward case might look like this:
If you have two matrices, A and B, where A pitches forwards a bit, and B rotates to the right a bit, and you multiply A through B to produce matrix C, then multiplying a point through matrix C will produce the same result as if the point was first multiplied through matrix A, and the resulting point was multiplied through matrix B. Note: This is the way I prefer to think of my matrix operations right now, but it is actually the reverse of the normal way. When working with matrix operations in APIs such as OpenGL, matrix concatenations actually work in reverse order. In the above example of concatenating A and B into C, multiplying a point by C would actually be the same as first multiplying the point by B, and then multiplying the result by A. You can think of it as each successively concatenated matrix acting in respect to the local coordinate reference frame built up by the previously concatenated matrices, whereas with my method each successive matrix acts in the identity reference frame, which keeps the same order as if you took the time to rotate the point by each original matrix in turn.
The End, For Now:
This is by no means a complete treatment of matrix math, but hopefully it will help you to understand the basics better. Once you grasp this much, go dig up a few good books on matrices and learn some of their REAL power, while remembering that at their core, the simpler matrices don't have to be thought of as matrices at all.
Copyright 1998-1999 by Seumas McNally.
Courtesy Of Longbow Digital Artists