Midpoint Texture Mapping
What has all this got to do with games like Doom, Descent and Quake? These games provide a 3D view of the game environment which changes in real time as the player changes position. To provide reasonably smooth screen updates, many frames per second must be calculated and drawn. For example, if 20 frames per second are calculated in a graphics resolution of 320 x 200 pixels, about 1 us is available per pixel. This doesn't sound much, and it isn't: in this time the program has to perform all the 3D, perspective, lighting, texture mapping, and hidden surface calculations as well as game logic. The part of the program which has greatest effect on program speed are the innermost loops, an important one of which is the drawing of surface textures on the screen.
For each pixel on the screen, the game has to choose a bitmap pixel (texel) to draw. The surface to be drawn may be at any distance or orientation relative to the player, so this calculation is not trivial. There is a major difference here between the way that Doom and Descent work. Doom only permits rotation of the player about a vertical axis: he cannot look up and down, or roll from side to side. Also, all surfaces are either vertical or horizontal. This makes texture mapping much easier because the texture can be drawn on the screen by simple linear scaling, and the midpoint line algorithm described above can be use. Texture mapping in Descent is more interesting: full rotation and perspective calculations are required for each screen pixel. This is the type of texture mapping which is described below.
Using the Midpoint Algorithm for Texture Mapping
Texture mapping is the process of projecting a position on one plane onto another plane. One plane corresponds to a position in a bitmap containing an image drawn on that surface; the other surface is the viewing surface or computer screen. The planes may be at any position and orientation relative to each other in three dimensions, so the projection calculations must incorporate offset, rotation and perpective transformations. The equations which relate screen pixel coordinates (x, y) to bitmap texel coordinates (u, v) are as follows:
A..H are constants for a given surface being projected onto a particular row of pixels on the screen. They are related to the player's and surface's position and orientation and can be calculated by straight forward geometry. A..H do not need to be calculated at each screen pixel so their calculation does not have a major impact on the speed of drawing. Speed is crucial, however, in the calculation of (u,v) at each screen pixel position.
Consider the first equation, which relates u and x. Although we do not want to draw the curve u vs. x on the display, the midpoint algorithm still can be used for fast calculation. Two sets of algorithms will be needed: one set relating u to x and another relating v to x. The first step in deriving a midpoint algorithm for this calculation is to identify segments in the graph of u vs. x which can be drawn with the same algorithm. This is not so simple as for the straight line or circle. For the moment, consider the segment where the gradient of u vs. x is between 0 and 1, as we did for the straight line. This means that each time x is incremented, u either does not change or increments by 1 - a decision variable can be used to choose which.
The next step is to write down expressions for the decision variable and its increments. As for the circle these are derived in a very few lines of algebra, resulting in the following:
We use the same trick as we used for the straight line: the decision variable includes a factor 2 to eliminate awkward fractions. In common with the circle drawing algorithm, g(x,u) is a second order equation so the increments turn out to be in two levels.
Algorithms for other inclinations of the u vs. x curve can be derived very similar to the above. In addition, a complete set of algorithms for v vs. x must be derived.
High speed texture mapping using the Midpoint Algorithm is implemented in listing 7. The viewer is looking along the Z axis at a bitmap which is projected onto an imaginary viewing screen between the viewer and the bitmap. For simplicity in this example, no bitmap is used - instead a chessboard is drawn using the bitmap coordinates. The program starts by defining the way in which the bitmap is viewed. The distance of the viewing screen is defined and also the position and orientation of the bitmap relative to the viewer. Directions are expressed as 3D vectors. To avoid the need for floating point arithmetic, the components of the vectors are integers scaled up by a factor of 16. The next step is to calculate the starting positon in texel coordinates for the left end of the row of pixels to be drawn on the screen. This is a brute force projection calculation with no tricks and follows directly from the basic geometry. A factor of 16 is introduced again to compensate for the scaling of 16 used previously in the 3D vectors. The inner loop draws a row of pixels on the screen given the starting point just calculated, and it is here that the midpoint algorithm is used. As for the circle, there are two levels of increment required. A major difference is that there are two separate midpoint calculations because there are two independent parameters to calculate, u and v. Normally, these values would be used to index into a bitmap, but in this example they are used to choose the colour to plot so that a coloured chess board is drawn. The result is illustrated below.
Figure 3 - A chessboard rotated and tilted by listing 9
For comparison, the non-midpoint calculations are included in listing 7. Remove the definition of MIDPOINT to confirm that the results are exactly the same. Table 3 shows the speed achieved by both methods.
The midpoint algorithm is a great improvement over the brute force approach, and it can be improved still further. An obvious improvement is to compile the example for 32 bit arithmetic using the full extended register set of the 80386. The next step is to translate the code into assembler as we did for the line and circle above.
Listing 7 has some limitations introduced for simplicity. Firstly, not all cases of the midpoint algorithm are included in the example - only that for which u increases and v decreases as x increases. It is a fairly simple matter to derive the other cases and add a test before drawing each line to see which algorithm to use. This does not affect the timings appreciably. Secondly, the algorithm only works for the case when no more than one texel increment is required per screen pixel increment. The solution here is store lower resolution versions of the bitmap and choose the one to draw whose resolution is closest to the plotting resolution required.
Improvements and Optimisations
An improved version of the algorithm for texture mapping is shown in listing 8. An optimized version of that algorithm is in listing 9 (C) and listing 10 (assembler).
A slightly different version of the midpoint algorithm to that which is described above is used. This is because, in general, for each increment in screen x coordinate, the increment in bitmap texel coordinates can be positive or negative, and may not be 0 or 1 as in the simpler cases of a line or a circle. For the midpoint algorithm to work there must always be a choice between two alternative texel coordinate values to move to next. The choice unfortunately cannot be precalculated because the gradient of u vs. x varies as x varies. When the inclination of u vs. x is shallow (less than 45 degrees) the choice of the next value of u is either u or u + 1. However, if the curve gets steeper, the choice might be between u + 1 and u + 2. If the curve has negative gradient, the choice could be between u and u - 1. Thus, in general, the choice of next value for u is between u + n and u + n - 1, where n represents the "gear" of the algorithm.
The calculations for Gu, Ku1, ku2, ku21 and ku22 have a parameter DeltaU which corresponds to n above: similarly for v. Since the gradient of u vs. x in general varies as x varies, DeltaU and DeltaV may have to be adjusted (gear changed) as the algorithm runs.
We need a fast way of detecting when a gear change is required. In general, the algorithm is working correctly as long as the ideal values of u and v lie between the two possible texels which are being considered for plotting. This is the case as long as the magnitude of the decision variable is less than the difference in its value between the two adjacent texels. It turns out that this is equivalent to the test abs(Gu) < abs((Cu * (x + 1) + Du) and similarly for Gv. Thus to implement this test needs another increment in the main loop and a fairly complicated comparison of two numbers which slows down the algorithm considerably.
Another method for gear change detection which is true intuitively is based on the signs of the decision variable and its increments. Suppose the graph of u vs. x curls upwards so that the gear has to be increased from time to time as the scan line is drawn. Aas the algorithm starts to lose control, the sign of the decision variable and its increment is the same so this can be used as a simple test for gear changing. Note that the increment appropriate to the sign of the decision variable must be considered. This is the method which is actually implemented in the code examples.
Getting a given algorithm to run quickly is usually a matter of minimizing references to memory outside the CPU as it runs. Clearly, this means using CPU registers as much as possible. Thus, instead of having one loop in x, with all calculations performed and a point plotted within it, I have split the calculations into 3 parts. The first part calculates all the u values, and second part all the v values, and the final part draws the pixels on the screen. Each loop can than execute largely using CPU registers. The code executed when a gear change is required is not particularly optimized since it is rarely executed. Although this is a DOS real mode program, it makes use of the full 32 bit register set of the 386 and later processor. This is important because 32 bit resolution is required for the calculations not to overflow. If you find a way to make it run faster without introducing approximations I would be interested to hear from you via email (I have made no attempt to optimize the setup code or the gear change code as these are hardly ever executed).
The optimised version of the algorithm performs fast enough for over 30 full screen updates per second which is fast enough for realistic animation.
Before using the code sample for a real texture mapping applications, you would need to consider:
The midpoint algorithm provides a method for high speed perspective texture mapping without approximation. It is faster than brute force division, though not hugely. To write a 3D game using this method would be possible with a high specification PC but would leave little time for other processing. I do not have inside information but I suspect today's games use tricks such as linear interpolation, sacrificing exactness for speed. Nevertheless, we have seen how the midpoint algorithm leads to a feasible high speed algorithm for true perspective texture mapping without approximation. More importantly, I hope I have demonstrated how general the midpoint algorithm is in its application.