A tutorial for 2d and 3d vector and texture mapped graphics

                                
   A tutorial for 2d and 3d vector and texture mapped graphics
                          Version 0.60á


c 1994, 1995 by S‚bastien Loisel

Acknowledgements:

     I am very thankful to the following people for providing
     support and helping in the development of this document.
     
     Kevin Hunter for doing his best to explain a bit of linear
     algebra to the rest of the world.
     
     Simos Hadjiyiannis for providing everyone with WWW access to
     this document.
     
     For more information about these people, see the end of this
     file.


Note:
                   This package is Shareware!

This means that you can enjoy this package freely, pass it on to
your friends, and learn from it. However, if you find that this
package (3d document and C source files) has been useful to you,
you have to register yourself to me. Registering costs US$10.00
and will allow you to use the source code in any way you like,
even commercial purpose if you want to. You will be registered
for life, which means that you can get any future versions of
Zed3D freely and use it without feeling bad in the morning. You
will be allowed to use the source code in any way you want.
(Note, however, that you will have to get the Zed3D package
yourself.) Sending me money will encourage me in giving you more
quality products like this. Please support the shareware concept
and support the shareware authors.

If you want to register, simply fill in the following
registration form and print it, then mail it to my snail-mail
address along with a cheque or money order for US$10.00 (no other
currency accepted).

     Yes, I want to register my copy of the Zed3D document
     and all future versions. As a bonus, I will be able the
     Zed3D source code in any way I please. Enclosed, you
     will find my US$10 contribution to the shareware
     concept.
     
     My name is: __________________________________________
     
     My (snail mail) address is:
     
          _____________________________________________
     
          _____________________________________________
     
          _____________________________________________
     
     
     My e-mail address is: _________________________________
     
     
     I got Zed3D from: ____________________________________
     
     
     I would like to see this in future versions of Zed3D:
     
     __________________________________________________
     
     __________________________________________________
     
     __________________________________________________
     
     __________________________________________________
     
     
     Signature: __________________________________________
     
     Mail to:       Sebastien Loisel / Zed3D registration
                    1 J.K. Laflamme
                    Levis, Quebec, Canada
                    Postal Code G6V 3R1
     Cheque should go to Sebastien Loisel
Disclaimer & copyright notice:

This document is provided "as is", without guaranty of ANY kind.
This document is copyrighted 1994, 1995 by S‚bastien Loisel. It
may be distributed freely as long as no fee except shipping and
handling is charged for it. Including this document in a
commercial package requires the written permission of the author,
S‚bastien Loisel. The term "document" refers both to this
document and the accompanying source code.

If you want to reach me, see the end of this file.


     ____________________



1    An introduction to 3d transformations



First, let's introduce the idea of projections. We have to
realize that we are trying to view a 3d world through a 2d
®window¯. Thus, one full dimension is lost and it would be absurd
ot think that we can reconstitute it wholly and recognizably with
one less dimension. However, there are many forms of depth-cueing
that we introduce in a 2d image to help us recognize that 3d
dimension. Examples include but are not limited to reduced
lighting or visibility at a distance, reduced size at a distance,
out-of-focus blur, etc...

Our media is a screen, ideally flat, and the 3d world being
represented is supposed to look as real as possible. Let's look
at how the eyes see things:



Figure 1


The projection rays are the beams of light that reflect off the
objects and head to your eye. Macroscopically, they are straight
lines. If you're projecting these beams on a plane, you have to
find the intersection of these beams with the plane.



Figure 2

The projection points are thus the intersection of the projection
rays with the projection plane. Of course, projection points
coming from projection lines originating from behind an object
are obviously hidden, and thus comes the loss of detail from 3d
to 2d. (i.e. closer objects can hide objects that lie further
away from the eye).

The objects can be defined as a set of (x,y,z) points in 3d space
(or 3space). Thus, a filled sphere can be defined as
x**2+y**2+z**2<=1. But we need to simplify the problem right away
because of machine limitations. We can safely assume the eye is
always outside any object, thus we do not need do define the
interior. [Note: a**b stands for ®a raised to the bth power¯,
thus, x**2 is x squared]

Now let us define a projection onto a plane. Many different
projections could be defined, but the one that interests us is
the following for it's simplicity. Projecting takes any given
point (x,y,z) and localizes it on an (u,v) coordinate system.
I.e. space (x,y,z) is mapped onto a plane (u,v). Let's transform
(x,y,z) first. Multiplying (x,y,z) by a constant keeps it on a
line that passes through (0,0,0) (the origin). We will define
that line as the projection ray for our projection. Since all of
these lines pass through the origin, we thus define the origin as
our eye. Let the constant be k. Thus our transformation as of now
is k(x,y,z). Now we want to set everything into a plane. If we
select k as being 1/z, assuming z is nonzero, we get a projected
point of (1/z)(x,y,z). This becomes (x/z, y/z, 1). Thus all
points of any (x,y,z) value except z=0 will be projected with a
value z'=z/z of one. Thus all points lie in the plane z=1. This
projection thus fits our need for a projection with linear
projector through the eye on a plane surface.

If you didn't get all that right away, you can always look it up
later. The important thing to remember now is this: to project
from (x,y,z) onto (u,v) we decide to use

     u=x/z
     v=y/z


But, I hear you say, this means the eye is facing parallel to the
z axis, centered on the origin and is not banked. What if that is
not what I desire?

First, it is obvious that if the eye is at location (a,b,c) it
will not affect what one would see to move EVERYTING including
the eye by (-a,-b,-c). Then the eye will be centered.

Second, and almost as obvious, if the eye is rotated around the
x,y and z axis by agles of rx, ry and rz respectively, you can
rotate EVERYTHING by the exact opposite angles, which will orient
the eye correctly without changing what the eye would see.

Translation (moving) is a simple addition. I.e. (x,y,z)+(a,b,c)
becomes (x+a,y+b,z+c). Additions are fairly quick. But there is
the problem of the rotations.




1.1 Trigonometry


We will do it in two dimensions first.



Figure 3


We have the point defined by (x,y) and we want to rotate it with
a counterclockwise angle of b to (x',y'). See figure 3. The
radius r stays the same throughout the rotation, as per
definition, rotation changes the angle but not distance to
origin. Let r be the radius (not shown on figure 3). We need to
express x' and y' as functions of what we know, that is x, y and
b. The following can be said:

y'=sin(a+b)r
x'=cos(a+b)r

With the identities sin(a+b)=sin(a)cos(b)+sin(b)cos(a) and
cos(a+b)=cos(a)cos(b)-sin(a)sin(b), we substitute.

y'=rsin(a)cos(b)+rcos(a)sin(b)
x'=rcos(a)cos(b)-rsin(a)sin(b)

But from figure 3 we know that

rsin(a)=y and
rcos(a)=x

We now substitute:

y'=ycos(b)+xsin(b)
x'=xcos(b)-ysin(b)

A special note: you do not normally calculate sinuses and
cosinuses in real time. Normally, you build a lookup table, that
is, an array. This array contains the precalculated value of
sin(x) for all values of x within a certain range. That is, if
the smallest possible angle in your model is on half of a degree,
then you need a sinus table containing 720 entries, of which each
entry is the value of the sinus of the corresponding angle. You
can even note that a sinus is highly symmetric, that is, you need
the values of sin(x) only for x between 0 and 90 degrees.
Furthermore, you do not need a cosine table, as cos(x)=sin(x+90
degrees). This and fixed point mathematics may become obsolete as
floating point processors become faster and faster. A good
algorithm for calculating a sinus can be very quick today on a
reasonably high performance platform, and it shouldn't use
powers.

With that said, we will approach briefly 3d transformations.

In 3d, all rotations can be performed as a combination of 3
rotations, one about the x axis, one about the y axis and one
about the z axis. If you are rotating about the z axis, use the
exact same equation as above, and leave the z coordinate
untouched. You can extrapolate the two other transformations from
that.

Note that rotating first about the x axis and then about the y
axis is not the same as rotating about the y axis and then the x
axis. Consider this: point (0,0,1) rotated 90ø about the x axis
becomes (0,1,0). Then, rotated 90ø about the y axis it remains
there. Now, rotating first about the y axis yields (1,0,0), which
rotated then about the x axis stays at (1,0,0), thus the order of
rotations is important [(0,1,0) and (1,0,0) are not the same].

Now, if you are facing west (x axis, towards negative infinity).
The user pulls the joystick up. You want the ship to dive down as
a result of this. You will have to rotate the ship in the xy
plane, which is the same as rotating about the z axis. Things get
more complicated when the ship is facing an arbitrary angle.

Here I include a document authored by Kevin Hunter. Since he did
a very nice job of it, I didn't see a reason to re-do it,
especially when considering the fact that it would probably not
have been as tidy. :-) So, from here to chapter two has been
entirely written by Kevin Hunter.




     _____________________




Unit Vector Coordinate Translations
By Kevin Hunter


One of the more common means of doing "world to eye" coordinate
transformations involves maintaining a system of "eye" unit
vectors, and using a property of the vector dot product.  The
advantage of this approach, if your "eye" frame of reference is
rotated with respect to the "world" frame of reference, is that
you can do all the rotational calculation as simple projections
to the unit vectors.

The dot product of two vectors yields a scalar quantity:

     R dot S = |R| |S| cosa

where |R| and |S| are the lengths of the vectors R and S,
respectively, and a is the angle between the two vectors.  If S
is a unit vector (i.e. a vector whose length is 1.0), this
reduces to

     R dot S = |R| cosa

However, a bit of geometry shows that |R| cosa is the portion of
the R vector that is parallel to the S vector.  (|R| sina is the
part perpendicular).  This means that if R is a vector with its
origin at the eye position, then R can be converted from "world"
space into "eye" space by calculating the dot product of R with
the eye X, Y, and Z unit vectors.  Getting a "eye origin" vector
is trivial - all you need to do is subtract the "world"
coordinates for the eye point from the "world" coordinates for
the point to be transformed.  If all this sounds a bit foggy,
here's a more concrete example, along with a picture.  To make
the diagram easier, we'll consider the two-dimensional case
first.

                                
Figure 3a

In this figure, the "world" coordinates have been shown in their
typical orientation.  The "eye" is displaced up and left, and
rotated to the right.  Thus, the EyeX vector would have world-
space coordinates something like (0.866, -0.500) [I've picked a
30 degree rotation just for the numbers, although my less-than-
perfect drawing talents don't make it look like it).  The EyeY
vector, which is perpendicular to the EyeX vector, has world-
space coordinates (0.500, 0.866).  The eye origin point is at
something like (1, 2).
Based on these numbers, any point in world space can be turned
into eye-relative coordinates by the following calculations:

     EyeSpace.x   = (Px - 1.0, Py - 2.0) dot (0.866, -0.500)
             = (0.866Px - 0.866) + (-0.500Py + 1.00)
             = 0.866Px - 0.500Py + 0.134
     EyeSpace.y   = (Px - 1.0, Py - 2.0) dot (0.500, 0.866)
             = (0.500Px - 0.500) + (0.866Py - 1.732)
             = 0.500Px + 0.866Py - 2.232

The simple way of implementing this transformation is a two-step
approach.  First, translate the point by subtracting the eye
origin.  This gives you the eye-relative vector.  Then, perform
each of the three dot products to calculate the resulting X, Y,
and Z coordinates in the eye space.  Doing this takes the
following operations for each point:

     1.Three subtractions to translate the point
     2.Three multiplications and two additions for the X
       coordinate
     3.Three multiplications and two additions for the Y
       coordinate
     4.Three multiplications and two additions for the Z
       coordinate

Since additions and subtractions cost the same in virtually any
system, this reduces to a total of nine multiplications and nine
additions to translate a point.

If you're striving for performance, however, you can make this
run a little better.  Note, in the two-dimensional example above,
that we were able to combine two constants that fell out of the
second line into a single constant in the third line.  That last
constant turns out to be the dot product of the particular eye
coordinate vector and the negative of the eye origin point.  If
you're smart, you can evaluate this constant once at the
beginning of a set of point transformations and use it over and
over again.

This doesn't change the total number of operations - the three
subtractions you take away at the beginning you put back later.
The advantage, however, is that the constant is known ahead of
time, which may let you keep it in a register or cache area.
Basically, you've reduced the number of constants needed to
transform the coordinates and the number of intermediate values,
which may let you keep things in registers or the cache area.
For example, this approach lets you use a stacked-based FPU in
the following manner:

     push P.x
     push UnitVector.x
     mult
     push P.y
     push UnitVector.y
     mult
     add
     push P.z
     push UnitVector.z
     mult
     add
     push CombinedConstant
     add
     pop  result

This algorithm never has more than three floating point constants
on the FPU stack at a time, and doesn't need any of the
intermediate values to be removed from the FPU.  This type of
approach can cut down on FPU-to-memory transfers, which take
time.

Of course, if you don't have a stack-based FPU, this second
approach may or may not help.  The best thing to do is to try
both algorithms and see which works better for you.



Maintaining Systems of Unit Vectors

Spinning In Space
To effectively use the system discussed in the previous section,
you need to be able to generate and maintain a set of eye-space
unit vectors.  Generating the vectors is almost never a problem.
If nothing else, you can use the world X, Y and Z vectors to
start.  Life gets more interesting, however, when you try to
maintain these vectors over time.

Translations of the eye point are never a problem.  Translations
are implemented by adding or subtracting values to or from the
eye origin point.  Translations don't affect the unit vectors.
Rotations, however, are another matter.  This kind of makes
sense, since handling rotations is what maintaining this set of
vectors is all about.

To be effective, the set of three unit vectors have to maintain
two basic properties.  First, they have to stay unit vectors.
This means that we can't let their lengths be anything other than
1.0.  Second, the three vectors have to stay mutually
perpendicular.

Ensuring that a vector is a unit vector is very straight-forward
conceptually.  All you need to do is to calculate the length of
the vector and then divide each of its individual coordinates by
that length.  The length, in turn, can be easily calculated as
SQRT(x^2 + y^2 + z^2).  The only interesting part of this is the
SQRT() function.  We'll come back to that one later.

Keeping the set of vectors perpendicular takes a bit more work.
If you restrict rotations of the vector set to sequences of
rotations about the individual vectors, it's not too hard to do a
decent job of this.  For example, if you assume that, in eye
space, Z is to the front, X is to the right and Y is down (a
common convention since it lets you use X and Y for screen
coordinates in a 0,0 == top left system), rotation of the eye to
the right involves rotating around the Y axis, with Z moving
toward X, and X toward -Z.  The calculations here are very
simple:

     Znew = Zoldcosa + Xoldsina
     Xnew = Xoldcosa + (-Zold)sina

Of course, while you're doing this you need to save at least the
original Z vector so that you don't mess up the X calculation by
overwriting the Z vector before you get a chance to use it in the
second equation, but this is a detail.

In a perfect world, when you're done with this rotation, X, Y,
and Z are still all unit vectors and still mutually
perpendicular.  Unfortunately, there's this thing called round-
off error that creeps into calculations eventually.  With good
floating point implementations, this won't show up quickly.  If
you're having to do fixed-point approximations (for speed, I
assume) it will show up a lot faster, since you're typically
storing many fewer mantissa bits than standard FP formats.

To get things back square again there are a couple of things you
can do.  The most straight forward is to use the cross product.
The cross product of two vectors yields a third vector which is
guaranteed (within your round-off error) to be perpendicular to
both of the input vectors.  The cross product is typically
represented in determinant form:
                   x   y   z
         A cross   Ax  Ay  Az
         B =
                   Bx  By  Bz

In more conventional format, the vector resulting from A cross B
is

     (AyBz - AzBy, AzBx - AxBz, AxBy - AyBx)

One thing about the cross product is that you have to watch out
for the sign of things and the right-handedness or left-
handedness of your coordinate system.  A cross B is not equal to
B cross A - it's equal to a vector in exactly the opposite
direction.  In a right-handed coordinate system, X cross Y is Z,
Y cross Z is X, and Z cross X is Y.  In a left-handed coordinate
system, the arguments of the crosses have to be exchanged.

One tactic, then, is to rotate one of the vectors, and then,
instead of rotating the other, to recalculate the other using the
cross product.  That guarantees that the third vector is
perpendicular to the first two.  It does not, of course,
guarantee that the vector that we rotated stayed perpendicular to
the stationary vector.  To handle this, we could then adjust
either the stationary vector or the one we originally rotated by
another cross product if we wanted to.  In practice, this
frequently isn't necessary, because we rarely are rotating only
about one axis.  If, for example, we pitch and then roll, we can
combine rotations and crosses so that no vector has undergone
more than one or two transformations since it was last readjusted
via a cross product.  This keeps us from building up too much
error as we go along.

One other thing to remember about the cross product.  The length
of the generated vector is the product of the lengths of the two
input vectors.  This means that, over time, your calculated
"cross" vectors will tend to drift away from "unitness" and will
need to be readjusted in length.

Faster!
It is unusual in interactive graphics for absolute and unwavering
precision to be our goal.  More often (much more often) we're
willing to give up a tiny bit of perfection for speed.  This is
particularly the case with relatively low-resolution screens,
where a slight error in scale will be lost in pixel rounding.  As
a result, we'd like a measure of how often we need to readjust
the perpendicularity of our unit vectors.  As it happens, our
friend the dot product can help us out.

Remember that we said that the dot product of two vectors
involves the cosine of the angle between them.  If our unit
vectors are actually perpendicular, the dot product of any pair
of them should then be zero, since the cosine of a right angle is
zero.  This lets us take the approach of measuring just how badly
things have gotten out of alignment, and correcting it only if a
tolerable threshold has been exceeded.  If we take this approach,
we "spend" three multiplications and two additions, hoping to
save six multiplications and three additions.  If we're within
our tolerance more often than not, we're in good shape.

Foley et al point out that we can make this tradeoff work even
better, however.  Remember that the dot product represents the
projection of one vector along another.  Thus, if we find that X
dot Y is not zero, the resulting value is the projection of the X
vector along the Y vector (or the Y vector along the X vector, if
you want).  But if we subtract from the X vector a vector
parallel to the Y vector whose length is this projection, we end
up with X being perpendicular to Y again.  The figure may help to
make this clearer
                                
Figure 3b

This lets use the following algorithm:
     A = X dot Y
     if (abs(A) > threshold)
     {
          X -= Y times A
     }
This approach uses three multiples and two adds if we don't need
to adjust, and six multiplies and five adds if we do.  This
represents a savings of three multiplications over the "test and
then cross product" approach.  Note that, like the cross product
approach, this technique will tend to change the length of the
vector being adjusted, so vectors will need to be "reunitized"
periodically.

Perfection?
If we want to use our unit vector system, but we need to model
rotations that aren't around one of our unit axes, life gets more
difficult.  Basically, except in a very few circumstances (such
as if the rotation is around one of the world axes) the
calculations are much harder.  In this case, we have at least two
options.  The first is to approximate the rotation by incremental
rotations that we can handle.  This has to be done carefully,
however, because rotations are not interchangeable - rotating
about X then Y doesn't give the same answer as rotating around Y
then X unless the angles are infinitely small.

The second option is to use quaternions to do our rotations.
Quaternions are a very different way of representing rotations
and rotational orientations in space.  They use four coordinates
rather than three, in order to avoid singularities, and can model
arbitrary rotations very well.  Unfortunately, I'm not completely
up on them at this point, so we'll have to defer discussion of
them until some future issue when I've had a chance to do my
studying.



Efficient Computation of Square Roots

We mentioned earlier that we'd need to look at square roots to
"unitize" vectors.

Let's assume for the moment that we need to calculate the square
root of a number, and we don't have the hardware to do it for us.
Let us assume further than multiplications and divisions are
cheap, comparitively speaking.  How can we go about calculating
the square root of a number?

[Note:  Intel-heads like me think this is an unnatural situation,
but it does, in fact, occur out there.  For example, the Power PC
can do FP add/subtract/multiply/divide in a single cycle, but
lacks sqrt() or trig function support.]

Isaac Strikes Again
Perhaps the simplest approach is to use Newton's method.
Newton's method is a technique for successively approximating
roots to an equation based on the derivative of that equation.
To calculate the square root of N, we use the polynomial equation
x2 - N = 0.  Assuming that we have a guess at the square root,
which we'll call xi, then next approximation can be calculated by
the formula

     xi+1 = xi + (N - xi2) / (2xi)

This begs two questions.  First, how do we get started?  Second,
when do we get there?  To answer the first part, Newton's method
works really well for a well-behaved function like the square
root function.  In fact, it will eventually converge to the
correct answer no matter what initial value we pick.  To answer
the second part, "Well, it all depends on where you start."

To get an idea of how this iteration performs, let's expand it a
bit.  Let's assume that our current guess (xi) is off by a value
"e" (in other words, the square root of N is actually xi + e.)
In this case,

     xi+1    = xi + ((xi + e)2 - xi2) / (2xi)
        = xi + (xi2 +2exi + e2 - xi2) / (2xi)
        = xi + (2exi + e2) / (2xi)
        = xi + e + (e2 / 2xi)

But since "xi + e" is the answer we were looking for, this says
that the error of the next iteration is "e2/2xi".  If the error
is small, the operation converges very quickly, because the
number of accurate bits doubles with each iteration.  If the
error is not small, however, it's not so obvious what happens.
I'll spare you the math, and simply tell you that if the error is
large, the error typically halves each iteration (except for the
first iteration, in which the error can actually increase).  So,
while a bad initial guess won't prevent us from finding an
answer, it can slow us down a lot.

So how do we find a good initial guess?  Well, in many cases, we
have a good idea what the value ought to be.  If we're re-
unitizing a vector, we suspect that the vector we're working with
is already somewhere near a length of 1.0, so an initial guess of
1.0 will typically work very well.  In this situation, four
iterations or so usually gets you a value as accurate as you
need.

What if we're trying to convert a random vector into a unit
vector, or just taking the square root of a number out of the
blue?  In this case, we can't be sure that 1.0 will be a very
good guess.  Here are a few things you can do:

     1.If you have easy access to the floating point number's
       exponent, halve it.  The square root of A x 2M is SQRT(M)
       x 2M/2.  Using A x 2M/2 is a really good first
       approximation.
     2.If you are taking the length of a vector (x,y,z),
       abs(x)+abs(y)+abs(z) won't be off by more than a factor
       of 2 - also a good start.
     3.It turns out that the operation will converge faster if
       the initial guess is higher than the real answer.  If all
       else fails, using N/2 will converge pretty quickly.

Don't go crazy trying to get a good first guess.  Things will
converge fast enough even if you're somewhat off that a fast
guess and an extra iteration may be faster than taking extra time
to refine your guess.

If You Don't Like Newton...

If you're looking for a really accurate answer, Newton's method
may not be for you.  First, you can't, in general, just iterate a
fixed number of times.  Rather, you have to check to see if the
last two iterations differ by less than a threshold, which adds
calculation time to each iteration.  If you're in fixed-point
land, the assumption we made earlier that multiplications were
cheap may not be the case.  In this case, there is an alternate
closed-form algorithm you can use that will result in the "exact"
answer in a predefined amount of time.  This algorithm also uses
an incremental approach to the calculation.

Since (x + a)2 = x2 + 2xa + a2, we can iterate from x2 to (x+a)2
by adding 2xa + a2 to our working sum.  If a is2N, however, this
translates into adding 2N+1 x + 22N.  Since we can multiply by 2N
by shifting to the left by N bits, we can take a square root by
use of shifts and subtractions.  To take the square root of a 32-
bit number, for example, this results in an algorithm that looks
something like this:

     remainder = valueSquared;
     answer = 0;
     for (iteration = N; iteration >= 0; iteration--)
     {
          if (remainder >= (answer << iteration+1) + (1 <<
     iteration * 2))
          {
               remainder -= (answer << iteration+1) + (1 <<
     iteration * 2);
               answer += 1 << iteration;
          }
     }

A "real" algorithm won't repeatedly perform the shifts on the 5th
line.  Rather, an incremental approach can be taken in which the
answer and "1 bit" is kept in pre-shifted form and shifted to the
right as the algorithm proceeds.  With appropriate temporary
variable space, this algorithm shifts two values and does two
subtractions and either one or zero add or "or" per iteration,
and uses one iteration for each two bits in the original value.

This algorithm can be used very effectively on integer values,
and can also be adapted to take the square root of the mantissa
portion of a floating-point number.  In the latter case, care has
to be taken to adjust the number into a form in which the
exponent is even before the mantissa square root is taken, or the
answer won't be correct.

Table It
If you don't need absolute precision, there is a table-driven
square root algorithm that you can use as well.  The code for the
32-bit FP format ("float") is in Graphics Gems, and the 64-bit FP
format ("double") is in Graphics Gems III.  The source code is
online at ftp:://Princeton.edu/pub/Graphics/GraphicsGems.  (Note
the capital `G'.  There is also a /pub/graphics (small `g') which
is not what you want...)

The way this algorithm works is to reach inside the IEEE floating
point format.  It checks the exponent to see if it is odd or
even, modifies the mantissa appropriately for odd exponents, and
then halves the exponent.  The mantissa is then converted to a
table lookup with a shift and mask operation, and the new
mantissa found in a table.  The version in Graphics Gems III is
set up to allow you to adjust the size of the lookup table, and
thus the accuracy of the operation.

Whether or not this method is appropriate for you depends on your
tolerance for accuracy vs. table size tradeoffs.  The algorithm
runs in constant time, like the previous one, which is an
advantage in many situations.  It's also possible to combine
them, using a table lookup to get the first N bit guess, and then
finishing it out using the bit shifting approach above.

Trig Functions
Unfortunately, trig functions like sine and cosine don't lend
themselves nicely to methods like the square root algorithms
above.  One of the reasons for this is that sine and cosine are
interrelated.  If you try to use Newton's method to find a
cosine, you need to know the sine for your guess.  But to figure
out the sine, you need cosine.  Department of Redundancy
Department.

One alternate approach you can take is to use a Taylor series.  A
Taylor series is a polynomial that will approximate a sine or
cosine to any desired accuracy.  The problem with this is that
the Taylor series for sine and cosine don't converge really
quickly.  They don't do too bad, but you still have to do a bunch
of computation.

A better approach in this case is probably to use a combination
of computation and a look-up table, like the last square root
algorithm.  Since sine and cosine are cyclic, and interrelated,
you only really need to be able to calculate one of them over the
range 0 - p/2.  Given this, the rest of the values can be
determined.  Depending on how much calculation you want to do,
how much table space you want to spend, and how accurate your
answer needs to be, you can break up the region above into a
fixed number of regions, and use some form of interpolation
between points.  If you're in Floating Point land, over a
relatively short region, you can fit a quadratic or cubic
function to either the sine or cosine function with pretty good
accuracy.  This gives you a second or third order polynomial to
evaluate and a little bit of table lookup overhead.  For all but
the lowest accuracy cases, this will probably net out faster than
calculating the number of Taylor terms you'd need.

If you're in Fixed Point Land, you may be better off with a table
lookup and linear interpolation, rather than using a higher-order
polynomial.  For comparable accuracy, this approach will probably
need more table space, but if you're avoiding floating point,
you'll probably be avoiding integer multiplications as well.
(Particularly because you have the decimal point adjustments to
worry about).  The linear interpolation can be reduced to a
single multiply by storing delta information in the table, so all
you have to do is multiply the bits you stripped off while
building the table index (the "fractional part" by this delta
value and add it to the table entry.

One trick you can use in either case to make your table lookups
easier is to measure your angles in 216 units rather than radians
or degrees. 216 units basically consist of the following mapping:
     Degrees Radians     216 units
     0         0         0
     45        p/4       0x2000
     90        p/2       0x4000
     180       p         0x8000
     270       3p/2      0xC000
     359.999   2p-e      0xFFFF
The neat thing about this set of units is that they wrap just the
way that radians do (and degrees are supposed to).  They give you
resolution of about 0.005 degrees, and they make table lookups
very easy, since you can shift-and-mask them down into table
indices very easily.

Interestingly enough, this is something I'd devised for myself
late last year (feeling very proud of myself) only to see the
technique published in Embedded Systems a month or so later.
Just goes to show you...





     ____________________





2    Polygon drawing on a raster display in two dimensions


The second main problem is to display a filled polygon on a
screen. We will discuss that problem here. First, let us approach
the 2d polygon.

We will define a polygon as a set of points linked by segments,
with a defined inner region and outer region. The inner region
need not be in a single piece.

Our rule for determining if a point is in a given region is to
draw a line from infinity to that point. If that line crosses an
odd number of segments (edges), the point is within the polygon.
Otherwise it is not.



Figure 4


Drawing a line from infinity to any of the gray areas of the star
will cross an odd number of edges. In the white areas, it will
cross an even number of edges. Try it.



Figure 5

The polygon shown in a, b and c are all the same. In a, the inner
region has been shaded, but this has been omitted in b and c for
clarity.

The polygon shown in a can be listed as vectors, as in b or c.
Let us agree that both versions are equally satisfying. In b we
enumerated the edges in a counterclockwise continuous fashion,
meaning that the head of each vector points to the tail of
another. In c, we made each vector point generally downward, or
more strictly, each vector's topmost point is its tail. These
vector are noted (a,b)-(c,d) where (a,b) is the location of the
tail of the vector, and (c,d) is the location of the head (arrow)
of the vector. In diagram c, for all vectors d<b (or b>d, either
way). (Note, we will see what to do with b=d later. That being a
special case, we will not discuss it now).

As we are drawing on a screen (supposedly), it would be
interesting to be able to color each pixel once and only once,
going in horizontal or vertical lines (as these are easy to
compute). We will choose horizontal lines for hardware reasons.
Thus, we start scanning from infinitely upwards to infinitely
downwards using horizontal scanlines (e.g., the ®first¯ scanline
is way up, and the last is way down). Now, our screen being
finite, we need only to draw those lines that are on screen. At
each pixel, we will draw an imaginary line from the infinite to
the left (horizontal line) to the current pixel. If it crosses an
odd number of edges, we color it. Otherwise, we don't.

This means that from infinite left to the first edge, we do not
color. Then, from first edge to second edge, we color. Then, from
second edge to third edge, we do not color. From third edge to
fourth edge, we color. And so on. If you want, you can always
draw a background bitmap left from the first edge, from second
edge to third edge, etc.... Thus, you could put the picture of a
beautiful blue sky behind at the same time you're drawing a
polygon.

Now, let's start that over again. We are coming from infinitely
upward and drawing. We do not need to consider every edge, only
those that are on the current scanline (horizontal line). We will
thus build an inactive edge list (those edges that have not yet
been used in the drawing process) and an active edge list (those
edges that are currently in use to draw the polygon, i.e. that
intercept the current scanline). Thus, at first, all edges (named
m,n,o,p in figure 5) would be in the inactive edge.

As the algorithm reaches the topmost line of the polygon, any
relevant edge will be transfered from the inactive edge list to
the active edge list. When an edge becomes unused (we're lower
than the lowermost scanline), we remove it from the active list
and throw it to the dogs. (# grin #)

Now you will realize that at each scanline we have to check every
edge in the inactive edge list to see if we need to transfer them
to the active edge list. To accelerate that, we can sort the
inactive edge list in decreasing order. Thus, the ordered list
would become {m,p,n,o}. Then, we need only to check the first
vector at every scanline. Since they point downward, we need only
to check the first point of it.

If you get any horizontal lines in the edge list, remove them
from the list. Horizontal lines can be identified easily as b=d.
Proving that it works is left as an exercise (# muhaha! #).

The polygon drawing algorithm (scan-line algorithm).

I=inactive edges list, initially all edges
A=active edges list, initally empty

sort I with highest first endpoint first

for each scanline i starting from the top of the screen
     does the first edge in I start at scan line i?
          Yes, transfer all edges that should start now from I to
A
     find x value of all edges in A for current scanline (see the
line algorithm below)
     sort all edges in A with increasing x-intercept value (the
thing we jest calculated)

     previous_x_intercept<-leftmost pixel on screen
     for all edges n in A, counting by two, do
          draw background between previous_x_intercept and edge n
in A
          fill polygon between edge n and edge n+1 in A
          previous_x_intercept<-x intercept of edge n+1
     end for

     draw background for the rest of the scan line (from
previous_x_intercept to end of line)
     for all edges for which we have reached the bottom
          remove the edge from A
     end for
end for




2.1 A line algorithm for the polygon drawing algorithm


This section discusses a somewhat shortened version of
Bresenham's line drawing algorithm. It is optimized for drawing
filled polygons.

You assuredly realize you need to calculate the x coordinate of
the intersection of a line with an horizontal line. Well consider
the line equation that follows:

     x=dy/dx x y + k

If the line is defined with the vector (a,b)-(c,d), then we
define dy as d-b, dx as c-a. Say x0 is the x coordinate of the
topmost point of the segment.

x0=a

Say xn is the x intercept value of the segment for the nth line
from the top. Then, xn can be defined as follows:

xn=dy/dx x n + a

Say we know xn-1 and want to know xn based on that knowledge.
Well it is obvious that:

xn-x(n-1)=dy/dx x n + a - [dy/dx x (n-1) + a]
xn-x(n-1)=dy/dx
xn=dy/dx+x(n-1)

Thus, knowing xn-1 and adding dy/dx yields xn. We can thus do an
addition instead of a multiply and an add to calculate the new x
intercept value of a line at each scanline. An addition been at
least 5 times as fast as an add, usually more in the 15 times
range, this will speed things up considerably. However, it might
annoy us to use floating point or fixed point, even more so with
roundoff error.

To this end, we have found a solution. dy/dx can be expressed as
a+b/d, b<d. a is the integer part of dy/dx. Substituting, we find
that

xn=a+b/d+xn-1

Thus, we always add a at each scanline. Furthermore, when we add
a sufficient amount of b/d, we will add one more to xn. We will
keep a running total of the fraction part. We will keep the
denominator in mind, and set the initial numerator (when n=0) to
0. Then, at each line we increase the numerator by b. If it
becomes greater than d, we add one to xn and decrease the
numerator by d. In short, in pseudocode, we draw a segment from
(x0,y0) to (x1,y1) this way:

dx=x1-x0
dy=y1-y0
denominator=dy
numerator=dx modulo dy
add=dx divided (integer) dy
running=0
x=x0
for each line (0 to n) do
     x=x+add
     running=running+numerator
     if running>=denominator then
          running=numerator-denominator
          x=x+1
     end if
     do whatever you want with line here, such as drawing a pixel
at location (x,line)
next line



     __________________________




3    3d polygon drawing



The first step is to realize that a projected 3d line with our
given projection becomes a 2d line, that is it stays straight and
does not bend. That can be easily proved. The simplest proof is
the geometric one: the intersection of two planes in 3d is a line
in the general case. The projector all lie in the plane formed by
the origin ant the line to be projected, and the projection plane
is, well, a plane. Thus, the projection is contained in a line.

This means that all the boundaries of a polygon can be projected
as simple lines. Only their endpoints need be projected. Then,
you can simply draw a line between the two endpoints of each
projected edge (or more precisely, draw a 2d polygon with the
given endpoints).

However, there comes the problem of overlapping polygons.
Obviously, the color of a pixel must be that of the closest
polygon in that point (well for our objective, that suffices).
But this means that we must determine the distance of each
polygon that has the possibility to be drawn in the current
pixel, and that for the whole image. That can be time consuming.

We will make a few assumptions. First, we will assume no polygons
are interpenetrating; this means that to come in front of another
polygon, a polygon must go around it. This means that you only
need to determine what polygon is topmost when you cross an edge.

Less obviously, but equally useful, if from one scanline to the
next, you encounter the same edges in the same order when
considering left to right, you do not need to recalculate the
priorities as it has not changed.

Inevitably, though, you will need to determine the depth of the
polygon (z value) in a given pixel. That can be done rather
easily if you carry with the polygon its plane equation in the
form Ax+By+Cz+D=0. Remember that the projection ray equation for
pixel (u,v) is x=zu and y=zv. Feed that x and y in the plane
equation and you find that

Azu+Bzv+Cz+D=0
z(Au+Bv+C)=-D
z=-D/(Au+Bv+C)

Thus you can easily calculate the z coordinate for any plane in a
given (u,v) pixel.

The cautious reader will have noticed that it is useful to do
some of these multiplications and additions incrementally from
scanline to scanline. Also note that if Au+Bv+C equals zero, you
cannot find z unless D is also zero. These (u,v) coordinates
correspond to the escape line of the plane of the polygon (you
can picture it as the line of ®horizon¯ for the given plane). The
projection ray is parallel to the plane, thus you never cross it.

This brings me to another very important thing about planes.
Another way to define a plane is to give a normal vector. A
vector and a point of reference generate one and only one plane
perpendicular to the vector and passing through the point of
reference. Think of a plane. Think of a vector sticking up
perpendicular to it. That vector is called normal to the plane.
It can be easily proved that this vector is (A,B,C) with A,B,C
being the coefficients of the plane equation Ax+By+Cz=D. We might
want to normalize the vector (A,B,C) (make it length 1), or
divide it by SQRT(A2+B2+C2). But to keep the equation the same,
we must divide all member by that value. The equation thus
becomes A'x+B'y+C'z=D'. If we want to measure the distance
perpendicularly to the plane of a point (a,b,c), it can be shown
that this (signed) distance is A'a+B'b+C'c+D' (you can always
take the absolute value of it if you want an unsigned distance).
That might be useful for different reasons.

It is also notable that since the A, B and C coefficients are
also the coords of the normal vector, and that rotating a vector
can be done as we saw previously, it is easy to rotate the plane.
If the vector is normalized in the first place, it will stay
normalized after rotation. After rotation, you will ignore the D
component of the plane, but you do know the coordinates of a
point in the rotated plane. Any point in the polygon will fit the
bill. Thus, you can deduce D. In fact, when performing back-face
culling (see chapter 4), you will calculate D even as you are
performing some optimization.

Normally, you will build an inactive edge table (IET) in which
you will store all edges of all polygons to be rendered. Also
initialize an empty list called the active edge table (AET). In
the IET, make sure that the edges are sorted in increasing values
of y for the topmost endpoint of any edge. Also, as per the 2d
polygon drawing algorithm, discard any horizontal edge.

As you scan from the top of the screen to the bottom of the
screen, move the edges from the IET to the AET when you encounter
them. You do not need to check every edge in the IET as they are
sorted in increasing y order, thus you need only to check the
first edge on the IET.

As you are scanning from left to right, each time you cross an
edge, you toggle the associated polys on or off. That is, if poly
X is currently "off", it means you are not inside poly X. Then,
as you cross an edge that happens to be one of poly X's edge,
turn it "on". Of all polys that are "on", draw the topmost only.
If polys are not self-intersecting, you only need to recalculate
which one's on top when you cross edges. Your edge lists must
contain pointers to thir parent polygons. Here's a pseudo-code
for the algorithm:

Let polylist be the list of all poly
Put all edges of all polygon in IET
Empty AET
for Y varying from the topmost scanline to the last
     while the edge on top of the IET's topmost endpoint's Y
coodrinate equals Y do
          take the edge off the IET and insert it in the AET with
an insertion sort
          so that all edges in the AET are in increasing X
values.
     end_while

     discard any edge in AET for which the second endpoint's Y
value is Y

     previous_X=minus infinity
     current_poly=none
     for i=first edge through last edge in AET
          for X varying from previous_X to i's X coordinate
               draw a pixel at position (X,Y), attribute is taken
from current_poly
          end_for
          toggle polygons that are parent to i
          current_poly=find_topmost_poly()
     end_for

     update intercept values of all edges in AET and sort them by
increasing X intercept
end_for

A recent uproar about "s-buffers" made me explain this algorithm
a little more carefully. Paul Nettle wrote a FAQ explaining
something that he calls "s-buffering". It is essentially an
algorithm where, instead of calculating the spans on the fly as
above, you buffer them, that is, you precalculate all spans and
send them to a span buffer. Then you give the span buffer to a
very simple rasterizing function. It's a bit harder to take
advantage of coherence in that case (an example of coherence is
when we note that if the order of the edges do not change from
one scanline to the next, we don't need to recalculate which
spans are topmost, assuming polygons aren't interpenetrating).

Another very nice algorithm is the z-buffer algorithm, discussed
later. It allows interpenetrating polygon and permits mixing
rendering techniques in a rather efficient way (for real-time
rendering purposes).




     _________________________






4    Data Structures




A very important aspect of 3d graphics is data structures. It can
speed up calculations by a factor of 6 and up, only if we put
some restriction on what can be represented as 3d entities.

Suppose that everything we have in the 3d world are closed
polyhedras (no self-crossing(s)!). Each polyhedra is constituted
of several polygons. We will need each polygon's normal and
vertexes. But how we list these is where we can improve speed.

It can be proved that each edge is shared by exactly two polygons
in a closed polyhedra. Thus, if we list the edges twice, once in
each polygon, we will do the transformations twice, and it will
take twice as long.

To that, add that each vertex is part of at least 3 points,
without maximum. If you list each point in every vertex it is
part of, you will do the transformations at least thrice, which
means it will take thrice the time. Three times two is six.
You're doing the same transformations at least six times over.

Thus, you need to list the vertexes in every polygon as pointers
to vertexes. This way, two polygons can share the same data for a
shared vertex, and will allow you to do the transformations only
once.

The same goes for edges, list the endpoints as pointers to
endpoints, thus you will spare yourself much runtime work.

Another interesting thing. All polygons have an interior and
exterior face in that model. Make it so that the normal to all
polygons point towards the exterior face. It is clear that, if
you are standing outside all objects, and all objects (polyhedra)
are closed, you cannot see an inside face. If, when transforming
polygons, you realize a polygon is facing away from the eye, you
do not to transform it, just skip it. To know if it's pointing
away from you do the following:

This is called back-face culling. Take the (a,b,c) vector from
the eye to any one endpoint of the polygon. (E.g., if the eye is
at (m,n,o) and a vertex of the poly is at (i,j,k), then the said
vector would be (i-m,j-n,k-o).) Take also (u,v,w) the normal
vector to the polygon. No matter what direction you are looking,
do the following operation:

au+bv+cw

If the result is positive, then it is facing away from you. If it
is negative, it is facing towards you. If it is null, it is
parallel to you. The operation is called scalar multiply of two
vectors. It is very notable that the above scalar multiplication
of two vectors is interestingly enough the D part of the plane
equation Ax+By+Cz=D.

You can also illuminate objects with a very distant light source
(e.g. the sun) very easily. Take the vector pointing away from
the light source (a,b,c) and the normal of the plane (u,v,w). Do
a scalar multiplication of both as above. The result can be
interpreted as the intensity of the lighting or shadow for the
polygon. The approximation is good enough to give interesting
results. You might also want to affect the shading of the polygon
according to its distance to origin, to yield a fog result.




     _______________________






5    Texture mapping



What we are about to do would probably be more aptly named pixmap
mapping onto a plane. We want to take a pixmap (or bitmap,
although less appropriate a word, it is very popular) and ®stick¯
it to the surface of a polygon, cutting away any excess. The
result can be quite impressive. This, however, can be a painfully
slow operation. We will discuss here of the mathematics involved,
and after that, we will try to optimize them.

Let's give a polygon its own coordinate system, (i,j). Thus, all
points on the said polygon can be localized in (i,j), or (x,y,z),
or, once projected, in (u,v).



Figure 6

Figure 6 illustrates the problem. The blue plane is the
projection plane. The green triangle is a polygon we are
projecting on that plane and then scan-converting in (u,v)
coordinates. What we want to do is find, what (i,j) point
corresponds to any given (u,v) point when projected this way.
Then, the point (i,j) can be indexed in a pixmap to see what
color this pixel in (u,v) space should be.

Let the i and j vectors be expressed in (x,y,z) coordinates,
which they should normally be. We want a general solution.

i=(a,b,c)
j=(d,e,f)

Let also the origin of (i,j) space be pointed to in (x,y,z) by
the vector k=(m,n,o). Thus, a point at location (p,q) in (i,j)
space is the same as a point at location pi+qj+k in (x,y,z)
space. Furthermore, we have the projection ray defined as follow:
it shoots through point (r,s) in (u,v) space. This corresponds to
point (r,s,1) in (x,y,z) space. The equation of that ray is thus
t(r,s,1) where t can take any real value. Simplifying, we find
that the ray is (tr,ts,t), or

     x=tr
R:   y=ts
     z=t

or
     x=zr
R:   y=zs
     z=z

The point in the plane is
     x=pa+qd+m (1)
P:   y=pb+qe+n (2)
     z=pc+qf+o (3)

We want to find (p,q) as a function of (r,s). Here is what i did
in stupid MathCAD. [MathCAD flame deleted] I'm sorry I didn't use
the same variable names in MathCAD...

#######Begin picture of mathcad screen#########

########End picture of mathcad screen#########

These equations for p and q can be optimized slightly for
computers. Since it is usually faster on a computer to do a(b+c)
than ab+ac, and that both are equivalent, we will try to
factorize a little.

p=(Mv+Nu+O)/(Pv+Qu+R)
where M,N,O,P,Q,R are as follows:
M=CXq-AZq
N=BZq-CYq
O=AYq-BXq
P=ZqXp-ZpXq
Q=YqZp-YpZq
R=YpXq-YqXp

They are all constants, so they need only to be calculated once
to draw the whole image.

Similarly for q, we have:
q=(Sv+Tu+U)/(Vv+Wu+X)
and
S=AZp-CXp
T=CYp-BZp
U=BXp-AYp
V=ZqXp-ZpXq
W=YQZP-YPZQ
X=YqXp-YpXq

All these constants should be calculated only once. Now we can
simplify a little with, knowing that P=V, Q=W, R=X.

p=(Mv+Nu+O)/(Pv+Qu+R)
q=(Sv+Tu+U)/(Pv+Qu+R)

This is noteworthy: (Q,P,R)(u,v,1)=uQ+vP+R, the denominator. But,
(Q,P,R)=(Xq,Yq,Zq)x(Xp,Yp,Zp). This cross-product of two vectors
will yield a vector that is perpendicular to the first two (i.e.
a normal to the plane). If these two vectors are of length 1,
then the cross product will also be of length 1. But we might
already have the normal from previous transformations (e.g. if
we're doing back-face culling). Then, we won't need to calculate
(Q,P,R) as it is the normal to the plane. Or we can deduce the
normal from these equations if we don't have it.

The denominator needs to be calculated only once per pixel, not
for both p and q. The numerator for p and q is different, though.
Furthermore, It is obvious that Mv+O are constant throughout a
constant-v line (horizontal line). Thus, it needs to be
calculated only once per line. We can spare ourselves the
multiply per pixel doing it incrementally. (We're still stuck
with a divide, though.) Example, let V=Mv+O, for a given v. Then,
when calculating Mv+Nu+O, we can use V+Nu. If we know V+Nua, then
we can calculate V+N(ua+1) as follow: V+Nua+N

Let W=V+Nua, then we get

W+N

That's a simple add instead of a multiply. But we can not get rid
of the divide, so we're stuck with a divide per pixel (and at
least two adds) per coordinate.

There is a way to solve this, however, and here's how to do it.

First, it is interesting to note the following. The plane
equation is Ax+By+Cz=D, and x=zu, y=zv, where (u,v) is the screen
coordinates of a point, and (x,y,z) is the corresponding world
coordinate of the unprojected point. Thus,

Auz+Bvz+Cz=D

What happens when z is a constant? Let us express v as a function
of u.

Bvz=-Auz-Cz+D
v=(-A/B)u+(-C/B)+D/(Bz)

Let M=-A/B, and N=d/(Bz)-C/B then we have

u=Mv+N

a linear equation. Furthermore, M is independant of z. Thus, a
slice of constant z in the plane is projected as a line on (u,v).
All of these slices project in lines that are all parallel to
each other (e.g. the slope, M, is independant of the slice
taken).

Now, a slice of constant z in the original Ax+By+Cz=D plane is a
line. Since z is constant throughout that line, all points of
that line are divided by the same number, independantly of (x,y).
Thus, the projected line is merely the original line scaled up or
down.

Up to now, we have been rendering polygons with a scanline
algorithm where the scanlines are horizontal. First, we have to
conceive that the scanlines need not be horizontal, merely
parallel. If we use a slope of M for the scanlines (M=-A/B, as
above), then we will be performing scans of constant z value. Two
advantages are very obvious. First, it is easy to calculate the z
value for a large number of pixel (it is constant throughout a
scanline) and second, texture mapping the scanline becomes a mere
scaling operation, which can be done incrementally. Furthermore,
M need be calculated only once for the whole polygon, while N
requires at most a divide and an add per scanline.

Note that if abs(A)>abs(B), you might want to express v as a
function of u instead of the above, so that the slope ends up as
being -B/A instead. (E.g., make certain that the slope is less
than or equal to 1 in absolute value). There is a degenerate
case: A=B=0. This is the case where the whole plane is of
constant z value. In that case, you can use horizontal scan lines
and just scale the texture by a factor of 1/z.

Thus, we're going to be rasterizing the polygon with
nonhorizontal scanlines of slope M. The equation of a scanline
can be expressed as:

v=Mu+Vi

Where Vi is the v coordinate when u is 0. All scanlines being
parallel, M never changes, and Vi identifies uniquely each and
every scanline. Say i=0 corresponds to the topmost scanline. Say
we want to find the intersection of the scanline with a polygon
edge. What we're really interested in is the u value of the
intersection of the two lines. The edge can be represented as
being:

v=Nu+K

Where K is the v value for the edge when u=0. Now, intersecting
the two yields

Nu+K=Mu+Vi
(N-M)u=(Vi-K)
u=(Vi-K)/(N-M)

That is, assuming N-M is not zero, which shouldn't be because
we're not including edges parallel to the scanlines in our scan
line algorithm, thus N-M is never zero.

Now, V(i+1) is (Vi)+1. When we go from scanline i to scanline
i+1, V is incremented by one. We thus get u', the intersection of
the two lines for the next scanline, as being:

u'=(Vi+1-K)/(N-M)
u'=(Vi-K)/(N-M)+1/(N-M)
u'=u+1/(N-M)

It is thus possible to calculate u from scanline to scanline
incrementally, bresenham stlye. To clarify things up a bit: if P1
and P2 delimit the edge, then let (p,q) be P2 minus P1 (that is,
p is the delta u and q is the delta v). Therefore, N is q/p (by
definition of slope). Now, we know M to be of the form -B/A (from
the plane equation, above). We need to calculate N-M for our
incremental calculations, which is relatively easy. We have

N-M=q/p-(-A/B)
N-M=q/p+A/B
N-M=(qB+Ap)/(Bp)

Thus, u' is really:
u'=u+Bp/(qB+Ap)

At one point in the algorithm, we will need to calculate which
scanline contains a particular point. As an example, when sorting
from topmost edge downwards, we need to use a definition of top
that is perpendicular to the scanline used. Then, we have the
slope of the scanline, which is, say, m. If the equation of the
line that interests us is:

y=mx+b

Then, we can sort the edges according to increasing b values in
the above equation. If we want to calculate b for point (i,j),
simply isolate b in the above equation.

This last bit is called constant-z texture mapping. It is not
particularly useful in the general case, for a number of reasons,
mainly added aliasing, and span generation problems. However, it
has a very well known and very used application. If we have
vertical walls and horizontal floors exclusively, and that the
eye is never banked, then the following is always true: the
constant-z lines are always either horizontal or vertical. For
example, the walls' constant-z lines are vertical while the
floors' and ceilings' are horizontal. So basically, you can
render floors and walls using the techniques described in this
chapter in a very straightforward manner, if you use both
horizontal and vertical scanlines. Just use the last bit about
constant-z texture mapping and optimize it for horizontal and
vertical scanlines. It is to note that looking up or down will
not affect the fact that walls will have vertical scan-lines and
floors and ceilings will have horizontal scan-lines.




     ________________________





6    Of logarithms



If we have many multiplies or divides to make, little time and
lots of memory, and that absolute precision is not utterly
necessary, we may use the following tricks.

Assuming x and y are positive or null real numbers, and b a base
for a logarithm, we can do the following: (note: x**y means x
raised to the yth power, logby denotes base b logarithm of y).

logb(xy)=logbx+logby
b**logb(xy)=b**(logbx+logby)
xy=b**(logbx+logby)

Now, you might say that taking the log or the power of a value is
very time consuming, and it usually is. However, we will reuse
the trick that we used to do not so long ago, when pocket
computers were not available, we will make a logarithm table. It
is easy to store the values of all possible logaritms within a
finite, discrete range. As an example, if we are using 16 bit
fixed points or integer, we can store the result of a log of that
number as a table with 65536 entries. If each entry is 32 bits
wide, which should be sufficient, the resulting table will use
256kb of memory. As for the power table, you can build another
lookup table using only the 16 most significant bits of the
logarithm, yielding a 16 or 32 bit wide integer or fixed point.
This will yield a 128 to 256kb table, for a grand total of under
512kb. With today's machines, that should not be a problem. The
only problem is that you must then limit your scope to 16 bits
numbers. You realize of course that you can not multiply if one
or both of the numbers are negative. Well, you can do it this
way. Say x<0, then let u=-x, thus u>0. Then, xy=-uy, and uy you
can calculate with this method. Similarly,

x/y=b**(logbx-logby)

Thus divides are also a simple matter of looking it up in the
table and subtracting.

Powers are made much more simple also. I won't say anything long
winded about powers, but here is, in short:

logb(x**y)=y logbx
x**y=b**(y logbx)   (*)
And from there, even better yet:
x**y=b** ( b** [ logby+logb(logbx) ] )

For y=constant, some shortcuts are usually possible. Since ax can
sometimes be very quickly calculated with left shifts, you can
use (*) instead of the last equation. Very common example:
320y=(256+64)y
320y=256y+64y
320y=(y<<8)+(y<<6)

Where x<<y denotes "x, left shifted y times". A left shift of y
is equivalent of multiplying by 2**y. The ancient egyptians only
knew how to multiply this way (at least, as far as I know, for a
while, like for the pyramids and stuff - I think they even
calculated an approximation of pi during that time, and they
couldn't even multiply efficiently! :-)



     ______________________




7    More on data structures and clipping



Usually, the space viewed by the eye can be defined by a very
high square-based pyramid. Normally, it has no bottom, but in
this case it might have one for practical purposes. Let us say
that the eye has a width of vision (horizontally and vertically)
of 90 degrees. (Since it is square, that angle is slightly larger
when taken from one corner to the opposite.) Thus, we will see
45ø to both sides, up and down. This means a slope of one. The
planes bounding that volume can be defined as x<z, x>-z, y<z and
y>-z. To a programmer with an eye for this, it means that it easy
to ®clip¯, i.e. remove obviously hidden polygons, with these
bounds. All polygons that do not satisfy either of these
inequalities can be discarded. Note that these polygons must fall
totally outside the bounding volume. This is called trivial
rejection.

For polygons that are partially inside, partially outside that
region, you may or may not want to clip (I am not sure yet of
which is more efficient - I suspect it depends on the situation,
i.e. number of polygons, number of edges in each polygon...)

There is also a slight problem with polygons that are partially
behind, partially in front of the eye. Let us give a little
proof. All of our polygon-drawing schemes are based on the
assumption that a projected line is also a line. Here is the
proof of that.

Let us consider the arbitrary line L, not parallel to the plane
z=0, defined parametrically as:
     x= at+b
L:   y= ct+d
     z= t

Now, let us define the projection P(x,y,z) as follows:

P(x,y,z): u=x/z
     v=y/z

Then, P(L) becomes

P(L):     u=(at+b)/t
     v=(ct+d)/t

We are using a segment here, not an infinite line. If no point
with t=0 needs to be projected, we can simplify:

P(L):     u=a+b/t
     v=c+d/t

Let us measure the slope of that projection:

du/dv=(du/dt) / (dv/dt) = (-b/t2)/(-d/t2)

Since t is not 0

du/dv=b/d, a constant, thus the parametric curve defined by P(L)
is a straight line if t is not zero.

However, if t is null, the result is indeterminate. We can try to
find the limits, which are pretty obvious. When t tends towards
zero+, then u will tend towards positive (or negative if b is
negative) infinity. If t tends towards zero-, then u will tend
towards negative (or positive, if b is negative) infinity. Thus,
the line escapes to one infinity and comes back from the other if
t can get a zero value. This means that if the segment crosses
the plane z=0, the projection is not a line per se.

However, if the line is parallel to z=0, L can be defined as
follows:
     x=at+b
L:   y=ct+d
     z=k

Where k is the plane that contains L. P(L) becomes

P(L):     u=(at+b)/k
     v=(ct+d)/k

The equations do not admit a null k. For a non-null k, we can
find the slope:

du/dv = (du/dt)/(dv/dt) = (a/k)/(c/k)
du/dv=a/c

Still a constant, thus P(L) is a line. However, if k is null, we
do not get a projection. As k gets close to zero, we find that
the projection lies entirely at infinity, except for the special
case b=d=0, which is an indetermination.

This will have a practical impact on our 3d models. If a cube is
centered on the eye, it will be distorted unless we somehow take
that into consideration. The best way to do it, with little
compromise and efficient coding, is to remove all parts of
polygons that lie in the volume defined by z<=0. Of course, if an
edge spans both sides of the plane z=0, you will have to cut it
at z=0, excluding everything defined by z<=0. Thus, you will have
to make sure the point of the edge with the smaller z coordinate
is strictly greater than 0. Normally, you would have to take the
next higher real number (whatever that might be!) but due to
hardware contraints, we will use any arbitrarily small number,
such as z=.001 or even z=.1 if you are building a real-time
something in which infinite precision is not required.

Lastly, you might not want to render objects that are very far
away, as they may have very little impact on the global scene, or
because you are using a lookup table for logarithms and you want
to make sure that all values are within range. Thus, you will set
an arbitrary maximum z for all objects. It might even be a fuzzy
maximum z, such as ®I don't wand any object to be centered past
z0¯. Thus, some objects may have some part farther than z0, but
that may be more efficient than the actual clipping of the
objects.

Moreover, you might want to make several polygon representations
of the objects, with different levels of details as measured by
the number of polygons, edges and vertices, and show the less
detailed ones at greater distances. Obviously, you would not
normally need to rasterize 10 polygons for an object that will
occupy a single pixel, unless you are doing precision rendering.
(But then, in that case, you'll have to change the whole
algorithm...)

You may want to perform your trivial rejection clipping quickly
by first rotating the handle coordinates of each polyehdra (the
handle coordinate is the coordinate of its local coordinates'
origin), and performing trivial rejection on them. As an example,
if you determine that all of a polyhedra is contained within a
sphere of radius r, and that this polyhedra is ®centered¯ at z=-
3r, then it is obvious that it is entirely outside the view
volume.

Normally, the view volume is exactly one sixth of the total
viewable volume (if you do not take into account the bottom of
the pyramid). Thus, you will statistically eliminate five sixths
of all displayable polyhedras with such a clipping algorithm.

Let's make a brief review of our speed improvements over the
brute-force-transform-then-see-no-think algorithm. We multiplied
the speed by six with our structure of pointers to edges that
point to vertexes. Then, we multiply again by two (on average)
with back-face culling, and now by six again. The overall
improvement is a factor of 72! This shows you how a little
thinking pays. Note however that that last number is a bit skewed
because some operations do not take as long as some others, and
also by the fact that most polyhedra do not contain a nearly
infinite amount of vertices.

We have not taken into account several optimizations we
mentioned, such as the lessening of detail, a bottom to the view
volume and we did not compare with other kinds of algorithm. But
another appeal of this algorithm is that it draws once, and only
once to each pixel, thus allowing us to superimpose several
polygons with minimal performance loss, or make complicated
calculations for each pixel knowing that we will not do it twice
(or more!) for some pixels. Thus, this algorithm lends itself
very well to texture mapping (or more properly, pixmap mapping),
shading or others.

However, we have to recognize that this approach is at best an
approximation. Normally, we would have to fire projection rays
over the entire area covered by a pixel (an infinite number of
rays), which will usually amount to an integral, and we did not
perform any kind of anti-aliasing (more on that later), nor did
we filter our signal, nor did we take curved or interpenetrating
surfaces into account, etc... But it lends itself very well to
real-time applications.





8    A few more goodies...



Anti-aliasing is something we invented to let us decrease the
contrast between very different pixels. More properly, it should
be considered an approximation of high frequencies in a
continuous image through intensity variations in the discrete
view space. Thus, to reduce the stairway effect in a diagonal
line, we might want to put some gray pixels.



Figure 7

Figure 7 demonstrates this. This diagonal line a was ®anti-
aliased¯ manually (I hear the scientists scream) in b, but it
should let you understand the intention behind anti-aliasing.

Scientists use filters, and so will we, but I will not discuss
fully the theory of filters. Let us see it that way. We first
generate a normal image, without any kind of anti-aliasing. Then
we want to anti-aliase it. What we do is take each pixel, modify
it so it resembles its non-anti-aliased neighbors a little more
and show that on screen.

Let us further imagine that how we want to do that specifically
is to take one eight of each of the four surrounding pixel
(north, south, east and west) and one half of the ®real¯ pixel,
and use the resulting color for the output pixel.

If our image is generated using a 256-color palette, there comes
a handy little trick. Let us imagine that we want to take one
half of color a and one half of color b and add them together.
Well, there are 256 possible values for a. And for each single
value of a, there are 256 values of b, for a total number of
256x256 pairs, or 64k pairs. We can precalculate the result of
mixing one half a with one half b and store it in a matrix 64kb
large.

Now let us call the four surrounding pixels n,s,e,w and the
central point o. If we use the above matrix to mix n and s, we
get a color A that is one half n plus one half s, that is
A=n/2+s/2. Then, mix e and w as B=e/2+w/2. Then, mix A and B.
C=A/2+B/2=n/4+s/4+e/4+w/4. Lastly, mix C and o, and we get
D=o/2+n/8+s/8+e/8+w/8, the exact desired result. Now, this was
all done through a lookup table, thus was very fast. It can
actually be done in real time.

Thus, if you want to do this, you will first create the original,
un-anti-aliased image in a memory buffer, and then copy that
buffer as you anti-aliase it on screen. Obviously, it will be
slower a little, but you might want to consider the increased
picture quality. However, you will probably want to do this in
assembler, because it can be highly optimized that way (most
compilers get completely confused...)

Another very nice technique is this. For pixel (u,v) on screen,
mix equal parts of pixels (u,v), (u+1,v), (u,v+1), (u+1,v+1) in
your rendering buffer. It's as if the screen was half a pixel of
center in relation to the rendering buffer. You'll need to render
an extra row and column though.

     __________________


It is also interesting to note that the silhouette of a projected
sphere is a circle. The centre of the circle is the projection of
the centre of the sphere. However, the radius is NOT the
projection of the radius of the sphere. However, when the
distance is large enough when compared to the radius of the
sphere, this approximation is good enough.

     __________________


To determine if a point is contained inside a polyhedra, we
extend our test from the polygons like this: if a line drawn from
the point to infinity crosses an odd number of faces, the it is
contained, otherwise it is not.

     __________________


If you want to approximate a continuous spectrum with 256 colors
(it will not look perfect though), use base 6. That is, using the
RGB model (we could use HSV or any other), you will say that red,
green and blue can all take 6 different values, for a total
number of combinations of 216. If you add an extra value to green
or red, you get 252 colors, close to the maximum. However, 216 is
not bad and it leaves you 40 colors for system purposes, e.g.
palette animations, specific color requirements for, let's say,
fonts and stuff. Under Microsoft Windows, 20 colors in the
palette are already reserved by the system, but you still have
enough room for your 216 colors. By the way, I think it's stupid
that MS Windows does not allow you to do that as a standard
palette allocation. You have to define your own video driver to
enable programs that limit themselves to dithering to use those
colors. And that's living hell.

In addition, if the colors are arranged in a logical way, it will
be easier to find the closest match of the mix of two colors as
per anti-aliasing, above. (A bit like cheaper 24 bit color.)

Some people give 8 values to red and green, and 4 values to blue,
for a total of 256 colors, arguing that shades of blue are
perceived more difficultly. Personnally, I think the result is
not as good as the base six model I previously described.

Some other nice techniques: divide your palette in spans.
Example, allocate palette registers 16-48 to shades of blue. Use
only registers 20-44 in your drawings, but arrange so that it
goes from white-blue at color #16 through black-blue at color
#48. Then, shading is a simple matter of adding something to make
it darker, or subtracting to make it lighter. Make a few spans
for your essential colors, and maybe keep 30 for anti-aliasing
colors. These should not be used in your pixmaps. They could be
used only when performing final anti-aliasing of the rendered
picture. Pick these colors so to minimize visual artifacts. (E.g.
you have a blue span at 16-48, and a red span at 64-100. But you
don't have any purple, so if blue happens to sit next to red,
anti-aliasing will be hard to achieve. Then, just a bunch of your
special 30 colors for shades of purple.) Another particular trick
to increase the apparent number of colors (or rather, decrease
the artifacts created by too few colors), you could restrict your
intensity to lower values. E.g. make your darkest black/gray at
25% luminosity, and your lightest white at 75%. This will make
your graphics look washed out, but your 256 colors will be spread
over a narrower spectrum.




     _______________________





9    Sorting



Sorting becomes an important issue because we do it a lot in our
3d rendering process. We will discuss here briefly two general
sorting algoritmthms.

The first sort we will discuss is the bubble sort. It is called
this way because the elements seem to ®float¯ into position as
the sort progresses, with ®lighter¯ elements floating up and
®heavier¯ ones sinking down. Given the set S, composed of
elements s0, s1, ..., sn-1. This set has n elements. Let us also
suppose that we can compare two elements to know which of the two
is ®lighter¯, i.e. which one goes first. Let us also suppose that
we can exchange two consecutive elements in the set S. To sort
the set S, we will proceed as follow. Starting at element 0 up to
element n-2, we will compare each element to its successor and
switch if necessary. If no switches are made, the set is sorted.
Otherwise, repeat the process. In pseudocode, we have:

repeat the following:
     for i varying from 0 to n-2, repeat the following
          if si should go after si+1, then exchange si and si+1
     end of the loop
until no elements are switched in the for loop (i.e the set is
sorted)

This algorithm, while very inefficient, is very easy to
implement, and may be sufficient for relatively small sets (n of
about 15 or less). But we need a better algoritm for greater
values of n.

However, we have to keep in mind that we are sorting values that
are bounded; that is, these values have a finite range, and it is
indeed small, probably 16 bits or 32 bits. This will be of use
later. But let us introduce first the sort. It is called the
radix sort.

Assume we are give a set S of numbers, all ranging from 0 to 9999
inclusively (finite range). If we can sort all si according to
the number in the ones' place, then in the tens' place, then in
the hundreds' place, then thousands' space, we will get a sorted
list. To do this, we will make a stack for each of the 10
possible values a digit can take, then take each number in the S
set and insert them in the appropriate stack, first for the
rightmost digit. Then, we will regroup the 10 stacks and start
over again for the second rightmost digit, etc... This can be
done in pseudocode as follows:

divisor=1
while divisor<10000, do the following:
     for i varying from 0 to n-1, do the following:
          digit=(si divided (integer version) by divisor) modulo
10
          put si on stack # digit
     end of for loop
     empty set S
     for i varying from 0 to 9, do the following:
          take stack # i and add it to set S
     end of for loop
     divisor=divisor x 10
end of while loop


Now, let us assume that we want to work with hexadecimal numbers.
Thus, we will need 16 stacks. In real life, what we want to sort
is edges or something like that. If we limit ourselves to 16384
edges (16k, 12 bits), then our stacks need to be able to fit
16384 objects each. We could implement them as arrays and use
lots of memory. A better way is to use lists.



One could also use the quicksort algorithm, which has a
complexity of O(nxln(n)), which seems worse than n for this
particular problem. It is also a bit more tedious to code (in my
humble opinion). In our problem, we may not even need to sort for
16 bits. The only place that we need a powerful sorting algorithm
is when sorting all edges from top to bottom and from left to
right. If we are certain that all coordinates are bound between,
say, 0 and 200 (e.g. for y coordinate), then we can sort on 8
bits. For the x coordinate, though, depending if we already
clipped or not, we may have to sort on the full 16 bits. When
using the z-buffer algorithm (decribed scantily elsewhere), you
want to do a z-sort, then drawing the polygons that are on top
FIRST so that you can avoid these nasty pixel writes.





     ______________________







10   Depth-field rendering and the Z-buffer algorithm




You can approximate any 3d function z=f(x,y) with a depth field,
that is, a 2d matrix containing the z- values of f(x,y). Thus,
z=M[x][y]. Of course, if you want any precision at all, it has
somewhat large memory requirements. However, rendering this depth-
field is fairly efficient and can give very interesting results
for smooth, round surfaces (like a hill range or something).

To render a depth field, find the vector V from our eye to the
pixel in the projection plane in 3space. Shoot a ray, as before,
using that vector. Start at the eye and move in direction V a
little. If you find that you are still above the surface (depth
field), then keep going (add another V and so on) until you hit
or fall under the surface. When that happens, color that pixel
with the color of the surface. The color of the surface can be
stored in another matrix, C[x][y]. In pseudocode, we get:


M[x][y]=depth field
C[x][y]=color field
for each pixel to be rendered, do:
     V=projection ray vector
     P=Eye coordinates
     while (M[Px][Py]<Pz)
          P=P+V
     end while
     color the current pixel with C[Px][Py]
     if we are using z-buffer algorithm, then set Z[Px][Py] to Pz
(see below)
end for


Of course, you will want to break the loop when P is too far from
origin (i.e. when you see that nothing is close through that
projection ray, stop checking).

For vector graphics, we were forced to place our eye centered on
origin and looking towards positive z values. That forced us to
rotate everything else so the eye would be positioned correctly.
Now, obviously, we cannot rotate the whole depth field as it
would be very time- and memory-consuming. Instead, the eye now
can be centered anywhere (the initial value for P need not be
(0,0,0)) and the vector V can be rotated into place. Even better,
if we can find the vector W from the eye to the center of the
projection plane, and the two base vectors u and v in 3d for the
projection plane, all of the other vectors Vu,v wil be a linear
combination of the form: V=W+au+bv where (a,b) is the coordinate
of the pixel in the (u,v) coodinates system. If we are doing this
rendering with scan lines (as we most probably are doing), then
the above equation can be done incrementally, e.g. say Va,y=A,
then Va+1,y=A+u. No multiplications whatsoever.

The z coordinate of each pixel is readily available in the above
algorithm and it has a use. Let us store all the z coordinate
values in a matrix Z[a][b]. If we then want to add a flat image
with constant z value (say z=k) to the already rendered scene, we
can do it very easily by writing only to pixels for which
Z[a][b]>k. Thus, we can use another rendering technique along
with depth-field rendering. This is called the z-buffer
algorithm. Accelerating calculations for the z-buffer with
polygons, you might want to remove the divide-per-pixel. That is
fully discussed towards the end of the texture-mapping chapter.

A few optimizations are easily implemented. Let us assume we are
drawing from the bottom of the screen to the top, going in
columns instead of rows. First, if you are not banked (or banked
less than the steepest slope, e.g. no ®overhangs¯ are seen in the
picture) and you have rendered pixel a. Then you know how far
pixel a is from you. Then, then next pixel up, pixel b, will be
at least as far as pixel a. Thus, you do not need to fire a ray
starting from your eye to pixel b, you need only to start as far
as pixel a was. In fact, if you are not banked, you need only to
move your Pz coordinate up a little and keep going.

Therefore, if pixel a is determined to be a background pixel, so
will all the pixels above it.

You might want to render all of you polygons and other entities
using a z-buffer algorithm. As an example, say you have 3
polygons, A, B and C, and a sphere S. You could tell your
renderer to render poly A, which it would do right away, filling
the z-buffer at the same time. That is, for each screen pixel
that the buffer A covers, it would calculate the z-value of the
polygon in that pixel, then check with the z-buffer if it stands
in front of whatever is already drawn. If so, it would draw the
pixel and store the polygon's z-value for that pixel in the z-
buffer. Then render sphere S, which it would do right away again,
calculating the z-value for each pixel and displying only the
visible pixels using the z-buffer, and updating the z-buffer
accordingly. Lastly polygons B and C would be displayed.

What you gain by using this algorithm instead of a scan line
algorithm is obvious. Interpenetrating polygons behave correctly,
mixing rendering techniques is straightforward. There is but one
annoying thing: calculating the z value for a polygon in any
point on the screen involves a division, hence a division per
pixel if using z-buffer. But we can do better.

We are not really interested in the z value of a polygon in a
determined pixel, what we want really is to know what is visible.
To that purpose, we can obviously calculate z and show the object
with the smallest z value. Or we could evaluate z^2 and show the
one with the smallest z^2 value. If we want whatever has the
smallest z value, this is the same as saying that we want
whatever has the largest 1/z value. Fortunately 1/z is way easier
to calculate. Let the plane equation be Ax+By+Cz=D, and the
projection equations be u=x/z and v=y/z, or

Ax+By+Cz=D
x=uz
y=vz

or

A(uz)+B(vz+)Cz=D

or

z(Au+Bv+C)=D

(Au+Bv+C)/D=1/z

1/z= (A/D)u+(B/D)v+(C/D)

Let M=A/D (a constant), N=B/D (another constant) and K=C/D (a
third constant), then

1/z=Mu+Nv+K

As we can see plainly, we don't have a divide. This is linear so
can be calculated incrementally. E.g. for a given scanline (v=k),
the equation is

1/z=Mu+Nk+K

Let P=Nk+K (a constant)

1/z=Mu+P

A linear equation. When going from pixel u to pixel u+1, we just
need to add M to the value of 1/z. A single add per pixel.

There's also a double edged optimization that you can make, which
follows.

Sort your polygons in generally increasing z order (of course,
some polygons will have an overlap in z, but that's everybody's
problem with this algorithm - your solution is just as good as
anybody else's.) Now, you're scan-converting a polygon. If you
can find a span (e.g. a part of the current scanline) for which
both ends are behind the same convex object, then that span is
wholly hidden. As an example of this, if both enpoints of the
current span (on the scanline) are behind the same convex object,
then the current span is entirely hidden by that convex object.
If all of your objects are convex polygons, then you don't have
to check for convexity of the object. Another interesting example
is this:

If current span's left endpoint is hidden
     let A be the object that hides the left endpoint (of the
span)
     if A hides the right endpoint
          stop drawing the scanline
     else
          start drawing from the right end of the span,
leftwards, until object A is in front of the span
     end if
else
     if current span's right endpoint is hidden
          let B be the object that hides the right endpoint
          scan rightwards until object B is in front of the span
     else
          draw span normally
     end if
end if

This can be applied also to polyhedras if they are convex. This
last optimization should be relatively benign (e.g. I expect a
factor of two or some other constant from it).





     _____________________






11   Bitmap scaling and mixing rendering techniques




Another popular technique is called bitmap (or more properly
pixmap) scaling. Say you are looking in a parallel direction to
the z axis, and you are looking at a picture with constant z.
Through the projection, you will realize that all that happens to
it is that it is scaled by a constant factor of 1/z. That is, if
you look at a photograph lying in the plane z=2, it will look
exactly the same (once projected) as the same photograph at z=1,
except that it will be twice as small. You could also do that
with the picture of a tree. From a great distance, the
approximation will be fairly good. When you get closer, you might
realize that the tree looks flat as you move. But nonetheless, it
can yield impressive results. Bitmap scaling often suffices for
ball-like objects. You could use it for an explosion, smoke,
spherical projectile, etc... The advantage of bitmap scaling is
that it has only one point, probably it's center, to transform
(i.e, translate and rotate into place in 3d), and scaling it is
very fast.

The advantages of such a technique are obvious. First, it is very
easy to scale a pixmap by a given factor. Second, the z value is
a constant, thus it can easily be incorporated in the above z-
buffer algorithm with depth-field rendering for very nice
results. However, it will be obvious to the careful observer that
you can not smoothly go around it, as it is always facing you the
same way (i.e. you cannot see the back door to a house if it is
rendered this way, because, obviously, the bitmap being scaled
does not have a back door).

Moreover, the depth-field rendering technique blends quite
smoothly with vector graphics. First, render the depth field in a
buffer B. Do a scan-line algorithm for the vector graphics, but
with an added twist: the ®background¯ bitmap is B, and use the z-
buffer algorithm to determine if a pixel in B lies in front of
the current polygon if any.

If you do decide to use depth-field rendering along with vector
graphics, and you decide not to bank your eye, then you can
eliminate one full rotation from all calculations. E.g. if your
eye is not banked, you do not need to rotate around the z axis.




     _________________________





Scaling a bitmap can be done fairly easily this way. The
transformation is like this: we want to take every (u,v) point
and send it into (u/s,v/s), where s is the scaling factor (s=2
means that we are halving the size of the bitmap). The process is
linear.

(i,j)=(u/s,v/s)
(is,js)=(u,v)

Thus, if we are at pixel (i,j) on screen, it corresponds to pixel
(is,js) in the bitmap. This can also be done incrementally. If
un=A, then un+1=A+s. No multiplications are involved. Thus, if we
are viewing a bitmap at z=2 according to the above said 3d
projections, then we can render it using a scan-line algorithm
with these equations and s=z=2.

If a bitmap has holes in it, we can precalculate continuous spans
in it; e.g., in an O, drawing a scanline through the center shows
you that you are drawing two series of pixels. If you want the
center to be transparent (allow a background to show through),
then you can divide each scanline of the O in two spans, one left
and one right, and render them as two separate bitmaps; e.g.,
break the O in ( and ) parts, both of which have no holes. Note
however that we have a multiplication at each beggining of a
span, so this might not be efficient. Or you could select a color
number as being the ®transparent¯ color, and do a compare per
pixel and not blit if it is the transparent color. Or you can
allocate an extra bitplane (a field of bit, one bit per pixel)
where each bit is either 1 for opaque (draw this pixel) or 0 for
transparent (do not draw this pixel).





     ______________________





Partially translucid primitives can be rendered pretty easily.
This is an old trick, you see. You can even scavenge the lookup
table you used for anti-aliasing. If you want to have a red glass
on a part of the screen, the lookup table will tell you how to
redden a color. That is, the item #c in the lookup table tells
you what color is color c with a little red added. The lookup
table need not be calculated on the fly, of course. It can be
calculated as part of your initialization or whatever.




     ______________________






12   About automatically generating correctly oriented normals to
planes.



This bit is not meant to be a blindingly fast algorithm for
region detection. It is assumed that these calculations are made
before we start generating pictures. Our objective is to generate
a normal for each plane in a polyhedra oriented so that it points
outwards respectively to the surface. We assume all surfaces are
defined using polygons. But before determining normals for
planes, we will need to examine polygons in 3d.

Each polygon being constituted of a set of edges, we will later
require a vector for each edge that points in the polygon for any
edge. See below.



Figure 8

In figure 8, we can see in red the said vectors for all edges. As
we can see, the vectors do not necessarely point towards the
general center of mass. We could use the algorithm we used for
rendering polygons in two dimensions to determine the direction
of the arrows, but that is a bit complicated and involves trying
to find a direction that is not parallel to any of the above
lines. Thus, we will try another approach.

First, pick any point in the polygon. Then, from that point, take
the farthest point from it. See below:



Figure 9

(Please, do not measure it with a ruler :-) )
Now say we started with the point marked by the big red square.
Now, both point a and b are the farthest points from the red
square. You can pick any of the two. The objective here is to
find three consecutive points around the polygon for which the
inside angle is less than 180 degrees. If all three are on the
circle (worst case), their angle is strictly smaller than 180
degrees (unless they are all the same point, silly).

You could also pick the topmost point. If several are at the same
height, use the leftmost (or rightmost, whatever) one in the
topmost points.

Now we are certain that the small angle is the inside angle. Let
us draw the general situation:



Figure 10

Now, in a we see the above mentionned farthest point and the two
adjacent edges. If we take these edges as vectors as in b and
then sum them as in c, we find a point (pointed by the red vector
in c) which is necessarely on the in side for both edges. Now it
is easy to find a perpendicular vector for both edges and make
sure that they are in the good general direction. A way would be
to take the perpendicular vector and do a scalar multiplication
with the above red vector, the result of which must be strictly
positive. If it is not, the you must take the opposite vector for
perpendicular vector.

Now we know the situation for at least one edge. We need some way
to deduce the other edges' status from that edge's.

If we take Figure 10.c as our general case, the point at the end
of the red vector is not necessarely on the in side for both
edges. However, if it is on the in side for any of the two edges,
it is on the in side for both, and if it is on the out side for
any edge, it is on the out side for both.


Normally, we will never find two colinear edges in a polygon, but
if we do, it is fairly easy to solve it anyway (left as an
exercise).



Now, briefly, here is how to determine if a normal to a plane
points in the right direction. Pick the point with the greatest z
value (or x, or y). If many points have the same z value, break
ties with greatest x value, then greatest y value. Then, take all
connected edges. Make them into vectors, pointing from the above
point to the other endpoints (that is, the z component is less
than or equal to zero). Make them unit vectors, that is, length
of one. Find the vector with the most positive (or less negative)
z value. Break ties with the greates x value, and then greatest y
value. Take its associated edge. Now that edge is said to be
outermost.

Now, from that outermost edge, take the two associated polygons.
The edge has an associated vector for each connected polygon to
indicate the "in" direction. Starting from any of the two points
of the edge, add both these vectors. The resulting point is
inside. Adjust the normal so that it is on the right side of the
plane for both polygons.

Now we have the normal for at least one plane straight. We will
deduce correct side for the normal to the other polygons in a
similar manner than we did for the edges of a polygon, above.

Take two adjacent polygons, one for which you do know the normal,
the other for which you want to calculate the correct normal. Now
take a shared edge (any edge will do). That edge has two "in"
vectors for the two polygons. Starting from any of the two points
of the edge, add the two ®in¯ vectors. If the resulting point is
on the ®in¯ side for a plane, it is also on the ®in¯ side of the
other plane. If it is on the ®out¯ side of a plane, it is on the
®out¯ side for both planes.



To get a general picture of what I'm trying to say, here's a
drawing. Basically, where two lines intersect, you get four
distinct regions. One region lies entirely on the ®in¯ side of
both lines and another region lies entirely on the ®out¯ side of
both lines (if the two lines define a polygon). The two other
regions are mixed region (they lie on the ®in¯ side of a line and
on the ®out¯ side of the other line). The ®mixed¯ regions may or
may not be part of the actual polygon, depending on whether the
angle is lesser or greater than 180 degrees.



Figure 11


So we are trying to find a point that is in the ®in¯ or ®out¯
region, but not in the ®mixed¯ region. If, as above, we take the
edges as vectors and add the together, we end up in the either
the ®in¯ region or the ®out¯ region, but not in the ®mixed¯
region. That can be proved very easily, geometrically (yet
strictly), but I don't feel like drawing it. :) Anyway, it should
be simple enough.




     __________________





13   Reducing a polygon to a mesh of triangles



Now that's fairly simple. The simplest is when you have a convex
polygon. Pick any vertex. Now, take its two adjacent vertexes.
That's a triangle. In fact, that's your first triangle in your
mesh of triangles. Remove it from your polygon. Now you have a
smaller polygon. Repeat the same operation again until you are
left with nothing. Example:


Figure 12

For the above polygon, we pick the green edges in 1. The triangle
is shown in 2. When we remove it from the polygon, we're left
with what we see in 3.

This will take a n sided polygon and transform into n-2
triangles.

If the polygon is concave, we can subdivide it until it is convex
and here is how we do that.

Take any vertex where the polygon is concave (i.e. the angle at
that vertex is more than 180 degrees) and call that vertex A.
From that vertex, it can be proved geometrically that there
exists another vertex that is not connected by an edge to which
you can extend a line segment without intersecting any of the
polygon's edges. In short, from vertex A, find another vertex,
not immediately connected by an edge (there are two vertices
connected to A by an edge, don't pick any one of them). Call that
new vertex B. Make certain that AB does not intersect any of the
other edges. If it does, pick another B. Once you're settled on
your choice of B, split your polygon in two smaller polygons by
inserting edge AB. Repeat the operation with the two smaller
polygons until you either end up with triangles or end up with
convex polygons which you can split as above. It can again be
proved (though a little more difficultly) that you will end up
with n-2 triangles if you had a n sided polygon.

Here's a tip on how to find a locally concave vertex.


Figure 13

[The arrows point towards the ®in¯ side]

The two possible cases are shown in figure 13, either it is or it
is not convex. In 1, it is convex, as in 2 it is concave. To
determine if it is or not convex, take the line that passes
through a and u, above. Now, if v stands on the ®in¯ side of the
au line, it is convex. Otherwise, it is concave.





     __________________





14   Gouraud shading



First, let us discuss a simple way for shading polygons. Imagine
we have a light source that's infinitely far. Imagine it's at
z=positive infinity, x=0, y=0. If it's not, you can rotate
everything so that it is (in practice, you'll only need to rotate
normals). Now, the z component of the polygons normals give us a
clue as to the angle between the polygon and the light rays. The
z component goes from -1, facing away from the light source
(should be darkest) to 0, facing perpendicular to the light
source (should be in half light/penumbra), to +1, facing the
light source (should be the brightest). From that, you can shade
it linearly or any way you want.

With this model, a polygon is uniformly shaded. Approximating
round objects through polygons will always look rough.

Gouraud shading is what we call an interpolated shading. It is
not physically correct, but it looks good. What it does is this:
it makes the shade of a polygon change smoothly to match the
shade of the adjacent polygon. This makes the polyhedra look
smooth and curved. However, it does not change the actual edges
to curves, thus the silhouette will remain unsmoothed. You can
help that by allowing curved edges if you feel like it.

First, we will interpolate a vertex normal for all vertexes. That
is, we will give each vertex a ®normal¯ (note that a point does
not have a normal, so you could call it pseudo-normal). To
interpolate that vertex ®normal¯, averaging all connected
polygons normals would be a way. Now, find the ®shade¯ for each
vertex.


Figure 14

Now, say point a is at (ax,ay) and point b is at (bx,by). We're
scan-converting the polygon, and the gray-and-red line represents
the current sanline, y0. Point P is the pixel we are currently
drawing, located at (x0,y0). Points m and n define the span we're
currently drawing. Now, we will interpolate the color for point m
using a and b, and the color for n using a and c, then
interpolate the color for P using the color for m and n. Our
interpolation will be linear. Say color for point a is Ca, color
for point b is Cb, for point c it is Cc, Cm for m, Cn for n and
CP for P.

We will say that color at point m, Cm, is as follows:

Cm=(y0-ay)/(by-ay) x (Cb-Ca)+Ca

That is, is point m is halfway between a and b, we'll use half ca
and half cb. If it's two-third the way towards b, we'll use 1/3Ca
and 2/3Cb. You get the picture. Same for n:

Cn=(y0-ay)/(cy-ay) x (Cc-Ca)+Ca.

Then we do the same for CP.

CP=(x0-mx)/(nx-mx) x (Cn-Cm)+Cn.

If you think a little, these calculations are exactly the same as
Bresenham's line drawing algorithm, seen previously in chapter
2.1. It is thus possible to do them incrementally. Example:

Say we start with the topmost scanline. Color for point m is at
first Cb. Then, it will change linearly. When point m reaches
point a, it will be color Ca. Now, say dy=by-ay, dC=Cb-Ca, and
u=y0-ay.

Cm=dC/dy x u+Ca.

Then, when y0 increases by 1, u increases by one and we get

Cm'=dC/dy x (u+1)+Ca=dC/dy x u+Ca + dC/dy
Cm'=Cm+dC/dy

Same as Bresenham. Thus, when going from one scanline to the
next, we simply need to add dC/dy to Cm, no multiplications or
divisions are involved. The same goes for Cn. CP is done
incrementally from pixel to pixel.






     ____________________







15   Clipping polygons to planes



Eventually, you might need to clip your polygon, minimally to the
z>0 volume. Several approaches can be used. We will discuss here
the Sutherlan-Hodgman algorithm. But first, let us speak of
trivial rejection/acceptation.

If a polygon does not intersect the clipping plane, it can be
either trivially accepted or rejected. For example, if every
single point in a polygon are in z>0 and we are clipping with
z>0, then the whole polygon can be accepted. If every single
point is in z<=0, the whole polygon can be trivially rejected.
The real problem comes with cases where some points are in z>0
and some are in z<=0. Another nifty thing you can do is pre-
calculate the bounding sphere of a poly with its center on a
given point in the poly. Let O be the sphere's center and R it's
radius. If Oz-R>0, then the poly can be trivially accepted even
faster (no need to check every single point). If Oz+R<=0, the
poly can be trivially refused. You can extend this to polyhedras.
You could also check wether the intersection of the sphere and
the polygon plane is in z>0 (or z<=0), which might be slightly
better than checking for the whole sphere.

Here comes the Sutherland-Hodgman algorithm. Start with a point
that you know is going to be accepted. Now, move clockwise (or
counter-clockwise) around the poly, accepting all edges that
should be trivially accepted. Now, when an edge crosses from
acceptance to rejection (AtoR), find the intersection with the
clipping plane and replace the offending edge by a shortened one
that can be trivially accepted. Then, keep moving until you find
an edge that crosses from rejection to acceptance (RtoA), clip
the edge and keep the acceptable half. Add an edge between the
two edges AtoR and RtoA. Keep going until you are done.


Figure 15

Figure 15 illustrates the process of clipping the abcde polygon
to the clipping plane. Of note is this: when performing this kind
of clipping on convex polygons, the results are clear.
Furthermore, a convex polygon always produce a convex polygon
when clipped. However, concave polygons introduce the issue of
degenerate edges and whether they should be there in a correctly
clipped polygon. Figure 16 shows such a case of degenerate edge
generation.


Figure 16

In figure 16, the polygon to the left is clipped to the plane
shown, the result is on the right. The bold edge is double. That
is, two edges pass there. This is what we call a degenerate edge.
Degenerate edges don't matter if what interests us is the area of
the polygon, or if they happen to fall outside of the view
volume. What interests us is indeed the area of the polygon, but
due to roundoff error, we could end up with the faint trace of an
edge on screen. On could eliminate these degenerate edges pretty
easily.

To do so, do not add an edge between the points where you clip at
first (edge xy in figure 15). Instead, once you know all clipping
points (x and y in the above example), sort them in increasing or
decreasing order according to one of the coordinate (e.g. you
could sort the clipping point in increasing x). Then, between
first and second point, add an edge, between third and fourth,
add an edge, between fifth and sixth, add an edge and so on. This
is based on the same idea that we used for our polygon drawing
algorithm in an earlier chapter. That is, when you intersect an
edge, you either come out of the polygon or go in the polygon.
Then, between the first and second clipped point belongs an edge,
etc...

However, since we are clipping to z=0, the degenerate edge has to
be outside the view volume. (Division by a very small z, when
making projections, ensures that the projected edge is far from
the view window). Since we are clipping to z>0, let us pull out
our limit mathematics.

Let us first examine the case where the edge lies entirely in the
z=0 plane.

Let us assume we are displaying our polygon using a horizontal
scan-line algorithm. Now, if the both of the offending edge's
endpoints are at y>0, then the projected edge will end up far up
of the view window. Same goes if both endpoints are at y>0. If
they are on either side of y=0, then all projected points will
end up above or below the view window, except the exact point on
the edge where y=0. If that point's x<0, then x/z will end up far
to the left, and if x>0, the point will end up far to the right.
If x=0, then we have the case x=y=0 and z tends towards positive
zero (don't flame me for that choice of words). In the case x=0,
x/z will be zero (because 0/a for any nonzero a is 0) and so for
y. Thus, the projected edge should pass through the center of the
view window, and go to infinity in both directions. It's slope
should be the same as the one it has in the z=0 plane. Just to
make it simple, if we have the plane equation (as we should) in
the form Ax+By+Cz+D=0, we can find the projected slope by fixing
z=1: Ax+By+C+D=0, or y=-A/Bx-(C+D). Thus the slope of the
projected line is -A/B. If B is zero, we have a vertical line.

What we conclude is that if the edge's z coordinate is 0 and both
of its endpoints are on the same side of plane y=0, then the edge
will not intercept any horizontal scanline and can thus be fully
ignored. If the endpoints are on either side of y=0, then it will
have an intersection with all horizontal scanlines. The projected
edge will end up left if, when evaluated at y=0 and z=0 its x
coordinate is less than 0 (from the plane equation, x=-D/A). If
the x coordinate is positive, the edge ends up far to the right.
If x=y=0 when z=0, the edge will be projected to a line passing
through the center of the screen with a slope of -A/B.

In the case where one of the edge's endpoints is in the z=0
plane, that endpoint will be projected to infinity and the other
will not, we will end up with a half line in the general case,
sometimes a line segment.

Let us name the endpoint whose z coordinate is 0 P0. If P0's x<0,
then it will be projected to the left of the window. If P0's x>0,
it will be projected to the right. Likewise, if P0's y<0, it will
be projected upwards, if y>0 it will be projected downwards.

The slope will be (v0-v1)/(u0-u1) where (u0,v0) is projected P0
and (u1,v1) is projected P1. This can be written as
u0=P0x/p0z     u1=P1x/P1z     v0=P0y/P0z     v1=P1y/P1z

It is of note that u1 and v1 are both relatively small numbers
compared to u0 and v0. The slope is therefore:

m=(P0y/P0z-P1y/P1z)/(P0x/P0z-P1x/P1z)
m=(P0y/P0z)/(P0x/P0z-P1x/P1z)-(P1y/P1z)/(P0x/P0z-P1x/P1z)
m=(P0y/P0z)/([P0xP1z-P1xP0z]/P0zP1z)-0
m=(P0y/P0z)(P0zP1z/[P0xP1z-0])
m=P0yP1z/(P0xP1z)
m=P0y/P0x.

By simmetry we could have found the inverse slope p=P0x/P0y.
Thus, in the case where P0x=0, we are facing a vertical
projection. Else, the projection's slope is P0y/P0x. Anyway, the
line passes through (u1,v1).

In the case where P0=0, (u0,v0)=(0,0). Then the projected edge
remains a line segment.





     _________________________






16   Interpolations and forward differencing




Oftentimes, us graphics programmer want to interpolate stuff,
especially when the calculations are very complex. For example,
the texture mapping equations involve at least two divides per
pixel, which can be just a bit too much for small machines. Most
of the time, we have a scanline v=k on screen that we are
displaying, and we want to display a span of pixels, e.g. a
scanline in a triangle or something. The calculations for these
pixels can sometimes be complicated. If, for example, we have to
draw pixels in the span from u=2 to u=16 (a 14 pixel wide span)
for a given scanline, we would appreciate reducing the per-pixel
calculations to the minimum.

Let's take for our example, the z value of a plane in a given
pixel. Let's say the plane equation is x+2y+z=1. We have already
seen that z can be calculated with z=D/(Au+Bv+C), or
z=1/(u+2v+1). Let us imagine that we are rendering a polygon, and
that the current scanline is v=1. Then, z=1/(u+3) for the current
scanline. If that a span goes from u=0 to u=20 (that is, on line
v=.5, the polygon covers pixels u=0 through u=20). We can see
form the equation for z that when u=0, z=1/3, and when u=20,
z=1/23.

Calculating the other values is a bit complex because of the
division. (Divisions on computers are typically slower than other
operations.) As of such, we could perhaps use an approximation of
z instead. The most naive one would be the following: just assume
z is really of the form z=mu+b and passes through the points u=0,
z=1/3 and u=20, z=1/23. Thus:

z=mu+b verifies
1/3=m(0)+b and
1/23=m(20)+b

thus

1/3=b
and
1/23=20m+b
or
1/23=20m+1/3
1/23-1/3=20m
m=1/460-1/60 or approximately -0.01449275362319.

Thus, instead of using z=1/(u+3), we could use z=-
0.01449275362319u+1/3. In this particular case, this would be
somewhat accurate because m is small.

Example, we know from the real equation that z=1/(u+3), thus when
u=10, z=1/13, approximately 0.07692307692308. From the
interpolated equation, when u=10, we get z=-
0.01449275362319*10+1/3, or approximately 0.1884057971014. The
absolute error is approximately 0.11, which may or may not be
large according to the units used for z. However, the relative
error is around 30%, which is quite high, and it is possible to
create cases where the absolute error is quite drastic.

To reduce the error, we could use a higher degree polynomial.
E.g. instead of using z=mx+b, we could for instance use
z=Ax^2+Bx+C, a second degree polynomial. We would expect this
polynomial to reduce the error. The only problem with this and
higher degree polynomials is to generate the coefficients (A, B
and C in this case) which will minimize the error. Even defining
what "minimizing the error" is can be troublesome, and depending
on the exact definition, we will find different coefficients.

A nice way of "minimizing the error" is using a Taylor series.
Since this is not a calculus document, I will not teach it in
detail, but merely remind you of the sum. It can be demonstrated
that all continuous functions can be expressed as a sum of
polynomials for a given range. We can translate that function so
as to center that range about any real, and such other things. In
brief, the Taylor series of f(x) around a is:

T=f(a)+f'(a)(x-a)/1!+f''(a)(x-a)^2/2!+f'''(a)(x-
a)^3/3!+f''''(a)(x-a)^4/4!+...

This may look a bit scary, so here is the form of the general
term:

Ti=fi(a)(x-a)^i/i!

and

T=T0+T1+T2+T3+...+Ti+...

Where fi(a) is the ith derivative of f evaluated at point a, and
i! is the factorial of i, or 1x2x3x4x....xi. As an example, the
factorial of 4 is 1x2x3x4=24. It is possible to study this serie
and determine for which values of x it converges and such things,
and it is very interesting to note that for values of x for which
it does not converge, it diverges badly usually.

Taylor series usually do a good job of converging relatively
quickly in general. For particular cases, one can usually find
better solutions than a Taylor series, but in the general case,
they are quite handy. Saying that they converge "relatively
quickly" means that you could evaluate, say, the first 10 terms
and be quite confident that the error is small. There are
exceptions and special cases, of course, but this is generally
true.

What interests us in Taylor series is that they give us a very
convenient way of generating a polynomial of any degree to
approximate any given function for a certain range. In
particular, the z=1/(mu+b) equation could be approximated with a
Taylor series, and it's radius of convergence determined, etc...
(Please look carefully at the taylor series and you will see that
it is indeed a polynomial).

Here are a few well known and very used Taylor series expansions:

exp(x)=1+x+x^2/2+x^3/3!+x^4/4!+x^5/5!+x^6/6!+....
sin(x)=x-x^3/3!+x^5/5!-x^7/7!+x^9/9!-...
cos(x)=1-x^2/2!+x^4/4!-x^6/6!+x^8/8!-...

Taylor series for multivariable functions also exist, but they
will not be discussed here. For a good description of these, you
can get a good advanced calculus book. My personnal reference is
Advanced engineering mathematics, seventh edition by Erwin
Kreysig, ISBN 0-471-55380-8. As for derivatives, limits et al,
grab an introductory level calculus book.

Now the only problem is evalutating a polynomial at any point. As
you know, evaluating Ax^2+Bx+C at point x0 requires that we
square x0, multiply it by A, add B times x0 and then finally add
C. If we're doing that once per pixel, we are losing a lot of
time. However, the same idea as in incremental calculations can
be recuperated and used for any degree of polynomial. Generally
speaking, we're going to try to find what happens to f(x) as we
increase x by one. Or, we're trying to find f(x+1)-f(x) to see
the difference between one pixel and the next. Then, all we need
to do is add that value as we move from one pixel to the next.
Let's do an example with f(x)=mx+b.

f(x+1)-f(x)=[m(x+1)+b]-[mx+b]
f(x+1)-f(x)=m

So, as we move from a pixel to the next, we just need to add m to
the previous value of f(x).

If we have a second order polynomial, the calculations are
similar:

f(x)=Ax^2+Bx+C

f(x+1)-f(x)=[A(x+1)^2+B(x+1)+C]-[Ax^2+Bx+C]
     =[A(x^2+2x+1)+Bx+B+C]-[Ax^2+Bx+C]
     =[Ax^2+(2A+B)x+A+B+C]-[Ax^2+Bx+C]
     =2Ax+A+B

Let's name that last equation g(x). So, as we move from x to x+1,
we just need to add g(x) to the value of f(x). However,
calculating g(x) involves two multiplications (one of which which
can be optimized out by using bit shifts) and at least an add.
But g(x) is a linear equation, so we can apply forward
differencing to it again and calculate:

g(x+1)-g(x)=[2A(x+1)+A+B]-[2AX+A+B]
     =2A

So what we can do as we move from x to x+1 is first add g(x) to
f(x) and then add 2A to g(x), only two adds per pixel.

This can be extended to any degree of polynomial. In particular,
NURBS (not described in this document) can be optimized this way.
(I do not intend to discuss NURBS, but this optimization is
considered less efficient that subdivision, but is worth
mentionning.)

So what is the use for all this? Well, you could use a Taylor
series to do texture mapping instead of two divisions per pixel.
This is a generalization of linear approximation for texture
mapping. You could also use it for evaluating the z value of a
polygon in any pixel, as we were doing in the introduction to
this chapter. This however might not be very useful since you
might not need the actual z value, and the 1/z value might
suffice which, as we have already seen, can be calculated
incrementally without approximation (e.g. no error except
roundoff). This would be useful in visible surface determination.




     _________________________






17   Specular highlights and the Phong illumination model





Up to now, we have been using a very straight illumination model.
We have assumed that the "quantity of light at a given point" can
be calculated with some form of dot product of the light vector
with the normal vector. (Let me remind you that (a,b,c) dot
(d,e,f) is ad+be+cf.) You should normalize the light vector for
this, and the plane normal should be of unit length too. Let us
assume the plane normal is (A,B,C) and that it is already
normalized. Let us further assume that the light vector (vector
from light source to point that is lighted in 3d) is (P,Q,R).
Then, the normalized light vector is (P,Q,R) times
1/sqrt(P*P+Q*Q+R*R). As you can see, normalizing the light vector
is very annoying. (See chapter 2 for a more lengthy discussion of
related topics).

Now, we have been imagining that no matter what angle you look at
the object from, the intensity of the light is the same. In
reality, that may not be the case. If the object is somewhat
shiny and a bright light shines on it and you are looking at the
object from an appropriate angle, there should be a very intense
spot. If you move your eye, the spot moves, and if you move your
eye enough, the spot disappears entirely. (Try it with an apple
in a room with a single, intense light source, no mirrors et al).
This means that the light is reflected more in one general
direction than in the others.

Let us look at a little drawing. Let N be the surface normal, L
the light vector, R the reflection vector and V the eye-to-object
vector.


Figure 17

In the above picture, we see clearly that R is L mirrored by N.
Please not that the angle between L and R is not 90 degrees as it
may appear, but rather twice the angle between L and N. It can be
demonstrated that:

R=2N(N dot L)-L

Or, in more detail, if R=(A,B,C), N=(D,E,F) and L=(G,H,I), then N
dot L is DG+EH+FI, and thus we have

A=2D(DG+EH+FI)-G
B=2E(DG+EH+FI)-H
C=2F(DG+EH+FI)-I

It is of note that, if the light is at infinity, for a flat
polygon, N dot L is constant. If the light is not at infinity or
if the surface is curved, then N dot L varies.

Now, if we do not take into account the angle b in figure 17, the
amount of light perceived should be something like
I=Kcos(a)

where K is some constant that depends on the material and a is
the angle shown in figure 17. If to that we want to add the
specular highlight, we should add a term that depend on angle b
of figure 17. I becomes

I=Kcos(a)+Ccos^n(b)

Where C is a constant that depends on the material. The constant
n is a value that will specify how sharp is the reflection. The
higher the value for n, the more localized the reflection will
be. Lower values of n can be used for matte surfaces, and higher
values can be used for more reflective surfaces. Most light
sources don't shine as bright from a distance (but some do shine
as bright at any small distance, such as the sun, but that light
can be considered to lay "infinitely" far). In that case, we
could want to divide the above intensity by some function of the
distance. We know from physics that the quantity of energy at a
distance r of a light source is inversely proportional to the
square of the radius. However, this may not look good in computer
graphics. In short, you might want to use the square of the
length of the light vector, or the length of the light vector, or
some other formula. Let us assume we are using the length of the
light vector. Then, the intensity becomes:

I=1/|L| * [Kcos(a)+Ccos^n(b)]

Furthermore, it is known that if t is the angle between the
vectors A and B, then A dot B is |A||B|cos(t). Then,

cos(t)=(A dot B)/(|A||B|)

If |A| and |B| are both one, we can ignore them. So, let L' be
normalized L, and R' be normalized R and V' be normalized V, and
of course N is of unit length, then the equation is:

I=D * (K[N dot L']+C[R' dot V'])

Where D=1/|L|. We could have used some other function of |V|.

To that equation above, you may want to add some ambiant light,
e.g. light that shines from everywhere at the same time, such as
when you are standing outside (you wouldn't see a very dark
corner in daylight, normally, so the light that is still present
in a "dark corner" is the ambiant light). If you have multiple
light sources, just calculate the intensity from each light
source and add them together.

This of course assumes that a is comprised between 0 and 90
degrees, otherwise the light is on the wrong side of the surface
and doesn't affect it at all. (In which case the only light
present would be the light of intensity A).





     _________________________






18   Phong shading





Polygons are flat, and as such have a constant normal across
their surface. When we used pseudo-normals for gouraud shading.
This was because we were attempting to model a curved surface
with flat polygons. We can use the Phong illumination model with
flat polygons if we want. That is, forget about the pseudo-
normals, use the plane normal everywhere in the Phong
illumination equation. This will yield very nice looking flat
surfaces, very realistic. However, most of the time, we want to
model curved surfaces and the polygons are merely an
approximation to that curved surface. In a curved surface, the
normal would not be constant for a large area, it would change
smoothly to reflect the curvature. In our polygons, of course,
the normal does not change. However, if we really want to model
curved surfaces with polygons, we find that the sharp contrasts
between facets is visually disturbing and does not represent a
curved surface well enough. To this end, we made the intensity
change gradually with Gouraud shading. There is however a problem
with Gouraud shading.

It is possible to demonstrate that no light inside a Gouraud
shaded polygon can be of higher intensity than any of the
vertices. As of such, if a highlight were to fall inside the
polygon (as is most probably the case) and not on a vertex, we
would miss it perhaps entierly. To this end, we might want to
interpolate the polygon normal instead of intensity from the
vertices pseudo-normals. The idea is the same as with Gouraud
shading, except that instead of calculating a pseudo-normal at
the vertices, calculating the intensities at the vertices and
then interpolating the intensity linearly, we will calculate
pseudo-normals at the vertices, interpolate linearly the pseudo-
normals and then calculate the intensity. The intensity can be
calculated using any illumination model, and in particular the
Phong illumination model can be visually pleasent, though some
surfaces don't have any specular highlights. (In which case you
can remove the specular factor from the equation entirely).

The calculations for Phong shading are very expensive per pixel,
in fact too expensive even for dedicated hardware oftentimes. It
would be absurd to think that one could do real-time true Phong
shading on today's platforms. But in the end, what is true Phong
shading? Nothing but an approximation to what would happen if the
polygon mesh really were meant to represent a curved surface.
I.e. Phong shading is an approximation to an approximation. The
principal objective of Phong shading is to yield nonlinear
falloff and not miss any specular highlights. Who is there to say
that no other function will achieve that?

One of the solutions to this has been proposed in SIGGRAPH '86,
included with this document. What they do is pretty
straightforward (but probably required a lot of hard work to
achieve). They make a Taylor series out of Phong shading and use
only the first two terms of the series for the approximation.
Thus, once the initialization phase is completed, Phong shading
requires but two adds per pixel, a little more with specular
highlights. The main problem, though, is the initialization
phase, which includes many multiplications, divisions, and even a
square root. It will doubtlessly be faster than exact Phong
shading, but I am not so confident that it will be sufficiently
fast to become very popular. Using Gouraud shading and a very
large number of polygons can yield comparable results at a very
acceptable speed.


Approximating Phong Shading


There are a few approaches to approximating phong shading. They
all boil down to a simple technique. Find the spherical
coordinates equivalent of the normal vector at each vertex and
interpolate this value linearly along edges and inside the
polygon. Or, you can interpolate the (x,y) value of the normal
vector linearly, and then, instead of normalizing the (x,y,z)
vector, use a lookup table to find what exact z value will make
the particular (x,y,z) vector normal.

If you are using a lookup-table technique, the lookup-table can
be viewed as a texture mapped onto the polygon. In this case, you
could envision placing something other than z value or shade
value in the lookup table. Nice effect such as environment
mapping can be achieved this way.





     _________________________




Version history


Original version (no version number): original draft, chapters 1
through 8.

Version 0.10 beta: added chapters 9 through 11, added version
history. Still no proofreading, spellchecking or whatever.

Version 0.11 beta: modified some chapters, noticed than (Q,P,R)
is the normal vector (chapter 5). A little proofreading was done.

Version 0.12 beta: I just noticed that this document needs a
revision badly. Still, I don't have time. I just added a few
things about the scalar multiplication of two vectors being the D
coefficient in the plane equation. Added chapter 12. Also
corrected a part about using arrays instead of lists. The
overhead is just as bad. Better to use lists. Ah, and I have to
remember to add a part on calculating the plane equation from
more than three points to reduce roundoff error.

Version 0.20 beta: Still need to add the part about calculating
A,B and C in plane eq. from many points. However, corrected
several of the mistakes I talked about in the preceding paragraph
(i.e. revision). Ameliorated the demonstration for 2d rotations.
Started work on 3d rotations (this had me thinking...)

Version 0.21 beta: converted to another word processor. Will now
save in RTF instead of WRI.

Version 0.22 beta: corrected a few goofs. 3d rotations still on
the workbench. Might add a part about subdividing a polygon in
triangles (a task which seems to fascinate people though I'm not
certain why). Will also add the very simple Gouraud shading
algorithm in the next version (and drop a line about bump
mapping). This thing needs at least a table of contents, geez...
I ran the grammar checker on this. Tell me if it left anything
out.

Version 0.23 beta: well, I made the version number at the start
of this doc correct! :-) Did chapter 13 (subdividing into
triangles) and chapter 14 (Gouraud shading).

Version 0.24 beta: removed all GIFs from file because of recent
CompuServe-Unisys thingy. Don't ask me for them, call CompuServe
or Unisys [sigh].

Version 0.30 beta: added the very small yet powerful technique
for texture mapping that eliminates the division per pixel. It
can also help for z-buffering. Corrected the copyright notice.

Version 0.31 beta: I, err, added yet another bit to texture
mapping [did this right after posting it to x2ftp.oulu, silly
me]. Corrected a typo in chapter 13: a n-sided polygon turns into
n-2 triangles, not n triangles. Was bored tonight, added
something to the logs chapter (I know, it's not that useful, but
hey, why would I remove it?) Added ®Canada¯ to my address below
(it seems I almost forgot internet is worldwide tee hee hee).
Addendum: I'm not certain that the GIF thing (see the paragraph
about v0.24beta above) applies to me, but I'm not taking any
chances.

Version 0.32 beta: added bits and pieces here and there. The WWW
version supposedly just became available today. Special thanks to
Lasse Rasinen for converting it. Special thanks also to Antti
Rasinen. I can't seem to be able to connect to the WWW server
though, I just hope it really works. Corrected a goof in the free
directional texture mapping section.

Version 0.40 beta: Kevin Hunter joined forces with me and made
that part in chapter two about changes in coordinates system,
finding the square root et al. Modified the thing about
triangulating a polygon. Added the Sutherland-Hodgman clipping
algorithm and related material (the whole chapter 15). This
document deserves a bibliography/suggested readings/whatever, but
I don't think I have the time for that. Might add a pointer to
the 3d books list if I can find where it is. Anyway, if you're
really interested and want to know more, I'm using Computer
Graphics, Principles and Practice by Foley, van Dam, Feiner and
Hughes, from Addison-Wesley ISBN 0-201-12110-7. There. It's not
in the right format, but it's there. MY E-MAIL ADDRESS CHANGED!
PLEASE USE THIS ONE FROM NOW ON!

Version 0.41 beta: added to chapter 3. Gave a more detailed
discussion of a scan-line algorithm. Put in a line about Paul
Nettle's sbuffering in the same chapter. Will now include a
"where can I get this doc from" in the readme, and I'll try to
put the last few version history entries in my announcements for
future versions of the doc.

Version 0.42 beta: removed the buggy PS file. Added an ASCII file
containing most of the pics of the document.

Version 0.50 beta: added chapters 16, 17 and 18 about Phong
shading and illumination model, interpolations forward
differencing and Taylor series. I am also including an extract
from SIGGRAPH '86 for fast Phong shading. Source code is coming
out good, so it just might be in the next version.

Version 0.60 beta: added source code and Phong approximations in
chapter 18. Zed3D is now SHAREWARE. It means you should give me a
small contribution if you find it useful. See the beginning of
the document for more details.




     _____________________________






About the author



S‚bastien Loisel is a student in university. He is studying
computer sciences and mathematics and has been doing all sorts of
programs, both down-to-earth and hypothetical, theoritical
models. He's been thinking a lot about computer graphics lately
and so decided to write this because he wanted to get his ideas
straight. After a while, he decided to distribute it on Internet
(why the hell am I using third person?)

The author would be happy to hear your comments, suggestions,
critics, ideas or receive a postcard, money, new computer and/or
hardware or whatever (grin). If you do want to get in touch with
me, well here's my snail mail address:

S‚bastien Loisel
1 J.K. Laflamme
L‚vis, Qu‚bec
Canada
G6V 3R1

MY E-MAIL ADDRESS CHANGED! PLEASE USE THIS ONE FROM NOW ON!
loiselse@ift.ulaval.ca

See also the enclosed file, README.3D.

I would like to thank Lasse and Antti Rasinen making a WWW
version available in Finland.

I would like to thank Simos Hadjiyiannis for making the WWW
available to the rest of the world. Mr. Hadjiyiannis can be
reached at hadjiyia@cs.colostate.edu. The WWW page is
http://www.cs.colostate.edu/~hadjiyia/3d.html.

I would like to thank Kevin Hunter for writing a big chunk of
chapter 2. Thank you Kevin. Kevin Hunter can be reached (at the
time of this writing) as HUNTER@symbol.com.
Discuss this article in the forums
Date this article was posted to GameDev.net: 7/16/1999
(Note that this date does not necessarily correspond to the date the article was written)
See Also:
Texture Mapping