Fast Antialiasing

 Journal:   Dr. Dobb's Journal  June 1992 v17 n6 p139(7)
 -----------------------------------------------------------------------------
 Title:     Fast antialiasing. (Column)
 Author:    Abrash, Michael.
 AttFile:    Program:  GP-JUN92.ASC  Source code listing.

 Abstract:  Advanced computer graphics techniques require powerful computers
            and dedicated systems to handle the output, but utility
            microcomputers based on Intel's 80386 and 80486 microprocessors
            become burdened when trying to compute such graphics features as
            anti aliasing.  Anti aliasing involves smoothing lines and edges
            to remove the jagged appearance.  Anti aliasing makes computer
            images more attractive and aids in making technical drawings more
            accurate.  An algorithm developed by Xiaolin Wu describing line
            anti aliasing is described.  The algorithm brackets either side of
            a line's pixel with pixels of differing values of color.  The sum
            intensity of the bordering pixels equals the intensity of the
            line's pixel.
 -----------------------------------------------------------------------------
 Descriptors..
 Topic:     Tutorial
            Computer Graphics
            Programming
            Program Development Techniques
            Rendering
            Dithering
            Algorithms.
 Feature:   illustration
            chart
            program.

 -----------------------------------------------------------------------------
 Full Text:

 The thought first popped into my head as I unenthusiastically picked through
 the salad bar at a local "family" restaurant, trying to decide whether the
 meatballs, the fried clams, or the lasagna was likely to shorten my life the
 least.  I decided on the chicken in mystery sauce.

 The thought recurred when my daughter asked, "Dad, is that fried chicken?"

 "I don't think so," I said.  "I think it's stewed chicken."

 "It looks like fried chicken."

 "Maybe it's fried, stewed chicken," my wife volunteered hopefully.  I took a
 bite.  It was, indeed, fried, stewed chicken.  I can now, unhesitatingly and
 without reservation, recommend that you avoid fried, stewed chicken at all
 costs.

 The thought I had was as follows: This is not good food.  Not a profound
 thought, but it raises an interesting question: Why was I eating in this
 restaurant?  The answer, to borrow a phrase from E.F.  Schumacher, is
 appropriate technology.  For a family on a budget, with a small child, tired
 of starting at each other over the kitchen table, this was a perfect place to
 eat.  It was cheap, it had greasy food and ice cream, no one cared if
 children dropped things or talked loudly or walked around, and most important
 of all, it wasn't home.  So what if the food was lousy?  Good food was a
 luxury, a bonus; everything on the above list was necessary.  A family
 restaurant was the appropriate dining-out technology, given the parameters
 within which we had to work.

 When I read through SIGGRAPH proceedings and other state-of-the-art
 computer-graphics material, all too often I feel like I'm dining at a
 four-star restaurant with two-year-old triplets and an empty wallet.  We're
 talking incredibly inappropriate technology for PC graphics here.  Sure, I
 say to myself as I read about an antialiasing technique, that sounds
 wonderful--if I had 24-bpp color, and dedicated hardware to do the
 processing, and all day to wait to generate one image.  Yes, I think, that is
 a good way to do hidden surface removal--in a system with hardware
 z-buffering.  Most of the stuff in Computer Graphics is riveting, but, alas,
 pretty much useless on PCs.  When an 80x86 has to do all the work, speed
 becomes the overriding parameter, especially for real-time graphics.

 Literature that's applicable to fast PC graphics is hard enough to find, but
 what we'd really like is above-average image quality combined with terrific
 speed, and there's almost no literature of that sort around.  There is some,
 though, and you folks are right on top of it.  For example, alert reader
 Michael Chaplin, of San Diego, wrote to suggest that I might enjoy the
 line-antialiasing algorithm presented in Xiaolin Wu's article, "An Efficient
 Antialiasing Technique," in the July 1991 issue of Computer Graphics.
 Michael was dead-on right.  This is a great algorithm, combining excellent
 antialiased line quality with speed that's close to that of non-antialiased
 Bresenham's line drawing.  This is the sort of algorithm that makes you want
 to go out and write a wireframe animation program, just so you can see how
 good those smooth lines look in motion.  Wu antialiasing is a wonderful
 example of what can be accomplished on inexpensive, mass-market hardware with
 the proper programming perspective.  In sort, it's a splendid example of
 appropriate technology for PCs.

 Wu Antialiasing

 To recap briefly, antialiasing is the process of smoothing lines and edges so
 that they appear less jagged.  Antialiasing is partly an aesthetic issue,
 because it makes images more attractive.  It's partly an accuracy issue,
 because it makes it possible to position and draw images with effectively
 more precision than the resolution of the display.  Finally, it's partly a
 flat-out necessity, to avoid the horrible, crawling, jagged edges of temporal
 aliasing when performing animation.

 The basic premise of Wu antialiasing is almost ridiculously simple: As the
 algorithm steps one pixed unit at a time along the major (longer) axis of a
 line, it draws the two pixels bracketing the line along the minor acis at
 each point.  Each of the two bracketing pixels is drawn with a weighted
 fraction of the full intensity of the drawing color, with the weighting for
 each pixel equal to one minus the pixel's distance along the minor axis from
 the ideal line.  Figure 1 illustrates this concept.

 The intensities of the two pixels that bracket the line are selected so that
 they always sum to exactly 1; that is, to the intensity of one fully
 illuminated pixel of the drawing color.  The presence of aggregate full-pixel
 intensity means that at each step, the line has the same brightness it would
 have if a single pixel were drawn at precisely the correct location.
 Moreover, thanks to the distribution of the intensity weighting, that
 brightness is centered at the ideal line.  Not coincidentally, a line drawn
 with pixel pairs of aggregate single-pixel intensity, centered on the ideal
 line, is perceived by the eye not as a jagged collection of pixel pairs, but
 as a smooth line centered on the ideal line.  Thus, by weighting the
 bracketing pixels properly at each step, we can readily produce what looks
 like a smooth line at precisely the right location, rather than the jagged
 pattern of line segments that non-antialiased line-drawing algorithms such as
 Bresenham's trace out.

 You might expect that the implementation of Wu antialiasing would fall into
 two distinct areas: tracing out the line (that is, finding the appropriate
 pixel pairs to draw) and calculating the appropriate weightings for each
 pixel pair.  Not so, however.  The weighting calculations involved only a few
 shifts, XORs, and adds; for all practical purposes, tracing and weighting are
 rolled into one step--and a very fast step it is.  How fast is it?  On a
 33-MHz 486 with a fast VGA, a good but not maxed-out assembler implementation
 of Wu antialiasing draws a more than respectable 5000 150-pixel-long vectors
 per second.  That's especially impressive considering that about 1,500,000
 actual pixels are drawn per second, meaning that Wu antialiasing is drawing
 at over 50 percent of the maximum memory bandwidth--and hence more than half
 the faster theoretically possible drawing speed--of an AT-bus VGA.  In short,
 Wu antialiasing is about as fast an antialiased line approach as you could
 ever hope to find for the VGA.

 Tracing and Intensity in One

 Horizontal, vertical, and diagonal lines do not require Wu antialiasing
 because they pass through the center of every pixel they meet; such lines can
 be drawn with fast, special-case code.  For all other cases, Wu lines are
 traced out one step at a time along the major axis by means of a simple,
 fixed-point algorithm.  The move along the minor axis with respect to a
 one-pixel move along the major axis (the line slope for lines with slopes
 less than 1, 1/slope for lines with slopes greater than 1) is calculated with
 a single integer divide.  This value, called the "error adjust," is stored as
 a fixed-point fraction, in 0.16 format (that is, all bits are fractional, and
 the decimal point is just to the left of bit 15).  An error accumulator, also
 in 0.16 format, is initialized to 0.  Then the first pixel is drawn; no
 weighting is needed, because the line intersects its endpoints exactly.

 Now the error adjust is added to the error accumulator.  The error
 accumulator indicates how far between pixels the line has progressed along
 the minor axis at any given step; when the error accumulator turns over, it's
 time to advance one pixel along the minor axis.  At each step along the line,
 the major-axis coordinate advances by one pixel.  The two bracketing pixels
 to draw are simply the two pixels nearest the line along the minor axis.  For
 instance, if X is the current major-axis coordinate and Y is the current
 minor-axis coordinate, then the two pixels to be drawn are (X,Y) and (X,Y +
 1).  In short, the derivation of the pixels at which to draw involves nothing
 more complicated than advancing one pixel along the major axis, adding the
 error adjust to the error accumulator, and advancing one pixel along the
 minor axis when the error accumulator turns over.

 So far, nothing special; but now we come to the true wonder of Wu
 antialiasing.  We know which pair of pixels to draw at each step along the
 line, but we also need to generate the two proper intensities, which must be
 inversely proportional to distance from the ideal line and sum to 1, and
 that's a potentially time-consuming operation.  Let's assume, however, that
 the number of possible intensity levels to be used for weighting is the value
 NumLevels = [2.sup.n] for some integer n, with the minimum weighting (0
 percent intensity) being the value [2.sup.n-1], and the maximum weighting
 (100 percent intensity) being the value 0.  Given that, lo and behold, the
 most significant n bits of the error accumulator select the proper intensity
 value for one element of the pixel pair, as shown in Figure 2.  Better yet,
 [2.sup.n-1] minus the intensity of the first pixel selects the intensity of
 the other pixel in the pair, because the intensities of the two pixels must
 sum to 1; as it happens, this result can be obtained simply by flipping the n
 least-significant bits of the first pixel's value.  All this works because
 what the error accumulator accumulates is precisely the ideal line's current
 distance between the two bracketing pixels.

 The intensity calculations take longer to describe than they do to perform.
 All that's involved is a shift of the error accumulator to right-justify the
 desired intensity weighting bits, and then an XOR to flip the
 least-significant n bits of the first pixel's value in order to generate the
 second pixel's value.  Listing One (page 154) illustrates just how efficient
 Wu antialiasing is; the intensity calculations take only three statements;
 and the entire Wu line-drawing loop is only nine statements long.  Of course,
 a single C statement can hide a great deal of complexity, but Listing Six
 (page 156), the inner loop of an assembler implementation, shows that only 15
 instructions are required per step along the major axis--and the number of
 instructions could be reduced to ten by special-casing and loop unrolling.
 Make no mistake about it, Wu antialiasing is fast.

 Sample Wu Antialiasing

 The true test of any antialiasing technique is how good it looks, so lt's
 have a look at Wu antialiasing in action.  Listing One is a C implementation
 of Wu antialiasing.  Listing Two (page 155) is a sample program that draws a
 variety of Wu-antialiasing lines, followed by non-antialiased lines, for
 comparison.  Listing Three (page 156) contains DrawPixel() and SetMode()
 functions for mode 13h, the VGA's 320x200 256-color mode.  Finally, Listing
 Four (page 156) is a simple, non-antialiased line-drawing routine.  Link
 these four listings together and run the resulting program to see both
 Wu-antialiased and non-antialiased lines.

 Listing One isn't particularly fast, because it calls DrawPixel() for each
 pixel.  On the other hand, DrawPixel() makes it easy to try out Wu
 antialiasing in a variety of modes; just adapt the code in Listing Three for
 the 256-color mode you want to support.  For example, Listing Five (page 156)
 shows code to draw Wu-antialiased lines in 640x480 256-color mode on a
 Super-VGA built around the Tseng Labs ET4000 chip with at least 512K of
 display memory installed.  It's well worth checking out Wu antialiasing at
 640x480.  Although antialiased lines look much smoother than normal lines at
 320 x 200 resolution, they're far from perfect, because the pixels are so big
 that the eye can't blend them properly.  At 640x480, however, Wu-antialiased
 lines look fabulous; from a couple of feet away, they look as straight and
 smooth as if they were drawn with a ruler.

 Listing One requires that the DAC palette be set up so that a NumLevel-long
 block of palette entries contains linearly decreasing intensities of the
 drawing color.  The size of the block is programmable, but must be a power of
 two.  The more intensity levels, the better.  Wu says that 32 intensities is
 enough; on my system, eight and even four levels looked pretty good.  I found
 that gamma correction, which gives linearly spaced intensity steps, improved
 antialiasing quality significantly.  Fortunately, we can program the palette
 with gamma-corrected values, so our drawing code doesnt's have to do eny
 extra work.

 Listing One isn't very fast, so I implemented Wu antialiasing in assembler,
 hard-coded for mode 13h.  The inner loop of the assmbler code is shown in
 Listing Six; the full assembler routine will be available as part of the code
 archive from this issue of DDJ.  (See "Availability" on page 3.)  High-speed
 graphics code and fast VGAs go together like peanut butter and jelly, which
 is to say very well indeed; the assembler implementation ran more than twice
 as fast on my 486 after I ran the SETBUS utility from last month to put the
 VGA into 16-bit mode.  Enough said!

 Notes on Wu Antialiasing

 Wu antialiasing can be applied to any curve for which it's possible to
 calculate at each step the positions and intensities of two bracketing
 pixels, although the implementation will generally be nowhere near as
 efficient as it is for lines.  However, Wu's anticle in Computer Graphics
 does describe an efficient algorithm for drawing antialiased circles.  Wu
 also describes a technique for antialiasing solids, such as filled circles
 anf polygons.  Wu's approach biases the edges of filled objects outward.
 Although this is no good for adjacent polygons of the sort used in rendering,
 it's certainly possible to design a more accurate polygon-antialiasing
 approach around Wu's basic weighting technique.  The results would not be
 quite so good as more sophisticated antialiasing techniques, but they woudl
 be much faster.

 In general, in fact, the results obtained by Wu antialiasing are only so-so,
 by theoretical measures.  Wu antialiasing amounts to a simple box filter
 placed over a step approximation of a line, and that process introduces a
 good deal of deviation from the ideal.  On the other hand, Wu notes that even
 a 10 percent error in intensity doesn't lead to noticeable loss of image
 quality, and for Wu-antialiased lines up to 1K pixels in length, the error is
 under 10 percent.  If it looks good, it is good--and it looks good.  With a
 16-bit error accumulator, fixed-point inaccuracy becomes a problem for
 Wu-antialiased lines longer than 1K.  For such lines, you should switch to
 using 32-bit error values, which would let you handle lines of any practical
 length.

 In the listings, I have chosen to truncate, rather than round, the
 error-adjust value.  This increases the intensity error of the line but
 guarantees that fixed-point inaccuracy won't cause the minor axis to advance
 past the endpoint.  Over-running the endpoint would result in the drawing of
 pixels outside the line's bounding box, and potentially even in an attempt to
 access pixels off the edge of the bitmap.

 Finally, I should mention tha, as published, Wu's algorithm draws lines
 symmetrically, from both ends at once.  I haven't done this for a number of
 reasons, not least of which is that symmetric drawing is an inefficient way
 to draw lines that span banks on banked Super-VGAs.  Banking aside, however,
 symmetric drawing is potentially faster, because it eliminates half of all
 calculations; in so doing, it cuts intensity error in half, as well.

 Matrix Orthogonalization Made Easy

 Peter Brooks of MicroMind passed along the following nifty idea.  Fixed-point
 rotation matrices tend to drift from the proper values over the course of
 repeated concatenations, due to fixed-point error, with the colums gradually
 ceasing to be mutually perpendicular  (orthogonal).  It is essential that
 rotation matrices remain orthogonal, however, in order to perform
 nondistorting rotations.  One way to deal with this is to periodically
 recalculate one of the column as the cross-product of the other two, which
 works because the cross-product of two vectors is an orthogonal vector.

 Peter's insight was: Why not recalculate one colum as the cross-product of
 the other two every time you recalculate a rotation matrix?  In other words,
 do the matrix multiplication to generate two columns of the result matrix,
 then just calculate the third column as the cross-product of the other two.
 The row or column calculated as the cross-product should be rotated
 regularly.  Not only does this guarantee an orthogonal matrix at all times,
 but it also turns out to be faster, because it takes fewer operations to
 calculate a cross-product than to recalculate a matrix column.  (All of the
 above applies equally well if you substitute "row" for "column.")

 This sounds like a great idea to me.  My only question is whether one should
 alternate between rows and columns, in order to distribute error more evenly.
 Any thoughts, readers?

 Next Time

 We didn't quite make it back to X-Sharp this month, although the topics were
 certainly related.  Next month, for sure.  In the meantime, keep those
 excellent suggestions coming.  And stay the heck away from fried, stewed
 chicken.

 [BACK] Back

Discuss this article in the forums


Date this article was posted to GameDev.net: 7/16/1999
(Note that this date does not necessarily correspond to the date the article was written)

See Also:
Michael Abrash's Articles

© 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!