Fifteen Ways to do Faster Blits.
A slow game is not a good game. Without a fast blitter,
your game can run like a 8086 doing ray-tracing.
With that in mind, let's take a look at this code:
void BltLinear(LINEAR_BITMAP far * BM, int x, int y, UINT8 far * ScreenBase)
{
int Top; /* coordinate values of bitmap top-left corner */
int Left;
int BltWidth; /* width of bitmap so we don't dereference pointers */
int BltHeight; /* height of bitmap so we don't dereference pointers */
UINT16 TempOffset; /* temp variable to calc far pointer offsets */
UINT8 far * Screen; /* pointer to current screen position */
UINT8 far * Bitmap; /* pointer to current bitmap position */
unsigned WidthCounter;
unsigned HeightCounter;
unsigned ScreenIncrement;
assert(LinearBM != NULL);
assert(ScreenBase != NULL);
//Compute our left and top starting points
Left = x - BM->OriginX;
Top = y - BM->OriginY;
//Compute our Screen location
TempOffset = Top * ScreenWidth + Left;
Screen = ScreenBase + TempOffset;
//Computer our bitmap pointer
Bitmap = &(BM->Data);
//Alias pointers
BltWidth = BM->Width;
BltHeight = BM->Height;
//How much should we increment the screen inside of the loop
ScreenIncrement = ScreenWidth - BltWidth;
for (HeightCounter = 0; HeightCounter < BltHeight; HeightCounter++)
{
for (WidthCounter = 0; WidthCounter < BltWidth; WidthCounter++)
{
if (*Bitmap != 0)
{
*Screen = *Bitmap;
}
Screen++;
Bitmap++;
}
Screen += ScreenIncrement;
}
}
How could we optimize that? Well, this brings us to our first
optimization:
Always flip and unroll loops.
We can do this and other "classic" optimizations like that
to improve our code. This may seem obvious, but it is worth
mentioning.
Don't use far pointers to bitmaps unless you need to.
Unless you are going to be blitting very large(64K or over)
bitmaps, don't use far pointers to bitmaps. Far pointer operations
are always slower then near pointer operations. Far pointers
are composed of a segment and an offset, while near pointers
are only composed of an offset.
Use register variables for the loop counters whenever
possible.
We can turn the loop counters into register variables. Those
are probably the best ones to use registers for because they
are frequently acessed. To do this, we just need to add
a register keyword in the declaration of the counter
variables.
Don't use assert's!
Assert'ing something might be good debugging practice, but
it does not help the execution time for our blitter. You
can either take out the assert's completely, or you
can just add a #define NDEBUG when you compile the
final version. If you don't know already, a assert macro
expands to if statement. That means for each assert
we have a conditional jump. Conditional jumps are slow.
Use global variables instead of passing parameters.
You can use global variables's to speed up this function.
Instead of passing parameter's, you just set the value
of the global variables. This may seem odd, but it works.
Almost everyone has heard somewhere or read somewhere
that global variables is a no-no, but some rules have to
be broken when you are a game programmer.
Don't use variables where you don't need them!
In the code, it references the variable screenwidth. While
this may seem to make sense, it is better to hard-code
in a value(in this case 320). Since this function will only work in
mode 13h, why will supporting different resoulutions help?
Also, if you don't use that variable, you don't have to set
it, which makes reusing the code in another game easier.
Take out as many features as possible!
This is important, albeit not quite as much in this particular
function. You don't always want to use a blitter packed with
features you don't need. For instance, say you have a blitter
which can do clipping, rotating, scaling, skewing, and dithering,
but you just need a simple blitter. You should use the
blitter that has the least overhead. The less features the
faster. In this example, we can take out the Left and Top
variables because all they do is subtract OriginX from X. While
that may be helpful in some cases, it is hardly nessesary.
Use the smallest size variables possible.
Take a look at the declaration for tempoffset. The type UINT16
means unsigned long, or unsigned 16 bit thingie. Anyway, you
only need a integer to hold that value, because the max
for it is 64000, and an integer can hold that plus 1 thousand
odd more.
Don't pass struct's as parameters!
In the code you pass a structure to the function. Bad idea!
You have to push and pop the whole structure off and on the
stack. This means a serious slowdown. What you should do is
make the width and height global parameters, and
pass the bitmap as a unsigned char. This makes it faster
even more so because you can take out the assignments that
alias the pointers! Passing structures as parameters
is a bad idea in any time-critical function, becuase you
have to push and pop each member.
Don't pass the screen as a parameter!
You want to have the screen as a global variable.
Since the adress of the screen does not change, You don't need
to make a new far pointer each time you blit. This increases
the speed twofold, since you also need to push and pop one less
parameter.
Always write a transparent version and a non-transparent version.
You want to write two versions because if you know something
is non-transparent, you can use a non-transparent blitter.
Non-transparent blitters are much faster
because there is a conditional jump statement that is not executed.
This also relates to not putting in every feature.
Use compiled sprites.
A compiled sprite is a function that is created in
inline assembly by a program called a sprite compiler.
A compiled sprite function is simple to use. You
just call a function and it draws itself.
Compiled sprites take out the tests for transparency, all
conditional and unconditional jumps, and are generally assembly
code. Use them whenever possible. Soon I will have a section
on that, but it's under construction.
Don't use pixel plot functions.
Don't use pixel plot functions inside blit functions.
The only time that is acceptable is when the
function is a macro and not actually a function. Our example
does not have this shortcoming, but many blitter's
do.
Never use the BGI in any shape or form
The same goes for the Microsoft libraries. They are just to
slow! You might you them to demonstrate a concept or for
an example on game AI, but Never in a
game. Especially not in our blitter. This does not apply
to our example blitter, but it is fairly important. No
sucessfull game has ever been written using either library.
A final note.
I would like to thank you for taking the time to read this article.
As final advice, never stop looking for faster ways to do things.
I am sure that these are not all. Please Email me with your
tip, tricks, and advice. That's all.
You can reach me at form1@aol.com
Copyright 1996, David Berube
You can reach me at form1@aol.com
Discuss this article in the forums
Date this article was posted to GameDev.net: 7/24/1999
(Note that this date does not necessarily correspond to the date the article was written)
See Also:
Optimization
© 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!
|