Fifteen Ways to do Faster Blits.


A slow game is not a good game. Without a fast blitter, your game can run like a 8086 doing ray-tracing.

With that in mind, let's take a look at this code:


void BltLinear(LINEAR_BITMAP far * BM, int x, int y, UINT8 far * ScreenBase)
{
    int Top;    /* coordinate values of bitmap top-left corner */
    int Left;
    int BltWidth;       /* width of bitmap so we don't dereference pointers */
    int BltHeight;  /* height of bitmap so we don't dereference pointers */
    UINT16 TempOffset;  /* temp variable to calc far pointer offsets */
    UINT8 far * Screen; /* pointer to current screen position */
    UINT8 far * Bitmap; /* pointer to current bitmap position */
    unsigned WidthCounter; 
    unsigned HeightCounter; 
    unsigned ScreenIncrement;

    assert(LinearBM != NULL);
    assert(ScreenBase != NULL);

    //Compute our left and top starting points    
    Left = x - BM->OriginX;
    Top  = y - BM->OriginY;

    //Compute our Screen location
    TempOffset = Top * ScreenWidth + Left;
    Screen = ScreenBase + TempOffset;

    //Computer our bitmap pointer
    Bitmap = &(BM->Data);

    //Alias pointers 
    BltWidth = BM->Width;
    BltHeight = BM->Height;
    
    //How much should we increment the screen inside of the loop
    ScreenIncrement = ScreenWidth - BltWidth;


    for (HeightCounter = 0; HeightCounter < BltHeight; HeightCounter++)
    {
        for (WidthCounter = 0; WidthCounter < BltWidth; WidthCounter++)
        {
            if (*Bitmap != 0)
            {
                *Screen = *Bitmap;
            }
            Screen++;
            Bitmap++;
        }
        Screen += ScreenIncrement;
    }
}

How could we optimize that? Well, this brings us to our first optimization:

Always flip and unroll loops.

We can do this and other "classic" optimizations like that to improve our code. This may seem obvious, but it is worth mentioning.

Don't use far pointers to bitmaps unless you need to.

Unless you are going to be blitting very large(64K or over) bitmaps, don't use far pointers to bitmaps. Far pointer operations are always slower then near pointer operations. Far pointers are composed of a segment and an offset, while near pointers are only composed of an offset.

Use register variables for the loop counters whenever possible.

We can turn the loop counters into register variables. Those are probably the best ones to use registers for because they are frequently acessed. To do this, we just need to add a register keyword in the declaration of the counter variables.

Don't use assert's!

Assert'ing something might be good debugging practice, but it does not help the execution time for our blitter. You can either take out the assert's completely, or you can just add a #define NDEBUG when you compile the final version. If you don't know already, a assert macro expands to if statement. That means for each assert we have a conditional jump. Conditional jumps are slow.

Use global variables instead of passing parameters.

You can use global variables's to speed up this function. Instead of passing parameter's, you just set the value of the global variables. This may seem odd, but it works. Almost everyone has heard somewhere or read somewhere that global variables is a no-no, but some rules have to be broken when you are a game programmer.

Don't use variables where you don't need them!

In the code, it references the variable screenwidth. While this may seem to make sense, it is better to hard-code in a value(in this case 320). Since this function will only work in mode 13h, why will supporting different resoulutions help? Also, if you don't use that variable, you don't have to set it, which makes reusing the code in another game easier.

Take out as many features as possible!

This is important, albeit not quite as much in this particular function. You don't always want to use a blitter packed with features you don't need. For instance, say you have a blitter which can do clipping, rotating, scaling, skewing, and dithering, but you just need a simple blitter. You should use the blitter that has the least overhead. The less features the faster. In this example, we can take out the Left and Top variables because all they do is subtract OriginX from X. While that may be helpful in some cases, it is hardly nessesary.

Use the smallest size variables possible.

Take a look at the declaration for tempoffset. The type UINT16 means unsigned long, or unsigned 16 bit thingie. Anyway, you only need a integer to hold that value, because the max for it is 64000, and an integer can hold that plus 1 thousand odd more.

Don't pass struct's as parameters!

In the code you pass a structure to the function. Bad idea! You have to push and pop the whole structure off and on the stack. This means a serious slowdown. What you should do is make the width and height global parameters, and pass the bitmap as a unsigned char. This makes it faster even more so because you can take out the assignments that alias the pointers! Passing structures as parameters is a bad idea in any time-critical function, becuase you have to push and pop each member.

Don't pass the screen as a parameter!

You want to have the screen as a global variable. Since the adress of the screen does not change, You don't need to make a new far pointer each time you blit. This increases the speed twofold, since you also need to push and pop one less parameter.

Always write a transparent version and a non-transparent version.

You want to write two versions because if you know something is non-transparent, you can use a non-transparent blitter. Non-transparent blitters are much faster because there is a conditional jump statement that is not executed. This also relates to not putting in every feature.

Use compiled sprites.

A compiled sprite is a function that is created in inline assembly by a program called a sprite compiler. A compiled sprite function is simple to use. You just call a function and it draws itself. Compiled sprites take out the tests for transparency, all conditional and unconditional jumps, and are generally assembly code. Use them whenever possible. Soon I will have a section on that, but it's under construction.

Don't use pixel plot functions.

Don't use pixel plot functions inside blit functions. The only time that is acceptable is when the function is a macro and not actually a function. Our example does not have this shortcoming, but many blitter's do.

Never use the BGI in any shape or form

The same goes for the Microsoft libraries. They are just to slow! You might you them to demonstrate a concept or for an example on game AI, but Never in a game. Especially not in our blitter. This does not apply to our example blitter, but it is fairly important. No sucessfull game has ever been written using either library.

A final note.

I would like to thank you for taking the time to read this article. As final advice, never stop looking for faster ways to do things. I am sure that these are not all. Please Email me with your tip, tricks, and advice. That's all.

You can reach me at form1@aol.com

Copyright 1996, David Berube

You can reach me at form1@aol.com

Discuss this article in the forums


Date this article was posted to GameDev.net: 7/24/1999
(Note that this date does not necessarily correspond to the date the article was written)

See Also:
Optimization

© 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!