Upcoming Events
Unite 2010
11/10 - 11/12 @ Montréal, Canada

GDC China
12/5 - 12/7 @ Shanghai, China

Asia Game Show 2010
12/24 - 12/27  

GDC 2011
2/28 - 3/4 @ San Francisco, CA

More events...
Quick Stats
101 people currently visiting GDNet.
2406 articles in the reference section.

Help us fight cancer!
Join SETI Team GDNet!
Link to us Events 4 Gamers
Intel sponsors gamedev.net search:

This appendix is to address a few issues that were not sufficiently explained in the article (or in some cases left out altogether). The article is intended for beginners using Direct3D for 2D applications. For this reason, when writing the code I decided to go with flexibility over speed in many areas. However, I neglected to mention other, faster methods. I will attempt to do so here, as well as point out some omissions from the sample code.

Issue 1: Cleanup

This is just a little one, but apparently not very well known. When a call is made to IDirect3DDevice9::SetStreamSource(), a reference count is increased in the passed vertex buffer. If you do not call SetStreamSource() with another vertex buffer, or NULL, the Release() method of the previously passed vertex buffer will fail. This is just a small memory leak, and is cleaned up by DirectX upon program termination anyways. However if you're using a lot of different vertex buffers, you will notice some slowdown as your video memory is consumed. Also, it's just good programming practice to release everything properly yourself.

For you copy-and-pasters, you'll want to add the following to your DirectX cleanup code:

//Clear stream source
DEVICE->SetStreamSource (0, NULL, 0, 0);

It must be before this:

//Release vertex buffer
if (vertexBuffer)
   vertexBuffer->Release ();

NOTE: Replace DEVICE with the name of your IDirect3DDevice9* object

And voila, no memory lost.

Issue 2: Direct3D Uses Inclusive-Exclusive Coordinates

Wow I can't believe people complained about this one. It's fairly common knowledge. When you're setting up a destination rectangle, the bottom and right coordinates are:

RECT.bottom = RECT.top + height
RECT.right = RECT.left + width

You even save two cycles over inclusive-inclusive coordinates by losing the "- 1"s.

Issue 3: Locking the Vertex Buffer

Here's where we get into some optimization. Locking is more expensive than simply using transformation and scaling matrices to move the vertexes of the quad around, but it also affords more flexibility. If you want to use colour modulation, you're pretty much stuck locking the vertex buffer in order to get access to that sweet, sweet colour member (don't worry, you're probably not losing too much processing time). However, if you're not using colour modulation and want an extra bit of speed, read on.

This method uses transformation matrices to move around the vertices in the buffer. Understand that this method is not particularly compatible with the drawing methods presented in the first article. In fact, it requires a change to a fairly major item from the first article: the vertex format. You should use this one instead:

//Custom vertex format
const DWORD D3DFVF_TLVERTEX = D3DFVF_XYZ | D3DFVF_DIFFUSE | D3DFVF_TEX1;

//Custom vertex
struct TLVERTEX
{
   float x;
   float y;
   float z;
   D3DCOLOR colour;
   float u;
   float v;
};

In order to use matrix transformations, an orthographic projection matrix must first be set up. This must be done at initialization, after the device is open for business. Here is the code:

D3DXMATRIX matOrtho;
D3DXMATRIX matIdentity;

//Setup orthographic projection matrix
D3DXMatrixOrthoLH (&matOrtho, RESOLUTION_WIDTH, RESOLUTION_HEIGHT, 1.0f, 10.0f);
D3DXMatrixIdentity (&matIdentity);
DEVICE->SetTransform (D3DTS_PROJECTION, &matOrtho);
DEVICE->SetTransform (D3DTS_WORLD, &matIdentity);
DEVICE->SetTransform (D3DTS_VIEW, &matIdentity);

NOTE: Replace DEVICE with the name of your IDirect3DDevice9* object

NOTE 2: Replace RESOLUTION_WIDTH with the width of the backbuffer

NOTE 3: Replace RESOLUTION_HEIGHT with the height of the backbuffer

Please note that this method does not allow colour modulation (though a similar effect can be acheived with a simple vertex shader). It simply uses the vertex colours that are already specified in the buffer. Before using this method, we must ensure that the colour values in the buffer are all white, and that there is a valid quad in the buffer. This function will see to that:

//Setup the quad
void SetupQuad ()
{

    TLVERTEX* vertices = NULL;
    vertexBuffer->Lock(0, 4 * sizeof(TLVERTEX), (VOID**)&vertices, 0);

    //Setup vertices
    vertices[0].colour = 0xffffffff;
    vertices[0].x = 0.0f;
    vertices[0].y = 0.0f;
    vertices[0].z = 1.0f;
    vertices[0].u = 0.0f;
    vertices[0].v = 0.0f;

    vertices[1].colour = 0xffffffff;
    vertices[1].x = 1.0f;
    vertices[1].y = 0.0f;
    vertices[1].z = 1.0f;
    vertices[1].u = 1.0f;
    vertices[1].v = 0.0f;

    vertices[2].colour = 0xffffffff;
    vertices[2].x = 1.0f;
    vertices[2].y = -1.0f;
    vertices[2].z = 1.0f;
    vertices[2].u = 1.0f;
    vertices[2].v = 1.0f;

    vertices[3].colour = 0xffffffff;
    vertices[3].x = 0.0f;
    vertices[3].y = -1.0f;
    vertices[3].z = 1.0f;
    vertices[3].u = 0.0f;
    vertices[3].v = 1.0f;

    vertexBuffer->Unlock();
}

If you are not using colour modulation at all in your program, this only needs to be called once, upon program startup.

Here is code to actually draw the textured quad:

//Draw a textured quad on the backbuffer
void Blit(IDirect3DTexture9* texture, RECT* rDest, float rotate)
{
    float X;
    float Y;
    D3DXMATRIX matTranslation;
    D3DXMATRIX matScaling;
    D3DXMATRIX matTransform;
    
    //Get coordinates
    X = rDest->left - (float)(RESOLUTION_WIDTH) / 2;
    Y = -rDest->top + (float)(RESOLUTION_HEIGHT) / 2; 

    //Setup translation and scaling matrices
    D3DXMatrixScaling (&matScaling, (float)(rDest->right - rDest->left),
        (float)(rDest->bottom - rDest->top), 1.0f);
    D3DXMatrixTranslation (&matTranslation, X, Y, 0.0f);
    matTransform = matScaling * matTranslation;

    //Check if quad is rotated
    if (rotate)
    {
        D3DXMATRIX matRotate;

        //Create rotation matrix about the z-axis
        D3DXMatrixRotationZ (&matRotate, rotate);

        //Multiply matrices together
        matTransform *= matRotate;
    }

    //Draw the quad
    DEVICE->SetTransform (D3DTS_WORLD, &matTransform);
    DEVICE->SetTexture (0, texture);
    DEVICE->DrawPrimitive(D3DPT_TRIANGLEFAN, 0, 2);
}

NOTE: Replace DEVICE with the name of your IDirect3DDevice9* object

NOTE 2: Replace RESOLUTION_WIDTH with the width of the backbuffer

NOTE 3: Replace RESOLUTION_HEIGHT with the height of the backbuffer

Though slightly less powerful, this function should be faster than the ones presented in the article. It also has better-looking rotation.

Please see NoLocking.zip for a demonstration of this technique in action.

Issue 4: Batching

It's time for a bit of batching. If you manage this well, this could potentially give you a rather large speed increase.

In the Blit() functions provided in the article, SetTexture() is called every time you want to draw something, even if you're drawing the same texture on screen many times in a row. You can save some valuable cycles by simply not calling it if you're drawing the same texture multiple times.

Another feature the blitting methods could use is setting up a source rectangle to blit from. This way you can have many images on one texture and draw all that are needed in the current frame with a single call to SetTexture(). Since texture coordinates are not specified in pixels (or texels), but rather as a value from 0.0f to 1.0f, a quick conversion must be done from pixel coordinates to texture coordinates.

Furthermore, when drawing multiple quads from the same texture, it is possible to put them all in one large vertex buffer and draw them all in a single call. To do this, we must use the triangle-list primitive type, as opposed to the triangle fans used up to this point. To accomodate this, we will also need an index buffer to so we do not create duplicate vertices in the buffer. More information about index buffers can be found in the Two Kings tutorial here: http://www.two-kings.de/tutorials/d3d08/d3d08.html

So, first thing we need to do is add another vertex buffer and an index buffer, as well as some state information about the vertex buffer and texture. Put this in a scope visible to all of your graphics code:

//Vertex buffer and index buffer for batched drawing
IDirect3DVertexBuffer9* vertexBatchBuffer;
IDirect3DIndexBuffer9* indexBatchBuffer;

//Max amount of vertices that can be put in the batching buffer
const int BATCH_BUFFER_SIZE = 1000;

//Vertices currently in the batching buffer
int numBatchVertices;
TLVERTEX* batchVertices;

//Info on texture used for batched drawing
float batchTexWidth;
float batchTexHeight;

You can tweak the BATCH_BUFFER_SIZE constant to a number that works best with your app. The lower it is, the more often it has to be flushed. The higher it is, the longer it takes to lock. Make sure it's a multiple of four though, so you can completely fill it with quads.

Now we need to initialize our new buffers. Do this around the same time as you initialize the other vertex buffer (just make sure it's after the device is created):

//Create batching vertex and index buffers
d3dDevice->CreateVertexBuffer(BATCH_BUFFER_SIZE * sizeof(TLVERTEX),
    D3DUSAGE_WRITEONLY, D3DFVF_TLVERTEX, D3DPOOL_MANAGED, &vertexBatchBuffer, NULL);
d3dDevice->CreateIndexBuffer (BATCH_BUFFER_SIZE * 3, D3DUSAGE_WRITEONLY,
    D3DFMT_INDEX16, D3DPOOL_MANAGED, &indexBatchBuffer, NULL);
numBatchVertices = 0;

You may have noticed that the vertex and index buffers are both static. The vertex buffer is static because there will likely be a lot of switching between it and our original vertex buffer. The index buffer is static because we only change its contents once. Also, by making them static, we can keep them in the managed pool which makes for easier handling of alt-tabbing in a fullscreen app.

Of course, you must remember to set the buffers free when you're done with them. Add this to your cleanup routine:

//Release batching buffers
if (vertexBatchBuffer)
    vertexBatchBuffer->Release ();
if (indexBatchBuffer)
    indexBatchBuffer->Release ();

Since you'll only be putting quads in the vertex buffer, you'll only need to fill up the index buffer once and never modify it again. Use this function to fill it (call it after the index buffer has been initialized):

//Fill the index buffer
void FillIndexBuffer ()
{
    int index = 0;
    short* indices = NULL;

    //Lock index buffer
    indexBatchBuffer->Lock(0, BATCH_BUFFER_SIZE  * 3,
                           (void**) &indices, 0);

    for (int vertex = 0; vertex < BATCH_BUFFER_SIZE; vertex += 4)
    {
        indices[index] = vertex;
        indices[index + 1] = vertex + 2;
        indices[index + 2] = vertex + 3;
        indices[index + 3] = vertex;
        indices[index + 4] = vertex + 1;
        indices[index + 5] = vertex + 2;
        index += 6;
    }

    //Unlock index buffer
    indexBatchBuffer->Unlock ();
}

Alright. Now you just have to let your computer know you're ready to batch a bunch of quads together. Call this function every time you want to draw a series of images from a single texture:

//Get ready for batch drawing
void BeginBatchDrawing (IDirect3DTexture9* texture)
{
    D3DXMATRIX matIdentity;
    D3DSURFACE_DESC surfDesc;

    //Lock the batching vertex buffer
    numBatchVertices = 0;
    vertexBatchBuffer->Lock (0, BATCH_BUFFER_SIZE * sizeof(TLVERTEX),
                             (void **) &batchVertices, 0);

    //Get texture dimensions
    texture->GetLevelDesc (0, &surfDesc);
    batchTexWidth = (float) surfDesc.Width;
    batchTexHeight = (float) surfDesc.Height;

    //Set texture
    d3dDevice->SetTexture (0, texture);

    //Set world matrix to an identity matrix
    D3DXMatrixIdentity (&matIdentity);
    d3dDevice->SetTransform (D3DTS_WORLD, &matIdentity);

    //Set stream source to batch buffer
    d3dDevice->SetStreamSource (0, vertexBatchBuffer,
                                0, sizeof(TLVERTEX));
}

You should probably make this function set a flag somewhere so the program can tell it's in the middle of a batch drawing process. I'm leaving it out of the example to keep things simple (the demos already have far too many global variables for my liking).

Now it's time to put some quads in the vertex buffer. This function will do just that for you:

//Add a quad to the batching buffer
void AddQuad (RECT* rSource, RECT* rDest, D3DCOLOR colour)
{
    float X;
    float Y;
    float destWidth;
    float destHeight;

    //Calculate coordinates
    X = rDest->left - (float)(d3dPresent.BackBufferWidth) / 2;
    Y = -rDest->top + (float)(d3dPresent.BackBufferHeight) / 2; 
    destWidth = (float)(rDest->right - rDest->left);
    destHeight = (float)(rDest->bottom - rDest->top);

    //Setup vertices in buffer
    batchVertices[numBatchVertices].colour = colour;
    batchVertices[numBatchVertices].x = X;
    batchVertices[numBatchVertices].y = Y;
    batchVertices[numBatchVertices].z = 1.0f;
    batchVertices[numBatchVertices].u = rSource->left / batchTexWidth;
    batchVertices[numBatchVertices].v = rSource->top / batchTexHeight;
    
    batchVertices[numBatchVertices + 1].colour = colour;
    batchVertices[numBatchVertices + 1].x = X + destWidth;
    batchVertices[numBatchVertices + 1].y = Y;
    batchVertices[numBatchVertices + 1].z = 1.0f;
    batchVertices[numBatchVertices + 1].u = rSource->right / batchTexWidth;
    batchVertices[numBatchVertices + 1].v = rSource->top / batchTexHeight;

    batchVertices[numBatchVertices + 2].colour = colour;
    batchVertices[numBatchVertices + 2].x = X + destWidth;
    batchVertices[numBatchVertices + 2].y = Y - destHeight;
    batchVertices[numBatchVertices + 2].z = 1.0f;
    batchVertices[numBatchVertices + 2].u = rSource->right / batchTexWidth;
    batchVertices[numBatchVertices + 2].v = rSource->bottom / batchTexHeight;

    batchVertices[numBatchVertices + 3].colour = colour;
    batchVertices[numBatchVertices + 3].x = X;
    batchVertices[numBatchVertices + 3].y = Y - destHeight;
    batchVertices[numBatchVertices + 3].z = 1.0f;
    batchVertices[numBatchVertices + 3].u = rSource->left / batchTexWidth;
    batchVertices[numBatchVertices + 3].v = rSource->bottom / batchTexHeight;

    //Increase vertex count
    numBatchVertices += 4;

    //Flush buffer if it's full
    if (numBatchVertices == BATCH_BUFFER_SIZE)
    {
        //Unlock vertex buffer
        vertexBatchBuffer->Unlock();
        
        //Draw quads in the buffer
        d3dDevice->DrawIndexedPrimitive (D3DPT_TRIANGLELIST, 0, 0,
              numBatchVertices, 0, numBatchVertices / 2);        

        //Reset vertex count        
        numBatchVertices = 0;        

        //Lock vertex buffer
        vertexBatchBuffer->Lock (0, BATCH_BUFFER_SIZE * sizeof(TLVERTEX),
              (void **) &batchVertices, 0);
    }

}

As you can see, the function has basic colour modulation functionality. It can easil bey extended this to behave similarly to the BlitExD3D() function from the article, but I'll leave that up to you. Also, when the vertex buffer gets full, this function automatically draws whatever's inside it, and prepares to receive a new batch of quads.

//Finish batch drawing
void EndBatchDrawing()
{
    //Unlock vertex buffer
    vertexBatchBuffer->Unlock();

    //Draw the quads in the buffer if it wasn't just flushed
    if (numBatchVertices)
        d3dDevice->DrawIndexedPrimitive (D3DPT_TRIANGLELIST, 0, 0,
              numBatchVertices, 0, numBatchVertices / 2);

    //Set stream source to regular buffer
    d3dDevice->SetStreamSource (0, vertexBuffer, 0, sizeof(TLVERTEX));

    //Reset vertex count        
    numBatchVertices = 0;        
}

A demonstration of batching building on the NoLocking.zip app can be found in Batching.zip. The batching code presented relies heavily upon global variables and has very little error checking. I urge the reader to encapsulate the batching functionality in a class, and add a lot of error checking (asserting that the vertex buffer is not overflowing in AddQuad() would be a good start).



Appendix

Contents
  The concept
  Initialization code
  Drawing a texture
  The CTexture class
  Appendix

  Source code
  Printable version
  Discuss this article