This appendix is to address a few issues that were not sufficiently explained in the article (or in some cases left out altogether). The article is intended for beginners using Direct3D for 2D applications. For this reason, when writing the code I decided to go with flexibility over speed in many areas. However, I neglected to mention other, faster methods. I will attempt to do so here, as well as point out some omissions from the sample code. Issue 1: CleanupThis is just a little one, but apparently not very well known. When a call is made to IDirect3DDevice9::SetStreamSource(), a reference count is increased in the passed vertex buffer. If you do not call SetStreamSource() with another vertex buffer, or NULL, the Release() method of the previously passed vertex buffer will fail. This is just a small memory leak, and is cleaned up by DirectX upon program termination anyways. However if you're using a lot of different vertex buffers, you will notice some slowdown as your video memory is consumed. Also, it's just good programming practice to release everything properly yourself. For you copy-and-pasters, you'll want to add the following to your DirectX cleanup code:
//Clear stream source
DEVICE->SetStreamSource (0, NULL, 0, 0);
It must be before this:
//Release vertex buffer
if (vertexBuffer)
vertexBuffer->Release ();
NOTE: Replace DEVICE with the name of your IDirect3DDevice9* object And voila, no memory lost. Issue 2: Direct3D Uses Inclusive-Exclusive CoordinatesWow I can't believe people complained about this one. It's fairly common knowledge. When you're setting up a destination rectangle, the bottom and right coordinates are: RECT.bottom = RECT.top + height RECT.right = RECT.left + width You even save two cycles over inclusive-inclusive coordinates by losing the "- 1"s. Issue 3: Locking the Vertex BufferHere's where we get into some optimization. Locking is more expensive than simply using transformation and scaling matrices to move the vertexes of the quad around, but it also affords more flexibility. If you want to use colour modulation, you're pretty much stuck locking the vertex buffer in order to get access to that sweet, sweet colour member (don't worry, you're probably not losing too much processing time). However, if you're not using colour modulation and want an extra bit of speed, read on. This method uses transformation matrices to move around the vertices in the buffer. Understand that this method is not particularly compatible with the drawing methods presented in the first article. In fact, it requires a change to a fairly major item from the first article: the vertex format. You should use this one instead: //Custom vertex format const DWORD D3DFVF_TLVERTEX = D3DFVF_XYZ | D3DFVF_DIFFUSE | D3DFVF_TEX1; //Custom vertex struct TLVERTEX { float x; float y; float z; D3DCOLOR colour; float u; float v; }; In order to use matrix transformations, an orthographic projection matrix must first be set up. This must be done at initialization, after the device is open for business. Here is the code:
D3DXMATRIX matOrtho;
D3DXMATRIX matIdentity;
//Setup orthographic projection matrix
D3DXMatrixOrthoLH (&matOrtho, RESOLUTION_WIDTH, RESOLUTION_HEIGHT, 1.0f, 10.0f);
D3DXMatrixIdentity (&matIdentity);
DEVICE->SetTransform (D3DTS_PROJECTION, &matOrtho);
DEVICE->SetTransform (D3DTS_WORLD, &matIdentity);
DEVICE->SetTransform (D3DTS_VIEW, &matIdentity);
NOTE: Replace DEVICE with the name of your IDirect3DDevice9* object NOTE 2: Replace RESOLUTION_WIDTH with the width of the backbuffer NOTE 3: Replace RESOLUTION_HEIGHT with the height of the backbuffer Please note that this method does not allow colour modulation (though a similar effect can be acheived with a simple vertex shader). It simply uses the vertex colours that are already specified in the buffer. Before using this method, we must ensure that the colour values in the buffer are all white, and that there is a valid quad in the buffer. This function will see to that: //Setup the quad void SetupQuad () { TLVERTEX* vertices = NULL; vertexBuffer->Lock(0, 4 * sizeof(TLVERTEX), (VOID**)&vertices, 0); //Setup vertices vertices[0].colour = 0xffffffff; vertices[0].x = 0.0f; vertices[0].y = 0.0f; vertices[0].z = 1.0f; vertices[0].u = 0.0f; vertices[0].v = 0.0f; vertices[1].colour = 0xffffffff; vertices[1].x = 1.0f; vertices[1].y = 0.0f; vertices[1].z = 1.0f; vertices[1].u = 1.0f; vertices[1].v = 0.0f; vertices[2].colour = 0xffffffff; vertices[2].x = 1.0f; vertices[2].y = -1.0f; vertices[2].z = 1.0f; vertices[2].u = 1.0f; vertices[2].v = 1.0f; vertices[3].colour = 0xffffffff; vertices[3].x = 0.0f; vertices[3].y = -1.0f; vertices[3].z = 1.0f; vertices[3].u = 0.0f; vertices[3].v = 1.0f; vertexBuffer->Unlock(); } If you are not using colour modulation at all in your program, this only needs to be called once, upon program startup. Here is code to actually draw the textured quad: //Draw a textured quad on the backbuffer void Blit(IDirect3DTexture9* texture, RECT* rDest, float rotate) { float X; float Y; D3DXMATRIX matTranslation; D3DXMATRIX matScaling; D3DXMATRIX matTransform; //Get coordinates X = rDest->left - (float)(RESOLUTION_WIDTH) / 2; Y = -rDest->top + (float)(RESOLUTION_HEIGHT) / 2; //Setup translation and scaling matrices D3DXMatrixScaling (&matScaling, (float)(rDest->right - rDest->left), (float)(rDest->bottom - rDest->top), 1.0f); D3DXMatrixTranslation (&matTranslation, X, Y, 0.0f); matTransform = matScaling * matTranslation; //Check if quad is rotated if (rotate) { D3DXMATRIX matRotate; //Create rotation matrix about the z-axis D3DXMatrixRotationZ (&matRotate, rotate); //Multiply matrices together matTransform *= matRotate; } //Draw the quad DEVICE->SetTransform (D3DTS_WORLD, &matTransform); DEVICE->SetTexture (0, texture); DEVICE->DrawPrimitive(D3DPT_TRIANGLEFAN, 0, 2); } NOTE: Replace DEVICE with the name of your IDirect3DDevice9* object NOTE 2: Replace RESOLUTION_WIDTH with the width of the backbuffer NOTE 3: Replace RESOLUTION_HEIGHT with the height of the backbuffer Though slightly less powerful, this function should be faster than the ones presented in the article. It also has better-looking rotation. Please see NoLocking.zip for a demonstration of this technique in action. Issue 4: BatchingIt's time for a bit of batching. If you manage this well, this could potentially give you a rather large speed increase. In the Blit() functions provided in the article, SetTexture() is called every time you want to draw something, even if you're drawing the same texture on screen many times in a row. You can save some valuable cycles by simply not calling it if you're drawing the same texture multiple times. Another feature the blitting methods could use is setting up a source rectangle to blit from. This way you can have many images on one texture and draw all that are needed in the current frame with a single call to SetTexture(). Since texture coordinates are not specified in pixels (or texels), but rather as a value from 0.0f to 1.0f, a quick conversion must be done from pixel coordinates to texture coordinates. Furthermore, when drawing multiple quads from the same texture, it is possible to put them all in one large vertex buffer and draw them all in a single call. To do this, we must use the triangle-list primitive type, as opposed to the triangle fans used up to this point. To accomodate this, we will also need an index buffer to so we do not create duplicate vertices in the buffer. More information about index buffers can be found in the Two Kings tutorial here: http://www.two-kings.de/tutorials/d3d08/d3d08.html So, first thing we need to do is add another vertex buffer and an index buffer, as well as some state information about the vertex buffer and texture. Put this in a scope visible to all of your graphics code: //Vertex buffer and index buffer for batched drawing IDirect3DVertexBuffer9* vertexBatchBuffer; IDirect3DIndexBuffer9* indexBatchBuffer; //Max amount of vertices that can be put in the batching buffer const int BATCH_BUFFER_SIZE = 1000; //Vertices currently in the batching buffer int numBatchVertices; TLVERTEX* batchVertices; //Info on texture used for batched drawing float batchTexWidth; float batchTexHeight; You can tweak the BATCH_BUFFER_SIZE constant to a number that works best with your app. The lower it is, the more often it has to be flushed. The higher it is, the longer it takes to lock. Make sure it's a multiple of four though, so you can completely fill it with quads. Now we need to initialize our new buffers. Do this around the same time as you initialize the other vertex buffer (just make sure it's after the device is created):
//Create batching vertex and index buffers
d3dDevice->CreateVertexBuffer(BATCH_BUFFER_SIZE * sizeof(TLVERTEX),
D3DUSAGE_WRITEONLY, D3DFVF_TLVERTEX, D3DPOOL_MANAGED, &vertexBatchBuffer, NULL);
d3dDevice->CreateIndexBuffer (BATCH_BUFFER_SIZE * 3, D3DUSAGE_WRITEONLY,
D3DFMT_INDEX16, D3DPOOL_MANAGED, &indexBatchBuffer, NULL);
numBatchVertices = 0;
You may have noticed that the vertex and index buffers are both static. The vertex buffer is static because there will likely be a lot of switching between it and our original vertex buffer. The index buffer is static because we only change its contents once. Also, by making them static, we can keep them in the managed pool which makes for easier handling of alt-tabbing in a fullscreen app. Of course, you must remember to set the buffers free when you're done with them. Add this to your cleanup routine:
//Release batching buffers
if (vertexBatchBuffer)
vertexBatchBuffer->Release ();
if (indexBatchBuffer)
indexBatchBuffer->Release ();
Since you'll only be putting quads in the vertex buffer, you'll only need to fill up the index buffer once and never modify it again. Use this function to fill it (call it after the index buffer has been initialized):
//Fill the index buffer
void FillIndexBuffer ()
{
int index = 0;
short* indices = NULL;
//Lock index buffer
indexBatchBuffer->Lock(0, BATCH_BUFFER_SIZE * 3,
(void**) &indices, 0);
for (int vertex = 0; vertex < BATCH_BUFFER_SIZE; vertex += 4)
{
indices[index] = vertex;
indices[index + 1] = vertex + 2;
indices[index + 2] = vertex + 3;
indices[index + 3] = vertex;
indices[index + 4] = vertex + 1;
indices[index + 5] = vertex + 2;
index += 6;
}
//Unlock index buffer
indexBatchBuffer->Unlock ();
}
Alright. Now you just have to let your computer know you're ready to batch a bunch of quads together. Call this function every time you want to draw a series of images from a single texture: //Get ready for batch drawing void BeginBatchDrawing (IDirect3DTexture9* texture) { D3DXMATRIX matIdentity; D3DSURFACE_DESC surfDesc; //Lock the batching vertex buffer numBatchVertices = 0; vertexBatchBuffer->Lock (0, BATCH_BUFFER_SIZE * sizeof(TLVERTEX), (void **) &batchVertices, 0); //Get texture dimensions texture->GetLevelDesc (0, &surfDesc); batchTexWidth = (float) surfDesc.Width; batchTexHeight = (float) surfDesc.Height; //Set texture d3dDevice->SetTexture (0, texture); //Set world matrix to an identity matrix D3DXMatrixIdentity (&matIdentity); d3dDevice->SetTransform (D3DTS_WORLD, &matIdentity); //Set stream source to batch buffer d3dDevice->SetStreamSource (0, vertexBatchBuffer, 0, sizeof(TLVERTEX)); } You should probably make this function set a flag somewhere so the program can tell it's in the middle of a batch drawing process. I'm leaving it out of the example to keep things simple (the demos already have far too many global variables for my liking). Now it's time to put some quads in the vertex buffer. This function will do just that for you:
//Add a quad to the batching buffer
void AddQuad (RECT* rSource, RECT* rDest, D3DCOLOR colour)
{
float X;
float Y;
float destWidth;
float destHeight;
//Calculate coordinates
X = rDest->left - (float)(d3dPresent.BackBufferWidth) / 2;
Y = -rDest->top + (float)(d3dPresent.BackBufferHeight) / 2;
destWidth = (float)(rDest->right - rDest->left);
destHeight = (float)(rDest->bottom - rDest->top);
//Setup vertices in buffer
batchVertices[numBatchVertices].colour = colour;
batchVertices[numBatchVertices].x = X;
batchVertices[numBatchVertices].y = Y;
batchVertices[numBatchVertices].z = 1.0f;
batchVertices[numBatchVertices].u = rSource->left / batchTexWidth;
batchVertices[numBatchVertices].v = rSource->top / batchTexHeight;
batchVertices[numBatchVertices + 1].colour = colour;
batchVertices[numBatchVertices + 1].x = X + destWidth;
batchVertices[numBatchVertices + 1].y = Y;
batchVertices[numBatchVertices + 1].z = 1.0f;
batchVertices[numBatchVertices + 1].u = rSource->right / batchTexWidth;
batchVertices[numBatchVertices + 1].v = rSource->top / batchTexHeight;
batchVertices[numBatchVertices + 2].colour = colour;
batchVertices[numBatchVertices + 2].x = X + destWidth;
batchVertices[numBatchVertices + 2].y = Y - destHeight;
batchVertices[numBatchVertices + 2].z = 1.0f;
batchVertices[numBatchVertices + 2].u = rSource->right / batchTexWidth;
batchVertices[numBatchVertices + 2].v = rSource->bottom / batchTexHeight;
batchVertices[numBatchVertices + 3].colour = colour;
batchVertices[numBatchVertices + 3].x = X;
batchVertices[numBatchVertices + 3].y = Y - destHeight;
batchVertices[numBatchVertices + 3].z = 1.0f;
batchVertices[numBatchVertices + 3].u = rSource->left / batchTexWidth;
batchVertices[numBatchVertices + 3].v = rSource->bottom / batchTexHeight;
//Increase vertex count
numBatchVertices += 4;
//Flush buffer if it's full
if (numBatchVertices == BATCH_BUFFER_SIZE)
{
//Unlock vertex buffer
vertexBatchBuffer->Unlock();
//Draw quads in the buffer
d3dDevice->DrawIndexedPrimitive (D3DPT_TRIANGLELIST, 0, 0,
numBatchVertices, 0, numBatchVertices / 2);
//Reset vertex count
numBatchVertices = 0;
//Lock vertex buffer
vertexBatchBuffer->Lock (0, BATCH_BUFFER_SIZE * sizeof(TLVERTEX),
(void **) &batchVertices, 0);
}
}
As you can see, the function has basic colour modulation functionality. It can easil bey extended this to behave similarly to the BlitExD3D() function from the article, but I'll leave that up to you. Also, when the vertex buffer gets full, this function automatically draws whatever's inside it, and prepares to receive a new batch of quads. //Finish batch drawing void EndBatchDrawing() { //Unlock vertex buffer vertexBatchBuffer->Unlock(); //Draw the quads in the buffer if it wasn't just flushed if (numBatchVertices) d3dDevice->DrawIndexedPrimitive (D3DPT_TRIANGLELIST, 0, 0, numBatchVertices, 0, numBatchVertices / 2); //Set stream source to regular buffer d3dDevice->SetStreamSource (0, vertexBuffer, 0, sizeof(TLVERTEX)); //Reset vertex count numBatchVertices = 0; } A demonstration of batching building on the NoLocking.zip app can be found in Batching.zip. The batching code presented relies heavily upon global variables and has very little error checking. I urge the reader to encapsulate the batching functionality in a class, and add a lot of error checking (asserting that the vertex buffer is not overflowing in AddQuad() would be a good start).
|
|