Occlusion Culling Using DirectX 9
by Dustin Franklin

Introduction to Occlusion Culling

Object culling is a very important facet of graphics programming. It is incredibly wasteful and time consuming to render objects that are not even going to be visible to the user. However, it is critical to optimize the culling process itself. Often, it can use up a lot of processing time instead of saving it.

Even though the culling process needs to be optimized to every extent possible, numerous traditional methods, which have proven themselves to be fast and adequate enough for standard situations, leave much to be desired. Some cull too many objects, and others do not perform enough culling.

The theory of occlusion culling spawns from the fact that even though an object is inside the camera frustum, it could still be hidden and out of view.

Diagram 1.1: Example scene layout

Diagram 1.2: Example scene render

Here, as Diagram 1.1 shows, five primitives are displayed in a scene. However, in the final render (Diagram 1.2), only 3 of them are actually visible. Even though those other two objects turn out to be hidden, they are still rendered, wasting a lot of time. A simple frustum-based culling procedure would still result in the objects being rendered, since they are inside the camera’s view.

Occlusion-based culling procedures are used to determine which objects will actually be visible. Only those objects will actually be rendered, thus saving loads of time. An occluder is an object that hides other objects (for example, the large red box in Diagram 1.1). Half-occluded objects are partly visible (the blue pentagon and purple wedge), and are still rendered. Fully-occluded objects are completely hidden (the green sphere and orange box), and are excluded from being rendered.

For more background information on occlusion culling, please refer to Occlusion Culling Algorithms, by Tomas Möller and Eric Haines.

Introduction to IDirect3DQuery9

The IDirect3DQuery9 interface is one of the new features of DirectX9. It allows developers to access a wealth of statistics, including optimization information, objects handled by the resource manager, and triangle processing.

IDirect3DQuery9 can also perform occlusion queries, which calculate the number of pixels visible on the screen. Only pixels that were rendered between the query start and the query finish are included in this count. If the result is zero, the vertices rendered are fully occluded, meaning they are not visible from the current camera position. So, if the occlusion result is greater than zero, the vertices rendered are visible to the user.

Query Type	Datatype	Use
D3DQUERYTYPE_VCACHE	D3DDEVINFO_VCACHE	Information about optimization, pertaining to data layout for vertex caching
D3DQUERYTYPE_RESOURCEMANAGER	D3DDEVINFO_RESOURCEMANAGER	Number of objects sent, created, evicted, and managed in video memory
D3DQUERYTYPE_VERTEXSTATS	D3DDEVINFO_D3DVERTEXSTATS	Number of triangles that have been processed and clipped
D3DQUERYTYPE_EVENT	bool	For any and all asynchronous events issued from API calls
D3DQUERYTYPE_OCCLUSION	DWORD	The number of pixels that pass Z-testing, or are visible on-screen.
Table 2.1: Uses of IDirect3DQuery9

The ATI Occlusion Query demo conveys the basics of IDirect3DQuery9 implementation.

Occlusion Culling with DirectX9

The emergence of IDirect3DQuery9 provides an easy way to implement effective occlusion culling. The basic process is presented below:

Render every object's bounding mesh
For every object:
1. Begin query
2. Re-render the bounding mesh
3. End query
4. Retrieve occlusion query data. If the pixels visible are greater than zero, the object should be rendered. Otherwise, the object should be occluded from rendering.

Step 1

The actual mesh contains too many vertices to use in the occlusion culling process, so a bounding mesh, with a much lower vertex count, will be used as a substitute. Why use a bounding mesh instead of a bounding box or sphere?

Diagram 3.1: Types of bounding volumes

Diagram 3.1 shows multiple types of bounding volumes, including box, sphere, and mesh. Note that the number of vertices of the sphere and mesh are the same, in this particular case. However, even though the vertex count is close, the fit of the volumes drastically varies. The bounding mesh is the only volume that truly approximates the original mesh well enough to be accurate. This is very important in the occlusion process, as a large amount of vertices may be mistakenly rendered or excluded based on their bounding volume.

However, a bounding mesh cannot be calculated through an algorithm like a bounding box or mesh can. It needs to be modeled and loaded at runtime, just like a normal mesh.

Each object’s bounding mesh is rendered first to make sure the entire scene is present in the Z-buffer. If the occlusion query were to take place before all the objects were present in the Z-buffer, then the object being queried could mistakenly be found to be visible, even though it would actually be occluded in the final scene.

Step 2

Now that every object's bounding mesh is in the Z-buffer, the same thing must be done again, except this time, the occlusion query is used to determine each object's visibility status. If the query finds zero visible pixels, the object is excluded from the final, full-scale rendering. If the query finds one or more visible pixels, the object is included in the render.

It is important to note that the occlusion cull rendering does not take place on the primary, full-size surface. A much smaller surface (320 pixels by 240 pixels seems to work well) is used to improve performance.

The Code

Contents

	Introduction
	The Code
	Conclusion

	Source code
	Printable version
	Discuss this article