The following article is an excerpt from ShaderX2 - Shader Programming Tips and Tricks, published by Wordware presumably in August 2003 (see www.shaderx2.com for more information).
Terrain rendering has heretofore been computed by a CPU and rendered by a combination of CPU and GPU. It is possible to implement a fast terrain renderer which works optimally with current 3D hardware. This is done by using geo-mipmapping which splits the terrain into a set of smaller meshes called patches. Each patch is triangulated view-dependently into one single triangle strip. Special care is taken to avoid gaps and t-vertices between neighboring patches. An arbitrary number of textures can be applied to the terrain which are combined using multiple alpha-blended rendering passes. Since the terrain's triangulation changes over time, vertex normals cannot be used for lighting. Instead a pre-calculated lightmap is used. In order to reduce popping when a patch switches between two tessellation levels geo-morphing is implemented. As will be pointed out later, this splitting of the terrain into small patches allows some very helpful optimizations.
Terrain rendering has been an active research area for quite a long time. Although some impressive algorithms were developed, the game development community has community has rarely used these methods because of the high computational demands. Recently, another reason for not using the classic terrain rendering approaches such as ROAM [Duc97] or VDPM [Hop98] emerged: modern GPUs just don't like CPU generated dynamic vertex data. The game developers' solution for this problem was to build very low resolution maps and fine-tuned terrain layout for visibility optimization. In contrast to indoor levels, terrain visibility is more difficult to tune and there are cases where the level designer just wants to show distant views.
The solution to these problems is to introduce some kind of terrain-LOD (level of detail). The problem with simple LOD methods is that at the moment of adding or removing vertices, the mesh is changed which leads to very noticeable popping effects. The only clean way out of this is to introduce geomorphing which inserts new vertices along an existing edge and later on moves that vertex to its final position. As a consequence the terrain mesh is no longer static but changes ("morphs") every frame. It is obvious that this morphing has to be done in hardware in order to achieve a high performance.
A lot of work has already been done on rendering terrain meshes. Classic algorithms such as ROAM and VDPM attempt to generate triangulations which optimally adapt to terrain given as a heightmap. This definition of "optimally" was defined to be as few triangles as possible for a given quality criteria. While this was a desirable aim some years ago, things have changed.
Today the absolute number of triangles is not as important. As of 2003, games such as "Unreal 2" have been released which render up to 200000 triangles per frame. An attractive terrain triangulation takes some 10000 triangles. This means that it is no longer important if we need 10000 or 20000 triangles for the terrain mesh as long as it is done fast enough. Today "fast" also implies using as little CPU processing power as possible since in real life applications the CPU usually has more things to do than just drawing terrain (e.g. AI, physics, voice-over-ip compression, etc...). The other important thing today is to create the mesh in such a way that the graphics hardware can process it quickly, which usually means the creation of long triangle strips. Both requirements are mostly unfulfilled by the classic terrain meshing algorithms.
The work in this article is based on the idea of geo-mipmapping de Boer by [Boe00]. Another piece of work that uses the idea of splitting the terrain into a fixed set of small tiles is [Sno01], although the author does not write about popping effects or how to efficiently apply materials to the mesh.
Building the Mesh
The terrain mesh is created from an 8-bit heightmap which has to be sized 2^n+1 * 2^n+1 (e.g. 17*17, 33*33, 65*65, etc…) in order to create n^2 * n^2 quads. The heightmap (see Figure 1a) can be created from real data (e.g. DEM) [Usg86] or by any program which can export into raw 8-bit heightmap data (e.g. Corel Bryce [Cor01]). The number of vertices of a patch changes during rendering (see view-dependent tessellation) which forbids using vertex normals for lighting. Therefore a lightmap (see Figure 1b) is used instead.
In order to create the lightmap, the normals for each point in the heightmap have to be calculated first. This can be done by creating two 3d-vectors, each pointing from the current height value to the neighboring height positions. Calculating the cross product of these two vectors gives the current normal vector, which can be used to calculate a diffuse lighting value. To get better results, including static shadows, advanced terrain data editing software such as Wilbur [Slay95] or Corel Bryce should be used.
The heightmap is split into 17*17 values sized parts called patches. The borders of neighboring patches overlap by one value (e.g. value column 16 is shared by patch 0/0 and patch 1/0). Geometry for each patch is created at runtime as a single indexed triangle strip. A patch can create geometry in 5 different tessellation levels ranging from full geometry (2*16*16 triangles) down to a single flat quad (2 triangles, see Figure 2) for illustration. Where needed degenerate triangles are inserted to connect the sub-strips into one large strip [Eva96].
In order to connect two strips the last vertex of the first strip, and the first vertex of the second strip have to be inserted twice. The result is triangles which connect the two strips in the form of a line and are therefore invisible (unless rendered in wireframe mode). The advantage of connecting small strips to one larger strip is that less API calls are needed to draw the patch. Since index vertices are used and a lot of today's graphics hardware can recognize and automatically remove degenerate triangles the rendering and bandwidth overhead of the degenerate triangles is very low.