Working with AVI Files
by Jonathan Nix

Comment on this Document


You have several options when working with AVI files.

* Parse the file yourself.
* Parse the file with MMIO routines.
* Piece together a DirectShow filter.
* Use the Win32 AVIFile API.
* Use MCI to draw it to a window.

There are benefits and drawbacks to each of them, but only Microsoftís AVIFile api makes it easy enough for the novice, but advanced enough for most purposes. This document deals with the loading and interpreting of AVI files using the AVIFile interface. If you are already comfortable with DirectShow, then I recommend that method instead. As an alternative, AVIFile will allow you to process the many different kinds of AVI files, handle decompression, and read the video frames easily. The API is also for reading any Resource Interchange File Format (RIFF), so learning the API will enable you to process other types of RIFF files like WAV, and even help you to create your own custom format that extends AVIís capabilities while remaining compatible with other software.

This document will show you how to extract video and sound information from an AVI file. Youíll see how to synchronize game elements over the top of the AVI video, and the sample code shows how the AVI can be blitted with transparency so itís superimposed over a gameís action. You will receive royalty-free wrapper classes for avi files, bitmap files, and direct draw to get you started.


When learning to process a file, I usually learn its file format and create a wrapper class to make it easier for the rest of my code. One thing noticed as files get more advanced, are that theyíre utilizing a chunk based format. WAV is probably the easiest of the chunk based, and PCM waves can be parsed with little difficulty. The reason why I recommend using an API for AVI files, though, is because they come in so many forms that you must either learn them all or limit your capability in some way. Theyíre also usually compressed with one of several different formats.

This document is designed to lead you through the entire process from beginning to end. I only talk about the stuff thatís pertinent to opening and getting frames and audio from the file. I donít go into how the sound can be played back or the images rendered, because there are many different ways depending on what you want to do. The sample code, written using MSVC++ 6.0, displays the frame sequence using DirectX 7.0 in full screen exclusive mode. If thereís popular demand I will submit other articles to describe in more detail how the video is rendered and sound played back.


Project Settings

Youíll need to link to winmm.lib, and vfw32.lib to use the AVIFile functions. They should be included in Borland as well. Theyíre included with MSVC++ 6.0, but should also be available for versions as low as 4.0. Newer versions of the libraries are also part of the latest Platform SDK release from Microsoft.

You can use these functions with any kind of project: C, C++, MFC, Win32, console app, Direct Draw, Direct3D, etc.



Opening the AVI File

The AVIFileOpen function only takes a string for the filename, as opposed to a file handle. This means that youíll be unable to embed an AVI file into a proprietary WAD format and load it directly from within while using this API, unless you can figure out a trick. Itís possible to encode the file format and/or change the extension to protect your copyright for a gameís release.

if(AVIFileOpen(&pAviFile, ďfilename.aviĒ, OF_READ, NULL))
   // error

Getting the Fileís Info

AVIFileInfo(pAviFile, &info, sizeof(info));

The info structure contains some extra stuff you might use later on, but nothing spectacular or essential for our purpose so I wonít go into that here.

Finding Audio and Video Streams

An AVI file may have any number of streams of any type. Usually theyíre just audio and video streams. Itís possible to open all of the streams and then query what type they are later. Usually a program ignores streams it doesnít recognize or need. This allows you to innovate the AVI file format, while retaining compatibility with other programs.

Iíll use preallocated arrays to contain only the audio and video streams in the file. Ordinarily I would recommend a linked list, but such implementation details are out of this documentís context, not to mention most AVI files will have only one audio and video stream anyway.


int nNumAudioStreams=0, nNumVideoStreams=0;

The loops to open each stream are pretty strait forward. I explicitly specify what type of stream to load, either streamtypeAUDIO or streamtypeVIDEO. I ignore any other stream that might be in the file, like streamtypeTEXT or streamtypeMIDI. To load any stream type available, specify zero for the streamtype.

do {
      if(AVIFileGetStream(pAviFile, &pAudio[nNumAudioStreams],
         streamtypeAUDIO, nNumAudioStreams))
} while(++nNumAudioStreams < MAX_AUDIO_STREAMS);

do {
   if(AVIFileGetStream(pAviFile, &pVideo[nNumVideoStreams],
      streamtypeVIDEO, nNumVideoStreams))
} while(++nNumVideoStreams < MAX_VIDEO_STREAMS);

Now we have neat arrays of audio and video streams, and we know the number contained in each. Processing them will consist of looping through these streams, so here forward I simply refer to the current stream as pStream. Note that we havenít actually loaded anything yet, weíve merely obtained a handle to the data thatís in the file. This allows us to play potentially massive AVI files without a significant memory impact.

Getting a Streamís Info

A streamís information is obtained simply through the use of this function.

if(AVIStreamInfo(pStream, &infoAudio, sizeof(info)))
   // error

Like the fileís info, this stuff isnít essential for processing an AVI file. There are some things you can calculate from the structure members, but Iíve found an easier way to determine these values thatís described later.

Determining a Streamís Format

Since a stream can be of numeral kinds, we need to know its format. First of all, we must know how long the format data is. The following code accomplishes that.

LONG lSize; // in bytes
if(AVIStreamReadFormat(pStream, AVIStreamStart(pStream), NULL, &lSize))
   // error

* Audio Stream Specifics

The format data for an audio stream is based on the WAVEFORMAT structure, but it may have a few extra data members at the end and a different structure name. You can tell what structure it is by comparing the lSize variable with the sizeof(WAVEFORMATEX) or sizeof(PCMWAVEFORMAT). These structures, and most others, simply extend WAVEFORMAT with a few extra bytes.

PCMWAVEFORMAT includes the important wBitsPerSample member, and WAVEFORMATEX includes both wBitsPerSample and cbSize. The cbSize member tells how many extra bytes are stored after the WAVEFORMATEX structure. The extra bytes are for non-Pulse Code Modulation (PCM) formats, if you want to support those. Usually youíll only find PCM formats, but the code youíre about to see supports all of them.

Weíll accomplish that by reading in a chunk and casting it to a WAVEFORMAT pointer.

LPBYTE pChunk = new BYTE[lSize];
   // allocation error

if(AVIStreamReadFormat(pStream, AVIStreamStart(pStream), pChunk, &lSize))
   // error


Now that we know the audio format, we are better equipped to interpret the actual sound information in order to play it back. Itís not necessary to be familiar with the structure members just yet.

* Video Stream Specifics

The BITMAPINFO structure defines the format for a video stream. That structure contains one of the BITMAPINFOHEADER, BITMAPV4HEADER, or BITMAPV5HEADER structures, followed by the palette information if the image format is 8bits per pixel. Software for Windows 98 or Windows2000 should write the BITMAPV5HEADER, but for reading an AVI file we need to determine which version was stored in the file in order to be backward compatible. Thatís easiest by allocating and reading it all in one chunk like we did with the sound format.

LPBYTE pChunk = new BYTE[lSize];
// allocation error

if(AVIStreamReadFormat(pStream, AVIStreamStart(pStream), pChunk, lSize))
    // error

The reason why thatís possible, is because each structure begins with the same information as BITMAPINFO, but adds a few extra members to the previous version. If you determine that you need these extra members for a project youíre working on, itís easy to cast the data chunk over to the appropriate structure pointer:

// Only if you need to:
DWORD biSize = pInfo->bmiHeader.biSize;
    case sizeof(BITMAPV5HEADER):
        // ...
    case sizeof(BITMAPV4HEADER):
        // ...
    case sizeof(BITMAPINFOHEADER):
        // ...

The BITMAPINFO structure tells us a lot about the image format. It tells what type of compression is used, the frame size, bit depth, etc. Thatís all the necessary information one needs when converting image data over to a GDI HBITMAP, MFC CBitmap, LPDIRECTDRAWSURFACE, or custom format.

Processing an Audio Stream

Streams of this type are typically uncompressed, so itís probably best to describe that method. Weíll start by determining the size of the audio data contained in the stream.

LONG lSize;
if(AVIStreamRead(pStream, 0, AVISTREAMREAD_CONVENIENT, NULL, 0, &lSize, NULL))
    // error

Since we already know the streamís format, weíll load the sound data into a byte buffer.

LPBYTE pBuffer = new BYTE[lSize];
    // error

if(AVIStreamRead(pStream, 0, AVISTREAMREAD_CONVENIENT, pBuffer, lSize, NULL, NULL))
    // error

So now with the sound format and data, itís a simple task to create a DirectSound buffer, play it back through a Win32 multimedia function, or custom library. Changing the function calls wonít do anything special, but you can read about the parameters if you want in the online docs.

Processing a Video Stream

The great thing about AVIFile is that it handles decompression of video for us. Weíll initialize that feature in the following code.

Since we determined what format the frames are stored in earlier, you can modify it slightly, or calculate a new format from scratch, and have AVIFile convert the frames into whatever format best suits your rendering system.

The PGETFRAME pointer is used by the system to handle the decompression of the video frames. The NULL parameter says I just want AVIFile to leave the image format the way it is. You can try passing a BITMAPINFO pointer, perhaps by changing the image format that was loaded before.

pgf = AVIStreamGetFrameOpen(pStream, NULL);
    // error

Now that the decompression system has been initialization, we can enter the loop that plucks and displays each frame for this video stream. AVIFile is organized so you only need one frame in memory at a time, allowing you to quickly play large files, but you can copy or buffer the frames as needed.

Next weíll determine which frame we need to pluck from the stream. Usually thatís accomplished by incrementing the value lTime each millisecond via a multimedia timer, or calculating it via the difference in time between frame renderings. The API functions are then used to calculate a frame value based on the amount of time elapsed since the beginning of play. This allows accurate playback for whatever the file specifies as its playback speed, regardless of the time it takes us to render a frame. With this code youíll be able to synchronize the video with speech or game events. Alternatively you can calculate the lFrame variable through any means depending on what effect youíre accomplishing.

// Precalculated: When stream is opened
lEndTime = AVIStreamEndTime(pStream);

// Calculated just before next frame is blitted
if(lTime <= lEndTime)
    lFrame = AVIStreamTimeToSample(pStream, lTime);
else // the video is done

With that information, itís easy to pluck a packed DIB from the video stream.

lpbi = (LPBITMAPINFOHEADER)AVIStreamGetFrame(pgf, lFrame);

The packed DIB is comprised of a BITMAPINFOHEADER structure, followed by the palette information if needed, and then followed by the bitmap data. All of this is just one sequential block of memory, so itís possible to calculate the palette and bitmap pointers using pointer arithmetic.

// For 16, 24, or 32bit image formats
LPBYTE pData = lpbi + lpbi->biSize;

// For 8bit image formats
LPBYTE pData = lpbi + lpbi->biSize + 256 * sizeof(RGBQUAD);

The image data that has been extracted is only good until the next time we call AVIStreamGetFrame, so itís important to display it to screen, copy it to a texture or bmp file, or whatever you want to do with it. I donít go into such details here, but youíll see how in the provided sample code.

When all of the frames have been processed, itís important to close down the decompression system as follows.

    // error

Cleaning up

After youíre done with the streams, and file, you release them as follows. Thatís all there is to it! Hopefully I have helped you to understand this file format. Feel free to email me if you have any questions.

// Remember to release all streams



Did you skip the document just to get to the samples?

How can you have any pudding if you havenít finished your meat?

- Pink Floyd, ďThe WallĒ

* Sample One

The first sample plays an AVI file that was made by Klowner. This sample uses my CDirectDraw wrapper class to do the rendering. The file is first played forwards, and then played in reverse so the action is seamless. Itís time synchronized so the visual frame rate is consistent with the frame rate specified by Klowner when he made the file. Each frame is also blitted with transparency, so the sequence can be superimposed over a backdrop during a gameís action.

* Sample Two, featuring the PowerRender API

The second sample shows a rotating teapot superimposed over the AVI fileís action. It may look like the teapot is being stretched, skewed and zooming in and out, but I used PRís camera features namely field of view and aspect ratio to optimize those effects. This sample requires a hardware accelerator compatible with Microsoftís Direct3D in order to operate, but should be compatible with OpenGL, GLIDE, or software rendering depending on what you recompile it for. If you want more information on the professional PowerRender api, hereís their site: Egerter software.

ŗ[Get the Samples Now!!]Ŗ

ŗ[Visit my Web Site!!]Ŗ

©1999 Jonathan Nix. All Rights Reserved.

All sample code is subject the most current copyrights and/or disclaimers posted on my website.

PowerRender is a trademark of Egerter Software.

Direct3D and Microsoft are trademarks of Microsoft Coorporation.

Discuss this article in the forums

Date this article was posted to 11/8/1999
(Note that this date does not necessarily correspond to the date the article was written)

See Also:

© 1999-2011 All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!