Sound Server Programming
by Arnaud Carré

Part 1 : WaveOut Sound Server

What is a Sound Server?

A Sound Server is some code to play sound. (What a general definition !). Each time you want to code computer sound, you have to use a Sound Server. A Sound Server can be represented as a single thread, managing the System sound device all the time. The user (you !) just send some sound data to the Sound Server regularly.

Why need I a sound Server?

Let's imagine you just coded a great demo, and you want to add some music. You like these old soundchip tunes and you want to play an YM file (ST-Sound music file format). You download the package to play an YM file, but unfortunatly the package is only a "music rendering" package. That is, with the library, you can ONLY generate some samples in memory, not into the sound device ! Many music library are made like this. Tradionnaly, the library provides a function like:

MusicCompute(void *pSampleBuffer,long bufferSize)

So the SoundServer is for you !!! My SoundServer provides all Windows sound device managing, and call your callback regulary. Here is what your code should be:

#include <windows.h>
#include "SoundServer"

static CSoundServer soundServer;

void myCallback(void *pSampleBuffer,long bufferLen)
{
  MusicCompute(pSampleBuffer,bufferSize); // original music package API
}

void main(void)
{
  soundServer.open(myCallback);
  // wait a key or anything you want
  soundServer.close(); 
}

How does it work?

Managing sound device under Windows can be done with various API. Today we'll use the classic Multimedia API called WaveOut. So our Sound Server will work properly even if you don't have DirectSound. We'll see a DirectSound version of the Sound Server in the next article.

The main problem is that we're speaking of sound, so we have some rules to respect to avoid listening some nasty *blips* in the sound stream. Let's imagine we want to play an YM file at 44100Khz, 16bits, mono, and we have internal buffers of 1 second. First, we fill our buffer with the start of the music, and we play the buffer through the Windows API. After one second, buffer is finished, so Windows tell us that the buffer is done, and we have to fill it again with the next of the song. We can fill our buffer again, and send back to the sound device. BUT, in this case, playback is stopped until we fill the buffer again, so we hear some *blips* !!

To avoid that problem, we'll use the queuing capability of the WaveOut API. Just imagine you have two buffers of one second each, already filled with valid sound data (let's call them buffer1 and buffer2). If you play buffer1 and IMMEDIATLY play buffer2, buffer1 is not cutted. Buffer2 is just "queued", and buffer1 is still playing. when buffer1 is finished, Windows starts IMMEDIATLY buffer2 so there is no *blips*, and inform you buffer1 is finished through a callback. So you have 1 second to fill buffer1 again and send it to the sound device. Quite simple, no ?

Let's do the code

All the sound server is encapsulated in a class called CSoundServer. You start the server by calling the "open" method. Open method gets your own callback function as an argument. Then we initialize the Windows WaveOut API by calling

waveOutOpen( &m_hWaveOut, WAVE_MAPPER, &wfx, (DWORD)waveOutProc, (DWORD)this, // User data.
            (DWORD)CALLBACK_FUNCTION);

Please note that waveOutProc is our internal callback. And this callback will call your user-callback.

Then we fill all our sound buffer (remember the multi-buffering to avoid *blips*).

for (i=0;i<REPLAY_NBSOUNDBUFFER,i++)
{
  fillNextSoundBuffer();
}

Let's have a look to the most important function: "fillNextSoundBuffer". First, we have to call your user callback, to fill the sound buffer with real sample data.

// Call the user function to fill the buffer with anything you want ! :-)
if (m_pUserCallback) m_pUserCallback(m_pSoundBuffer[m_currentBuffer],m_bufferSize);

Then we have to prepare the buffer before sending it to the sound device:

// Prepare the buffer to be sent to the WaveOut API
m_waveHeader[m_currentBuffer].lpData = (char*)m_pSoundBuffer[m_currentBuffer];
m_waveHeader[m_currentBuffer].dwBufferLength = m_bufferSize;
waveOutPrepareHeader(m_hWaveOut,&m_waveHeader[m_currentBuffer],sizeof(WAVEHDR));

and finnaly we can send it to the device with the waveOutWrite

// Send the buffer the the WaveOut queue
waveOutWrite(m_hWaveOut,&m_waveHeader[m_currentBuffer],sizeof(WAVEHDR));

That's all folks !! Quite easy, no ??

How can I use it?

I like "clean and short" code. Traditionnaly, when I get a source code from the web, it's always a nightmare to compile and run it. So I try to do things as simple as possible. To use the sound server, just copy SoundServer.cpp and SoundServer.h files in your project directory.

WARNING: Do not forget to link your project with WINMM.LIB to use the Sound Server.

Download the Sound Server source code AND a sample project, playing a generated sin-wave.

Part 2 : DirectX Sound Server

In part 1, we learned to make a SoundServer using the windows WaveOut API. Now we'll use the famous DirectSound API.

Is DirectSound better than WaveOut?

As all simple questions, answer is quite not simple ! :-) In fact, you have to know exactly what's important for your sound server. If your program have to be very accurate (I mean game, demo, or anything requiring high visual/sound synchronisation), use DirectSound. The drawback is that it's a bit more complicated to use (thread usage) and user should have the DirectX API installed. If you only want to play a streaming audio in the background in a tool, just use the WaveOut version.

How does it work?

If you read the previous part, you're familiar with the multi-buffering. With DirectSound, we don't use the same technic. Basicaly DirectSound provides a set of sample buffers, and mix them together. If you want some sound fx in your next generation video game, just create a DirectSoundBuffer for each sample, and play them. DirectSound manages all the mixing, polling etc.. for you !

So you say, "great", that's quite easy ! Yes, but we're speaking of a sound server, for streaming purpose ! So we have the same problem: we want a short sound buffer, and we want the sound server call our own callback periodicaly. Unfortunatly, DirectX7 does not provide streaming sound buffer (maybe in DX8). So we'll use that sheme:

Create a DirectSoundBuffer
Create and launch a thread rout, wich goal is to poll the SoundBuffer without end. Each time we have a little space in it, we fill the buffer with our own data, and so on.

Some words about DirectSound buffers...

DirectSound uses SoundBuffer to play sound. You can use one sound buffer for each sound effect you have to play. All these sounds are mixed into an special sound buffer called the primary sound buffer. All DirectSound app must create a primary sound buffer. For our SoundServer, we can fill directly the primary sound buffer with our data, but writing to the primary sound buffer is not allowed on all drivers or operating system (NT). So we'll use a second buffer, wich is not a primary one. We can lock/unlock and write data in that new buffer without trouble. So our SoundServer will contain a primary sound buffer and a classic sound buffer.

Let's do the code

All the sound server is encapsulated in a class called CDXSoundServer. You have to send the handle of your main window to the constructor, because DirectX need it. Then you can start the server by calling the "open" method. Open method gets your own callback function as an argument. Let's see the open method in detail:

1) Create a DirectSound object.

HRESULT hRes = ::DirectSoundCreate(0, &m_pDS, 0);

WARNING: In our sample I simply check if all is ok. If not, open returns FALSE. You have to add some better error message handler. As an example, if DirectSoundCreate returns an error, maybe DirectSound is not installed on the machine.

2) At that point, m_pDS is a valid LPDIRECTSOUND. Now we set a cooperative level. We choose DSSCL_EXCLUSIVE because we want our app to be the only one to play sound (others apps stops playing sound if they don't have the focus) and DSSCL_PRIORITY allowing us to set our own sound buffer format. (This is only for easy enderstanding, because DSSCL_EXCLUSIVE includes DSSCL_PRIORITY).

hRes = m_pDS->SetCooperativeLevel(m_hWnd,DSSCL_EXCLUSIVE | DSSCL_PRIORITY);

3) Now we can create the primary sound buffer and set its internal format.

DSBUFFERDESC bufferDesc;
memset(&bufferDesc, 0, sizeof(DSBUFFERDESC));
bufferDesc.dwSize = sizeof(DSBUFFERDESC);
bufferDesc.dwFlags = DSBCAPS_PRIMARYBUFFER|DSBCAPS_STICKYFOCUS;
bufferDesc.dwBufferBytes = 0;
bufferDesc.lpwfxFormat = NULL;
hRes = m_pDS->CreateSoundBuffer(&bufferDesc,&m_pPrimary, NULL);
if (hRes == DS_OK)
{
  WAVEFORMATEX format;
  memset(&format, 0, sizeof(WAVEFORMATEX));
  format.wFormatTag = WAVE_FORMAT_PCM;
  format.nChannels = 1; // mono
  format.nSamplesPerSec = DXREPLAY_RATE;
  format.nAvgBytesPerSec = DXREPLAY_SAMPLESIZE * DXREPLAY_RATE;
  format.nBlockAlign = DXREPLAY_SAMPLESIZE;
  format.wBitsPerSample = DXREPLAY_DEPTH;
  format.cbSize = 0;
  hRes = m_pPrimary->SetFormat(&format);

4) Now create a normal sound buffer (to be filled by our rout) and set the same format as the primary. Of course you can set another format, but in that case you'll get a speed penalty.

DSBUFFERDESC bufferDesc;
memset(&bufferDesc,0,sizeof(bufferDesc));
bufferDesc.dwSize = sizeof(bufferDesc);
bufferDesc.dwFlags = DSBCAPS_GETCURRENTPOSITION2|DSBCAPS_STICKYFOCUS;
bufferDesc.dwBufferBytes = DXREPLAY_BUFFERLEN;
bufferDesc.lpwfxFormat = &format; // Same format as primary
hRes = m_pDS->CreateSoundBuffer(&bufferDesc,&m_pBuffer,NULL);

WARNING: Please notice the DSBCAPS_STICKYFOCUS flags. This flags allow our app to play sound even if we don't have have focus. Very useful if you write a sound player. The DSBCAPS_GETCURRENTPOSITION2 tells DirectSound we'll use the GetPosition method later on that sound buffer.

5) And finally, play the empty sound buffer in loop mode, and launch a new thread to fill it:

hRes = m_pBuffer->Play(0, 0, DSBPLAY_LOOPING);
m_hThread = (HANDLE)CreateThread(NULL,0,(LPTHREAD_START_ROUTINE)threadRout,(void *)this,0,&tmp);

Some words about threads...

What's a thread ?? A thread is another task of your program. That is, you have benefit of multi-tasking AND memory sharing ! Our thread have to check the sound buffer all the time. So let's imagine we have only two threads running: our app and our sound thread. All threads will share 50% of CPU each. But I'm sure you don't want your SoundServer takes 50% of CPU time !! :-) So we'll use the "sleep" function. Sleep tells window to forgot the thread for a given amount of time. Sleep(20) suspends the thread for 20ms, so the app have 100% of CPU in that time ! 20ms is a good timing for a sound server. Of course, in practice your app will never have exactly 100% CPU, because of the operating system himself. Our thread routs looks like:

static DWORD WINAPI __stdcall threadRout(void *pObject)
{
  CDXSoundServer *pDS = (CDXSoundServer *)pObject;
  if (pDS)
  {
    while ( pDS->update() )
    {
      Sleep(20);
    }
  }

  return 0;
}

NOTE: You may have notice the m_bThreadRunning is "volatile". Don't forget thread uses shared memory, so the m_bThreadRunning member can be changed by another task. That's why we don't want the compiler uses registers. Volatile tells compiler the memory can be changed by an interrupt routine.

How to fill a sound buffer...

Our thread rout calls DCXSoundServer::update() as often as possible. That function have to check where the sound buffer is currently playing (sound buffer are circular). We keep an internal position (m_writePos) wich is our own position. Let's imagine the sound buffer is 8192 bytes len and we already computed 120 bytes. so m_writePos = 120. At the same time, let's say the playing position is 4120. So we can compute safely 4000 bytes of new data from m_writePos to playPos. (because we can't write over the playing cursor without hear nastly glitchs). So, first we get the playing position, and we compute the data size to be generated from m_writePos to playPos (don't forget we're in a circular buffer)

HRESULT hRes = m_pBuffer->GetCurrentPosition(&playPos,&unusedWriteCursor);
if (m_writePos < playPos) writeLen = playPos - m_writePos;
else writeLen = DXREPLAY_BUFFERLEN - (m_writePos - playPos);

Now we can safely compute"wrileLen" bytes of data at the m_writePos. To fill a DirectSoundBuffer, we have to lock it:

while (DS_OK != m_pBuffer->Lock(m_writePos,writeLen,&p1,&l1,&p2,&l2,0))
{
  m_pBuffer->Restore();
  m_pBuffer->Play(0, 0, DSBPLAY_LOOPING);
}

Please notice that lock can returns an error when SoundBuffer have to be restored. Our error check is very lame, because we don't check the DSERR_BUFFERLOST error code. But that's quite enough for our article! Finally we can call our user callback with valid pointer and size:

if (m_pUserCallback)
{
  if ((p1) && (l1>0)) m_pUserCallback(p1,l1);
  if ((p2) && (l2>0)) m_pUserCallback(p2,l2);
}

Source code and sample project...

As always, the source code of a very "short and simple" sound server using DirectSound API. If you want to use it in your project, just use DXSoundServer.cpp and DXSoundServer.h

As a sample, you can download a complete project containing a sin wave generator, using both WaveOut or DirectSound API.

WARNING: Do not forget to link your project with WINMM.LIB to use the WaveOut Sound Server, and DSOUND.LIB to use the DirectSound API.

Download the Sound Server source code AND a sample project, playing a generated sin-wave.

Hope you like the article !

Discuss this article in the forums

Date this article was posted to GameDev.net: 3/16/2001
(Note that this date does not necessarily correspond to the date the article was written)

See Also:
General