3D Sound in Games
by Jake Simpson

Sound in games always seems to get lost in the rush to talk about the visuals in games, especially in reviews, have you noticed that? There may be some comment on the music, as to its suitability, or not. Maybe some reference to the atmosphericness (is that a word?) of the ambient sounds, but that’s usually about all. Then it's off onto the *Great* visuals, or the terrific power of the weapons, or the general abundance of blood.

Which is really not fair to the poor sound designer, who's spent hours sweating over getting the music and sounds *just* right, or to the poor programmer that’s had the sound guy hammering on him to 'lower the attenuation on this set of sounds, but up it on that'. Anyone who's ever done sound stuff in games knows what I mean. Sounds require more tweaking than any graphics event going. It's far more subjective than graphics, but far less likely to be something that needs to be toned down, like graphics inevitably do, since they take so much of the processing/rendering time of the game.

However, sounds are important - more so than most people realize. Try playing your favorite game with all the sounds turned off. Doesn't play right does it? In a study done by LucasFilm when they were testing out the THX standards, it became apparent that decent sound can actually fool the brain into thinking the picture is better. In the study, one group of people that where shown a movie with average sound, then the same movie with better sound actually commented that the picture seemed sharper too! Sound = important. QED.

However, now we have a new tool to play with. 3D Spatialized sound. Not that spatialized sound hasn't been used before - lots of games split the mono sounds over two channels and use 3D distance from camera to determine each channel's volume. But this has always been more a gimmick than a real helpful tool for the game's player. That’s not to say it isn't helpful at all, just that for you - as a games player - to really be able to use it, you have to have just the right set of circumstances. Things are coming along with the new 5.1 speaker setups that we are starting to see developed for the home PC. A quick stroll down Best Buy's PC sound isle gives you many manufactures offering 5 speaker setups, and cards from Creative Labs, Diamond and Aureal that will use them.

However, it's worth pointing out that we still haven't reached the Nirvana of speaker support yet, since all of these speaker solutions only offer 2 dimensional sound systems, i.e. they are all on one level. With 4 speakers, you can hear a sound anywhere around you in the horizontal, but not the vertical. You need a speaker over the top of you, and one beneath you for that to work correctly. Both Creative and Aureal mention that they offer some algorithms for simulating vertical sounds on a horizontal setup, but quite frankly, if you can hear it, then you've got better ears than everyone I know. There's no substitute for physical positioning of a sound source I'm afraid. Not that it is really that much of a problem for today's crop of FPS games, since most of them are played on a horizontal basis. Quake is a good example. While it gives you 6 degrees of freedom, it really only uses 4 degrees in practical game play. Hearing a sound directly above you is not critical to successful game play. On the other hand, if your playing Descent, or Freespace, or any space sim, then it’s a different story.

So the current state of 3D sounds in games requires one of the two prevalent sound systems, Creative Labs EAX sound system, or the Aureal A3D sound system. There's the Miles Sound System which deserves a mention as well. There are others on the horizon, but since they aren't here now, we won't dwell on them.

Creative Labs gives us EAX. And what a solution it is. I attended this year's Creativity Conference, and they certainly have their... "stuff" together J. They have a road map for EAX 3.0, 4.0, 5.0 etc etc etc. EAX revolves around 'Environment sets' (EAX stands for Environmental Audio. What the X is for is anyone's guess. Probably means the same as the GTX letters things you get on cars.J). The idea behind environmental audio is that the audio preferences of your surroundings reflect the type of surroundings you have. e.g. if you are in an echo-y cave, then you get lots of reflective sound effects. When you move from one area to another, you can re-set these environment property sets, thus changing the surrounding area's aural characteristics. There is more to EAX of course, but this is the core of what it's attempting to do. Also, it's important to distinguish between EAX and the SB Live! physical card too.. Think of EAX as a 'Glide' type API, and the SB Live! as a voodoo card, and you get the idea. EAX is suited to the SB Live!'s feature set, but unlike 'Glide' can be adapted for use on other cards - and we'll talk about that a bit further in.

Getting back to the property sets - the big problem is knowing which one to set them to. It involves knowing the environment you are in. There are numerous ways to do this, from dropping markers in the world maps that contain a distance and a property set - if you are within the range of the marker, this is your property set - to detecting the architecture around you, how close it all is, and deciding which property set to use based on that. But Creative has come the rescue on this front with a tool they call Eagle. Developed by Keith Charley of Creative, the idea behind Eagle is a tool that loads up the raw map file that designers use to create their game levels, and creating a low polygon version of it that contains all the information necessary for setting property sets, as well as data necessary for obstruction and occlusion, which we will talk about a bit later.

As a tool, it’s a great idea, but it does have some drawbacks too. Firstly, it's an extra step in the designer's development cycle. After building your level, you have to set up all the sound info the game requires too. Any change to a level requires the same change to the sound map as well. Secondly, the map that it generates is extra data that has to be loaded per level. It increases memory consumption, and adds to the loading time. Both bad things, especially if you're looking at complex levels to begin with. However, the one thing that it does offer you is that you end up with geometry that is intentionally designed for the job of adding cool sound effects to your game. It's fast to access, which is something that comes into play in the obstruction/occlusion areas.

Something else that should to be discussed is the move by Microsoft to include EAX support inside of DirectSound. In their quest to support hardware sound cards, Microsoft decided that EAX was the way to go. This could be seen as both a drawback and a plus. A plus in that many people can now use the EAX protocols, which means you don't HAVE to have a SB Live! to use EAX, just a card that can support property sets. This is a double-edged sword though, since EAX could now be applied to a card that really isn't made for EAX support, resulting in a sound system that doesn't do either the card or EAX justice. It can also be seen as a drawback in that Microsoft have a way of usurping control of systems like EAX to their own ends.

Aureal offers a comprehensive solution. The A3D system has its own way of doing things that covers almost all the bases, and what it doesn't, it will soon J. Currently it’s the major competitor for Creative Labs in the sound card market. They offer a similar solution to EAX called, not surprisingly, A3D. Their home grown chip is the Vortex chip set, although A3D will run on other systems. Interestingly, Creative even had a small A3D emulation driver they shipped with the SB Live!. A3D is not property set based, although coming in A3D 3.0, they will have A3D sets, which just happen to map to EAX property sets, meaning that effectively you can run A3D on an property set supporting card. Draw your own conclusions about what this will mean. Also supported in A3D3.0 is .mp3 decompression. It's software based, so it's not free, but it would certainly be worth investigating for those music channels. Something worth mentioning is that Quake III will support A3D 3.0. This is bound to boost the popularity of this sound system.

As it stands right now, there are tools being created to allow easier access to A3D functions - but it's hard to comment on them without seeing them.

The A3D protocol is much more geared directly to the card than the EAX approach. It's almost direct to the metal kind of stuff, which means it's tougher to pick up initially, but you do get a better feel for what's going on, and the controls you have over it.

The Miles sound system is a bit of a misnomer. Its more a layer of API that sits on top of whatever kind of card you have already. The idea being that it can support A3D, EAX, DirectSound, you name it. You just say "I want this sound at this location, with 3D sound, this kind of environment, this kind of echo, off you go" and the system does it. It does all the initialization it needs to, figures out what kind of sound card you have, and basically does everything for you, plus more. There's .mp3 decompression built in, as well as other very sexy stuff. It's very handy, and pretty cheap considering the amount it does for you as a programmer. The downside is that it's fairly expensive CPU wise. Not terribly, but there is a cost. I understand that DaiKatana uses Miles to good effect though.

Regarding obstruction and occlusion, it's important to actually distinguish between the two. Obstruction is where you can hear a sound, but there is something in the way, like a pillar. You can hear the sound reflected off walls or other objects, but not directly. Occlusion is where the sound has a wall between you and it. Thus it would either be muffled, or not heard at all dependant on the composition of the wall and your proximity to it. These are two different problems to solve algorithmically. For the second problem, you need access to some world geometry to know where the walls are. In order to figure out if a wall is between you and the sound, you need to do some kind of trace function from you to the sound source, looking to see what's in the way. These traces are not free, and can be expensive if you are traveling the length of the map. They need to be done every frame, to update the volume for the sound, which means a fair bit of processing per frame. This is where you start getting CPU time being spent, and where people start getting worried about sound CPU demands. It's interesting to realize that this is all done in software, before the sound card gets involved at all. While both Aureal and Creative mention that A3D 2.0 and EAX 2.0 has obstruction and occlusion built in, they mean that they will write the code for your game that does the obstruction and occlusion, not that its handled in hardware. There's really nothing to stop you doing this yourself in your own engine, with just a basic DirectX sound system.

Something else worth mentioning regarding occlusion is that most games out there don't have it right now. If you support a system that has it, as well as a system that doesn't, then you can get yourself into a bit of a mess. To clarify, with Heretic II we had the basic Quake II sound system, which was just that. Basic. Then we added A3D2.0 (with occlusion and obstruction) support into the patch. Now the system automatically selected which to use, depending on what card you had. The drawback here is that if you were playing deathmatch using the default sound system, as long as you had proximity to other players, you could hear their footsteps, no matter if there was a wall in the way. With A3D 2.0, due to occlusion, you couldn't. Those playing default sound had aural clues as to your whereabouts that you didn't have for them. You could always switch to default sound, but that would have rather defeated the point of having A3D in the first place. Of course the solution would have been to have put in the occlusion for all the sound systems, but hindsight is always 20/20 isn't it?

When you think of "sound card acceleration" it's hard not to think of it in video card terms, and to a greater degree this is so. Video cards take pixel data, perform operations on that data, like interpolation, and stuff these transformed pixels out to a canvas that the game player sees on the screen. Similarly, sound cards take raw sample data and perform operations on that data, mix it together and then send the result to the speakers. Some of these operations include echo, reflections, Doppler shifts, pitch bending and so on. The mixer mixes samples processed by the card, as well as outside analog inputs, like the sound from your CD-ROM drive or mikes that you plug in. There's other stuff the card does, like Midi playback, but we don't need to get into that for the purposes of this document.

The cool part of the 3D sound card is what the card does with 3D sounds. The SBLive! can handle up to 32 concurrent 3D sounds (the difference between a 3d sound and a 2D sound is that a 3D sound has a 3D origin, is spatialized for 4 speakers, and has a ton of clever stuff done it after the sound is submitted to the card), while the A3D Vortex 2 chip can handle 16. On the A3D card, once a sound is submitted, it can work out all the reflections for the sound inside a room inside of the dedicated on card chip, which is cool, along with all the filters required for the sound - like Doppler effects, orientation effects and so on. The SBLive! contains a programmable DSP that works out the reverb for a sound, plus all the same sort of filters, just done in a different fashion.

Both the cards are pretty much the same speed - the real differences are in what occurs physically on the card once the sound gets there. There are some differences in approach, and in the software that gets the sound to the card, but by and large it would be hard to pick one over the other. As a developer, both have their good sides, and their bad sides, it's very much a swings and roundabouts kind of thing as to which is superior. The bottom line really is "how do they sound?" Well, they both sound really good to me; time will tell on this one...

Of course the one thing that games could really use that’s not here is .mp3 hardware decompression. We know we want it, so pretty please manufacturers can we have it?J

Discuss this article in the forums


Date this article was posted to GameDev.net: 7/17/2000
(Note that this date does not necessarily correspond to the date the article was written)

See Also:
General

© 1999-2011 Gamedev.net. All rights reserved. Terms of Use Privacy Policy
Comments? Questions? Feedback? Click here!