XNA Creators Club Online
Page 1 of 1 (6 items)
Sort Posts: Previous Next

Localizing 3D sounds

Last post 04-30-2008 7:01 PM by JSpiel. 5 replies.
  • 04-28-2008 11:42 PM

    Localizing 3D sounds

    I'm trying to localize sounds for 2-channel (left/right) headphones.  After looking at some binaural recordings, it's apparent that 5.1 surround sound headphones are an unnecessary gimmic, as the human auditory system uses a number of audio cues to localize sounds with just 2-channels.

    Two approaches for producing such 3D audio are:
    1. Record with binaural microphones (one placed inside each ear or the ears of a mannequin head), capturing the sound as it would be heard if one were actually there.  This I would call the brute force approach; the auditory analogue of motion capture.
    2. Use HRTF (head-related transfer functions) and position data to localize monaural sounds synthetically.

    What, if any, HRTF data does XNA's audio system use when localizing sounds with Cue.Apply3D.

    Does or can XNA itself distinguish between 2-channel speakers and headphones when rendering 3D audio?
    If not, is it able to send position data to the sound card driver so the driver can use its own HRTF data for localizing sounds correctly for 2-channel headphones?

    It seems to me like XNA is not capable of doing either of the above, and the results are not impressive, because I can't tell whether a sound is coming from in front of me or behind me.  If a proper HRTF was used anywhere at all between the C# code and my eardrums, the audio should sound noticeably different.  If XNA doesn't use HRTF, then it probably should.  Also, it really should have settings for targeting headphones specifically rather than stereo speakers, unless of course all this information is passed to the sound card driver and handled properly.

    The most realistic 3D sounds you'll ever hear, besides actually being there, are going to come from in-ear 2-channel headphones.  Example: http://youtube.com/watch?v=wT1XuB95qMk  (use in-ear headphones or it will suck)

    External speakers, unfortunately, have "sweet spots", and 5.1 and even 7.1 surround is cheezy and sounds flat if your outside the sweet spot.  It's also lazy to handle 3D audio by simply balancing the sound across 5 or 7 speakers, and I certainly hope such an approach is never used for 2-speakers (that would be stereo, the stuff that sounds like its inside your head rather than all around you).   All modern games should use HRTFs to properly localize sounds for 2-channel headphones.  Binaural recording's won't cut it, because characters are always spinning around relative to sound sources (unless of course you record the sound a hundred times from various positions, and cue/transition to the correct recording as the player turns while keeping them all synchronized).
  • 04-30-2008 12:02 AM In reply to

    Re: Localizing 3D sounds

    The human brain computes spatial information from interaural time differences, interaural phase differences, interaural intensity differences and the HRTF. Unfortunately every human has a different 'H' so no two HRTFs are the same. This all means that HRTF is not sufficient as a means of deceiving the brain into thinking that the sound has a precise locality.
  • 04-30-2008 12:39 PM In reply to

    Re: Localizing 3D sounds

    HRTF processing is also extremely expensive. Cool tech for putting together a preprocessed sound effect, or for doing in realtime as a specialized demo, but not really feasible to apply to large numbers of sounds in realtime while also playing a game.
    XNA Framework Developer - blog - homepage
  • 04-30-2008 3:46 PM In reply to

    Re: Localizing 3D sounds

    No, that's not true.  It's definitely NOT that expensive at all.

    I'm running Holistics Amphiotik Synthesis mixing at least 25 separate sounds with binaural 3d audio simultaneously in REAL-TIME and it's using only 4% of my CPU.  With 50 sounds it still uses only 8%, AND I'm running Unreal Tournament 3 in the background with no slowdown at all on dual display setup @ 3360x1050 resolution.  I can drag any one of the sound sources in 3D space around the listener as everything is playing and clearly hear that single sound source move to it's new location.  This technology is incredible and sounds amazing.  I have a Intel Core2 Quad CPU Q6600 @ 2.4GHz, 2 Gigs of RAM, and GeForce 8800 GTS.

    Holistics published a paper last year on the system http://smc07.uoa.gr/SMC07%20Proceedings/SMC07%20Paper%2048.pdf titled "Real-time Spatial Mixing Using Binaural Processing".  I'd suggest reading it.

    Also, ear shape is not that big of a deal unless you naively assume a person is sitting in FIXED location facing a fixed direction with a fixed sound source.  Well, of course, that kind of situation is hard for anyone to locate a sound without good context like knowing where the sound is coming from in advance.  When sounds are moving and/or you can move your head relative to the sound source (like in VIDEO GAMES), your brain is very efficient at localizing the sound.  We're awesome at processing moving time-senstivie information, not static information.  Also, you underestimate the ability of the brain to adapt.  If your ear got cut off or your hearing gets worse in one ear, your HRTF would change, and your brain is capable of adjusting and reinterpretting the sound; our HRTFs are not FIXED in our brains.  The important thing is that even if your ear is not cut off, with MOVING SOUNDS your brain can quickly adjust to the new HTRF, it takes minutes not years.  Even over a short time, with motion, we're capable of adjusting to different HTRFs.  You should do an experiment before you make assumptions.

    Maybe game developers should think about licencing Holistics' software ;)  It's a pretty nice API with full control over many parameters in real-time.





  • 04-30-2008 5:14 PM In reply to

    Re: Localizing 3D sounds

    Thanks for the paper. It looks interesting. I'm not sure what you are claiming about this system. Can you hear a sound behind you using two stereo speakers in front of you?
  • 04-30-2008 7:01 PM In reply to

    Re: Localizing 3D sounds

    Yes, absolutely, but only if you're at the correct angle to the two speakers -- whatever angle was used when transaural processing was applied (you can toggle transaural processing in Amphiotik Synthesis, and configure the speaker angle).

    "Transaural audio is a method used to deliver binaural signals to the ears of a listener using stereo loudspeakers. The basic idea is to filter the binaural signal such that the subsequent stereo presentation produces the binaural signal at the ears of the listener".

    Having said all that, remember that in general, 2-channel headphones are the best thing to target.  Loudspeakers are basically sloppy with sound, spewing it all over the place with total ignorance to where the listener's ears are.  With transaural processing loudspeakers can be alright, but then you can't move from a specific spot.  Headphones create a personal experience, one that will be the same for everyone reguardless of their location if you just have multiple headphone jacks for your friends!  If you're worried about hearing each other, just build the headphones with a microphone pass-through in each ear!

    Binaural audio works off the idea that if you record a sound from inside the ears (of a maniquin head for example) with two microphones, then if you play it back through in-ear headphones you will reproduce exactly what the person would hear if their ears had been there.  Genius!

    And it works.  It works so well that it's creepy, because you think something is there that isn't.  It's like an auditory halucination.  Get some in-ear headphones and listen to the audio of that youtube video I linked to in my first post.

    Now, as far as position goes...  we're not that great at locating sounds, but if we turn our heads or the sounds are moving enough, we can get a much better idea of where something is -- good enough to follow a moving sound source.

    Really, nothing out there today is better than in-ear headphones with binaural audio as far as producing eerily realistic sound, and surround sound and loudspeakers in general are nothing but a channel-balancing gimmick that takes virtually nothing into consideration about how humans perceive sound; speakers are nice if many people to need to hear something, but they're simply not the best at creating a realistic experience for any individual.  Surround sound is nothing but channel separation and volume balancing, and the positioning is wrong anyway if you're not right in the center; you get cross talk when the sound from all the speakers is hitting your ears wrong.

    The sad truth is (and i'm sure companies that sell thousand dollar surround systems don't want you to know this), it's possible to get very realistic audio from a cassette player and a cheap pair of in-ear headphones if the audio is recorded or synthesized correctly.  I've read that there are actually a few cassette albums that were recorded with binarual audio.  Most audio recordings were never made for headphones, so most of them sound like they're coming from inside your skull rather than all around you.
Page 1 of 1 (6 items) Previous Next
var gDomain='m.webtrends.com'; var gDcsId='dcschd84w10000w4lw9hcqmsz_8n3x'; var gTrackEvents=1; var gFpc='WT_FPC'; /*<\/scr"+"ipt>");} /*]]>*/
DCSIMG