Wavelets, HRTFs, and sound localization

Paul Hubbard, Kristin L. Umland, M. Cristina Pereyra


Wavelet-domain HRTF, non-standard form, D6 basis
Wavelet-domain HRTF (Head-Related Transfer Function), Non-Standard form, Daubechies-6 basis, thresholded at maximal row norm of 1.0e-7. See fig2.m for the code to generate this figure.

Synopsis

We implemented wavelet-domain convolution, in the non-standard matrix form developed by Beylkin et al. The code here will load a head-related transfer function (HRTF), convert it to the non-standard form (NSF) and the convolve it with a WAV-format audio file. The convolution is done in the wavelet domain, and the result is a (approximated) localized sound clip.

Err...what?

In plainer English - you can process audio such that it sounds as if its coming from a specific location in 3D space. This is really cool, especially for games and theater. This project was an effort to speed up the localization process using math tricks from people much smarter than I am. It didn't really work, but is still an interesting use of the wavelet transform. And hey, the filtering might work better for your application.

Please read the paper for more details. Also feel free to contact me (address below) if you have questions, comments, ideas, and so forth.

Publication Info

This has been accepted into the proceedings of the Wavelets X Conference, Aug 3-8 2003 in San Diego.

A copy of my presentation slides (PDF format) is now available.

Copyright

The paper copyright has been transferred to SPIE, so I cannot send you the LaTex source or PostScript versions. However, older versions are posted below, and the code and other supporting materials are all here.

Software Requirements to run the code

(URLs for these are found below)
  1. MATLAB, v5 or greater
  2. KEMAR HRTF set from MIT
  3. WaveLab toolkit from Stanford
  4. Source code below
In the interest of reproducible research, we are making all of the code available here. Please contact us if you use this code in your projects or research - we would very much like to hear of further work and/or applications in this area.

Matlab versions

This code was originally written for MATLAB v5.0/5.1. When I ported it to v6.5, I discovered that the 'flops' counter I used to estimate complexity was gone, replaced by an elapsed time tick/tock that was less useful. So the code as posted no longer computes flops, for which I apologize - you have to run it on v5.1 to see those.

What about MATLAB clones like Octave?

I've experimented off and on with porting the code to Octave, but have run into problems with the runtime. For example, there are no routines to read and write WAV files, though I could convert my audio clips into another format. Also, I get odd results with the KEMAR HRTF files, where Octave reads too few samples. Currently on hold, though I'd gladly accept any help.

The Source code

  1. Code, as written to run on version 5 of MATLAB, zip file
  2. Code, current version, ported to v6.5 of MATLAB, zip file

Sound files: Listen to the Results

We found the most useful test clip to be a Beethoven piano concerto (the complete reference is in the paper).
  1. lvb.wav, original, 504kB
  2. Localized to zero degrees elevation, 40 degrees azimuth, using MATLAB's 'convolve' function: ref_lvb.wav, 1006kB
    Note the addition of artifacts and distortion - this is because the KEMAR HRTF set is, while free, in need of work.
  3. And now, here is the same clip, localized in the wavelet domain. wd_lvb.wav, 1008kB
    The parameters for this are: Daubechies D4 basis, all detail levels, epsilon (Maximal row norm error) set to 0.001
    Some discernable differences from the normal convolution, a slight metallic artifact is present.
  4. Same clip, better approximation (Epsilon reduced to 0.0001). wd_lvb_2.wav, 1008kB
    Now indistinguishable from the standard convolution.

Related Links and References

I used Emacs to edit my code, and the MATLAB extenstions for Emacs, called "matlab.el", are mirrored locally.

There are instructions in the file as to how to add MATLAB support to your .emacs file. Highly recommended!

MATLAB is a product of The MathWorks, and the MATLAB home page is here.

I used the freeware WaveLab toolkit toolbox. WaveLab is found at this Stanford link.

G. Beylkin did the groundbreaking work in wavelet-domain operators; he is at U of Colorado in Boulder, and his list of downloadable papers is here, and his homepage is available here

The HRTF tables are from the MIT KEMAR project; that is available here.

I strongly recommend good headphones for evaluating these results. My personal preference is the Grado Labs SR series, I like the SR60 and SR325 a great deal. I've had good luck with headphone.com as a supplier.

One particularly impressive convolution system is BruteFIR, which is an extremely efficient convolver using FFT techniques.

Johns Hopkins wrote some excellent Java applets that interactively demonstrate discrete and continuous convolution. Highly recommended, a very useful site.

Update 11/11/10 - Amber Sullivan sent me a link to this wavelet site, which seems to be a pretty good resource list as well.

The JHU site also has many other very useful applets and demos to play with; wander over and take a look!

Previous versions of the paper

Paper, MS word97 format, as rejected by VR2000 Lacking in detail, but covers most of the essentials.

In the interests of, well, something, here are the reviews. Others talk about full disclosure, but we deliver!

Paper, VR00 version, PostScript formatted, for those of you not using MS products.

Contacting the authors

  • Paul Hubbard's email is [email protected]

  • Back to home page