Wavelets, HRTFs, and sound localization
Paul Hubbard, Kristin L. Umland, M. Cristina Pereyra
Wavelet-domain HRTF (Head-Related Transfer Function), Non-Standard
form,
Daubechies-6 basis, thresholded at maximal row norm of 1.0e-7. See
fig2.m for the code
to generate this figure.
Synopsis
We implemented wavelet-domain convolution, in the non-standard matrix
form developed by Beylkin et al. The code here will load a head-related
transfer function (HRTF), convert it to the non-standard form (NSF) and
the convolve it with a WAV-format audio file. The convolution is done
in the wavelet domain, and the result is a (approximated) localized
sound clip.
Err...what?
In plainer English - you can process audio such that it sounds as if
its coming
from a specific location in 3D space. This is really cool, especially
for games
and theater. This project was an effort to speed up the localization
process using
math tricks from people much smarter than I am. It didn't really work,
but is
still an interesting use of the wavelet transform. And hey, the
filtering might
work better for your application.
Please read the paper for more details. Also feel free to contact me
(address below)
if you have questions, comments, ideas, and so forth.
Publication Info
This has been accepted into the proceedings of the Wavelets X
Conference, Aug 3-8
2003 in San Diego.
A copy of my presentation
slides (PDF format) is now available.
Copyright
The paper copyright has been transferred to SPIE, so I cannot send you
the LaTex source
or PostScript versions. However, older versions are posted
below, and the code
and other supporting materials are all here.
Software Requirements to run the code
(URLs for these are found below)
- MATLAB, v5 or greater
- KEMAR HRTF set from MIT
- WaveLab toolkit from Stanford
- Source code below
In the interest of
reproducible research, we are making all of the code available
here. Please
contact us if you use this code in your projects or research - we would
very much like to hear
of further work and/or applications in this area.
Matlab versions
This code was originally written for MATLAB v5.0/5.1. When I ported it
to v6.5, I
discovered that the 'flops' counter I used to estimate complexity was
gone, replaced
by an elapsed time tick/tock that was less useful. So the code as
posted no longer
computes flops, for which I apologize - you have to run it on v5.1 to
see those.
What about MATLAB clones like Octave?
I've experimented off and on with porting the code to
Octave, but have run into
problems with the runtime. For example, there are no routines to read
and write WAV
files, though I could convert my audio clips into another format. Also,
I get odd
results with the KEMAR HRTF files, where Octave reads too few samples.
Currently
on hold, though I'd gladly accept any help.
The Source code
- Code, as written to run on version 5 of
MATLAB, zip file
- Code, current version, ported to v6.5 of
MATLAB, zip file
Sound files: Listen to the Results
We found the most useful test clip to be a Beethoven piano concerto
(the complete reference is in the paper).
- lvb.wav, original, 504kB
- Localized to zero degrees elevation, 40 degrees azimuth, using
MATLAB's 'convolve' function: ref_lvb.wav,
1006kB
Note the addition of artifacts and distortion - this is because the
KEMAR HRTF set is, while free, in need of work.
- And now, here is the same clip, localized in the wavelet domain. wd_lvb.wav, 1008kB
The parameters for this are: Daubechies D4 basis, all detail levels,
epsilon (Maximal row norm error) set to 0.001
Some discernable differences from the normal convolution, a slight
metallic artifact is present.
- Same clip, better approximation (Epsilon reduced to 0.0001).
wd_lvb_2.wav, 1008kB
Now indistinguishable from the standard convolution.
Related Links and References
I used Emacs to edit my code, and the MATLAB extenstions for Emacs,
called "matlab.el", are mirrored locally.
There are instructions in the file as to how to add MATLAB support
to your .emacs file. Highly recommended!
MATLAB is a product of The
MathWorks, and the MATLAB home page is here.
I used the freeware WaveLab toolkit toolbox. WaveLab is found at this Stanford
link.
G. Beylkin did the groundbreaking work in wavelet-domain operators;
he is at U of Colorado in Boulder, and his list of
downloadable papers is here, and his homepage is available
here
The HRTF tables are from the MIT KEMAR project; that is available here.
I strongly recommend good headphones for evaluating these results.
My personal preference is the Grado
Labs SR series, I like the SR60 and SR325 a great deal. I've had
good luck with
headphone.com as a supplier.
One particularly impressive convolution system is
BruteFIR,
which is an extremely efficient convolver using FFT techniques.
Johns Hopkins wrote some excellent Java applets that interactively
demonstrate discrete and continuous convolution. Highly recommended, a
very useful site.
Update 11/11/10 - Amber Sullivan sent me a link to this wavelet site, which seems to be a pretty good resource list as well.
The JHU site also has
many other
very useful applets and demos to play with; wander over and take a
look!
Previous versions of the paper
Paper, MS word97 format, as rejected by
VR2000 Lacking in detail, but covers most of the essentials.
In the interests of, well, something,
here are the reviews. Others talk about full disclosure,
but we deliver!
Paper, VR00 version, PostScript formatted,
for those of you not using MS products.
Contacting the authors
Paul Hubbard's email is [email protected]