Wednesday, June 24, 2015

Thesis: Listening to Distances and Hearing Shapes: Inverse Problems in Room Acoustics and Beyond

Here is another illuminating thesis work !



Listening to Distances and Hearing Shapes: Inverse Problems in Room Acoustics and Beyond by Ivan Dokmanic

A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources{channel{receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract
room, and to replace it by more familiar point sets. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know a priori the matching between the peaks and the points in space, and solving the inverse problem is achieved by echo sorting a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of- flight measurements. Equipped with this perspective, we rst address the \Can one hear the shape of a room?" question, and we answer it with a qualified \yes". Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only fi rst-order echoes and a few microphones. Next, we show how di erent problems of localization bene t from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown
location|for example a fi nger snap|and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal de ned on Earth's surface lives on a sphere. This resulted in the rst slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using nite rate-of-innovation principles and apply it to various signal processing problems on the sphere. One way our brain uses echoes to improve speech communication is by integrating them with the direct path to increase the e ective useful signal power. We take inspiration from this human ability, and propose acoustic rake receivers (ARRs) for speech|microphone beamformers that listen to echoes. We show that by beamforming towards echoes, ARRs improve not only the signal-to-interference-and-noise ratio (SINR), but also the perceptual evaluation of speech quality (PESQ). The fi nal chapter is motivated by yet another localization problem, this time a tomographic inversion that must be performed extremely fast on computation- and storage-constrained hard-
ware. We initially proposed the sparse pseudoinverse as a solution, and this led us to the second slight departure from the main theme: an investigation of the properties of various norm-minimizing generalized inverses. We categorize matrix norms according to whether or not their minimization yields the MPP, and show that norm-minimizing generalized inverses have interesting properties. For example, the worst-case and average-case`lp-norm blowup is minimized by generalized inverses minimizing certain induced and mixed matrix norms; we call this a poor man's l`p minimization. 
 
 
Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

No comments:

Printfriendly