Simulation of Auditory Near-field Distance
This 3-year project aims to build the next generation of sound spatializers to be used in conjunction with Head-Mounted Displays (HMDs) in Virtual Reality (VR) environments. As a result of this project, a prototype of a real-time virtual sound spatializer which allow distance control for binaural reproduction will be implemented. This proof-of-concept would let us evaluate the accuracy of the spatialization methods especially on the near-field and close to virtual walls. By demonstrating its feasibility and benefits, we expect that the general public take greater interest on the introduction of spatial sound along with visual 3D-technologies.
This project, SAUND, aims to develop a sound spatializer (a software program) that can be used with Head-Mounted Displays (HMD). An HMD is a device featuring 3D displays that is usually worn as a helmet or pair of glasses. The proposed spatializer would be able to simulate not only distance changes of remote sound sources (as current spatializers do) but those produced in the near proximity of the user. Specifically, we plan to develop a prototype of such spatializer as depicted in Figure 1.
Conservative estimates [1] put the number of HMD units sold by 2020 around 25 million. This multi-billionaire market would increase the demand for immersive contents not only visually, but also auditory, tactile, etc. Auditory, i.e., the sense of hearing, is arguably the second most important modality of perception.
In contrast with the verisimilitude of the visual images obtained with HMDs, the projection of auditory images still faces many hurdles preventing such realism. For example, binaural recordings (recordings made with two mikes fit into a surrogate of the listener: another person, mannequin, etc.) allow impressive auditory illusions which also capture the room characteristics where the recording was made. However, situations arise where such recordings become impractical as in the case of virtual worlds where the interactions of agents (inhabitants, assets, etc.) are difficult to predict.
Although the angle from which a sound is being projected is relatively easy to simulate, its auditory distance constitutes an elusive problem. Several techniques have been proposed to solve it: simulating changes with monaural intensity (i.e., changing the sound level on both ears by the same amount) [2]; manipulating the acoustic energy traveling directly from the sound source to the listener’s ears and the given by other paths (bouncing against walls, etc.)— direct-to-reverberant energy ratio [3]; computing a Distance Variation Function (DVF) [5], i.e., a function describing changes in the filtering effect of the ears, head, and upper body (Head-Related Impulse Responses—HRIRS) when a sound source moves from far- to near-field; decomposing HRIRs into spherical harmonics (a mathematical way to describe a sound field) [4]; interpolating previously captured HRIRs in the near-filed [6]; etc. To complicate this problem, when a listener is in the near- field of a source (i.e., at a relatively short distance) or in the proximity of walls, monaural intensity changes are not longer adequate. Furthermore, distance judgements of virtual stimuli in the near- field tend to be overestimated by listeners, but the causes of such error are currently undetermined.
The purpose of this research is to bridge the gap between realism of visual and auditory displays in HMDs, especially for near-field sources. Since headphones are the de facto reproduction apparatus for HMDs, in this research, we will focus on the Simulation of AUditory Near-field Distance (SAUND) on headphone (binaural) reproduction.
With this system in place, a typical user would be able to hear multiple sounds coming from arbitrary directions and distances, corresponding to those virtual assets in a scene: e.g., other users, sound effects, etc. For each audio stream (belonging to an asset), a single spatializer block is used, the location (x, y, z coordinates) of each asset is retrieved from the logic of the VR scene as well as the position of the user (x, y, z and rotations around these axes). The resulting sound mix is then presented to the user, commonly, via headphones.
The creation, deletion, assignation, and management of the spatializer modules is performed by the spatializer manager module. For data and audio communication, routing protocols such as Open Sound Control (osc) would be used.
Several aspects of Saund are new including:
Besides the dissemination of our results, we will build a sound spatializer prototype to demonstrate feasibility and benefits of near-field spatialization. This prototype would feature the functionalities aforementioned, but integration with a full-scale VR system is beyond the scope of this project.
The way we interact with information will drastically change when HMDs, or similar technologies, are widely adopted. Hence, we must prepare audio technologies with similar capabilities than their visual counterparts.
Saund aims to unveil latent difficulties to achieve this goal and to stress the importance of spatial sound in virtual environments. These are some of Saund’s significant aspects:
Saund project comprises six work packages (WP) as illustrated in Figure 2. These WPs correspond to different aspects of the project, concretely,
WP1 comprises the subjective evaluation of different methods used for near-field localization of virtual sources (see p. 1 for a short review).
The real-time implementation of the spatializers (on Pure-data programming language) are grouped in WP2, tasks related to the development of visual contents for HMD visualization (developed on Unity) will be conducted in WP3, while WP4 will cover the integration of visual and audio parts, as well as its evaluation.
Dissemination and demonstration tasks are contemplated in WP5, and the administrative tasks to guarantee the normal development of the project (monitoring, coordination, progress meetings, technical progress, objective achievement, financial issues, communication, quality assurance and punctuality of reports and demonstrations) are contained in WP6.Work packages 1, 2, 4, and 6 will be led by Assoc. Prof. Julian Villegas, whereas WP3 and WP5 by Senior Assoc. Prof. Jie Huang.
Julian Villegas: http://onkyo.u-aizu.ac.jp/
Jie Huang: http://web-ext.u-aizu.ac.jp/~j-huang/
2016
Project title: "SAUND: Simulation of auditory near-field distance".
This project is supported by JSPS KAKENHI Grant Number 16K00277.