Welcome to the Onkyo Winter Camp (2024)

In collaboration with Rochester Institute of Technology (RIT), Korea Advanced Institute of Science & Technology (KAIST), and the University of Aizu (UoA) we will host a series of seminars on the 7th of February (Wednesday).
The objective of this event is to strengthen the collaboration between these esteemed institutions and share our latest developments with the members of the University of Aizu.



Here is the program for all the presentations for Onkyo Winter Camp.
Location: Room M12 (3rd floor, Research Quadrangles, University of Aizu)

EEG measurement for human spatial auditory perception
Akira Takeuchi (RIT)
AbstractTime: 10:00–10:30

Even in a situation where the sound is coming from various directions, such as in a cafeteria, an office, or a party lounge, people can focus on a particular sound. This ability, called the “cocktail party effect,” remains a mystery. Our study investigates listeners’ biological responses to speech under different masking noise conditions during a spatial selective attention task. We also focus on cultural differences in listening style between interdependent and independent listeners, as defined in the previous study. By controlling the signal-to-noise ratio and the position of the sound source, we compare listeners’ levels of attention using their EEG responses.

Subjective personalization of HRTFs
Camilo Arevalo (UoA)
AbstractTime: 10:30–11:00

A mismatch between generic HRTFs, commonly used in audio spatializers, and those of a particular listener can lead to inaccurate representation of audio source for the listener. Thus, the importance of HRTFs personalization. To improve the subjective performance of audio spatializers, we propose a method for the personalization of HRTFs through the manipulation of latent space features extracted from multiple HRTF databases using autoencoders. Subjective experiments shows that personalization method proposed, while using few variables, have no perceptual detrimental compared to a non-personalized method. Furthermore, it has better performance for elevations located at the back compared to a non-personalized HRTF database.

AR auditory training for speech in noise perception and sound localization
Sean Koh (KAIST)
AbstractTime: 11:00–11:30

The World Health Organization reports that around 20% of the global population is affected by hearing impairment. Addressing this, we introduce an AR-based auditory training game to offer innovative in-situ training, focusing on enhancing auditory selective attention and word discrimination. Additionally, it tracks user movement in relation to audio sources, providing insight into user behavior in localization tasks. By analyzing navigation behavior, we can distinguish between hearing-impaired and non-impaired individuals, as well as track their rate of improvement throughout the training.

Effect of distortion on the intelligibility of speech in noise
Julián Villegas, Ph.D. (UoA)
AbstractTime: 11:30–12:00

In this research, we investigate the impact of combining linear and non-linear distortion on the intelligibility of speech presented in noise. Specifically, we focus on the effect of non-linear distortion introduced by half-wave rectification in conjunction with pre-emphasis and dynamic range compression. Both pre-emphasis and dynamic range compression contribute linear distortion to the signal.
Preliminary results from our ongoing research suggest that the proposed method significantly enhances speech intelligibility compared to spectral shaping with dynamic range compression (SSDRC). Notably, this improvement is achieved with a reduced CPU demand, making our approach more computationally efficient.
It is important to note that this research is still in progress, and further analysis and experimentation are underway to validate and refine our findings.

Towards individualization of binaural music reproduction
Sungyoung Kim, Ph.D. (KAIST/RIT)
AbstractTime: 12:00–12:30

The AIRIS laboratory (Applied and Innovative Research for Immersive Sound laboratory) conducted a comprehensive comparison of four commercial binaural renderers, assessing listeners’ preferences for each. Building upon our prior research, the comparative analysis revealed distinct between-group differences that can be attributed to individual listening proficiency. Participants with extensive backgrounds in music and audio production exhibited heightened sensitivity to subtle perceptual variations in binaural renders, underscoring their ability to discern nuanced differences. In contrast, the other group, comprising listeners with less experience in music and audio production, demonstrated a tendency to overlook minor distinctions in binaural rendering. Instead, their sensitivity was directed towards the overall direct-to-reverberation ratio. This intriguing finding suggests that introducing room-related reverberation could potentially enhance the binaural presentation of musical content for this specific group of listeners. In contrast, participants with more critical and advanced listening skills prioritized timbre-related fidelity over the precise reproduction of space-induced characteristics.

Spatial soundscape superposition
Michael Cohen, Ph.D. (UoA)
AbstractTime: 12:30–13:00

Contemporary listeners are exposed to overlaid cacophonies of sonic sources. Such soundscape superposition can be usefully characterized by where such combination actually occurs: at the ears of listeners, in the auditory imagery subjectively evoked by such events, or in whatever audio equipment is used to mix, transmit, and display such signals. Besides physical and psychological combinations, procedural (logical and cognitive) superposition considers such aspects as layering of soundscapes; parameterized binaural and spatial effects; audio windowing, narrowcasting, and multipresence as strategies for managing privacy; metaphorical mappings between audio sources and virtual location; separation of visual and auditory perspectives; rotation as revolution; separation of direction and distance; and range-compression and -indifference. Exploiting multimodal sensation and mental models of situations and environments, convention and idiom can tighten apprehension of a scene, using metaphor and relaxed expectation of sonorealism to enrich communication.



17:00 Arrival at Aizu Wakamatsu
18:00 Welcome party
10:00–13:00: Workshop
13:00–18:00: Social program
18:00: Farewell party