Home » (English) Student projects

(English) Student projects

Disculpa, pero esta entrada está disponible sólo en Inglés Estadounidense. For the sake of viewer convenience, the content is shown below in the alternative language. You may click the link to switch the active language.

Graduate research:

    • «Machine learning-based music synthesis,» Edward Ly  (in progress).
    • «Real-time spatialization using speakers,» N. Fukasawa  (in progress).
    • «Lateralization of sound by spectral energy equalization and delay adjustments using single-layer loudspeaker arrays,» S. Nogami (2015).
      In this research, we investigated influence of spectral energy changes and delay on azimuth judgements.
      First, we investigated influence of spectral energy changes. So, we compared subjective judgements of azimuth obtained by three methods: Vector-Based Amplitude Panning (VBAP), VBAP mixed with binaural rendition over loudspeakers (VBAP+HRTF), and a newly proposed method based on equalizing spectral energy. In our results, significantly smaller errors were found for the stimuli treated with VBAP+HRTF; differences between the other two treatments were not significant. Regarding spherical dispersion of the judgements, VBAP results have the greatest dispersion, whereas the dispersion on the results of the other two methods were significantly smaller, however similar between them. These results suggest that horizontal localization using VBAP methods can be improved by applying a frequency dependent panning factor a opposed to a constant scalar as commonly used. And we hypothesized that including Interaural Time Difference (ITD) is efficient on azimuth judgements.
      Secondly, we investigated influence of delay. So, we compare subjective judgements of azimuth obtained by three methods: reproducing the stimuli from the real loudspeakers, ambisonics, and a newly proposed method based on equalizing spectral energy including delay. In our results, adding delay has a beneficial effect on accuracy of panning judgements. Our method yielded smaller absolute errors than ambisonics. Moreover it yielded smaller absolute errors than our previous method. These results suggest that including delay adjustments can improve horizontal localization.
    • «Elevation of sound by spectral energy equalization and delay adjustments using single-layer loudspeaker arrays,» T. Nagasaka (2015).
      We investigated the relative influence of spectral cues on elevation localization of virtual sources. Comparing five methods and two of them based on Vector-Based Amplitude Panning: 3D Vector-Based Amplitude Panning (3D-VBAP), 2D-VBAP in conjunction with HRIR convolution. Three of them are equalizing filters method which filtered the stimuli to simulate spectral peaks and troughs naturally occurring at different angles, the modification of equalizing filters (delay adjustments), and real positions. A single horizontal loudspeaker array was used for three methods (2D-VBAP in conjunction with HRIR convolution, and equalizing filters with/out delay adjustments).
      The experiment was divided into two experiments. In the former experiment, smallest absolute errors were observed for the 3D-VBAP judgements regardless of azimuth; no significant difference in the mean absolute error was found between the other two methods in a first experiment. However, for most presentation azimuths, the equalizing filter method yielded the least dispersed results.
      In latter experiment, an improvement of localizations was observed in equalizing filters treatment by a change of HRTF database and an addition of delay adjustments. We found the changes were related to reproduction of elevated sounds, however the relationship between localization and these changes such as adjustment was unclear. These results could be used for improving elevation localization in two-dimensional VBAP reproduction systems.
      In the whole of the experiments, we developed an experiment system that use Pd as an iOS application, and OSC protocol was used as the back-end system. Most of implementation process were GUI based programing, so it was understandable for an experimenter who was not familiar with character base programing. The system had a possibility to reduce time cost that was caused by previous methods such oral communication and a notes taking.
    • «Relative influence of spectral bands in horizontal-front localization of white noise,» T. Sugasawa (2014).
      In this research, important frequency ranges to recognize sound as coming from the front are investigated and a useful method to reproduce realistic sound for that direction in the absence of front loudspeaker is suggested. Stereophonic systems are reproducing sound systems that use two-channels, usually arranged symmetrically on middle-plane and at the same distance from listener’s position. With such a system is possible to present realistic sound in frontal direction, but its localizing accuracy and sound spreading are worse than multichannel surround system since there is no front loudspeaker. In order to ameliorate this, I focused on manipulating the frequency spectrum and investigated the relationship between energy in spectral bands and front localization. Yunoue [1], who graduated from the University of Aizu, had investigated same contents previously using three loudspeakers. He divided frequency range between 0.02–22.05 kHz into 13 bands and conducted experiments to determine which bands are more important to localize sound images in the frontal area. However, there are some problems in his method that may affect his results. To assess the correctness of his findings some modification were introduced and the same experiment was conducted again. In addition to that experiment, data from Head-Related Transfer Functions (HRTFs) was analyzed to create compensation filters on 0◦ and ± 30◦ to improve focus of front sources in stereophonic systems. The created filters were convolved with some sound sources and reproduced via two-way loudspeakers. In addition, as a way of comparison, sounds applied panning technique were also reproduced. The results show that the method of using inverse filters was better than simple panning at improving the perceptual focus of the frontal image. This method could be easily implemented in real time systems and probably extended to other spatial dimensions.

Undergraduate research:

    • «Floor reflection effect on elevation perception of sounds,» D. Hasegawa (2019).
      The aim of this study is to improve accuracy of sound elevation perception. Compared to the subjective accuracy of artificial sounds above a listener’s ears, that achieved for artificial sounds below is usually reported to be inferior. Because of that, reflection of sound and no reflection of sound were recorded in an anechoic chamber. Recorded signals were processed to eliminate the effect of the recording apparatus (microphone, loudspeaker, etc.). In this experiment, four methods (direct, delay, reflection, and mixed HRIR) were used. Participants listened to sound processed with these methods and judge their elevation. As a result, accuracy to distinguish sounds from below ears improved by using HRIRs including reflection from the floor.
    • «Improving Speech Localization in Virtual Reality,» N. Miyauchi (2019).
      The purpose of this study is to make a database of HRIRs (Head Related Impulse Responses) with mouth radiation. In this thesis, we collected impulse responses by using two HATS (Head and Torso Simulator) in an anechoic chamber. We used equalizing filters to create sound image of the direction by making IR convoluted of mono sound. An experiment was performed asking how people perceive a directed sound source. The average error is 39.2°. The results indicate that sound perception had an effect on by using two HATS at listener azimuth 0°. This project is in collaboration with Ruri Moriyama.
    • «Improving speech localization in VR,» R. Moriyama (2019).
      The aim of this study is to measure voice directivity using two HATS (Head and Torso Simulators), one as a listener and the other as a speaker. We often know the location of a speaker (right, left, back, front, etc.) in virtual reality (VR). But, it is difficult to understand the orientation of a speaker in these environments because HRIRs (Head-Related Impulse Responses) have no voice directivity pattern built into them. On the contrary in the real world, we can understand the location and orientation of a speaker with good accuracy even when he or she cannot be seen. The HATS we used as speaker featured a mouth simulator with a directivity pattern similar to at found in humans. We used diffuse-field equalization before creating spatialized sounds. An experiment was carried out to verify how people perceive spatialized sounds comparing with a normal database and our database. This study’s result indicates that it was more difficult for people to perceive speaker’s orientation correctly from the side in VR scenes than from the front. This is a collaboration project with Ms. Miyauchi because of its large scope.
    • «Replacing vocalic part with Shepard-tone like spectrum in real-time,» S. Hirata (2019).
      The aim of study is exploring aesthetic possibilities of sung voice with inharmonic instruments and real-time vocalic detection for future applications. In this thesis, we used a Pure-data program to replace vocalic part with Shepard-tone like spectrum in real-time (same pitch and same reported duration). An experiment was performed investigating how psychoacoustic roughness of voice with inharmonic instrument minimize. Participants were asked to estimate roughness they listened of four types of sound sources with different ratios. The results indicates that roughness of voice with inharmonic timbre is not minimized by using Pure-data program.
    • «Platform for comparing sound spatialization methods in virtual reality,» A. Uemura (2018).
      This research aimed at building an acoustic environment for Virtual Reality (VR) spatialization of sound. To test the built VR program, a subjective experiment was conducted to compare the judgements of virtual sound images processed with two methods: Default spatializ tion in Unity and using HRTF convolution. The results indicated that the proposed environment could be used for other method comparisons in the future.
    • «Perception of spatialized Risset tones,» N. Fukasawa (2018).
      This research aimed at building an acoustic environment for Virtual Reality (VR) spatialization of sound. To test the built VR program, a subjective experiment was conducted to compare the judgements of virtual sound images processed with two methods: Default spatializ tion in Unity and using HRTF convolution. The results indicated that the proposed environment could be used for other method comparisons in the future.
    • «Improving sound perception in elevation using single layer loudspeaker array display,» Y. Suzuki (2018).
      The purpose of this study is to improve the localization in elevation of sound sources coming from a single-layer loudspeaker array. In this thesis, we used a method using equalizing filters to create elevated sound images. The experiment was performed eliciting how people perceive the elevated sound source. They were asked the direction they perceive of the sound using elevation and azimuth angle. The results indicate that sound perception was improved by using a loud- speaker grouping method.
    • «Computer-assisted singing experience,» M. Ishihara (2018).
      This research aimed at building an acoustic environment for Virtual Reality (VR) spatialization of sound. To test the built VR program, a subjective experiment was conducted to compare the judgements of virtual sound images processed with two methods: Default spatializ tion in Unity and using HRTF convolution. The results indicated that the proposed environment could be used for other method comparisons in the future.
    • «Assisting System for Grocery Shopping Navigation and Product Recommendation,» S. Saito (2018).
      We present a system for grocery shopping by recommending products according to those currently in a shopping basket, and supporting users in finding their way to the location of recommended item in store.
      Our goal is supporting users in finding their way from their current position to the location of a determined item. And suggesting items to user that analysis by big data.
    • «Implementation of a transaural system in Pure-data,» T. Ninagawa (2017).
      Transaural audio is a method used to deliver binaural signals to a listener using regular stereo loudspeakers. This paper discusses the implementation of a transaural audio filter in Pure-data. Pure-data is a visual programming language used to process and gener- ate sound, video, 2D/3D graphics, and interfaces sensors, input devices, etc.
    • «Quantifying the benefits of bimodal navigation systems,» T. Takahashi (2015).
    • «Loudness perception with headphone and vibration,» Y. Ito (2015).
      When listening to music, people often listen at high levels because it is difficult to hear the bass. This causes a variety of problems. The number of persons who listen to music everyday is increasing, and so it is hearing impairments. This research aims to find ways to reduce these problems by exploring the energy of the bass from music. We used vibration motors move amplify the low frequency energy found in the bass. It is assumed that the listener was difficult to hear the bass and that is one a reason to listen to music at high levels. In view of this, we considered to add bass amplification by vibration to the music that is routinely heard. It is expected that users do not have to increase the music level.
    • «A study of ultrasound encoding and decoding based on steganography,» R. Igarashi (2015).
      This paper describes a method of the information communication mean using the steganography that is one of the information hiding technologies. This technique uses a sound, but does not use the conventional electric wave that is used for a cell-phone and the Internet. This technique suggested in this report conceals information in a high frequency band beyond a human being’s audible range. It makes possible to transmit and receive information without being recognized by a third party. Therefore, the letter is transmitted by the use of suggestion technique, and inspect its effective- ness with decoding reverse processing in the receiver.
    • «Steganography in stereo signals using phase changes,» S. Hoshi (2014).
      The present study was undertaken in order to embed information into the right channel of a stereo audio file and to create its corresponding detection system. In this research, we focus on a technique of phase modulation to watermark information. Phase modulation was implemented with all-pass filters. Information to be embedded was the binary data corresponding to an ASCII coded message. This simple approach shows that it is possible to embed information this way with applications in in-door localization, security, etc.
    • «Navigating in virtual worlds using smartphones: Reflecting real world motion in virtual environments,» Y. Chiba (2013).
      «Freedom» is a project that lets navigate a virtual enviroment using sensors in smartphones. Users can control 3D animation created in Alice, software for creating and rendering 3D animation by Java programming language. «Freedom» is a new way of controlling 3D animation intuitively and impressing users with realistic experience.
    • «`Machi-Beacon’: Spatial Sound For Mobile Navigation System,» W. Sanuki (2013).

      This research explores the development of mobile navigation systems using spatial sound. A combination of spatial sound and geographic information allows mobile device users to perform auditory localization tasks while their eyes, hands, and attention are occupied. We created a spatial sound and simple navigation system using Unity. This navigation system informs user orientation to a goal and distance between present place and that point. The system runs on iOS and Android OS. Preliminary results indicate that a combination of spatial sound GPS and GIS can be used for navigation.

      This research was presented at the «Aizu Industry IT Technology» contest where it won an Encouragement Prize.

    • «Implementing an A-weighting filter as an external object in Pure-data,» K. Sakui (2013).
      This research aimed the construction of A-weighting filter as an external object in Pd. C language was used for building the external. The same processing as actual A-weighting and created external object to perform, and the result was compared using a graph. Similar result were obtained in the middle frequency range. But, for low frequencies (under 100 Hz), values differed noticeably.