Search by item HOME > Access full text > Search by item

JBE, vol. 24, no. 2, pp.227-233, March, 2019

DOI: https://doi.org/10.5909/JBE.2019.24.2.227

A Sound Interpolation Method Using Deep Neural Network for Virtual Reality Sound

Jaegyu Choi and Seung Ho Choi

C.A E-mail: shchoi@seoultech.ac.kr

Abstract:

In this paper, we propose a deep neural network-based sound interpolation method for realizing virtual reality sound. Through this method, sound between two points is generated by using acoustic signals obtained from two points. Sound interpolation can be performed by statistical methods such as arithmetic mean or geometric mean, but this is insufficient to reflect actual nonlinear acoustic characteristics. In order to solve this problem, in this study, the sound interpolation is performed by training the deep neural network based on the acoustic signals of the two points and the target point, and the experimental results show that the deep neural network-based sound interpolation method is superior to the statistical methods.



Keyword: VR sound, Deep Neural Network, Sound Interpolation

Reference:
[1] Veaux Christophe, Yamagishi Junichi, and MacDonald Kirsten, ”CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit,” The Centre for Speech Technology Research (CSTR), 2016.
[2] V. Nair and G. E. Hinton, “Rectified linear units improve restricted Boltzmann machines,” in Proc. 27th Int. Conf. Machine Learning, pp. 807-814, 2010.
[3] Vu Pham, Théodore Bluche, Christopher Kermorvant, and Jérôme Louradour, “Dropout improves recurrent neural networks for handwriting recognition,” Frontiers in Handwriting Recognition (ICFHR), 2014 14th International Conference, pp. 285–290, IEEE, 2014.
[4] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[5] T. Qu, Z. Xiao, M. Gong, Y. Huang, X. Li, and X. Wu, “Distance dependent head-related transfer functions measured with high spatial resolution using a spark gap,” IEEE Trans. on Audio, Speech and Language Processing, vol. 17, no. 6, pp. 1124-1132, 2009.
[6] J. Wen, N. Gaubitch, E. Habets, T. Myatt, P. Naylor, “Evaluation of speech dereverberation algorithms using the MARDY database”, Proc. Int. Workshop Acoust. Echo Noise Control, pp. 1-4, 2006.

Comment


Editorial Office
1108, New building, 22, Teheran-ro 7-gil, Gangnam-gu, Seoul, Korea
Homepage: www.kibme.org TEL: +82-2-568-3556 FAX: +82-2-568-3557
Copyrightⓒ 2012 The Korean Institute of Broadcast and Media Engineers
All Rights Reserved