High Parameter Frequency Resolution Encoding Scheme for Spatial Audio Objects Using Stacked Sparse Autoencoder
作者:Yulin Wu, Ruimin Hu, Xiaochen Wang, Chenhao Hu, Shanfa Ke
摘要
Object-based audio systems have become common in recent years as they provide the flexibility for many auditory scenarios, such as virtual reality games, interactive theater, and spatial audio communication. For saving bitrates, multiple audio objects are compressed into a mono downmix signal and side information parameters. However, side information parameter frequency resolution is too low to cause aliasing distortion. To overcome this issue, a new encoding scheme based on high parameter frequency resolution (224 sub-bands in a frame) is proposed in this paper. The side information parameters with high frequency resolution are compressed and reconstructed via SSAE (stacked sparse autoencoder) neural network and further used for recovering the audio objects. The performance of the proposed method is compared against existing SAOC (spatial audio object coding) methods at the same overall bitrate, judged by both objective and subjective results. The evaluation shows that our approach can facilitate the high quality of spatial audio objects.
论文关键词:Aliasing distortion, Frequency resolution, SSAE, SAOC
论文评审过程:
论文官网地址:https://doi.org/10.1007/s11063-021-10659-8