Audio scene recognition based on audio events and topic model

作者:

Highlights:

摘要

Topic model is a hot research topic which is attracting attentions from many fields. Recently, several studies have applied topic model to ASR (audio scene recognition). Among these studies, most of them use the document-word co-occurrence matrix for topic analysis. In this work, we propose a new ASR algorithm based on audio events and topic model, which uses the document-event co-occurrence matrix for topic analysis. Our work is based on the hypothesis that: for an audio document, compared with its word distribution, its event distribution is more in line with humans’ way of thinking, and then the topic distribution obtained based on the document-event co-occurrence matrix can represent the audio document better. The contribution of this work lies in that: (1) we propose an ASR algorithm which uses document-event co-occurrence matrix for topic analysis. Compared with the current studies which use document-word co-occurrence matrix for topic analysis, the proposed algorithm can extract the topic distribution which can express the audio documents better, and then can get better recognition results; (2) we propose a much easier method to obtain the document-event co-occurrence matrix; (3) we propose a method to weight the event distribution of audio documents; this weighting method can emphasize the audio events that are important in reflecting the unique topics of the audio documents, and can suppress the audio events that are common to many topics. Experimental results on two public datasets verify the effectiveness of the proposed ASR algorithm, and also verify the necessity and effectiveness of the proposed weighting method. The innovative ideas in this work are not limited to ASR, but can be extended to many other fields, such as the video classification etc.

论文关键词:Audio scene recognition,Audio event,Topic model,PLSA,LDA,Support vector machine

论文评审过程:Received 9 May 2016, Revised 1 April 2017, Accepted 7 April 2017, Available online 8 April 2017, Version of Record 21 April 2017.

论文官网地址:https://doi.org/10.1016/j.knosys.2017.04.001