Learning representations of sound using trainable COPE feature extractors

作者：

Highlights：

• We introduce trainable COPE feature extractors for sound representation learning.

• A COPE feature extractor is trained using a single prototype sound of interest.

• We propose a method for audio event detection with robustness to varying SNR.

• We experiment on four data sets: MIVIA audio, MIVIA roads, ESC-10 and TU Dortmund.

• We achieve better results than existing methods for audio event detection.

摘要

•We introduce trainable COPE feature extractors for sound representation learning.•A COPE feature extractor is trained using a single prototype sound of interest.•We propose a method for audio event detection with robustness to varying SNR.•We experiment on four data sets: MIVIA audio, MIVIA roads, ESC-10 and TU Dortmund.•We achieve better results than existing methods for audio event detection.

论文关键词：Audio analysis,Event detection,Peaks of energy,Representation learning,Trainable feature extractors

论文评审过程：Received 27 July 2017, Revised 26 February 2019, Accepted 21 March 2019, Available online 21 March 2019, Version of Record 27 March 2019.

论文官网地址：https://doi.org/10.1016/j.patcog.2019.03.016