Speech-music discrimination using deep visual feature extractors
作者:
Highlights:
• CNNs provide state-of-the-art results in speech-music discrimination.
• Transfer learning can be used to transfer knowledge across different modalities.
• CNNs & transfer-learning can significantly reduce the training data requirements.
• Data augmentation and representation play important role when training audio-CNNs.
摘要
•CNNs provide state-of-the-art results in speech-music discrimination.•Transfer learning can be used to transfer knowledge across different modalities.•CNNs & transfer-learning can significantly reduce the training data requirements.•Data augmentation and representation play important role when training audio-CNNs.
论文关键词:CNNs,Speech-music discrimination,Transfer learning,Audio analysis
论文评审过程:Received 3 January 2018, Revised 11 May 2018, Accepted 12 May 2018, Available online 19 May 2018, Version of Record 4 August 2018.
论文官网地址:https://doi.org/10.1016/j.eswa.2018.05.016