Stacked auto-encoders based visual features for speech/music classification
作者:
Highlights:
• The SAE model proposed in this study was used for the first time for SMC.
• Experiments were conducted on spectrogram and chromagram.
• Mean Classification accuracy of 93.81% was attained using SAE and softmax.
• SAE outperforms traditional classifiers and some deep learning techniques.
• Benefits and challenges of the model were also discussed.
摘要
•The SAE model proposed in this study was used for the first time for SMC.•Experiments were conducted on spectrogram and chromagram.•Mean Classification accuracy of 93.81% was attained using SAE and softmax.•SAE outperforms traditional classifiers and some deep learning techniques.•Benefits and challenges of the model were also discussed.
论文关键词:Speech/Music classifier,Auto-encoders,Time-frequency Visual features,Deep-learning
论文评审过程:Received 21 February 2022, Revised 22 June 2022, Accepted 30 June 2022, Available online 4 July 2022, Version of Record 18 July 2022.
论文官网地址:https://doi.org/10.1016/j.eswa.2022.118041