Music emotion recognition using recurrent neural networks and pretrained models
作者:Jacek Grekow
摘要
The article presents conducted experiments using recurrent neural networks for emotion detection in musical segments. Trained regression models were used to predict the continuous values of emotions on the axes of Russell’s circumplex model. A process of audio feature extraction and creating sequential data for learning networks with long short-term memory (LSTM) units is presented. Models were implemented using the WekaDeeplearning4j package and a number of experiments were carried out with data with different sets of features and varying segmentation. The usefulness of dividing the data into sequences as well as the point of using recurrent networks to recognize emotions in music, the results of which have even exceeded the SVM algorithm for regression, were demonstrated. The author analyzed the effect of the network structure and the set of used features on the results of the regressors recognizing values on two axes of the emotion model: arousal and valence. Finally, the use of a pretrained model for processing audio features and training a recurrent network with new sequences of features is presented.
论文关键词:Emotion detection, Audio features, Sequential data, Recurrent neural networks
论文评审过程:
论文官网地址:https://doi.org/10.1007/s10844-021-00658-5