Multimodal fusion methods with deep neural networks and meta-information for aggression detection in surveillance
作者:
Highlights:
• A DNN-based approach is proposed for aggression detection in surveillance.
• Four multimodal fusion methods are used with and without an intermediate level.
• Acoustic, visual and textual features are combined with meta-features.
• Linguistic and word affect features are used with properties of spontaneous speech.
• The different methods are validated on the dataset of aggression in trains.
摘要
•A DNN-based approach is proposed for aggression detection in surveillance.•Four multimodal fusion methods are used with and without an intermediate level.•Acoustic, visual and textual features are combined with meta-features.•Linguistic and word affect features are used with properties of spontaneous speech.•The different methods are validated on the dataset of aggression in trains.
论文关键词:Aggression detection,Deep learning,Multimodal fusion,Audio–visual fusion,Text-based features,Meta-features
论文评审过程:Received 3 March 2021, Revised 6 July 2022, Accepted 10 August 2022, Available online 13 August 2022, Version of Record 26 August 2022.
论文官网地址:https://doi.org/10.1016/j.eswa.2022.118523