Assessing the performances of different neural network architectures for the detection of screams and shouts in public transportation

作者:

Highlights:

• Models including temporal decoding deal better with the temporal information.

• Vocal sounds are those which benefit the most from temporal information.

• Decomposing the acoustic environment helps classifying shouts and speech sounds.

• Real-world cases impose further constraint which affects the system’s performance.

摘要

•Models including temporal decoding deal better with the temporal information.•Vocal sounds are those which benefit the most from temporal information.•Decomposing the acoustic environment helps classifying shouts and speech sounds.•Real-world cases impose further constraint which affects the system’s performance.

论文关键词:Audio surveillance,Acoustic event detection,Transportation,Classification,Neural networks

论文评审过程:Received 28 March 2018, Revised 31 July 2018, Accepted 29 August 2018, Available online 20 September 2018, Version of Record 26 September 2018.

论文官网地址:https://doi.org/10.1016/j.eswa.2018.08.052