AUCO ResNet: an end-to-end network for Covid-19 pre-screening from cough and breath

作者:

Highlights:

• The Auditory Cortex ResNet, briefly AUCO ResNet, is proposed and tested. It is a deep neural network architecture especially designed for audio classification trained end-to-end. It is inspired by the architectural organization of rat's auditory cortex, containing also innovations 2 and 3. The network outperforms the state-of-the-art accuracies on a reference audio benchmark dataset without any kind of preprocessing, imbalanced data handling and, most importantly, any kind of data augmentation.

• A trainable Mel-like spectrogram layer able to finetune the Mel-like-Spectrogram for capturing relevant time frequency information.

• A novel sinusoidal learnable attention mechanism which can be considered as a technique to weight local and global feature descriptors focusing on high frequency details.

• State of the art cross-dataset testing and related accuracies.

摘要

•The Auditory Cortex ResNet, briefly AUCO ResNet, is proposed and tested. It is a deep neural network architecture especially designed for audio classification trained end-to-end. It is inspired by the architectural organization of rat's auditory cortex, containing also innovations 2 and 3. The network outperforms the state-of-the-art accuracies on a reference audio benchmark dataset without any kind of preprocessing, imbalanced data handling and, most importantly, any kind of data augmentation.•A trainable Mel-like spectrogram layer able to finetune the Mel-like-Spectrogram for capturing relevant time frequency information.•A novel sinusoidal learnable attention mechanism which can be considered as a technique to weight local and global feature descriptors focusing on high frequency details.•State of the art cross-dataset testing and related accuracies.

论文关键词:Audio classification,Spectrograms,Attention mechanism,Covid,Pre-screening,Convolutional neural network

论文评审过程:Received 31 March 2021, Revised 10 March 2022, Accepted 14 March 2022, Available online 15 March 2022, Version of Record 20 March 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.108656