Environmental Sound Classification on the Edge: A Pipeline for Deep Acoustic Networks on Extremely Resource-Constrained Devices

作者:

Highlights:

• We introduce the first ever pipeline for sound classification using Deep Learning in extremely resource-constrained (battery-powered, low memory and CPU speed) microcontrollers (MCUs).

• Firstly, we present a CNN architecture (ACDNet) that achieves state-of-the-art classification accuracy for raw audio classification on ESC-10, ESC-50, UrbanSound8K and AudioEvent benchmark datasets.

• Secondly, we compress ACDNet using our proposed hybrid structured compression technique to obtain Micro-ACDNet that is 97.22% smaller in size and requires 97.28% fewer FLOPs yet produces close to state-of-the-art performance.

• Finally, after 8-bit quantization, we deploy Micro-ACDNet on a standard off-the shelf MCU.

摘要

•We introduce the first ever pipeline for sound classification using Deep Learning in extremely resource-constrained (battery-powered, low memory and CPU speed) microcontrollers (MCUs).•Firstly, we present a CNN architecture (ACDNet) that achieves state-of-the-art classification accuracy for raw audio classification on ESC-10, ESC-50, UrbanSound8K and AudioEvent benchmark datasets.•Secondly, we compress ACDNet using our proposed hybrid structured compression technique to obtain Micro-ACDNet that is 97.22% smaller in size and requires 97.28% fewer FLOPs yet produces close to state-of-the-art performance.•Finally, after 8-bit quantization, we deploy Micro-ACDNet on a standard off-the shelf MCU.

论文关键词:Deep learning,Audio classification,Environmental sound classification,Acoustics,Intelligent sound recognition,Micro-Controller,IoT,Edge-AI

论文评审过程:Received 2 April 2021, Revised 27 August 2022, Accepted 4 September 2022, Available online 8 September 2022, Version of Record 12 September 2022.

论文官网地址:https://doi.org/10.1016/j.patcog.2022.109025