Deep4SNet: deep learning for fake speech classification

作者:

Highlights:

• Deep4SNet is a text-independent classifier of original/fake speech recordings.

• It is based on a customized deep learning architecture.

• Speech recordings are transformed into histograms to feed the model.

• Experimental results are performed on Deep Voice and Imitation datasets.

• The accuracy of the classifier is over 98%.

摘要

•Deep4SNet is a text-independent classifier of original/fake speech recordings.•It is based on a customized deep learning architecture.•Speech recordings are transformed into histograms to feed the model.•Experimental results are performed on Deep Voice and Imitation datasets.•The accuracy of the classifier is over 98%.

论文关键词:Fake voice,Convolutional neural network,Imitation,Deep learning,Deep voice,Classification

论文评审过程:Received 22 November 2019, Revised 22 April 2021, Accepted 21 June 2021, Available online 29 June 2021, Version of Record 3 July 2021.

论文官网地址:https://doi.org/10.1016/j.eswa.2021.115465