An improved method for voice pathology detection by means of a HMM-based feature space transformation

作者:

Highlights:

摘要

This paper presents new a feature transformation technique applied to improve the screening accuracy for the automatic detection of pathological voices. The statistical transformation is based on Hidden Markov Models, obtaining a transformation and classification stage simultaneously and adjusting the parameters of the model with a criterion that minimizes the classification error. The original feature vectors are built up using classic short-term noise parameters and mel-frequency cepstral coefficients. With respect to conventional approaches found in the literature of automatic detection of pathological voices, the proposed feature space transformation technique demonstrates a significant improvement of the performance with no addition of new features to the original input space. In view of the results, it is expected that this technique could provide good results in other areas such as speaker verification and/or identification.

论文关键词:Pathological voice,Hidden Markov models,Minimum classification error,Dynamic feature space transformation,AUC,area under the ROC curve,ANN,artificial neural networks,EM,expectation maximization,FFT,fast Fourier transform,FST,feature space transformation,GMM,Gaussian mixture model,GPD,generalized probabilistic descent,GNE,glottal to noise excitation ratio,HNR,harmonics to noise ratio,HMM,hidden Markov models,KNN,k-nearest neighbour,KLFDA,kernel local Fisher discriminant analysis,LDA,linear discriminant analysis,LPCC,linear prediction cepstral coefficients,LPC,linear prediction coefficients,MEEI,Massachusetts Eye and Ear Infirmary Voice & Speech Lab,MFCC,mel-frequency cepstrum coefficients,MCE,minimum classification error,MCP,minimum cost point,MDVP,multi-dimensional voice program,MLP,multi-layer perceptron,MDA,multiple discriminant analysis,ML,maximum likelihood,NNE,normalized noise energy,PCA,principal component analysis,ROC,receiver operating characteristic,SNR,signal to noise ratio,SPI,soft phonation index,SVM,support vector machines,SE,standard error,UPM,Universidad Politécnica de Madrid,VTI,voice turbulence index

论文评审过程:Received 15 August 2009, Revised 16 March 2010, Accepted 23 March 2010, Available online 27 March 2010.

论文官网地址:https://doi.org/10.1016/j.patcog.2010.03.019