Empirical analysis of linguistic and paralinguistic information for automatic dialect classification

作者：Shweta Sinha, Aruna Jain, Shyam S. Agrawal

摘要

Current research in automatic speech recognition is primarily concerned with the correct evaluation of linguistic information transmitted in the speech signal and with the identification of variations, naturally present in speech. These differences in speech may be due to the individual’s age; gender; or speaking style influenced by his dialect. Undoubtedly, the focus of research in this field is to strengthen further the techniques developed thus far, regarding their reliability and accuracy. The endeavour of this research paper is to primarily concentrate on analysis and modelling of linguistic and paralinguistic information embedded in the speech signal for discovering the similarities and dissimilarities among acoustic characteristics arising out of different dialects. This paper investigates the influence of dialectal variations, by measuring and analysing certain acoustic features such as formant frequencies, pitch, pitch slope, duration and intensity of vowel sounds. For automatic identification of native dialect, these differences are further exploited, given a sample of native speaker’s speech. For the classification of dialect in the spoken utterances support vector machines along with dialect-specific Gaussian mixture models were used. The system performance is compared with human perception of dialects. The proposed study focuses on various dialects of one of the world’s major language; Hindi.

论文关键词：Support vector machine, Gaussian mixture model, Dialect identification, Acoustic feature, Human perception

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10462-017-9573-3