Speech translation system for english to dravidian languages

作者:J. Sangeetha, S. Jothilakshmi

摘要

In this paper the Speech-to-Speech Translation (SST) system, which is mainly focused on translation from English to Dravidian languages (Tamil and Malayalam) has been proposed. Three major techniques involved in SST system are Automatic continuous speech recognition, machine translation, and text-to-speech synthesis system. In this paper automatic Continuous Speech Recognition (CSR) has been developed based on the Auto Associative Neural Network (AANN), Support Vector Machine (SVM) and Hidden Markov Model (HMM). The HMM yields better results compared with SVM and AANN. Hence the HMM based Speech recognizer for English language has been taken. We propose a hybrid Machine Translation (MT) system (combination of Rule based and Statistical) for converting English to Dravidian languages text. A syllable based concatenative Text To Speech Synthesis (TTS) for Tamil and Malayalam has been proposed. AANN based prosody prediction has been done for the Tamil language which is used to improve the naturalness and intelligibility. The domain is restricted to sentences that cover the announcements in the railway station, bus stop and airport. This work is framed a novel translation method for English to Dravidian languages. The improved performance of each module HMM based CSR, Hybrid MT and concatenative TTS increases the overall speech translation performance. This proposed speech translation system can be applied to English to any Indian languages if we train and create a parallel corpus for those languages.

论文关键词:Speech-to-speech translation, Spoken language translation, Automatic continuous speech recognition, Machine translation, Text to speech synthesis, Dravidian languages

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10489-016-0846-3