Visual units and confusion modelling for automatic lip-reading
作者:
Highlights:
• A novel technique for automatic lip-reading is proposed.
• A weighted finite state transducer cascade is used incorporating a confusion model.
• Performance was slightly better than a standard HMM system.
• The issue of suitable units for automatic lip-reading was also studied.
• It was found that visemes are sub-optimal because of reduced contextual modelling.
摘要
•A novel technique for automatic lip-reading is proposed.•A weighted finite state transducer cascade is used incorporating a confusion model.•Performance was slightly better than a standard HMM system.•The issue of suitable units for automatic lip-reading was also studied.•It was found that visemes are sub-optimal because of reduced contextual modelling.
论文关键词:Lip-reading,Speech recognition,Visemes,Weighted finite state transducers,Confusion matrices,Confusion modelling
论文评审过程:Received 26 June 2015, Revised 20 January 2016, Accepted 3 March 2016, Available online 1 April 2016, Version of Record 17 April 2016.
论文官网地址:https://doi.org/10.1016/j.imavis.2016.03.003