Rapid speaker ID using discrete MMI feature quantisation

作者:

Highlights:

摘要

This paper presents a method of rapidly determining speaker identity from a small sample of speech, using a tree-based vector quantiser trained to maximise mutual information (MMI). The method is text-independent and new speakers may be rapidly enrolled. Unlike most conventional hidden Markov model approaches, this method is computationally inexpensive enough to work on a modest integer microprocessor, yet is robust even with only a small amount of test data. Thus speaker identification is rapid in terms of both computational cost and the small amount of test speech necessary to identify the speaker. This paper presents theoretical and experimental results, showing that perfect ID accuracy may be achieved on a 15-speaker corpus using little more than 1 s of text-independent test speech. Also presented is a demonstration of how this method may be used to segment audio data by speaker.

论文关键词:

论文评审过程:Available online 19 May 1998.

论文官网地址:https://doi.org/10.1016/S0957-4174(97)00051-1