Learning variable-length representation of words
作者:
Highlights:
• A variable-length representation learning (embedding) of words.
• Allows provision for compressing the word vectors.
• Proposed algorithm uses a smaller number of dimensions for words with consistent contexts (words with specific meanings).
• Variable length embedding potentially helps removing bias (over-fitting) on certain datasets.
• Proposed approach outperforms fixed-length embedding, and also transformation-based approaches based on regularization and binarization, on standard word-semantics datasets.
摘要
•A variable-length representation learning (embedding) of words.•Allows provision for compressing the word vectors.•Proposed algorithm uses a smaller number of dimensions for words with consistent contexts (words with specific meanings).•Variable length embedding potentially helps removing bias (over-fitting) on certain datasets.•Proposed approach outperforms fixed-length embedding, and also transformation-based approaches based on regularization and binarization, on standard word-semantics datasets.
论文关键词:Word embedding,Compression and sparsity,Lexical semantics
论文评审过程:Received 6 August 2019, Revised 13 January 2020, Accepted 23 February 2020, Available online 27 February 2020, Version of Record 5 March 2020.
论文官网地址:https://doi.org/10.1016/j.patcog.2020.107306