Topical classification of domain names based on subword embeddings
作者:
Highlights:
• Subword information can alleviate the impact of noisiness and sparsity in extremely short text classification.
• Convolutional Neural Networks outperform Recurrent Neural Network on processing noisy text sequences.
• Subword embeddings learned from Wikipedia can significantly improve the performance of domain name classification.
摘要
•Subword information can alleviate the impact of noisiness and sparsity in extremely short text classification.•Convolutional Neural Networks outperform Recurrent Neural Network on processing noisy text sequences.•Subword embeddings learned from Wikipedia can significantly improve the performance of domain name classification.
论文关键词:Domain names,Text classification,WWW,Internet,E-Commerce
论文评审过程:Received 9 March 2019, Revised 4 February 2020, Accepted 11 February 2020, Available online 27 February 2020, Version of Record 9 March 2020.
论文官网地址:https://doi.org/10.1016/j.elerap.2020.100961