Deep transfer learning baselines for sentiment analysis in Russian

作者:

Highlights:

• We identified the most commonly used sentiment analysis datasets of the Russian language texts.

• We fine-tuned Multilingual BERT, RuBERT, and two versions of the Multilingual USE on seven sentiment analysis datasets.

• Fine-tuned RuBERT achieved new state-of-the-art results on Russian sentiment datasets.

• We can state that in the context of existing approaches, sentiment analysis of the Russian language texts based on the language models outperforms rule-based and basic machine learning-based approaches in terms of classification quality.

摘要

•We identified the most commonly used sentiment analysis datasets of the Russian language texts.•We fine-tuned Multilingual BERT, RuBERT, and two versions of the Multilingual USE on seven sentiment analysis datasets.•Fine-tuned RuBERT achieved new state-of-the-art results on Russian sentiment datasets.•We can state that in the context of existing approaches, sentiment analysis of the Russian language texts based on the language models outperforms rule-based and basic machine learning-based approaches in terms of classification quality.

论文关键词:00-01,99-00,Sentiment analysis,Transfer learning,Russian texts

论文评审过程:Received 15 April 2020, Revised 29 November 2020, Accepted 23 December 2020, Available online 27 January 2021, Version of Record 27 January 2021.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102484