An ensemble model for classifying idioms and literal texts using BERT and RoBERTa
作者:
Highlights:
• Fundamental NLP categorizes text into structured categories.
• We propose a predictive ensemble model to classify idioms and literals.
• We user BERT and RoBERTa, fine-tuned with the Trofi dataset.
• Model is tested with a newly created dataset of idioms and literal expressions, numbering 1470 in all, and annotated by domain experts.
摘要
•Fundamental NLP categorizes text into structured categories.•We propose a predictive ensemble model to classify idioms and literals.•We user BERT and RoBERTa, fine-tuned with the Trofi dataset.•Model is tested with a newly created dataset of idioms and literal expressions, numbering 1470 in all, and annotated by domain experts.
论文关键词:BERT,RoBERTa,Ensemble model,Idiom,Literal classification
论文评审过程:Received 19 April 2021, Revised 25 August 2021, Accepted 5 September 2021, Available online 26 September 2021, Version of Record 26 September 2021.
论文官网地址:https://doi.org/10.1016/j.ipm.2021.102756