Identifying entities from scientific publications: A comparison of vocabulary- and model-based methods
作者:
Highlights:
• Five vocabulary- and model-based methods to extract terms from scientific publications are evaluated.
• Three conditional random fields (CRF)-based methods outperform the two vocabulary-based ones.
• CRF with keyword-based dictionary method has the best performance.
• The keyword-based one has a higher recall and the Wikipedia-based one has a higher precision.
摘要
•Five vocabulary- and model-based methods to extract terms from scientific publications are evaluated.•Three conditional random fields (CRF)-based methods outperform the two vocabulary-based ones.•CRF with keyword-based dictionary method has the best performance.•The keyword-based one has a higher recall and the Wikipedia-based one has a higher precision.
论文关键词:Entity extraction,Vocabulary,Dictionary,Conditional random fields,Content-aware
论文评审过程:Received 5 November 2014, Revised 22 April 2015, Accepted 22 April 2015, Available online 16 May 2015, Version of Record 16 May 2015.
论文官网地址:https://doi.org/10.1016/j.joi.2015.04.003