Learning multilingual named entity recognition from Wikipedia
作者:
摘要
We automatically create enormous, free and multilingual silver-standard training annotations for named entity recognition (ner) by exploiting the text and structure of Wikipedia. Most ner systems rely on statistical models of annotated data to identify and classify names of people, locations and organisations in text. This dependence on expensive annotation is the knowledge bottleneck our work overcomes.
论文关键词:Named entity recognition,Information extraction,Wikipedia,Semi-structured resources,Annotated corpora,Semi-supervised learning
论文评审过程:Received 9 November 2010, Revised 8 March 2012, Accepted 11 March 2012, Available online 13 March 2012.
论文官网地址:https://doi.org/10.1016/j.artint.2012.03.006