Classifier subset selection for biomedical named entity recognition

作者：Nazife Dimililer, Ekrem Varoğlu, Hakan Altınçay

摘要

Classifier ensembling approach is considered for biomedical named entity recognition task. A vote-based classifier selection scheme having an intermediate level of search complexity between static classifier selection and real-valued and class-dependent weighting approaches is developed. Assuming that the reliability of the predictions of each classifier differs among classes, the proposed approach is based on selection of the classifiers by taking into account their individual votes. A wide set of classifiers, each based on a different set of features and modeling parameter setting are generated for this purpose. A genetic algorithm is developed so as to label the predictions of these classifiers as reliable or not. During testing, the votes that are labeled as being reliable are combined using weighted majority voting. The classifier ensemble formed by the proposed scheme surpasses the full object F-score of the best individual classifier by 2.75% and it is the highest score achieved on the data set considered.

论文关键词：Biomedical named entity recognition, Classifier ensembles, Classifier subset selection, Genetic algorithms, Weighted voting, Natural language processing

论文评审过程：

论文官网地址：https://doi.org/10.1007/s10489-008-0124-0