Biological relation extraction and query answering from MEDLINE abstracts using ontology-based text mining

作者：

Highlights：

•

摘要

The rapid growth of the biological text data repository makes it difficult for human beings to access required information in a convenient and effective manner. The problem arises due to the fact that most of the information is embedded within unstructured or semi-structured text that computers cannot interpret very easily. In this paper we have presented an ontology-based Biological Information Extraction and Query Answering (BIEQA) System, which initiates text mining with a set of concepts stored in a biological ontology, and thereafter mines possible biological relations among those concepts using NLP techniques and co-occurrence-based analysis. The system extracts all frequently occurring biological relations among a pair of biological concepts through text mining. A mined relation is associated to a fuzzy membership value, which is proportional to its frequency of occurrence in the corpus and is termed a fuzzy biological relation. The fuzzy biological relations extracted from a text corpus along with other relevant information components like biological entities occurring within a relation, are stored in a database. The database is integrated with a query-processing module. The query-processing module has an interface, which guides users to formulate biological queries at different levels of specificity.

论文关键词：Text mining,Ontology,Biological relation extraction,Biological query processing

论文评审过程：Received 30 October 2005, Revised 18 May 2006, Accepted 3 June 2006, Available online 17 July 2006.

论文官网地址：https://doi.org/10.1016/j.datak.2006.06.007