Using lexical disambiguation and named-entity recognition to improve spelling correction in the electronic patient record
作者:
Highlights:
•
摘要
In this article, we show how a set of natural language processing (NLP) tools can be combined to improve the processing of clinical records. The study concentrates on improving spelling correction, which is of major importance for quality control in the electronic patient record (EPR). As first task, we report on the design of an improved interactive tool for correcting spelling errors. Unlike traditional systems, the linguistic context (both semantic and syntactic) is used to improve the correction strategy. The system is organized along three modules. Module 1 is based on a classical spelling checker, it means that it is context-independent and simply measures a string-edit-distance between a misspelled word and a list of well-formed words. Module 2 attempts to rank more relevantly the set of candidates provided by the first module using morpho-syntactic disambiguation tools. Module 3 processes words with the same part-of-speech (POS) and apply word-sense (WS) disambiguation in order to rerank the set of candidates. As second task, we show how this improved interactive spell checker can be cast as a fully automatic system by adjunction of another NLP module: a named-entity (NE) extractor, i.e. a tool able to identify words as such patient and physician names. This module is used to avoid replacement of named-entities when the system is not used in an interactive mode. Results confirm that using the linguistic context can improve interactive spelling correction, and justify the use of named-entity recognizer to conduct fully automatic spelling correction. It is concluded that NLP is mature enough to help information processing in EPR.
论文关键词:Electronic patient record,Named-entity recognition,Spelling correction,Natural language processing,Artificial intelligence
论文评审过程:Received 7 January 2002, Revised 11 April 2003, Accepted 12 April 2003, Available online 16 June 2003.
论文官网地址:https://doi.org/10.1016/S0933-3657(03)00052-6