A de-identifier for medical discharge summaries

作者:

Highlights:

摘要

ObjectiveClinical records contain significant medical information that can be useful to researchers in various disciplines. However, these records also contain personal health information (PHI) whose presence limits the use of the records outside of hospitals.The goal of de-identification is to remove all PHI from clinical records. This is a challenging task because many records contain foreign and misspelled PHI; they also contain PHI that are ambiguous with non-PHI. These complications are compounded by the linguistic characteristics of clinical records. For example, medical discharge summaries, which are studied in this paper, are characterized by fragmented, incomplete utterances and domain-specific language; they cannot be fully processed by tools designed for lay language.

论文关键词:Automatic de-identification of narrative patient records,Local lexical context,Local syntactic context,Dictionaries,Sentential global context,Syntactic information for de-identification

论文评审过程:Received 23 November 2006, Revised 8 October 2007, Accepted 9 October 2007, Available online 28 November 2007.

论文官网地址:https://doi.org/10.1016/j.artmed.2007.10.001