Med7: A transferable clinical natural language processing model for electronic health records

作者:

Highlights:

• Introduced a principled end-to-end workflow to developing a named-entity recognition model from a limited number of annotated examples for identification of medical concepts in clinical notes.

• Self-supervised pre-training using the cloze-type task.

• An iterative model improvement workflow with active learning.

• A demonstrated importance of the domain adaptation of the same task using medical records from different clinical backgrounds.

• A case study on information extraction using the largest mental health secondary care electronic health records database in the United Kingdom.

摘要

•Introduced a principled end-to-end workflow to developing a named-entity recognition model from a limited number of annotated examples for identification of medical concepts in clinical notes.•Self-supervised pre-training using the cloze-type task.•An iterative model improvement workflow with active learning.•A demonstrated importance of the domain adaptation of the same task using medical records from different clinical backgrounds.•A case study on information extraction using the largest mental health secondary care electronic health records database in the United Kingdom.

论文关键词:Clinical natural language processing,Neural networks,Self-supervised learning,Noisy labelling,Active learning

论文评审过程:Received 18 August 2020, Revised 24 February 2021, Accepted 3 May 2021, Available online 18 May 2021, Version of Record 1 June 2021.

论文官网地址:https://doi.org/10.1016/j.artmed.2021.102086