Combining contextualized word representation and sub-document level analysis through Bi-LSTM+CRF architecture for clinical de-identification
作者:
Highlights:
• De-identify entities belonging to various classes in unstructured medical records.
• Stack embeddings and extend the context to boost Bi-LSTM+CRF systems performance.
• Establish a new state of the art in the classification of entities at category level.
摘要
•De-identify entities belonging to various classes in unstructured medical records.•Stack embeddings and extend the context to boost Bi-LSTM+CRF systems performance.•Establish a new state of the art in the classification of entities at category level.
论文关键词:Clinical de-identification,Named entity recognition,Deep learning,Contextualized embedding,Sub-document level analysis
论文评审过程:Received 30 June 2020, Revised 29 September 2020, Accepted 1 December 2020, Available online 24 December 2020, Version of Record 24 December 2020.
论文官网地址:https://doi.org/10.1016/j.knosys.2020.106649