A deep neural network model for speakers coreference resolution in legal texts

作者:

Highlights:

摘要

Coreference resolution is one of the fundamental tasks in natural language processing (NLP), and is of great significance to understand the semantics of texts. Meanwhile, resolving coreference is essential for many NLP downstream applications. Existing methods largely focus on pronouns, possessives and noun phrases resolution in the general domain, while little work is proposed for professional domains such as the legal field. Different from general texts, how to code legal texts and capture the relationship between entities in the text, and then resolve coreference is a challenging problem. For better understanding the legal text, and facilitating a series of downstream tasks in legal text mining, we propose a deep neural network model for coreference resolution in court record documents. Specifically, the pre-trained language model and bi-directional long short-term memory networks are first utilized to encode legal texts. Second, graph neural networks are applied to incorporate reference relations between entities. Finally, two distinct classifiers are used to score the candidate pairs. Results on the dataset show that our model achieves 87.53% F1 score on court record documents, outperforming neural baseline models by a large margin. Further analysis shows that the proposed method can effectively identify the reference relations between entities and model the entity dependencies.

论文关键词:Legal text mining,Coreference resolution,Court record document,Neural networks,Attention mechanism

论文评审过程:Received 3 February 2020, Revised 12 June 2020, Accepted 27 July 2020, Available online 20 August 2020, Version of Record 20 October 2020.

论文官网地址:https://doi.org/10.1016/j.ipm.2020.102365