An adaptable, high-performance relation extraction system for complex sentences

Highlights：

• Automatic extraction of semantic relations from large volume of natural language text belonging to closed domain, particularly judicial domain.

• Combination of knowledge based and semi-supervised learning systems to extract domain-specific relations from complex text that lacks domain-specific labelled data.

• Solution to various challenges in handling complex sentences in passive voice and containing conjunctive forms.

• Evaluation of state-of-the-art open information extraction approaches on judicial text with varying complexity and length.

• First attempt to provide a solution for extracting domain-specific relations from judicial text.

摘要

•Automatic extraction of semantic relations from large volume of natural language text belonging to closed domain, particularly judicial domain.•Combination of knowledge based and semi-supervised learning systems to extract domain-specific relations from complex text that lacks domain-specific labelled data.•Solution to various challenges in handling complex sentences in passive voice and containing conjunctive forms.•Evaluation of state-of-the-art open information extraction approaches on judicial text with varying complexity and length.•First attempt to provide a solution for extracting domain-specific relations from judicial text.

论文关键词：Information extraction,Natural language processing,Domain-specific relation extraction,Domain ontology,Semi-supervised learning,Knowledge based system,Judicial text,Open information extraction,Domain adaptability

论文评审过程：Received 7 January 2020, Revised 24 January 2022, Accepted 29 April 2022, Available online 7 May 2022, Version of Record 24 June 2022.