ECHR-OD: On building an integrated open repository of legal documents for machine learning applications

作者:

Highlights:

• A repository of documents based on the European Court of Human Rights is proposed.

• The data is preprocessed to ease the exploration and usage of ML algorithms.

• The database is periodically and automatically updated.

• The whole ETL to generate the repository is provided as the open-source software.

• Our classifier correctly predicted 96% of decisions with a 82% F1-Score.

摘要

•A repository of documents based on the European Court of Human Rights is proposed.•The data is preprocessed to ease the exploration and usage of ML algorithms.•The database is periodically and automatically updated.•The whole ETL to generate the repository is provided as the open-source software.•Our classifier correctly predicted 96% of decisions with a 82% F1-Score.

论文关键词:Open data repository,Legal documents repository,Judgment documents,European Court of Human Rights,Machine learning,Classification of legal documents

论文评审过程:Received 14 October 2020, Revised 25 March 2021, Accepted 6 April 2021, Available online 7 June 2021, Version of Record 9 February 2022.

论文官网地址:https://doi.org/10.1016/j.is.2021.101822